Designated HitterJanuary 10, 2008
In Defense of the Hall of Fame
By Mark Armour

[Editor's Note: As always, the views expressed in this article are those of the author and do not necessarily reflect the official policy or position of Baseball Analysts and/or its writers.]

Over the holidays, I spent a lot of time poring over issues of The Sporting News from the 1960s. Typically distracted by stories that have nothing to do with my task, I came across many discussions about who should be in the Hall of Fame. This was 45 years ago, so the articles were about guys like Sam Rice, George Kelly, Elmer Flick, or Jim Bottomley, written by Shirley Povich, Fred Lieb, Lee Allen, or Taylor Spink, with testimony from Branch Rickey, Joe McCarthy, or Casey Stengel, old men who knew a thing or two about talent. There were stories like this every off-season, largely anecdotal, well-written, and fascinating. My reading has been like a refresher course in early 20th century baseball.

What was missing from these newspapers were all of the “No” votes. Back in the day, a writer would pull out his typewriter to support some old ballplayer, but there were no stories about why someone was overrated or unqualified. Had baseball blogs existed in 1962, some modern expert could have lectured Povich about Sam Rice’s WARP score, or blasted Rickey for his silly misevaluation of George Kelly. But we missed out on all of that good fun, and eventually all these guys, and others like them, got in.

The argument for George Kelly, as I recall it, went something like this: starred on offense and defense for the only National League team ever to win four consecutive pennants (still true), won multiple HR and RBI titles, credited by John McGraw with getting more important hits than any man who ever played for him, and had a cool nickname (“Highpockets”). Using the standards of the time, that’s a decent argument. Not perfect, insufficient even, but not a bad resume. Kelly was a fine player.

There are very few people around anymore who think George Kelly should be in the Hall of Fame (Bill James has suggested he is the worst player in the Hall), though there are also few people around who know anything about him—what teams he played for, his impact on those teams, what his great manager thought of him, how he played the game. All we know about him, or think we know about him, is how good his statistics were. Not good enough, apparently.

I am not suggesting that George Kelly “deserves” his plaque—whatever that means. Rather, I am saying that the man and his accomplishments and his stories have been buried by the avalanche of his Hall of Fame case. The memories and opinions of Fred Lieb and Branch Rickey have been replaced with … what exactly? Is there anyone out there that has anything to say about any of these players besides their statistics? Forget George Kelly, does anyone have any colorful stories about Bert Blyleven or Andre Dawson to help me get through the winter? Even Joe Posnanski, one of our best bloggers, has felt a need to serve up endless “How Good Was He?” columns this winter. Say it ain’t so, Joe.

Having read dozens of Hall of Fame arguments on the web in the past few weeks, by good people, some of them my friends, I find several problems with them in the main. Walking timidly into the lion’s den, let me summarize.

Recently there has been some debate on various internet sites, including this one, about who deserves to vote in Hall of Fame elections. Let me tell you what I think. If I were in charge of the process, I would require that all voters understand what the Hall of Fame actually is before gaining the privilege. I would make every voter take a history test. There are 200 members of the Hall of Fame who were chosen based on their play in the major leagues, and I would expect each of the voters to understand (at the very least) the careers and qualifications of all of those men—the highlights, great moments, opinions of contemporaries.

Does this mean that the correct 200 players are in the Hall of Fame? No, of course not. Does this mean that 200 is the right size? No. However, I suggest that whatever standards you come up should be “reasonably” consistent with the current membership list. If you want to say that the voters overvalue the players of the 1930s, or that 3B is underrepresented, or there are not enough Yankees, you must do so while not dynamiting a 70-year-old institution. You want to ignore the bottom 10% of the Hall, we can live with that.

Jay Jaffe, a fine writer and analyst over at Baseball Prospectus, invented a measure called JAWS (which uses WARP as its basis) and compares new candidates to the JAWS score of the average HOF player at his position. Actually, if I have this right, he first removes the worst inductee at each position (and four pitchers) and then uses the average of the rest. This process might suggest that Jay believes that half of the Hall of Fame is unqualified, or at least suspect. My bright friend Rob Neyer uses Win Shares, but has a similarly strict standard, recently writing, "I believe that if a player is among the best dozen or so at his position, he belongs in the Hall of Fame; or, alternatively, that if he's better than half the players at his position already in the Hall, he belongs in the Hall of Fame." When considering that there are about 18 HOFers per position now, and that there are several non-inductees that Rob supports, he is implying that about 40% of the current members are unqualified.

I mention Jay and Rob because they are two of the more talented and visible writers on this subject, and I suspect most people reading this agree with them on this issue. With all due respect, and writing as a product of the same general community of thought, I have a different view.

Look, I am down with the idea that the Hall of Fame contains several questionable players. (Not bad players—there are no players even remotely “bad” in the Hall of Fame.) But, I am sorry, if you want to impose standards that 40% or 50% of the current Hall does not reach, then, in my opinion, you should not get to vote. You are ignoring what the Hall of Fame actually is. You can’t wave away 40% of the Hall and claim to be interested in helping. And there are no “tiers” in the Hall of Fame either—every member is honored equally.

Parenthetically, if every voter was like Rob and Jay, and only voted on, for example, the best 12 players at each position, the actual HOF bar would be even higher than that. All voters are not going to agree on who these 12 guys are, and you need 75% of the vote. The effective standard becomes that 75% of the voters have to put you in the top 12. Which I suspect would leave you with something like 8 guys per position. We will reach the point where we only elect superstars and relief pitchers. Oh look, here we are.

Another problem with the analytical arguments is that they are so … strident. The current message from the stat community to the Hall of Fame and its voters goes something like this: “Your institution is riddled with poor selections, and most of the current voting writers are morons. P.S. Please find enclosed my application to join your fine group.” It’s a bit like saying, “I don’t like your wife, but if you have me over for dinner I can give her a few tips on her attitude.”

Every time some poor writer released their Hall of Fame ballot last month, unless it had the “right” guys on it, the voter was deemed not smart enough, unthinking. I don’t really want to quote examples because I am in enough trouble already, but, trust me, if you voted for Jack Morris you were mocked. (Sure, Morris had more Win Shares and the same WARP as Rich Gossage, and no GM in their right mind would prefer Gossage to Morris, even before considering Morris’s epic post-season performances. Apparently “relief pitcher” is a separate position now. Coming soon: the top 12 “seventh-place hitters”. But I digress…)

Jim Rice received 72% of the vote on Tuesday, an overwhelming consensus of support, 12% more than Franklin Roosevelt’s 1936 landslide over Alf Landon. Are these 72% all just not smart enough? Four hundred journalists, many of whom saw Rice play hundreds of time, just need to think this through properly? How did we all get so confident? I submit, sheepishly, that perhaps it is we who need to open our minds.

Me? Sure, I have argued for all the “smart” guys—Ron Santo and Bert Blyleven and Tim Raines—at cocktail parties. Even Tony Oliva, which is a big hit, believe me. But I suggest we all could use a little humility. The idea that we can confidently separate Dale Murphy and Andre Dawson statistically is nuts—who you prefer is basically a matter of taste. Defense, adjusting for eras, quality of competition, integration, position, post-season play, intangibles? If you are approached by someone who claims to have unraveled these issues statistically—I strongly urge you to run.

My final problem with all of the analytical Hall of Fame arguments: there are too many of them, and they all say the same thing. Once you have decided to use Win Shares, or WARP, or JAWS, there is really no need for a lengthy explanation. If you want to explain the internals of Win Shares and make the case for why you are using it as opposed to something else, go right ahead. But once you have defined the parameters of the debate on your terms, there is nowhere to go unless you typed some of the numbers incorrectly. The reason people come up with a different answer is that you confidently co-opted the question.

The only way one can add to the conversation is to supply some sort of color or nuance—a description of performances in big games, quotes from opponents or managers, a great World Series catch, your own personal memories. Does this matter? I suggest it matters in one sense at least: without it, you don’t really have an article that hasn’t been written before. Are you all really going to write the same Jim Rice stories again next year?

When I was about 12 years old, I received a little book for Christmas about the Hall of Fame, written by Ken Smith (who was the librarian at the Hall for many years), containing biographies of all of the current members. It was not great literature by any means, but I must have read that paperback three or four times, and it played its small part in my baseball education. Reading about Hugh Duffy and Tommy McCarthy got me curious about the great Boston Beaneater teams of the 1890s, just as Frankie Frisch and Dizzy Dean brought me to the Gas House Gang, and Eddie Collins and Frank Baker to the powerhouse Athletics teams of the early 1910s. Although the book focused on the players, it was the great teams that made the stories interesting. The teams, it seemed to me, were what baseball history was really all about.

I think we all agree that if George Kelly had played for the Phillies in the 1920s instead of the Giants, he would not be in the Hall of Fame. (He would actually be more respected than he is, since instead of being a “joke Hall of Famer” he would be an “unappreciated star”.) However, he *did* play for the Giants, and this seems wholly relevant to the conversation. John McGraw somehow won ten pennants with Christy Mathewson (who was only around for five of them) and a bunch of players like George Kelly—great defenders who could hit a little. The only NL team ever to win four straight flags, the 1921-24 Giants, had four Hall of Famers: Frankie Frisch, George Kelly, Ross Youngs, and the shortstop (Dave Bancroft, giving way to Travis Jackson), all but Frisch considered “mistakes” by today’s experts. How many Hall of Famers should be on this great team? It is consistent with the purpose of the Hall of Fame, in my view, to honor baseball’s champions.

If you begin with the premise that the 200 guys in the Hall of Fame should be the 200 statistically-best careers in history, a premise all analysts have rallied around, then George Kelly does not have a case. If you modify this premise, if you believe that being on this great Giants team gets him extra points, that the word of John McGraw carries additional weight, that first base defense was more important at that time and place than it is today, that career length is less important to you, we start inching along and suddenly his case seems less ridiculous. This is not a case I would make, but this is the case that the people who lived and watched those teams made about George Kelly. If the guy who John McGraw thought was the best player on a four-time champion—if this is the worst guy in the place, how bad can it really be?

Don’t worry, I am not asking for your support for George Kelly, although I do suggest you pause at his plaque the next time you are in that great museum in Cooperstown. He’s got a nice story. Jack Morris and Jim Rice have nice stories too, and the smart people advocating their candidacies are worth a listen.

Mark Armour writes baseball from his home in Corvallis, Oregon. He was the co-author, with Dan Levitt, of the award-winning book Paths to Glory, the editor of Rain Check: Baseball in the Pacific Northwest, and the director of SABR's Baseball Biography Project. His next large project is the life of Joe Cronin. He can be reached at markarmour@comcast.net.

Comments

Mark, if there was widespread acknowledgment by the BBWAA of all the great statistical work that has been done to further our understanding of the game, and they then rejected it from time to time on the basis that it should not be the Hall of Stats, that would be one thing.

But the writers use stats to justify their cases, too. All the time. Just the wrong ones.

Also, the assumption that those who evidence players' Hall cases with newer stats care little for the actual stories sounds pretty presumptive to me. I think stats and stories go hand in hand in terms of enhancing our understanding of baseball's past.

Defense, adjusting for eras, quality of competition, integration, position, post-season play, intangibles? If you are approached by someone who claims to have unraveled these issues statistically—I strongly urge you to run.

Once you have decided to use Win Shares, or WARP, or JAWS, there is really no need for a lengthy explanation.

How many Hall of Famers should be on this great team? It is consistent with the purpose of the Hall of Fame, in my view, to honor baseball’s champions. ... If the guy who John McGraw thought was the best player on a four-time champion—if this is the worst guy in the place, how bad can it really be?

This is a great piece of analysis, but I think there is one thing left out. Why do you need to have anyone to vote at all if you can choose players based on WARP, wins shares or some other uber-stat?

Those who want the Hall of Fame to be the Hall of Statistically Superior Players should just create their own institution. Just understand that a lot of the people who play the game will simply ignore the honor of their selection. Because they recognize that all the statistics are really just very crude measures of the game they played.

Mr. Armour,
Well written and refreshing. I'm a huge proponent of building your own case with your words and actions vs. demeaning and tearing down others'. As a former collegiate player, a good (I think the best) litmus test when deciding between the tough nominees, is to put yourself in the dugout and say to yourself, if I was going to start a 162 game season, who would I want to be on my team, considering the total package, i.e. stats, hustle, attitude, tangibles and intangibles included, who do I want as a teammate? All the best. Shark

Sully: as to your first point, sure the writers use stats. In your opinion, they are the wrong ones, and in their opinion, yours are the wrong ones. On on it goes.

For the second, I do not make the assumpion you tar me with. I comment only about what I like to read, and how that relates to what people are writing. All analysts might be secretly squirreled away reveling in Bert Blyleven stories. If so, there secret is safe.

:-)

Mark, do you think writers take the time to understand the more advanced metrics?

Do half the people who read sites like this take the time to really understand the more advanced metrics?

My point to Mark is this:

I know what a HR is, know what an RBI is, know what batting average is, know what fielding percentage is, know what wins are and know what ERA is.

I wonder if much of the BBWAA could say the same of park effects, FRAA or FRAR, VORP or even how much a simple AVG/OBP/SLG line tells you.

This is a great article. Although I disagree with some of it, the point that stats can sometimes overwhelm the arguments is well taken.

At the same time, however, it's quite clear that there is little consistency or sense to much of the voting. Jim Rice is the best example. Yes, he was a very good player, but yes he wasn't very good for very long. You might be able to add some ancillary points for him, but where? The Sox won zero World Series with him.

Supporters throw out Tony Perez as a comparable. But Tony Perez was a key cog on the Big Red Machine. I can see people arguing that Perez's numbers are not so great, but to me, he was part of a dynasty, and that dynasty should be reflected in the Hall.

I think most SABR folks leave some room for the Perez's of the world, but if it means a Rice is allowed to use Perez for justification for his own admission, then they start to get angry.

For me, I would prefer there be some floor for performance at each position. If a guy has this level of performance, he's a shoe in. If he doesn't, it's time to look at WS rings, AS appearances, impact on the game (Roger Maris, for example), and so forth.

Stat heads may just want that floor and nothing else, but I disagree with that too.

Sully: I think you (as an analyst) are fighting two problems here, which are different but related.

1. Many people are not wired to understand advanced baseball metrics. This is simple biology. This does not mean these people are stupid, and it does not mean (IMO) that their voice is unimportant in this matter.

2. There are also many people, and I count myself among them, who understand advanced baseball metrics perfectly well but are simply believe that they can't tell the story well enough. I can understand, for example, that Jack Morris has a WARP of 92 (or whatever), realize exactly what that means, even buy into its accuracy [though I believe there are huge error bars in all of this stuff], ... yet still come to the conclusion that his Hall of Fame case is better than that number would suggest. Perhaps the fact that you could count on him for 240 innings every year has additional value. Perhaps 1991 Game 7 should count as 20 wins. Perhaps I believe that the 1980s was the best baseball (in terms of quality of play) ever, and that no starting pitchers born after 1947 in the Hall (but three relievers)--is a problem.

There are cases to be made like this for a lot of players. We all have to allow for personal biases. It is not a science. I, for one, am glad of that.

And of course Bert Blyleven is a grave omission.

Your second point is very well taken and I agree that the sort of anecdotal/peripheral evidence you cite for Morris is well worth considering (although I still can't distinguish him from Dennis Martinez).

Many people are not wired to understand advanced baseball metrics. This is simple biology.

I don't buy this. The metrics themselves are ancillary. You are telling me that an individual intelligent enough to be employed by widely circulated newspapers cannot grasp that not making outs is fundamental to scoring runs? Or that fielding percentage tells us very little? Or that failing to consider context (park effects, era, etc) when citing stats is inadequate. Or that wins often tell us as much about the run support the pitcher gets as it does about the pitcher himself?

And on and on and on...

That's always been the funny thing to me about "advanced" baseball principles. They're very, very logical. I played baseball until I was 21, was considered a heady, coach-on-the-field type and only after I stopped playing and picked up the NBJHBA did I feel I started to understand the game.

If you take the time to read the best work out there, it's really pretty simple.

I think its a little more complicated than that. First off, yes I do believe that there are smart people who don't think about baseball the way you do. Part of aptitude is what you are interested in. Its baseball, being a fan supposed to be fun.

I am one of those people for whom baseball became even more fun once James started writing his abstracts. But for many smart people, likely more people, reading even simple analysis is boring and irrelevant to the enjoyment of the game. If this was a college course, perhaps they could ace it. But they should not have to take the course in the first place. To say we understand the game "better" than they do... I can't go there.

So there are three questions.

1) Do you think the Hall of Fame should honor the best players ever to play?

2) Do you think that we have the means to adequately understand which players (with plenty of room for disagreement on the margins) have contributed the most to their teams' winning efforts (the object is to win, after all)?

3) Should the voters try and take the time to read "even simple analysis" even if it is boring to them?

I am not trying to impose my way of enjoying baseball on anybody. I just think HOF voters should make more of an effort to honor the best players, since that is what they are purporting to do.

I have to admit, although I quote WS, WARP, etc. from time to time, I like stats that can be tracked when watching and keeping score of games. Counting stats such as Times on Base, Total Bases, Outs, K, BB, GB, FB, etc. are all very straightforward, understandable, and, importantly, descriptive. The same goes for rate stats like AVG/OBP/SLG/OPS, ERA, K/9, K/BB, etc.

Without delving into the more complex derivative- or formula-based "great" stats, I strongly believe beat writers, columnists, editors, cartoonists, and other Hall of Fame voters would be well-served to concentrate on the aforementioned fundamental measurements of player productivity (especially when viewed in the context of era, league, home ballpark, and position) rather than relying primarily on the Triple Crown stats of AVG/HR/RBI (and hits at the exclusion of walks) or ERA/W/SO, as well as number of MVPs, Cy Young Awards, All-Star games, etc.

I defend those who used basic stats up through the 1970s and perhaps into the 1980s because, for the most part, *we* didn't know better. But with the advent of the computer, game logs, and play-by-play data, as well as the contributions of analysts like Bill James (and many others), *we* can - and should - ask more of the writers and voters than what we did for the first 50 years of the HOF voting.

Statistics are not separate from the game. The right ones simply help illuminate which players produced and which ones didn't. I believe one can enjoy watching and describing the game even more once they take the time to understand stats that are really no more complicated than what it must have felt like after Henry Chadwick introduced many of them back in the 1800s.

Great article. I try never to justify one player making the HOF on the basis of who is already there because, lets face it, so many old timers were chosen before we had records of everything or methods to analyze those records. Similarly, I try not to argue against a player except in context of a better eligible player deserves selection first (i.e. Blyleven is one of the best all time in K's, Shutouts, IP; Morris is not one of the best all time in anything; Blyleven gets in first).

For example, I feel that Dale Murphy deserves consideration ahead of Andre Dawson. He gets some of one of my pet bugaboos; Murphy was a minor league catcher which shortened his major league career (knees can only take so much) so his misuse and thus lessening of his quantity based stats was not his fault. Similar use of that prediction and its obverse (players who start catching later catch longer) would have predicted long careers for Mike Piazza and Bob Boone, and now do so for Russell Martin. But much of my impressions of Dale Murphy and Andre Dawson come from Vin Scully, either over the radio or on television. I'm not attributing anything negative to Scully, but maybe Irish Scully has a mild appreciation for a fine Irish name like Murphy. Since I'm not good enough to make skill judgments on my own, I rely on stats, especially the park and league adjusted stats, to help me decide.

To deny intangibles is to deny the greatness of players like Phil Rizutto and Ozzie Smith. I mean, based on stats, Smith doesn't deserve to be there. And yet I *always* fought to get Cardinals tickets when we split up our season seats because I wanted to see Ozzie. His last season, I had to spend that game in the hospital because my kid caught her hand caught in a door, and I *think* I was more upset about my kid than missing the game, but the lie detector might not agree with me.

I go to Dodger Stadium (and others) to see greatness, to see the best players in the world do something spectacular. If all baseball was to me is statistics, I wouldn't bother going to the games. You knew going to a Sandy Koufax or Fernando Valenzuela game was something special because the cars would be backed up all the way onto Sunset Blvd. approaching Dodger Stadium. So let the guys who see 150 games a year vote for the players that gave them a special tingle on their way into the stadium. I'm sure Rice has intangibles, and I'm *very* sure that those intangibles were felt in Boston far more than anywhere else. But lets keep on trying to make sure they vote for the players who earned it without the tingle like Raines and Blyleven as well as the intangible guys like Rice and Morris.

Basic reactions to the article:

1) Well-written, though I largely disagree with it.

2) Everybody who talks about baseball uses stats. Everyone. It's a statistical game. The argument is over which stats are useful/illuminating and which are not.

3) I do not think that "intangibles" should be completely ignored. There are things we cannot measure. But I do think that a HoF case built primarily on intangibles is weak. YMMV.

4) Jack Morris had good PS performances. He also had some clunkers. Hardly anyone discusses the clunkers, for some reason... A corollary (at least before the HGH thing) would be Andy Pettitte. Lots of my fellow Yankees fans talk Andy up as a great post-season pitcher. Yet if you look at his record, you see a mix of performances that range from excellent to terrible, and they average out to pretty good. I've enjoyed AP's career. But he's made out to be something he's not (in more ways than one, apparently). So it seems to be with Jack Morris.

5) The "relief pitcher" "7th place hitter" comment/argument is actually an interesting one. Flesh it out, because I think it may have merit. Many analysts believe that a good starter > a great reliever (and a decent starter > a good reliever, and so on). Jack Morris - good starter. Goose Gossage - great reliever. One is dubbed a deserving HoFer by many analysts, the other is dubbed a potential mistake.

That link to George Kelly actually goes to George Kell, another questionable HOF pick.

Cool article though.

For the record, this is the voting criteria for the Hall of Fame: "Voting shall be based upon the player's record, playing ability, integrity, sportsmanship, character, and contributions to the team(s) on which the player played."

As to Sully's first question: Do you think the Hall of Fame should honor the best players ever to play?

Not really. I think the HOF should honor the players who did the best things. What does that mean? That's up to you.

Bill Mazeroski's HR. How much does this inform how great of a player he was? Very little--its just one hit. All modern analysis believes it to be irrelevent--the post-season does not count.

If I were asked to rank the Top 20 2B of all time, this HR would be more or less ignored. If I were asked to vote on the HOF, this HR would absolutely matter.

When some people read this argument they are apt to say, "What? So is Don Larsen a HOFer too?"

I said it matters, I did not say it carried the day. There is room for nuance in this conversation. Not all plate appearances are created equal, not all stages are the same size.

Once you open this can of worms, a can which deserves to be opened in my view, suddenly a little art has been mixed in with the science. Which is WONDERFUL, in my opinion.

Again, there are two separate issues:
1. I don't believe that analysis alone can tell you who has the best statistics, because of dozens of issues like peak-career, league and era strengths, and various other adjustments that are really just "estimates" at this point.
2. Even if you did have the perfect solution for (1), this does not (for me) encompass precisely what the HOF is supposed to be honoring.

For almost everyone (Willie Mays, So Taguchi) this does not matter. But the gray area is where all the debates are, and the gray area is at least 100 players deep, perhaps more.

I expect you (Sully) and I would parcel out this gray area in similar ways, but I don't believe that my way is in any way more valid than many other ways.

I'm afraid that my bright friend Mark has misinterpreted one of my views. He writes:

*******
My bright friend Rob Neyer uses Win Shares, but has a similarly strict standard, recently writing, "I believe that if a player is among the best dozen or so at his position, he belongs in the Hall of Fame; or, alternatively, that if he's better than half the players at his position already in the Hall, he belongs in the Hall of Fame." When considering that there are about 18 HOFers per position now, and that there are several non-inductees that Rob supports, he is implying that about 40% of the current members are unqualified.
*******

I'm implying no such thing, though I can understand how Mark might assume that I am. My point was that if a player meets one of those criteria he *clearly* belongs in the Hall of Fame. That doesn't mean that other guys don't have valid cases, and I don't think the percentage of non-deserving Hall of Famers is anything *like* 40 percent. My guess? 20-25 percent, at most.

Mark, if you would be so kind, can you just tell me your point? I ask this with humility, not snark.

Are you calling for more civility?

Would you like the Hall to be thought of more as a museum?

Help me out if you wouldn't mind.

For the record, this is the voting criteria for the Hall of Fame: "Voting shall be based upon the player's record, playing ability, integrity, sportsmanship, character, and contributions to the team(s) on which the player played."

I always get a kick out of that description, particularly the part about "playing ability." When it comes to honoring a player *after* his career, I don't really care about his "ability" anymore and, instead, I only care about his "performance" or using one of the criteria established by the Hall, the "player's record."

But that is neither here nor there. The voters (both the BBWAA and the Vets Committees) have created the de facto standards, as James has so effectively pointed out, by virtue of who *they* have inducted into the Hall of Fame.

If "integrity, sportsmanship, and character" are such important criteria, why have certain players with questionable "integrity, sportsmanship, and character" been voted into the Hall and others haven't? Or, more relevant to today, why are some current (and future) candidates being scorned while others before them received a free pass, if you will?

If anything, it seems that these intangibles are more arguable than the stats. But the voters who resort to such intangibles like Rice's fear factor do so fully knowing that it can't be measured (although I might argue differently). As such, if it can't be measured, it follows that it can't be challenged. Ergo, if somebody makes such a claim, it becomes irrefutable in their minds.

More than anything, I'm just asking for consistency in the implementation of the criteria.

I wonder if much of the BBWAA could say the same of park effects

Do we really know anything about the "park effect" on an individual player? I don't think so. The folks that vaguely understood that Yankee stadium is an advantage for left handed power hitters are miles ahead of anyone who is applying the Yankee stadium "park effect" randomly to every player based on statistical analysis.

Do half the people who read sites like this take the time to really understand the more advanced metrics?

From reading people's comments I doubt that even the people who create some of these "advanced" metrics really understand them. Or even some of the old-fashioned ones.

For instance, people use innings pitched as a measurement of how much a player pitched rather than recognizing it as a measure of how many outs they were able to get.

not making outs is fundamental to scoring runs?

Its what you do that contributes to scoring runs that is fundamental to scoring runs. Sometimes making an out contributes more than not making an out.

This is similar to the exaggerated importance of getting on base that ignores the reality that how likely individual players are to score once on base, varies at least as widely as how likely they are to get on base. A walk is not the same as a hit, its not even the same as a walk by a different player.

If you take the time to read the best work out there, it's really pretty simple.

Only to people who fail to see, or choose to ignore, its complexity. In fact, that is the basic problem with a lot of statistical gurus, including Bill James. They have a simple understanding of both baseball and statistics and the result is simple answers to complex questions.

And that is reason enough for people to ignore a lot of the statistical analysis based on catchily named stats. They rarely actually measure what their catchy name implies.

Why am I getting a "the honorable minister from _____" House of Commons vibe when I read "My bright friend..." ?

For those not familiar, the honorable minister lead-in is generally accompanied by a snarky smile and followed by some skillful knifework.

Actually, Mark and I really are friends. While I certainly can't speak for him, I consider Mark one of the brightest people I know. Both of us, I think, were simply winking at each other.

Right, then. No knife fight. ;)

Nope. But I might short-sheet his futon mattress the next time he stays over.

Ooh, sleepovers. Now I'm getting an ARod-Jeter vibe...

Hey Ross, question for ya: here are the 2007 leaders in VORP, a "catchily named stat:"

1. Alex Rodriguez
2. Hanley Ramirez
3. Magglio Ordonez
4. David Ortiz
5. David Wright
6. Chipper Jones
7. Matt Holliday
8. Jorge Posada
9. Albert Pujols
10. Miguel Cabrera
11. Prince Fielder
12. Chase Utley
13. Carlos Pena
14. Curtis Granderson
15. Jimmy Rollins
16. Ichiro Suzuki
17. Vladimir Guerrero
18. Ryan Braun
19. Barry Bonds
20. Victor Martinez
21. Grady Sizemore
22. Ryan Howard
23. Derek Jeter
24. Aaron Rowand
25. Todd Helton
26. Carlos Beltran
27. Placido Polanco
28. Brian Roberts
29. Derrek Lee
30. Edgar Renteria

Bearing in mind that VORP is an offense-only stat, do you think that it fails to measure what its "catchy name implies?" In other words, is this a bad list of the top 30 offensive performances in the 2007 regular season?

I'm not wedded to VORP. I think it's solid, but imperfect, particularly because it doesn't say anything about defense, and its consideration of baserunning is, AFAIK, limited to steals/caught. But it seems to me that's a pretty good list, and probably a much, much better list than the BBWAA would concoct.

First off, I want to thank Rich for running my article. I wrote him a few days ago what I wanted to write, and he agreed to run it. I didn't expect most people here to agree with me, but I appreciate being able to speak out. I am much more drawn to write when I find myself at odds with the prevailing winds. I hope Rich lets me back. :-)

Next, Sully asks me what my point is. I tell you, there is nothing that makes a writer feel better than spending 2500 words on a subject, plus several long comments, and being asked to explain his point. :-( I am kidding. Sort of.

I had lots of points, but I will be brief.
1. There is too much snideness in HOF discussion, on both sides. I did not point out the sportswriter's snideness, because that is more or less common knowledge 'round these parts.
2. Analytical HOF debates generally include a lot of assumptions about what the purpose of the HOF, which is the cause for most of the disagreement, IMO.

Baseball writing has become a 12-month occupation. 40 years ago, baseball writers would write once a week in the winter, just to fill you in on who signed a contract or spoke at the Manchester Elks Club. With lots of year-round writers, and no games, the HOF debates (and the MVP debates, which have most of the same issues) take over for six weeks, and there really isn't enough to say on the subject. I see no solution.

The games can't get here soon enough. :-)

Mark, you say there's "too much snideness" in these discussions. But they're not discussions. A discussion is what happens when you and I talk about our kids, or where we're going to eat after a local SABR meeting. These are DEBATES, and while you might describe them as snide, I would describe them as (mostly) spirited. This is what people do when they debate things, whether it's sports or politics or the use of Freudian symbolism in "The Great Gatsby" (I made that up, as I don't have any idea if there's anything Freudian in "The Great Gatsby").

I'm not sure why I'm supposed to get along with Dan Shaughnessy, regarding this particular issue, when we're debating from completely different perspectives with completely different motivations. It's *going* to be contentious. And as near as I can tell, we're not hurting anybody.

Well, Rob, if they all wrote like Shaughnessy I would agree with you. But you know as well as I do that there are good smart sportswriters advocating for Jack Morris, for example, and I think some people on "this" side have a strong tendency to paint with too broad of a brush.

I see no difference between "sabermetricians don't think players are human beings" and "Jack Morris supporters don't understand how baseball games are won and lost" (which is actually much kinder than I have seen this point made).

Really? I have never written, "Jack Morris supporters don't understand how baseball games are won or lost." Can you show me someone who has? Five someones? Outside of message boards?

This criticism reminds me all those "Look at what the crazy bloggers are saying!" comments ... meanwhile, the guys being PAID to write and talk about politics and public policy -- Bill O'Reilly, Chris Matthews, etc. -- are either buffoons or just plain bat-shit crazy. I'm all for a rational, respectful debate (or if you prefer, discussion). But shouldn't we start at the top, with the guys who actually have, you know, audiences?

To reinforce Rob's point, my first reaction to the post was "who are these stat people he is talking about?"

I have a sharp tongue but try and always be sure to steer clear of ad hominems.

I also happen to like the smells of the ballpark, etc.

And funny you mention the political commenteriat, Rob.

The post smacked of David Broder-esque advocacy of bipartisanship for bipartisanship's sake. Forget which side of the aisle the sound policy falls on.

Well put, Sully. "It doesn't matter who's right -- and maybe nobody's right, really! -- as long as we all get along."

No thank you, sir.

You know, I am not going to provide the examples you ask for. Obviously, you are all free to draw whatever conclusions you wish from that.

Apparently I am a minority of one on this. Carry on.

Bearing in mind that VORP is an offense-only stat, do you think that it fails to measure what its "catchy name implies?"

Yes. You can come up with a similar top 20 list using all sorts of offensive stats.

And you make the case that Aaron Rowand belongs on that list, but Torii Hunter doesn't. Given the contracts each just signed, it appears the baseball industry doesn't agree with you. I don't need VORP to know that Alex Rodriguez is more valuable than Nick Punto. So what exactly does VORP measure?

Bearing in mind that VORP is an offense-only stat

You mean keeping in mind that the "catchy name" doesn't really represent "Value" but simply one part of a player's value? That pretty much describes the problem. And we won't even discuss what a "replacement player" really is because it doesn't really matter - its basically a conceptual framework that has little or nothing to do with the actual number.

The fact that you can calculate a number doesn't guarantee it has meaning. You can add apples and oranges and divide by lemons and call it fruit salad.

In any disagreement, the truth probably lies somewhere in the middle.

People who mostly look at numbers would be better served by learning more about the history of the game. Writers who only go to games and ignore "the new math" would be well served to at least understand what some of the new stats are trying to measure.

Mark's article is fun, I just can't get enough of this HOF debate and history stuff. Plus after BJ put down George Kelly, it only makes me want to learn more about Mr. Kelly.

Funny thing is, almost all of my analyst friends absolutely love the history of the game. Meanwhile, many writers I know are interested in the "new math" only when it serves their immediate purpose.

To me, that's a simple truth that Mark missed.

Actually Mark I really enjoyed the article. Last time I checked it was the Hall of FAME, not the the hall of the statistically best players of all time(those all those that meet this criteria should always be considered)
I couldn't agree more with Rich's point about the context in which many people were elected. I wasn't alive in the 50's and I think its wholly unfair to look back through statistical glasses and deem someone unworthy.
I agree with Neyer's point that the discussions are more spirited than spiteful...besides much better then people talking about basketball or god forbid, gridiron.
I think any piece that gets people talking is a good one, so well done.

I just want to say, nice job. Really. Mark, Rob and Rich.

I liked the column, liked the debate. The infite "Shades of gray" vibe is quite welcomed.

An interesting take. However, I do think Mark has mischaracterized an aspect of the JAWS project when he writes, "This process might suggest that Jay believes that half of the Hall of Fame is unqualified, or at least suspect."

Aside from the omission of the bottom of the barrel in figuring the JAWS benchmark at each position, the point of my system isn't that half of the Hall's members are unqualified, it's to identify the candidates on the ballot who raise the standards of the Hall because they surpass the benchmark at their position. The Halls rolls have been compromised by the admission of some dubious players, and while we can't undo what's done, we should focus on candidates who don't compromise the standards further instead of the ones who merely meet a minimum threshold.

For example, my system says that Jim Rice is about as good a candidate as Ralph Kiner, but that among HOF leftfielders, both would rank in the bottom tier among their positional peers. We're not going to kick Kiner out, but I don't think we should be jazzed up about Rice joining him, as it appears he will next year.

Furthermore, I've fully acknowledged from the outset of my project that there are certainly elements of players' Hall of Fame cases that JAWS can't capture (postseason performance, awards, league leadership, milestones, pioneering efforts, etc). Subjective arguments will always have a place in the discussion, but they shouldn't dominate it.

Rob wrote: "Funny thing is, almost all of my analyst friends absolutely love the history of the game. Meanwhile, many writers I know are interested in the "new math" only when it serves their immediate purpose. To me, that's a simple truth that Mark missed."

I didn't miss it, it just had nothing to do with my story. I am only commenting about you (the plural you, not the singular) write. Whether someone is interested in old stories is only interesting to me (for the purposes of this story) if they write about them.

I expect you have run across people with this odd view, but it ain't me.

it's to identify the candidates on the ballot who raise the standards of the Hall because they surpass the benchmark at their position.

I think one of the questions raised here was whether "their position" is a critical component of the discussion. Gossage or Morris? Why should Gossage get in the HOF because he was used at a position that usually went to much lesser pitchers?

It appears the idea is that there is no minimum defensive requirement except that they were goo enough for a manager to play them at a particular position. But it seems to me a player whose defense is well below average needs to more than make up for it with their bat.

Subjective arguments will always have a place in the discussion, but they shouldn't dominate it.

Why not? Are the subjective judgments really less reliable than a crude statistical analysis that can't capture the complexity of the game?

But Mark, I *do* write about the old stories. I've written whole books full of old stories, and my next book -- out this spring! -- is 100% old stories. You mentioned me in the piece (by the way, thanks) so you *seem* to have had me in mind.

Or is your point simply that we don't write about the old stories at the same time we're writing about a player's Hall of Fame credentials? If so, you're probably right, as I don't think Tim McCarver's bubble-blowing talents have much bearing on his Hall of Fame case...

I think one of the questions raised here was whether "their position" is a critical component of the discussion.

Yes, I believe position is an important part of the discussion. If nothing else it's an organizing principle that should help us identify a representative cross-section of players for a given time period and provide a means of historical comparison of apples to apples and catchers to catchers, and even relievers to relievers. All-Star teams don't consist entirely slugging first basemen and corner outfielders, so why should the Hall of Famers from that era over-represent such players?

Even if defensive skill isn't a mandatory asset for potential Hall of Famers at a given position, it's wholly relevant to incorporate defensive statistics because they account for things like, as you suggest, a player whose below-average defense requires some additional offensive input. Systems like Win Shares and WARP also do a good job of apportioning defensive responsibility and value fairly, such that we can have a basis of comparison between the slick-fielding shortstop and the lumbering first basemen.

Are the subjective judgments really less reliable than a crude statistical analysis that can't capture the complexity of the game?

I'd like to think JAWS rises above a description of "crude statistical analysis" given that it accounts for offense, defense and pitching in runs and wins, adjusting for park and league scoring levels, and that it takes into account both career and peak performance. It's still a tool, but my hope is that it's a useful one when it comes to sorting out Hall of Fame candidates.

Once you have decided to use Win Shares, or WARP, or JAWS, there is really no need for a lengthy explanation... The only way one can add to the conversation is to supply some sort of color or nuance—a description of performances in big games, quotes from opponents or managers, a great World Series catch, your own personal memories.

Well, jeez, I'd like to think that somewhere within the 10,000-odd words I spilled on JAWS this past six weeks contained some amount of nuance regarding the players under discussion, but I do heartily apologize for not being able to spit out the equivalent of "The Glory of Their Times" in that context.

Ross,

"Yes. You can come up with a similar top 20 list using all sorts of offensive stats."

Even if this was true, that doesn't make VORP a bad stat.

"And you make the case that Aaron Rowand belongs on that list, but Torii Hunter doesn't. Given the contracts each just signed, it appears the baseball industry doesn't agree with you."

First off, that's a VORP ranking, not a Rob-in-CT ranking. I just put it up as an example for discussion.

Just because Torii Hunter got a bigger contract doesn't make him a better player. A contract results not only from the player's ability (real or perceived) but the market at the time, the needs of the team, the skill of the agent, the desires of the player himself (see: Wakefield, Tim), and probably a few other things I'm not thinking of.

Torii Hunter ranked 45th in VORP in 2007, by the way.

Hunter hit .287 .334 .505. He stole 18 bases, but was caught 9 times.
Rowand hit .309 .374 .515. He stole 6 bases and was caught 3 times.

I do not know for sure if VORP considers park effects (yes, I admit it! I don't know everything under the hood!!), but I know Philly is considered a hitter's park. I don't recall what Minny's park effect is. Even if you adjust Rowand's numbers down a bit, he out-hit Hunter. I wouldn't just waive away 40 points of OBP.

Also, as we've both noted, VORP does indeed fail to incorporate defense, which is a big part of Hunter's game. Granted, the same is true of Rowand, since they are both CFers with good defensive reputations.

So a more accurate (less catchy) name would be "Offensive Value Over Replacement Player." Or maybe just "Offensive Value." It's made fairly clear, I think, that "replacement player" is a theoretical construct. There is reasoning behind it, but it is ultimately an arbitrary number, and I'm fine with that. This somehow makes VORP useless?

Again, I'm not holding it up as some perfect single-number answer to life, the universe and everything (that would be 42, of course). I'm merely asserting that it is a pretty good stat, and miles better than what many members of the BBWAA tend to use when discussing a player's ability.

I think it's already been noted well that ability is not the only component when evaluating a player for the HoF. While I think it should be the most important thing, I agree with many of the other posters that things like character, adversity, etc. should matter. It's a question of how much you think those things should matter.

One more thing on Hunter v. Rowand. The VORP ranking was for 2007 only. Teams presumably (hopefully!!) look at more than just the last year's results when evaluating a player. I don't recall, but maybe Hunter was actually the better hitter over a 3-year period, and Rowand just had one flukey good year in '07. In that case, assuming for a moment that their defense is roughly even (not necessarily true), paying Hunter more might be perfectly reasonable, and also not at odds with the 2007 VORP ranking of 30th for Rowand and 45th for Hunter.

Why not? Are the subjective judgments really less reliable than a crude statistical analysis that can't capture the complexity of the game?

Yes! In fact, they most certainly are. Subjective judgements tell us Jeter is one of the best shortstops in baseball. "Crude" statistical analysis tells us he is actually quite possibly the worst.

Also, for what its worth, using contracts to figure out who the baseball industry thinks is better is ridiculous. Gil Meche is going to make $11 million next year, and Josh Beckett is going to make $10 million.

This site is called baseball ANALYSTS. What do you hope to get out of your visits here?

Jay Jaffe wrote: "Well, jeez, I'd like to think that somewhere within the 10,000-odd words I spilled on JAWS this past six weeks contained some amount of nuance regarding the players under discussion, but I do heartily apologize for not being able to spit out the equivalent of "The Glory of Their Times" in that context."


I am guessing that Jay actually does not heartily apologize, and he has no reason to. On the chance that this comment was directed at me, I assure you that I do not wish you to do anything at all of the sort. I hope you continue to write what you wish, and since you do it so very well. I have quoted (i.e., "ripped off") JAWS on more than one occasion.

All I hope is that the *readers*, and yes I am looking at *you* over there, seek out other parts to this story, perhaps even become *writers* of these stories yourself, and realize that these debates often do a poor job (IMO) of illuminating the game. No one is under any obligation to tell the *whole* story, just as no one should be under any illusions that they are telling anything more than a part of it, and that people they disagree are as well.

that doesn't make VORP a bad stat.

It just means it has little if any meaning. If you added Total Bases to RBI's and divided by games played and called it Power Quotient you would probably end up with a list of some of the game's best power hitters. That doesn't mean the number has any real meaning.

I just put it up as an example for discussion.

You put that list up in defense of VORP, so if your claim is VORP is a measure of a player's "value" than you need to defend it as such.

I do not know for sure if VORP considers park effects (yes, I admit it! I don't know everything under the hood!!)

Which was sort of my point. Most people who use VORP don't know what it means. That is really its usefulness - it hides any weaknesses inside a black box. How many people even know that it only includes offense?

This somehow makes VORP useless?

What does your list show it is useful for? Here is a list of the 10 pitchers with Greatest Value Above Replacement:

Sabathia-CLE
Webb-ARI
Harang-CIN
Blanton-OAK
Halladay-TOR
Hudson-ATL
Lackey-LAA
Peavy-SDP
Haren-OAK
Santana-MIN

Again, I'm not holding it up as some perfect single-number answer to life, the universe and everything

Then what are you holding it up as? It does not seem to have any other use - especially since you admit you don't really know what it means "under the hood". If you look at the "uber-stats" they really all claim to measure the same thing - the comprehensive value of a player.

All-Star teams don't consist entirely slugging first basemen and corner outfielders, so why should the Hall of Famers from that era over-represent such players?

Because those players arguably contributed more to their team's success than people who play other positions. As was pointed out above, not everyone bats at the top of the order, should we judge someone as a HOF candidate because they were the best number eight hitter in the league? Should middle relievers have their own category for the HOF? Apples to apples.

Systems like Win Shares and WARP also do a good job of apportioning defensive responsibility and value fairly,

You claim they apportion it "fairly" and yet even among professional baseball people there is a wide disagreement about the relative value of offense, pitching and defense. Essentially this is the same claim to have discovered that the answer is 42.

Even if defensive skill isn't a mandatory asset

Who said it wasn't? The problem is pretty simple, is a slick fielding second baseman more valuable than a poor fielding shortstop? Position is only the starting point for discussion of defense. And it has little to do with their offensive value, the position they play doesn't matter when they come to the plate. So why would you divide players by position and then talk about their offensive value?

I'd like to think JAWS rises above a description of "crude statistical analysis" given that it accounts for offense, defense and pitching in runs and wins, adjusting for park and league scoring levels, and that it takes into account both career and peak performance.

There are no accurate measurements of "offense", "defense", "pitching". There are only measurements of specific results for players who performed in a wide variety of circumstances. At best, they are crude measures of performance. In essence, you are trying to calculate differences to the third decimal point starting with numbers rounded to the nearest 100. And each calculation multiplies the uncertainty of the results.

If you want to compare Alex Rodriguez to Nick Punto, you don't need these calculations. And, if you want to compare Torii Hunter and Aaron Rowand I doubt it has any meaning at all. Is there any real difference between 30th and 45th rankings in VORP? Does anyone who uses VORP know the significance? I doubt it.

Regarding Mark Armour's point that no starting pitcher but three relief pitchers born after 1947 has been elected I have a minor quibble: Dennis Eckersley spent 12 of his 24 major league seasons as a starter.

Still relief pitchers of a generation outnumbering starting pitchers in the hall is significant. It seems to me that being the only starting pitchers born in the 50s who have come close to election is a strong argument in favor of Blyleven and Morris. Has that argument been made, yet?

Err, make that "starting pitchers other than Eckersley..."

Has that argument been made, yet?

I had been working on that argument and was planning on writing about it but Jim Caple beat me to the punch yesterday. It is a good read.

When it comes to the HOF, there are three types of people. 1) the no-brain entries 2) the no-brain rejections. 3) those in-between

There is no real debate about the candidacy of buys like Cy Young, Willie Mays, etc. In these guys, the statistical crowd and the general public have no problems. The "steroid" era has caused some confusion in there, in that before BALCO or HGH, Barry Bonds and Mark McGwire might have fallen into that area. (and maybe even Rafael Palmeiro)

As far as the number 2 group. I guess out there some people believe that Todd Stottlemeyer was a HOF person, father, husband, whatever, but finding someone not related who believes that Todd Stottlemeyer belongs in the Baseball HOF is remote.

That means that the primary HOF discussions revolve around the guys in the number 3 group. Thats where Bert Blyleven and Tommy John and Catfish Hunter and even Tom Glavine and the like go. Everyone disagrees about those guys. Analysts disagree, sportswriters disagree, everyone disagrees. So why are some in and some not? because its a popularity contest, so the more people who think he should be in (for whatever reason), the more likely that the guy is enshrined. Do numbers play a part, sure. Does popularity? playing on a winner, being a nice guy, who's good to sportswriters and his mother? they are all factors. And they are all right.

When Mark talks about the bloggers who flat out state that voters are morons, he's talking about the FireJoeMorgan's of the world. There are plenty of those out there to cite.

The more popular Internet-writers like Neyer obviously don't go nearly as far FJM, but a 1000-word absolute-shredding of someone like Shaughnessy's work obviously doesn't paint him in a favorable light.

The more popular Internet-writers like Neyer obviously don't go nearly as far FJM, but a 1000-word absolute-shredding of someone like Shaughnessy's work obviously doesn't paint him in a favorable light.

No, it does.

Shaughnessy has nothing but contempt for those who think differently than him.

"It just means it has little if any meaning. If you added Total Bases to RBI's and divided by games played and called it Power Quotient you would probably end up with a list of some of the game's best power hitters. That doesn't mean the number has any real meaning."

Define "real meaning." Further, that list would be flawed, IMO, because RBIs don't mean much when removed from the context of RBI chances (how many ducks were on that pond?). But yeah, I'm sure the list would be full of guys with good power. If power is all you want to look at, maybe such a metric (modified to adjust for RBI opportunities) would be useful. More useful than Isolated Power (Slugging % - Batting Average)? I don't think so, but feel free to argue the point if you'd like.

"How many people even know that it only includes offense?"

Anyone who bothers to read the definition (quoted later in this post).

"Then what are you holding it up as? It does not seem to have any other use - especially since you admit you don't really know what it means "under the hood". If you look at the "uber-stats" they really all claim to measure the same thing - the comprehensive value of a player."

I've said what I'm holding it up as: a pretty good measure of a player's offensive value - nothing more. Superior to things used by others, in many cases. Not perfect, but good. I've said it a number of times.

This apparently isn't worthwhile to you. To me, it is, especially compared to prattling on about batting average or how Joe Ballplayer is just so damned *gritty*.

I admit I don't know if VORP makes adjustments for park effects (if it doesn't, I'd count that as a weakness). So yes, I do not have a full understanding of the mechanics under the hood or in the black box. For that reason, coupled with others I've mentioned, VORP is not something I look and and figure I know all there is to know.

VORP does *not* claim to provide a comprehensive value of a player. That would be silly for a system that doesn't incorporate defense.

I'll quote the definition, straight from BP:

"The number of runs contributed beyond what a replacement-level player at the same position would contribute if given the same percentage of team plate appearances. VORP scores do not consider the quality of a player's defense."

As for concocting your own list - that's fine, anyone can, sure. I agree that most people can figure out that ARod is good and Nick Punto isn't. Where it gets interesting is debating the relative value of two players who are (or seem to be at first) similar in performance. And, for offensive performance, I think VORP is useful - at least as a starting point for discussion.

Please, though, I don't want people to come away with all this just thinking "bloggers are mean." I wrote 2500 words. One of my points, of many, is that people in this debate, to use Rob's term, attack a person's motives or intellect or honesty, rather than attacking their argument. Whether this is mean or not is up to you to decide--but is certainly ineffective, unless your goal is to through some raw meat to the people who already agree with you.

Another point is that people are more likely to see this happening if they disagree with the writer than they do if they agree with the writer.

This is certainly not about taking sides. Baseball Analysts is my favorite baseball site.

Another point is that I cannot compose simple sentences on this site without typos, wrong words, and poor spelling. I am really smarter than this guys, I swear.

I checked. VORP *does* factor in park factors.

Mark - I disagree with your entire article, but I very much enjoyed reading it. It was well thought out, and well written. Great job.

Speaking of FireJoeMorgan, there's a pretty good shredding of a "Harold Baines for the HoF!" article up. Over the top? Yeah. Unnecessarily nasty? Definitely.

FJM isn't the rule, though. It tops the charts in SORB (Snark Over Replacement Blog).

Please, though, I don't want people to come away with all this just thinking "bloggers are mean." I wrote 2500 words. One of my points, of many, is that people in this debate, to use Rob's term, attack a person's motives or intellect or honesty, rather than attacking their argument. Whether this is mean or not is up to you to decide--but is certainly ineffective, unless your goal is to through some raw meat to the people who already agree with you. (bold emphasis mine)

While I agree with you that in general the vitriol goes over the line is attacking subjective arguments at times (mostly on sites like FJM, where that is what they do), I think part of the issue is that because the arguments are subjective, and the author and arguement become intertwined.

If some theoretical author says, "David Eckstein is invaluable for his hustle, and I don't care what fancy stats say." How can one dispassionately argue that?

"If some theoretical author says, "David Eckstein is invaluable for his hustle, and I don't care what fancy stats say." How can one dispassionately argue that?"

Well, you could ignore it, which is likely what I would do but is probably a boring answer. :-)

You could write generally about the role of hustle in baseball and what you think of it, suggesting (for example) that most of his hustle is captured perfectly well by his statistics.

Or you could say. "Theoretical author likes and admires David Eckstein, and I think that is great. However, the claim that these admirable qualities have intrinsic value over and above Eckstein's statistics doesn't really hold up under scrutiny."

Or you could say. "Theoretical author is a typical old fossil whose appreciation of the game petrified about 1958."

Fair enough Mark. However I think that, in general, serious statistical writers (sites like FJM being comedic rather than a serious) do take the tact of your first two responses.

Again, I agree with the general point that you make, however in reading your article, it seemed to "feel" (hows that for a subjective arguement? :)) that you only pointed the finger in one direction when it's clear that it can be pointed in both directions (and *maybe* a bit more in the other :) ).

A vast, vast, vast majority of stats writers (and bloggers like Joe Pos who you pointed out) feel that both statistical and anecdotal evidence are valid. Reaction to certain writers gets heated, I feel, because they dismiss statistical analysis as unimportant and meaningless. Except of course when the stats support their subjective evidence.

All that being said, I appreciate your article for the discussion that it opens up. Well done.

Define "real meaning."

It actually measures something. That number doesn't, but if you can't see that I understand your fascination with VORP.

I've said what I'm holding it up as: a pretty good measure of a player's offensive value

But it isn't a measure of a player's offensive value is it? Alex Rodeiguez's offensive value is no less because he is playing third base. But VORP compares players to a theoretical replacement player who plays the same position. Rather than measuring a players "offensive value", it more accurately measures a player's roto value where every shortstop with x-number of games is equal for purposes of the game.

Where it gets interesting is debating the relative value of two players who are (or seem to be at first) similar in performance

And where the differences in VORP are completely meaningless as the Hunter/Rowan example shows. And that example includes players at the same position.

at least as a starting point for discussion.

It isn't a starting point. Its an end point. You say VORP, I say WARP and neither one of them measures anything tangible. They are just concocted numbers, like my "Power Quotient". You just argue about whose number is better instead of about the baseball players.

Reaction to certain writers gets heated, I feel, because they dismiss statistical analysis as unimportant and meaningless. Except of course when the stats support their subjective evidence.

Where are these writers who dismiss statistical analysis? Virtually all those writers will use batting average, rbi's, wins, stolen bases, ERA. How are they more dismissive of statistical analysis than those that dismiss those statistics?

The question is not whether to use statistics but whether to use measurements of tangible results or creative concoctions that don't really measure anything tangible?

Worse, is that many of the people using even simple statistics don't really understand what they mean. Take K/9 as an example. It measures what percentage of outs a pitcher gets by strikeout. That is not the same as "how often" a pitcher gets a strikeout which would be K/batter faced.

Or the folks that think "Isolated Power" represents how much power a hitter has since that's what its called.

Kyle: Your criticism (that I only called out one side for snideness) is well taken, and was made by a couple of other people off-line. My reasoning, perhaps flawed, was that the people I am largely speaking to on this site are already well aware of the other snideness, and it is called out often on the web. To the extent that this reasoning made it appear I was taking sides, I regret it.

To your point that vast majority of bloggers consider both statistical and anecdotal arguments, I do not want to speak for anyone. I will say, however, that such thinking should lead to a less confident conclusion, and everyone ought to be aware of that. Small differences of opinion on really basic things like peak-career, post-season importance, the proper size of the Hall, winning teams, not even considering more controversial things like leadership, can create hugely different ballots.

What is it with the strawman that I'm "fascinated" by VORP? I brought it up because you were ranting about "catchily named" stats. I felt the need to provide an example of such a stat and discuss it. VORP was just one possibility. Oy.

"But it isn't a measure of a player's offensive value is it? Alex Rodeiguez's offensive value is no less because he is playing third base. But VORP compares players to a theoretical replacement player who plays the same position. Rather than measuring a players "offensive value", it more accurately measures a player's roto value where every shortstop with x-number of games is equal for purposes of the game."

It's a measure of a player's offensive value over a certain baseline ("replacement level"). RL is a proxy for "oh shit, ARod broke his leg, call up hesucksky from AAA." The positional adjustment gives extra credit to players who play difficult defensive positions at which offense is generally less abundant, such as SS, CF, 2B, C and to a lesser extent 3B. The basic theory behind that is that, in a pinch, it's easier to find a guy you can stick at 1B who will hit a little bit than it is to find a guy who can play SS and hit, which seems reasonable enough to me. You disagree?

I've never played "roto" so I'm not sure what you mean by roto value. A guy who plays all 162 games and racks up a lot of plate appearances will have a higher VORP than a player who is slightly better by rate but who misses a bunch of games, it's true. VORP is a counting stat, not a rate stat. That leads me to wonder how good it is at measuring a catchers' offensive value, and is one of the reasons I don't just go look at VORP and then shut my mind off.

"And where the differences in VORP are completely meaningless as the Hunter/Rowan example shows. And that example includes players at the same position."

How does the Hunter/Rowand example show that? Rowand was a better hitter than Hunter last season. VORP reflects that, and quantifies it (Rowand: VORP of 52, Hunter, VORP of 39).

"It isn't a starting point. Its an end point. You say VORP, I say WARP and neither one of them measures anything tangible. They are just concocted numbers, like my "Power Quotient". You just argue about whose number is better instead of about the baseball players."

Speaking of strawman arguments...

It's a starting point for me. I don't say VORP! and rest my case, when discussing a player. Never have, never will.

I don't know where you get that idea from. As for being "concocted" - is this shorthand for anything more complicated than batting average? Is batting average tangible to you? It's a number. It requires a formula to calculate. Is it therefore concocted and totally meaningless?

I don't usually spend time arguing about the relative virtues of performance metrics. In fact, this is probably the only time I have (or will), and it was merely for the purpose of trying to get you to flesh out what your problem(s) with such stats is/are.

I am guessing that Jay actually does not heartily apologize, and he has no reason to. On the chance that this comment was directed at me...

It was directed at you, Mark, but I think my sarcasm fell flat when juxtaposed with my reply to Ross. I appreciate the kind words and the feedback your piece has provided.

It's a measure of a player's offensive value over a certain baseline ("replacement level").

No it isn't. Whatever it measures, it is being compared to a variety of different baselines depending on what position a player played most.

The basic theory behind that is that, in a pinch, it's easier to find a guy you can stick at 1B who will hit a little bit than it is to find a guy who can play SS and hit

Again is Rodriguez not as good an offensive player because he is playing third instead of shortstop? No. He produces the same offense either way.

VORP is a counting stat

What does it count? Theoretical runs?

I don't say VORP! and rest my case, when discussing a player. Never have, never will.

That is exactly what you did above. And since you don't seem to have a clear idea of how it is calculated how is it even a starting point when you don't know where it is you are starting?

If someone says batting average is a starting point, I think most people here can tell you what is missing whether its at bats, walks or power. Where do you go from VORP?

As for being "concocted" - is this shorthand for anything more complicated than batting average?

No, it shorthand for complicated formulas that claim to have a meaning that is not apparent from the calculation. Concocted stats claim to be "objective" but in fact contain a variety of subjective judgments which are not apparent in the statistic.

Rowand was a better hitter than Hunter last season. VORP reflects that, and quantifies it (Rowand: VORP of 52, Hunter, VORP of 39).

VORP claims he was a third better, which he clearly wasn't. And by most measurements, Hunter produced more offense than Rowand.

BTW here is another description of VORP:

"Value Over Replacement Player (and rate) -- developed by Keith Woolner of Stathead Consulting, VORP measures how many runs a player would contribute to a league average team compared to a replacement level player at the same position who was given the same percentage of team plate appearances as the original player had."

In other words, it has nothing to do with a player's actual offensive value. It is based entirely on a statistical model. So you end up with Rowand a third more valuable than Hunter, all evidence to the contrary.

Further, that list would be flawed, IMO, because RBIs don't mean much when removed from the context of RBI chances (how many ducks were on that pond?)

BTW - Hunter had more RBI's than Rowand with fewer plate appearances with runners on base and fewer with runners in scoring position. Rowand had 26 plate appearances with the bases loaded and drove in 15 runs. Hunter had 18 plate appearances and drove in 19 runs.

How does VORP capture that in its measurement of "value'? The answer is it doesn't.

It's silly to consider the best #7 hitter in baseball as being a particularly valuable skill. Players batting below 5th in the lineup almost all do so because they aren't as good as somebody with a comparable skill set (high OBP, his SP, or high OPS) who bats higher. On a great team, a good or even great hitter might wind up batting seventh, but that's because there are six better hitters ahead of him. If he were playing for a normal team, he'd be batting higher.


But it is useful to consider VORP by position. Every player at every position in baseball is there because they either field adequately and hit well (Ernie Banks), field well and hit adequately (Ozzie Smith), or some combination of the two. When a great hitting mediocre fielding shortstop (like Banks) loses enough defensive skill (or the team acquires a better fielder) then if the offense is high enough, the player moves down the defensive spectrum. So comparing players to who would replace them at that position is valid. Mike Piazza went from a catcher with some defensive value (having an excellent Catcher's ERA) despite his weak throwing arm, to a catcher with lousy defensive value but still high VORP offense, and finally to a terrible catcher with marginal DH skills, all in context of being a one of the very best hitting catchers of all time.


Is Jorge Posada a better hitter than Alex Pujols, who is just behind him on the VORP list? Of course not. But it would be relatively easy for most teams to find a #4 outfielder or a regular pinch hitter or even a minor league first baseman who could probably put up an 800 OPS if Pujols went down. Finding a catcher who could provide adequate or better defense and an 800 OPS is hard. And that's why the Hall of Fame will never be based upon any one statistic, unless it's one of those defining X hundred or Y thousand that always guarantees entry: VORP can't justify Ozzie Smith. But VORP does help to put the kind of season Posada had in perspective, and maybe help us identify a player who didn't lead his league in anything important offensively, but did very well for his position.


The Hall of Fame tends to be under represented in the defensive positions because they don't hit as well. It's hard for catchers to retain their speed, squatting all the time, and thus they'll get fewer leg hits and stolen bases. It's harder for middle infielders to maintain the range and quickness necessary to be an Ozzie Smith if they bulk up enough to hit 20 dingers. VORP helps us there. It's not the only stat, and I certainly wouldn't want to lean on it for someone like Derek Jeter, who has become dreadful defensively. But Jeter has built his own HOF resume and will get consideration regardless of his VORP.

A general point. I came to sabermetrics late in life (I am 65), at first because I loved the wit and iconoclasm in the writing, but increasingly because it made baseball discussions meaningful. In the traditional mode I grew up with, discussion meant essentially hurling opinions at each other. Who was better, Mantle, Mays or Snider? That was the question of my youth, and the debates were not debates at all, just random references and observations. After a while, it was simply boring and pointless.

To me, the essence of the sabermetric approach is that there is no orthodoxy-no "book" with all the answers. Rather, every hypothesis is open to question and revision, and even those that seem most sacred and generally accepted are subject to scrutiny and disciplined criticism. It means that there is responsibility and continual reassessment. Naturally, as with any movement, some become extremists, intolerant of heresy, but the whole point of sabermetrics it seems to me is to encourage heresy, and the people drawn to it seem by nature to be skeptical of absolutes.

I think the reception of Voros McCracken's ideas about the extent to which pitchers control the results of balls in play is an excellent example of the sabermetric approach. I love the notion itself, but from the beginning was skeptical since it always has seemed to me that pitchers do differ in their ability to create bad swings and weakly hit balls, as well as to generate grounders. Over the years, the idea has spawned a widespread discussion with many revisions. I doubt any final truth has been discovered, but it certainly has given us all new insights into the issue. I have never seen anything in the discussions of traditionalists that engenders a similar curiosity or growth of understanding. All I see is the application of received wisdom to new circumstances essentially to support a pre-conceived notion. So one team wins the World Series because of youthful enthusiasm and immunity from pressure while another loses it because of lack of experience. (By the way, applied even when the facts deny the very premise in some cases.)

Again, quite naturally, some will be strident or arrogant. I happen to like FJM because I often agree with their evisceration of particular journalists, but I also often cringe at some of their cruder snarkiness and narrow view (sometimes) that one statistic closes their case. I am less amused when I disagree with them, as I do violently on their steroids stance for example. So I can empathize with those who find them intolerable even while I enjoy their misery.

But in the end, when Neyer and James disagree, there is a reason to read both and assess their arguments. Should Shaughnessy and Chass disagree, I simply have no interest in trying to understand why because there is nothing there to discuss. It is simply two people expressing opinions, perhaps with a bit more literacy than some posters on the Heater, for example, but with little more credibility.

Thanks for the thoughtful response, Bob. I think you nailed it.

As Bill James told me over breakfast at the Winter Meetings in Anaheim in December 2004, "Good sabermetrics respects the validity of all types of evidence, including that which is beyond the scope of statistical validation." All of us (both inside and outside of the sabermetric community) would be wise to remember that.

James was neither a "statistician" as he has been called or a "stathead" as many of us have been termed. Sure, Bill was known for debunking baseball's conventional wisdom through the use of statistical evidence but his innovations like the Defensive Spectrum are about organizing *concepts*, not *stats*. His greatest contribution of all is teaching us the importance of dealing with *questions* rather than *answers*.

I would like to take this time to thank Mark and our readers for the spirited exchange. For the most part, I believe it was instructive and healthy. Congratulations to almost all of you.

For the record, I have asked a certain reader to refrain from participating in the comment threads in the future. Although I encourage dissenting opinions (the article itself is Exhibit One in that testimony), I find it a waste of time to wade through contrary viewpoints when they are nothing more than an opportunity to pick a fight with another reader. That is neither instructive nor healthy.

The purpose of Baseball Analysts is to inform, entertain, and engage a reader base that is among the most intelligent in the world of baseball. In order to remain true to this founding principle, I have found it necessary on a couple of occasions to take a stand like I did today for the long-term beneficial interests of the site and its readers. I hope and trust that everyone understands. Thanks again.

I'm late to this party, but I do want to add that I thought the article and the following comments were a fun read (though I skipped the VORP debate). Way to stimulate a discussion, Mark.

As Mark says, these Hall of Fame debates (and they are debates) get intense because of their timing. We love talking baseball, this is the Internet, and everything gets dissected to the bone cause there's nothing else to talk about. I both love it and eventually get sick of it.

However, I don't think Mark picked the two best examples to illustrate his points. I have found that Jay and Rob are both extremely aware of baseball's history and the importance of considering factors that can't be quantified. They've even written about it. And I don't think either one of them has been inappropriately snide. Peter Gammons has been more snide than either of them.

Regarding Mark's second point:

"Analytical HOF debates generally include a lot of assumptions about the purpose of the HOF, which is the cause for most of the disagreement, IMO."

Do analytic writers make assumptions any more than non-analytic ones? Of course not. As someone said, the best analytic articles bring those assumptions to the fore and make them explicit -- Rob and Jay being two great examples. Isn't that a good thing?

I know Mark cited their work as examples of the high HOF bencmarks being used in these debates (high standards that may well be justified, BTW, considering that these are only BBWAA votes we're talking about, not Veteran's Committee choices) but they were the only two examples he cited. So, if there are other examples of his points (hopefully, not me!!!), I don't know what they are and I can't tell whether I agree with him or not. On the surface, I don't.

Re Blyleven: I don't know how this enters into it at all, but his color work for the Twins is pretty good in comparison to his peer group. Blyleven and partner are entertaining and make the game fun to watch and listen to. I'll qualify that the evaluation is strictly from an entertainment perspective, not based on knowledge of stats and metrics.

On Rice being at 72%, with the better metrics we look at now we can see that Rice was probably not a HOF level player. But nothing we tell those writers now is going to change their memory of not just watching all those games, but believing for that entire era that Jim Rice was one of the very best offensive forces in the game. Just about everybody believed that at the time. So to get overly philosophical, perhaps even if it's not true it should still count. Kind of the opposite of the 'if a tree falls in the woods' argument. Go ahead and crucify me now please.

Thanks for your piece, Mark. It does what good writing should do -- make us think, keep us on our toes, help us live more in questions than in answers.

As others have pointed out, however, I'm not sure Neyer and Jaffe make the best targets for your criticism. I read most of what these two gentlemen write, and they always strike me as tough-minded without being close-minded. And I know both of them incorporate "extra-numerical" stuff like intangibles, intuition, and anecdote (where appropriate) into their HOF arguments.

What's more, I'm persuaded by Neyer's point that Hall of Fame debates SHOULD be rough and tumble. On the news last week I saw something about a gathering in Oklahoma hosted by Michael Bloomberg to discuss the viability of third-party presidential candidates. The message of this conference seemed to be: stop all this partisan bickering! Can't we all just get along! And I wanted to say to these guys -- don't you realize that the Framers INTENDED us not to get along? Don't you know that bickering, checks, balances, even, yes, a little gridlock now and again, is precisely what the Constitution designed our system to do?

I think baseball debates operate much the same way. Sure, it'd be nice if everyone were more reasonable and dispassionate, but it doesn't hurt to draw a little blood now and again. It makes us better citizens.

But I think there is a difference between passionate, rough and tumble debate, even some blood-letting and simple name calling or dismissive sarcasm. In fact, I think it possible to call an argument irrelevant, misleading, invalid, even foolish without applying similar terms to the people writing the argument.

In fact, it makes more sense to me to mock those "in-house" people with whom we often agree than those who approach issues from another perspective altogether. The former will usually get it and return the compliment in a way that is really bantering, while the latter just close up more. And the ad hominum attacks will tend to distract from the actual issues.

I understand the frustration of dealing with people who seem oblivious to reasoned discussion, and also the difficulty in ignoring them. (Of course, I am certain many with whom I disagree find me intolerably closed-minded.) But if no common ground can be found, that discussion must end up as a screaming match, so it probably is better to keep the focus on the argument and not the person, and to frame the argument in reasoned tones-taking the high ground, I suppose-to encourage those on the fence to consider those arguments more seriously.

I want to make another comment which may be colored by the fact that most of the people I am referring to are in the 50-80 year old generation. A few are considerably younger however, so perhaps what I have observed has some relevance.

I wonder if sites such as this overestimate the extent to which they have penetrated fans' consciousness. I am not talking about front offices but the general population. I am sure that book sales and site hits indicate widespread awareness, but is it so compared to the mass of the population?

I play ball with over 70 people. They are quite varied, and are certainly not stupid, but I have yet to find one who even has heard of sabermetrics. And I have asked directly. As a result, when we talk sports, and most have been ardent fans their whole lives with deep memories and full of traditional lore, I find our starting points miles apart.

For example, a few people were incensed that Pedroia won the ROY over Delmon Young, pointing out that Delmon was superior in nearly every statistical category. When I said the opposite was true, they were not just nonplussed but angry, and quoted home runs and RBIs to prove their point.

In my view, people with fixed views need to be approached gingerly and with respect. A no-holds-barred, attack mode can do no good. It's not a question of conversion, of arrogant missionary activity, but of trying to reframe the grounds of discussion little by little so that a comfort zone can be attained where disagreement is productive.

I agree, Bob. And I agree that Mark's piece is a useful corrective to much of the arrogance and closemindedness you sometimes read on blogs or in chatrooms. There's a difference between a debate and a screaming match.

My only point was that spirited discussions about anything people care about -- baseball, politics, whatever -- are bound to become messy and awkward at times, even shrill. And that's okay. It's all part of the process of finding common ground.

Anyone else think that Bob R. should write some material for this site? Or at least have a blog that I could visit?

Mark -- I think it's fair to say, after reading this comments section and the things written by Rob Neyer, Jay Jaffe, and others, that

1) You have materially misrepresented their positions.
2) Your comment regarding George Kelly being an "unappreciated star" were he on the same-era Phillies is simply absurd, and comes with no supporting argument. You don't even take up Bill James' principle complaint, namely that he got entry by way of his voting cronies.
3) I find this piece unpersuasive in the extreme, poorly thought out and poorly researched. It strikes me as a series of excuses, a sort of "the dog ate my homework" for those too lazy to actually concoct a real argument.

Bob,

I am a 17 year old high school senior, and I run into the same exact problems you reference in your last comment. I often start by describing the stat I will reference before I actually say anything. For example: "wouldn't you like to know how many runs he contributed to the team above what an average player would have?" I refrain from using replacement players in normal conversation because it just makes it more confusing. And if you did write for this site, I would most certainly read what you have to say.