Abstracts From The Abstracts
Part Five: 1981 Baseball Abstract
The 5th Annual Edition is the last of Bill James' self-published Baseball Abstracts. It features a light-yellow cover with artwork by Susan McCarthy of a dozen baseballs in an egg carton with the handwritten inscription "Baseball Fever - Hatch It". Susie, as Bill calls her in the acknowledgements section, also created a silkscreen on the "same motif as the cover design". The 15" x 23" four-color serigraph, "suitable for matting and framing", was offered for sale at $12 ($9 cost, $3 for postage and handling).
James also lists price information on back issues, noting 1980 and 1979 are "still available" for $10 and $7, respectively. He also mentions doing a reprint of the 1978 and 1977 editions and writes: "If you want one--and, to be frank about it, I don't know why anyone would--then send me a check or m.o. for the appropriate amount, the check sent before September 1 but dated (please) September 10. On September 1 I will count the checks and reprint the number needed. These copies will be exactly like the originals except they will say 'REPRINT' on the cover."
Down below, James offers the 1977 book for $5 ("Virtually all numbers; only one essay remains interesting at this point") and the 1978 version for $8 ("Far better, but still largely outdated").
James opens the 1981 Abstract with a two-page "Dear Reader" letter entirely on the subject of sabermetrics, including what is sabermetrics, the definition of sabermetrics, and the difference between sportswriting and sabermetrics.
1) Sportswriting draws on the available evidence, and forces conclusions by selecting and arranging that evidence so that it points in the direction desired. Sabermetrics introduces new evidence, previously unknown data derived from original source material.
2) Sportswriting designs its analysis to fit the situation being discussed; sabermetrics designs methods which would be applicable not only in the present case but in any other comparable situation. The sportswriter say this player is better than that one because this player had 20 more home runs, 10 more doubles, and 40 more walks and those things are more important than that players 60 extra base hits and 31 extra stolen bases, and besides, there is always defense and if all else fails team leadership. If player C is introduced into this discussion, he is a whole new article. Sabermetrics puts into place formulas, schematic designs, or theories of relationship which could compare not only this player to that one, but to any player who might be introduced into the discussion.
3) Sportswriters characteristically begin their analysis with a position on an issue; sabermetrics begins with the issue itself. The most over-used form in journalism is the diatribe, the endless impassioned and quasi-logical pitches for the cause of the day--Mike Norris for the Cy Young Award, Rickey Henderson for MVP, Gil Hodges for the Hall of Fame, everybody for lower salaries and let's all line up against the DH. Sportswriting "analysis" is largely an adversary process, with the most successful sportswriter being the one who is the most effective advocate of his position. I personally, of course, have positions which I advocate occasionally, but sabermetrics by its nature is unemotional, non-committal. The sportswriter attempts to be a good lawyer; the sabermetrician, a fair judge.
For that reason, good sabermetrics respects the validity of all types of evidence, including that which is beyond the scope of statistical validation.
On the subject of sabermetrics, James intersperses the following remarks in the Player Ratings and Comments section:
Bad sabermetrics attempts to end the discussion by saying that I have studied the issue and this is the answer. Good sabermetrics attempts to contribute to the discussion in such a way as to enable it to move forward on a ground of common understanding.
Bad sabermetricians characteristically insist that those things which cannot be measured are not important, like Earnshaw Cook's incredible assertion that major league teams should play the best hitters available, more or less regardless of defense. Bad sabermetricians run from the monster in terror, and insist that he does not really exist, that there is only That Shadow.
Speaking of Cook, James' disdain for the mathematician-turned-author shows up later in the book: "Cook knew everything about statistics and nothing at all about baseball--and for that reasons, all of his answers are wrong, all of his methods useless."
James also takes on Tom Boswell's Big Bang Theory and Total Average. He disproves Boswell's assertion that "the winning team will score more runs in one inning than the loser will in all nine" in a significant majority of games. James says that "Boswell reached his conclusion by studying World Series games, and World Series games are not typical of regular-season games".
As to Total Average, James huffs, "The world needs another offensive rating system like Custer needed more Indians...What we really need, as I wrote three years ago, is for the amateurs to clear the floor."
I don't mean to sound harsh or negative about the work that Boswell has done. He is a first-rate writer, and I would happily say that he was a first-rate sabermetrician if I thought that any of you would believe it. If, like most of the nation's sportswriters, he had never developed a single idea about how baseball games were won, if he had never done a half-hour's research to check his idea, then I would not be criticizing him. It would hardly seem wise or fair to single him out for criticism because he did have a single idea, and he did do a half-hour's research, give or take ten minutes. The best ideas are those which have one saying, "Well, I wonder why nobody else ever thought of that?" Boswell has yet to come up with such an idea. But I would give a week's pay to have Boswell working for a KC newspaper, where I could read his stuff regularly. He's good.
James uses his Value Approximation method throughout the book, including the creation of a "Talent Balance Sheet" for each major league team as well as a spin-off version that he refers to as "Trade Value" (defined as years remaining multiplied by established value). James says he might call Trade Value "Estimated Future Approximate Value" except for the fact that "I hate acronyms".
No, they're alright in their place but they are too dangerous when you are writing about numbers; you can wind up saying that A has an EV of X and a PYR of Y and X times Y equals Z which is his EFAV and nobody knows what the hell you are talking about. Or cares.
"The Favorite Toy" essay was a culmination of James' search "for some way to estimate accurately a player's chances of attaining some particular career goal". As James saw it, "the most relevant issues in the question of whether A can X are:
1) Distance. How far away is A from X?
2) Momentum. How fast is A approaching X? And
3) Time. How long does A have to attain X?"
James provides a table of "Major League Players Who Have At Least A .01 Chance Of Getting to 3000 Hits". The top three players are Rod Carew (53%), Robin Yount (32%), and George Brett (31%). Seven players in total eventually reached the magical plateau, including the three above plus Eddie Murray (24%), Dave Winfield (13%), Paul Molitor (7%), and Henderson (6%). Pete Rose (1978) and Carl Yastrzemski (1979) had already reached the 3000 hit mark, while Wade Boggs, Tony Gwynn, and Cal Ripken had yet to make their major league debuts.
The total expectation for all of the players figured, from Rod Carew to Bill Russell, is 6.16; that is to say, this system would estimate that, out of all major league players now playing regularly but with less than 3000 hits, about 6 would eventually reach that mark. That is, I think, a very good guess.
With respect to breaking Hank Aaron's career home run total, James predicts that the player "will have to do it from ahead, because, no one, comparing age levels, is goning to catch Aaron from behind". James points out, "The best 5-year home run period of Aaron's career began when Henry was 35 years old, and that is without historicial precedent. So if you're behind him at 33 or 35, forget it."
Although Barry Bonds was only 16 years old when James wrote the 1981 Abstract, I thought it would be interesting to compare Bonds' and Aaron's home run totals from age 35-on.
Age 35 44 49
Age 36 38 73
Age 37 47 46
Age 38 34 45
Age 39 40 27 (through 7/31/04)
Bonds has outhomered Hank, 240-203, from age 35-39 with a third of a season still to go. This is not meant to put James or Aaron down at all; rather, it is designed to show once again just how incredible Bonds has been these past five years.
James asks himself, "If I were to name five American League players who should win an MVP award, I would name Molitor, Murray, Parrish, Wilson, and Henderson." He tells us to "save the list and we'll see". Well, I'm here to tell you, Bill, that you picked one correctly (Henderson, 1990). To his credit, Henderson or Murray finished second in the A.L. voting the following three years. However, the closest that Parrish and Wilson ever came to winning the MVP award was 9th and 10th, respectively.
James further develops his ideas as to determining won/lost percentages in "Pythagoras and Logarithms". This essay is loaded with a graph and several formulas, including the log5 equation. James even provides a table of the NFL standings, showing the actual and projected W/L records (the latter based on points and opposition points by the Pythagorean method). "Of the 28 teams, 21 are within 1 game of being correct; the standard error is 1.27 games." James goes on to say that the approach "could also be used to estimate the won/lost records of hockey teams...and to do any number of other jobs within the world of sports statistics".
In "Other Voices", fellow sabermetrician Pete Palmer writes a letter to James, telling him, "I still like my formula best." James concedes the difference between the two is that "the Palmer formula tends to be slightly more conservative, and to give answers which are a little closer to .500".
James concludes in his essay on "Ability and Career Expectation" that the peak period for players is "more 26 to 30 than 28 to 32". He summarizes his finding by saying "very few players are still at their best at age 32". James also puts forth the notion that all players "move downward [as far as Offensive W-L%] and leftward [defensive spectrum] over time".
Elsewhere, James raises several questions about areas of performance that are not as easily quantifiable.
We do not know how many times each player was thrown out attempting to take an extra base. We do not know how many times each player gave away a base by throwing to the wrong one. We do not know how many hits Mark Belanger has robbed the opposition of over the years, how many doubles Greg Luzinski has given away. We don't have any idea how many runs Roberto Clemente prevented by keeping people at third on sacrifice flies. We couldn't even guess how many runs Mickey Cochrane saved his teams by knowing what pitches to call for, or Carlton Fisk. We do not know which or whether players are especially good in the clutch. And this is only the shadow of the monster; our whole ignorance is much larger than we can conceive of.
James was early in discovering that, much to his and everyone else's surprise, power pitchers were more likely to have "better durability" than control pitchers.
On the subject of pitching, James refutes the belief that "pitching is 75% of baseball" and suggests "about 35%" as the correct weighting. One of the most persuasive arguments James makes on behalf of hitting over pitching is as follows:
Another point which seems to me to be relevant is that the spread of occurrence of every single type of offensive incident is wider for hitters than it is for pitchers. No pitcher allows home runs as often as Mike Schmidt hits them, or as rarely as Duane Kuiper hits them. No pitcher allows opposition batters an average of as high as George Brett, or as low as that of the league's lowest average. No pitcher strikes out batters as often as Gorman Thomas strikes out, or as in-frequently (sic) as Brett strikes out. No pitcher walks batters as often as Gene Tenace walks, or as rarely as Rob Picciolo walks, even though "walks" are traditionally considered the province of the pitcher.
While sitting at the typewriter, James notices that "the lion's share of championships have been won by teams which play in pitcher's parks" and concludes by saying "more research, more research" when wondering if "there is an inherent advantage to a team which must force itself to learn to play the 1-run game that is often forced upon it by a low-scoring battle".
A section entitled "Joint Project" was basically a request put forth by James to code the pitching motions of all the major league pitchers. It was his "first reader-participation project". The letters used in the coding remind me of those proposed in the last chapter in The Neyer/James Guide to Pitchers, entitled "Pitcher Codes".
In a similar vein, James tells his readers he has a folder on his desk labeled The Baseball Analyst that for the last two years "has collected mostly dust".
The Baseball Analyst, if it is to be, is to be a journal of sabermetrics. I will edit it, and occasionally make comments or even small contributions, but 90 to 95% of it will be written by other people. People like you.
If you are interested, this is the way I'm going to set it up. The Analyst, or course, will not pay for copy. All people who contribute, whether they contribute a 5-page article or a paragraph, must also subscribe...The number printed will be exactly the number of subscribers. The Analyst would come out six times a year, and contain 20 pages an issue. The pages would look, in general, a lot like the earliest Abstract's (sic)--photo copied, staple bound. The cost: $12 a year.
The system is set up to avoid the possibility of the Analyst running in the red, because I just couldn't afford to carry the thing if it doesn't pay for its own way...If you are interested, send a check for $12 to "Bill James" or "The Baseball Analyst", drawn on an account that will still be active on August 1...I will put all of the checks, and all of the work, in that same dusty folder, cleaned out for the occasion. If, on August 1, that folder contains at least 50 checks and at least 40 pages of material--enough for the first two issues--then the Baseball Analyst will finally get off the couch. If it fails on either account, the checks will be returned to you, and the folder put away.
Next up: 1982 Baseball Abstract, the first of seven soft-cover annuals published by Ballantine Books.
[Additional reader comments and retorts at Baseball Primer.]