Baseball BeatJuly 18, 2004
Abstracts From The Abstracts
By Rich Lederer

Part Three: 1979 Baseball Abstract

The 3rd Annual Edition of the Baseball Abstract had the words "By Bill James" on the salmon-color cover for the first time. The book was still held together by three staples. The number of pages increased from 115 from the previous year to 120, but the 1979 version is about half the weight because the text was copied on both sides of the paper for the first time.

For the second consecutive year, James wrote a "Dear Reader" letter (which graces the opening page of the book), signing it as Editor and Publisher along with a "Copyright Bill James 1979" and "All Rights Reserved" for the first time.

With a speed which would be considered quick for geologic changes, the Baseball Abstract shows promise of emerging from obscurity. The media is beginning to take note; I never much wanted to be famous, but if somebody offers rich. . .

. . .You will note, if you read carefully, that I often use mechanical metaphors. I am a mechanic with numbers, tinkering with the records of baseball games to see how the machinery of the baseball offense works. I do not start with the numbers any more than a mechanic starts with a monkey wrench. I start with the game, with the things that I see and the things that people say there. And I ask, "Is it true? Can you validate it? Can you measure it? How does it fit in with the rest of the machinery?" And for those answers, I go to the record books.

What is remarkable to me is that I have so little company. Baseball keeps copious records, and people talk about them and argue about them and think about them a great deal. Why doesn't anybody use them? Why doesn't anybody say, in the face of this contention or that one, "Prove it. Baseball's got a million records and if that is true you can prove it, so prove it." Why do people argue about which catcher throws best, rather than figure the catchers' records against base-stealers? I really don't know.

But that, essentially, is what I do. I hope you like it. But if not, your money will be expeditiously [my emphasis] refunded.

James writes a two-page essay on "The Defensive Record" with a sub-title "A word of explanation." He concludes that "(1) the more important measure of a player's defensive ability is not his fielding average, but his range factor, which is simply the number of plays per game that the fielder makes, and (2) the important measure of a defensive team is the percentage of all balls put into play against it that it can get to and make a play on."

James attempts to prove his point by stating that the "good defensive teams" based on Defensive Efficiency Record (DER) allowed fewer runs during the 1978 season than "poor defensive teams", whereas there were no clear patterns based on fielding averages. "DER, always, correlates well with W/L Pct."

The form of the book is very similar to the prior year. James covers the National League East and West, then the American League East and West. Rather than providing tidbits on each team as in the last review, I am going to include James' most interesting commentary irrespective of its place in the book.

In response to the accuracy of his numbers, James tells his readers that he sells two things, "an approach, a novel way of looking at the statistics which brings out insights you can't get otherwise, and the general truths which emerge from that."

The general truth is that Richie Hebner hit vastly better last year in Philadelphia than on the road. If it is important to you that difference might have been .324-.239 rather than .327-.236, or that those games started by Randy Jones may have seen only 49 double plays rather than 50, then I suggest you do two things: ask for your money back, and count them yourself. And have fun. . .

James writes an essay on "Guidry/Rice, A Primer of Stat Analysis" in which he compares the top two finishers in the A.L. MVP voting. James introduces the concept of runs created for the first time (and, in fact, lists the number of runs created for all the regulars along with monthly and season totals as well as home/road splits).

We begin with the offense. A hitter should be measured by his success in that which he is trying to do, and that which he is trying to do is create runs. It is startling, when you think about it, how much confusion there is about this. I find it remarkable that, in listing offenses, the league offices will list first--meaning best--not the team which has scored the most runs, but the team with the highest team batting average. It should be obvious that the purpose of an offense is not to compile a high batting average.

...There are two essential offensive statistics: on-base %, and advancement percentage (Total bases divided by Plate appearances). Other things tend to magnify or minimize the effects of those two; speed maximizes the effect of on-base percentage, timeliness maximizes the effects of advancement percentage.

James then provides the following formula for runs created:

(H + W - CS) (TB + .7 SB)
AB + W + CS

James proceeds to tell us that "70 is about an 'average' runs created total for a full-time player. 80 is above average, 90 good, 100 or more excellent. These standards generally conform to those of the 'Run' or 'RBI' totals." He provides a table comparing "the results of these formula computations to actual run totals" for every team in the majors.

Of the 26 teams, 9 estimates are within 1%, 20 are within 3%, 25 within 5%. The big error is on California, and it is caused by an unusual number of HBP--60% more than any other team in the league, and a total over twice the league average.

A number of runs created formulas were later derived, including the use of hit by pitches and grounded into double plays. The reasons James left out HBP originally is because "there aren't that many of them, the data is hard to come by, and it isn't worth the loss in simplicity", but he advises us to "use your common sense--if you're figuring Ron Hunt, count his HBP as walks." James leaves out GIDP "for the same reason I leave out RBI and runs scored--they are influenced by what the guy in front of you does. There is not an equal spread of opportunity."

With this information in hand, James calculates that Rice created 148 runs while using up 469 outs vs. 78 runs for an average American Leaguer, concluding that "Rice's superiority to the league average is therefore 70 runs (approximately 9 games)." Guidry, on the other hand, allowed 61 runs in 274 innings whereas an average American League pitcher would have allowed 129 runs over that span, concluding that "Guidry's superiority is therefore 68 runs--and Rice, by two runs, wins the award."

James admits that "it is, fortunately, not that simple...but there's a limit to how long anybody can think about this at one time, so take a break and I'll get back to the subject on page 83."

In Guidry/Rice Part II, James reminds his readers that:

...we were attempting to compare Rice and Guidry by comparing each to an average player, and calculating how many more runs the MVP candidate had saved or created. The first complication is the substitution level. Is it appropriate, in discussing the player's worth, to compare him to an 'average' player, or to some level below that, a so-called 'replacement level' at which a minor leaguer or the best available fringe player, in case of need, might fill in. It would be a lot easier to use the average, no doubt about it, but a lot better to use the replacement level.

James also discusses "park illusions," mentioning five effects: (1) dimensions, (2) playing surface, (3) configuration, (4) visibility, and (5) climate. He says "if you add all of those things up, their potential impact on statistics becomes so obvious that the arguments against such things tend to degenerate rapidly."

James then makes one final adjustment for defense "both ways." He adjusts Guidry's runs saved downward by suggesting that "defense is probably 70% pitching and 30% defensive play," applying .7 factor to his replacement level and park adjusted runs saved total. He adjusts Rice's runs created upward by giving him "credit for 'saving' at least 7 to 10 runs" based on leading A.L. LF in range factor (2.26) "in a park which has a small left-field area and in which, traditionally, left fielders have had low range factors" and his "excellent" fielding average (.989) and assists total (13 in 114 games).

The conclusion that James reaches is that "it was close and arguable--but Jim Rice was the MVP." I grab my Win Shares book at this point to see if James may have had a change of heart over the years and learn that Rice is credited with 36 (tops in the league) and Guidry with 31 (2nd). Although the inputs have changed, the bottom line is that Rice won out both ways--in James' more primitive attempt 25 years ago and in his more sophisticated and updated analysis.

Later, in "Guidry/Rice: A Post Script," James volunteers that "the purpose of this essay, of course, was not to put to rest the MVP debate as much as to introduce a variety of analytical theories and techniques that you might not be familiar with." He also discusses timeliness and clutch factor before bringing up Victory-Important RBI, a stat that gives more weight to RBI in winning close games than in blow-outs and no weight whatsoever in defeats. VI-RBI, as James calls it, lacks merit in my mind, and I'm glad it never took hold.

James also launches the novel idea that "there exists a spectrum of defensive positions, left to right, which goes something like this: first base, left field, right field, third base, center field, second base, shortstop", claiming that "each postion is more difficult to play than the position before it."

In discussing the fact that Davey Lopes scored "only" 90 runs in 1978, James discloses a "generally accurate format for estimating how many runs a lead-off man will score" as follows: (Times on First x .35) + (Times on Second x .55) + (Triples x .80) + (Home Runs x 1.00). He adds that the discrepancies "can be explained by failings or bonuses from the offense behind them, the players own speed or lack of it, and random deviation from chance."

In the case of Lopes, James says the discrepancy is due to Bill North batting second. North, whose only extra base hits in his 110 games with the Dodgers that season were his 10 doubles, ranked "dead last in the league in isolated power, meaning that he just never scores a runner from first."

Moving along, James comments on the "shortage of third basemen in the Hall of Fame", the baddd [his word and spelling] choice of Bob Horner over Ozzie Smith as the Rookie of the Year, and "the nonsensical notion that a pitcher 'wins' or 'loses.'"

James also observes that "a team that improves dramatically in one season will almost always decline markedly in the next," "the largest element in shutting off the running game is not the catcher's arm...or the pitcher's move...but the pitcher's ability to throw strikes," and that "it is a mistake to try to build the pitching staff first" due to the fact that their careers are "in perpetual danger of coming to an abrupt end."

Long before any player was ever on pace to draw more than 100 intentional walks in a single season, James noted that "Rod Carew last year once swung at two pitches when he was being intentionally walked, trying to get the pitcher to throw him something he could reach." He writes that "this was once a common strategy", mentioning Cap Anson and King Kelly "did that often." He finishes with "For some reason it isn't done anymore." Hmmm. Paging Barry Bonds, paging Barry Bonds...

In the section on the Kansas City Royals, James makes the assertion that they "are not the kind of team that traditionally has done well in a short, crucial series."

A short series favors power hitting, for reasons that I won't get into, and the Royals do not have that much power. They have always had a deep pitching staff with a deep bullpen--but a short series favors a team with outstanding front-line pitching. How often are you going to use your #5 starter in a World Series, anyway? A short series favors a team that runs the bases conservatively--the Royals run them aggressively.

In view of the recent thinking so heavily influenced by the playoff success of the Anaheim Angels and Florida Marlins, I find the opinions expressed by James 25 years ago quite interesting. Was James wrong? Or are the Angels and Marlins flukes? And what does this say about the Oakland A's approach and the validity of Billy Beane's staunchest critics?

Next up: 1980 Baseball Abstract

[Additional reader comments and retorts at Baseball Primer.]

Comments

I am loving this series Rich.

For what season did James test his Runs Created formula on the teams?

J.C. - James tested the RC formula on the teams for the 1978 season. He listed the results on page 30 of the 1979 Baseball Abstract. James calculated RC with and without SB and found that the difference was negligible. He even wrote, "the major leagues as a whole score no more runs than they would with no stolen base attempts at all".