Baseball BeatMarch 06, 2006
The Great Discussion
By Rich Lederer

Sabermetrics. The term, which is derived from the acronym for the Society for American Baseball Research, was originated by Bill James in the 1980 Baseball Abstract.

A year ago I wrote in this letter that what I do does not have a name and cannot be explained in a sentence or two. Well, now I have given it a name: Sabermetrics, the first part to honor the acronym of the Society for American Baseball Research, the second part to indicate measurement. Sabermetrics is the mathematical and statistical analysis of baseball records.

James admitted to me in our interview in December 2004 that his original meaning was "not a very good definition." Bill said he had recently stumbled across an even worse definition in a dictionary ("the computerized use of baseball statistics") because "computers don't have anything to do with it." The Senior Baseball Operations Advisor for the Boston Red Sox was pleased to learn that he said "good sabermetrics respects the validity of all types of evidence, including that which is beyond the scope of statistical validation" in the 1981 Baseball Abstract.

I'm glad to know I wrote that back then. In the wake of Moneyball, some people have tried to set up a tension in the working baseball community between people who see the game through statistics and scouts. There is no natural tension there. There's only tension there if you think that you understand everything. If you understand that you're not really seeing the whole game through the numbers or you're not seeing the whole thing described through your eyes, there is no real basis for tension and there's no reason for scouts not to be able to talk and agree on things.

A year after The Great Debate, hosted by Alan Schwarz of Baseball America, I gathered three top baseball minds in the hopes of advancing the discussion beyond the idea that sabermetricians are nothing more than statheads. Joining me today are Tom Tango, Mitchel Lichtman, and Eric Van.

Tom (aka Tangotiger) and Mitchel (MGL), along with Andy Dolphin, recently published The Book: Playing The Percentages in Baseball. The Book is aimed at coaches, managers, and front office executives, as well as baseball fans interested in strategies such as batter/pitcher matchups, platooning, the sacrifice bunt, base stealing, and much more.

All three of my guests are noted sabermetricians. Tom works full-time in computer systems development and part-time as a consultant to major-league teams (currently in the NHL, formerly in MLB); Mitchel is currently senior analyst for the St. Louis Cardinals; and Eric, whose lifelong dalliance with sabermetrics turned serious in 1999 when he started posting analysis to Usenet, was hired as a consultant in 2005 by the Boston Red Sox after his work on Sons of Sam Horn caught the eye of John Henry and his management team.

Please feel free to pull up a chair, listen in, and enjoy.

Rich: It's March 2006. Nearly 30 years have passed since Bill James wrote his first Baseball Abstract. The sabermetric community has grown significantly in numbers and respect over the last few decades. Our voices are now being heard more than ever. Let's take a few minutes to assess where we've been, where we are, and where we're going.

Mitchel: Wow, that's a heck of a question to start off with Rich! It sounds like the topic for an entire book (by Bill James no doubt)! The way I see it is that there has been an evolution of sorts on two fronts. One is with the technology/information itself. We know and understand infinitely more about baseball (in a sabermetric sense) than we did 30 years ago. Two is with the acceptance and use by the fans, the media, and the teams themselves. The latter appears to be much slower and much more disjointed, for various reasons.

As far as the future is concerned, I anticipate that teams "jumping on the sabermetric bandwagon," if you will, will continue to accelerate at a rapid pace. As far as the information and technology is concerned, I anticipate that the evolutionary pace will slow down considerably. In certain "industries" there is a limit to the amount of information/understanding that can be gleaned. Sabermetrics and the game of baseball in general is one of those industries I believe. Sabermetrics is more like "trigonometry in mathematics" then "computers in science." With sabermetrics, as with trigonometry, you create a number of theories, constructs, and paradigms, and then you move on to something else. We are not quite ready to move on to something else, but we are close, in my opinion.

Tom: Where we've been? Tons of ideas, by tons of people and tons of superb work having been produced. Where are we? Like in everything in life, you have alot of people who contribute to the party, and you have alot of people who insist on sticking their heads in the party to tell the partygoers that the party is lame. Where are we going? As the community keeps expanding, you will naturally get factions. And that's what's going to happen here.

I will disagree with Mitchel about the slowdown. If anything, it's going to accelerate. What should now happen with the data is that we'll be plotting everything in 4D. You will not only plot the exact location of all the fielders and the ball, but also do so in real-time, from the moment the ball is in the pitcher's hand until he gets it back. This kind of data is the gold mine that we've been looking for. The slowdown will happen if MLB and the data owners considers it more important if 30 analysts look at this data instead of 30,000.

Mitchel: Tom, I agree with you that "data in 4D" is one of the next (and exciting) frontiers, so to speak, and that there is lots of potential in analyzing that kind of data, especially on the defensive front. However, that is why I said that we are not quite ready to move on. And while we will never actually complete the sabermetric quest, I do think that it is and will slow down considerably.

Eric: I don't see either a slowdown or an acceleration! Or, rather I see both. Punctuated equilibrium. I'll agree with Mitchel in that the progress we've made in many individual areas appears to be slowing down. And yet most such areas are amenable to breakthroughs, and breakthroughs spawn orgies of new work. We may see a future in which whole years go by with nothing too exciting happening, and other years when we're all piling gleefully on some new concept or approach, like we did with DIPS.

The reason why you're going to see this pattern is that so many of our problems are bedeviled by Bill James' fog. For instance, no one has ever shown statistically that hot and cold streaks are real, but you can watch the game and see that guys go through weeks where their mechanics are off and their numbers suffer. What's now clear to me is that this is real and important and still absolutely swamped by random variation. Manny Ramirez had one of his patented two-to-three week slumps last May and June and right in the middle had seven straight singles in Yankee Stadium, only two of which were hit hard. Tons of noise and a weak signal. Every so often, someone's going to make a breakthrough on one of these difficult problems, and we'll all be very busy and very happy for a while.

Mitchel: Eric, I disagree with your comments on "Bill James' fog." While I have never had a problem with his basic premise (that just because something cannot be measured or measured well, using certain statistical techniques, does not mean that it does not exist), I consider anything which cannot be measured or supports the null hypothesis with a high degree of certainty to be essentially a non-issue, at least in a practical sense.

For example, in our new book we actually do find the "existence" of a clutch-hitting skill. We also explain, however, that for all practical purposes, we might as well assume that there is none (which was the previous finding by most researchers). We also analyze "streak hitting and pitching" and similarly find a small level of predictive value, but again, nothing to write home about, and nothing that would have a whole lot of practical significance as far as decision-making or evaluation of players goes. I do not consider these kinds of issues important in sabermetrics, other than for their nominal value I suppose.

Eric: I'm a little less sanguine about the inevitable flood of data that Tom correctly anticipates. I already have more data than I can comfortably process! A wealth of new data will not necessarily translate instantly to a wealth of new findings; there has to be an inevitable period where we learn how to play with our new toys.

In terms of our influence on the public and the profession, I see the former steadily growing. As for the latter, does anyone actually have an idea how many teams employ sabermetric consultants, and how many total consultants are on MLB payrolls? I do think it will become universal, and fairly soon.

Rich: Well, now that you've raised the question, how many teams would you estimate employ sabermetric consultants?

Mitchel: Good question. I have no idea exactly. Obviously St. Louis, Oakland, and Boston are the most notable. I have heard that Cleveland, Toronto, and San Diego may use sabermetrics and employ analysts.

Tom: I think the way that question is posed can have answers that lead to different conclusions. Even if a team employs a "sabermetric consultant," do they listen to him? In my limited experience, teams have this dichotomy of overspending on players, and under-spending on the support staff. Spend equals listen.

Eric: Let's hear it for under-spending! When I first met with the Sox, I pointed out that they were spending $200,000 on the free agent market for each extra run scored or fewer run allowed (it's now $300,000), so that there was no way they could overpay me. That elicited a great round of laughter from those assembled.

Mitchel: That's hilarious! The story, that is, not what you said. Speaking of dollars per marginal run, for position players at least, I try to counsel the Cards not to spend more than $200,000 per marginal run (also on the free agent market). I consider anything less than that to be somewhat of a bargain. Less than $150,000 is a real bargain, and less than $100,000 (almost impossible) is an absolute steal.

Tom: Yes, $200,000 per run or $2 million per year should be the going rate. If teams operated on that basis, they wouldn't even need a sabermetrician! The sports markets are incredibly inefficient and ripe for the taking. But teams jump over themselves to overspend on players. Guys will give up a limb for a chance at Carmen Electra, without thinking that they can wait a minute for Jessica Alba to round the corner. Or wait for Keira Knightley.

Rich: Tom, are we so advanced now we know what Jessica Alba's baserunning is worth?

Tom: You'd be surprised.

Eric: Mitchel, do you make an exception at the top of the talent pyramid? Do you break the bank just for the elite? You're not going to wrap up Albert long term for that kind of money.

Mitchel: Yes and no, Eric. As I've said many times in many different forums, the essential bottom line for the owner (for whose interests I essentially look out for) is, "How much net profit will this player provide over the course of his contract, as compared to how much money we are paying him, and what are the alternatives?" That is usually a function of that player's marginal win value (as compared to some baseline, like a replacement player) over the length of that contract (among other things). As Tom said, to figure that, teams don't really need sabermetricians. All they need is the Marcel formula and a calculator or spreadsheet!

I do "allow" some leeway for elite, "top of the pyramid" players, where supply and demand really affects the market (even though it really shouldn't). But anything more than 3 or 4 million per marginal win (per year, of course) is generally a waste of money. Compare that to Konerko's contract which will cost the White Sox around 8 or 9 mil per marginal win - or Jeter's current salary, which is almost 7 mil per marginal win. Heck, Albert is currently worth around 7 wins above replacement and is making only around $15 million per. Of course, he was signed pre-arb, I think, which entitles the Cards to a substantial discount, as compared to a free-agent player.

Tom: Actually, didn't they wrap him up for 110/7? I didn't run the numbers, but that sounds right to me. Don't forget that baseball inflation is probably 10%, so in 7 years, the marginal cost will double. Pujols may have been underpaid!

Rich: Generally speaking, are teams still spending too much on scouting and too little on performance analysts? Or should the extra money that would go to pay for the latter come out of the hide of the players or the profits of the owners?

Tom: Well, we don't even know how much they spend on scouting. I've asked, and they don't like to tell me. My best guess is they spend $15-20 million a year on scouting, minors, and player development. That sounds low to me.

Eric: If a team is spending nothing on analysis, there are obviously hundreds of guys in the field who could do a solid, competent job...and because it's a fun job with hundreds of candidates for thirty positions, they're not going to pay much. The bang for buck here is off the scale. The much more interesting question is how many analysts there are who are way more than competent, who can do more than just prevent their team from making Saber 101 mistakes, but can come up with great stuff, stuff that gives their team a real edge over rivals whose analysis is pedestrian. It will be very interesting to see how much money such elite analysts can eventually make, if they can establish a track record of adding that kind of value. I know I'm working on it.

Mitchel: I have no idea what teams should or do spend on scouting. I have never asked the Cardinals and no matter what they said, it wouldn't mean much to me anyway. As far as what teams do or should spend on "analysis," I ditto what Eric just said. And I don't think it is an "either/or" thing, although teams may perceive it that way, at least for now. At the present time and probably in the near future, teams can get a more than competent sabermetrician for pennies on the dollar. As more teams recognize the value of a good analyst or two (or three), the supply and demand balance will change, competition will likely heat up, and analysts will make more money. There is a limit, however, for various reasons. For one thing, as the "baseline" increases, analysts will be able to save their teams less and less money, as compared to other teams or the average team. For another, geeks and nerds will always make a lot less than athletes. I guess eventually we will have to set the value of a "replacement-level sabermetrician" and go from there. Perhaps we should also form a union and start hiring agents like Boras or Moorad. Without collective bargaining or some other powerful force in the market (like extreme competition), it is difficult for anyone to make a whole lot of money.

Eric: I think we've stumbled on a question that had never occurred to me before - just how much value can a top analyst add, above a replacement-level one? What kinds of new discoveries are out there, and how exploitable might they be in terms of getting a competitive advantage? And a thorny-related question: let's say an analyst crunches, say, some pitch-type data from BIS and discovers some wonderful new platoon pattern. A pattern that could be exploited to get a competitive edge, but also a pattern that every fan would be interested in and would add to everyone's appreciation of the game. Is it fair to sell this finding to an MLB club for their exclusive use, or is there a scientific obligation to publish?

Tom: I think you should publish, after a couple of years. One thing that I negotiate in all my contracts is that I maintain IP rights to all my work, and that I grant the team or individual a non-exclusive, non-transferrable, perpetual-use licence. I don't want to happen to me what happened to Kramer.

Rich: Well, Bill James has said that he wishes he could talk about certain studies, but that the Sox now own the rights to some of his recent findings. In the 1988 Baseball Abstract, James released the formulas and theories to his old works in Breakin' The Wand. I guess it comes down to whether or not you are independent or employed by a team.

Mitchel: As Tom said, or at least implied, if you are employed by a team or work for them as an IC, it is up to the two parties to decide how to deal with the IP rights. Obviously, teams would like as much exclusivity and ownership as possible. It is certainly a little frustrating and disappointing when James says something like, "I would love to talk about X, but I can't."

In my case with the Cardinals, I have an agreement which is very fair and balanced. With some of my work I retain ownership and there is no exclusivity agreement, and with other stuff the team acquires most of the rights. I also have a limited non-compete clause in my contract. To be honest, I have not looked at the contract in a while and there have never been any disagreements between us. The Cardinals are a very pleasant organization to work with and we have a very deferential, almost informal, relationship.

Eric: Having studied neuroscience, I was trying to work out an analogy with pharmaceutical research and, unfortunately, it just doesn't fly. If you find a new serotonin receptor subtype, and think you can design a drug to target it, you have to publish the scientific finding as an eventual justification for the drug's efficacy. You probably have a year or two head start on the competition in terms of developing the drug, and once you beat everyone to the market with it, you patent it! So there are no incentives against making scientific findings public.

If we do unionize, we might want to consider a policy whereby all our contracts state that such research becomes public domain after, say, 10 years (via the rights reverting back to us for publication). It's nice to give your employer a competitive edge but I'd hate to see the scientific understanding of the game suffer as a result.

Mitchel: I don't think that sabermetricians have any responsibility whatsoever to publish or release any of their work in the public domain. It is their work and it is up to them to decide what suits them best. We are not talking about the cure for cancer or global warming here.

Eric: C'mon. There's a profound correlation this century between global temperatures and strikeout rates. And it's not like we lack a causal explanation in terms of hot air.

Tom: Right, I agree. Some people expect strikeout rates to jump 1% based soley on our discussion here today.

Rich: Tom, you have stated before that sabermetrics includes both quantifiable and qualifiable measures. Do you care to elaborate on that point?

Tom: I think people like to associate "numbers" and performance analysis to sabermetrics, and relegate scouting and observation as some ugly duckling. Sabermetrics is about the search for truth about baseball. And, at its core, baseball is about the physical and mental abilities of its players, which manifest themselves in explosions a handful of times in a game. Since we have limited samples in which to evaluate a player by his performance, we need to supplement that with some keen observations. The pinnacle of sabermetrics is the convergence of performance analysis and scouting.

Mitchel: Tom, I know that is not politically correct to "bash" traditional scouting and observation, so I won't. I will say, however - and you and I have had this discussion before - that the more data we have - the "explosions" you refer to - the less we need scouting and other "subjective" data in order to reach the correct conclusions. To a large extent, an infinite amount of unbiased data always yields perfect results. This is an important point that is often missed or at least misunderstood by even good analysts.

Tom: There is no question that if you had an infinite sample that we would have no need for observational analysis. It's essentially a scale, where good scouting can be worth 300 at bats, just to use as an illustration. That is, if I had a player with a 300 AB season, and I had a good scout who watched him for 5 or 10 games, I would "weight" his analysis by 300 AB. However, after a couple of seasons, my player will now have 1200 or 1500 AB, and the scout is still worth 300 AB. So, the scout becomes less and less relevant with the more AB that the player piles up.

Eric: The convergence of sabermetrics and scouting has me as juiced as Tom but for a different reason. When I dream at night I dream of spreadsheets, and they have not just the columns we're all used to from The Bill James Handbook, but all the scouting-style data that BIS gathers: who throws what pitches how fast, all that. And I'm running correlations between that data and the standard numbers, and looking for career patterns and so forth. And Liv Tyler is lending a hand with the thornier linear regressions. They're pretty good dreams.

Tom: Yes, the scouting-style data is exactly what I'm talking about, as anyone who follows my Fans' Scouting Report project knows. We need to capture all these traits of players, all the little things, so that we can better appreciate the context of the performance, and properly assign a value to the performance.

Eric: I want to return to something Mitchel said earlier: "I consider anything which cannot be measured or supports the null hypothesis with a high degree of certainty to be essentially a non-issue, at least in a practical sense." And I think that's irrefutable. But the question is, are the things that are unmeasurable going to stay that way? Some very real and important things can be unmeasurable if enough noise is added. Who's to say that the right noise filter doesn't exist?

Mitchel: Eric, sure, heretofore never been used statistical techniques as well as new methodologies can reduce background noise and otherwise enable us to measure things that we were previously unable to measure. But, to tell you the truth, if quality researchers have had difficulty measuring something in the past, it is most likely not worth a whole lot even if it can eventually be measured. That is not an absolute statement of course. We are talking about a relatively simple environment to study (with all due respect to Bill James, who generally refers to baseball as a complex dynamic), as compared with, say, quantum physics or cosmology.

Rich: Well, Mitchel, I would rather talk about baseball than the structure of the universe any day. With that in mind, I'd like to go around the room and hear the most interesting topic you are working on right now.

Eric: Hmm . . . I actually did recently send a letter to New Scientist about the structure of the universe (some unappreciated implications of Heim's Grand Unified Theory). This may be why it took me 35 years to get a career going in sabermetrics . . .

Tom: I've started a few things, and they are all based off the play-by-play and pitch-by-pitch logs. Studes at Major League Baseball Graphs did a sensational job with what I was dipping my toes in, with his Batted Ball Index project. And I was also dipping my toes into what David Pinto already did with his fielding graphs chart. David Appelman did the third thing that I've been working on and off with, understanding pitch-by-pitch. There are plenty of great minds out there working their butts off.

I think the Holy Grail centers around understanding the pitch-by-pitch process. This is what baseball is all about, this is where performance analysis can do the most damage, this is where you can have a real impact on the approach to hitters and pitchers themselves, and this is where scouting and game theory really comes to the forefront. It's the center of the baseball universe. My guess is that top baseball game designers may have cracked this nut already, and I would bet that Tom Tippett may be ahead of everyone on this. Just a guess. This is a journey I'd love to take, if I had time.

Mitchel: Well, I can't really say, as it is all proprietary, but I can say that in 10 or 12 years when it becomes public, it will rock the baseball world! Just kidding!

I'm not really working on anything earth-shattering right now. I have recently revamped my entire UZR methodology, which doesn't really mean anything to too many people, as I haven't published any wholesale results in a long time anyway. And, of course, I've been "scooped" by John Dewan in terms of any future public disclosure of UZR ratings in the form of a book. That is fair, as the original concept of a "zone rating" and even an "ultimate zone rating" was originally published by John and STATS Inc (although I developed my own "zone rating" independently and about the same time - along with several other people that I know of - remember DeCoursey's and Nichols' "defensive average" back in the late 80's or early 90's?).

Eric: You kids! The adjective "back" should be reserved for the early 70's. I had to hand-calculate league OBP's and emend my copy of the 1974 MacMillan Encyclopedia in ballpoint ink. And walk a mile to school, too. Carrying the book.

Mitchel: I don't think I'm that much younger than you, Eric! Anyway, I am also working on an "ultimate, ultimate zone rating (UUZR)" which, rather than using distinct zones or vectors and the probabilities of catching a certain type ball within them, uses a smooth function such that we can basically plug in the x, y coordinates of a batted ball (along with the usual characteristics - speed, type, etc.) and come up with the probability of that ball being caught, regardless of whether we already have an historical "baseline" for that particular type of ball at those coordinates. I am also going to incorporate into the UUZR methodology subjective ratings on all plays made (which STATS routinely provides) to improve the integrity of the data.

As well, I am working on better ways of "park adjusting" player stats in order to do better context-neutral projections as well as to determine the future value of a player in a specific park, especially when that player changes home teams. I am continually working on improving my projection models, as these are really at the heart of what a sabermetrician can do for a team. Tom might disagree with this as he tends to think that one projection system is basically as good as another.

Tom: For established big-league hitters, that's pretty much true. You can more or less prove that the maximum r possible for a forecasting system is around .75, while a group of fans can get you .65, and these sophisticated forecasting systems are at the .70 level (as a basic illustration). That's for hitters. For pitchers and fielders, that's not true of course.

As for park factors, I've been talking about this for years. I find it extremely disappointing that we always talk about a single park factor, when that's simply not reality. Busch Stadium cannot possibly affect Coleman, McGee, and Jack Clark the same way, and we should not pretend that it does. Same for Coors. Yes, using something is better than nothing. But, there's been very little published on this subject and very little innovation.

Eric: The overall park factors work fine for evaluating past value, but can be close to worthless for predicting future value. And there's a whole breakthrough project waiting to be done correlating weather and park data. Look at the year-to-year PF variation for Dodger Stadium vs. a place that actually has weather like Wrigley Field.

Mitchel: My next big project is delving into the pitch-by-pitch data (TLV data - type, location, and velocity) that Tom just mentioned. He is right in that that is one of the Holy Grails left in baseball analysis with respect to evaluating and "scouting" players (and understanding and incorporating game theory into the analysis) in a very different way than we have been doing for the last 20 years.

Eric: I'm doing some interesting things for the Sox that I won't talk about. On my own, as you might have guessed from my fog challenge to Mitchel, I'm chasing some of the Holy noise-obscured Grails. I've found a lot of interesting things about pitcher's BABIP (and can we please start calling it BPA? The tops of my spreadsheet columns thank you).

Mitchel: I'm all for that (BPA). BABIP is way too long. Almost as bad as TINSTAAPP!

Eric: For instance, team BPA depends significantly on team K and BB rates. So good pitchers do allow a lower BPA, and differences between pitchers must be reasonably large. It also means that when you use BPA as a team defensive metric (and all the best people do), you want to tweak it to adjust for the quality of the staff as evidenced by the K and BB rates.

I'm also just wrapping up my other recent project. I'm about to send the Hardball Times an article that, I believe, proves that RISP hitting differences are real rather than random (a question so settled in the other direction that Keith Woolner omitted it from "Baseball's Hilbert Problems" in the 2000 Baseball Prospectus). I'm not talking about "clutch hitting," but real and reasonably common variations in performance by hitters with RISP in response to the different tactics of the batter/pitcher matchup. I hope it will open up that topic for a good deal of further analysis.

Rich: Thanks for the chat, guys. Based on our discussion, I think it is safe to say that there is a good deal of further analysis ahead of us in a number of areas.

[Additional reader comments and retorts at Baseball Primer.]

Comments

"I think the Holy Grail centers around understanding the pitch-by-pitch process."

You mean like the great stuff Rich came up with last week? ('Strikeout Proficiency') I really enjoyed those two articles -- and the tons of mileage that came out of it over at BTF and Pinto's site -- as well as this roundtable.

"The slowdown will happen if MLB and the data owners considers it more important if 30 analysts look at this data instead of 30,000."

I think this is a real issue, especially with the current TLV data. This stuff isn't publicly accessible. As far as I know, the closest you can get is Retrosheet and while a lot can be gleaned from their play by play data, it doesn't have the same type of granularity of TLV data.

You can’t exactly blame the companies that collect this type of data for not publicly disseminating it because I’m sure it’s expensive to collect and for companies like BIS it’s their bread and butter. On the other hand, the data is far too expensive for almost all hobbyists and if you can afford it, it comes with understandable distribution restrictions.

Over at FanGraphs, we’ve agreed with BIS to freely distribute raw TLV data for retired players only, but for any league wide studies to be done it may take over a decade for enough players to retire. And it certainly doesn’t help with the evaluation of current players.

Unfortunately, I think for the foreseeable future it may be that there are only the 30 analysts looking at the data. I’m not exactly sure how the problem is going to be fixed either unless there's a Retrosheet style project to collect this type of data.

"we’ve agreed with BIS to freely distribute raw TLV data for retired players only"

Well, that is very refreshing! At least one company cares enough about the researchers that they will allow data to enter the public domain. Since BIS only has data since 2002, this doesn't apply to them, but I'd also say that companies should release data that is 5 years old or older. After all, that data is pretty much worthless to a team or outfits like Yahoo, etc. Even setting up a nominal charge would be great. But for an analyst, it is really irrelevant if the data is from 2002 or 2007. We need data to understand the behaviours of players. Releasing old data that's not generating any revenue also does good business sense in that it may bring you a larger customer base, of which some may start to buy your data.

The 5-year statute of limitations is a terrific idea to promulgate (and Tom makes a strong argument for why it will be good for the stat companies.) I've never done a study where data that old correlated to present performance.

It's in the interest of teams to get this old data into the hands of the general saber community, too. An analyst for a team can do a much better job with the recent, proprietary data if he can draw on a large body of public work on its general interpretation. Instead of 30 (or fewer) analysts separately inventing the wheel, you've got something resembling a normal scientific community, where the general work is in the public domain and the applications are for profit.

Thank GOD somebody -- i.e., MGL -- has finally come out and said the obvious about James's "Fog" article. (I mean, besides me.) James makes too much out of the "fog"; his results are of no practical importance. To do otherwise is to stand the scientific method on its head.

Rob, I couldn't disagree with you more. As I've said in several places now (and for many years), the strength of a correlation doesn't tell you the size of the signal being measured; it tells you the size of the signal less the size of the noise. And a signal of any strength can be obscured by sufficient noise, hence the fog argument. There are certain things that are important to the game that are inevitably swamped by noise and hence difficult to measure, but that doesn't mean we give up on understanding them. The initial findings about the weakness of the correlation of BPA (BABIP) led many people to assume that the individual range in BPA was not "baseball significant" (even after I, Tom Tippett, and others showed that it was statistically significant). Well, it turns out that if you don't attempt to measure a pitcher's true BPA, your estimate of his true ERA is likely to be off by 0.20 or even 0.40 runs, which is to say $1 - $3 million a year of salary in terms of value on the FA market. It's swamped in fog and it's of immense practical importance. Not everything on Bill's list of fog-shrouded phenomena is going to prove as real and important, but each needs to be looked at more closely.

I never said we should stop trying to find useful information, but neither should we pretend that because we don't know something that it does exist, either. The problem I see with the "Fog" document is that it very much implies the existence of things that quite frankly haven't been proven. Unlike James and his hypothetical picket, the existence of the army is not a given, let alone its presence. His choice of metaphors was extremely poor, as it would seem to make it incumbent on those who say "there is no proof of X as a skill" to prove that negative assertion, as opposed to those who believe it is to prove their positive assertion. Well, negative assertions aren't proveable, by definition.

I think this (the "fog" issue) is more a matter of degree and semantics than anything else.

James statement:

Cramer was using random data as proof of nothingness and I did the same, many times, and many other people also have done the same. But I'm saying now that's not right; random data proves nothing and it cannot be used as proof of nothingness.
Why? Because whenever you do a study, if your study completely fails, you will get random data. Therefore, when you get random data, all you may conclude is that your study has failed. Cramer's study may have failed to identify clutch hitters because clutch hitters don't exist as he concluded or it may have failed to identify clutch hitters because the method doesn't work as I now believe. We don't know. All we can say is that the study has failed

was a poor way of representing his point of view, and I agree with Rob assessment.

One, every good scientist understands that when he does an "experiment" and it "fails" that the best he can say is that, "We found no evidence of whatever it is we were looking for," or some such thing, and that, "Given the test we did, there is such and such chance that we made a Type I or Type II error," etc. That is banal.

It is also banal to state that an experiment might fail because it was poorly designed, etc., as James does in the above statement. Actually, it is more than banal. It is downright ridiculous to state that, "We shouldn't get all excited or jump to any conclusions when a scientist finds no evidence of something because the study may have been bad. Well, no shit! That applies equally when we do find evidence of something. And that is why we have things like peer review and duplicated, independent research before we in fact jump to any wholsesale conclusions. That really has nothing to do with the "fog" issue, per se.

So while I don't necessarily disagree with Eric's point of view, I think that we are in some sense arguing about angels dancing on the head of a pin.

Obviously if someone uses poor statistical techniques (including too small samples, etc.), we take their finding with a grain of salt. However, when we do accept a certain thesis, we operate on the assumption that is was derived in a responsible, scientific manner, always leaving open the possibility that the results are "wong." So what? That is the nature of science. I forgot who said it, and I am probably butchering what he said anyway, but science is dicsovering truths through the scientific method until such time as someone with better tools or data disproves those truths.

Bottom line is that I think this whole "fog" issue is overrated. Eric can flippantly state that "a large signal can be obscured by noise such that it is difficult to measure," but the fact of the matter is that that is exceeedingly rare in baseball. If a signal is hard to measure, it is almost without a doubt not very important in a practical sense. DIPS is a poor example of something that supports Eric hypothesis. Voros made an initial blanket statement that pitchers have little or no control over BABIP, as opposed to other outcomes (BB, K, and HR). He was right then and is right now. How little (as well as other related stuff) is another issue altogether.

Anyway I'll get off the soapbox and return you to your regularly scheduled programming.

The problem I see with the "Fog" document is that it very much implies the existence of things that quite frankly haven't been proven.

Does it? Or does it merely caution people not to jump too quickly to the conclusion that noise = insignificance? (Which, to my mind, is a much more useful reminder than it is to a working scientist like MGL.)

If I was managing or assembling a team, I'd rather allow myself to be open to the possibility that Gary Sheffield really *does* hit better in the clutch, or that Davey Lopes really *did* hit lefties better than the platoon spread for righties would predict. It would be near or at the bottom for stuff I'd look for or rely on, but I see far less harm personally in keeping the door open for such data than slamming it shut.

But then, I'll never interact with a baseball team in any meaningful way, so who cares?

Matt, all due respect, but you don't post sentries to look for the Iraqi army in Arizona. That's my problem with James's metaphor.

Oh, and finally: if the Angels -- for this is just a part of this discussion -- don't take RISP and RISP2 hitting seriously, why let slip that they do, that it's the most important stat they keep?

Bill James' Fog piece was actually useful because a lot of non-scientists reading sabermetric work don't understand the distinction between proving something doesn't exist and not proving that something exists. This is very problematic in the practice of the work for a couple reasons.


  1. Practicioners read that hot hands don't exist and they say the sabermetric community is out of touch and ignore every useful study.
  2. Some studies that show non-existence are indeed wrong. Mitchel said they showed that streakiness existed but was small -- that is different than studies that wrote that it didn't exist.

I work in basketball and it is extremely common for well-trained statisticians jumping into this to form their basketball study completely wrong. The most common mistake leads to the result that offensive rebounds are useless. This is because they construct the study wrong -- as soon as you tell them how to craft it, they see a true value showing up in their results. And that value isn't small. It is not a small signal, but a lot of noise introduced by the study. I wouldn't doubt that this sort of thing happens in the more difficult topics of baseball, as well.


So there is a lot of fog that makes the practice of sabermetrics (in baseball or basketball) a little tough.

Dean Oliver
Consultant to the Seattle Supersonics
Author, Basketball on Paper

James' real power is his language. We talk about the Fog piece, and we don't even have to go back to read it. We *remember* it, like a song. I say "Michelle", and you're singing the entire Beatles song. We say "Fog", and we think of the James piece.

I agree with Dean that the effect of the Fog was to remind many people between the difference between something that doesn't exist, and something that you haven't found. It's not the same thing. At the same time, if you are like Mitchel, and you are looking high and low for it, and you still haven't found it, for all intents and purposes, if you end up finding it, it probably will be useless to you.

Unless you are meticulous, it's hard to tell that what you are looking for is a bomb or a grain of sand.