Why You Shouldn't Bet on Baseball
Alternative title to this post: Why Tango Tiger still has a day job
I really enjoyed the feedback to my last article on why umpires should be biased in favor of control pitchers. Most of the folks with a solid statistical background responded favorably, while the unwashed masses thought I was ridiculous and should be run out of the blogosphere on a rail. (Perhaps I don't give the detractors enough credit; if they were washed and educated, then it was an equally entertaining Ask Marilyn-esque experience.) Anyhow, I thought I'd repeat the experience by picking on another low hanging fruit: betting on baseball games. I know a fair number of people who bet on sports. They don't understand that betting on sports is an investment in entertainment, not a viable means of turning baseball knowledge into cold hard cash. This post should be far less iconoclastic than the biased umpire post, but the central point doesn't appear to be widely known.
So here's the skinny: you should not bet on baseball. In the long run, you'll lose. No model that you can develop can be anticipated/demonstrated to beat Vegas. I don't care how good you (think you) are. I don't care how much you think any given line is outrageous. You can't be expected to win. You might think (as I did) that many people who bet on baseball may make poor predictions, and that the intelligent bettor may be able to profit off of them. You'd just have to be better than the average bettor, right? Wrong. I'd argue that Vegas profits off of these folks (because they set the lines), but the rest of us shouldn't get involved.
A couple caveats: First, I've been recording the Vegas odds for a couple years, and have analyzed data for 2007 and 2008. I forgot to download them this year before they disappeared off of the web service I use, so I don't have this year's odds. But I'd be happy to wager that nothing has changed because the system Vegas uses hasn't changed. Second, I am only looking at betting on who wins individual games; there are a number of other bets one could place, and maybe you could make money betting on which pitcher is most likely to start the top of the 7th inning, or which batter is likely to adjust their cup first. I'm not touching those wagers.
In 2008, Vegas came really close to being perfect. The variance between the actual outcome of all regular season games and Vegas' prediction for those games was within the range that one can attribute to random chance. If we start by assuming that Vegas' lines perfectly reflected the likelihood of each team winning each game, the variance between the predictions and the actual outcomes would be greater than it was in 2008 75 percent of the time.
That doesn't mean that Vegas is 75% likely to be perfect. We'd have to go all Bayesian and start assuming silly things to figure out precisely how good Vegas really is. But think about that statistic: if Vegas were perfect, they still would have had a 75% chance of making a worse set of predictions than they did in 2008. So somehow, by hook or by crook, they made some ridiculously accurate predictions in 2008.
It is still possible that the bookmakers got lucky, and that there is money to be made in betting on baseball. So I ran a couple more simulations. According to my numbers, it is crazy unlikely (p<.01) that Vegas is off more than 4% per game. This is because almost all baseball games have a true home-win probability of somewhere close to 50%. Even when the Yankees meet the Royals, the odds aren't far from 50%.
So let's run with my rough estimate that Vegas is off by no more than 4%. In order to make money betting on baseball, you'd have to do better than that. You won't make money if you're just better at chance (i.e., by picking the Yankees every time, or picking the home team every time). If you matched their 4% inaccuracy, you'd lose money a little more often than you won money (on a year-by-year basis). If you barely beat the 4% inaccuracy, with, say, 3.8% inaccuracy, you'd be expected to make a little money each year, but the likelihood that you would lose money each year would still be very high. If you removed 25% of the error in Vegas' estimates, so that your estimates deviated from the true probabilities by 3%, you'd make a profit 73% of the time (again, on a year-by-year basis...so 2-3 years out of every ten, you'd net a loss), for an average return of 3 cents on the dollar. I make more than that in my checking account (granted, it's a great rate for a checking account, but still...).
There are enough variables out there that there's a distinct possibility that I'm wrong. If you have the data to demonstrate that you can win reliably, I'd like to see it. But until then, I'm sticking with the numbers, which say that Vegas is really, really good, and you'd have to be considerably better (25% better) to even make a decent return on your investment.
I also have a novel answer to the question: "If you're so smart, why aren't you rich?" Because I'm smart enough to know it's a scam. Flame away :)
Following this post, there were a number of replies, mostly harsh ones. I expected nothing less, since there are far more people invested in baseball betting being a sound investment than the alternative. In fact, I was hoping for it; my initial analyses were run to see how large the margin for potential profit was for my own practical purposes. In the comments, I promised that if someone posted verifiable data that demonstrated that I was wrong, I would say so. I'm going to relax that standard and give Umaga credit for making a reasonable argument that I found convincing along with his own purported ROI.
Based on responses like those made by Umaga, I'll change my position: (1) No one can make money betting on the closing line, or lines that end up looking very similar to the closing line; (2) if you can predict which opening lines are particularly poor, and you bet early enough, you may be able to make money. In summary, I'd say that there are a small few (yes, likely financial quants, or ex-quants) who can leverage the peculiarities in the system to make money. I have no data for this, but I'm convinced that it's true. But for a vast majority of people, betting on something that looks like the closing line, you're not likely to make money.
Some have commented that a bookmaker's job is to balance the books, and to some extent to exploit known biases in bettors (such as to exaggerate the probability of a favorite, like the Yankees, to win). The story is that this pushes the moneyline away from the "true" probability of each team winning, creating a margin for people to put "smart money" in. I'm fully aware of how these lines are set, and how they change, but it doesn't change the story. If there are any biases such as these, the "smart money" is completely canceling them out. We know this because the closing line is as close to being a perfect measure of game outcome as is practically possible. It does not show the systematic bias that we would expect to see. Thus, if a bettor is going to exploit this, he would have to do so early, before the line drifts towards the closing line.
Others commented that I was being dense, and that "of course" the moneyline makes a "ridiculously good prediction" of game outcome. They argue that baseball betting is essentially a prediction market for baseball game outcomes. These comments absolutely miss the point: (1) a prediction market is not guaranteed to converge at a perfect prediction; (2) even if it *did* converge to the perfect prediction, the 2008 closing lines were better than you would expect a perfect prediction system to be 75% of the time. Kyle is wrong when he says "of course" bookmakers are that good; by random chance we would expect them to be measurably worse even if we predictive markets to converge to a perfect prediction (which seems to be Kyles other point, which, of course, is some combination of silly and naive).
The reason is this (in answer to TomC's question): every baseball game is a Bernoulli trial, that is there are two possible outcomes, home team wins and away team wins. There is a probability, p, that the home team wins, and a probability, q=1-p, that it loses. Thus, each baseball game is essentially a weighted coin flip. A perfect prediction system would have access to the "true" probability of each team winning (p and q). If you were to bet on whichever team has the greatest chance of winning, the outcome of your bet would also be a Bernoulli trial. This means that the variance between your optimal guess and the actual outcome has a known distribution: a binomial distribution. If we know the number of games we bet on, and we know the true odds of winning each of those bets, we can calculate a probability distribution for the variance between the actual outcome and our optimal guess. Thus, we can say things like "There is a 75% chance that the variance between our optimal guess and the actual outcome is less than some number, k."
In 2008, the variance between the bookmakers' closing line and the actual outcome was very small. In fact, it was less than we would have expected by chance 75% of the time.
What does that imply? If you were betting on the closing line, and you had perfect access to the true probability of each team winning, you would still have a 75% of being outperformed by (the average) bookmakers in 2008. If you can't outperform the bookmakers, you can't make money. Thus, you can't play the closing line, or lines that end up being similar to the closing line, and win.
But Umaga's point is a good one and well taken: I said both that you can't make money betting on baseball, and that you can't make money betting on the closing line. But these two claims are not equivalent. In the end, I'm convinced he is correct: I'll stand by the latter claim and back off of the former. You can't make money on the closing line, but you may be able to make money on the opening line or rogue lines (of which there are many). According to this story, making money on sports betting requires the bettor to be clever and look for opportunities to exploit, because the predictive market is really good. So for the vast majority of the sports bettors out there--the ones who don't have MBAs; who haven't had quant jobs at hedge funds; and who don't try to jump on opening lines before they drift away--those folks are buying entertainment every time they bet on a game.
Lastly, I'll point out that since bookmakers take a percentage of the action, this isn't even a zero-sum game; it's a negative sum game. One commenter, Garrett Weinzierl, doesn't like the implication that sports bettors are "all sailing off the edge." But since this is a negative-sum game, most bettors are sailing off the edge. For this system to work, most have to be sailing off of the edge. If you're a sports bettor, I'm not saying you're sailing in the wrong direction. Only Garrett knows where his boat is going. But if you're not at risk for going over an edge, you're an exception to the rule.