Behind the ScoreboardApril 24, 2010
The Science of Playoffs
By Sky Andrecheck

What is the point of having playoffs? I mean this not as a snarky rhetorical, but as a real question. Playoffs are a given in every major sport today, but it wasn't always that way. Before 1969, there were no playoffs in baseball, just one seven-game series between the winners of two completely separate leagues. Other sports were more quick to adopt playoffs. The NFL adopted playoffs in 1933 - a one game championship between the winners of the East and West divisions. The NBA had extensive playoffs almost from its inception, allowing 12 of its 17 teams to make the playoffs in 1950. The NHL was also an early adopter. The NBA scaled back its playoffs. Meanwhile, MLB has expanded its playoffs. With all of these different approaches to playoffs, who is right? To answer that, the point of the playoffs must be determined.

One way to look at it is that playoffs shouldn't be necessary at all unless there's a tie for first place. If the goal is to choose the best team and crown them as champion, then that would be the ideal approach. For instance, in 2009, the Yankees had a record that was six games better than any other AL team. Is there anything that could have happened in the playoffs to overturn the evidence that the Yankees were the best AL team in 2009? Not really. If we assume that teams remain static over the course of a season, the best way to evaluate a team's skill is to look at its overall record including the playoffs (schedule adjusted). Even had the Yankees been swept out of the first round, the evidence would point to the conclusion that New York had the best ballclub. So if it's a certainty that the Yankees were the best AL team, then what's the point of having any playoffs? Why not just send them straight to the World Series? That's one view, and a view baseball had until 1969.

But here's another view. The statement above is actually incorrect. It is not a certainty that the Yankees are the best team. It is a certainty that no matter what happens in the playoffs, that the Yankees are most likely the best team. But, because of variability we can't ever be certain who's the best, and sometimes we may not be very close to certain at all.

Consider a scenario in which, based on their records, one team has a 70% chance of being the "true" best team. Meanwhile, there's a 30% probability that a second team is the "true" best team. What do we do? One approach is to simply give the first team the championship. They are most likely the best team, and so they should be given the title. But the second team might be mad. "Hey, we deserve 30% of that championship!" Well, they don't give out parts of championships. But you could, in essence, give them 30% of a championship by giving them a 30% chance to win one whole championship. This could take place by having the commissioner pick a ball out of a lottery at the end of the season. The lottery machine would be filled with 70% of Team A's balls and 30% of Team B's balls. If your team's ball comes up, it is awarded the championship. They could do that. But it would a pretty awful way to end a season.

Championships should be decided on the field, not by ping pong balls. The solution to the uncertainty? Playoffs. Instead of drawing a lottery, you can simply set-up playoffs. And, if you structure the playoffs so that Team A has a 70% chance to win and Team B has a 30% chance to win, then you will have achieved the same effect. Except now the championship is decided on the field. These playoffs give each team a chance to win in proportion to the probability that it was the best team in the regular season. If your team, based on the regular season, has a 10% chance of being the "true" best team, then you will have a 10% chance of winning the playoffs and claiming the championship. Seems fair to me.

Of course, that's only if you set-up the playoffs just right. If you were to make the National League playoffs a 16-team NCAA-style single elimination knockout tournament, those two probabilities would not even be close to matching one another. The chance that a bad team could win would the tournament would be much higher than the probability that they were the best team during the regular season.

An Example

Let's take a simple example with just two teams. In 2008, the Cubs won 97 games, with a .602 WPCT. The Phillies won 92 games, with a .568 WPCT. If we regress these to the mean, which I won't go into here, you get the Cubs with a predicted "true" WPCT of .569 and the Phillies with a predicted "true" WPCT of .548. Each of these has a standard error of about .032. Hence, the probability that the Cubs are "truly" better than the Phillies is about 68%. So, in order to be "fair", a playoff series should be structured so that the Cubs have a 68% chance of winning. However, in a seven-game series with home field advantage to the Cubs, Chicago has only a 56% to win (this includes the fact that Cubs are likely better than the Phillies). But 56% is too low. Their five game lead is substantial, and should not be able to be so easily erased by a simple best of seven series. So how about we change things up? We still play a seven game series, but we spot the Cubs a 1-0 lead. Running the numbers again, now the Cubs have a 69% chance of winning the series. That's almost perfect! It gives the Cubs an advantage, but the Phillies still have a chance to win. And they can do it by winning just four out of six games.

The above set-up is the fairest one to determine the championship between the Phillies and Cubs. Of course, purists will say that the Cubs should be awarded the championship regardless. After all, if the Phillies win 4 out of 6 games, the Cubs will still have a better record than Philadelphia. Hence, the Cubs still are the team that's most likely the best "true" team in the league. Even though the above system is "fair", it still quite easily allows for the championship to be awarded to a team which is probably inferior. And that's part of the point. Likely inferior teams can win, but only if they do something extraordinary in the playoffs.

2009 AL Playoffs

Now let's take a larger example. In the 2009 AL, any baseball fan could tell you that there were three dominant teams: the Yankees, the Angels, and the Red Sox, with the Yankees likely being the best of the bunch. The probabilities I calculated of being the true best team back that up. The Yankees, with six more victories than any other team, have a probability over 50%, while the Red Sox and Angels are significantly lower. The probabilities for other teams are close to 0%. So did the AL playoffs match those probabilities well? Take a look at the chart below:

alprobs.PNG

The Yankees were not amply rewarded for their regular season dominance, and their playoff probabilities were much too low. Additionally, the Twins and Tigers' probabilities were much too high. And as a final issue, the Red Sox also had too high of a probability. So how could the 2009 AL playoffs have been made fairer? First, by limiting the teams to just New York, Boston, and Los Angeles, you can set the Tigers and Twins probabilities down to zero. Then, since New York is far ahead and LA and Boston close together, it makes sense to have Boston and LA play each other in a five game series, with the winner playing the Yankees. What happens if we test that scenario, with LA and New York having the home field advantage? We get the following: probabilities:

Yankees: 58%
Angels: 23%
Red Sox: 19%

The Red Sox probability is a little higher than we'd like, but overall it's a pretty spot on match to the probabilities that each is the true best team in the league. Additionally, in their guts, I think most fans would agree that this would have been a fair playoff setup given the results of the regular season.

2009 NL Playoffs

Now I'll move to the NL and get a little wild. The NL was more evenly spread. The Dodgers had the best record, but several other teams were close behind. Additionally, there were several lagging contenders who, because of the overall parity, could potentially be the best true team in the league. The chart below shows the probabilities for the 2009 NL:

nlprobs.PNG

Overall, the probabilities are not way off like they were for the 2009 AL, however, there are still some inequities. The actual playoff probabilities are too high for each of the playoff team and they are too low for the teams that did not make the playoffs. Playing around with the numbers - here's the closest I could come to evening this out:

playoffscenario.PNG

As you can see, the lowly Cubs do make the playoffs. But it will take a three-game sweep of the mighty Dodgers to advance. Additionally, teams such as Atlanta and Florida also have a shot, but will need to win two straight games against their superior foes to advance. The probabilities in this scenario match well with the probabilities of each team being the best true NL team. The results are below:

Dodgers: 28%
Phillies: 21%
Rockies: 19%
Cardinals: 13%
Giants: 9%
Marlins: 4%
Braves: 4%
Cubs: 1%

Conclusion

In this way, this playoff set-up is actually both more fair and often allows more teams to actually make the playoffs. Obviously, the drawback is that the playoffs aren't set in advance, with the additional drawback being that it's hard to match the probabilities exactly. So at least one team will end up getting the short end of the stick, and then they'll be mad. Additionally, really complicated playoff systems don't exactly have the best track record in major sports (see the BCS). Still I think a scenario like this is something that is inherently fairer in that it rewards teams in proportion to their accomplishments during the regular season - something that the current system famously does not do.

Ideally a system like this would work pretty well for a non-major sport that was a little more flexible on its scheduling and a little less rigid in its traditions. But, to be honest, it's likely impractical at any level. Still this method can be used to evaluate playoff structures and see where the holes are. In baseball, it's clearly that inferior teams have too large an edge in the playoffs. In other sports, depending on the structure, length of season, true talent distribution, the size of the home field advantage, etc, things may be different.

Comments

One shortcoming here is that, in the AL scenario, you're not taking into account the effect that a close-to one week layoff would have on the Yankees.

A wholly different view is this: playoffs will not definitively settle "who is best" questions, so we should regard them as simply being a chance to see baseball played at its current best level--in effect, a series of exhibition games, but with ample tangible and intangible rewards available to provide motivation.

Back in the day, a good part of the charm and attraction of the World Series was that it was the first and only time that the two teams could meet: that greatly augmented fan interest by providing season-long fodder for "who's really best", because that couldn't be tested, directly or indirectly, till the Series.

If we today had "real baseball" as our goal, rather than financial returns from the post-season, we would establish four 7-team leagues (nominally structured as one pleases, say AL E & W and NL E & W) and a two-level playoff structure, with no interleague play. Yes, not every city gets to host every team--what does that signify in an age in which television viewership greatly exceeds physical attendance? A side benefit is that every game would be directly meaningful to the league championship, and the winner in each league more likely to be the best team in that league.

Nothing like that is ever going to happen, because virtually the sole desideratum today is money, but it's fun to contemplate.

I don't think the playoff system is really meant to determine, in a serious way, who the best team is. It's purpose is really to generate revenue while providing a dramatic conclusion to the regular season. The owners would never even consider the kind of reform you're talking about, because they don't WANT the best team to win the WS on anything like a consistent basis. That would make it predictable and therefore a less compelling piece of entertainment.

Even though the three-stage playoffs format is here to say, I do think purists have a legitimate gripe about how difficult this format makes it for the best team to actually win the WS. But I think there is a simple solution that would at least somewhat address this concern: Each year, at the beginning of the playoffs, there should be a trophy awarded to the NL and AL teams who "win" the regular season. Each league would have its own trophy, named for an historical luminary associated with that league. The trophy presentation would be a part of the opening ceremonies of whatever divisional playoff series the winner was participating in.

The award would entitle the team to display a banner like the ones used to denote division championships, pennants, and WSs. Thus, finishing with the best RS record would be recognized as an achievement on a par with these other achievements.

To get the ball rolling, I would further propose awarding the trophy retroactively going back to the first year of the current playoff format.

I agree with the above two comments, the purpose of the playoffs seems to have more to do with keeping fans happy, and generating additional revenue, with a tournament featuring the best or most high profile teams, instead of determining "the best" team. If we look at it this way, its important that the teams in the playoffs are actually the teams that the fans want to see in the playoffs. For this reason keeping traditions alive (not monkeying with a accepted system too much), representing different regions, and keeping the integrity of the leagues (though not the divisions) becomes important. I would add that I like pennant races and would like a playoff structure that featured that.

The Yankees can hire the Washington Generals to keep them in top shape during the week then.

What are those regressed winning percentages? Isn't an observed winning percentage an unbiased estimator of the population value? Or are you imposing a prior that all teams are .500? In general, a Bayesian approach might be helpful here....

Eric,
Yes, those are regressed assuming a distribution of true talent with mean .500 and SD .06. So that's the prior. The SD was calculated by subtracting the expected variance from the actual variance in WPCTs over a number of seasons. So you are correct, the Bayesian-style approach is used when I said I regressed each team's WPCT. Tango's got a lot on this at his Book blog if you want to read up more....

Others - Keep in mind that this system doesn't prove who's best. It simply gives teams a chance to win in proportion to the probability that they are the best. I think that the more those numbers deviate, the more unfair the sport's playoff structure "feels". I think that's true of any sport or any setup, be it MLB, beer league softball, a kids' ping-pong tournament, etc.

I think the whole point of the playoffs is to randomize the winner. I think American owners are afraid that if the champion is too predictable before the season, ala European football then the fans will lose interest. Hence the fact that lower seeded teams have too much of a chance in the playoffs is not a bug, it's a feature

I think you missed what are really the two main reasons for playoffs.

The original reason for playoffs was to determine which team was best when the teams did not play each other in the regular season. The World Series prior to inter-league play is a prime example. Since the American League champion never played any National League teams during the season, there was no way to know whether they were
better than the National League without going head to head in the post-season. This also applies, to a lesser extent, to the present day situation in which teams play more games against teams in their own division than against teams not in their division.

The second reason for playoffs is money, plain and
simple. Adding more teams to the post-season increases broadcast and ticket revenue. More playoff spots mean that more teams get to play meaningful games late in the season. More playoff games means more commercial time can be sold.

The college basketball playoffs just expanded to 68 teams even though none of the additional slots will go to teams that have even a 2% chance of making it to the final game. To me it is obvious that the new slots were only added to increase revenue and not because there are truly deserving teams who got left out under the 65 team setup.

But no matter how you set up a playoff system, you can never be sure that the best team will win.
If the 2010 MLB playoffs were repeated 100 times there's a good chance that all 8 of the participating teams would win the World Series at least once. The teams are all good and in a short series, anything can happen.

One final thought regarding the futility of playoffs is the question of what is meant by a "team". Is it the team as it was constituted in April? Or is it the team that took the field in July? Or is it the team that played in September? Imagine if the Cardinals make the playoffs but Albert Pujols and Matt Holliday break their wrists on the last day of the season and can't play in the post-season. Obviously the "team" that St Louis puts on the field for the playoffs is not the same team that was so good from April to September.

As fans, we love to debate which teams are the best, but the reality is that teams are in a constant state of flux. The team that is best today might be mediocre tommorrow as a result of a couple of injuries to key players.

Just a thought here: How about realizing that no matter how you manage it, awarding a championship is unfair? Let's supose that one team won 150 games one year, and is 99.9% sure of being the best team in baseball. No one would argue with that. However, it's still not fair because some of the 18 players on that team are not, in fact, among the best 18 players in baseball. In fact, there is a pretty good chance that even with that scenario, the best player in baseball is not on that team.

So, should we now divvy up the championship between players? Or should we recognize that championships are one, just one, way of enjoying a season. It seems to me that baseball and college football are the 2 sports which most appreciate the good that goes with a season that does not involve championships. Why devise a complicated way of making championships the be all end all in these sports? Why not simply devise other ways, such as in baseball individual awards and recognition, and in college football the bowl system, to ensure that teams not winning championships also have a little fun?

If you want to get closer to the true percentages, why not just give teams a run handicap? Give the Yankees a 1-0 lead to start off the games, for example.

Many people have rightly pointed out that the point of the playoffs isn't really to "determine the best team". Look at this another way...perhaps (as I see it) the point of the regular season is to determine who deserves to be in the playoffs (and thereby have a right to play for the championship)?

Even if you decide the playoffs are for determining "the best team", using regular season results to handicap thins seems to be the unfair thing to me. The regular season is affected by a lot of factors, especially things like injuries...what if there is a team that is twice as talented as the next closest team, but suffers many injuries (but gets healthy in time for the playoffs), I don't want to see this team handicapped. Teams already benefit from having a better record in the regular season, due to confidence near season's end that the playoffs will be earned, and because the team will be able to set its rotation and rest players.

Despite the fact that the 2009 Yankees had the best regular season record AND won the World Series, I'm still not convinced that they were the "best" (i.e. most talented?) team. However, they were a team with a lot of good health, a great level of base talent, and much of that talent performed to or above expectations. They also took advantage of their home ballpark and won many games in lucky fashion, then used the benefits of the strong regular season (including rest and the ability to choose the days they would play, which let them use a three-man rotation for the playoffs).

The 2009 Yankees deserved and earned their championship, but it didn't definitively prove anything about a "best team". Why would you ever want to reward what could be nothing more than good luck with a playoff advantage?

Another thought I had on this, though I think it applies more to some other sports such as football, is that such a system would effectively eliminate the highly successful teams that just "coast" at the end of the season - every game really does matter now, even after "clinching" first place in a league.

Much of the variability in baseball lies in the fact that baseball is more truly an individual sport than a team one. Other than infield groundouts or outfield assists, there is not much to be said for one player's performance being based on another.

Another example is with pitching. The 1995 Braves had Maddux, Glavine, and Smoltz, who had a combined 47-16 record that year. But every 4th day, you would get Steve Avery, who with a 7-13 mark was quite obviously not up to par. However, when you enter the playoff scenarios, suddenly this 1995 team became significantly better, as only those three starters were required, and Avery resorted to bullpen duty. Perhaps there is some meat to be found in analyzing the 'probably superior' teams and their chances of winning in the playoffs, based on this new 3-man rotation.

What if there was no such thing as a monolithic "best team in baseball"? What if team quality -- team "true skill" -- changes on a monthly, daily, or even hourly basis? What if the best team yesterday is not the best team tomorrow, or the best team at this exact moment is never the best team again for the rest of the season, but reached a level for however brief a time that will not be matched by any other team?