The Science of Playoffs
What is the point of having playoffs? I mean this not as a snarky rhetorical, but as a real question. Playoffs are a given in every major sport today, but it wasn't always that way. Before 1969, there were no playoffs in baseball, just one seven-game series between the winners of two completely separate leagues. Other sports were more quick to adopt playoffs. The NFL adopted playoffs in 1933 - a one game championship between the winners of the East and West divisions. The NBA had extensive playoffs almost from its inception, allowing 12 of its 17 teams to make the playoffs in 1950. The NHL was also an early adopter. The NBA scaled back its playoffs. Meanwhile, MLB has expanded its playoffs. With all of these different approaches to playoffs, who is right? To answer that, the point of the playoffs must be determined.
One way to look at it is that playoffs shouldn't be necessary at all unless there's a tie for first place. If the goal is to choose the best team and crown them as champion, then that would be the ideal approach. For instance, in 2009, the Yankees had a record that was six games better than any other AL team. Is there anything that could have happened in the playoffs to overturn the evidence that the Yankees were the best AL team in 2009? Not really. If we assume that teams remain static over the course of a season, the best way to evaluate a team's skill is to look at its overall record including the playoffs (schedule adjusted). Even had the Yankees been swept out of the first round, the evidence would point to the conclusion that New York had the best ballclub. So if it's a certainty that the Yankees were the best AL team, then what's the point of having any playoffs? Why not just send them straight to the World Series? That's one view, and a view baseball had until 1969.
But here's another view. The statement above is actually incorrect. It is not a certainty that the Yankees are the best team. It is a certainty that no matter what happens in the playoffs, that the Yankees are most likely the best team. But, because of variability we can't ever be certain who's the best, and sometimes we may not be very close to certain at all.
Consider a scenario in which, based on their records, one team has a 70% chance of being the "true" best team. Meanwhile, there's a 30% probability that a second team is the "true" best team. What do we do? One approach is to simply give the first team the championship. They are most likely the best team, and so they should be given the title. But the second team might be mad. "Hey, we deserve 30% of that championship!" Well, they don't give out parts of championships. But you could, in essence, give them 30% of a championship by giving them a 30% chance to win one whole championship. This could take place by having the commissioner pick a ball out of a lottery at the end of the season. The lottery machine would be filled with 70% of Team A's balls and 30% of Team B's balls. If your team's ball comes up, it is awarded the championship. They could do that. But it would a pretty awful way to end a season.
Championships should be decided on the field, not by ping pong balls. The solution to the uncertainty? Playoffs. Instead of drawing a lottery, you can simply set-up playoffs. And, if you structure the playoffs so that Team A has a 70% chance to win and Team B has a 30% chance to win, then you will have achieved the same effect. Except now the championship is decided on the field. These playoffs give each team a chance to win in proportion to the probability that it was the best team in the regular season. If your team, based on the regular season, has a 10% chance of being the "true" best team, then you will have a 10% chance of winning the playoffs and claiming the championship. Seems fair to me.
Of course, that's only if you set-up the playoffs just right. If you were to make the National League playoffs a 16-team NCAA-style single elimination knockout tournament, those two probabilities would not even be close to matching one another. The chance that a bad team could win would the tournament would be much higher than the probability that they were the best team during the regular season.
Let's take a simple example with just two teams. In 2008, the Cubs won 97 games, with a .602 WPCT. The Phillies won 92 games, with a .568 WPCT. If we regress these to the mean, which I won't go into here, you get the Cubs with a predicted "true" WPCT of .569 and the Phillies with a predicted "true" WPCT of .548. Each of these has a standard error of about .032. Hence, the probability that the Cubs are "truly" better than the Phillies is about 68%. So, in order to be "fair", a playoff series should be structured so that the Cubs have a 68% chance of winning. However, in a seven-game series with home field advantage to the Cubs, Chicago has only a 56% to win (this includes the fact that Cubs are likely better than the Phillies). But 56% is too low. Their five game lead is substantial, and should not be able to be so easily erased by a simple best of seven series. So how about we change things up? We still play a seven game series, but we spot the Cubs a 1-0 lead. Running the numbers again, now the Cubs have a 69% chance of winning the series. That's almost perfect! It gives the Cubs an advantage, but the Phillies still have a chance to win. And they can do it by winning just four out of six games.
The above set-up is the fairest one to determine the championship between the Phillies and Cubs. Of course, purists will say that the Cubs should be awarded the championship regardless. After all, if the Phillies win 4 out of 6 games, the Cubs will still have a better record than Philadelphia. Hence, the Cubs still are the team that's most likely the best "true" team in the league. Even though the above system is "fair", it still quite easily allows for the championship to be awarded to a team which is probably inferior. And that's part of the point. Likely inferior teams can win, but only if they do something extraordinary in the playoffs.
2009 AL Playoffs
Now let's take a larger example. In the 2009 AL, any baseball fan could tell you that there were three dominant teams: the Yankees, the Angels, and the Red Sox, with the Yankees likely being the best of the bunch. The probabilities I calculated of being the true best team back that up. The Yankees, with six more victories than any other team, have a probability over 50%, while the Red Sox and Angels are significantly lower. The probabilities for other teams are close to 0%. So did the AL playoffs match those probabilities well? Take a look at the chart below:
The Yankees were not amply rewarded for their regular season dominance, and their playoff probabilities were much too low. Additionally, the Twins and Tigers' probabilities were much too high. And as a final issue, the Red Sox also had too high of a probability. So how could the 2009 AL playoffs have been made fairer? First, by limiting the teams to just New York, Boston, and Los Angeles, you can set the Tigers and Twins probabilities down to zero. Then, since New York is far ahead and LA and Boston close together, it makes sense to have Boston and LA play each other in a five game series, with the winner playing the Yankees. What happens if we test that scenario, with LA and New York having the home field advantage? We get the following: probabilities:
Red Sox: 19%
The Red Sox probability is a little higher than we'd like, but overall it's a pretty spot on match to the probabilities that each is the true best team in the league. Additionally, in their guts, I think most fans would agree that this would have been a fair playoff setup given the results of the regular season.
2009 NL Playoffs
Now I'll move to the NL and get a little wild. The NL was more evenly spread. The Dodgers had the best record, but several other teams were close behind. Additionally, there were several lagging contenders who, because of the overall parity, could potentially be the best true team in the league. The chart below shows the probabilities for the 2009 NL:
Overall, the probabilities are not way off like they were for the 2009 AL, however, there are still some inequities. The actual playoff probabilities are too high for each of the playoff team and they are too low for the teams that did not make the playoffs. Playing around with the numbers - here's the closest I could come to evening this out:
As you can see, the lowly Cubs do make the playoffs. But it will take a three-game sweep of the mighty Dodgers to advance. Additionally, teams such as Atlanta and Florida also have a shot, but will need to win two straight games against their superior foes to advance. The probabilities in this scenario match well with the probabilities of each team being the best true NL team. The results are below:
In this way, this playoff set-up is actually both more fair and often allows more teams to actually make the playoffs. Obviously, the drawback is that the playoffs aren't set in advance, with the additional drawback being that it's hard to match the probabilities exactly. So at least one team will end up getting the short end of the stick, and then they'll be mad. Additionally, really complicated playoff systems don't exactly have the best track record in major sports (see the BCS). Still I think a scenario like this is something that is inherently fairer in that it rewards teams in proportion to their accomplishments during the regular season - something that the current system famously does not do.
Ideally a system like this would work pretty well for a non-major sport that was a little more flexible on its scheduling and a little less rigid in its traditions. But, to be honest, it's likely impractical at any level. Still this method can be used to evaluate playoff structures and see where the holes are. In baseball, it's clearly that inferior teams have too large an edge in the playoffs. In other sports, depending on the structure, length of season, true talent distribution, the size of the home field advantage, etc, things may be different.