Behind the Scoreboard November 17, 2009
MVP Award Balloting: Is It Fair?

The MVP and Cy Young Awards are closely upon us, and soon we'll know this year's choices. As you know, the balloting for these awards is done by two baseball writers from each city. I'll spare the indignation about why award choices are limited to just two BBWAA members when there are a host of other highly qualified people who could be consulted on the awards, and concentrate on the balloting process.

For the MVP award, voters rank their top 10 choices for the award. Each 1st place vote receives 14 points, each 2nd place vote receives 9 points, each third place vote receives 8 points, etc, down to where each 10th place vote receives 1 point. The candidate with the most total points is the MVP. This weighting system seems fair enough. But is it? Why shouldn't a first place vote be worth 10 points? Or 20 points?

What's more, the Cy Young does things differently. There, the writers only select their top three players for the award. A first place vote gets 5 points, a second place vote gets 3 points, while a third place vote gets 1 point. This strikes me as odd, since it would seem that a system good enough for the MVP would be good enough for the Cy Young, and vice-versa.

Ballot Weighting Based on Empirical Win Values

An alternate, perhaps more accurate, method of balloting would be to have each voter assess the value of each player (perhaps measured in wins). Each voter would give a value score and the player with the highest average value among the writers would be the MVP. While in theory this would work, in practice this would probably be a mess. Writers would be working off of different internal scales and the votes would be all over the map. One guy might think that his first place choice is 10 times as valuable as his 10th place choice, while another guy might think that his first place choice is only twice as valuable. While this could represent real differences between the two writers' valuation of these players, more likely it would be a function of different perceptions of value, and the different scales each writer is using in their heads.

Because of these issues, the 1 through 10 balloting that currently takes place is probably the way to go. This essentially forces everyone to be on the same valuation scale. The #1 choice gets 14 times the weight as a #10 choice regardless of the individual writer's evaluation of their relative worth. But is the scale 14-9-8 scale that is used a good one?

Going with the theory that the weighting system should correspond proportionately to the value of the player, let's look at the Wins Above Replacement (WAR) values for the top players over the past 25 years. The following list shows the average WAR value for players ranked 1 through 10. The #1 player averaged 9.4 WAR, while the #2 player averaged 8.3 WAR. Meanwhile the #10 player averaged 5.9 WAR.

Needless, to say if we used these weights for the MVP balloting, the results would be vastly different. However, this wouldn't be right either, because it assumes that anybody left off of a ballot altogether has a value of 0. Of course, any writer would consider a serious MVP candidate to have a value far greater than zero, even if he did leave that player off his ballot. So, how to evaluate those unranked players? Since the writer didn't rank that player, we don't know how he values him. Assuming an 11th place vote for players left off the ballot seems a bit too optimistic, but a serious MVP candidate couldn't be too far behind. Subjectively, it seems reasonable to me to assume a 13th place ranking for unballoted MVP candidates - giving them an estimated WAR of 5.5.

Using this WAR scale (9.4 points for a 1st place finish, 8.3 points for 2nd place...5.9 points for 10th place, and 5.5 points for unranked players) would probably be the most fair ballot weighting system. How does this compare with the system MLB actually uses? While the weights seem to be very different, this is mostly because the systems are on two different scales. To make them comparable, we can convert the WAR system to a scale where 0 points are given to a player left off the ballot and there are 59 total points doled out altogether. When we do this, we see that in fact the two balloting systems are extremely similar.

Overall, the WAR system advocates giving slightly more weight to players who finish 1st and 2nd in the balloting, while giving slightly less weight to those thereafter, with the exception of the 10th place vote. In particular, second place votes are undervalued (they are worth 9 points, whereas they should be worth 10.3 points). In all however, there is very little to quibble with. If I were starting from scratch I would choose a 15-10-8-7-etc system instead, however this is a very small difference. Kudos to Major League Baseball, which has used the same ballot weights since 1938. It really got it right with its MVP ballot system.

How about the Cy Young? As I mentioned previously, the current system gives 5 points for first place, 3 points for second, and 1 point for third. Going through this same process above for pitchers only, the WAR scale recommends 4.9 points for first place, 2.6 points for second and 1.4 points for third. Again, the this scale is fairly similar to the one used by MLB, though MLB slightly overvalues second place votes and undervalues third place votes. Though it might be better to go with a 14-9-8-etc system (or a 15-10-8-etc system) just so writers have a chance to rank more players, the current system works pretty well too.

Conclusion

Overall, the method which MLB chooses its MVP and Cy Young Awards isn't the most important thing on Earth. However, it's nice to know that MLB is doing something right. It would have been fairly easy to screw these up. For instance, a 10-9-8-etc MVP ballot system would be off from reality by quite a bit. However, the systems currently in place do a good job of reflecting the actual differences in value of players as ranked by the sportswriters. Whoever was initially responsible for this system did his job well. For once, it's nice that the traditional way of doing things is also the right one.

Since there can only be one winner for these awards, it strikes me that an alternative vote/ instant runoff system would make sense.

As a caveat, in the US this sort of system is only used for local elections in San Francisco, and was adopted quite recently. They are used in Australian elections. Actual two ballot, runoff systems are used more widely but I doubt MLB would want to go through the expense of running two ballots.

The way alternative vote works is that voters rank their first, second, third choices etc. for the awards as before. A player obtaining a majority (not just a plurality) for the award would get the award. If no player obtains a majority, the player with the least votes are eliminated, and the next ranked choices of the people who vote for him are now counted as first choices and redistributed. Eventualy someone gets a majority. Instant runoff works much the same way, except only second choices are considered, so the system is somewhat similar but less finely grained.

Granted, this would work somewhat differently for sports awards than in the political arena. But I've noticed in the history of these awards, usually the winner actually gets a large minority of first choice votes. I'm not sure what the rationale for weighted votes really is.

Hi Ed, Good thought. The difference is that in a political arena, people are voting for their favorite candidate. In an awards scenario, we have a panel of (presumed) experts objectively ranking the players' value. In this scenario, unlike the political one, there is no "strategic" voting (ex. leaving Mauer off the ballot in hopes that Jeter will win), hence no need for a runoff.

The weighted scenario allows a consensus #2 pick to be the MVP over guys who some experts think is #1 but other experts think is #10. Because we trust the voters to give accurate assessments of a player's value, using the weighting method provides a lot more information and a lot better of a result.

So, does your system, which seems ok to me, alter historical results of who won the MVP or the Cy Young awards? I don't know where the data is, but if you do maybe that would be an interesting follow up article.

Ed, IRV is actually used in lots of jurisdictions throughout the US other than just SF (http://www.fairvote.org/?page=1960) and was recently implemented for future Academy Awards as well.

I agree with Sky on this. I'm a huge supporter of IRV in political elections but think it is unnecessary for MVP voting for the reasons that Sky lists as well as the fact that people like to know who came in second, third etc. Using IRV we would still be able to "know" that but it would be less accurate as the 2nd place votes would not represent voters true beliefs. IRV is only accurate at determining what person is able to gain a majority of the vote.

Great question Wimbo. I don't think it would change a huge amount - probably only for the closest races since MLB's weights are pretty accurate.

In the 1979 race, the Keith Hernandez and Willie Stargell tied. Under my system, Hernandez should have beaten Stargell very slightly (224.9 to 223.2). Hernandez got a lot more second place votes (8 vs. 3), which are undervalued, hence giving him the "victory".

Not sure where most of that data is, but I got that data from right here at baseball analysts in Bill Dean's article from 2006. http://baseballanalysts.com/archives/2006/05/who_was_really.php

This is the fairest way (called the Borda method) if you don't have a select pool of 10 candidates or 3 to start with. Otherwise, you cannot use IRV or Range voting.

If you had the pool of 10 or 3 then range voting would be the best, when YOU give them a rank 00 - 99.

Add them up, and you have a winner. IRV has too many flaws, and has been dumped in several cities because of this.

Recent YouTube Postings are quite comical in pointing them out:

This is very interesting, I'd never given much thought to the exact weighted values assigned in these votes, and I'm pleased that you found the values are sensible.

Anyone have any idea how the writers settled on these values in the first place? Was there an established procedure back then for these votes, or did they hire an expert to determine fair numbers?

Re: Got the System Right

I assure you that whoever thought of the 14-9-8-7 etc.. MVP ballot system got it right by mere accident. I believe this system has been in place since 1931 and I doubt VORP and WAR were factors in their balloting method. Someone should do an analysis of all the "bad" winners/ballots over the years...1964 NL MVP for instance

Sammy,

Borda is a little better than IRV but definitely worse than score voting (aka "range voting").

Here are Bayesian Regret figures to prove that:
http://scorevoting.net/UniqBest.html

Here's a much more thorough analysis:
http://scorevoting.net/rangeVborda.html