Measuring Offense with Batting Runs
Two weeks ago, we looked at the performance of all major leaguers (well, all but catcher and pitchers). I figured it wouldn’t hurt to take a look at offensive performance here today.
When you talk about offensive metrics, well, you’ve got a lot to talk about. You’ve got linear methods (like Pete Palmer’s Batting Runs), multiplicative methods (like Bill James’ Runs Created), rate stats (OBP/SLG, wOBA, GPA, etc,), and a bunch of other things you could do. Really, your stat of choice should depend heavily upon what question you’re trying to answer. Anyway, rather than try to recap the history of run estimation, something I would inevitably fail miserably at, let me just explain what I did.
Palmer’s Batting Runs
That’s essentially the stat we’ve calculated, and you can read a little about it here. It should be very similar to the number located on each player page at Baseball-Reference.com (“BtRns” under Special Batting). If you’re new to this stuff, well, the process is actually pretty simple. You take a player’s stats (singles, doubles, triples, etc.) and multiply them by the corresponding number in the formula. So, if Milton Bradley has 53 singles, you multiply that by .47, then take his doubles and times them by .85, and so on. At the end, you subtract (outs* ~.3). Base stealing is added in separately, and is simply .22*SB-.38*CS.
What you end up with is the number of runs above (or below) average a player has produced in his given playing time.
Adjusting for Parks
Surely, we want to make some adjustment for the park that a player plays his home games in. To do this, we take the outs number (.286 for the AL) and multiply it by the player’s park factor. For, let’s say, Jason Varitek, we penalize him .297 (.286*1.04) for his outs, rather than .286. If we go through and do this for every player, we have a pretty decent park adjustment*. By the way, I used Patriot’s park factors.
*There is a more complicated, more technically correct way to make this adjustment. The difference, however, is pretty tiny, so I’m just sticking with the simpler adjustment.
As I understand it, a linear weights type method for individual hitters is the best way to go. While something like Runs Created is a fine run estimator, often times it will overvalue great hitters, because they interact with their teammates and not in a lineup of clones (i.e., there aren’t nine Albert Pujols’ in the batting order, but rather one Pujols and eight mortals). Runs Created assumes a player interacts outside of a team construct, while Batting Runs does not.
And unlike, say, OPS, probably the most popular stat on the internet, we actually know what Batting Runs is measuring – runs! We know there’s a difference between an .800 OPS and a .900 OPS, but we don’t really know what one point of OPS is worth. The difference between 30 Batting Runs above average and 20 in, let’s say, 400 PA, is 10 runs. Pretty simple and straightforward.
The negatives have more to do with the simplicity of my calculation than anything else. There are things you can (and probably should) add like double play adjustments, a different out value for strikeouts, and so on. It all depends on how accurate and detailed you want to get. Next time we do this, probably at the end of the year, we’ll use a more detailed formula.
Furthermore, the weights used here are long term averages and are not based on any specific context. For instance, if you want to know how many runs J.D. Drew added to the Red Sox, rather than an average team, you’d problem want to look at something like Custom Linear Weights.
Also, remember that this method counts, say, every home run as 1.40 runs, as that is what it’s worth in the long run. However, if a player has a particularly clutch year or something, he’s obviously getting undercut here. Going back to what I said earlier, it really depends on what exactly you want to measure.
Finally, this is just one year’s worth of stats, and does not represent a player’s true talent. To find that, or at least estimate it, you’d want multiple years of data, regression to the mean, an age adjustment, and so on.
Alright, enough babbling, let’s see some numbers. Here are the top 15 hitters in each league:
AL NL 1. Rodriguez, NY 38.2 1. Pujols, Stl 52.2 2. Bradley, Tex 34.4 2. Berkman, Hou 47.4 3. Sizemore, Cle 32.3 3. Jones,Atl 42.7 4. Markakis, Bal 28.2 4. Holliday, Col 38.1 5. Drew, Bos 28.1 5. Ludwick, Stl 33.5 6. Quentin, Chi 28.0 6. Ramirez, Fla 31.1 7. Morneau, Min 26.7 7. Wright, NY 30.2 8. Huff, Bal 26.2 8. Utley, Phi 27.7 9. Kinsler, Tex 26.1 9. Lee, Hou 26.8 10.Hamilton, Tex 25.5 10.McCann, Atl 26.4 11.Cabrera, Det 24.9 11.Burrell, Phi 26.2 12.Youkilis, Bos 24.3 12.Gonzalez, SD 24.6 13.Roberts, Bal 23.5 13.Braun, Mil 23.1 14.Ramirez, Bos 22.7 14.Teixeira, Atl 22.9 15.Giambi, NY 22.6 15.Bay, Pit 22.3
And how about the worst 10:
AL NL 1. Pena, KC -27.0 1. Sanchez, Pit -23.4 2. Gomez, Min -17.9 2. Francoeur, Atl -21.8 3. Johjima, Sea -17.9 3. Patterson, Cin -21.3 4. Betancourt, Sea -15.7 4. Vizquel, SF -21.1 5. Cabrera, NY -15.6 5. Taveras, Col -20.4 6. Varitek, Bos -14.4 6. Bourn, Hou -19.8 7. Vidro, Sea -14.4 7. Jones, LA -18.6 8. Gutierrez, Cle -14.3 8. Greene, SD -17.9 9. Marte, Cle -13.9 9. Pena, Was -17.7 10.Bynum, Bal -13.7 10.Young, Ari -15.8
Here’s the spreadsheet with all players*:
*I took out the pitchers in the NL while making the calculations. Of course, I’m just realizing it now, but I forgot to do the same in the AL (darn inter-league play). I took them out now, but I’m hoping it didn’t have too much of an effect on the final numbers (and I really don’t think it did).
Unlike the fielding spreadsheet, unfortunately, this one won’t automatically update – I had some computer issues and had to use someone else’s, and I couldn’t seem to get the auto-update thing to work. Anyway, feel free to play around in there and use the numbers for whatever you’d like.
Now that we’ve covered hitting and fielding, we’re getting close to a pretty decent little player evaluation ‘system.’ Add in some positional adjustments, some league adjustments, maybe a base running stat, and some other stuff and we’d be pretty good. But hopefully this will tide you over in those message board/blog debates.
Next time, if my computer returns safely, we’ll dig a little deeper into the fielding data available at The Hardball Times.