Dee-Fense . . . Dee-Fense . . .
Last year, Justin Inaz popularized a new fielding stat, based on the freely available data from the Hardball Times. This year I decided to set up a spreadsheet (one that can automatically update!) and keep track of fielding performance, using Justin’s process. While there are plenty of advanced fielding metrics out there, such as MGL’s Ultimate Zone Rating, David Pinto’s Probabilistic Model of Range, and John Dewan’s Plus/Minus, I figured, if anything, it wouldn’t hurt to have one more. It may not get as detailed as those listed above, but it’s pretty good and it’s available all the time (and for free). The Methodology The Hardball Times provides us with some great information to evaluate fielding performance. On their fielding stats page, they report, for each and every player, the number of balls hit into the player’s zone, the number of plays made on balls in their zone, and the number of plays made on balls hit outside of their zone. With these three numbers in hand, we can get a pretty solid grasp of a player’s fielding performance. But, before we get to that, we’ve got a few definitions to get out of the way: Now, how do we go about turning three numbers into a decent fielding metric? Well, let’s take a look at Mariners’ shortstop Yuniesky Betancourt, as an example. He’s had 244 balls hit into his zone, and of those 244 chances, he’s turned 200 of them into outs. The average shortstop turns about 83% of balls in zone into outs, so we would expect the average SS to make about 203 plays, if they had 244 chances. Betancourt is about -3 compared to average. How do we handle out of zone performance? Betancourt’s made just 17 out of zone plays so far in 2008. The average shortstop makes about .13 out of zone plays per in zone chance*, so we’d expect the average SS to have about 32 out of zone plays, given Yuni’s in zone chances. This puts Betancourt at -15 on OOZ balls and about -18 plays overall. *One major assumption is being taken here. That is that the number of in zone chances a player gets also reflects the number of out of zone chances he’ll have. Since we don’t know exactly how many OOZ chances anyone actually has, we have to estimate this number somehow. Some people believe innings or total balls in play or something else would be a better proxy, but I’m using in zone chances here. We now have Betancourt at -18 plays, but we’re not quite done yet. It’s a lot easier to work in terms of runs because that’s generally how we measure things in baseball, so we have to make one final conversion. Using the numbers derived from Chris Dial, we can turn plays into runs, simply by multiplying plays by .753 for shortstops (it varies by position as saving a play in, say, the outfield, is, on average, more valuable than saving a play in the infield). Betancourt now ends up at about -13 runs, or the second-worst MLB shortstop, ahead of only Bobby Crosby (-14.6). That is essentially what you do, with every player, at every position (of course, Excel makes that a little bit easier, or at least it’s supposed to, if you know what you’re doing). The Good and the Bad There are a number of reasons why this metric (stat, translation, conversion, whatever you want to call it) is pretty darn good, and there are also, of course, many limitations. Positives:
Negatives:
I think that, if we keep the limitations in mind, this can be a very useful number to look at. Of course, we can’t get carried away with two-thirds of a season’s stats, both because of the limitations mentioned above, and because of the relatively small amount of data we’re working with. With that said, let’s take a look at the best and worst teams and individual fielders so far in 2008. Teams Below are all 30 teams listed, in order of runs saved above average (through Monday, July 28): STL 45.1 ATL 39.4 CHN 35.7 OAK 35.1 SD 31.0 HOU 28.3 PHI 28.0 LAN 27.5 MIL 22.4 TOR 18.5 LAA 15.4 NYN 10.3 TB 9.0 SF 4.7 CHA -2.8 SEA -5.5 COL -6.1 BOS -7.3 DET -7.6 ARI -8.6 WAS -10.8 CLE -12.9 BAL -17.0 CIN -19.2 PIT -24.3 TEX -26.4 FLA -38.1 NYA -48.9 MIN -49.2 KC -58.4 The Cardinals come out on top, at about 45 runs above average. The Cards are led by a great infield trio of Adam Kennedy ( 14.9), Albert Pujols ( 11.5), and Cezar Izturis ( 10.2). The Braves are also anchored by three great infielders in Yunel Escobar ( 17.4), Chipper Jones ( 16.5), and Mark Teixeira ( 10.6). The Cubs are led by rookie right fielder Kosuke Fukudome ( 15.5). Other standouts include Derrek Lee ( 7.2) and Mike Fontenot ( 6.8). The Royals find themselves trailing the majors, at 58 runs below average. They have eight players that are at least 5 runs below average or worse. Minnesota’s been hurt badly by their infield defense: Justin Morneau (first, -10.7), Alexi Casilla (second, -6.1), Brendan Harris (short, -5.6), and Mike Lamb (third, -12.6). The Yankees can thank most of their poor rating to Bobby Abreu, who trails the majors at 27.5 runs below average. Best and Worst Fielders The subtitle there is a bit of a misnomer, as you’d like to have more than one year of data to truly determine the best and worst fielders. But here are the top 20 fielders so far in 2008, ranked in order of runs saved above average (these aren’t per 150 innings or anything, by the way – this is the player’s total so far): Utley, Phi 25.0 Rolen, Tor 23.8 Beltre, Sea 21.8 Ellis, Oak 18.3 Hardy, Mil 17.5 Escobar, Atl 17.4 Giles, SD 16.9 Jones, Atl 16.5 Fukudome, Chi 15.5 Hannahan, Oak 14.9 Kennedy, Stl 14.9 Berkman, Hou 13.9 Winn, SF 13.2 Anderson, CHA 13.1 Votto, Cin 12.9 Rios, Tor 12.7 Gutierrez, Cle 11.5 Pujols, Stl 11.5 Helton, Col 11.3 Figgins, LAA 11.1 And how about the trailers: Abreu, NYA -27.5 Wells, Tor -21.9 Jacobs, Fla -17.9 Encarnacion,Cin -15.9 Hawpe, Col -15.5 McLouth, Pit -15.3 Griffey Jr, Cin -15.1 Mora, Bal -14.8 Blake, Cle -14.6 Crosby, Oak -14.6 Ramirez, Bos -14.1 Betancourt, Sea -13.3 Ordonez, Det -13.3 Lamb, Min -12.6 Easley, NYN -12.5 Hermida, Fla -12.5 Quentin, CHA -12.1 Cantu, Tor -12.1 Castillo, NYM -12.0 Kinsler, Tex -11.8 Figuring you might be interested in, oh, say, the 800 some players in between the top and bottom 20, here’s the full spreadsheet. There you’ve got ratings at every position, positional averages in some of the key stats, and team totals again. Feel free to use it however you’d like, of course, and let me know if you have any questions. And let me know if I’ve messed anything up, be it in the spreadsheet or in any of my rambling above. I am by no means any type of expert on fielding analysis, but I find it fascinating, and I hope you do too. Myron Logan writes about the Padres and baseball at Friar Forecast. |
Comments
Nice work! It's great to see others using these data. -j
Posted by: jinaz at July 31, 2008 8:03 AM
So, that's Chipper Jones up there in 8th place despite missing a quarter of the season to date? Wowsers. I was under the impression that ever certified advanced defensive statistic was required to rank Chipper as the worst defensive 3B in history. :-)
Seriously, though, always good to see new defensive metrics.
Posted by: mravery at July 31, 2008 9:54 PM
Justin, thanks!
mravery, Chipper has been great on out of zone plays this year -- basically, that's where he has all of his value (I've got him at +18.6 plays out of zone and +2 plays in zone, which ends up working out to about +20.6 plays/+16.5 runs).
My off-the-cuff guess is that he's playing closer to the fringe of the 3b zone than the average third basemen, which has allowed him to get to many OOZ balls.
... but I have not seen much of him over the last few years and I could be way off on that. Also, John Dewan's plus/minus (located at Bill James Online) has him at +7 plays this year, 10th overall at 3b. Not sure how the other metrics usually see him, but, yeah, it's interesting that he's (apparently) having this good of a year so late in his career.
Posted by: Myron Logan at July 31, 2008 10:44 PM
Myron-
Regarding the OOZ plays, I know that the Braves employ a shift what seems like a lot more often than other teams. Could this positioning account for some of Chipper's OOZ plays? Do you make any adjustments for defensive shifts and the like?
Posted by: mravery at August 1, 2008 7:03 AM
Yea, it definitely could. I don't really think I could adjust for it, because I don't have the actual play by play data -- just what THT reports. But I'm not sure.
He still manages to be about average in zone, which is pretty impressive, if they're employing a lot of shifts.
Posted by: Myron Logan at August 1, 2008 8:18 AM
Methinks I understand why the Cardinals are "outperforming" their preseason predictions. I also think that when the "experts" make those predictions they never take defense into consideration.
Posted by: Nick at August 1, 2008 9:42 AM
Great work.
Also, I noticed in the best fielders list you're missing a couple guys who play multiple positions. For instance Jayson Werth has saved a total of 12.5 runs and Marco Scutaro has saved 12.6 runs. And some of the guys on the list actually have more saved if you add the multiple positions.
Posted by: Andrew Hertzog at August 1, 2008 9:43 AM
I saw the Werth double-up as well...didn't realize he was so talented. But watching Utley every day, I always wondered how he JUST misses so many batted balls and I guess this goes to explain it...he is involved in more out of zone plays than anyone. Where most second baseman watch the ball squirt through the infield, Utley is horizontal with the hopes of fielding one out of every ten of those. Also, I didnt think Victorino sucked in the outfield, but I guess he does.
Posted by: Ryan Dodson at August 1, 2008 12:16 PM
What does it mean that 8 of the top 9 teams are from the NL?
Posted by: Chris at August 1, 2008 1:25 PM
Ryan and Andrew ... doh, that is a mistake on my part. Good catch. I guess you can consider that list to be the best fielders at their primary position (or something) ...
Chris, I think it possibly means that pitchers hit the ball in play more weakly than position players on average, and that helps the NL in these rankings (or it could mean nothing at all, just that the NL has better fielders). You could split the data up for the NL and AL, but I decided to combine them to get a larger sample for the averages. I'm not sure if that was the correct decision ...
Posted by: Myron Logan at August 1, 2008 5:28 PM
What are the chances the guys at the Hardball Times can incorporate this into their stat pages?
Posted by: Hyltzn at August 1, 2008 6:17 PM
Wow, that is really impressive work, thanks for referencing your sources and walking through your steps. For somebody just delving into metrics it's a big big help.
Posted by: Jeff at August 1, 2008 6:45 PM
You're my boy Myron!
Posted by: Melvin Nieves at August 2, 2008 12:23 AM
Thanks for the kind words, guys!
Hyltzn, I believe I've read a couple of times that they were asked not to by Baseball Info Solutions, to avoid any confusion with John Dewan's plus/minus stat. Hopefully (and I doubt there is), there is no problem with what people like me are doing here.
Posted by: Myron Logan at August 2, 2008 10:40 AM
Ah, that's understandable. I guess I'll just replicate in my own spreadsheet by going off of yours.
A quick question. What does "RZR desc arrow" mean?
Posted by: Hyltzn at August 2, 2008 8:02 PM
That is just RZR (revised zone rating, or plays divided by balls in zone). I think the "desc arrow" part just has something to do with how I sorted it on THT when I imported it to excel.
If you split up the leagues, let me know if there are major changes, if you'd like ...
Posted by: Myron Logan at August 2, 2008 10:03 PM
Alright. Any estimate as to how many +/- plays translate to 1 run? Don't Dewan and Baseball Info solutions translate plus/minus to runs?
Posted by: Hyltzn at August 2, 2008 10:23 PM
Alright.
Posted by: Hyltzn at August 2, 2008 10:26 PM
And don't mind my post 2 posts ago. I'm an idiot who completely forgot about the part where you explained exactly what I was asking.
Posted by: Hyltzn at August 2, 2008 10:30 PM
So, what are the average number of runs saved at each position? What I want to do is take, for example, Beltre's 21.8 and add it to the average number of runs saved at third-base to see how many total runs he saved. Same thing with Blake and his -14.6. If these numbers represent the number of runs saved above average, there must be a number that represents the average number of runs saved at each position. Thanks.
Posted by: Ralph C. at August 3, 2008 5:14 AM
Ralph, it depends what that at average player it "over". A replacement-level defender?
Posted by: Hyltzn at August 3, 2008 1:00 PM
Ralph, not sure if I'm following you ... but each position is compared to all players at that position, and average is always 0. So +22 for Beltre is 22 runs above average at third, with average being 0.
You could look at just the starters or something (like Hyltzn may be suggesting) and then you may get a different figure. fwiw, in the few studies I've seen, bench players are usually pretty comparable to starters when it comes to fielding.
Posted by: Myron Logan at August 3, 2008 2:09 PM
Very interesting post and spreadsheet; thank you very much.
For those new to Excel, if you click in the top left square (to the left of the column labelled "A") to select the entire spreadsheet, and then click on Data at the very top and sort by whatever metric you want (Ascending or Descending) you can group by Team (column D), Position (column F), runs saved (column X) and so forth. You can also do a sub sort by clicking on column E (to put all players within a league together) and column F, to see all fielders by league by position listed.
I was surprised to see some of these metrics. Being a Dodger fan, I note that Blake Dewitt is the best fielder on the team in this study, which is pretty darned good for a third baseman. James Loney, Juan Pierre!, Andruw Jones, and Angel Berroa also help a lot, which is somewhat surprising in that the last three are all part time players. I guess there is so little speed in left field in general that Pierre really has a big edge, even though his routes aren't the best. Pierre in CF is barely average. Kemp in CF is pretty bad (-10), in RF better (-5), and Ethier in RF is below average, in LF average. So the Dodgers are gaining a lot of arm by having those three arranged Pierre-Kemp-Ethier, but aside from throwing considerations would be better off arranging them Ethier-Pierre-Kemp (-2.4 versus -5). It also appears their best outfield (pre-Manny) is Pierre-Jones-Kemp, arm considerations aside. I guess Pierre's speed makes up for more things than bad routes causes problems.
Manny Ramirez, at -14, is of course hopeless, and having him in left and then Pierre in center merely makes things worse. Were I managing the Dodgers I would send Jones down to AAA to finish rehabbing his knee (after surgery, he rushed back when Pierre went out but he's not 100% healthy yet). If Jones returns to health and form (say, last year's levels), then I'd go Manny-Jones-Kent; even without Pierre's speed at the top of the lineup, the defense is significantly better and a lineup of Kemp-Martin-Manny-Loney-Blake-Kent will provide enough offense. If Furcal comes back, then the Dodgers should only play Pierre as a pinch runner, pinch hitter, and defensive replacement for Manny. Pierre for Manny may be the biggest defensive plus in the majors.
Posted by: Richard Aronson at August 3, 2008 7:18 PM
Any chance you could run these for as far back as your data goes? Having up to date '08 stats is pretty great, but it would be easier to know what to make of your metric if we had further results.
Posted by: colin at August 4, 2008 1:08 PM
colin, I agree completely. Then we could look at the reliability of the stat from year to year, run projections, and things like that. It's definitely something I want to do in the near future ...
Posted by: Myron Logan at August 4, 2008 6:42 PM
I guess what I was asking is this: For example, let's take Beltre-- how many runs has he saved so far this year? If everyone is starting out at zero, then that means if Beltre is a +22, he has saved 22 runs this year? How many runs has each defender saved this year? If Abreau is -25, for example (I'm being lazy and not looking at the chart again), does this mean that, with zero being the average, he has not saved, or let up, 25 runs this season-- not saving runs but "giving" them up, lacking a better word? Thanks.
Posted by: Ralph C. at August 5, 2008 3:24 PM
Ralph, sorry for the late response. But, yes, I think you are on the right track. If a player is at 0, that means he is average for his position. If a player is -25, that means he's 25 runs worse than an average player at that position (he's theoretically cost his team 25 runs over what an average player would cost them) ... he may be 40 runs worse than the best players at that position.
Posted by: Myron Logan at August 8, 2008 9:00 AM