Designated HitterJuly 31, 2008
Dee-Fense . . . Dee-Fense . . .
By Myron Logan

Last year, Justin Inaz popularized a new fielding stat, based on the freely available data from the Hardball Times. This year I decided to set up a spreadsheet (one that can automatically update!) and keep track of fielding performance, using Justin’s process. While there are plenty of advanced fielding metrics out there, such as MGL’s Ultimate Zone Rating, David Pinto’s Probabilistic Model of Range, and John Dewan’s Plus/Minus, I figured, if anything, it wouldn’t hurt to have one more. It may not get as detailed as those listed above, but it’s pretty good and it’s available all the time (and for free).

The Methodology

The Hardball Times provides us with some great information to evaluate fielding performance. On their fielding stats page, they report, for each and every player, the number of balls hit into the player’s zone, the number of plays made on balls in their zone, and the number of plays made on balls hit outside of their zone. With these three numbers in hand, we can get a pretty solid grasp of a player’s fielding performance. But, before we get to that, we’ve got a few definitions to get out of the way:

  • BIZ (balls in zone) – This is the number of balls hit into a player’s zone. A zone (or zones) is defined as the area on the field where at least 50% of balls are turned into outs, at the position in question.

  • Plays – This category is simply plays made on balls in zone.

  • OOZ (out of zone plays) – This is the number of plays a fielder makes on balls hit outside of his zone.

    Now, how do we go about turning three numbers into a decent fielding metric? Well, let’s take a look at Mariners’ shortstop Yuniesky Betancourt, as an example. He’s had 244 balls hit into his zone, and of those 244 chances, he’s turned 200 of them into outs. The average shortstop turns about 83% of balls in zone into outs, so we would expect the average SS to make about 203 plays, if they had 244 chances. Betancourt is about -3 compared to average.

    How do we handle out of zone performance? Betancourt’s made just 17 out of zone plays so far in 2008. The average shortstop makes about .13 out of zone plays per in zone chance*, so we’d expect the average SS to have about 32 out of zone plays, given Yuni’s in zone chances. This puts Betancourt at -15 on OOZ balls and about -18 plays overall.

    *One major assumption is being taken here. That is that the number of in zone chances a player gets also reflects the number of out of zone chances he’ll have. Since we don’t know exactly how many OOZ chances anyone actually has, we have to estimate this number somehow. Some people believe innings or total balls in play or something else would be a better proxy, but I’m using in zone chances here.

    We now have Betancourt at -18 plays, but we’re not quite done yet. It’s a lot easier to work in terms of runs because that’s generally how we measure things in baseball, so we have to make one final conversion. Using the numbers derived from Chris Dial, we can turn plays into runs, simply by multiplying plays by .753 for shortstops (it varies by position as saving a play in, say, the outfield, is, on average, more valuable than saving a play in the infield). Betancourt now ends up at about -13 runs, or the second-worst MLB shortstop, ahead of only Bobby Crosby (-14.6).

    That is essentially what you do, with every player, at every position (of course, Excel makes that a little bit easier, or at least it’s supposed to, if you know what you’re doing).

    The Good and the Bad

    There are a number of reasons why this metric (stat, translation, conversion, whatever you want to call it) is pretty darn good, and there are also, of course, many limitations.

    Positives:

    • It’s based on play-by-play data. It doesn’t try to estimate opportunities based on regular fielding stats. Rather, the folks at Baseball Info Solutions use video analysis of each play to derive the numbers. It’s a big step up over Range Factor and some of the other non-pbp metrics.

    • It counts both in zone and out of zone performance, and it also keeps them separated (so you don’t get problems like this). I see a lot of people looking at RZR (plays/BIZ) and maybe trying to eyeball OOZ performance. Well, now you don’t have to do that. They’re both combined so you can get a picture of a fielder’s total contribution (at least in the range aspect of fielding).

    • It’s available for free and we can update it when we want. Some of the more detailed metrics are often not updated until the end of the year, or are behind a paywall, or aren’t displayed at all for various reasons. Well, this may not be the most detailed -- more on that later -- but, thanks to the folks at The Hardball Times, it’s always there for us!

    Negatives:

    • There aren’t a lot of adjustments, like you’ll see in, say, something like Ultimate Zone Rating. For example, there isn’t an adjustment for the speed of the ball. A scorching grounder to the shortstop is going to look just like a routine ground ball, as long as they’re both in the shortstop’s zone, by this metric. Also, there are no park adjustments, and that could be a problem, especially in the outfield.

    • A ball is either determined to be in the fielder’s zone or out of it. We all know that all balls hit into a fielder’s zone are not created equal. If one player gets a bunch of balls on the fringe of his zone one year, it could make him look worse than he really is, though we expect stuff like that to even out as we get more and more data.

    • As mentioned above, we don’t know exactly how many opportunities a fielder has out of his zone. We can make as estimate, but there could certainly be problems with it.

    • It certainly does not include every aspect of fielding; rather it concentrates on the range aspect. For instance, things like pop ups and double plays aren’t included for infielders, throwing arms aren’t included for outfielders, and scooping bad throws out of the dirt isn’t considered for first basement.

    I think that, if we keep the limitations in mind, this can be a very useful number to look at. Of course, we can’t get carried away with two-thirds of a season’s stats, both because of the limitations mentioned above, and because of the relatively small amount of data we’re working with. With that said, let’s take a look at the best and worst teams and individual fielders so far in 2008.

    Teams

    Below are all 30 teams listed, in order of runs saved above average (through Monday, July 28):

    STL     45.1
    ATL     39.4
    CHN     35.7
    OAK     35.1
    SD      31.0
    HOU     28.3
    PHI     28.0
    LAN     27.5
    MIL     22.4
    TOR     18.5
    LAA     15.4
    NYN     10.3
    TB       9.0
    SF       4.7
    CHA     -2.8
    SEA     -5.5
    COL     -6.1
    BOS     -7.3
    DET     -7.6
    ARI     -8.6
    WAS    -10.8
    CLE    -12.9
    BAL    -17.0
    CIN    -19.2
    PIT    -24.3
    TEX    -26.4
    FLA    -38.1
    NYA    -48.9
    MIN    -49.2
    KC     -58.4
    

    The Cardinals come out on top, at about 45 runs above average. The Cards are led by a great infield trio of Adam Kennedy ( 14.9), Albert Pujols ( 11.5), and Cezar Izturis ( 10.2). The Braves are also anchored by three great infielders in Yunel Escobar ( 17.4), Chipper Jones ( 16.5), and Mark Teixeira ( 10.6). The Cubs are led by rookie right fielder Kosuke Fukudome ( 15.5). Other standouts include Derrek Lee ( 7.2) and Mike Fontenot ( 6.8).

    The Royals find themselves trailing the majors, at 58 runs below average. They have eight players that are at least 5 runs below average or worse. Minnesota’s been hurt badly by their infield defense: Justin Morneau (first, -10.7), Alexi Casilla (second, -6.1), Brendan Harris (short, -5.6), and Mike Lamb (third, -12.6). The Yankees can thank most of their poor rating to Bobby Abreu, who trails the majors at 27.5 runs below average.

    Best and Worst Fielders

    The subtitle there is a bit of a misnomer, as you’d like to have more than one year of data to truly determine the best and worst fielders. But here are the top 20 fielders so far in 2008, ranked in order of runs saved above average (these aren’t per 150 innings or anything, by the way – this is the player’s total so far):

    Utley, Phi      25.0
    Rolen, Tor      23.8
    Beltre, Sea     21.8
    Ellis, Oak      18.3
    Hardy, Mil      17.5
    Escobar, Atl    17.4
    Giles, SD       16.9
    Jones, Atl      16.5
    Fukudome, Chi   15.5
    Hannahan, Oak   14.9
    Kennedy, Stl    14.9
    Berkman, Hou    13.9
    Winn, SF        13.2
    Anderson, CHA   13.1
    Votto, Cin      12.9
    Rios, Tor       12.7
    Gutierrez, Cle  11.5
    Pujols, Stl     11.5
    Helton, Col     11.3
    Figgins, LAA    11.1
    

    And how about the trailers:

    Abreu, NYA      -27.5
    Wells, Tor      -21.9
    Jacobs, Fla     -17.9
    Encarnacion,Cin -15.9
    Hawpe, Col      -15.5
    McLouth, Pit    -15.3
    Griffey Jr, Cin -15.1
    Mora, Bal       -14.8
    Blake, Cle      -14.6
    Crosby, Oak     -14.6
    Ramirez, Bos    -14.1
    Betancourt, Sea -13.3
    Ordonez, Det    -13.3
    Lamb, Min       -12.6
    Easley, NYN     -12.5
    Hermida, Fla    -12.5
    Quentin, CHA    -12.1
    Cantu, Tor      -12.1
    Castillo, NYM   -12.0
    Kinsler, Tex    -11.8
    

    Figuring you might be interested in, oh, say, the 800 some players in between the top and bottom 20, here’s the full spreadsheet.

    There you’ve got ratings at every position, positional averages in some of the key stats, and team totals again. Feel free to use it however you’d like, of course, and let me know if you have any questions. And let me know if I’ve messed anything up, be it in the spreadsheet or in any of my rambling above. I am by no means any type of expert on fielding analysis, but I find it fascinating, and I hope you do too.

    Myron Logan writes about the Padres and baseball at Friar Forecast.

  • Comments

    Nice work! It's great to see others using these data. -j

    So, that's Chipper Jones up there in 8th place despite missing a quarter of the season to date? Wowsers. I was under the impression that ever certified advanced defensive statistic was required to rank Chipper as the worst defensive 3B in history. :-)

    Seriously, though, always good to see new defensive metrics.

    Justin, thanks!

    mravery, Chipper has been great on out of zone plays this year -- basically, that's where he has all of his value (I've got him at +18.6 plays out of zone and +2 plays in zone, which ends up working out to about +20.6 plays/+16.5 runs).

    My off-the-cuff guess is that he's playing closer to the fringe of the 3b zone than the average third basemen, which has allowed him to get to many OOZ balls.

    ... but I have not seen much of him over the last few years and I could be way off on that. Also, John Dewan's plus/minus (located at Bill James Online) has him at +7 plays this year, 10th overall at 3b. Not sure how the other metrics usually see him, but, yeah, it's interesting that he's (apparently) having this good of a year so late in his career.

    Myron-

    Regarding the OOZ plays, I know that the Braves employ a shift what seems like a lot more often than other teams. Could this positioning account for some of Chipper's OOZ plays? Do you make any adjustments for defensive shifts and the like?

    Yea, it definitely could. I don't really think I could adjust for it, because I don't have the actual play by play data -- just what THT reports. But I'm not sure.

    He still manages to be about average in zone, which is pretty impressive, if they're employing a lot of shifts.

    Methinks I understand why the Cardinals are "outperforming" their preseason predictions. I also think that when the "experts" make those predictions they never take defense into consideration.

    Great work.

    Also, I noticed in the best fielders list you're missing a couple guys who play multiple positions. For instance Jayson Werth has saved a total of 12.5 runs and Marco Scutaro has saved 12.6 runs. And some of the guys on the list actually have more saved if you add the multiple positions.

    I saw the Werth double-up as well...didn't realize he was so talented. But watching Utley every day, I always wondered how he JUST misses so many batted balls and I guess this goes to explain it...he is involved in more out of zone plays than anyone. Where most second baseman watch the ball squirt through the infield, Utley is horizontal with the hopes of fielding one out of every ten of those. Also, I didnt think Victorino sucked in the outfield, but I guess he does.

    What does it mean that 8 of the top 9 teams are from the NL?

    Ryan and Andrew ... doh, that is a mistake on my part. Good catch. I guess you can consider that list to be the best fielders at their primary position (or something) ...

    Chris, I think it possibly means that pitchers hit the ball in play more weakly than position players on average, and that helps the NL in these rankings (or it could mean nothing at all, just that the NL has better fielders). You could split the data up for the NL and AL, but I decided to combine them to get a larger sample for the averages. I'm not sure if that was the correct decision ...

    What are the chances the guys at the Hardball Times can incorporate this into their stat pages?

    Wow, that is really impressive work, thanks for referencing your sources and walking through your steps. For somebody just delving into metrics it's a big big help.

    You're my boy Myron!

    Thanks for the kind words, guys!

    Hyltzn, I believe I've read a couple of times that they were asked not to by Baseball Info Solutions, to avoid any confusion with John Dewan's plus/minus stat. Hopefully (and I doubt there is), there is no problem with what people like me are doing here.

    Ah, that's understandable. I guess I'll just replicate in my own spreadsheet by going off of yours.

    A quick question. What does "RZR desc arrow" mean?

    That is just RZR (revised zone rating, or plays divided by balls in zone). I think the "desc arrow" part just has something to do with how I sorted it on THT when I imported it to excel.

    If you split up the leagues, let me know if there are major changes, if you'd like ...

    Alright. Any estimate as to how many +/- plays translate to 1 run? Don't Dewan and Baseball Info solutions translate plus/minus to runs?

    Alright.

    And don't mind my post 2 posts ago. I'm an idiot who completely forgot about the part where you explained exactly what I was asking.

    So, what are the average number of runs saved at each position? What I want to do is take, for example, Beltre's 21.8 and add it to the average number of runs saved at third-base to see how many total runs he saved. Same thing with Blake and his -14.6. If these numbers represent the number of runs saved above average, there must be a number that represents the average number of runs saved at each position. Thanks.

    Ralph, it depends what that at average player it "over". A replacement-level defender?

    Ralph, not sure if I'm following you ... but each position is compared to all players at that position, and average is always 0. So +22 for Beltre is 22 runs above average at third, with average being 0.

    You could look at just the starters or something (like Hyltzn may be suggesting) and then you may get a different figure. fwiw, in the few studies I've seen, bench players are usually pretty comparable to starters when it comes to fielding.

    Very interesting post and spreadsheet; thank you very much.

    For those new to Excel, if you click in the top left square (to the left of the column labelled "A") to select the entire spreadsheet, and then click on Data at the very top and sort by whatever metric you want (Ascending or Descending) you can group by Team (column D), Position (column F), runs saved (column X) and so forth. You can also do a sub sort by clicking on column E (to put all players within a league together) and column F, to see all fielders by league by position listed.

    I was surprised to see some of these metrics. Being a Dodger fan, I note that Blake Dewitt is the best fielder on the team in this study, which is pretty darned good for a third baseman. James Loney, Juan Pierre!, Andruw Jones, and Angel Berroa also help a lot, which is somewhat surprising in that the last three are all part time players. I guess there is so little speed in left field in general that Pierre really has a big edge, even though his routes aren't the best. Pierre in CF is barely average. Kemp in CF is pretty bad (-10), in RF better (-5), and Ethier in RF is below average, in LF average. So the Dodgers are gaining a lot of arm by having those three arranged Pierre-Kemp-Ethier, but aside from throwing considerations would be better off arranging them Ethier-Pierre-Kemp (-2.4 versus -5). It also appears their best outfield (pre-Manny) is Pierre-Jones-Kemp, arm considerations aside. I guess Pierre's speed makes up for more things than bad routes causes problems.

    Manny Ramirez, at -14, is of course hopeless, and having him in left and then Pierre in center merely makes things worse. Were I managing the Dodgers I would send Jones down to AAA to finish rehabbing his knee (after surgery, he rushed back when Pierre went out but he's not 100% healthy yet). If Jones returns to health and form (say, last year's levels), then I'd go Manny-Jones-Kent; even without Pierre's speed at the top of the lineup, the defense is significantly better and a lineup of Kemp-Martin-Manny-Loney-Blake-Kent will provide enough offense. If Furcal comes back, then the Dodgers should only play Pierre as a pinch runner, pinch hitter, and defensive replacement for Manny. Pierre for Manny may be the biggest defensive plus in the majors.

    Any chance you could run these for as far back as your data goes? Having up to date '08 stats is pretty great, but it would be easier to know what to make of your metric if we had further results.

    colin, I agree completely. Then we could look at the reliability of the stat from year to year, run projections, and things like that. It's definitely something I want to do in the near future ...

    I guess what I was asking is this: For example, let's take Beltre-- how many runs has he saved so far this year? If everyone is starting out at zero, then that means if Beltre is a +22, he has saved 22 runs this year? How many runs has each defender saved this year? If Abreau is -25, for example (I'm being lazy and not looking at the chart again), does this mean that, with zero being the average, he has not saved, or let up, 25 runs this season-- not saving runs but "giving" them up, lacking a better word? Thanks.

    Ralph, sorry for the late response. But, yes, I think you are on the right track. If a player is at 0, that means he is average for his position. If a player is -25, that means he's 25 runs worse than an average player at that position (he's theoretically cost his team 25 runs over what an average player would cost them) ... he may be 40 runs worse than the best players at that position.