Saber TalkAugust 29, 2008
THT Fielding Data, 2004-2007
By Myron Logan

A few weeks ago, we used the fielding stats at The Hardball Times to make a little fielding metric. We looked at the best and worst teams and players of 2008. That's great, but if we really want to analyze fielding in a meaningful way, we need more data. THT offers stats that go back to 2004, so let's go through the same process with the 04-07 seasons. Now, rather than just half a year's worth of stats, we'll have close to five years of data.

The Process (Briefly)

The methodology is explained quite in-depth in the above linked post, but let's go through a quick recap just to be sure everyone is on the same page. Basically, we have data on each fielder's performance in their defined "zone" and out of it. We're using both of these areas to find out how many runs a player is worth, above or below average. Here's a quick example, with numbers just for illustration.

Nomar -- 50 BIZ, 40 plays, .8 RZR (plays/BIZ)
League average RZR (at short) -- .82
League avg. plays (for 50 BIZ) -- 41 plays

So, in his zone, Nomar would be 1 play below average. We do the same thing on out of zone balls, with the only difference being that we don't know exactly how many opportunities players have out of their zone. We assume that in-zone chances reflect out of zone chances, and we use BIZ as a proxy for OOZ opportunities. If you're confused here, check out the link up top, as it may answer some of your questions.

After we've done that, using Chris Dial's conversions, we turn plays above/below average into runs above/below average. And ... that's it. Not too difficult.

Positional Averages

If you look closely at the positional averages from year to year (which others have done), you'll notice some pretty big differences. For instance, here's RZR in the outfield for all four years:

	2004	2005	2006	2007
LF	0.63	0.633	0.861	0.855
CF	0.796	0.815	0.894	0.888
RF	0.65	0.648	0.888	0.877

For 2004-2005, the average RZR (plays made in zone divided by total balls in zone) in left is around .63. In 2006 and 2007, it jumped up to over .85. You may notice a similar thing happening in right field, and to a lesser extent, center field. Surely, outfielders didn't all of the sudden improve in the 2005 off season; rather, something happened to the way the zones are drawn or how fly balls or line drives are handled by the folks over at Baseball Info Solutions (that's where THT gets the data).

There are some differences in the infield, too, but they aren't quite as bad. There are plenty of ways to deal with this problem (check the first link in the last paragraph), but note that here we're just calculating the stats year-by-year (i.e., we made no attempt to normalize the numbers like Mr. Wyers did). You'll be able to see all of the positional averages if you want to download the data at the bottom of the page.

The Best and Worst Teams

This is from 2004-2007, and is simply a team's overall runs above or below average, found by adding up all the player's numbers on each team:

Top 15 Teams

Year    Team    Runs
2007    ATL     93.5
2006    STL     91.3
2004    PHI     79.0
2006    HOU     74.6
2005    CHA     69.5
2006    ATL     68.2
2007    NYN     67.8
2004    LAN     65.2
2006    SEA     60.0
2006    MIL     53.9
2007    TOR     52.1
2005    LAA     50.6
2005    SEA     49.6
2007    STL     46.2
2007    KC      43.0

The 2007 Atlanta Braves outfield was probably one of the better defensive outfields of the past few years, at least by these numbers. Check it out:

A. Jones   31.1 runs
Diaz       19.1
Francoeur  15.0
Harris      8.6

That's like 74 runs above average, just in the outfield. And, get this, they didn't have one outfielder who was rated below average (unless you count Pete Orr, who missed the one ball in his zone ; )

The 2006 Cardinals were anchored by two corner infielders, Albert Pujols at first (30.7) and Scott Rolen at third (31.4). The '04 Phillies were led by Jim Thome (18.7), David Bell (14.2), Jason Michaels (12.3), and a bunch of other guys who were in the plus 5 range.

Bottom 15 Teams

Year	Team	Runs
2005	NYA	-102.4
2007	TB	-89
2005	CIN	-85.7
2007	CHA	-82.5
2006	PIT	-81.3
2005	FLA	-80.2
2005	ARI	-80
2004	NYA	-77.1
2006	NYA	-69.8
2007	CLE	-63.5
2006	BOS	-63
2007	BOS	-55
2006	CIN	-49.8
2005	KC	-49
2007	CIN	-48.2

Ouch. The Yankees show up three times, and '05 team was the worst of the previous four seasons. Their worst performers were Derek Jeter (-43.6), Robinsion Cano (-35.9), Bernie Williams (-24.7), and Gary Sheffield (-18).

The '07 Tampa Bay performance was more of a team effort, but Elijah Dukes (-13.8) and Akinori Iwamura (-10.5) show up at the bottom. The '05 Reds had an outfield of Ken Griffey Jr., Adam Dunn, and Wily Mo Pena. Nuff said.

Best and Worst Players

Note that these are player performances in a single year at a single position. Some players could have played multiple positions, and obviously performed better or worse overall than the numbers displayed here.

The Top 15

Year 	Last 	Pos 	runs
2005	Rowand	CF	44.6
2007	Suzuki	CF	34.4
2004	A-Rod	3B	33.3
2007	Grand.	CF	32.6
2007	Wright	3B	32.2
2004	Rolen	3B	31.8
2005	Logan	CF	31.4
2006	Rolen	3B	31.4
2007	Jones	CF	31.1
2006	Pujols	1B	30.7
2005	Everett	SS	30.4
2007	Pujols	1B	30.3
2005	Craw.	LF	30.2
2005	Teix.	1B	29.8
2005	Suzuki	RF	29.7

The Bottom 15

Year 	Last 	Pos 	runs 
2005	Ramirez	LF	-43.8
2005	Jeter	SS	-43.6
2006	Ramirez	LF	-41.7
2005	Cano	2B	-35.9
2005	Griffey	CF	-35.5
2007	Ramirez	LF	-34.1
2007	Braun	3B	-33.2
2004	B.Will.	CF	-32.5
2007	J.Baut.	3B	-30.5
2004	Jeter	SS	-29.1
2004	Blake	3B	-28.8
2007	Dye	RF	-28.2
2007	Jeter	SS	-27.6
2007	Atkins	3B	-27.2
2004	Young	SS	-25.0

The Data

You can download the full spreadsheet for each year right here: 2004, 2005, 2006, and 2007.

Please feel free to mess around with those spreadsheets all you'd like. Also, note that these calculations were all produced by me, so there could surely be mistakes.

Anyway, with almost five years of data now, we can begin to better understand fielding through these freely available numbers. In this space over the coming months, we'll hopefully take a look at things like aging, projections, the reliability of these numbers, bench players' vs. starters' fielding, and so on. But you can surely get a head start now.

Comments

An interesting comparison would be gold glove winners compared to these statistical numbers. Clearly, it should be shown or noted that certain players have received gold glove awards when they were not entirely worthy.

It could also be noted that defensive metrics are in no way definitive. Some are better than others and the ones at THT seem to be among the best but I don't trust them. Rating Matt Diaz as well above-average is crazy. He's just not a good defensive outfielders at all. There is also no way Akinori Iwamura was a below-average defensive player last year at 3B. Eli Dukes was terrible because he was playing out of position in CF, but Aki was a really good defensive 3B.

Tyler-

Why should we believe your observations instead of Myron's numbers, apart from, "Well he just looked awesome." How can you explain Iwamura's bad rating? Myron's given us objective data, and you've simply said, "that data is wrong" without explaining.

My understanding was that the 2006 Boston defense, setting aside Manny, was supposed to be really good. And these metrics seem to really hate Manny.

Also, how retarded was it for Seattle to move Ichiro back to RF?

Craig, yes, that would be pretty interesting ...

Tyler, I hear ya. Fielding stats surely aren't definitive, but I don't think scouting is either. Probably some combination of both would work best, especially with a small amount of games to work with. I think when anyone tosses out fielding statistics, it is important to remember that they are merely an estimate of actual fielding prowess, and not the end-all-be-all.

NBarnes, Manny is hurt by the monster in these calculations, as they are not park adjusted. So, he probably isn't as bad as they say he is. But UZR and the other park adjusted numbers never really like his fielding either, IIRC.