The Baseball Analysts: Saber Talk Archives

Leveling the Playing Field

By Myron Logan and Mike Rogers

The Rule 4 draft is, without question, one of the most important events of the year for Major League teams. One great draft can change the future of a franchise. The draft gives teams an opportunity to acquire young, talented players for a relatively small financial commitment. If one of them reaches the bigs, and becomes even an average player, you’ll garner yourself a ton of value over that player’s first six years.

Naturally, then, the draft, and studying the amateur players, is a major part of each organization’s yearly workload. Consider this response from Chris Long, Padres’ Senior Quantitative Analyst, in an interview with us last year:

What's so amazing about the baseball draft, and I'm sure the draft in other sports, is the sheer number of players to consider. Different ages, sizes, polish, playing environments, growth potentials, levels of competition faced, ability components, injury tendencies, and it goes on. Then there's the information you get from the scouts. Which scouts are better? Are they looking at the right players, in the right way, the right number of times? What's the best way to integrate all of the information you have? Overlaying all of this are considerations of finance, utility, need, risk and the poker game of the actual draft. Draft the right player and he could be worth $50 or even $100 million in value to your club (see Pujols). Draft the wrong players and you'll waste millions and negatively impact your club for years. It's an extremely difficult, messy, noisy, and thoroughly insane problem to work on. It's beautiful.

We all know about scouting. It's crucial to the game, especially in college and high school, and it isn't going anywhere. But a more unexplored area (at least on the 'net), and perhaps an equally important one, is the thorough analysis of college statistics. Many times, people will bring up what Chris brought up in the above passage, saying there are too many factors to consider, too much noise in the data. There's varying levels of competition, parks, player aging, limited sample sizes, switching from aluminum bats to wood, etc. It goes on and on.

They are, of course, right on the money. Looking at the raw stats of two college players is probably a hapless endeavor. Let's look at a quick, made-up example:

Player A: .300/.480/.680
Player B: .280/.420/.600

They are somewhat close, but if that's all we know about each player, we’ll probably go with Player A every time. But, let's say Player B played against the third-toughest opponents in Division 1 and also played in a big pitcher's park. Player A played in a small conference, against relatively weak competition, and a great hitter's park. Now who are you goin' with? And not to mention, this is a simplified example, which leaves out many significant factors. But it just serves as a reminder that the numbers, alone, are just numbers; they have relatively little utility in sorting out baseball players on the college level.

Anyway, as you can see, the reservations people have about college stats are real. However, there's no reason why we can't try to make some adjustments, and make some sense of the madness.

We've spent the last four months importing and adjusting collegiate baseball statistics in an attempt to neutralize the numbers to allow for cross-conference comparisons. To do this, we've discovered that Boyd's World is an invaluable tool. He gets much of the credit for accumulating a lot of the data and making it available online.

Now, our methods were actually pretty simple. We're judging the players in our system on a few things that we feel are a solid scope for the offensive skills necessary to succeed in professional baseball. They include:

Weighted On-Base Average (wOBA)
Isolated Power (IsoP; slugging percentage minus batting average)
Strikeout percentage (K%; strikeouts divided by plate appearances)
Walk percentage (BB%; walks divided by plate appearances)
Speed score

All of the above are pretty self-explanatory, especially with the Wins Above Replacement explosion that happened in November and December of 2008 around the sabermetric blogosphere. However, the wOBA formula we used did not include stolen bases. Honestly, it wasn't for any particular reason, we just happened to grab the one copy of the formula that did not include it.

As for speed score, it's measuring "baseball speed," or, at least, that's the intended goal. It's actually a fairly generic speed score that is not much unlike the one Bill James used in his earlier works.

But, what do we take into account when adjusting these numbers? For us, it was park factors and level of competition faced. Those two components can vary from team-to-team in such a dramatic fashion that you'd initially swear they aren't right. For instance, Air Force had a 4-year park factor from 2005-2008 of 145. Conversely, a school like Longwood University had a park factor over that time of just 72. With such drastic discrepancies, it was important to address this. Again, drawing from Boyd's World, we have multiple-year park factors. He lists two for each team, one being a PF and one being TPF – or Park Factor and Total Park Factor. The former is just rating that team's home park, while the latter is rating all of the parks that team played in over the course of time it was tracked. So, Air Force's 145 park factor is just their home park. Playing in the Mountain West, they frequent some of the most hitter friendly parks in collegiate baseball, and their Total Park Factor was 128 from 2005-08. Basically, over those years, Air Force's team played in environments that were 28 percent more offense-friendly than a neutral ballpark, which would have a rating of 100.

To neutralize for park factors, we take the wOBA for each hitter, and simply run it through this: wOBA*square root(100/Total Park Factor). This nets us a Park-Adjusted wOBA (PAwOBA).

But that's just the first part of the components to neutralize. You also have to take into account the competition these numbers are being tested against. As mentioned previously, two stat lines, unadjusted, are not equal. Thankfully, Boyd's World comes through again with his Strength of Schedule ratings. To neutralize this, we do pretty much the same from above.

PAwOBA*square root(Strength of Schedule/100)

This gets us a wOBA for the players that are now both park and competition adjusted. We do this for IsoP's as well, using the same methods just substituting IsoP for the wOBA's. And before we jump straight to the table (even though this is going on long enough), we'd like to give a brief introduction to our "Score" category. We don't have a catchy name for it yet (although we're open to all suggestions), but what it encompasses is all of the categories that we're tracking. It weights the adjusted wOBA's, adjusted IsoP's, K and BB% and throws in our speed score, as well. But, we've rambled enough. On to the 2008 stats for the first five college bats taken in the 2008 Rule 4 draft:

A note on speed score: It's scaled down so it goes as such: -5 is terrible, 0 is bad, 5 is average, 10 is good, 15 is great, 20+ is flat-out burner.

The above are nothing more than just the 2008 numbers for the first five college bats taken last June. They are not meant to be a predictor of talent moving from aluminum to wood bats. Instead, it's just, at the moment, adjusting to see who had the best statistical seasons when you account for who they were playing and where. When the 2009 draft comes around, we'll have a better tool to judge player performance than just the raw stats, and hopefully it will shed some light onto who the top prospects are.

Also, don't forget that we haven't considered positional values or defense. A player's position is very important at this level. Players that start on the left of the spectrum (1b, left field, right field) have to hit a ton to make it in the bigs. Most great prospects start on the right side of the spectrum as amateurs and gradually shift to the left as they age, provided that their bats can play at those less-demanding defensive positions.

Additional Resources

Earlier in the article, it was mentioned that this type of stuff has been somewhat unexplored on the Internet. While that may be the case, there's certainly been plenty of research into the area:

Right here at Baseball Analysts, Kent Bonham did some very similar work back in 2006.
Jeff Sackmann, partnering with Bonham, runs collegesplits.com. He also does great work at The Hardball Times, much of it focusing on the college game and its numbers.
This post at Sons of Sam Horn details how to go about some of these adjustments.
Lincoln Hamilton, at Project Prospect, has also done some similar analysis.

A Glance at the MVP Candidates

By Myron Logan

Last week we took a look at how we should go about picking the Most Valuable Player in each league. Now, let's take a look at some of the leaders in a few different stats. You may remember that we broke down the selection process into a few different categories; context-neutral stats, context-dependent stats, and contribution to real wins. The third category is largely unexplored, at least in terms of stats we could use, so we'll concentrate on the first two.

Context-Neutral Leaderboard

There are plenty of places you can go to find context-neutral stats. They are probably the most popular of the three categories mentioned above. To keep things simple, we'll go with Justin's Total Value Estimates. The great thing about this stat is that it includes virtually everything you'd want to include, like hitting (based on linear weights) and fielding (based on zone rating and revised zone rating). Also, there are adjustments for park, league, position, and players are measured against replacement level. Here are the top 10 players in the American and National League:

American League

Sizemore   Cle  76.6
Rodriguez  NY   63.0
Pedroia    Bos  53.5
Granderson Det  53.3
Roberts    Bal  50.2
Mauer      Min  49.9
Hamilton   Tex  49.2
Beltre     Sea  49.0
Youkilis   Bos  44.6
Markakis   Bal  44.0

National League

Pujols   Stl  87.3
Berkman  Hou  75.5
Jones    Atl  67.4
Utley    Phi  66.8
Ramirez  Fla  65.2
Beltran  NY   56.8
Holliday Col  56.6
Wright   NY   53.7
Giles    SD   52.4
Braun    Mil  48.4

(numbers through September 5th)

As you can see there, Sizemore and Pujols are the clear leaders. There's a similar drop to second place in both leagues and then some bunching up after that.

Context-Dependent Leaderboard

Conveniently, Sky Kalkman's taken Win Probability Added and incorporated fielding, position, and replacement level to create a stat similar to Justin's. Here's the NL leaderboard, this time in wins above replacement rather than runs:

National League

Berkman  Hou  8.6
Pujols   Stl  7.9
Beltran  NY   7.1
Ramirez  Fla  6.8
Holliday Col  6.7
Wright   NY   5.7
Burrell  Phi  5.2
Lee      Hou  5.0
Utley    Phi  4.8

So, what's happened in the NL as we've switched from straight linear weight to WPA? Well, Lance Berkman has jumped over Albert Pujols to take the top spot. He's been particularly clutch (1.78 clutchiness points), while Pujols has merely been average in the clutch. Chase Utley's dropped from third to tenth thanks to his struggles in clutch situations, as measured again by Fan Graphs' clutchiness (-2.13, last in the NL). Remember, this doesn't mean he isn't clutch; in fact, in his career, he's actually been above average in clutchiness. It just means that if you believe context (i.e., performance with men on base, in late game situations, etc) should be considered in the MVP voting, Chase Utley probably isn't your guy.

Sky hasn't run the calculations yet in the AL, so here's the poor-man's version (just plain old WPA, without the positional adjustments, fielding, and so on):

Mauer    Min  4.16
Morneau  Min  4.14
Quentin  Chi  3.89
Hamilton Tex  3.63
Pena     TB   3.55
Cabrera  Det  3.53
Sizemore Cle  3.38
Pedroia  Bos  3.23
Huff     Bal  2.64
Ibanez   Sea  2.55

Remember, the names aren't that important yet. Heck, we've still got a few crucial weeks left in the season. For now, what's more important is that we understand what everybody is talking about when all of the MVP articles role around. A columnist talking about clutchness in Minnesota ... he's in the context-considered camp. A blogger ridiculing the aforementioned columnist's clutch argument ... he's probably in the context-neutral camp.

Putting the V in MVP

By Myron Logan

In the next month, there will undoubtedly be a ton of debate surrounding the MVP award, in both leagues. While people will have many angles for their choice, from a sabermetric perspective, we should all be on the same page, or at least understand where each other are coming from.

In this entry at The Book Blog, there is an in-depth discussion of last year's MVP award, and, more importantly, the process (or processes) one should use to pick the winner. What follows is my attempt to convert that lengthy thread into an article, and hopefully add my own twist. So, thanks in advance to Tango, MGL, and all the commenters over there for their help in shaping my opinion on this matter (of course, if I screw something up, which is almost inevitable, don't blame them!).

Anyway, let's get back to the discussion. There's one word here that really throws everyone off, and that is value. How do we define value? Well, there isn't a simple answer. If you read the above linked thread, there are three views that come up most often:

Context-Neutral Stats

Examples: Batting Runs, VORP, Runs Created, etc.

If you're in this group, you believe that clutch performance shouldn't be considered in the MVP voting process. A home run in a 10-0 blowout is worth just as much as a walk off homer in the 9th. An example of a context neutral batting stat is Pete Palmer's Batting Runs (which we've discussed here before). As you can see from the formula, each homer (or any event, for that matter) is worth the same (1.4), regardless of when it occurs.

Context Considered

Example: Win Probability Added

Now, we're looking at "clutch" performance or, more generally, context. WPA looks at how much each event changes a team's chance of winning. So, by WPA, a solo homer in a 10-0 blowout may be worth, oh, let's say .01 WPA points (or virtually nothing) and a walk off homer in the 9th might be worth around .5 WPA points. There's a huge difference there. So, if a player does well (or poorly) in clutch situations, it's going to impact his MVP candidacy, under this process. Note that the player's team doesn't necessarily have to win; the team can lose but a player can still gain WPA points, or contribute to a theoretical win.

Must Contribute to Real Wins

Example: This may be a good attempt

Unlike the above process, here you're only counting performance that directly affects the team's win total. If a player hits three home runs in an 8-6 loss, he doesn't get any credit. This is certainly going to favor players on winning teams and players that do good in wins.

It's important to note that these are just three general groups. There are surely others out there that can be considered, and of course there are sub-groups inside of these groups and so on. The point is, as Tango says, you've got to pick a position and stick with it. There's a good chance that there are three or four reasonable MVP candidates in each league, depending on your stance.

Fielding and Other Stuff

As you'll note, we've only talked about offense so far, really. We can't ignore fielding, and base running, and the other facets of the game. At this point with fielding, we're almost always going to have a "context-neutral" stat, whether we use UZR, PMR, THT's stats, or whatever. There's no clutchness factor in any of the fielding metrics (you wonder why Derek Jeter doesn't fare well ; ). Until someone makes a WPA-like fielding stat, we're going to have to use what's available. Also, there are a slew of other things to consider, like, as mentioned, base running, positional adjustments, park adjustments, and so on.

When a writer talks about a player's huge hits in big wins, you're probably going to be shaking your head, as his overall numbers may not be that great. But, remember, that writer may just be onto something. While that player may not have been the best player in the league, he just may have added the most value.

THT Fielding Data, 2004-2007

By Myron Logan

A few weeks ago, we used the fielding stats at The Hardball Times to make a little fielding metric. We looked at the best and worst teams and players of 2008. That's great, but if we really want to analyze fielding in a meaningful way, we need more data. THT offers stats that go back to 2004, so let's go through the same process with the 04-07 seasons. Now, rather than just half a year's worth of stats, we'll have close to five years of data.

The Process (Briefly)

The methodology is explained quite in-depth in the above linked post, but let's go through a quick recap just to be sure everyone is on the same page. Basically, we have data on each fielder's performance in their defined "zone" and out of it. We're using both of these areas to find out how many runs a player is worth, above or below average. Here's a quick example, with numbers just for illustration.

Nomar -- 50 BIZ, 40 plays, .8 RZR (plays/BIZ)
League average RZR (at short) -- .82
League avg. plays (for 50 BIZ) -- 41 plays

So, in his zone, Nomar would be 1 play below average. We do the same thing on out of zone balls, with the only difference being that we don't know exactly how many opportunities players have out of their zone. We assume that in-zone chances reflect out of zone chances, and we use BIZ as a proxy for OOZ opportunities. If you're confused here, check out the link up top, as it may answer some of your questions.

After we've done that, using Chris Dial's conversions, we turn plays above/below average into runs above/below average. And ... that's it. Not too difficult.

Positional Averages

If you look closely at the positional averages from year to year (which others have done), you'll notice some pretty big differences. For instance, here's RZR in the outfield for all four years:

	2004	2005	2006	2007
LF	0.63	0.633	0.861	0.855
CF	0.796	0.815	0.894	0.888
RF	0.65	0.648	0.888	0.877

For 2004-2005, the average RZR (plays made in zone divided by total balls in zone) in left is around .63. In 2006 and 2007, it jumped up to over .85. You may notice a similar thing happening in right field, and to a lesser extent, center field. Surely, outfielders didn't all of the sudden improve in the 2005 off season; rather, something happened to the way the zones are drawn or how fly balls or line drives are handled by the folks over at Baseball Info Solutions (that's where THT gets the data).

There are some differences in the infield, too, but they aren't quite as bad. There are plenty of ways to deal with this problem (check the first link in the last paragraph), but note that here we're just calculating the stats year-by-year (i.e., we made no attempt to normalize the numbers like Mr. Wyers did). You'll be able to see all of the positional averages if you want to download the data at the bottom of the page.

The Best and Worst Teams

This is from 2004-2007, and is simply a team's overall runs above or below average, found by adding up all the player's numbers on each team:

Top 15 Teams

Year    Team    Runs
2007    ATL     93.5
2006    STL     91.3
2004    PHI     79.0
2006    HOU     74.6
2005    CHA     69.5
2006    ATL     68.2
2007    NYN     67.8
2004    LAN     65.2
2006    SEA     60.0
2006    MIL     53.9
2007    TOR     52.1
2005    LAA     50.6
2005    SEA     49.6
2007    STL     46.2
2007    KC      43.0

The 2007 Atlanta Braves outfield was probably one of the better defensive outfields of the past few years, at least by these numbers. Check it out:

A. Jones   31.1 runs
Diaz       19.1
Francoeur  15.0
Harris      8.6

That's like 74 runs above average, just in the outfield. And, get this, they didn't have one outfielder who was rated below average (unless you count Pete Orr, who missed the one ball in his zone ; )

The 2006 Cardinals were anchored by two corner infielders, Albert Pujols at first (30.7) and Scott Rolen at third (31.4). The '04 Phillies were led by Jim Thome (18.7), David Bell (14.2), Jason Michaels (12.3), and a bunch of other guys who were in the plus 5 range.

Bottom 15 Teams

Year	Team	Runs
2005	NYA	-102.4
2007	TB	-89
2005	CIN	-85.7
2007	CHA	-82.5
2006	PIT	-81.3
2005	FLA	-80.2
2005	ARI	-80
2004	NYA	-77.1
2006	NYA	-69.8
2007	CLE	-63.5
2006	BOS	-63
2007	BOS	-55
2006	CIN	-49.8
2005	KC	-49
2007	CIN	-48.2

Ouch. The Yankees show up three times, and '05 team was the worst of the previous four seasons. Their worst performers were Derek Jeter (-43.6), Robinsion Cano (-35.9), Bernie Williams (-24.7), and Gary Sheffield (-18).

The '07 Tampa Bay performance was more of a team effort, but Elijah Dukes (-13.8) and Akinori Iwamura (-10.5) show up at the bottom. The '05 Reds had an outfield of Ken Griffey Jr., Adam Dunn, and Wily Mo Pena. Nuff said.

Best and Worst Players

Note that these are player performances in a single year at a single position. Some players could have played multiple positions, and obviously performed better or worse overall than the numbers displayed here.

The Top 15

Year 	Last 	Pos 	runs
2005	Rowand	CF	44.6
2007	Suzuki	CF	34.4
2004	A-Rod	3B	33.3
2007	Grand.	CF	32.6
2007	Wright	3B	32.2
2004	Rolen	3B	31.8
2005	Logan	CF	31.4
2006	Rolen	3B	31.4
2007	Jones	CF	31.1
2006	Pujols	1B	30.7
2005	Everett	SS	30.4
2007	Pujols	1B	30.3
2005	Craw.	LF	30.2
2005	Teix.	1B	29.8
2005	Suzuki	RF	29.7

The Bottom 15

Year 	Last 	Pos 	runs 
2005	Ramirez	LF	-43.8
2005	Jeter	SS	-43.6
2006	Ramirez	LF	-41.7
2005	Cano	2B	-35.9
2005	Griffey	CF	-35.5
2007	Ramirez	LF	-34.1
2007	Braun	3B	-33.2
2004	B.Will.	CF	-32.5
2007	J.Baut.	3B	-30.5
2004	Jeter	SS	-29.1
2004	Blake	3B	-28.8
2007	Dye	RF	-28.2
2007	Jeter	SS	-27.6
2007	Atkins	3B	-27.2
2004	Young	SS	-25.0

The Data

You can download the full spreadsheet for each year right here: 2004, 2005, 2006, and 2007.

Please feel free to mess around with those spreadsheets all you'd like. Also, note that these calculations were all produced by me, so there could surely be mistakes.

Anyway, with almost five years of data now, we can begin to better understand fielding through these freely available numbers. In this space over the coming months, we'll hopefully take a look at things like aging, projections, the reliability of these numbers, bench players' vs. starters' fielding, and so on. But you can surely get a head start now.

Measuring Offense with Batting Runs

By Myron Logan

Two weeks ago, we looked at the performance of all major leaguers (well, all but catcher and pitchers). I figured it wouldn’t hurt to take a look at offensive performance here today.

When you talk about offensive metrics, well, you’ve got a lot to talk about. You’ve got linear methods (like Pete Palmer’s Batting Runs), multiplicative methods (like Bill James’ Runs Created), rate stats (OBP/SLG, wOBA, GPA, etc,), and a bunch of other things you could do. Really, your stat of choice should depend heavily upon what question you’re trying to answer. Anyway, rather than try to recap the history of run estimation, something I would inevitably fail miserably at, let me just explain what I did.

Palmer’s Batting Runs

That’s essentially the stat we’ve calculated, and you can read a little about it here. It should be very similar to the number located on each player page at Baseball-Reference.com (“BtRns” under Special Batting). If you’re new to this stuff, well, the process is actually pretty simple. You take a player’s stats (singles, doubles, triples, etc.) and multiply them by the corresponding number in the formula. So, if Milton Bradley has 53 singles, you multiply that by .47, then take his doubles and times them by .85, and so on. At the end, you subtract (outs* ~.3). Base stealing is added in separately, and is simply .22*SB-.38*CS.

What you end up with is the number of runs above (or below) average a player has produced in his given playing time.

Adjusting for Parks

Surely, we want to make some adjustment for the park that a player plays his home games in. To do this, we take the outs number (.286 for the AL) and multiply it by the player’s park factor. For, let’s say, Jason Varitek, we penalize him .297 (.286*1.04) for his outs, rather than .286. If we go through and do this for every player, we have a pretty decent park adjustment*. By the way, I used Patriot’s park factors.

*There is a more complicated, more technically correct way to make this adjustment. The difference, however, is pretty tiny, so I’m just sticking with the simpler adjustment.

The Good

As I understand it, a linear weights type method for individual hitters is the best way to go. While something like Runs Created is a fine run estimator, often times it will overvalue great hitters, because they interact with their teammates and not in a lineup of clones (i.e., there aren’t nine Albert Pujols’ in the batting order, but rather one Pujols and eight mortals). Runs Created assumes a player interacts outside of a team construct, while Batting Runs does not.

And unlike, say, OPS, probably the most popular stat on the internet, we actually know what Batting Runs is measuring – runs! We know there’s a difference between an .800 OPS and a .900 OPS, but we don’t really know what one point of OPS is worth. The difference between 30 Batting Runs above average and 20 in, let’s say, 400 PA, is 10 runs. Pretty simple and straightforward.

Limitations

The negatives have more to do with the simplicity of my calculation than anything else. There are things you can (and probably should) add like double play adjustments, a different out value for strikeouts, and so on. It all depends on how accurate and detailed you want to get. Next time we do this, probably at the end of the year, we’ll use a more detailed formula.

Furthermore, the weights used here are long term averages and are not based on any specific context. For instance, if you want to know how many runs J.D. Drew added to the Red Sox, rather than an average team, you’d problem want to look at something like Custom Linear Weights.

Also, remember that this method counts, say, every home run as 1.40 runs, as that is what it’s worth in the long run. However, if a player has a particularly clutch year or something, he’s obviously getting undercut here. Going back to what I said earlier, it really depends on what exactly you want to measure.

Finally, this is just one year’s worth of stats, and does not represent a player’s true talent. To find that, or at least estimate it, you’d want multiple years of data, regression to the mean, an age adjustment, and so on.

Alright, enough babbling, let’s see some numbers. Here are the top 15 hitters in each league:

AL                         NL	
1. Rodriguez, NY    38.2   1. Pujols, Stl     52.2
2. Bradley, Tex     34.4   2. Berkman, Hou    47.4
3. Sizemore, Cle    32.3   3. Jones,Atl       42.7
4. Markakis, Bal    28.2   4. Holliday, Col   38.1
5. Drew, Bos        28.1   5. Ludwick, Stl    33.5
6. Quentin, Chi     28.0   6. Ramirez, Fla    31.1
7. Morneau, Min     26.7   7. Wright, NY      30.2
8. Huff, Bal        26.2   8. Utley, Phi      27.7
9. Kinsler, Tex     26.1   9. Lee, Hou        26.8
10.Hamilton, Tex    25.5   10.McCann, Atl     26.4
11.Cabrera, Det     24.9   11.Burrell, Phi    26.2
12.Youkilis, Bos    24.3   12.Gonzalez, SD    24.6
13.Roberts, Bal     23.5   13.Braun, Mil      23.1
14.Ramirez, Bos     22.7   14.Teixeira, Atl   22.9
15.Giambi, NY       22.6   15.Bay, Pit        22.3

And how about the worst 10:

AL                         NL		
1. Pena, KC        -27.0   1. Sanchez, Pit   -23.4
2. Gomez, Min      -17.9   2. Francoeur, Atl -21.8
3. Johjima, Sea    -17.9   3. Patterson, Cin -21.3
4. Betancourt, Sea -15.7   4. Vizquel, SF    -21.1
5. Cabrera, NY     -15.6   5. Taveras, Col   -20.4
6. Varitek, Bos    -14.4   6. Bourn, Hou     -19.8
7. Vidro, Sea      -14.4   7. Jones, LA      -18.6
8. Gutierrez, Cle  -14.3   8. Greene, SD     -17.9
9. Marte, Cle      -13.9   9. Pena, Was      -17.7
10.Bynum, Bal      -13.7   10.Young, Ari     -15.8

Here’s the spreadsheet with all players*:

*I took out the pitchers in the NL while making the calculations. Of course, I’m just realizing it now, but I forgot to do the same in the AL (darn inter-league play). I took them out now, but I’m hoping it didn’t have too much of an effect on the final numbers (and I really don’t think it did).

Unlike the fielding spreadsheet, unfortunately, this one won’t automatically update – I had some computer issues and had to use someone else’s, and I couldn’t seem to get the auto-update thing to work. Anyway, feel free to play around in there and use the numbers for whatever you’d like.

Now that we’ve covered hitting and fielding, we’re getting close to a pretty decent little player evaluation ‘system.’ Add in some positional adjustments, some league adjustments, maybe a base running stat, and some other stuff and we’d be pretty good. But hopefully this will tide you over in those message board/blog debates.

Next time, if my computer returns safely, we’ll dig a little deeper into the fielding data available at The Hardball Times.

*Big thanks to Patriot for helping me better understand a few things and Baseball Prospectus for the data.