Baseball BeatJanuary 04, 2010
Graphing the Hitters
By Rich Lederer

Thanks to Fangraphs and Jeremy Greenhouse, I now have access to the 2009 stats in three spreadsheets covering 706 hitters, 664 pitchers, and 1,877 rows for fielders (including seven for Ben Zobrist). While combing through these numbers, it occurred to me that I had graphed pitchers and payroll efficiency over the years but never hitters. Well, that's about to change.

If a picture is worth a thousand words, then a graph is worth at least as many. Tables are nice to peruse but graphs are clearly more visual than columns and rows of stats. Although there is nothing groundbreaking as it relates to the graphs that I have chosen to present, I believe they tell their own stories. They are designed to be simple and straightforward. Two axis, four quadrants, and player names identifying outliers.

The first graph, which I call Productivity, plots on-base percentages on the x-axis and slugging averages on the y-axis for every qualified batter in 2009. The intersection of the MLB averages for OBP (.333) and SLG (.418) created quadrants that classify players as above average in both (upper right), below average in both (lower left), or above average in one and below average in the other (upper left and lower right).

Note: You can download a spreadsheet containing the AVG, OBP, SLG, and OPS of the 155 hitters here. This information can also be used to locate the 135 players not labeled in the graph below.

Productivity.png

I've got two questions:

1. Is Albert Pujols any good?
2. Is Yuniesky Betancourt really the starting shortstop for the Kansas City Royals?

OK, I've got one more:

3. Did Royals GM Dayton Moore just sign Jason Kendall to a two-year contract for $6 million?

Bonus round:

4. Is it true that Moore signed a four-year extension with the Royals through 2014 more than a year before his current deal expired?

The answer to all four questions is ... drum roll, please ... YES!

Pujols (.443 OBP, .658 SLG) is very, very good. He carried my fantasy baseball team to a championship in 2009. Thank goodness for pulling the piece of paper with "1" out of the hat prior to our draft. He won his third National League Most Valuable Player Award unanimously, leading the senior circuit in OBP, SLG, OPS (1.101), OPS+ (188), R (124), HR (47), XBH (93), TOB (310), TB (374), and several other advanced metrics. Prince Albert doesn't turn 30 until the middle of this month, yet he has already produced over 1,700 hits and 800 walks, slugged 387 doubles and 366 home runs, and surpassed 1,000 runs scored and 1,100 runs batted in over the first nine years of his career.

Betancourt, on the other hand, had the lowest OBP (.274) and the seventh-worst SLG (.351) in the majors. The distinction of ranking dead last in SLG went to Yuniesky's newest teammate, the 35-year-old Kendall, who has "hit" .261/.336/.321 (OPS+ of 76) with 8 HR in nearly 3,000 plate appearances since being traded by the Pittsburgh Pirates (or was it the "Stealers"?) after the 2004 season.

Joe Mauer (very good) and Emilio Bonifacio (very bad) also stood out last year. Mauer was named AL MVP, sweeping the Triple Crown in rate stats with a .365 AVG, .444 OBP, and .587 SLG while winning his third batting title in the past four seasons. He also led the league in OPS (1.031) and OPS+ (170). Did I mention that Kendall is Mauer's third-most similar player through age 26?

Speaking of Bonifacio, how many fantasy owners picked him up when he was hitting .583/.600/.833 after the first week of the season? He rewarded them by putting up a .233/.288/.279 line the rest of the way.

There are a number of other interesting observations from the Productivity graph. For example, check out the names of the high-OBP and high-SLG players in the northeast quadrant. In addition to Pujols, the list includes Prince Fielder (.412/.602), Joey Votto (.414/.567), Derrek Lee (.393/.579), Ryan Howard (.360/.571), and Kendry Morales (.355/.569). First basemen all. The diamond directly below Votto's is Kevin Youkilis (.413/.548). The one down and to the right of Lee's is Miguel Cabrera (.396/.547). The diamond that is between Youk and Miggy is Adrian Gonzalez (.407/.551). Lastly, the one down and to the left of Lee is Mark Teixeira (.383/.565).

The following graph is a duplicate of the one above but it also includes a trendline. I chose a linear trendline as it is virtually the same as the other choices. The equation for the dataset of all qualified hitters is y = 1.1493x + 0.051. Or, more specifically, SLG = 1.1493 x OBP + 0.051. Due to the lack of pitchers and bench players, the qualified group produced a simple average OBP of .354 and SLG of .458, or 6.3% and 9.6%, respectively, higher than the league norm.

Productivity%20with%20Trendline.png

The hitters below the trendline get more of their productivity from OBP while those above the line get more from SLG. While many of the players below the trendline are not particularly skilled at reaching base (wherefore art thou Bonifacio?), they are even more inept at hitting for power.

Nick Johnson, Chone Figgins, Luis Castillo, and Russell Martin derived most of their offensive value last year from getting on base. Jose Lopez and Bengie Molina hit for some power but made far too many outs. Todd Helton and Derek Jeter were two of the more productive hitters, combining on base with slugging but generating more value from the former than the latter.

Although Mauer and Pujols led their respective leagues in OBP, both players slugged at an even higher rate relative to the league average. Given that Mauer and Pujols are standout defensive players as well, it's not difficult to understand whey they were named the Most Valuable Players in 2009.

Comments

Fantastic work. I love this chart.

What purpose does the trendline serve? This dataset may show a correlation, but formulating SLG as a function of OBP seems somewhat faulty.

Great chart. Mind boggling to see what pujols did this year. Surprise of the year to me was how poor of a year both russel martin

Would be interesting to stratify the data set.

even a simple \ \ \ overlay categorized simply by levels along the 20-80 scale would be a neat-looking device.

Thanks William and Jim.

@RockiesMagicNumber: In addition to providing a visual as to whether a hitter is getting more of his offensive value from on-base or slugging (which was the stated purpose of the trendline), I actually think it helps support the idea that every basis point of OBP is more valuable than SLG, if for no other reason than the league-wide slugging average is about 25% higher than on-base.

I recognize that part of the correlation is, of course, due to the fact that the two variables are not totally independent of each other. Do note that I didn't give an R-squared as it wasn't my intention to make an issue of the correlation itself. Instead, it was designed as an instrument to easily see if a hitter was deriving more of his offensive value from OBP or SLG.

Could you please make available the spreadsheet in an Excel format? I believe in Numbers this is done using File > Save As, then using the Save Copy As option to select Excel Document.

Thanks, as always, for the great work like this piece.

Great stuff, but just one thing: as far as nicknames go, "Prince Albert" is probably one you want to toss back in the waste bin. Wrong connotations.

how about a HOF version which you can send to the dude at SI Heynman(sp?) to show how Mac and Martinez perhaps are a bit better than he thinks. that would be an interesting cloud..

@NaOH: I thought I had saved the spreadsheet in an Excel format the first time. Oh well. I just uploaded an Excel version. Please let me know if it transferred properly. Thanks.

@randyr: This graph would be an excellent visual for Hall of Fame candidates as well. The OBP and SLG are not adjusted for league or ballpark but the unadjusted numbers would still tell a worthwhile story.

This makes me even more delighted to know the Mets will likely soon sign Bengie Molina.

Ah c'mon.

We got to hear about the royals enough here in KC.

Can't I get a break from em here?

I go to this site to hear about good players.

I love this stuff. I'd be interested in seeing these graphs for some of Babe Ruth's seasons or Barry Bonds' MVP years to show how much ahead of everybody else that they were. Maybe a career chart, of present and / or hopeful HOFers would be interesting. Thanks for your work.

great chart. who says the AL is better than the NL. The NL has alot of exciting players, who can match up with the AL. I would prefer to watch the NL stars over the AL

@Hotcorner

Is this chart your basis for that statement?

Thanks, Rich. It's an Excel file now and I got it.

Excellent graph and writing, Rich.

This is an awesome graph, and I love this graph, but "Wherefore art thou Bonifacio" means "Why are you Bonifacio", which, unless Bonifacio's name means something, that doesn't make any sense.