Baseball BeatJanuary 27, 2009
BABIP: Slicing and Dicing Groundball Out Rates
By Rich Lederer

The Hardball Times has been at the forefront of publishing batted ball information on its website and in its Baseball Annuals for the past five years. Led by Dave Studeman, THT has written several articles on this subject, including two recent studies on BABIP by co-authors Chris Dutton and Peter Bendix and Derek Carty.

BABIP, of course, stands for Batting Average on Balls in Play. Some analysts prefer BA/BIP, others BABiP. No matter how the acronym is presented, Batting Average on Balls in Play measures exactly what it says: the batting average on all batted balls other than home runs. The formula is calculated as (H-HR)/(AB-K-HR) or (H-HR)/(AB-K-HR)+SF.

Batting Average on Balls in Play is basically the opposite of Defensive Efficiency Ratio (DER) or, perhaps more precisely, 1-DER. BABIP is used for batters whereas DER is used for team defense. Depending on one's perspective, either BABIP or DER can be employed when it comes to pitchers.

In a study on Defense Independent Pitching Stats (DIPS) eight years ago (has it really been that long?), Voros McCracken determined that Batting Average on Balls in Play was primarily a function of a pitcher's defense, ballpark and luck, rather than an actual skill. Here is McCracken's original conclusion in his own words: "There is little if any difference among major-league pitchers in their ability to prevent hits on balls hit in the field of play."

Over the ensuing years, several researchers and analysts have modified and improved the thinking behind DIPS as more information — particularly batted ball data — has become available. But the basic fact remains: Pitchers have less control over BABIP than hitters.

According to Carty, "Most pitchers regress toward the league average BABIP of around .300 or .305. Very few pitchers can repeatedly do better or worse than this, so we say that pitchers have very little control over BABIP. Hitters, on the other hand, can have a substantial amount of control over BABIP. Ichiro Suzuki, for example, has a .356 career BABIP. Hitters do not regress toward league average, rather, they each regress toward their own, unique number."

Carty then asks, "What is that number?" He proceeds to evaluate a number of BABIP estimators to find out which ones do the best "job of predicting the following year's BABIP." You can read about his process and results here.

I'm a pattern-recognition type and noticed a few common threads when thumbing through the batted ball stats in The Hardball Times Baseball Annual during the offseason. While some of my observations are included in one way or another in THT studies, I believe we can achieve even more accuracy with a few more tweaks here and there.

OK, for some background information . . .

According to THT, the MLB average groundball out rate was 74 percent in 2007 and 2008. By comparison, the MLB average flyball out rate was 83 percent in 2007 and 84 percent in 2008. Another way of looking at those percentages is to say that batters hit about .260 on groundballs and .160-.170 on outfield flyballs (excluding home runs).

The line drive out rate was 29 percent in 2008, meaning batters hit roughly .710 on these batted balls. The hit rate on infield flies is nearly non-existent as pop-ups are converted into outs 99 percent of the time.

When it comes to batting average, line drives are king, followed by groundballs, outfield flyballs, and infield flies. Put it all together and National and American League teams hit .298 and .302, respectively, on balls in play in 2008. NL and AL clubs had BABIP of .301 and .305 in 2007.

However, when it comes to production, flyballs are more valuable than groundballs. To wit, including home runs, line drives produced .40 runs in 2007 and .39 in 2008, while the average outfield flyball yielded .18 runs in 2007 and 2008. Meanwhile, the average groundball generated .05 runs per event in 2007 and .04 in 2008.

From the perspective of pitchers, all else being equal, groundball types tend to give up more hits but fewer runs than flyball types. Groundball pitchers generally allow more unearned runs, as I observed in February 2006, due to the greater frequency of errors on balls hit on the ground than in the air.

Nonetheless, I wanted to focus on the average groundball out rate as a variable impacting BABIP. I compiled a list of outliers (high and low) for the 2007 and 2008 seasons. The minimum number of plate appearances required for inclusion was 300. THT listed players by team and did not provide combined results for players who performed for two clubs. For this exercise, I simply took a weighted-average of the groundball out rate based on plate appearances as opposed to actual batted balls. The differences between the two should be minor.

2007 Highest Groundball Out Rates

Jack Cust               86
Adam Lind               85
Bobby Crosby            84
Jason Giambi            84
Paul Lo Duca            84
Dave Ross               84
Kevin Millar            83
Brian Schneider         83
Rich Aurilia            82
Adam Dunn               82
Prince Fielder          82
Josh Fields             82
Kenjii Johjima          82
Dioner Navarro          82
Gregg Zaun              82
Jermaine Dye            81
Ryan Howard             81
Tadahito Iguchi         81
Luke Scott              81
Richie Sexson           81
Marcus Giles            80
Alex Gonzalez           80
Khalil Greene           80
Geoff Jenkins           80                            
Paul Konerko            80                           
Yorvit Torrealba        80   

Most of these hitters are bigger, slower with older skill types. Not a speedster on the list. Ten of the 26 players hit lefthanded and one (Dioner Navarro) bats both. More than 25 percent are catchers. Only five play middle infield or center field.

Marcus Giles only hit .275 on balls in play in 2007 after producing BABIP of .337-.365 from 2003-2005. Was his high out/low success rate on groundballs in 2007 the reason he hit so poorly on balls in play or was the reason he hit so poorly on balls due to not hitting the ball as hard as once before? Note that Giles didn't play in the majors in 2008.

2008 Highest Groundball Out Rates

Jim Edmonds             85 (84 CHC/89 SD)
Corey Patterson         85
Jim Thome               85
Brandon Boggs           83
Jose Castillo           83
Carlos Delgado          83
Jack Hannahan           83
Eric Hinske             83
Craig Counsell          82
Todd Helton             82
Ryan Howard             82
Brian Schneider         82
Nick Swisher            82
Lyle Overbay            81
Alfonso Soriano         81
Omar Vizquel            81
Adrian Beltre           80
Ken Griffey Jr.         80 (81 CWS/80 CIN)
Mike Jacobs             80
Kenjii Johjima          80
Carlos Ruiz             80
Jose Vidro              80

Once again, there are a number of bigger, slower, and/or older types. The list is comprised almost exclusively with catchers and corner position players. Thirteen of the 22 hitters bat lefthanded and four are switch-hitters. Ryan Howard, Kenjii Johjima, Brian Schneider showed up on both lists of high groundball out rates.

Alfonso Soriano and Corey Patterson are the only two players with plus speed. Given the fact that he bats righthanded and runs well, Soriano was the biggest surprise to me.

Interestingly, Travis Hafner made an out on 87 percent of his groundballs in 2008 but only had 234 plate appearances.

2007 Lowest Groundball Out Rates

Matt Kemp               53
Ryan Ludwick            62
Corey Hart              63
Matt Diaz               63
Ichiro Suzuki           63
B.J. Upton              63
Ryan Braun              64
Eric Byrnes             65
Akinori Iwamura         65
Mike Lamb               65
Moises Alou             66
Chris Burke             66
Jose Guillen            66
Mike Lowell             66
Hunter Pence            66
Jason Werth             66
Orlando Cabrera         67
Cliff Floyd             67
Matt Holliday           67
Raul Ibanez             67
Derek Jeter             67
Nook Logan              67
Placido Polanco         67
Jorge Posada            67
Hanley Ramirez          67
Mark Reynolds           67
Rickie Weeks            67

Of the 27 qualifiers, 20 are RHB, only six are LHB, and one is a switch-hitter. There are also more middle infielders and center fielders on the list of low versus high groundball out rates.

Matt Kemp's extraordinarily low rate was based on 311 plate appearances. In this case, you can't chalk it up to small sample size because he repeated this feat the following year — albeit at a much higher rate than the previous season but still low enough to tie for third among all qualifiers.

2008 Lowest Groundball Out Rates

Rickie Weeks            61
Dan Uggla               64
Jason Bay               65 (63 BOS/66 PIT)
Milton Bradley          65
Gabe Gross              65
Matt Holliday           65
Matt Kemp               65
Mike Aviles             66
Scott Hairston          66
Adam Jones              66
Manny Ramirez           66 (59 LAD/69 BOS)
Justin Upton            66
Shane Victorino         66
Jason Bartlett          67
Ryan Braun              67
Ben Francisco           67
Carlos Gomez            67
Franklin Gutierrez      67
Cristian Guzman         67
Akinori Iwamura         67
Reed Johnson            67
Evan Longoria           67
Jose Lopez              67
Hunter Pence            67
Brian Roberts           67

Nineteen of the 25 players hit righthanded, while just two bat from the left side and four are switch-hitters. Once again, there are more middle INF and CF on this list than on the opposite.

In addition to Kemp, Ryan Braun, Matt Holliday, Akinori Iwamura, Hunter Pence, and Rickie Weeks had extraordinarily low groundball out rates in 2007 and 2008.

What variables account for these outliers? Speed is obviously a major factor, not only because fast runners beat out more infield singles but these burners also force more fielding and throwing errors as infielders are forced to act more quickly. Whether a hitter bats left or right appears to have a large influence as well, although the actual results are somewhat counter intuitive as one might think that LHB would have a higher success rate than RHB.

Lefthanded batters who pull the ball to first and second basemen (and even to the left of shortstops) are hurt by the shorter (or lack of) throws in completing the out. Some of these hitters are more likely to be victimized by defensive shifts than righthanded pull hitters. Of note, LHB who slap the ball to the left side of the infield — such as Ichiro and Iwamura — appear to have higher success/lower out rates than pull hitters. An examination (and perhaps incorporation) of spray charts would be helpful here.

In addition to speed, I believe hustle or effort may play a minor role. While difficult to measure, all else being equal, I suspect players who bust their tails down the line will convert grounders into hits or errors at a higher rate than those who rarely turn it up when running to first.

Two more factors for consideration are the velocity and trajectory of groundballs. Harder hit balls are more likely to get through the infield and become hits while high hoppers have a better chance of succeeding than routine, two or three bounce hits, especially among those players who run well.

The presence and speed of baserunners, as well as the number of outs and the score, can have an effect on groundball out rates. The most likely impact is when there is a runner being held on first base, opening up the right side of the infield. Additional contextual items to consider, among others, include double play situations where middle infielders pinch toward second base and the positioning of infielders in late and close games.

There is a lot of food for thought here, all designed to improve the retrospective and predictive powers of the BABIP models.

* * *

Courtesy of The Hardball Times, here is some additional information as it relates to batted ball data.

% of Plate Appearances

         2008     2007
K%        18       17
BB%       10        9

% of Batted Balls

         2008     2007
GB%       44       43
LD%       20       19
FB%       36       38

Many thanks to Dave Studeman and The Hardball Times for the stats in this article.

Comments

I would like to see, if possible, just how negatively the shift effects players like Ryan Howard, especially average wise.

Not so many artificial turf fields these days but might be interesting to see if there is a park adjustment to groundball out rates.

I'm fascinated by all of this. I wonder if anyone calculates SLGBIP, which seems like it might complement DER by incorporating doubles and triples allowed into the mix.

So Manny wasn't dogging it after all. Now get the Boston media to report the statistical truth.

Excellent article, Rich. This continues to be one of my favorite sites.

FWIW, Fielder usually got the severe shift and surprisingly Counsell gets a shift as well. Also, Soriano had leg issues that affected him over some of the season.

Note that what is represented in the THT article as the BaseballHQ BABIP formula uses speed (as best as they can estimate it with a metric called SX) as an input, so pointing out that speed will affect BABIP on groundballs isn't a completely new insight.

Also, I believe that BaseballHQ also uses rolling three year averages for BABIP, similar to the Marcel calculation, so their projection system probably uses both inputs. I say "probably" because like most projection systems other than Marcel, the details are proprietary.

This is an interesting study. I actually did an article precisely on this topic, breaking down BABIP by batted ball types. Here's the link:

http://www.thegoodphight.com/2009/1/16/726379/babip-projection-and-new-s

In general, I found that outfield hits on groundballs tend to have a year to year correlation of about .10-.15 and infield hits per groundball tend to correlate by about .45 or so. The main correlates that I found using a regression were

1) historical groundball BABIP
2) historical infield hits per groundball
3) contact rate (as measured by fangraphs)
4) age-- which had by far its largest effect on groundballs BABIP as compared with linedrives and flyballs.

The regression had an R-squared of about .15.

Revhalofan, rather than conslude that Manny was not dogging it, there should be another explanation like how hard he hit his GB, or positioning of the defense. He was dogging it.