BABIP: Slicing and Dicing Groundball Out Rates
The Hardball Times has been at the forefront of publishing batted ball information on its website and in its Baseball Annuals for the past five years. Led by Dave Studeman, THT has written several articles on this subject, including two recent studies on BABIP by co-authors Chris Dutton and Peter Bendix and Derek Carty. BABIP, of course, stands for Batting Average on Balls in Play. Some analysts prefer BA/BIP, others BABiP. No matter how the acronym is presented, Batting Average on Balls in Play measures exactly what it says: the batting average on all batted balls other than home runs. The formula is calculated as (H-HR)/(AB-K-HR) or (H-HR)/(AB-K-HR)+SF. Batting Average on Balls in Play is basically the opposite of Defensive Efficiency Ratio (DER) or, perhaps more precisely, 1-DER. BABIP is used for batters whereas DER is used for team defense. Depending on one's perspective, either BABIP or DER can be employed when it comes to pitchers. In a study on Defense Independent Pitching Stats (DIPS) eight years ago (has it really been that long?), Voros McCracken determined that Batting Average on Balls in Play was primarily a function of a pitcher's defense, ballpark and luck, rather than an actual skill. Here is McCracken's original conclusion in his own words: "There is little if any difference among major-league pitchers in their ability to prevent hits on balls hit in the field of play." Over the ensuing years, several researchers and analysts have modified and improved the thinking behind DIPS as more information — particularly batted ball data — has become available. But the basic fact remains: Pitchers have less control over BABIP than hitters. According to Carty, "Most pitchers regress toward the league average BABIP of around .300 or .305. Very few pitchers can repeatedly do better or worse than this, so we say that pitchers have very little control over BABIP. Hitters, on the other hand, can have a substantial amount of control over BABIP. Ichiro Suzuki, for example, has a .356 career BABIP. Hitters do not regress toward league average, rather, they each regress toward their own, unique number." Carty then asks, "What is that number?" He proceeds to evaluate a number of BABIP estimators to find out which ones do the best "job of predicting the following year's BABIP." You can read about his process and results here. I'm a pattern-recognition type and noticed a few common threads when thumbing through the batted ball stats in The Hardball Times Baseball Annual during the offseason. While some of my observations are included in one way or another in THT studies, I believe we can achieve even more accuracy with a few more tweaks here and there. OK, for some background information . . . According to THT, the MLB average groundball out rate was 74 percent in 2007 and 2008. By comparison, the MLB average flyball out rate was 83 percent in 2007 and 84 percent in 2008. Another way of looking at those percentages is to say that batters hit about .260 on groundballs and .160-.170 on outfield flyballs (excluding home runs). The line drive out rate was 29 percent in 2008, meaning batters hit roughly .710 on these batted balls. The hit rate on infield flies is nearly non-existent as pop-ups are converted into outs 99 percent of the time. When it comes to batting average, line drives are king, followed by groundballs, outfield flyballs, and infield flies. Put it all together and National and American League teams hit .298 and .302, respectively, on balls in play in 2008. NL and AL clubs had BABIP of .301 and .305 in 2007. However, when it comes to production, flyballs are more valuable than groundballs. To wit, including home runs, line drives produced .40 runs in 2007 and .39 in 2008, while the average outfield flyball yielded .18 runs in 2007 and 2008. Meanwhile, the average groundball generated .05 runs per event in 2007 and .04 in 2008. From the perspective of pitchers, all else being equal, groundball types tend to give up more hits but fewer runs than flyball types. Groundball pitchers generally allow more unearned runs, as I observed in February 2006, due to the greater frequency of errors on balls hit on the ground than in the air. Nonetheless, I wanted to focus on the average groundball out rate as a variable impacting BABIP. I compiled a list of outliers (high and low) for the 2007 and 2008 seasons. The minimum number of plate appearances required for inclusion was 300. THT listed players by team and did not provide combined results for players who performed for two clubs. For this exercise, I simply took a weighted-average of the groundball out rate based on plate appearances as opposed to actual batted balls. The differences between the two should be minor. 2007 Highest Groundball Out Rates Jack Cust 86 Adam Lind 85 Bobby Crosby 84 Jason Giambi 84 Paul Lo Duca 84 Dave Ross 84 Kevin Millar 83 Brian Schneider 83 Rich Aurilia 82 Adam Dunn 82 Prince Fielder 82 Josh Fields 82 Kenjii Johjima 82 Dioner Navarro 82 Gregg Zaun 82 Jermaine Dye 81 Ryan Howard 81 Tadahito Iguchi 81 Luke Scott 81 Richie Sexson 81 Marcus Giles 80 Alex Gonzalez 80 Khalil Greene 80 Geoff Jenkins 80 Paul Konerko 80 Yorvit Torrealba 80 Most of these hitters are bigger, slower with older skill types. Not a speedster on the list. Ten of the 26 players hit lefthanded and one (Dioner Navarro) bats both. More than 25 percent are catchers. Only five play middle infield or center field. Marcus Giles only hit .275 on balls in play in 2007 after producing BABIP of .337-.365 from 2003-2005. Was his high out/low success rate on groundballs in 2007 the reason he hit so poorly on balls in play or was the reason he hit so poorly on balls due to not hitting the ball as hard as once before? Note that Giles didn't play in the majors in 2008. 2008 Highest Groundball Out Rates Jim Edmonds 85 (84 CHC/89 SD) Corey Patterson 85 Jim Thome 85 Brandon Boggs 83 Jose Castillo 83 Carlos Delgado 83 Jack Hannahan 83 Eric Hinske 83 Craig Counsell 82 Todd Helton 82 Ryan Howard 82 Brian Schneider 82 Nick Swisher 82 Lyle Overbay 81 Alfonso Soriano 81 Omar Vizquel 81 Adrian Beltre 80 Ken Griffey Jr. 80 (81 CWS/80 CIN) Mike Jacobs 80 Kenjii Johjima 80 Carlos Ruiz 80 Jose Vidro 80 Once again, there are a number of bigger, slower, and/or older types. The list is comprised almost exclusively with catchers and corner position players. Thirteen of the 22 hitters bat lefthanded and four are switch-hitters. Ryan Howard, Kenjii Johjima, Brian Schneider showed up on both lists of high groundball out rates. Alfonso Soriano and Corey Patterson are the only two players with plus speed. Given the fact that he bats righthanded and runs well, Soriano was the biggest surprise to me. Interestingly, Travis Hafner made an out on 87 percent of his groundballs in 2008 but only had 234 plate appearances. 2007 Lowest Groundball Out Rates Matt Kemp 53 Ryan Ludwick 62 Corey Hart 63 Matt Diaz 63 Ichiro Suzuki 63 B.J. Upton 63 Ryan Braun 64 Eric Byrnes 65 Akinori Iwamura 65 Mike Lamb 65 Moises Alou 66 Chris Burke 66 Jose Guillen 66 Mike Lowell 66 Hunter Pence 66 Jason Werth 66 Orlando Cabrera 67 Cliff Floyd 67 Matt Holliday 67 Raul Ibanez 67 Derek Jeter 67 Nook Logan 67 Placido Polanco 67 Jorge Posada 67 Hanley Ramirez 67 Mark Reynolds 67 Rickie Weeks 67 Of the 27 qualifiers, 20 are RHB, only six are LHB, and one is a switch-hitter. There are also more middle infielders and center fielders on the list of low versus high groundball out rates. Matt Kemp's extraordinarily low rate was based on 311 plate appearances. In this case, you can't chalk it up to small sample size because he repeated this feat the following year — albeit at a much higher rate than the previous season but still low enough to tie for third among all qualifiers. 2008 Lowest Groundball Out Rates Rickie Weeks 61 Dan Uggla 64 Jason Bay 65 (63 BOS/66 PIT) Milton Bradley 65 Gabe Gross 65 Matt Holliday 65 Matt Kemp 65 Mike Aviles 66 Scott Hairston 66 Adam Jones 66 Manny Ramirez 66 (59 LAD/69 BOS) Justin Upton 66 Shane Victorino 66 Jason Bartlett 67 Ryan Braun 67 Ben Francisco 67 Carlos Gomez 67 Franklin Gutierrez 67 Cristian Guzman 67 Akinori Iwamura 67 Reed Johnson 67 Evan Longoria 67 Jose Lopez 67 Hunter Pence 67 Brian Roberts 67 Nineteen of the 25 players hit righthanded, while just two bat from the left side and four are switch-hitters. Once again, there are more middle INF and CF on this list than on the opposite. In addition to Kemp, Ryan Braun, Matt Holliday, Akinori Iwamura, Hunter Pence, and Rickie Weeks had extraordinarily low groundball out rates in 2007 and 2008. What variables account for these outliers? Speed is obviously a major factor, not only because fast runners beat out more infield singles but these burners also force more fielding and throwing errors as infielders are forced to act more quickly. Whether a hitter bats left or right appears to have a large influence as well, although the actual results are somewhat counter intuitive as one might think that LHB would have a higher success rate than RHB. Lefthanded batters who pull the ball to first and second basemen (and even to the left of shortstops) are hurt by the shorter (or lack of) throws in completing the out. Some of these hitters are more likely to be victimized by defensive shifts than righthanded pull hitters. Of note, LHB who slap the ball to the left side of the infield — such as Ichiro and Iwamura — appear to have higher success/lower out rates than pull hitters. An examination (and perhaps incorporation) of spray charts would be helpful here. In addition to speed, I believe hustle or effort may play a minor role. While difficult to measure, all else being equal, I suspect players who bust their tails down the line will convert grounders into hits or errors at a higher rate than those who rarely turn it up when running to first. Two more factors for consideration are the velocity and trajectory of groundballs. Harder hit balls are more likely to get through the infield and become hits while high hoppers have a better chance of succeeding than routine, two or three bounce hits, especially among those players who run well. The presence and speed of baserunners, as well as the number of outs and the score, can have an effect on groundball out rates. The most likely impact is when there is a runner being held on first base, opening up the right side of the infield. Additional contextual items to consider, among others, include double play situations where middle infielders pinch toward second base and the positioning of infielders in late and close games. There is a lot of food for thought here, all designed to improve the retrospective and predictive powers of the BABIP models. Courtesy of The Hardball Times, here is some additional information as it relates to batted ball data. % of Plate Appearances 2008 2007 K% 18 17 BB% 10 9 % of Batted Balls 2008 2007 GB% 44 43 LD% 20 19 FB% 36 38 Many thanks to Dave Studeman and The Hardball Times for the stats in this article. |
Comments
I would like to see, if possible, just how negatively the shift effects players like Ryan Howard, especially average wise.
Posted by: Steve at January 27, 2009 9:17 AM
Not so many artificial turf fields these days but might be interesting to see if there is a park adjustment to groundball out rates.
Posted by: Gilbert at January 27, 2009 9:37 AM
I'm fascinated by all of this. I wonder if anyone calculates SLGBIP, which seems like it might complement DER by incorporating doubles and triples allowed into the mix.
Posted by: rfs1962 at January 27, 2009 10:12 AM
So Manny wasn't dogging it after all. Now get the Boston media to report the statistical truth.
Posted by: Rev Halofan at January 27, 2009 6:07 PM
Excellent article, Rich. This continues to be one of my favorite sites.
Posted by: Kevin at January 28, 2009 5:44 AM
FWIW, Fielder usually got the severe shift and surprisingly Counsell gets a shift as well. Also, Soriano had leg issues that affected him over some of the season.
Posted by: Hal at January 28, 2009 8:46 AM
Note that what is represented in the THT article as the BaseballHQ BABIP formula uses speed (as best as they can estimate it with a metric called SX) as an input, so pointing out that speed will affect BABIP on groundballs isn't a completely new insight.
Also, I believe that BaseballHQ also uses rolling three year averages for BABIP, similar to the Marcel calculation, so their projection system probably uses both inputs. I say "probably" because like most projection systems other than Marcel, the details are proprietary.
Posted by: Detroit Michael at January 28, 2009 1:18 PM
This is an interesting study. I actually did an article precisely on this topic, breaking down BABIP by batted ball types. Here's the link:
http://www.thegoodphight.com/2009/1/16/726379/babip-projection-and-new-s
In general, I found that outfield hits on groundballs tend to have a year to year correlation of about .10-.15 and infield hits per groundball tend to correlate by about .45 or so. The main correlates that I found using a regression were
1) historical groundball BABIP
2) historical infield hits per groundball
3) contact rate (as measured by fangraphs)
4) age-- which had by far its largest effect on groundballs BABIP as compared with linedrives and flyballs.
The regression had an R-squared of about .15.
Posted by: MattS at January 28, 2009 10:44 PM
Revhalofan, rather than conslude that Manny was not dogging it, there should be another explanation like how hard he hit his GB, or positioning of the defense. He was dogging it.
Posted by: Bill at January 29, 2009 9:20 AM