Designated Hitter February 12, 2009
BABIP: Progressing and Regressing Groundball Out Rates

A couple of weeks ago, Rich Lederer asked what variables account for extraordinarily low groundball out rates. So, using a similar method to that which Peter Bendix and Chris Dutton used to find expected BABIP, we dug deeper and ran a regression to find expected average on groundballs.

Intuitively, one would think that faster players with the ability to find holes in the infield have the best success rates on groundballs. As Lederer pointed out, defensive alignments and batter handedness are also variables that will affect groundball average. While infield shifts are difficult to quantify, we still attempted some statistical approaches to analyze their effects. And to account for handedness, we limited our sample to only left-handers or switch-hitters batting lefty. Our sample included 206 players with at least 200 total ground balls since 2002. We then ran a linear regression to find the factors that influence a batter's groundball average.

Five variables were significant at a one percent level in our regression—a ratio of pulled groundballs to opposite field groundballs, the percentage of grounders hit to center field, a speed score developed by Bill James, bunt hits per plate appearance, and homers per ball in air. The R-squared is .4648. Here is the regression output, if you're into that sort of stuff.

The location of groundballs along with the batter’s speed seem to have the most influence on groundball hit rate, confirming our suspicions. Hitting the ball the other way forces a longer throw, and busting it down the line on grounders is probably the most advantageous way a player can utilize his speed. Velocity of groundballs was difficult to account for. Line drive percentage and grounded into double play percentage, which are likely tied with the hardness of a groundball hit, proved insignificant. Many of you might know the split in batted ball hit average is about .715 on liners, .235 on grounders, and .140 on fly balls. Now, we can break that down further with this data. Lefties hit for a lower average on grounders than righties by about 10-15 points. Opposite field grounders and grounders up the middle from lefties go for hits on average about 30% of the time, while pulled grounders go for hits only 15-20% of the time. Interestingly, hitting homeruns has a negative impact on pulled and total groundball average, but is one of the most significant positively correlated variables that go into opposite field average. One guess is that power hitters tend to hit weaker groundballs to the right side when they roll over their wrists. Or perhaps they pull the ball into a shift, which seems to be supplied only to power hitters due to a likely managerial bias. But when these homerun hitters do hit opposite field groundballs, however rarely, they are apparently more likely to go for hits than opposite field grounders from slash hitters.

One of the main reasons we calculated our expected average value was to examine the exaggerated infield shift more closely. In our sample, we came up with nearly 20 players who we believed to have been “overshifted,” a defensive alignment in which the shortstop plays on the second-base side of the bag and the second-baseman goes to short right field. The shift was originally introduced as a way to get Ted Williams out, and it was brought back in vogue to foil Barry Bonds. By comparing a player’s expected average with his actual average, and using several more basic methods, we were able to draw conclusions about the use of the shift. An average significantly greater than the corresponding expected average indicates that our regression model does not account for something affecting the hitter – maybe a defensive shift.

The players whose expected groundball average most exceeded actual groundball average were Barry Bonds, Rafael Palmeiro, Mark Teixeira, Adam Dunn, and Jack Cust. Their averages all fell at least 20 points below their expected averages, while Jack Cust’s came up almost 30 points short. With this information, we looked at their traditional BABIPs with men on base and nobody on base as a loose measure to determine when these batters are being shifted, and when they’re not. We should note that the average BABIP with men on is slightly higher than with nobody on, and for pull-hitting lefties, there will be an even greater difference as the first baseman will often have to hold on a runner, opening up the hole between first and second base. Bonds, Palmeiro, and Cust all gained at least 30 extra points of BABIP with men on, and Bonds had a .265 BABIP with nobody on and .338 with men on. Dunn showed little split, while we could not isolate Teixeira’s situational left-handed at-bats from his right-handed at-bats. All of these players pull their groundballs at least six times as much as they hit grounders to the opposite field, and they all have slow speed scores, making them prime candidates to be victims of the shift.

Other players who get shifted and who have averages below their expected averages include: Prince Fielder, Justin Morneau, Mike Jacobs, and Jason Giambi. Giambi’s BABIP has been an astounding 95 points higher with men on than with nobody on.

What was almost as interesting was the list of shifted players whose average exceeds their expected average – potentially meaning the shift is not effective against them. David Ortiz, Carlos Pena, and Travis Hafner all fit into this category. There was no noticeable difference between skill sets of these player and the first group, so some other factors must explain this difference. Perhaps this second group includes hitters who are better at locating their hits against the shift. Ortiz does have a split of 45 points between his BABIP with men on vs. nobody on, so we won’t discount the impact of the shift on him.

Within this group of shifted batters, there were some other noteworthy discoveries. Ryan Howard has an incredibly high pull-to-opposite-field-groundball ratio of 11.875—the largest in our sample—yet his average and expected average were about equal, as both values fell within the .200-.205 range. Given his dramatic pull/opp ratio, we have little doubt that the shift has affected him, so we dug deeper to find the answer. Looking at the MLB.com provided hitting charts, and checking the locations of his groundball outs, there is a cluster of outs in short right field over the last two years, but not prior, meaning the decision to shift him might have been recent. Indeed, in 2005-2006, Howard hit .237 on grounders, and then when the shift came into play regularly in 2007-2008, he hit only .175 on grounders. Also notable were Hafner's and Morneau’s extremely low pull/opp ratios, which were 3.98 and 2.99 respectively. According to this statistic, neither player would be an obvious candidate for the shift – yet both are shifted, and as said earlier, it would appear that the shift is detrimental to Morneau. However, the 3-4 defense applied to Hafner never made much sense, as he has rather moderate pull-to-opposite-field-groundball and groundball-to-flyball ratios.

Finally, we looked for any left-handed batters with high pull percentages, who would therefore be good candidates for the defensive shift. Nate McLouth had a pull/opp ratio of 10.208, but his speed statistic is quite high, explaining why teams probably choose not to shift him. If you’re fielding balls in short right field, you won’t get a fast player out. Nick Swisher’s pull/opp ratio 10.92 yet teams do not shift him. Russell Branyan and David Dellucci are also strong candidates for a shift, but none of these players follow the hulking power hitter profile, so managers don’t think twice about creative ways to get them out.

We ran a logistic regression using a value of one if we had evidence that the player had been shifted and zero if not. It turns out that homerun-per-flyball and groundball-to-flyball ratios have been the most significant factors in determining what players get shifted. Bonds’ expected shift score was one, meaning that he is truly the prototype of shifted players. Pull percentage and intentional walks per plate appearance were also significant at a five percent level, but we believe that opposite field groundball rate should be taken into account as well. Evacuating that side of the infield against a hitter who hits any significant amount of opposite field groundballs is simply giving away hits, no matter how many pulled grounders get taken away. There is a clear managerial bias to shift power hitters, while not taking enough into account batted ball location.

Our study is not perfect. We found no good way to quantify the shift, which would allow us to distinguish between players who receive a full shift and those who receive a partial one, or those who are shifted all the time and those for whom only some teams put on the defensive shift. Nevertheless, our study shows some interesting results. By comparing expected ground ball and actual averages, we believe that the shift had the most significant impact on Bonds, Palmeiro, and Cust, and that it had a surprisingly little impact on batters like Ortiz and Pena. In addition, we suggest that Swisher might be a good candidate to shift, and we suggest that managers make decisions based on evidence rather than player reputation. These are only basic observations, yet they shed some light on the hard-to-quantify defensive shift.

Leanne Brotsky, David Estabrook, Jeremy Greenhouse, Kimberly Miner, and Steven Smith assisted in writing this article. We would also like to thank Evan Chiachiaro and Dan Rathman, and Anthony Doina who participated in Baseball Analysis at Tufts’ research committee. Any questions can be directed to TuftsBAT@gmail.com.

Interesting article. I'm guessing that the independent variables that you guys used were for the same year as the year the dependent variable was measured for, right?

I asked a similar question using very different methodology in a previous piece that I think is somewhat complimentary to your article. I tried to predict ground ball BABIP using historical data from the previous three years. I was able to get an R-squared of .1526 using an average the three years' previous Groundball BABIPs, an average of the three years' previous infield hit rates, and the three years' previous contact rate as measured by fangraphs.com (% of pitches swung at that were at least fouled off).

I think that the negative correlation of homerun rate and groundball BABIP is something I've noticed a few times before. I attribute it to selection bias-- people who are good enough to make the major leagues need to be good at avoiding outs one way or the other. Those who are bad at both hitting homeruns and reaching via groundball are likely to never make the majors, yielding a negative correlation for those who stay.

The shift issue is something I addressed once recently too, but quite differently as well. I found that lefty power hitters were able to get hits in higher leverage situations due to the fact that the shift wasn't as strong with runners on base. Hence, their BABIPs were far higher with runners on base than without, making them unintentionally "clutch". Here's that link:

http://www.thegoodphight.com/2009/1/29/741980/there-is-clutch-or-the-cas

All in all, very interesting article. I think the sabermetric community is really starting to get a better understanding of BABIP. Great work, guys.

http://www.thegoodphight.com/2009/1/16/726379/babip-projection-and-new-s

Swisher is a switch hitter. I assume the 10.92 pull/opp ratio mentioned for him is only for his left-handed at-bats?

Matt, all variables were from the same years.

Ken, that's correct. We took all switch-hitter's data only from when they were batting left-handed.

Nice article, a few notes:

Your batting averages on different types of batted balls is wrong, unless you're leaving out home runs.

Including home runs, a simple rule is that batters have higher fly ball batting averages on pulled fly balls and higher ground ball batting averages on ground balls hit the other way. This is true for both righty and lefty batters, but the ground ball difference is more extreme for lefty batters.

My belief (no stats to back this up) is that pulled ground balls are hit harder, but defenses play batters to pull ground balls -- so ground balls to the opposite field are more likely to find holes even if they're not hit as strongly. As a proof that pulled ground balls are hit more strongly, there are more doubles on pulled ground balls for both lefty and righty batters, even with the lower BA on those hits.

Also, hitting line drives has nothing to do with power. A line drive is a trajectory. I would have used HR/F as a power measure.

Cool article.

The "astounding" split for Giambi was something that we Yankees fans watched in despair for several years. He'd rip a grounder or low liner in the hole between 1st & 2nd and be thrown out easily by the SS or 2B playing in shallow RF. He never adjusted to it, so they kept doing it. Smart.

Even when the big lefty can hit the ball the other way, sometimes that robs his power. If you have David Ortiz up with two outs and shift him and he dribbles one through the left side for a single, that is a VERY low run expectancy. Move the bat down 1 inch or so on a hard ground ball to the right side, and it becomes a home run.

So, the next analysis is how the "shift beaters" change their ISO and OPS as a result of "counter punching".

Studes, thanks for the comment. I'm posting this on TheBookBlog comments too.

The batting averages on different types of balls in play was taken off baseball reference. That's my mistake not indicating that it was BABIP to which I was referring, and not total average, so yes, homeruns were excluded. I'm not even sure why I chose to do that.

Why hitters fare better on opposite field grounders than pulled grounders is something we tried to look at. 12 players in our sample actually had higher pull averages on grounders than opposite field averages, and none of them were homerun hitters, while the players with the biggest split in favor of opposite field average were Thome, Howard, Bonds, Chipper, Delgado, Gross and Giambi. Defensive positioning definitely played a part in it. It seems that the most important thing one can do to get hits on pulled grounders is be fast and to get hits on opposite field grounders is to hit homeruns, and therefore draw the defense over. Of course, logically it should be that the more pulled grounders you hit, the better your opposite field average would be as the defense moves over, but that's not what the data says.

I'm sorry, but I don't quite understand the last part of your comment. By hitting for power, do you mean power of groundball hit? Because we tested to see if LD% and HR/F mattered in GB Avg. My guess was that line drives might be correlated with power of groundballs, since they might follow a similar swing path or something. In that regard, I was probably wrong. As for actual slugging power, I would not use LD% to account for that either.

Rob, glad you liked the article. I'm a Yankee fan too, and the shift definitely hurt Giambi. But do you remember when he was able to pull one past Brian Roberts and the shift? His reaction was pretty funny. http://cache.deadspin.com/assets/images/deadspin/2008/07/jasongiambifinger.jpg

Cliff, I believe the guys at The Book have done some analysis on what the break even point is for it to be worth an Ortiz to bunt against the shift or "counter-punch." I think most of the time it would have a higher run expectancy to beat the shift than swing away.