The Case of Michael Young and Line Drive Rates
Courtesy of The Hardball Times, the table below details the top 20 line-drive rates over the past five seasons. Do you notice any repeaters? There are only two players who qualified more than once: David Wright twice (2005 and 2008) and Michael Young FOUR times (2004-2007).
Does this data say more about Young's proclivity in hitting liners, his home ballpark, or the bias of scorekeepers? A combination of the three? Or perhaps something else?
This table captures a number of career years. Freddy Sanchez hit .344 with an OPS of .851 in 2006 vs. career averages of .300 and .753. Brian Roberts hit .314/.903 in 2005 vs. .284/.771. Geoff Jenkins hit .292/.888 in 2005 vs. .275/.834. Chone Figgins hit .330/.825 in 2007 vs. .290/.743. Ryan Ludwick hit .299/.966 in 2008 vs. .273/.857. Brady Clark hit .306/.798 in 2005 vs. .277/.744. Joe Mauer hit .347/.936 in 2006 vs. .317/.856.
Other than Juan Pierre, all of these players had BA/BIP over .300 with a mean of .340. Young, for what it's worth, owns three of the top four BA/RISP (among this sample size), including the only one greater than .400.
Of note, Young is the only Texas player included in the above list, which suggests LD% has more to do with the hitter than the effects of the ballpark or scorekeeper. However, it should be noted that Mark Teixeira had a 28.2% LD rate in 2003. In addition, Hank Blalock (2005), Milton Bradley (2008), and Ian Kinsler (2008) had rates that fell just outside the top 20. As such, I think it is fair to say that ballparks influence LD rates.
According to Baseball Analysts contributor Jeremy Greenhouse, there have been about 50 Rangers with at least 100 plate appearances since 2005 and the average line-drive rate (sans Young) was 20.5% vs. 19.9% league wide. Furthermore, in a study at Fangraphs, Brian Cartwright determined that "a batter is 18% more likely to have a batted ball coded as a LD" in Arlington . . . "while in Minneapolis, it's 20% less likely."
As Tangotiger wrote in response to Brian's work, "A 'line drive' is not necessarily a line drive. If hitters are showing as hitting 20% fewer line drives in the Metrodome than away from the Metrodome, we don't know if it's because the Metrodome depresses LD rates, or if it's because the scorer in Minnesota is depressing it. Since it makes a huge difference when looking at LD and FB rates, then you need some sort of park factor to normalize the data . . . Taking a guess, I have to believe this is a scorer issue. A line drive is really a batted ball that leaves the bat at a certain angle, at a certain velocity. I don't see how those things would affect whether a ball is a LD, FB, or GB, regardless of the park you are in. I can see how the scorer can be influenced by the positioning of the fielder (and worse, if the fielder caught the ball or not), and try to assign a batted ball code."
The thread attached to Tango's comments is fascinating and includes posts by Colin Wyers, Mike Fast, MGL, Greg Rybarczyk, Dave Studeman, and David Gassko. It is worth reading if you're into advanced batted ball studies. As studes points out, "From my work in the 2006 THT Annual, there was a greater standard error in line drive rates per park than in GB or Outfield Fly rates. Not outrageously higher, but definitely higher." You can also download a PDF of the 2004 THT Annual that includes Robert Dudek’s groundbreaking article on hang time, which is important because, as Tango notes, "how much time it takes for the ball and the fielder to intersect" is what is really important in differentiating between batted balls.
There are a number of questions to ask when it comes to batted balls. What percentage is attributed to the hitter or pitcher, the ballpark, or the scorekeeper? What distinguishes a line drive from a hard-hit groundball or a looping flyball? Is a one hopper that skips past the infield classified as a grounder or a liner? Does the ball have to hit the outfield grass first in order to be coded as a line drive? How high can a ball be hit and still be considered a line drive? Should the outcome have an effect on how a batted ball is coded? Does the outcome have an effect?
Play by play, batted ball, pitch f/x. We know a lot more today than we did just five years ago and we will know a lot more in five years than we know today. Hit f/x is next. Stats are not ridiculous. Only those who ignore (the right) stats are ridiculous.