Hitter Scouting Reports
One of the interesting statistics that can be found over at Fangraphs is how hitters perform against different types of pitches. Presumably using this data, we can see how well hitters handle various pitches, be it fastballs, sliders, curves, cutters, etc. The statistic of interest is the Runs Above Average per 100 pitches statistic (for instance, for fastballs, the stat is wFB/C, denoting the runs above average the player contributed per 100 fastballs).
At first blush it would seem that we could identify the best fastball hitting players in baseball from this statistic. Likewise, with curveballs, sliders, change-ups, etc. However, one of the big problems with this data is it is very noisy. One year, a player may appear to hit best against fastballs, while the next year it may be curveballs. For instance, in 2007 it appeared that Aramis Ramirez hit very well against curveballs (wCB/C of 5.09), while the next year he hit curveballs very poorly (wCB/C of -2.53). This past year, he appeared to be about average. One of the key questions is whether these fluctuations are real, and whether these stats, in general, can be trusted.
For this analysis, I looked at five pitches: the fastball, the slider, the cutter, the curveball, and the changeup. For each of these pitches I gathered data for all 212 players with 400 or more PA's in the 2008 season.
Here's how the basics broke down: Relative to their overall abilities, hitters did best against fastballs (.20 RAA per 100 pitches) and change-ups (.14 RAA per 100 pitches), about average against curveballs (-.05 RAA per 100 pitches), and worse against cutters (-.34 RAA per 100 pitches) and sliders (-.55 RAA per 100 pitches).
These averages are fine, although what I'm really interested in is how individual batters varied. Are some hitters really better at hitting the fastball? And what's the spread of the distribution?
As a first step I subtracted each hitter's RAA per 100 pitches for each pitch by their overall average RAA per 100 pitches. Obviously someone like Albert Pujols hits well against pretty much all pitches, but I'm interested in which pitches he hits best. This adjustment takes care of that.
More interesting is the distribution of talent regarding the ability to hit each type of pitch. The standard deviation of hitter abilities for each pitch (weighted by the number of plate appearances) is the following:
Again, at first glance, it appears that the fastball has the smallest variation in the ability to hit them, while cutters have the least. But of course, a lot of this variation is due to chance alone. Not that many cutters are thrown, so of course the variation on RAA per 100 pitches will be fairly high.
What we can do is to calculate the expected variance due to chance alone. Knowing that the standard error for RAA on a typical 600 PA season is 10.75 runs, we can work backwards and find that the standard deviation for RAA on a single pitch is .2243 (10.75/(600*3.83)^.5). Knowing this, we get the following estimates for amount of variability that is expected to occur just by chance:
As you can see by comparing these figures to the ones above, most of the variability in performance against various pitches can be explained by chance alone. In some cases (change-ups, sliders), the variability expected by chance even slightly exceeds the actual variability in the data. This indicates that basically there is no "real" difference between batters in the ability to hit the change-ups and sliders thrown to them (more on this in a moment).
For the other pitches, the ratio of the variances tells us how much we need to regress each hitter's data. For fastballs, we have to regress 77%, while cutters and curves must each be regressed 89%. Most of the variability is due to chance alone. For instance, in 2008, Adam Dunn had an RAA that was 1.11 runs per 100 pitches better than his average production. However, when we regress based on the above, we get than Dunn was just .43 runs per 100 pitches better against fastballs - not all that much different than a normal hitter, who was .22 runs better against fastballs.
With luck accounting for so much of the variability in the above data, the RAA per 100 pitches figures for Fangraphs are fairly limited in their use. In fact, for all pitches except for fastballs, the observed variability was not significantly different from the variability expected by chance, leading one to believe that there may not be any true talent difference at all.
So what does this all mean? We've all seen players who "can't hit the curveball" or are "great fastball hitters". Does this analysis show that these players don't exist at all. Not so fast. While it does show that the players don't seem to actually hit pitches differently, we are ignoring another extremely important factor - how often the batter sees each pitch.
It stands to reason that pitchers would throw more curveballs to the player who "can't hit the curve" and less fastballs to great fastball hitters. And presumably they'll throw fewer and fewer fastballs and more and more curveballs until the batter starts to expect the curve and his efficacy against the curveball actually begins to match his ability against the fastball. In a game theory sense, the game would reach an equilibrium when expected RAA was the same for each pitch. A batter may be a truly better fastball hitter and a weak curveball hitter, but as pitchers throw fewer fastballs, their fastballs become tougher to hit because the batter sees them less often. Likewise if the pitcher throws mostly curveballs, the batter can sit on the curve and he will begin to hit better against that pitch. In a nutshell, pitchers throw fastball hitters fewer fastballs, making them more of a surprise and tougher to hit, and as a result, the batter's RAA per fastball decreases. At least, that's my theory.
So, an important follow-up is whether some hitters do indeed see fewer fastballs than others. The average and standard deviations of how often hitters see each type of pitch can be seen below.
As you can see, very little of the variation in the types of pitches seen is due to chance. This means that there is a reason that some batters see more of one type of pitch than others. Presumably, the reason is due to scouting reports which indicate how to best pitch particular hitters. Alexi Ramirez saw a fastball a league-low 47% of the time. Meanwhile, Juan Pierre saw a fastball over 70% of the time. Those differences are no fluke. Unlike the RAA per pitch data, these percentages are stable. Ramirez was pitched fastballs just 50% of the time in 2009, while Pierre has seen about 70% fastballs in each year of his career.
So, given that there are very little "true" differences in the actual RAA per pitch, but there are significant and consistent differences in the way that hitters are actually pitched, this leads me to believe that the best indicator of a hitters strengths is the proportion of pitches thrown to him. RAA per pitch, while a cool stat, has so much variability that it's rendered nearly useless. The percentage of fastballs (or other pitches seen) is a much more stable and reliable indicator of a batter's strengths and weaknesses. In essence, the advance scouts have already done our work for us in identifying a batter's abilities. To find a hitter's strengths and weaknesses, all we have to do is watch how teams pitch to him.
A last look at this subject is examining the relationship between RAA per 100 pitches and the percentage of each type of pitch seen. If my game theory presumption were true, we would see basically no relationship between the two variables. The graphs below show the relationships.
As you can see, the RAA per 100 pitches and the percentage of pitches seen have basically no relationship for sliders, cutters, change-ups, or curve balls. For fastballs there is a weak relationship, showing that hitters who get fewer fastballs are better at hitting them. From a game theory perspective it shows that pitchers could throw even fewer fastballs than they do already to good fastball hitters (there may be other factors to consider besides just optimizing the outcome of each individual pitch, however, so there may be other good reasons why pitchers would continue to throw fastballs to a good fastball hitter).
Overall, this has been a somewhat sprawling piece on a tricky topic, so I'll sum up. Looking at the evidence, it appears that when trying to identify a hitter's strengths and weaknesses against particular pitches, looking at how he actually did against those pitches is not a particular useful measure. More indicative is the frequency which a batter was thrown each pitch. The better a hitter is against a particular pitch, they less often he will see it. This entire issue of selection bias is an important one to consider, especially when doing pitch f/x analysis or other pitch-by-pitch studies.