Command PostAugust 10, 2007
Makin' a Filter
By Joe P. Sheehan

Jamie Moyer and Josh Beckett both throw fastballs, but while Moyer's tops out around 85 MPH, Beckett's travels 10 MPH faster. Looking at each pitcher separately, it's easy to classify their fastball, but the only thing the two fastballs have in common with each other is that they are the fastest pitch each pitcher throws. In order to expand my examination of when pitchers throw certain pitches, I want to classify every pitch that has been tracked by the pitch f/x system as either a fastball or off-speed pitch. In order to effectively differentiate between the two groups of pitches, each pitcher has tobe compared to himself and not an outside standard that would classify Moyer's 85 MPH fastball as an off-speed pitch.

In each appearance by a pitcher, I found the average speed of his pitches as they crossed the plate, and then divided the velocity of each pitch in that appearance by the average, which gave me a value for each pitch, standardized for that day. I then classified each pitch as a fastball or off-speed, using only that standard value. Obviously this isn't a perfect method for classifying pitches, and there is some level of inaccuracy with the labels, but it's simple, relatively accurate for fastballs vs. off-speed pitches, and I think it's a good start in automating the classification process.

Testing the method on individual pitchers, the results generally agreed with a visual inspection of their pitch chart, but the algorithm I used to classify pitches had problems with certain types of off-speed pitches. To fix the problems I used a cut-off point of the standard value to separate fastballs from everything else. Generally speaking, a pitch that was faster than the average speed was usually a fastball and anything slower was off-speed. This was the case for every type of pitcher I examined, which will be important.

Some pitches are going to be improperly classified with this method as well, but the problem is smaller compared to using the algorithm and because of the similarity between different types of pitchers, this method worked better than the algorithm when classifying pitches for multiple pitchers. Here's a pitch chart from Roy Halladay to give a sense of where the distinction is being made between pitches.


One thing to keep in mind, and it's shown clearly in Halladay's graph, is that I didn't make any attempt to separate 2-seam and 4-seam fastballs for pitchers that throw both pitches, which will slightly skew the results for those pitchers.

Once I was automatically classifying individual pitchers, I went back and classified every pitch in my database as either a fastball or an off-speed pitch. Before I looked at when pitches were thrown though, I needed to establish some baselines. Of all the pitches in my database, 62% have been fastballs. Some basic splits are in the table below.

Split     Fastball%   Total Pitches
Overall   62%         122072
RHP/RHH   63%         46849
RHP/LHH   61%         43197
LHP/RHH   61%         23415
LHP/LHH   63%         8611

It seems that pitchers throw more fastballs to same-side hitters, but overall 62% looks pretty good as an average. Here's a list of the 10 pitchers who throw the highest and lowest percentage of fastballs (min 100 pitches).

Name               FB%     Total Pitches
Scot Shields       75%     531
Todd Jones         75%     116
Darren Oliver      75%     357
Joakim Soria       75%     206
Alan Embree        73%     380
R. Betancourt      73%     173
Jay Marshall       73%     319
Mike Timlin        73%     146
Aaron Sele         72%     127
Macay McBride      72%     263
Cole Hamels        48%     341
Ian Snell          47%     285
Akinori Otsuka     46%     293
Tom Glavine        46%     324
Matt Wise          46%     121
C. Villanueva      45%     508
Royce Ring         44%     248
Kiko Calero        44%     314
Justin Speier      42%     363
Jamie Walker       37%     151

This list is pretty interesting and the full list it came from might be even more interesting. First of all, Jamie Walker throws a ridiculously small percentage of fastballs compared to the league average. 37% is more than 3 standard deviations from the mean, so he must have reasonably good off-speed pitches to rely on them so extensively. Comparing pitchers to each other gave me insight into some differences in pitch selection I was unaware of. I knew Hamels and Glavine relied heavily on pitches other than their fastballs, but I had no idea they threw their fastballs less than half the time. Similarly, I was surprised at how frequently the leaders threw their fastballs. Joel Zumaya missed the pitch limit cut-off, but he threw his fastball 84% of the time. Hitters essentially knew his fastball was coming, but there still wasn't much they could do with it. One other tidbit from this chart is regarding Beckett. He was subject to criticism last season that he was relying on his fastball too much. This season he has thrown it 65% of the time this season, which is above average, but not in the category of 'over-reliance'.

In a previous article, I examined the pitch selection of Jake Peavy and Dan Haren, based on the Leverage Index of the situation. I didn't have any baselines to compare their averages too, but now I do. Instead of using LI to separate situations, I took a suggestion from a comment by Tangotiger and created three groups of situations based on the run value of a strikeout vs. regular out. Using the win value of a strikeout vs. regular out would probably be a better distinction, but that's for another article. A strikeout is much more valuable than a regular out primarily when there are runners on third base and less than two outs, while the value of a regular out is higher than a strikeout if there is a runner on first or first and second, with one or no outs. The chart below shows the fastball percentages for each situation, split by the pitcher/batter matchup.

Split       High K     Low K      Everything Else
Overall     60%        64%        62%
RHP/RHH     62%        65%        63%
RHP/LHH     60%        64%        61%
LHP/RHH     58%        63%        61%
LHP/LHH     63%        64%        62%

In every case, the percentage of fastballs thrown is lower when the pitcher needs a strikeout, which is what we expected going in (and saw in the case of Peavy and Haren). The differences between situations aren't severe, but in the 'overall' case especially, the sample size is large enough that the differences are real.

Below is a table showing the pitchers who have thrown the highest and lowest percentage of fastballs when they need a strikeout (min 20 pitches). It is a little misleading to just compare the percentage of fastballs a pitcher throws when he needs a strikeout to the league average and say anything less than the league average (more breaking balls) is good while anything higher is bad. A pitcher should throw whatever pitch he has that can get the most swings-and-misses in a high K situation, and for some pitchers, their best swing-and-miss pitch happens to be their fastball. Pitchers rely on their fastballs generally, but certain pitchers should and do use it even more in situations where they need a strikeout.

Name             FB%      Total Pitches
Carlos Silva     88%      25
Matt Belisle     87%      23
Greg Maddux      84%      43
Chris Sampson    81%      21
Vicente Padilla  80%      111
Adam Eaton       79%      29
Manny Delcarmen  79%      33
Odalis Perez     77%      31
Scot Shields     77%      31
Jay Marshall     76%      34
Rudy Seanez      39%      28
Javier Lopez     39%      41
Vinnie Chulk     38%      21
Matt Cain        38%      29
Will Ohman       36%      22
C. Villanueva    36%      22
Mike MacDougal   35%      20
Kelvim Escobar   34%      62
Scott Baker      32%      25
Mike Thompson    32%      22

Manny Delcarmen is one of the pitchers who relies more on his fastball when he needs a strikeout and we can see whether he should be or not. Delcarmen gets a swinging strike 13% of the time he throws his fastball (in any situation), while he gets a swinging strike only 10% of the time with his off-speed pitches. If those ratios are real, and not the product of a small sample size so far, Delcarmen appears to be justified relying on his fastball more when he needs a strikeout. The downside to this is if hitters know a fastball is coming nearly 80% of the time with a runner on third and less than two outs, it would seem to lose some of it's swing-and-miss capabilities...unless it is such a good fastball that hitters can't hit it even when they know it's coming, in which case a pitcher should use it more heavily when he needs a strikeout. There should be some point where that circular loop ends and an equilibrium is reached between the amount a pitch is thrown and it's ability to cause swings-and-misses.

I've covered some of the flaws in the methodology I used to separate pitches, but overall I was quite happy with the results. When I compared the overall fastball percentages for individual pitchers to Inside Edge on ESPN and my own individual pitcher graphs, the percentages were close in all three cases. The next step in this type of analysis is to separate out the different off-speed pitches that I lumped together, which adds another layer of information about pitchers and pitch selection. A changeup and curveball are two very different pitches and could be used for very different purposes by a pitcher.

I'm going to close with one last table, this one showing the fastball percentage on extreme pitcher's counts (0&2 and 1&2) and extreme hitter's counts (3&0, 3&1).

Count             Fastball%     Total Pitches
3&0 and 3&1       83%           4340
0&2 and 1&2       54%           18091

I should have separated the 3 ball counts by the cost of a walk, but it seems amazing that pitchers are so afraid of walking a hitter in those counts that they become Zumaya-esque in terms of pitch selection, but without the amazing fastball to back it up. In a count that already favors the hitter, hitters see almost all fastballs, which is one big reason why hitters have a .630 SLG in 3&0 and 3&1 counts this year.


Obviously, I'm a Giants fan: given that Cain has a great mid-90's fastball with "stuff" but he only uses it 38% of the time when he needs a strikeout, it seems that someone should slap him upside the head and tell him to throw his fastball more.

In fact, just before his recent high in strikeouts, he admitted in an interview that he wasn't relying on his fastball for a strike, but observed other pitchers and realized that he should, leading to him now getting more strikeouts, so perhaps he slapped himself.

However, he went through a similar thing last year where he didn't trust his fastball, thinking major league hitters can hit it, but when they skipped his start, he calmed down and starting hurling the heat.

The problem seems to be that he's preparing for the day when his heater is gone, but he's only 22 years old, so he should just fling the heat and work on being a pitcher when he has a big lead.

Cain has been using his changeup and slider more in his last 2 starts. His fastball isn't in the mid-90's according the the GameDay's I have observed. His fastball has been at 92-94. That's still good enough for him to get it by hitters.

If you had happen to view the Gameday for his start @San Diego his fastball was in the mid-90's, but don't be fooled. I have noticed that SD's PITCHf/x system are reading the fastballs as faster than normal. I compared his fastball speed to other pitchers in the game and everyone else's figures were high as well.

He hasn't had much leads to work with this year, and the bullpen has blown a few of his wins. So there goes some of that confidence right there.

How often did Beckett throw his fastball last season?

Inside Edge has him throwing the fastball 71% of the time and that includes pitches from this season, last season and I think the season before. So he may have been throwing it 75% of the time. I actually have him throwing it 60% of the time so far this season in the games with enhanced gameday he's pitched.

Someone has been watching too much Biodome buuuuuuuuuudddy

Nice article. Just pointing out the obvious here, but your filter will fail for pitchers with extremely high or low fastball percentages. (For instance, Tim Wakefield's average speed pitch will still likely be a knuckleball, and all of his curveballs will likely be classified as fastballs.)

Wow this is awsome research interesting to see Halladay velocity mostly under 90? are you sure about that? I thought his sinker always went 92-94.

Interesting though I think Sinker ballers actually use a fastball more because they WANT the hitter to hit it and beat it into the ground. Thats why I see Silva up there so high.

As far as pitchers any pitcher that can't crack 90 92 MPH I firmly suggest throwing sinkers. IMO It's almost a must.

New Jays Site/Forum

It seems to me you might you get cleaner results if you also took into consideration the vertical and horizontal movement as well. Why not, for right-handed pitchers, look only at pfx_x values that are negative and for left-handers pfx_x values that are positive and for both pfx_z values that are positive? If there's any reasonable amount of backspin on the ball the pfx_z won't be negative. The lateral movement is more dicey since I'm not sure you want cut fastballs in your classification as as fastball and so if you do then you wouldn't use pfx_x.

Great stuff as ususal.

i agree. great work.

I'm really happy someone got the Biodome joke.

I actually didn't include Wakefield in the analysis for that reason. I need to look at the actual values again, but I think his fastballs were pitches that were more than 4-5% faster than average. Using the cutoff at 1 grouped too many knuckleballs as fastballs I think. Something like what Dan suggested should work well for Wakefield.

Adding on to what Dan said, you could classify any righty with a positive pfx_x value (and negative for lefty) as a cut fastball if the pfx_z value is over 8 or 7. Jake Peavy's and Matsuzaka's cutter seem to often have a lower value then that though (although there seems to be no clear distinction between their cutter and slider and it may be better off calling their cutters offspeed pitches anyways).

Another addition for classifying pitches would be to treat slow pitches with negative pfx_x values for LHP (and positive for RHP), that also have negatvie pfx_z values as curveballs. I believe that speed would be the primary differentiator between curves and splitters.

Graham Rhodenizer, were the velocities you thought Halladay threw based on radar gun readings? If so, then they are not reliable, as radar guns have inconsisten readings and can be higher or lower than actual. The PITCHf/x system (when properly calibrated) is more reliable than a radar gun.

Great Article!