Makin' a Filter
Jamie Moyer and Josh Beckett both throw fastballs, but while Moyer's tops out around 85 MPH, Beckett's travels 10 MPH faster. Looking at each pitcher separately, it's easy to classify their fastball, but the only thing the two fastballs have in common with each other is that they are the fastest pitch each pitcher throws. In order to expand my examination of when pitchers throw certain pitches, I want to classify every pitch that has been tracked by the pitch f/x system as either a fastball or off-speed pitch. In order to effectively differentiate between the two groups of pitches, each pitcher has tobe compared to himself and not an outside standard that would classify Moyer's 85 MPH fastball as an off-speed pitch. In each appearance by a pitcher, I found the average speed of his pitches as they crossed the plate, and then divided the velocity of each pitch in that appearance by the average, which gave me a value for each pitch, standardized for that day. I then classified each pitch as a fastball or off-speed, using only that standard value. Obviously this isn't a perfect method for classifying pitches, and there is some level of inaccuracy with the labels, but it's simple, relatively accurate for fastballs vs. off-speed pitches, and I think it's a good start in automating the classification process. Testing the method on individual pitchers, the results generally agreed with a visual inspection of their pitch chart, but the algorithm I used to classify pitches had problems with certain types of off-speed pitches. To fix the problems I used a cut-off point of the standard value to separate fastballs from everything else. Generally speaking, a pitch that was faster than the average speed was usually a fastball and anything slower was off-speed. This was the case for every type of pitcher I examined, which will be important. Some pitches are going to be improperly classified with this method as well, but the problem is smaller compared to using the algorithm and because of the similarity between different types of pitchers, this method worked better than the algorithm when classifying pitches for multiple pitchers. Here's a pitch chart from Roy Halladay to give a sense of where the distinction is being made between pitches. One thing to keep in mind, and it's shown clearly in Halladay's graph, is that I didn't make any attempt to separate 2-seam and 4-seam fastballs for pitchers that throw both pitches, which will slightly skew the results for those pitchers. Once I was automatically classifying individual pitchers, I went back and classified every pitch in my database as either a fastball or an off-speed pitch. Before I looked at when pitches were thrown though, I needed to establish some baselines. Of all the pitches in my database, 62% have been fastballs. Some basic splits are in the table below. Split Fastball% Total Pitches Overall 62% 122072 RHP/RHH 63% 46849 RHP/LHH 61% 43197 LHP/RHH 61% 23415 LHP/LHH 63% 8611 It seems that pitchers throw more fastballs to same-side hitters, but overall 62% looks pretty good as an average. Here's a list of the 10 pitchers who throw the highest and lowest percentage of fastballs (min 100 pitches). Name FB% Total Pitches Scot Shields 75% 531 Todd Jones 75% 116 Darren Oliver 75% 357 Joakim Soria 75% 206 Alan Embree 73% 380 R. Betancourt 73% 173 Jay Marshall 73% 319 Mike Timlin 73% 146 Aaron Sele 72% 127 Macay McBride 72% 263 ------------------------------ Cole Hamels 48% 341 Ian Snell 47% 285 Akinori Otsuka 46% 293 Tom Glavine 46% 324 Matt Wise 46% 121 C. Villanueva 45% 508 Royce Ring 44% 248 Kiko Calero 44% 314 Justin Speier 42% 363 Jamie Walker 37% 151 This list is pretty interesting and the full list it came from might be even more interesting. First of all, Jamie Walker throws a ridiculously small percentage of fastballs compared to the league average. 37% is more than 3 standard deviations from the mean, so he must have reasonably good off-speed pitches to rely on them so extensively. Comparing pitchers to each other gave me insight into some differences in pitch selection I was unaware of. I knew Hamels and Glavine relied heavily on pitches other than their fastballs, but I had no idea they threw their fastballs less than half the time. Similarly, I was surprised at how frequently the leaders threw their fastballs. Joel Zumaya missed the pitch limit cut-off, but he threw his fastball 84% of the time. Hitters essentially knew his fastball was coming, but there still wasn't much they could do with it. One other tidbit from this chart is regarding Beckett. He was subject to criticism last season that he was relying on his fastball too much. This season he has thrown it 65% of the time this season, which is above average, but not in the category of 'over-reliance'. In a previous article, I examined the pitch selection of Jake Peavy and Dan Haren, based on the Leverage Index of the situation. I didn't have any baselines to compare their averages too, but now I do. Instead of using LI to separate situations, I took a suggestion from a comment by Tangotiger and created three groups of situations based on the run value of a strikeout vs. regular out. Using the win value of a strikeout vs. regular out would probably be a better distinction, but that's for another article. A strikeout is much more valuable than a regular out primarily when there are runners on third base and less than two outs, while the value of a regular out is higher than a strikeout if there is a runner on first or first and second, with one or no outs. The chart below shows the fastball percentages for each situation, split by the pitcher/batter matchup. Split High K Low K Everything Else Overall 60% 64% 62% RHP/RHH 62% 65% 63% RHP/LHH 60% 64% 61% LHP/RHH 58% 63% 61% LHP/LHH 63% 64% 62% In every case, the percentage of fastballs thrown is lower when the pitcher needs a strikeout, which is what we expected going in (and saw in the case of Peavy and Haren). The differences between situations aren't severe, but in the 'overall' case especially, the sample size is large enough that the differences are real. Below is a table showing the pitchers who have thrown the highest and lowest percentage of fastballs when they need a strikeout (min 20 pitches). It is a little misleading to just compare the percentage of fastballs a pitcher throws when he needs a strikeout to the league average and say anything less than the league average (more breaking balls) is good while anything higher is bad. A pitcher should throw whatever pitch he has that can get the most swings-and-misses in a high K situation, and for some pitchers, their best swing-and-miss pitch happens to be their fastball. Pitchers rely on their fastballs generally, but certain pitchers should and do use it even more in situations where they need a strikeout. Name FB% Total Pitches Carlos Silva 88% 25 Matt Belisle 87% 23 Greg Maddux 84% 43 Chris Sampson 81% 21 Vicente Padilla 80% 111 Adam Eaton 79% 29 Manny Delcarmen 79% 33 Odalis Perez 77% 31 Scot Shields 77% 31 Jay Marshall 76% 34 ----------------------------------- Rudy Seanez 39% 28 Javier Lopez 39% 41 Vinnie Chulk 38% 21 Matt Cain 38% 29 Will Ohman 36% 22 C. Villanueva 36% 22 Mike MacDougal 35% 20 Kelvim Escobar 34% 62 Scott Baker 32% 25 Mike Thompson 32% 22 Manny Delcarmen is one of the pitchers who relies more on his fastball when he needs a strikeout and we can see whether he should be or not. Delcarmen gets a swinging strike 13% of the time he throws his fastball (in any situation), while he gets a swinging strike only 10% of the time with his off-speed pitches. If those ratios are real, and not the product of a small sample size so far, Delcarmen appears to be justified relying on his fastball more when he needs a strikeout. The downside to this is if hitters know a fastball is coming nearly 80% of the time with a runner on third and less than two outs, it would seem to lose some of it's swing-and-miss capabilities...unless it is such a good fastball that hitters can't hit it even when they know it's coming, in which case a pitcher should use it more heavily when he needs a strikeout. There should be some point where that circular loop ends and an equilibrium is reached between the amount a pitch is thrown and it's ability to cause swings-and-misses. I've covered some of the flaws in the methodology I used to separate pitches, but overall I was quite happy with the results. When I compared the overall fastball percentages for individual pitchers to Inside Edge on ESPN and my own individual pitcher graphs, the percentages were close in all three cases. The next step in this type of analysis is to separate out the different off-speed pitches that I lumped together, which adds another layer of information about pitchers and pitch selection. A changeup and curveball are two very different pitches and could be used for very different purposes by a pitcher. I'm going to close with one last table, this one showing the fastball percentage on extreme pitcher's counts (0&2 and 1&2) and extreme hitter's counts (3&0, 3&1). Count Fastball% Total Pitches 3&0 and 3&1 83% 4340 0&2 and 1&2 54% 18091 I should have separated the 3 ball counts by the cost of a walk, but it seems amazing that pitchers are so afraid of walking a hitter in those counts that they become Zumaya-esque in terms of pitch selection, but without the amazing fastball to back it up. In a count that already favors the hitter, hitters see almost all fastballs, which is one big reason why hitters have a .630 SLG in 3&0 and 3&1 counts this year. |
Comments
Obviously, I'm a Giants fan: given that Cain has a great mid-90's fastball with "stuff" but he only uses it 38% of the time when he needs a strikeout, it seems that someone should slap him upside the head and tell him to throw his fastball more.
In fact, just before his recent high in strikeouts, he admitted in an interview that he wasn't relying on his fastball for a strike, but observed other pitchers and realized that he should, leading to him now getting more strikeouts, so perhaps he slapped himself.
However, he went through a similar thing last year where he didn't trust his fastball, thinking major league hitters can hit it, but when they skipped his start, he calmed down and starting hurling the heat.
The problem seems to be that he's preparing for the day when his heater is gone, but he's only 22 years old, so he should just fling the heat and work on being a pitcher when he has a big lead.
Posted by: obsessivegiantscompulsive at August 10, 2007 1:50 PM
Cain has been using his changeup and slider more in his last 2 starts. His fastball isn't in the mid-90's according the the GameDay's I have observed. His fastball has been at 92-94. That's still good enough for him to get it by hitters.
If you had happen to view the Gameday for his start @San Diego his fastball was in the mid-90's, but don't be fooled. I have noticed that SD's PITCHf/x system are reading the fastballs as faster than normal. I compared his fastball speed to other pitchers in the game and everyone else's figures were high as well.
He hasn't had much leads to work with this year, and the bullpen has blown a few of his wins. So there goes some of that confidence right there.
Posted by: XV84 at August 10, 2007 3:39 PM
How often did Beckett throw his fastball last season?
Posted by: Joe at August 10, 2007 3:41 PM
Inside Edge has him throwing the fastball 71% of the time and that includes pitches from this season, last season and I think the season before. So he may have been throwing it 75% of the time. I actually have him throwing it 60% of the time so far this season in the games with enhanced gameday he's pitched.
Posted by: ultxmxpx at August 10, 2007 7:45 PM
Someone has been watching too much Biodome buuuuuuuuuudddy
Posted by: Trenchtown at August 11, 2007 1:01 PM
Nice article. Just pointing out the obvious here, but your filter will fail for pitchers with extremely high or low fastball percentages. (For instance, Tim Wakefield's average speed pitch will still likely be a knuckleball, and all of his curveballs will likely be classified as fastballs.)
Posted by: AP at August 12, 2007 12:08 PM
Wow this is awsome research interesting to see Halladay velocity mostly under 90? are you sure about that? I thought his sinker always went 92-94.
Interesting though I think Sinker ballers actually use a fastball more because they WANT the hitter to hit it and beat it into the ground. Thats why I see Silva up there so high.
As far as pitchers any pitcher that can't crack 90 92 MPH I firmly suggest throwing sinkers. IMO It's almost a must.
www.bluejayfever.com
New Jays Site/Forum
Posted by: Graham Rhodenizer at August 12, 2007 3:05 PM
It seems to me you might you get cleaner results if you also took into consideration the vertical and horizontal movement as well. Why not, for right-handed pitchers, look only at pfx_x values that are negative and for left-handers pfx_x values that are positive and for both pfx_z values that are positive? If there's any reasonable amount of backspin on the ball the pfx_z won't be negative. The lateral movement is more dicey since I'm not sure you want cut fastballs in your classification as as fastball and so if you do then you wouldn't use pfx_x.
Great stuff as ususal.
Posted by: DanAgonistes at August 12, 2007 3:07 PM
i agree. great work.
Posted by: eric at August 12, 2007 5:05 PM
I'm really happy someone got the Biodome joke.
AP
I actually didn't include Wakefield in the analysis for that reason. I need to look at the actual values again, but I think his fastballs were pitches that were more than 4-5% faster than average. Using the cutoff at 1 grouped too many knuckleballs as fastballs I think. Something like what Dan suggested should work well for Wakefield.
Posted by: joe p at August 13, 2007 9:10 AM
Adding on to what Dan said, you could classify any righty with a positive pfx_x value (and negative for lefty) as a cut fastball if the pfx_z value is over 8 or 7. Jake Peavy's and Matsuzaka's cutter seem to often have a lower value then that though (although there seems to be no clear distinction between their cutter and slider and it may be better off calling their cutters offspeed pitches anyways).
Posted by: ultxmxpx at August 13, 2007 1:00 PM
Another addition for classifying pitches would be to treat slow pitches with negative pfx_x values for LHP (and positive for RHP), that also have negatvie pfx_z values as curveballs. I believe that speed would be the primary differentiator between curves and splitters.
Posted by: joe p at August 13, 2007 2:02 PM
Graham Rhodenizer, were the velocities you thought Halladay threw based on radar gun readings? If so, then they are not reliable, as radar guns have inconsisten readings and can be higher or lower than actual. The PITCHf/x system (when properly calibrated) is more reliable than a radar gun.
Posted by: XV84 at August 14, 2007 1:06 AM
Great Article!
Posted by: Kevin at August 15, 2007 8:44 AM