Pitchers, Pitch by Pitch
Last week, in this same Designated Hitter column, Dan Fox took an excellent look at what batters did on a pitch-by-pitch basis. Well, guess what? Pitchers have pitch-by-pitch stats, too, and they're just as interesting! I've sliced this data dozens of ways, and there's literally hundreds of different stats you could create from Baseball Info Solutions "pitch data," so I'm only going to focus on the four I've found that I believe are most relevant.
When a pitcher throws the ball, it can either land in or out of the strike zone. Pitchers will throw the ball in the strike zone anywhere from 44% of the time to 65% of the time. (If this sounds familiar it's because I went over this same stat, but for batters in my Dissecting Plate Discipline article.) Let's call this stat Zone Ratio (ZRatio) which will simply be the ratio of pitches thrown in the strike zone to pitches thrown out of the strike zone.
You won't be surprised when I tell you this stat correlates well with walks, but not all pitchers that have a low ZRatio necessarily walk a lot of batters. Let's take a look at the top and bottom 5 lists for starting pitchers only.
Top 5 ZRatio Bottom 5 ZRatio Carlos Silva 1.86 Al Leiter 0.90 Paul Byrd 1.49 Kirk Rueter 0.89 Brad Halsey 1.48 Scott Downs 0.88 Bartolo Colon 1.47 Felix Hernandez 0.87 Greg Maddux 1.46 Dewon Brazelton 0.81
Seeing pitchers like Dewon Brazelton and Al Leiter who walked over 6 batters per 9 innings last season on the bottom list isn't much of a shock, but it is a little odd to see Felix Hernandez and Scott Downs who both walked under 3.5 batters per 9 innings. Looking at the top list, Carlos Silva threw far and away the highest percentage of pitches in the strike zone in baseball which sounds about right considering his miniscule walk rate of .4 batters per 9 innings.
Top 5 ZRatio Bottom 5 ZRatio R. Betancourt 1.65 Ryan Dempster 0.87 Heath Bell 1.54 Akinori Otsuka 0.85 Matt Belisle 1.51 J.C. Romero 0.83 Luis Ayala 1.51 Mike Gonzalez 0.83 Paul Quantrill 1.48 Mike Wuertz 0.78
Looking at relief pitchers, no one appears out of place on the bottom list, but it is interesting to see the Cub's closer Ryan Dempster and the Pirates possible closer Mike Gonzalez. I wonder if throwing that many pitches out of the strike zone will catch up to them eventually? The top of the list is pretty ho-hum in my opinion.
After the pitcher throws the ball, the batter can either swing or take the pitch. Batters should typically be expected to swing at a high percentage of pitches inside the strike zone, but what I find fascinating are pitchers that can make batters swing at pitches outside the strike zone. For this we're going to look at outside swing percentage (OSwing) which is the percentage of pitches thrown outside the strike zone a batter swings at.
Perhaps you could consider this a measure of deception. Pitchers will cause batters to swing at pitches outside the strike zone anywhere from 9% to 31% of the time. It doesn't have a great correlation with anything, but I suppose it matches up best with a pitcher's strikeout to walk ratio. Once again, let's look at the top and bottom 5 lists for starting pitchers.
Top 5 OSwing Bottom 5 OSwing Brad Radke 31.51% Hayden Penn 13.75% Johan Santana 30.43% John Maine 13.41% Curt Schilling 29.75% Zach Day 13.10% Felix Hernandez 28.59% Glendon Rusch 11.99% Odalis Perez 27.94% Scott Erickson 9.95%
In the top 5 we have a pretty interesting list including arguably the best pitcher in baseball Johan Santana who's only second in OSwing to his teammate Brad Radke. Felix Hernandez also shows up and is the only player on the list who has a ZRatio less than 1. On the bottom of the list, there's not really anyone worth mentioning.
Top 5 OSwing Bottom 5 OSwing Brad Lidge 32.54% Jesus Colome 12.43% Rudy Seanez 30.24% Matt Mantei 12.32% Derrick Turnbow 28.48% Armando Benitez 11.67% Mike Wuertz 28.11% Danny Kolb 11.03% J. Papelbon 27.87% Nate Bump 9.82%
The top list of relievers is just as impressive with two closers. Only Mike Wuertz has an ERA over 3. Bringing up the rear are former closers Matt Mantei and Danny Kolb. And then there's Armando Benitez which I find particularly odd. I'm really not sure what he's doing there, but I bet if you were to look at his OSwing in previous seasons, it wouldn't be anywhere near the bottom.
Moving along, once a batter has decided to swing at a pitch, he can either make contact with it or whiff at the ball. Pitchers will have batters swing and hit their pitchers between 60% and 90% of the time. Let's simply call this Contact, which is the percentage of pitches a batter makes contact with when he swings the bat. Obviously this will correlate quite well with a pitcher's strikeouts.
Looking at the top and bottom 5 Contact lists for starting pitchers; Johan Santana makes another appearance on a top list. It looks like if Kerry Wood could actually stay healthy he's still got what it takes to make batters miss along with Kelvim Escobar who is not just looking to stay healthy but could also join the pitching elite. The bottom of the list is scattered with pitchers who barely strikeout anyone including Carlos Silva. Should I just reserve a spot for a Twins starting pitcher on every list?
Bottom 5 Contact Top 5 Contact Kirk Rueter 91.58% Ezeq. Astacio 74.46% Carlos Silva 91.08% Johan Santana 74.26% Kirk Saarloos 89.66% Jake Peavy 73.86% Shawn Estes 89.21% Kelvim Escobar 71.81% Ryan Drese 89.09% Kerry Wood 70.45%
Taking a look at the relievers, there's two of the best closers in Brad Lidge and Joe Nathan on the top list. Ugueth Urbina used to close but recently has ended up on teams with established closers. Weurtz shows up on another list. Could he possibly be a future closer? At the bottom of the list are pitchers you wouldn't trust to close out Little League games.
Bottom 5 Contact Top 5 Contact Paul Quantrill 91.21% Joe Nathan 67.51% Scott Munter 90.44% Ugueth Urbina 67.24% Kevin Gryboski 90.20% Mike Wuertz 60.45% Nate Bump 90.04% Brad Lidge 59.86% Jesse Crain 88.12% Rudy Seanez 59.25%
Finally, when a batter makes contact with the ball, it can either be put into play or fouled off. I'm not so interested in what batters do to pitches outside the strike zone, but more so what they do to pitches inside the strike zone. So let's look at the ratio of pitches inside the strike zone that are fouled off and call it the Foul Ratio (FRatio).
FRatio correlates quite well with strikeouts, but also has some correlation with a pitcher's fly ball percentage. It's a little strange, but basically it suggests that pitchers who put the ball in play more frequently are often ground ball pitchers. Pitchers will have a FRatio of anywhere from .45 to 1.45.
Top 5 FRatio Bottom 5 FRatio Mark Prior 1.19 Ric. Rodriguez 0.55 Chris Young 1.17 Carlos Silva 0.53 Erik Bedard 1.14 Kirk Rueter 0.50 Matthew Cain 1.13 Scott Erickson 0.47 Kyle Davies 1.12 Mike Gosling 0.45
Looking at starting pitchers only, the top list has some pretty interesting names on it. It's worth noting that only Mark Prior has an OSwing over 20% on this list. No list would be complete without Silva, so he shows up on the bottom list (have you learned enough about him yet?).
Top 5 FRatio Bottom 5 FRatio B.J. Ryan 1.43 Pete Walker 0.55 J. Papelbon 1.41 Joey Eischen 0.54 Russ Springer 1.37 Scott Sauerbeck 0.54 Scott Eyre 1.36 T. Mulholland 0.54 Ugueth Urbina 1.35 Brian Shouse 0.50
B.J. Ryan heads up the top list for Relievers, but is the only active closer of the lot. Most of the high profile closers in baseball aren't too far from the top 5. There's no one too notable towards the bottom of the list, but former closer Danny Graves missed the 5 spot by just .3%. In addition, Joey Eischen and Scott Sauerbeck managed to strike out a good deal of batters despite having a lousy FRatio.
So what kind of conclusions can we make from looking at a pitcher's pitch-by-pitch data? Well, it's clear to me that having a high OSwing and a high FRatio is clearly preferable, so let's look at one final list which is a combination of the two. I believe this should give us a good indication of a pitcher's overall skill level or possibly potential. For sake of a better name, let's call this stat Potential. Here are the top 10 starters and top 10 relievers.
Top 10 Starters Top 10 Relievers Johan Santana 0.332 J. Papelbon 0.392 Curt Schilling 0.317 Joe Nathan 0.345 Brad Radke 0.279 Robert Jenks 0.331 Rick Helling 0.266 Brad Lidge 0.328 Mark Prior 0.264 Eddie Guardado 0.319 Scott Kazmir 0.260 Mariano Rivera 0.303 Rich Harden 0.259 Juan Rincon 0.301 Jake Peavy 0.250 Scott Eyre 0.291 B. McCarthy 0.239 Jose Valverde 0.290 Robinson Tejeda 0.235 R. Betancourt 0.288
These are two very prestigious lists with some interesting players thrown in. The only starter that seems totally out of place to me is Rick Helling since everyone else is either already a good pitcher or is seen as one with great potential. The relievers are no different as you have 5 of the best closers in baseball and no one had an ERA over 3. I'd show the bottom lists, but there's really no one worth mentioning.
What will be really interesting to see is if these stats have predictive power, my guess is that they probably do, but next year when the Baseball Info Solutions 2006 pitch data is complete, we'll be able to take a much better look at whether or not any of these stats correlate from year to year. There's obviously a lot of work to be done and analysis like this is just scratching the surface, but it seems to me that pitch-by-pitch data is the future of player based statistical analysis.
David Appelman is the creator of FanGraphs.com. You can contact him via e-mail.
[Additional reader comments and retorts at Baseball Primer.]
Great article. I'm very interested to see how these numbers vary from year to year to determine their predictive value.
Carlos Silva might need an article all to himself. How is this guy getting people out? From the statistics cited above, he works exclusively in the strike zone, hitters frequently make contact, and usually that contact is solid enough to put the ball in play. So what's his deal? Is it that he has such impeccable control that hitters can only get enough of the bat on it to produce weak grounders and fly balls? Did hitters just happen to hit everything right at defenders? If he had a very low BABIP, we might draw the conclusion that he just got lucky and balls were hit right at defenders, who fielded the ball well, but at .295 he was about average. His G/F ratio of 1.83 shows he's inducing more ground balls, but not at such a high rate that coupled with good infield D it would explain his success in 2005.
It would be interesting if it were possible to determine the average velocity of batted balls in play for a particular pitcher. Clearly this is not the type of information readily available to the masses (who knows, teams might keep that kind of info), but I'm guessing that another way a guy like Silva succeeds in making it easier for his defenders to field batted balls is by locating pitches in such a way that hitters just don't hit the ball hard. Batters' Isolated Power against him might give some indication of how hard they're hitting the ball, but that still fails to explain his success as he falls pretty much in the middle of the pack with an ISO of .151.
Posted by: Dan Vacek at March 2, 2006 8:52 AM
This is great stuff; very likely this could work like BABIP, only for pitchers!
Posted by: matty fred at March 2, 2006 9:57 AM
Yeah, Silva is a pretty odd pitcher. He put a greater percentage of balls in play in his 2005 season than any other active pitcher has in any year of their career. Also, taking a closer look at pitch location is something I definitely want to try and do this year.
Posted by: Appelman at March 2, 2006 10:28 AM
DA: Very interesting analysis. I'm copying this post from Primer in case you don't hang out there:
It would be great if at some point David examined whether and how much Out-of-zone swing, contact rate, and foul percentage predict BABIP. If they do, and if they showed decent y-t-y correlation, these measures may allow us to detect hit-prevention ability in young pitchers.
I thought is was odd that he combined just OZone and Foul rate at the end, w/o contact (unless I misunderstood). I'd think a combined metric that includes 1) swing out of zone, 2) swing/miss, and 3) swing/foul would be even better. Perhaps add 4) strike/take as well, to include all pitches on which there's a good outcome for pitcher.
Posted by: Guy at March 2, 2006 11:14 AM
thanks David, that is really pretty interesting
Posted by: AgRyan04 at March 2, 2006 4:56 PM
Jonathan Papelbon just dominates the reliever category ... but it should be noted that it's a smaller sample size for him.
Posted by: Brian at March 3, 2006 7:49 AM
You mention using BIS data. I don't see mention of the time frame you use to judge the relievers---
is it just regular season, or do you include their
post season numbers, since they had the most at stake there. Is it just 2005, or their entire
major league regular and post season numbers?
Posted by: susan mullen at March 4, 2006 5:59 PM
All the stats are from the 2005 regular season only. I'll be doing this same exercise after the 2006 season to see if any of these stats remain constant from year to year.
Posted by: Appelman at March 4, 2006 8:08 PM