Designated HitterMarch 02, 2006
Pitchers, Pitch by Pitch
By David Appelman

Last week, in this same Designated Hitter column, Dan Fox took an excellent look at what batters did on a pitch-by-pitch basis. Well, guess what? Pitchers have pitch-by-pitch stats, too, and they're just as interesting! I've sliced this data dozens of ways, and there's literally hundreds of different stats you could create from Baseball Info Solutions "pitch data," so I'm only going to focus on the four I've found that I believe are most relevant.

When a pitcher throws the ball, it can either land in or out of the strike zone. Pitchers will throw the ball in the strike zone anywhere from 44% of the time to 65% of the time. (If this sounds familiar it's because I went over this same stat, but for batters in my Dissecting Plate Discipline article.) Let's call this stat Zone Ratio (ZRatio) which will simply be the ratio of pitches thrown in the strike zone to pitches thrown out of the strike zone.

You won't be surprised when I tell you this stat correlates well with walks, but not all pitchers that have a low ZRatio necessarily walk a lot of batters. Let's take a look at the top and bottom 5 lists for starting pitchers only.

Top 5 ZRatio		         Bottom 5 ZRatio	
Carlos Silva    1.86		Al Leiter       0.90
Paul Byrd       1.49		Kirk Rueter     0.89
Brad Halsey     1.48		Scott Downs     0.88
Bartolo Colon   1.47		Felix Hernandez 0.87
Greg Maddux     1.46		Dewon Brazelton 0.81

Seeing pitchers like Dewon Brazelton and Al Leiter who walked over 6 batters per 9 innings last season on the bottom list isn't much of a shock, but it is a little odd to see Felix Hernandez and Scott Downs who both walked under 3.5 batters per 9 innings. Looking at the top list, Carlos Silva threw far and away the highest percentage of pitches in the strike zone in baseball which sounds about right considering his miniscule walk rate of .4 batters per 9 innings.

Top 5 ZRatio		         Bottom 5 ZRatio	
R. Betancourt   1.65		Ryan Dempster   0.87
Heath Bell      1.54		Akinori Otsuka  0.85
Matt Belisle    1.51		J.C. Romero     0.83
Luis Ayala      1.51		Mike Gonzalez   0.83
Paul Quantrill  1.48		Mike Wuertz     0.78

Looking at relief pitchers, no one appears out of place on the bottom list, but it is interesting to see the Cub's closer Ryan Dempster and the Pirates possible closer Mike Gonzalez. I wonder if throwing that many pitches out of the strike zone will catch up to them eventually? The top of the list is pretty ho-hum in my opinion.

After the pitcher throws the ball, the batter can either swing or take the pitch. Batters should typically be expected to swing at a high percentage of pitches inside the strike zone, but what I find fascinating are pitchers that can make batters swing at pitches outside the strike zone. For this we're going to look at outside swing percentage (OSwing) which is the percentage of pitches thrown outside the strike zone a batter swings at.

Perhaps you could consider this a measure of deception. Pitchers will cause batters to swing at pitches outside the strike zone anywhere from 9% to 31% of the time. It doesn't have a great correlation with anything, but I suppose it matches up best with a pitcher's strikeout to walk ratio. Once again, let's look at the top and bottom 5 lists for starting pitchers.

Top 5 OSwing			Bottom 5 OSwing	
Brad Radke      31.51%		Hayden Penn     13.75%
Johan Santana   30.43%		John Maine      13.41%
Curt Schilling  29.75%		Zach Day        13.10%
Felix Hernandez 28.59%		Glendon Rusch   11.99%
Odalis Perez    27.94%		Scott Erickson   9.95%

In the top 5 we have a pretty interesting list including arguably the best pitcher in baseball Johan Santana who's only second in OSwing to his teammate Brad Radke. Felix Hernandez also shows up and is the only player on the list who has a ZRatio less than 1. On the bottom of the list, there's not really anyone worth mentioning.

Top 5 OSwing		         Bottom 5 OSwing	
Brad Lidge      32.54%		Jesus Colome    12.43%
Rudy Seanez     30.24%		Matt Mantei     12.32%
Derrick Turnbow 28.48%		Armando Benitez 11.67%
Mike Wuertz     28.11%		Danny Kolb      11.03%
J. Papelbon     27.87%		Nate Bump        9.82%

The top list of relievers is just as impressive with two closers. Only Mike Wuertz has an ERA over 3. Bringing up the rear are former closers Matt Mantei and Danny Kolb. And then there's Armando Benitez which I find particularly odd. I'm really not sure what he's doing there, but I bet if you were to look at his OSwing in previous seasons, it wouldn't be anywhere near the bottom.

Moving along, once a batter has decided to swing at a pitch, he can either make contact with it or whiff at the ball. Pitchers will have batters swing and hit their pitchers between 60% and 90% of the time. Let's simply call this Contact, which is the percentage of pitches a batter makes contact with when he swings the bat. Obviously this will correlate quite well with a pitcher's strikeouts.

Looking at the top and bottom 5 Contact lists for starting pitchers; Johan Santana makes another appearance on a top list. It looks like if Kerry Wood could actually stay healthy he's still got what it takes to make batters miss along with Kelvim Escobar who is not just looking to stay healthy but could also join the pitching elite. The bottom of the list is scattered with pitchers who barely strikeout anyone including Carlos Silva. Should I just reserve a spot for a Twins starting pitcher on every list?

Bottom 5 Contact	                  Top 5 Contact	
Kirk Rueter     91.58%		Ezeq. Astacio   74.46%
Carlos Silva    91.08%		Johan Santana   74.26%
Kirk Saarloos   89.66%		Jake Peavy      73.86%
Shawn Estes     89.21%		Kelvim Escobar  71.81%
Ryan Drese      89.09%		Kerry Wood      70.45%

Taking a look at the relievers, there's two of the best closers in Brad Lidge and Joe Nathan on the top list. Ugueth Urbina used to close but recently has ended up on teams with established closers. Weurtz shows up on another list. Could he possibly be a future closer? At the bottom of the list are pitchers you wouldn't trust to close out Little League games.

Bottom 5 Contact		         Top 5 Contact	
Paul Quantrill  91.21%		Joe Nathan      67.51%
Scott Munter    90.44%		Ugueth Urbina   67.24%
Kevin Gryboski  90.20%		Mike Wuertz     60.45%
Nate Bump       90.04%		Brad Lidge      59.86%
Jesse Crain     88.12%		Rudy Seanez     59.25%

Finally, when a batter makes contact with the ball, it can either be put into play or fouled off. I'm not so interested in what batters do to pitches outside the strike zone, but more so what they do to pitches inside the strike zone. So let's look at the ratio of pitches inside the strike zone that are fouled off and call it the Foul Ratio (FRatio).

FRatio correlates quite well with strikeouts, but also has some correlation with a pitcher's fly ball percentage. It's a little strange, but basically it suggests that pitchers who put the ball in play more frequently are often ground ball pitchers. Pitchers will have a FRatio of anywhere from .45 to 1.45.

Top 5 FRatio	                  Bottom 5 FRatio	
Mark Prior      1.19		Ric. Rodriguez  0.55
Chris Young     1.17		Carlos Silva    0.53
Erik Bedard     1.14		Kirk Rueter     0.50
Matthew Cain    1.13		Scott Erickson  0.47
Kyle Davies     1.12		Mike Gosling    0.45

Looking at starting pitchers only, the top list has some pretty interesting names on it. It's worth noting that only Mark Prior has an OSwing over 20% on this list. No list would be complete without Silva, so he shows up on the bottom list (have you learned enough about him yet?).

Top 5 FRatio		         Bottom 5 FRatio	
B.J. Ryan       1.43		Pete Walker     0.55
J. Papelbon     1.41		Joey Eischen    0.54
Russ Springer   1.37		Scott Sauerbeck 0.54
Scott Eyre      1.36		T. Mulholland   0.54
Ugueth Urbina   1.35		Brian Shouse    0.50

B.J. Ryan heads up the top list for Relievers, but is the only active closer of the lot. Most of the high profile closers in baseball aren't too far from the top 5. There's no one too notable towards the bottom of the list, but former closer Danny Graves missed the 5 spot by just .3%. In addition, Joey Eischen and Scott Sauerbeck managed to strike out a good deal of batters despite having a lousy FRatio.

So what kind of conclusions can we make from looking at a pitcher's pitch-by-pitch data? Well, it's clear to me that having a high OSwing and a high FRatio is clearly preferable, so let's look at one final list which is a combination of the two. I believe this should give us a good indication of a pitcher's overall skill level or possibly potential. For sake of a better name, let's call this stat Potential. Here are the top 10 starters and top 10 relievers.

Top 10 Starters		         Top 10 Relievers	
Johan Santana   0.332		J. Papelbon     0.392
Curt Schilling  0.317		Joe Nathan      0.345
Brad Radke      0.279		Robert Jenks    0.331
Rick Helling    0.266		Brad Lidge      0.328
Mark Prior      0.264		Eddie Guardado  0.319
Scott Kazmir    0.260		Mariano Rivera  0.303
Rich Harden     0.259		Juan Rincon     0.301
Jake Peavy      0.250		Scott Eyre      0.291
B. McCarthy     0.239		Jose Valverde   0.290
Robinson Tejeda 0.235		R. Betancourt   0.288

These are two very prestigious lists with some interesting players thrown in. The only starter that seems totally out of place to me is Rick Helling since everyone else is either already a good pitcher or is seen as one with great potential. The relievers are no different as you have 5 of the best closers in baseball and no one had an ERA over 3. I'd show the bottom lists, but there's really no one worth mentioning.

What will be really interesting to see is if these stats have predictive power, my guess is that they probably do, but next year when the Baseball Info Solutions 2006 pitch data is complete, we'll be able to take a much better look at whether or not any of these stats correlate from year to year. There's obviously a lot of work to be done and analysis like this is just scratching the surface, but it seems to me that pitch-by-pitch data is the future of player based statistical analysis.

David Appelman is the creator of FanGraphs.com. You can contact him via e-mail.

[Additional reader comments and retorts at Baseball Primer.]

Comments

Great article. I'm very interested to see how these numbers vary from year to year to determine their predictive value.

Carlos Silva might need an article all to himself. How is this guy getting people out? From the statistics cited above, he works exclusively in the strike zone, hitters frequently make contact, and usually that contact is solid enough to put the ball in play. So what's his deal? Is it that he has such impeccable control that hitters can only get enough of the bat on it to produce weak grounders and fly balls? Did hitters just happen to hit everything right at defenders? If he had a very low BABIP, we might draw the conclusion that he just got lucky and balls were hit right at defenders, who fielded the ball well, but at .295 he was about average. His G/F ratio of 1.83 shows he's inducing more ground balls, but not at such a high rate that coupled with good infield D it would explain his success in 2005.

It would be interesting if it were possible to determine the average velocity of batted balls in play for a particular pitcher. Clearly this is not the type of information readily available to the masses (who knows, teams might keep that kind of info), but I'm guessing that another way a guy like Silva succeeds in making it easier for his defenders to field batted balls is by locating pitches in such a way that hitters just don't hit the ball hard. Batters' Isolated Power against him might give some indication of how hard they're hitting the ball, but that still fails to explain his success as he falls pretty much in the middle of the pack with an ISO of .151.

This is great stuff; very likely this could work like BABIP, only for pitchers!

Yeah, Silva is a pretty odd pitcher. He put a greater percentage of balls in play in his 2005 season than any other active pitcher has in any year of their career. Also, taking a closer look at pitch location is something I definitely want to try and do this year.

DA: Very interesting analysis. I'm copying this post from Primer in case you don't hang out there:

It would be great if at some point David examined whether and how much Out-of-zone swing, contact rate, and foul percentage predict BABIP. If they do, and if they showed decent y-t-y correlation, these measures may allow us to detect hit-prevention ability in young pitchers.

I thought is was odd that he combined just OZone and Foul rate at the end, w/o contact (unless I misunderstood). I'd think a combined metric that includes 1) swing out of zone, 2) swing/miss, and 3) swing/foul would be even better. Perhaps add 4) strike/take as well, to include all pitches on which there's a good outcome for pitcher.

thanks David, that is really pretty interesting

Fascinating stuff.

Jonathan Papelbon just dominates the reliever category ... but it should be noted that it's a smaller sample size for him.

You mention using BIS data. I don't see mention of the time frame you use to judge the relievers---
is it just regular season, or do you include their
post season numbers, since they had the most at stake there. Is it just 2005, or their entire
major league regular and post season numbers?

All the stats are from the 2005 regular season only. I'll be doing this same exercise after the 2006 season to see if any of these stats remain constant from year to year.