Baseball BeatFebruary 09, 2009
Categorizing Starting Pitchers by Strikeout and Groundball Rates - 2008 Edition
By Rich Lederer

Strikeout and groundball rates have become my favorite way to evaluate pitchers. While I also pay close attention to walk rates, I am most interested in whether pitchers can miss bats and keep batted balls in the park.

The reasons are simple and straightforward: (1) strikeouts are the out of choice and (2) groundballs are preferred over flyballs and line drives. Except for the rare missed third strike, a strikeout always produces an out and no chance for runners to advance bases (other than a stolen base). Among batted ball types, infield flies are the least harmful, followed by groundballs, outfield flies, and line drives.

Thanks to the advancements in play-by-play data, we can even place a value on the run impact of each event. For example, according to information gathered from The Hardball Times, strikeouts have had a run impact of approximately -0.11, infield flies -0.09, groundballs 0.04, outfield flies 0.18, and line drives 0.39 per incident over the past few seasons.

Although groundballs generate more hits and errors than flyballs, their run impact is lower because the hits are usually limited to singles and an occasional double down the first or third base line, whereas balls in the air that turn into hits more often become doubles, triples, or home runs. By definition, groundball pitchers give up fewer flyballs and line drives. In addition, groundball rates fluctuate less than home run rates because park effects, weather, and other forms of randomness play a huge role when it comes to the outcome of long flyballs, especially among pitchers. Therefore, if you want to maintain a low home run rate, the best thing to do is to keep batted balls on the ground.

Based on the above information, it follows that just as pitchers with high strikeout rates would generally fare better than those with low rates, pitchers with high groundball rates would normally fare better than those with low rates. Furthermore, it also suggests that pitchers who combine higher strikeout and groundball rates will outperform those with lower rates.

With the foregoing in mind, I introduced the idea of categorizing pitchers by strikeout and groundball rates for the 2006 season in January 2007 (Part I: Starters/Part II: Relievers). I also generated this information for the 2007 season in March 2008 (Part I: Starters/Part II: Relievers) and will once again provide it for the 2008 campaign, beginning with starters today and relievers tomorrow.

Consistent with the methodology that I have used in the past, the universe of starters consists of all pitchers who completed 100 or more innings and started in at least 33 percent of their appearances. There were 135 pitchers who met these requirements in 2008. Among these qualifiers, the average K/BF rate was 16.90% and the average GB rate was 43.45%. The mean K and GB rates are highlighted in red in the graph below. These averages separate the starting pitchers into four quadrants.

By placing pitchers in quadrants, one can easily distinguish those with above-average strikeout and groundball rates (referred herein as the northeast quadrant), above-average strikeout and below-average groundball rates (southeast quadrant), above-average groundball and below-average strikeout rates (northwest quadrant), and below-average groundball and strikeout rates (southwest quadrant).

The simple average and weighted average (by innings) ERA and RA are detailed in the table below. Whether using simple or weighted, ERA or RA, the message is crystal clear:

  • Pitchers with above-average K rates outperform those with below-average K rates.
  • Pitchers with above-average GB rates outperform those with below-average GB rates.
  • Pitchers who combine above-average K and GB rates outperform all others.


2008%20SP%20K%3AGB%20ERA%20and%20RA%20by%20Type.png


Not surprisingly, pitchers with the highest strikeout and groundball rates had the lowest average ERA, while those with the lowest K and GB rates had the highest average ERA. In the hybrid categories, pitchers with above-average strikeout and below-average groundball rates beat those with below-average K and above-average GB rates. The order of preference is the northeast quadrant, followed by the southeast, northwest, and southwest.

Looking at the outliers in the graph is one of the most interesting aspects of this study. Starting with the northeast quadrant and going clockwise, Derek Lowe, Brandon Webb, and, to a lesser extent, Ubaldo Jimenez, Roy Halladay, Chad Billingsley, A.J. Burnett, CC Sabathia, and Edinson Volquez, plus Tim Lincecum, Rich Harden, Scott Kazmir, Chris Young, Jason Bergmann, Brian Burres, Livan Hernandez, Aaron Cook, Fausto Carmona, and Tim Hudson all stand out for their extreme (good or bad) strikeout and/or groundball rates. Is there anybody who wouldn't take the outliers in the northeast quadrant over the outliers in the southwest quadrant? Lowe (3.24), Webb (3.30), Jimenez (3.99), Halladay (2.78), Billingsley (3.14), Burnett (4.07), Sabathia (2.70), Volquez (3.21), and Lincecum (2.62) all had much lower ERAs than Bergmann (5.09) and Burres (6.04).


StartersGBK.png

Data and graph courtesy of David Appelman, FanGraphs.


Let's take a closer look at the results. Pitchers in the northeast, southeast, and southwest quadrants are sorted by K/BF rates. Pitchers in the northwest quadrant are listed in the order of GB rates.


Picture%202_2.png


The two Cy Young award winners headline this year's northeast quadrant. Lincecum had the second-highest K/BF rate in the majors (trailing only Harden) while generating an above-average GB rate. Cliff Lee made the highly unusual leap from the dreaded southwest quadrant in 2006 (14.63%, 32.70%) and 2007 (14.89%, 35.28%) to the more tony northeast quadrant in one fell swoop, primarily owing to a harder fastball and improved movement that produced a career-best O-Swing%.

Burnett, Doug Davis, Dan Haren, Felix Hernandez, Roy Oswalt, Sabathia, and Webb have inhabited the northeast quadrant in each of our studies covering the past three seasons. If asked, "Which one is not like the others?" I'm confident that we would all answer, "Doug Davis." The 33-year-old lefthander has been near the bottom of the NE rankings in all three campaigns, barely exceeding the hurdle in both metrics each time. Davis also had the highest walk rate of this otherwise elite group in 2006, 2007, and 2008. He is what he is, an ever-so-slightly, better-than-average starting pitcher who gives up his share of hits and walks while doing a reasonably good job at missing bats and keeping the ball in the yard.

There have been just nine cases in the past three seasons of pitchers combining a 20% K rate with a 50% GB rate. King Felix is the only pitcher to accomplish this feat all three years. He posted the same K rate in 2008 as in 2007, but his GB rate dropped from 60.83% to 52.14%. Nonetheless, his three-peat is impressive, especially when you consider that he won't turn 23 until after the 2009 season starts.

Burnett is a two-time member of the 20-50 club, coming up just short on the GB side of the equation in 2008. Halladay joined the ranks this year, whiffing at least 20% for the first time since 2001. Known as a groundball pitcher, Roy was part of the northwest quadrant the previous two seasons.

Potential breakout candidates and fantasy sleepers include Jorge de la Rosa, Clayton Kershaw, Manny Parra, and Andrew Miller. Besides above-average K and GB rankings, these pitchers share two things in common: all four youngsters are southpaws with a high walk rate.

I am intrigued by de la Rosa, who was 5-2 with a 2.45 ERA and compelling peripheral stats in August and September, a period covering 11 games and nine starts (including five at Coors Field) and 58.2 innings.

Miller (6th) and Kershaw (7th) were selected back-to-back by the Tigers and Dodgers in the first round of the 2006 draft. Miller (University of North Carolina) was widely considered the top college pitcher and Kershaw (Highland Park HS, Dallas) the best high school hurler. Detroit traded Miller and Cameron Maybin (and four others) to Florida for Miguel Cabrera and Dontrelle Willis in December 2007, while Los Angeles has held on to Kershaw. Both lefties pitched an almost identical number of innings in the majors last season with the soon-to-be 21-year-old Kershaw getting the better of Miller, who turns 24 in May. Note that Clayton's K/BF and GB rates were also higher than Andrew's and his BB rate (11.06% to 11.38%) was slightly better as well.

             IP     H    R    ER  HR   BB   SO   ERA
Kershaw     107.7  109   51   51  11   52  100   4.26
Miller      107.3  120   78   70   7   56   89   5.87


Picture%204_2.png


Harden stands out as the only starting pitcher in the majors with a K/BF over 30%. Over the past three years, just one starter per season has achieved this status with Francisco Liriano (30.44% in 2006) and Erik Bedard (30.15% in 2007) preceding Harden in the 30-something club. Liriano and Bedard fell victim to injuries. Following in their footsteps wouldn't be something particularly new to Harden, now would it?

Kazmir has ranked second, first, and second in the southeast quadrant for three straight seasons although it is a bit disturbing to note that his GB rate fell more than 10 percentage points below his 2006 and 2007 levels. Josh Beckett dropped out of the northeast and into the southeast grouping for the first time while slightly topping his K rate from his outstanding summer in 2007 (23.60%).

Jake Peavy, Wandy Rodriguez, Gil Meche, and Ian Snell have been a member of the northeast or southeast quadrant for each of the past three seasons, while Ervin Santana, Javier Vazquez, Cole Hamels, Chris Young, Johan Santana, Ted Lilly, Oliver Perez, Jered Weaver, Matt Cain, Ben Sheets, Aaron Harang, Bronson Arroyo, and Justin Verlander have been firmly ensconced in the SE quadrant for three years running.


Picture%205_2.png


The northwest quadrant always produces a mixed bag of pitchers. Opposite of those in the SE, hurlers in the NW succeed by inducing grounders and keeping the ball in the park, whereas their counterparts thrive on strikeouts.

Carmona tops the list for the second consecutive year. Although Fausto's GB rate exceeded the rarefied 60% mark once again, his K rate fell off the cliff (from a reasonable 15.59% in 2007 when he finished fourth in the AL CYA voting to a dangerously low 10.56% in 2008). Worse yet, his K/BB rate plummeted from 2.25 to 0.83. The good news for Indians fans is that Carmona just turned 25 in December so he still has time to get his mojo back.

Paul Maholm, Carlos Zambrano, Odalis Perez (yes, Odalis Perez), Adam Wainwright, and Armando Galarraga (whose K and GB rates are essentially league average) are within hailing distance of meeting the minimum standards of the NE quadrant. With a solid K rate and a top ten GB%, the 26-year-old Maholm deserves attention as a pitcher coming into his own. Zambrano fell out of the NE for the first time since this study began, pitching to contact more often than before while improving his walk rate to a level not seen since his stellar season in 2004. Meanwhile, don't bet on Galarraga to improve his W-L record or ERA as his BABIP of .247 was unsustainably low.

At the other end of the spectrum, Livan Hernandez and Kyle Kendrick aren't long for the majors with K rates below 10%. A free agent, Hernandez may find it difficult to convince an employer to allow him to wear a big league uniform in 2009, even at the minimum salary.


Picture%206_2.png


Repeating my comments from each of the past two series, "[The southwest] is the quadrant that you want to avoid. It is inhabited by some of the worst starters in the game. If you fail to miss bats and don't keep the ball on the ground when it is put into play, you are going to run into trouble." The only way to survive in this quadrant is to put up as close to league-average K and/or GB rates as possible (see Brandon Backe, Vicente Padilla, Todd Wellemeyer, Matt Garza, Gavin Floyd, Jorge Campillo, Kevin Millwood, and Andy Sonnanstine) or to throw strikes, maintain a low walk rate, and duck when the ball is put into play. However, all of these types of pitchers live on the edge with very little margin for error.

As I am wont to say, "When it comes to evaluating pitchers, I would rather know their strikeout and groundball rates than their ERA. Throw in walk rates and you have almost everything you need to know about a pitcher. Focusing on these components gives one a much more comprehensive understanding of a pitcher's upside and downside than looking at a single metric such as ERA."

Tomorrow: Categorizing Relievers by Strikeout and Groundball Rates.

Comments

This is one of my favorite things you do on this site. Thanks.

I may have missed it, but have you considered using K/BB ratio rather than K/BF ratios for the graph? Or perhaps doing the K/BB ratio on a separate graph with GBs? Is there a reason that K/BF ratio is better or more accurately reflects effectiveness?

For example, looking at the names, Sonnanstine is slightly below average in both categories you have, so he ends up in the "worst" quadrant, but his K/BB ratio is 3.35 because he walks so few. It seems to me he had a better year than Ian Snell who ends up in the southeast quadrant because of his K/BF rate although his K/BB rate was 1.52, or Doug Davis in the "best" quadrant, but who also had a K/BB rate under 2 (1.75).

"Meanwhile, don't bet on Galarraga to improve his W-L record or ERA as his BABIP of .247 was unsustainably low."

But his infield defense is improving this year, with Inge and Everett joining Polanco.

This is my favorite set of graphs of the year every year. I really liked it when you did the minors and some of the stand out prospects were of the ends of the charts. Thank you.

It's amazing to see similar numbers/peripherals from Andy Pettitte and Jon Lester, then realize Pettitte had an ERA nearly a run and a third higher. Maybe it was defense, "luck", or something else, but Pettitte appears to have a solid chance of improvement in 2009.

Kevin,

I'd guess it was mostly a mixture of defense and luck, with some skill on the side (Lester does have better stuff). The luck may correct, but Lester should retain a defensive edge, despite some upgrades by the Yanks. Still, I'm glad to have Pettitte back. I'll take another ~200 innings of what he can provide, yessir.

Great stuff! Thanks.

I've done something like the past few seasons (and I probably got the idea for it after reading one of these posts in a previous year).

I use BB/9 in addition to K-rate and GB%. I calculate Z-scores for each pitcher in these three categories and then sum each pitcher's individual Z-Scores to get a Total Z-Score.

I created a GSheet using this method (and using your 100 IP for starters and 30 IP for relievers qualifiers). Check it out: http://spreadsheets.google.com/pub?key=pcwHMrO9Xyn-nMRwxfxpiRQ

Good job, Ryan. Although my preference would be to use K/BF and BB/BF, using K/9 and BB/9 won't change the results all that much.

Your Z-Score equally weights the three categories, which is why groundball pitchers are faring so well. Try running your Z-Scores using a 3-2-1 weighting (K/BB/GB), and I bet the rankings will do an even better job of capturing the best pitchers.

Done!

I'd be interested in using K/BF and BB/BF, but I can't find BF anywhere at Fangraphs. Am I blind? Or do you get that data elsewhere?

Is there any way to get a GSheet for this data or possibly forward me a CSV/XLS etc?

Your work is most appreciated.