Baseball BeatMay 18, 2010
The Most Under Appreciated Batted Ball Type
By Rich Lederer

Call them pop-ups, pop flies, or infield flies. While these batted balls are one and the same, they are not outfield fly balls despite getting lumped together by many baseball sites and analysts. Like Rodney Dangerfield, they get no respect.

Infield fly balls are converted into outs about 99% of the time. In other words, only 1% of all pop-ups become hits. By comparison, roughly 75% of all line drives, 25% of ground balls, and 20% of fly balls result in hits (including home runs). Line drives also have the highest run value, followed by fly balls and ground balls.

If pop-ups are routinely turned into outs with no advancement by base runners, then they should be treated more like strikeouts for the purpose of performance analysis than anything else. Unlike line drives, fly balls and ground balls, pop-ups and strikeouts have no (or negative) run value.

When it comes to breaking out batted balls, I favor Baseball Prospectus over Fangraphs. My preference is not due to the source (BP uses Gameday/MLB Advanced Media and FG uses Baseball Info Solutions) but rather that the former categorizes pop-ups as a separate batted ball event (POP) whereas the latter includes infield fly balls (IFFB) as a subset of fly balls (FB). (You can read Colin Wyers' article, David Appelman's rebuttal, and a thorough discussion at The Book if you are interested in how this data is collected.)

Using BP's custom statistic reports, let's take a look at the four different batted ball types as a percentage of all batted balls for 2009 and 2010.

Screen%20shot%202010-05-18%20at%206.30.17%20AM.png

As shown, pop-ups account for approximately 7%-8% of all batted balls. While this rate is a fraction of the other batted ball events, it is worth knowing because pop flies are almost always converted into outs.

Batted balls represent about 72% of all plate appearances with walks (9%), hit by pitches (1%), and strikeouts (18%) accounting for the balance.

Screen%20shot%202010-05-18%20at%209.35.05%20AM.png

While there is a lot of interesting information in the table above, I would like to focus on POP and SO rates as it seems to me that these "automatic outs" could be combined when analyzing pitchers (and hitters, for that matter). Importantly, inducing infield flies appears to be a repeatable skill, much like strikeouts and ground balls, although perhaps not to the same extent.

As shown, SO and POP total about 23.5% of all plate appearances. All else equal, I believe that pitchers with higher POP rates — particularly as a percentage of non-SO and GB — should be preferred over those with lower rates. If nothing else, it is my hope that such pitchers may gain greater respect from those who overlook them now.

While I want to like SIERA for many of its innovations, I'm not convinced that "pop-ups represent a potential problem for the pitcher in the future."

Pop-up rate was allowed to negatively affect SIERA because it is a symptom of the pitcher throwing the ball that generates an upward trajectory, which could lead to an increase in home runs. A pitcher’s skills are throwing strikes, making hitters miss, and throwing with angles and spins such that the trajectory of the ball is downward when it hits the bat. A popup almost always represents an out, but it also represents a potential problem for the pitcher in the future.

Moving forward, here are the 2009 rankings of all pitchers with 100 or more innings with an above-average SO + POP rates (SO plus POP divided by PA).

Num NAME PA BB HBP SO FB GB LD POP SO+POP
1 Rich Harden 609 67 6 171 107 152 67 42 34.98%
2 Clayton Kershaw 701 91 1 185 117 177 78 55 34.24%
3 Justin Verlander 982 63 6 269 193 241 151 58 33.30%
4 Tim Lincecum 905 68 6 261 145 283 109 35 32.71%
5 Jake Peavy 410 34 1 110 74 120 48 23 32.44%
6 Zack Greinke 915 51 4 242 195 255 116 53 32.24%
7 Javier Vazquez 874 44 4 238 176 255 126 34 31.12%
8 Johan Santana 701 46 3 146 144 193 99 71 30.96%
9 Jered Weaver 882 66 4 174 220 202 118 99 30.95%
10 Scott Baker 828 48 4 162 209 214 100 91 30.56%
11 Jon Lester 843 64 3 225 142 269 108 32 30.49%
12 Jonathan Sanchez 710 88 6 177 149 187 68 36 30.00%
13 Yovani Gallardo 793 94 5 204 146 231 84 30 29.51%
14 Tommy Hanson 522 46 5 116 105 146 68 37 29.31%
15 Ricky Nolasco 785 44 2 195 176 218 116 35 29.30%
16 Dan Haren 909 38 4 223 192 285 128 40 28.93%
17 Ted Lilly 706 36 2 151 197 182 87 51 28.61%
18 Jorge De La Rosa 799 83 9 193 137 239 103 35 28.54%
19 Cole Hamels 814 43 5 168 157 261 117 63 28.38%
20 Matt Garza 861 79 11 189 173 233 126 51 27.87%
21 Max Scherzer 741 63 10 174 158 215 90 32 27.80%
22 Wandy Rodriguez 849 63 5 193 176 276 96 41 27.56%
23 CC Sabathia 938 67 9 197 178 296 132 59 27.29%
24 Josh Johnson 855 58 6 191 126 307 125 42 27.25%
25 Chad Billingsley 823 86 7 179 143 262 101 45 27.22%
26 Aaron Harang 703 43 4 142 156 191 121 47 26.88%
27 Carlos Zambrano 733 78 9 152 136 229 87 43 26.60%
28 Adam Wainwright 970 66 3 212 150 360 136 45 26.49%
29 Roy Halladay 963 35 5 208 163 366 141 45 26.27%
30 Joe Blanton 837 59 8 163 181 257 116 56 26.16%
31 Josh Beckett 883 55 7 199 165 299 126 32 26.16%
32 Felix Hernandez 977 71 8 217 164 367 113 38 26.10%
33 Ubaldo Jimenez 914 85 10 198 125 344 112 40 26.04%
34 Barry Zito 818 81 8 154 163 235 120 59 26.04%
35 Francisco Liriano 609 65 6 122 123 178 80 36 25.94%
36 Randy Wolf 862 58 6 160 211 263 103 61 25.64%
37 Chad Gaudin 664 76 8 139 132 199 79 31 25.60%
38 Edwin Jackson 890 70 5 161 194 267 128 66 25.51%
39 A.J. Burnett 896 97 10 195 184 259 117 33 25.45%
40 Scott Richmond 610 59 117 144 151 101 38 25.41%
41 Matt Cain 886 73 3 171 211 263 112 53 25.28%
42 John Danks 839 73 5 149 170 282 98 62 25.15%
43 Brett Anderson 734 44 3 150 132 280 91 34 25.07%
44 Ryan Dempster 842 65 6 172 171 296 95 39 25.06%
45 Scott Kazmir 647 60 6 117 160 160 99 45 25.04%
46 Roy Oswalt 757 42 8 138 149 265 104 51 24.97%
47 David Hernandez 462 46 1 68 130 109 62 46 24.68%
48 J.A. Happ 685 56 5 119 166 204 86 50 24.67%
49 Justin Masterson 568 60 8 119 96 213 51 21 24.65%
50 Chris Carpenter 750 38 7 144 110 319 93 39 24.40%
51 Gavin Floyd 797 59 2 163 154 263 125 31 24.34%
52 Cliff Lee 969 43 5 181 203 325 159 53 24.15%
53 Joba Chamberlain 709 76 12 133 135 222 93 38 24.12%
54 Ervin Santana 614 47 10 107 155 178 77 39 23.78%
55 Johnny Cueto 740 61 14 132 158 230 103 43 23.65%

Of these pitchers, Jered Weaver (15.5%), Scott Baker (14.8%), Tim Wakefield (14.1%), Johan Santana (14.0%), David Hernandez (13.3%), Clayton Kershaw (12.9%), Micah Owings (11.6%), Rich Harden (11.4%), David Huff (11.1%), and Todd Wellemeyer (11.1%) induced the greatest number of pop-ups as a percentage of batted balls. Weaver (11.2%), Baker (11.0%), Wakefield (10.8%), Santana (10.1%), Hernandez (10.0%), Huff (9.1%), Owings (8.7%), Wellemeyer (8.4%), Jamie Moyer (8.3%), and Jeremy Guthrie (8.0%) produced the most infield flies as a percentage of plate appearances.

Importantly, the rankings of pitchers by SO + POP and POP rates are not meant to identify the most valuable pitchers as neither takes into consideration BB, HBP, or HR rates. However, I wonder if Fielding Independent Pitching (FIP) couldn't be improved by combining SO and POP in its formula, which is typically defined as (HR*13+(BB+HBP-IBB)*3-K*2)/IP plus a league-specific factor (usually around 3.2) to create an equivalent ERA number.

The formula for FIP would need to be tinkered to account for the effect of POP as simply adding POP to SO wouldn't work. The multipliers or the league-specific factor would need to be changed to equate the newly constructed FIP with ERA.

Here are the top ten leaders for 2010 (among pitchers with 40 or more IP):

Num NAME PA BB HBP SO FB GB LD POP SO+POP
1 Tim Lincecum 218 15 69 28 70 26 10 36.24%
2 Clayton Kershaw 197 29 3 52 29 51 14 19 36.04%
3 Jered Weaver 205 12 59 47 56 18 13 35.12%
4 Colby Lewis 212 21 3 54 43 50 22 20 34.91%
5 Tommy Hanson 204 13 3 56 50 52 18 12 33.33%
6 Phil Hughes 170 15 42 35 42 23 13 32.35%
7 Brandon Morrow 187 27 3 54 33 39 25 6 32.09%
8 Yovani Gallardo 228 29 61 27 66 37 9 30.70%
9 Justin Verlander 203 20 1 46 33 60 28 16 30.54%
10 Jonathan Sanchez 178 20 2 45 38 45 19 9 30.34%

Tim Lincecum, Kershaw, Jered Weaver, and Justin Verlander are the only pitchers who ranked in the top ten in 2009 and 2010. Tommy Hanson (14th in 2009 and 5th in 2010), Yovani Gallardo (13th and 8th), and Jonathan Sanchez (12th and 10th) rank in the top 15 both years.

The greatest influence on SO + POP is clearly due to the former, yet the latter exerts value on the margin. The ability to induce pop-ups should not be dismissed when evaluating pitchers. Furthermore, it is my belief that certain pitchers have a knack for allowing fewer home runs as a percentage of outfield fly balls than the league average. Saying a pitcher is "lucky" because he has a lower HR/FB rate than the league average is simplistic, as is resorting to xFIP as a standalone measure (especially when a pitcher has a sufficiently large sample size to evaluate). By the same token, labeling a pitcher with a below-average BABIP "lucky" may not be totally accurate either.

The analytical community has come a long way on batted ball info. Paying more attention to pop-ups would be instructive in my opinion. Digging deeper into pitcher-batter results as they relate to pitch types, pitch sequencing, ball-strike counts, and bases occupied could lead us to solve some of the mysteries previously ascribed to luck and randomness. For example, pitchers with "plus" changeups may induce more than their fair share of pop-ups and lazy fly balls.

More than anything, I hope this article leads to additional discussion and research with respect to analyzing pitchers.

* * *

Update: Tom Tango sent me an email with a link to Tango's Lab: Batted Ball FIP. He pointed me to posts #8 and #9. Leave it to Tangotiger to have developed a formula for batted ball FIP (bbFIP). The formula is as follows:

ERA = 11*[(BB+LD)-(SO+iFB)]/PA + 3*(oFB-GB)/PA + 4.2

Note: the league-specific factor may differ depending on the data source

A line drive is like a walk, an infield fly is like a strikeout, and the gap between an outfly and a groundball is about one-fourth the gap between BB and SO.

In post #16, Tangotiger lists the results by root mean square error (RMSE) of bbFIP (1.05), SIERA (1.05), and FIP (1.11) and concludes "I’d say that bbFIP is a worthy addition here. Not to mention that it’s in the same spirit as FIP (linear and simple coefficients)."

If you have the time and interest, go ahead and read the entire discussion. Brian Cartwright goes into even more detail with numerous tables listing the predictive value of run estimators. As Brian notes, it is important to distinguish between "describing the past vs. predicting the future." I agree. Some skills are more repeatable than others. Guy cautions, "The farther forward you look, the more the skills change/deteriorate." He also warns against "survivor bias" in these studies. Excellent points all.

Comments

some interesting thoughts. before going any further i would want to see some kind of evidence that a pitcher has some semblance of control over generating pop flies. a glance at those numbers seems to me that it could just be attributed to chance.

if a pitcher can indeed exhibit some degree of control, is it just a byproduct of flyball pitchers or is more the result of pitchers getting beaten and getting jammed on the hands?

Amen. I agree with almost everything you said. I usually like to set baselines for HR/FB rate for each pitcher but these obviously require larger sample sizes than K rates, for example. I would have liked to have seen more tables where you just isolated pop-up % and more talk about the consistency of pop-up % from year-to-year without lumping it in with strikeouts.

MGL did the heavy lifting on this subject six years ago.

Not surprisingly, a pitcher's FB and GB as a percentage of his total BIP (essentially his G/F ratio) are very much within a pitchers control and appear to be relatively stable from year to year. The number of IF pop flies and to some extent OF pop flies, as a percentage of all non-GB BIP, are somewhat a unique function of the pitcher as well. In other words, good pitchers may tend to get more pop files than bad pitchers, as a percentage of their total non-ground ball balls in play.

The question is not so much whether pitchers have "control" over their pop flies, as a function of non-GB BIP, but how much "control" they have, given a particular sample size. Pitchers have "control" over everything. How much control is the important thing. For example, the reason we use FIP or DIPS ERA is not because pitchers have no control over non-HR BIP, but because they have much less control over them than they do BB, SO, and HR. And the reason we use xFIP is not because pitchers have no control over their HR/FB, but because they have much less control over that than they do HR per BIP or per PA.

If you want to lump the pop fly in with K in an FIP or DIPS formula, you better make sure that pitcher control over pop flies is similar to that of the BB, SO, and HR, and greater than that of the other non-HR BIP.

So how much control does a pitcher have to have over an outcome like pop flies, to include it in an FIP or DIPS formula? There is no clear answer. It depends on the sample size of the data. A lot of people do not understand that DIPS and FIP work better on smaller sample sizes and that non-FIP and DIPS formulas like ERC, BaseRuns and even regular old ERA or RA work better than DIPS or FIP with very large samples. In between, take your pick. And when I say, "Works better," I mean in terms of predicting future RA or ERA or in describing a pitcher's true talent.

So before we talk about how important a pitcher's pop fly percentage is, and whether we want to include it in a FIP or DIPS formula, we need to quantify that "control" by looking at something like year to year correlations. My guess is that year to year correlation is not going to be nearly as high as it is for BB, SO, and even HR, and I would hesitate to include it in an FIP or DIPS formula, at least for anything but a small sample of data.

MGL: Thanks. I agree with you on everything, including the fact that ERC, ERA, and RA "work better than DIPS or FIP with very large samples" (despite the reliance by many on FIP and xFIP in such cases).

As it relates to pop flies, I believe such batted balls are, at a minimum, useful in describing the past. They may be less helpful in predicting the future, especially compared to SO and BB. I'm a K-BB-GB (in that order) guy myself as the ability to miss bats trumps all but think there is value on the margin in paying attention to POP/IFFB/iFB as well.

I agree with Rich that it makes sense to include PU's along with strikeouts. Sure, for predicting future performance we would like to separate the two as strikeouts are more predictive than PU's; however, for retrospective performance the two have essentially equal value and should be viewed similarly.

Great article. I've been seeing for years discussions about how Barry Zito induces pop outs, and this article clearly validates that thinking.

As long as FIP does not account for pop outs, it will fall down on the job in analyzing pitchers like Zito, who most sabers have been denigrating for years. I wonder how many of the Tom Tippett "Crafty Lefty" category had high SO+POP%.

I'm not really up on the latest and greatest of these, so this is probably a stupid question: I was wondering how Fangraph's tERA would fit into the discussion of comparing SIERA, bbFIP, and FIP.

Lastly, it is interesting that all the Giants main starters were on the Top 41: Lincecum, Sanchez, Zito, Cain. I know Lincecum and Sanchez has the K's to get high on the list, but Zito certainly doesn't and Cain, while good, is not that great at striking out a lot. But I guess both were good at getting pop outs, though not league leading percentages, to boost their ranking for SO+POP%.

tRA is probably the best DIPS metric out there, because it includes all of the stuff as FIP but also a pitcher's batted ball rates. So guys like Zito will be rated more to their abilities.

It's not as predictive as FIP or some of the other stats, but that isn't neccesarily what we are looking for in DIPS..