The Baseball Analysts: The Most Under Appreciated Batted Ball Type

The Most Under Appreciated Batted Ball Type

By Rich Lederer

Call them pop-ups, pop flies, or infield flies. While these batted balls are one and the same, they are not outfield fly balls despite getting lumped together by many baseball sites and analysts. Like Rodney Dangerfield, they get no respect.

Infield fly balls are converted into outs about 99% of the time. In other words, only 1% of all pop-ups become hits. By comparison, roughly 75% of all line drives, 25% of ground balls, and 20% of fly balls result in hits (including home runs). Line drives also have the highest run value, followed by fly balls and ground balls.

If pop-ups are routinely turned into outs with no advancement by base runners, then they should be treated more like strikeouts for the purpose of performance analysis than anything else. Unlike line drives, fly balls and ground balls, pop-ups and strikeouts have no (or negative) run value.

When it comes to breaking out batted balls, I favor Baseball Prospectus over Fangraphs. My preference is not due to the source (BP uses Gameday/MLB Advanced Media and FG uses Baseball Info Solutions) but rather that the former categorizes pop-ups as a separate batted ball event (POP) whereas the latter includes infield fly balls (IFFB) as a subset of fly balls (FB). (You can read Colin Wyers' article, David Appelman's rebuttal, and a thorough discussion at The Book if you are interested in how this data is collected.)

Using BP's custom statistic reports, let's take a look at the four different batted ball types as a percentage of all batted balls for 2009 and 2010.

Screen%20shot%202010-05-18%20at%206.30.17%20AM.png

As shown, pop-ups account for approximately 7%-8% of all batted balls. While this rate is a fraction of the other batted ball events, it is worth knowing because pop flies are almost always converted into outs.

Batted balls represent about 72% of all plate appearances with walks (9%), hit by pitches (1%), and strikeouts (18%) accounting for the balance.

Screen%20shot%202010-05-18%20at%209.35.05%20AM.png

While there is a lot of interesting information in the table above, I would like to focus on POP and SO rates as it seems to me that these "automatic outs" could be combined when analyzing pitchers (and hitters, for that matter). Importantly, inducing infield flies appears to be a repeatable skill, much like strikeouts and ground balls, although perhaps not to the same extent.

As shown, SO and POP total about 23.5% of all plate appearances. All else equal, I believe that pitchers with higher POP rates — particularly as a percentage of non-SO and GB — should be preferred over those with lower rates. If nothing else, it is my hope that such pitchers may gain greater respect from those who overlook them now.

While I want to like SIERA for many of its innovations, I'm not convinced that "pop-ups represent a potential problem for the pitcher in the future."

Pop-up rate was allowed to negatively affect SIERA because it is a symptom of the pitcher throwing the ball that generates an upward trajectory, which could lead to an increase in home runs. A pitcher’s skills are throwing strikes, making hitters miss, and throwing with angles and spins such that the trajectory of the ball is downward when it hits the bat. A popup almost always represents an out, but it also represents a potential problem for the pitcher in the future.

Moving forward, here are the 2009 rankings of all pitchers with 100 or more innings with an above-average SO + POP rates (SO plus POP divided by PA).

Num	NAME	PA	BB	HBP	SO	FB	GB	LD	POP	SO+POP
1	Rich Harden	609	67	6	171	107	152	67	42	34.98%
2	Clayton Kershaw	701	91	1	185	117	177	78	55	34.24%
3	Justin Verlander	982	63	6	269	193	241	151	58	33.30%
4	Tim Lincecum	905	68	6	261	145	283	109	35	32.71%
5	Jake Peavy	410	34	1	110	74	120	48	23	32.44%
6	Zack Greinke	915	51	4	242	195	255	116	53	32.24%
7	Javier Vazquez	874	44	4	238	176	255	126	34	31.12%
8	Johan Santana	701	46	3	146	144	193	99	71	30.96%
9	Jered Weaver	882	66	4	174	220	202	118	99	30.95%
10	Scott Baker	828	48	4	162	209	214	100	91	30.56%
11	Jon Lester	843	64	3	225	142	269	108	32	30.49%
12	Jonathan Sanchez	710	88	6	177	149	187	68	36	30.00%
13	Yovani Gallardo	793	94	5	204	146	231	84	30	29.51%
14	Tommy Hanson	522	46	5	116	105	146	68	37	29.31%
15	Ricky Nolasco	785	44	2	195	176	218	116	35	29.30%
16	Dan Haren	909	38	4	223	192	285	128	40	28.93%
17	Ted Lilly	706	36	2	151	197	182	87	51	28.61%
18	Jorge De La Rosa	799	83	9	193	137	239	103	35	28.54%
19	Cole Hamels	814	43	5	168	157	261	117	63	28.38%
20	Matt Garza	861	79	11	189	173	233	126	51	27.87%
21	Max Scherzer	741	63	10	174	158	215	90	32	27.80%
22	Wandy Rodriguez	849	63	5	193	176	276	96	41	27.56%
23	CC Sabathia	938	67	9	197	178	296	132	59	27.29%
24	Josh Johnson	855	58	6	191	126	307	125	42	27.25%
25	Chad Billingsley	823	86	7	179	143	262	101	45	27.22%
26	Aaron Harang	703	43	4	142	156	191	121	47	26.88%
27	Carlos Zambrano	733	78	9	152	136	229	87	43	26.60%
28	Adam Wainwright	970	66	3	212	150	360	136	45	26.49%
29	Roy Halladay	963	35	5	208	163	366	141	45	26.27%
30	Joe Blanton	837	59	8	163	181	257	116	56	26.16%
31	Josh Beckett	883	55	7	199	165	299	126	32	26.16%
32	Felix Hernandez	977	71	8	217	164	367	113	38	26.10%
33	Ubaldo Jimenez	914	85	10	198	125	344	112	40	26.04%
34	Barry Zito	818	81	8	154	163	235	120	59	26.04%
35	Francisco Liriano	609	65	6	122	123	178	80	36	25.94%
36	Randy Wolf	862	58	6	160	211	263	103	61	25.64%
37	Chad Gaudin	664	76	8	139	132	199	79	31	25.60%
38	Edwin Jackson	890	70	5	161	194	267	128	66	25.51%
39	A.J. Burnett	896	97	10	195	184	259	117	33	25.45%
40	Scott Richmond	610	59		117	144	151	101	38	25.41%
41	Matt Cain	886	73	3	171	211	263	112	53	25.28%
42	John Danks	839	73	5	149	170	282	98	62	25.15%
43	Brett Anderson	734	44	3	150	132	280	91	34	25.07%
44	Ryan Dempster	842	65	6	172	171	296	95	39	25.06%
45	Scott Kazmir	647	60	6	117	160	160	99	45	25.04%
46	Roy Oswalt	757	42	8	138	149	265	104	51	24.97%
47	David Hernandez	462	46	1	68	130	109	62	46	24.68%
48	J.A. Happ	685	56	5	119	166	204	86	50	24.67%
49	Justin Masterson	568	60	8	119	96	213	51	21	24.65%
50	Chris Carpenter	750	38	7	144	110	319	93	39	24.40%
51	Gavin Floyd	797	59	2	163	154	263	125	31	24.34%
52	Cliff Lee	969	43	5	181	203	325	159	53	24.15%
53	Joba Chamberlain	709	76	12	133	135	222	93	38	24.12%
54	Ervin Santana	614	47	10	107	155	178	77	39	23.78%
55	Johnny Cueto	740	61	14	132	158	230	103	43	23.65%

Of these pitchers, Jered Weaver (15.5%), Scott Baker (14.8%), Tim Wakefield (14.1%), Johan Santana (14.0%), David Hernandez (13.3%), Clayton Kershaw (12.9%), Micah Owings (11.6%), Rich Harden (11.4%), David Huff (11.1%), and Todd Wellemeyer (11.1%) induced the greatest number of pop-ups as a percentage of batted balls. Weaver (11.2%), Baker (11.0%), Wakefield (10.8%), Santana (10.1%), Hernandez (10.0%), Huff (9.1%), Owings (8.7%), Wellemeyer (8.4%), Jamie Moyer (8.3%), and Jeremy Guthrie (8.0%) produced the most infield flies as a percentage of plate appearances.

Importantly, the rankings of pitchers by SO + POP and POP rates are not meant to identify the most valuable pitchers as neither takes into consideration BB, HBP, or HR rates. However, I wonder if Fielding Independent Pitching (FIP) couldn't be improved by combining SO and POP in its formula, which is typically defined as (HR*13+(BB+HBP-IBB)*3-K*2)/IP plus a league-specific factor (usually around 3.2) to create an equivalent ERA number.

The formula for FIP would need to be tinkered to account for the effect of POP as simply adding POP to SO wouldn't work. The multipliers or the league-specific factor would need to be changed to equate the newly constructed FIP with ERA.

Here are the top ten leaders for 2010 (among pitchers with 40 or more IP):

Num	NAME	PA	BB	HBP	SO	FB	GB	LD	POP	SO+POP
1	Tim Lincecum	218	15		69	28	70	26	10	36.24%
2	Clayton Kershaw	197	29	3	52	29	51	14	19	36.04%
3	Jered Weaver	205	12		59	47	56	18	13	35.12%
4	Colby Lewis	212	21	3	54	43	50	22	20	34.91%
5	Tommy Hanson	204	13	3	56	50	52	18	12	33.33%
6	Phil Hughes	170	15		42	35	42	23	13	32.35%
7	Brandon Morrow	187	27	3	54	33	39	25	6	32.09%
8	Yovani Gallardo	228	29		61	27	66	37	9	30.70%
9	Justin Verlander	203	20	1	46	33	60	28	16	30.54%
10	Jonathan Sanchez	178	20	2	45	38	45	19	9	30.34%

Tim Lincecum, Kershaw, Jered Weaver, and Justin Verlander are the only pitchers who ranked in the top ten in 2009 and 2010. Tommy Hanson (14th in 2009 and 5th in 2010), Yovani Gallardo (13th and 8th), and Jonathan Sanchez (12th and 10th) rank in the top 15 both years.

The greatest influence on SO + POP is clearly due to the former, yet the latter exerts value on the margin. The ability to induce pop-ups should not be dismissed when evaluating pitchers. Furthermore, it is my belief that certain pitchers have a knack for allowing fewer home runs as a percentage of outfield fly balls than the league average. Saying a pitcher is "lucky" because he has a lower HR/FB rate than the league average is simplistic, as is resorting to xFIP as a standalone measure (especially when a pitcher has a sufficiently large sample size to evaluate). By the same token, labeling a pitcher with a below-average BABIP "lucky" may not be totally accurate either.

The analytical community has come a long way on batted ball info. Paying more attention to pop-ups would be instructive in my opinion. Digging deeper into pitcher-batter results as they relate to pitch types, pitch sequencing, ball-strike counts, and bases occupied could lead us to solve some of the mysteries previously ascribed to luck and randomness. For example, pitchers with "plus" changeups may induce more than their fair share of pop-ups and lazy fly balls.

More than anything, I hope this article leads to additional discussion and research with respect to analyzing pitchers.

* * *

Update: Tom Tango sent me an email with a link to Tango's Lab: Batted Ball FIP. He pointed me to posts #8 and #9. Leave it to Tangotiger to have developed a formula for batted ball FIP (bbFIP). The formula is as follows:

ERA = 11*[(BB+LD)-(SO+iFB)]/PA + 3*(oFB-GB)/PA + 4.2

Note: the league-specific factor may differ depending on the data source

A line drive is like a walk, an infield fly is like a strikeout, and the gap between an outfly and a groundball is about one-fourth the gap between BB and SO.

In post #16, Tangotiger lists the results by root mean square error (RMSE) of bbFIP (1.05), SIERA (1.05), and FIP (1.11) and concludes "I’d say that bbFIP is a worthy addition here. Not to mention that it’s in the same spirit as FIP (linear and simple coefficients)."

If you have the time and interest, go ahead and read the entire discussion. Brian Cartwright goes into even more detail with numerous tables listing the predictive value of run estimators. As Brian notes, it is important to distinguish between "describing the past vs. predicting the future." I agree. Some skills are more repeatable than others. Guy cautions, "The farther forward you look, the more the skills change/deteriorate." He also warns against "survivor bias" in these studies. Excellent points all.

Comments

some interesting thoughts. before going any further i would want to see some kind of evidence that a pitcher has some semblance of control over generating pop flies. a glance at those numbers seems to me that it could just be attributed to chance.

if a pitcher can indeed exhibit some degree of control, is it just a byproduct of flyball pitchers or is more the result of pitchers getting beaten and getting jammed on the hands?

Posted by: dutchbrowncoat at May 18, 2010 12:53 PM

Amen. I agree with almost everything you said. I usually like to set baselines for HR/FB rate for each pitcher but these obviously require larger sample sizes than K rates, for example. I would have liked to have seen more tables where you just isolated pop-up % and more talk about the consistency of pop-up % from year-to-year without lumping it in with strikeouts.

Posted by: Dillon at May 18, 2010 12:54 PM

MGL did the heavy lifting on this subject six years ago.

Not surprisingly, a pitcher's FB and GB as a percentage of his total BIP (essentially his G/F ratio) are very much within a pitchers control and appear to be relatively stable from year to year. The number of IF pop flies and to some extent OF pop flies, as a percentage of all non-GB BIP, are somewhat a unique function of the pitcher as well. In other words, good pitchers may tend to get more pop files than bad pitchers, as a percentage of their total non-ground ball balls in play.

Posted by: Rich Lederer at May 18, 2010 6:17 PM

The question is not so much whether pitchers have "control" over their pop flies, as a function of non-GB BIP, but how much "control" they have, given a particular sample size. Pitchers have "control" over everything. How much control is the important thing. For example, the reason we use FIP or DIPS ERA is not because pitchers have no control over non-HR BIP, but because they have much less control over them than they do BB, SO, and HR. And the reason we use xFIP is not because pitchers have no control over their HR/FB, but because they have much less control over that than they do HR per BIP or per PA.

If you want to lump the pop fly in with K in an FIP or DIPS formula, you better make sure that pitcher control over pop flies is similar to that of the BB, SO, and HR, and greater than that of the other non-HR BIP.

So how much control does a pitcher have to have over an outcome like pop flies, to include it in an FIP or DIPS formula? There is no clear answer. It depends on the sample size of the data. A lot of people do not understand that DIPS and FIP work better on smaller sample sizes and that non-FIP and DIPS formulas like ERC, BaseRuns and even regular old ERA or RA work better than DIPS or FIP with very large samples. In between, take your pick. And when I say, "Works better," I mean in terms of predicting future RA or ERA or in describing a pitcher's true talent.

So before we talk about how important a pitcher's pop fly percentage is, and whether we want to include it in a FIP or DIPS formula, we need to quantify that "control" by looking at something like year to year correlations. My guess is that year to year correlation is not going to be nearly as high as it is for BB, SO, and even HR, and I would hesitate to include it in an FIP or DIPS formula, at least for anything but a small sample of data.

Posted by: MGL at May 18, 2010 10:10 PM

MGL: Thanks. I agree with you on everything, including the fact that ERC, ERA, and RA "work better than DIPS or FIP with very large samples" (despite the reliance by many on FIP and xFIP in such cases).

As it relates to pop flies, I believe such batted balls are, at a minimum, useful in describing the past. They may be less helpful in predicting the future, especially compared to SO and BB. I'm a K-BB-GB (in that order) guy myself as the ability to miss bats trumps all but think there is value on the margin in paying attention to POP/IFFB/iFB as well.

Posted by: Rich Lederer at May 18, 2010 11:07 PM

I agree with Rich that it makes sense to include PU's along with strikeouts. Sure, for predicting future performance we would like to separate the two as strikeouts are more predictive than PU's; however, for retrospective performance the two have essentially equal value and should be viewed similarly.

Posted by: Nick Steiner at May 19, 2010 12:39 AM

Great article. I've been seeing for years discussions about how Barry Zito induces pop outs, and this article clearly validates that thinking.

As long as FIP does not account for pop outs, it will fall down on the job in analyzing pitchers like Zito, who most sabers have been denigrating for years. I wonder how many of the Tom Tippett "Crafty Lefty" category had high SO+POP%.

I'm not really up on the latest and greatest of these, so this is probably a stupid question: I was wondering how Fangraph's tERA would fit into the discussion of comparing SIERA, bbFIP, and FIP.

Lastly, it is interesting that all the Giants main starters were on the Top 41: Lincecum, Sanchez, Zito, Cain. I know Lincecum and Sanchez has the K's to get high on the list, but Zito certainly doesn't and Cain, while good, is not that great at striking out a lot. But I guess both were good at getting pop outs, though not league leading percentages, to boost their ranking for SO+POP%.

Posted by: obsessivegiantscompulsive at May 20, 2010 5:20 PM

tRA is probably the best DIPS metric out there, because it includes all of the stuff as FIP but also a pitcher's batted ball rates. So guys like Zito will be rated more to their abilities.

It's not as predictive as FIP or some of the other stats, but that isn't neccesarily what we are looking for in DIPS..

Posted by: Nick Steiner at May 21, 2010 11:40 PM