Touching BasesOctober 21, 2010
Count Oddities
By Jeremy Greenhouse

I've been doing a lot of thinking about game theory and how it relates to pitch selection and swing rates. I finally decided to run some numbers to find the baselines for swinging, pitch selection, and strike throwing based on the ball/strike count.

The rate at which pitchers throw strikes aligns perfectly with the average run expectancy in each count. However, batters' swing rates are not likewise dictated by run expectancy. Instead, batters like to swing more the deeper they get in the count.

Batters swing 74% of the time on full counts, by far the highest percentage of any count. At the other end, they swing at only 6% of 3-0 pitches.

Pitchers simply aren't good enough at throwing strikes on 3-0 to warrant batters mixing their strategy between swinging and taking. Pitchers only hit the zone about 60% of the time 3-0, whereas they would need to hit it at least 70% of the time to make batters consider swinging I believe. Strangely, batters are eight times as likely to swing on 3-1 as they do 3-0. I think straight takes on 3-1 might be a viable strategy at times.

We already know and accept that batter's don't act completely rationally on the first pitch. Some players just don't like swinging 0-0, so they don't, and that's that. Yet they up their swing rates from 27% on 0-0 to 40% on 1-0, even though pitchers have similar pitch selections and locations and more importantly, the reward of taking is greater.

There is a 50/50 split between fastballs and off-speed pitches on 0-2 and 1-2 counts. Naturally, fastballs are thrown in the zone at a higher frequency. What's odd is that batters swing at more off-speed pitches on those counts.

The big question is, How much do batters learn from pitch to pitch? The deeper into his repertoire a pitcher must go, the greater the advantage is for the batter. There are probably advantages to taking pitches besides drawing balls. I don't think this applies to the full count, though, which might be why the swing rate is too damn high.

Here's the relevant data. I should note that I used the same strike zone model for all counts, which means that more pitches would be called strikes on 3-0 than listed as being in the zone, and fewer strikes would be called than listed on 0-2.

Count FB% Zone% Swing%
3-0 95.2% 58.5% 6.6%
3-1 85.0% 57.5% 54.3%
2-0 81.6% 55.3% 40.0%
3-2 69.4% 54.0% 73.7%
2-1 68.5% 52.6% 58.7%
1-0 68.6% 52.0% 40.7%
0-0 68.1% 50.2% 26.7%
1-1 56.4% 46.5% 52.9%
2-2 54.0% 43.8% 65.4%
0-1 55.3% 41.8% 46.1%
1-2 49.2% 35.7% 57.8%
0-2 52.4% 29.0% 49.4%


"There is a 50/50 split between fastballs and off-speed pitches on 0-2 and 1-2 counts. Naturally, fastballs are thrown in the zone at a higher frequency. What's odd is that batters swing at more off-speed pitches on those counts."

Very interesting, Jeremy. I tried to think about this by re-organizing your chart based on the number of strikes in the count. It seems like the FB% tells us what I think most of us know; pitchers throw more fastballs to avoid walks when there are 2 or 3 balls in the count (exception being 2-2, which makes sense).

The zone % also makes sense to me. With no strikes, it's within 8 percent from 0 balls to 3 balls - very little change. With one strike, it ranges from 41 to 57%, and a hitter is just as likely to see a strike 3-1 as he is 3-0. The interesting data is with two strikes. At 0-2 and 1-2, there's really not much incentive to throw a strike; see if the hitter will get himself out. But get to 3 balls and 3-0, 3-1, and 3-2 are within 3 percent of one another.

With the swing %, a hitter is more likely to swing with each additional strike. Wouldn't that just be due to the batter knowing he might HAVE to swing in that count (particularly with two strikes)? I suppose the 3-2 swing % is highly correlated with the increase zone % from pitchers; that is, a pitcher is more likely to throw a strike with three balls and a hitter is more likely to swing with two strikes, so this is a perfect storm.

"Pitchers only hit the zone about 60% of the time 3-0, whereas they would need to hit it at least 70% of the time to make batters consider swinging I believe."

Jeremy, this is not correct. You're making an assumption in all of your analysis that batters select their strategy independent of the pitch thrown. That's true for the pitcher -- they don't know what strategy the hitter will use -- but not true for the hitter. They decide whether to swing after seeing the pitch -- a huge difference. Consider the swings at 3-0: I bet at least 90% of them are strikes. So the comparison you need to do is between a 3-1 outcome (take) and the actual production when swinging. If hitter misses, it's still 3-1 so no difference. But on contact, 3-0 hitters really mash: a .409/.409/.866 line last year. That's a much better outcome for the hitter than being 3-1. If you figure the hitter would draw a walk on 10% of those pitches now swung at, then it's probably a wash. Which is exactly what should happen!

And what is the evidence that hitters should take more often at 3-1? Currently, they put up an OPS of 1.067 after 3-1 -- basically, the average hitter becomes Albert Pujols. We should be very cautious about concluding that hitters doing that well are making major mistakes.

You need to remember that decades of baseball evolution have produced these behaviors, which are very stable over time. There is almost zero chance that there are major inefficiencies remaining. If you find one, it means you're doing something wrong. An individual hitter may act non-optimally, of course, but the most you will find among all hitters is very small inefficiencies (if that).

Guy, I feel like you're ignoring the data.

What you describe I should do on 3-0 counts, Dave Allen did for 3-2 counts, but you dismissed his conclusions for whatever reason.

"There is almost zero chance that there are major inefficiencies remaining. If you find one, it means you're doing something wrong."

I mean how do I respond to that? All I was trying to do was provide the data.

I suppose I wasn't clear enough with what I meant on the 3-0 thing, and maybe I was wrong as well. There are obviously pitchers who hit the zone 70% of the time on 3-0. Those are probably the pitchers against whom batters swing, and I believe they swing correctly. There are also probably pitchers who can only hit the zone 50% of the time on 3-0. I believe that taking every one of those pitches is a viable strategy. You believe batters should always be prepared to try to swing at the 10-20% of pitches that they should swing at. Whatever. Both strategies are going to come out way ahead. Maybe I underestimate how batters are able to wait until they see the pitch to make their decision. I'll give you that.

OK, then you ask what evidence there is batters should take more on 3-1.

I just provided the data at the end of the article. 54% swings at 3-1, 6% on 3-0. There is little chance that both of those strategies are perfectly correct. That's all I'm saying. Similarly, how do you explain the 0-0 and 1-0 swing rate differences? I offered that batters don't like swinging on the first pitch, and maybe that assumption is wrong, but there is almost surely some inefficiency there. And how do you explain batters swinging at off-speed pitches more than fastballs on 1-2 and 0-2 counts even though off-speed pitches are more often out of the zone?

I'll tell you what, I will provide you with whatever data you want, and you can write up an article for next Thursday, or I will run whatever numbers you want and try to write it up myself. Just tell me what to do, and I'll do my best to do it. In exchange, you can't say that decades of baseball evolution have wiped out all cognitive biases and produced equilibria in every facet of the game.

Hey Jeremy. I'm not ignoring your data, but certainly it's possible I'm misinterpreting it. Let me see if I can explain my perspective more clearly.

One disagreement is with the assumption I think you're making of independence between pitch location and the hitter swing decision. In fact the pitches swung at and those taken are very different. So to demonstrate hitters should take more on a given account, you need to figure out which additional pitches will be taken, estimate the outcomes on these extra takes, and then compare that to what the hitter produce when swinging at that same set of pitches. None of which is simple.

Let's take 3-0. You can probably estimate what % of those pitches would be called a ball (my guess is 10-15%). Then determine if that set of outcomes is better or worse than the production when hitters swing. Maybe that will show hitters should take 100%, in which case my intuition is wrong. But my point is that simply knowing whether pitchers throw 60%, 70%, or 80% strikes at 3-0 can't answer this question.

Dave did not do this analysis at all for the 3-2
count, so I'm not sure what you mean there. He simply showed that hitters swing more than 50% of the time at a set of balls that would be better to take. But hitters take some strikes and swing at some balls at every count -- by itself that doesn't tell us if they are swinging too much or too little. For example, if hitters start trying to take more borderline pitches at 3-2, it's easy to imagine -- looking at Daves' data -- that for each 10 extra takes 2 would be from the take zone (good outcome), 3 from the 50-50 zone (neutral), and 5 from the swing zone (bad). It's not obvious that this would be good for the hitter. The point is that taking more balls will always mean taking more strikes as well. To measure the net impact, we have to know both the distribution of pitch locations AND how hitter's higher take rate varies by location. That's why I suggested testing what happens if you combine hitters' 2-2 swing/take rates with the 3-2 pitch distribution.

On the other points, well, each is complicated:

54% swings at 3-1, 6% on 3-0: Why must this be inconsistent? The cost of strike one at 3-0 is very small compared to the cost of strike two. And I think the umps' zone is larger at 3-1.

0-0 vs. 1-0 swing rate: I agree the 0-0 swing rate seems too low. But again, we'd need to figure out where the extra swings would come. And perhaps, as you say, the hitter gains in comfort/learning by taking the pitch.

Why do hitters swing at 2-strike breaking balls? Great question. I don't know. I assume hitters have less ability to correctly judge whether breaking balls will be strikes, and this is exacerbated when they are behind in the count. But the important point is that "surprising" is not a synonym for "wrong." Our inability to explain it is as likely to mean we are missing something as to indicate MLB hitters are guilty of mass insanity.

OK, I finally understand what you're getting at. I can now see how Dave's analysis might have been lacking, although I'm not sure we can do any better. Tell me if these are the steps you suggest.

1: Find value of taking on 3-2.
2: Find value of swinging on 3-2.
3: Use minimax to solve for correct swing rates.

The big problem here is that the value of swinging on 3-2 is interconnected with the probability of swinging on 3-2, so we can't really use theoretical swing rates but keep the same values that result from the actual swing rates.

That said, I can certainly test your method, if I understand it correctly.

"The cost of strike one at 3-0 is very small compared to the cost of strike two."

Sorry, but that's not true. If you believe that is true, then maybe Major Leaguers believe that is true as well, which could be why they swing so often on 3-1 counts. Perhaps 3-2 from 3-1 should be much more costly than 3-1 from 3-0, but as it stands, it is not.

I was suggesting two possible studies.

For 3-0, simply look at the c. 6% of pitches swung at and estimate the outcomes if hitters took them all (my guess was about 15% BB, 85% 3-1). Then compare that set of outcomes to what happens when hitters actually swing (swinging strikes, fouls, and balls in play).

For 3-2, I suggest "mapping" hitters' 2-2 strategy onto the actual 3-2 distribution of pitches. So for 3-2 fastballs in location X, assign the run value hitters achieve when count in 2-2. This assumes that hitters could use their 2-2 strategy, which involves taking more pitches, when count is 3-2 (which seems reasonable). The new 3-2 result will of course be more BBs, more called strike 3s, fewer hits, and fewer swinging strike 3s. Does that all net out to a gain for the hitters?

OK, Guy, I'll try my best. I think mapping the 2-2 strategy on 3-2 pitches could produce funky results.

What I was thinking on 3-2, is that you would assign the 2-2 run value on each pitch. So, if a curveball in location X has a run value of -.1 when the count is 2-2, assign that same run value to those pitches at 3-2. Presumably, the 2-2 run values will generally be lower for balls in the strikezone, but higher for balls out of the zone.

However, I'm sure this is harder to do than I'm making it sound!

I'm not sure if you're aware of this but are just ignoring it for the purposes of this discussion, and if you are, sorry for bringing up something this obvious. But generally, hitters are not *allowed* to swing at 3-0 pitches unless given the green light by the manager. And the green light is typically only given in certain situations, to certain players, e.g. runners in scoring position with first base open, a high-average hitter at the plate, close game, etc. A ridiculously good power hitter like Pujols might be given the green light with no runners on, but there's almost no chance a non-power hitter would be.

I would say there are also other factors aside from immediate run expectancy that factor into a hitter's decision on whether to swing or not. They might be sacrificing a small amount of run expectancy by taking 0-0 pitches more often than is optimal, but the hope is that this strategy will cause the pitcher to work more, throw more pitches, and eventually result in greater run expectancy in all counts later in the game when the pitcher is tired--or perhaps greater run expectancy later in the series when the bullpen is used up in this game because this starting pitcher had to be removed early.