F/X Visualizations July 31, 2009
Measuring a Pitcher's Ability to Locate a Pitch

In many of my past posts I have displayed heat maps showing how a specific value, HR rate, run value, BABIP, varies over pitch location. One thing I mentioned in passing in the BABIP post, but probably should have been mentioning all along is that just because a location is the best to pitch to does not mean a pitcher should attempt to throw it there. We must think about a pitcher's ability to locate and what happens if he misses his spot. MGL put it best in asking this question, in this post at the Book Blog:

Let’s say that pitch f/x data tells us the following about a particular pitcher or group of pitchers:

On the average, the run value of a high inside fastball is -.001 where minus is good for the pitcher. The run value of a low outside fastball is +.001. In other words, the run value of the former is better than the run value of the latter.

Now, put all pitch sequence and game theory stuff aside.

In an average situation against an average batter, where those run values above absolutely apply, which pitch should a pitcher attempt to throw, and why? We are just talking about one pitch, and again, put aside anything to do with pitch sequences and game theory.

Low and away.

Your phrasing: “which pitch should a pitcher attempt to throw, and why?” The key word is attempt. If you make a mistake down and away, you probably won’t get burned as much as if you make a mistake going up.

If he has perfect control, then by all means take the one which the better value, but there is human error involved.

And MGL's further explanation.

You CANNOT use the run values of pitch locations based on hit f/x data to make any decisions about what pitches to throw unless you consider what happens when you miss your exact location (and the distribution of those misses, location-wise), which will happen some non-trivial percentage of the time.

I was thinking about the pitch f/x article or two a while back that told us exactly what I told you - that the high inside fastball was a very effective pitch. What the data and article did NOT tell you was the run value of a pitch that was ATTEMPTED to be thrown high and inside. ...

In general the reason why pitchers do NOT throw high and/or inside that much in this day and age is not because they are not man enough anymore as some broadcasters would have you believe, but it is not necessarily because a high inside fastball is a bad pitch (if it hits that location). It is because a miss on that pitch will more often result in a HR (or extra base hit) or a hit batter. As well, batters will take a difficult to hit high and inside pitch more often now than they would in the old days when the strike zone was higher than it is now.

Here is a visual representation of what he is talking about. Below is the run value of a pitch from a right handed pitcher to a left handed batter.

Suppose location B, up and in, has a slightly better for the pitcher run value than location A. So if a pitcher could hit location B exactly that would be the best place to pitch. But if in throwing to B some fraction of the time he misses and the pitch will end up in less favorable place than if he misses pitching to location A. Depending how often he hits his spot, and by far how off he misses he might be better off pitching to spot with a worse run value.

Ultimately what we would want to know is for a particular pitcher, pitch type and pitch location the probability density function of where the pitch will end up. This combined with the run value map would give us an expectation of the run value if that pitcher attempts to throw to a given location.

We do not have that information now, and we will probably never have anything that specific. But, if we knew the location of the catcher's mitt we would have some indication of where a pitch was intended. This was brought up at both pitchf/x summits and Marv White of Sportvision said that is it possible given the current technology, but not at the top of their list of things to do. There is some discussion over at the Book Blog about how hard it would be to collect this data and how much information it would give us. Either way I add my vote to that of other analysts interested in that data.

Without that though, I wanted to see if I could estimate how close a pitcher comes to hitting his spots. Again, without knowing where each pitch was intended to go this is impossible, but I think we can get an estimate for at least one pitcher. Again I turn to Mariano Rivera. Check out the location of his pitches to LHBs.

The vertical location varies quite a bit, but there are two clear horizontal areas he pitches to. If we assume that he intends to throw all of his pitches to just either inside the right edge of the zone or just inside the left edge of the zone we can then see how close he is, along the horizontal axis, to hitting his spot.

I do think he probably varies the intended horizontal location by count. Probably intending to pitch closer to the zone when he has three balls, and pitching even farther on the edge when he is ahead in the count. So I am goign to restrict my attention to pitches from 0-0, 1-0, 0-1 and 1-1 counts.

Since the horizontal location varies by vertical location I am going to look at the deviation from the black lines below.

Here is a histrogram of the deviations from these black lines.

Over 75% of his pitches are within half a foot to either side of the target along the horizontal axis. In other words 75% of the time he can get his pitch within a 1-foot horizontal strip. Over 50% of his pitches are within 1/3 of a foot to either side of his target along the horizontal axis. So half the time he gets it in a 8-in horizontal strip.

This all assumes that you believe that he is always throwing at one of two targets. If you think he aims at a range of horizontal locations, then the variation I have measured is partially from those range of locations and partially from his ability to locate. In that case I am ascribing some variation in intended location to his ability to locate, so I think you can these numbers as the least accurate he could possibly be. They, also, says nothing about how far he is from his intended target along the vertical axis, because I have no way of knowing his intended vertical target.

I think of this as a first attempt at measuring how close a pitcher is to hitting his intended location. Catcher mitt location data will get us closer to measuring it, but it is probably something we will never be able to fully measure.

Good stuff. Marv believes, if anything, catcher's feet will be our first bit of empirical data. Finding feet on dirt is easier than mitt on everything else. It takes us half-way, I suppose.

That said, spotting the glove is possible, just not as easy.

On SNY they actually track the catchers mitt.....then on HR's (or other hits in general) they show where the catchers mitt was located and where the actual pitch went......i'm surprised at how much pitchers miss their intended targets.

Harry,

I had forgotten he said that. Hopefully that comes soon.

john,

What is SNY? Is the data from that publicly available?

Sadly no. SNY is the Mets TV Station (Sportsnet New York), and they don't keep the data publically available.

garik16,

Thanks that is really interesting. I will have to try and catch a game on SNY.

Great work. How many pitchers have groups like Mo so we can compare?

I actually did a project on this exact topic for a game theory class last semester. We devised a model for imperfect pitch location and tested some simple probability density functions on expected value of pitch locations against certain batter's tendencies (by pitch location).

Most Gaussianly distributed pdf's about the target location have the effect of 'smoothing' out the valuation. So in the case you showed with A and B, if the pitcher misses the locations with the same pdf (locally), he usually should still go for the lower value location B. But with a with a large enough st.dev. on the Gaussian, the strategy would change to location A. In our project, we tried to find out how inaccurate (by width of the Gaussian) a pitcher had to be in order to affect his strategy.

The strategy also depends on count and runner's position since these change the values of certain pitch locations.

Obviously, the main shortfall is the ability to devise a pitcher's unique probability density function for each of his pitches since it would change on a daily basis (is he leaving his curveball up today?, is the fastball tailing more than usual?, etc)

It's nice to see someone else has at least thought about quantifying something like this. Keep up the great work!