Run Value by Pitch Location
[Editor's note: Dave Allen has agreed to join Baseball Analysts. He is a graduate student whose research involves analysis of spatial data and spatially explicit modeling. He also loves baseball. Dave will combine these two interests in the F/X Visualizations series.]
A lot of interesting new sabremeteric work has become possible over the past two years with the availability of the pitch fx data. In this new blog entry, I will continue this analysis and present the results in a simple, yet hopefully effective, visual manner.
This first post builds on work that Joe Sheehan did a year ago looking at the run value of each pitch based on its location. He placed each pitch into one of 25 bins and calculated the average run value in each bin. In the post he suggested that it would be interesting to get rid of the bins and take a continuous approach. A year later, it seems no one has accomplished that so I thought it would be a good way to launch my work.
Using the first table in this post, I assigned a run value to every pitch in the pitch fx database, not just pitches that ended an at-bat, and then averaged the run value of all the pitches in each location. I split the data up by handedness of the pitcher and batter. The number in parentheses is the average run value for all pitches regardless of location. The images are from the catcher's perspective so that a right-handed batter stands to the left of the strike zone and a left-handed batter stands to the right of the strike zone.
This method reproduces some of Sheehan's results:
This continuous approach also gives some additional insights beyond Sheehan's:
Tango and Lichtman made some important comments on the limitations of Sheehan's original work without splitting the data by swing/taken or pitch type. These critiques apply equally, if not more so, here because I did not split the data by count as Sheehan did.
I hope to address these points in future posts. For example, I assume the peak of negative to zero valued pitches a foot above the center of the zone is mostly the result of 'high heat' fastballs in pitcher's counts. By analyzing the run value of pitch locations for just fast balls in specific counts, I will be able to confirm or deny this assumption.