F/X VisualizationsMarch 23, 2009
Deconstructing the Fastball Run Value Map
By Dave Allen

In a previous post I presented a map showing the run value of a fastball based on its location. In this post I will examine that map in more depth. Consider the two locations, A and B, in the figure below.

immarked1

These locations have about the same run value, just below 0, but for different reasons. Taken pitches at location A are called strikes while taken pitches at location B are balls. In order for the two locations to have the same run value pitches swung at in location A must have, on average, higher run value outcomes than pitches swung at in B. Not brain-surgery so far, swinging at fastballs down the middle is better than swinging at fastballs a foot above the strike zone. We could try to intuitively guess at explaining the rest of the above pattern in a similar manner, but why try when we have the data to properly explain it. I will present that data in this post.

The run value of a pitch is determined by the outcome of four events.

  1. If the batter swings at the pitch or not.
  2. If no to 1, whether the taken pitch is called a ball or a strike.
  3. If yes to 1, whether the batter makes contact.
  4. If yes to 3, the run value of that contact.

Below I present a series of three images for each handedness combination that show how the outcomes of these four events vary by location for fastballs. Reading left to right:

  • The first image addresses events 1 and 2. The heat map is the swing percentage by location to address 1. On top of that are three contour lines where 75%, 50% and 25% of taken pitches were called strikes to address 2. So if a batter took a pitch inside the smallest circle it was called a strike over 75% of the time. If he took a pitch in doughnut between the smallest and middle circles it was called a strike between 75% and 50% of the time, and so on.
  • The second image addresses 3 showing the contact percentage of pitches swung at.
  • The final image addresses 4 showing the run value of a contacted pitch (including foul balls).
At the top of each image is the average value over all locations.

fa_rr

There is a lot going on in this series of images, and they might be intimidating at first. My suggestion is to focus on the leftmost image, spend sometime looking at it and once you understand it move on to the next. Do the same with the middle before moving on to the rightmost one.

With these images we can better explain the pattern in the overall fastball run value map. Consider location B in the first graph, the area of slightly negative run valued fastballs above the strike zone. Batters swing at pitches in this location over 50% of the time, make contact only around 70% of the time and the result of that contact is negatively valued. So the swung at pitches will have a quite low negative run value. The taken pitches are almost all called balls (this location is outside the largest strike contour) which have a very high positive run value. The result is the slightly negative value we see in the first image. Similar explanations can be made for any part of the run value map.

The region of highest swing percentage overlaps with the regions of highest contact percentage and run value of contacted pitches, and the 75% called strike contour, but is not entirely coincident with any of these. This means that hitters are not making entirely optimal swing decisions based on their ability to make contact, the value of that contact or how the strike zone is called.1

Contact percentage and run value of contacted pitches both reach their maximum slightly down and in from the center of the zone. But the overall regions of high contact percentage and run value of contacted pitches are not exactly the same. The region of high contact percentage is a diagonal swath from the top-in corner of the zone to the middle of the bottom of the zone. The region of high run value of contacted pitches is a diagonal swath from the bottom-in corner of the zone to the middle of the top of the zone.

Another interesting result is how the called strike zone compares to the rulebook strike zone. The inside and the top of the zone are called fairly well (the 50% contour runs along the rulebook zone on these edges), but the outside edge is shifted away a couple inches (the 75% contour runs along the rulebook zone's outside edge) and the bottom of the zone is shifted significantly up (the 25% contour is ABOVE the bottom edge). In addition, the strike zone is rounded rather than rectangular. These results are not new. John Walsh, David Pinto and Jonathan Hale have each shown all or some of these before, but it is nice to see that my analysis reproduces their results.

fa_lr

For the most part these are quite similar to the righty/righty images. One interesting thing we can address with these images is why RHBs do better against LHPs than RHPs. First, compare the location of the highest swing percentage relative to the strike contours in the RHB vs LHP and RHB vs RHP images. In the RHB vs LHP it is much more coincident along the horizontal axis, although it is still too high along the vertical axis . That means RHBs are swinging at more pitches in the called strike zone and taking more pitches outside the called strike zone against lefties than righties, which begins to explain their success. In addition, RHBs have a higher contact percentage and higher run value on contacted pitches versus LHPs compared to RHPs. So righties are better at each component of the at-bat against LHPs than RHPs.

fa_rl

These are almost mirror images of RHB vs LHP above and the overall averages are very close. It is interesting to see how the strike zone is called differently to LHBs. The top is called well and the bottom is called very high just like to RHBs. The outside edge is shifted away as it is to RHBs, but that shift is larger with the 75% contour extending outside of the rulebook zone. The inside of the zone is also shifted outside a couple inches (the 25% contour runs along the rulebook edge), which was not the case to RHBs. Walsh and Pinto also observed these results.

fa_ll

While LHBs' success against RHPs is very similar to RHBs' success against LHPs, LHBs fare much worse against LHPs than RHBs do against RHPs. Lefties swing at even more pitches outside the called zone, take more pitches inside the zone and make less and poorer contact against LHPs than RHBs do against RHPs.

Overall I was very surprised to see that in every case the average run value of a contacted fastball is negative. This is probably because I included foul balls in this group, but it is still surprising.

With these images one can understand the fastball run value maps in this post. Now if you go back, look at these maps and see something surprising, you can use the images presented here to understand what is going.

In future posts I will present similar images for the other pitch types.



1. Brian Cartwright made the following comment in this post:
One idea I never followed thru on is first identify hr% by location (and pitch type and count), as you have done here, then for each hitter (his favorite zones and pitches to go deep) then finally see how well each player recognizes the mashable pitches - what are the swing% for batters when they see a pitch in the best hitting zone? My opinion is that Barry Bonds and Brain Giles hit a high pct of homers because of superior pitch recognition, and putting the bat on the ball when they swung, not because of hitting the ball an extra-ordinary distance.
This suggests an interesting way of evaluating batters: how well does their swing percentage map coincide with their home run rate map, contact percentage map or run value of contacted pitches map. It would be interesting to see if Giles' region of highest swing percentage is more inline with his region of highest run value than the average hitter, presented above.

Comments

Dave, I suggest you to use different color schemes.
While these are very attractive, it's not easy to decode values, expecially for the third one in each row.
I'd like to point you this web utility that is very helpful in color selction:
http://www.personal.psu.edu/cab38/ColorBrewer/ColorBrewer_intro.html
Prof Cynthia Brewer has also written an R package (RBrewer).

Max,

I will check that out. Thanks for the suggestion.

Dave

Max,

I looked at ColorBrewer and I agree with you. There are some color schemes there that would make these images clearer, particularly the run value image as you point out. My next post will reproduce these images for the other pitch types and I am going to use the existing color schemes. Partially just out of laziness because I already have the images made, but also to make comparisons across pitches easier.

But in the future I will definitely think about this for my new work. Thanks.