The Same Things
Every pitch has a unique fingerprint that differentiates it from other all pitches. There are many factors that give every pitch a different identity, such as speed, how much movement it has, the handedness of the batter and pitcher, the location of the pitch, as well as the sequence of pitches that led to the pitcher throwing it. This week I want to look at how similar different pitches are. Do Brad Lidge and Joe Nathan throw a similar slider? (They don't). If so, how similar is it? (Not very, Lidge's is similar to Jonathan Broxton's, Nathan's is more like Bobby Jenks'). If not, what parts are different? (Nathan's is faster, and has a bigger pfx_z value, but a smaller pfx_x value)
Using the pitch classifications from wmy database, I found the average speed and pfx values for every pitch I had data for. For example, Josh Beckett's fastball has an average speed of 95 MPH, pfx_x value of -7.4" and a pfx_z value of 8.7". (Pfx_x/z values are how the pitch actually moved relative to a spin-less version of it. They measure in inches how much spin the pitcher put on the ball). Once I had the average values for all the pitches, I found the z-score for each value, relative to all other pitches. I then subtracted the z-scores of the pitch I was comparing from the z-score of the Beckett's fastball and squared the result. This gives the distance between each pitch and Beckett's fastball for each category, and summing those differences gives the total difference between Beckett's fastball and the other pitches.
Derek Lowe relies heavily on his sinker to produce a ton of ground ball outs. Lowe is reputed to have one of the best sinkers in baseball, which I won't argue, but what's the difference between Lowe's sinker and Brandon Webb's? How similar are the two pitches to each other and what other pitches are they similar to? If my similarity scores are measuring what I think they are, Lowe and Webb's sinkers will be most similar to other sinking fastballs, and hopefully will be similar to each other. The table below shows the pitches most similar to each sinker along with the similarity score for each pitch.
Name Pitch Throws MPH pfx_x pfx_z Score Brandon Webb FB R 88.8 -10.13" 1.94" 100 Franquelis Osoria FB R 90.8 -9.45" 2.15" 96 Kameron Loe FB R 88.6 -8.73" 3.79" 96 Derek Lowe FB R 90.3 -10.28" 3.87" 96 Shawn Hill FB R 89.6 -8.33" 3.80" 95 Jeremy Accardo CH R 86.0 -8.46" 1.97" 95
Name Pitch Throws MPH pfx_x pfx_z Score Derek Lowe FB R 90.2 -10.28" 3.87" 100 Yorman Bazardo FB R 89.9 -9.38" 4.89" 97 Jake Westbrook FB R 91.1 -8.99" 3.71" 97 Luis Ayala FB R 89.6 -8.53" 4.57" 97 Shawn Hill FB R 89.6 -8.33" 3.80" 96 Kameron Loe FB R 88.6 -8.73" 3.79" 96
Webb's sinker is slightly more unique than Lowe's, primarily due to the spin he imparts on the ball (he has the smallest pfz_z number for a fastball and combines it with an large absolute value pfx_x value). One cool thing to notice is that the fifth most similar pitch to Webb's sinker is Accardo's changeup. Changeups typically have a smaller pfx_z value than fastballs, sinking more than a fastball thrown by the same pitcher, and Accardo's mirrors Webb's sinker. Overall though, I would classify the similar pitches in both cases (as well as other similar pitches that fell outside the top-5) as sinkers, giving some confidence that the system is actually finding similar pitches.
Name Pitch Throws MPH pfx_x pfx_z Score Barry Zito CB L 70.2 -0.69" -11.48" 100 Ted Lilly CB L 71.0 -4.34" -8.95" 92 Sean Marshall CB L 73.2 -4.26" -9.91" 92 Rick VandenHurk CB R 71.0 4.47" -9.79" 90 Jo-Jo Reyes CB L 73.3 -2.95" -7.33" 90 Doug Davis CB L 68.4 -5.39" -8.48" 90
The first thing to realize is that Zito's curve is much more unique than either of the two sinkers. The reason for this is the lack of horizontal spin. Zito throws almost a true 12-to-6 curveball, and as a result of that, a right-handed pitcher's pitch shows up on his list of most similar pitches. I'm not saying that Vanden-Hurk's curve is going to look like Zito's to a batter, but Zito's curve is so unique that there aren't many similar pitches to it, thrown by either LHP or RHP. Hill's curve doesn't show up at the top of Zito's list because Hill's is thrown faster, has a smaller pfx_z value, and has a larger pfx_x value. Zito's curveball is really a unique pitch.
Speaking of unique pitches, lets talk about Mariano Rivera's cutter. I've been somewhat fascinated with Rivera's cutter since I started working with the pitch f/x data. For those who might be unaware, despite being a right-handed pitcher, Rivera is hit harder by right-handed batters than left-handed batters. This is due to the cutter which moves in on left handed batters and causes lots of weak contact and broken bats. The list of similar pitches to Rivera's cutter has a pretty wide selection of pitches.
Name Pitch Throws MPH pfx_x pfx_z Score Mariano Rivera FB R 93.4 2.72" 7.72 100* Jared Burton FB R 93.4 1.57" 7.58 98* Brandon Medders SL R 91.2 2.27" 9.40 95 Juan Salas FB R 90.9 1.02" 8.05 95* Jon Lester FB L 92.1 4.50" 9.56 95 Jason Isringhausen CT R 90.3 1.69" 7.92 95 Randy Flores FB L 90.0 1.79" 7.41 95 Jonathan Broxton CT R 96.3 1.03" 8.40 94 Brian Wolfe CT R 92.6 -0.39" 6.97 94 Kevin Cameron FB R 91.9 -0.11" 6.64 94
Again, these aren't necessarily pitches that will look like Rivera's cutter to hitters, but pitches that move like it. The release point a pitcher throws with plays a huge role in what a pitch looks like, but for right now, don't worry about that. Jared Burton's fastball actually looks like a close match to the cutter, but the horizontal movement for Rivera's cutter is the most unique aspect of the pitch, and Burton's pitch doesn't come close to matching it. Brandon Medder's slider looks close too, but drops less and is a little slower. The pitches that have similar horizontal movement to the cutter are all primarily thrown by left-handed pitchers, with very few pitches thrown by right-handed pitchers having that much movement in to left-handed hitters. The right-handed pitchers with a * next to their score in the list above have reverse splits (right-handed batters hit them better than left-handed ones), but only Burton and Rivera show a reverse split on the pitch in the list. I'm probably reading too much into a sketchy list (that also has sample size problems) but I'm going to keep an eye on Burton.
I think this is a cool way to look at pitches and see similarities that might have otherwise gone unseen. Right now, the similarity scores I'm using are based more on how the pitch moves, independent of how the batter perceives it, which isn't the ideal solution. In addition to just the movement and speed, the sequencing and location of pitches has a large impact on how they are viewed by the batter. For Jamie Moyer's fastball, the two most similar pitches are Cole Hamels' changeup and Johan Santana's changeup. The similarity speaks highly to the movement on Moyer's fastball, but without looking, I would guess that Moyer throws his fastball mostly in situations where Santana and Hamels throw their fastballs, not their changeups. If I can get the similarity scores to reflect how batters view the pitches, the scores will become much more useful.
The pitch I called his fastball could be 2 different pitches, one of which behaves like a regular 4-seamer and one of which behaves almost exactly like Rivera's cutter. The red cluster in the chart below is what I initially called Burton's fastball and if you look at the far left of the cluster, you can see a somewhat separate cluster that could be a regular 4-seam fastball, with the cutter occurring more on the right. Without having first-hand information about the types of pitches a pitcher throws I wouldn't be comfortable making a distinction between 2 such similar groupings, but it looks like this might be something.
I have Burton throwing the cutter around 50% of the time, the 4-seamer 25%, and the slider and changeup being the other 25%...Justin, do you know if Burton throws his cutter that often?
If you're curious, here are the values of the 2 cutters...pretty much a dead on match, with Burton's actually having a higher (more "movement") pfx_x value. I would kill for data on Rivera's cutter when he was at his absolute peak though and I wonder maybe if he's lost an inch or two off his cutter since then.
Name MPH, pfx_x, pfx_z