Command Post December 14, 2007
The Same Things

Every pitch has a unique fingerprint that differentiates it from other all pitches. There are many factors that give every pitch a different identity, such as speed, how much movement it has, the handedness of the batter and pitcher, the location of the pitch, as well as the sequence of pitches that led to the pitcher throwing it. This week I want to look at how similar different pitches are. Do Brad Lidge and Joe Nathan throw a similar slider? (They don't). If so, how similar is it? (Not very, Lidge's is similar to Jonathan Broxton's, Nathan's is more like Bobby Jenks'). If not, what parts are different? (Nathan's is faster, and has a bigger pfx_z value, but a smaller pfx_x value)

Using the pitch classifications from wmy database, I found the average speed and pfx values for every pitch I had data for. For example, Josh Beckett's fastball has an average speed of 95 MPH, pfx_x value of -7.4" and a pfx_z value of 8.7". (Pfx_x/z values are how the pitch actually moved relative to a spin-less version of it. They measure in inches how much spin the pitcher put on the ball). Once I had the average values for all the pitches, I found the z-score for each value, relative to all other pitches. I then subtracted the z-scores of the pitch I was comparing from the z-score of the Beckett's fastball and squared the result. This gives the distance between each pitch and Beckett's fastball for each category, and summing those differences gives the total difference between Beckett's fastball and the other pitches.

Derek Lowe relies heavily on his sinker to produce a ton of ground ball outs. Lowe is reputed to have one of the best sinkers in baseball, which I won't argue, but what's the difference between Lowe's sinker and Brandon Webb's? How similar are the two pitches to each other and what other pitches are they similar to? If my similarity scores are measuring what I think they are, Lowe and Webb's sinkers will be most similar to other sinking fastballs, and hopefully will be similar to each other. The table below shows the pitches most similar to each sinker along with the similarity score for each pitch.

```Name              Pitch  Throws  MPH     pfx_x    pfx_z   Score
Brandon Webb      FB     R       88.8   -10.13"   1.94"   100
Franquelis Osoria FB     R       90.8    -9.45"   2.15"    96
Kameron Loe       FB     R       88.6    -8.73"   3.79"    96
Derek Lowe        FB     R       90.3   -10.28"   3.87"    96
Shawn Hill        FB     R       89.6    -8.33"   3.80"    95
Jeremy Accardo    CH     R       86.0    -8.46"   1.97"    95
```
```Name            Pitch   Throws  MPH      pfx_x  pfx_z   Score
Derek Lowe      FB      R       90.2   -10.28"  3.87"   100
Yorman Bazardo  FB      R       89.9    -9.38"  4.89"    97
Jake Westbrook  FB      R       91.1    -8.99"  3.71"    97
Luis Ayala      FB      R       89.6    -8.53"  4.57"    97
Shawn Hill      FB      R       89.6    -8.33"  3.80"    96
Kameron Loe     FB      R       88.6    -8.73"  3.79"    96
```

Webb's sinker is slightly more unique than Lowe's, primarily due to the spin he imparts on the ball (he has the smallest pfz_z number for a fastball and combines it with an large absolute value pfx_x value). One cool thing to notice is that the fifth most similar pitch to Webb's sinker is Accardo's changeup. Changeups typically have a smaller pfx_z value than fastballs, sinking more than a fastball thrown by the same pitcher, and Accardo's mirrors Webb's sinker. Overall though, I would classify the similar pitches in both cases (as well as other similar pitches that fell outside the top-5) as sinkers, giving some confidence that the system is actually finding similar pitches.

I wanted to look at breaking balls too. Just from observing the two, Barry Zito and Rich Hill appear to have very similar curveballs. Let's see what the list says.

```Name             Pitch  Throws  MPH     pfx_x   pfx_z     Score
Barry Zito       CB     L	70.2    -0.69"  -11.48"   100
Ted Lilly        CB     L	71.0    -4.34"   -8.95"    92
Sean Marshall    CB     L	73.2    -4.26"   -9.91"    92
Rick VandenHurk  CB     R	71.0     4.47"   -9.79"    90
Jo-Jo Reyes      CB     L	73.3    -2.95"   -7.33"    90
Doug Davis       CB     L       68.4    -5.39"   -8.48"    90
```

The first thing to realize is that Zito's curve is much more unique than either of the two sinkers. The reason for this is the lack of horizontal spin. Zito throws almost a true 12-to-6 curveball, and as a result of that, a right-handed pitcher's pitch shows up on his list of most similar pitches. I'm not saying that Vanden-Hurk's curve is going to look like Zito's to a batter, but Zito's curve is so unique that there aren't many similar pitches to it, thrown by either LHP or RHP. Hill's curve doesn't show up at the top of Zito's list because Hill's is thrown faster, has a smaller pfx_z value, and has a larger pfx_x value. Zito's curveball is really a unique pitch.

Speaking of unique pitches, lets talk about Mariano Rivera's cutter. I've been somewhat fascinated with Rivera's cutter since I started working with the pitch f/x data. For those who might be unaware, despite being a right-handed pitcher, Rivera is hit harder by right-handed batters than left-handed batters. This is due to the cutter which moves in on left handed batters and causes lots of weak contact and broken bats. The list of similar pitches to Rivera's cutter has a pretty wide selection of pitches.

```Name               Pitch  Throws  MPH   pfx_x   pfx_z   Score
Mariano Rivera     FB     R	  93.4  2.72"   7.72    100*
Jared Burton       FB     R	  93.4  1.57"   7.58     98*
Brandon Medders    SL     R	  91.2  2.27"   9.40     95
Juan Salas         FB     R	  90.9  1.02"   8.05     95*
Jon Lester         FB     L	  92.1  4.50"   9.56     95
Jason Isringhausen CT     R	  90.3  1.69"   7.92     95
Randy Flores       FB     L	  90.0  1.79"   7.41     95
Jonathan Broxton   CT     R	  96.3  1.03"   8.40     94
Brian Wolfe        CT     R	  92.6 -0.39"   6.97     94
Kevin Cameron      FB     R	  91.9 -0.11"   6.64     94
```

Again, these aren't necessarily pitches that will look like Rivera's cutter to hitters, but pitches that move like it. The release point a pitcher throws with plays a huge role in what a pitch looks like, but for right now, don't worry about that. Jared Burton's fastball actually looks like a close match to the cutter, but the horizontal movement for Rivera's cutter is the most unique aspect of the pitch, and Burton's pitch doesn't come close to matching it. Brandon Medder's slider looks close too, but drops less and is a little slower. The pitches that have similar horizontal movement to the cutter are all primarily thrown by left-handed pitchers, with very few pitches thrown by right-handed pitchers having that much movement in to left-handed hitters. The right-handed pitchers with a * next to their score in the list above have reverse splits (right-handed batters hit them better than left-handed ones), but only Burton and Rivera show a reverse split on the pitch in the list. I'm probably reading too much into a sketchy list (that also has sample size problems) but I'm going to keep an eye on Burton.

I think this is a cool way to look at pitches and see similarities that might have otherwise gone unseen. Right now, the similarity scores I'm using are based more on how the pitch moves, independent of how the batter perceives it, which isn't the ideal solution. In addition to just the movement and speed, the sequencing and location of pitches has a large impact on how they are viewed by the batter. For Jamie Moyer's fastball, the two most similar pitches are Cole Hamels' changeup and Johan Santana's changeup. The similarity speaks highly to the movement on Moyer's fastball, but without looking, I would guess that Moyer throws his fastball mostly in situations where Santana and Hamels throw their fastballs, not their changeups. If I can get the similarity scores to reflect how batters view the pitches, the scores will become much more useful.
---------------------------------------------------------

12/18 Update:
Here's what I've got with Burton...

The pitch I called his fastball could be 2 different pitches, one of which behaves like a regular 4-seamer and one of which behaves almost exactly like Rivera's cutter. The red cluster in the chart below is what I initially called Burton's fastball and if you look at the far left of the cluster, you can see a somewhat separate cluster that could be a regular 4-seam fastball, with the cutter occurring more on the right. Without having first-hand information about the types of pitches a pitcher throws I wouldn't be comfortable making a distinction between 2 such similar groupings, but it looks like this might be something.

I have Burton throwing the cutter around 50% of the time, the 4-seamer 25%, and the slider and changeup being the other 25%...Justin, do you know if Burton throws his cutter that often?

If you're curious, here are the values of the 2 cutters...pretty much a dead on match, with Burton's actually having a higher (more "movement") pfx_x value. I would kill for data on Rivera's cutter when he was at his absolute peak though and I wonder maybe if he's lost an inch or two off his cutter since then.

Name MPH, pfx_x, pfx_z
Burton 93.50,2.92,7.94
Rivera 93.35,2.72,7.72

Really neat stuff--I can see this type of work eventually being worked into a PECOTA-like system to really improve our player projections.

With respect to Burton, I asked John Walsh to run some pitchf/x profiles a few months back, which I profiled in this post. He pointed out the qualitative similarities between Burton and Rivera's cutters. Pretty neat to see how similar they came out in your quantitative analysis. All warnings about small sample sizes are well-taken, but as a Reds fan it's hard not to feel a prickle of excitement about him.
-j

j

I'm excited to follow Burton too. I took a quick look at his stats when I was writing about him (I didn't know who he was) and was impressed. I love finding these oddities/possible gems about different players in the pitch fx data.

Interesting and good post!

This is fantastic! It seems like with time, if you could find correlations between certain types of pitches and pitcher success, then you could use this tool to help evaluate prospects.

Wouldn't it be nice to know that the young kid in your system has a cutter similar to Rivera? Or a sinker similar to Webb?

I took a quick look at his stats when I was writing about him (I didn't know who he was) and was impressed.

The big warning flag with him is his walk rate, but it got much better over the course of the season. It'll be interesting to see what happens with him in '08.
-justin

Xlnt. Once the fingerprinting has been identified, calculating how well batters (both individually and in the aggregate) perform on certain types if pitches would be a logical next step (for advanced scouting and player evaluation).

I'm stating the obvious here, but it's not just the movement that makes a certain pitcher elite. In Rivera's case, it's the movement of his cutter combined with his control and how easily he can repeat his motion, delivering the same quality pitch time after time. That ability to repeat is as much the reason for success as is the cutter.

Very nice, Joe.

Great work-I'm just puzzled why Rivera's is classed a Fastball and Broxton a Cutter. Does it matter?

GeneralBlu,

It's just a convention because of the way I label pitches. Usually, I label the fastest/most frequently thrown pitch as a pitcher's fastball, and base the other names off of the fastball. Broxton has a fastball, slider and something else that I call a cutter. Rivera has his cutter and virtually nothing else (he throws a 4-seamer occasionally) so the cutter gets labeled as a fastball.

I think the names of pitches aren't as important as what they do. If you were labeling sinkers, I don't think there is a line that can be drawn between a sinker and a non-sinking fastball. Although, by naming the pitches, I think you are able to get some information about the general intent of the pitch, such as fastballs vs. offspeed in different situations.

Joe, it is important because delivery, grip, and arm speed are a big part of the signal (and deception) to a hitter.

It may not be crucial on every pitch, but the pitch's end result is often not what the hitter is actually reacting to.

Hi Joe,

I got this question in an email from BLee about your study and thought I'd copy it over here (with minor edits for clarity, noted by brackets), as I thought it was a great question:

[Justin makes] the point in [his] initial Burton-Rivera article that Burton seems to throw two different types of fastballs, presumably a two-seamer and a cutter. Did Sheehan separate the two types of fastballs in his analysis? You initially noted that Burton throws far more "regular" fastballs than Rivera, who relies almost exclusively on the cutter. Sheehan notes that the horizontal break on Rivera's cutter is what separates it from anybody else's. If he didn't separate Burton's two types of fastballs, the "regular" fastballs on the negative side of the graph would bring down his overall average pfx_x in Sheehan's analysis. Is it possible that Rivera and Burton's cutters are even more alike than we think?

Thanks,
Justin

Justin

I don't have my spreadsheet/pitch type graphs at work, so I don't know exactly how I separated the pitches out. Josh Kalk (whos usually pretty similar to me for pitch types) has Burton with a fastball, slider and change, although he doesn't label Burton's fastball as a cutter, which is something he does in other cases. The link for Kalk's take on Burton is below, and I'll check out what I think of Burton when I get home.

http://baseball.bornbybits.com/plots/Jared_Burton.html

I put an update at the end of the article so I could post Burton's pitch chart.

A quick comment about z-scores... they only make sense if the underlying data are normal. Have you checked whether the underlying data are normal? I thought about this sort of approach for computing similarity scores for hitters, but things like SLG and OBP are not normal (and why would they be given that the lower end of the distribution gets shipped off to AAA). However, it might be sensible to do this in this instance, if pitch movements and velocities follow a normal distribution (one can check this via tests of normality).

Justin, do you know if Burton throws his cutter that often?

Unfortunately, I can't give much scouting help on him. I think I've only seen him pitch once or twice, as Reds games weren't on the tube much out here in Arizona in the second half. And even if I'd seen him more, my ability to identify pitches is miserable.

Thanks for taking the time to look at him though! I'll be watching him much more closely next year.
-j

Thanks for the explanation, Joe. I under stand that this is the perfect illustration of gist of your article. I just couldn't grasp that a 96 mph pitch wasn't the guys main fastball. Broxton throws hard!