F/X VisualizationsApril 17, 2009
What Did We Know This Time Last Year?
By Dave Allen

This early in the season the leader and laggard boards often have some interesting names, and it is fun to theorize which of these are legitimate breakouts (or breakdowns) and which are small sample size flukes. The pitchf/x data adds a powerful tool in helping with this classification. It allows us to look deeper into why a pitcher may have struggled or succeeded in a start. We have already seen some great analysis along these lines. RJ Anderson has a series of posts looking at Lincecum's, Sabathia's and Wheeler's performances thus far based on pitch speed and movement and release point. River Avenue Blues broke down Wang's first two games to see what might be up.

These are good examples of using all the data pitchf/x offers to assess recent performance. Of course what often happens is people just look at fastball speed and ignore movement, location, and release point data. For example after Cole Hamels first poor start everyone focused on his 86 mph fasball, but, as Hamels said himself, he started off with a fastball in the mid-80s early last year too. The image below shows Hamels's average fastball speed by start. The x-axis is not scaled by date, but by start (so no matter how far apart in time two consecutive starts are they are always the same distance apart along the x-axis). The division between seasons in marked with a red line.

hamels_sp_start.png

Hamels's fastball speed is right where it was last year (not to say that we should be worry free about Hamels; last year he pitched 261 innings after just 189 in 2007). This provides a useful way to see if a pitcher's speed is within his normal variation. Consider Wang:

wang_sp_start.png

His fastball in his injury shortened 2008 was 2 mph slower than his fastball in 2007. For his first two starts of 2009 it is in the low range of his already low 2008 numbers. That could mean trouble.

As I noted earlier the best pitchf/x analysis will take into account all the data, but most people will be lazy. Like I just did, they will look at just fastball speed. So I wanted to know how much we could learn only looking at that. More specifically what can we say about performance for the rest of the season looking just at fastball speed thus far into the season. I looked back at last year to find out. Most starters have started two games with about 100 pitches per start, about half of them fastballs. So what can we know with 100 fastballs worth of data?

I started off with the average speed of every pitcher's first 100 fastballs in 2008 and then compared that with his average fastball speed for all of 2007. I wanted to see how well that pitcher performed from that point forward, so I found their FIP from the game after they reached their 100th fastball on in the 2008 season. (FIP stands for fielding independent pitching. Developed by Tangotiger, it roughly gives the expected ERA of a pitcher if he pitched in front of an average defense). From that I subtracted that player's preseason CHONE projected FIP (CHONE is one of the best projection systems. It was created by Sean Smith). The result is how the pitcher performed over the rest of the season relative to his projection. Here are the players with the biggest increase and decrease in fastball speed.

The second column is how much faster (or slower) the player's first 100 2008 fastballs were compared to his 2007 fastballs. A positive number is a faster fastball in 2008. The third is FIP minus projected FIP. Like ERA a low FIP is good, so a negative difference is outperforming the projection.

+---------------------+--------------+----------------+
| Name                | FB speed dif | FIP - proj FIP |
+---------------------+--------------+----------------+
| Ervin Santana	      |         2.28 |          -1.16 |
| Tim Lincecum        |         1.65 |          -0.83 |
| Josh Beckett        |         1.36 |          -0.45 |
| John Maine          |         1.07 |           0.10 |
| Santiago Casilla    |         1.06 |           0.92 |
| Wandy Roriguez      |         0.96 |          -0.84 |
| Manny Delcarmen     |         0.89 |          -0.86 |
| Wilfredo Ledezma    |         0.82 |          -0.05 |
| Shaun Marcum        |         0.79 |          -0.26 |
| Leo Nunez           |         0.77 |           0.05 |
+---------------------+--------------+----------------+
| Francisco Rodriguez |        -2.34 |           0.05 |
| Mike Mussina        |        -2.34 |          -1.37 |
| Daniel Cabrera      |        -2.49 |           0.82 |
| Brad Lidge          |        -2.51 |          -1.10 |
| Jeff Suppan         |        -2.61 |           0.80 |
| Oliver Perez        |        -2.81 |           0.21 |
| Chris Young         |        -3.42 |           0.41 |
| Bob Howry           |        -3.89 |           0.84 |
| Cole Hamels         |        -3.90 |           0.15 |
| Heath Bell          |        -4.01 |           0.30 |
+---------------------+--------------+----------------+

Although there is considerable variation seven of the ten pitchers with the largest increases in fastball speed outperformed their projection and eight of the ten with the largest decrease underperformed their projection. In addition the top two were two of the biggest breakout pitching performances of last year and you could have seen it just 100 fastballs into the season. Of course the trend is not perfect, 100 fastballs into the season Brad Lidge, Mike Mussina, Hamels and Francisco Rodriguez were way below their 2007 averages and they all had great seasons (although Hamels's and Rodriguez's performances were slightly worse than projected). Here are the results for all players.

fip_sp.png

The relationship is very significant ( p < .01), but explains little of the variation (r2= 0.05). The equation for the best fit line is y = -0.24 - 0.15x. Where x is the difference in fastballs speeds (first 100 '08 fastballs minus '07 fastballs) and y is remaining 08 FIP minus projected FIP. So an increase of one mph is worth a 0.15 decrease in FIP (or each decrease of a mph is worth an increase of 0.15 FIP). Also if a pitcher is throwing just as fast in his first 100 fastballs of the season as he was all of last season (x = 0) you expect him to outperform his projection by almost 0.25 runs. If you thought going into the season he was a 4.00 FIP (or ERA) pitcher and his first 100 fastballs are just as fast as his fastballs the year before you would expect him to be a 3.75 FIP (or ERA) pitcher. But there is so much unexplained variation (95% in fact) this pitcher could end up performing very well or very poorly.

So, although the trend is significant, there is so much unexplained variation I would say with just the speed of the first 100 fastballs we don't know that much more than before. But that will not stop me from posting this season's leaders and laggards in fastball speed difference. Some of the pitchers have not reached the 100 fastball cutoff used in the above analysis. Remember someone at the top of the list could end up with very poor performance relative to projection, like Santiago Casilla last year. A pitcher at the bottom could end up like Mussina.

 Greatest difference between 09 fastball speed thus far and 08 fastball speed

+-------------------+--------+--------+
| Name              | Number |    Dif | 
+-------------------+--------+--------+
| Todd Coffey       |     61 |   1.93 |
| Justin Verlander  |    119 |   1.81 |
| Kevin Correia     |    109 |   1.23 |
| Jonathan Sanchez  |     74 |   1.14 | 
| Josh Johnson      |    163 |   1.14 |
| Matt Albers       |     55 |   1.13 |
| Chirs Volstad     |    117 |   1.09 |
| Adam Eaton        |     55 |   1.09 |
| Armando Galarraga |     97 |   0.98 |
| Jason Marquis     |    105 |   0.94 |
+-------------------+--------+--------+
| Geoff Geary       |     63 |  -2.04 |
| Matt Harrison     |     59 |  -2.05 |
| Daniel Cabrera    |    131 |  -2.25 |
| Manny Delcarman   |     68 |  -2.26 |
| Oliver Perez      |    126 |  -2.39 |
| Joe Saunders      |    128 |  -2.44 |
| Daisuke Matsuzaka |     62 |  -2.44 |
| Hideki Okajima    |     55 |  -2.66 |
| Dana Eveland      |     91 |  -2.88 |
| Dennis Sarfate    |     67 |  -3.12 |
+-------------------+--------+--------+

With all the caveats I will still venture that the pitchers at the top of the list, as a whole, out-perform their projections and the pitchers at the bottom under-perform. It will be interesting to see if any of the names on the top of this list turn out to be this season's Tim Lincecum or Ervin Santana.

Sorry this post was a little light on visualizations. I promise my next post will make up for it.

Comments

Excellent read Dave - especially love the visualization of the last graph

Cool stuff. Did re-classify any pitches? Do you get similar results with end speeds?

The Wang article @ RAB was shaky, IMO - lots of talk about release points and movement without even addressing the alignment issues in Yankee Stadium in 2008.

Man Dave, your stuff is always interesting well presented. Cant wait till the next post!

Shouldn't players who spent a significant amount of time on the DL be omitted from the study to correct for a selection bias.

I can see pitchers who spent time on the DL last season coming back into 09 throwing harder only because their injury has been corrected. This could work in the opposite way as well, a pitcher like Matsuzaka lands on the DL after a couple starts where he struggles to hit 90.

I think this would have more value with injuries removed, as then going forward you could point to a drop in FB velocity and have somewhat of an either or proposition (hurt or going to do poorly).

This could be very interesting stuff, seen as when a pitcher drops their arm angle they get more movement at the sacrifice of sheer speed.

Interesting stuff. I'd say I can't wait to see the end results, but that would mean the season is over.

Jason and Snowball2 (nice to see a fellow simpsons fan) thanks for the kind words.

Harry, I am using the provided pitchf/x pitch classifications. I am still looking into and thinking about pitch reclassification. But that is a very good point, especially since the classification algorithm changed this year. So fastball speed changes could be a result of fastballs being classified differently this year compared to last year.

Steve, that is a good suggestion. I did try and take out all guys switching between the bullpen and rotation (and vice veras), but I didn’t think of omitting players coming off of DL stints. Do you, or anyone else for that matter, know of a DL-database?

What effect do you think the WBC and longer spring training might have?

Good work, Dave.

DL database linked in my name.

Anonymous,

I am not sure about longer spring training, but I looked at all pitchers who participated in the WBC this year. The WBC pitcher's fastballs so far this year have been 0.76 mph slower on average than their 2008 fastballs. The non-WBC pitcher's fastballs so far this year have been 0.88 mph slower. The difference, however, is not statistically significant (p=0.43).

Vegas Watch,
thanks for the link.