The Baseball Analysts: I've Seen That Before

I've Seen That Before

By Jeremy Greenhouse

While a pitcher's stuff diminishes over the course of game, the effects I found were relatively small. So why do batters gain an edge over pitchers as the game goes on? Well, baseball is a game of adjustments. Batters get their timing down and start picking up the ball out of the pitcher's hand. All that good stuff.

The first time a batter faces a curveball, he might be caught off-guard. That’s why pitchers throw predominantly fastballs the first time through the order. And that’s why batters do so well the third time they face a pitcher. They’ve seen most of his repertoire, and are able to recognize the curve. As the saying goes, “Fool me once, shame on you. Fool me…you can’t get fooled again.”

First, here is the average run value per 100 pitches based on the number of times a batter has seen a given type of pitch. I include all data points for which I have approximately 1,000 pitches.

For reference:
F2: Sinker/Two-Seam Fastball
F4: Four-Seam Fastball
CB: Curveball
SL: Slider
CH: Changeup
FC: Cut Fastball

This chart indicates that a batter facing a fastball from the same pitcher for the 12th time will perform better than a batter facing a pitcher's first fastball. Chances are, however, that batters who face 12 fastballs are better from those who only face a few. One way to get around this bias might be to take the difference in run value between the 11th fastball and 12th fastball. This method, called the delta method, allows you to compare apples to apples as each change in measurement is at least composed of players from the same sample. This produced the following chart:

The magnitude of the results is enormous, if the results are to be believed. A batter facing a changeup for a fifth time is expected to perform over five runs per 100 pitches better than he performs the first time he saw the changeup. That's pretty much the difference between the best and worst hitter in the league. Unfortunately, I have to say that I don't think the delta method is the way to go here, and I'm not sure how to fix my sampling problems. Batters who face at least three changeups have a rv100 of 0.2 on the third changeup, but they only have an rv100 of -1.1 on the second change. This is a delta of 1.3 runs. Meanwhile, batters who face at least four changeups have an rv100 of -1.3 runs on the third change and 0.3 on the fourth, another huge delta of 1.6 runs. This would mean that batters perform three runs per 100 pitches better on the fourth changeup they see than on the second. The oddity here is that batters who face at least three changeups are above average on the third changeup, but batters who face at least four changeups are well below average on the third changeup. I think what this means is that once pitchers get burned on a given pitch, they quit throwing it to that batter the rest of the game. I don't know how to solve for these biases.

I went on and produced the same two charts, except this time at the at-bat level instead of the game level.

Batters who face seven fastballs in an at-bat are good, in that they are able to work the count. Meanwhile, pitchers who throw five sliders in an at-bat are good, in that they are either ahead in the count or can locate their breaking balls.

Using the delta method:

No pitch gains in effectiveness after its been thrown once already in an at-bat. This finding was applicable at the game level as well. However, there are differences between the at-bat and game level. Off-speed pitches such as the changeup and curveball lose more value than fastballs during the game, given an even distribution of pitches. But in an at-bat, off-speed pitches do not lose as much effectiveness as fastballs when they're repeatedly thrown. It makes sense to me that changeups are the worst pitch to show multiple times to the same batter throughout the game, since the success of changeups is built on deception. Yet I'm not sure why changeups don't lose as much effectiveness in an at-bat once thrown multiple times as fastballs do. I think it has something to do with the count in which they're thrown and the theory of the out pitch.

Comments

Have you researched any other type of data that concludes a changeup being more effective than a fastball when repeatedly thrown? Really interesting what you have here.

Posted by: Danny at January 21, 2010 1:14 AM

For some reason, the charts did not come through for me. However, had to laugh. . .nice W reference.

Posted by: Mark at January 21, 2010 6:15 AM

Very intersting research. Could you add confidence intervalls around your data points? That would give us a better idea about how relevant the averages are.

Posted by: Bjoern at January 21, 2010 9:14 AM

Without looking at any numbers, I'm willing to guess that for most pitchers

a) fastball velocity decreases throughout a game
b) smaller differences in fastball velocity versus other pitchers leads to decreased effectiveness (within range & reason)

Maybe including this as a variable will help discern how much of the difference is fatigue versus pitch selection?

Posted by: Lou at January 21, 2010 10:53 AM

Lou,

He's studied this before, if you look back at some of the older articles. And good luck disentangling that from the effect of the batters getting used to the pitches.

Jeremy,

This is absolutely fascinating work. It's incredible to see what you've come up with over the last few weeks... And it makes me shudder to think what others, working for teams full time, are coming up with. Imagine if this were your full time job. :)

Good stuff!

Posted by: Patrick at January 21, 2010 12:01 PM

Danny, I'm not sure what you mean by type of data. This is the first time I've done research in this type of pitch sequencing.

Mark, I don't know what's up with the charts not coming through for some people, or why Google Reader doesn't show Google Charts. I don't know what to do about that.

Bjoern, I don't have the raw data handily available to calculate confidence intervals. But I wouldn't trust the results too much whether they're statistically significant or not.

Patrick, thanks for the comment.

Posted by: Jeremy Greenhouse at January 21, 2010 12:12 PM

Jeremy,

Can you tell us a bit more about the delta method?

I assume each point shows us the change in expected runs for the 1st and 2nd pitch, including only data for which hitters saw two pitches. But I don't believe the numbers. Are you sure your not normalizing by something? That is, are you sure the Y axis is expected change in Runs expected per 100 ABs?

Posted by: cdm at January 21, 2010 1:07 PM

Chris,

I'm not entirely sure what you're asking. I'll do my best to explain what I did.

For all batters who saw at least two pitches, I found the average run value for the first and second pitches and multiplied by 100. For all batters who saw at least three pitches, I found the average run value for the second and third pitches and multiplied by 100. And so on. The charts express the accumulative deltas. So the first point represents the difference between pitch one and pitch two, and the second point represents that difference plus the difference between pitch two and pitch three. That's it.

Posted by: Jeremy Greenhouse at January 21, 2010 1:17 PM

Thanks Jeremy,

I'm just trying to interpret the delta plots, and I find that I can't. I think two things that I can't reconcile with the graphs: (1) I would expect the derivative to be smaller than the values in the plots above them; (2) the plots of RV/100 show positive and negative slopes, so the derivative should have both positive and negative values.

The method you described sounds right. Are you sure there isn't a small glitch somewhere in the analysis that produced the delta plots? Or am I missing something?

Posted by: cdm at January 21, 2010 2:08 PM

Chris, here's what my numbers showed:

Batters who face at least three changeups have a rv100 of 0.2 on the third changeup, but they only have an rv100 of -1.1 on the second change. This is a delta of 1.3 runs. Meanwhile, batters who face at least four changeups have an rv100 of -1.3 runs on the third change and 0.3 on the fourth, another huge delta of 1.6 runs. So the first graph shows the average run value on the third and fourth changeups to be 0.2 and 0.3 runs. That's a small derivative. The second graph shows the change in run values to be 1.3 and 1.6 runs. Those are very large. I think the reason I didn't get any negative values is that pitchers stop throwing a pitch type once they get burned by it. So if a batter hits a homer off a slider, the pitcher probably wouldn't give up a homer if he threw another slider, but he's become scared to throw it, or he's been pulled from the game. Anyway, that's the theory I'm working with. I tried to express that I don't have much confidence in the results due to the sampling problems, but I am confident that the numbers I produced are accurate.

Posted by: Jeremy Greenhouse at January 21, 2010 2:30 PM

Hi Jeremy,

Batters who face >=3 changeups:
second: -1.1
third: 0.2

Batters who face >=4 changeups:
third: -1.3
fourth: 0.3

Since batters who face >=4 changeups is a subset of the batters who face >=3 changeups, in order for >=4 to average to -1.3, the batters who face exactly 3 changeups must have a crazy high run value associated with that third changeup. Hence your explanation that they are getting hammered on that pitch. What is the distribution of outcomes for that 3rd change-up for batters that see exactly 3 change-ups (HR, 1B, OUT, etc)? I assume thats how you calculated RV/100?

Posted by: cdm at January 22, 2010 8:49 AM

Chris, here are the numbers I have for 3rd changeup for batters that see exactly 3 change-ups. Let me know if you want the distribution for any other pitch types/counts.

Ball	38.20%
Strike	42.26%
Out	12.05%
Single	4.76%
Double	1.51%
Triple	0.17%
Homer	0.84%

I use the methodology outlined by John Walsh and Joe P. Sheehan to calculate run value. I then multiply that by 100 to scale it to 100 pitches.

Posted by: Jeremy Greenhouse at January 22, 2010 3:09 PM