The Baseball Analysts: Spitballing on Command

Spitballing on Command

By Jeremy Greenhouse

At best, quantifying command is really difficult. At worst it's a foolish endeavor. The reason is that, while we may know the precise location of a pitch thanks to PITCHf/x data, we have no idea of the pitcher's intention. Perhaps pitchers could fill out a survey after every inning, or perhaps someone could track the target of the catcher's glove. Maybe these data are being collected somewhere, but they certainly aren't publicly available. But we beat on.

Mike Fast in the 2009 Hardball Times Annual took a shot at measuring Cliff Lee's command, and Dave Allen tried with Mariano Rivera. Borrowing ideas from both of them, I attempted to rank a group of pitchers by command.

My sample consists of pitches that I have classified as four-seam fastballs in RHB vs. RHP matchups on 0-0 counts. 100 pitchers have thrown at least 200 such pitches, giving me over 60,000 data points.

First, I came up with a heat map. It shows what you'd expect. Fastballs up-and-in or down-and-away are most successful. Then I predicted each pitch's expected run value based on such location. Here are the top six:

Greg Maddux
Trevor Hoffman
Yusmeiro Petit
Phil Hughes
Paul Byrd

Maddux's command is legendary, so it speaks wellthat he ranks so highly. I'm pretty sure all of these guys have good reputations for command. And the bottom 5:

Seth McClung
Fausto Carmona
Rich Harden
Dennis Sarfate
Matt Albers

Looking at a pitcher's walk rate usually suffices in grading command. Since 2007, all of these guys have surrendered their fair share of walks, and all those balls show up in the numbers.

So I think that method has legs. I controlled for a fair amount of things (batter/pitcher handedness, count, pitch type), but one could go even further and regress the league-wide locational run values to each batter's own heat map. The sample sizes get small, so for left-handed fastballs to left-handed batters, I'd probably combine 0-2 counts with 1-2 counts, and use both two-seam and four-seam fastballs. Regression to the mean and stuff.

I also tried clustering analysis. In a situation as specific as RHB vs. RHP, 0-0 count, pitchers generally have more types of pitch offerings to choose from than pitch locations. With fastballs, you either go high heat or throw at the knees. With sliders, there's back foot or back door. Curves are intended to be thrown either anywhere in the dirt or anywhere in the zone. Anyway, those are the assumptions you need to make if you believe clustering makes sense. Furthermore, if you're limited to k-means clustering, you might as well assume that all pitchers have two intended locations for their fastballs. That's what I did, anyway. So I gave each pitcher his own two separate cluster centers, and found each pitch's standard deviation from those centers, grouping by pitcher. Here were the leaders:

Greg Maddux
Brett Myers
Joakim Soria
Ricky Nolasco
John Lackey

Maddux is no Rivera, but he's head-and-shoulders above the other 99 pitchers in my sample when it comes to command, so it lends validation to the power of PITCHf/x that two rudimentary analyses can pull out Maddux's needles from the haystack. The bottom five:

Matt Garza
Brandon Morrow
Seth McClung
Rich Harden
David Aardsma

I believe that Aardsma's four-seam fastball is an outlier in several ways. Though I'm not disregarding this piece of data, I don't think it means what it's supposed to mean. But all of these guys are prone to the walk. It would be weird be if somebody had excellent command outside the strike zone, so that his expected run values based on location graded out poorly, but he had really tight clusters of pitches. This would indicate good command but poor approach. I always get that feeling watching Dice-K.

So Maddux, Nolasco, Hughes, and Petit are in the top ten of both lists. I know Maddux and Nolasco have great reputations for control; I'm unsure about the other two. Garza, Sarfate, Harden, and McClung show up in the bottom ten of both lists, Sarfate and McClung definitely have no aptitude for command.

The ultimate goal here is to evaluate pitchers. I feel confident that with a sample of 50 pitches, I could assess a guy's stuff. I think a pitcher would need to have thrown over 1,000 pitches, assuming he's not walking the ballpark, to provide an ample PITCHf/x sample for evaluating command, given the need to drill down the data by pitch types, batter types, and counts. And it takes precisely 4,242 pitches to get a good read on a pitcher's intangibles.

Comments

I'm not a technologist, but it seems to me that we must have the tools to track the location of the catcher's mitt at the point of release, and then the distance the glove moved to catch the pitch. Wouldn't that tell us just about everything we need to know about command?

Posted by: GreggB at June 10, 2010 3:45 AM

Good stuff (or should I say command?), Jeremy. I always get a kick out of your wit, too.

I believe you are on to something here. The names on the top and bottom ten lists are not random. Yusmeiro Petit, to the extent he got by, clearly did so based on command and not stuff (87 mph FB). It just didn't translate that well to the big leagues but he looked like the next coming of Greg Maddux two years into his pro career. Similarly, if Paul Byrd didn't have command, he wouldn't have had anything.

Looking forward to your piece on intangibles!

Posted by: Rich Lederer at June 10, 2010 8:57 AM

Gregg, I think it would.

Rich, I agree that the only reason Maddux/Byrd/Petit were able to pitch in the Bigs at 87 must have been because they had command. What I'm hoping to find eventually when I come up with these StuffRV/CommandRV/fxRV lists are guys who might defy logic and are therefore over/under valued. Maybe some 87 MPH pitchers actually have poor command but good results. (Intangibles!) More importantly, I'm on the lookout for guys with good stuff and command, but a poor ERA, or maybe even poor peripherals. I'd like to come up with a group of PITCHf/x breakout candidates. It'll take a while.

Posted by: Jeremy Greenhouse at June 10, 2010 9:43 PM

This is really awesome, Jeremy.
For what it's worth, I once saw a "catcher's glove" tracker on a MASN Orioles broadcast. I know extremely little about it and how scientific it is, but it did seem intriguing. The clip of it is at 1:12:
http://baltimore.orioles.mlb.com/video/play.jsp?content_id=7222499

Posted by: Lucas Apostoleris at June 11, 2010 9:34 AM

I feel like I'm seeing the evolution of baseball analysis as we speak.

Why is it that some people find this kind of thinking as offensive? They want to keep baseball dumbed down, but I love getting as much info as possible.

Kudos to you for diving in on such a seemingly difficult task, that's how progress is made!

Posted by: Peter at June 11, 2010 1:27 PM

Lucas, the catcher's glove tracker on MASN (and SNY uses something similar) is unfortunately not an automated one. One of the guys in the video production crew clicks on the catcher's glove in the video to produce that effect in the replay. That's not the same as knowing where the catcher's glove is located in space. Nor is it the same as having that data available for a large sample of pitches.

Posted by: Mike Fast at June 14, 2010 8:23 AM

Thanks for clearing that up, Mike. I figured it was something like that since I hadn't heard of there being any available data for the catcher's target.

Posted by: Lucas Apostoleris at June 15, 2010 11:53 AM