Spitballing on Command
At best, quantifying command is really difficult. At worst it's a foolish endeavor. The reason is that, while we may know the precise location of a pitch thanks to PITCHf/x data, we have no idea of the pitcher's intention. Perhaps pitchers could fill out a survey after every inning, or perhaps someone could track the target of the catcher's glove. Maybe these data are being collected somewhere, but they certainly aren't publicly available. But we beat on.
Mike Fast in the 2009 Hardball Times Annual took a shot at measuring Cliff Lee's command, and Dave Allen tried with Mariano Rivera. Borrowing ideas from both of them, I attempted to rank a group of pitchers by command.
My sample consists of pitches that I have classified as four-seam fastballs in RHB vs. RHP matchups on 0-0 counts. 100 pitchers have thrown at least 200 such pitches, giving me over 60,000 data points.
First, I came up with a heat map. It shows what you'd expect. Fastballs up-and-in or down-and-away are most successful. Then I predicted each pitch's expected run value based on such location. Here are the top six:
Maddux's command is legendary, so it speaks wellthat he ranks so highly. I'm pretty sure all of these guys have good reputations for command. And the bottom 5:
Looking at a pitcher's walk rate usually suffices in grading command. Since 2007, all of these guys have surrendered their fair share of walks, and all those balls show up in the numbers.
So I think that method has legs. I controlled for a fair amount of things (batter/pitcher handedness, count, pitch type), but one could go even further and regress the league-wide locational run values to each batter's own heat map. The sample sizes get small, so for left-handed fastballs to left-handed batters, I'd probably combine 0-2 counts with 1-2 counts, and use both two-seam and four-seam fastballs. Regression to the mean and stuff.
I also tried clustering analysis. In a situation as specific as RHB vs. RHP, 0-0 count, pitchers generally have more types of pitch offerings to choose from than pitch locations. With fastballs, you either go high heat or throw at the knees. With sliders, there's back foot or back door. Curves are intended to be thrown either anywhere in the dirt or anywhere in the zone. Anyway, those are the assumptions you need to make if you believe clustering makes sense. Furthermore, if you're limited to k-means clustering, you might as well assume that all pitchers have two intended locations for their fastballs. That's what I did, anyway. So I gave each pitcher his own two separate cluster centers, and found each pitch's standard deviation from those centers, grouping by pitcher. Here were the leaders:
Maddux is no Rivera, but he's head-and-shoulders above the other 99 pitchers in my sample when it comes to command, so it lends validation to the power of PITCHf/x that two rudimentary analyses can pull out Maddux's needles from the haystack. The bottom five:
I believe that Aardsma's four-seam fastball is an outlier in several ways. Though I'm not disregarding this piece of data, I don't think it means what it's supposed to mean. But all of these guys are prone to the walk. It would be weird be if somebody had excellent command outside the strike zone, so that his expected run values based on location graded out poorly, but he had really tight clusters of pitches. This would indicate good command but poor approach. I always get that feeling watching Dice-K.
So Maddux, Nolasco, Hughes, and Petit are in the top ten of both lists. I know Maddux and Nolasco have great reputations for control; I'm unsure about the other two. Garza, Sarfate, Harden, and McClung show up in the bottom ten of both lists, Sarfate and McClung definitely have no aptitude for command.
The ultimate goal here is to evaluate pitchers. I feel confident that with a sample of 50 pitches, I could assess a guy's stuff. I think a pitcher would need to have thrown over 1,000 pitches, assuming he's not walking the ballpark, to provide an ample PITCHf/x sample for evaluating command, given the need to drill down the data by pitch types, batter types, and counts. And it takes precisely 4,242 pitches to get a good read on a pitcher's intangibles.