Touching BasesFebruary 04, 2010
Hitters by Zones
By Jeremy Greenhouse

Few in MLB can beat a well-located pitch down and away. I wanted to look up those who could, so I broke the plate area down into nine zones, scaling the vertical component of the pitch for the batter’s height. For this analysis, I decided to restrict my sample to only 2009 pitches at which the batter swung. Here’s a crude chart showing the percentage of swings in each zone and how batters fare when swinging, indicated by color.

Zones.jpg

Batters have the advantage when the pitch is middle-middle, and for the other eight zones, the run value is negative.

Getting right to the leaderboards. There are nine of these, but I’m going to leave the commentary short and I’ll leave a spreadsheet at the end.

Down-In

Name Runs Swings
Derrek Lee 5.6 57
David Wright 3.8 72
Corey Hart 3.6 60
Hunter Pence 2.8 73
Carlos Delgado 2.6 11
Chase Headley -5.9 58
Ryan Braun -6.1 84
Aubrey Huff -6.2 56
David Ortiz -6.2 64
Ryan Howard -6.6 84

Ryan Howard and David Ortiz are similar type hitters who like the ball out over the plate but can get beat inside. Carlos Delgado hit a homer, three doubles and a single on his eleven swings at pitches down and in.

Down-Middle

Name Runs Swings
Joey Votto 10.6 193
Brian Roberts 9.9 204
Miguel Cabrera 9.7 191
Dustin Pedroia 6.9 150
Nick Markakis 6.8 160
Garret Anderson -11.7 174
Nate McLouth -12.3 125
Jack Cust -12.7 124
Dan Uggla -13.4 185
Derek Jeter -13.9 173

I’m surprised Derek Jeter’s on this list, as he’s a successful groundball hitter. Dan Uggla and Jack Cust on the other hand are fly ball hitters.

Down-Away

Name Runs Swings
Carlos Gonzalez 1.8 69
Denard Span 1.5 68
Ichiro Suzuki 1.4 121
Robinzon Diaz 1.2 18
Trevor Crowe 1.2 17
Hideki Matsui -12.8 107
Adam LaRoche -13.4 145
Jayson Werth -13.5 138
Ryan Howard -13.8 231
Brandon Inge -14.0 120

It appears foot speed is instrumental if one is to succeed by swinging at pitches down and away. I’m assuming the highest percentage of grounders are on pitches in this location, and speed is important to get on base via the grounder. Pitching Howard down in the zone seems to be a good idea.

Middle-In

Name Runs Swings
Martin Prado 13.2 87
Michael Young 10.9 132
James Loney 10.2 83
Mike Cameron 8.8 113
Derrek Lee 8.3 116
Willie Bloomquist -7.1 121
Lyle Overbay -7.2 42
Jeff Francoeur -7.6 172
Edgar Renteria -8.5 132
Mark DeRosa -14.1 125

Derrek Lee likes the ball inside.

Middle-Middle

Name Runs Swings
Prince Fielder 30.7 249
Mark Teixeira 29.9 294
Ryan Braun 29.6 281
Adam Dunn 25.3 294
Andre Ethier 25.2 323
Augie Ojeda -10.9 128
Nick Punto -11.3 191
Luis Rodriguez -11.8 129
Ty Wigginton -12.0 219
Dioner Navarro -13.1 174

This is clearly the most telling list in terms of quality of hitter. To be successful swinging the bat, you have to be able to hit the ball pitched down the middle.

Middle-Away

Name Runs Swings
Adrian Gonzalez 8.2 156
Robinson Cano 7.2 175
Ryan Braun 7.2 101
Nick Markakis 6.3 178
Brad Hawpe 5.9 228
Pedro Feliz -10.5 129
Jimmy Rollins -10.7 301
Chase Utley -11.1 232
Curtis Granderson -13.3 252
Aaron Hill -13.6 152

I already knew that Adrian Gonzalez and Robinson Cano excelled hitting the ball the other way, so it makes sense that they also excel at hitting outside pitches. The Phillies are not so good at hitting the ball when pitched away. They are good at baserunning, however.

Up-In

Name Runs Swings
Casey McGehee 5.1 84
Michael Young 5.0 85
Marco Scutaro 3.8 43
Seth Smith 3.8 14
Pablo Sandoval 3.1 81
Hunter Pence -7.1 77
Matt Holliday -7.7 85
Clint Barmes -8.0 75
Jhonny Peralta -8.6 85
Michael Cuddyer -10.3 123

Michael Young also likes the ball inside. He beat out Lee by six runs last year on pitches at least half a foot inside. Seth Smith had seven hits on the 14 pitches he swung at up and in, including four for extra bases.

Up-Middle

Name Runs Swings
Michael Cuddyer 10.7 186
Raul Ibanez 9.7 114
Aaron Hill 9.6 223
Kevin Youkilis 7.5 172
Todd Helton 7.4 168
Orlando Cabrera -10.3 204
Jason Giambi -11.3 109
Mike Cameron -11.6 122
Jose Bautista -11.9 136
Mark Reynolds -13.5 177

Michael Cuddyer was last at pitches up and in, but first at pitches up and over the plate. I find this very interesting. If you’re a pitcher, you can jam Cuddyer, but you better not miss.

Up-Away

Name Runs Swings
Albert Pujols 5.5 82
Matt Wieters 4.7 42
Chris Coghlan 4.7 76
Matt Kemp 3.9 56
Jacoby Ellsbury 3.8 58
Jimmy Rollins -6.0 110
Rafael Furcal -6.1 93
Jorge Cantu -6.4 56
Brian Roberts -7.3 76
Emilio Bonifacio -8.0 73

It took you a whole article to find Albert Pujols at the top of a leaderboard. My analysis confirms Rich Lederer's preliminary hypothesis. Pujols continues to be good.

Here's a spreadsheet containing all hitters with at least ten pitches swung at in a zone. And why not? Pitchers too.

Touching BasesFebruary 01, 2010
Thoughts on Bloomberg Sports
By Jeremy Greenhouse

Bloomberg Sports unveiled its two new products to the media on Sunday afternoon, and I was one of those fortunate enough to be in attendance. Thoughts:

The fantasy product, to be released this month on a trial basis, contains a draft kit and in-season tools. Player news, stats, and data visualizations are all available with at most three clicks of the mouse. Bloomberg Sports is not providing any new data sources to the consumer, but in partnerships with MLB and Rotowire, BBGSports aggregates relevant player statistics and news, laying the data out in a friendly and efficient interface. Pretty much all of the offensive and pitching stats/splits available on Baseball Reference and FanGraphs are available in Bloomberg’s product. Even better, those stats that aren’t included can be written into the system. You can create new stats and the product is adaptable to the most obscure fantasy league settings. All of these stats can be easily ranked and charted. BBGSportsspider.jpgThe best visualization I saw was their “spider” chart, which is similar to Justin Bopp’s DiamondView and Kevin Dame's 5 Tool Analyzer.

Attached to the fantasy product will be a team of writers led by Jonah Keri, whose background in business and baseball analysis makes him a neat fit, but more importantly, Keri’s refined post-up game and precise outlet passes are reminiscent of a younger, Jewish Wes Unseld. BBGSports has decided to produce some of its written content for free, and lock some behind a pay wall. I imagine the free content will be similar to FanGraphs’ written content, in that it will use progressive analysis to inform the reader as well as to promote the site’s statistical engine. But what will be behind the pay wall? The Baseball Prospectus model is sensible in that BP leaves its more random material, for lack of a better term, in the open (Interviews, TWIQ, Roundtables), while leaving its selling point—progressive analysis—behind the pay wall. However, BBGSports isn’t selling its analysis. In fact, BBGSports is selling others' analysis, as Bloomberg specializes in collecting and distributing relevant news from thousands and thousands of web sites. So I wonder if BBGSports is just going to put some of its written content behind the pay wall to satisfy the consumer who likes to feel that he’s getting more bang for his buck. I hope that BBGSports finds a way to differentiate its free analysis from that which is paid for. I look forward to seeing what Keri and Co. have in store, and who it is that composes Keri’s company.

My chief criticism of BBGSports’ fantasy product is, oddly enough, with its only never-before-seen-to-me data. Again, I don't think the product was built to harvest any new data, but rather to provide an incredibly convenient database that consists of already-available information. In that mission, BBGSports has succeeded. But BBGSports went ahead and set up a proprietary algorithm to rank players in a traditional 5x5 fantasy league. The rank, called “B-Rank,” is not customizable to league settings as of yet and the methodology behind the ranking system was not explained despite multiple questions from the audience. The speakers, headlined by the impressive Stephen Orban, did not share any intentions to market the B-Rank, nor did they explain the B-Rank’s value, yet they nevertheless insisted on keeping it entirely secret. Now, to be fair, there is a very nice ranking feature that allows you to rank players using whatever categories and filters you’d like, and exclude drafted players or put players on your watch list and all that good stuff. But the B-Rank looms over it. One of my favorite things about my fantasy experience at ESPN is the player rater, which rates players in each category based on a Z-Score, and then sums those scores to form a comprehensive rating. This is intuitive and understandable, and I can adjust these rankings to my own whims since I understand what goes into them. But with the B-Rank, I have no idea why players are ranked where they are.

Same with the new projection system. Even if BBGSports is releasing the new PECOTA, we wouldn’t be buying it, since BBGSports hasn’t shown that it is an expert in sabermetrics, and the speakers were in fact adamant that they are not baseball experts. So why should I care that BBGSports is launching a projection system? If you were to follow the projection’s advice and draft Ryan Howard fourth or Matt Kemp sixth, I would take pity on your children, for they would have been born to a poor fantasy baseball player. Instead of taking its cue from Baseball Prospectus, whose initiative it is to develop new and progressive analytics, BBGSports should follow in FanGraphs’ footsteps and assemble an assortment of projections. And if BBGSports wants its own projection system, I feel the user should have the ability to modify the projections however he or she pleases. If BBGSports wants B-Rank to catch on, then BBGSports will need to treat it the same way as FanGraphs treated WAR. FanGraphs went through pains to ensure that readers understood the thought process and calculations behind WAR. It would be a big plus and potential selling point for BBGSports to create a ranking system that can become universally accepted among fantasy players, but that’s not happening if fantasy players don’t know what the hell B-Rank consists of.

BBGSports might want to allow one of its programmers to play around with the data and periodically release new metrics that incline to the sabermetric bent. As I’ve stated, I don’t think Bloomberg should be trying to introduce any proprietary metrics, but along the same lines as BBGSports' written analysis, perhaps a quantitative analyst can demonstrate how the product in place can be utilized to develop one’s own projections/rankings/metrics using only the data provided by BBGSports. The B-Rank would be a great start, if only its purpose wasn't defeated by protecting the algorithm.

Fortunately, BBGSports appears genuinely interested in consumer feedback. I feel that its willingness to accept and respond to feedback will be instrumental to BBGSports' success. The fantasy product exists to make the fantasy player’s job easier and more fun, which necessitates the fantasy player’s input. As for the pro product, with only 30 teams to sell to, BBGSports will have to cater individually to each and every team. To get a glimpse of the the pro product, see David Appelman’s post. Incorporated into the pro product are pitchf/x data and and the tools to integrate whatever proprietary information teams are already holding into the BBGSports database, which can only be accessed via a proper bar code and finger print. The visuals provided by Appelman and Ben Kabak speak to BBGSports as an innovative and interactive product. And from what I've heard and seen so far, improvements will be ongoing.

Already in an advantageous relationship with MLB and MLB advanced media, Bloomberg Sports will likely want to partner up with STATS, Baseball Info Solutions, and Baseball America. Bloomberg Sports will eventually become the leading distributor for all private data collectors, as BBGSports does a better job of presenting that data than any other provider I’ve seen.

Touching BasesJanuary 28, 2010
On the Out Pitch
By Jeremy Greenhouse

Tim Lincecum retired 89% of batters he got to 0-2 or 1-2 counts. They had no chance. Here's how Lincecum's pitch selection breaks down on 0-2 and 1-2 counts, and the results of each pitch type.

FB CH SL CB
Usage 43% 31% 20% 6%
Ball 36% 36% 39% 31%
Out 32% 43% 46% 51%
Hit 7% 4% 4% 1%
RV100 0.3 -4.6 -5.4 -7.8

I'm grouping his four-seam and two-seam fastball. When I split the two, I find his two-seamer is much more effective than his four-seamer, but still not even as valuable as his off-speed offerings. I mean his changeup and slider are true out pitches. In fact, his change might be the best out pitch in baseball. You probably already know that. Yet his fastball on these counts is merely average. Would he be better off sacrificing some of the effectiveness from his changeup in exchange for some added effectivenss on his fastball? Theoretically, yes, this would be the right move, and theoretically, he could do this by throwing his changeup so often that batters come to expect it, and at the same time throwing his fastball so rarely that it acts like an out pitch, in that batters are fooled by it.

Yet for some reason, whenever I look at a pitcher's different pitch type run values, I notice disparities. Check out the A's duo of Brett Anderson and Mike Wuertz, who possibly possess the two best sliders in the game. Apparently, their fastballs suffer in spite of their extraordinary sliders. My guess is that they use their sliders as out pitches, so I wanted to see if there's a trend among pitchers to have a disparity in value between their out pitch and their fastballs. This type of analysis could, and probably should, be done for all counts, but I've been intrigued by the theory of the out pitch, so I'm limiting my sample to only pitches on 0-2 and 1-2 counts.

For the sake of simplicity, I'm grouping all fastballs together (four-seam, two-seam, cutter), and all off-speed pitches together (curve, slider, change, splitter, knuckler). So, in the following plot each pitcher represents a data point (minimum 200 pitches, Mo excluded), and the color of each dot represents how often a pitcher throws his fastball.

outpitch.jpg

There appears to be a slightly positive trend line heading in the direction we would expect. Pitchers who extract value from one pitch type tend to get some value out of their other pitch types. Also, I see more yellow and red points on the right side and more blue points on the left side, meaning pitchers who throw more off-speed pitches have had better success with them than pitchers who throw fewer off-speed pitches.

Given that the average run value is defined as zero, 59% of pitchers perform at an above average rate with their off-speed offerings, while only 38% are above average with their fastballs. There are two and a half times more pitchers who have above average off-speed pitches and below average fastballs than pitchers who have below average off-speed pitches and above average fastballs.

As for correlation coefficients, which are on a scale of -1 to 1 with 1 representing a strong positive relationship, -1 representing a strong negative relationship, and 0 representing little or no correlation, I found that there is a weak correlation of .09 between fastball and off-speed run values. In addition, there is a correlation of -.25 between pitch type run value and pitch type frequency. Again, all of these data suggest that pitchers are not throwing their best pitches often enough in out pitch situations.

Returning to the above graph, one interesting note I made is that the two bluest points also show up as the two highest points on the graph. This means that the two pitchers who have the lowest fastball percentage have also had the poorest fastball results. Want to take a guess at the names behind the data points?

Well, it turns out knuckleballers should stick to the knuckleball. R.A. Dickey and Tim Wakefield aren't fooling anybody by trying to sneak a fastball in there. Wake's thrown 34 fastballs in 0-2/1-2 counts, and he's generated nine outs compared to six hits. That's abysmal. Dickey is just as bad, with 14 outs against nine hits. They're doing batters a favor by throwing fastballs.

There seems to be a stigma to pitching backwards, but if your out pitch is your best pitch, and you can throw it for strikes and it doesn't add stress on your arm, then you should consider turning your fastball into a secondary pitch, making it a potential out pitch as well

Pitch type run values don't tell the whole story. It's important to look at what happens in the entire at-bat, not just the one pitch. For example, it's possible that pitchers are throwing fastballs outside the strike zone to set up breaking balls as their out pitch. So they're intentionally lowering the value of their fastballs, and therefore are getting better overall results when they throw the fastball even though the fastball doesn't get the glory in the run value column. However, the conclusions I found when looking at the linear weights value of the entire at bat remain the same as when I analyzed single pitch run values.

I'm including a scatter plot of the categories I've used--fastball/off-speed percentage, fastball/off-speed run value, and fastball/off-speed linear weights-the overall linear weights value of the at-bat following the 0-2/1-2 fastball/off-speed pitch). Use the scroll bar on the bottom right to locate your pitcher of interest.

Touching BasesJanuary 21, 2010
I've Seen That Before
By Jeremy Greenhouse

While a pitcher's stuff diminishes over the course of game, the effects I found were relatively small. So why do batters gain an edge over pitchers as the game goes on? Well, baseball is a game of adjustments. Batters get their timing down and start picking up the ball out of the pitcher's hand. All that good stuff.

The first time a batter faces a curveball, he might be caught off-guard. That’s why pitchers throw predominantly fastballs the first time through the order. And that’s why batters do so well the third time they face a pitcher. They’ve seen most of his repertoire, and are able to recognize the curve. As the saying goes, “Fool me once, shame on you. Fool me…you can’t get fooled again.”

First, here is the average run value per 100 pitches based on the number of times a batter has seen a given type of pitch. I include all data points for which I have approximately 1,000 pitches.

For reference:
F2: Sinker/Two-Seam Fastball
F4: Four-Seam Fastball
CB: Curveball
SL: Slider
CH: Changeup
FC: Cut Fastball

This chart indicates that a batter facing a fastball from the same pitcher for the 12th time will perform better than a batter facing a pitcher's first fastball. Chances are, however, that batters who face 12 fastballs are better from those who only face a few. One way to get around this bias might be to take the difference in run value between the 11th fastball and 12th fastball. This method, called the delta method, allows you to compare apples to apples as each change in measurement is at least composed of players from the same sample. This produced the following chart:

The magnitude of the results is enormous, if the results are to be believed. A batter facing a changeup for a fifth time is expected to perform over five runs per 100 pitches better than he performs the first time he saw the changeup. That's pretty much the difference between the best and worst hitter in the league. Unfortunately, I have to say that I don't think the delta method is the way to go here, and I'm not sure how to fix my sampling problems. Batters who face at least three changeups have a rv100 of 0.2 on the third changeup, but they only have an rv100 of -1.1 on the second change. This is a delta of 1.3 runs. Meanwhile, batters who face at least four changeups have an rv100 of -1.3 runs on the third change and 0.3 on the fourth, another huge delta of 1.6 runs. This would mean that batters perform three runs per 100 pitches better on the fourth changeup they see than on the second. The oddity here is that batters who face at least three changeups are above average on the third changeup, but batters who face at least four changeups are well below average on the third changeup. I think what this means is that once pitchers get burned on a given pitch, they quit throwing it to that batter the rest of the game. I don't know how to solve for these biases.

I went on and produced the same two charts, except this time at the at-bat level instead of the game level.

Batters who face seven fastballs in an at-bat are good, in that they are able to work the count. Meanwhile, pitchers who throw five sliders in an at-bat are good, in that they are either ahead in the count or can locate their breaking balls.

Using the delta method:

No pitch gains in effectiveness after its been thrown once already in an at-bat. This finding was applicable at the game level as well. However, there are differences between the at-bat and game level. Off-speed pitches such as the changeup and curveball lose more value than fastballs during the game, given an even distribution of pitches. But in an at-bat, off-speed pitches do not lose as much effectiveness as fastballs when they're repeatedly thrown. It makes sense to me that changeups are the worst pitch to show multiple times to the same batter throughout the game, since the success of changeups is built on deception. Yet I'm not sure why changeups don't lose as much effectiveness in an at-bat once thrown multiple times as fastballs do. I think it has something to do with the count in which they're thrown and the theory of the out pitch.

Touching BasesJanuary 14, 2010
Pitch Counts and Pitch Classifications
By Jeremy Greenhouse

Consider this part two to my study on pitch counts and pitchf/x.

The first time through a lineup, pitchers traditionally throw fastballs, and then switch to off-speed pitches when facing batters a second time. In order to isolate the effects of pitch counts on a pitcher's stuff as opposed to his pitch selection, I had to classify a whole lot of pitches. That was fun.

There were about 5,000 games in which a pitcher threw 100 pitches during the pitchf/x era. These pitchers performed admirably to have lasted that long into a game, so this sample won't be representative of all, or even most, starters. To illustrate the point that pitchers mix up their repertoire over the course of a game:

Six pitches are regularly thrown throughout any given game. The four-seam fastball (F4) belongs in most every pitcher's repertoire, though some sidearmers or sinkerball specialists will only throw fastballs of the two-seam variety (F2). These two pitches are often difficult to distinguish from one another, be it by the human eye, or by the detailed pitchf/x data. Cut fastballs (FC) are also difficult to make out at times from four-seamers and sliders at times. Sliders (SL), curveballs (CB), and changeups (CH) increase in usage over the course of the game. Knuckleballs and splitters are thrown only one or two percent of all pitches, so I won't include them in this study, and I made no attempt to classify screwballs, shuutos, or gyroballs, since I'd guess they compose about .001% of pitches in the last three years.

Perhaps some pitches are more useful later in the game than others. In theory, all pitch types should have the same effectiveness. Game theory would dictate that if a pitcher's curveball is better than his fastball, he should throw his curveball so often that batters come to expect it. Therefore his fastball gains value. Eventually, the two pitches become equal in terms of overall effectiveness. For one reason or another (maybe there is credence to the notion of the "out pitch"), this theory does not hold true for many pitchers, or at a league-wide level. The run value of fastballs is higher than the run value of breaking balls, which would signify that pitchers are under-using their secondary pitches. (Keep in mind, the main advantage to using run values is that they take the count into account.) As you will see in the below image, this trend narrows, but still exists, even as pitchers use more off-speed offerings deeper into the game.

All run values per 100 pitches.The high points and low points in the graph represent the high points and low points in the opponent's batting order.

It seems to me that changeups are ineffective pitches at the start of the game, but gain effectiveness later in the game. This makes sense intuitively. The graph also lends merit to the manager's decision to leave these pitchers in for 100 pitches, as the sample of pitchers is clearly above average through 90 pitches. However, these pitchers were also undoubtedly lucky. They would not make it to 100 pitches if they gave up runs. That's where my metric for measuring a pitcher's stuff based on a pitch's physical characteristics comes into play.

First, the two least impressive types of pitches in terms of stuff: the sinker and changeup.

As you'll see with each of these charts, there's something funky going on in the first several pitches of the ballgame. I'm not even going to attempt to form a guess as to why changeups appear to have a better StuffRV as the game goes on. The success of changeups is obviously not built on how "nasty" they are.

Again, for some reason, we should disregard the first dozen points or so. Pitchers throw fastballs an inordinate amount of time on the first pitch, and apparently, anything they throw lacks in stuff. They're warming up or something. Maybe they know batters tend to not swing at the first pitch of the game. I don't know. But you see that with all three types of fastballs, from the tenth pitch to the hundredth, a pitcher loses about a 10th to a 20th of a run in StuffRV per 100 pitches.

Finally, breaking balls.

So, even pitchers who have successful games lose a significant amount of stuff over the course of a game. Since this sample represents an above average group of pitchers, I'd imagine lesser ones deal with inferior durability. I would be comfortable saying that the quality of a generic starting pitcher's stuff decreases by at least .05 runs per 100 pitches from his first pitch to his last.

Touching BasesDecember 31, 2009
Pitch Counts and Pitchf/x
By Jeremy Greenhouse

I remember Randy Johnson throwing 99 to finish a complete game. Back in their day, Nolan Ryan and Bob Feller probably did that on a regular basis (if you were to ask them). There's a lengthy list of early 20th century pitchers who pitched complete games in both ends of a doubleheader. So what's the driving force behind the pitch count craze? Are we going soft?

I don't think there's some grand scheme to baby pitchers. I do think that pitchers nowadays exert exponentially more effort on each pitch than pitchers of yesteryear, but our contemporaries could still probably hold up past the hundred pitch mark. The main reason pitchers get pulled before they reach their limit is because there's little incentive not to pull them. Take a look at baseball reference's splits. Pitchers allow a .726 OPS the first time through the order, then the OPS jumps 40 points the next time through and another 40 points after that. So managers make the correct decision to insert a reliever who has the advantage of facing batters for the first time. With eight-man bullpens, there's no reason not to go to a reliever early. So the question becomes not if, in the current environment, we should continue to adhere to pitch counts, but why? Does the pitcher lose effectiveness, or does the batter adjust to even the fastest of fastballs having already seen in in his three previous plate appearances?

With pitchf/x data, you can tease out the pitcher's part in the pitcher/batter matchup. A pitcher really controls five things:

-Where the ball is released
-Where the ball lands
-How hard the ball is thrown
-How much the ball spins
-What direction the ball spins

Here, I will concern myself with the final three components, which I believe define what we call a pitcher's "stuff." For example, the average fastball from a right-handed pitcher (92 MPH, nine inches of rise, seven inches of run) is worth about half a run below average per 100 pitches. I will call that its StuffRV. The following graph demonstrates the average StuffRV (per 100) and a smoothed out actual run value (per 100).

There's a lot going on here.

-Our main concern is with a pitcher's endurance with regards to his stuff. The takeaway from this graph, then, is that from a pitcher's 10th pitch to his 60th pitch, his stuff will deteriorate by about a 10th of a run per 100 pitches.

-My methodology grades out fastballs as inferior to breaking balls. You can tell by looking at the very first mark on the graph. A pitcher's first pitch of the day is a fastball about 80% of the time, while in total, pitchers throw fastballs 60% of the time. On an 0-0 count otherwise, pitchers throw fastballs just under three quarters of the time. Same as on pitches two through ten: 70-75%. For some reason, pitchers like to start their outings off with a fastball.

-A pitcher's success is, of course, largely dependent on the batter, and you can see when each lineup spot tends to hit by following the true run value curve. Pitchers face the eighth and ninth batters in the order generally during their 25th to 35th pitches and again their 60th to 70th pitches. The two peaks of the True RV line occur when starting pitchers are generally facing the 4th and 5th batters in the lineup.

-Relievers have better stuff than starters. The section from 1-15 pitches is composed mostly of relievers, and that's the lowest trough in the StuffRV curve.

-Those pitchers who managers leave in past the 100-pitch mark are well above average, and their stuff continues to be above average. I'll account for this survivor bias another time. For now, I'd rather do brief case studies of one pitcher who maintains his stuff throughout the game, and another who does not.

I correlated every pitcher's pitch count with his StuffRV on that pitch. Brett Anderson seems to pick up steam the deeper he goes into a game. I classified his pitches into four clusters: fastball, slider. changeup, curveball So the first thing I did was look to see trends in his velocity and movement. Well, nothing really stood out. His slider gains almost an inch in movement by the end of the game, but I don't think that's it. Then I remembered that Anderson's slider was the most valuable slider in baseball last year, and it edges out Zack Greinke's as the *nastiest* starter's slider in baseball by my rankings.

Pitches FB SL CH CU
1-25 67% 23% 6% 4%
26-50 51% 28% 14% 8%
51-75 43% 31% 12% 14%
75+ 39% 38% 11% 12%

So there you go. He challenges hitters with fastballs the first time through the lineup and then switches to mainly off-speed pitches, which are his bread and butter. Hence, you might say, he improves his stuff as the game goes on.

Jered Weaver, on the other hand, has worse stuff by my calculation as the game goes on. Weaver throws his fastball 68% of the time in his first 25 pitches, compared to 52% from his 51st pitch on, and in exchange his changeup usage increases from 10% to 23%. Not only is there a difference in Weaver's pitch selection, but there's also a notable change in his pitch quality. Here are the characteristics of his fastball as the game goes on:

Pitches Velocity StuffRV True RV
1-25 90.0 -0.19 -0.29
26-50 89.7 -0.13 0.06
51-75 89.2 -0.10 0.63
75 89.0 -0.07 0.16

But pitchers who have a changeup as good as Weaver's don't rely on stuff to get by. Weaver's all about deception. And that stuff I don't know how to measure.

Touching BasesDecember 26, 2009
Batted Ball Location Leaderboards
By Jeremy Greenhouse

My first post on this site in February borrowed the main idea of Dave Studeman's batted ball reports, except instead of looking at the trajectory of batted balls, I grouped them by vector. A full season has passed, so who were the best pull hitters in baseball this year?

Value of Pulled Batted Balls

Pulled09.jpg


Chase Utley and Hanley Ramirez were both worth about 40-45 runs above average with the bat on the year. Utley got some value out of walking and taking his HBPs, but the bulk of their value at the plate came from them pulling balls. Reading John Walsh's piece in this year's Hardball Times Annual, I realized that Utley and Hanley weren't fully appreciated because their contributions outside the batter's box were equally valuable. Their baserunning, position, fielding value, and ability to stay on the field add another 40-45 runs to their value.

Albert Pujols and Kevin Youkilis returned from last year's top ten, while Dan Uggla, second last year, finishes one spot outside the top ten.

I think part of the reason that Youk and Jason Bay are listed is that they play get to take advantage of the Green Monster. I'm not trying to discredit them, since they're both excellent right-handed hitters, but I am trying to discredit Dustin Pedroia and Mike Lowell. Here is the average run value of pulled fly balls and line drives for Boston's four main RHBs since 2008.

Player Home Away Diff
Kevin Youkilis 0.58 0.48 0.10
Jason Bay 0.52 0.43 0.09
Mike Lowell 0.47 0.26 0.21
Dustin Pedroia 0.40 0.22 0.18

Lowell pulls half his balls in play, too, so I doubt there's any park that he'd rather play in than Fenway. As for Pedroia, he has a career .332/.391/.505 line at home. On the road, he hits .283/.350/.406. He has never hit a 400-foot home run in his career according to Hit Tracker. I doubt anybody is more suited for his home park than Pedroia is for Fenway.

At the bottom of the list is Casey Kotchman, who I believe is the only first baseman to have totaled a negative value on pulled balls. Over a quarter of Kotchman’s balls in play were pulled groundballs, and he hit .073 on those. In 2008, a whopping third of his balls in play were pulled grounders, though he managed to hit .154 on them, so it's possible defenses have figured him out.

Value of Center Field Batted Balls

Center09.jpg


Both Phillies repeat on this leaderboard from last year, while O-Cab and Pedroia again prove their ineptitude at hitting the ball up the middle.

Ryan Howard focused his prodigious power to center this year. Previously, Howard hit the plurality of his home runs the opposite way three times in his career, and in 2007, he had pulled the highest share of his homers, but this year, he hit a remarkable 21 of his 45 homers to center. Mark Reynolds came closest to matching Howard with 17 home runs to center.

Value of Opposite Field Batted Balls

Opposite09.jpg


Joe Mauer’s 35-run total is absurd. He was worth 13 runs going the other way last year and his -10 runs on pulled balls actually was a league low. Now, he's cracked both the center field and opposite field top ten, and his futility pulling the ball was skimmed down to -5 runs. Mauer hit 34% of his balls to the opposite field, while the league average is 27%. His backup Mike Redmond hits the highest rate of balls the other way in the league.

Only Adrian Gonzalez hit more opposite-field homers than Joe Mauer this year. Adrian Gonzalez in Fenway Park would be scary. Derek Jeter, who’s always had opposite field power, hit the most home runs to right field batting right handed this year, possibly rejuvenated by the even shorter short porch at the New Yankee Stadium. In 2008, Jeter had better luck going the other way with his fly balls when he was on the road than he did when he was at home. That split did not continue in 2009. Jeter produced slightly better results on flies to right in the New Yankee Stadium than he did while playing on the road.

Jimmy Rollins' batted ball profile continues to perplex. He hit an anemic .200 on grounders this year, below his already mediocre .231 career average. Though speed is important for batters to reach base safely on grounders, spraying the ball to all fields might be even more weighty. Rollins hit only 7% of his groundballs the other way, which allows defenses to shift their fielders to one side of the field, and signifies that he's rolling over on the ball when he hits grounders. Placido Polanco, Jermaine Dye, and Joe Crede all hit over a third of their flies to the opposite field, but under 5% of those balls fall for hits.

A spreadsheet containing the full results can be found here. Batted ball location data via MLBAM. The field was partitioned equally into thirds to classify right/center/left.

Touching BasesDecember 23, 2009
Aybar vs. Greinke
By Jeremy Greenhouse

Marc Topkin of the St. Petersburg Times on July 18:

Manager Joe Maddon had his reasons for starting Willy Aybar on Saturday.

Some he could explain, such as wanting to keep Aybar fresh for his primary duties as the Rays' top pinch-hitter. And some Maddon couldn't, and wouldn't, derived from extensive research and data analysis by the Rays front office staff that deduced Aybar would be a prime weapon against Royals ace Zack Greinke…

"Free Willy," Maddon said. "This is something we do back at the office, and we really crunch numbers, just so many different things. And Willy came out on top vs. Greinke, so we had to throw him out there."

The research is based on what Maddon called "an esoteric system" and had to be thorough and complex because Aybar had never faced Greinke. And it went beyond the more visual "swing planes" they have discussed before in arranging matchups.

It is also proprietary, Maddon said, joking that revealing it would carry the potential penalty of banishment to semipro ball back in eastern Pennsylvania.

"I would probably end up managing the Japan-Jeddo Stars," he said.

Aybar went 3 for 3 off Greinke.

On December 7, Tommy Rancel of DRaysBay published this exchange he had with Tampa Bay Rays coordinator of baseball operations James Click:

TR: what does Willy Aybar know about Zack Greinke?

JC: Whatever it is, I hope he's told our other hitters.

I’m intrigued.

The Idea

Use pitchf/x data to create a projection system for individual batter/pitcher matchups.

The Qualifications

I have none. The idea is overly ambitious, and I quickly realized I'm not the man for the job.

The Method

Chris Moore rather brilliantly ranked the best fastballs in baseball using five parameters: horizontal location, vertical location, velocity, vertical movement, and horizontal movement. Zack Greinke unsurprisingly came out on top.

Chris only looked at fastballs from right-handed pitchers against right-handed batters. If Chris were to have looked at RHP vs. LHB matchups, I’m sure Greinke would not have come out ahead, and instead Mariano Rivera would have topped the list. But what about RHPs against only Willy Aybar?

So I came up with a way to predict Aybar’s performance given certain pitch tendencies. For example, Aybar does best against slow fastballs around 90 MPH and he likes the ball down the middle. Plots to illustrate these points.

Aybar.jpg


You can't plot all five dimensions together, but the point is that I made a model using all five variables. I then predicted that model onto a data set containing only Greinke pitches. So the model doesn't have any idea how Greinke would pitch Aybar, but it knows that Greinke likes to throw 94 MPH fastballs on the outer part of the plate, and it knows that Aybar likes to hit 88 MPH fastballs down the middle. After some regression to the mean, you have yourself a projection.

The Technical Details

My first data set consisted of all pitches Aybar faced from 2008 to July 18, 2009, and I tried to limit my sample further to only non-sidearming/knuckleballing RHPs. I ran a local regression to predict run values, weighing recent data the most heavily. My second data set contained all pitches from Greinke to LHBs over the same time span. I predicted my model onto that data set. Next, I regressed the expected run values for Aybar against Greinke toward the actual run values of Greinke vs. all LHBs he faced. I then regressed my projection even further to the the average performance of switch-hitting LHBs against RHPs, which I found to be around the league average .330 wOBA.

The Results

I predict Aybar to be precisely league average against Greinke.

My analysis gleans hardly any new insight into player projections. Aybar is below average against RHPs, but Greinke isn’t a world-beater himself against LHBs, having allowed an .824 OPS against LHBs in 2008.

Pretty much, I don’t think you’re going to get enough data from 1,000 pitches from hitters to beat out traditional projection systems. (For pitchers, however, any amount of pitchf/x data adds significant value.) So I guess I'm not on the same track as Friedman, Click, Kalk and the rest of whatever the Rays have going on in baseball ops.

I actually projected Aybar against all RHPs, and for what it's worth, I predict Aybar will do well against Pedro Martinez and poorly against Mariano Rivera. My model tells me Aybar will do surprisingly well against Roy Oswalt and surprisingly poorly against Armando Galarraga. It's not worth much.

The Loosely-Related Tim McCarver Quote

"I said it was Izturis who didn't get the bunt down last year. It was actually Manny Aybar. Excuse me, Erick Aybar, not his younger brother Manny who plays for Tampa Bay."
Touching BasesDecember 10, 2009
Crowding the Plate
By Jeremy Greenhouse

Roger Dorn earned back some respect when he showed that he was willing to take one for the team. But really, that pitch was so far up and in, it would’ve been more impressive to have seen him in his old age avoid it.

Some players, though, do have the ability to dodge pitches. Orlando Cabrera has seen over 5,000 pitches in the last two years, and has been able to get out of the way of all but one of them. At the other end of the spectrum, Chase Utley has taken his base on 51 HBPs in the last couple years, 21 more than the next closest batter.

Batters are hit in just over 1% of plate appearances when facing same-handed pitchers, while opposite-handed matchups result in half as many HBPs. 10% of pitches are inside in same-handed plate appearances, while 7% are inside in opposite-handed plate appearances. This explains some of the difference in hit by pitch probability. Using 2008-2009 pitchf/x data, I found the expected probability of a batter getting hit by a pitch that is at least a foot from the center of the plate—more or less all pitches that would normally be called for a ball inside.


HBPLHP.jpg


The figure of the batter (borrowed from Mike Fast) stands approximately a foot off the plate.

Here you see that if you’re a pitcher and have the intent of throwing a bean ball, you should throw at the batter’s back, where 80-90% of pitches will hit him.

The portion from the knees down—about one and a half feet off the ground and lower—protrudes more gray area from the opposite-handed graphs from than from the same-handed graphs. The head area is also more of a danger zone for same-handed batters. My guess is that batters of the same handedness as the pitcher pick up the ball later in the pitcher's delivery than they do facing opposite-handed pitchers, and therefore same-handed batters have less time to react to the pitch as they realize it’s going to hit them.

I also considered that velocity might be a factor in hit by pitch expectancy. Again, my sample is restricted to pitches inside.


HBPVelo.jpg


I’d imagine this trend has to do with the relationship between a pitcher’s velocity and his control. Breaking balls and 100 MPH heaters are not located as well as 90 MPH fastballs. But maybe batters are also more willing to get hit by the slowest pitches and not as able to get out of the way of the fastest pitches. I decided to include horizontal location, vertical location, and velocity as components in a regression to find the probability of HBPs.

Player Inside Pitches Extra HBPs HBP Probability
Chase Utley 266 40 449%
Carlos Quentin 288 26 442%
Jason Giambi 255 22 502%
Kelly Shoppach 231 21 414%
Aaron Rowand 352 20 340%
Billy Butler 526 -13 13%
Vladimir Guerrero 544 -14 37%
Edgar Renteria 501 -16 6%
Michael Young 510 -16 15%
Adrian Gonzalez 587 -17 39%

HBP probability is the rate at which a batter is hit by pitches over what would be expected from the average batter of that handedness.

I imagine this list is most indicative of how far batters stand from the plate. I also believe that pitchers are aware of the reputations of most batters' willingness to take his HBPs, so such batters are not pitched inside as often as they would be otherwise.

The charts of the best and worst at being hit by pitches, though I'm not sure you want to be the best in this category.


UtleyHBP.jpg


Solid points are HBPs, while hollow points are everything else. The background portrays the HBP probability for all LHBs.

Utley’s getting hit by anything at least a foot inside, where 16 of 21 pitches went for HBPs. He’s getting hit by anything at all inside and four feet up, about the location of his elbow, where eight of 11 pitches went for HBPs. In fact, Utley’s been hit on 10 pitches not charted, as they were less than a foot inside. All of those pitches were also at the letters or higher, so I’d imagine he leaned into at least a couple of them.

Meanwhile, Adrian Gonzalez has been hit by a few more pitches than anyone else on the list of laggards. He just gets pitched inside quite a bit and is more adept at dodging balls than Patches O’Houlihan.

Full results, including pitchers, can be found here.

Touching BasesDecember 03, 2009
Controlling the Zone
By Jeremy Greenhouse

"The STRIKE ZONE is that area over home plate the upper limit of which is a horizontal line at the midpoint between the top of the shoulders and the top of the uniform pants, and the lower level is a line at the hollow beneath the knee cap. The Strike Zone shall be determined from the batter's stance as the batter is prepared to swing at a pitched ball."

Eddie Gaedel knows not a called strike. The 3-foot-7 dwarf took four balls in his lone Major League plate appearance. (If you want to see a discussion on the practicality of short pinch-hitters taken well beyond its logical extreme, follow this link.

Gaedel physically shrunk the strike zone. I’m interested to see what batters can control the strike zone without any such advantage. Who manages to earn a ball on a pitch on the black or a strike on a pitch at the letters? That’s where pitchf/x comes into play.

John Walsh and Dave Allen have found the true dimensions of the strike zone using pitchf/x data. Jonathan Hale has studied individual umpire strike zones and found that Cy Young winners and control pitchers get better calls, and Hale dispelled the myth that rookies get big leagued by umps.

I assigned every pitch since 2008 an expected called strike probability based on the horizontal location of the pitch and a scaled vertical location*, while also accounting for batter handedness, pitch movement/velocity, and the umpire. After that, I added up the expected balls and called strikes of players, and the actual ball/strike numbers for all players. Here are the batters who have the largest disparity between their expected ball probability and the actual rate at which balls are called on them.

Player Balls:Called Strikes Extra Balls Ball Probability
Nelson Cruz 2.9 49 115%
Chris Iannetta 3.1 64 115%
Rick Ankiel 3.6 47 115%
Michael Young 2.5 98 114%
Ryan Raburn 2.7 37 114%
John Buck 2.9 38 113%
Josh Hamilton 4.3 40 112%
Brad Wilkerson 2.2 28 112%
Justin Morneau 2.6 82 112%
Ken Griffey Jr. 2.6 79 112%
Hunter Pence 2.5 -69 90%
Carlos Beltran 2.5 -75 89%
Alfonso Soriano 2.8 -56 88%
Travis Hafner 2.2 -52 88%
Brandon Phillips 2.7 -73 87%

Michael Young and Carlos Beltran (who I suppose is synonymous with the called strike to Met fans) have the highest and lowest number of extra balls among all players, respectively. The average difference between a called strike and a ball is between a tenth and an eighth of a run. So Young has gotten nearly 20 runs of value out of controlling the strike zone better than Beltran has. To look deeper into this, I plotted their respective strike zones (Beltran's a switch hitter, so two for him) against the league average strike zones. Inside these contour lines, a pitch is more likely than not to be called a strike, while outside the contour lines, pitches are called for balls greater than 50% of the time.

Beltranyoung.jpg

The difference between Beltran and Young can be seen at the knees. I should note the caveat that this entire effect could be caused by a few stringers listing Beltran’s bottom of the strikezone too high and Young's too low.

I don't want to make any rash conclusions on what type of players get the benefit of the doubt from umpires, but with three Rangers in the top ten, and another five Rangers in the next dozen on my list, I feel that I can say with confidence that Rudy Jaramillo is paying off umpires. Just throwing it out there. But I'm pretty sure it's true.

Seriously, though, one of the first things I noticed was that 10 of the top 30 players on the leaderboard were catchers. It turns out catchers are 2-3% more likely to have a pitch called a ball than average. It's fully possible that that's just noise, of course.

I was especially interested in batters' luck in full count situations. The leverage of a full count is double that of any other count, with the disparity in value between a walk and out coming in at around 0.6 runs. It turns out that Jack Cust, who has taken more full count pitches in the last two years than anyone but Adam Dunn, has had easily the best luck on full counts, with ten more balls called than expected. (Dunn's had one fewer than expected.)

Here I've plotted Cust's called strikes in green, balls in red, and the average LHB strike zone contour in blue.

Custfull.jpg

I count two strikes easily outside the zone, and nine balls that were easily inside the zone. Most batters experience a smaller strike zone on full count than on average, but Cust has been particularly lucky. Serves him right for not swinging too often in a full count.

How about on the pitcher's side?

Player Balls:Called Strikes Extra Strikes Strike Probability
Tom Glavine 2.1 88 178%
Peter Moylan 2.3 45 130%
Derek Lowe 1.9 288 129%
Gary Majewski 1.9 28 126%
Francisco Cordero 2.2 73 123%
Jorge Campillo 1.8 87 122%
B.J. Ryan 2.2 40 118%
Jamie Moyer 2.0 171 118%
Matt Maloney 1.9 18 117%
Tim Hudson 2.2 61 117%
Dontrelle Willis 2.7 -30 86%
Luke French 2.4 -29 86%
T.J. Beam 2.5 -16 86%
Brandon League 2.9 -35 85%
Denny Bautista 2.4 -42 83%
Ryan Tucker 3.6 -22 78%


It seems like control pitchers have better luck with umpires, which Hale has already shown. (The correlation between balls to called strikes ratio, which I'd consider a decent measure for control, is -.32.) Mariano Rivera, the best control pitcher in baseball, is 13% more likely to have his pitches called for strikes than you would expect. While watching Mo, you can sometimes actually see him intentionally try to expand the zone. When Hale did his study in 2007, he found that Derek Lowe had the highest rate of extra strikes per game. Demonstrating this ability for three straight years might be worth looking into further. Jamie Moyer and Lowe have also gotten five more strikeouts than expected on full counts, the most in the majors.

About the reliability of these ball and strike probabilities: For batters, the split-half correlation for "ball probability," (which I'm defining as the probability of a called pitch being called a ball above what is expected) reaches .5 when I limit my sample to batters with minimum of 125 called pitches. It takes batters with at least 600 called pitches to reach a .7 correlation. The league average pitches per plate appearances is 3.8, and an average of 2.1 of those pitches are called for a ball or strike by the umpire. So I’d say that it takes about 300 plate appearances for this metric to stabilize. You can compare that to more common metrics by reading the series by Pizza Cutter. or a sample of players with at least 50 plate appearances to know to regress halfway to the mean. For pitchers. r = .5 when pitchers the sample of pitchers has thrown at least 60 called pitches, and 300 called pitches to reach an r of .7.

*And Glove Slap to Tango on how to scale vertical location. I unfortunately decided to use the mean values of every batter's top and bottom strike zone values as inputted by MLBAM stringers. I probably should have scaled to the median, or better yet the median by month. Maybe next time.

Touching BasesNovember 19, 2009
Holliday-Bay: Visual Scouting Reports 1.0
By Jeremy Greenhouse

Jason Bay and Matt Holliday are the two best hitters on the market. Holliday is a year younger than Bay, and will likely command a more lucrative contract. If you'd like to know how they stack up in left field, check out ESPN's recent articles analyzing the matter. But I’d like to concentrate on their hitting. Here’s how they stack up, per FanGraphs

Over the past two years, they’ve been rather even hitters. Using 2008-09 pitchf/x data, I’ll take a deeper look

A couple weeks ago, I introduced a series of graphs that try to provide a visual scouting report of sorts for hitters. Here's how each batter performs by pitch location.

(Click on images to enlarge.)

They are strikingly similar compared to league average. Middle and lower in, they’re well above average, but they have weaknesses up and in. I'm surprised that hitters the caliber of Holliday and Bay perform worse than league average in any spots. Holliday also struggles more than the average batter on pitches down and out of the zone, while Bay appears to excel at pitches way down and away, likely a result of his excellent plate discipline.

No matter how I break these guys down, they'll turn out above league average at almost everything, so I prefer to compare them to themselves. The next set of graphs shows how they do relative to their own averages, as opposed to the league average. Therefore, every single batter will have some blue—even Pujols—and every single batter some red—even Tony Pena, and that's because every single batter has relative strengths and weaknesses.

Bay appears to have a great knowledge of the strike zone, as his “swing zone” and “strike zone” nearly overlap. (These contour lines indicate where the probability shifts from greater than 50% to less than 50%. For example, pitches outside the black elipse are more likely to be called for a ball than a strike, and pitches inside the red elipse are more likely to be swung at than taken.) Holliday, however, has a distinct region outside the strike zone where he owns a negative run value. This seems to stem from Holliday's propensity to expand the strike zone. Yet he doesn’t face the same problems up in the zone, even though he’s willing to swing at high balls too.

To look deeper into this, I plotted the same red 50% swing zone, and also included Holliday's contact zones at 75% and 90% intervals, which show where he's most likely to make contact when he swings. You can also see 50 separate points that indicate the location of pitches that resulted in Holliday's home runs.

What we're interested in is the very top and very bottom of his swing zone—the portions that extend beyond his strike zone. It turns out that these regions also extend beyond his 75% contact zones. There is slightly less area up top between his swing zone and contact zone as there is in the bottom region, meaning he is better at making contact on pitches at high pitches out of the strike zone than low pitches out of the strike zone. But he hasn't hit homers in either of those regions. He has swung half the time at these bad balls, and whiffed over a quarter of the time when he does pull the trigger. The most important thing to remember is that both of these swing-and-miss regions would be called for balls more often than not if he would just lay off.

How about their platoon splits? I use release point data for these. Like the previous graphs, these are from the batter's point of view.

Bay exhibited a reverse platoon split two years ago, but over his career he has maintained a normal split. Normally I exclude Chad Bradford's release points, since they’re outliers, but I wanted to include them to show Bay’s success against submariners and sidearmers. He’s five-for-eight against Bradford in his career. Bay’s had less success against lefties with lower arm slots. He’s 0 for 16 with three walks against the likes of Brian Fuentes, Billy Wagner, and Javier Lopez. In 2009, Bay and Holliday both faced the highest rate of LHPs of their careers.

Now, I’m not sure if this next set of graphs will catch on, but I wanted to know how batters fare by pitch type, so here’s what I came up with. You have to have some knowledge of pitchf/x data to fully comprehend these graphs, but really all that I look for is to quickly see if there’s some type of obvious gradient from blue to red or red to blue that would suggest a batter does better against pitches of a certain velocity and break.

You can see very distinct sections in Bay's graphs where he excels against both LHPs and RHPs. These pitches have the same velocity and movement as your league average fastball (about 85-95 miles per hour with 5-10 inches of horizontal and vertical movement), which meshes with Bay's reputation as a fastball hitter. Over the past two seasons, Bay has been the fourth best hitter in baseball against the fastball. He’s not as good against curveballs, especially slower breaking pitches. I didn’t note anything remarkable in Holliday’s release point graphs nor his velocity/movement graphs, but Holliday does have interesting pitch splits. He saw 65% fastballs with the A’s and 55% with the Cardinals. In exchange, he saw his slider rate nearly double in St. Louis. The increase in slider percentage might have been part of the reason Holliday found renewed success, as he has been the top hitter in the Majors against the slider over the last two years.

Finally, hit locations.

Holliday was shipped out of Coors Field in the offseason, and he might have felt the hangover effect, having tailored his game to Coors where he has boasted a career OPS 160 points better than he has at all other venues. Or a combination of Oakland's pitcher's park, increased quality of competition, and decreased slider percentage plagued him. Or the first half of the season was just noise. His BABIP shot up from a career low .318 in Oakland to a career high .391 in St. Louis. Once he was traded, he hit more line drives and fewer infield flies. Due to its spacious foul grounds, the Coliseum's park factor for infield fly balls is around 104. More importantly, Holliday's home run per fly ball rate was just 9.7% in the Coliseum, well below his career rate of 16.5%. The average batter would see his homes runs per fly ball plummet some 30% in a move from Colorado to Oakland. (Batted ball park effects from David Gassko.)

Meanwhile, Bay pulls balls at an extraordinary rate. Infielders should shift him to the pull side as much as is acceptable against right-handed batters. He pulls his fly balls at a high rate too.

Bay was traded to a haven in Fenway Park, where he could take advantage of the green monster in left field. Using Hit Tracker Online data, I plotted Bay's 2009 homers against his 2008 homers along with Fenway's and PNC's outfield dimensions.

15% of Bay's balls in play last year were fly balls to left, compared to 10% in 2008. Could this have been a conscious effort? In 2008, 43% of his flies to left were hits and 25% were homers. In 2009, 63% of Bay's flies to left were hits and 40% were homers. Thanks to the monster, He managed more more homers on flies to left last year than he had all of hits on flies to left two years ago. I'm sure the trade-off in opposite-field power for pull power yielded a net positive for Bay.

As always, these graphs are works in progress, so please feel free to leave comments on how to improve them.

Touching BasesNovember 13, 2009
With the Jumping and the Diving and the Whole Thing
By Jeremy Greenhouse

First there was the error. A century later, we finally have the natural antithesis to the error: the Web Gem.

The good people over at ESPN track all the best defensive plays in baseball on a daily basis, and come up with that short minute segment which is often a highlight of my night. This year, they began keeping track of who made each Web Gem, and were kind enough to share the data with me. Web Gems are intended solely for the purposes of the television viewer. They are simply the most entertaining plays to watch, and aren’t supposed to be used as a defensive measure. But errors really never should have been used as a defensive measure either. Nonetheless, these are all valuable data points, so my first order of business was to see how errors and Web Gems stack up. Here you have error to Web Gem ratio.

Position Ratio
C 14
1B 6
2B 4
3B 4
SS 4
RF 2
CF 1
LF 2
P 16

I assigned every player a position based on where he played the most innings, and all stats count toward that position.

There were five players who made no errors but tallied three or more Web Gems.

Name Position Web Gems
Grady Sizemore CF 5
Jason Bay LF 5
David DeJesus LF 4
Austin Kearns RF 3
Omar Vizquel SS 3

Here, you see some hits and some misses. Sizemore and Vizquel are, by all accounts, excellent defensive players. David DeJesus and Austin Kearns are average. And then there’s Jason Bay. For the Jason Bays of the world, I submit to you the Gary Matthews Jr. effect. Matthews, you may recall, made a stupendously phenomenal catch a couple years ago that was replayed and analyzed like the Zapruder film. His defensive reputation was built off of one play. And you can't point out the number of errors for outfielders to disprove the reputation, since outfielders don't make errors. Anyway, I hope nobody signs Jason Bay to a GMJ-type contract.

But it’s the aughts, and we’ve moved past errors. In fact, Baseball Info Solutions came up with a similar method presented in the Fielding Bible II called Good Plays/Misplays that uses objective criteria to come up with a more advanced Web Gems/Errors. These data aren’t available to the public, but some BIS defensive data is. FanGraphs lists the number of expected outs each non-catcher position player should make based on the distribution of balls in his zone. So I'm going to call the amount of Web Gems per expected out each player's Web Gem percentage.

Position Percentage
1B 0.88%
2B 0.93%
3B 1.49%
SS 1.11%
RF 0.89%
CF 1.25%
LF 0.82%

That looks much more like the defensive spectrum. Third basemen get a boost for playing the hot corner, where there are myriad opportunities to show off quick reactions as right-handed batters scorch balls down the line at over 100 MPH. 3Bs Ryan Zimmerman, Mark Reynolds, Brandon Inge, and David Wright were the only players to total double-digit Web Gems this year.

How does the ability to make the spectacular play match up with UZR, the most popular advanced defensive metric? For the rest of this article, I'll use the statistical method of correlation. A correlation coefficient returns the strength of the relationship between two variables. Closer to 1 indicates that there is a positive correlation, closer to -1 indicates a negative correlation, and closer to 0 means that there is no relationship. The overall correlation was .08, which is very weak. I think that on the Opening Day Web Gem segment, Karl Ravech should ask John Kruk* whether he knew that the .26 correlation coefficient between UZR and Web Gem percentage for third basemen was easily the strongest correlation of any position.

*How is it possible that someone who is so outspokenly anti-statistics literally walked away from the game the moment he reached a .300 career batting average?

I would venture that Web Gem percentage correlates with UZR not because Web Gems assess skill, but because they track the most influential plays. The average runs saved per play defensively is .8, a tick higher than that for outfielders. I’d venture that most Web Gems are plays made no better than 10% of the time on average. So for every web gem, you can probably attribute at least half a run to that player's value.

Tangotiger’s invaluable Fans' Scouting Reports finished balloting this week. I’m guessing that Web Gems will be even more influential in shaping the fan’s opinion than in swaying any defensive statistics. Here, I'll report the correlation coefficients between Web Gem percentage and several ratings from the FSR.

Average Rating Reaction Acceleration Velocity Hands Throwing Strength
1B 0.28 0.26 0.30 0.33 0.23 0.27
2B 0.24 0.21 0.24 0.25 0.22 0.21
3B 0.47 0.49 0.56 0.43 0.38 0.41
SS 0.31 0.31 0.36 0.27 0.28 0.22
LF 0.24 0.15 0.22 0.15 0.19 0.15
CF 0.00 -0.01 0.09 0.07 0.05 -0.06
RF -0.16 -0.11 -0.08 -0.17 0.00 -0.18

You see that fans are likely more influenced by spectacular plays made by infielders than by outfielders. Since such a significant portion of a third baseman's fielding ability is making the remarkable play, Web Gems correlate well for 3Bs in both UZR and scouting reports. The only surprising result I found is that there isn't a positive correlation between throwing strength from right and center fielders and Web Gem percentage. I figured a lot of outfield Web Gems would be a result of throwing strength. Perhaps throwing strength isn't strongly correlated with outfield assists. Something to look into.

And since the Gold Gloves were announced this week, I'll leave you with a table of each Gold Glovers relevant statistics as well as the guys at each position who I consider to be the best not to have won the award. Adam Jones over Franklin Gutierrez really stands out as a poor selection.

Name Position Errors Gems Rating UZR Dewan
Adam Wainwright P O O - - -1
Mark Buehrle P 1 O - - 11
Brad Bergesen P O O - - 6
Yadier Molina C 5 3 4.6 - 4
Joe Mauer C 3 1 4.5 - -1
Gerald Laird C 3 O 4.3 - 5
Adrian Gonzalez 1B 7 1 4.1 3.8 8
Mark Teixeira 1B 4 3 4.1 -3.7 O
Albert Pujols 1B 13 3 4.2 1.3 12
Orlando Hudson 2B 8 8 4.2 -3.3 6
Placido Polanco 2B 2 1 4.2 11.4 2
Chase Utley 2B 12 2 4.3 10.8 11
Ryan Zimmerman 3B 17 19 4.4 18.1 21
Evan Longoria 3B 13 2 4.5 18.5 16
Chone Figgins 3B 14 8 4.4 16.7 31
Jimmy Rollins SS 6 4 4.6 2.7 -2
Derek Jeter SS 8 2 3.6 6.6 4
Jack Wilson SS 12 9 4.4 14 27
Matt Kemp OF 2 6 4.4 2.6 -8
Adam Jones OF 5 7 4 -4.7 -10
Michael Bourn OF 3 7 4.6 8.6 5
Torii Hunter OF 1 5 4.3 -1.4 4
Shane Victorino OF 1 4 4.5 -4.1 -13
Ichiro Suzuki OF 4 3 4.6 10.5 12
Franklin Gutierrez OF 7 4 4.6 29.1 31
Carl Crawford OF 6 1 4.5 17.5 24
Nyjer Morgan OF 4 6 4.7 27.8 23

Thanks to the Baseball Tonight staff for giving me access to the Web Gem data. The Baseball Tonight schedule can be found here, and Web Gem leaderboards are updated during the season on the BBTN Clubhouse page here.

Touching BasesNovember 05, 2009
Visual Scouting Reports (Beta)
By Jeremy Greenhouse

What if I could just punch a couple lines into my computer and get to see the strengths and weaknesses of a player in graphical form? Harry Pavlidis does a good job using pitchf/x data to give a brief summary of pitchers, and Dave Allen is like King Midas graphing with R. I've set out to develop my own set of hitter graphs and I ask for your help in improving them for future, more in-depth, player analysis.

Here's what I've got so far, using Jayson Werth's 2008-2009 data as an example.

WerthGraphs.jpg

I'll break down the three components one by one. For now, the graphs represent the three most meaningful locations of the baseball's flight--from the pitcher's hand to the strike zone to the hit location. Here's Werth's "Batter Zone."

Werthzones.jpg

These are from the batter's perspective. Here, you can see Werth's expected run value is worst against pitches up in the zone and down and away. As you know, this holds true for most hitters. Where you see blue on the graph on the left, he performs worse than his average self. Then on the right, you see how he compares to the league average. He excels on pitches down and in, but is worse when challenged up.

So, how to improve these visualizations? I'm using a standard strike zone, but I'd like to create contour lines showing each batter's individual strike zones, and swing zones, showing where he's most likely to let it fly. I'm unsure how large the data frame should be. Right now, set at four feet by three feet, it captures the intricacies within the strike zone, but it might be leaving out some information for players like Vlad. The downside to expanding the frame is that for most graphs, the extra space will be occupied entirely by the average value of a ball, which will overwhelm the details of the visual. Lastly, for the graph on the right comparing Werth to average, I don't know whether to fix the color bar so that great hitters, like Chase Utley, appear red everywhere, since he's above average at everything, or to color in blue locations where he has a mere expected value of .01 runs better than average, since he's not as awesome in those locations as he is in others.

Here is how Werth does against release points, which is informative in showing his platoon splits.

Werthrelease.jpg

It appears to me that Werth has a normal platoon split, but struggles a fair bit against righties with a lower arm slot.

Lastly, Werth's spray charts.

Werthspray.jpg

Werth pulls his grounders at a high rate. In the outfield, depending on the precision of the data, the center fielder should shade a bit towards left.

I'd appreciate any input on how to improve this set of graphs. I'd also like to come up with graphs to show how hitters fare based on velocity and movement, but nothing comes to mind, and I have ideas for how to present hitf/x data if we ever get more of it.

I ran through the Phillies lineup excluding switch-hitters, so here they are, with brief comments. A quick glance at these graphs certainly won't give you any answers, but it might give some food for thought.

Chase Utley:

Utleygraphs.jpg

Utley is an insanely good hitter, no matter where you pitch him. However, don't try to brush him back, as Buster Olney suggested, because he will take his HBPs, which I'm guessing is what that graph's upper-right red portion consists of. He pulls almost everything.

Ryan Howard:

Howardgraphs.jpg

Howard also famously pulls his ground balls. Shifting against him is an obvious strategy, but the real question is where the third baseman should play.


Raul Ibanez:

Ibanezgraphs.jpg

Ibanez has similar batter zones as Utley, but he's not as good anywhere.

Pedro Feliz:

Felizgraphs.jpg

Feliz is actually a good hitter on pitches away. I'd imagine that's because he lays off of most of them, since he can't hit them anyway. But he can be beat on the inner half. Feliz shows no platoon split and a normal spray chart.

Carlos Ruiz:

Ruizgraphs.jpg

Boy did Ruiz have a great series. He hits most of his flies the other way,but has hit all of his home runs to his pull field.

Please don't shy from sharing your thoughts.

Touching BasesOctober 29, 2009
A Quick Take on Velocity
By Jeremy Greenhouse

A few weeks ago, Max Marchi wrote an article for The Hardball Times analyzing pitchers' fastball velocity trends throughout the year. Last year on THT, Josh Kalk developed a preliminary aging curve for fastball velocity. Both Marchi and Kalk used pitchf/x data, which began being recorded reliably back in 2007. I decided to try to more or less replicate their studies, except using Baseball Info Solutions data from FanGraphs.

FanGraphs has monthly splits for all of its offensive and pitching statistics going back to 2002. My sample consisted of just a shade over 2,000 single seasons in which a pitcher recorded fastball velocities in each month. Borrowing the idea from Marchi, I divided a pitcher's monthly velocity by his yearly velocity to come up with a speed index for that month. The average velocity trend looks a lot like a temperature graph.

However, I'm not too concerned about adjusting for the weather. We understand that velocity increases from April to May, but I'm interested in the rate of increase among certain groups of players. I don't know how real these trends are, but for example, Zack Greinke over the last three years has averaged 92 in April and 94 in September. That's a dramatic increase in velocity, so is there something special about him that allows him to pick up steam? On the other hand, Pedro Martinez from 2002-2006 averaged 89.6 MPH on his fastball in April, but actually decreased his velocity as the year went on, averaging 88 MPH in September. Why would he wear down?

I've heard that it takes longer for taller players to find their mechanics, and they therefore throw harder later in the season than they do in April. So I grouped players by height, labeling those at least 6'6" as tall, and those who stand less than six feet as short.

The data seem to weakly support the notion that taller players take a bit more time to reach their velocity than shorter players. I repeated this process with a bunch of different groups of players. Graphs follow.

For girth, I used body mass index, which adjusts for weight by height. Weight recordings are never reliable, so take this for what it's worth.

It looks like heavier players might suffer a bit in the dog days of summer, which makes sense intuitively. Similarly, older pitchers (33 and older) might wear down in the summer months more so than younger pitchers (25 and younger).

And how about a velocity trend based on how hard the pitcher throws? Hard throwers averaged at least 93 miles per hour on their heater for the year while soft tossers were clocked at 88 MPH and below.

There appears to be a large difference in how hard they throw coming out of the gates in April. Or perhaps it's just that faster pitches are more affected by the temperature changes than slower pitches. Many of these groups will correlate with each other, so it could be that the reason tall pitchers start off slow is that they throw hard. Or vice versa.

Finally, I checked to see if velocity trends might be influenced by early workload. I grouped pitchers by those who threw greater than approximately 500 pitches in April, those who threw between approximately 100 and 500, and those who threw fewer than approximately 100.

Might a light pitch count in April pay dividends in August and September?

On to year-to-year trends, also known as aging curves. I tried to copy Kalk's method of using matched pairs and finding the difference in year1's velocity to year2's. My sample consisted of 3,275 matched pairs, with between 100 to 400 pairs in each group. However, unlike Kalk, I do not use a weighted average based on pitch count. I'm not entirely sure that an increased sample of pitches yields a more reliable average fastball velocity. I'm also not sure how to address the selective sampling issues, so for now, I don't.

My conclusions differ from Kalk's. He found that pitchers increase their velocity until they reach age 28 or 29. I find that velocity remains around constant until that age, at which point there is a rather sharp decline in fastball speed. Pitchers who survive in MLB into their thirties tend to lose around two MPH on their fastball.

Again, I'll separate players into groups and come up with aging curves. First by height, with 6'2" as the cutoff.

Strong evidence here that taller pitchers maintain their velocity better than shorter pitchers. How does this bode for Tim Lincecum, who already lost a couple miles per hour of hop on his pitches this year? Before making any assumptions, let's take a look at aging curves by weight class.

Perhaps Lincecum can take solace in the possibility that bulk might be a detriment in aging gracefully.

That's it. Let me know if there are any other types of players who you think might exhibit unusual velocity trends.

Touching BasesOctober 01, 2009
Bullpen Management
By Jeremy Greenhouse

It's the manager's job to get the most out of his players. With regards to the bullpen, this means optimally inserting relievers depending on such factors as the current baserunners, batter, and score. Hence, relief contributions tend to be measured by Win Probability Added. Having a bullpen ace makes the managers job easier in that he doesn't even have to think about whom to give his highest-leveraged innings. LOOGYs are always nice too. FanGraphs has a statistic that compares a player's WPA in high-leverage situations vs. his WPA in low-leverage situations, to see how relatively Clutch that player is. Looking at the bullpen as a collective unit, we can more or less make the assumption that a Clutch bullpen has been managed well, which is to say that better relievers are pitching in well-deserved, higher-leveraged innings.

I collected all data from FanGraphs since 1979 on team bullpens. Here it is in the form of a Google Motion Chart. What you will see is this year's team bullpen's Clutch score plotted against their WPA/LI, which is a measure of how well the bullpen performed, treating high-leverage and low-leverage situations as equal.

While this data could be used to rank managers historically, I've chosen to focus only on this year for now. The Yankees, Red Sox, and Twins have been best at deploying their top relievers at opportune times thanks to three of the top closers in the game in Mariano Rivera, Jonathan Papelbon, and Joe Nathan. Meanwhile, the Pirates blow everybody else out of the water in bullpen mismanagement. Let’s see how they’ve gone about this. Sorted by the average leverage index for each reliever, here are the WPA figures for all Pirate relievers with at least 20 innings pitched.

Name pLI WPA
Matt Capps 1.6 -2.68
Sean Burnett 1.23 -0.04
John Grabow 1.23 1.19
Joel Hanrahan 0.92 1.04
Jesse Chavez 0.9 -0.83
Steven Jackson 0.88 -0.2
Jeff Karstens 0.78 -0.75
Evan Meek 0.65 0.14

Providing Matt Capps with the highly leveraged innings was a decent idea to start the season, but he’s been struggling this year. His walk rate has skyrocketed and his .370 BABIP isn’t helping matters. Meanwhile, Joel Hanrahan has been phenomenal with the Pirates so far, boasting a double-digit K/9 mark without having allowed a homer, yet he has been riding the pine when it’s mattered most.

A look at the Yankees, who have had the "Clutchiest" bullpen in baseball this year.

Name pLI WPA
Mariano Rivera 1.71 3.58
Phil Hughes 1.43 2.4
Phil Coke 1.29 0.88
Alfredo Aceves 1.07 1.92
Brian Bruney 0.99 1.02
David Robertson 0.68 -0.3
Brett Tomko 0.66 -0.55
Jonathan Albaladejo 0.65 -0.22
Jose Veras 0.62 -0.2
Edwar Ramirez 0.44 0.15

As a Yankee fan, it kills me that Phil Coke pitches more important innings than David Robertson. Other than that quibble, it’s hard to argue with what Joe Girardi’s done with the bullpen pieces he’s been given. Ramirez, Veras, and Albaladejo are clearly the three worst relievers to have seen time in the Yanks’ pen, and Girardi did a good job hiding them. Mo and Hughes make it easy at the top.

Equal time to the Red Sox, who have a highly-touted bullpen which has performed at a merely average level when factoring out leverage.

Name pLI WPA
Jonathan Papelbon 2.19 4.83
Hideki Okajima 1.31 1.78
Ramon Ramirez 1.23 0.47
Manny Delcarmen 1.11 -0.31
Daniel Bard 0.93 -0.25
Justin Masterson 0.75 0.32
Takashi Saito 0.68 -0.26

I’m sure Sox fans would rather see Bard in higher-leverage situations, but besides that minor note, I would think the Nation would also be satisfied with Francona’s usage of the pen.

Having an elite closer the likes of Mo and Papelbon doesn't make the decisions that go into bullpen management that cut and dry, though. Two teams at the bottom of the rankings who have terrific closers are the Dodgers and Royals. How did they go about possibly mismanaging their bullpens?

Name pLI WPA Name pLI WPA
Jonathan Broxton 1.85 2.87 Joakim Soria 2 3.23
George Sherrill 1.71 1.08 John Bale 1.24 -1.36
Cory Wade 1.66 -0.95 Jamey Wright 1.04 -0.13
Hong-Chih Kuo 1.55 0.6 Juan Cruz 1.02 -0.81
Ramon Troncoso 1.34 2.16 Kyle Farnsworth 0.92 -2.18
Ronald Belisario 1.19 -1.17 Roman Colon 0.82 -0.84
Jeff Weaver 1.14 0.53 Robinson Tejeda 0.69 0.1
James McDonald 0.91 0.14 Ron Mahay 0.6 -1.22
Brent Leach 0.86 -0.16
Guillermo Mota 0.71 -0.97

First, it’s clear the Dodgers had a tremendous bullpen, while the Royals, well, not so much. The main problem for the Dodgers appears to be Joe Torre’s reliance on Cory Wade to start the year. With so many other terrific options, Torre waited too long to pull the plug on Wade, who's been in the minors for the last couple of months.

As for the Royals, at least Hillman was able to get Soria right. I’d love to know what the Royals saw in John Bale to make them think he was one of their top relievers. And how do they go out and sign Kyle Farnsworth to an $9-million deal, have him pitch better than they would have expected—better than he’s pitched in years—but put him in the least meaningful innings that he’s ever pitched? To be fair, his high-leverage stint against the Yankees Tuesday night didn’t work out too well. The Royals are also burying Robinson Tejeda at the bottom of their bullpen chain, which hasn’t worked out too well for them. And free Carlos Rosa!

In addition to using the best relievers in the most critical situations, managers also have to find a way to get the most out of their relievers by playing to their strengths. Which brings me to platoon splits. Failed starters can always get jobs as relievers if they have the ability to shut down same-handed batters.

The Braves have had a superb bullpen this year, and their Clutch score might be penalizing them for being equally awesome in both high- and low-leverage situations. Part of the reason for their success was their closer-by-committee tandem of righty Rafael Soriano and lefty Mike Gonzalez. For the following table, I went to Baseball Reference's splits pages and found how often each reliever the platoon advantage as well has how much better he fared when facing same-handed batters. Baseball Reference calls the split that compares a pitchers production to himself tOPS+, and for pitchers, lower is better, so Peter Moylan's ptnOPS+ of 65 would mean that Moylan allows an opposing OPS 35% worse against right-handed batters. Therefore, Bobby Cox should try to have Moylan face mainly righties.

Name Throws Ptn% ptnOPS+ Name Throws Ptn% ptnOPS+
Peter Moylan R 61% 65 Eric O'Flaherty L 50% 85
Jeff Bennett R 58% 69 Mike Gonzalez L 34% 78
Manny Acosta R 54% 95
Buddy Carlyle R 53% 114
Rafael Soriano R 52% 49
Kris Medlen R 46% 150

I was surprised to see that Soriano and Gonzalez, who do exhibit traditional platoon splits, have not been given the advantage of facing same-handed batters that often. Instead, it appears that O’Flaherty and Moylan have been used as the righty and lefty specialists while Bobby Cox has opted to allot Soriano and Gonzalez the eighth and ninth innings.

Running the numbers for the Nationals, nothing of note really came up.

Name Throws PTN% ptnOPS+ Name Throws PTN% ptnOPS+
Jay Bergmann R 64% 76 Ron Villone L 50% 94
Jorge Sosa R 60% 65 Sean Burnett L 50% 101
Julian Tavarez R 60% 64 Joe Beimel L 40% 98
Saul Rivera R 58% 92
Logan Kensing R 55% 110
Kip Wells R 55% 72
Joel Hanrahan R 50% 97
Mike MacDougal R 49% 83
Tyler Clippard R 47% 173

The fact that Mike MacDougal, he of the 32/38 K/BB ratio, is closing this year in Washington should say all you need to know about the state of the Nats' bullpen. But hey, they won the Harper lottery.

This type of analysis is essentially made for Tony La Russa, so I’ll put both parts together to try to grade his management.

Name pLI WPA
Ryan Franklin 1.94 2.23
Kyle McClellan 1.45 0.89
Jason Motte 1.04 0.17
Dennys Reyes 1.04 0.26
Trever Miller 0.89 1.2
Chris Perez 0.84 0.38
Blake Hawksworth 0.79 1.24
Brad Thompson 0.5 -0.32

Franklin has emerged as a reliable bullpen ace, and La Russa thankful for that fact. Coming into the year, the likes of Jesse Todd, Jason Motte, and Chris Perez were names you heard vying for that closer job. After Franklin, though, La Russa has had struggles. He’s given high-leverage appearances to Motte, who has not been one of his better relievers. Hawksworth also may be a guy who's emerging that La Russa can start to trust more.

Name Throws PTN% ptnOPS+ Name Throws PTN% ptnOPS+
Chris Perez R 70% 88 Trever Miller L 62% 30
Jason Motte R 61% 71 Dennys Reyes L 60% 60
Kyle McClellan R 57% 121
Ryan Franklin R 54% 109
Blake Hawksworth R 53% 65
Brad Thompson R 52% 99

La Russa does a fantastic job of platooning. Both lefties he’s utilized out of the pen have had the benefit of facing a majority of same-handed batters. Trever Miller has put up great numbers this year, and La Russa would be well-served to use him as the southpaw in a righty-lefty combination with Kyle McClellan who has been holding his own as La Russa's go-to guy after Franklin. There is a dilemma in the case of Miller, who is truly exceptional against lefties to the tune of 37 strikeouts to six walks this year. So in a relatively close game, should La Russa bring him in once the starter is out and a lefty is up to ensure quality innings from Miller, or should La Russa at times wait and hope that Miller might have the chance to face a couple lefties in the 8th or 9th when the leverage is highest, but risk not pitching Miller at all?

There’s a lot more than what I've addressed that goes into bullpen management, but I think bullpen management is one of the more important in-game aspects of managing a baseball team, and it might be a subject that's possible to objectively grade.

Touching BasesSeptember 30, 2009
Thoughts from the 2009 New England Symposium on Statistics in Sports
By Jeremy Greenhouse

On Saturday, Harvard hosted NESSIS, a gathering of sports statisticians that could be billed as the little brother of the sports analytics conference at MIT, only geekier. I say that as a compliment.

Academia vs. Industry (vs. Internet?)

A couple of the best of both worlds were on display, as two names I've become familiar with, Shane Jensen and Tom Tippett, presented their analysis. Tippett, Director of Baseball Information Services for the Red Sox, presented research on special baseball tactics such as the bunt and stolen base. His findings often dissented with conventional sabermetric wisdom. A base stealer must steal at a clip of at least 70% to be deemed successful? Well, the break-even rate fluctuates wildly based on the game state. With nobody out in a one run game, the break-even rate of stealing second is only 54%. In a two-plus run game, it’s 84%. With regards to the bunt, Tippett found that good bunters should continue to bunt thanks to the possibility of an error or hit, and that in the context where you’re playing for one run, bunting is often sensible depending on the hitter and upcoming batters. It seems to me that the guys who wrote The Book came up with similar conclusions. Of course, the real stuff Tippett does for the Red Sox is proprietary and can hardly be discussed.

Tippett said that the more he studied an issue, the more often he found that managers tended to be right, without even knowing the data. Mike Zarren, a statistician for the Celtics, agreed. Zarren brought up two points of interest. First, he said that the reason it's hard for people within his industry and those from academia to collaborate is that academics are always interested in publishing while teams need to keep their research private. Secondly, Zarren was fond of mentioning the fact that the Celtics led the league in technical fouls last year, and that was before signing Rasheed Wallace. I pray for Tommy Heinsohn’s health.

Meanwhile, Jensen was one of three baseball analysts representing academics from the Wharton School at UPenn who presented their work. In a comparison of fielding metrics, Jensen's SAFE was deemed the most statistically advanced defensive metric publicly available. However, the guys on the Internet who distribute their data for free, in the forms of UZR and PMR, hold their own. Jensen's system also showed that Derek Jeter has been a subpar fielder in the past, so I have to question whether Jensen has an anti-New York bias, whether he's ever watched baseball, and the credentials of the Department of Statistics at the Wharton School.

Talking About Practice?

The topic of team practice was addressed by Gilbert Fellingham, a statistician at Brigham Young University and volleyball enthusiast. Fellingham studied point-by-point volleyball data to see what skills matched up best with results, he determined that, for instance, women’s volleyball teams should spend more time on their transition offense. Of course, there are some skills that are important but are difficult to improve upon, even with countless hours of practice. I’d imagine that every baseball player has a different skill that they should practice, but how can we quantify it? We can quantify player performance and we can detect player weaknesses, but we don't know what areas of weaknesses can be most efficiently improved upon through practice. I have no idea whether there's a uniform practice structure among teams or whether some teams have specific agendas.

The Lesser Sports

Benjamin Alamar of JQAS in researching play-calling in the NFL found that teams under-utilize the pass. When I was watching the Colts play the Cardinals Sunday night, it made my head hurt every time the Colts ran the ball. Keeping the Cardinals’ defense on its toes so that it can’t sit on the pass is important, but that really only matters when there’s a Nash equilibrium (no idea if I'm using that term correctly). What I mean is that if there are more than six people in the box, I really don’t see why the Colts would ever run. Ever. Even if the Cardinals are expecting pass, they still won't be able to defend it since they have x-number of linebackers who can’t do a thing when Peyton Manning airs it out. Alamar said that there’s even a greater chance of a play yielding negative expected points (think run expectancy) on a run than on a pass. The only downside to passing all the time is the risk put on the quarterback through wear to his arm and the threat of the crushing sack.

My favorite presenter may have been Wayne Winston, who provides Mark Cuban with his adjusted plus/minus numbers, which strongly appeal to me. In baseball, plus/minus is also known as WOWY, which I believe is most useful in assessing defensive value, catcher/pitcher batteries, and batter protection. I’ve long known that the statistics presented in an NBA box score are of much less value than those in a baseball box score. The interaction between teammates in basketball can be so subtle that we often don't know what to track. It's difficult to pinpoint why the Timberwolves perform better with Sebastian Telfair on the floor, but apparently they do. Plus/minus confirms in no uncertain terms that playing Ben Wallace in the series against the Cavs was a disaster. It also gives credence to this decade's Kevin Garnett vs. Tim Duncan and Kobe vs. Shaq debates.

Information on NESSIS can be found at its web site here.

Touching BasesSeptember 24, 2009
It's Not Whether You Win or Lose, It's Whom You Play
By Jeremy Greenhouse

An oft-overlooked piece of information by baseball analysts is strength of schedule at the player and team level. As the regular season winds down and we try to determine who has been the best team in baseball , as well as the Most Valuable Players and Cy Youngs of both leagues, I took a look at quality of opponent to see who has been helped and who has been hurt by the competition.

Giving credit where credit is due, to my knowledge Baseball Prospectus continues to hold the best readily available data on quality of opposition. On BP's adjusted standings page, there is a column for expected runs based on a team's batting line (UEQR) and another column for that same stat, except adjusted for strength of schedule (AEQR).

I plotted the difference between how many runs a team should have scored/allowed with the same number adjusted by strength of schedule. The scale is not very intuitive, so I will explain that teams in the upper-right quadrant have scored more runs and allowed fewer runs due to a relatively easy strength of schedule, while the reverse holds true for teams in the lower-left quadrant.

SOS.jpg

If you look closely, you may noticed that each quadrant is made up of mainly teams from the same division. You see that the AL East may have the highest quality of play, the AL West the best run prevention or worst run scoring, the NL East the best run scoring or worst run prevention, and the NL Central the lowest quality of play.

Given a fair strength of schedule, the Orioles would have been expected to score some 20 runs greater and allow some 20 fewer with a fair strength of schedule. Given this fact, as well as Baltimore's youth, and the concept of regression to the mean, you can already mark me down for the Orioles' over next year. Because of the potency of the Yankees’ lineup, the rest of the AL East actually should have earned better run production marks. Unsurprisingly, the Jays and Orioles have the largest difference between their second-order wins and third-order wins, or in English, they have faced the most skewed schedules in baseball.

Adjusting for luck and schedule, the Indians have carried the strongest offense in their division by a fair margin, and will surely make for a trendy pick, as usual, among analysts in next year's predictions. With my apologies to Rich Lederer, who is probably tired of his Angels’ Pythagorean record being discussed, I have to mention oddities in the Angels’ record. Not only have they managed to outplay their run differential, but according to BP, the Angels’ have gotten lucky in the number of runs they've scored and allowed. The Halos are really the only team in their division that can hit, so their staff is likely not as good as we think. Furthermore, Angels hitters lead the league in BABIP and have been unusually successful with runners on base as compared to their production with the bases empty. However, each team in the AL West plays defense ranging from above average to excellent, so to be fair, the entire division's run-scoring has been depressed by playing each other.

The American League owned a .546 winning percentage in Interleague Play this year, marking the fifth straight year of American League utter superiority. I do wonder whether any National League team would have boasted a winning record playing in the American League East.

To check out individual players' quality of opponent, I moved on to BP's quality of batters and pitchers faced reports. The reports give quality of opposition in terms of the triple-slash-stat line.* I limited my sample to pitchers with at least 300 batters faced and batters with at least 300 plate appearances.

*Instead of GPA and OPS, why haven't we ever used what I believe to be the most sensible combination of OBP and SLG, 1.75 multiplied by OBP and then added to SLG? We could then keep it on that scale, which has a league average of exactly one, as in 1.00. Wouldn't that be a rough measure of offensive production that makes everyone happy, more or less?

The eight pitchers in baseball, and 14 of the top 15, who have faced the highest quality of opposition all hail from the American League East, including Roy Halladay coming in second to David Hernandez. Roy Halladay is awesome. Again, to illustrate the difference in quality of play between the leagues, I will refer to John Smoltz and Brad Penny. The opposing batter's quality of slugging percentages against Smoltz and Penny have gone down 20 and 11 points respectively since the pair left the AL East. Cliff Lee's difference has been a mere seven points in slugging. Todd Wellemeyer has had the easiest go of any pitcher this year.

You may not know of the stat kept at Baseball Reference called platoon percentage, but I feel it is an important piece of information in showing the competition a player has faced. Paul Maholm, whose platoon split I looked at last week, has been unlucky enough to have had the platoon advantage least often among pitchers with at least 100 innings.

The top six pitchers of the year in each league, in my opinion, excluding Lee who split time between leagues and whom I already mentioned:

First Last AVG OBP SLG PTN%
Roy Halladay .268 .344 .433 0.43
Justin Verlander .265 .338 .425 0.43
Felix Hernandez .264 .337 .424 0.45
Jon Lester .262 .334 .424 0.26
CC Sabathia .261 .333 .416 0.23
Zack Greinke .261 .333 .420 0.53
Adam Wainwright .255 .329 .403 0.54
Javier Vazquez .254 .328 .403 0.49
Dan Haren .253 .328 .401 0.48
Tim Lincecum .251 .329 .397 0.45
Chris Carpenter .252 .325 .393 0.53
Ubaldo Jimenez .253 .323 .394 0.47

Chase Headley and Kevin Kouzmanoff, who both might actually be quality Major Leaguers, have had the misfortune of not only playing half their games in Petco, but also facing the most difficult slate of pitchers among hitters. This list may well be flawed, since pitchers who get to throw multiple times against the Padres will have their stats inflated. This might be the reason that Padre hitters appear to have faced quality pitchers. Following this logic, it makes sense that Melky Cabrera and Derek Jeter have faced the pitchers with the aggregate highest opposing OBP and SLG against since these pitchers have been subjected to the Yankees. I'll present the list anyway.

First Last AVG OBP SLG PTN%
Derek Jeter .256 .338 .414 0.28
Mark Teixeira .255 .335 .412 1.00
Kevin Youkilis .254 .336 .406 0.28
Miguel Cabrera .252 .334 .400 0.27
Ben Zobrist .250 .333 .396 1.00
Joe Mauer .249 .332 .396 0.62
Hanley Ramirez .251 .337 .399 0.26
Chase Utley .249 .336 .397 0.65
Derrek Lee .250 .335 .400 0.18
Albert Pujols .247 .334 .398 0.26
Prince Fielder .247 .336 .395 0.73
Adrian Gonzalez .241 .328 .388 0.62
Touching BasesSeptember 15, 2009
How Release Points Affect Platoon Splits
By Jeremy Greenhouse

Dave Allen I am not, but I will do my best at an F/X visualizations-style piece. Below is the expected run value of a pitch based on its release point, which is defined as the point where the ball is measured 50 feet away from home plate. The image is from the batter's perspective, so points on the left tend to be thrown by righties and vice-versa.


RS%20RV.jpg


Kind of like a rainbow, kind of like a tie-dyed afro.

Looking at the image, my guess is that the graph says more about the context of the pitches than content. Managers can control when they deploy pitchers of a given arm slot, so in all likelihood, lower release points occur when the pitcher has a platoon advantage over the batter. For example, see that cluster of pitches about a foot off the ground and two feet on the third base side of the rubber? All 1,000 or so were thrown by righty one-out guy extraordinaire Chad Bradford, whose numbers exceed his talent thanks to his manager placing him in situations where he can be expected to succeed. So what I’m saying is that the above graph is more descriptive of batters than it is of pitchers.


RS%20RVL.jpg


The plots are more or less mirror images. Against same-handed batters, southpaws who stand far to the first-base side of the rubber and sling the ball from a low three-quarter arm slot are expected to shut down the opposition. Release points seem to have a much greater effect against left-handed batters than they do right-handed batters, as you can see in the range of the color map. This is likely why LOOGYs are all the rage, while you rarely hear about ROOGYs other than the aforementioned Bradford.

I decided to try an analysis of individual players with large gaps in their platoon splits. Billy Butler would be one of the American League's elite hitters were pitchers only allowed to throw left-handed.


butrp.jpg


He hits standard righties at a rate of one run below average per 100 pitches, while he hits standard lefties to the tune of a couple runs above average per 100 pitches. However, pitchers with untraditional arm slots are where it gets interesting with Butler. He can't touch righty sindwinders, and he's grounded out weakly three times while facing those several submarine pitches from Bradford. Conversely, he has apparently picked up the ball quite well against lefty sidearmers in the limited time he's had against them. You win a prize if you said small sample size.

For my lefty hitter, I chose Ryan Howard, who might be out of a job playing baseball if all of us could only throw lefty.


howardrp.jpg


Oddly, Howard not only does poorly against lefties who sling the ball, but also righties who release the ball from the extreme third-base side of the rubber.

Paul Maholm exhibits an interesting split.


maholmr.jpg


He's much better against lefties in general, but it seems to me that his worst pitches against both sets of batters when he releases the ball from straight on according to the batter's point of view. This is not what we saw the the league average split. It is abnormal that Maholm has performed better against righties when releasing the ball closer to the first-base line.

For such a great starting pitcher in the past, Brandon Webb sure shows a large platoon split. His go-to pitch, the sinker, does happen to be prone to the largest platoon split, on average, of any type of fastball. Keep in mind that the sets of graphs for Maholm and Webb vs. RHB and vs. LHB are set to different scales, so it appears as if they're dramatically altering their release points based on batter handedness, but it's actually just a fault of mine in setting the axes.


WebbRP.jpg


Webb is best when pitching from a higher release point that is closer to his body. Just a hunch, but I'd guess this has to do with the movement on his sinker he gains from a higher arm angle.

Touching BasesSeptember 09, 2009
On That Stuff
By Jeremy Greenhouse

Justin Verlander. A.J. Burnett. Ubaldo Jimenez. These are the names that make scouts salivate. Why? One word. Stuff.

Two components determine how nasty a pitcher’s stuff truly is: velocity and movement. We’ve had radar guns to track the league’s hardest throwers for some time (that would be Joel Zumaya, of course) But now, with the help of pitchf/x data and a local regression technique picked up from Dave Allen, we can come pretty close to quantifying a pitcher’s stuff. We can assign every single pitch an expected run value given its physical characteristics—be it velocity, movement, location, release point, or any other data point given by the pitchf/x data. For the purposes of measuring expected run value based on stuff (StuffRV), I used velocity, horizontal movement, and vertical movement as my three independent variables, and restricted my sample to only righties who released the ball from at least five feet off the ground, with a minimum of 1,000 pitches over the last three years. To the leaderboards.

Name StuffRV
A.J. Burnett -46
Felix Hernandez -31
Zack Greinke -26
Edwin Jackson -26
Ubaldo Jimenez -26
Chad Billingsley -23
Brian Wilson -22
Brandon Morrow -21
Roy Halladay -21
Matt Garza -20
Dave Bush 15
Jeff Suppan 17
Braden Looper 19
Livan Hernandez 20
Greg Maddux 23


Really no surprises on this list, which at least gives some validity to this method for evaluating stuff. I probably could have guessed 10 of these names correctly given 20 tries. One pitcher with lightning stuff who often goes overlooked is Gil Meche, who registered twelfth. One pitcher with deplorable stuff who does not go overlooked is Livan Hernandez.

Justin Verlander showed up a bit lower down in the rankings than I would’ve thought, sandwiched between Brett Myers and Jeremy Guthrie, and good for 30th out of 250 pitchers. But actually, Verlander's worse-than-expected ranking just goes to demonstrate that last year's off-putting performance was more than just random fluctuation. His average fastball velocity dipped below 94 miles per hour for the first time in his career last year, which was reflected in the actual run value of his fastball. However, his reputation as a flamethrower is well-deserved. Limiting the sample to only the top quartile of a pitcher’s stuff knocks Ubaldo Jimenez out of the top five in favor of Verlander, who, at his best, might be the league's best. Verlander has a very large difference between his best stuff and his worst stuff. By the numbers, it looks like Scot Shields owns the widest disparity between best and worst stuff. The difference in Verlander's fastball last year, which clocked in at 94 MPH with eight inches of run and ten inches of rise, and his fastball this year, which has the same average movement (to the nearest inch), but has picked up two miles per hour in velocity, is .4 runs per 100 pitches in expected value. The actual difference in Verlander's fastball, according to fangraphs, is even greater than that, at just over a full run per 100 pitches. That uptick in velocity has made a big difference.

How about the best stuff on a per-100-pitch basis?

Name StuffRV/100
Brian Wilson -1.08
Matt Lindstrom -0.99
Jonathan Broxton -0.97
Joel Zumaya -0.96
Brandon Morrow -0.87
Heath Bell -0.85
Dennis Sarfate -0.83
Grant Balfour -0.78
Mariano Rivera -0.78
A.J. Burnett -0.73
Josh Geer 0.41
Edwar Ramirez 0.42
Livan Hernandez 0.48
Greg Maddux 0.53
Joe Nelson 0.56


Max-effort relievers and a couple of Yankees top the pitch-for-pitch list. Also, Dennis Sarfate has apparently followed the route of Daniel Cabrera—an Oriole with eyebrow-raising stuff who lost his velo and thus, likely his place in the Majors.

Greg Maddux really survived the latter part of his career on his pitching moxie. Even when I restrict the sample to the top 25% of his pitches, he continues to show below-average stuff. Nevertheless, he accumulated five WAR over the last couple years.

Come to think of it, setting a minimum of 1,000 pitches for this analysis might have been a mistake. Given the precision and granularity of the data, this technique could be used to assess a pitcher's stuff given only a handful of pitchers. For example, I had never heard of Carlos Rosa before conducting this analysis, but now, from a sample of just 50 pitches, I can’t stop wondering why he’s not in the Majors. Great stuff. Decent control. The only evident knocks against him are his 2-8 Win-Loss record in AAA and 4.56 ERA. Maybe Dayton Moore knows something I don’t, or perhaps Rosa brought it just for his brief appearance in the Majors, or it’s possible GMDM is undervaluing a young talent who can get Major League hitters out. Actually, all three of these scenarios have probably taken place.

Who has made the most of his stuff? For this ranking, I subtracted the actual run value that each pitcher has been worth from the expected run value based on his stuff.

Name StuffRV-RV
Roy Halladay 61
Brandon Webb 60
Derek Lowe 54
Dan Haren 51
Greg Maddux 46
Jake Peavy 45
Jonathan Papelbon 38
Justin Duchscherer 37
Joakim Soria 35
Brad Penny -28
Carlos Silva -30
Adam Eaton -32
Bronson Arroyo -32
Miguel Batista -36


This group consists of pitchers who can succeed without the strikeout. While Webb and Lowe both have heavy sinkers that are able to generate grounders like clockwork, the location of these pitches down in the zone has likely had more to do with their success than the tremendous movement on their pitches.

And at the bottom, there are two types of pitchers. The Batistas and Pennys are granted roster spots because their passable stuff at least gives them upside, but the Eatons and Silvas of the world, who have crummy stuff and can’t even pitch up to their already limited abilities, well, those are the guys to whom you don’t give multi-year deals.

A spreadsheet containing most of the data used in this article.

Pitchf/x data from wantlinux.net via MLBAM. Thanks to Dave Allen and all others who helped me with the code used in this analysis.

Touching BasesMay 26, 2009
David Price's Debut
By Jeremy Greenhouse

For Cleveland sports fans, I don’t know if any moment could top LeBron James’ game-winning three pointer from Friday night. Last night’s ninth-inning comeback by the Indians wasn't half bad.

For Tampa Bay fans, though, last night's game was of greater importance than its bullpen collapse. Last night, David Price made his first start of the year.

Pitching in five regular season and five postseason games last year, Price served as an instrumental part in the Rays’ playoff run. Nevertheless, Price retained his rookie eligibility, and the Rays, managing a surplus in pitching, opted to option the 23-year old southpaw down to AAA and keep youngsters Andy Sonnanstine and Jeff Niemann in the rotation as well as limit Price’s innings.

Following Price's phenomenal postseason performance, Josh Kalk penned everything you need to know about the man, who was named the second-best prospect in baseball (behind Matt Wieters) by Keith Law, Kevin Goldstein, and Baseball America.

In spring training, Price went 2-0 with a 1.08 ERA, but his six walks allowed in 8.1 innings of work were a bad sign. After Price’s second spring appearance, he admitted that he was experiencing difficulty.

"I've worked on my changeup so much, my slider's gone away," Price told mlb.com. "It's something I'm going to have to get back."

Considering the hype Price received, it's hard to believe that he still had areas where he needed to improve, but he's still just a kid with only a year of professional ball under his belt.

Price’s first six starts with AAA Durham were worrisome, as he posted a 1-4 record due to a disappointing 21:16 K:BB ratio. Price was drawing fewer swinging strikes and he was not inducing nearly as many ground balls in his 2009 stint with Durham as he had in 2008 across four levels. Yet Price seemed to have turned it around in the last couple of weeks leading up to his start yesterday. In what might be his final Minor League appearance of his career (knock on wood) Price went five innings of no-hit ball while striking out nine. Price entered the Rays' rotation when Scott Kazmir, to whom Kalk compared Price, hit the Disabled List. I set out to break down the second start of Price's Major League career.

Price came out firing. His first 14 pitches were four-seem fastballs clocking in between 94 to 98 miles per hour. Jamey Carroll drew for a leadoff walk, followed by Grady Sizemore hitting a pop up down the left field line, Carl Crawford made a futile attempt at a diving catch, which allowed runners to advance to second and third with no outs. Then Price really flashed his potential.

Price worked ahead of the count on Victor Martinez with fastballs, and with two strikes, Martinez had little chance. Price busted Martinez inside with sliders which Martinez could do little else but foul off. Price then blew Martinez away with a 98-MPH fastball on the outside part of the plate. Price worked ahead of Jhonny Peralta with inside fastballs and finished him off with a hard slider inside. Price finished the inning by testing Shin-Soo Choo with fastballs up in the zone, and on 2-2 Price threw a heater over the heart of the plate that Choo took for a called strike three.

Needless to say, that stretch was Price’s most impressive, which is fair since it doesn’t really get much better than that.

The Rays gave Price a fiive-run cushion heading into the bottom of the second. However, Price walked the leadoff batter on four pitches, which just makes you wonder. There’s no reason that any Major League pitcher with a five run lead should be walking the leadoff batter on four pitches. Price allowed five walks, which is the second time in his last four starts that he’s allowed that many. Walks have been a problem for Price. Since being promoted to AAA last year, Price has walked well over four batters per nine innings.

Let's take a look at Price's strikezone plot. This is from the catcher’s perspective, so pitches on the right are towards Price’s arm side, or inside to left-handed batters. Blue markers are pitches against righties, while red markers are pitches against lefties. Circles indicate fastballs while triangles indicate sliders.


david%20price.jpg


It looks to me like he tried to work away from lefties. His off-speed stuff was saved almost exclusively for righties, and he tried to keep his sliders in on them. Looking at his strikezone plot, I don't think Price was wild, but he didn't shy away from working himself into long at bats, which is unnecessary given that defense behind him and his ability to blow batters away.

Despite the leadoff walk in the second, Price retired the next three batters in order. With a full count on Ryan Garko, Price demonstrated the ability to keep the ball in the zone when necessary, as he forced Garko to foul off five pitches in the zone before popping out on a slider on the outside corner.

Price allowed two more baserunners in the third, but came out unscathed. The fourth inning was where it all started falling apart for Price and the Rays. The Rays had a 10-0 lead, yet Price was already at 77 pitches by the start of the inning, and his fastballs to the first two batters of the inning were down in velocity to 92-94 MPH. Mark DeRosa lined a single the other way and Garko pounded his third homer of the year on a knee-high fastball. Price picked the velocity back up against Matt LaPorta, working at 95-97 with his fastball to strike LaPorta out. Yet Price was up at 90 pitches, and he had apparently lost his command. Price walked the next two batters and was pulled by Joe Maddon, who had said in a pre-game interview that it was a goal for Price to go deep into the game. Neither of those baserunners came around to score, but Price was fortunate to forfeit only two runs after allowing nine baserunners in 3.1 innings.

Price, as usual, was 95% fastball/slider. He showed his spike curveball and changeup once or twice, but they were all wasted for balls.

I’d say he found his slider. Like last year, it averaged a velocity of 86-88 miles per hour. While Price doesn't generate significant horizontal movement, he actually got the ball to dive more in yesterday's start than he did on average last year. He releases his slider a couple inches farther from his body than he releases his fastball on average. There aren't many sliders thrown at 86-88, especially from the left side. Last year, Francisco Liriano and Randy Johnson threw the hardest sliders among left-handers. Both of them had little horizontal movement, like Price, and Liriano's and Johnson's sliders actually generated less vertical movement than Price's has. Nevertheless, all of these sliders have solid reputations and they have all accounted for above-average run values, which can now be found on Fangraphs. Swinging at Price's slider simply isn’t a good idea. Out of eight swings on his sliders, there were five fouls, two misses, and one pop out. However, when batters took the slider, only two called strikes were called out of twelve pitches. If he can locate the slider down in the zone, I believe it would be nearly untouchable.

His fastball averaged 96 MPH, which, for a starter, for a lefty, and for a human whose arm must follow the laws of biomechanics, is positively exceptional. The movement on it is nothing to write home about, though, in my opinion.

Price’s stuff is unbelievable. There’s no denying that. But walking that many batters is inexcusable, and it cost his team the game. Price has yet to have an outing of over six innings since he was called up to the Majors last year. Part of that is due to the Rays’ attempt to limit his innings. And part of that is Price’s propensity to throw too many pitches. The Rays were forced to go to their bullpen early, and they ended up not having enough arms to close out the game. Well, that’s not really fair. A bullpen should be able to close out a ninth-inning seven-run lead. Here’s the WPA chart from the biggest comeback of the year.

raysindians.jpg

Touching BasesMay 19, 2009
HitTracker F/X
By Jeremy Greenhouse

What happens when you combine Hit Tracker data with Pitch f/x data? You get a whole lot of data.

First, I looked at batter age in relation to standard home run distance. Standard home run distance is the distance a home run would travel in neutral conditions if it were to land at field level. My sample contains data on home runs from 2007 and 2008, totaling nearly 10,000 data points.

It appears to me that the age 25-29 peak holds true. I had data on 16 homers hit by players before their 21st birthday and the average distance was 420 feet. This is because Justin Upton is an absolute monster. The oldest grouping of players is likely biased since players who maintain the ability to hit home runs at that age are almost entirely power-happy first basemen and designated hitters. That group will be lighter on lighter-hitting middle infielders than the younger groups.

There are about 500-1000 home runs per grouping, which leaves it prone to skewness. Albert Pujols and Adam Dunn were born two months apart and their tremendous power probably contributed to the large break between ages 28-29 and 29-30.

Next up I graphed standard distance against a batter's weight. It’s a standard assumption that heavier players have more raw power. And even though listed player weights are some of the more unreliable baseball data available, the relationship is still undeniable.

Less obvious is the relationship between home run distance and batter height. Yet the trend is just as distinct.

When it comes to raw power, short players are at a greater disadvantage than light players while heavy players are at a greater advantage than tall players.

All of our assumptions about quantifiable measures that contribute to a batter’s power seem to hold true. Age, height, and weight are important in determining power. With pitch f/x data, we can also see what effects pitchers have on home run distance. This is getting into Defensive Independent Pitching Statistics theory. Max Marchi wrote a couple of great articles combining hit location and pitch f/x data. A good chunk of gameday data from 2007 did not have pitch f/x data, so I am working with closer to 7,000 home runs.

One would think that pitch velocity plays a part in determining how hard a ball is hit. To compare apples to apples, I used Hit Tracker’s speed off bat measure instead of standard distance.

It looks to me like pitch velocity is insignificant. Perhaps on the slowest of pitches, the ball doesn’t receive the same force off the bat, but every group faster than 80 miles per hour generates a speed off bat within half a mile per hour of each other. That’s nothing.

I wanted to see if there were any balls that left the pitcher’s hand with a greater velocity than that which they flew off the bat. There were about a dozen cases, with the biggest disparity in velocity coming on a 345-foot, 96 mile per hour Carlos Pena homer off a 99 mile per hour A.J. Burnett fastball.

Now, if I were Dave Allen I would come up with some awesome heat charts to demonstrate the relationship between pitch location and standard distance. I am not. But I do have bar charts. Here is pitch height plotted against standard distance.

I’m 6’2” and the top of my knee is exactly two feet high. Meanwhile, the top of my belt would be 3.5 feet high, but there just aren’t that many homers hit in the top layer of the strike zone. It would appear that home runs are hit the farthest on pitches at or around the knees. I’m not a physicist, or a physician for that matter, but I believe there are two factors a batter can control in how far he hits the ball: force and trajectory. I decided to break these down by pitch height.

pitchheight.jpg

Batters hit the ball hardest on pitches down in the zone. But the elevation angle—which is defined by Hit Tracker as the angle above horizontal at which the ball left the bat, in degrees—might actually determine why balls fly farther when batters go down to get them. The increase In elevation angle is uniform, and in general the lower the elevation angle, the higher the home run distance. The correlation coefficient between the terms is -.25. Furthermore, there is a correlation coefficient of -.5 between elevation angle and speed off bat, which affirms that batters want to get on top of the ball, so to speak. Of course, the reason for the negative correlation between home run distance and pitch height could actually be the horizontal launch angle. Maybe low pitches are easier to turn on than higher pitches.

I broke down horizontal pitch location by batter handedness.

This is from the batter’s perspective, so pitches 2-6 inches from the center of the plate (on the right) are outside to right-handed batters.

I’m extremely surprised to see that batters hit pitches outside farther than they hit pitches inside.

I incorporated pitcher handedness as well as home run field location to find the differences in platoon splits.

platoon.jpg

Lefties not only hit longer homers on outside pitches than righties, but they also hit longer opposite-field home runs. These two points are probably intertwined. Other than that, I don’t see anything notable in platoon splits.

Finally, I looked at the count’s effect on home run distance. I might have saved the best for last, as there is quite a clear relationship, which strongly signifies a change in hitter approach.

counts.jpg

On 3-0, hitters get better pitches to hit and might even swing harder when they choose to let it fly, and with two strikes hitters get worse pitches to hit and might shorten their swing to protect the plate. Again, this is selective sampling. Batters will only hit home runs on decent pitches. And pitchers are even more likely to throw fastballs over the heart of the plate when behind in the count than they are when ahead.

Thanks to Greg Rybarczyk and MLB for making all this wonderful data freely available.

Touching BasesMay 12, 2009
Micah Owings the Hitter
By Jeremy Greenhouse

Maybe Dusty Baker knows what he’s doing.

On Sunday night the Cincinnati Reds trailed the St. Louis Cardinals by a run with two outs in the bottom of the ninth inning with the bases empty. Baker pinch hit pitcher Micah Owings for the fifth time this season. Clearly, Owings is not your average pitcher. He pitches respectably, but carries a big stick. Owings had been 2-4 on the year as a pinch hitter which was better than his 2-3 Win-Loss record as a starter.

From MLB.com's gameday, here's a summary of Owings' at bat.


owingsatbat.jpg


Owings took the first two offerings for balls, though they appear to be borderline strikes by the blurred edges of the strike zone shown in the graphic. Owings then fouled one off before taking another ball. Up 3-1 in the count, Owings took a pitch that might have been an inch or two off the corner—which is to say that it wasn’t a clear-cut call. Well, Owings thought it was. He tossed his bat to the dugout and decided to take his base before it was granted. The ump called him back. Owings wasn’t going to take another pitch after that. He fouled three straight off when Ryan Franklin, who, I should mention, had not yet blown a save on the season, resorted to his first off-speed pitch of the at-bat. Not a good idea. Franklin left a slider right over the heart of the plate that Owings crushed 384 feet to left-center field to tie the game at 7-7. The shot gave the Reds a 49% shift in win expectancy which, in one swing, made Owings the most valuable player of the game by WPA. The Reds went on to drop the contest 8-7, but Owings as usual received his fair share of accolades for his performance.*

*An aside, and my first Pozterisk on this site.

Pitchers like Owings, Carlos Zambrano, Dontrelle Willis, and Mike Hampton who have had nice runs with the bat tend to have their value overstated a bit since we in the media tend to focus on oddities. But it is my belief that the relative value of a pitcher's hitting ability is understated on the whole, considering most people don't give a second thought to how skilled a pitcher is with the stick.

Last year, Nate Silver took a look at several notable hitting pitchers in the game. He found that the difference in true talent between the best and worst hitting pitchers is worth about ten runs per year. Since pitchers are rarely allowed to bat in high-leverage situations, Tom Tango approximated that a pitcher's hitting ability could be equivalent to roughly -.125 to +.25 points in earned run average, or some 10%-20% of a pitcher's value. Last year, there were 120 pitchers who had at least 10 plate appearances and 120 pitchers who tossed at least 120 innings. The standard deviation in their pitching WAR was 1.74 wins compared to a standard deviation of .36 hitting WAR.

David Gassko penned a comprehensive history of hitting pitchers and the decline in such skill over the years. Silver had hypothesized that the lost art was a cause of the specialization of position players and pitchers. The best hitting pitchers tend to be those those who spent the least amount of time in the minors since hitting is a skill that takes constant practice and the minors are the only place where pitchers can forget how to hit. Gassko concluded that even the half win that some pitchers provide with the bat can be worth half a million dollars. Should teams work with pitchers more on hitting?

This year, Ubaldo Jimenez had led the league in batting runs among pitchers before Owings went deep on Sunday. Jimenez had the highest average fastball velocity in the league last year and has been a productive pitcher each of the last two years thanks to above-average strikeout and home run rates from a Coors field product. At 4.4 WAR, he would have been a solidly above average pitcher last year—if not for a league worst -1.5 WAR on offense. This year, though, he has yet to allow a homer and is posting a positive batting WAR which has made for a solid season.

Wandy Rodriguez is having a nice year too but is due for some regression as his BABIP is down 60 points from last year to .263 and he, like Jimenez, has yet to allow a home run despite allowing 64 balls in the air. Still, his curve ball is one of the best in the league, year after year , and he has thrown it more often than all pitchers but A.J. Burnett thus far.Yet while he is ninth in the league for pitchers with 13.5 runs above replacement, he has given away a pitcher-worst -3.9 runs with the bat.

Owings is hitting .346/.414/.692 in 29 career plate appearances as a pinch hitter, a step up from his .315/.336/.556 line in 115 plate appearances as a pitcher.

Owings owns Georgia's high school home run record. A transfer at Tulane, Owings hit .355/.470/.719 before being drafted by the Arizona Diamondbacks as a 22-year old. While rarely seeing time with the bat in the minor leagues, he more than held his own with a .359/.373/.500 line in 64 at bats.

Owings has taken a step back on the hill this year, but right now we’re concerned about his performance in the box. He’s managed an incredible .435 career BABIP thanks to an impressive 24.4 line drive percentage. In 2007 when he won the silver slugger award, Owings hit four homers, all 400+-foot blasts including two shots off Buddy Carlyle on August 18 that traveled further than 440 feet each. Now, I'm not saying Owings owns Carlyle, but Owings did hit doubles off him the other two times they met, so I wouldn't be surprised if Owings at least paid rent on Buddy. Owings has shown a strong reverse-platoon split, as demonstrated by this graph.


owingssplit.jpg


On balls in play, Owings follows a profile similar to most hitters. He pulls four times as many ground balls as he hits the other way and has a rather even distribution of fly balls while demonstrating most of his power on balls he pulls. Owings swings at well over half the pitches he sees and is not too sharp at making contact. But for some reason pitchers are willing to give him offerings inside the strike zone more often than not. And when he does make contact, he inflicts serious damage to the tune of a .261 ISO and 21.4% HR/FB. He has average speed and is an average baserunner too.

We always see pitch f/x breakdowns when hitters pitch, and Chone Smith just gave a neat overview of recent velocity for hitters on the mound, but how about breaking down how a pitcher hits with pitch f/x data?

Using all gameday data available for Owings plate appearances since 2007, his rookie year, I’ll try to break down Owings' performance by pitch location. Here's my first shot at these types of graphs.


owings%20strike%20zone.jpg


He’s 6’5”, so his strike zone is a couple inches higher than average. It looks to me like he’s willing to chase pitches low out of the zone. The four home runs for which gameday has data for appear to be standard locations for right-handed hitters, as Dave Allen showed. I'd call him a low-ball hitter. But there are too many data points in here for my liking, so I’ll break it down Harry Pavlidis style. I made each zone about a foot in diameter, which appears to have been a mistake, but here it is...


owings%20slices.jpg


Owings will swing at anything over the plate or inside.He likes the ball down and in, but pitchers can get him to chase balls that are low and he's not too strong at making contact on pitches up in the zone.

So the real question is what should be done with Owings. What do you do with a slightly below-average pitcher with some potential who adds value with the bat? I’ve had the idea of batting him third in away games and then subbing in the starter in the bottom of the first, but that idea is admittedly radical. I don’t at all advocate trying to turn him into Rick Ankiel, since Owings still has value as a pitcher. Maybe he could be turned into a reliever who comes into games as a pinch hitter. Well, what I hope is that Dusty Baker carves out a unique role for him or keeps giving him at bats as a pinch hitter. Players like Owings make the game more fun.

Touching BasesMay 07, 2009
Findings from the Free Agent Market
By Jeremy Greenhouse

Curt Flood really started something with this whole free agency thing, huh? Using ESPN’s Free Agent Tracker, I collected data for all free agents since 2006 and used regression analysis to pick up on some trends.

WAR to Wages

This offseason, Fangraphs unveiled its Wins Above Replacement measure in the Value section of its stats pages. WAR is a statistic that combines offensive, defensive and positional value and sets it against a replacement-level baseline to find the marginal wins a player contributes to his team. There has been debate over how to convert these marginal wins into a marginal value in terms of dollars. One of the first things I looked at was whether the relationship between WAR and salary was linear or nonlinear. I plotted the WAR from each free agent's contract year—excluding those who were injured all year or who came over from Japan—against the average annual value of the contract they signed.


aav%20vs%20war.jpg


I admit that before having seen any data, I had a bias toward the nonlinear relationship, since it just makes intuitive sense to me.

The regression lines look rather similar. It would appear that the nonlinear regression has an advantage at the extremes, since it won’t predict negative salaries for very negative WAR and it better captures the exponential value of superstar players. However, there is little difference between the regression lines for the vast majority of players, those between 0 WAR and 5 WAR. The R2 values, which measure the percentage of variance of Average Annual Value that is explained by WAR,, are similar at an impressive .62-.64 range. This affirms that a single year of WAR captures a lot of a player’s value. Keep in mind when looking at these R2 values that the R2 will always increase in a polynomial equation due to the nature of adding a new term, so we definitely cannot make any conclusions about either method from this graph alone.

Time 100’s own Nate Silver, in deriving Marginal Value Over Replacement Player, used a nonlinear form of WARP . I have duplicated his graph here which projects WARP for 2005's free agent class by using three years of WARP from 2002-2004 instead of the one previous year of WAR I used for 2006-2008 free agents. I have superimposed a rough line of best fit to portray the difference between a linear and nonlinear model.


AAV%20vs%20WARP%20copy.jpg


The thinking behind a nonlinear model is that there is an abnormal distribution of talent in baseball, which makes top talent disproportionately more valuable than average talent.

Phil Birnbaum shows that individual skills in the major leagues may be normally distributed. Anecdotally, this is reaffirmed by the 20-80 scouting scale, which is based on a normal distribution with a mean of 50 and standard distribution of 10. Furthermore, Tom Tango shows that “when you consider the number of opportunities each player gets (in the Major Leagues), the total effective talent distribution is rather typical.”

However, when observing only the Major Leagues, we neglect the fact that most subpar baseball talent resides at another level. There is an abundance of freely available talent that could provide marginal upgrades to current Major Leaguers. What this means in terms of player value is that below-average players will be disproportionately underpaid compared to above-average players due to the difference in the supply within each pool.

Bill James once wrote “talent in baseball is not normally distributed. It is a pyramid. For every player who is 10 percent above the average player, there are probably twenty players who are 10 percent below average.” I believe this theory holds if by baseball he means the total baseball universe and by average he means the Major League average. So, Tango may be right that, at the Major League level, talent follows a normal distribution, but when we add talent from all player pools, the curve does begin to look like the right tail of a normal distribution.

Think of it this way: would you rather have the right side of the Cardinals’ infield or the Reds’ infield? The combinations of Albert Pujols/Skip Schumaker and Joey Votto/Brandon Phillips will both produce 8 WAR, give or take. Through the currently dominant model for fair-market evaluation, both sets of players are worth some $35 million if you simply multiply their WAR by $4-5 million. But my intuition tells me that I'd rather have the pair on the Cardinals. The key is that Pujols takes up only one roster spot and provides the same value of a pair of players who take up two. I might be able to upgrade over Schumaker on the cheap eventually. We also must account for the fact that freely available talent is, well, free, while the superstars who bring in 5+ WAR will need to be acquired through trading or bidding.

Furthermore, I found statistically significant evidence that the Type A tag for free agents is correlated with increased pay. In a practical sense, the Type A label decreases a player's value in a free market since it costs prospective teams a first-round pick to acquire the player or the label costs the player in leverage if he tries to re-sign with his former team. However, Type A free agents tend to be the best players in my sample, so it is evident that teams ignore the Type A tag and are willing to spend what it takes to reel in superior players.

Separating position players and pitchers, I find that is much easier to predict position players' salaries in general, and the nonlinear regression fits better for position players than it does for pitchers. In separating the two pools of players, I decided to test for some skills that do not translate into a hitter’s or pitcher’s WAR, but still might directly relate to his salary.

General Managers dig the fastball

Fangraphs keeps track of pitch usage and velocity for all pitchers since 2002, and all the data can be easily exported to a spreadsheet. This is a good thing for baseball analysts. Dave Allen and Dan Turkenopf both used pitch f/x data to show how velocity relates to production. In these regressions, I account for a player’s WAR, and therefore can try to isolate the effect of a pitcher’s fastball velocity on his salary. Here is the regression output.

      Source |       SS       df       MS              Number of obs =     149
-------------+------------------------------           F(  4,   144) =   62.82
       Model |  1.7252e+15     4  4.3131e+14           Prob > F      =  0.0000
    Residual |  9.8863e+14   144  6.8655e+12           R-squared     =  0.6357
-------------+------------------------------           Adj R-squared =  0.6256
       Total |  2.7139e+15   148  1.8337e+13           Root MSE      =  2.6e+06
------------------------------------------------------------------------------
         aav |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         WAR |    2399138   153233.6    15.66   0.000      2096260     2702016
         fbv |   164514.8   72588.22     2.27   0.025     21038.76    307990.9
          o7 |  -423055.5   545027.9    -0.78   0.439     -1500344    654233.1
          o8 |   -1365307   508682.7    -2.68   0.008     -2370757   -359857.4
       _cons |  -1.19e+07    6496299    -1.83   0.069    -2.47e+07    954444.2
------------------------------------------------------------------------------


This means that there is statistically significant evidence that fastball velocity (fbv) contributes to a pitcher’s salary. Every additional mile per hour harder a pitcher throws, he is paid about $165,000. Fastball velocities typically range from 85 MPH to 95 MPH, so if two players were to put up the same WAR, but one was a soft tosser and the other a flame thrower, in an auction teams may bid up to a couple million dollars more for the harder thrower, based on that skill alone.

I created two player pools, separating those with above-average fastball velocities and those with below-average fastball velocities. The average fastball in my sample of 149 pitchers travels 89.7 miles per hour. The WAR of both player pools is nearly identical, as the harder throwers average .97 WAR compared to .96 WAR for the softer throwers. Yet the harder throwers earned $4.9 million per year in free agency compared to $4.2 million for the latter group. Perhaps fastball velocity predicts future performance, or perhaps there is an allure to signing a player who can light up the radar gun, or maybe fans come out to see fast pitchers. No matter the case, throwing hard gets you paid.

I also included time-fixed effects in this regression, setting dummy variables to represent the year during which the pitcher became a free agent. We find statistically significant evidence of deflation in 2008. While 2006 and 2007 appear stable in terms of free agent salaries, pitchers with similar production in 2008 were liable to lose on average a million dollars per year on their contract because they hit the market at the wrong time.

General Managers dig the longball

By longball, I don’t mean home runs. I mean actual distance. From Hit Tracker, I included the average true distance in feet of home runs for all players in my dataset..I also included weight of a player in pounds, which might measure raw power or might measure nothing, but was significant in the regression. Unfortunately, weight is also probably the least accurate data point I could use since there are no reliable sources for it.

      Source |       SS       df       MS              Number of obs =     169
-------------+------------------------------           F(  3,   165) =  123.05
       Model |  2.5996e+15     3  8.6653e+14           Prob > F      =  0.0000
    Residual |  1.1620e+15   165  7.0421e+12           R-squared     =  0.6911
-------------+------------------------------           Adj R-squared =  0.6855
       Total |  3.7616e+15   168  2.2390e+13           Root MSE      =  2.7e+06
------------------------------------------------------------------------------
         aav |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         WAR |    2256088   125521.3    17.97   0.000      2008253     2503923
        true |   28062.52   13259.32     2.12   0.036     1882.712    54242.32
      weight |    24497.9   10709.87     2.29   0.023     3351.842    45643.95
       _cons |  -1.49e+07    4881150    -3.05   0.003    -2.45e+07    -5253468
------------------------------------------------------------------------------

These measures are essentially independent of WAR but do affect salary. I believe home run distance and weight are actually capturing the phenomenon that has shown that there is a stronger correlation between slugging percentage and salary than between salary and most any other basic statistic. Weight and True Distance correlate very well with slugging percentage. We can say with confidence that there is a bias toward heavier players who hit for power, all else being equal. For every ten pounds of weight or ten feet in home run distance, a hitter can expect a positive return averaging around 250 grand.

This is not to say whether paying these players more for the ability to throw fast or hit long home runs is efficient or not. I did this analysis to observe trends in the market over the last few years, and I am not trying to comment on any sort of inefficiencies that may exist.

Thanks to all the data sources I used in this study including ESPN, Fangraphs, Hit Tracker, Forbes, and Fantasypitchfx

Edit: At Jake's request, I have separated the data series by year and added separate trendlines for each year.

aavwar2.jpg


Touching BasesApril 27, 2009
Derek Holland Analysis
By Jeremy Greenhouse

I wrote this post last Wednesday night, and Derek L. Holland has since made another appearance, tossing three innings of one-run ball. His velocity was a bit down, but his pitch usage and movements were similar. He gave up two walks and threw a lot more balls as well. Here's what I wrote Wednesday in what seemed to me auspicious introduction to a promising career.

Rookie Derek L. Holland made his Major League debut on Wednesday night against the Blue Jays, pitching two and a third scoreless innings.

Holland, 22, was drafted out of junior college in the 25th round of the 2006 Rule 4 draft. From there, Holland’s stock as a prospect rocketed upwards coinciding with the increase in his velocity. In his stint in A-ball in 2007, Holland threw 67 innings with a 3.22 ERA and 3.95 K/BB ratio. In 2008, across three leagues—the highest being AA Frisco—Holland made even more strides, lowering his walk and home run rates in 150.2 innings, which culminated in a 2.27 ERA and 157 strikeouts—third in the Minor Leagues. The performance garnered him Rangers Minor Leaguer of the year.

“What worked so well for me was being able to communicate with my catchers and staying ahead of the hitters,” Holland told mlb.com. “It was huge, and that was what helped me to keep having the hitters guessing. I feel as the year went along, I got stronger and my pitches became a little better.”

Coming into the year, Holland was a prospect on everyone’s radar, as he was ranked 40th by Kevin Goldstein, 31st by Baseball America, and 21st by Keith Law.

Here’s what Goldstein had to say about the flame-throwing left hander:

“The Good: Holland's velocity only got better during the year, as he began the year in the low 90s but was sitting at 94-96 mph while touching 99 by season's end. His arm speed rivals that of any southpaw in the minors, and the pitch also features excellent late life. His top secondary pitch is a plus changeup with depth, fade, and good arm-side deception.
The Bad: Holland is still struggling to come up with a consistent breaking ball. He throws a slider which either flashes plus or is below average depending on the day, and he can flatten the pitch out by overthrowing it. The leap he made last year was so unexpected that he still has some skeptics.”

And Law:

“He was 88-91 mph the following spring, then was 90-93 in the summer of '07 in Spokane. By the middle of 2008, he was already in Double-A, sitting 93-95 and touching 98, with natural bore and cut to the pitch and uncanny command. His changeup is already an above-average pitch, and he held right-handed hitters to a .215/.268/.305 line across three levels this year. His slider is still a work in progress, but it's improving, and he has enough command and deception to get left-handed hitters out in the minors. He doesn't have the raw upside of (Neftali) Feliz, but he's not far behind him in potential and is ahead of him in command and feel for pitching, and is the most likely of Texas' horde (pun intended) of pitching prospects to contribute to the big club in 2009.”

With that in mind, I broke down Holland’s first appearance in the show.

Holland entered in the 6th inning of a 6-3 game with the bases loaded and two outs. He had the platoon advantage against Adam Lind and promptly challenged Lind with two consecutive 96-MPH heaters. Ahead 1-2, Holland threw Lind a slider that broke off the plate outside that Lind just barely spoiled. Holland worked outside with another 95 MPH fastball and Lind fought it off for an infield hit. Holland again worked ahead of the count on Scott Rolen with fastballs before throwing a 1-2 slider that Rolen popped up.

Holland breezed through the seventh. He retired Kevin Millar on the first pitch of the inning, and then got into a ten-pitch duel with Rod Barajas. Holland fell behind with two high-and-wide fastballs. Yet he continued to work up in the zone, and Barajas was unable to catch up any of his next four fastballs, fouling three off and swinging through another. When the count worked full, there was no doubt Holland would stay with the hard stuff, and after a couple more foul balls, Holland eventually induced a fly out on a letter high fastball.

Holland picked up two strikeouts in the eighth. His best pitch of the night might have been a 1-2 ankle-high slider to Aaron Hill which was swung over for strike three. But he tried a 1-2 slider on the very next batter, and this time Alex Rios stayed on it for a single. Holland worked inside to Vernon Wells, and Wells was caught looking at a 92 MPH fastball right over the heart of the plate, a pitch Holland got away with.

He had lost his velocity by the 9th inning. In the seventh, Holland’s fastball averaged 96-97, but it fell to 93-94 in the ninth. He also missed his target on each of the first three pitches against his final batter. On 2-0, the catcher set up outside and the pitch sailed over the inside part of the plate, hammered for a single by Adam Lind. Lind was the only lefty Holland faced, and he got base hits on both encounters.

Holland certainly was able to work ahead of hitters, as he indicated was one of the keys to his success. He got into only one three-ball count and six two-strike counts.

Courtesy of Brooks baseball, here’s what his location chart looked like.

holland.jpg

Holland worked up. His shoulder might have been flying open a bit, because something caused him to consistently miss high and wide to his arm side.

Holland showed that plus velocity Goldstein referred to. In addition to his 94 MPH fastball, he generated solid movement on the pitch. However, he didn’t throw any pitch listed as a changeup, which Goldstein and Law called his best offspeed pitch. I could see why Goldstein referred to his slider as inconsistent. He was able to keep it down in the strikezone, which is always a positive, and coming in at 84-85 MPH, his slider has a nice speed differential with his hard stuff, but the break on the pitch was suspect. Though harder than most sliders, Holland’s had below average vertical and horizontal movement. Law noted Holland’s possible reverse platoon split, and the fact that Holland’s slider doesn’t break away from lefties certainly contributes to this.

Holland’s going to want to work on getting some more tilt on his slider. He also might want to start working down in the zone with his fastball, though his nerves might have made him overthrow a bit Wednesday night. His biggest asset is simply being a southpaw who can dial up 97 and throw strikes. He can survive with just that pitch if he comes out of the bullpen. I think Holland, and Neftali Feliz, will be tremendous assets to the Rangers decrepit pitching situation in the future. But considering how wide open that A.L. West division is, we could see the fruits of the Ranger’s superlative farm system pay dividends this year.

Touching BasesApril 27, 2009
Parks and Conversation
By Jeremy Greenhouse

These notes don't fit into the post that I will hopefully have up tomorrow, but I thought I'd include this graphic of average ballpark dimensions from 2006-2008 here. I converted the dimensions found on Hit Tracker from pixels into feet, and here are the results by quartiles.

league%20park%20dimensions.jpg

Also, last week I linked to a couple excellent studies on park factors by Greg Rybarczyk and David Gassko, but I forgot to to link to Jeff's excellent post on park factors, which I will be referencing as well in the future. Fortunately, his study used the same years of data as I did. It contains several useful pieces of information that I have not seen used in many other places, such as foul area, and average wall height, which is a key part of information missing from the above visual, but which can also be found at ballparks.com

Lastly, I'm interested in hearing thoughts on whether it would be more informative to list numbers other than just averages for home run characteristics. For example, the bottom 10% of home runs in a certain park might tell us how easy it is to hit short home runs, while showing perhaps the top 25% could tell us how well the ball carries in a park. Or for certain players, the top quartile will give an indication of a player's raw power, while the bottom half may tell us more about how he used his park to his advantage.

Touching BasesApril 21, 2009
Personal Park Effects (Part 1)
By Jeremy Greenhouse

There's been a lot of talk on this site about park effects, as Eric Walker and Sky Andrecheck just this week delved into home-field advantage, precision, and accuracy.

My idea is that not all park effects are uniform. For example, I believe that Mike Lowell and Dustin Pedroia are largely aided by the Green Monster, to a greater extent than most hitters and Johnny Damon's home run production has been largely influenced by the short porch in Yankee Stadium's right field. So what I've set out to do is use Hit Tracker data to compare players' home runs at home and away from home, and perhaps come to conclusions about certain ballparks effects on certain players.

I will not attempt to come up with my own home run factors. One reason for this is because if I look at only home runs here, I will face terrible selective sampling issues which would make my results neither precise nor accurate. The other is that I'm not that smart. I'll just present the data, and try to infer results from it. For actual park effects, Walker linked to a paper by my friends at the Harvard Sports Analysis Collective, and in the future I will be referring to a couple of articles at The Hardball Times by David Gassko and Greg Rybarczyk.

Here are the averages for all regular season home runs from 2006-2008 for which Hit Tracker has information. Here is the glossary for the terms. I've broken the fields into left, center, and right. I'd love to get more granular if I had more data. The second column refers to how many home runs were hit over the timespan, and the percentage hit to each field. The rest are Hit Tracker terms.


Field497.6True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left51%394.7104.229.991.01.91.02.1389.624.2
Center12%421.7106.628.994.72.61.32.3415.620.5
Right37%393.8103.929.990.71.91.12.1388.623.9

I defined the dimensions so that 12% of balls went out to center, so as to be consistent with the work I did last week. However, Hit Tracker also gives horizontal launch angles, with which you can define your own dimensions rather accurately. As there are more right-handed batters than left-handed batters, there are more left-field home runs than right-field home runs. Other than that, the differences between left and right are negligible. Homers to center are hit harder and farther, but also need more help from atmospheric effects. On to specific ballparks. Click on the ballpark names to view their dimensions.

Ameriquest Field

Homers in Arlington certainly travel. They have by far the greatest impact from temperature of any ballpark. There have actually been more home runs to right field than left, which would likely mean that it is easier to hit home runs in that direction. Indeed, home runs to right have to travel a lesser distance than those to left. This is likely compounded by the home team, the Texas Rangers, trying to exploit this advantage by stocking up on lefties or switch-hitters. This effect is most prominent with the Yankees and Yankee Stadium, who have been well-known to go after left-handed batters as their production will be enhanced by the short right-field fences.


Field545True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left38%403.1105.229.792.30.94.62395.227.7
Center14%423.1106.529.397.30.54.92415.320.2
Right47%396.2103.829.690.42.24.52387.322.6

Angels Stadium

Angels Stadium seems to play true to most of the league averages. It might be a bit easier than normal to hit home runs out to center.


Field417True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left48%401.3104.729.891.45.51.31394.025.2
Center18%413.8105.228.589.84.91.01407.515.1
Right35%394.8103.630.190.93.21.31389.724.1

AT&T Park

I'm surprised that AT&T Park has one of the strongest negative temperature effects. I guess being by the bay really cools the weather. This, and an endemic offense from the home team, contribute to the very small amount of homers to have been hit in AT&T. However, the wind will ratchet up at times. It's clearly a pitcher's park. I imagine it would have been helpful to break up this park into right-center and right field.


Field363True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left64%395.5104.129.386.99.1-2.60388.822.9
Center11%423.1106.728.089.49.4-2.20415.619.0
Right25%386.5103.229.986.97.6-2.50381.324.4

Busch Stadium

It can get windy in Busch, which will inflate the actual distances of home runs. Overall, the park is fair.


Field463True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left54%399.7104.530.191.85.02.62390.426.0
Center9%424.5106.029.495.97.22.92412.919.3
Right37%399.5104.230.392.84.02.32391.525.5

Chase Field

Chase Field is clearly a home run park, but it is quite deep to center. There aren't many cheap home runs hit at Chase.


Field522True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left54%406.0105.529.692.10.73.65397.226.6
Center7%439.8109.029.3101.60.13.75431.228.0
Right39%397.7104.330.293.10.13.35389.924.6

Citizens Bank Park

Citizens Bank Park's dimensions are right around league average, but the walls don't just out toward right-center and left-center making home runs attainable in those directions.


Field654True Speed ElevationApexWindTemperatureAltitudeStandard Parks 
Left53%387.5103.429.889.02.31.30383.821.3 
Center11%423.5106.929.598.44.21.70417.522.2 
Right36%391.0103.929.890.23.41.10386.522.8 

Comerica Park

Comerica is built for triples with its insanely deep walls in center field. Anyone who can hit homers out there is a man. For such a difficult home run park, its impressive how many homers have been hit there.


Field538True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left64%394.7105.129.389.20.90.22391.226.0
Center4%440.2109.828.699.13.5-0.32434.128.9
Right32%388.2103.929.387.3-0.50.32386.025.1

Coors Field

Mile-high air is worth 21 feet in home run distance. Aside from that, there's not much notable about the park. The deep fences do a decent job of canceling out the extreme altitude effects.


Field522True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left48%415.5104.730.091.52.11.121391.124.5
Center13%441.1106.829.496.13.60.621416.019.3
Right39%414.3104.429.488.81.90.221391.623.1

Dodger Stadium

Dodger Stadium's center field doesn't reach 400 feet, so a rather high percentage of homers travel that way.


Field423True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left42%402.6105.129.691.52.80.92396.824.5
Center18%416.6105.828.691.81.71.32411.718.3
Right39%399.0104.229.891.12.90.72393.523.5

Dolphins Stadium

Dolphins Stadium is conducive to righties, so long as they can get some loft on their fly balls. Right field is the opposite, as home runs travel farther but not as high. Straightaway center is 400 feet, which is normal, but the walls jut out from there, making home runs into the power alleys difficult. Fly balls are aided by the temperature, though I'm not sure the temperature data accounts for whatever effect humidity might cause.


Field503True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left61%390.4103.829.790.2-0.24.20386.322.8
Center7%426.5107.927.390.50.94.40421.124.4
Right32%396.6104.928.887.90.34.40391.626.2

Fenway Park

You can see how high home runs have to go to clear the Green Monster. Though the relationship is far from strict, ten feet in distance correlates with an extra foot and a half in apex height. But in Fenway, homers to center are 40 feet longer but only half a foot higher on average than those to left.


Field442True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left54%385.2102.431.493.65.0-0.70377.721.7
Center9%423.3106.728.894.17.3-1.60417.620.6
Right37%397.7105.229.188.15.00.50393.325.8

Great American Ballpark

Great American seems to be a bit harder on righties than it is to lefties, but it overall plays as a home run hitter's park.


Field675True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left44%399.5104.929.992.51.61.62394.124.4
Center14%418.5105.929.093.82.42.42411.618.1
Right42%393.7103.829.990.52.32.12387.322.2

Jacobs Field

I would think that it shouldn't be too hard to hit balls out of the Jake to center, but there haven't been too many hit in that direction for some reason.


Field481True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left44%392.3103.630.291.90.1-0.13389.724.4
Center11%412.2104.829.091.63.31.23405.415.0
Right45%385.7102.430.289.62.2-0.23381.220.9

Kauffman Stadium

Looking at the atmospheric effects, I'm surprised it's so difficult to hit home runs at Kauffman, though the fences are kind of deep to right-center and left-center field.


Field413True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left49%404.8105.730.094.30.72.64397.928.0
Center15%429.0107.828.695.40.33.54421.823.9
Right37%403.5104.930.596.01.02.44396.526.8

McAfee Coliseum

I always thought McAfee was a more difficult home run park, but the dimensions aren't bad at all. It does have the worst wind and temperature effects of any park, though.


Field407True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left53%389.6104.629.489.1-1.9-1.60393.026.2
Center12%409.5106.528.993.5-4.7-1.70416.022.6
Right36%390.3105.330.192.8-3.6-2.10396.026.2

Metrodome

It takes some elevation to hit home runs over the baggy in right, but there are a lot of cheap home runs hit in that direction too. A 32.9 degree elevation angle is the highest figure for any field I came up with and 370 feet in standard distance is the lowest to a field either direction of center.


Field413True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left50%392.4104.229.789.90.30.03388.925.2
Center16%426.2108.527.389.60.0-0.13423.124.4
Right34%373.6100.232.997.40.00.13370.318.0

Miller Park

Down the line to right is nice and short. The apex of home runs to center is unusual.


Field556True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left51%397.2104.529.590.90.20.02394.821.7
Center15%419.4106.729.9100.00.20.52416.322.9
Right34%393.2104.029.991.30.90.12389.920.7

Minute Maid Park

That hill out in center sure makes things difficult for power hitters. It's unusual that the wind had an adverse effect on center-field home runs, since normally balls need a little help from the wind to carry that far.


Field531True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left59%376.6101.931.593.50.21.70374.619.2
Center4%429.0108.928.295.7-0.20.80427.926.7
Right37%393.7104.929.188.30.31.80391.524.2

Nationals Park

There's not much of a sample for Nationals Park, but it seems to play around league average, unlike RFK which was cavernous.


Field145True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left50%397.2105.129.088.11.61.20394.525.6
Center19%412.2105.528.490.21.82.20408.116.7
Right32%389.1102.530.892.61.52.50384.921.7

Oriole Park

Oriole Park is definitely a home run haven thanks to friendly atmospherics and a short fence in left.


Field570True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left50%388.1102.929.787.72.32.60383.121.3
Center12%415.6105.928.792.81.42.10412.118.9
Right37%393.4103.530.391.72.72.80387.823.9

Petco Park

Petco is death to righties. Wind might blow from left to right in Petco more often than not. Straightaway center isn't so deep, but the fences in the alleys extend out to 400 feet.


Field414True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left54%390.1104.329.790.40.5-0.60390.026.1
Center18%417.5106.528.893.74.1-0.90414.318.5
Right29%396.5104.329.789.55.9-0.30390.726.7

PNC Park

PNC plays similarly to Petco, except it is even harder on righties and even easier out to center. Jason Bay must be happy getting out of that ballpark and into Fenway where he can pepper the left-field wall.


Field418True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left46%401.5105.329.289.52.11.33395.127.6
Center15%415.2104.829.895.92.61.83407.716.2
Right39%394.1103.831.095.31.10.93388.924.7

RFK Stadium

I almost feel bad for Nationals hitters who had to play in this behemoth of a stadium.


Field285True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left54%395.6104.530.393.50.92.50392.127.3
Center11%426.8108.427.793.0-0.20.90425.826.3
Right35%397.5104.530.996.91.82.10393.427.8

Rogers Centre

It's impressive that there was an above average amount of home runs in the Rogers Centre and also above average distances.


Field505True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left54%398.3105.829.692.10.90.01396.126.3
Center12%425.9108.528.194.7-0.50.11425.125.3
Right34%397.6105.629.490.9-0.3-0.21396.927.1

Safeco Field

Safeco is awful for right-handed power hitters. It was an odd decision for the Mariners to go after Adrian Beltre and Richie Sexson.


Field448True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left48%386.2103.830.291.21.2-1.70386.727.2
Center9%413.9106.728.794.00.4-2.00415.623.2
Right43%385.8103.129.988.92.5-2.30385.523.7

Shea Stadium

If you thought Shea Stadium was a pitcher's park, wait until you see how Citi Field plays.


Field499True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left48%391.6103.230.893.84.50.50386.623.7
Center6%420.6106.230.2100.43.60.90416.221.0
Right46%397.0105.129.490.21.90.70394.424.8

Tropicana Field

The Trop conforms to league averages except to center where the walls are very deep.


Field528True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left52%388.9103.730.090.60.20.90388.024.2
Center10%422.6107.528.695.30.00.90421.926.4
Right39%392.2104.529.087.40.00.80389.724.7

Turner Field

Turner Field is deep down the lines, but hitters get a lot of help from altitude, wind, and temperature.


Field502True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left49%404.3105.729.491.60.52.94397.126.2
Center17%423.6106.029.798.02.63.34414.119.5
Right34%400.9104.330.393.42.32.84392.025.9

U.S. Cellular Field

I had always thought that there was a jet stream of wind that forced balls out of U.S. Cellular, but it appears that the park is friendly to home runs only because of the crazy-short fences. The deepest part of the park might not even reach 390 feet.


Field662True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left59%387.6103.430.592.31.3-0.42384.321.5
Center12%417.5105.628.992.72.91.02411.418.3
Right29%391.0103.829.287.31.5-0.42387.622.4

Wrigley Field

Wrigley Field is windy, who would've guessed? I don't think that the park has much to do with the Cubs' decision to stock up on right-handed bats.


Field552True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left62%393.5103.229.687.94.60.92386.019.0
Center15%421.2105.230.598.69.61.12408.515.9
Right23%401.0104.430.292.55.90.92392.023.9

Yankee Stadium

There's been a lot of talk about the new Yankee Stadium playing like a bandbox, but the old stadium wasn't so bad itself. The short right-field porch allowed the Yankees to stack up on lefties, so there has been a higher percentage of homers hit to right in Yankee Stadium than any other park.


Field533True Speed ElevationApexWindTemperatureAltitudeStandard Parks
Left37%397.1105.430.092.92.90.40391.726.8
Center10%425.0107.528.191.75.20.80418.821.7
Right54%379.1101.830.689.32.80.60375.521.4


Next time, I will look at a team's home runs at home compared to a team's home runs away from home. The ultimate goal is to find how certain park's effects on certain players.

Touching BasesApril 14, 2009
All-Time Home Run Location Leaderboards
By Jeremy Greenhouse

I’ve hypothesized, along with others, that Ryan Howard might be the best opposite-field power hitter of all time. Thanks to the wonders of Retrosheet (and Colin Wyers), we can get closer to answering that question.

I queried for all home runs in the retrosheet era, and came up with about 185,000 homers. I then tried to eliminate all home runs that didn’t have a field location or were inside-the-parkers. That cut around 20,000 homers. And not all the homers cut were from the 50s. I think the worst year for data on home run location in the retrosheet era (1953-2008) was 1984. The most accurate years are probably during the ‘90s. Anyway, Here’s the diagram retrosheet uses. I coded all three zones left/right of center as pull/opposite field respectively, and the straightaway zone as center field. Onward.

+----------+---------+-----------+-----------+
|   Bats   |  Pull%  |  Center%  | Opposite% | 
+----------+---------+-----------+-----------+
|  Left    |  76.3   |    11.8   |    11.9   |
|  Right   |  76.6   |    12.1   |    11.3   |  
+----------+---------+-----------+-----------+

Somewhat odd that lefties hit more opposite field homers than center field homers. This won't really shed light on the matter, but I felt like looking at splits against pitchers.

+----------+-----------+---------+-----------+-----------+
|   Bats   |  Pitches  |  Pull%  |  Center%  | Opposite% | 
+----------+-----------+---------+-----------+-----------+
|  Left    |   Left    |  77.6   |    11.6   |   10.9    |
|  Left    |   Right   |  76.0   |    11.8   |   12.1    |  
|  Right   |   Left    |  77.0   |    11.7   |   11.3    |
|  Right   |   Right   |  76.4   |    12.3   |   11.2    |  
+----------+-----------+---------+-----------+-----------+

So it appears that lefty pitchers have their homers pulled more often than righty pitchers—likely a result of southpaws being softer throwers. I wonder why lefties appear to hit homers to the opposite field against righties at an abnormal rate.

Alright, let’s look at the top home run hitters of all time.


alltimehomers.jpg

This was a convenient place to stop, as the next three in line hitters were switch hitters in Chipper Jones, Mickey Mantle, and Eddie Murray. Unfortunately, I messed up coding home run locations for them.

I had a feeling Jim Thome hit a very high percentage of homers to the opposite field. He and Ryan Howard are linked in more ways than one. I have to give credit to Rich Lederer for guessing that Mike Piazza would be among the tops in percentage of homers to the opposite field. But just wait until we get to my man Howard. Sheffield, not surprisingly, pulled twenty times as many homers as he hit the other way. I can’t say that I knew Ernie Banks was that extreme a pull hitter.

With a minimum of 100 home runs in my sample set, here are those with the highest pull percentage.


pull%25.jpg


No, not that Frank Thomas. You can see his splits above on the all-time leader list.

These are guys who don’t have the power to hit it out any other way. I’m impressed that so many batters have hit 100 homers without using an entire third of the field. The only other member of this group is Don Baylor, who hit 277 homers without an opposite field blast. I wanted to check on Ichiro Suzuki, since he fits in this school of hitters, but didn’t reach the 100 home run threshold. He’s hit a single opposite-field home run in his career. I hope it was memorable.


Center Field Percentage


center%25.jpg


Mike Marshalls have played a large part in my life over the last year. I recently learned that one Mike Marshall was as an outcast Cy Young award-winning former teammate of Jim Bouton, who later became a doctor who developed radical pitching mechanics. Now I know that the year he retired another Mike Marshall, of whom I had never heard, debuted as an impressive home run hitter to center field. They both had their best years with the Dodgers.

These guys all seem to have tons of raw power, as that’s what it takes to hit balls out to center. Chipper Jones belongs on this leaderboard, but was excluded due to my glitch with switch hitters.


Opposite Field Percentage


opposite%25.jpg

It’s always a pleasure to see Roberto Clemente top any list. The fact that he was such an extreme opposite field power hitter might be a tidbit not many knew about, so I’m glad I can contribute one of the more trivial pieces of information to his legend. I’m surprised to see Julio Franco here. I saw a game or two of his in my day (who didn’t), and I always thought his unique batting stance would be conducive to pulling balls, kind of like Gary Sheffield’s bat wiggle. I guess holding the bat parallel to the ground delays his swing so he makes contact with the ball as it travels further in the zone. Chuck Knoblauch, who was the opposite of Franco in that he held his bat practically parallel to the ground behind him instead of over his head, pulled 75% of his homers. Also irrelevant: Franco's hit multiple homers against both Oil Can Boyd and Russ Ortiz. I doubt many others can say that.

So Ryan Howard is clearly up there. When I made my claim about Howard, it was after seeing that he was the only player in the last four years to have recorded greater than 15 homers in a season to his weak side. I was looking at Baseball Info Solutions data then, which has Howard’s 177 career homers distributed as 37.29% to left, 32.20% to center, and 30.51% to right. So the center field zone I’m using is a bit smaller than that of BIS. I think we can say pretty definitively that he’s a great opposite-field home run hitter, but Clemente seems to be in a class by himself when it comes to opposite%. I assume that Clemente’s and Skowron’s opposite field numbers are somewhat inflated, since their center field numbers are depressed as a result of the much deeper fences back in the day. Additionally, the right-field line at Forbes Field was 300 feet, which may have padded Clemente's totals.

Was Bo Jackson the beginning of the hype machine? Or was it Brian Bosworth? I believe that Bo Jackson hit a home run to the opposite field so far that it went into orbit, only to be knocked down by a homer Matt Wieters hit last week, which I’m sure will in turn be bumped by Stephen Strasburg and then Bryce Harper.

Derek Jeter’s opposite-field prowess is well known, and I believe he’s the only batter in this group to have added to his tally this year. (I wrote that, and then on Monday, Howard hit a three-run shot to left-center field. We’ll have to see where they score that one.)

On to the single season leaderboards.


Single Season Pulled


singleseasonpull.jpg


Roger Maris is the home run king!


Single Season Center Field


singleasoncenter.jpg


Chipper had 17 in 1999 as well, but he is not included. '99 was an interesting year. 1985 was also an interesting year. Clearly, home run location data from that year are not reliable.

Finally, here's the leaderboard that started this whole ordeal.


Single Season Opposite Field


singleseasonopposite.jpg


There you have it. Howard is demonstrating opposite field power the likes of which we have never seen before.

Actually, one more note. Since I regret messing up the coding for switch hitters, I decided to go back and check on the five most notable I could think of in Mickey Mantle, Eddie Murray, Chipper Jones, Lance Berkman, and Mark Teixeira. Here are their career splits.


switchhitters.jpg


Looks like Murray and Teixeira were similar from both sides of the plate while Berkman has a split personality.

If you've made it this far, here's something that might interest you. In a google docs spreadsheet, I’ve included all batters with 50 career home runs in my dataset and on another sheet all batters with seasons of at least ten recorded home runs. If you want to search for a specific player, I’d suggest that you check out baseball reference's home run logs. Sean Forman does good stuff over there.

Touching BasesApril 10, 2009
Thursday Thoughts
By Jeremy Greenhouse

No scheduled column today, so I'll be throwing a Barry Zito changeup. Luckily for us, Dave might also post later, so he'll bring the vintage Pedro change of pace. Here's what I got from yesterday's slate of games.

Kyle Davies threw seven scoreless innings yesterday. He got some buzz in the preseason as a potential breakout pitcher, as Joe Posnanski and scouts alike noted his September surge and excellent spring training. Last year, he posted a 4.06 ERA in spite of a mediocre 1.65 strikeout-to-walk ratio. However, in September, he improved those marks to a 2.27 ERA and 3.43 K/BB. Last afternoon, he was lights out as he struck out eight in seven innings.

Davies is a standard four-pitch righty. He’s been making steady improvement since a disastrous second year in the Majors. Per fangraphs, his fastball velocity since 2006 has risen from 90.6 to 91.3 to 91.5, and yesterday it was clocked at 91.7. Meanwhile, He’s improved his rate of drawing swinging strikes from 17.5% to 18% to 18.3%. Yesterday, he managed to induce 14 swing and misses on 52 swings.

Davies throws a rising fastball which made him vulnerable to homers two years ago. Last year, his HR/FB dipped to 7%, which will probably regress to the mean this year. Even so, his peripherals are improving, so while he might continue to improve his GB/FB rate, he'll almost certainly allow more homers. Davies snaps off a curve with above average velocity, vertical, and horizontal movement, which I would say makes it a plus pitch. However, he shows a noticeably higher release point for his curve than other pitches, which can only serve to tip pitches. Nevertheless, his curve was awesome yesterday. He threw only three of his 13 curves for balls, as he was able to draw a groundout, three swinging strikes, a foul ball, and five called strikes from the yakker. Davies’ changeup had some serious tail yesterday, and he threw it for strikes three quarters of the time yesterday which is excellent. He began using the changeup more often in September of last year in favor of his fastball, as he threw the change 16% of the time as compared to 10% earlier in the season. His changeup and curve are both strong pitches, which makes him formidable against both right-handed and left-handed batters.

Davies' slider and fastball have minimal differential in terms of velocity, but sometimes with sliders, not mixing speeds helps to conceal the pitch. Sinkerballers will often complement their two-seemer with a strong sweeping slider, so they stay on the same plane and have similar velocity, and therefore are unrecognizable until about 30 feet from the plate. Davies, on the other hand, works up and down, complementing his rising fastball with a slider that has little horizontal movement but dives down. I would think his slider is his worst pitch, but he might just use it as a show-me pitch against righties. I could see Davies showing a reverse platoon split, since his slider seems to be substantially worse than his curve and change. I could buy him as a league average pitcher this year too.

Other thoughts: We saw a rather telling difference in managing philosophies in the Mariners’ and Cardinals’ games. Young flamethrowers Brandon Morrow and Jason Motte both got their first save opportunities of the year earlier this week, and they imploded, forfeiting ninth inning two-run leads. Up 2-0 yesterday , Don Wakamatsu decided to give Brandon Morrow another chance, and Morrow promptly came in and walked the first batter on four fastballs out of the zone. But Wakamatsu’s confidence in the youngster paid off, and so did Morrow’s confidence in his heater, as Morrow threw nothing but fastballs all inning, resulting in two strikeouts and a can of corn to center to end the game. Tony LaRussa, however, was in the precarious position of trying to preserve a one-hitter. Did this game have any added significance as it was Chris Carpenter's first healthy start in three years? I don’t know, but LaRussa must have somehow considered it a must-win, as he abandoned his bullpen strategy, leaving Jason Motte on the bench and trotting out Dennys Reyes. Reyes got the job done, but I still prefer Wakamatsu’s approach to bullpen usage thus far. Don't panic after one game.

My favorite moment of the day was in the Dodgers Padres game. Vin Scully was calling the game, so you know it’s good. Heath Bell came on to pitch the ninth, and he had the luck of facing the heart of the Dodgers’ imposing lineup. Things looked bleak for the new Padres’ new closer when Orlando Hudson led off with a triple, sending one Manny B. Ramirez to the plate with the tying run on third and no outs. But with the infield in, Bell got Manny to ground out to short, halting Hudson at third. Following an Andre Ethier walk, Russell Martin bounced into a double play, and thus the Padres were tied for third place with the Dodgers. We might have our first divisional race of the year on our hands.


Touching BasesApril 07, 2009
GameDay, MLB.TV, and Instant Replay
By Jeremy Greenhouse

The new MLB gameday and mlb.tv are unreal. MLBAM unveiled GameDay Premium, which will cost $20 for the season , but I’ll be sure to make the investment for the comprehensive pitch f/x data presentations including hot/cold zones velocity charts, pitch type charts movement charts, and release point charts. Now that Josh Kalk took his player cards down, the only two sources left for real-time pitch f/x graphs and data are brooksbaseball and mlb gameday.

MLB.tv offered every game’s home, away, and radio broadcasts, and the DVR as well as “jump to inning” functions will be useful later on when I’m not watching games live. The option of displaying four games at once is awesome. Unfortunately, MLB’s archaic zoning laws prevents friend of mine who lives in Pennsylvania from watching Mets, Yankees, Pirates, and Phillies games due to blackout restrictions.

As for actual baseball, It looks like the closer’s job in St. Louis may still be up for grabs. Jason Motte entered the ninth inning with a two run lead and immediately brought the heat. His first pitch was a fastball in, and a hitter as experienced as Freddy Sanchez knew what to do, raking it for a double. Motte got the next two batters out before he unraveled. Motte was too predictable, as Adam LaRoche, Eric Hinske, and Jack Wilson all sat dead red. He challenged LaRoche with three fastballs, and LaRoche picked up his second hit of the game. Hinske pounded the first-pitch fastball for a double. Finally, Motte loaded the bases with two outs up one when he challenged Jack Wilson. Wilson was overmatched, swinging through the first-pitch fastball he knew was coming. He was able to foul off the second, but then went to the well once too often as Wilson caught up with a letter-high fastball for a game-winning three-run double. In all, Motte threw 22 fastballs in the 95-98 MPH range, but he might want to use his slider more often when he’s ahead in the count.

I already saw a couple contested home run calls for which replay wasn’t used. I think it was Yunel Escobar who hit a shot to center in the Sunday night that might or might not have cleared the wall. The hit ruled a double in spite of a fan’s protest that the ball had hit him in the chest. The next day Cesar Izturis lifted a ball to deep left which Johnny Damon had a beat on, but as he jumped at the wall a fan reached over and interfered with his arm, allowing the ball to travel into the stands. Is replay only going to be used during the playoffs or are we going to take this tool seriously?

Touching BasesMarch 31, 2009
Can Albert Pujols Win the Triple Crown?
By Jeremy Greenhouse
“My guess is that we will see another Triple Crown winner in the next ten years. The historical trend lines are heading in that direction. That doesn’t necessarily mean anything, as, as I said, the historical trend lines may be simply a result of a random clustering of talent. It’s difficult, and it hasn’t happened for a long time, but it has not become impossible for some player to win the Triple Crown.” Bill James—June 6, 2008

Albert Pujols has a serious shot at winning the first Triple Crown since Frank Robinson and Carl Yastrzemski did so back in the 60s. It's been over 70 years since a National Leaguer led the league in home runs, batting average, and runs batted in. The only time Pujols has led the league in any triple crown category was when he boasted a .359 batting average back in 2003. He’s finished second in every category at least once. But this year might be different.

This year, Pujols might have a fully healthy elbow. This year, Chipper Jones might not threaten .400. This year, Ryan Howard might not pound 50 home runs. According to Joe Posnanski, you just have to have The Power to Believe. This is the year of Pujols.

Here's how Pujols has stacked up thus far in his career. This table shows Pujols' marks followed by the league leader's in parentheses.

+-------+-------------------+-----------+----------------+-------+-------------------+
| Year  |   Batting Average | Home Runs | Runs Batted In | Games | Plate Appearances |
+-------+-------------------+-----------+----------------+-------+-------------------+
| 2008  |     .357  (.364)  |  37  (48) |  116  (146)    |  148  |       641         |    
| 2007  |     .327  (.340)  |  32  (50) |  103  (137)    |  158  |       679         |
| 2006  |     .331  (.344)  |  49  (58) |  137  (149)    |  143  |       634         |
| 2005  |     .330  (.335)  |  41  (51) |  117  (128)    |  161  |       700         |
| 2004  |     .331  (.362)  |  46  (48) |  123  (131)    |  154  |       692         |
| 2003  |     .359  (.359)  |  43  (47) |  124  (141)    |  157  |       685         |
| 2002  |     .314  (.370)  |  34  (49) |  127  (128)    |  157  |       675         |
| 2001  |     .329  (.350)  |  37  (73) |  130  (160)    |  161  |       676         |
+-------+-------------------+-----------+----------------+-------+-------------------+


Let’s break it down by category. I've looked at six projection systems—Bill James, CHONE, Marcel, Oliver, PECOTA, and ZiPS—to give us an idea of what to expect.

Batting Average
Last year, Chipper Jones'.364 average narrowly edged Pujols’.357 average for the batting title. This year, every projection system shows Pujols consistently hitting between .327 and .339. Chipper has a much wider range. CHONE and PECOTA, currently the two most trusted systems out there, completely disagree on Chipper. CHONE puts him at .310 while PECOTA shows Jones posting a .341 average to edge out Pujols. Jones’ true talent level with regards to batting average was the subject of much discussion here, here, and here. It's tough to say who has the edge between the two.


pujolsjones.jpg


Pujols and Chipper both excel in their plate discipline skills. Last year they had the lowest first-strike percentage of all National League batters to qualify for the batting title. They rarely see pitches inside the strike zone, and neither is prone to swing at pitches in general. In fact, Pujols and Chipper both walked more than they struck out. Pujols has achieved this feat seven straight years. When shooting for a high batting average, the importance of not striking out is, of course, that one has a greater chance at getting a hit if the ball is put into play.

Chipper and Pujols also excel at earning surefire hits by putting the ball out of play and over the fence. Low strikeout and high homerun totals give players a good chance at having a high average. The rest is dependent on BABIP. The factors that go into BABIP, according to an article by Peter Bendix and Chris Dutton, boil down to pitch recognition, speed, the ability to make solid contact, and the ability to spread the ball to all fields. Pujols hits a lot of line drives (20% career), and has incredible power (22.7% HR/FB, 84 XBH/year). He rarely swings, but when he does swing, he makes contact 90% of the time, which is above average and exceptional for someone who swings so hard. However, Pujols doesn’t spray the ball particularly well and isn’t too fast down the line. (He’s not slow, though. Fans gave him 46 out of 100 on speed, he’s an average to good baserunner, and he has a great glove.) Overall, xBABIP says that Pujols has gotten very lucky with BABIP lately, but nevertheless, Pujols' best shot at any of the categories is in batting average, where he and Jones are almost in a class by themselves.

Other batting average contenders: David Wright and Hanley Ramirez project to hit better than .300 almost across the board. Their problem is that they strike out too much, having both eclipsed the century mark last year. Garrett Atkins. Milton Bradley. Matt Kemp, if his .376 career BABIP is sustainable. Chase Utley. Jose Reyes. Brian McCann. Manny Ramirez has a hitter's haven in Los Angeles. Pablo Sandoval is my sleeper.

Home Runs

Ryan Howard is going to be Pujols’ biggest challenger in home runs and runs batted in. Howard, unfortunately, simply is more one dimensional than Pujols. There are no average specialists like Ichiro is in the AL, but Howard is the National League specialist in hitting the ball a long ways. A third of his fly balls clear the fence. Howard has hit 48, 47, and 58 long-balls over the last three years. Not a single projection system has Pujols hitting greater than 41 homers. Meanwhile, not a single projection system has Howard hitting fewer than 40. But there is hope.

Looking at their skillsets, Pujols may actually be the better homerun hitter, but is simply in worse circumstances. If we can establish that he has a higher talent level when it comes to homers, I say we can at least give him a legitimate shot to take the category.

Howard’s home park is hugely beneficial to his power output. Statcorner’s park factors show a crazy 116 HR/FB park factor for Philly and an equally ridiculous 87 HR/FB for St. Louis. (That’s Petco level. I had no idea.) Greg Rybarczyk used his Hit Tracker system to come up with a new method for calculating home run park factors. Howard is 15% more likely to hit homers in Citizen Bank Park to any field except for straight away center, where Pujols would have an edge.

Howard’s average homer traveled 400 feet last year and the speed off bat was 104 MPH. But Pujols demonstrated more raw power, as he hit his average homer went 406 feet and 106 MPH off the bat. Furthermore, Howard's power figures seem to be declining, as his distance and speed figures are trending downward. Pujols shows more consistent power, averaging distance and speed off bat figures of 406, 412, 407, and 106, 109, and 110 in past years.

Here's the placement of their home runs from last year. Pujols' home runs and Busch's outfield walls are in red, Howard's home runs and Citizen Bank's outfield walls are in blue.


howardpujols2.jpg


See that 20 foot discrepancy between Busch's left field wall and Citizen Bank Park's? It looks like Howard got three or four extra homers in that area, and there's little doubt in my mind that Pujols hit some fly balls out there that went for mere doubles.

Other home run contenders: Adam Dunn won the "golden sledgehammer" with an average of 419 feet and 109 MPH. Fortunately for Pujols, he's now playing in Nationals Park. Four straight seasons of exactly forty homers will likely come to an end. Ryan Braun and Prince Fielder are The Brewers Young Duo That Needs A Nickname. They're 24-25 years old and Fielder's already logged a 50 home run season while Braun's getting there. Joey Votto. Lance Berkman. Adrian Gonzalez was just profiled by Marc Normandin on Baseball Prospectus using Hit Tracker data, and it's crazy to think what he'd be hitting if he were still in Texas. Manny Ramirez. Alfonso Soriano. Chris Young is my sleeper, and who knows what Justin Upton is capable of?

Runs Batted In

Ryan Howard is out in front of the RBI race, but we all know how team-dependent those are. Last year, Chase Utley made up 32 of Howard's 146 RBI, but if Utley is dinged up, his decline, coinciding with Howard’s decline, would severely impact Howard's RBI potential. PECOTA, in fact, shows Pujols driving in more runs than Howard.

Last year, Pujols batted 3rd behind Aaron Miles and Skip Schumaker, who did well getting on base in front of him. Schumaker should bat leadoff this year, which is a plus, since he's OBPed around .360 the last couple of years and upped that to .370 last year when he was the leadoff man. Hopefully Ryan Ludwick bats second, which would give the Cardinals' top two batters higher OBPs than the Phillies top two of Jimmy Rollins and Shane Victorino. Pujols batted third most of last year, but it looks like Tony La Russa will switch Pujols to cleanup and insert Ryan Ankiel into the three hole. The trio of Schumaker, Ludwick, and Ankiel ought to set the table nicely for Pujols, at least better than did Miles, Schumaker, and Cesar Izturis, who La Russa batted ninth most of last season season in place of the pitcher.

Of note, Howard had fewer extra base hits than Pujols, despite all the homers. The lack of doubles is a large part of the reason why Howard is overrated. Howard had 146 RBI to Pujols’ 116. They both earned just over half their RBI on homers, but Howard was able to earn twice as many RBI on singles, while hitting thirty fewer singles. This suggests Howard had men in scoring position more often than Pujols did. Indeed, Howard had 50 more plate appearances with runners in scoring position. Perhaps that evens out this year.

Pujols has been getting intentionally walked more and more, and last year was given a free pass twice as often as Howard. That doesn't bode well for Pujols, considering all those walks come during RBI chances. Furthermore, Howard’s BABIP with RISP was .383 compared to an overall .285 BABIP. This is likely explained by the infield shift, as Rich Lederer noted last year. On the other hand, Pujols faced terrible luck in RBI situations, suffering a BABIP with RISP 50 points below his season total. Check out this graph from fangraphs, and first off notice the age. Ryan Howard is older than Albert Pujols! Again, I had no idea.


pujolshowardbabip.jpg


If Howard can't collect hits within the field of play, and continues his strikeout percentage trend, he'll simply be relying on his homers for RBI. I've already shown that that faucet of production might run drier for Howard than it has in previous years. Howard has a strikeout percentage three times that of Pujols, and when they swing, Howard swings and misses three times more often too. Howard's skills are in decline. I’m going to say there’s a chance for Pujols to out ribeye Howard.

Other RBI contenders: David Wright, Carlos Beltran, and Carlos Delgado. The top of the Mets' lineup is really dangerous. Lance Berkman. Manny Ramirez. Joey Votto. Aramis Ramirez. Braun and Fielder. Garrett Atkins. Andre Ethier is my sleeper. The top of the Dodgers lineup is awesome too, and Ethier slugged .510 last year. If Adrian Gonzalez were to get traded, he could compete, but the Padres aren't scoring many runs this year.

In my opinion, Pujols is the best hitter for average, best hitter for power, and best hitter at driving in runs in the National League. The problem is that the pieces around him have yet to fall perfectly into place. His park, his lineup, and other Triple Crown category contenders have not been kind to him. I won’t predict that Pujols wins the Triple Crown, if only for the fact that no matter how overwhelming a favorite is in any category, the field is generally a better play thanks to random variance. But if Pujols does pull it off, don't tell me I didn't warn you.

Touching BasesMarch 24, 2009
Fun With Hit Tracker: Home Runs Over Time
By Jeremy Greenhouse

All home runs are not created equal. Over the course of a six-month season, things are bound to change. Players wear down or maybe some heat up. In the past, we've been able to find player trends by analyzing first-half and second-half splits or maybe even game logs. But now with new data sources, we can try to find out how or why players produce different outcomes over a season. Are they lucky? Do their skills improve? Do they fatigue?

Josh Kalk used pitch f/x data to show how pitchers fatigue during starts and he unveiled wear pattern charts for specific pitchers to show how some fatigue over the course of a season.

Another great new data source that has not received the same attention as pitch f/x is Hit Tracker. Developed by Greg Rybarczyk, Hit Tracker tracks every physical aspect of the home run. So how did the distance of home runs vary over the course of the 2008 season?



"True Distance" measures how far the ball actually traveled, or how far the ball would have traveled had it landed uninterrupted. I know, if only we could project how far Mickey Mantle’s and Ted Williams' legendary shots traveled. Well, Hit Tracker can. Here and here you go.

The chart seems to show that home run distances trend upward until early August and then fall slightly. It also appears that we can say with confidence that over the course of a week, the mean home run distance will be right around 390-400 feet. The first data points on the chart are a bit whacky, since the March average was 399 feet per home run, but then the first three days of April averaged 390-foot homers per day. Hence, the five-day rolling average is somehow much lower than the same month's average. But the main observation is that from April until July, there is a rather distinct increase in home run distance—around five feet per dinger. So what causes the change? Perhaps players need some time to get into their groove, or perhaps the environment becomes progressively more conducive to home runs. But how do we measure that? Did I mention that Hit Tracker also records the two most important components a batter can control? It captures where and exactly how hard the ball is hit. With the upcoming advent of hit f/x, we might get this data for all types of batted balls. The launch angle is measured in horizontal and vertical degrees from the point of contact three feet above home plate, while the speed off bat is measured in miles per hour. I chose to use the speed off bat as a measure for the player’s skill over time. I believe that a hitter's objective when he is at bat is to hit the ball as hard as possible. Here are the results:



Well, that appears to directly contradict what we saw in the first chart. Players seem to start off hot in the opening weeks of the season, but then by late May the average speed off bat flattens out at around 104 MPH until playoff time when there is a pretty decent rise. It would make sense that the select few who are able to hit homers in October do so with more power than the average hitter.

If it’s not the hitter who controls the change home runs, then it must be the hitter’s environment. Fortunately, Hit Tracker also records atmospheric effects such as temperature, wind, and altitude. Altitude should theoretically remain constant over time, as stadiums don't traditionally switch locations. But wind and temperature flow with the seasons. Since both factors can negatively impact the distance a ball travels, I plotted the absolute average impact as well as the actual average.



The impact due to temperature is defined as “the distance gained or lost due to the impact of the ambient temperature, in feet, as compared to a 'standard' temperature of 70 degrees." Temperature, along with the Speed Off Bat appear to largely explain the opening chart which showed the average true distance of home runs over the course of 2008. I’m not a physicist, but I figure a change in one mile per hour is equivalent to about 1.5 feet per second, and the average home run stays in the air around 3-5 seconds. So if the speed off bat decreases half a mile per hour in the early months, then the batter is responsible for about a three-foot decrease in distance. Yet the average true distance of home runs increases from about 395 in April to 398 feet in July. While the batter might cause at most a five-foot dip as the season progresses, the temperature appears to rise from a minimum average of -5 feet to a high of 5, which would explain the rise in distance. I’m also not a meteorologist, but the symmetry makes sense to me as the temperature rises in the spring, then peaks around the June 21 solstice, maintaining that point through the dog days of August, until the temperature declines going into the Fall classic. Not exactly shocking results.

Putting it all together with the standard distance, which controls for atmospheric effects and simply measures how far the ball would have been hit in neutral conditions:


Looks pretty even throughout the season, with the exception that distance possibly curls up at the start and end points. This could all be contributed to small sample size, but the fact that better players make the playoffs may have something to do with it, but do better players also start out hot? I'll be sure to keep note of it over the next few weeks.

Here's a chart of the three year's worth of data. Out of about 15,400 homers, Hit Tracker was missing data on less than 300 of them. The table should be read as the mean of each category, followed by standard deviation in parentheses.

Month     Amount True Distance Speed Off Bat Wind Effect Temp Effect Standard Distance                  
March     26     399.8 (25.3)  105.6 (5.7)   5.4 (17.5)  -4.0 (2.4)  396.7 (33.5)      
April     2214   395.6 (24.7)  106.1 (5.2)   1.7 (13.2)  -2.5 (4.3)  393.8 (27.1)   
May       2522   396.0 (25.3)  105.6 (5.2)   1.8 (11.7)  -0.2 (3.7)  392.1 (26.2)
June      2545   396.6 (25.5)  105.1 (4.9)   2.0 (10.6)   2.0 (3.4)  390.0 (25.8)
July      2446   397.9 (26.1)  105.3 (5.0)   2.3 (11.0)   3.3 (3.2)  390.4 (25.3)
August    2641   397.0 (24.8)  105.9 (5.0)   1.4 (9.7)    2.7 (3.6)  392.7 (26.6)
September 2508   398.0 (26.1)  105.9 (5.2)   1.5 (10.0)   0.9 (3.1)  392.7 (26.6) 
October   242    393.8 (24.8)  105.8 (5.1)   3.0 (10.6)  -2.0 (3.4)  391.2 (26.7)
***

I wanted to do a mini-case study applying changes in home runs over time, and the clear choice for any such study is Ryan Howard. He gives us a nice sample to work with and such a large part of his value is built on home runs. He’s been on a clear decline since his age 26 season, so we can see whether there have been changes in his home runs year by year. Plus, if you look at his day-by-day graph on fangraphs, he’s been a rather remarkable second-half hitter.

howard.jpg

Over his career, he's held a 168 point difference in OPS between the first and second halves of the season. I'm not predicting that he'll continue the trend this year—I'm just pointing out that the trend has existed.

Howard also intrigues me since I believe he might be the best opposite field power hitter of all-time. But that’s a subject I’ll tackle another time hopefully. Again I decided to forego the launch angles and stick to the effects of speed off bat, temperature, wind, and distance. Presented without much commentary:



It's evident that he's been hitting the ball with less force in recent years. I also like that you can easily see trend lines with positive slopes each year, confirming him as a late bloomer.



Wind impact appears to be random, but you can almost make out those parabolic curves in temperature impact. A power hitter might be prone to mid-summer surges thanks to those extra five to ten feet in fly ball distance from less dense air.



Not much notable. He averaged 403 feet in 2006, 406 feet in 2007, and 398 feet in 2008. Here's his table. It includes his home runs from 2006-2008 but is missing three from 2007. It should be read category mean followed by standard deviation.

Month     Amount True Distance Speed Off Bat Wind Effect Temp Effect Standard Distance                  
April        13   414.2 (27.3)  109.1 (5.1)   5.3 (12.5)  -3.0 (4.3)   408.3 (28.9)   
May          29   394.9 (29.1)  105.7 (5.5)   3.4 (9.7)    0.5 (4.4)   390.1 (33.0)
June         24   410.7 (32.9)  105.1 (5.2)   3.2 (11.4)   3.3 (2.8)   402.5 (30.9)
July         28   398.1 (27.7)  107.7 (6.0)   2.8 (7.9)    3.0 (2.9)   391.2 (29.0)
August       26   404.8 (30.8)  108.0 (6.9)  -3.6 (13.2)   4.3 (3.0)   403.2 (35.0)
September    30   400.2 (22.9)  106.4 (4.6)   1.0 (6.9)    2.1 (2.8)   396.8 (24.8) 
October      4    390.5 (25.0)  104.5 (6.5)   3.7 (4.5)   -2.5 (6.3)   389.3 (32.2)

All data was obtained from Hittrackeronline.com. Interested parties may contact webmaster@hittrackeronline.com

Touching BasesMarch 18, 2009
The UZR Era
By Jeremy Greenhouse
"The interesting question is why defense is so much more difficult to quantify than offense in all sports. Perhaps defense by its nature involves more interaction between individuals than individual actions, and perhaps the way to get past that is to embrace the concept and measure combinations of players." -- Bill James


Over the offseason, fangraphs unveiled Ultimate Zone Rating, a defensive metric developed by Mitchel Lichtman that measures how efficient a fielder is at turning balls in his area of responsibility into outs. The data, tracked by Baseball Info Solutions, ranges back to 2002 and is converted neatly into a runs saved figure. I’d like to give an overview of some notable teams and players throughout the years UZR has been available. As defense is a team effort, here’s a visual representation of how each team’s outfield has performed during the UZR era. The best outfield defenses, that convert balls in play into outs at a high rate and limit advancement of baserunners with their arms, will be in the top right, while the worst will be in the bottom left.



ARM rating is uncorrelated with an outfielder's range, though the measures are not independent, since the amount of time it takes a fielder to reach the ball affects how he is able to hold baserunners. The value of outfield arms, which is usually not mentioned when evaluating team defense, can add or subtract 20 runs a year, so it’s definitely significant. However, to find an outfield’s true talent when it comes to arms, any figures would probably have to be heavily regressed.

The 2004-2007 Braves consistently had the best outfield in the Majors. With Andruw Jones patrolling center, the Braves were set at the second most influential defensive position on the diamond when it comes to fielding.* Jones was flanked in left by the likes of Ryan Langerhans, Matt Diaz, and Willie Harris, who all had great range. And in right, the Braves trotted out stalwart Jeff Francoeur and his rocket arm. Meanwhile, the Yankees from 2002-2006 consistently fielded the worst outfield in the Majors.

*The traditional defensive spectrum is well-known, but for reference—shortstops and center fielders are expected to make just over 2.5 outs per nine by UZR, followed by second basemen. Right fielders and third baseman come in at two expected outs per nine and left fielders a bit less. The fact that right fielders are expected to make more outs than left fielders goes against traditional baseball knowledge, which I believe states that fielders with more range should play left. Batters tend to hit more fly balls to the opposite field than to the pull field, and righties bat more than lefties, so this makes sense. Perhaps if there's a defensive whiz in right, say Ichiro Suzuki or Jayson Werth, they should switch fields if at the same time there's an albatross in left, say Raul Ibanez, depending on batter handedness and spray-chart information. Finally, first basemen come in at about one expected out per nine, though that of course does not account for throws first basemen handle.*

The Nationals/Expos franchise has put up the best ARM rating in the UZR era. In each of the final three years of their existence, the Expos' outfield led the league in ARM thanks to Vladimir Guerrero, Juan Rivera, Endy Chavez, and Brad Wilkerson. Of course, only Chavez had any range, so their defense as a whole trended around average. The collective outfield arms of the 2003 Detroit Tigers, the worst team ever (?), cost the team 20 runs, one of the worst marks on record. However, that number doesn’t really stand out among that team’s .300 on-base percentage, and 1.37 strikeout-to-walk ratio. What's the opposite of nitpicking?

The Rays' worst-to-first success has been fairly well documented. Their biggest improvement may have been their outfield defense, which saved nearly 70 runs more in 2008 than it did in 2007—the largest improvement by any outfield in the UZR era. B.J. Upton and Carl Crawford's numbers skyrocketed while Eric Hinske and Gabe Gross were great replacements for Delmon Young. and Jonny Gomes. Considering left and right fielders have remained constant for the Rays both years, I wonder to what extent the difference can be attributed to individual improvements from Upton and Crawford, and how much of the success was thanks to the unit meshing together in terms of positioning. The Rays went on to the World Series, where they met the Phillies, who incidentally posted the exact same 74.3 team UZR. The Phillies were aided by their ARM rating of 22.1, the highest single-season mark to date. Pat Burrell was the only bad defender on the team, but his arm almost made made up for what he lacked in range, while Shane Victorino and Jayson Werth are stellar all-around players.

Now let's take a look at the infield.



Though the Rays improved their infield by 50 runs in UZR from 2007-2008, the second biggest year-to-year leap by an infield, they trailed well behind the 2006 Kansas City Royals. In 2005, the Royals infield was 47 runs below average. In 2006, they were 32 runs above average. Nevertheless, the pitching staff still allowed more runs in '06 than in '05! In 2006, KC’s 5.29 FIP was the highest single season mark of any club ridiculous run environments seen in 2000. But their defense did make a marked improvement. The Royals saved over 110 total runs on defense in 2006 compared to 2005, thanks to the additions of Mark Grudzielanek, Doug Mientkiewicz, and Reggie Sanders. The following year, 2007, the Royals 78.5 UZR in was the highest single-season total for any team since 2002. Mark Teahen was terrible in 2005, but he found his footing on both ends of the field in 2006, posting an average UZR and an .874 OPS. Then in 2007, he put together another solid year, losing production with the bat but gaining ground with the glove in his move to right field. Unfortunately, it all fell apart for him last year, and now we’ll see how he does at second base. Also in 2007, Tony Pena actually merited playing time, finishing second in UZR for shortstops behind only Omar Vizquel.

Remember that All-Star studded Rangers infield of Hank Blalock, Michael Young, Alfonso Soriano, and Mark Teixeira? It turns out they gave away a whole lot of their value on defense. In 2005, the Rangers infield had a UZR of -62.4, the worst ever. The 2007 Giants had the best infield on record. In the same vein, the Athletics infield last year had the highest double play run total though it's a matter of only a dozen or so runs. Lastly, The Phillies have had the best infield defense in the last seven years, while the Rangers and Yankees have been worst.

The 2008 Phillies infield defense has been the topic of some discussion. Ryan Howard was so bad that the entire defense shifted to cover him, maximizing the range of Chase Utley, Jimmy Rollins, and Pedro Feliz.. The Phils' infield saved 40 runs last year, an excellent figure, no matter how you slice it. To actually isolate Utley from Howard, it would probably be best to use a "With or Without You" analysis, comparing Utley's performance with Howard on the field against his performance with other first basemen, though the sample would be impossibly small.

I am forever on a quest to find why teams or players are "clutch," and out-perfrom their expectations in high-leverage situations. I constantly correlate variables with fangraphs' clutch score, and I have so far found very weak correlations with strikeout rate and baserunning on offense, meaning teams that run the bases well and rarely strike out for some reason do better in more important situations. Now, with fielding, I found a weak correlation between clutch and double play runs. I suspect some teams are adept at employing relievers who specialize in inducing groundballs at opportune times, and therefore leverage their double play runs. It's also possible that some teams are able to effectively manage the intentional walk to their advantage late in games, setting up the double play.

I think splitting up defenses into infield and outfield units is a comprehensive method for evaluating team defenses, but it's often more interesting to look at individual players, so I'll leave you with the time leaders and laggards in UZR for all seasons from 2002-2008.


UZRleaders.jpg


Andruw Jones has by far the highest career UZR. By Sean Smith’s Wins Above Replacement leaderboard, Jones is 77th in WAR, making him a borderline Hall of Fame candidate. Any sort of resurgence would make him a near lock, but it’s currently looking bleak as is.

Arms are an area of study that have belonged to John Walsh, but UZR's ARM metric shows similar results, and confirms many players' reputations. Alex Rios has paced the league in ARM runs, while Ichiro and Francoeur trail slightly. In 2007, Francoeur's arm was the most valuable of any outfielder during the UZR era. On the other end, Juan Pierre's arm has been laughably bad, coming in nearly 20 runs worse than anyone else's over the years.

Jack Wilson was slickest at turning the double play in the UZR era, and he certainly does make it look pretty, if I do say so myself.

Finally, the Yankees. Bernie Williams and Hideki Matsui show up on the bottom ten list, and Derek Jeter, Gary Sheffield, Bobby Abreu, Jason Giambi and Johnny Damon also show up in the bottom 10th percentile, so yeah, the Yankees haven't valued defense highly.

Touching BasesMarch 10, 2009
Baserunning and Leverage
By Jeremy Greenhouse

Let’s set the scene.

2004 ALCS. Yankees vs. Red Sox. Game 4. Red Sox down a run, Dave Roberts on first, ninth inning, no outs.

Dave Roberts advanced on a stolen base to 2B.

2007 National League one-game playoff. Padres vs. Rockies. 13th inning tie game, Matt Holliday on third, no outs.

Jamey Carroll hit a sacrifice fly to right (Liner). Matt Holliday scored.

That’s how it looks in the box score, but those two baserunning plays might be the two most momentous swings in baseball over the last five years.

Baserunning statistics are rarely looked at, yet the difference between the best and worst individual baserunners is about 20 runs, or two wins. Pretty significant. Players like Holliday, Carlos Beltran and Ichiro Suzuki, and other efficient baserunners become underrated when this skill isn't accounted for. So is baserunning an underrated commodity in the grand scheme of things?

There are several advanced metrics for baserunning, but my choice for this analysis is Bill James Online’s “net gain,” which takes into account “basestealing, avoidance of the double play, and success at taking the extra base while avoiding being thrown out.” I tend to think of four bases as equivalent to about one run, though I could be off base there. Here's the relationship between runs scored and net bases. Each dot represents a team's single season total over the time span 2002-2008.



The r-squared between runs and net bases is .17, so it’s pretty clear that the least important part out of the four facets of the game—hitting, pitching, baserunning, and defense—is baserunning. The difference between the best and worst baserunning teams in the majors is around 50 runs. That can be compared to 125 run swings in fielding, and between 200-300 run differences in pitching and hitting, depending on the year.

As demonstrated by "The Steal" and "The Sac Fly," mentioned at the beginning of this article, baserunning can at times be the make or break factor in any given game. Tom Tango developed, and statistically quantified, the concept of a leverage index to provide context to any game state. Baserunning, defense, hitting, and pitching can all be leveraged, be it through pinch-runners, pinch-hitters, defensive substitutions, or relief pitchers. I’d like to look at whether good baserunning teams also perform better in high-leverage situations. So, using one of my favorite statistics in fangraphs “clutch” score and one of my favorite types of visual presentations in google’s motion chart, I compared a team’s baserunning to its ability to come through when it matters most. Here's a year-by-year graphic of all 14 American League teams' baserunning metrics plotted against their clutch score.



And now the National League:



The correlation coefficient between net baserunning and clutch score is .12, which isn’t significant, but it’s not zero. Furthermore, going from first to third or scoring from first has a bit of a stronger correlation than avoiding the double play and stealing bases. Strikeout percentage has an inverse relationship of similar strength to baserunning, so there are a couple variables that might weakly relate to how well teams can come through when it matters most.

The average American League team is seven bases a year better than National League teams. I still don’t know what a National League style of play means other than inferior baseball. The Phillies have been the best baserunning team over the time frame, but they have been rather unclutch. The Angels rank sixth in baserunning, right behind the Yankees ironically enough, and the Halos have been twice as clutch as any team in the time period. Meanwhile, the Ozzieball White Sox and Bowdenball Nationals lagged in basferunning, while they put up neutral clutch scores.

How about a leaderboard of the most and least clutch teams since 2002?

clutch%20baserun.jpg

I find the bottom five teams on this list interesting. Well, the Tigers .265 winning percentage is interesting too. But the Astros, Cubs, Indians, and Giants were all quality teams that won in spite of bad luck, unlike the Angels and Red Sox at the top who won because of it. Anyway, it looks like the clutch teams are better baserunners, but barely.

People sometimes try to explain the difference between a team’s Pythagorean winning percentage and their true winning percentage by the strength of that team's bullpen, baserunning, and "smallball" in general. But however a team creates or prevents runs, it is accounted for in the Pythagorean record. Then again, in many situations these aspects of the game are leveraged. So I decided to look at the difference between a team’s winning percentage and its Pythagorean winning percentage and winning percentage in one-run games. The results indicated that overall baserunning can’t explain how a team fares in close games at all, despite Dusty Baker's claim that "you gotta have some speed to win close ball games."

I attempted to break the data down further by looking at pinch-runners and performance in different situations, but unfortunately the only data readily available were stolen base and caught stealing scores.

  • The Athletics last year had the two most steals from substitute players of any team since 2002, thanks to pinch-runner extraordinaire Rajai Davis. Davis had 42 plate appearances as a sub, picking up 11 singles and one walk, but he pinch-ran so often that he had more stolen base attempts than times he reached first base. Oddly, Davis was a better hitter than basestealer as a sub on the A’s, as he hit to a tune of .341/.357/.561, while he was successful in just 11 of 16 theft attempts. It didn’t really matter for the A’s, who showed unremarkable splits in clutch situations. However, I wouldn't dismiss the idea of keeping a 25th-man on the roster as a specialist pinch runner.
  • The Phillies, the best baserunning team in the league each of the last two years, have topped the league in contributions from substitutions on the bases as well. Their sub-baserunners have put up 28 steals compared to a single caught stealing, while in the ninth inning the entire team has recorded 31 steals to one caught. But again, it seemingly makes no difference in the team’s record in tight games.
  • The incredibly unclutch Indians of 2005 were 3 for 11 stealing bases in situations with a leverage index above 1.5, and it probably did take them out of a game or two.

The sample sizes in these situations are small, so it’s hard to make conclusions using this data. But I think that the small sample size is a decent conclusion. While baserunning might be under-appreciated in today's game in a macro sense, it might be over-valued in explaining how an individual game is won and lost. Teams can leverage their baserunning to add a few runs over the course of a season, if that. Teams hold constant true-talent levels for baserunning, and it doesn't appear that the better clubs are able to achieve greater success by leveraging the ability at opportune times. Over 162 games, the difference between a team's offensive performance in high-leverage situations relative to their normal run production levels can't be explained by their baserunning.

Touching BasesMarch 03, 2009
To What Extent Do Batters Control Pitches?
By Jeremy Greenhouse

Ninety percent of the game is half mental, and that Yogiism is most apparent when it comes to the pitcher vs. batter matchup. Every at-bat has a story. Every pitcher has a repertoire of pitches from which to choose and he will use context and game theory when making his decisions. But perhaps the most important factor in determining pitch selection is the type of batter at the plate. So do batters control the type of pitches they see?

Dave Cameron recently got the ball rolling when he noted that that the percentage of fastballs a batter sees is inversely tied to his isolated power. The relationship makes intuitive sense, and the correlation coefficient of -.59 suggests that power is one of the most important determinants in how often a pitcher will challenge someone with a fastball. I decided to test out a whole lot more correlations to see what effects what. To better understand correlations and regressions in baseball, I’d suggest reading this article by John Beamer. The main points: the correlation coefficient is “a statistic representing how closely two variables co-vary; it can vary from -1 (perfect negative correlation) through 0 (no correlation) to 1.” Also, correlation does not imply causation. There will be a significant amount of interaction between the variables. For example, a batter who swings quite often will receive plenty of breaking balls, as those pitches are harder to make contact with. The flip side is that a batter may only swing so much because he sees a lot of curves and can't lay off them.

First, let's take a look at who saw the most fastballs, breaking balls, and off-speed pitches in any season over the last four years.

pitchtype.jpg


It looks like hitters with no power saw the most fastballs, free swinging power hitters saw the most breaking balls, and I don't see any rhyme or reason to the list of batters who saw a lot of change ups and split fingers.

My first test was to run a correlation four years with ISO and fastball percentage using my sample of about 1700 batters. The correlation coefficient was -.45. My initial guess was that as my sample had a lower minimum plate appearance, those batters with little reputation were being pitched differently than those whom the pitcher knew the book on. Limiting the plate appearance minimum from 100 to 300, and then to 500, I was proven wrong, as limiting the plate appearance minimum to 100, 300, or 500 resulted in correlation coefficients of -.45 as well. The low coefficient of correlation in my data was consistent with most of my results, as running the same statistical tests using plate discipline stats that Dave Appelman ran resulted in smaller coefficients.

Correlating fastball percentage with other traditional statistics confirms a lot of conventional baseball wisdom. The more a batter strikes out, the fewer fastballs and the more breaking balls he receives. There is also a positive relationship between strikeout percentage and fastball velocity. Unfortunately, no pitch type information correlates with batting average on balls in play. I had hoped that pitch type might be a factor in improving BABIP prediction models, but I guess not.

However, certain batted ball statistics do co-vary with pitch type. The stronger a batter’s pull tendency or fly ball tendency, the fewer fastballs he will likely receive over a year. Conversely, groundball hitters face a much higher percentage of fastballs. These types of hit trajectories and vectors are closely intertwined with power output, so this just further shows that pitchers tend to throw more fastballs to hitters who can’t do significant damage to them. This fear factor again comes through in testing how a pitcher will approach the zone against power hitters. There is a positive correlation between the number of wild pitches and passed balls and a batter's power based on stats like homerun per fly ball or ISO.

Plate discipline stats align quite well with pitch type stats. Showing a willingness to swing at pitches results in fewer fastballs, but making contact results in, or is the cause of, many fastballs. Moreover, free swingers face a higher fastball velocity than patient hitters, and contact hitters face a lower fastball velocity than power hitters. So when pitchers do challenge a scary hitter with a fastball, it appears that they dial it up. Or perhaps, only pitchers who can bring the heat will go after power hitters, while those with subpar fastballs simply avoid throwing fastballs altogether in those situations. And is there anything more frustrating than watching a batter swing at a slider in the dirt? There is a correlation between a batter's slider percentage and his swing percentage on pitches outside the strike zone, but the relationship only holds strong for batters who have established reputations in the league as hackers.

slidervsoswing.jpg

Notice the much lower coefficient of determination for players with between 100 and 150 plate appearances. There is a wider range of talent in this pool of players, but the spread in fastball percentage is also greater, suggesting a pitcher's choices are more random when they have less information on a batter.

Without expecting to find much, I tested the relationships between win probability statistics and pitch types. Though the results were rendered statistically insignificant, they all made sense. Batters who have higher leverage indexes over the course of a year tend to see fewer fastballs and curveballs, but more changeups and sliders. Furthermore, batters who come up with more on the line face increased velocity from each type of pitch. Then I looked at one of my favorite statistics, the clutch score—a measurement of how much better or worse a player does in high leverage situations than he would have done in a context neutral environment. Nothing significant or interesting came up with regards to pitch type, but I like the idea of clutchiness so much that I correlated it with other variables. As reported in Tango's clutch project, fans prefer batters who can put the bat on the ball. Batters who hit for power and strike out a lot do indeed perform slightly worse in the clutch, while those more adept at making contact perform slightly better.

Unfortunately, I didn’t account for any type of platoon situation, which is of course one of the more important things in determining pitch type. Same-handed batters vs. pitchers matchups see more breaking pitches while different-handed batters vs. pitcher matchups see more off-speed pitches in the variety of changeups and splitters. Running a basic test to see how well this theory holds up, I coded lefties as 0 and righties as 1 and correlated the handedness with pitch type. The percentage of sliders seen returned a correlation coefficient of .65, which confirms our suspicions. As righties see many more same-handed pitchers, they get a higher percentage of breaking pitches moving away from them. So even though lefties don't show up when searching for the leaders in slider percentage, that's just because they face a disproportionate number of different-handed pitchers.

Ryan Howard has never been able to hit left-handed pitchers (300 point difference in OPS in his career), and as such, he has received the highest percentage of sliders of any lefty each of the last two years with 200 plate appearances, but it still doesn’t place him in the top 25 either year. On the other side of the spectrum, the correlation between changeups and handedness was -.54. Lefties face different-handed pitchers much more often than same-handed, and therefore receive the changeup much more often than righties. Going a step further, we see that righties receive faster sliders and lefties get faster changeups because right-handed pitchers throw harder than lefties in general. Righties are also more likely to see pitches in the strikezone than lefties.

Lastly, park factors were not accounted for, though they play a large role in determining why pitchers throw certain pitches. As Josh Kalk showed, pitchers are much more likely to throw their fastball/sinker (which are classified as the same pitch by fangraphs) in Coors than in other parks. Matt Holliday, who is much more of a power hitter than a contact hitter would normally receive few fastballs, but playing in Coors, a pitcher’s best option is to bring the heat, as any kind of breaking ball in the thin air might get crushed. Therefore, Holliday has received a well above average amount of fastballs in his career, and it'll be interesting if his hitting approach changes as his pitch type profile changes.

Plugging a bunch of these variables into a multiple regression for fastball percentage yields an r-squared of .5 , meaning that half the variance in how often a batter is thrown a fastball can be explained by the hitter's contact skills, power, and plate discipline. So what I'm interested in is what the rest of the variance can be attributed to. Game state and randomness will certainly affect a pitcher's decision on what he will throw. And pitchers will often simply disregard the batter’s reputation, pitching their own game based on their own strengths. The last possibility is that pitchers are actually using more advanced data in their decisions. You can observe a lot by watching, and if pitchers study batter film or actually learn batter tendencies with the advent of pitch f/x data, it could change the art of the batter vs. pitcher matchup from what it was in Yogi's days.


Touching BasesFebruary 24, 2009
Batted Ball Location Leaderboards
By Jeremy Greenhouse

With apologies to Dave Studeman, whose batted ball leaderboards on The Hardball Times are always must reads, I decided to try a similar data presentation, breaking up batted ball stats by fields of play instead of by type. Using linear weight run values, I developed lists showing who the most productive players were in 2008 when pulling the ball, taking the ball back up the middle, or going to the opposite field.


Value of Pulled Batted Balls

pulled.jpg

Every one of the top ten players when it came to pulling the ball happened to bat right-handed, which can be explained by their relative advantage when hitting ground balls. Righties who pull grounders force longer throws than lefties who pull grounders. These players are mainly fly ball hitters. In the case of switch-hitters like Chone Figgins, I combined their pulled/center/opposite field stats from each side of the plate, so right-handed balls to left are added to left-handed balls to right to come up with pulled batted balls.

Jorge Cantu and Dan Uggla—who would’ve thunk? Uggla is a former Rule 5 pick and Cantu spent time last year in two different minor league systems before both found their rightful spots on the Marlins. I’d have to attribute their appearance on the leaderboard to coincidence. Dustin Pedroia and Kevin Youkilis, on the other hand, are given a bit of an extra push, as both are clearly aided by the green monster. Pedroia might be the perfectly suited player for Fenway. Just check out his home run chart. He has yet to hit a 400-foot homerun in his career. You have to wonder whether he’d be the MVP outside of that park, as it would certainly be a challenge to find a voter who checks park-adjusted stats.

I don’t think I ever expected to see the universally beloved Joe Mauer on the bottom of any list, but he gets murdered by pulled groundballs. Only three of his nine long balls went to right field in 2008, as he unfortunately never developed the 20 homerun power people were hoping for. Chone Figgins, Ryan Theriot, and Cesar Izturis all had one homerun apiece last year, while Castillo tallied six, so it appears that a minimal amount of power is necessary to be successful pulling the ball.


Value of Center Field Batted Balls

center.jpg

It’s interesting that the top six players on this list bat right-handed. But the bottom four players do too, so that would suggest that the trend of righties is random. It’s tough to choose between the hitters best at pulling the ball and best at going up the middle, but I’m siding with the latter set of players. I’d classify the first set of hitters more as homerun hitters and the second set as line drive types. Pedroia appears at the bottom of this list, likely because in Fenway he doesn’t derive the same benefit from his fly balls to center as he does to the left-field wall. He picked up just three hits on 73 center-field flies.


Value of Opposite Field Batted Balls

opposite.jpg

What Mauer lacks in pulled balls he makes up for with his approach going the other way, as he is the only catcher to appear on a leaderboard. Nick Markakis, Matt Kemp, and Manny Ramirez all show up as top center and opposite field hitters. These guys are at times described as "pure" hitters, and there's why. I'd presume each one is quite talented at going with the pitch.

Without trying to sound hyperbolic, I have to ask: is Ryan Howard the greatest opposite-field power-hitter ever? His 2006 and 2008 seasons in which he crushed 25 and 20 opposite-field blasts, respectively, are the only years in the last four in which any player has hit more than even 15 homers to their weak side. Howard does have his opposite-field numbers skewed by his groundball run value, which is likely only positive due to the vacated side of the infield.

Analyzing Howard’s trends piqued my interest in a specific batted ball type and location: pulled groundballs. There were seven players who cost their team 20 runs on pulled grounders: Mark Teahen, Prince Fielder, Adrian Gonzalez, Ryan Howard, Jimmy Rollins, Casey Kotchman, and Carlos Delgado. Several of these players do indeed receive the defensive shift, but I immediately noticed one of these names is not like the other. Jimmy Rollins sticks out like a sore thumb. He’s a switch-hitter, and as such is the only non-lefty to appear on the list. He is far and away the fastest player in the group and an absolutely awesome baserunner, but he apparently wasn’t able to make the most of his speed last year when he put the ball in play, compiling a well below league average 19% hit rate on grounders and legging out a single bunt hit in seven attempts.

Adrian Gonzalez might actually have power that approaches Howard’s but we’ll never know until he gets out of Petco. At the other end, one thing’s for certain: pitchers need to find ways to prevent Cantu from pulling the ball. Here's what the spray chart for Cantu—perhaps the best pull hitter and worst opposite field hitter in the game—looks like.

cantu.jpg