Quantifying Coachers, Part II
"The main quality a great third base coach must have is a fast runner." - Rocky Bridges, California Angels coach "It's frustrating. Your job is not to get in the way of a rally." - Rich Donnelly, Dodgers third base coach after Game 1 of the 2006 NLDS
After the game Dodgers third base coach Rich Donnelly noted that he didn't want to send Kent but saw that with Drew close behind, he'd likely end up with two runners on third and at that point he was hoping for a botched throw. And for some reason, perhaps their proximity or his attention focused on the lead runner, Donnelly did not or was unable to give the stop sign to Drew. As you'll recall, in part I we laid the groundwork for measuring the contribution of third base coaches (or "coachers" as they were originally termed in the 1870s) in the dimension of waving runners around. This time we'll revisit that framework to make an adjustment for team quality and then attempt to answer the question of whether there is a repeatable skill involved in this aspect of the game. Contextual Matters? We left off with the question of whether it is really fair to assign all of a team's baserunning (even the subset of plays discussed in part I) to the third base coach's influence? Keep in mind that failing to advance as frequently as the average runner in various situations, as well as getting thrown out, will both depress EqHAR with the latter being much more costly than the former. Even so, it could be the case that Joey Cora of the White Sox was saddled with extremely slow runners who didn't advance as often as they should or runners who don't take direction very well and run through his signs or even who simply don't hustle. And Dino Ebel of the Angels may be, and in fact is, blessed with a Chone Figgins who regularly scoots home on singles and doubles and never gets caught (Figgins was not thrown out in 56 opportunities and recorded the highest individual EqHAR at 4.93 in 2006). Because this metric is dependant on the personnel a coach has to work with, an additional step is warranted that acknowledges that dependency. This step involves comparing the opportunities that coaches can be said to have some control over with ones that they do not. If a team is populated with poor baserunners who have trouble advancing or regularly get thrown out in situations where the coach is a spectator, one might argue that those opportunities should serve as the baseline with which we judge the coach. Table 2 shows the results of this recalculation by including the "non-coach" EqHAR opportunities and then including a final column that is the ratio of the Rate for opportunities the coach has influence over to the Rate for the opportunities for which they do not. Table 2: Third Base Coaches in 2006 Ordered by Ratio Coach Non-Coach Team Name Opp Rate Opp OA EqHAR Rate Ratio TBA Tom Foley 163 1.15 313 12 -6.6 0.80 1.44 PHI Bill Dancy 262 1.15 329 5 -1.2 0.96 1.20 BAL Tom Trebelhorn 296 1.01 400 8 -6.1 0.84 1.20 SFN Gene Glynn 220 0.95 346 6 -4.7 0.84 1.13 CLE Jeff Datz 274 0.99 400 7 -3.4 0.91 1.09 SDN Glenn Hoffman 231 1.00 348 7 -3.2 0.91 1.09 TOR Brian Butterfield 237 0.99 387 9 -2.9 0.92 1.08 NYN Manny Acta 228 1.05 293 4 -0.6 0.98 1.07 MIL Dale Sveum 214 1.01 329 11 -1.7 0.95 1.06 ANA Dino Ebel 238 1.19 373 9 5.2 1.13 1.06 CHA Joey Cora 234 0.86 404 9 -7.5 0.81 1.05 COL Mike Gallego 247 1.03 359 12 -0.8 0.98 1.05 OAK Ron Washington 245 0.89 372 10 -6.0 0.85 1.04 WAS Tony Beasley 239 1.03 314 9 -0.3 0.99 1.04 KCA Luis Silverio 237 1.04 400 13 0.7 1.02 1.02 BOS DeMarlo Hale 248 0.86 424 8 -7.6 0.85 1.01 SEA Carlos Garcia 226 0.97 377 13 -0.2 1.00 0.97 SLN Jose Oquendo 230 0.98 375 9 1.0 1.03 0.95 ARI Carlos Tosca 275 1.01 332 5 2.0 1.07 0.95 DET Gene Lamont 240 1.10 362 3 5.5 1.16 0.95 NYA Larry Bowa 289 0.93 410 3 -0.2 1.00 0.94 PIT Jeff Cox 230 0.98 399 2 1.8 1.04 0.93 LAN Rich Donnelly 260 0.90 370 10 -1.0 0.97 0.92 CIN Mark Berry 217 0.98 315 4 2.4 1.08 0.91 HOU Doug Mansolino 214 1.11 344 1 7.6 1.23 0.91 TEX Steve Smith 234 0.95 410 9 2.5 1.06 0.90 ATL Fredi Gonzalez 231 0.94 362 6 2.5 1.06 0.89 MIN Scott Ullger 222 1.01 452 8 6.6 1.14 0.88 FLO Bobby Meacham 199 1.05 359 5 8.3 1.24 0.84 CHN Chris Speier 199 0.94 350 3 7.2 1.22 0.77 Under this second measure Cora moves from 30th to 11th by virtue of his team racking up a very poor EqHAR of -7.5 and rate of 0.81 in opportunities that Cora had little or no influence over. When comparing the 0.81 rate in his coach-influenced opportunities to 0.86, Cora comes out at 1.05 thereby slightly outperforming his team. In Table 2 Washington and Gonzalez both look a little better while Speier and Florida's Bobby Meacham fall by virtue of their respective teams performing quite well in non-coach opportunities at 1.24 for the Marlins and 1.22 for the Cubs. And what of the Angels Ebel who came out on top in Table 1 in part I? He slides to 10th since the Angels recorded a very respectable 1.13 rate in non-coach opportunities while Tom Foley of the Devil Rays takes the top spot since his team performed so poorly in other opportunities (-6.6, 0.80) and so well when he was likely involved (5.3, 1.15). This metric can be expanded to encompass multiple seasons and therefore a larger view. Table 3 shows these metrics for each of the 74 third base coaches employed from the beginning of the 2000 season through 2006. Table 3: All Third Base Coaches 2000-2006 Name Opp OA EqHAR Rate Opp OA EqHAR Rate Ratio Billy Hatcher 387 6 5.1 1.06 573 21 -12.3 0.78 1.35 Bill Dancy 527 15 3.4 1.04 737 17 -11.3 0.84 1.23 Michael Cubbage 494 12 4.7 1.05 706 15 -11.1 0.85 1.23 Lance Parish 189 5 0.9 1.02 243 8 -3.7 0.84 1.22 Cookie Rojas 221 5 -0.2 1.00 268 9 -4.6 0.83 1.20 Terry Bevington 439 12 -3.4 0.96 544 11 -9.2 0.82 1.17 Bobby Floyd 173 5 -2.7 0.93 316 8 -6.0 0.81 1.15 Jack Lind 211 2 4.7 1.10 273 10 -0.9 0.96 1.14 Tom Foley 1056 20 14.0 1.07 1609 43 -8.5 0.95 1.13 Dave Myers 986 16 7.7 1.04 1463 35 -10.7 0.92 1.12 Al Pedrique 223 2 5.3 1.11 308 4 -0.3 0.99 1.12 Juan Samuel 626 11 7.3 1.05 976 23 -3.9 0.95 1.11 Wendell Kim 624 20 -14.7 0.88 980 34 -19.5 0.80 1.10 Jeff Datz 274 5 -0.7 0.99 400 7 -3.4 0.91 1.09 John Russell 672 19 -1.5 0.99 1096 24 -10.0 0.91 1.09 Mike Cubbage 244 7 -1.3 0.97 310 8 -2.8 0.91 1.08 Jim Riggleamn 270 7 -2.0 0.96 308 11 -3.5 0.90 1.07 Tom Trebelhorn 1323 32 6.6 1.03 2101 51 -5.9 0.97 1.06 Gene Lamont 1103 28 1.8 1.01 1730 49 -9.2 0.95 1.06 Eddie Rodriquez 475 11 -5.9 0.94 614 16 -6.7 0.89 1.06 Dino Ebel 238 3 10.3 1.19 373 9 5.2 1.13 1.06 Joey Cora 234 9 -7.7 0.86 404 9 -7.5 0.81 1.05 Joel Skinner 1087 27 15.5 1.07 1650 41 2.6 1.01 1.05 Ozzie Guillen 345 10 1.3 1.01 632 19 -2.1 0.97 1.05 John Vukovich 1130 33 -7.4 0.97 1491 41 -11.4 0.93 1.04 Tony Beasley 239 6 1.5 1.03 314 9 -0.3 0.99 1.04 Brian Butterfield 1195 24 6.1 1.03 1827 45 -1.9 0.99 1.04 Tim Flannery 683 18 6.5 1.05 710 20 0.7 1.01 1.04 Manny Acta 1032 17 15.3 1.07 1495 37 4.3 1.03 1.04 Ron Oester 407 11 -1.0 0.99 571 20 -2.4 0.96 1.03 Willie Randolph 976 20 7.4 1.04 1189 33 1.2 1.01 1.03 Ron Washington 1730 45 2.0 1.00 2272 40 -5.0 0.97 1.03 Carlos Tosca 712 13 0.6 1.00 969 17 -1.2 0.99 1.02 Dale Sveum 789 18 -20.9 0.87 1201 26 -18.6 0.85 1.01 Gene Glynn 1594 40 -20.0 0.94 2198 40 -15.1 0.93 1.01 Gary Pettis 379 14 -3.1 0.96 509 14 -2.5 0.95 1.01 DeMarlo Hale 248 5 -7.6 0.86 424 8 -7.6 0.85 1.01 Sonny Jackson 601 20 -16.6 0.86 896 24 -11.8 0.86 1.00 Al Newman 889 24 1.5 1.01 1384 28 1.1 1.01 1.00 Bryan Little 264 4 7.5 1.14 298 5 4.6 1.14 1.00 Luis Silverio 449 9 5.7 1.06 787 19 4.6 1.06 1.00 Mike Gallego 488 8 1.3 1.01 728 19 1.5 1.02 0.99 Dave Huppert 240 4 -0.7 0.99 318 7 -0.2 1.00 0.99 Pete MacKanin 201 5 0.4 1.01 228 8 0.5 1.02 0.99 Steve Smith 1082 21 1.7 1.01 1697 34 6.0 1.03 0.98 Doug Mansolino 867 18 7.6 1.05 1260 20 9.6 1.07 0.97 Jose Oquendo 1616 33 25.9 1.08 2267 49 23.1 1.11 0.97 Carlos Garcia 226 6 -1.5 0.97 377 13 -0.2 1.00 0.97 Tim Raines 204 9 2.9 1.06 335 7 3.2 1.10 0.97 Rob Picciolo 704 11 3.9 1.03 1163 24 6.7 1.07 0.97 Jerry Narron 494 8 7.7 1.06 611 12 6.5 1.10 0.97 Glenn Hoffman 1541 42 -13.5 0.95 2019 47 -2.8 0.99 0.96 Sandy Alomar 487 11 11.7 1.11 683 15 12.6 1.16 0.96 Fredi Gonzalez 1249 25 3.7 1.02 2005 32 14.0 1.06 0.95 Rich Donnelly 1594 48 -4.8 0.99 2176 52 7.4 1.04 0.95 Gary Allenson 366 18 -12.7 0.81 510 19 -8.0 0.85 0.95 Rafael Santana 408 8 0.7 1.01 717 12 6.0 1.08 0.94 Tim Foli 387 13 -1.2 0.99 502 15 2.9 1.05 0.94 Ned Yost 590 21 -8.4 0.93 797 24 0.0 1.00 0.93 Jeff Cox 847 23 -10.1 0.94 1384 22 1.2 1.01 0.93 Ron Roenicke 1538 40 2.9 1.01 1977 34 18.2 1.10 0.92 Ron Gardenhire 511 16 -0.4 1.00 479 13 4.3 1.09 0.92 John Mizerock 478 10 -1.0 0.99 790 13 6.4 1.08 0.91 Jeff Newman 207 4 2.7 1.07 359 3 6.1 1.17 0.91 Trent Jewett 354 10 2.5 1.04 454 10 6.2 1.14 0.91 Larry Bowa 495 10 -8.6 0.91 699 9 2.1 1.03 0.89 Mark Berry 684 18 -10.9 0.92 911 17 3.1 1.03 0.89 Scott Ullger 222 3 0.5 1.01 452 8 6.6 1.14 0.88 Rich Dauer 710 20 0.2 1.00 861 16 12.7 1.15 0.87 Matt Galante 592 19 -8.8 0.93 853 26 7.3 1.08 0.87 Luis Sojo 558 16 -6.3 0.94 718 12 5.8 1.09 0.86 John Sterns 206 10 -7.0 0.85 253 10 -0.4 0.98 0.86 Bobby Meacham 199 4 2.3 1.05 359 5 8.3 1.24 0.84 Chris Speier 860 22 -4.7 0.98 1158 15 24.0 1.22 0.80 Sam Perlozzo 254 5 -4.0 0.92 275 3 6.3 1.22 0.75
From an absolute perspective Dale Sveum registered the lowest EqHAR at -20.9 during his time with the Red Sox in 2004-2005 and Brewers in 2006 while Gary Allenson with Milwaukee in 2001-2002 had the lowest absolute rate at 0.81. In both cases, however, the poor performance of their teams buoyed their ratings. Cardinals third base coach Jose Oquendo had the highest absolute EqHAR of 25.9 in his seven years with Tony LaRussa while Ebel recorded the highest rate at 1.19 in his single season with the Angels. These absolute numbers indicate that over the course of seven seasons the range in terms of EqHAR is around 55 runs. In answer to the first question we posed in part I, the act of waving runners around is quantifiable, albeit imperfectly with the limitations already discussed. The quantification in the above analysis passes the test of reasonableness and takes the following form. Third base coaches in the absolute sense seem at most to be able to contribute to just over one additional win or one loss (Sveum with the 2005 Red Sox recorded an EqHAR of -12.6 and Jerry Narron with the Rangers in 2000 was at +10.9) in the course of a season over what would be expected. Over the course of seven seasons that contribution grows to around two and half wins indicating there is a large degree of variability in play. However, judging a coach by that absolute metric is not necessarily equitable since it doesn't take into consideration the personnel the coach is working with. To correct for this a ratio that uses a baseline can be calculated and when that ratio is converted to runs, the range becomes -1.5 to +1.5 wins per season and -3 to +3 wins over the course of seven seasons. While we've answered the first question in the affirmative, does the difference we see between third base coaches in a single season indicate that there is a disparity in skill between these coaches? The standard way performance analysts have approached a question like this is to perform year to year comparisons in an effort to see if the effect being measured persists. As it turns out, roughly two-thirds of third base coaches remain in the role the following season with a high of 24 in being retained from during the winter 2003-2004. Using the ratio calculated in the previous section, a correlation coefficient (denoted as r where a value of -1 indicates a perfectly negative linear correlation and a value of 1 indicates a perfectly linear one) can be calculated for each pair of seasons as shown in Table 4. Table 4: Year to Year Correlations in Ratio for Third Base Coaches Year Pair Coaches r 2000-2001 19 0.34 2001-2002 20 -0.16 2002-2003 21 -0.10 2003-2004 24 -0.09 2004-2005 21 -0.02 2005-2006 19 0.31 From an overall perspective those 124 pairs can be graphed as shown in Figure 1.
There may be several reasons for this negative result. Reminiscent of the ongoing debate over clutch hitting, the skill this metric is trying to measure may be much more subtle than the metric can deliver. Instead of a coach being "responsible" for up to +1.5 wins per season, his actual contribution to those wins may be a fractional part of that value and hence the variability component in the numbers we use for correlation swamps the skill component to a large degree. So there may indeed be a skill involved in waving runners around, but that skill is for all intents and purposes unimportant in the big scheme of things. The obvious dependence on his personnel would seem to support this. Additionally, perhaps the metric is poorly designed and may not capture the skill at all though it exists. It could even be the case that there really is no skill involved in holding and sending runners (or if you prefer, there is no skill difference between coaches at the major league level) and the differential results we see can be chalked up to a combination of personnel (try as we might to disentangle it or due to turnover of the roster) and simple luck driven by anything and everything from the opponents defense to the weather. Our quest for knowledge about the game is just as often informed by studies that show no effect as those that confirm our intuition. As for the influence of third base coaches in determining when to send and when to hold runners, the most we can say from this study (assuming our metric is relevant) is that if there is a skill involved, it is hard to measure and although the judgment exercised on the field can often make the difference in individual plays, it doesn't manifest itself on the larger scale of seasons.
Neal Williams is the president of the Rocky Mountain chapter of the Society for American Baseball Research. |
Comments
Well, you could have broken up your results by even/odd halves or something like that, but there's a simpler way of knowing whether or not you're onto something: The smell test. According to the fans of every team he has ever coached (including yours truly), Wendell Kim is the absolutely worst third-base coach in the history of mankind.
Any list where he does not show up near the bottom, I'm sorry to say, is almost certainly telling us very little, simply because we KNOW that Kim is awful. Nonetheless, this is a very interesting look, and certainly, you're on the right track. I'm just not sure the sample size is ever going to be great enough to make up for all the other variables that come into play.
Posted by: David Gassko at March 15, 2007 3:14 AM
I had the same thought David and as you'll notice Kim had a rate of .88 which puts him 7th from the bottom. The problem is that the players he's had to work with also happen(?) to have done very poorly overall. This could mean that we need to weight the influence of the non-coach portion of the measure. Sample size is certainly a problem here even over seven years.
Of course it's also just possible that Kim, while deservedly having a poor reputation, may also have sent runners in low probability situations where they happened to have made it thus raising his rate up a bit from what it would have otherwise been. The end result is a poor assessment in the minds of fans but a successful outcome on the field. Also, the failures (20 kills in Kim's case) have a disproportionately large impact on the minds of fans.
And I think your last comment hits the nail on the head. There are lots of variables here in trying to measure something that is ultimately the decision of the individual runners and so at best it's a kind of secondary effect.
Posted by: DanAgonistes at March 15, 2007 9:11 AM
Wow, after all that, unfortunately we have little discernible skill. That surprises me. Oh well. Good idea though.
I hope that people appreciate how important it is to try and ascertain how much skill there is in a measure that has significant variability. If there is little skill, as in this case, the actual data do not pass the "so what" test.
For what it is worth, I think that the "smell" test is overrated and I am surprised that DSG made the comment he did.
You might as well make the same type of comment when it comes to clutch hitting and DIPS ("I don't care what the data and ensuing statistical analysis say, I know a clutch hitter or a pitcher that always gets hammered when I see one...").
Posted by: MGL at March 15, 2007 10:41 AM
Re: smell test.
I believe in it. Not in terms of "does clutch hitting exist", but in things alot clearer. If I want a list of the greatest basestealers ever, I want to see Tim Raines on that list. If I want the best fielders since 2000, I want to see Scott Rolen, Adam Everett, and Ichiro on that list. If they are not there, I want to know exactly the reason. (i.e., Ichiro has terrible positioning... just an illustration).
If Dodgers fans are saying that hands-down, Kim is the worst third-base coach *ever*, well, he better be close to the bottom. If he's not, then I want to know the reason, like the uncertainty level is so high, that he could in fact be the worst.
Posted by: tangotiger at March 15, 2007 2:15 PM
While it's a bit of work, I recommend following this process:
http://www.tangotiger.net/catchers.html
Since players and 3B coaches turn over quite a bit, you might get a decent sample to work with.
Posted by: tangotiger at March 15, 2007 2:17 PM
If Dodgers fans are saying that hands-down, Kim is the worst third-base coach *ever*, well, he better be close to the bottom.
If Kim is as bad as David suggests, then Dodgers fans probably think he is a great coach. : )
At the risk of reducing the sample sizes even more, I believe adding as much context into the study as possible is important. For example, it's a cardinal sin for a baserunner to make the first or third out at third base. Making the second out at third is considered a worthwhile gamble in most cases. Similarly, making the third out at home can also be a worthwhile gamble, especially if a weaker hitter is up next.
The point is that not all outs are created the same. As a result, the various outs should be viewed or weighed differently.
This was an outstanding effort to lift the fog and perhaps stimulate an intelligent discussion in order to take this study to the next level. Good job, Dan and Neal, and thanks for allowing Baseball Analysts to be the host site.
Posted by: Rich Lederer at March 15, 2007 2:34 PM
Well Mickey, what if someone came up with an ultimate clutch hitting measure, and it rated David Ortiz as the worst clutch hitter in baseball? Would you buy it? I agree that the smell test is generally overrated, but there are certain things that are just universal (if you told me Ortiz was not clutch, I would believe it, by the way).
Posted by: David Gassko at March 15, 2007 2:44 PM
All I said is that the smell test was "overrated" not that it doesn't have some value.
The whole point of sabermetrics is to see which smells are genuine and which are not.
We can certainly use our powers of observation and intuition to aid us in our search for the truth, which is the essence of the smell test, especially when it comes to things that we think are "obvious."
However, sometimes what we perceive as obvious turns out not to be true or at least to the extent we think it is true, which is where the power of objective analysis comes into play.
Posted by: MGL at March 15, 2007 2:56 PM
Rich, thanks for letting us post it. To your point about not all outs having the same weight, that is factored into the underlying EqHAR framework already from a Run Expectancy standpoint. So making the first out at third is more expensive than making the second out there.
Posted by: DanAgonistes at March 16, 2007 12:00 PM