Baseball BeatOctober 25, 2006
Net Stolen Bases: Leaders and Laggards
By Rich Lederer

Back in March 2005, I introduced the concept of the net stolen base. The idea was founded on rewarding players for SB and penalizing them for CS. The original formula was SB - (2 * CS) = Net SB. In this year's version, I have also included pickoffs. The updated formula is now SB - (2 * (CS + PO)).

The number of stolen bases, as a standalone stat, is misleading. It is somewhat like hits. If you don't know the number of at-bats, it is hard to put the the number of hits into context. As a result, when looking at SB, we need to know the number of attempts. Stolen bases as a percentage of attempts gives us a rate stat similar to batting average. However, in order to add value, a player needs to be successful stealing bases much more often than just 51% of the time. You see, CS is a double-edged sword. A runner who is cut down trying to steal not only produces an out, but he also removes himself from the base paths. As a result, we need to penalize him for the out as well as the lost baserunner.

In one of many studies on the value of stolen bases, James Click demonstrated that the breakeven point for stealing second base is approximately 73%, and it ranges from 70% to 93% (depending upon the number of outs) for stealing third base.

OUTS  STOLEN BASE  BREAKEVEN
0       Second       73.2%
1       Second       73.1%
2       Second       73.2%
0       Third        74.8%
1       Third        69.5%
2       Third        92.7%

The above breakeven points may zig and zag a percentage point or two from one season to the next, but the basic premise is the same year in and year out. When stealing bases, a player needs to be successful somewhere between 70-75% of the time. If not, he is doing more harm than good by attempting to steal bases. Sure, there are some other factors at play here, mainly the game context (i.e., the score, the number of outs, who's pitching, who's catching, and who's at bat). But, generally speaking, a baserunner needs to be called safe nearly three times as often as out when attempting to take those extra 90 feet.

Before we get ahead of ourselves here, let's take a look at the past season's stolen base leaders.

2006 STOLEN BASE LEADERS

PLAYER               TEAM      SB
 1 Jose Reyes        NYM       64
 2 Juan Pierre       ChC       58
   Carl Crawford     TB        58
 4 Chone Figgins     LAA       52
 5 Hanley Ramirez    Fla       51
 6 Dave Roberts      SD        49
 7 Ichiro Suzuki     Sea       45
   Corey Patterson   Bal       45
 9 Felipe Lopez      Was       44
10 Alfonso Soriano   Was       41
11 Scott Podsednik   CWS       40
12 Rafael Furcal     LAD       37
   Ryan Freel        Cin       37
14 Jimmy Rollins     Phi       36
   Brian Roberts     Bal       36
16 Derek Jeter       NYY       34
17 Willy Taveras     Hou       33
18 Kenny Lofton      LAD       32
19 Bobby Abreu       NYY       30
20 Orlando Cabrera   LAA       27
21 Chris Duffy       Pit       26
22 Johnny Damon      NYY       25
   Mike Cameron      SD        25
   Luis Castillo     Min       25
   Eric Byrnes       Ari       25
   Brandon Phillips  Cin       25

Jose Reyes led the major leagues in SB with 64. Carl Crawford led the American League with 58. The question is: How valuable were these stolen bases?

Here are the stolen base leaders in the context of caught stealing, pick offs, and net SB.

SB LEADERS WITH CS AND PO TOTALS

PLAYER               TEAM       SB      CS     PO   NET SB* 
 1 Jose Reyes        NYM        64      17      3     24     
 2 Juan Pierre       ChC        58      20      0     18
   Carl Crawford     TB         58       9      1     38
 4 Chone Figgins     LAA        52      16      0     20
 5 Hanley Ramirez    Fla        51      15      0     21
 6 Dave Roberts      SD         49       6      3     31
 7 Ichiro Suzuki     Sea        45       2      1     39
   Corey Patterson   Bal        45       9      0     27
 9 Felipe Lopez      Was        44      12      1     18
10 Alfonso Soriano   Was        41      17      2      3
11 Scott Podsednik   CWS        40      19      2    - 2
12 Rafael Furcal     LAD        37      13      0     11
   Ryan Freel        Cin        37      11      4      7
14 Jimmy Rollins     Phi        36       4      0     28
   Brian Roberts     Bal        36       7      2     18
16 Derek Jeter       NYY        34       5      0     24
17 Willy Taveras     Hou        33       9      2     11
18 Kenny Lofton      LAD        32       5      0     22
19 Bobby Abreu       NYY        30       6      1     16
20 Orlando Cabrera   LAA        27       3      0     21
21 Chris Duffy       Pit        26       1      1     22
22 Johnny Damon      NYY        25      10      0      5
   Mike Cameron      SD         25       9      0      7
   Luis Castillo     Min        25      11      1      1
   Eric Byrnes       Ari        25       3      2     15
   Brandon Phillips  Cin        25       2      2     17

* Net SB = SB - (2 * (CS + PO))

Based on Click's work, I could have used SB - (3 * (CS + PO)) rather than SB - (2 * (CS + PO)) to come up with a 75% breakeven point. However, I chose to err on the side of conservatism, plus I think it is slightly easier to compute the net number in your head using two times rather than three. When possible, remember to keep it simple, stupid. So, in honor of KISS, we will use the Deuce in our formula.

Based on the adjusted stolen base totals, the top 10 most efficient base stealers were as follows:

MOST EFFICIENT BASE STEALERS

PLAYER               TEAM      NET SB 
 1 Ichiro Suzuki     Sea         39
 2 Carl Crawford     TB          38
 3 Dave Roberts      SD          31
 4 Jimmy Rollins     Phi         28
 5 Corey Patterson   Bal         27 
 6 Jose Reyes        NYM         24
   Derek Jeter       NYY         24    
 8 Kenny Lofton      LAD         22
   Chris Duffy       Pit         22
10 Hanley Ramirez    Fla         21
   Orlando Cabrera   LAA         21

Ichiro Suzuki and Carl Crawford were the most efficient base stealers last year. Both combined quantity with efficiency. Feel free to add Dave Roberts if you're interested in the NL leader in this category.

Using the same forumula, we can also determine the least efficient base stealers.

LEAST EFFICIENT BASE STEALERS

PLAYER                 TEAM     SB      CS     PO   NET SB
 1 Jamey Carroll       Col      10      12      3    -20
 2 Jeff Francoeur      Atl       1       6      1    -13
 3 Bill Hall           Mil       8       9      1    -12
 4 Jose Bautista       Pit       2       4      2    -10
 5 Reggie Abercrombie  Fla       6       5      2    - 8
   Ronny Cedeno        ChC       8       8      0    - 8
   Dan Uggla           Fla       6       6      1    - 8
 8 Magglio Ordonez     Det       1       4      0    - 7
   Yuniesky Betancourt Sea      11       8      1    - 7
   Ryan Zimmerman      Was      11       8      1    - 7
   David Eckstein      StL       7       6      1    - 7

Every player on the list above is literally costing his team outs and potentially runs and even wins. Jamey Carroll, Yuniesky Betancourt, and Ryan Zimmerman may have pleased their fantasy baseball owners in the stolen base department last year, but they were a net negative for their real owners with respect to stealing bases. Using Carroll as an example illustrates the point. The little second baseman stole 10 bases last year, but he also finished among the top 10 in CS (12) and was tied for second in PO (3). In other words, he cost the Rockies 15 outs while removing himself from the bases 15 times. Shake it all up and our formula suggests that Carroll was worth minus 20 bases when attempting to steal in 2006.

As it relates to the NL Rookie of the Year balloting, if you are at all torn among Hanley Ramirez, Dan Uggla, and Zimmerman, look no further than their net SB as perhaps a tie breaker. Ramirez contributed a net positive 21 bases - good for 10th best in MLB - while Uggla and Zimmerman cost their teams 8 and 7 bases, respectively, for a net differential of nearly 30 bases.

Jeff Francoeur is a certified out maker. He was third in the NL and fourth overall in outs (defined as AB - H + CS + GIDP + SH + SF) last year. He had a poor OBP (.293) and SB% (14%). A good athlete, Francoeur lacks discipline at the plate and on the bases. Oh, his ceiling is high. But he has more holes than Pinehurst in his game at the moment.

Looking at net stolen bases can be valuable in so many ways. For example, if a team wants to sign Alfonso Soriano to a huge long-term contract, I just hope that management realizes what it's getting or not getting, as the case may be. Every serious baseball fan knows by now that Soriano became just the fourth player to hit 40 home runs and steal 40 bases in the same season. But, if the truth be known, he wasn't much of an asset when attempting to steal bases in 2006. At best, Alfonso turned in an ever so slightly positive net SB contribution last year.

Worse yet is Scott Podsednik, whose claim to fame to date has been stealing bases. Although the Chicago White Sox left fielder stole 40 bases, he was a net liability in this department. Yep, you read that right. Podsednik was caught or picked off 21 times, resulting in a net contribution of minus 2 bases. If Podsednik's not helping you on the base paths, where is he adding value? Certainly not with the bat (.261/.330/.353).

*******

Net stolen bases can also be used at the team level. Here is a ranking from top to bottom based on the offensive side of the equation (exclusive of pickoffs).

TEAM                 SB    CS    NET SB
 1 NY Mets          146    35      76
 2 NY Yankees       139    35      69
 3 San Diego        123    31      61
 4 Cincinnati       124    33      58
 5 Baltimore        121    32      57
 6 Philadelphia      92    25      42
 7 LA Angels        148    57      34
 8 Seattle          106    37      32
 9 LA Dodgers       128    49      30
   Tampa Bay        134    52      30
11 Chicago Cubs     121    49      23
12 Pittsburgh        68    23      22
13 Oakland           61    20      21
14 Minnesota        101    42      17
15 Arizona           76    30      16
16 Cleveland         55    23       9
17 San Francisco     58    25       8
18 Houston           79    36       7
19 Texas             53    24       5
   Boston            51    23       5
21 Toronto           65    33     - 1
   Washington       123    62     - 1 
23 Chicago Sox       93    48     - 3
   Kansas City       65    34     - 3
   Milwaukee         71    37     - 3
26 St. Louis         59    32     - 5
27 Florida          110    58     - 6
28 Colorado          85    50     -15
29 Atlanta           52    35     -18
30 Detroit           60    40     -20

The Mets, Yankees, and Padres - playoff teams all - were 1-2-3 in the majors. The Tigers and Cardinals - battling each other in the World Series - ranked among the bottom five.

Now let's take a look at the defensive side of the picture.

TEAM                 SB    CS    NET SB
 1. Florida          69    46     -23
 2. Detroit          49    35     -21
 3. Baltimore        80    50     -20
    Cincinnati       50    35     -20
 5. Texas            67    40     -13
 6. Minnesota        54    31     - 8
 7. Seattle          72    38     - 4
 8. LA Angels        77    40     - 3
 9. Kansas City      58    30     - 2
    Pittsburgh      102    52     - 2
    NY Yankees       92    47     - 2
12. St. Louis        63    32     - 1
13. Arizona          90    45       0
14. Oakland          88    41       6
15. Colorado         99    42      15
16. Tampa Bay       108    46      16
17. San Francisco    98    40      18
18. Houston          78    28      22
19. Philadelphia     94    35      24
20. NY Mets         111    40      31
21. LA Dodgers      110    38      34
22. Milwaukee        97    31      35
23. Chicago Cubs    118    39      40
24. Atlanta         101    30      41
25. Chicago Sox     116    34      48
26. Washington      110    30      50
27. Cleveland       128    34      60
28. Boston          108    23      62
29. Toronto         130    32      66
30. San Diego       150    26      98

Detroit, as one would expect, doesn't look so bad here. The Tigers do a pretty good job at holding runners and Pudge Rodriguez has been one of the best catchers ever at throwing out runners attempting to steal. San Diego, on the other hand, looks downright awful. Mike Piazza, without a doubt the best-hitting catcher of all time, has never been known for his defensive prowess, especially his arm.

Netting out the offensive and defensive contributions for each team produces the following results:

TEAM               OFF    DEF    NET
 1 Cincinnati       58    -20     78
 2 Baltimore        57    -20     77
 3 NY Yankees       69    - 2     71
 4 NY Mets          76     31     45
 5 LA Angels        34    - 3     37
 6 Seattle          32    - 4     36
 7 Minnesota        17    - 8     25
 8 Pittsburgh       22    - 2     24
 9 Philadelphia     42     24     18
   Texas             5    -13     18
11 Florida         - 6    -23     17
12 Arizona          16      0     16
13 Oakland          21      6     15
14 Tampa Bay        30     16     14
15 Detroit         -20    -21      1
16 Kansas City     - 3    - 2    - 1
17 LA Dodgers       30     34    - 4
   St. Louis       - 5    - 1    - 4
19 San Francisco     8     18    -10
20 Houston           7     22    -15
21 Chicago Cubs     23     40    -17
22 Colorado        -15     15    -30
23 San Diego        61     98    -37
24 Milwaukee       - 3     35    -38
25 Cleveland         9     60    -51
   Washington      - 1     50    -51
   Chicago Sox     - 3     48    -51
28 Boston            5     62    -57
29 Atlanta         -18     41    -59
30 Toronto         - 1     66    -67

Seven of the eight playoff teams were essentially even to plus in terms of their effectiveness in stealing bases and preventing stolen bases. Only the Padres were hugely negative.

Whether plus or minus, keep in mind that stolen bases are worth less than one-half of a base per game to the best and worst teams. The difference between the #1 club (Cincinnati) and the #30 (Toronto) approaches one base per game - not insignificant but a relatively minor matter when compared to the disparities between the best and worst teams with respect to pitching, hitting, and fielding.

In closing, it is important to make two points:

1. Stolen bases are not as valuable in today's high-scoring environment as they were in the Dead Ball era or during the pitching-dominated 1960s and early 1970s.

2. Outs are more precious today than ever before.

As a result, pay attention to both quantity and quality. The net stolen base concept does a good job at doing just that.

Comments

I actually think the PO total for Podsednik is low. Anyone who watched the Sox much this year will tell you that total should be closer to 10. I was keeping a tally of his pick offs early in the season, and I know I had him up to 3 in May.

This is good stuff, but I think it is state of the art for 1999.

SB, CS, and PO are raw data. They lack context. By combining them, you simply combine three things that lack context to create something with context. I think it's entirely valid, but not really current with today's metrics.

We're able to determine, to some degree, the effect of any action on a team's chances of winning, using the metrics on, e.g., fangraphs. Assuming that runners only attempt to steal when the result is SB, CS, or PO (that is, those three stats give us a 100% picture of the runner's intent), we can make a better determination. How?

Take each event (SB, CS, PO) and figure out the win% gained by a success and lost by a failure. Multiply each of those by the odds of a success/failure, as appropriate. Sum up all instances where this occurs.

Take Jose Reyes for example. He had 64 stolen bases and 20 failures. Therefore, he succeeds 76.2% of the time.

We'd take each of his 84 attempts and look at the two possible outcomes. We'd look up the value, in terms of added win percentage, for each of those outcomes. We'd multiply the "good" outcome by .762 and the "bad" outcome (which should be a negative number) by .238. We'd add those up to determine how much his attempt gained or lost his team.

Repeat that for the other 83 instances, and we'll have a better idea as to the effect of his base stealing.

The reason this is imporant is simply because a guy with a lot of "garbage time" steals is apt to have a higher success rate, but really shouldn't get all that much credit for the swipe. I'm making this example up, but you can imagine that Brian Roberts was about a 65% success rate guy in high leverage situations, but a 95% success rate guy in low leverage situations. Should a player like that be considered a significantly positive basestealer?

Corey Patterson?

I think the disparity between The Cheat's calculation of Scotty Po's PO total and the lower total given in the article may come in the way pick-offs are scored. Somebody correct me if I'm wrong, but I think a pick-off is only scored when the player is tagged out heading back to the bag. If the runner is caught off the bag by the pitcher, heads toward the next base, and is tagged out there, then it goes down as a "caught stealing."

Either way you score it, the point of this article is a good and well-argued one. Guys like Podsednik and Soriano and their high stolen base totals are not nearly as valuable in real baseball as they are in fantasy baseball. Indeed when you factor in Podsednik's bat, it's difficult to justify holding a corner outfield spot for him.

I was surprised not to see Chris Duffy's name among the most efficient base stealers. He had 26 steals and was caught only once. Was he picked off more than three times?

what about corey patterson?

Uh oh...not another stat that makes Derek Jeter look good. I know some people won't be happy about that.
Seriously, though, Scott Podsednik is just a below average baseball player. Don't forget about his pathetic arm in left field.

Going down the line in the hopes of answering questions and responding to comments:

* I actually think the PO total for Podsednik is low.

As Shooty Babitt writes, "I think the disparity between The Cheat's calculation of Scotty Po's PO total and the lower total given in the article may come in the way pick-offs are scored." Yes, that is correct. As such, Podsednik's additional pickoffs are incorporated in his CS totals.

* This is good stuff, but I think it is state of the art for 1999.

Well, I'm a 1999-style guy. :)

Although Win Percentage Added (WPA) adds more context, I'm not sure all stats can realistically - or should - be reported in a WPA manner. To wit, why keep track of batting average? Instead, just report the value of hits based on WPA. Same with other rate stats like OBP or SLG or, for that matter, raw stats such as HR and BB.

WPA is great, but there are a couple of issues here. First, the data is not readily available at this time. Secondly, I don't think stats in highly leveraged situations should necessarily overwhelm all others in all cases.

* I missed Corey Patterson and Chris Duffy the first time and have since corrected the above tables to include them. I didn't realize that my filter was looking at "qualified" players only. I re-ran the sorts using "all" players and Patterson and Duffy were the only two I missed last time around. Thanks.

I agree that it should not be necessarily reported as WPA.

I disagree with the breakeven point. As I've shown in THE BOOK, the breakeven point was something like 68.7%. SB for example also have errors that allow the runners to go to third, and some CS and PO actually have the runner safe on 2B. It all depends how things like these are calculated.

The breakeven point can be anywhere between 60-90% depending on the inning/score.

Nonetheless, the choice of 2 instead of 3 was definitely the right thing to do.

If you wanted to get rid of all the brackets, SB/2-CS-PO works just as well.

I'd really like to see the lists using the 3 modifier too, if only to reference it and see what significant differences there are in the results compared to the formula using the 2 modifier. Any chance that could be added at the end of the article?

Also, a quick response to Redders, just because Derek Jeter is potentially overrated doesn't mean we're not allowed to extol his virtues, and he's always been known as an excellent baserunner.

May I also suggest a second list for the "least efficient basestealers" using a minimum number of steals attempted, maybe about 15? I think that would provide a nice reference to the least efficient BASESTEALERS, as opposed to guys like Magglio Ordonez who very rarely run, and so put up a terrible total due to a few CS.

Peter makes a good point above. I wonder how many of Ordonez's 4 CS were on failed hit and run plays. Given the Tigers' strikeout prone lineup, probably at least a couple of them. In these cases, there was no reasonable expectation that Ordonez would steal the base. On the other hand, we can assume that for basestealers like Soriano, Podsednik, and Castillo, the majority of their CS and PO were accumulated in a failed effort to steal a base.

Wrong, Wrong, Wrong. Why does the pitcher try and keep the runner close to the bag. Why does the first baseman hold the runner and in the process create a huge hole on the right side of the diamond?

Does the pitcher throw more fastballs and less curveballs to the next hitter because the base runner is on first base is a base stealing threat?

It may be next to impossible to truly understand the actual impact a stolen base has. And of course, what do you make of a situation where the runner is caught stealing, and the next batter hits what would have been an automatic double play, clearly those outs should be SUBTRACTED from the outs totals.

What about making the third out of the inning trying to steal a base, but it now allows the next hitter to lead off next inning, sometimes that is a no brainer situation. Example, Grady Sizemore at bat, two outs, the runner at first is caught stealing, now Grady gets to lead off next inning. Not such a bad move really, that third out is really much less neglible than other situations.

Alex wrote "what do you make of a situation where the runner is caught stealing, and the next batter hits what would have been an automatic double play . . . "

There is no such thing as a predestined play. Change one outcome and the game situation has changed. It is the Big Bang theory. Change a caught stealing with a station-to-station keep your foot on the bag strategy and you must replay the entire game from that point on. Didn't Dr. K say that if a butterfly flaps its wings in China it might cause a hurricane in the Caribbean?

should a higher value be given to (and penalized against when caught ) for stealing 3rd and home?

base stealing is always a interesting debate though, because we know that the different leverage situations cause completely different outlooks and the effect of removing force outs are hard to properly rate as well. and it's also not implimenting balks forced due to steal attempts. or extra bases taken due to errors while dealing with a steal.

. Sure, there are some other factors at play here, mainly the game context (i.e., the score, the number of outs, who's pitching, who's catching, and who's at bat).

While you acknowledge this, you proceed as if it were not a significant issue. I don't see any analysis that supports that. The fact is stolen base attempts are not random. Runners don't take off and "see what happens". They are always weighing the risk of a specific situation against the benefits for that specific situation.

The relationship between risk and benefits fluctuates wildly for different stolen bases. Neither the attempts, not the times caught stealing represent a cross section of the times players are on base. And the situations where runners get caught are probably not the same as the situations where they don't.

But you are also misusing the studies that have been done of base stealing. They have not found a "break even point", as you state. What they have found is the "optimum" percentage which players ought to steal. If they get caught more often than the "optimum" they ought to make fewer attempts. But the flip side, not acknowledged by the authors, is that if they get caught less often than optimum they should have made more attempts because they left runs on the table that they would have scored if they stole the base.

The problem is that the theoretical optimum these studies have identified, isn't really very useful for analysis of actual base stealing because it depends on "all other things being equal" when they clearly aren't. One of those assumptions, unstated, is that the team attempting to steal bases is not consciously optimizing both its chances and the situations in which it attempts them.

Which is why it sounds absurd when you use the studies to say a player who got caught less often than the "optimum" hurt his team. The truth is it depends on whether he failed to score a run because he failed to attempt a steal. We don't really know what would have happened if he had made more attempts. And no statistical study can tell us that. But it can't tell us what would have happened if he had made fewer attempts either.

the suggestion that it's not significant is probably because over a larger sample size it would even out. not many people make the top of the list simply by stealing against lousy catchers. and i doubt anyone would only steal in useful situations if they were indeed up there.

over a larger sample size it would even out

I don't see how that would be true. For instance, who the batter is is not going to even ought no matter how large the sample size.

As for the catcher, presumably the players factor that into their attempts. Its perfectly plausible to avoid getting caught by not running on certain catchers or pitchers. In fact, wouldn't you expect them to?

But the actual value of a stolen base doesn't change because of who is catching. It does change depending on the runner, the hitters behind him, the pitcher on the mound, how well that pitcher is pitching, the quality of both team's bullpen, the score of the game, etc. Those things just don't even out no matter how large the sample. Its plausible, in fact likely, that the situations in which players get caught are qualitatively different than the situations when they don't.