Explaining Batter Outcomes in Percentage Terms
In the comments thread to Patrick Sullivan's thoughtful article on "Out" With The Old: A Better Way to Look at OBP, Mark Armour made the following observations and statements: I think Sully is absolutely right here, especially because the denominator is plate appearances, not "at bats". Mark's comments really hit home with me. More than anything, I like the idea of using plate appearances as the denominator to determine how often a player gets a hit, walk, hit by pitch, or makes an out. In this way, we can measure everything in percentage terms and the beauty of it all is that the four outcomes total 1 if expressed in decimal fractions (as Mark said) or 100% when stated in percentages. All of these stats would be viewed on the same scale. Therefore, they would be more descriptive and easier to understand. Walks, hit by pitches, and outs would be put in their proper perspective, and each would be accounted for more than ever before. For example, Ichiro Suzuki's batter outcomes could be expressed in decimal fractions or percentages, as follows: DECIMAL FRACTIONS HIT BB HBP OUT Ichiro Suzuki .323 .067 .004 .606 PERCENTAGES HIT % BB % HBP % OUT % Ichiro Suzuki 32.34% 6.66% 0.41% 60.60% To make things even simpler, we could round these outcomes to their closest whole number. Imagine Dave Niehaus, the play-by-play announcer for the Seattle Mariners, describing Suzuki when the lefthanded hitter approaches the plate. "Suzuki produces a hit 32%, a walk or hit by pitch 7%, and an out 61% of the time." Or "Ichiro gets on base in 39% of his plate appearances and makes an out the other 61%." Let's take a look at last year's leaders in hitting, walk, on-base, and out percentages (minimum of 502 total plate appearances). TOP 20 IN HITTING PERCENTAGE (H/TPA) HIT % BB % HBP % OUT % Ichiro Suzuki 32.34% 6.66% 0.41% 60.60% Magglio Ordonez 31.81% 11.19% 0.29% 56.70% Placido Polanco 31.20% 5.77% 1.72% 61.31% Matt Holliday 30.29% 8.84% 1.40% 59.47% Edgar Renteria 30.20% 8.47% 0.18% 61.14% Hanley Ramirez 30.03% 7.37% 0.99% 61.61% Carl Crawford 29.35% 5.10% 0.80% 64.75% Mike Lowell 29.25% 8.12% 0.46% 62.17% Michael Young 29.05% 6.79% 0.72% 63.44% Jorge Posada 29.03% 12.56% 1.02% 57.39% Chone Figgins 29.03% 10.14% 0.00% 60.83% Dmitri Young 28.94% 8.66% 0.20% 62.20% Derek Jeter 28.85% 7.84% 1.96% 61.34% Chipper Jones 28.83% 13.67% 0.00% 57.50% Chase Utley 28.71% 8.16% 4.08% 59.05% Dustin Pedroia 28.40% 8.09% 1.20% 62.31% Robinson Cano 28.25% 5.83% 1.20% 64.72% Vladimir Guerrero 28.18% 10.76% 1.36% 59.70% Aramis Ramirez 28.14% 7.71% 0.72% 63.44% Alfonso Soriano 28.04% 5.02% 0.65% 66.29% TOP 20 IN WALK PERCENTAGE (BB/TPA) HIT % BB % HBP % OUT % Jack Cust 19.92% 20.71% 0.20% 59.17% Pat Burrell 20.23% 19.06% 0.67% 60.03% Jim Thome 22.20% 17.72% 1.12% 58.96% Todd Helton 26.10% 17.01% 0.29% 56.60% Carlos Pena 22.55% 16.83% 1.63% 58.99% David Ortiz 27.29% 16.64% 0.60% 55.47% Ryan Howard 21.91% 16.51% 0.77% 60.80% Adam Dunn 21.84% 15.98% 0.79% 61.39% Travis Hafner 21.94% 15.43% 1.06% 61.57% Rickie Weeks 18.97% 15.42% 2.77% 62.85% Nick Swisher 21.40% 15.17% 1.52% 61.91% Albert Pujols 27.25% 14.58% 1.03% 57.14% J.D. Drew 22.83% 14.31% 0.18% 62.68% Gary Sheffield 22.09% 14.17% 1.52% 62.23% Lance Berkman 23.35% 14.07% 1.20% 61.38% Jason Varitek 21.43% 13.71% 1.54% 63.32% Chipper Jones 28.83% 13.67% 0.00% 57.50% Ken Griffey 23.43% 13.64% 0.16% 62.76% Kevin Millar 21.53% 13.52% 1.42% 63.52% Grady Sizemore 23.26% 13.50% 2.27% 60.96% TOP 20 IN OBP/LOWEST OUT PERCENTAGE OBP % OUT % David Ortiz 44.53% 55.47% Todd Helton 43.40% 56.60% Magglio Ordonez 43.30% 56.70% Albert Pujols 42.86% 57.14% Jorge Posada 42.61% 57.39% Chipper Jones 42.50% 57.50% Alex Rodriguez 42.23% 57.77% David Wright 41.63% 58.37% Jim Thome 41.04% 58.96% Carlos Pena 41.01% 58.99% Chase Utley 40.95% 59.05% Jack Cust 40.83% 59.17% Matt Holliday 40.53% 59.47% Vladimir Guerrero 40.30% 59.70% Miguel Cabrera 40.00% 60.00% Derrek Lee 40.00% 60.00% Mark Teixeira 40.00% 60.00% Pat Burrell 39.97% 60.03% Prince Fielder 39.50% 60.50% Ichiro Suzuki 39.40% 60.60% Total plate appearances, as provided by ESPN's stats, consist of every outcome, including sacrifice flies, sacrifice hits, and catcher's interferences. The latter should be factored into OBP but has not for this exercise. One could argue over the inclusion of SH because a player many times is asked to "give himself up" by the manager. However, in my book, the batter created an out, albeit one that would be more productive than a strikeout or infield flyout. By the same token, I have not counted grounded into double plays as two outs. Doing so would result in the batter outcomes not adding up to 1 in decimal fractions or 100% in percentages. Besides, I believe GIDP, like RBI, should be viewed in the context of opportunities. We can also look at total bases as a percentage of plate appearances. Rather than calling it slugging average (which is TB/AB), Branch Rickey termed total bases as a percentage of plate appearances as "advancement percentage." This one might be best expressed as a decimal fraction. TOP 20 IN SLUGGING OR ADVANCEMENT PERCENTAGE SLG Matt Holliday .541 Alex Rodriguez .531 Alfonso Soriano .525 Magglio Ordonez .521 Prince Fielder .520 Chipper Jones .517 David Ortiz .511 Hanley Ramirez .508 Carlos Pena .502 Curtis Granderson .500 Aramis Ramirez .498 Chase Utley .489 Jimmy Rollins .488 Miguel Cabrera .488 Mark Teixeira .483 Corey Hart .481 Ryan Howard .477 Vladimir Guerrero .476 Carlos Lee .475 Albert Pujols .473 Taking slugging or advancement percentage one step further, we can calculate bases per plate appearance (or BPA), defined as [TB+BB+HBP]/TPA. I have excluded SB, CS, GIDP from this formula. BPA may also be best expressed as a decimal fraction. TOP 20 IN BASES PER PLATE APPEARANCE BPA Alex Rodriguez .695 Carlos Pena .686 David Ortiz .684 Prince Fielder .673 Chipper Jones .653 Ryan Howard .650 Matt Holliday .644 Jim Thome .642 Magglio Ordonez .636 Albert Pujols .629 Adam Dunn .625 Mark Teixeira .621 Miguel Cabrera .612 Chase Utley .612 David Wright .605 Jorge Posada .603 Jack Cust .602 Brad Hawpe .597 Vladimir Guerrero .597 Pat Burrell .594 I realize that all bases are not created equally. A single is worth more than a walk, two singles are worth more than a double, and two doubles are worth more than a home run. Linear weights captures these finer points, but the differences are minor in the scope of the bigger picture. In any event, I'm getting far afield from the original idea of expressing batter outcomes as a percentage of plate appearances. I believe this approach would serve to de-emphasize batting average while raising the awareness and value of walks, outs, and on-base percentage. If nothing else, it would be a good first step in highlighting what is and what isn't important when it comes to batting statistics. |
Comments
So, in other words, Jorge Posada had an even better season than the excellent season most people thought he had. Impressive.
And, it's really time we get on voters to give the deserving David Ortiz an MVP one of these days...
While I wouldn't say we should replace the way we look at players, because that would mean relearning all the numbers we've picked up over the years about past players, I think adding these stats to the current collection makes a lot of sense. Nice job!
Posted by: Peter at February 5, 2008 8:45 AM
I've been doing this in my own player analysis for some time now, and it has always been a valuable tool.
My initial goal was similar to yours and Mark's: I wanted everything to add up to 1.00, so everything could be viewed in the proper context (that context being: how often does each particular event occur, as a percentage of all events).
To accomplish that end, we have to ignore some outs (GIDP, CS, other base-running outs) and some positive base advancements(SB), but I solved the problem by thinking about statistics as either primary or secondary. For example, BA, OBA, Out% are all "Primary Performance," and the other outs or bases -- outs and bases that are conditional, like SB, CS, GIDP, etc. -- become "Secondary Performance" or "Conditional Performance."
Looking at a player's Primary Performance will generally give me a good idea of production, but looking at Secondary Performance is always helpful, especially when players are borderline-great/good. (Jim Rice comes to mind. His Secondary numbers are wretched, obviously.)
Anyway, I'm really hoping this "Everything adds up to 1.00" trend catches on.
Posted by: Derek at February 5, 2008 9:44 AM
Nice. I would think one could just combine BB and HBP, though, to make it simpler. One could have a new set of "slash stats" that add up to one.
Posted by: Blastings at February 5, 2008 10:46 AM
I like what you're doing here using plate appearances as a denominator.
However, the idea of expressing the stat as a percentage is something that I doubt the mainstream baseball culture will ever accept. I really don't think it does anything to "help" the mainstream fan accept the OBP concept better. If anything I think it will be far less accepted than OBP.
This is due to the same reasons why the metric system just won't take hold in the US, despite the fact that metric is a more precise and practical system for our base 10 counting. The mere fact that it is different from the established norms makes society resistant to its acceptance. Thus, I really believe that one of the reasons that the mainstream is at least begrudgingly accepting of OBP is that its format is similar to the norm of Batting Average. In other words, just like you can say "Player X has a batting average of three-hundred," you can say, "Player X has a on-base percentage of four-hundred."
Say what you will about how little it makes sense, but until we start measuing fastballs in kilometers per hour, rather than mph, I doubt percentage stats will ever take hold.
Posted by: Kyle at February 5, 2008 5:35 PM
Basing on PA is common sense and the way I've evaluated players for my DMB leagues for at least 10 years.
It fully exposes the fraud that batting average is which is culturally difficult for baseball. Most surprising to those not familiar with the method is how many more hits the low walk/high average hitters get than the high walk/high average hitters.
For situational tactics, it is very useful. For example, if there is a man on second and two outs late in a tie game, your best chance to score that run can be easily calculated based on the next 2-3 hitters hitting averages and on base averages (against different pitching matchups with park effects considered of course!).
Statistics like GB% and FB% should also be based on PAs and are similarly useful.
Posted by: Nick Dowling at February 6, 2008 10:14 AM
Just a quick point. Qualification for hitting rate stats in Major League Baseball is not solely based on obtaining 502 plate appearances. You can qualify if your rate stat still ranks assuming you make an out in all remaining plate appearances necessary to reach 502. With this fact in mind, Barry Bonds' stats with making outs in the 25 additional plate appearances to reach 502 would be:
Hit % 18.73%
BB % 26.29%
HBP % 00.60%
OB % 45.62%
Out % 54.38%
APA .3824
BPA .6514
Number One in all of MLB at the age of 43 in highest BB %, highest OB %, and lowest Out %! Still 6th in all of MLB in BPA! How could he still be so good at this late stage of his career if his sucess from 1999 thru 2005 was tied to performance enhancing drugs and with the Feds and everybody else breathing down his neck since then making it improssible for him to continue to use?
I wounder what numbers a system like PECOTA or any other such system would project for Barry Bonds suposed darkside years (1999 thru 2005)using his pre darkside performances for 1986 thru 1998 and his post darkside performances for 2006 and 2007 as the bookmarks for these projections?
Posted by: giantsrainman at February 6, 2008 4:35 PM
Well, I just wandered over here and noticed this thread. I am glad to see Rich's fine article.
The beauty of plate appearance stats is not only that you can get things to add up to 1, although that is certainly nice. Its that you really can get a feel for how likely something is to happen. If Albert Pujols is up at bat, what is the likelihood that he is going to get a hit? What about a HR? As a fan, this seems like a pretty interesting to know while you are sitting and watching the game. Yet with all the dozens of stats that might get flashed on the screen, none of them tell me this very interesting thing.
Its all on a Strat-o-matic card, of course.
Posted by: Mark Armour at February 6, 2008 8:54 PM