Baseball BeatFebruary 05, 2008
Explaining Batter Outcomes in Percentage Terms
By Rich Lederer

In the comments thread to Patrick Sullivan's thoughtful article on "Out" With The Old: A Better Way to Look at OBP, Mark Armour made the following observations and statements:

I think Sully is absolutely right here, especially because the denominator is plate appearances, not "at bats".

Another advancement I would like to see everyone rally around is to eliminate the notion of an "at bat". The beauty of OBP, or Sully's OUT%, is that it really does tell you the likelihood of something happening. The denominator of batting average, slugging percentage, home-run percentage should all be "plate appearances".

I brought this up at a SABR meeting a few years ago and someone said, "but that would penalize players who walk," since Ted Williams "BA" would go down dramatically more than Jim Rice's would. While true, statistics do not penalize anyone, they just try to explain what happened.

I just want all of these percentages (OUT %, batting average, walk %, etc.) to add up to 1.

Mark's comments really hit home with me. More than anything, I like the idea of using plate appearances as the denominator to determine how often a player gets a hit, walk, hit by pitch, or makes an out. In this way, we can measure everything in percentage terms and the beauty of it all is that the four outcomes total 1 if expressed in decimal fractions (as Mark said) or 100% when stated in percentages.

All of these stats would be viewed on the same scale. Therefore, they would be more descriptive and easier to understand. Walks, hit by pitches, and outs would be put in their proper perspective, and each would be accounted for more than ever before.

For example, Ichiro Suzuki's batter outcomes could be expressed in decimal fractions or percentages, as follows:

DECIMAL FRACTIONS

                   HIT       BB     HBP      OUT
Ichiro Suzuki     .323     .067    .004     .606

PERCENTAGES

                   HIT %    BB %   HBP %    OUT %
Ichiro Suzuki     32.34%   6.66%   0.41%   60.60%

To make things even simpler, we could round these outcomes to their closest whole number. Imagine Dave Niehaus, the play-by-play announcer for the Seattle Mariners, describing Suzuki when the lefthanded hitter approaches the plate. "Suzuki produces a hit 32%, a walk or hit by pitch 7%, and an out 61% of the time." Or "Ichiro gets on base in 39% of his plate appearances and makes an out the other 61%."

Let's take a look at last year's leaders in hitting, walk, on-base, and out percentages (minimum of 502 total plate appearances).

TOP 20 IN HITTING PERCENTAGE (H/TPA)

                   HIT %    BB %   HBP %    OUT %
Ichiro Suzuki     32.34%   6.66%   0.41%   60.60%
Magglio Ordonez   31.81%  11.19%   0.29%   56.70%
Placido Polanco   31.20%   5.77%   1.72%   61.31%
Matt Holliday     30.29%   8.84%   1.40%   59.47%
Edgar Renteria    30.20%   8.47%   0.18%   61.14%
Hanley Ramirez    30.03%   7.37%   0.99%   61.61%
Carl Crawford     29.35%   5.10%   0.80%   64.75%
Mike Lowell       29.25%   8.12%   0.46%   62.17%
Michael Young     29.05%   6.79%   0.72%   63.44%
Jorge Posada      29.03%  12.56%   1.02%   57.39%
Chone Figgins     29.03%  10.14%   0.00%   60.83%
Dmitri Young      28.94%   8.66%   0.20%   62.20%
Derek Jeter       28.85%   7.84%   1.96%   61.34%
Chipper Jones     28.83%  13.67%   0.00%   57.50%
Chase Utley       28.71%   8.16%   4.08%   59.05%
Dustin Pedroia    28.40%   8.09%   1.20%   62.31%
Robinson Cano     28.25%   5.83%   1.20%   64.72%
Vladimir Guerrero 28.18%  10.76%   1.36%   59.70%
Aramis Ramirez    28.14%   7.71%   0.72%   63.44%
Alfonso Soriano   28.04%   5.02%   0.65%   66.29%

TOP 20 IN WALK PERCENTAGE (BB/TPA)

                   HIT %    BB %   HBP %    OUT %
Jack Cust         19.92%   20.71%  0.20%   59.17%
Pat Burrell       20.23%   19.06%  0.67%   60.03%
Jim Thome         22.20%   17.72%  1.12%   58.96%
Todd Helton       26.10%   17.01%  0.29%   56.60%
Carlos Pena       22.55%   16.83%  1.63%   58.99%
David Ortiz       27.29%   16.64%  0.60%   55.47%
Ryan Howard       21.91%   16.51%  0.77%   60.80%
Adam Dunn         21.84%   15.98%  0.79%   61.39%
Travis Hafner     21.94%   15.43%  1.06%   61.57%
Rickie Weeks      18.97%   15.42%  2.77%   62.85%
Nick Swisher      21.40%   15.17%  1.52%   61.91%
Albert Pujols     27.25%   14.58%  1.03%   57.14%
J.D. Drew         22.83%   14.31%  0.18%   62.68%
Gary Sheffield    22.09%   14.17%  1.52%   62.23%
Lance Berkman     23.35%   14.07%  1.20%   61.38%
Jason Varitek     21.43%   13.71%  1.54%   63.32%
Chipper Jones     28.83%   13.67%  0.00%   57.50%
Ken Griffey       23.43%   13.64%  0.16%   62.76%
Kevin Millar      21.53%   13.52%  1.42%   63.52%
Grady Sizemore    23.26%   13.50%  2.27%   60.96%

TOP 20 IN OBP/LOWEST OUT PERCENTAGE

                   OBP %    OUT %
David Ortiz       44.53%   55.47%
Todd Helton       43.40%   56.60%
Magglio Ordonez   43.30%   56.70%
Albert Pujols     42.86%   57.14%
Jorge Posada      42.61%   57.39%
Chipper Jones     42.50%   57.50%
Alex Rodriguez    42.23%   57.77%
David Wright      41.63%   58.37%
Jim Thome         41.04%   58.96%
Carlos Pena       41.01%   58.99%
Chase Utley       40.95%   59.05%
Jack Cust         40.83%   59.17%
Matt Holliday     40.53%   59.47%
Vladimir Guerrero 40.30%   59.70%
Miguel Cabrera    40.00%   60.00%
Derrek Lee        40.00%   60.00%
Mark Teixeira     40.00%   60.00%
Pat Burrell       39.97%   60.03%
Prince Fielder    39.50%   60.50%
Ichiro Suzuki     39.40%   60.60%

Total plate appearances, as provided by ESPN's stats, consist of every outcome, including sacrifice flies, sacrifice hits, and catcher's interferences. The latter should be factored into OBP but has not for this exercise. One could argue over the inclusion of SH because a player many times is asked to "give himself up" by the manager. However, in my book, the batter created an out, albeit one that would be more productive than a strikeout or infield flyout. By the same token, I have not counted grounded into double plays as two outs. Doing so would result in the batter outcomes not adding up to 1 in decimal fractions or 100% in percentages. Besides, I believe GIDP, like RBI, should be viewed in the context of opportunities.

We can also look at total bases as a percentage of plate appearances. Rather than calling it slugging average (which is TB/AB), Branch Rickey termed total bases as a percentage of plate appearances as "advancement percentage." This one might be best expressed as a decimal fraction.

TOP 20 IN SLUGGING OR ADVANCEMENT PERCENTAGE

                   SLG
Matt Holliday     .541
Alex Rodriguez    .531
Alfonso Soriano   .525
Magglio Ordonez   .521
Prince Fielder    .520
Chipper Jones     .517
David Ortiz       .511
Hanley Ramirez    .508
Carlos Pena       .502
Curtis Granderson .500
Aramis Ramirez    .498
Chase Utley       .489
Jimmy Rollins     .488
Miguel Cabrera    .488
Mark Teixeira     .483
Corey Hart        .481
Ryan Howard       .477
Vladimir Guerrero .476
Carlos Lee        .475
Albert Pujols     .473

Taking slugging or advancement percentage one step further, we can calculate bases per plate appearance (or BPA), defined as [TB+BB+HBP]/TPA. I have excluded SB, CS, GIDP from this formula. BPA may also be best expressed as a decimal fraction.

TOP 20 IN BASES PER PLATE APPEARANCE

                   BPA
Alex Rodriguez    .695
Carlos Pena       .686
David Ortiz       .684
Prince Fielder    .673
Chipper Jones     .653
Ryan Howard       .650
Matt Holliday     .644
Jim Thome         .642
Magglio Ordonez   .636
Albert Pujols     .629
Adam Dunn         .625
Mark Teixeira     .621
Miguel Cabrera    .612
Chase Utley       .612
David Wright      .605
Jorge Posada      .603
Jack Cust         .602
Brad Hawpe        .597
Vladimir Guerrero .597
Pat Burrell       .594

I realize that all bases are not created equally. A single is worth more than a walk, two singles are worth more than a double, and two doubles are worth more than a home run. Linear weights captures these finer points, but the differences are minor in the scope of the bigger picture.

In any event, I'm getting far afield from the original idea of expressing batter outcomes as a percentage of plate appearances. I believe this approach would serve to de-emphasize batting average while raising the awareness and value of walks, outs, and on-base percentage. If nothing else, it would be a good first step in highlighting what is and what isn't important when it comes to batting statistics.

Comments

So, in other words, Jorge Posada had an even better season than the excellent season most people thought he had. Impressive.

And, it's really time we get on voters to give the deserving David Ortiz an MVP one of these days...

While I wouldn't say we should replace the way we look at players, because that would mean relearning all the numbers we've picked up over the years about past players, I think adding these stats to the current collection makes a lot of sense. Nice job!

I've been doing this in my own player analysis for some time now, and it has always been a valuable tool.

My initial goal was similar to yours and Mark's: I wanted everything to add up to 1.00, so everything could be viewed in the proper context (that context being: how often does each particular event occur, as a percentage of all events).

To accomplish that end, we have to ignore some outs (GIDP, CS, other base-running outs) and some positive base advancements(SB), but I solved the problem by thinking about statistics as either primary or secondary. For example, BA, OBA, Out% are all "Primary Performance," and the other outs or bases -- outs and bases that are conditional, like SB, CS, GIDP, etc. -- become "Secondary Performance" or "Conditional Performance."

Looking at a player's Primary Performance will generally give me a good idea of production, but looking at Secondary Performance is always helpful, especially when players are borderline-great/good. (Jim Rice comes to mind. His Secondary numbers are wretched, obviously.)

Anyway, I'm really hoping this "Everything adds up to 1.00" trend catches on.

Nice. I would think one could just combine BB and HBP, though, to make it simpler. One could have a new set of "slash stats" that add up to one.

I like what you're doing here using plate appearances as a denominator.

However, the idea of expressing the stat as a percentage is something that I doubt the mainstream baseball culture will ever accept. I really don't think it does anything to "help" the mainstream fan accept the OBP concept better. If anything I think it will be far less accepted than OBP.

This is due to the same reasons why the metric system just won't take hold in the US, despite the fact that metric is a more precise and practical system for our base 10 counting. The mere fact that it is different from the established norms makes society resistant to its acceptance. Thus, I really believe that one of the reasons that the mainstream is at least begrudgingly accepting of OBP is that its format is similar to the norm of Batting Average. In other words, just like you can say "Player X has a batting average of three-hundred," you can say, "Player X has a on-base percentage of four-hundred."

Say what you will about how little it makes sense, but until we start measuing fastballs in kilometers per hour, rather than mph, I doubt percentage stats will ever take hold.

Basing on PA is common sense and the way I've evaluated players for my DMB leagues for at least 10 years.

It fully exposes the fraud that batting average is which is culturally difficult for baseball. Most surprising to those not familiar with the method is how many more hits the low walk/high average hitters get than the high walk/high average hitters.

For situational tactics, it is very useful. For example, if there is a man on second and two outs late in a tie game, your best chance to score that run can be easily calculated based on the next 2-3 hitters hitting averages and on base averages (against different pitching matchups with park effects considered of course!).

Statistics like GB% and FB% should also be based on PAs and are similarly useful.

Just a quick point. Qualification for hitting rate stats in Major League Baseball is not solely based on obtaining 502 plate appearances. You can qualify if your rate stat still ranks assuming you make an out in all remaining plate appearances necessary to reach 502. With this fact in mind, Barry Bonds' stats with making outs in the 25 additional plate appearances to reach 502 would be:

Hit % 18.73%
BB % 26.29%
HBP % 00.60%
OB % 45.62%
Out % 54.38%
APA .3824
BPA .6514

Number One in all of MLB at the age of 43 in highest BB %, highest OB %, and lowest Out %! Still 6th in all of MLB in BPA! How could he still be so good at this late stage of his career if his sucess from 1999 thru 2005 was tied to performance enhancing drugs and with the Feds and everybody else breathing down his neck since then making it improssible for him to continue to use?

I wounder what numbers a system like PECOTA or any other such system would project for Barry Bonds suposed darkside years (1999 thru 2005)using his pre darkside performances for 1986 thru 1998 and his post darkside performances for 2006 and 2007 as the bookmarks for these projections?

Well, I just wandered over here and noticed this thread. I am glad to see Rich's fine article.


The beauty of plate appearance stats is not only that you can get things to add up to 1, although that is certainly nice. Its that you really can get a feel for how likely something is to happen. If Albert Pujols is up at bat, what is the likelihood that he is going to get a hit? What about a HR? As a fan, this seems like a pretty interesting to know while you are sitting and watching the game. Yet with all the dozens of stats that might get flashed on the screen, none of them tell me this very interesting thing.

Its all on a Strat-o-matic card, of course.