The Predictive Value of Bases on Balls
Since the introduction of sophisticated baseball analysis twenty-five or so years ago, analysts have recognized that on-base-percentage (OBP) is the most informative statistic when evaluating a batter. That is, of all the traditional statistics, OBP best reflects a batter's contribution to run scoring. Because the main difference between OBP and batting average is mainly that the former includes bases on balls while the latter does not, an increased recognition of the value of a walk has occurred recently.
Not surprisingly, as analysts gained an appreciation for the value of the base on balls, the importance of the ability to draw a walk began to be applied in instances that may not be as appropriate. Some began to look at the propensity to walk by a minor leaguer as a predictor of major league success. The theory behind this idea derived mainly from the belief that a high walk total signified good plate discipline and strike zone judgment. The existence of these latter two skills, in turn, suggested a player had a higher likelihood of continuing to develop as a hitter.
I never really liked this theory. In the base data set I used in my work on modeling player careers, there was one player who debuted in the majors with a horrible walk ratio but later evolved into a Hall of Famer with a decent walk rate. Willie Stargell walked only 17 times in 438 plate appearances (excluding hit by pitch or sacrifice flies) in 1964, while recording a .304 OBP and .501 SLG. Seven years later Stargell had a huge season, leading the Pirates to the championship with a .398 OBP and .628 SLG. That year he drew 83 bases on balls in 594 plate appearances.
Of course one counter example proves nothing, but it suggested to me that young players who walked only infrequently could develop into stars as well. Furthermore, one can easily imagine the theory supporting this opposing view: a young player who rarely walks has additional room for improvement as they further learn the strike zone (and some players like Alfonso Soriano keep hitting without ever learning the strike zone).
To test whether the ability to draw a walk as a youngster leads to a higher propensity to evolve into a quality major leaguer, I looked at a number of minor leaguers and compared how they developed. Specifically, I looked for pairs of regular players of the same age, minor league level (AAA, AA, A+, or A), and ability who differed significantly in their likelihood of walking. I generally looked for pairs that differed by at least .060 BB/PA, although most had a larger difference. I found 31 such pairs using the years 1998 and 1999; the average BB/PA for the low walk players was .047 (19 BB per 400 PA), while the average for the high walk players was .130 (52 BB per 400 PA). I have highlighted one pair below as an example.
Player Level Year Age MLEOW OW25 BB/PA Encarnacion, Mario AA 1999 21 .515 .620 0.118 Barrett, Michael AA 1998 21 .518 .623 0.056
To approximate ability I used Offensive Winning Percentage at age 25 (OW25), a statistic discussed at length in Paths to Glory, a book I co-authored with Mark Armour. Offensive Winning Percentage was developed by Bill James twenty-five years ago to measure the contribution of a player's batting statistics within the context of the game. Offensive Winning Percentage attempts to estimate the winning percentage of a team with eight other hitters of equal ability and league average pitching and fielding.
James further introduced the concept of minor league equivalencies in the 1985 Baseball Abstract. He demonstrated that minor league batting statistics are meaningful and could be translated into major league equivalents. The key for making sense of minor league statistics, as with many of baseball's other statistical issues, is context. The three contextual items that one must consider in minor league player evaluation consist of the player's age, the level of the league, and the run context the team plays in, including the average runs scored per game in the league and effect of the team's home park.
For the analysis of minor league players I first convert the player's season statistics to a major league equivalent offensive winning percentage (MLEOW). And second, I adjust the MLEOW based on the player's age to predict what his Offensive Winning Percentage at the major league level will be at 25. Using this metric for all players provides a common evaluation point at an age by which most quality players have made their major league debut.
The pairs were then compared three years later to check if one group or the other improved more dramatically. To compare the two groups I took a simple average (not weighted by plate appearances) of their OW25 three years later. [Note: After three years some of these players had been promoted to the majors, so no minors to majors adjustment was needed.] To jump to the conclusion, there is little difference between the development of the two groups. Superficially, the high walk group seemed to exhibit a higher level three years later.
Type Number OW25 OW25 y+3 Low Walk 31 .392 .259 High Walk 31 .394 .322
For several reasons, two theoretical and one practical, I do not believe the difference above reflects a real difference in the development of players. On the theoretical side, a number of the marginal minor leaguers were receiving only limited plate appearances (leading to a wide range of non-representative OW25 due to the small sample sizes), thus, potentially skewing the simple averages. Second, adjusting for players no longer in Organized Baseball (OB) is a little bit tricky. If the average OW25 of the 62 players is close to .400, averaging in a zero for players out of OB will tend to artificially depress the numbers for a group. If these players remained in OB they would probably be below average, but not zero. In this sample the low walk group had nine players out of OB three years later, while the low walk group had six.
To adjust for the first concern, I looked at all players with more than 250 PA or zero. I still included players with no plate appearances because if a player was out of the league that did indicate a lack of improvement. Making this first adjustment slightly narrowed the difference between the groups.
Type Number OW25 y+3 Low Walk 30 .262 High Walk 22 .312
On the practical side, one of the reasons why I do not believe that the above two tables suggest high walk players develop better is that there is not an equal likelihood for all players to move up to the major leagues. Poor minor league hitters are unlikely to make the majors regardless of whether they can draw a walk.
Type Number OW25 OW25 y+3 Low Walk 9 .538 .398 High Walk 9 .539 .389
Finally, I evaluated only the top prospects; those with an OW25 of at least .500. Only ten players (five in each group) hit this well. For these minor league hitters, the low walk group actually improved more, although I wouldn't read too much into such a small sample size.
Type Number OW25 OW25 y+3 Low Walk 5 .596 .421 High Walk 5 .596 .371
One interesting aside of this analysis is that it gave me a chance to test my adjustments in calculating OW25, the estimated offensive winning percentage of a player age 25. Obviously, individual players' performance varies widely over a three year period, but for a group of players the value should remain fairly consistent. In fact this seems to be the case. The 37 players who had 250 or more PA three years later recorded an OW25 of .397; three years earlier those same 37 had an OW25 of .421. As can be seen above, all 62 averaged an OW25 of .393. In other words, the adjustments seem to reflect player aging fairly well. One can also see some regression to the mean, however, in the last two tables above: on average, players recording a very high OW25 in a single year do not seem to be able to hold that level as they move through the minor league system.
In terms of player value, baseball analysts have been instrumental highlighting the value of a walk. A walk contributes to team run scoring, and the sabermetric offensive statistics include walks in calculating the run contribution of a batter. But this value should not be confused with some sort of a predicative significance for young ballplayers; at least not until some additional research suggests otherwise.
Dan Levitt is the co-author of Paths to Glory, winner of the 2004 Sporting News-SABR Baseball Research Award. He manages the capital markets for a national commercial real estate firm.
[Additional reader comments and retorts at Baseball Primer.]
Interesting article, but the Barret-Encarnacion example brings up a problem with any age matching study. Encarnacion aged by 2 years in the Age-gate scandals. He was actually 23 in 1999.
Unfortunately, I'm not sure how much we can actually trust any ages from Dominican players prior to the 9/11 crackdown.
It's possible that any study like this that requires age matching should simply exclude all Dominican players since their birth records are not sufficiently reliable.
Posted by: philly at August 25, 2005 6:07 AM
One issue I have is using OWP as the indicator of ability.
If player A walks 10 times a year, and player B walks 60 times a year, but they have the same OWP, player A is much better at everything but walking. They aren't really equivalent hitters at all.
I would think it would be better to look at players that were similar in all other respects (look at the counting numbers, adjusting for park if you can, etc.) except for walking and then compare their development.
I could be missing something here, but I'm fairly certain that's how Bill James used to do it with his matched set similarity score studies.
Once you have the matched sets, I'd use something like offensive Win Shares or offensive WARP3 (BRAR?) totalled to determine value down the line, as players washing out of the league is a very important part of the study. One of the big reasons why the people who say that the guys that don't walk will not be as productive is because they think the lack of strike zone judgement will cause just that.
I know Sickles says that if you divide them into four groups (high K/high BB; low K/low BB; high K/low BB; low K/high BB), while everyone wants the -K/+BB guy and no one wants the +K/-BB guy, he prefers the -K/-BB guy to the +K/+BB guy, all other things equal which I'm sure would surprise some people. So that being said, how often did Stargell whiff in 1964? :-)
I'd love to see a similar study, using these four groups, perhaps on AA players in the same league in a season or something? Is there anywhere to get the data for a minor league season like 1998 in spreadsheet format?
Posted by: Joe Dimino at August 25, 2005 11:55 AM
Nice study. It raises an important question. Maybe the only thing we can say is that guys who have a tendency to walk alot in the minors will tend to do so as well in the majors but not necessarily improve more in other areas than the low walk guys. But even this might need to be tested.
It could be that we would still want to know the walk totals of minor league hitters. If two guys both hit .300 with 25 HRs in AA, but one walks 50 times and the other 100, the latter guy might be the better prospect if he will continue to walk more in the future. Maybe that raises another question. Do minor league walk rates predict major league walk rates better than minor league HR rates predict major leage HR rates?
Posted by: Cyril Morong at August 25, 2005 3:33 PM