Measuring the Taint: Steroids and the Court of Public Opinion
By Sky Andrecheck

After the Mitchell Report was released last year, baseball hoped to put its steroid past behind it. However, with this year's allegations of Alex Rodriguez and Manny Ramirez both juicing, steroids are once again back on baseball's front burner.

How A-Rod and Manny's legacies will be tainted by steroid accusations remains to be seen, but one of questions for fans, baseball media, and hall of fame voters is how to treat alleged steroid users in the steroid age. While no player has actually been tried for taking steroids, all players stand in front of the court of opinion, and this court, fair or not, will determine a player's legacy.

While the list of players somehow connected with steroids has grown to over 125 according to Baseball's Steroid Era, some alleged users seemed to have escaped the taint and shame that comes with steroid use, while others have felt the full wrath of public scorn crash upon them. Watching a nationally televised early season game this year between the Cubs and Cardinals, the announcers lauded the amazing feel-good story of Rick Ankiel, the wild pitcher turned slugger, while conveniently not mentioning that he completed the transformation with the help of Human Growth Hormone. Ankiel had a prescription from a doctor and was not banned by Major League Baseball, but he still took HGH - it seems that he has been given him a pass where other HGH users have been vilified - at least according to Miller and Morgan.

But while no polls of fans' perceptions have been taken, it got me thinking about how tainted certain ballplayers were due to steroids. While a poll might be ideal, another measure of steroid taint might be how many mentions of steroids linked with a player are in the media. Another might be how often fans refer to a player as a steroid user. Where's one place that the media and fans intersect to provide commentary on baseball? The internet of course.

One way of measuring the steroids stain is by using the all-powerful Google. To get a player's baseline number of mentions, I put a player's name in quotes and searched for all references within the past year. Then, to measure the stain of steroids, I searched for that player's name with the word "steroids" next to it and took note of how many hits were found in the last year for that search. Dividing the number of hits for a player and steroids, by the number of hits for the player overall, gives an estimate of the "percent tainted" for a particular player. I limited the searches to references within the last year to eliminate hits for that player before the steroids were found, as well as to give the controversy time to calm down - we're not as interested in how widely reported the story was at the time it broke, but in how a player is perceived after some time has passed.

Obviously this is an inexact science - the number of hits change over time, and is subject to the unknown inner workings of Google. And of course, if you've ever searched for something on the internet before, you'll know that sometimes you might get results that don't result in what you want - a hit from a search of Ankiel and steroids might talk about Ankiel in one place and mention steroids in a totally different context further down the page. Ideally, we'd like to filter those out, but this method should still give a decently accurate results.

Another potential problem, was that if there was recent news on the particular player and steroids, this tended to give some bizarre results - Tejada was recently pled guilty to lying to Congress, so for a few weeks this led there to be more hits for Tejada and steroids than Tejada alone. Now that inconsistency seems to have gone away. I'm not sure how this happens, but it's reason for caution when there has been recent news surrounding a player. For this reason, Manny Ramirez and A-Rod are not in the table below - the verdict is still out on how their usage will affect their legacy. The data I present here is about a week or so old - hopefully things haven't changed much.

For what it's worth, here is the table of the "Percent Tainted" for the alleged or proven steroid users. It's not a comprehensive list, but covers the highest profile players along with their usage and the source of their allegations.


Is there a pattern that can explain why some players seem to be more tainted than others? Not surprisingly, it's Bonds that tops the list. He's followed by Palmeiro, Clemens, and Caminiti, all high-profile steroids cases to be sure. A few guys, Knoblauch, Hill, Neagle, are high on the list, but are probably more an artifact of the method rather than real public perception. These were players who were out of baseball and out of the public eye when their names surfaced in the Mitchell Report, leading to a high percentage of recent hits linking them to steroids. On the other hand, this didn't seem to affect Fernando Vina or David Justice, who were also out of baseball when the report surfaced, but their percentages were fairly low.

Of the other Mitchell Report guys, some players got off relatively easily. You don't hear much about Eric Gagne's steroid use, and the Google data backs this up, at only 19% tainted. Gary Matthews Jr. and Brendan Donnelly also seemed to get a pass from the public. Why I'm not sure, but my perceptions seem to match the Google data - the guys at the bottom of the list aren't guys you generally associate with steroids, even though there's evidence that they did them. Meanwhile, the guys at the top are the players I tend to link with steroids more readily.

Players that were simply rumored to have juiced, or were implicated via hearsay, were less likely to be judged harshly by the public. A guy like Bret Boone, who's numbers surely would indicate steroid user, but was implicated only by Jose Canseco, came in fairly low at 21%. Ivan Rodriguez and Magglio Ordonez were even lower. Puzzling is Canseco himself, who was 30% tainted - high but not as high as some others - even though he seems to have made his entire existence revolve around steroids.

At the bottom of the list is our man Rick Ankiel, who was found to have taken HGH, but claimed he had a good reason for it. The ESPN announcers weren't the only ones giving Ankiel a pass; it seems that most others did as well.

In general, it seems that the players who took a low profile - no lawsuits, no interviews, no public outrage - seemed to fare the best. Guys like Clemens or Palmeiro, admittedly bigger stars to begin with, tried to refute the claims and ended up high on the list. It also seems best not be linked with one of those guys - Andy Pettitte probably handled his situation as best he could, but being linked with Clemens assured his own use would be brought up time and time again. Ditto with Benito Santiago and Bonds.

While it's interesting to see the perception of players who have already been busted, we can use the same method to try to track which players - past and present - are most perceived to have taken steroids, even if no actual evidence or credible allegations have been made. This isn't a witch hunt, but rather simply taking measure of who the public suspects of possibly taking steroids.

For this, we must take additional steps of manually filtering out results that actually suspect a player of steroids vs. results that say, have a player commenting on steroid use without any implication at all. A search of Derek Jeter and steroids may turn up a lot of results, but they will be talking about him in relation to A-Rod’s use, not suspecting Jeter of steroid use himself. To do this, I manually looked at the first 20 hits and saw which were relevant suspicions, and which were not, and proportionally scaled back the "taint percentage." To be fair I also went back and did this for the proven steroid users as well, so the table above also reflects this methodology. It's not foolproof to be sure, and it's somewhat subjective, but it's a way to combat the above problem.

Below is a table of players who have never been actually reported to have used steroids, and their taint percentages. The list consists of big power hitters and a few other all-star type players - the type of player who usually falls under suspicion, or at least attracts the attention of fans.


The most suspected, but never proven, player of all is not surprisingly Sammy Sosa. He's always been a face of the steroid era, despite never having actually been linked to using them, and his percent tainted is larger than most players who actually have been proven to take steroids. The other biggest suspicions seem to be based largely on statistics, which makes sense in light of the lack of actual evidence. Brady Anderson, Luis Gonzalez, Andruw Jones, and Adrian Beltre all had bizarre seasons of extremely high or extremely low production, presumably leading to their steroid suspicion.

Still, the lack of hard evidence leaves these players well below the average taint of players with actual allegations against them. Among active sluggers, David Ortiz and Albert Pujols, who many regard as the greatest clean slugger, are not above suspicion either. As luck would have it, I was playing around with these numbers the day before the Manny Ramirez steroids story broke - he was pulling around 5% - which would have made him one of the more suspected sluggers in the game today.

Of course, these numbers are not hard and fast - a couple of wackos making baseless allegations can significantly increase the % tainted in the table above so there's probably a fairly large variance to these numbers - but of course baseless allegations are exactly the type of thing we are trying to measure.

While I'd really like to see a public opinion poll of baseball fans asking how much they thought a variety of players were helped by steroids, this admittedly flawed method seems to be a decent approximation for the public's opinion on many players. My main concern is my lack of knowledge about Google’s inner workings, and how these percentages might fluctuate based on unknown reasons. Still it’s pretty interesting to see how players stack up. It will be interesting as time goes on, to see how the perception of players change. For some the scandal may fade away, while others may be permanently branded as cheaters. The lists above may give an indication of which players will be which.


Measuring the taint...


Intelligent discussion? I will have none of that.

A matter wanting consideration is the differential treatment by the media of various players: most notably, there has been a press vendetta against Barry Bonds that goes back to well before any least hint of PED use (indeed, to before PED use was an issue). Sports Illustrated was at the forefront, but they were hardly alone. Several discussions of that bias and vendetta can be found at the "Steroids and Baseball" web site media-links page.

Canseco's relatively low number can be attributed to two things: 1) he confessed using, everyone knows that, so there's no reason to search him (whereas people search Boone to see if he's ever been caught; and 2) he does so many other high profile things (boxing, MMA, etc.) that people probably search him with those terms.

What about Bagwell? Surely he should be in there?