Touching BasesSeptember 30, 2009
Thoughts from the 2009 New England Symposium on Statistics in Sports
By Jeremy Greenhouse

On Saturday, Harvard hosted NESSIS, a gathering of sports statisticians that could be billed as the little brother of the sports analytics conference at MIT, only geekier. I say that as a compliment.

Academia vs. Industry (vs. Internet?)

A couple of the best of both worlds were on display, as two names I've become familiar with, Shane Jensen and Tom Tippett, presented their analysis. Tippett, Director of Baseball Information Services for the Red Sox, presented research on special baseball tactics such as the bunt and stolen base. His findings often dissented with conventional sabermetric wisdom. A base stealer must steal at a clip of at least 70% to be deemed successful? Well, the break-even rate fluctuates wildly based on the game state. With nobody out in a one run game, the break-even rate of stealing second is only 54%. In a two-plus run game, it’s 84%. With regards to the bunt, Tippett found that good bunters should continue to bunt thanks to the possibility of an error or hit, and that in the context where you’re playing for one run, bunting is often sensible depending on the hitter and upcoming batters. It seems to me that the guys who wrote The Book came up with similar conclusions. Of course, the real stuff Tippett does for the Red Sox is proprietary and can hardly be discussed.

Tippett said that the more he studied an issue, the more often he found that managers tended to be right, without even knowing the data. Mike Zarren, a statistician for the Celtics, agreed. Zarren brought up two points of interest. First, he said that the reason it's hard for people within his industry and those from academia to collaborate is that academics are always interested in publishing while teams need to keep their research private. Secondly, Zarren was fond of mentioning the fact that the Celtics led the league in technical fouls last year, and that was before signing Rasheed Wallace. I pray for Tommy Heinsohn’s health.

Meanwhile, Jensen was one of three baseball analysts representing academics from the Wharton School at UPenn who presented their work. In a comparison of fielding metrics, Jensen's SAFE was deemed the most statistically advanced defensive metric publicly available. However, the guys on the Internet who distribute their data for free, in the forms of UZR and PMR, hold their own. Jensen's system also showed that Derek Jeter has been a subpar fielder in the past, so I have to question whether Jensen has an anti-New York bias, whether he's ever watched baseball, and the credentials of the Department of Statistics at the Wharton School.

Talking About Practice?

The topic of team practice was addressed by Gilbert Fellingham, a statistician at Brigham Young University and volleyball enthusiast. Fellingham studied point-by-point volleyball data to see what skills matched up best with results, he determined that, for instance, women’s volleyball teams should spend more time on their transition offense. Of course, there are some skills that are important but are difficult to improve upon, even with countless hours of practice. I’d imagine that every baseball player has a different skill that they should practice, but how can we quantify it? We can quantify player performance and we can detect player weaknesses, but we don't know what areas of weaknesses can be most efficiently improved upon through practice. I have no idea whether there's a uniform practice structure among teams or whether some teams have specific agendas.

The Lesser Sports

Benjamin Alamar of JQAS in researching play-calling in the NFL found that teams under-utilize the pass. When I was watching the Colts play the Cardinals Sunday night, it made my head hurt every time the Colts ran the ball. Keeping the Cardinals’ defense on its toes so that it can’t sit on the pass is important, but that really only matters when there’s a Nash equilibrium (no idea if I'm using that term correctly). What I mean is that if there are more than six people in the box, I really don’t see why the Colts would ever run. Ever. Even if the Cardinals are expecting pass, they still won't be able to defend it since they have x-number of linebackers who can’t do a thing when Peyton Manning airs it out. Alamar said that there’s even a greater chance of a play yielding negative expected points (think run expectancy) on a run than on a pass. The only downside to passing all the time is the risk put on the quarterback through wear to his arm and the threat of the crushing sack.

My favorite presenter may have been Wayne Winston, who provides Mark Cuban with his adjusted plus/minus numbers, which strongly appeal to me. In baseball, plus/minus is also known as WOWY, which I believe is most useful in assessing defensive value, catcher/pitcher batteries, and batter protection. I’ve long known that the statistics presented in an NBA box score are of much less value than those in a baseball box score. The interaction between teammates in basketball can be so subtle that we often don't know what to track. It's difficult to pinpoint why the Timberwolves perform better with Sebastian Telfair on the floor, but apparently they do. Plus/minus confirms in no uncertain terms that playing Ben Wallace in the series against the Cavs was a disaster. It also gives credence to this decade's Kevin Garnett vs. Tim Duncan and Kobe vs. Shaq debates.

Information on NESSIS can be found at its web site here.


Was this a misinterpretation or particularly subtle humor?

"With nobody on in a one run game, the break-even rate of stealing second is only 54%."

Seems the possibility of stealing is 0 with nobody on. Does that really mean runner on 1st (probably matters how many outs also)? Or is it talking about trying to take 2nd on a 2-out single in a 1 run game?

Gilbert. My bad, meant nobody out.

I thought it was common knowledge Jeter was a sub-par fielder.

So Gilbert thinks a mistake might be a joke, and eric g. thinks a joke is a mistake.

Anyway, nice article Jeremy. My stat geekdom is so tied to baseball that it makes me laugh to think of people studying volleyball data. And the current stat revolution in the NBA is much needed; anecdotally, I think basketball suffers from poor GMs more than any other sport.

great stuff. what's the deal with wharton anyways? "it sounds made up" -- cosmo kramer