Share this article

print logo

Baseball’s a beautiful game for data journalist

Nate Silver is a statistics guru, or data-driven journalist, who became known for his political forecasting at the New York Times during the last presidential election.

Last year Silver announced he was leaving the Times for a new venture with ESPN, which backed the launch this month of Silver’s website,

Sports is a mainstay of Five Thirty Eight’s content, which also dives into politics, economics and other fields. Neil Paine is a major contributor to the site’s sports pages, bringing his analytical skills to stories on baseball and other sports. Paine used to work for the Sports-Reference group of statistical websites, and he is a consultant on analytics for the NBA’s Atlanta Hawks.

With baseball about to start, Paine spoke about the season ahead and what fans can gain from becoming more savvy about analytics. The following interview was conducted by email.

Q. What is your mission as a sportswriter for Five Thirty Eight?

A: Our mission as a site is to do “data journalism,” which means we still want to tell compelling stories, but we want to do it in a way that’s supported by statistics and the scientific method. We want to do it through data visualizations, or interactive graphics. And in a way, the sports world has been at the leading edge of some of this, with the popularity of sabermetrics. But at the same time, we want to be able to tell those types of stories to people who – while intelligent – aren’t necessarily hard-core statheads.

Q. Is baseball one of the most rewarding sports to follow and write about, because of the pure volume of statistics that are produced by 30 teams playing 162 games each?

A: Yes, from a data-analysis point of view, baseball is fantastic. The sample sizes are huge, the players don’t interact with each other and get entangled statistically to the degree you see in basketball or football, and there’s a ton of historical data to mine. But at the same time, there’s still a lot about the game we don’t know yet. For instance, defensive metrics have room to improve – and they’re going to introduce a new optical-tracking technology that looks like it will revolutionize defensive stats. So baseball is great from a stats perspective, and I’m also excited at the prospect of its analysis getting even better down the road.

Q: For a fan who has been watching baseball for years, but has never paid attention to statistics beyond say the triple-crown categories and basic pitching stats, what might they get from paying more attention to analytics and statistics?

A: I think it completely changes how you view the game. Certainly it changes how you evaluate players, at the very least. If you’re only looking at the triple-crown numbers, you miss so much about what a player does, because the information you have on him is biased by loads of factors that are external to the player himself. For example, RBI totals are hugely influenced by where the player hits in the lineup and how good his teammates are.

If you want to know about that specific player, why wouldn’t you want to cut out the noise that has nothing to do with him and zero in more closely on what he personally brings to the game, instead of including a bunch of stuff that he has no control over?

The same goes for pitchers – if you just look at wins, you’re not only getting info about the pitcher himself, but also a ton of noise from his bullpen support, the quality of the defense behind him, etc. Why use a stat that you know includes all of that noise, when there are alternatives out there which take those complicating factors out of play?

Oftentimes, debates like these are framed as “stats vs. no stats” or “stats versus the eye test.” But in reality, it’s not like traditionalists aren’t looking at any stats – they’re just looking at the wrong ones. And in 2014, there’s no reason to keep doing that.

Q. What about fantasy baseball “owners” – do they stand to benefit more than average fans from reading your columns and learning something about analytics?

A: We don’t plan to be in the business of giving fantasy sports advice per se, since that’s a market which has become really saturated in recent years. And there are already a lot of good people out there who specialize in that.

But I think fantasy owners can still benefit from many of the insights of sabermetrics. For instance, realizing that pitchers have little to no control over the batting average they allow on balls in play (BABIP) is a game-changer for fantasy, because it lets you identify pitchers who are over- or under-valued due to luck.

Q: Do you get much email from readers asking for fantasy advice? What about from readers arguing with your statistical analyses?

A: I rarely get feedback about fantasy advice, since we’re not really focused on that aspect of sports fandom. But I do get a lot of emails about analysis. Some of it is argumentative, which I think is good, because none of us are perfect – there are plenty of cases where someone brings up a fresh point I hadn’t considered, and it changes the conversation. The majority of it, though, is about ideas for future posts.

Q: I notice that your columns quote Bill James and cite some of his formulas that you employ. Is the world of advanced baseball stats a fairly friendly and cooperative one, as opposed to being cut-throat?

A: I think it’s a friendly community for the most part. I do know many of the other stats guys, and we make it a point to grab drinks together during the Sloan (Sports Analytics) Conference at MIT. We’re all pretty secretive about what we’re working on or the specifics of what we think about a given player, of course, because it’s a business. And there are people who are more paranoid than others about sharing information in conversations.

Q: Can you give me the name of a team and maybe an individual or two who you think might have breakthrough years in MLB this season?

A: The Giants and Angels are a couple of teams who disappointed last season, but the predictive systems (both Vegas’ over/unders and the various public statistical projections) think they’ll both have bounce-back years and be back in the playoff hunt.

As far as players go, it’s tough to say – but for what it’s worth, Baseball Prospectus’ PECOTA projection system thinks Travis d’Arnaud could have a breakout year. Mike Olt might be another one if he’s healthy and the Cubs give him a chance.

Q: For someone wanting to learn more about analytics and advanced statistics for baseball, is there any resource – online or not – that you might recommend in addition to Five Thirty Eight?

A: For those just getting started, I’d recommend reading books like “Baseball Between the Numbers,” Alan Schwarz’s “The Numbers Game,” Bill James’ “New Historical Baseball Abstract,” and any of Rob Neyer’s work. For the more advanced statheads, “The Book” by Tom Tango et al is essential reading.

On a daily basis, the biggest online destinations are Fangraphs, Baseball Prospectus,, and Beyond the Boxscore. Beyond those, there are dozens of smaller blogs that are doing fantastic work, from Phil Birnbaum’s Sabermetric Research to team-specific blogs like Lookout Landing and Crashburn Alley. It’s a great time to be a baseball fan, because there’s never been more quality analysis out there in the media than there is right now.