Share this article

print logo

Sabermetrics innovator: Polling methods don't work

LISBON, Portugal -- Bill James and his wife couldn't decipher the television in their hotel room Tuesday night. So they tried to sleep, occasionally arising to check election returns on their computer.

The votes weren't adding up to the polling models. Not even close.

That troubled James for multiple reasons.

Hundreds of supposedly scientific polls were wrong about Donald Trump and Hillary Clinton. The data turned out to be rubbish. Once again, James was reminded how much better sports are than politics when it comes to applying advanced statistics.

"Junk it," James said of polling methods. "It doesn't work."

Bill James is globally famous for pioneering Sabermetrics, advanced statistical analysis that revolutionized baseball and has influenced other sports.

Bill James is a senior advisor with the Boston Red Sox.

James is globally famous for pioneering Sabermetrics, advanced statistical analysis that revolutionized baseball and has influenced all other sports.

His seminal research was the foundation for "Moneyball," the 2003 best-seller and 2011 film on the struggling Oakland A's, who adopted James' concepts on a shoestring budget and reached the playoffs two years in a row.

James is a senior advisor with the Boston Red Sox, helping them in 2004 win their first World Series in 86 years.

Trump's victory was more stunning, astonishing, miraculous.

Polling analyst Nate Silver of FiveThirtyEight.com correctly predicted 99 of 100 of states over the past two electoral colleges, but gave Clinton a 71.4 percent likelihood of beating Trump.

Silver's forecast looked terrific compared to other models. Princeton University pegged Clinton's chances at 99 percent. The Huffington Post was at 98 percent, and the New York Times was at 85 percent.

But Silver's polling model also gave Trump a 20 percent chance of winning the Republican nomination.

Wednesday morning, I sat down with James at the 2016 Web Summit in Portugal.

James, 67, shared his thoughts on wayward political polling, his fears a civil war could be looming, how baseball analytics have jumped to other sports and whether a fantasy politics league could work.

All those presidential polls proved borderline useless. What happened?

There's an expert fallacy. People believe that experts, because they have a lot of knowledge about the field, have a greater ability to predict the future. In some rare cases that's true. But 99 times out of 100 the expert has no more ability to prove the future than the average man on the street does most of the time.

There are flashes of insight the experts have that justify their position on television. But none of us has the ability to predict movement. To give Nate Silver credit, Nate saw the election the way all the other experts did, but he had a big question mark there, and he was trying to tell people, "Don't be too sure." Two days ago, I think, he said there was a 71 percent chance Hillary Clinton would win, which would leave a big chance for an upset. He wasn't saying there was going to be an upset, so he was wrong. But he wasn't quite as wrong.

Nate was so precise with his projections in 2012, correctly predicting the winner in all 50 states. That might have given the general public a belief certain people have a handle on foretelling presidential winners. As a pioneer of advanced statistics, how do interpret whether Tuesday night was an anomaly versus a red flag the methods are flawed?

It's a reminder that the world is a billion times more complicated than the human mind. Of course, everything that happens to me these days is a reminder.

I think of it like a chessboard, which is eight squares by eight squares, 64 squares. You sit down and think "He can do this, and I can do this, and he can do this, and I can do this." If you can think three moves ahead, you can beat almost anybody. If you can think four moves ahead, you're a chess grandmaster.

The world is a chessboard that's a million squares by a million squares. The human mind just isn't that sophisticated. We understand only little, tiny areas. A lot of times we think we've got it figured out, but we don't.

How did you process what happened Tuesday night?

Well, I'm worried. But in a sense less worried. I certainly didn't vote for Trump, and I didn't vote for him because he's unworthy to be president. But I have been concerned for 10 years, and I've been telling people we're drifting toward a civil war. We are getting further and further apart and angrier and angrier.

People, if you look at history, three years before there's a civil war have no idea they're drifting toward a civil war. Then there's a bridge too far, and you can't go back. Look at Sarajevo. Four years before the war, no one had an idea it was coming. Look at Rome in 64 B.C. People were saying terrible things about each other, and then the same people two years later were murdered by their political opponents. You think, "God damn, they had no idea this was coming, did they?" They thought they were living in a safe, stable democracy.

I still worry we could be drifting toward a civil war because we are so divided. But I hope this result vents some of that rage and some of that frustration from that one side so they do not feel excluded the way that they did yesterday and that it may help us to get past this period of intense anger.

What are your thoughts on the calls to scrap polling as any kind of science?

The old polling model doesn't work. It does need to be rethought. If you watch the commentary shows, they talk often about how good their polls are. They always say, "There's this poll, but it's bad. There's this poll, and it's good. We only focus on the good, reliable polls." Well, the "good, reliable polls" didn't turn out to be that good. And it's almost certain that some of the "bad polls" were actually accurate.

Those people need to rethink what they're assumptions are and how they go about gathering poll data.

That sounds like a massive adjustment.

I don't think it's that big. The mistake they make seems obvious to me: They're too narrow. They have a process that has evolved since the last big one they got wrong in 1948. Since then, they had a process that has evolved and evolved and evolved, but it's gotten more narrow.

In what way?

There's a way you're supposed to do it. You're supposed to use registered voters. You're supposed to use likely voters. You're supposed to use voter rolls. You're supposed to make projections based on probability of turnout.

What you need to be doing instead is open up the polls so that you consider information from a wider variety of sources. In other words, don't spend so much time purifying the polls. Make them larger. Make them more inclusive.

What changes when it comes to data gathering would accomplish that goal?

Pollsters say there is a right method, but there is not. Pollsters say you call people on the phone so you know who you're talking to. You know whether they're likely to vote or not. You know whether they voted last time. You know whether they're registered Republican or Democrat because you have the voter list.

Junk it. It doesn't work. Deal with people you don't know. Use 25 different models. Go to a shopping mall and set up a booth where you hand out candy bars to anybody who'll fill out a poll for you. Walk down the street and stop every seventh person you see and ask. Put up buckets with pictures of Trump and Hillary and ask random people to drop a quarter in one or the other and count them up. Do it a hundred different ways and see if you can figure out 900 more. Then you get a broader understanding rather than a narrow understanding.

We are here at the 2016 Web Summit to talk about sports. We both make our livings in the sports world. Sports seem a little more trivial today, but maybe more important as a happy diversion. How have you reconciled chatting about sports today with the world changing so much in the past 24 hours?

Politics and sports keep intruding on each other. On my website, people are always screaming at me to stop talking about politics and stick to sports because sports are fun. I won't do it. The same things that are true in sports are true in politics.

Let's transition to sports for a bit, if only for that diversion.

That's OK.

Buffalo is an NFL, NHL town. How do you see the evolution of analytics impacting sports other than baseball?

Football is different in that a lot of serious analysis was done by the NFL teams themselves for a long time, whereas baseball analysis started as a public thing and then swept into the sport. People have been studying football films, which is a type of analysis, since the 1950s. So it has a different history.

For a lot of reasons, the NFL also has been less open to invasion by outside analysts because the professional analysts were better. They're better, in part, because the pro game is pretty closely aligned to the college game. Coaches move back and forth. College football coaches are organized, intelligent, analytic people at a high level, at lever higher than almost anybody you know. They have been driving analysis within their programs rather than waiting for outside analysts.

The fact that the information was proprietary kept team analysts ahead of us for a long time. At some point, that won't be true anymore. At some point, analysis in other sports will sweep ahead of football because of transparency. In fact, that's already sort of happening.

For example, on the issue of when you should punt, serious football analysts say you should go for it on fourth down way more than teams do. But they punt at midfield even though the data shows it's not worth it.

The "Moneyball" book and movie made analytics more of a phenomenon because it exposed so many casual fans to those ideas. But how much danger is there for people to assume that because advanced stats work so well for baseball, then they must be useful in sports like football or hockey?

Not a danger, but many dangers that are inherent to that thinking.

How so?

For several reasons. Baseball players get hundreds of trials a year against a mix of pitchers. Football players don't have the same bulk of data. Baseball players have records building up from the time they're in the high school that are, in a sense, meaningful. They don't predict what's going to happen, but they can show where the player is at that moment. A football player might be a linebacker in college and a defensive lineman in NFL. You don't have the same consistency.

To assume players in other sports can be measured the same way as baseball players would be an invitation to disaster. Stats in other sports are no doubt useful, but they're not the same as baseball.

You're constantly asked about the future of Sabermetrics and where they're headed. What's an angle or a point you don't feel gets addressed enough?

Most people still have no concept of how big it is. People know that baseball teams hire guys like me. What they don't understand is they hire a lot of guys like me. Also, our ways of thinking about things have invaded the whole culture.

["Moneyball" author] Michael Lewis is a great writer. He told a great story, which has been a great value to me. But he presented this conflict between the scouts the analytic community. I never found that to be true. The scouts grew up reading this stuff and know it and understand it.

To bring our conversation full-circle, it seems there are some antiquated ways of looking at election data and projecting winners. Presidential analytics, you would think, is more important than sports analytics. How much more advanced are sports analytics compared to politics?

Sports are way ahead. Baseball benefitted from decisions made 150 years ago to keep really, really good records. The National League started in 1876 with what were, in retrospect, sophisticated records and methods. Teams didn't even have managers, but the league designated an official scorer for every game and instructed exactly how the record should be kept and standardized and filed to the league office to be processed.

Maybe fantasy politics would help us build better databases.

If you designed a political fantasy game that really worked, people would love it. People are obsessive about politics. If you developed better ways to measure things ... I don't think we have the same breadth of popular records as we do in baseball.

Think about a baseball card. I know the baseball card business is into the tank now, but the baseball card business is built on selling math to seventh-grade boys. There's a picture and a bunch of information and stats.

It's what led me to become a sportswriter.

Exactly! You know more about Gary Carter and Tim Raines and Razor Shines than you do about the people who work in your office. You grew up in Cleveland, right?

Yeah, so let's say Mike Hargrove.

Right. He was born Oct. 28, 1949, I think. [Oct. 26, 1949, actually]. But my point is you learn from a baseball card probably more than you know about the people you work with.

So why don't you have cards for political figures? There's no reason, really.

People have not done what [groundbreaking statistician] Henry Chadwick did for baseball, which was to think through the process of "How do you make a record for this guy?"

People haven't done that in politics, so we're still back in the mire.

Story topics: / /

There are no comments - be the first to comment