Tuesday, September 1, 2020

Forecasting Wars

The two most influential forecasting models out there at the current time are 538's and The Economist's. As you perhaps have noticed, they tend to differ quite a bit. As I write, 538 gives Biden and a 69 percent chance of wining the election while The Economist gives Biden an 87 percent chance of winning. That's a pretty big difference. Who's right?
Who knows? But perhaps the most interesting question is why--why do they differ so much? Andrew Gelman, the statistician and political scientist behind the Economist model has an explanation:
"I’ve been chewing more on the Florida forecast from Fivethirtyeight.
Their 95% interval for the election-day vote margin in Florida is something like [+16% Trump, +20% Biden], which corresponds to an approximate 95% interval of [42%, 60%] for Biden’s share of the two-party vote.
This is bugging me because it’s really hard for me to picture Biden only getting 42% of the vote in Florida.
By comparison, our Economist forecast gives a 95% interval of [47%, 58%] for Biden’s Florida vote share.
Is there really a serious chance that Biden gets only 42% of the vote in Florida?....
How did they get such a wide interval for Florida?
I think two things happened.
First, they made the national forecast wider. Biden has a clear lead in the polls and a lead in the fundamentals (poor economy and unpopular incumbent). Put that together and you give Biden a big lead in the forecast; for example, we give him a 90% chance of winning the electoral college. For understandable reasons, the Fivethirtyeight team didn’t think Biden’s chances of winning were so high. I disagree on this—I’ll stand by our forecast—but I can see where they’re coming from. After all, this is kind of a replay of 2016 when Trump did win the electoral college, also he has the advantages of incumbency, for all that’s worth. You can lower Biden’s win probability by lowering his expected vote—you can’t do much with the polls, but you can choose a fundamentals model that forecasts less than 54% for the challenger—and you can widen the interval. Part of what Fivethirtyeight did is widen their intervals, and when you widen the interval for the national vote, this will also widen your interval for individual states.
Second, I suspect they screwed up a bit in their model of correlation between states. I can’t be sure of this—I couldn’t find a full description of their forecasting method anywhere—but I’m guessing that the correlation of uncertainties between states is too low. Why do I say this? Because the lower the correlation between states, the more uncertainty you need for each individual state forecast to get a desired national uncertainty....
Perhaps this is a byproduct of Fivethirtyeight relying too strongly on state polls and not fully making use of the information from national polls and from the relative positions of the states in previous elections.
If you think of the goal as forecasting the election outcome (by way of vote intentions; see item 4 in the above-linked post), then state polls are just one of many sources of information. But if you start by aggregating state polls, and then try to hack your way into a national election forecast, then you can run into all sorts of problems. The issue here is that the between-state correlation is mostly not coming from the polling process at all; it’s coming from uncertainty in public opinion changes among states. So you need some underlying statistical model of opinion swings in the 50 states, or else you need to hack in a correlation just right. I don’t think we did this perfectly either! But I can see how the Fivethirtyeight team could’ve not even realized the difficulty of this problem, if they were too focused on creating simulations based on state polls without thinking about the larger forecasting problem.
There’s a Bayesian point here, which is that correlation in the prior induces correlation in the posterior, even if there’s no correlation in the likelihood.
And, as we discussed earlier, if your between-state correlations are too low, and at the same time you’re aiming for a realistic uncertainty in the national level, then you’re gonna end up with too much uncertainty for each individual state.
At some level, the Fivethirtyeight team must realize this—earlier this year, Nate Silver wrote that correlated errors are “where often *most* of the work is in modeling if you want your models to remotely resemble real-world conditions”—but recognizing the general principle is not the same thing as doing something reasonable in a live application.
Also, setting up between-state uncertainties is tricky. I know this because Elliott, Merlin, and I struggled when setting up our own model, which indeed is a bit of a kluge when it comes to that bit.
Alternatively, you could argue that [42%, 60%] is just fine as a 95% interval for Biden’s Florida vote share—I’ll get back to that in a bit. But if you feel, as we do that this 42% is too low to be plausible, then the above two model features—an expanded national uncertainty and too-low between-state correlations—are one way that Fivethirtyeight could’ve ended up there."
All clear? Good! Now you can go back picking between the models based on how pessimistic or optimistic you're feeling on any given day....

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.