As readers here will know, I’ve been tracking rankings of teams in the 2018 World Cup as predictions as I’ve been doing this for the past several World Cups. The 2018 competition wound up with 12 rankings, plus an ensemble representing the average of all of the 12. With the 48 group stage matches in the books here are the results.
Not much separates rankings at the top, with Lloyd’s, CBS and Goldman Sachs sharing the top spot. At the beginning I set the FIFA Rankings as the baseline to beat, since anyone with any amount of soccer knowledge could easily apply those to make predictions. These are fairly sophisticated, of course, but if a prediction can’t beat the FIFA Rankings, why bother?
It turned out that the FIFA Rankings proved an easy bar to best, with only 2 of the 12 predictions failing to do better. Of note however is that one of those rankings was the overall winner when I ran this exercise in 2014. Prediction is hard!
There were 9 draws in the group stage (interestingly, exactly the same as in 2014) which did not add to the scores. Thus there were a possible 39 matches from which to score in this very simply ranking system. A simple coin flip would lead to an expected 19.5 guessed correctly, so fairly obviously each of the methods added value over a truly random approach. But with only 5 matches separating top from bottom it is very hard to say that randomness is not what distinguished places in the table.
The ensemble (representing the average ranking across the 12) did very well, finishing above most of the rankings and just one out of first place. This result reflects understandings from the academic literature on forecasting which suggests that such an integration of forecasts can often outperform any one method.
No ranking had Germany crashing out and none had Japan going forward. Places in the table turned not on bold predictions running against the grain of conventional wisdom, but rather, odd situational results like the dreary England- Belgium match where it was not clear if one or both teams wanted a result based on where they would land in the knockout phase. Context, of course, is one reason why prediction has its limits.
Conventional wisdom is safe haven when something truly unexpected happens. No one will fault a forecaster for failing to anticipate Germany’s early trip home. But predict Germany to go home when they don’t? You’d be mocked as a fool. Groupthink and psychology show up in our quantitative predictions even when we can see them. It is no wonder that results cluster together. There is comfort in the herd, but not fame or shame.
Back in 2014 when I ran this exercise the top rankings also scored 32 in the group phase. Interesting. Now I’ll run a second competition in the knockout part of the tournament with all 16 teams and all 12 predictions. In 2014 these results were tight with the 12 rankings getting between 9 and 11 matches correct out of 16.
In the prediction evaluation exercise there is everything to play for. In the opening matches there are four unanimous picks: Belgium, Brazil, Spain and England. Every other match has a split opinion. I’ll continue to post updates on Twitter @RogerPielkeJr as the tournament progresses. Comments welcomed!