Now that the group stage is finished. We needed a few tweaks to the simulation process (described in some more detail on my blog), which we spent some time debating and implementing before taking a look at the round of 16 games.

First off, the data on the last World Cups show that during the knock out stage, there are substantially fewer goals scored. This makes sense: now it's make or break. This wasn't too difficult to deal with, though − we just needed to modify the distribution for the zero component of the number of goals ($\pi$, as described here). In this case, we've used a distribution centered on around 12% with most of the mass concentrated between 8% and 15%.

These are the predictions for the four remaining games at this stage. Germany, France and (only marginally) Argentina have a probability of winning exceeding 50%. The other games look closer.

France vs Nigeria

{mbox:significance/world-cup/france-nigeria.jpg|width=600|height=461|caption=Click to enlarge|title=Predictive distribution of results – France vs Nigeria}

Germany vs Algeria

{mbox:significance/world-cup/germany-algeria.jpg|width=600|height=461|caption=Click to enlarge|title=Predictive distribution of results – Germany vs Algeria}

Argentina vs Switzerland

{mbox:significance/world-cup/argentina-switzerland.jpg|width=600|height=461|caption=Click to enlarge|title=Predictive distribution of results – Argentina vs Switzerland}

Belgium vs USA

{mbox:significance/world-cup/belgium-usa.jpg|width=600|height=461|caption=Click to enlarge|title=Predictive distribution of results – France vs Nigeria}

Technically, there is a second issue, which is of course that in the knock out stage draws can't really happen − eventually game ends either after extra time, or at penalties. For now, we'll just use this prediction, but I'm trying to think of a reasonable way to include the extra complication in the model. The main difficulty is that in extra time the propensity to score drops even further − about 30% of the games that go to extra time end up at penalties.

Here is what the model predicted for the first four games that were played over the weekend.

Brazil vs Chile

{mbox:significance/world-cup/brazil-chile.jpg|width=600|height=461|caption=Click to enlarge|title=Predictive distribution of results – Brazil vs Chile}

Colombia vs Uruguay

{mbox:significance/world-cup/columbia-uruguay.jpg|width=600|height=461|caption=Click to enlarge|title=Predictive distribution of results – Colombia vs Uruguay}

Netherlands vs Mexico

{mbox:significance/world-cup/netherlands-mexico.jpg|width=600|height=461|caption=Click to enlarge|title=Predictive distribution of results – France vs Nigeria}

Costa Rica vs Greece

{mbox:significance/world-cup/costarica-greece.jpg|width=600|height=461|caption=Click to enlarge|title=Predictive distribution of results – France vs Nigeria}

This article first appeared on Gianluca Baio’s personal blog. Image courtesy of Steindy

Significance Magazine