Quintessa mathematicians and scientists enjoy analysing data. Recently, Simon Rookyard presented predictions of the results in the group stage of the on-going European Football Championship, based on an algorithm developed within Quintessa. With the group stage of the competition now complete, it is possible to assess the performance of the algorithm so far, and present predictions for the remainder of the competition.
In a recent article we used our “N-Estimates” algorithm, created to rate sports teams, to predict the results of the 36 opening round matches of UEFA EURO 2020. Each of these predictions was accompanied by a graph to denote the most likely prediction (green) and the set of predictions within a 1-σ confidence interval (yellow). Figure 1 presents the EURO 2020 first round results again with the observed results highlighted.
We consider four metrics of algorithm performance: the percentage of matches in which the correct outcome (correct winner or draw) was predicted; the percentage of predictions with the correct goal difference; the percentage of predictions with the correct exact scoreline; and the percentage of matches inside the approximate 1-σ region. Table 1 compares the algorithm’s performance in each of these metrics against a benchmark value (the expectation if winners/goal differences/scorelines were selected randomly, or the 68% of the 36 matches which would be expected to fall within a 1-σ confidence interval).
Metric | Number of Matches | ||
---|---|---|---|
Expectation from Random Predictions | N-Estimates Algorithm | BBC Pundit Competition Leader | |
Correct Outcome | 13 | 20 | 20 |
Correct Goal Difference | 6 | 10 | 10 |
Correct Scoreline | 3 | 5 | 7 |
Inside 1-σ Region | 25 | 22 | Confidence levels not provided |
It is clear that the algorithm performed well overall. In fact, the outcome, goal difference and scoreline were all correctly predicted around 67% more often than would be expected by chance. Particular successes included predicting some unprecedented events, such as England winning their opening EURO game for the first time, and Austria’s first ever win at a European championship. However, there was more variability in the first round than anticipated; only 22 results fell within the approximate 1-σ region, compared with an expected 25 matches. The algorithm’s worst predictions were not randomly distributed, but instead were associated with particular teams whose performances were systematically better or worse than expected (all but one of the matches in which the predicted and observed goal differences differed by more than two involved England, Scotland, Spain or Poland). One interesting finding relates to the effect of a team playing in their own country. In the historical data used to train the algorithm, there is a clear advantage to the home team. For this reason, home advantage was included in the predictions where relevant. However, across the group stage, for the matches where home advantage was expected, our predictions favoured the home team by 1.08 goals per game more than the observed goal difference, on average. For matches played in a neutral country, the average of the predicted goal differences differed from the average observed goal difference by just 0.08 goals. This is a strong indication that, unlike most football matches, home advantage is absent from EURO 2020.
It is also interesting to see how the algorithm compares to expert human judgement. BBC Sport has been running a prediction competition amongst its pundits, and so Table 1 includes statistics for the pundit who is leading the BBC’s competition at the end of the group stage. We can see that the algorithm has performed similarly well to the best pundit, although the pundit was more successful at predicting exact scorelines.
We have updated the teams’ ratings following the first round results and our predictions for the remaining rounds of the competition are below. See Table 2 for the round of 16 predictions, Table 3 for the quarter final predictions, Table 4 for the semi final predictions and Table 5 for the final prediction. The predictions are for the score at the end of normal time. Where this is predicted to be a draw, we have indicated which team we expect to progress (either in extra time or after a penalty shoot-out); this is simply the team with the higher rating before the match according to the N-Estimates algorithm. In response to our discovery from the first round matches, home advantage has not been included in any of these predictions. Matches for the later rounds will be added when the competing teams are known.
Date | First Team | Predicted First Team Score | Predicted Second Team Score | Second Team | Confidence Indicator |
---|---|---|---|---|---|
Wales | 0 | 2* | Denmark | ||
Italy | 2* | 0 | Austria | ||
Netherlands | 1* | 0 | Czech Republic | ||
Belgium | 2* | 0 | Portugal | ||
Croatia | 0 | 3* | Spain | ||
France | 1* | 0 | Switzerland | ||
England | 0* | 0 | Germany | ||
Sweden | 1* | 1 | Ukraine |
Date | First Team | Predicted First Team Score | Predicted Second Team Score | Second Team | Confidence Indicator |
---|---|---|---|---|---|
Switzerland | 0 | 1* | Spain | ||
Belgium | 1* | 0 | Italy | ||
Czech Republic | 0 | 2* | Denmark | ||
Ukraine | 0 | 1* | England |
Date | First Team | Predicted First Team Score | Predicted Second Team Score | Second Team | Confidence Indicator |
---|---|---|---|---|---|
Italy | 1* | 1 | Spain | ||
England | 0 | 0* | Denmark |
Date | First Team | Predicted First Team Score | Predicted Second Team Score | Second Team | Confidence Indicator |
---|---|---|---|---|---|
Italy | 0 | 0* | England |
Quintessa is not affiliated in any way with UEFA or the BBC. Its application of the N-estimates algorithm to the UEFA EURO 2020 competition is an independent and non-commercial endeavour. The UEFA EURO 2020 logo is copyright of UEFA.
Update 1 July 2021: Quarter final predictions added.
Update 4 July 2021: Semi final predictions added.
Update 8 July 2021: Final prediction added.