| Literature DB >> 35327877 |
Jaemin Lee1, Juhuhn Kim1, Hyunho Kim1, Jong-Seok Lee1.
Abstract
Since the coronavirus disease 2019 (COVID-19) pandemic, most professional sports events have been held without spectators. It is generally believed that home teams deprived of enthusiastic support from their home fans experience reduced benefits of playing on their home fields, thus becoming less likely to win. This study attempts to confirm if this belief is true in four major European football leagues through statistical analysis. This study proposes a Bayesian hierarchical Poisson model to estimate parameters reflecting the home advantage and the change in such advantage. These parameters are used to improve the performance of machine-learning-based prediction models for football matches played after the COVID-19 break. The study describes the statistical analysis on the impact of the COVID-19 pandemic on football match results in terms of the expected score and goal difference. It also shows that estimated parameters from the proposed model reflect the changed home advantage. Finally, the study verifies that these parameters, when included as additional features, enhance the performance of various football match prediction models. The home advantage in European football matches has changed because of the behind-closed-doors policy implemented due to the COVID-19 pandemic. Using parameters reflecting the pandemic's impact, it is possible to predict more precise results of spectator-free matches after the COVID-19 break.Entities:
Keywords: Bayesian hierarchical Poisson model; COVID-19; football; home advantage; match prediction
Year: 2022 PMID: 35327877 PMCID: PMC8947042 DOI: 10.3390/e24030366
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Welch’s unequal variances t-test on expected points and goal difference.
| Test Statistic | Expected Points | Goal Difference |
|---|---|---|
|
| 2.3451 | 2.3049 |
| df | 7.6454 | 7.7808 |
| 0.0485 | 0.0510 | |
| 95% confidence interval | [0.0011, 0.2730] | [−0.0009, 0.3371] |
| mean_before_COVID-19 | 1.6214 | 0.3671 |
| mean_after_COVID-19 | 1.4843 | 0.1990 |
| effect size (Cohen’s | −1.4612 | −1.3731 |
Figure 1Line graphs of expected points and the goal difference for four major European football leagues. (A) Trend of the average expected points of the home team per league in the corresponding season. (B) Trend of average goal difference of the home team per league in the corresponding season. For each plot, the red dashed lines represent the average value of all four leagues per season.
Figure 2Parameters sampled from the proposed model in Section 3. (A) Caterpillar plot of the parameter per team in the Premier League for matches after the COVID-19 break. We adjusted the parameter such that the average is zero because is the relative parameter in the same league. The line length of the caterpillar plot represents a 95% credible interval. (B–E) Caterpillar plot of the home advantage () of four major European football leagues in the 10 most recent seasons, namely B for English Premier League, C for Spanish LaLiga, D for Italian Serie A, and E for German Bundesliga. “After COVID-19” represents the 2019–2020 and 2020–2021 season matches since the leagues were suspended because of COVID-19 in March 2020. The blue dashed line represents the average for the 10 most recent seasons before the COVID-19 break. The line length of the caterpillar plot represents a 95% credible interval.
Score prediction results of the exemplary matches.
| Home/Away | Team Name | Parameters | Simulated Results | Most Frequent | Actual | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| Mean_ | Mean_ |
| Win | Draw | Loss | |||||
| Match1 | Home | Liverpool FC | 0.301 | 0.272 | 1.42 | 0.425 | 0.273 | 0.302 | 1 | 2 |
| Away | Tottenham Hotspur | 0.224 | 0.212 | 1.14 | 0.302 | 0.273 | 0.425 | 1 | 1 | |
| Match2 | Home | Schalke 04 | −0.314 | 0.432 | 0.64 | 0.039 | 0.091 | 0.870 | 0 | 0 |
| Away | Bayern Munich | 0.561 | 0.300 | 3.23 | 0.870 | 0.091 | 0.039 | 3 | 4 | |
Figure 3Distribution of simulated score. The brighter the point, the more frequent the score results are. The red dashed line represents the set of tie matches. The location of the red dot shows the actual match results between the two teams during the 2020–2021 season. (A) Liverpool FC vs. Tottenham Hotspur. (B) Schalke 04 vs. Bayern Munich.
Result of various match prediction models.
| Classifier | Feature Set 1 | Feature Set 2 | Feature Set 3 | Hyperparameter | |||
|---|---|---|---|---|---|---|---|
| Test Accuracy |
| Test Accuracy |
| Test Accuracy |
| ||
|
| 0.5062 | 0.2011 | 0.5208 | 0.2008 |
|
| C = 10 (L2 regularization) |
|
| 0.5076 |
| 0.5145 | 0.2010 |
| 0.2009 | hidden layer = 2, hidden node = (3, 3) |
|
| 0.4695 | 0.2123 | 0.4889 | 0.2100 |
|
| max features = 5, n tree = 100 |
|
| 0.4951 | 0.2050 | 0.5159 | 0.2023 |
|
| C = 1 (L2 regularization) |
|
| 0.4792 |
|
| 0.1175 | 0.4778 | 0.1177 | prior = (0.3, 0.24, 0.46) |
|
| N/A | N/A | 0.5214 |
|
| 0.2998 | simulated 10,000 times |
|
| 0.4915 | 0.1870 | 0.5044 | 0.1863 |
|
| |