| Literature DB >> 32095267 |
Fraser Philp1, Ahmad Al-Shallawi2,3, Theocharis Kyriacou4, Dimitra Blana2, Anand Pandyan1.
Abstract
OBJECTIVES: This objective of this study was to evaluate whether combining existing methods of elastic net for zero-inflated Poisson and zero-inflated Poisson regression methods could improve real-life applicability of injury prediction models in football.Entities:
Keywords: football; injuries; soccer; statistics; validation
Year: 2020 PMID: 32095267 PMCID: PMC7010990 DOI: 10.1136/bmjsem-2019-000634
Source DB: PubMed Journal: BMJ Open Sport Exerc Med ISSN: 2055-7647
Number of injuries for match and training according to injury severity categories
| Injury severity | Severity category | Number of injuries | |
| Match | Training | ||
| ≤1 day | Slight | 7 | 4 |
| >1 day and <3 days | Minimal | 5 | 2 |
| >3 days and <7 days | Mild | 5 | 1 |
| >7 days and <28 days | Moderate | 10 | 8 |
| >28 days | Severe | 0 | 2 |
| Career ending | 0 | 0 | |
Variables contained within the dataset
| Category | Number | Input |
| Position | 1 | Attacker |
| 2 | Midfielder | |
| 3 | Defender | |
| 4 | Goalkeeper | |
| Anthropometric | 5 | Kicking leg |
| 6 | Height | |
| 7 | Weight | |
| 8 | Sum of 4 sites skinfold thickness (biceps, triceps, subscapular, suprailiac) | |
| 9 | Activity duration | |
| Activity type | 10 | Match |
| 11 | Training | |
| 12 | Futsal | |
| 13 | Conditioning | |
| Surface type | 14 | Sand Astroturf |
| 15 | Natural grass | |
| 16 | Artificial Astroturf (3G) | |
| 17 | Wooden | |
| Injuries | 18 | Previous injuries |
| 19 | In-season injuries | |
| 20 | Cumulative number of injuries (to case) | |
| Variables related to training/match activities/fitness | 21 | Acute:chronic workload ratio |
| 22 | Cumulative match load* | |
| 23 | Cumulative match grass load* | |
| 24 | Total match Artificial Astroturf (3G) load* | |
| 25 | Total training (all types) load* | |
| 26 | Total training load* (excluding futsal and conditioning) | |
| 27 | Total training grass load* (excluding futsal and conditioning) | |
| 28 | Total training Sand Astroturf load* (excluding futsal and conditioning) | |
| 29 | Total training Artificial Astroturf (3G) load* (excluding futsal and conditioning) | |
| 30 | Total training futsal load* | |
| 31 | Total training load* (with futsal) excluding conditioning | |
| 32 | Total training conditioning load* | |
| 33 | Cumulative match and training load* (22+23) | |
| 34 | Yo-Yo fitness score |
*Load refers to time in minutes.
Figure 1Summary of results for the predictor selection and modelling processes.
Vuong’s test for the presence of zero inflation
| Vuong z-statistic | Model comparison | P value | |
| Raw | −4.982125 | model2>model1 | <0.001*** |
| Akaike information criterion-corrected | 4.978704 | model2>model1 | <0.001*** |
| Bayesian information criterion-corrected | −4.968557 | model2>model1 | <0.001*** |
***p<0.001.
Variance inflation factor (VIF) testing results for multicollinearity
| Predictor label | VIF value |
| Weight | 1.9922 |
| Sum of 4 sites skinfold thickness (biceps, triceps, subscapular suprailiac) | 2.2473 |
| Time in activity | 1.3662 |
| Match | 32.2477* |
| Training | 39.6572* |
| Conditioning | 22.1688* |
| Sandastro | 27.3124* |
| Grass | 31.6022* |
| Artificial turf 3G | 14.9991* |
| Previous injuries | 1.3860 |
| In-season injuries | 1.5483 |
| Cumulative match volume | 103.6455* |
| Cumulative match grass volume | 110.9217* |
| Total all training | 8.2872 |
| Total training volume excluding futsal and conditioning | 12.8611* |
| Total training Artificial turf 3G volume excluding futsal and conditioning | 3.4659 |
| Total training futsal volume | 2.6395 |
| Total training grass volume excluding futsal and conditioning | 9.0142 |
*Indicates high level of multicollinearity >10.
Results for comparison of elastic net (ENET) for zero-inflated Poisson and ‘traditional’ predictors selection methods
| Full model | Backward | ENET without cross-validation | ENET with cross-validation | |
| Akaike information criterion (AIC) | 666.6 | 665.23 | 662.31 | – |
| Bayesian information criterion (BIC) | 909 | 793 | 750 | – |
| Log likelihood | −929 | −301.7 | −298.0 | −46.65 |
For AIC and BIC, lower values are indicative of better model performance while for log likelihood, a larger number is indicative of better model performance.
Calculation of AIC and BIC is not possible with ENET with cross-validation.
Results for the modern elastic net (ENET) for zero-inflated Poisson (ZIP) penalty method and traditional ZIP method
| Modern ENET for ZIP penalty method (n=13) | Traditional ZIP method (n=11) |
| Weight | |
| Sum of 4 sites skinfold thickness | Sum of 4 sites skinfold thickness |
| Time in activity | Time in activity |
| Training | Match |
| Conditioning | Grass |
| Artificial turf 3G | Artificial turf 3G |
| Previous injuries | Previous injuries |
| Acute:chronic workload ratio | Cumulative no of injuries |
| Total time match-play (3G) | Total time match-play (3G) |
| Total time trained (grass) | Total time trained (grass) |
| Total time (futsal) | Total time (futsal) |
| Total time (conditioning) | Total time (conditioning) |
| Yo-Yo fitness score |
Results for regularised zero-inflated Poisson regression model
| For the count outcome of the model: |
| Count model or log(λi)=(0.022∗weight−0.008∗sum of 4 sites skinfold thickness−0.006∗time in activity+0.35∗training−0.94∗conditioning+094∗artificial turf 3G+0.04∗previous injuries−0.295∗acute:chronic workload ratio+0.001∗total time match-play (3G)+0.002∗total time trained (grass)−0.005∗total time (futsal)−0.0004∗total time (conditioning)+0.182∗total time (conditioning)) |
| Name of predictor | Estimated coefficient | SD | Calculated value | P value | OR | 2.50% | 97.50% |
| Weight | 0.022 | 0.011 | 1.97 | 0.04* | 1.0223 | 1.0002 | 1.0449 |
| Sum of 4 sites skinfold thickness | −0.008 | 0.004 | −2.33 | 0.01* | 0.9913 | 0.984 | 0.9986 |
| Time in activity | −0.006 | 0.002 | −3.2 | 0.001** | 0.9936 | 0.9897 | 0.9975 |
| Training | 0.349 | 0.1 | 3.1 | 0.001** | 1.4174 | 1.1372 | 1.7667 |
| Conditioning | −0.94 | 0.5 | −1.9 | 0.06 | 0.3905 | 0.148 | 1.0306 |
| Artificial turf 3G | 0.941 | 0.2 | 4.7 | <0.001*** | 2.5636 | 1.7365 | 3.7847 |
| Previous injuries | 0.041 | 0.1 | 0.4 | 0.68 | 1.0418 | 0.856 | 1.268 |
| Acute:chronic workload ratio | −0.295 | 0.08 | −3.6 | <0.001*** | 0.7446 | 0.6337 | 0.8748 |
| Total time match-play (3G) | 0.001 | 0.0005 | 2.1 | 0.02* | 1.001 | 1.0001 | 1.002 |
| Total time trained (grass) | 0.002 | 0.0003 | 6.02 | <0.001*** | 1.0017 | 1.0012 | 1.0023 |
| Total time (futsal) | −0.005 | 0.001 | -5 | <0.001*** | 0.9948 | 0.9928 | 0.9968 |
| Total time (conditioning) | −0.0004 | 0.0001 | −8.2 | <0.001*** | 0.9996 | 0.9995 | 0.9997 |
| Yo-Yo fitness score | 0.182 | 0.008 | 2.3 | 0.01* | 1.1993 | 1.0298 | 1.3968 |
Significance codes: *p<0.05, **p<0.01, ***p<0.001.
Predictors with positive coefficients were identified as being positively related with the count outcome of injury, ie, for a one-unit increase in the identified variable, the likelihood of injury increases by the respective value, assuming all other variables are constant.
Predictors with negative coefficients were identified as being negatively related with the count outcome of injury, ie, for a one-unit increase in the identified variable, the likelihood of injury decreases by the respective value, assuming all other variables are constant.
| For the zero outcome of the model: | |||||||
| Zero-inflated model or logit(Pi)=(−1.25∗match−0.76∗in-season injury) | |||||||
|
|
|
|
|
|
|
|
|
| Match | −1.25 | 0.35 | −3.62 | <0.001*** | 0.287 | 0.146 | 0.5639 |
| In-season injury | −0.76 | 0.13 | −5.63 | <0.001*** | 0.4694 | 0.3608 | 0.6106 |
Significance codes: *p<0.05, **p<0.01, ***p<0.001.
Predictors with negative coefficients were identified as being negatively related with the zero outcome, ie, for a one-unit increase in the identified variable, the likelihood of not getting an injury decreases by the respective value, assuming all other variables are constant.