| Literature DB >> 35911375 |
Jérémy Briand1, Simon Deguire1, Sylvain Gaudet1, François Bieuzen1.
Abstract
Injuries limit the athletes' ability to participate fully in their training and competitive process. They are detrimental to performance, affecting the athletes psychologically while limiting physiological adaptations and long-term development. This study aims to present a framework for developing random forest classifier models, forecasting injuries in the upcoming 1 to 7 days, to assist the performance support staff in reducing injuries and maximizing performance within the Canadian National Female Short-Track Speed Skating Program. Forty different variables monitored daily over two seasons (2018-2019 and 2019-2020) were used to develop two sets of forecasting models. One includes only training load variables (TL), and a second (ALL) combines a wide array of monitored variables (neuromuscular function, heart rate variability, training load, psychological wellbeing, past injury type, and location). The sensitivity (ALL: 0.35 ± 0.19, TL: 0.23 ± 0.03), specificity (ALL: 0.81 ± 0.05, TL: 0.74 ± 0.03) and Matthews Correlation Coefficients (MCC) (ALL: 0.13 ± 0.05, TL: -0.02 ± 0.02) were computed. Paired T-test on the MCC revealed statistically significant (p < 0.01) and large positive effects (Cohen d > 1) for the ALL forecasting models' MCC over every forecasting window (1 to 7 days). These models were highly determined by the athletes' training completion, lower limb and trunk/lumbar injury history, as well as sFatigue, a training load marker. The TL forecasting models' MCC suggests they do not bring any added value to forecast injuries. Combining a wide array of monitored variables and quantifying the injury etiology conceptual components significantly improve the injury forecasting performance of random forest models. The ALL forecasting models' performances are promising, especially on one time windows of one or two days, with sensitivities and specificities being respectively above 0.5 and 0.7. They could add value to the decision-making process for the support staff in order to assist the Canadian National Female Team Short-Track Speed Skating program in reducing the number of incomplete training days, which could potentially increase performance. On longer forecasting time windows, ALL forecasting models' sensitivity and MCC decrease gradually. Further work is needed to determine if such models could be useful for forecasting injuries over three days or longer.Entities:
Keywords: data mining; high performance; machine learning; modeling; sport injury prevention
Year: 2022 PMID: 35911375 PMCID: PMC9329998 DOI: 10.3389/fspor.2022.896828
Source DB: PubMed Journal: Front Sports Act Living ISSN: 2624-9367
Figure 1Overview of the data mining framework to develop and evaluate random forest forecasting models.
Athletes characteristics summary.
|
|
| |
|---|---|---|
| Age (years) | 21 ± 2 | 18–24 |
| World ranking | 27 ± 19 | 2—Not ranked |
| Experience on the national team (years) | 4 ± 2 | 1–9 |
Figure 2Conceptual framework of the factors influencing sport performance and injuries and their inter-relation.
Description of the different variables, their measurement frequency, and the strategies used to replace missing values.
|
|
|
|
|
|---|---|---|---|
| External training load | Number of laps on the ice rink performed for each training by the athletes in different intensity zones. | Every training on ice | Main cause: Athletes did not train. Missing values replaced with a zero. |
| Internal training load | Athletes qualitatively assigned fatigue perception on a scale of 0 to 10, which was multiplied by the session duration, in minutes, to provide an internal load score referred to as sFatigue (Dunbar et al., | After every training session | Same as external training load. |
| Psychological wellbeing metrics | Athletes provided an assessment, on a scale of 0 to 100, of their levels of stress, energy, happiness, mood, motivation, performance stress, and sleep quality over the three previous days (Junge, | 3 times a week | Main Cause: non-daily measurement frequencies. Replaced with the most recent measurement. |
| Heart rate variability | Resting heart rate variability (HRV) taken using a heart rate monitoring belt (Polar H10, Finland) connected HRV4Training (Altini et al., | Every 3 days | Same as psychological wellbeing metrics. |
| Neuromuscular function | Counter movement jumps (CMJ) performed on force plates. Variables measured: contraction time, flight time to contraction time ratio, jump height (from flight time and impulsion), takeoff velocity, flight time to contraction time ratio, height (from impulsion), height (from flight time) and flight velocity (Gathercole et al., | 3 times a week | Same as psychological wellbeing metrics. |
| Injury type and location | Each time athletes reported injuries they specified injury body location (head/neck, trunk, trunk/lumbar, lower limb, upper limb) and the type (bone, muscle and tendon, joint and ligament, skin, brain/spinal cord/peripheral nervous system, other). | Each time an injury was reported | Main cause: Athlete was not injured. Replaced with a zero. |
| Training completion | The athletes ranked the level of training completion according to four factors: 0: training completed without injury/illness, 1: training completed with injury/illness, 2: training could not be completed because of injury/illness and 3: The athlete could not train at all because of injury/illness. | After every training session | Main cause: Athlete did not train. Missing values replaced with a zero. In the cases where athletes could not train because of injury or illness, the corresponding health status was validated by the medical team. |
| Injury (Target variable: Forecast 1 to 7 days) | The injury status, by definition (Meeuwisse et al., | Every day | Variable derived from the |
For the missing values and replacement strategies, the main cause of missing values is reported in the third column of the table. A close follow-up was made with the performance support staff to verify all information for specific situations, such as athletes omitting to report due to injuries.
Figure 3Evaluation metrics scores for both the ALL forecasting models (red dots) and TL forecasting models (blue dots). The x-axis displays the model forecasting window (from 1 to 7 days), while the y-axis presents in (A) the sensitivity, (B) Matthews Correlation Coefficient (MCC), and (C) specificity. The (*) symbol indicates a significant positive difference for the ALL forecasting models compared to the TL forecasting models (p < 0.01), as determined by paired T-test performed on each individual forecasting window, while the (#) symbol indicates positive large Cohen effect sizes for the ALL forecasting models compared to the TL forecasting models (d > 1). The red dotted line on graph (B) indicates a MCC of zero, corresponding to the performance of a model making predictions randomly. Note that the y-axis of each graph are all on different scales to better appreciate the difference between the two types of models for every metric.
Figure 4The 5 most important variables for each forecasting window (1 to 7 days) of the ALL forecasting models. The selected variables of importance were the ones with the highest minimum GINI decrease coefficient. The variables are numbered and colored according to their corresponding forecasting time period. ACD, Acute Chronic difference. ACR, Acute to Chronic Ratio.