| Literature DB >> 35050970 |
Alessio Rossi1, Luca Pappalardo2, Paolo Cintia1.
Abstract
In the last decade, the number of studies about machine learning algorithms applied to sports, e.g., injury forecasting and athlete performance prediction, have rapidly increased. Due to the number of works and experiments already present in the state-of-the-art regarding machine-learning techniques in sport science, the aim of this narrative review is to provide a guideline describing a correct approach for training, validating, and testing machine learning models to predict events in sports science. The main contribution of this narrative review is to highlight any possible strengths and limitations during all the stages of model development, i.e., training, validation, testing, and interpretation, in order to limit possible errors that could induce misleading results. In particular, this paper shows an example about injury forecaster that provides a description of all the features that could be used to predict injuries, all the possible pre-processing approaches for time series analysis, how to correctly split the dataset to train and test the predictive models, and the importance to explain the decision-making approach of the white and black box models.Entities:
Keywords: artificial intelligence; soccer; sport science; training and testing
Year: 2021 PMID: 35050970 PMCID: PMC8822889 DOI: 10.3390/sports10010005
Source DB: PubMed Journal: Sports (Basel) ISSN: 2075-4663
Figure 1Diagram of the injury forecasting validation. The pink leaves (i.e., psychophysiological assessment and training workload) refer to the input variables for the injury prediction algorithm. The red leaf is injury information used to label each training vector. Orange leaves are the models trained and tested by the injury prediction algorithm. Each of these three leaf types (pink, red, and orange) are useful for building the injury prediction algorithm. Furthermore, blue leaves describe how to train, validate, and test the model developed by the injury prediction algorithm. Moreover, green leaves list all the metrics to assess the model’s goodness. Finally, gray leaves describe the data preprocessing in each injury prediction algorithm stage.
Confusion matrix. TP, FP, TN, and FN refer to True Positive, False Positive, True Negative, and False Negative, respectively.
| Actual Classes | |||
| Injury | No-Injury | ||
| Predicted classes | Injury | TP | FP |
| No-Injury | FN | TN | |