| Literature DB >> 35923866 |
Yining Lu1, Ayoosh Pareek1, Ophelie Z Lavoie-Gagne2, Enrico M Forlenza3, Bhavik H Patel4, Anna K Reinholz1, Brian Forsythe3, Christopher L Camp1.
Abstract
Background: In professional sports, injuries resulting in loss of playing time have serious implications for both the athlete and the organization. Efforts to quantify injury probability utilizing machine learning have been met with renewed interest, and the development of effective models has the potential to supplement the decision-making process of team physicians. Purpose/Hypothesis: The purpose of this study was to (1) characterize the epidemiology of time-loss lower extremity muscle strains (LEMSs) in the National Basketball Association (NBA) from 1999 to 2019 and (2) determine the validity of a machine-learning model in predicting injury risk. It was hypothesized that time-loss LEMSs would be infrequent in this cohort and that a machine-learning model would outperform conventional methods in the prediction of injury risk. Study Design: Case-control study; Level of evidence, 3.Entities:
Keywords: loss of playing time; lower extremity; machine learning; muscle strain; professional athletes
Year: 2022 PMID: 35923866 PMCID: PMC9340342 DOI: 10.1177/23259671221111742
Source DB: PubMed Journal: Orthop J Sports Med ISSN: 2325-9671
Definition of Machine Learning Concepts and Methods Used
| Term | Definition |
|---|---|
| Multiple imputation | A popular method for handling missing data, which
are often a source of bias and error in model
output. In this approach, a missing value in the
data set is replaced with an imputed value based on
a statistical estimation; this process is repeated
randomly, resulting in multiple “completed” data
sets, each consisting of observed and imputed
values. These are combined utilizing a simple
formula known as the Rubin rule to give final
estimates of target variables.
|
| Recursive feature elimination (RFE) | A feature selection algorithm that searches for an
optimal subset of features by fitting a given
machine learning algorithm (random forest and naïve
Bayes in our case) to the predicted outcome, ranking
the features by importance, and removing the least
important features; this is done repeatedly, in a
“recursive” manner, until a specified number of
features remains or a threshold value of a
designated performance metric has been reached. The
features can then be entered as inputs into the
candidate models for prediction of the desired outcome.
|
| 0.632 bootstrapping | The method for training an algorithm based on the
input features selected from RFE. Briefly, model
evaluation consists of reiterative partitions of the
complete data set into train and test sets. For each
combination of train and test set, the model is
trained on the train set using 10-fold
cross-validation repeated 3 times. The performance
of this model is then evaluated on the respective
test set, and no data points from the training set
were included in the test set. This sequence of
steps is then repeated for 999 more data partitions.
|
| Extreme gradient boosting | Algorithm of choice among stochastic gradient
boosting machines, a family in which multiple weak
classifiers (a classifier that predicts marginally
better than random) are combined (in a process known
as boosting) to produce an ensemble classifier with
a superior generalized misclassification error rate.
|
| Random forest | Algorithm of choice among tree-based algorithms, an
ensemble of independent trees, each generating
predictions for a new sample chosen from the
training data, and whose predictions are averaged to
give the forest’s prediction. The ensembling process
is distinct in principle from gradient boosting.
|
| Neural network | A nonlinear regression technique based on 1 or more
hidden layers consisting of linear combinations of
some or all predictor variables, through which the
outcome is modeled; these hidden layers are not
estimated in a hierarchical fashion. The structure
of the network mimics neurons in a brain.
|
| Elastic net penalized logistic regression | A penalized linear regression based on a function to
minimize the square errors of the outputs; belongs
to the family of penalized linear models, including
ridge regression and the lasso.
|
| Support vector machines | A supervised learning algorithm that performs
classification problems by representing each data
point as a point in abstract space and defining a
plane known as a hyperplane that separates the
points into distinct binary classes, with maximal
margin. Hyperplanes can be linear or nonlinear, as
we have implemented in the presented analysis, using
a circular kernel.
|
| Area under the receiver operating characteristic curve (AUC) | A common metric to model performance, utilizing the
receiver operating characteristic curve, which plots
calculated sensitivity and specificity given the
class probability of an event occurring (instead of
using a 50:50 probability). The AUC classically
ranges from 0.5 to 1, with 0.5 being a model that is
no better than random and 1 being a model that is
completely accurate in assigning class labels.
|
| Calibration | The ability of a model to output probability
estimates that reflect the true event rate in repeat
sampling from the population. An ideal model is a
straight line with an intercept of 0 and slope of 1
(ie, perfect concordance of model predictions to
observed frequencies within the data). A model can
correctly assign a label, as reflected by the AUC,
yet it can output class probabilities of a binary
outcome that is dramatically different from its true
event rate in the population; such a model is not
well calibrated.
|
| Brier score | The mean square difference between predicted
probabilities of models and observed outcomes in the
testing data. The Brier score can generally range
from 0 for a perfect model to 0.25 for a
noninformative model.
|
| Decision curve analysis | A measure of clinical utility whereby a clinical net
benefit for 1 or more prediction models or
diagnostic tests is calculated in comparison with
default strategies of treating all or no patients.
This value is calculated based on a set threshold,
defined as the minimum probability of disease at
which further intervention would be warranted. The
decision curve is constructed by plotting the ranges
of threshold values against the net benefit yielded
by the model at each value; as such, a model curve
that is farther from the bottom left corner yields
more net benefit than one that is closer.
|
Inputs Considered for Feature Selection
| Variables |
|---|
| Recent groin injury |
Figure 1.(A) Discrimination and (B) calibration of the extreme gradient boosted machine. AUC, area under the receiver operating characteristic curve.
Baseline Characteristic of the Study Population (N = 2103)
| Variable | Value |
|---|---|
| Age, y | 26 (23-29) |
| BMI, kg/m2 | 24.3 (20.1-26.5) |
| Career length, y | 6 (2-9) |
| Position | |
| Center | 384 (18.2) |
| Power forward | 429 (20.4) |
| Point guard | 424 (20.2) |
| Small forward | 389 (18.5) |
| Shooting guard | 477 (22.7) |
| Injuries (n = 736) | |
| Quadriceps | 85 (11.5) |
| Hamstring | 268 (36.4) |
| Calf | 266 (36.1) |
| Groin | 117 (15.9) |
Values are presented as n (%) or median (interquartile range). BMI, body mass index.
Significant Contributors to Lower Extremity Muscle Strain From Logistic Regression Model
| Variable | OR (95% CI) |
|---|---|
| Previous injury count | 21.0 (2.5-72.5) |
| Recent quadriceps injury | 4.31 (1.21-15.4) |
| Recent groin injury | 2.9 (2.88-2.91) |
| Free throw rate | 2.76 (1.27-6) |
| Recent ankle injury | 2.66 (2.65-2.68) |
| Recent hamstring injury | 2.39 (2.38-2.4) |
| Recent concussion | 2.34 (2.33-2.35) |
| Recent back injury | 1.95 (1.94-1.96) |
| Age | 1.03 (1.01-1.05) |
| Games played | 1.01 (1.01-1.02) |
| 3-point attempt rate | 0.46 (0.27-0.79) |
OR, odds ratio.
Model Assessment on Internal Validation Using 0.632 Bootstrapping With 1000 Resampled Data Sets (N = 2103)
| Metric | AUC (95% CI) | ||||
|---|---|---|---|---|---|
| Apparent | Internal Validation | Calibration Slope | Calibration Intercept | Brier Score | |
| Elastic net | 0.834 (0.791-0.877) | 0.819 (0.818-0.820) | 0.999 (0.998-1) | 0.003 (0.001-0.005) | 0.031 (0.027-0.034) |
| Random forest | 0.905 (0.896-0.92) | 0.830 (0.829-0.831) | 1.001 (1-1.002) | 0.002 (0.001-0.007) | 0.029 (0.027-0.032) |
| XGBoost | 0.906 (0.899-0.911) | 0.840 (0.831-0.845) | 1.003 (1.002-1.004) | 0.002 (0.001-0.007) | 0.03 (0.027-0.033) |
| SVM | 0.881 (0.88-0.882) | 0.787 (0.786-0.788) | 0.999 (0.998-1) | 0.007 (0.004-0.009) | 0.031 (0.028-0.034) |
| Neural network | 0.84 (0.839-0.841) | 0.813 (0.812-0.814) | 0.997 (0.996-0.998) | 0.003 (0-0.005) | 0.031 (0.028-0.034) |
| Logistic regression | 0.835 (0.834-0.836) | 0.818 (0.817-0.819) | 0.998 (0.997-0.999) | 0.008 (0.002-0.012) | 0.031 (0.028-0.034) |
| Simple XGBoost | 0.882 (0.880-0.882) | 0.832 (0.818-0.838) | 0.999 (0.998-1.000) | 0.003 (0.002-0.004) | 0.031 (0.027-0.033) |
Null model Brier score = 0.063. AUC, area under the receiver operating characteristic curve; SVM, support vector machine; XGBoost, extreme gradient boosted.Data in parentheses is 95% confidence intervals.
Figure 2.(A) Variable importance plot of the extreme gradient boosted (XGBoost) machine model. (B) Summary plot of Shapley (SHAP) values of the XGBoost model. Specifically, the global SHAP values are plotted on the x-axis with variable contributions on the y-axis. Numbers next to each input name indicate the mean global SHAP value, and gradient color indicates feature value. Each point represents a row in the original data set. Three-point attempt rate = percentage of player field goals that are for 3 points; free throw attempt rate = ratio of free throw attempts to field goal attempts. LE, lower extremity.
Figure 3.Decision curve analysis comparing the complete extreme gradient boosted (XGBoost) machine algorithm with the complete logistic regression as well as a simplified model utilizing select parameters. The downsloping line marked by “All” plots the net benefit from the default strategy of changing management for all patients, while the horizontal line marked “none” represents the strategy of changing management for none of the patients (net benefit is zero at all thresholds). The “All” line slopes down because at a threshold of zero, false positives are given no weight relative to true positives; as the threshold increases, false positives gain increased weight relative to true positives and the net benefit for the default strategy of changing management for all patients decreases. LR, logistic regression.
Figure 4.Example of individual patient-level explanation for the simplified extreme gradient boosted machine algorithm predictions. This athlete had a predicted injury risk of 0.77% at this point during the season. The only feature to support the likelihood of injury was a recent back injury.