| Literature DB >> 35447864 |
Jeffrey Turner1, Torrey Wagner1,2, Brent Langhals1.
Abstract
Physical fitness is a pillar of U.S. Air Force (USAF) readiness and ensures that Airmen can fulfill their assigned mission and be fit to deploy in any environment. The USAF assesses the fitness of service members on a periodic basis, and discharge can result from failed assessments. In this study, a 21-feature dataset was analyzed related to 223 active-duty Airmen who participated in a comprehensive mental and social health survey, body composition assessment, and physical performance battery. Graphical analysis revealed pass/fail trends related to body composition and obesity. Logistic regression and limited-capacity neural network algorithms were then applied to predict fitness test performance using these biomechanical and psychological variables. The logistic regression model achieved a high level of significance (p < 0.01) with an accuracy of 0.84 and AUC of 0.89 on the holdout dataset. This model yielded important inferences that Airmen with poor sleep quality, recent history of an injury, higher BMI, and low fitness satisfaction tend to be at greater risk for fitness test failure. The neural network model demonstrated the best performance with 0.93 accuracy and 0.97 AUC on the holdout dataset. This study is the first application of psychological features and neural networks to predict fitness test performance and obtained higher predictive accuracy than prior work. Accurate prediction of Airmen at risk of failing the USAF fitness test can enable early intervention and prevent workplace injury, absenteeism, inability to deploy, and attrition.Entities:
Keywords: military; neural network; physical fitness test; predictive modeling; risk management
Year: 2022 PMID: 35447864 PMCID: PMC9030411 DOI: 10.3390/sports10040054
Source DB: PubMed Journal: Sports (Basel) ISSN: 2075-4663
Prior machine learning analyses in the prediction of military fitness test failures.
| Description of Work | Method Used | Performance | Ref |
|---|---|---|---|
| Predict Fitness Assessment Failure in Australian Army | Classification, | AUC = 0.70 | [ |
| Predict U.S. Army Fitness Assessment 2-mile run time | Regression, | R2 = 0.55–0.59 | [ |
| Predict U.S. Army Fitness Assessment Failure | Classification, | AUC = 0.61–0.77 (F) | [ |
AUC: Area under the receiver operator characteristic (ROC) curve.
Feature Summary Statistics.
| Variable | Mean | Max | Std Dev | Data Distribution | Notes/Definition |
|---|---|---|---|---|---|
| Age * | 28.15 | 59 | 6.60 | ~Log-Normal | --- |
| Gender | 0.25 | 1 | 0.43 | Binary | 0 = Male (174 members) |
| ORS Total * | 7.63 | 10 | 1.95 | ~Log-Normal | Outcome Rating Scale [ |
| ORS Social | 7.29 | 10 | 2.05 | ~Log-Normal | --- |
| ORS Interpersonal | 7.59 | 10 | 2.08 | ~Log-Normal | --- |
| ORS Individual | 7.35 | 10 | 2.02 | ~Log-Normal | --- |
| PTSD | 9.28 | 68 | 11.75 | Right-skewed | Post-Traumatic Stress Disorder Checklist (PCL-5) [ |
| Sleep | 7.32 | 22 | 3.92 | ~Normal | --- [ |
| Burnout | 2.14 | 7 | 0.81 | ~Normal | --- [ |
| InjuryEval | 0.30 | 1 | 0.46 | Binary | 1 = Recent injury evaluated by provider |
| InjuryNoEval | 0.12 | 1 | 0.33 | Binary | 1 = Recent injury not evaluated by provider |
| DLC | 0.09 | 1 | 0.29 | Binary | 1 = Duty Limiting Condition |
| FitSat | 3.32 | 5 | 0.81 | Categorical | Fitness Satisfaction |
| PhysRestr | 0.27 | 1 | 0.45 | Binary | 1 = Recent injury resulting in physical activity restriction |
| BMI | 27.13 | 42.3 | 4.10 | ~Normal | Body Mass Index [ |
| BodyFatPerc | 0.29 | 0.49 | 0.82 | ~Normal | Body Fat Percentage [ |
| MusclePerc | 0.33 | 0.47 | 0.06 | ~Normal | Muscle Mass Percentage [ |
| FMS_Shldr | 0.12 | 1 | 0.33 | Binary | 1 = Functional Movement Screen (FMS) Shoulder Pain [ |
| FMS_Ext | 0.21 | 1 | 0.41 | Binary | 1 = FMS Low Back Pain [ |
| FMS_Flex | 0.06 | 1 | 0.23 | Binary | 1 = FMS Hip Pain [ |
| FMS Total | 14.28 | 20 | 2.60 | ~Normal | FMS Composite Score [ |
* indicates the variable will be log-transformed prior to modeling. The variable names are defined with references on the right side of the table.
Figure 1Selected numeric and ordinal categorical variable histograms. Acronyms and abbreviations: logAge, log-transformed age variable; logORS, log-transformed outcome rating scale variable; PTSD, post-traumatic stress disorder questionnaire; BMI, body mass index; FMS_Tot, functional movement screen composite score.
Logistic Regression Model Results as measured on the test/holdout dataset. The weighted average between the pass/fail classes are presented for precision and recall. The features used in each model are represented in the footnotes, and gray text indicates a trivial model.
| Model | AUC | Precision | Recall | Accuracy | |
|---|---|---|---|---|---|
| Full | <0.01 | 0.82 | 0.79 | 0.79 | 0.79 |
| 5-feature | <0.01 | 0.86 | 0.82 | 0.82 | 0.82 |
| 4-feature | <0.01 | 0.89 | 0.83 | 0.84 | 0.84 |
| Recursive feature elimination (RFE) 3 | <0.01 | 0.87 | 0.75 | 0.75 | 0.75 |
| Select K Best 4 | <0.01 | 0.86 | 0.82 | 0.82 | 0.82 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| Goal | -- | 0.80 | -- | -- | 0.90 |
1: ‘Gender’, ‘Sleep’, ‘BMI’, ‘FitSat’, ‘PhysRestr’. 2: ‘Sleep’, ‘BMI’, ‘FitSat’, ‘PhysRestr’. 3: ‘Gender’, ‘InjuryNoEval’, ‘PhysRestr’, ‘FitSat’, ‘FMS_Flex’. 4: ‘MusclePerc’, ‘BodyFatPerc’, ‘FitSat’, ‘PhysRestr’, ‘DLC’.
Figure 2Graphical analysis of the ratio of passing/failing the APFT vs. (a) muscle percentage and (b) body fat percentage.
Figure 3Gender differences in body composition and APFT pass/fail results for (a) male service members and (b) female service members. The figure is overlaid with biological gender-based body fat norms [32].
Figure 4Performance graphs for the best (four-feature p-value) classical logistic regression model. (a) the confusion matrix resulting from the holdout dataset, where APFT Failure = 0 and APFT Pass = 1. (b) ROC curves from the training dataset, holdout dataset, and trivial Chance model. TPR: True positive rate; FPR: False positive rate.
Neural network model hyperparameter search ranges.
| Hyperparameter | Full Model Range | Limited Model Range |
|---|---|---|
| Neurons | 2, 3, 4, 5, 10 | 1, 2, 3, 4, 5, 7, 9, 12 |
| Hidden layers | 0, 1, 2 | 0, 1, 2, 3 |
| Batch size | 16, 32 | 16, 32 |
| Epochs | 15, 20, 60, 100, 140 | 15, 20, 60, 100, 140 |
| Learning rate | 0.01, 0.001, 0.0005 | 0.01, 0.001, 0.0005 |
Figure 5Families of models from the multidimensional hyperparameter sweeps on the full dataset (left), and the limited dataset (right). On each subfigure, an arrow indicates the best model.
Hyperparameters and metrics for the five best neural network models for the full 21-feature dataset (top) and the limited four-feature dataset (bottom). Bold text indicates the best model.
| Dataset | Neurons | Layers | Learn | Batch | Epochs | Mean Fold | Inter-Fold Variance (%) |
|---|---|---|---|---|---|---|---|
| Full | 10 | 1 | 0.001 | 16 | 140 | 87.2 | 13 |
| 10 | 1 | 0.01 | 32 | 20 | 85.9 | 15 | |
|
|
|
|
|
|
|
| |
| 3 | 2 | 0.01 | 32 | 140 | 84.6 | 8 | |
| 5 | 2 | 0.01 | 16 | 20 | 84.0 | 2 | |
| Limited |
|
|
|
|
|
|
|
| 3 | 1 | 0.01 | 16 | 100 | 91.7 | 10 | |
| 3 | 0 | 0.01 | 16 | 60 | 91.0 | 6 | |
| 40 | 1 | 0.001 | 32 | 60 | 91.0 | 10 | |
| 5 | 0 | 0.01 | 32 | 60 | 91.0 | 8 |
Figure 6For the full model (left) and limited model (right): (a1,a2) Network structure where green are the input features, blue are neurons, and red is the output neuron; (b1,b2) training curves for the training (blue) and holdout (orange) datasets; (c1,c2) ROC curves and AUC for the training (blue) and holdout (orange) datasets; and (d1,d2) confusion matrix.
Neural network modeling results as measured on the holdout dataset. The weighted average between the pass/fail classes are presented for precision and recall.
| Model | AUC | Precision | Recall | Accuracy |
|---|---|---|---|---|
| Baseline | 0.94 | 0.89 | 0.90 | 0.90 |
| Full 21-input model | 0.97 | 0.92 | 0.93 | 0.93 |
| Limited 4-input model | 0.96 | 0.93 | 0.93 | 0.93 |
| Goal | 0.80 | -- | -- | 0.90 |