| Literature DB >> 33297395 |
Rana Zia Ur Rehman1, Yuhan Zhou2, Silvia Del Din1, Lisa Alcock1, Clint Hansen3, Yu Guan4, Tibor Hortobágyi2, Walter Maetzler3, Lynn Rochester1,5, Claudine J C Lamoth2.
Abstract
Falls are the leading cause of mortality, morbidity and poor quality of life in older adults with or without neurological conditions. Applying machine learning (ML) models to gait analysis outcomes offers the opportunity to identify individuals at risk of future falls. The aim of this study was to determine the effect of different data pre-processing methods on the performance of ML models to classify neurological patients who have fallen from those who have not for future fall risk assessment. Gait was assessed using wearables in clinic while walking 20 m at a self-selected comfortable pace in 349 (159 fallers, 190 non-fallers) neurological patients. Six different ML models were trained on data pre-processed with three techniques such as standardisation, principal component analysis (PCA) and path signature method. Fallers walked more slowly, with shorter strides and longer stride duration compared to non-fallers. Overall, model accuracy ranged between 48% and 98% with 43-99% sensitivity and 48-98% specificity. A random forest (RF) classifier trained on data pre-processed with the path signature method gave optimal classification accuracy of 98% with 99% sensitivity and 98% specificity. Data pre-processing directly influences the accuracy of ML models for the accurate classification of fallers. Using gait analysis with trained ML models can act as a tool for the proactive assessment of fall risk and support clinical decision-making.Entities:
Keywords: classification; data pre-processing; fall; fall risk assessment; gait; inertial measurement unit; machine learning; neurological disorders; path signature; wearables
Year: 2020 PMID: 33297395 PMCID: PMC7729621 DOI: 10.3390/s20236992
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Protocol for gait assessment in the neurology ward.
Figure A1Piecewise linear path for signature extractions for training ML models.
Demographic characteristics of study participants.
| Demographics | Non-Fallers ( | Fallers ( | |
|---|---|---|---|
| M/F | 115/75 | 88/71 | 0.330 |
| Age (year) | 61.6 ± 12.2 | 65.0 ± 12.7 | 0.009 |
| Height (m) | 1.73 ± 0.1 | 1.70 ± 0.1 | 0.021 |
| Mass (kg) | 81.89 ± 16.35 | 76.31 ± 14.87 | 0.002 |
| BMI (kg/m2) | 27.22 ± 4.76 | 26.08 ± 4.34 | 0.027 |
SD: standard deviation; M: male; F: female; BMI: body mass index. p-value < 0.05 considered as statistically significant in independent t-test (Age, Height, Mass and BMI) and chi-squared test (M/F).
Figure 2Radar plot indicating the difference between fallers and non-fallers in a range of gait characteristics using z-scores (* indicates significant difference between groups (p-value < 0.05)).
Figure 3Correlation among the gait characteristics; the bigger the circler, the higher the correlation. Blue means positive correlations and red means negative correlations.
Figure 4Number of components selected from the principal component analysis (PCA) for training the classifiers.
ML model results for accuracy (sensitivity, specificity).
|
| ||||||
|
|
|
|
|
|
|
|
| LDA | 68.57(0.79, 0.56) | 61.9(0.49, 0.73) | 63.81(0.51, 0.76) | 53.33(0.57, 0.51) | 67.9(0.76, 0.55) | 63.10(0.62, 0.62) |
| LR | 71.43(0.79, 0.63) | 63.8(0.49, 0.77) | 61.9(0.51, 0.72) | 47.62(0.54, 0.42) | 59(0.62, 0.55) | 60.75(0.59, 0.62) |
| NB | 62.86(0.76, 0.49) | 71.43(0.55,0.86) | 62.86(0.47, 0.78) | 72.38(0.57, 0.85) | 69.52(0.86, 0.43) | 67.81(0.64, 0.68) |
| SVM | 70.48(0.87, 0.53) | 68.57(0.53,0.82) | 67.62(0.47, 0.87) | 70.48(0.52, 0.85) | 75.24(0.83, 0.63) | 70.48(0.64, 0.74) |
| KNN | 64.76(0.72,0.57) | 60.95(0.59, 0.63) | 61.9(0.55, 0.69) | 64.76(0.54, 0.73) | 59.05(0.63, 0.53) | 62.28(0.61, 0.63) |
| RF | 63.8(0.72,0.55) | 70.48(0.61, 0.79) | 61.9(0.51, 0.72) | 67.62(0.59, 0.75) | 67.62(0.71, 0.63) | 66.284(0.63, 0.69) |
|
| ||||||
|
|
|
|
|
|
|
|
| LDA | 60(0.76, 0.43) | 67.62(0.51, 0.82) | 38.1(0.31,0.44) | 44.76(0.35, 0.53) | 34.29(0.32, 0.38) | 48.95(0.45, 0.52) |
| LR | 58.1(0.7, 0.45) | 67.62(0.51, 0.82) | 38.1(0.33, 0.43) | 42.86(0.39, 0.46) | 32.38(0.29, 0.38) | 47.81(0.44, 0.51) |
| NB | 58.1(0.78, 0.37) | 66.67(0.49, 0.82) | 39.05(0.31, 0.46) | 42.86(0.33, 0.51) | 49.52(0.65, 0.25) | 51.24(0.51, 0.48) |
| SVM | 58.1(0.85, 0.29) | 67.62(0.41, 0.91) | 42.86(0.26, 0.59) | 49.52(0.24, 0.7) | 35.24(0.39, 0.3) | 50.67(0.43, 0.56) |
| KNN | 55.24(0.69, 0.41) | 61.9(0.57, 0.66) | 45.7(0.24, 0.67) | 39.05(0.44, 0.36) | 49.52(0.51, 0.48) | 50.28(0.49, 0.52) |
| RF | 60.95(0.69, 0.53) | 64.76(0.59, 0.7) | 36.19(0.41, 0.32) | 43.81(0.52, 0.37) | 35.23(0.25, 0.53) | 48.19(0.49, 0.49) |
|
| ||||||
|
|
|
|
|
|
|
|
| LDA | 91.43(0.961,0.87) | 91.43(0.94, 0.89) | 90.47(0.88, 0.93) | 92.38(0.83, 1) | 93.33(0.95,0.923) | 91.81(0.91, 0.92) |
| LR | 95.24(0.98, 0.93) | 98.095(1,0.96) | 95.2(0.96, 0.94) | 95.24(0.94, 0.97) | 95.24(0.95,0.953) | 95.80(0.97, 0.95) |
| NB | 54.28(0.137,0.93) | 66.67(0.33, 0.96) | 57.14(0.86, 0.29) | 70.47(0.37, 0.97) | 66.67(0.23, 0.94) | 63.05(0.38, 0.82) |
| SVM | 95.24(0.96, 0.94) | 96.19(1, 0.93) | 94.28(0.94, 0.94) | 93.33(0.85, 1) | 96.19(1, 0.94) | 95.05(0.95, 0.95) |
| KNN | 64.76(0.51, 0.78) | 65.71(0.53, 0.77) | 58.09(0.47, 0.69) | 63.810(0.50,0.75) | 62.857(0.43,0.75) | 63.04(0.49, 0.75) |
| RF | 99.05 (1,0.98) | 98.09 (1,0.96) | 98.09 (0.98, 0.98) | 99.05(0.98, 1) | 99.05 (1, 0.98) | 98.67(0.99, 0.98) |
ML model results for F1 Score and AUC.
|
| ||||||
|
|
|
|
|
|
|
|
| LDA | 0.68(0.71) | 0.61(0.7) | 0.63(0.63) | 0.54(0.54) | 0.67(0.67) | 0.63(0.65) |
| LR | 0.71(0.7) | 0.63(0.69) | 0.61(0.63) | 0.48(0.49) | 0.6(0.63) | 0.61(0.63) |
| NB | 0.62(0.66) | 0.71(0.76) | 0.62(0.65) | 0.72(0.74) | 0.68(0.67) | 0.67(0.7) |
| SVM | 0.7(0.7) | 0.68(0.68) | 0.66(0.67) | 0.7(0.69) | 0.75(0.73) | 0.7(0.7) |
| KNN | 0.65(0.67) | 0.61(0.59) | 0.62(0.65) | 0.65(0.64) | 0.59(0.62) | 0.62(0.63) |
| RF | 0.64(0.7) | 0.7(0.74) | 0.61(0.64) | 0.67(0.72) | 0.68(0.76) | 0.66(0.71) |
|
| ||||||
|
|
|
|
|
|
|
|
| LDA | 0.59(0.62) | 0.67(0.73) | 0.38(0.34) | 0.45(0.41) | 0.35(0.31) | 0.49(0.48) |
| LR | 0.57(0.62) | 0.67(0.73) | 0.38(0.34) | 0.43(0.41) | 0.33(0.30) | 0.48(0.48) |
| NB | 0.56(0.60) | 0.66(0.71) | 0.39(0.35) | 0.43(0.37) | 0.48(0.39) | 0.50(0.48) |
| SVM | 0.55(0.57) | 0.65(0.66) | 0.41(0.42) | 0.47(0.47) | 0.36(0.34) | 0.49(0.49) |
| KNN | 0.54(0.57) | 0.62(0.65) | 0.43(0.47) | 0.57(0.56) | 0.50(0.50) | 0.53(0.55) |
| RF | 0.61(0.65) | 0.65(0.67) | 0.36(0.35) | 0.44(0.42) | 0.34(0.35) | 0.48(0.49) |
|
| ||||||
|
|
|
|
|
|
|
|
| LDA | 0.916(0.916) | 0.911(0.916) | 0.900(0.904) | 0.905(0.913) | 0.916(0.937) | 0.909(0.917) |
| LR | 0.952(0.953) | 0.980(0.982) | 0.951(0.953) | 0.945(0.950) | 0.938(0.952) | 0.953(0.958) |
| NB | 0.226(0.532) | 0.478(0.645) | 0.662(0.580) | 0.523(0.668) | 0.340(0.582) | 0.445(0.601) |
| SVM | 0.951(0.953) | 0.961(0.964) | 0.941(0.943) | 0.918(0.924) | 0.952(0.969) | 0.945(0.951) |
| KNN | 0.584(0.644) | 0.591(0.649) | 0.522(0.578) | 0.548(0.623) | 0.466(0.589) | 0.542(0.616) |
| RF | 0.990(0.991) | 0.980(0.982) | 0.980(0.981) | 0.989(0.989) | 0.988(0.992) | 0.985(0.987) |
Figure 5Classification performance of the ML models based on the F1 score. LDA: linear discriminant analysis; LR: logistic regression; NB: Naïve Bayes; SVM: support vector machine; KNN: k-nearest neighbour; RF: random forest.
Accuracy, sensitivity and specificity of ML models.
| ML Models | Data Pre-Processing Methods | ||
|---|---|---|---|
| Standardisation | PCA | Path Signature | |
| Linear Discriminant Analysis (LDA) | 63.10(62, 62) | 48.95(45, 52) | 91.81(91, 92) |
| Logistic Regression (LR) | 60.75(59, 62) | 47.81(44, 51) | 95.80(97, 95) |
| Naïve Bayes (NB) | 67.81(64, 68) | 51.24(51, 48) | 63.05(38, 82) |
| Support Vector Machine (SVM-linear) | 70.48(64, 74) | 50.67(43, 56) | 95.05(95, 95) |
| K-Nearest Neighbour (KNN) | 62.28(61, 63) | 50.28(49, 52) | 63.04(49, 75) |
| Random Forest (RF) | 66.28(63, 69) | 48.19(49, 49) | 98.67(99, 98) |