| Literature DB >> 29670502 |
Angkoon Phinyomark1,2, Giovanni Petri1, Esther Ibáñez-Marcelo1, Sean T Osis2,3, Reed Ferber2,4,3.
Abstract
The increasing amount of data in biomechanics research has greatly increased the importance of developing advanced multivariate analysis and machine learning techniques, which are better able to handle "big data". Consequently, advances in data science methods will expand the knowledge for testing new hypotheses about biomechanical risk factors associated with walking and running gait-related musculoskeletal injury. This paper begins with a brief introduction to an automated three-dimensional (3D) biomechanical gait data collection system: 3D GAIT, followed by how the studies in the field of gait biomechanics fit the quantities in the 5 V's definition of big data: volume, velocity, variety, veracity, and value. Next, we provide a review of recent research and development in multivariate and machine learning methods-based gait analysis that can be applied to big data analytics. These modern biomechanical gait analysis methods include several main modules such as initial input features, dimensionality reduction (feature selection and extraction), and learning algorithms (classification and clustering). Finally, a promising big data exploration tool called "topological data analysis" and directions for future research are outlined and discussed.Entities:
Keywords: Biomechanics; Data science; Gait; Kinematics; Principal component analysis; Support vector machine; Topological data analysis
Year: 2017 PMID: 29670502 PMCID: PMC5897457 DOI: 10.1007/s40846-017-0297-2
Source DB: PubMed Journal: J Med Biol Eng ISSN: 1609-0985 Impact factor: 1.553
Fig. 1Marker placement for standard biomechanical gait analysis
Fig. 2The main components of modern biomechanical gait analysis
Summary of biomechanical gait analysis studies using data science methods and their research question of interest
| Reference | Number of subjects | Initial input features | Dimensionality reduction | Leaning algorithms | Research question of interest |
|---|---|---|---|---|---|
| [ | 96 | 900 features (9 running kinematic waveforms) | MR (the best 3 angles/waveform) | – | Differences between male and female runners experiencing ITBS at the time of testing and healthy gender- and age-matched runners |
| [ | 483 | 72 features (running kinematic variables) | MR (the best 8–62 PCs) | SVM (78.4–100%) | Gender- and age-related differences in healthy runners |
| [ | 34 | 31 features (running kinematic variables) | SFS (the best 6 features) | SVM (100%) | Age-differences in healthy runners |
| [ | 92 | 51 features (running kinematic and kinetic variables) | Several feature extraction methods → AdaBoost (as part of the classifier) | AdaBoost (84.7–100%) | Differences between gender, shod/barefoot running, and runners with and without PFP |
| [ | 40 | 505 features (5 running kinematic waveforms) | PCA (the first 3 PCs/waveform) | – | Differences between female runners with previous ITBS and female healthy runners |
| [ | 72 | 902 features (9 running kinematic waveforms + 2 clinical variables) | PCA (only kinematic waveforms) → SFS (the best 2 PCs) | LDA (78.1%) | Prediction of the response to exercise treatment for patients with PFP |
| [ | 98 | 604 features (6 walking kinematic waveforms + 4 clinical variables) | PCA (only kinematic waveforms) → SFS (the best 6 PCs and 1 clinical variable) | LDA (85.4%) | Prediction of the response to exercise treatment for patients with knee OA |
| [ | 200 | 4 features (running kinematic variables) or 100 features (a running kinematic waveform) | PCA, Kernel PCA (the first 7 or 10 PCs) | – | Gender- and age-differences in healthy runners |
| [ | 11 | 3939 features (39 running marker position waveforms) | PCA, PCA with SVM, ICA → MR | – | Differences between movements resulting from wearing shoes with different midsoles |
| [ | 121 | 900 features (9 running kinematic waveforms) | PCA (the first 4 PCs/waveform) | HCA | Defining distinct groups of healthy runners and to investigate the practical implications of clustering healthy subjects |
| [ | 88 | 3636 features (36 running marker position waveforms) | SOFM |
| Defining functional groups of runners and to understand whether the defined groups required group-specific footwear features |
Fig. 3Pipeline of topological simplification (the Mapper algorithm)
Fig. 4Simplicial complex and 0-, 1-, and 2-dimensional simplices
Fig. 5Clique complex where cliques of size one, two, three, and four are shown as small red disks, black line segments, light blue triangles, and dark blue tetrahedral, respectively
Fig. 6Cech complex where 1-simplex and 2-simplices are shown as red line and blue full triangles, respectively
Fig. 7a Cech complex where a fixed set of points (step 1) can be transformed into different Cech complexes based on a proximity parameter r. b Barcode of H0 (the number of connected components) and H1 (the number of cycles) according to the evolution of proximity parameter r