| Literature DB >> 32398644 |
Brian Horsak1, Djordje Slijepcevic2, Anna-Maria Raberger3, Caterine Schwab3, Marianne Worisch4, Matthias Zeppelzauer2.
Abstract
The quantification of ground reaction forces (GRF) is a standard tool for clinicians to quantify and analyze human locomotion. Such recordings produce a vast amount of complex data and variables which are difficult to comprehend. This makes data interpretation challenging. Machine learning approaches seem to be promising tools to support clinicians in identifying and categorizing specific gait patterns. However, the quality of such approaches strongly depends on the amount of available annotated data to train the underlying models. Therefore, we present GAITREC, a comprehensive and completely annotated large-scale dataset containing bi-lateral GRF walking trials of 2,084 patients with various musculoskeletal impairments and data from 211 healthy controls. The dataset comprises data of patients after joint replacement, fractures, ligament ruptures, and related disorders at the hip, knee, ankle or calcaneus during their entire stay(s) at a rehabilitation center. The data sum up to a total of 75,732 bi-lateral walking trials and enable researchers to classify gait patterns at a large-scale as well as to analyze the entire recovery process of patients.Entities:
Mesh:
Year: 2020 PMID: 32398644 PMCID: PMC7217853 DOI: 10.1038/s41597-020-0481-z
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Demographic overview of the dataset and the pre-defined classes.
| Class | N | Age (yrs.) Mean (SD) | Body mass (kg) Mean (SD) | Sex (m/f) | Bi-lateral Trials |
|---|---|---|---|---|---|
| Healthy C. | 211 | 34.7 (13.9) | 73.9 (15.6) | 104/107 | 7,755 |
| Hip | 450 | 42.6 (12.8) | 82.4 (15.6) | 373/77 | 12,748 |
| Knee | 625 | 41.6 (12.0) | 84.3 (18.6) | 426/199 | 19,873 |
| Ankle | 627 | 41.6 (11.4) | 87.0 (18.0) | 498/129 | 21,386 |
| Calcaneus | 382 | 43.5 (10.4) | 84.0 (14.5) | 339/43 | 13,970 |
Fig. 1Class taxonomy. The class structure and the dependencies between the classes of the GaitRec dataset: Healthy Controls (HC), Gait Disorders (GD), Hip (H), Knee (K), Ankle (A), and Calcaneus (C). Details of the subclasses are described in Section Dataset & Annotation.
Description of the data stored in the “GRF_*.csv” files. “*” for the associated file name is a placeholder for “right” and “left”.
| Variables | Associated file | Format | Dimension | Unit | Description |
|---|---|---|---|---|---|
| Vertical GRF | GRF_F_V-RAW_*.csv | double | 1 × n | Newton | Raw vertical ground reaction force |
| Anterior-posterior GRF | GRF_F_AP-RAW_*.csv | double | 1 × n | Newton | Raw breaking and propulsive shear force |
| Medio-lateral GRF | GRF_F_ML_RAW_*.csv | double | 1 × n | Newton | Raw medio-lateral shear force |
| COP anterior-posterior | GRF_COP_AP_RAW_*.csv | double | 1 × n | Centimeter | Raw COP coordinate in walking direction |
| COP medio-lateral | GRF_COP_ML_RAW_*.csv | double | 1 × n | Centimeter | Raw COP coordinate in medio-lateral direction |
| Vertical GRF | GRF-F_V_PRO_*.csv | double | 1 × n | Multiple of body weight | Post-processed vertical ground reaction force |
| Anterior-posterior GRF | GRF_F_AP_PRO_*.csv | double | 1 × n | Multiple of body weight | Post-processed breaking and propulsive shear force |
| Medio-lateral GRF | GRF-F_ML_PRO_*.csv | double | 1 × n | Multiple of body weight | Post-processed medio-lateral shear force |
| COP anterior-posterior | GRF_COP_AP_PRO_*.csv | double | 1 × n | % stance | Post-processed COP coordinate in walking direction |
| COP medio-lateral | GRF_COP_ML_PRO_*.csv | double | 1 × n | % stance | Post-processed COP coordinate in medio-lateral direction |
n is either the number of frames during one step across the force plate for the unprocessed data (“RAW”) or a time-normalized vector of 101 points for the post-processed (“PRO”) data. Note that the first three columns of each file hold the SUBJECT_ID, SESSION_ID, and TRIAL_ID.
Description of the information stored in the metadata file.
| Categories/Variables | Format | Unit | Description |
|---|---|---|---|
| SUBJECT_ID | integer | — | Unique identifier of a subject |
| SESSION_ID | integer | — | Unique identifier of a session |
| CLASS_LABEL | string | — | Annotated class labels |
| CLASS_LABEL_DETAILED | string | — | Annotated class labels for subclasses |
| SEX | binary | — | female = 0, male = 1 |
| AGE | integer | years | Age at recording date |
| HEIGHT | integer | centimeter | Body height in centimeters |
| BODY_WEIGHT | double | Body weight in Newton | |
| BODY_MASS | double | kg | Body mass |
| SHOE_SIZE | double | EU | Shoe size in the Continental European System |
| AFFECTED_SIDE | integer | — | left = 0, right = 1, both = 2 |
| SHOD_CONDITION | integer | — | barefoot & socks = 0, normal shoe = 1, orthopedic shoe = 2 |
| ORTHOPEDIC_INSOLE | binary | — | without insole = 0, with insole = 1 |
| SPEED | integer | — | slow = 1, self-selected = 2, fast = 3 walking speed |
| READMISSION | integer | — | indicates the number of re-admission = 0 … n |
| SESSION_TYPE | integer | — | initial measurement = 1, control measurement = 2, initial measurement after readmission = 3 |
| SESSION_DATE | string | — | date of recording session in the format “DD-MM-YYYY” |
| TRAIN | binary | — | is part (=1) or is not part (=0) of TRAIN |
| TRAIN_BALANCED | binary | — | is part (=1) or is not part (=0) of TRAIN_BALANCED |
| TEST | binary | — | is part (=1) or is not part (=0) of TEST |
Fig. 2Dataset composition. Configuration of the balanced and unbalanced train/test splits of the GaitRec dataset. The pie-charts show the amount of trials populated (in total amount and percentage) within each class and split.
Fig. 3Data overview. Visualization of all body-weight normalized vertical, anterior-posterior, and medio-lateral GRF signals of the affected side available per subject and class. For healthy controls all available recordings are visualized. The plots also show the mean (solid line) and its one-fold standard deviation (dotted line). Note that for easier usage the orientation of the medio-lateral and anterior-posterior signals were uniformed, so that medial and anterior forces are always represented as positive values.
| Measurement(s) | gait measurement • phenotypic annotation |
| Technology Type(s) | force sensor • visual observation method |
| Factor Type(s) | experimental condition • musculoskeletal impairment • age • sex • shod condition • walking speed |
| Sample Characteristic - Organism | Homo sapiens |
| Sample Characteristic - Environment | laboratory environment |