Literature DB >> 32641740

A database of human gait performance on irregular and uneven surfaces collected by wearable sensors.

Yue Luo1, Sarah M Coppola2,3, Philippe C Dixon4,5, Song Li1, Jack T Dennerlein6,3, Boyi Hu7,8.   

Abstract

Gait analysis has traditionally relied on laborious and lab-based methods. Data from wearable sensors, such as Inertial Measurement Units (IMU), can be analyzed with machine learning to perform gait analysis in real-world environments. This database provides data from thirty participants (fifteen males and fifteen females, 23.5 ± 4.2 years, 169.3 ± 21.5 cm, 70.9 ± 13.9 kg) who wore six IMUs while walking on nine outdoor surfaces with self-selected speed (16.4 ± 4.2 seconds per trial). This is the first publicly available database focused on capturing gait patterns of typical real-world environments, such as grade (up-, down-, and cross-slopes), regularity (paved, uneven stone, grass), and stair negotiation (up and down). As such, the database contains data with only subtle differences between conditions, allowing for the development of robust analysis techniques capable of detecting small, but significant changes in gait mechanics. With analysis code provided, we anticipate that this database will provide a foundation for research that explores machine learning applications for mobile sensing and real-time recognition of subtle gait adaptations.

Entities:  

Mesh:

Year:  2020        PMID: 32641740      PMCID: PMC7343872          DOI: 10.1038/s41597-020-0563-y

Source DB:  PubMed          Journal:  Sci Data        ISSN: 2052-4463            Impact factor:   6.444


Background & Summary

Gait analysis is the science of functional assessment of human locomotion, and it has been applied in multiple areas such as medicine, sport, and ergonomics with promising results[1-3]. One specific successful application of gait analysis is to assess fall risk exposure and prevent falling injuries[4]. Fall risk is associated with multiple factors including human characteristics, health conditions, and the physical environment[5]. In particular, irregular walking surfaces in the outdoor built and natural environment expose people to potential fall injuries[6]. Unfortunately, traditional gait analysis requires expensive engineering technologies that are time and labor intensive, especially when the analysis involves heuristic hand-crafted feature extraction[7-9]. To overcome this limitation, machine learning methods are increasingly being integrated into gait and posture related investigations[10-12]. This data descriptor aims to contribute to machine learning research of gait performance when walking in different outdoor environments, which has surprisingly been limited in previous literature. Previous work has shown that gait adaptations utilized when walking on irregular surfaces may reflect reduced stability and increased fall risk[13-15]. However, one limitation of such previous studies is that they were conducted in simulated laboratory environments and thus lack real world validity. With the recent development of wearable motion tracking technologies such as Inertial Measurement Units (IMU), we now have the capability to extend gait analysis into outdoor settings to maximize ecological validity. In order to develop accurate, robust and generalizable machine learning algorithms to recognize subtle gait alterations, it is necessary to have sufficient amounts of properly annotated data. Unfortunately, very limited gait related data sets are publicly accessible. Among these, most were primarily generated for human activity recognition purposes so the activity tasks included have a very broad spectrum of coverage[16-24]. For example, gait is usually one category accompanied by other activities that have substantial differences (sitting, lying down, climbing stairs, running, etc.). Subtle gait alterations due to internal/external factors have never been considered or properly annotated in existing public data sets. A second category of data sets are focused on utilizing human gait performance as a biometrics characteristic for human identification[12,25-30]. Therefore, creators of those data sets usually only considered between subject differences and only collected short duration of gait trials from each participant which is not sufficient to train advanced machine learning models. Furthermore, the environmental conditions in which these data were collected are not always reported in sufficient detail. In order to advance machine learning for the recognition of human gait changes caused by walking surface characteristics, there is an urgent need to create large data sets that have an exhaustive set of walking surfaces representative of the real environment outside the laboratory, preferably with wearable and non-intrusive sensors. Therefore, in this descriptor, we present a publicly accessible data set collected with wearable motion sensors where participants walked on different real-world outdoor surfaces. We anticipate that this data set will provide a foundation for subsequent research that explores the application of machine learning to mobile sensing and real-time recognition of subtle gait adaptations.

Methods

Participants

Thirty young participants with no reported neurological or musculoskeletal conditions that affected their gait or posture and no history of falling injuries in the previous two years volunteered for this study. The sample of participants is in proximity to normal urban US campus. Their anthropometry information is provided in Table 1. The Harvard and Northeastern Institutional Review Boards approved this study and all participants provided written consent.
Table 1

Anthropometry information of participants.

ParticipantAgeSexHeight (cm)Body mass (kg)
128F154.549.1
224F158.654.1
322F16753.6
422F16656
523F168.261.4
633M17599
727M18475.3
818M18782.3
922F162.153.6
1019F16261.7
1128M18070.4
1218M177.981
1322F174.258.6
1419F66.567.3
1519M18172.4
1631M176101.2
1719F17373.9
1830M165.482.9
1932F16553
2022F167.174.8
2119M16973.9
2222M178.580.2
2324F179.661.6
2426F174.662.5
2522F15761.6
2622M175.666
2722M192.785.9
2822M18091.1
2926M17278.1
3022M18884.4
Summary23.5 (4.2)15M, 15F169.3 (21.5)70.9 (13.9)
Anthropometry information of participants.

Data collection

Participants performed several walking trials over nine different surfaces while wearing six IMU sensors (MTw Awinda, Xsens, Enschede, Netherlands). The sensors were secured to the body using the bands provided by the manufacturer such that they were: 1) centered on the wrist on the dorsal forearm, 2 & 3) centered on both the anterior thighs, 4 & 5) centered 5 cm above the bony processes of both ankles, and 6) posterior level of L5/S1 joint (Fig. 1).
Fig. 1

Sensor placement setup.

Sensor placement setup. Researchers palpated participant’s bones to place the sensors. Participants were instructed to face southwest and perform a sensor calibration procedure three times prior to the experimental trial collection. The calibration procedure was: 1) line up directly centered with experiment computer; 2) forward trunk flexion about 30 degrees 3 times; 3) raise right arm 3 times; 4) raise right leg three times; 5) raise left leg three times. A researcher performed these movements with the participant. The calibration data are also included in this data set. The nine walking surfaces were: 1) flat even (horizontal, 0 grade, paved); 2) up stairs (cement); 3) down stairs (cement); 4) sloped up (cement); and 5) sloped down (cement) 6) grass; 7) banked left (paved); 8) banked right (paved); 9) uneven stone brick (Fig. 2).
Fig. 2

Measurement sites for walking trials.

Measurement sites for walking trials. Signal pattern of trunk sensor on different walking surfaces: resultant acceleration amplitude (m/s2, blue solid lines) and resultant angular velocity amplitude (rad/s, red dotted lines) from subject #1. Participants were instructed to walk at their normal pace and to let their arms swing naturally. Participants stood still at the starting position and waited for the verbal cue from a researcher to start their walking trials. Each walking trial lasted for 16.4 ± 4.2 seconds until stop. Within each trial, walking was performed by participants without changes of direction (i.e. straight walking). Between trials, only walking on flat even, grass, and uneven stone brick were conducted with direction changes every other trial (i.e. walking forward for the first trial and walking back for the next trial). Surfaces were presented in a randomized order and adequate rest was provided to prevent fatigue between trials. Participants walked six times on each of these surfaces, and a researcher walked next to them with the experimental data capture machine to ensure a strong signal connection. A summary of the data collection conditions includes weather (‘N/A’ was filled if weather was not recorded), temperature, and time of day for each participant is provided in Table 2.
Table 2

Data collection conditions.

ParticipantTemperature (°C)Wind (m/s)WeatherTime of day
1−1.111.2N/AMorning (9:30 am)
24.48.9SunnyAfternoon (2:30 pm)
34.48.0CloudyNoon
407.6SunnyMorning (9:30 am)
55.05.8SunnyAfternoon (2 pm)
66.73.1SunnyAfternoon (6 pm)
72.82.2CloudyMorning (8 am)
82.23.1N/AMorning (11 am)
911.76.3Partly cloudyAfternoon (2:40 pm)
1016.74.0Partly cloudyMorning (10 am)
116.18.5SunnyMorning (9 am)
127.28.0Partly cloudyMorning (10:30 am)
139.48.0CloudyAfternoon (3:30 pm)
147.87.6CloudyAfternoon (noon)
1510.67.6N/AAfternoon (4 pm)
1610.06.7N/AAfternoon (6 pm)
178.98.0N/AAfternoon (1 pm)
188.35.8SunnyMorning (10 am)
1910.05.8SunnyMorning (11 am)
2012.24.5SunnyMorning (11:30 am)
2112.85.4SunnyAfternoon (1 pm)
2214.44.5CloudyMorning (9:30 am)
2315.03.1CloudyMorning (11:30 am)
2420.04.5SunnyAfternoon (2 pm)
2522.84.9SunnyAfternoon (5:30 pm)
2620.06.7Partly cloudyMorning (10:30 am)
279.40.4CloudyMorning (9:30 am)
2817.83.1CloudyAfternoon (4 pm)
2910.64.0Partly cloudyAfternoon (5 pm)
3015.62.2CloudyMorning (9:40 am)
Data collection conditions.

Data processing

Wearable data were collected using the MTw Awinda software (Xsens, Enschede, Netherlands). The sampling frequency was set at 100 Hz. Raw sensors’ outputs were synchronized by the software and then exported to a standard txt file format. Subsequently, all the data files were imported and processed under MATLAB (R2019a, The MathWorks, Natick, USA). Trajectories were smoothed using a 2nd order Butterworth low pass filter with a 6 Hz cut-off frequency. Figure 3 is presented to give an example of the filtered signal pattern of the trunk sensor while walking on different surfaces.
Fig. 3

Signal pattern of trunk sensor on different walking surfaces: resultant acceleration amplitude (m/s2, blue solid lines) and resultant angular velocity amplitude (rad/s, red dotted lines) from subject #1.

Data Records

Raw data

All raw data files exported from MTw are stored as .txt format and have been uploaded into figshare[31] to provide free accessibility to the public. A total of 10,260 (30 participants * 57 trials * 6 sensors) files are available from the database. Files are grouped by folders with labels from 1–30 representing the participant number (30 participants in total). Each file was named systematically as ‘#-000_00B432**.txt’, where ‘#’ represents the walking surface condition (Table 3) and ‘**’ represents the sensor location (Table 4). For example, file ‘9-000_00B432CC.txt’ stands for the trunk sensor (‘CC’) data while walking on the flat even surface (‘9’) for all participants. Furthermore, for each trial there was a .mtb file (i.e. binary motion tracker file).
Table 3

Table for walking surface condition and sample duration (across all participants).

Trial number (#)Walking surface conditionSample duration (s) Mean (standard deviation)
1–3Calibration (CALIB)19.29 (3.14)
4–9Flat even (FE)13.55 (2.19)
10–15Cobble stone (CS)16.12 (1.93)
16,18,20,22,24,26Upstairs (StrU)12.48 (1.17)
17,19,21,23,25,27Downstairs (StrD)11.84 (1.42)
28,30,32,34,36,38Slope up (SlpU)22.70 (1.89)
29,31,33,35,37,39Slope down (SlpD)22.77 (2.22)
40,42,44,46,48,50Bank left (BnkL)16.06 (1.90)
41,43,45,47,49,51Bank right (BnkR)16.29 (1.67)
52–57Grass (GR)14.48 (1.52)
Table 4

Table for sensor locations of each trial based on last 2 digits of filenames.

Orange Sensor number/**Sensor location
CC.txtTrunk
95.txtWrist
93.txtRight thigh
8B.txtLeft thigh
9B.txtRight shank
B6.txtLeft shank
Table for walking surface condition and sample duration (across all participants). Table for sensor locations of each trial based on last 2 digits of filenames. Sensors’ outputs (e.g. 3D acceleration, 3D gyroscope data) as well as the recording information (e.g. start time, update rate, filter profile, and firmware version) are stored in each file with labels. The average duration for each surface condition (across all participants) is summarized in Table 3. A comprehensive description of the data structure and variable labels are given in Table 5.
Table 5

Data stored in .txt files (all variables are with dimension n x 1).

LabelsUnitDescription
PacketCounterN/APacket counter, value will be same if data frames were recorded at the same time (increase 1 unit per data frame)
SampleTimeFineN/ANot recorded in this study
Acc_Xm/s2Acceleration in the vertical direction (w/gravity)
Acc_Ym/s2Acceleration in the medio-lateral direction (w/gravity)
Acc_Zm/s2Acceleration in the anterior-posterior direction (w/gravity)
FreeAcc_Xm/s2Acceleration in the vertical direction (w/o gravity)
FreeAcc_Ym/s2Acceleration in the medio-lateral direction (w/o gravity)
FreeAcc_Zm/s2Acceleration in the anterior-posterior direction (w/o gravity)
Gyr_Xrad/sRate of turn along the vertical direction
Gyr_Yrad/sRate of turn along the medio-lateral direction
Gyr_Zrad/sRate of turn along the anterior-posterior direction
Mag_Xa.u.3D magnetic field in the vertical direction
Mag_Ya.u.3D magnetic field in the medio-lateral direction
Mag_Za.u.3D magnetic field in the anterior-posterior direction
VelInc_Xm/sDelta_velocity (dv) in the vertical direction
VelInc_Ym/sDelta_velocity (dv) in the medio-lateral direction
VelInc_Zm/sDelta_velocity (dv) in the anterior-posterior direction
OriInc_q0N/ADelta_quaternion (q0)
OriInc_q1N/ADelta_quaternion (q1)
OriInc_q2N/ADelta_quaternion (q2)
OriInc_q3N/ADelta_quaternion (q3)
RolldegEuler angles in XYZ Earth fixed type (roll)
PitchdegEuler angles in XYZ Earth fixed type (pitch)
YawdegEuler angles in XYZ Earth fixed type (yaw)
Data stored in .txt files (all variables are with dimension n x 1).

Processed data

A processed data file was also provided as a .mat format (data file format of MATLAB) in the repository. Raw sensor data from 30 participants were aggregated into one single file with participant as the first layer and sensor as the second layer. The outline of the MATLAB script is described as following: 1. import the raw txt files; 2. apply Butterworth low-pass filter (2nd order, cutoff frequency: 6 Hz, sampling frequency: 100 Hz); 3. count the missing frames; 4. export processed data into .mat file.

Technical Validation

Sensor placement

Participants were required to wear tight clothes during the experiment to prevent sensor movement. As described in the procedures (see Data Collection), the wearable sensor placement followed the instructions available in the manufacturer’s documentation. In addition, before each experiment, the signal quality of each IMU sensor was manually verified through the system’s acquisition software. IMU sensors were positioned by the same researchers (Authors BH and SC) for consistency.

Missing data

The trial-wise data missing rate is recorded in the database for each participant (under the second layer of the .mat file). Due to transmission errors between the data collection computer and the IMU sensors, some data frames/packages were dropped. However, we have confirmed that missing data is not a major issue for this data set, only a small fraction of data packages were dropped (0.23% ± 0.69%). Data missing rate is summarized by sensor location in Table 6 and by walking surface in Table 7.
Table 6

Table for data missing rate by sensor locations.

Sensor locationMissing rate Mean (standard deviation)
Trunk0
Wrist0.13% (0.13%)
Right thigh0.19% (0.18%)
Left thigh0.93% (4.08%)
Right shank0.08% (0.08%)
Left shank0.06% (0.05%)
Table 7

Table for data missing rate by walking surfaces.

Sensor locationMissing rate Mean (standard deviation)
Calibration (CALIB)0
Flat even (FE)0.17% (0.26%)
Cobble stone (CS)0.36% (1.67%)
Upstairs (StrU)0.59% (3.04%)
Downstairs (StrD)0.66% (3.03%)
Slope up (SlpU)0.02% (0.05%)
Slope down (SlpD)0.10% (0.20%)
Bank left (BnkL)0.16% (0.23%)
Bank right (BnkR)0.30% (0.42%)
Grass (GR)0.12% (0.17%)
Table for data missing rate by sensor locations. Table for data missing rate by walking surfaces.

Comparison with published data sets

The age of the participants differed significantly from previously published data sets, which varied from ages 2 to 78 years[18-20,22,24-27,29], whereas this data set only included young adults. The number of participants of previous data sets also varied significantly from 8 to 744. Subject number is an important technical component for database selection considering the need for large amounts of data during machine learning model training. Nevertheless, it also obscures the merit of data sets that have relatively few participants, but longer recording lengths. For example, although Ravi et al.[23] only recruited 10 participants in their study, a total of 30 hours of data were collected using different models of smartphones with an unconstrained phone placement setting. The data set can be treated as a suitable data resource of models designed for real-world application in which the models and placement of smartphones are always unspecified. Our data set includes 30 participants and each one has a relatively large amount of data collected. The current data set is well aligned with previous similar data sets. When using these data sets for gait-related machine learning model development, we should be aware that the relative homogeneous samples might restrict the generalizability to more heterogeneous data in terms of age distribution. The annotation of the ground truth for recorded activities is also important for publicly accessible data sets because it is needed to validate the predicted outcome. Most of the previous similar data sets have documented the types of activities participants performed. Among them, many include walking records on different surfaces (walking on concrete/grass field, walking upstairs/downstairs, etc.)[16,18-22,24,26,27]. Compared to them, the current data set provides a larger amount of irregular walking surfaces. Machine learning algorithm developers could benefit from the diversified walking records contained in the present data set. Although some parameters about testing sites (e.g. the grade of the slope and the stair dimensions) were not systematically surveyed during the data collection phase, we believe they represent common public architecture features. To further improve the usability of the data, more details about measurement sites will be provided in the GitHub and publicly accessible data description in the future.

Usage Notes

Previous literature has shown that IMUs are a valid tool for measuring subtle changes in gait kinematics and the performance is as sensitive as the current standard in kinematic tracking (i.e. optical motion capture)[32]. To support a range of users in accessing the data set, other than raw data, processed data are provided in .mat format in the data repository. The .mat data file is readable by both Python and MATLAB environments. Existing Python and MATLAB open-source tools focused on gait and human motion kinematics could be used to analyze this data set. GaitPy provides python functions to read accelerometry data and estimate the clinical characteristics of gait (https://pypi.org/project/gaitpy/). It could be a complementary tool when utilizing this data set. For MATLAB, the Kinematics and Inverse Dynamics toolbox (https://www.mathworks.com/matlabcentral/fileexchange/58021-3d-kinematics-and-inverse-dynamics) can be utilized in investigating joint kinematics and dynamics. Moreover, biomechZoo, which help users analyze, process, and visualize motion data from various sensors[33] could support researchers aiming to explore this data set.
Measurement(s)Gait
Technology Type(s)Sensor Device
Factor Type(s)surface • age • sex • height • body mass
Sample Characteristic - OrganismHomo sapiens
  14 in total

1.  Time Series Analysis Using Geometric Template Matching.

Authors:  Jordan Frank; Shie Mannor; Joelle Pineau; Doina Precup
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2012-05-29       Impact factor: 6.226

2.  biomechZoo: An open-source toolbox for the processing, analysis, and visualization of biomechanical movement data.

Authors:  Philippe C Dixon; Jonathan J Loh; Yannick Michaud-Paquette; David J Pearsall
Journal:  Comput Methods Programs Biomed       Date:  2016-11-18       Impact factor: 5.428

Review 3.  Toward Pervasive Gait Analysis With Wearable Sensors: A Systematic Review.

Authors:  Shanshan Chen; John Lach; Benny Lo; Guang-Zhong Yang
Journal:  IEEE J Biomed Health Inform       Date:  2016-11       Impact factor: 5.772

4.  Effects of surface irregularity and lighting on step variability during gait: a study in healthy young and older women.

Authors:  Sibylle B Thies; James K Richardson; James A Ashton-Miller
Journal:  Gait Posture       Date:  2005-08       Impact factor: 2.840

5.  Toward automated, at-home assessment of mobility among patients with Parkinson disease, using a body-worn accelerometer.

Authors:  Aner Weiss; Sarvi Sharifi; Meir Plotnik; Jeroen P P van Vugt; Nir Giladi; Jeffrey M Hausdorff
Journal:  Neurorehabil Neural Repair       Date:  2011 Nov-Dec       Impact factor: 3.919

6.  Gait adaptations of older adults on an uneven brick surface can be predicted by age-related physiological changes in strength.

Authors:  P C Dixon; K H Schütte; B Vanwanseele; J V Jacobs; J T Dennerlein; J M Schiffman
Journal:  Gait Posture       Date:  2018-03-20       Impact factor: 2.840

7.  Accelerometer-Based Gait Recognition by Sparse Representation of Signature Points With Clusters.

Authors:  Yuting Zhang; Gang Pan; Kui Jia; Minlong Lu; Yueming Wang; Zhaohui Wu
Journal:  IEEE Trans Cybern       Date:  2014-11-20       Impact factor: 11.448

8.  Wearable assistant for Parkinson's disease patients with the freezing of gait symptom.

Authors:  Marc Bächlin; Meir Plotnik; Daniel Roggen; Inbal Maidan; Jeffrey M Hausdorff; Nir Giladi; Gerhard Tröster
Journal:  IEEE Trans Inf Technol Biomed       Date:  2009-11-10

Review 9.  Gait analysis using wearable sensors.

Authors:  Weijun Tao; Tao Liu; Rencheng Zheng; Hutian Feng
Journal:  Sensors (Basel)       Date:  2012-02-16       Impact factor: 3.576

10.  Gait analysis methods: an overview of wearable and non-wearable systems, highlighting clinical applications.

Authors:  Alvaro Muro-de-la-Herran; Begonya Garcia-Zapirain; Amaia Mendez-Zorrilla
Journal:  Sensors (Basel)       Date:  2014-02-19       Impact factor: 3.576

View more
  8 in total

1.  Reliability and generalization of gait biometrics using 3D inertial sensor data and 3D optical system trajectories.

Authors:  Geise Santos; Tiago Tavares; Anderson Rocha
Journal:  Sci Rep       Date:  2022-05-19       Impact factor: 4.996

2.  A database of physical therapy exercises with variability of execution collected by wearable sensors.

Authors:  Sara García-de-Villa; Ana Jiménez-Martín; Juan Jesús García-Domínguez
Journal:  Sci Data       Date:  2022-06-03       Impact factor: 8.501

3.  A Multi-Modal Gait Database of Natural Everyday-Walk in an Urban Environment.

Authors:  Viktor Losing; Martina Hasenjäger
Journal:  Sci Data       Date:  2022-08-03       Impact factor: 8.501

4.  A multi-sensor human gait dataset captured through an optical system and inertial measurement units.

Authors:  Geise Santos; Marcelo Wanderley; Tiago Tavares; Anderson Rocha
Journal:  Sci Data       Date:  2022-09-07       Impact factor: 8.501

5.  Egocentric vision-based detection of surfaces: towards context-aware free-living digital biomarkers for gait and fall risk assessment.

Authors:  Mina Nouredanesh; Alan Godfrey; Dylan Powell; James Tung
Journal:  J Neuroeng Rehabil       Date:  2022-07-22       Impact factor: 5.208

6.  From raw measurements to human pose - a dataset with low-cost and high-end inertial-magnetic sensor data.

Authors:  Manuel Palermo; Sara M Cerqueira; João André; António Pereira; Cristina P Santos
Journal:  Sci Data       Date:  2022-09-30       Impact factor: 8.501

7.  Explainable gait recognition with prototyping encoder-decoder.

Authors:  Jucheol Moon; Yong-Min Shin; Jin-Duk Park; Nelson Hebert Minaya; Won-Yong Shin; Sang-Il Choi
Journal:  PLoS One       Date:  2022-03-11       Impact factor: 3.240

8.  Reference in-vitro dataset for inertial-sensor-to-bone alignment applied to the tibiofemoral joint.

Authors:  Ive Weygers; Manon Kok; Thomas Seel; Darshan Shah; Orçun Taylan; Lennart Scheys; Hans Hallez; Kurt Claeys
Journal:  Sci Data       Date:  2021-08-05       Impact factor: 6.444

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.