Joana Chong1, Petra Tjurin2, Maisa Niemelä3, Timo Jämsä4, Vahid Farrahi5. 1. Faculty of Sciences, University of Lisbon, Lisbon, Portugal; Research Unit of Medical Imaging, Physics and Technology, University of Oulu, Oulu, Finland. 2. Research Unit of Medical Imaging, Physics and Technology, University of Oulu, Oulu, Finland. 3. Research Unit of Medical Imaging, Physics and Technology, University of Oulu, Oulu, Finland; Medical Research Center, Oulu University Hospital and University of Oulu, Oulu, Finland. 4. Research Unit of Medical Imaging, Physics and Technology, University of Oulu, Oulu, Finland; Medical Research Center, Oulu University Hospital and University of Oulu, Oulu, Finland; Diagnostic Radiology, Oulu University Hospital, Oulu, Finland. 5. Research Unit of Medical Imaging, Physics and Technology, University of Oulu, Oulu, Finland. Electronic address: Vahid.farrahi@oulu.fi.
Abstract
PURPOSE: Machine-learning (ML) approaches have been repeatedly coupled with raw accelerometry to classify physical activity classes, but the features required to optimize their predictive performance are still unknown. Our aim was to identify appropriate combination of feature subsets and prediction algorithms for activity class prediction from hip-based raw acceleration data. METHODS: The hip-based raw acceleration data collected from 27 participants was split into training (70 %) and validation (30 %) subsets. A total of 206 time- (TD) and frequencydomain (FD) features were extracted from 6-second non-overlapping windows of the signal. Feature selection was done using seven filter-based, two wrapper-based, and one embedded algorithm, and classification was performed with artificial neural network (ANN), support vector machine (SVM), and random forest (RF). For every combination between the feature selection method and the classifiers, the most appropriate feature subsets were found and used for model training within the training set. These models were then validated with the left-out validation set. RESULTS: The appropriate number of features for the ANN, SVM, and RF ranged from 20 to 45. Overall, the accuracy of all the three classifiers was higher when trained with feature subsets generated using filter-based methods compared with when they were trained with wrapper-based methods (range: 78.1 %-88 % vs. 66 %-83.5 %). TD features that reflect how signals vary around the mean, how they differ with one another, and how much and how often they change were more frequently selected via the feature selection methods. CONCLUSIONS: A subset of TD features from raw accelerometry could be sufficient for ML-based activity classification if properly selected from different axes.
PURPOSE: Machine-learning (ML) approaches have been repeatedly coupled with raw accelerometry to classify physical activity classes, but the features required to optimize their predictive performance are still unknown. Our aim was to identify appropriate combination of feature subsets and prediction algorithms for activity class prediction from hip-based raw acceleration data. METHODS: The hip-based raw acceleration data collected from 27 participants was split into training (70 %) and validation (30 %) subsets. A total of 206 time- (TD) and frequencydomain (FD) features were extracted from 6-second non-overlapping windows of the signal. Feature selection was done using seven filter-based, two wrapper-based, and one embedded algorithm, and classification was performed with artificial neural network (ANN), support vector machine (SVM), and random forest (RF). For every combination between the feature selection method and the classifiers, the most appropriate feature subsets were found and used for model training within the training set. These models were then validated with the left-out validation set. RESULTS: The appropriate number of features for the ANN, SVM, and RF ranged from 20 to 45. Overall, the accuracy of all the three classifiers was higher when trained with feature subsets generated using filter-based methods compared with when they were trained with wrapper-based methods (range: 78.1 %-88 % vs. 66 %-83.5 %). TD features that reflect how signals vary around the mean, how they differ with one another, and how much and how often they change were more frequently selected via the feature selection methods. CONCLUSIONS: A subset of TD features from raw accelerometry could be sufficient for ML-based activity classification if properly selected from different axes.
Authors: Petra Tjurin; Maisa Niemelä; Maarit Kangas; Laura Nauha; Henri Vähä-Ypyä; Harri Sievänen; Raija Korpelainen; Vahid Farrahi; Timo Jämsä Journal: Med Sci Sports Exerc Date: 2022-03-22