| Literature DB >> 29212759 |
Angkoon Phinyomark1,2, Rami N Khushaba3, Esther Ibáñez-Marcelo1, Alice Patania4, Erik Scheme2, Giovanni Petri5.
Abstract
The success of biological signal pattern recognition depends crucially on the selection of relevant features. Across signal and imaging modalities, a large number of features have been proposed, leading to feature redundancy and the need for optimal feature set identification. A further complication is that, due to the inherent biological variability, even the same classification problem on different datasets can display variations in the respective optimal sets, casting doubts on the generalizability of relevant features. Here, we approach this problem by leveraging topological tools to create charts of features spaces. These charts highlight feature sub-groups that encode similar information (and their respective similarities) allowing for a principled and interpretable choice of features for classification and analysis. Using multiple electromyographic (EMG) datasets as a case study, we use this feature chart to identify functional groups among 58 state-of-the-art EMG features, and to show that they generalize across three different forearm EMG datasets obtained from able-bodied subjects during hand and finger contractions. We find that these groups describe meaningful non-redundant information, succinctly recapitulating information about different regions of feature space. We then recommend representative features from each group based on maximum class separability, robustness and minimum complexity.Entities:
Keywords: EMG; electromyogram; feature extraction; feature selection; myoelectric control; topological data analysis; topological simplification
Mesh:
Year: 2017 PMID: 29212759 PMCID: PMC5746577 DOI: 10.1098/rsif.2017.0734
Source DB: PubMed Journal: J R Soc Interface ISSN: 1742-5662 Impact factor: 4.118
A list of EMG feature extraction techniques.
| full names | abbreviations | parameters | dimensions | references |
|---|---|---|---|---|
| amplitude of the first burst | AFB | 1 | [ | |
| approximate entropy, sample entropy | ApEn, SampEn | 1, 1 | [ | |
| autoregressive model and its differencing version | AR, DAR | order = 4 | 4, 4 | [ |
| box counting dimension | BC | 1 | [ | |
| cepstrum/cepstral coefficients and its differencing version | CC, DCC | order = 4 | 4, 4 | [ |
| critical exponent analysis | CEA | 1 | [ | |
| difference absolute mean value | DAMV | — | 1 | [ |
| difference absolute standard deviation value | DASDV | — | 1 | [ |
| detrended fluctuation analysis | DFA | 1 | [ | |
| maximum-to-minimum drop in power density ratio | DPR | — | 1 | [ |
| frequency ratio | FR | 1 | [ | |
| Higuchi's fractal dimension | HG | 1 | [ | |
| histogram | HIST | segment = 3 | 3 | [ |
| integrated EMG | IEMG | — | 1 | [ |
| Katz's fractal dimension | KATZ | — | 1 | [ |
| kurtosis, skewness | KURT, SKEW | — | 1, 1 | [ |
| log detector and its differencing version | LD, DLD | — | 1, 1 | [ |
| The second-order moment | M2 | — | 1 | [ |
| mean absolute value | MAV | — | 1 | [ |
| modified mean absolute value (type 1, type 2) | MAV1, MAV2 | — | 1, 1 | [ |
| mean absolute value slope | MAVS | segment = 2 | 1 | [ |
| maximum amplitude | MAX | cut-off = 5 Hz, order = 6 | 1 | [ |
| median frequency, mean frequency | MDF, MNF | — | 1, 1 | [ |
| maximum fractal length | MFL | — | 1 | [ |
| multiple hamming/trapezoidal windows | MHW, MTW | — | 3, 3 | [ |
| mean power, total power | MNP, TTP | — | 1, 1 | [ |
| myopulse percentage rate | MYOP | threshold = 20/0.02/5 × 10−5 | 1 | [ |
| power spectrum deformation | OHM | — | 1 | [ |
| peak frequency | PKF | — | 1 | [ |
| power spectral density fractal dimension | PSDFD | — | 1 | [ |
| power spectrum ratio | PSR | 1 | [ | |
| root mean square | RMS | — | 1 | [ |
| spectral moment | SM | order = 2 | 1 | [ |
| signal-to-motion artefact ratio, signal-to-noise ratio | SMR, SNR | — | 1, 1 | [ |
| slope sign change | SSC | threshold = 16/10−4/10−10 | 1 | [ |
| simple square integral | SSI | — | 1 | [ |
| time-dependent power spectrum descriptors | TDPSD | — | 6 | [ |
| absolute temporal moment and its differencing version | TM, DTM | order = 3 | 1, 1 | [ |
| variance and its differencing version | VAR, DVARV | — | 1, 1 | [ |
| variance of central frequency | VCF | — | 1 | [ |
| variance fractal dimension | VFD | — | 1 | [ |
| V, DV | order = 3 | 1, 1 | [ | |
| Willison amplitude | WAMP | threshold = 20/0.02/5 × 10−5 | 1 | [ |
| waveform length | WL | — | 1 | [ |
| zero crossing | ZC | threshold = 10/0.01/10−5 | 1 | [ |
Figure 1.The resulting topological network computed using three intervals with a 50% overlap from the 28 PC scores extracted from the first dataset. k-NN distance was used as a filter function. The colours encode the filter values, with blue indicative of low distance, and green of high. The number of features in each node is indicated using the size of the node and the number in it.
Figure 2.Sub-groups of EMG features (a) derived from the topological network of the first EMG dataset and their corresponding filter values (b) sorting from smallest to largest. k-NN distance was used as a filter function. The colours encode the filter values, with blue indicative of low distance, and green of high.
Figure 3.The resulting topological networks computed using eight intervals with a 50% overlap from (a) the reduced lower-dimensional space (28 PC scores) and (b) the original high-dimensional space (38 400 feature values) for the first dataset. k-NN distance was used as a filter function. The colours encode the filter values, with blue indicative of low distance, and green of high.
Figure 4.The resulting topological networks computed using three intervals with a 50% overlap from the reduced lower-dimensional space for (a) the second dataset (29 PC scores) and (b) the third dataset (24 PC scores). k-NN distance was used as a filter function. The colours encode the filter values, with blue indicative of low distance, and green of high.
The classification performance of 81 features using different evaluation methods: DBI, FLDI, SVM and LDA for the three EMG datasets.
| data 1 | data 2 | data 3 | data 1 | data 2 | data 3 | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| feature | DBI | FLDI | SVM | LDA | SVM | LDA | feature | DBI | FLDI | SVM | LDA | SVM | LDA |
| AFB | 3.42 | 9.92 | 42.7 | 50.6 | 54.3 | 43.1 | MAX | 3.08 | 8.07 | 31.3 | 40.0 | 42.5 | 30.6 |
| ApEn | 1.63 | 1.74 | 11.7 | 15.8 | 38.8 | 23.1 | MDF | 2.66 | 4.73 | 33.2 | 36.0 | 20.8 | 31.7 |
| SampEn | 1.79 | 1.82 | 14.7 | 17.2 | 39.9 | 26.9 | MNF | 2.37 | 4.21 | 27.0 | 31.0 | 18.0 | 26.5 |
| AR(1) | 2.34 | 4.75 | 28.4 | 32.2 | 15.4 | 22.7 | MFL | 1.69 | 1.86 | 10.6 | 14.2 | 2.7 | 12.9 |
| AR(2) | 2.55 | 6.01 | 34.6 | 37.7 | 15.6 | 22.3 | MHW(1) | 2.51 | 4.66 | 22.5 | 33.7 | 27.9 | 29.6 |
| AR(3) | 2.70 | 6.43 | 40.0 | 40.6 | 21.3 | 28.1 | MHW(2) | 2.50 | 4.74 | 23.6 | 36.5 | 13.1 | 31.5 |
| AR(4) | 3.31 | 8.77 | 48.1 | 48.5 | 33.5 | 32.3 | MHW(3) | 2.69 | 6.72 | 26.6 | 38.2 | 13.0 | 30.0 |
| DAR(1) | 2.34 | 4.44 | 28.2 | 31.9 | 13.0 | 22.1 | MTW(1) | 2.18 | 3.58 | 19.1 | 29.5 | 27.2 | 25.4 |
| DAR(2) | 2.75 | 7.18 | 39.9 | 40.8 | 19.5 | 26.9 | MTW(2) | 2.39 | 4.19 | 21.7 | 33.7 | 12.8 | 25.4 |
| DAR(3) | 2.46 | 4.13 | 33.7 | 35.6 | 16.1 | 25.2 | MTW(3) | 2.56 | 5.36 | 24.6 | 36.4 | 13.5 | 31.0 |
| DAR(4) | 2.94 | 7.20 | 45.2 | 45.3 | 22.3 | 35.2 | MNP | 2.18 | 3.43 | 18.8 | 29.4 | 21.9 | 25.2 |
| BC | 1.83 | 2.32 | 18.4 | 21.2 | 43.0 | 36.9 | TTP | 2.18 | 3.43 | 19.1 | 29.2 | 21.9 | 26.2 |
| CC(1) | 2.34 | 4.75 | 28.6 | 31.9 | 15.1 | 22.3 | MYOP | 1.94 | 2.87 | 15.8 | 20.5 | 8.2 | 13.1 |
| CC(2) | 2.53 | 5.66 | 33.1 | 36.1 | 14.8 | 23.1 | OHM | 3.00 | 8.35 | 39.8 | 44.3 | 22.6 | 27.3 |
| CC(3) | 2.90 | 8.33 | 37.7 | 42.0 | 17.1 | 19.8 | PKF | 5.65 | 21.31 | 66.2 | 66.5 | 46.2 | 62.3 |
| CC(4) | 3.22 | 11.96 | 44.6 | 47.4 | 32.4 | 21.0 | PSDFD | 3.48 | 9.67 | 48.2 | 49.5 | 29.3 | 62.5 |
| DCC(1) | 2.34 | 4.44 | 28.4 | 32.0 | 12.7 | 22.7 | PSR | 4.78 | 16.93 | 59.1 | 60.2 | 40.8 | 52.5 |
| DCC(2) | 2.64 | 6.02 | 37.2 | 38.3 | 15.5 | 22.9 | RMS | 2.03 | 2.60 | 15.6 | 21.5 | 16.6 | 19.2 |
| DCC(3) | 2.52 | 4.16 | 33.8 | 36.0 | 16.3 | 32.1 | SM | 1.93 | 2.43 | 16.0 | 26.5 | 9.9 | 26.9 |
| DCC(4) | 2.94 | 7.02 | 44.2 | 44.7 | 21.4 | 34.6 | SMR | 15.83 | 218.21 | 86.8 | 83.6 | 36.9 | 66.5 |
| CEA | 4.05 | 10.21 | 54.0 | 54.7 | 76.0 | 65.2 | SNR | 3.12 | 10.87 | 39.9 | 45.1 | 30.9 | 22.5 |
| DAMV | 1.87 | 1.89 | 11.9 | 18.7 | 4.1 | 20.6 | SSC | 1.82 | 2.46 | 17.9 | 21.7 | 3.6 | 30.2 |
| DASDV | 1.80 | 1.88 | 11.9 | 18.0 | 5.0 | 22.9 | SSI | 2.18 | 3.43 | 18.6 | 29.3 | 21.7 | 25.0 |
| DFA | 2.28 | 4.25 | 27.9 | 30.4 | 17.4 | 26.9 | TDPSD(1) | 3.61 | 3.27 | 39.9 | 39.0 | 15.3 | 27.1 |
| DPR | 12.13 | 184.14 | 81.7 | 83.0 | 79.2 | 60.6 | TDPSD(2) | 2.48 | 3.07 | 27.2 | 29.0 | 13.7 | 17.1 |
| FR | 2.72 | 4.56 | 36.0 | 39.4 | 17.1 | 36.2 | TDPSD(3) | 2.54 | 2.67 | 31.8 | 36.3 | 18.2 | 38.1 |
| HG | 4.29 | 14.19 | 48.3 | 52.4 | 33.2 | 39.6 | TDPSD(4) | 2.17 | 4.81 | 25.9 | 29.5 | 15.7 | 27.5 |
| HIST(1) | 6.32 | 35.20 | 40.5 | 50.5 | 39.1 | 51.2 | TDPSD(5) | 2.84 | 3.37 | 35.1 | 35.6 | 18.8 | 26.0 |
| HIST(2) | 4.42 | 19.28 | 32.0 | 39.0 | 31.9 | 48.3 | TDPSD(6) | 2.38 | 4.99 | 29.1 | 31.2 | 13.6 | 16.0 |
| HIST(3) | 6.28 | 40.45 | 40.7 | 50.4 | 40.2 | 48.5 | TM | 4.85 | 31.73 | 50.4 | 58.3 | 50.1 | 33.3 |
| IEMG | 2.03 | 2.53 | 14.7 | 21.4 | 10.8 | 18.5 | DTM | 3.64 | 11.32 | 38.0 | 48.6 | 30.4 | 40.4 |
| KATZ | 4.03 | 9.41 | 54.2 | 54.6 | 44.0 | 47.9 | VAR | 2.18 | 3.43 | 19.0 | 29.6 | 22.2 | 26.0 |
| KURT | 5.97 | 31.76 | 62.1 | 66.1 | 53.0 | 46.9 | DVARV | 1.95 | 2.48 | 16.2 | 26.5 | 9.8 | 26.5 |
| SKEW | 11.00 | 212.11 | 80.0 | 78.7 | 62.8 | 30.8 | VCF | 18.25 | 340.56 | 87.8 | 87.6 | 82.2 | 67.9 |
| LD | 2.47 | 3.08 | 19.6 | 24.8 | 8.2 | 20.0 | VFD | 13.08 | 156.55 | 83.4 | 83.0 | 71.5 | 65.8 |
| DLD | 2.26 | 2.59 | 19.2 | 23.9 | 3.7 | 20.2 | V | 2.00 | 2.85 | 17.4 | 22.8 | 20.7 | 19.8 |
| M2 | 1.95 | 2.48 | 16.1 | 26.8 | 9.9 | 26.2 | DV | 1.81 | 1.97 | 13.1 | 18.5 | 6.5 | 20.6 |
| MAV | 2.03 | 2.53 | 14.6 | 21.2 | 10.9 | 17.9 | WAMP | 1.77 | 2.12 | 12.4 | 17.3 | 4.3 | 15.2 |
| MAV1 | 2.05 | 2.61 | 14.9 | 21.7 | 9.6 | 18.3 | WL | 1.87 | 1.89 | 11.9 | 18.7 | 4.2 | 21.0 |
| MAV2 | 2.11 | 2.69 | 15.6 | 22.4 | 8.4 | 19.6 | ZC | 1.77 | 1.89 | 17.6 | 20.9 | 5.1 | 27.5 |
| MAVS | 3.37 | 9.95 | 39.1 | 42.1 | 34.9 | 45.6 | |||||||
The classification performance of the Mapper selected features, the SFS selected features and all features using the SVM classifier for the first and the third EMG datasets.
| feature set | data 1: test set | data 3 | feature set | data 3: test set | data 1 |
|---|---|---|---|---|---|
| Mapper | 4.95 | 7.62 | Mapper | 7.07a | 4.70a |
| SFS (data 1: training set) | 4.61 | 6.62 | SFS (data 3: training set) | 15.03b | 11.01b |
| all features | 5.24 | 6.15 | all features | 5.66 | 4.76 |
aMeaningful difference between Mapper and SFS (d > 0.8).
bMeaningful difference between SFS and all features (d > 0.8).