| Literature DB >> 34940346 |
Andrea Bizzego1, Giulio Gabrieli2, Michelle Jin Yee Neoh2, Gianluca Esposito1,2,3.
Abstract
Deep learning (DL) has greatly contributed to bioelectric signal processing, in particular to extract physiological markers. However, the efficacy and applicability of the results proposed in the literature is often constrained to the population represented by the data used to train the models. In this study, we investigate the issues related to applying a DL model on heterogeneous datasets. In particular, by focusing on heart beat detection from electrocardiogram signals (ECG), we show that the performance of a model trained on data from healthy subjects decreases when applied to patients with cardiac conditions and to signals collected with different devices. We then evaluate the use of transfer learning (TL) to adapt the model to the different datasets. In particular, we show that the classification performance is improved, even with datasets with a small sample size. These results suggest that a greater effort should be made towards the generalizability of DL models applied on bioelectric signals, in particular, by retrieving more representative datasets.Entities:
Keywords: ECG; deep neural networks; transfer learning
Year: 2021 PMID: 34940346 PMCID: PMC8698903 DOI: 10.3390/bioengineering8120193
Source DB: PubMed Journal: Bioengineering (Basel) ISSN: 2306-5354
Sample sizes of the subsets used in this study for each partition. N: number of subjects; Segments: number of segments; and %BEAT: percentages of segments in the BEAT class.
| Dataset Name | Training | Testing | ||||
|---|---|---|---|---|---|---|
| N | Segments | % BEAT | N | Segments | % BEAT | |
| NormalSinus+LongTerm | 17 | 240,000 | 7.37 | 8 | 80,000 | 8.47 |
| Arrhythmia | 32 | 230,000 | 6.19 | 16 | 110,000 | 7.07 |
| Baseline FlexComp | 12 | 14,748 | 6.48 | 6 | 7384 | 6.43 |
| Baseline ComfTech | 12 | 14,741 | 6.62 | 6 | 7385 | 6.19 |
| Movement ComfTech | 12 | 14,886 | 7.33 | 6 | 7443 | 6.68 |
Figure 1Signal processing steps on the ECG signals. (A) Portion of the original ECG data (from the Baseline ComfTech subset). Vertical lines indicate the position of the R peak indicating an heart-beat. (B) Examples of four samples belonging to the BEAT class: R peak between 0.1 to 0.15 s (indicated by the vertical red lines). (C) Examples of four samples belonging to the NO-BEAT class: R peak not present or not between 0.1 to 0.15 s (indicated by the vertical red lines).
Figure 2Schematic illustration of the network architecture used in this study.
Figure 3The Matthew Correlation Coefficient of the networks on different datasets and partitions. Vertical bars indicate 90% confidence intervals.
Performance of the Network on the Training and Testing partitions of the NormalSinus+LongTerm subset.
| Metric | Training Partition | Testing Partition |
|---|---|---|
| MCC | 0.860 [0.855, 0.866] | 0.797 [0.751, 0.830] |
|
| 86.7% [85.9, 87.6] | 85.3% [81.3, 90.5] |
| Sensitivity | 87.3% [86.6, 88.2] | 78.3% [71.9, 82.3] |
| F-score | 0.870 [0.864, 0.876] | 0.815 [0.772, 0.853] |
Performance (MCC and 90% CI) of the Network before (Experiment 2) and after (Experiment 3) retraining; for each subset and partition under investigation.
| Dataset Name | Experiment 2 | Experiment 3 | |
|---|---|---|---|
| Testing Partition | Training Partition | Testing Partition | |
| Arrhythmia | 0.690 [0.675, 0.703] | 0.852 [0.844, 0.859] | 0.852 [0.843, 0.861] |
| Baseline FlexComp | 0.706 [0.642, 0.767] | 0.852 [0.864, 0.913] | 0.803 [0.760, 0.847] |
| Baseline ComfTech | 0.861 [0.815, 0.895] | 0.939 [0.917, 0.954] | 0.935 [0.911, 0.960] |
| Movement ComfTech | 0.822 [0.774, 0.865] | 0.874 [0.846, 0.902] | 0.879 [0.830, 0.907] |