| Literature DB >> 31121807 |
Mariusz Pelc1,2, Yuriy Khoma3, Volodymyr Khoma4,5.
Abstract
In this paper, the possibility of using the ECG signal as an unequivocal biometric marker for authentication and identification purposes has been presented. Furthermore, since the ECG signal was acquired from 4 sources using different measurement equipment, electrodes positioning and number of patients as well as the duration of the ECG record acquisition, we have additionally provided an estimation of the extent of information available in the ECG record. To provide a more objective assessment of the credibility of the identification method, some selected machine learning algorithms were used in two combinations: with and without compression. The results that we have obtained confirm that the ECG signal can be acclaimed as a valid biometric marker that is very robust to hardware variations, noise and artifacts presence, that is stable over time and that is scalable across quite a solid (~100) number of users. Our experiments indicate that the most promising algorithms for ECG identification are LDA, KNN and MLP algorithms. Moreover, our results show that PCA compression, used as part of data preprocessing, does not only bring any noticeable benefits but in some cases might even reduce accuracy.Entities:
Keywords: ECG; Lviv Biometric Dataset; Physionet; biomarker; human identification; machine learning
Mesh:
Substances:
Year: 2019 PMID: 31121807 PMCID: PMC6566823 DOI: 10.3390/s19102350
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1ECG-based identification process.
Figure 2(a) Raw ECG signal and (b) ECG signal after filtering and normalization with the detected R peaks.
Figure 3ECG segments (heart beats) aligned to the R peak (a) before and (b) after the outlier correction.
Basic parameters of the ECG datasets.
| Parameter | LBDS | ECG-ID | QT | Normal Sinus Rhythm |
|---|---|---|---|---|
| Lead | modified I-lead (from the fingers of the right and left hand) | I-lead | I-lead | I-lead |
| Number of users | 53 | 90 | 22 | 18 |
| Total number of records | 545 | 310 | 22 | 18 |
| Records per user | from 3 to 15 | from 1 to 22 | 1 | 1 |
| Sampling rate | 277 Hz | 500 Hz | 250 Hz | 125 Hz |
| Average record time | ~10 seconds | 20 seconds | 15 minutes | ~10:20 hours (from 8:00 to 13:50 hours) |
ECG identification results.
| Physionet ECG-ID | LBDS | Physionet QT | MIT-BIH Normal Sinus Rhythm | |
|---|---|---|---|---|
| Logistic Regression | 0.8286 | 0.9417 | 0.8809 | 0.7492 |
| SVM classifier | 0.8817 | 0.9599 | 0.9174 | 0.7707 |
| LDA classifier | 0.9328 | 0.9831 | 0.9659 | 0.9017 |
| KNN classifier | 0.8903 | 0.9746 | 0.9686 | 0.7967 |
| Naive Bayes | 0.7003 | 0.9587 | 0.9034 | 0.6607 |
| Random Forest | 0.8362 | 0.9546 | 0.9278 | 0.8192 |
| xgboost classifier | 0.7352 | 0.9126 | 0.9191 | 0.8591 |
| MLP (1 hidden layer) | 0.8933 | 0.9711 | 0.9162 | 0.8925 |
| MLP (2 hidden layer) | 0.8976 | 0.9464 | 0.9478 | 0.8744 |
| MLP (3 hidden layer) | 0.8406 | 0.92373 | 0.9294 | 0.8808 |
| PCA + Logistic Regression | 0.8286 | 0.9383 | 0.8465 | 0.7335 |
| PCA + SVM classifier | 0.8865 | 0.9593 | 0.8832 | 0.7472 |
| PCA + LDA classifier | 0.9536 | 0.9833 | 0.9481 | 0.8798 |
| PCA + KNN classifier | 0.8913 | 0.9758 | 0.9675 | 0.7957 |
| PCA + Naive Bayes | 0.6211 | 0.9511 | 0.8915 | 0.6681 |
| PCA + Random Forest | 0.7782 | 0.9199 | 0.8947 | 0.7418 |
| PCA + xgboost classifier | 0.6723 | 0.8911 | 0.9460 | 0.7305 |