| Literature DB >> 34720200 |
Shuo Liu1, Jing Han1,2, Estela Laporta Puyal3,4, Spyridon Kontaxis3,4, Shaoxiong Sun5, Patrick Locatelli6, Judith Dineley1, Florian B Pokorny1,7, Gloria Dalla Costa8, Letizia Leocani8, Ana Isabel Guerrero9, Carlos Nos9, Ana Zabalza9, Per Soelberg Sørensen10, Mathias Buron10, Melinda Magyari10, Yatharth Ranjan5, Zulqarnain Rashid5, Pauline Conde5, Callum Stewart5, Amos A Folarin5,11, Richard Jb Dobson5,11, Raquel Bailón3,4, Srinivasan Vairavan12, Nicholas Cummins1,5, Vaibhav A Narayan12, Matthew Hotopf13,14, Giancarlo Comi15, Björn Schuller1,16, Radar-Cns Consortium17.
Abstract
This study proposes a contrastive convolutional auto-encoder (contrastive CAE), a combined architecture of an auto-encoder and contrastive loss, to identify individuals with suspected COVID-19 infection using heart-rate data from participants with multiple sclerosis (MS) in the ongoing RADAR-CNS mHealth research project. Heart-rate data was remotely collected using a Fitbit wristband. COVID-19 infection was either confirmed through a positive swab test, or inferred through a self-reported set of recognised symptoms of the virus. The contrastive CAE outperforms a conventional convolutional neural network (CNN), a long short-term memory (LSTM) model, and a convolutional auto-encoder without contrastive loss (CAE). On a test set of 19 participants with MS with reported symptoms of COVID-19, each one paired with a participant with MS with no COVID-19 symptoms, the contrastive CAE achieves an unweighted average recall of 95.3 % , a sensitivity of 100 % and a specificity of 90.6 % , an area under the receiver operating characteristic curve (AUC-ROC) of 0.944, indicating a maximum successful detection of symptoms in the given heart rate measurement period, whilst at the same time keeping a low false alarm rate.Entities:
Keywords: Anomaly detection; COVID-19; Contrastive learning; Convolutional auto-encoder; Respiratory tract infection
Year: 2021 PMID: 34720200 PMCID: PMC8547790 DOI: 10.1016/j.patcog.2021.108403
Source DB: PubMed Journal: Pattern Recognit ISSN: 0031-3203 Impact factor: 7.740
Gender-, age-, and site-related distribution of participants per data subset.
| Positive participants | Health control | |||
|---|---|---|---|---|
| Pre-training | for testing | for testing | ||
| 14 | 5 | 5 | ||
| 35 | 14 | 14 | ||
| 18 | 7 | 7 | ||
| 19 | 6 | 6 | ||
| 12 | 6 | 6 | ||
| 1 | 2 | 2 | ||
| 10 | 3 | 4 | ||
| 12 | 6 | 5 | ||
| 19 | 6 | 6 | ||
| 6 | 1 | 1 | ||
| 1 | - | - |
Fig. 1Segmentation and pre-processing of heart rate data of a participant with reported COVID-19-like symptoms. Top: Heart rate data recorded 24-hours-a-day/7-days-a-week from 21 February to 20 May 2020 (total 90 days). Onset (black vertical bar) indicates 0 o’clock at t8he reported symptom onset date. Red rectangle – 7 days heart rate data before and after symptom onset representing a symptomatic segment; green rectangle – asymptomatic segment. Middle: Symptomatic segment. Blue curve – unprocessed heart rate trajectory of the red rectangle above; red curve – heart rate trajectory averaged over 5-minutes intervals. Bottom: Representation of the symptomatic segment as sized image of 5-minutes heart rate data related pixels. Each column represents an interval of 2 h, the 168 columns sum up to 14 days.
Available symptomatic and asymptomatic segments per data subset. Data completeness [%]of respective heart rate segments is given in parentheses (mean + std).
| Positive participants | Health control | ||
|---|---|---|---|
| # (%) | Pre-training | for testing | for testing |
| 49 | 19 | - | |
| 1470 | 570 | 1140 |
Fig. 2The convolutional auto-encoder (CAE) architecture with 4 encoder layers and 4 decoder layers as an example. An encoder layer is a sequence of convolution – batch-normalisation – PReLU – max-pooling. A decoder layer is a sequence of transposed convolution – batch-normalisation – PReLU – transposed max-pooling. The distance between the original and reconstructed image represents the reconstruction error.
Specifications of our CAE models. Each convolution and pooling layer, as well as de-convolution and de-pooling layer contains its own kernel size, stride, padding size, and number of channels. *=dimensionality depends on the total number of layers, **= dimensionality of latent attributes. fc abbreviates fully-connected layer.
| Blocks | Kernel | Stride | Padding | # Channels | |
|---|---|---|---|---|---|
| (5,5) | (1,1) | (2,2) | 32 | ||
| (2,2) | (2,2) | - | 32 | ||
| (5,5) | (1,1) | (2,2) | 64 | ||
| (2,2) | (2,2) | - | 64 | ||
| (5,5) | (1,1) | (2,2) | 128 | ||
| (2,2) | (2,2) | - | 128 | ||
| (5,5) | (1,1) | (2,2) | 256 | ||
| (3,3) | (3,3) | - | 256 | ||
| (3,3) | (1,1) | (1,1) | 512 | ||
| (3,3) | (1,1) | (1,1) | 1024 | ||
| (3,3) | (1,1) | (1,1) | 512 | ||
| (3,3) | (1,1) | (1,1) | 256 | ||
| (3,3) | (1,1) | (1,1) | 128 | ||
| (3,3) | (3,3) | - | 128 | ||
| (5,5) | (1,1) | (2,2) | 64 | ||
| (2,2) | (2,2) | - | 64 | ||
| (5,5) | (1,1) | (2,2) | 32 | ||
| (2,2) | (2,2) | - | 32 | ||
| (5,5) | (1,1) | (2,2) | 1 | ||
| (2,2) | (2,2) | - | 1 | ||
Evaluation results for the binary COVID-19 yes/no (based on the symptom CD1/CD2 definitions above) classification [%] of the baseline methods and contrastive CAE models with a different number of (#) layers. For the contrastive CAE, classification is performed based on reconstruction error using logistic regression.
| # Layers | UAR | Sensitivity | Specificity | AUC-ROC | MCC | |
|---|---|---|---|---|---|---|
| 61.0 | 63.2 | 58.8 | 0.542 | 0.046 | ||
| 67.3 | 73.7 | 61.0 | 0.577 | 0.074 | ||
| 72.8 | 73.7 | 71.9 | 0.685 | 0.105 | ||
| 1 | 58.8 | 70.2 | 47.4 | 0.508 | 0.044 | |
| 2 | 83.0 | 84.2 | 81.9 | 0.769 | 0.176 | |
| 3 | 90.6 | 81.3 | 0.878 | 0.213 | ||
| 4 | ||||||
| 5 | 93.9 | 87.7 | 0.931 | 0.270 | ||
| 6 | 90.9 | 81.9 | 0.883 | 0.217 |
Comparison of results [%] between convolutional auto-encoders (CAEs) with 4 encoder and 4 decoder layers trained with RMSE loss vs contrastive loss. Classification is performed based on the latent attributes. # Attr: dimensionality of latent attributes.
| # Attr | UAR | Sensitivity | Specificity | AUC-ROC | MCC | |
|---|---|---|---|---|---|---|
| 50 | 57.9 | 0.545 | ||||
| 100 | 58.5 | 47.4 | 69.5 | 0.465 | 0.038 | |
| 300 | 63.4 | 63.2 | 63.7 | 0.527 | 0.058 | |
| 500 | 65.8 | 63.2 | 0.068 | |||
| 1000 | 55.3 | 47.4 | 63.2 | 0.448 | 0.023 | |
| 50 | 92.0 | 83.9 | 0.904 | 0.233 | ||
| 100 | 84.3 | 0.236 | ||||
| 300 | 90.9 | 81.9 | 0.890 | 0.217 | ||
| 500 | 90.9 | 94.7 | 0.881 | |||
| 1000 | 71.9 | 68.4 | 75.4 | 0.597 | 0.105 |
Classification results [%] of the contrastive CAE with 4 encoder and 4 decoder layers based on the reconstruction error (rec. error) using logistic regression. # Attr: dimensionality of latent attributes. The last row indicates removing the latent attributes layer.
| # Attr. | UAR | Sensitivity | Specificity | AUC-ROC | MCC | |
|---|---|---|---|---|---|---|
| 50 | 93.9 | 100.0 | 87.7 | 0.927 | 0.270 | |
| 100 | ||||||
| 300 | 91.5 | 100.0 | 83.0 | 0.890 | 0.226 | |
| 500 | 92.4 | 100.0 | 84.8 | 0.895 | 0.240 | |
| 1000 | 94.4 | 100.0 | 88.9 | 0.936 | 0.284 | |
| - | 93.3 | 100.0 | 86.6 | 0.923 | 0.258 |
Fig. 3Training and testing curves illustrated by the reconstruction errors when using different margin sizes.
Classification results [%] of the contrastive CAE with 4 encoder and 4 decoder layers based on the reconstruction error (rec. error) using different margin sizes.
| (m)argin | UAR | Sensitivity | Specificity | AUC-ROC | MCC | |
|---|---|---|---|---|---|---|
| 2 | 78.9 | 84.2 | 73.6 | 0.753 | 0.136 | |
| 3 | 91.4 | 100.0 | 82.8 | 0.905 | 0.224 | |
| 4 | 94.1 | 100.0 | 88.2 | 0.920 | 0.275 | |
| 5 | ||||||
| 10 | 90.5 | 94.7 | 86.2 | 0.861 | 0.238 | |
| 15 | 90.9 | 94.7 | 87.0 | 0.861 | 0.247 |
Classification results [%] of the contrastive CAE with 4 encoder and 4 decoder layers based on the reconstruction error (rec. error) using different numbers of (#) participants for pre-training.
| # Participants | UAR | Sensitivity | Specificity | AUC-ROC | MCC | |
|---|---|---|---|---|---|---|
| 49 | 95.3 | 100.0 | 90.6 | 0.944 | 0.310 | |
| 40 | ||||||
| 30 | 95.2 | 100.0 | 90.3 | 0.940 | 0.305 | |
| 20 | 82.3 | 84.2 | 80.3 | 0.823 | 0.167 | |
| 10 | 79.8 | 84.2 | 75.4 | 0.737 | 0.143 | |
| 0 | 76.4 | 78.9 | 73.8 | 0.696 | 0.124 |
Test results [%] for shifting the sliding window by days.
| # Days | UAR | Sensitivity | Specificity | AUC-ROC | MCC | |
|---|---|---|---|---|---|---|
| 57.4 | 52.6 | 62.2 | 0.420 | 0.032 | ||
| 64.7 | 68.4 | 61.0 | 0.558 | 0.063 | ||
| 95.6 | 100.0 | 91.2 | 0.946 | 0.320 | ||
| 0 | ||||||
| 1 | 95.4 | 100.0 | 90.8 | 0.945 | 0.313 | |
| 2 | 96.1 | 100.0 | 92.1 | 0.957 | 0.337 | |
| 3 | 94.9 | 100.0 | 89.9 | 0.949 | 0.298 | |
| 4 | 87.4 | 94.7 | 80.2 | 0.823 | 0.193 | |
| 5 | 61.5 | 68.4 | 54.6 | 0.517 | 0.048 |
Fig. 4Reconstruction errors for continuous binary COVID-19 yes/no classification on 14-days heart rate windows of an exemplary individual (the same as in Fig. 1, top).