| Literature DB >> 35873349 |
Alican Akman1, Harry Coppock1, Alexander Gaskell1, Panagiotis Tzirakis1, Lyn Jones2, Björn W Schuller1,3.
Abstract
Several machine learning-based COVID-19 classifiers exploiting vocal biomarkers of COVID-19 has been proposed recently as digital mass testing methods. Although these classifiers have shown strong performances on the datasets on which they are trained, their methodological adaptation to new datasets with different modalities has not been explored. We report on cross-running the modified version of recent COVID-19 Identification ResNet (CIdeR) on the two Interspeech 2021 COVID-19 diagnosis from cough and speech audio challenges: ComParE and DiCOVA. CIdeR is an end-to-end deep learning neural network originally designed to classify whether an individual is COVID-19-positive or COVID-19-negative based on coughing and breathing audio recordings from a published crowdsourced dataset. In the current study, we demonstrate the potential of CIdeR at binary COVID-19 diagnosis from both the COVID-19 Cough and Speech Sub-Challenges of INTERSPEECH 2021, ComParE and DiCOVA. CIdeR achieves significant improvements over several baselines. We also present the results of the cross dataset experiments with CIdeR that show the limitations of using the current COVID-19 datasets jointly to build a collective COVID-19 classifier.Entities:
Keywords: COVID-19; audio; computer audition; deep learning; digital health
Year: 2022 PMID: 35873349 PMCID: PMC9302571 DOI: 10.3389/fdgth.2022.789980
Source DB: PubMed Journal: Front Digit Health ISSN: 2673-253X
Figure 1A schematic of the COVID-19 Identification ResNet, (CIdeR). The figure shows a blow-up of a residual block, consisting of convolutional, batch normalization, and Rectified Linear Unit (ReLU) layers.
ComParE sub-challenge dataset splits.
|
|
| |||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
| COVID-19-postive | 71 | 48 | 39 | 72 | 142 | 94 |
| COVID-19-negative | 215 | 183 | 169 | 243 | 153 | 189 |
| Total | 286 | 231 | 208 | 315 | 295 | 283 |
Values specify the number of audio recordings, not the number of participants.
DiCOVA sub-challenge dataset splits.
|
|
| |||
|---|---|---|---|---|
|
|
|
|
|
|
| COVID-19-postive | 75 | blind | 60 | 21 |
| COVID-19-negative | 965 | blind | 930 | 188 |
| Total | 1,040 | 234 | 990 | 209 |
The test set labels were withheld by the DiCOVA team, contestants had to submit predictions for each test case, on which a final AUC was returned.
Results for CIdeR and a range of baseline models for 4 sub-challenges across the DiCOVA and ComParE challenges.
|
|
|
| |||
|---|---|---|---|---|---|
|
|
|
| |||
| DiCOVA | Track 1 | - | 0.699 ± 0.068 | - | |
| Track 2 | 0.786 ± 0.057 | 0.647 ± 0.014 | 0.684 ± 0.072 | 0.776 ± 0.063 | |
| ComParE | CCS | 0.732 ± 0.068 | 0.722 ± 0.069 | 0.765 ± 0.065 | 0.753 ± 0.066 |
| CSS | 0.583 ± 0.072 | 0.656 ± 0.070 | 0.628 ± 0.070 | ||
Testing is performed on the held-out test fold once final model decisions have been made on the validation sets. The Area Under Curve of the Receiver Operating Characteristics curve (AUC(-ROC)) is displayed. A 95% confidence interval is also shown following (.
Track 1: coughing, Track 2: deep breathing + vowel phonation + counting, CCS: coughing, CSS: speech—“ hope my data can help managethe virus pandemic”.
As the demographics were not provided for the Track 1 test set, when calculating the AUC confidence intervals, it was assumed that there was an equal number of COVID-19-positive and COVID-19-negative recordings.
The results for cross dataset experiments.
|
| |||
|---|---|---|---|
|
|
|
|
|
| DiCOVA | 0.799 | 0.554 | 0.464 |
| ComParE | 0.512 | 0.732 | 0.552 |
| COUGHVID | 0.395 | 0.518 | 0.566 |
| All | 0.673 | 0.717 | 0.531 |