| Literature DB >> 33821082 |
Gang Yin1,2, Lintao Li1,2, Shun Lu1,2, Yu Yin3, Yuanzhang Su4, Yilan Zeng5, Mei Luo5, Maohua Ma5, Hongyan Zhou1,2, Lucia Orlandini1,2, Dezhong Yao3, Gang Liu6, Jinyi Lang1,2.
Abstract
The outbreak of COVID-19 coronavirus disease around the end of 2019 has become a pandemic. The preferred method for COVID-19 detection is the real-time polymerase chain reaction (RT-PCR)-based technique; however, it also has certain limitations, such as sample-dependent procedures with a relatively high false negative ratio. We propose a safe and efficient method for screening COVID-19 based on Raman spectroscopy. A total of 177 serum samples are collected from 63 confirmed COVID-19 patients, 59 suspected cases, and 55 healthy individuals as a control group. Raman spectroscopy is adopted to analyze these samples, and a machine learning support-vector machine (SVM) method is applied to the spectrum dataset to build a diagnostic algorithm. Furthermore, 20 independent individuals, including 5 asymptomatic COVID-19 patients and 5 symptomatic COVID-19 patients, 5 suspected patients, and 5 healthy patients, were sampled for external validation. In these three groups-confirmed COVID-19, suspected, and healthy individuals-the distribution of statistically significant points of difference showed highly consistency for intergroups after repeated sampling processes. The classification accuracy between the COVID-19 cases and the suspected cases is 0.87 (95% confidence interval [CI]: 0.85-0.88), and the accuracy between the COVID-19 and the healthy controls is 0.90 (95% CI: 0.89-0.91), while the accuracy between the suspected cases and the healthy control group is 0.68 (95% CI: 0.67-0.73). For the independent test dataset, we apply the obtained SVM model to the classification of the independent test dataset to have all the results correctly classified. Our model showed that the serum-level classification results were all correct for independent test dataset. Our results suggest that Raman spectroscopy could be a safe and efficient technique for COVID-19 screening.Entities:
Keywords: COVID‐19; Raman spectroscopy; machine learning; screening, support vector machine
Year: 2021 PMID: 33821082 PMCID: PMC8014023 DOI: 10.1002/jrs.6080
Source DB: PubMed Journal: J Raman Spectrosc ISSN: 0377-0486 Impact factor: 2.727
Clinical characteristics of the investigated individuals
| Characteristic | Characteristic | COVID‐19 | Suspected | Healthy control | |
|---|---|---|---|---|---|
| Symptomatic | Asymptomatic | ||||
| Total | 58 | 5 | 59 | 55 | |
| Age (media, range) | 47.6 (20–78) | 45.8 (21–74) | 45.5 (24–65) | ||
| Gender | |||||
| Male | 26 | 2 | 36 | 24 | |
| Female | 32 | 3 | 23 | 31 | |
| Distribution of temperature (blood sampling) | |||||
| <37.5°C | 27 | 5 | 13 | 5 | |
| 37.5–38.0°C | 19 | 17 | |||
| 38.1–39.0°C | 7 | 21 | |||
| >39.0°C | 5 | 8 | |||
| Symptoms | |||||
| Cough | 36 | 24 | |||
| Fatigue | 21 | 16 | |||
| Myalgia or arthralgia | 9 | 6 | |||
| Headache | 11 | 20 | |||
| Shortness of breath | 6 | 3 | |||
| Disease severity | |||||
| Nonsevere | 51 | 58 | |||
| Severe | 7 | 1 | |||
| Abnormalities on chest CT | |||||
| Ground‐glass opacity | 29 | 19 | |||
| Local patchy shadowing | 21 | 34 | |||
| Bilateral patchy shadowing | 28 | 17 | |||
| Interstitial abnormalities | 9 | 17 | |||
Abbreviation: CT, computed tomography.
FIGURE 1The total average serum Raman of the three groups and the difference between the groups. (a) The total average of the three types of Raman, the color band represents the standard deviation. (b) The Raman difference signal between the groups (black) and the Raman signal of the groups between ±2 standard deviations (red and blue) [Colour figure can be viewed at wileyonlinelibrary.com]
FIGURE 2The result of the ANOVA test. The spectra range without a significant difference in the ANOVA test (p < 0.05) was indicated in blue, while others were indicated in yellow. (a) The Raman shift spectrum of the difference in p value after a 70% random sampling and repeated training 100 times for the intergroups. (b) The Raman shift spectrum of the difference in p value after a 70% random sampling and repeated training 100 times intragroups [Colour figure can be viewed at wileyonlinelibrary.com]
Performance parameters of the SVM
| Class | Performance parameter | Value ± SD | 95% CI |
|---|---|---|---|
| COVID‐19 versus suspected | Sensitivity | 0.89 ± 0.08 (0.90 ± 0.08) | 0.87–0.91 (0.87–0.92) |
| Specificity | 0.86 ± 0.09 (0.88 ± 0.09) | 0.83–0.88 (0.85–0.90) | |
| Accuracy | 0.87 ± 0.05 (0.89 ± 0.06) | 0.86–0.89 (0.88–0.90) | |
| COVID‐19 versus healthy control | Sensitivity | 0.89 ± 0.07 (0.89 ± 0.079) | 0.90–0.92 (0.87–0.91) |
| Specificity | 0.93 ± 0.06 (0.94 ± 0.06) | 0.91–0.94 (0.93–0.96) | |
| Accuracy | 0.91 ± 0.04 (0.91 ± 0.04) | 0.90–0.92 (0.90–0.93) | |
| Suspected versus healthy control | Sensitivity | 0.70 ± 0.09 (0.72 ± 0.11) | 0.68–0.73 (0.69–0.75) |
| Specificity | 0.66 ± 0.09 (0.71 ± 0.11) | 0.64–0.69 (0.68–0.74) | |
| Accuracy | 0.69 ± 0.05 (0.71 ± 0.07) | 0.68–0.70) (0.70–0.73) |
Note: Brackets: serum‐level classification results for each serum samples.
Abbreviations: CI, confidence interval; SVM, support‐vector machine.
FIGURE 3The ROC curve of the SVM diagnostic algorithm for the COVID‐19 group versus the suspected group, the COVID‐19 group versus healthy control group, and the suspected group versus the healthy control group [Colour figure can be viewed at wileyonlinelibrary.com]
Results of 20 samples for the independent verification
| Individual spectra predictions | ||||||
|---|---|---|---|---|---|---|
| Sample # | Predicted class | External validation results | ||||
| COVID‐19 | Suspected | Healthy control | Sample # | Predicted class | True class | |
| 1 | 2771 | / | 529 | 1 | COVID‐19 | COVID‐19 |
| 2 | 2570 | / | 730 | 2 | COVID‐19 | COVID‐19 |
| 3 | 2642 | / | 658 | 3 | COVID‐19 | COVID‐19 |
| 4 | 2424 | / | 876 | 4 | COVID‐19 | COVID‐19 |
| 5 | 2631 | / | 669 | 5 | COVID‐19 | COVID‐19 |
| 6 | 3300 | 0 | / | 6 | COVID‐19 | COVID‐19 |
| 7 | 3271 | 29 | / | 7 | COVID‐19 | COVID‐19 |
| 8 | 3300 | 0 | / | 8 | COVID‐19 | COVID‐19 |
| 9 | 2811 | 489 | / | 9 | COVID‐19 | COVID‐19 |
| 10 | 3300 | 0 | / | 10 | COVID‐19 | COVID‐19 |
| 11 | 76 | 3224 | / | 11 | Suspected | Suspected |
| 12 | 196 | 3104 | / | 12 | Suspected | Suspected |
| 13 | 891 | 2409 | / | 13 | Suspected | Suspected |
| 14 | 93 | 3207 | / | 14 | Suspected | Suspected |
| 15 | 1229 | 2071 | / | 15 | Suspected | Suspected |
| 16 | / | 5 | 3295 | 16 | Healthy controls | Healthy controls |
| 17 | / | 115 | 3185 | 17 | Healthy controls | Healthy controls |
| 18 | / | 0 | 3300 | 18 | Healthy controls | Healthy controls |
| 19 | / | 0 | 3300 | 19 | Healthy controls | Healthy controls |
| 20 | / | 16 | 3284 | 20 | Healthy controls | Healthy controls |
Symptomatic.