| Literature DB >> 35142113 |
Katherine Ember1,2, François Daoust1,2, Myriam Mahfoud1,2, Frédérick Dallaire1,2, Esmat Zamani Ahmad1,2,3, Trang Tran1,2, Arthur Plante1,2, Mame-Kany Diop2,3, Tien Nguyen1,2,3, Amélie St-Georges-Robillard1,2, Nassim Ksantini1,2, Julie Lanthier1,2, Antoine Filiatrault1,2, Guillaume Sheehy1,2, Gabriel Beaudoin1,2, Caroline Quach4,5, Dominique Trudel2,3,6, Frédéric Leblond1,2,3.
Abstract
SIGNIFICANCE: The primary method of COVID-19 detection is reverse transcription polymerase chain reaction (RT-PCR) testing. PCR test sensitivity may decrease as more variants of concern arise and reagents may become less specific to the virus. AIM: We aimed to develop a reagent-free way to detect COVID-19 in a real-world setting with minimal constraints on sample acquisition. The machine learning (ML) models involved could be frequently updated to include spectral information about variants without needing to develop new reagents. APPROACH: We present a workflow for collecting, preparing, and imaging dried saliva supernatant droplets using a non-invasive, label-free technique-Raman spectroscopy-to detect changes in the molecular profile of saliva associated with COVID-19 infection.Entities:
Keywords: Raman spectroscopy; biofluids; coronavirus disease-19; saliva; screening
Mesh:
Substances:
Year: 2022 PMID: 35142113 PMCID: PMC8825664 DOI: 10.1117/1.JBO.27.2.025002
Source DB: PubMed Journal: J Biomed Opt ISSN: 1083-3668 Impact factor: 3.758
Fig. 1Workflow from saliva collection to determination of COVID infection status from Raman spectra. Volunteers donated between 1 and 5 ml of saliva into a 50-ml tube. The liquid was pipetted into a 1.5-ml microcentrifuge tube which was centrifuged. The saliva supernatant and pellet were then stored separately at but supernatant only is shown here for clarity. Supernatant was thawed, vortexed and mounted on an aluminum slide. After 45 min of drying, spectra were acquired using a Renishaw InVia Raman spectrometer. This figure was created with BioRender.com.
Fig. 4Raman spectroscopy from a droplet of model saliva composed of a mix of salts, bovine serum albumin, mucin, and other metabolites. (a)–(c) Brightfield images for (a) whole droplet dried on aluminum slide (); (b) crystalline region (); and (c) edge () with acquisition points shown with different symbols: diamonds (on crystal), crosses (off crystal) and triangles (edge). (d) Average spectra from one saliva sample for on crystal (top), off crystal (middle), and edge (bottom) regions respectively with shaded areas representing the interspectral variance within the specimen. Each spectrum is an average of multiple acquisitions obtained with a 785-nm laser using a Renishaw InVia Raman microscope. The scale bar in A is 1 mm in length, whereas the scale bar in B and C are 0.1-mm long.
Fig. 2ML schematic workflow. The ML workflow consists of a 5-fold CV embedded in a 5-time repeat loop creating different splitting of the training and validation sets. A 5-time repeat 5-fold CV allowed assessment of variance in AUCs produced by the models, thus reducing the bias induced when splitting the dataset while allowing computation time to remain reasonable. AUCs were stable using this procedure. Feature selection, bagging, mapping, and training steps are repeated for each fold. Raman spectra are represented by spectral peaks and fitted peaks. Each instance is represented by the relevant features which are selected using a combination of a variance-based algorithm, acting as a broad skimmer, and an RF. Each bag is then mapped from a multiple instance representation to a single instance representation through an instance similarity measure; the mapping function being different for MILES and MILDM. A linear SVM algorithm is used to train each model and output a classification probability for patients in the validation set. After each CV procedure, a receiver operating ROC curve is computed with a corresponding AUC value to assess the model performance. The final output is the ROC and AUC averaged over the five repetitions, ensuring further stability. Computational time for each classification scenario was 30 min or less.
Clinical characteristics of the total volunteer cohort. Characteristics were taken from questionnaires given to volunteers. There were 513 COVID negative volunteers and 37 COVID positive volunteers. The number on the left in each column is the number of individuals with each characteristic, and the number in parentheses on the right is the percentage of the total number of COVID negative or positive volunteers. About 38 COVID-19 negative and 33 COVID-19 positive samples were analyzed due to time constraints and accessibility to biosafety level 2 containment facilities. Data from the remaining volunteers were not used in this paper but samples have been retained for future studies.
| Total COVID-19 negative | Analyzed COVID-19 negative | Analyzed COVID-19 positive | |
|---|---|---|---|
| Total number of volunteers | 513 | 38 | 33 |
| Age range, | — | — | — |
| 0–20 | 58 (11) | 6 (15) | 7 (21) |
| 21–40 | 255 (50) | 17 (45) | 13 (39) |
| 41–60 | 135 (26) | 12 (32) | 11 (33) |
| 61–80 | 60 (12) | 3 (8) | 2 (6) |
| 81+ | 1 (0) | 0 (0) | 0 (0) |
| Not given | 4 (1) | 0 (0) | 0 (0) |
| Sex at birth, | — | — | — |
| Female | 225 (44) | 18 (47) | 18 (55) |
| Male | 206 (40) | 20 (53) | 15 (45) |
| Prefer not to say | 82 (16) | 0 (0) | 0 (0) |
| Symptoms, | — | — | — |
| Respiratory symptoms | 187 (36) | 24 (61) | 21 (64) |
| Non-respiratory symptoms | 30 (6) | 1 (3) | 3 (9) |
| None | 279 (54) | 13 (34) | 5 (15) |
| Not reported | 17 (3) | 4 (11) | 4 (12) |
| Disease, | — | — | — |
| Other disease | 124 (24) | 10 (26) | 3 (9) |
| None | 385 (76) | 18 (74) | 30 (91) |
| Nicotine consumption, | — | — | — |
| Smoking | 96 (19) | 4 (11) | 3 (9) |
| Vaping | 32 (6) | 2 (5) | 0 (0) |
| Alcohol consumption, | 323 (63) | 24 (63) | 18 (54) |
| BMI | 25.4 | 27.4 | 24.6 |
| Prescription medication or vitamins taken | 294 (57.3) | 27 (71) | 24 (73) |
Fig. 3Raman spectroscopy of a representative droplet of human saliva supernatant. (a)–(c) Brightfield images for (a) whole droplet dried on aluminum slide (); (b) crystalline region (); and (c) edge () with acquisition points shown with different symbols: diamonds (on crystal), crosses (off crystal), and triangles (edge). (d) Average spectra from one saliva sample for on crystal (top), off crystal (middle), and edge (bottom) regions, respectively, with shaded areas representing the interspectral variance within the specimen. Each spectrum is an average of multiple acquisitions obtained with a 785-nm laser using a Renishaw InVia Raman microscope. The scale bar in A is 1 mm in length, whereas the scale bar in B and C are 0.1-mm long.
Fig. 5ML model discriminating between COVID-negative and positive saliva supernatant from males using (a)–(c) edge and (d)–(f) on crystal Raman spectra from dried droplets. (a) and (d) Upper frame shows SNV-normalized, background corrected Raman spectra from all volunteers. Main features used in model building designated by dotted lines. Mean spectra from COVID-negative volunteers (, at least eight spectra per volunteer) are shown in black and spectra from COVID-positive volunteers (, at least eight spectra per volunteer) are shown in red. Variance is not shown for reasons of clarity. Bottom frame shows the standardized Raman spectra, where each individual feature has 0 mean and unit variance. (b) and (e) ROC for these models with sensitivity and specificity. (c) and (f) List of features used in model building and their assignments as determined using compounds in model saliva and from literature.
Fig. 6ML model discriminating between COVID-negative and positive saliva supernatant from females using (a)–(c) edge and (d)–(f) on crystal Raman spectra from dried droplets. (a) and (d) Upper frame shows SNV-normalized, background corrected Raman spectra from all volunteers. Main features used in model building designated by dotted lines. Mean spectra from COVID-negative volunteers (, at least nine spectra per volunteer) are shown in black, and spectra from COVID-positive volunteers ( for edge, for on crystal, at least nine spectra per volunteer) are shown in red. Variance is not shown for reasons of clarity. Bottom frame shows the standardized Raman spectra, where each individual feature has 0 mean and unit variance. (b) and (e) ROC for these models with sensitivity and specificity. (c) and (f) List of features used in model building and their assignments as determined using compounds in model saliva and from literature.
Fig. 7ML model discriminating between COVID-negative and positive volunteer saliva supernatant using (a)–(c) edge and (d)–(f) on crystal Raman spectra from dried droplets. (a) and (d) Upper frame shows SNV-normalized, background corrected Raman spectra from all volunteers. Main features used in model building designated by dotted lines. Mean spectra from COVID-negative volunteers ( for both edge and on crystal, at least nine spectra per volunteer) are shown in black and mean spectra from COVID-positive volunteers spectra ( for edge, 31 for on crystal, at least nine spectra per volunteer) are shown in red. Variance is not shown for reasons of clarity. Bottom frame shows the standardized Raman spectra, where each individual feature has 0 mean and unit variance. (b) and (e) ROC for these models with sensitivity and specificity. (c) and (f) List of features used in model building and their assignments as determined using compounds in model saliva and from literature.