| Literature DB >> 31666527 |
Chi-Sing Ho1,2, Neal Jean3,4, Catherine A Hogan5,6, Lena Blackmon7, Stefanie S Jeffrey8, Mark Holodniy9,10,11, Niaz Banaei5,6,11, Amr A E Saleh12,13, Stefano Ermon14, Jennifer Dionne15.
Abstract
Raman optical spectroscopy promises label-free bacterial detection, identification, and antibiotic susceptibility testing in a single step. However, achieving clinically relevant speeds and accuracies remains challenging due to weak Raman signal from bacterial cells and numerous bacterial species and phenotypes. Here we generate an extensive dataset of bacterial Raman spectra and apply deep learning approaches to accurately identify 30 common bacterial pathogens. Even on low signal-to-noise spectra, we achieve average isolate-level accuracies exceeding 82% and antibiotic treatment identification accuracies of 97.0±0.3%. We also show that this approach distinguishes between methicillin-resistant and -susceptible isolates of Staphylococcus aureus (MRSA and MSSA) with 89±0.1% accuracy. We validate our results on clinical isolates from 50 patients. Using just 10 bacterial spectra from each patient isolate, we achieve treatment identification accuracies of 99.7%. Our approach has potential for culture-free pathogen identification and antibiotic susceptibility testing, and could be readily extended for diagnostics on blood, urine, and sputum.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31666527 PMCID: PMC6960993 DOI: 10.1038/s41467-019-12898-9
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1A convolutional neural network (CNN) can be used to identify bacteria from Raman spectra. a To build a training dataset of Raman spectra, we deposit bacterial cells onto gold-coated silica substrates and collect spectra from 2000 bacteria over monolayer regions for each strain. An SEM cross section of the sample is shown (gold coated to allow for visualization of bacteria under electron beam illumination). Scale bar is 1 µm. b Conceptual measurement schematic: by focusing the excitation laser source to a diffraction-limited spot size, Raman signal from single cells can be acquired. c Using a one-dimensional residual network with 25 total convolutional layers (see Methods for details), low-signal Raman spectra are classified as one of 30 isolates, which are then grouped by empiric antibiotic treatment. d Raman spectra of bacterial species can be difficult to distinguish, and short integration times (1 s) lead to noisy spectra (SNR = 4.1). Averages of 2000 spectra from 30 isolates are shown in bold and overlaid on representative examples of noisy single spectra for each isolate. Spectra are color-grouped according to antibiotic treatment. These reference isolates represent over 94% of the most common infections seen at Stanford Hospital in the years 2016–17[39]
Fig. 2CNN performance breakdown by class. The trained CNN classifies 30 bacterial and yeast isolates with isolate-level accuracy of 82.2±0.3% and antibiotic grouping-level accuracy of 97.0±0.3% (± calculated as standard deviation across 5 train and validation splits). a Confusion matrix for 30 strain classes. Entry i, j represents the percentage out of 100 test spectra that are predicted by the CNN as class j given a ground truth of class i; entries along the diagonal represent the accuracies for each class. Misclassifications are mostly within antibiotic groupings, indicated by colored boxes, and thus do not affect the treatment outcome. Values below 0.5% are not shown, and matrix entries covered by figure insets are all below 0.5% aside from a 2% misclassification of MRSA 2 as P. aeruginosa 1 and 1% misclassification of Group B Strep. as K. aerogenes. b Predictions can be combined into antibiotic groupings to estimate treatment accuracy. TZP = piperacillin-tazobactam. All values below 0.5% are not shown
Fig. 3Binary MRSA/MSSA classifier. a A binary classifier is used to distinguish between methicillin-resistant and -susceptible S. aureus (MRSA/MSSA), achieving 89.1±0.1% accuracy. b By varying the classification threshold, it is possible to trade off between sensitivity (true positive rate) and specificity (true negative rate). The ROC curve shows sensitivities and specificities significantly higher than random classification, with an AUC of 0.953
Fig. 4Extension to clinical patient isolates. A CNN pre-trained on our reference dataset can be extended to classify clinical patient isolates and further improved by fine-tuning on a small number of clinical spectra. a 5 species of bacterial infections are tested, with 5 patients per infection type. Each patient is classified into one of 8 treatment classes where each species corresponds to a different treatment class. After fine-tuning, species identification accuracy improves from 89.0±3.6% to 99.0±1.9% (± calculated as standard deviation across 10,000 sampling trials). b Binary classification between MRSA and MSSA patient isolates is also performed, with an accuracy of 61.7±7.3% that improves to 65.4±6.3% after fine-tuning. c Dependence of average diagnosis rates for the fine-tuned model on the number of spectra used per patient. With just 10 spectra, the performance of the model reaches 99% — within 1% difference of the performance with 400 spectra (100%). Error bars are calculated as the standard deviation across 10,000 trials of random selections of n spectra, where n is the number of spectra used per patient. d We perform an additional test on a new clinical dataset gathered from an additional 25 patients with the same distribution across species as the first clinical dataset. We update the model that is pre-trained on the reference dataset and fine-tuned on the first clinical dataset by fine-tuning on the second clinical dataset using the same procedure. e Detailed breakdown by class for the second clinical dataset. Correct pairings between species and treatment group are outlined in the colored boxes. The rate of accurate identification is 99.7±1.1%