| Literature DB >> 28877205 |
Martin Christner1, Dirk Dressler2, Mark Andrian3, Claudia Reule2, Orlando Petrini4,5.
Abstract
The fast and reliable characterization of bacterial and fungal pathogens plays an important role in infectious disease control and tracking of outbreak agents. DNA based methods are the gold standard for epidemiological investigations, but they are still comparatively expensive and time-consuming. Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) is a fast, reliable and cost-effective technique now routinely used to identify clinically relevant human pathogens. It has been used for subspecies differentiation and typing, but its use for epidemiological tasks, e. g. for outbreak investigations, is often hampered by the complexity of data analysis. We have analysed publicly available MALDI-TOF mass spectra from a large outbreak of Shiga-Toxigenic Escherichia coli in northern Germany using a general purpose software tool for the analysis of complex biological data. The software was challenged with depauperate spectra and reduced learning group sizes to mimic poor spectrum quality and scarcity of reference spectra at the onset of an outbreak. With high quality formic acid extraction spectra, the software's built in classifier accurately identified outbreak related strains using as few as 10 reference spectra (99.8% sensitivity, 98.0% specificity). Selective variation of processing parameters showed impaired marker peak detection and reduced classification accuracy in samples with high background noise or artificially reduced peak counts. However, the software consistently identified mass signals suitable for a highly reliable marker peak based classification approach (100% sensitivity, 99.5% specificity) even from low quality direct deposition spectra. The study demonstrates that general purpose data analysis tools can effectively be used for the analysis of bacterial mass spectra.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28877205 PMCID: PMC5587271 DOI: 10.1371/journal.pone.0182962
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Classification accuracy and marker peak detection rates.
| Sample prep. | SNR CUT-OFF | LG size | Peak data | Classification performance [%] | MSP | Marker protein detection frequency [%] | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MP1 | MP2 | MP1&2 | ||||||||||
| Sens. | Spec. | Acc. | p3356 | p6711 | p5442 | p10883 | ||||||
| fae | 2 | 5 | QL | 99.2 | 98.4 | 98.6 | 39 | 70 | 80 | 50 | 100 | 100 |
| 4 | 3 | QL | 99.0 | 93.9 | 95.6 | 42 | 90 | 50 | 60 | 80 | 80 | |
| 5 | QL | 99.8 | 98.0 | 98.6 | 30 | 90 | 100 | 100 | 100 | 100 | ||
| 10 | QL | 100 | 98.1 | 98.8 | 24 | 100 | 100 | 100 | 100 | 100 | ||
| 8 | 5 | QL | 96.5 | 98.1 | 97.6 | 36 | 100 | 100 | 100 | 100 | 100 | |
| 16 | 5 | QL | 98.7 | 99.6 | 99.3 | 30 | 70 | 100 | 100 | 100 | 100 | |
| 32 | 5 | QL | 91.0 | 89.5 | 89.9 | 29 | 0 | 100 | 30 | 50 | 50 | |
| 4 | 5 | QN | 84.9 | 95.6 | 91.7 | 54 | 10 | 10 | 30 | 10 | 10 | |
| dsd | 4 | 3 | QL | 80.7 | 94.8 | 89.7 | 48 | 0 | 70 | 100 | 90 | 70 |
| 5 | QL | 88.3 | 98.4 | 94.8 | 44 | 0 | 100 | 100 | 100 | 100 | ||
| 10 | QL | 94.0 | 98.3 | 96.8 | 42 | 0 | 100 | 100 | 100 | 100 | ||
1 Method of sample preparation for MALDI-TOF MS measurement: FAE or DSD.
2 Signal to noise ratio cut-off used for peak detection.
3 Size of learning groups for A.B.O.S. analysis.
4 Analysis of qualitative (QL; peak presence or absence) or quantitative (QN; peak intensity) peaklists.
5 Mean sensitivity, specificity and accuracy from 10 independent runs.
6 Total number of different peaks listed among the top 10 most important peaks in 10 independent A.B.O.S. runs.
7 Presence of peaks representing known outbreak strain marker proteins among the ten most significant peaks in 10 independent runs.
Fig 1NOREC marker peaks detected by A.B.O.S. analysis.