| Literature DB >> 29971049 |
Michal Strejcek1, Tereza Smrhova1, Petra Junkova1, Ondrej Uhlik1.
Abstract
Many ecological experiments are based on the extraction and downstream analyses of microorganisms from different environmental samples. Due to its high throughput, cost-effectiveness and rapid performance, Matrix Assisted Laser Desorption/Ionization Mass Spectrometry with Time-of-Flight detector (MALDI-TOF MS), which has been proposed as a promising tool for bacterial identification and classification, could be advantageously used for dereplication of recurrent bacterial isolates. In this study, we compared whole-cell MALDI-TOF MS-based analyses of 49 bacterial cultures to two well-established bacterial identification and classification methods based on nearly complete 16S rRNA gene sequence analyses: a phylotype-based approach, using a closest type strain assignment, and a sequence similarity-based approach involving a 98.65% sequence similarity threshold, which has been found to best delineate bacterial species. Culture classification using reference-based MALDI-TOF MS was comparable to that yielded by phylotype assignment up to the genus level. At the species level, agreement between 16S rRNA gene analysis and MALDI-TOF MS was found to be limited, potentially indicating that spectral reference databases need to be improved. We also evaluated the mass spectral similarity technique for species-level delineation which can be used independently of reference databases. We established optimal mass spectral similarity thresholds which group MALDI-TOF mass spectra of common environmental isolates analogically to phylotype- and sequence similarity-based approaches. When using a mass spectrum similarity approach, we recommend a mass range of 4-10 kDa for analysis, which is populated with stable mass signals and contains the majority of phylotype-determining peaks. We show that a cosine similarity (CS) threshold of 0.79 differentiate mass spectra analogously to 98.65% species-level delineation sequence similarity threshold, with corresponding precision and recall values of 0.70 and 0.73, respectively. When matched to species-level phylotype assignment, an optimal CS threshold of 0.92 was calculated, with associated precision and recall values of 0.83 and 0.64, respectively. Overall, our research indicates that a similarity-based MALDI-TOF MS approach can be routinely used for efficient dereplication of isolates for downstream analyses, with minimal loss of unique organisms. In addition, MALDI-TOF MS analysis has further improvement potential unlike 16S rRNA gene analysis, whose methodological limits have reached a plateau.Entities:
Keywords: 16S rRNA gene; MALDI BioTyper; MALDI-TOF mass spectrometry (MS); bacterial identification; bacterial isolation; dereplication of isolates; species delineation
Year: 2018 PMID: 29971049 PMCID: PMC6018384 DOI: 10.3389/fmicb.2018.01294
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Collection of isolates used in this study.
| Rho1 | Compost soil | ||
| Rho2 | Strain Collection | ||
| Rho3 | Compost soil | ||
| Art1 | Rhizosphere 1 | ||
| Art2 | Rhizosphere 1 | ||
| Art3 | Rhizosphere 1 | ||
| Glu1 | Rhizosphere 1 | ||
| Mic1 | Strain collection | ||
| Paa1 | Rhizosphere 1 | ||
| Psa1 | Strain collection | ||
| Psa3 | Rhizosphere 1 | ||
| Oer1 | Rhizosphere 1 | ||
| Bac1 | Compost soil | ||
| Bac2 | Compost soil | ||
| Bac3 | Compost soil | ||
| Bac4 | Compost soil | ||
| Bac5 | Compost soil | ||
| Bre1 | Compost soil | ||
| Bre2 | Compost soil | ||
| Bre3 | NA (< 1.7) | Compost soil | |
| Pab1 | Compost soil | ||
| Lys1 | Compost soil | ||
| Lys2 | Compost soil | ||
| Lys3 | Compost soil | ||
| Sol1 | Compost soil | ||
| Spo1 | NA (< 1.7) | Compost soil | |
| Bos1 | NA (< 1.7) | Rhizosphere 1 | |
| Met1 | Strain collection | ||
| Rhi1 | Strain collection | ||
| Ach1 | Strain collection | ||
| Pan1 | Strain collection | ||
| Par1 | Strain collection | ||
| Cup1 | Strain collection | ||
| Psm1 | Strain collection | ||
| Psm3 | Rhizosphere 1 | ||
| Psm4 | Sediment 1 | ||
| Psm6 | Contaminated soil | ||
| Psm10 | NA (< 1.7) | Strain collection |
Identification results based on 16S rRNA gene (EzBioCloud Identify Service) and MALDI BioTyper (v3.1 equipped with MBT 6903, covering 2,226 unique bacterial species) analyses. Entries in bold are known bacterial strains. Cultures in rectangles were grouped by 98.65% similarity of the 16S rRNA gene using the UPGMA clustering method. Entries in square brackets are updated bacterial nomenclature entries (see Materials and Methods section). Origin: compost soil—compost soil for gardening purposes, Central Bohemia, Czech Republic; rhizophere 1—PCB-contaminated soil with horseradish vegetation, South Bohemia, Czech Republic (Uhlík et al., .
Figure 1Analysis of 1 kDa mass intervals across all 585 mass spectra. Gray bars—number of detected mass signals per interval; blue bars—number of mass signals identified by shrinkage discriminant analysis as useful for species prediction based on assigning the closest type strain; green area—proportional mass signal intensity; red line—mean value of average cosine similarity between biological and technical replicates of individual cultures (4 × 3 = 12 spectra). All values are normalized by maxima of the respective variable.
Figure 2Histogram of calculated molecular masses of bacterial ribosomal proteins. Only 123 (0.01%) proteins out of 761,208 had a molecular mass of less than 4 kDa. Molecular masses were calculated and downloaded from the UniProtKB protein database. The histogram is made up of bins of 100 Da, and only proteins with a mass of less than 20 kDa are shown.
Figure 3Plot showing pairwise relationship of 16S rRNA gene sequence similarity between two cultures and their mass spectra cosine similarity. Horizontal error bars represent standard deviations of mass spectra cosine similarities calculated for all technical and biological replications (n = 3 × 4 = 12). The colors of the data points represent the lowest taxonomy rank that is shared between a pair of microorganisms. The taxonomical classification and species assignment by the closest type strain were carried out using the EzCloud Identify Service. Vertical solid line—calculated optimal cosine similarity threshold based on 98.65% 16S rRNA gene similarity; vertical dashed line—calculated optimal cosine similarity threshold based on assigning the closest type strain.
Figure 4Precision, recall and F1 score curves for species classification by mass spectra cosine similarity (x-axis) as compared to the two commonly used 16S rRNA gene species demarcation analyses: dashed lines—species separation by 98.65% 16S rRNA gene similarity with an optimal analogous cosine similarity threshold of 0.79; solid lines—species assignment by the closest type strain (EzCloud Identify Service) with an optimal analogous cosine similarity threshold of 0.92.
Comparison of MALDI-TOF MS and 16S rRNA gene analysis methods for dereplication of recurrent bacterial isolates.
| 0.79 | 98.65% similarity | 46/37 | 23%/19% | 8 (17%) | 35 | 6 | 8 |
| 0.92 | Closest type strain | 76/43 | 39%/22% | 32 (42%) | 39 | 5 | 5 |
Bacterial set samples are represented by 4 biological replications of 49 cultures (196 samples in total).
Number of cultures that were (a) separated in the same way by both MALDI-TOF MS and 16S rRNA gene analysis, (b) separated into more clusters by 16S rRNA gene analysis and (c) separated into more clusters by MALDI-TOF MS.