| Literature DB >> 30213965 |
Natacha Couto1, Leonard Schuele2,3, Erwin C Raangs2, Miguel P Machado4, Catarina I Mendes2,4, Tiago F Jesus4, Monika Chlebowicz2, Sigrid Rosema2, Mário Ramirez4, João A Carriço4, Ingo B Autenrieth3, Alex W Friedrich2, Silke Peter3, John W Rossen2.
Abstract
High throughput sequencing has been proposed as a one-stop solution for diagnostics and molecular typing directly from patient samples, allowing timely and appropriate implementation of measures for treatment, infection prevention and control. However, it is unclear how the variety of available methods impacts the end results. We applied shotgun metagenomics on diverse types of patient samples using three different methods to deplete human DNA prior to DNA extraction. Libraries were prepared and sequenced with Illumina chemistry. Data was analyzed using methods likely to be available in clinical microbiology laboratories using genomics. The results of microbial identification were compared to standard culture-based microbiological methods. On average, 75% of the reads corresponded to human DNA, being a major determinant in the analysis outcome. None of the kits was clearly superior suggesting that the initial ratio between host and microbial DNA or other sample characteristics were the major determinants of the proportion of microbial reads. Most pathogens identified by culture were also identified through metagenomics, but substantial differences were noted between the taxonomic classification tools. In two cases the high number of human reads resulted in insufficient sequencing depth of bacterial DNA for identification. In three samples, we could infer the probable multilocus sequence type of the most abundant species. The tools and databases used for taxonomic classification and antimicrobial resistance identification had a key impact on the results, recommending that efforts need to be aimed at standardization of the analysis methods if metagenomics is to be used routinely in clinical microbiology.Entities:
Mesh:
Substances:
Year: 2018 PMID: 30213965 PMCID: PMC6137123 DOI: 10.1038/s41598-018-31873-w
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Characteristics of the samples and mapping of trimmed reads against a human genome hg19 (%) using CLC Genomics Workbench v10.0.1.
| Sample 1 | Sample 2 | Sample 3 | Sample 4 | Sample 5 | Sample 6 | Sample 7 | Sample 8 | Sample 9 | Sample 10 | Negative control | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Sample type | Peritoneal fluid | Pus (abscess) | Synovial fluid | Synovial fluid | Pus (abscess) | Pus (empyema) | Pus (empyema) | Bone biopsy | Pus (abscess) | Sputum | Water |
| DNA extraction method | Ultra-Deep Microbiome Prep (Molzym) | Ultra-Deep Microbiome Prep (Molzym) | Ultra-Deep Microbiome Prep (Molzym) | Ultra-Deep Microbiome Prep (Molzym) | Ultra-Deep Microbiome Prep (Molzym) | QIAamp DNA Microbiome Kit (Qiagen) | QIAamp DNA Microbiome Kit (Qiagen) | Micro-DXTM (Molzym) | Micro-DXTM (Molzym) | Micro-DXTM (Molzym) | QIAamp DNA Microbiome Kit (Qiagen) |
| Total number of reads | 5,892,978 | 9,603,346 | 8,615,810 | 6,078,166 | 8,368,930 | 2,912,802 | 1,486,700 | 6,534,866 | 6,173,132 | 7,596,836 | 1,730,738 |
| Mapped reads against hg19 | 5,249,063 (89.2%) | 7,828.746 (81.6%) | 8,254,594 (95.9%) | 6,015,945 (99.0%) | 309,588 (3.7%) | 2,877,066 (98.8%) | 922,932 (62.2%) | 229,149 (3.5%) | 6,081,612 (98.5%) | 7,337,832 (96.7%) | 1,706,861 (98.9%) |
| Unmapped reads | 632,951 (10.8%) | 1,770,558 (18.4%) | 355,200 (4.1%) | 61,099 (1.0%) | 8,052,272 (96.3%) | 34,506 (1.1%) | 561,772 (37.8%) | 6,303,803 (96.5%) | 89,922 (1.5%) | 235,520 (3.3%) | 19,805 (1.2%) |
Microorganisms identified by conventional methods, WGS and using shotgun metagenomics and the taxonomic classification methods in Unix.
| Sample number | Culture result (CFU)a | Conventional identification (MALDI-TOF) | WGS-based identification | Shotgun metagenomics | ||
|---|---|---|---|---|---|---|
| Krakenb | MIDASc | MetaPhlAnc | ||||
| 1 | 103 |
|
| |||
| 2 | 103 |
| —# | Not identified* | Not identified* | Not identified* |
| 3 | 1 |
| —# | Not identified* | Not identified* | |
| 4 | 103 |
|
| |||
| 5 | ≥105 ≥ 105 |
|
| |||
| 6 | 103 |
|
| Not identified* | Not identified* | |
| 7 | 102 |
| —# | |||
| 8 | 103 |
|
| |||
| 9 | 103 |
|
| |||
| 10 | 103 |
| —# | |||
aThe number of colonies of a given species was estimated from the number of colonies with the same morphology on the same plate; bThe relative abundance is calculated using total number of reads as denominator; cThe relative abundance is calculated with the total number of classified reads as denominator; dminiKraken database was used; #Although there was a laboratory identification, no isolates were available for WGS; *No reads matched that specific pathogen, not even at the genus level.
Microorganisms identified by conventional methods, WGS and using shotgun metagenomics and the taxonomic classification methods in CLC Genomics Workbench.
| Sample number | Culture result (CFU)a | Conventional identification (MALDI-TOF) | WGS-based identification | Shotgun metagenomics | |
|---|---|---|---|---|---|
| Taxonomic Profiling (CLC)b | Best match with K-mer spectra (CLC)c | ||||
| 1 | 103 |
|
| ||
| 2 | 103 |
| —# | Not identified* | Not identified* |
| 3 | 1 |
| —# | Not identified* | |
| 4 | 103 |
|
| Not identified* | |
| 5 | ≥105 ≥ 105 |
|
| ||
| 6 | 103 |
|
| Not identified* | |
| 7 | 102 |
| —# | ||
| 8 | 103 |
|
| ||
| 9 | 103 |
|
| ||
| 10 | 103 |
| —# | ||
aThe number of colonies of a given species was estimated from the number of colonies with the same morphology on the same plate; bThe relative abundance is calculated with the total number of classified reads as denominator; cBased on the Output Quality Report; #Although there was a laboratory identification, no isolates were available for WGS; *No reads matched that specific pathogen, not even at the genus level.
Microorganisms identified by conventional methods, WGS and using shotgun metagenomics and the taxonomic classification methods in webpages (BaseSpace, Taxonomer and CosmosID).
| Sample number | Culture result (CFU)a | Conventional identification (MALDI-TOF) | WGS-based identification | Shotgun metagenomics | ||||
|---|---|---|---|---|---|---|---|---|
| Genius (Basespace)c | Kraken (Basespace)c,d | MetaPhlAn (Basespace)c | Taxonomer (Utah)b,e | Cosmos IDa | ||||
| 1 | 103 |
|
| |||||
| 2 | 103 |
| —# | Not identified* | Not identified* | Not identified* | Not identified* | Not identified* |
| 3 | 1 |
| —# | Not identified* | Not identified* | |||
| 4 | 103 |
|
| |||||
| 5 | ≥105 ≥ 105 |
|
| |||||
| 6 | 103 |
|
| |||||
| 7 | 102 |
| —# | |||||
| 8 | 103 |
|
| |||||
| 9 | 103 |
|
| |||||
| 10 | 103 |
| —# | |||||
aThe number of colonies of a given species was estimated from the number of colonies with the same morphology on the same plate; bThe relative abundance is calculated using total number of reads as denominator; cThe relative abundance is calculated with the total number of classified reads as denominator; dminiKraken database was used; eFull Analysis mode was used; #Although there was a laboratory identification, no isolates were available for WGS; *No reads matched that specific pathogen, not even at the genus level.
Figure 1Scheme of the bioinformatic analysis of the metagenomics samples.
Performance of the different taxonomic classification methods for each sample. Sensitivity and positive predictive value were calculated using culture/MALDI-TOF as standards.
| Method | Total number of bacteria identifieda | True positivesa | False positives | False negatives | Sensitivity (%) | PPV (%) |
|---|---|---|---|---|---|---|
| Culture/MALDI-TOF | 9 | 9 | 0 | 0 | 100% | 100% |
| MetaPhlAn (BaseSpace) | 16 | 7 | 9 | 2 | 78% | 44% |
| Genius (BaseSpace) | 35 | 8 | 27 | 1 | 89% | 23% |
| Kraken (BaseSpace) | 959 | 7 | 952 | 2 | 78% | 1% |
| Taxonomer (Full Analysis) | 4649 | 8 | 4641 | 1 | 89% | 0% |
| CosmosID | 35 | 8 | 27 | 1 | 89% | 23% |
| Taxonomic Profiling (CLC Genomics Workbench v10.0.1) | 17 | 6 | 11 | 3 | 67% | 35% |
| Best match K-mer spectra (CLC Genomics Workbench v10.0.1) | 12 | 8 | 4 | 1 | 89% | 67% |
| Kraken (Unix) | 198 | 7 | 191 | 2 | 78% | 4% |
| MetaPhlAn2 (Unix) | 15 | 7 | 6 | 4 | 75% | 75% |
| MIDAS (Unix) | 34 | 7 | 26 | 2 | 88% | 50% |
aExcluding the samples with non-identified anaerobic bacteria (Samples 2 and 5).
Abbreviations: PPV, positive predictive value.
Antimicrobial resistance phenotypes and antimicrobial resistance genes detected using different approaches.
| Sample number | Conventional identification (MALDI-TOF) | Conventional susceptibility testing (VITEK 2)b | WGS CLC Genomics Workbench | Shotgun metagenomics | |
|---|---|---|---|---|---|
| ReMatCh (Unix) | CLC Genomics Workbencha | ||||
| 1 |
| LEV, ERY, CLI | |||
| 2 |
| DOX, CLI | —# | Not detected | Not detected |
| 3 |
| OXA, GEN, TEC, FUS, CIP, ERY, CLI | —# | Not detected | Not detected |
| 4 |
| PEN, ERY |
| Not detected | Not detected |
| 5 |
| susceptible | —# | — | — |
| 6 |
| PEN, AMX, CFX, IMP, GENhl, STRhl, LEV, ERY, CLI, AMP/SUL | Not detected | Not detected | |
| 7 |
| PEN |
|
|
|
| 8 |
| AMX, PIP/TAZ, CFX, CFT, CTZ, IMP, FOX, TOB, FOS, NIT, TMP | |||
| 9 |
| PEN | —# |
|
|
| 10 |
| AMX, AMC, CFX, FOX, NIT, POL | —# |
|
|
aThe analysis aborted when the script tried to connect to NCBI.
bOnly non-susceptibility is indicated. Abbreviations: AMP/SUL, ampicillin/sulbactam; AMX, amoxicillin; AMC, amoxicillin/clavulanate; CFX, cefuroxime; FOS, fosfomycin; FOX, cefoxitin; CIP, ciprofloxacin; CLI, clindamycin; DOX, doxycycline; ERY, erythromycin; FUS, fusidic acid; GEN, gentamicin; GENhl, gentamicin high-level; LEV, levofloxacin; NIT, nitrofurantoin; PEN, penicillin; POL, polymyxin B; STRhl, streptomycin high-level; TEC, teicoplanin.
Results of MLST using by whole genome sequencing and shotgun metagenomics.
| Sample number | Conventional identification (MALDI-TOF) | WGS | Shotgun metagenomics | |
|---|---|---|---|---|
| CLC Genomics Workbench v10.1.1 | CLC Genomics Workbench v10.1.1 | metaMLST (Unix-based) | ||
| 1 |
| ST117 | Not detected (6 alleles identified correctly) | ST117 |
| 2 |
| —# | — | — |
| 3 |
| —# | Not detected | Not detected |
| 4 |
| ST30 | Not detected | Not detected |
| 5 |
| ST141 | ST141 | ST4508 |
| 6 |
| ST117 | Not detected | Not detected |
| 7 |
| ST30 | ST30 | ST667 |
| 8 |
| — | — | — |
| 9 |
| —# | Not detected | Not detected |
| 10 |
| —# | — | — |
Abbreviations: ST, sequence type.
Figure 2Minimum-spanning tree based on wgMLST allelic profiles of two S. aureus genomes and two E. coli genomes obtained through SM and WGS in comparison to reference strains 04-02981 (GenBank accession number NC_017340) and 06-00048 (NZ_CP015229), respectively. Each circle represents an allelic profile based on sequence analysis. The numbers on the connecting lines illustrate the numbers of target genes with differing alleles.
Figure 3(a) Overview of the nodes (representing plasmid sequences) and links between plasmids (connecting similar plasmids) found in Sample 1 (SMg) using the pATLAS tool. (b) A closer look at one of the cloud of plasmids. The color gradient in each cloud of plasmids represents the plasmid sequence coverage (SC), varying between 0–0.79 (grey) and 0.80–1 (red gradient).
Figure 4A heatmap comparing the identified plasmids using bowtie2 in S. haemolyticus WGS (1), E. faecium WGS (2) and in the SMg dataset (3) isolated from sample 1.