| Literature DB >> 35216302 |
Yu Shi1, Guoping Wang1, Harry Cheuk-Hay Lau1, Jun Yu1.
Abstract
Whole genome metagenomic sequencing is a powerful platform enabling the simultaneous identification of all genes from entirely different kingdoms of organisms in a complex sample. This technology has revolutionised multiple areas from microbiome research to clinical diagnoses. However, one of the major challenges of a metagenomic study is the overwhelming non-microbial DNA present in most of the host-derived specimens, which can inundate the microbial signals and reduce the sensitivity of microorganism detection. Various host DNA depletion methods to facilitate metagenomic sequencing have been developed and have received considerable attention in this context. In this review, we present an overview of current host DNA depletion approaches along with explanations of their underlying principles, advantages and disadvantages. We also discuss their applications in laboratory microbiome research and clinical diagnoses and, finally, we envisage the direction of the further perfection of metagenomic sequencing in samples with overabundant host DNA.Entities:
Keywords: clinical metagenomics; high throughput sequencing; host DNA depletion; human microbiome
Mesh:
Substances:
Year: 2022 PMID: 35216302 PMCID: PMC8877284 DOI: 10.3390/ijms23042181
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Schematic illustration of untargeted metagenomic sequencing and targeted sequencing in human samples. A variety of human-derived samples can be analysed using untargeted metagenomic sequencing or targeted sequencing. (A) For samples with a low amount of host DNA such as faecal samples, a taxonomic profile with a great resolution can be obtained when directly performing untargeted metagenomic sequencing. Gray represents host DNA; red, yellow and purple represent various bacteria; and blue and green represent viruses and archaea, respectively. (B) For samples with overabundant human DNA including nasal/oral/skin swabs, body fluids, blood and biopsy tissues, the vast majority of sequencing reads are aligned to the human genome, which can obscure signals from microorganisms when using metagenomic sequencing. As a solution, removal of host DNA before sequencing can improve the resolution of microbial DNA. These samples can also be analysed with targeted sequencing, which can increase the number and proportion of reads of interest in the sequence data although it limits the breadth of microorganisms that can be identified. mNGS, metagenomic next-generation sequencing.
Figure 2Workflow of typical host DNA depletion approaches. Before metagenomic sequencing, human DNA can be removed by different approaches. (A) The mainstream pre-extraction method to remove host DNA is first treating human cells with a selective lysis buffer followed by DNA nuclease or PMA treatment. (B) The most commonly used post-extraction methods take advantage of the disparity of the cytosine methylation frequency between eukaryotic and prokaryotic DNA. MBD-Fc-bound magnetic beads can capture methylated human DNA sequences, leaving the unmethylated motifs for downstream library preparation. MBD, methyl-CpG binding domain; HTS, high throughput sequencing.
Commercial kits for microorganism enrichment.
| Kit | Principle | Pros | Cons | Hands-On Time Per Sample | Cost Per Sample (USD) | Ref. |
|---|---|---|---|---|---|---|
| QIAamp DNA Microbiome (Qiagen, Hilden, Germany) | Lysis of host cell by saponin, degrade extracellular DNA with Benzonase nuclease | Ultra-clean columns to minimise contamination risk | Requires fresh sample | 160 min | 13 | [ |
| MolYsis™ Complete/Ultra-Deep Microbiome Prep (Molzym, Bremen, Germany) | Chaotropic lysis of host cell, degrade extracellular DNA with MolDNase | Applicable for body fluids, tissue and swab samples. Enrichment of bacterial and fungal DNA | Fresh sample is recommended | 120 min | 11 | [ |
| HostZERO Microbial DNA Kit (Zymo Research, Irvine, CA, USA) | Lysis of host cell, degrade extracellular DNA with microbial selection enzyme | Protocols for both tissue and liquid samples are provided | Requires intact (living) bacteria cells | 30 min | 10 | [ |
| NEBNext Microbiome DNA Enrichment (New England BioLabs, Ipswich, MA, USA) | Capture methylated host DNA | Can retain cell-free DNA from dead organisms to avoid DNA loss | Requires high molecular weight intact DNA. Bias to high CpG-methylated microbes | 30 min * | 39 * | [ |
| LOOXSTER Enrichment Kit (Analytik Jena GmbH, Jena, Germany) | Capture non-methylated CpG dinucleotides | Can retain cell-free DNA from dead organisms to avoid DNA loss | Requires high molecular weight intact DNA. Bias to high CpG-methylated microbes | 75 min * | 34 * | [ |
*: DNA extraction step is excluded.
Case examples of host DNA depletion in clinical metagenomics in the last five years.
| Sample Type | Potential Clinical Indication | Sample Size | Depletion Method | Sequencing Platform | Reads Number | Ref. |
|---|---|---|---|---|---|---|
| Cerebrospinal fluid | Infectious aetiology identification | 13 | Selective lysis by a bead-beater tissue homogeniser followed by a Benzonase nuclease treatment | Ion Torrent PGM | N/A | [ |
| Prosthetic joint sonicate fluid | Pathogen identification | 408 | MolYsis basic kit | Illumina HiSeq | 2.8 million, mean | [ |
| Urine | Pathogen identification | 10 | Differential centrifugation and MolYsis kit | MinION | 0.026 million, median | [ |
| Urine | Antimicrobial resistance marker identification | 13 | NEBNext microbiome kit | Ion Torrent PGM | N/A | [ |
| Sputum | Pathogen detection | 6 | Microfluidic separation followed by DNase digestion | Illumina HiSeq | 36.3 million, mean | [ |
| Sputum, bronchoalveolar lavage and endotracheal aspirates | Diagnosis of known and unknown infections | 40 | Saponin-based differential lysis followed by HL-SAN DNase digestion | MinION | 0.041 million, mean | [ |
| Cerebrospinal fluid | Diagnosis of known and unknown infections | 95 | NEB Microbiome Enrichment Kit | Illumina HiSeq | 5~10 million | [ |
| Endotracheal aspirates | Pathogen identification | 22 | Saponin-based differential lysis followed by HL-SAN DNase digestion | MinION | 6628, median | [ |
| Synovial fluid | Pathogen detection | 168 | MolYsis basic kit | Illumina HiSeq | 30 million, mean | [ |
| Bone and joint infectious tissue | Pathogen detection and antibiotic susceptibility prediction | 24 | Ultra-Deep Microbiome Prep kit | Illumina HiSeq | 20 million, mean | [ |
| Valve tissue | Pathogen identification | 1 | Ultra-Deep Microbiome Prep kit | Illumina MiSeq | 1.4 million, mean | [ |
| Hepatic tissue | Diagnosis of unknown infections | 1 | Ultra-Deep Microbiome Prep kit | Illumina MiSeq | 1.1 million, mean | [ |
| Blood culture bottles inoculated with prosthetic joint tissue | Pathogen identification | 9 | MolYsis basic kit | Illumina MiSeq | 10.3 million, mean | [ |
| Blood | Pathogen detection | 8 | MolYsis complete kit and WGA | Illumina HiSeq | 27.5 million, mean | [ |
| Whole blood | Diagnosis of infection | 101 | MolYsis complete kit | Ion Torrent | N/A | [ |
| Sputum | 40 | MolYsis basic kit | Illumina MiSeq and MinION | 3.6 million, mean | [ | |
| Prosthetic joint sonication fluid | Diagnosis of prosthetic joint infections | 97 | A 5 μm pore size filter | Illumina MiSeq | N/A | [ |
| Urine | Pathogen detection and antimicrobial susceptibility prediction | 40 | NEB Microbiome Enrichment Kit | Ion Proton | N/A | [ |
N/A: data is not publicly available for analysis.
Figure 3Illustration of strategies for removing unwanted high abundance DNA. (A) Sequencing library with Y-shape adapters contacts with a plurality of protein-guide RNA (gRNA) complexes in CRISPR/Cas9 system wherein gRNAs are complementary to the targeted human sequences to allow cleavage. Cleaved host DNA is then degraded by exonuclease III from blunt-ends cleaved by Cas9, leaving other sequences intact for subsequent amplification and sequencing [83]. (B) Using a nanopore device and computational approaches, individual double-strand DNA molecules can be selectively sequenced. When the DNA strand is sequenced, its current signal can be rapidly classified with or without base-calling. If the molecule is mapped to the pre-set reference genome such as a human genome, these reads would then be ejected from pores in real-time by revering the voltage polarity; otherwise, the sequencing would continue. Figures created using BioRender (https://biorender.com, accessed on 3 February 2022).