| Literature DB >> 31119088 |
Simon Roux1, Gareth Trubl2, Danielle Goudeau1, Nandita Nath1, Estelle Couradeau3, Nathan A Ahlgren4, Yuanchao Zhan5, David Marsan5, Feng Chen5, Jed A Fuhrman6, Trent R Northen1, Matthew B Sullivan2,7, Virginia I Rich2, Rex R Malmstrom1, Emiley A Eloe-Fadrosh1.
Abstract
BACKGROUND: Metagenomics has transformed our understanding of microbial diversity across ecosystems, with recent advances enabling de novo assembly of genomes from metagenomes. These metagenome-assembled genomes are critical to provide ecological, evolutionary, and metabolic context for all the microbes and viruses yet to be cultivated. Metagenomes can now be generated from nanogram to subnanogram amounts of DNA. However, these libraries require several rounds of PCR amplification before sequencing, and recent data suggest these typically yield smaller and more fragmented assemblies than regular metagenomes.Entities:
Keywords: Genome assembly; Metagenomics; Microbial ecology; Viral metagenomics
Year: 2019 PMID: 31119088 PMCID: PMC6511391 DOI: 10.7717/peerj.6902
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1Coverage bias within individual contigs for unamplified and PCR-amplified libraries.
(A) Example of coverage bias along a single contig from sample 1064195 (contig 1064195_contig_573). Reads from libraries ASXXB, BWNCO, and BWWYG (Table S2) were mapped to the same contig, and read depth along sliding windows of 100 bp is displayed for each library on the y-axis. Windows on the edges of the contig (within 200 bp of the 5′ or 3′ end) were excluded as read depth is not as reliable in these end regions. (B) Illustration of the insert size bias associated with high depth of coverage regions in PCR-amplified libraries. For each library, the number of PCR cycles performed for the library is indicated on the x-axis, while the Kolmogorov–Smirnov distance between the insert size distribution of low- versus high-depth regions is indicated on the y-axis. The magnitude of the difference between the means of the two distributions was also estimated using Cohen’s effect size (d) and is indicated by the dot color. For clarity, only libraries for which the mean insert size was lower in high depth regions are included in the plot, and the 22 libraries which showed the opposite trend are not plotted (Table S1). KS: Kolmogorov–Smirnov
Figure 2Optimized pipeline for assembly of PCR-amplified metagenomes.
(A) Distribution of the cumulative size of long (≥10 kb) contigs (y-axis) obtained across all PCR-amplified libraries from different assembly pipelines (x-axis). Assembly pipelines are indicated along the x-axis (see Table S3). (B) Cumulative size of long (≥10 kb) contigs obtained with a standard (green) or optimized (purple) assembly pipeline for different ranges of library PCR amplifications (x-axis). Coloring of the assembly pipelines is identical as in panel A. (C) Estimated error rate (y-axis) from different assembly pipelines (x-axis) across all PCR-amplified libraries. These assembly errors were estimated for the 25 libraries for which an unamplified reference assembly was available (Table S2). Coloring of the assembly pipelines is identical as in panels A and B. Dedup.: Deduplication, Meta: metaSPAdes, SC: single-cell SPAdes.