| Literature DB >> 23202435 |
Nicholas A Bokulich1, Sathish Subramanian, Jeremiah J Faith, Dirk Gevers, Jeffrey I Gordon, Rob Knight, David A Mills, J Gregory Caporaso.
Abstract
High-throughput sequencing has revolutionized microbial ecology, but read quality remains a considerable barrier to accurate taxonomy assignment and α-diversity assessment for microbial communities. We demonstrate that high-quality read length and abundance are the primary factors differentiating correct from erroneous reads produced by Illumina GAIIx, HiSeq and MiSeq instruments. We present guidelines for user-defined quality-filtering strategies, enabling efficient extraction of high-quality data and facilitating interpretation of Illumina sequencing results.Entities:
Mesh:
Year: 2012 PMID: 23202435 PMCID: PMC3531572 DOI: 10.1038/nmeth.2276
Source DB: PubMed Journal: Nat Methods ISSN: 1548-7091 Impact factor: 28.547
Figure 1Quality Filtration Process Flow in QIIME v1.5.0.
Figure 2α- and β-Diversity comparisons of mock community reads filtered using select phred_quality_score (q) settings (dataset 1). A, B: Family-level (A) and genus-level (B) taxon counts for mock communities filtered with variable (q) values at multiple OTU minimum abundance thresholds (c) (as %). Arrows below color key indicate expected genus- (blue) and family-level (red) taxon counts. C, D, E: Procrustes PCoA biplot of GAIIx weighted UniFrac distance comparing variation in (q). Comparison of (q) setting listed in bottom-right corner to (q) = 3. Top-right corner indicates Bonferroni-corrected p-value for Procrustes goodness of fit. Red, human feces; Magenta, mock community; Cyan, human skin; Dark cyan, human tongue; Blue, freshwater; Orange, freshwater creek; Purple, ocean; Yellow, estuary sediment; Pink, soil. All other settings represent defaults in both α- and β-diversity comparisons.