| Literature DB >> 24451012 |
Matthew G Links, Bonnie Chaban, Sean M Hemmingsen, Kevin Muirhead, Janet E Hill1.
Abstract
BACKGROUND: Formation of operational taxonomic units (OTU) is a common approach to data aggregation in microbial ecology studies based on amplification and sequencing of individual gene targets. The de novo assembly of OTU sequences has been recently demonstrated as an alternative to widely used clustering methods, providing robust information from experimental data alone, without any reliance on an external reference database.Entities:
Year: 2013 PMID: 24451012 PMCID: PMC3971603 DOI: 10.1186/2049-2618-1-23
Source DB: PubMed Journal: Microbiome ISSN: 2049-2618 Impact factor: 14.650
Figure 1mPUMA workflow. Programs used at each step in the pipeline are shown in red. A. User-defined protocol options for assembly and read-to-operational taxonomic unit (OTU) tracking include gsAssembler for both processes (green arrows), gsAssembler plus Bowtie 2 for read tracking (blue arrows), and Trinity assembly plus Bowtie 2 for read tracking (purple arrows). B. Post-assembly analysis of OTU and abundance data. Gray boxes indicate possible downstream analysis tools for which input is generated by mPUMA. The horizontal broken line indicates the transition from analysis of nucleotide OTU ((nt)OTU) and translated peptide OTU ((aa)OTU). Quality of the assembly can be evaluated by assessing Sensitivity/Specificity (Sn/Sp) of each OTU as defined in [3]. WateredBLAST is a combination of BLAST and Smith-Waterman alignments, described in detail in [15].
Figure 2Comparison of methods for both assembly and abundance calculation using a synthetic community of 20 cloned 60 universal target sequences. Three different scenarios were investigated for the generation of a microbial profile (left-to-right): gsAssembler alone, gsAssembler plus Bowtie 2 for abundance, and Trinity plus Bowtie 2 for abundance. The number of community members recovered is shown across the top (out of 20). The major parameter affecting the accuracy of assembly is varied across the lower x-axis. For gsAssembler the minimum identity of overlaps was held constant at 90 while the minimum length parameter was varied. In the case of Trinity, the k-mer length was varied from 10 to 31 bp. The upper panel shows the percentage of reads which were untrackable (left ordinate) and the total error associated with each assembly (right ordinate). In the lower panel, microbial profiles are plotted as stacked bars with each element colored by organism according to the legend. Profiles marked as 'Target' indicate the actual composition of the amplicon library determined by Bowtie 2 mapping of all reads on to the 20 reference sequences.