| Literature DB >> 27367037 |
Falk Hildebrand1,2,3, Raul Tadeo1,2,4,5, Anita Yvonne Voigt3,6, Peer Bork3,7, Jeroen Raes8,9,10,11.
Abstract
BACKGROUND: 16S ribosomal DNA (rDNA) amplicon sequencing is frequently used to analyse the structure of bacterial communities from oceans to the human microbiota. However, computational power is still a major bottleneck in the analysis of continuously enlarging metagenomic data sets. Analysis is further complicated by the technical complexity of current bioinformatics tools.Entities:
Keywords: 16S rDNA gene; Demultiplexing; Metagenomics; OTU; Pipeline
Mesh:
Substances:
Year: 2014 PMID: 27367037 PMCID: PMC4179863 DOI: 10.1186/2049-2618-2-30
Source DB: PubMed Journal: Microbiome ISSN: 2049-2618 Impact factor: 14.650
Figure 1Overview of the LotuS workflow. Raw reads are demultiplexed and quality filtered; from these, OTUs are clustered. Mid- and high-quality reads are mapped to OTUs; the taxonomy and phylogenetic relatedness are calculated on the extended OTU seeds.
Figure 2Genus level compositional comparison. Comparison of mice caecal composition in example datasets between the five methodologies used. Y-axis are single samples; the percentage of reads that is assigned to specific genera is displayed on the x-axis.
Richness comparisons between pipelines
| s_obs | 119.2243 | 118.6378 | 243.1324 | 201.8838 | 119.0432 |
| Chao1 | 142.314 | 141.8862 | 473.0673 | 273.9279 | 151.7102 |
| Evenness | 0.778795 | 0.778613 | 0.752087 | 0.807766 | 0.763974 |
| Shannon | 3.698957 | 3.690516 | 4.128862 | 4.253904 | 3.627188 |
Average diversity and richness estimates for the five methods to derive an OTU matrix, rarefied to 2,000 reads per samples. LB LotuS BLAST, LR LotuS RDP, QDN QIIME de novo OTU creation, QRE QIIME reference OTU picking, MOT mothur.
Computational efficiency
| 2 × 454 | Demultiplexing/quality filtering | 37 | 37 | 160 | 160 | 235 |
| 2 × 454 | Full run | 177 | 7,317 | 4,325 | 17,081 | 39,660 |
| 2 × MiSeq | Demultiplexing/quality filtering | 820 | 820 | 3,495 | 3,495 | * |
| 2 × MiSeq | Full run | 8,856 | 23,761 | 69,696 | 56,916 | * |
Execution times for the five pipelines in seconds performed on the same computer. The output of “full runs” is OTU abundance, higher taxonomic level abundance and a phylogenetic tree of OTUs. The table is further separated into the execution times on our 454 mice faeces test set (2 × 454) and the two human faeces MiSeq test sets (2 × MiSeq). Asterisks denote that mothur was excluded, due to an unknown error (see Additional file 1).
Compositional similarity of technical replicates
| OTU | LR | LB | QR | QD |
|---|---|---|---|---|
| Bray-Curtis | 0.117267 | 0.1158 | 0.106167 | 0.153467 |
| Canberra | 0.037446 | 0.037591 | 0.038509 | 0.036918 |
| Jensen Shannon | 0.001487 | 0.001572 | 0.001602 | 0.001651 |
| Genus | LR | LB | QR | QD |
| Bray-Curtis | 0.0496 | 0.048133 | 0.044133 | 0.045933 |
| Canberra | 0.055303 | 0.055628 | 0.05864 | 0.05339 |
| Jensen Shannon | 0.002053 | 0.002067 | 0.002236 | 0.001977 |
| Family | LR | LB | QR | QD |
| Bray-Curtis | 0.046867 | 0.043333 | 0.042833 | 0.044933 |
| Canberra | 0.060573 | 0.057641 | 0.065774 | 0.063415 |
| Jensen Shannon | 0.002129 | 0.002124 | 0.002432 | 0.002502 |
The technical replicates between two MiSeq runs were compositionally compared, using Bray-Curtis, Canberra and Jensen Shannon distance metric. The table shows the average distance of 38 pairs of replicates, for each pipeline execution mode. Less distance means more similar replicate samples.