| Literature DB >> 22761935 |
Lu Fan1, Kerensa McElroy, Torsten Thomas.
Abstract
Direct sequencing of environmental DNA (metagenomics) has a great potential for describing the 16S rRNA gene diversity of microbial communities. However current approaches using this 16S rRNA gene information to describe community diversity suffer from low taxonomic resolution or chimera problems. Here we describe a new strategy that involves stringent assembly and data filtering to reconstruct full-length 16S rRNA genes from metagenomicpyrosequencing data. Simulations showed that reconstructed 16S rRNA genes provided a true picture of the community diversity, had minimal rates of chimera formation and gave taxonomic resolution down to genus level. The strategy was furthermore compared to PCR-based methods to determine the microbial diversity in two marine sponges. This showed that about 30% of the abundant phylotypes reconstructed from metagenomic data failed to be amplified by PCR. Our approach is readily applicable to existing metagenomic datasets and is expected to lead to the discovery of new microbial phylotypes.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22761935 PMCID: PMC3384625 DOI: 10.1371/journal.pone.0039948
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Reads, 16S rRNAcontigs, OTUs and chimera examination of the simulated communities.
| Sample | HC–A | HC–B | HC–C | MC–A | MC–B | MC–C | LC–A | LC–B | LC–C |
|
| 999913 | 999909 | 999912 | 999703 | 999775 | 999769 | 999603 | 999606 | 999685 |
|
| 1303 | 1353 | 1376 | 984 | 1112 | 1153 | 874 | 916 | 860 |
|
| 130 (3, 1) | 126 (7, 1) | 125 (4, 3) | ||||||
|
| 3733 (85, 15) | 3005 (365, 8) | 2386 (374, 150) | ||||||
|
| 73 (0, 0) | 53 (3, 0) | 54 (3, 2) | ||||||
|
| 3257 (0, 0) | 2610 (330, 0) | 2004 (364, 140) | ||||||
|
| 458, 1548, 1262 | 574, 1529, 1127 | 515, 1532, 1174 | ||||||
|
| 81, 0, 0 | 81, 0, 0 | 81, 0, 0 | 75, 1, 1 | 77, 1, 1 | 77, 1, 1 | 80, 0, 0 | 79, 0, 0 | 80, 0, 0 |
|
| 1303, 0, 0 | 1353, 0, 0 | 1376, 0, 0 | 978, 2, 4 | 1106, 2, 2 | 1148, 4, 4 | 870, 0, 0 | 915, 0, 0 | 857, 0, 0 |
|
| 74, 0, 0 | 74, 0, 0 | 74, 0, 0 | 69, 0, 0 | 72, 0, 0 | 72, 0, 0 | 72, 0, 0 | 71, 0, 0 | 72, 0, 0 |
|
| 1303, 0, 0 | 1353, 0, 0 | 1376, 0, 0 | 982, 0, 0 | 1108, 0, 0 | 1150, 0, 0 | 870, 0, 0 | 915, 0, 0 | 857, 0, 0 |
|
| 52, 0, 0 | 53, 0, 0 | 52, 0, 0 | 49, 0, 0 | 50, 0, 0 | 49, 0, 0 | 49, 0, 0 | 48, 0, 0 | 48, 0, 0 |
|
| 1303, 0, 0 | 1353, 0, 0 | 1376, 0, 0 | 982, 0, 0 | 1108, 0, 0 | 1150, 0, 0 | 870, 0, 0 | 915, 0, 0 | 857, 0, 0 |
Figure 116S rRNA gene contigs and chimeric contigs for simulated datasets.
Open circle: non-chimeric contigs; solid circle: chimeric contigs containing one contaminating read; solid triangles: chimeric contigs containing more than one contaminating read. Arrow: chimera detected by UChime. (A) HC. (B) MC. (C) LC.
Figure 2Taxonomic classification of assembled and unassembled shotgun 16S rRNA gene reads for simulated datasets.
(A) HC. (B) MC. (C) LC.
The sponge metagenomic datasets.
| Sample | Cyr–A shotgun | Cyr–B shotgun | Cyr–C shotgun | Cyn–A shotgun | Cyn–B shotgun | Cyn–C shotgun |
|
|
|
| ||||
|
| 897408 | 971976 | 888127 | 678263 | 1169872 | 1323699 |
|
| 387.6 | 353.2 | 276.8 | 358.0 | 408.1 | 392.8 |
|
| 859525 | 898161 | 788662 | 660869 | 1004075 | 1111093 |
|
| 282 | 385 | 95 | 237 | 530 | 413 |
|
| 48 (557) | 66 (908) | ||||
|
| 13 (445) | 12 (727) | ||||
|
| 1218, 1535, 1418 | 493, 1517, 1251 | ||||
The sponge tag-sequencing data sets.
| Sample | Cyr–A PCR | Cyr–B PCR | Cyr–C PCR | Cyn–A PCR | Cyn–B PCR | Cyn–C PCR |
|
|
|
| ||||
|
| 5989 | 7895 | 13961 | 8257 | 5284 | 12509 |
|
| 301.1 | 302.5 | 305.7 | 306.8 | 317.2 | 314.1 |
|
| 2342 | 3038 | 4988 | 3754 | 2140 | 6130 |
|
| 212 | 179 | 311 | 265 | 155 | 244 |
|
| 269.8 | 268.9 | 272.2 | 267.2 | 271 | 269.2 |
Figure 3Phylum-level classification of the sponge pyro-tag-sequencing and shotgun sequencing datasets.
(A) 16S rRNA gene PCR approach. (B) Unassembled shotgun 16S rRNA gene reads. (C) Assembled shotgun 16S rRNA gene reads. (D) Single-copy gene analysis.
Figure 4Shared and unique OTUs of the PCR-based and shotgun-based sponge datasets.
Circle sizes are proportional to OTU number. (A) 0.01 phylogenetic distance OTU. (B) 0.03 phylogenetic distance OTU. (C) 0.05 phylogenetic distance OTU.
Figure 5Abundance and primer-mismatches in the top OTUs at the 0.01 phylogenetic distance level for the sponge datasets.
Asterisk, primer-mis-match event.