| Literature DB >> 27485508 |
Heather K Allen1, Darrell O Bayles2, Torey Looft3, Julian Trachsel3,4, Benjamin E Bass3,5, David P Alt2, Shawn M D Bearson3, Tracy Nicholson6, Thomas A Casey3.
Abstract
BACKGROUND: Profiling of 16S rRNA gene sequences is an important tool for testing hypotheses in complex microbial communities, and analysis methods must be updated and validated as sequencing technologies advance. In host-associated bacterial communities, the V1-V3 region of the 16S rRNA gene is a valuable region to profile because it provides a useful level of taxonomic resolution; however, use of Illumina MiSeq data for experiments targeting this region needs validation.Entities:
Keywords: 16S rRNA gene; MiSeq; Microbial ecology; Mock community; V1–V3
Mesh:
Substances:
Year: 2016 PMID: 27485508 PMCID: PMC4970291 DOI: 10.1186/s13104-016-2172-6
Source DB: PubMed Journal: BMC Res Notes ISSN: 1756-0500
16S gene composition of the mock community
| Strain | Genome sizea | # 16S genesa | Genome copies per microgramb | Reference or DSMZ catalogue numberc | |
|---|---|---|---|---|---|
| 1 |
| 1,641,481 | 3 | 5.64E+08 | [ |
| 2 |
| 4,878,012 | 6 | 1.89E+08 | [ |
| 3 |
| 4,656,144 | 7 | 1.99E+08 | [ |
| 4 |
| 2,474,718 | 7 | 3.74E+08 | DSM 20460 |
| 5 |
| 3,585,187 | 3 | 2.57E+08 | [ |
| 6 |
| 3,050,489 | 1 | 3.04E+08 | [ |
| 7 |
| 2,224,137 | 2 | 4.17E+08 | [ |
| 8 |
| 5,207,899 | 3 | 1.78E+08 | [ |
| 9 |
| 2,872,915 | 5 | 3.22E+08 | [ |
| 10 |
| 6,293,399 | 5 | 1.47E+08 | DSM 2079 |
| 11 |
| 3,080,849 | 3 | 3.01E+08 | DSM 17677 |
| 12 |
| 2,153,652 | 4 | 4.30E+08 | DSM 6778 |
| 13 |
| 4,431,877 | 3 | 2.09E+08 | DSM 19495 |
| 14 |
| 4,470,622 | 3 | 2.07E+08 | DSM 18026 |
| 15 |
| 3,788,225 | 4 | 2.45E+08 | DSM 1382 |
| 16 |
| 1,864,998 | 9 | 4.97E+08 | DSM 20081 |
| 17 |
| 2,115,681 | 2 | 4.38E+08 | DSM 20642 |
| 18 |
| 2,509,362 | 1 | 3.69E+08 | DSM 4420 |
| 19 |
| 3,592,125 | 4 | 2.58E+08 | DSM 16839 |
| 20 |
| 1,704,865 | 1 | 5.43E+08 | DSM 2375 |
aAll except C. porcorum were calculated by the Joint Genome Institute’s Integrated Microbial Genomes Database https://img.jgi.doe.gov/cgi-bin/w/main.cgi. C. porcorum was calculated manually (BioProject PRJNA335387)
bEstimates were calculated by URI Genomics & Sequencing Center http://cels.uri.edu/gsc/cndna.html
cGenomic DNAs were acquired from DSMZ, the Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures https://www.dsmz.de/home.html
dThis archaeon was included as a control for non-specific amplification
Primers used in this study
| Primer name | Sequence (5′–3′)a |
|---|---|
| i5+V3 |
|
| i7+V1 |
|
| For.seq.V3 | TATGGTAATTCAATTACCGCGGCTGCTGG |
| Rev.seq.V1 | AGTCAGTCAGCCGAGTTTGATCMTGGCTCAG |
| Index.V1 | CTGAGCCAKGATCAAACTCGGCTGACTGACT |
aIllumina’s MiSeq adaptor sequences are in bold. Underlined portion denotes the barcode region. Barcodes used in this study are from the Schloss laboratory’s MiSeq SOP (http://www.mothur.org/wiki/MiSeq_SOP; [2]). Italics denotes the V1 or V3 16S rRNA gene primer [11]
Fig. 1Cluster size versus frequency plot. The log10 of the cluster size (logCt) (i.e. number of sequences in each cluster) plotted against the log10 of the frequency of cluster membership sizes (i.e. the frequency of clusters that contained n sequences, where n is the cluster membership size) found among all mock community samples. Plots of all mock community data (a), data with cross-sample singletons removed (b), and data with both cross-sample singletons and doubletons removed (c) are shown. Red line regression line, blue line lowess fit line, SE standard error, r2 coefficient of determination
Average diversity estimates of the mock community (n = 12, rarified to 6654 sequences per sample) with and without removing low-frequency sequences
| Mock community | Actual number of OTUsa | Observed number of OTUs | Estimated total number of OTUsb | Chao diversity index | Shannon diversity index | Inverse Simpson index | Error rate (%) | File size (Gb)c |
|---|---|---|---|---|---|---|---|---|
| All sequences | 20 | 734 ± 56 | 374,770 ± 214,807 | 21,676 ± 3273 | 3.6 ± 0.1 | 18 ± 0.8 | 3.6 | 41 |
| Singletons removed | 20 | 28 ± 0.8 | 68 ± 13 | 41 ± 3 | 2.7 ± 0.02 | 12 ± 0.3 | 1.4 | 21 |
| Single and doubletons removed | 20 | 22 ± 0.3 | 22 ± 0.3 | 23 ± 0.7 | 2.6 ± 0.02 | 12 ± 0.3 | 1.3 | 3 |
Average diversity estimates: plus or minus (±) the standard error of the mean, where appropriate
a Haemophilus parasuis has two divergent copies of the 16S rRNA gene that cluster separately
bThe estimated total number of OTUs is the number of OTUs predicted to be in the sample based on the number of OTUs observed in the sequences. The program Catchall was used to make the estimates [14]
cSize of the distance matrix file
Fig. 2Average relative abundance of bacterial genera in 12 replicates of the mock community