| Literature DB >> 28453559 |
Chongqing Wen1,2, Liyou Wu1, Yujia Qin1, Joy D Van Nostrand1, Daliang Ning1, Bo Sun3, Kai Xue1, Feifei Liu1, Ye Deng1,4, Yuting Liang3, Jizhong Zhou1,5,6.
Abstract
Illumina's MiSeq has become the dominant platform for gene amplicon sequencing in microbial ecology studies; however, various technical concerns, such as reproducibility, still exist. To assess reproducibility, 16S rRNA gene amplicons from 18 soil samples of a reciprocal transplantation experiment were sequenced on an Illumina MiSeq. The V4 region of 16S rRNA gene from each sample was sequenced in triplicate with each replicate having a unique barcode. The average OTU overlap, without considering sequence abundance, at a rarefaction level of 10,323 sequences was 33.4±2.1% and 20.2±1.7% between two and among three technical replicates, respectively. When OTU sequence abundance was considered, the average sequence abundance weighted OTU overlap was 85.6±1.6% and 81.2±2.1% for two and three replicates, respectively. Removing singletons significantly increased the overlap for both (~1-3%, p<0.001). Increasing the sequencing depth to 160,000 reads by deep sequencing increased OTU overlap both when sequence abundance was considered (95%) and when not (44%). However, if singletons were not removed the overlap between two technical replicates (not considering sequence abundance) plateaus at 39% with 30,000 sequences. Diversity measures were not affected by the low overlap as α-diversities were similar among technical replicates while β-diversities (Bray-Curtis) were much smaller among technical replicates than among treatment replicates (e.g., 0.269 vs. 0.374). Higher diversity coverage, but lower OTU overlap, was observed when replicates were sequenced in separate runs. Detrended correspondence analysis indicated that while there was considerable variation among technical replicates, the reproducibility was sufficient for detecting treatment effects for the samples examined. These results suggest that although there is variation among technical replicates, amplicon sequencing on MiSeq is useful for analyzing microbial community structure if used appropriately and with caution. For example, including technical replicates, removing spurious sequences and unrepresentative OTUs, using a clustering method with a high stringency for OTU generation, estimating treatment effects at higher taxonomic levels, and adapting the unique molecular identifier (UMI) and other newly developed methods to lower PCR and sequencing error and to identify true low abundance rare species all can increase reproducibility.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28453559 PMCID: PMC5409056 DOI: 10.1371/journal.pone.0176716
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1OTU overlap among technical replicates for experiment I.
(A) Between two technical replicates, singletons not removed; (B) between two technical replicates, singletons removed; (C) among three technical replicates, singletons not removed; (D) among three technical replicates, singletons removed.
Fig 2Overlap of OTUs generated using Uclust between/among technical replicates at different sequencing depth.
At a sequencing depth of about 30,000 reads, overlap of both two and three replicates was approaching a plateau when singletons were not removed.
Three-way ANOVA to assess alpha diversities at different levels for experiment I.
| Df | Shannon measurement (H') | Number of OTUs | Pielou evenness (J) | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| With singletons | Singletons removed | With singletons | Singletons removed | With singletons | Singletons removed | ||||||||
| F | P | F | P | F | P | F | P | F | P | F | P | ||
| 2 | 0.028 | 0.973 | 0.014 | 0.986 | 0.232 | 0.794 | 0.132 | 0.877 | 0.017 | 0.983 | 0.005 | 0.995 | |
| 2 | 1.913 | 0.159 | 1.470 | 0.240 | 13.976 | 1.7E-05 | 11.273 | 9.7E-05 | 0.698 | 0.502 | 1.439 | 0.247 | |
| 1 | 14.421 | 4.1E-04 | 13.665 | 0.001 | 14.164 | 4.6E-04 | 10.889 | 0.002 | 14.081 | 4.7E-04 | 18.907 | 7.1E-05 | |
| 48 | |||||||||||||
a The model used for the three-way ANOVA: (α diversity)ijk = (technical replicate)i+(location)j+(treatment)k+ (error)ijk.
b The Shannon entropy index, number of OTUs, and the Pielou evenness were used to measure the alpha diversity for each tagged PCR library.
c F-value
d P-value (>F)
* p<0.05
** p<0.01
One-way ANOVA and Duncan grouping to assess β-diversity at different levels based on OTUs from Experiment I.
| Data size | With singletons | Singletons removed | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Sorensen | Bray-Curtis | Sorensen | Bray-Curtis | ||||||
| β diversity | Significance | β diversity | Significance | β diversity | Significance | β diversity | Significance | ||
| 54 | 0.499 | e | 0.269 | d | 0.466 | e | 0.258 | d | |
| 162 | 0.528 | d | 0.374 | c | 0.497 | d | 0.364 | c | |
| 243 | 0.561 | c | 0.422 | b | 0.531 | c | 0.413 | b | |
| 243 | 0.617 | b | 0.505 | a | 0.594 | b | 0.500 | a | |
| 243 | 0.637 | a | 0.523 | a | 0.609 | a | 0.514 | a | |
a Data sizes (n) are the number of data points of the pairwise comparisons within the technical replicates, biological replicates, or treatments.
b We calculated two popular β-diversity dissimilarity measurements, Sorensen and Bray-Curtis, in which Sorensen dissimilarity is based on OTUs richness and Bray-Curtis dissimilarity takes OTUs abundance into account.
c Significance at [pr(>F)] <0.05, using the Duncan grouping method. a, b, c, d and e represent the significance of the β-diversity differences between technical replicates, biological replicates, and the treatments. The letter ‘a’ indicates the two highest β-diversities, although the second highest diversity is not significant, those diversities that are significantly lower than the highest diversity are indicated by the letter b, c, d, or e.
Fig 3Detrended correspondance analysis of the microbial communities from Experiment I, with singletons not removed.
Samples were from three experiment locations (H, Hailun; F, Fengqiu; Y, Yingtan) with two treatments at each location: planted (P) and unplanted (C, control), and three field replicates for each treatment. Each soil was tagged three times to create technical replicates. The three technical replicates of each soil were sequenced in the same MiSeq run. Singletons were not removed.