| Literature DB >> 36061756 |
Ilia Korvigo1,2, Anna A Igolkina2,3, Arina A Kichko2, Tatiana Aksenova2, Evgeny E Andronov2,4.
Abstract
High-throughput sequencing of amplicon libraries is the most widespread and one of the most effective ways to study the taxonomic structure of microbial communities, even despite growing accessibility of whole metagenome sequencing. Due to the targeted amplification, the method provides unparalleled resolution of communities, but at the same time perturbs initial community structure thereby reducing data robustness and compromising downstream analyses. Experimental research of the perturbations is largely limited to comparative studies on different PCR protocols without considering other sources of experimental variation related to characteristics of the initial microbial composition itself. Here we analyse these sources and demonstrate how dramatically they effect the relative abundances of taxa during the PCR cycles. We developed the mathematical model of the PCR amplification assuming the heterogeneity of amplification efficiencies and considering the compositional nature of data. We designed the experiment-five consecutive amplicon cycles (22-26) with 12 replicates for one real human stool microbial sample-and estimated the dynamics of the microbial community in line with the model. We found the high heterogeneity in amplicon efficiencies of taxa that leads to the non-linear and substantial (up to fivefold) changes in relative abundances during PCR. The analysis of possible sources of heterogeneity revealed the significant association between amplicon efficiencies and the energy of secondary structures of the DNA templates. The result of our work highlights non-trivial changes in the dynamics of real-life microbial communities due to their compositional nature. Obtained effects are specific not only for amplicon libraries, but also for any studies of metagenome dynamics. ©2022 Korvigo et al.Entities:
Keywords: Amplification bias; Bayesian inference; Compositional data analysis; High-throughput sequencing; Microbiome; PCR
Year: 2022 PMID: 36061756 PMCID: PMC9438772 DOI: 10.7717/peerj.13888
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 3.061
Figure 1Dynamic of community over 5 cycles.
(A) Shift of the microbial taxonomic structure with the increase of cycles. (B) A comparison between high-level temporal community dynamics in experimental and inferred data. Inferred dynamics appear to be a smoothed version of observed variations. In both cases we observe a rapid expansion of Bacteroides at the expense of Firmicutes. Relative groupabundances are estimated from 500 samples from the posterior-predictive distribution.
Figure 2Estimates of dynamics.
(A) Inferred high-level community dynamics at cycles 0–35. Community composition at cycle 0 corresponds to approximated initial community profile. PCR bias resulted in a dramatic overrepresentation of Bacteroides and a corresponding shrinkage of Firmicutes. Relative group abundances are estimated from 500 samples from the posterior-predictive distribution. (B) Estimated log-ratio distributions between relative abundances of individual amplicon sequence variants (ASVs) inferred at cycles 0 (that is approximated initial template proportions) and 35. Vertical lines inside the boxes represent medians. Boxes represent interquartile distances. Log-ratios were estimated from 500 samples from the posterior-predictive distribution. Most relative abundances changed by a factor of 2–8. The plot contains the subset of amplicon sequence variants classified at the genus level. Numbers in parentheses are used to disambiguate ASVs classified into the same genera.
Figure 3Non-linear changes of abundances.
(A) Predicted relative abundance dynamics of five major and minor amplicon sequence variants smoothed by the local polynomial regression method. Grey boundaries denote confidence intervals. (B) Association between amplicon efficiencies and the energy of secondary structures.