| Literature DB >> 31561435 |
Léa Siegwald1, Ségolène Caboche2,3, Gaël Even4,5, Eric Viscogliosi6, Christophe Audebert7,8, Magali Chabé9.
Abstract
Targeted metagenomics is the solution of choice to reveal differential microbial profiles (defined by richness, diversity and composition) as part of case-control studies. It is well documented that each data processing step may have the potential to introduce bias in the results. However, selecting a bioinformatics pipeline to analyze high-throughput sequencing data from A to Z remains one of the critical considerations in a case-control microbiota study design. Consequently, the aim of this study was to assess whether the same biological conclusions regarding human gut microbiota composition and diversity could be reached using different bioinformatics pipelines. In this work, we considered four pipelines (mothur, QIIME, kraken and CLARK) with different versions and databases, and examined their impact on the outcome of metagenetic analysis of Ion Torrent 16S sequencing data. We re-analyzed a case-control study evaluating the impact of the colonization of the intestinal protozoa Blastocystis sp. on the human gut microbial profile. Although most pipelines reported the same trends in this case-control study, we demonstrated how the use of different pipelines affects the biological conclusions that can be drawn. Targeted metagenomics must therefore rather be considered as a profiling tool to obtain a broad sense of the variations of the microbiota, rather than an accurate identification tool.Entities:
Keywords: 16S targeted metagenomics; bioinformatics pipelines; case-control study; human gut microbiota; metagenetics
Year: 2019 PMID: 31561435 PMCID: PMC6843237 DOI: 10.3390/microorganisms7100393
Source DB: PubMed Journal: Microorganisms ISSN: 2076-2607
Figure 1Chao1 indices boxplots at the family level between both groups of patients (Blastocystis-colonized and Blastocystis-free), for all pipelines. Difference between groups was tested using a Mann–Whitney-Wilcoxon (MWW) test [18,24].
Figure 2Assigned read proportions for each pipeline to different taxonomic levels (i.e., order, family and genus).
Figure 3Mean proportions of each family are significantly different between both groups (the difference between groups has been tested reproducing the original study secondary analysis: using a non-parametric Student test with a Benjamini-Hochberg correction; only the families with a p-value < 0.05 and an effect size above 1% have been represented). Values on each bar is the difference of means between both groups.