Literature DB >> 30197908

Hypothesis Testing and Statistical Analysis of Microbiome.

Abstract

After the initiation of Human Microbiome Project in 2008, various biostatistic and bioinformatic tools for data analysis and computational methods have been developed and applied to microbiome studies. In this review and perspective, we discuss the research and statistical hypotheses in gut microbiome studies, focusing on mechanistic concepts that underlie the complex relationships among host, microbiome, and environment. We review the current available statistic tools and highlight recent progress of newly developed statistical methods and models. Given the current challenges and limitations in biostatistic approaches and tools, we discuss the future direction in developing statistical methods and models for the microbiome studies.

Entities: Chemical Disease Gene Species

Keywords: IBD; Vitamin D receptor; bioinformatics; biostatistics; cancer; diet; dysbiosis; hypothesis testing; inflammation; microbiome; obesity; statistical methods and models

Year: 2017 PMID： 30197908 PMCID： PMC6128532 DOI： 10.1016/j.gendis.2017.06.001

Source DB: PubMed Journal: Genes Dis ISSN： 2352-3042

Introduction

The gut microbiome plays fundamental roles in the human health. It can be considered as a newly identified organ that interacts with other organs and influences the development of disease.1, 2 Human Microbiome Project (HMP) was initiatively funded in 2008 by the National Institutes of Health Roadmap for Biomedical Research and constructed as a large, genome-scale community research project. The HMP project needs data analysis, computational methods development, and the public availability of tools and data. Human gut microbiome study is to understand not only the microbiome community composition, but also the dynamic interactions among microbiome, host, environment, and disease intervention. The microbiome studies require a multi-disciplinary team effort, involving basic, translational, and clinical investigators. The next phase of research investigation of the gut microbiome should be guided by specific biological questions relevant to the clinical aspects and natural history of the disease, utilizing the full spectrum of ‘omic’ technologies, bioinformatic analysis, and experimental models. However, the significant roles of biostatisticians and bioinformatic and biostatistical methods in gut microbiome studies are underestimated, especially the appropriate use of biostatistic tests is largely ignored. Here, we discuss the statistical hypothesis tests in both community composition and microbiome-host interactions. We review the utility of various statistical approaches for assessing the diversity of microbiome communities and analyzing and modeling the association between community composition of the microbiome and host. We summarize the current available statistic tools for microbiome studies. We highlight the recent progress in new statistical methods and models. In doing so, we provide specific examples of these methods and discuss how to appropriately apply them into microbiome study. In the meantime, we bring up the limitations and daunting challenges ahead of us that must be overcome in order to move the field forward. Furthermore, we discuss the development of statistical methods, the limits, and future direction.

Research and statistical hypotheses in human microbiome studies

In the current microbiome studies, there are mainly two themes: 1) to characterize the relationship between microbiome features and biological, genetic, clinical or experimental conditions; and 2) to identify potential biological and environmental factors that are associated with microbiome composition. The goal of these studies is to understand mechanisms of host genetic and environmental factors that shape our microbiome. Insights gained from the studies potentially contribute to the development of therapeutic strategies in modulating the microbiome composition in human diseases.6, 7 Dynamic interactions exist among environment, microbiome and host (Fig. 1). To study the complicated interactions among these factors, three general research hypotheses have been developed and used in the field: hypothesis 1 is to test the association between environment and host. There are no specific features for this hypothesis compared to other biomedical sciences. To test the hypothesis 1, we can use the standard statistical methods and models, which are commonly used in other biomedical sciences. For the microbiome studies, the focus is on the following research hypothesis 2 and 3:

Figure 1

Dynamic Interactions among environment, microbiome and host for the research hypotheses in microbiome studies.

Dynamic Interactions among environment, microbiome and host for the research hypotheses in microbiome studies. The research hypothesis 2 is to test the association between microbiome and host: whether the composition of the microbiome or “dysbiotic” microbiome is linked to the health or disease of host. For example, in inflammatory bowel diseases (IBD) research,8, 9 dysbiosis is associated with the progression of the diseases. Lack of vitamin D receptor (VDR) causes dysbiosis and changes the functions of the murine intestinal microbiome. Altered bacterial community is associated with different intestinal epithelial VDR status. The research hypothesis 3 is to test whether microbiome is associated with environmental or biological covariates, the impact of environmental factors on microbiome, or whether there is an effect of intervention on a specific microbiome composition (diversity) in health and disease. The examples include testing whether dietary interventions shape gut microbiota,8, 14 the impact of a probiotic intervention on the composition of the human microbiota. The longitudinal studies have tested antibiotics and diet effects on gut microbial community structure, analyzed whether nutrition influence gut microbiome composition at the level of bacterial species, or hypothesized that antibiotic treatments affect the diversity of strains of gut bacteria. In a recent paper, Bokulich et al showed that antibiotic exposure and delivery mode alter bacterial diversity and delay microbiota maturation and infant diet affects diversity of intestinal microbiome. Statisticians usually develop their statistical hypotheses based on the research hypotheses. Based on the research hypotheses, the null statistical hypothesis is developed as “there is no difference of microbiome composition in health and disease (or experimental groups or genetic conditions)” or “there is no difference (change) of microbiome composition in different environmental factors (or intervention). Although these statistical hypotheses have the core theme that explores impacts of environmental or external factors (e.g. interventions) on composition and/or richness of microbiota, they could focus on various topics, including, alpha diversity (species diversity in each individual sample), bacterial richness, total number of unique operational taxonomic units (OTUs), phylogenetic diversity (the relative amount of diverse phylogenetic lineages), and species evenness in each sample. The statistical hypothesis could be alpha diversity. For example, for antibiotic studies, we can hypothesize that antibiotic treatment does not decrease microbial diversity or specifically antibiotic-treated children have same diverse gut microbiota; antibiotic treatment decreases microbial diversity,13, 18, 19, 20, 21 so the antibiotic-treated children have a less diverse gut microbiota. The statistical hypothesis can also be beta diversity, such as, Jaccard index of species or strains or UniFrac phylogenetic distance. The statistical hypotheses could be even temporal microbiome community. For example, we can hypothesize that all strains are similar, the microbiome community is stable (not change over time), or compared to non-antibiotic users, antibiotic treatment make the strains less similar and less stable.

Statistical methods and models for microbiome studies

An appropriate statistical method is needed to prove a scientific hypothesis. In this section, we review the classical statistical tests, multivariate statistical tools, and some newly developed models and methods in analyzing microbiome data.

Classic statistical tests

Many classic statistical tests are available to analyze gut microbiome. A hypothesis testing in microbial taxa can be conducted by comparing alpha and beta diversity indices. Depending on whether the data are normally or non-normally distributed, number of experimental groups, or experimental conditions, we can use a t-test, analysis of variance, or corresponding non-parametric test. Standard t-test was used to compare alpha diversity between two groups or population abundance23, 24, 25, 26 between two sets of relative abundance data. Standard t-test was used even to compare the relative abundances of different phyla and genera between healthy volunteers and colorectal cancer (CRC) patients in a gut microbiota study in CRC.27, 28 The non-parametric analogous Wilcoxon rank-sum test (also called Mann–Whitney test) was conducted to compare alpha diversity: Shannon diversity across groups, two clusters as defined by the bacterial taxonomic composition. It was used to identify the statistically significant differences in microbial taxa or OTUs, and other nonparametric measures, or the relative abundances of different phyla and genera. In summary, two-sample t-test and its nonparametric counterpart Wilcoxon rank-sum test were widely used in microbiome studies to comparing continuous variables between two groups. When comparing more than two groups, the one-way analysis of variance (ANOVA) or its non-parametric equivalent of the Kruskal–Wallis test are appropriate, depending on that the variables are normally or not normally distributed. ANOVA was reviewed to analyze taxonomic diversity data including intergroup and intragroup beta diversity, to compare proportional abundance,23, 31 or to assess the performance of risk model of the gut microbiome on BMI or lipids24, 31 and taxonomic and functional-specific biases. ANOVA test was also used to compare the functional capacity of microbiome among intestinal locations. Kruskal–Wallis one-way ANOVA was used to compare normalized z-score of the bacterial and fungal proportions for samples, and unequal variances of microbiome data. When comparing categorical microbiome data, for example testing a single a priori specified taxon is present at different rates across groups, a chi-square test was usually used. Clearly, the classical statistical methods are still widely used and will be used in gut microbiome studies. However, it is important that the appropriate statistical tests and methods should be carefully chosen to analyze microbiome data.

Overdispersion and zero-inflated models in microbiome studies

Taxa count data in microbiome studies, such as microbiome taxonomy reads or OTU counts from amplicon sequencing experiments or differential expression data from RNA-Seq experiments are often overdispersed and have many zeros. In order to fit the microbiome count data with overdispersion and excess zeros, typically, the negative binomial (NB), and zero inflated models are often applied. For example, a NB model was fitted in microbiome abundance data analysis, used to analyze gut microbiome in Parkinson's disease. A NB model developed by was used to test for assessing differences in sequence tag abundance and used for detecting differentially abundant features in clinical metagenomic samples. The abundance of bacteria in the human gut is characterized by an increasing number of zeros at lower taxonomic levels and right screwed. In order to capture the characteristic of excess zeros and model the screwed microbiome data, a zero-inflated model, such as Zero-Inflated Poisson (ZIP), Zero-Inflated Negative Binomial (ZINB) or hurdle model, is needed to be chosen for modeling the excess zeros. The appropriateness of using zero-inflated model in gut microbiome study was assessed by extensive simulations and applied in a real human microbiome study. Recently, for the same reason: capture the excess zeros and model the screwed microbiome data, Wang et al used the hurdle model with a negative binomial distribution to analyze the species of bacteria (97% similarity threshold OTUs). In order to identify the environmental or biological covariates that are associated with different bacterial taxa while accounting for overdispersion and many zeros, Xia et al proposed to apply an additive logistic normal multinomial regression model to link covariates to bacterial composition (counts) and applied the model to a study of the association between diet and stool microbiome composition. Motivated by the observed strong correlation between the number of OTUs detected in a sample and the corresponding sequencing depth in high-throughput 16S rDNA studies, in 2013, Paulson et al proposed Zero-inflated Gaussian (ZIG) mixture model. The mixture model has two components: a novel cumulative sum scaling (CSS) normalization technique to correct the bias in the assessment of differential abundance introduced by total-sum normalization (TSS), and a zero-inflated Gaussian distribution mixture model to account for biases in differential abundance testing resulting from under-sampling of the microbial community. The model seeks to directly estimate the probability that an observed zero is generated from the detection distribution due to under-sampling or from the count distribution (absence of the taxonomic feature in the microbial community). ZIG is implemented in the metagenomeSeq Bioconductor package. The authors evaluated metagenomeSeq using simulated data and compared ZIG to existing tools, such as Kruskal–Wallis test using oral microbiota data from the Human Microbiome Project and concluded that ZIG outperforms approaches that are widely used in the field and yields a more precise biological interpretation of the data. ZIG methods may have broader applicability for other differential abundance analyses such as gut microbiome. However, the methods need be evaluated with sufficient amount of studies.

Multivariate statistical tools in microbiome studies

Microbiome communities in an environmental context can be analyzed by multivariate statistical methods or models. Many statistical models and methods exist for analyzing the association of microbiome community composition and environmental covariates and outcomes. Most multivariate statistical tools used in microbiome study mainly adopted from well-developed ecological research fields and environmental sciences. Due to high dimensionality, non-normality and phylogenetic structure of the data, it is difficult to test the association of microbiome composition directly with potential environmental factors using OTUs or taxa abundances. Thus, multivariate analyses generally first need to choose one distance measure method and then conduct analysis of the estimated distances, in which a distance measure is defined between any of two microbiome samples. Several tests of among-group differences are available in analyzing microbiome data: multivariate analysis of variance with permutation (PERMANOVA), and analysis of group similarities (ANOSIM), multi-response permutation procedures (MRPP), and Mantel's test (MANTEL). PERMANOVA was proposed by Anderson and McArdle to apply the powerful ANOVA to multivariate ecological datasets.43, 44 PERMANOVA is one of most widely used nonparametric methods to fit multivariate models to microbiome data. It is a multivariate analysis of variance based on distance matrices and permutation. Similarly as MRPP, and other multivariate analyses, PERMANOVA is generally used with one of distance measure methods. For example, a PERMANOVA using unweighted UniFrac distance measure was conducted to show the composition of the gut microbiota in omnivore versus vegans, to assess the association with beta diversity measures, to test for microbial divergence among populations, and Bray–Curtis dissimilarity matrix.48, 49 Similar as PERMANOVA, ANOSIM is one of most widely used multivariate methods in microbiome studies. ANOSIM was used to compare within- and between-group similarity through a distance measure, to test the null hypothesis that the average rank similarity between samples within a group is the same as the average rank similarity between samples belonging to different groups. For example, Kelly et al use weighted and unweighted UniFrac distances to test the strength of association with microbiome composition between treatments and among time points within treatments. In microbiome literature, the MRPP on the pairwise weighted UniFrac distance matrix was conducted to confirm the significance of the clustering, to test the factors influencing microbial communities, and to compare community dissimilarities with Bray–Curtis distances. Like a correlation analysis, Mantel's test was used to test association between environmental factors and host microbiome. For example, to test whether microbiome variation explains microbiome variation in host, the association between the host genetic distance and the variance in community beta-diversity, donor microbiome and BMI, and even to identify the predictors of microbiome composition.

Newly developed statistical methods

In order to specifically fit multivariate data, especially microbiome data, recently, the researchers and statisticians have developed several parametric and non-parametric models.

Dirichlet-multinomial model

Among these parametric probability models, the multinomial and Dirichlet-multinomial distributions57, 58, 59 are the most popular ones. For the purposes of hypothesis testing and power calculations of taxonomic-based human microbiome data, La Rosa and colleagues proposed a multivariate statistic method, based on Dirichlet multinomial mixtures models. The authors reparametrize the Dirichlet multinomial model to the Dirichlet multinomial mixtures to make it suitable to perform hypothesis testing across groups, based on difference between location (mean comparison) and scales (variance comparison/dispersion). It is implemented in R statistical software package “HMP” via using the data from the NIH Human Microbiome Project (HMP). Its capability of performing power calculations is also attractive for researchers and statisticians when they design their microbiome study and prepare their grant application.

UniFrac distance metric family

We have reviewed above, to compare microbial communities, multivariate analyses first need to choose one distance measure method. Numerous distance measures have been proposed.61, 62 Among them, phylogenetic distance measures, which account for the phylogenetic relationship among the taxa, are very powerful toolboxes because they exploit the degree of divergence between different sequences. In order to capture phylogenetic information when computing differences between microbial communities, in 2005, Lozupone and Knight proposed the UniFrac distance metric. UniFrac measures the phylogenetic distance between sets of taxa in a phylogenetic tree. The goal of the UniFrac distance metric was to enable objective comparison between microbiome samples from different conditions. In 2007, Lozupone et al added a proportional weighting to the original UniFrac and differentiated them as unweighted UniFrac and weighted UniFrac.64, 65 Since then, two versions of UniFrac were available in the microbiome literature and have been applied in thousands of research publications covering almost everything from human disease to general ecology.65, 66 Unweighted UniFrac distance considers only species presence and absence information and counts the fraction of branch length unique to either community, and weighted UniFrac distance uses species abundance information and weights the branch length with abundance difference. Although these two UniFrac distances have been become the most widely used phylogenetic distance measures, their limitations also have been noticed: These two measures are evaluated assign too much weight either to rare lineages (unweighted UniFrac distance) or to most abundant lineages (weighted UniFrac distances), thus, may not be very powerful in detecting change in moderately abundant lineages. Based on a variance adjusted weighted UniFrac distance (VAWUniFrac), Chen et al in 2012 develop generalized UniFrac distances that extend the weighted and unweighted UniFrac distances for detecting a much wider range of biologically relevant changes in microbiome composition. Thus, the UniFrac toolbox family has been expanded from UniFrac distances to generalized UniFrac distances. The generalized UniFrac distances were demonstrated in detecting the microbiome differences by analysis of two real human gut microbiome data sets related to linking human gut microbiome composition to long-term diet and testing upper respiratory tract microbiome difference between smokers and non-smokers using PERMANOVA. Thus, through incorporating UniFrac distances and PERMANOVA, generalized UniFrac distance measure provides a statistical approach to test the association between microbiome composition and environmental covariates. Two newly developed UniFrac tools were added to the UniFrac toolboxes: one is micropower R package contributed by Kelly et al and another is Wong et al's UniFrac R programs. In the micropower package, Kelly et al incorporated the measures of unweighted and weighted UniFrac distances into analyses of pairwise distances and PERMANOVA to power and sample-size estimation. Under the compositional data analysis setting, Wong et al introduced two new weightings: information UniFrac and ratio UniFrac that are not as sensitive to rarefaction and allow greater separation of outliers than classic unweighted and weighted UniFrac. The goal is to address the limitations of unweighted UniFrac's highly sensitive to rarefaction instance and to sequencing depth in uniform data sets with no clear structure or separation between groups. To our knowledge, no formal manuals are available for micropower R package and Wong et al's UniFrac R programs, although Gregory B Gloor lab hosted UniFrac workshop on February 29, 2016 to illustrate the uses of information UniFrac and ratio UniFrac.

Compositional analysis for microbiome data

Much earlier in 1897, Pearson already warned that “spurious correlation” may be formed when use the ratio of two absolute measurements in the measurement of organs. Since the second half of the twentieth century, researchers in the field of geology have awakened the fact that use the standard statistical approaches to analyze composition data may make the results uninterpretable. Aitchison in the 1980s and particularly in his 1986 seminal work72, 73, 74, 75, 76 realized that every statement about a composition can be stated in terms of ratios of components and developed a set of fundamental principles, a variety of methods, operations, and tools for compositional data analysis. Of those, the logratio transformation methodology was widely accepted by statisticians and researchers in geology, ecology and other fields73, 77, 78, 79 since with logratio transformations, the problem of a constrained sample space (the simplex) of the compositional data will be removed, and data are projected into multivariate real space. Therefore, all available standard multivariate techniques can be used again to analyze compositional data. However, although a series of publications have shown that the existing tools for compositional data analysis in geology, ecology and other fields are readily adapted and also a valid approach to analyze microbiome high-throughput sequencing data,76, 77, 78, 79, 80, 81 the development of methods and tools for microbiome compositional data analysis are most recent. The representative works are Gloor team's ANOVA-like differential express (ALDEx and ALDEx2)80, 81, 82 and Mandal et al's analysis of composition of microbiomes (ANCOM). Fundamentally, both approaches use the logratio transformation techniques to convert microbiome data, thus remove the constraints to make the standard multivariate techniques suitable for analysis. For comparison of microbial composition, it is inappropriate to draw inferences regarding the total abundance in the ecosystem from the abundance of OTUs in the specimen. Instead, the inferences are drawn regarding the relative abundance of a taxon in the ecosystem, using its relative abundance in the specimen. Thus, it exists a compositional constraint: all microbial relative abundances within a specimen sum to one, which results in compositional data residing in a simplex73, 76 rather than the Euclidean space. Recently, Mandal et al developed a novel statistical framework called ANCOM to account for the compositional constraints to reduce false discoveries in detecting differences in microbial mean taxa abundance at an ecosystem level. It is based on compositional log-ratios. The authors compared ANCOM with zero inflated Gaussian (ZIG) and t-test with simulation studies and real data. They concluded ANCOM outperforms ZIG method by substantially reducing the FDR and increasing power. The attractive features of ANCOM are: it makes no distributional assumptions and can be implemented in a linear model framework to adjust for covariates as well as model longitudinal data. Compared to ANCOM, ALDEx and ALDEx2 are more comprehensive: they are applicable to nearly any type of data that is generated by high-throughput sequencing and are suitable for the comparison of many different experimental designs. The statistical analyses include both two-sample and paired t-test, ANOVA, and non-parametric test, such as, Welch's t test, Wilcoxon test, Kruskal–Wallace test. They also have option to adjust p-values using Benjamin-Hochberg method.

Statistical packages that implement hypothesis testing and statistical analysis

In the developing statistical methods and models for hypothesis testing and statistical analysis, bioinformatics pipelines and R packages play a very important role. The two bioinformatics pipelines are QIIME and mothur. Both QIIME and mother are all self-contained pipelines that can be used to analyze 16S rRNA gene sequencing data. Due to their comprehensive features and support documentation, QIIME and mother were reviewed as the two outstanding pipelines.86, 87 QIIME and mothur have the capability to perform microbiome composition and some statistical analyses such as alpha and beta diversities, ANOVA, paired t test, two sample t-test, adonis, ANOSIM, MRPP, PERMANOVA, PERMDISP, db-RDA, and Mantel test.88, 89 Vegan is a very important and most widely used R package. It was initially designed for community ecologists. Vegan is not self-contained, it depends on many other R packages, and must be run under R statistical environment. However, vegan contains the most popular methods of multivariate analysis and tools for diversity analysis, and other potentially useful functions. Therefore, it is commonly used in analyzing ecological communities, and has been adapted to analyze microbiome data. Other R packages which are useful for hypothesis testing and statistical analysis include DESeq, DESeq2, edgeR, limma, metagenomeSeq, microbiome and phyloseq. All these packages have their specific capabilities to conduct hypothesis testing and statistical analysis. Both DESeq and DESeq2 use the negative binomial distribution to test for differential expression; edgeR package implements original statistical methodology described in93, 98, 99, 100; limma is to detect the differential abundance of the species in the samples95, 101; metagenomeSeq includes a non-parametric permutation test on t-statistics, a non-parametric Kruskal–Wallis test, and a mixture model that implements a zero-inflated Gaussian (ZIG) distribution of mean group abundance for each taxonomic feature. Among these packages, microbiome and phyloseq are more comprehensive statistical tools. First of all, microbiome and phyloseq have integrated other available statistical packages to perform statistical hypothesis testing and analysis. For example, the microbiome package contains general-purpose tools for microarray-based analysis of microbiome profiling data sets in R. This package also conducts statistical analysis based on the phyloseq class. Additionally, phyloseq has integrated with or extended to other R packages, such as, DESeq, DESeq2, edgeR, to facilitate taxanomic diversity analysis and statistical modeling. Second, they have tools to manage microbiome data sets. For example, phyloseq package has capability of importing and exporting data from other packages, even from bioinformatics pipelines, such as QIIME and mother. Third, phyloseq also has capability to perform various diversity metrics analyses and sophisticated analyses. For example, after importing data into the R, one may easily perform beta diversity analysis using any or all of over 40 different ecological distance metrics; implement alpha diversity metrics; perform more sophisticated analyses, such as k-tables analysis and differential analysis of microbiome data. The microbiome package adds extra functionality for microbiome data sets to perform microbiota composition analysis, bistability analysis, calculate diversity indices and also to fit linear models, do pairwise comparisons, and association studies, et al. Fourth, both microbiome and phyloseq packages have functions and tools to visualize microbiome data via barplots, boxplots, density plots, heatmaps, motion charts, and networks, and ordination and clustering.

Longitudinal microbiome data analysis and causal inference

The microbiome is inherently dynamic, driven by interactions with the host and the environment, and varies over time. Thus, longitudinal microbiome data analysis provides more information on the profile of microbiome with host and environment interactions. Several computational methods for analyzing longitudinal microbiome data have been developed and applied into microbiome studies. A regression-based time series model can be used to analyze a series of observations (dependent variables) including relative abundances of an OUT over time, ecological diversity of the gut microbiota over time, and a function of time and other covariates (independent variables). For example, a regression model was conducted to evaluate the dependence of the human vaginal microbiome on time in the menstrual cycle and other covariates103, 104; an autoregressive (AR) model was used to assess the tendency of the different taxonomic groups of bacteria. Gerber and his colleagues developed the microbiome counts trajectories infinite mixture model engine, a time-series clustering method for analyzing microbiome data that automatically infers the number of temporal patterns from the data. However, current approach of longitudinal microbiome data analysis cannot provide appropriate statistical tools to model dynamic and complicated microbiome data. First, microbiome may have causative effects on host. It was evidenced by the following factors of both human and animal studies: 1) studies in wild type mice107, 108 and zebrafish109, 110 have found a number of similarities in their microbiotic function and host interactions; 2) the microbiota have played a role in maturation of the host immune system and even anatomical development of the intestine.111, 112 Second, the bacterial composition (species members and abundance) of the gut microbiota is personalized.113, 114 Most microbiomes are strikingly divergent between distinct host species.115, 116 During the lifespan, our microbiome varies systematically across body habitats and time, can be dramatically altered, either transiently or long term, by diseases such as infections or medical interventions such as antibiotics,19, 118, 119 such trends may ultimately reveal how changes of microbiome cause or prevent diseases. For example, reduced species diversity has been observed in obese humans107, 116; the abundance of phylum Fusobacteria increased significantly in the colon of colorectal cancer patients.121, 122 Thus, researchers in the microbiome field need understand not only the association, but also the causative functions of bacteria in human diseases.123, 124, 125 Third, the mutual relationship between microbiome and its host suggests a causal inference model, or mediation analysis and longitudinal analysis may be granted. Currently, microbiome researchers shift their emphasis from correlation to causality. The distinguishing feature of longitudinal studies is that the subjects are measured repeatedly during the study, permitting the direct assessment of changes in response variable over time.126, 127 Thereby, longitudinal study captures both between-individual differences (heterogeneity among individuals) and within subject dynamics. It offers the opportunity to study complex biological, psychological, and behavioral hypotheses, especially those involving changes over time. The advantage of longitudinal analysis is also suitable for microbiome data. It will enhance our understanding of short-and long-term trends of microbiome by intervention, such as diet, and the development and persistence of disease caused by microbiome. Mediational analysis provides the researcher with a story about a sequence of effects that leads to something.129, 130, 131 It allows us to conduct scientific investigations to explain how something comes about. Detecting the dynamic causation among microbiome, intervention and the host is very critical. However, to our knowledge, there are limited applications of causal inferences and mediation analysis in microbiome studies.

Limitations of existing methods

The existing statistical approaches have their limitations. First, the existing statistical methods for analyzing microbiome proportional data do not solve constraint problem, and some researchers even do not know it exists. Most standard statistical methods, such as the Pearson correlation coefficient, t-test, ANOVA are still widely used or exist in current literature on the analysis of microbiome data23, 24, 25, 26 without testing the data distribution and transformation. One assumption of standard statistical methods is that the compared data are independent. Since the sum of the relative abundances is unity, it indicates the data are not independent with the unity or any constant constrain. Thus it is not appropriate to directly use these methods for analyzing microbiome relative abundance data. Actually, this constraint problem is not limited to standard statistical methods; even the family of probability models requires all pairs of OTUs to be negatively correlated. Thus, it may not be appropriate for microbiome data. When we examine univariate or multivariate differences, correlations, and methods that depend on correlation, any inference made on proportional data has the potential to be very misleading if the statistical methods do not consider constraint problem. The focus of microbiome research shifts from correlation to causality. Although longitudinal microbiome data analyses are available, suitable longitudinal and causal inference models are very limited in microbiome studies. The available statistical tools for longitudinal data analysis are far away from meeting the needs of modeling the dynamic microbiome data.

Conclusion and future direction

In summary, three resources of statistical tools are available for analyzing microbiome data. First and convenient way is to use standard tests and models because of its familiarity; second is to borrow statistical methods and models from other relevant research fields such as multivariate methods and techniques from ecology; third is to develop new statistical methods for fitting in specific microbiome data. The progress has been made from choosing standard statistical methods, borrowing them from other fields to develop its own unique methods. Some available statistical approaches for microbiome data seem feasible. There are two challenges in developing statistical methods and models in microbiome study: one is to fully consider the constraint feature of microbiome data to model and analyze compositional data; another is to develop longitudinal and causal models to fit the dynamic and complicated association among microbiome, environments and host. For hypothesis testing and statistical analysis of microbiome data, further work is needed to develop methods and models that are more suitable for analyzing microbiome compositional data. Specially, the efforts should be focused on three approaches: 1) to analyze microbiome data as compositional and further develop statistical tools to capture compositional feature of microbiome data; 2) to develop statistical tools for longitudinal microbiome data analysis, and 3) to shift correlation analysis to causation analysis. Currently, both longitudinal and mediation analyses are very limited. To model causative effects of microbiome data, both appropriate design and suitable statistical models are needed. Because microbiome data are very complicated, it is really a challenge for statisticians and microbiome researchers to develop statistical tools to conduct longitudinal and mediation analyses of microbiome data. The future studies need a team effort involving biomedical researcher, physicians, bioinformatic experts, and biostatisticians. More mechanism-driven studies should be based on appropriate statistical design and perform analysis using the experimental models, human samples, ‘omic’ technologies, bioinformatic analysis, and statistic modeling.

Conflicts of interest

The authors declare no competing financial interests.

107 in total

1. Dynamics and Stabilization of the Human Gut Microbiome during the First Year of Life.

Authors: Fredrik Bäckhed; Josefine Roswall; Yangqing Peng; Qiang Feng; Huijue Jia; Petia Kovatcheva-Datchary; Yin Li; Yan Xia; Hailiang Xie; Huanzi Zhong; Muhammad Tanweer Khan; Jianfeng Zhang; Junhua Li; Liang Xiao; Jumana Al-Aama; Dongya Zhang; Ying Shiuan Lee; Dorota Kotowska; Camilla Colding; Valentina Tremaroli; Ye Yin; Stefan Bergman; Xun Xu; Lise Madsen; Karsten Kristiansen; Jovanna Dahlgren; Jun Wang
Journal: Cell Host Microbe Date: 2015-06-10 Impact factor: 21.023

2. A humanized gnotobiotic mouse model of host-archaeal-bacterial mutualism.

Authors: Buck S Samuel; Jeffrey I Gordon
Journal: Proc Natl Acad Sci U S A Date: 2006-06-16 Impact factor: 11.205

Review 3. The gut microbiota and obesity: from correlation to causality.

Authors: Liping Zhao
Journal: Nat Rev Microbiol Date: 2013-08-05 Impact factor: 60.633

4. Linking long-term dietary patterns with gut microbial enterotypes.

Authors: Gary D Wu; Jun Chen; Christian Hoffmann; Kyle Bittinger; Ying-Yu Chen; Sue A Keilbaugh; Meenakshi Bewtra; Dan Knights; William A Walters; Rob Knight; Rohini Sinha; Erin Gilroy; Kernika Gupta; Robert Baldassano; Lisa Nessel; Hongzhe Li; Frederic D Bushman; James D Lewis
Journal: Science Date: 2011-09-01 Impact factor: 47.728

5. Development of the human infant intestinal microbiota.

Authors: Chana Palmer; Elisabeth M Bik; Daniel B DiGiulio; David A Relman; Patrick O Brown
Journal: PLoS Biol Date: 2007-06-26 Impact factor: 8.029

6. Perturbation and restoration of the fathead minnow gut microbiome after low-level triclosan exposure.

Authors: Adrienne B Narrowe; Munira Albuthi-Lantz; Erin P Smith; Kimberly J Bower; Timberley M Roane; Alan M Vajda; Christopher S Miller
Journal: Microbiome Date: 2015-03-03 Impact factor: 14.650

7. Gut microbiomes of Malawian twin pairs discordant for kwashiorkor.

Authors: Michelle I Smith; Tanya Yatsunenko; Mark J Manary; Indi Trehan; Rajhab Mkakosya; Jiye Cheng; Andrew L Kau; Stephen S Rich; Patrick Concannon; Josyf C Mychaleckyj; Jie Liu; Eric Houpt; Jia V Li; Elaine Holmes; Jeremy Nicholson; Dan Knights; Luke K Ursell; Rob Knight; Jeffrey I Gordon
Journal: Science Date: 2013-01-30 Impact factor: 47.728

8. Hypothesis testing and power calculations for taxonomic-based human microbiome data.

Authors: Patricio S La Rosa; J Paul Brooks; Elena Deych; Edward L Boone; David J Edwards; Qin Wang; Erica Sodergren; George Weinstock; William D Shannon
Journal: PLoS One Date: 2012-12-20 Impact factor: 3.240

9. Chapter 12: Human microbiome analysis.

Authors: Xochitl C Morgan; Curtis Huttenhower
Journal: PLoS Comput Biol Date: 2012-12-27 Impact factor: 4.475

10. Methods for Improving Human Gut Microbiome Data by Reducing Variability through Sample Processing and Storage of Stool.

Authors: Monika A Gorzelak; Sandeep K Gill; Nishat Tasnim; Zahra Ahmadi-Vand; Michael Jay; Deanna L Gibson
Journal: PLoS One Date: 2015-08-07 Impact factor: 3.240

40 in total

1. MSC: a metagenomic sequence classification algorithm.

Authors: Subrata Saha; Jethro Johnson; Soumitra Pal; George M Weinstock; Sanguthevar Rajasekaran
Journal: Bioinformatics Date: 2019-09-01 Impact factor: 6.937

Review 2. Tools for Analysis of the Microbiome.

Authors: Jessica Galloway-Peña; Blake Hanson
Journal: Dig Dis Sci Date: 2020-03 Impact factor: 3.199

3. Microbiota Analysis Using Two-step PCR and Next-generation 16S rRNA Gene Sequencing.

Authors: Shailesh K Shahi; Kasra Zarei; Natalya V Guseva; Ashutosh K Mangalam
Journal: J Vis Exp Date: 2019-10-15 Impact factor: 1.355

4. Bringing pharmacomicrobiomics to the clinic through well-designed studies.

Authors: Heidi E Steiner; Hayley K Patterson; Jason B Giles; Jason H Karnes
Journal: Clin Transl Sci Date: 2022-08-09 Impact factor: 4.438

5. Hypothesis testing for phylogenetic composition: a minimum-cost flow perspective.

Authors: Shulei Wang; T Tony Cai; Hongzhe Li
Journal: Biometrika Date: 2020-07-11 Impact factor: 2.445

6. Mediation effect selection in high-dimensional and compositional microbiome data.

Authors: Haixiang Zhang; Jun Chen; Yang Feng; Chan Wang; Huilin Li; Lei Liu
Journal: Stat Med Date: 2020-11-17 Impact factor: 2.373

Review 7. Multi-omics data integration considerations and study design for biological systems and disease.

Authors: Stefan Graw; Kevin Chappell; Charity L Washam; Allen Gies; Jordan Bird; Michael S Robeson; Stephanie D Byrum
Journal: Mol Omics Date: 2021-04-19

8. Interrogation of the perturbed gut microbiota in gouty arthritis patients through in silico metabolic modeling.

Authors: Michael A Henson
Journal: Eng Life Sci Date: 2021-06-09 Impact factor: 2.678

9. Feature selection and causal analysis for microbiome studies in the presence of confounding using standardization.

Authors: Emily Goren; Chong Wang; Zhulin He; Amy M Sheflin; Dawn Chiniquy; Jessica E Prenni; Susannah Tringe; Daniel P Schachtman; Peng Liu
Journal: BMC Bioinformatics Date: 2021-07-06 Impact factor: 3.169

10. Ingestion of Faecalibaculum rodentium causes depression-like phenotypes in resilient Ephx2 knock-out mice: A role of brain-gut-microbiota axis via the subdiaphragmatic vagus nerve.

Authors: Siming Wang; Tamaki Ishima; Youge Qu; Jiajing Shan; Lijia Chang; Yan Wei; Jiancheng Zhang; Yaoyu Pu; Yuko Fujita; Yunfei Tan; Xingming Wang; Li Ma; Xiayun Wan; Bruce D Hammock; Kenji Hashimoto
Journal: J Affect Disord Date: 2021-06-11 Impact factor: 6.533