Literature DB >> 30039541

Analyzing differences between microbiome communities using mixture distributions.

Konstantin Shestopaloff1,2, Michael D Escobar1, Wei Xu1.   

Abstract

In this paper, we present a method to assess differences between microbiome communities that effectively models sparse count data and accounts for presence-absence bias frequently encountered when zeros are present. We assume that the observed data for each operational taxonomic unit is Poisson generated with the rate for each sample originating from an underlying rate distribution. We propose to model this distribution using a mixture model, specifying the components based on the posterior rate distribution of a count and estimating the optimal weights using a least squares objective function. The distribution incorporates varying resolutions of samples, a point mass for differentiating structural and nonstructural zeros, and a truncation point mass to account for high values that are too sparse to model. As mixture component specification is not always straightforward, a method to estimate a joint model from several mixture distributions using minimum distances of bootstrap iterates is proposed. Once the population rate distribution is approximated, we obtain sample-specific distributions by conditioning on the observed operational taxonomic unit count, resolution, and estimated mixture distribution and then use these to estimate pairwise distances for a permutation test. The method gives an accurate estimate of the true proportion of zeros for presence-absence, effectively models the distribution of low counts using the mixture distribution, and achieves good power for detecting differences in a variety of scenarios. The method is tested using a simulation study and applied to two microbiome datasets. In each case, the results are compared with a number of existing methods.
© 2018 John Wiley & Sons, Ltd.

Keywords:  microbiome; mixture models; sparsity; statistical ecology; zero inflation

Mesh:

Year:  2018        PMID: 30039541     DOI: 10.1002/sim.7896

Source DB:  PubMed          Journal:  Stat Med        ISSN: 0277-6715            Impact factor:   2.373


  1 in total

1.  DCMD: Distance-based classification using mixture distributions on microbiome data.

Authors:  Konstantin Shestopaloff; Mei Dong; Fan Gao; Wei Xu
Journal:  PLoS Comput Biol       Date:  2021-03-12       Impact factor: 4.475

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.