| Literature DB >> 29140991 |
Emma Schwager1,2, Himel Mallick1,2, Steffen Ventz3,4, Curtis Huttenhower1,2.
Abstract
Compositional data consist of vectors of proportions normalized to a constant sum from a basis of unobserved counts. The sum constraint makes inference on correlations between unconstrained features challenging due to the information loss from normalization. However, such correlations are of long-standing interest in fields including ecology. We propose a novel Bayesian framework (BAnOCC: Bayesian Analysis of Compositional Covariance) to estimate a sparse precision matrix through a LASSO prior. The resulting posterior, generated by MCMC sampling, allows uncertainty quantification of any function of the precision matrix, including the correlation matrix. We also use a first-order Taylor expansion to approximate the transformation from the unobserved counts to the composition in order to investigate what characteristics of the unobserved counts can make the correlations more or less difficult to infer. On simulated datasets, we show that BAnOCC infers the true network as well as previous methods while offering the advantage of posterior inference. Larger and more realistic simulated datasets further showed that BAnOCC performs well as measured by type I and type II error rates. Finally, we apply BAnOCC to a microbial ecology dataset from the Human Microbiome Project, which in addition to reproducing established ecological results revealed unique, competition-based roles for Proteobacteria in multiple distinct habitats.Entities:
Mesh:
Year: 2017 PMID: 29140991 PMCID: PMC5706738 DOI: 10.1371/journal.pcbi.1005852
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Methods included in an evaluation on simulated data.
Type I and type II error rates were determined for these methods by the correct or incorrect rejection of H0; for CCLasso and SPIEC-EASI, no inferential methodology was provided and so the correct or incorrect estimation of wjk as zero was used. Note that although SPIEC-EASI infers the precision matrix, construction of the true correlation matrix in the simulated data guarantees that the same elements will be non-zero in the precision and covariance matrix.
| Method | H0 | Error calls | Inference Method |
|---|---|---|---|
| Simplicial Variation | inference | one-sided permutation test | |
| SparCC | inference | authors’ bootstrap-based method | |
| CCLasso | estimation | - | |
| SPIEC-EASI | estimation | - | |
| ReBoot | inference | permutation-based test | |
| BAnOCC | inference | 95% credible interval | |
| Spearman (composition) | inference | two-sided permutation test | |
| Spearman (unconstrained counts) | inference | two-sided permutation test |