Min Jin Ha1, Junghi Kim2, Jessica Galloway-Peña3, Kim-Anh Do4, Christine B Peterson4. 1. Department of Biostatistics, University of Texas MD Anderson Cancer Center, 1400 Pressler St., Houston, TX, USA. MJHa@mdanderson.org. 2. Center for Devices and Radiological Health, U.S. Food and Drug Administration, 10903 New Hampshire Avenue, Silver Sp, MD, USA. 3. Department of Veterinary Pathobiology, Texas A&M University, College Station, TX, USA. 4. Department of Biostatistics, University of Texas MD Anderson Cancer Center, 1400 Pressler St., Houston, TX, USA.
Abstract
BACKGROUND: The estimation of microbial networks can provide important insight into the ecological relationships among the organisms that comprise the microbiome. However, there are a number of critical statistical challenges in the inference of such networks from high-throughput data. Since the abundances in each sample are constrained to have a fixed sum and there is incomplete overlap in microbial populations across subjects, the data are both compositional and zero-inflated. RESULTS: We propose the COmpositional Zero-Inflated Network Estimation (COZINE) method for inference of microbial networks which addresses these critical aspects of the data while maintaining computational scalability. COZINE relies on the multivariate Hurdle model to infer a sparse set of conditional dependencies which reflect not only relationships among the continuous values, but also among binary indicators of presence or absence and between the binary and continuous representations of the data. Our simulation results show that the proposed method is better able to capture various types of microbial relationships than existing approaches. We demonstrate the utility of the method with an application to understanding the oral microbiome network in a cohort of leukemic patients. CONCLUSIONS: Our proposed method addresses important challenges in microbiome network estimation, and can be effectively applied to discover various types of dependence relationships in microbial communities. The procedure we have developed, which we refer to as COZINE, is available online at https://github.com/MinJinHa/COZINE .
BACKGROUND: The estimation of microbial networks can provide important insight into the ecological relationships among the organisms that comprise the microbiome. However, there are a number of critical statistical challenges in the inference of such networks from high-throughput data. Since the abundances in each sample are constrained to have a fixed sum and there is incomplete overlap in microbial populations across subjects, the data are both compositional and zero-inflated. RESULTS: We propose the COmpositional Zero-Inflated Network Estimation (COZINE) method for inference of microbial networks which addresses these critical aspects of the data while maintaining computational scalability. COZINE relies on the multivariate Hurdle model to infer a sparse set of conditional dependencies which reflect not only relationships among the continuous values, but also among binary indicators of presence or absence and between the binary and continuous representations of the data. Our simulation results show that the proposed method is better able to capture various types of microbial relationships than existing approaches. We demonstrate the utility of the method with an application to understanding the oral microbiome network in a cohort of leukemicpatients. CONCLUSIONS: Our proposed method addresses important challenges in microbiome network estimation, and can be effectively applied to discover various types of dependence relationships in microbial communities. The procedure we have developed, which we refer to as COZINE, is available online at https://github.com/MinJinHa/COZINE .
Authors: Jessica L Mark Welch; Blair J Rossetti; Christopher W Rieken; Floyd E Dewhirst; Gary G Borisy Journal: Proc Natl Acad Sci U S A Date: 2016-01-25 Impact factor: 11.205
Authors: M Claire Horner-Devine; Jessica M Silver; Mathew A Leibold; Brendan J M Bohannan; Robert K Colwell; Jed A Fuhrman; Jessica L Green; Cheryl R Kuske; Jennifer B H Martiny; Gerard Muyzer; Lise Ovreås; Anna-Louise Reysenbach; Val H Smith Journal: Ecology Date: 2007-06 Impact factor: 5.499
Authors: Cheol-In Kang; Jae-Hoon Song; Doo Ryeon Chung; Kyong Ran Peck; Joon-Sup Yeom; Jun Seong Son; Yu Mi Wi Journal: Support Care Cancer Date: 2011-12-23 Impact factor: 3.603
Authors: Patricia I Diaz; Bo-Young Hong; Amanda K Dupuy; Linda Choquette; Angela Thompson; Andrew L Salner; Peter K Schauer; Upendra Hegde; Joseph A Burleson; Linda D Strausbaugh; Douglas E Peterson; Anna Dongari-Bagtzoglou Journal: J Fungi (Basel) Date: 2019-06-13