Eric Z Chen1, Hongzhe Li1. 1. Genomics and Computational Biology Graduate Group Department of Biostatistics and Epidemiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA.
Abstract
MOTIVATION: The human microbial communities are associated with many human diseases such as obesity, diabetes and inflammatory bowel disease. High-throughput sequencing technology has been widely used to quantify the microbial composition in order to understand its impacts on human health. Longitudinal measurements of microbial communities are commonly obtained in many microbiome studies. A key question in such microbiome studies is to identify the microbes that are associated with clinical outcomes or environmental factors. However, microbiome compositional data are highly skewed, bounded in [0,1), and often sparse with many zeros. In addition, the observations from repeated measures in longitudinal studies are correlated. A method that takes into account these features is needed for association analysis in longitudinal microbiome data. RESULTS: In this paper, we propose a two-part zero-inflated Beta regression model with random effects (ZIBR) for testing the association between microbial abundance and clinical covariates for longitudinal microbiome data. The model includes a logistic regression component to model presence/absence of a microbe in the samples and a Beta regression component to model non-zero microbial abundance, where each component includes a random effect to account for the correlations among the repeated measurements on the same subject. Both simulation studies and the application to real microbiome data have shown that ZIBR model outperformed the previously used methods. The method provides a useful tool for identifying the relevant taxa based on longitudinal or repeated measures in microbiome research. AVAILABILITY AND IMPLEMENTATION: https://github.com/chvlyl/ZIBR CONTACT: hongzhe@upenn.edu.
MOTIVATION: The human microbial communities are associated with many human diseases such as obesity, diabetes and inflammatory bowel disease. High-throughput sequencing technology has been widely used to quantify the microbial composition in order to understand its impacts on human health. Longitudinal measurements of microbial communities are commonly obtained in many microbiome studies. A key question in such microbiome studies is to identify the microbes that are associated with clinical outcomes or environmental factors. However, microbiome compositional data are highly skewed, bounded in [0,1), and often sparse with many zeros. In addition, the observations from repeated measures in longitudinal studies are correlated. A method that takes into account these features is needed for association analysis in longitudinal microbiome data. RESULTS: In this paper, we propose a two-part zero-inflated Beta regression model with random effects (ZIBR) for testing the association between microbial abundance and clinical covariates for longitudinal microbiome data. The model includes a logistic regression component to model presence/absence of a microbe in the samples and a Beta regression component to model non-zero microbial abundance, where each component includes a random effect to account for the correlations among the repeated measurements on the same subject. Both simulation studies and the application to real microbiome data have shown that ZIBR model outperformed the previously used methods. The method provides a useful tool for identifying the relevant taxa based on longitudinal or repeated measures in microbiome research. AVAILABILITY AND IMPLEMENTATION: https://github.com/chvlyl/ZIBR CONTACT: hongzhe@upenn.edu.
Authors: Peter J Turnbaugh; Ruth E Ley; Michael A Mahowald; Vincent Magrini; Elaine R Mardis; Jeffrey I Gordon Journal: Nature Date: 2006-12-21 Impact factor: 49.962
Authors: Janet G M Markle; Daniel N Frank; Steven Mortin-Toth; Charles E Robertson; Leah M Feazel; Ulrike Rolle-Kampczyk; Martin von Bergen; Kathy D McCoy; Andrew J Macpherson; Jayne S Danska Journal: Science Date: 2013-01-17 Impact factor: 47.728
Authors: Manon D Schulz; Ciğdem Atay; Jessica Heringer; Franziska K Romrig; Sarah Schwitalla; Begüm Aydin; Paul K Ziegler; Julia Varga; Wolfgang Reindl; Claudia Pommerenke; Gabriela Salinas-Riester; Andreas Böck; Carl Alpert; Michael Blaut; Sara C Polson; Lydia Brandl; Thomas Kirchner; Florian R Greten; Shawn W Polson; Melek C Arkan Journal: Nature Date: 2014-08-31 Impact factor: 49.962
Authors: Omry Koren; Julia K Goodrich; Tyler C Cullender; Aymé Spor; Kirsi Laitinen; Helene Kling Bäckhed; Antonio Gonzalez; Jeffrey J Werner; Largus T Angenent; Rob Knight; Fredrik Bäckhed; Erika Isolauri; Seppo Salminen; Ruth E Ley Journal: Cell Date: 2012-08-03 Impact factor: 41.582
Authors: Antonio Gonzalez; Andrew King; Michael S Robeson; Sejin Song; Ashley Shade; Jessica L Metcalf; Rob Knight Journal: Curr Opin Biotechnol Date: 2011-12-07 Impact factor: 9.740
Authors: Chirag J Patel; Jacqueline Kerr; Duncan C Thomas; Bhramar Mukherjee; Beate Ritz; Nilanjan Chatterjee; Marta Jankowska; Juliette Madan; Margaret R Karagas; Kimberly A McAllister; Leah E Mechanic; M Daniele Fallin; Christine Ladd-Acosta; Ian A Blair; Susan L Teitelbaum; Christopher I Amos Journal: Cancer Epidemiol Biomarkers Prev Date: 2017-07-14 Impact factor: 4.254
Authors: Richard Meier; Jeffrey A Thompson; Mei Chung; Naisi Zhao; Karl T Kelsey; Dominique S Michaud; Devin C Koestler Journal: Stat Appl Genet Mol Biol Date: 2019-11-08
Authors: A R Sitarik; S Havstad; A M Levin; S V Lynch; K E Fujimura; D R Ownby; C C Johnson; G Wegienka Journal: Indoor Air Date: 2018-03-13 Impact factor: 5.770
Authors: Amy L D'Agata; Jing Wu; Manushi K V Welandawe; Samia V O Dutra; Bradley Kane; Maureen W Groer Journal: Dev Psychobiol Date: 2019-01-30 Impact factor: 3.038