Zhenqiu Liu1, Fengzhu Sun1, Jonathan Braun1, Dermot P B McGovern1, Steven Piantadosi1. 1. Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA, Molecular and Computational Biology Program, Department of Biological Sciences, USC, Los Angeles, CA 90089, USA, Department of Pathology and Laboratory Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095, USA and F. Widjaja Foundation - Inflammatory Bowel and Immunobiology Research Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA.
Abstract
MOTIVATION: Identifying disease associated taxa and constructing networks for bacteria interactions are two important tasks usually studied separately. In reality, differentiation of disease associated taxa and correlation among taxa may affect each other. One genus can be differentiated because it is highly correlated with another highly differentiated one. In addition, network structures may vary under different clinical conditions. Permutation tests are commonly used to detect differences between networks in distinct phenotypes, and they are time-consuming. RESULTS: In this manuscript, we propose a multilevel regularized regression method to simultaneously identify taxa and construct networks. We also extend the framework to allow construction of a common network and differentiated network together. An efficient algorithm with dual formulation is developed to deal with the large-scale n ≪ m problem with a large number of taxa (m) and a small number of samples (n) efficiently. The proposed method is regularized with a general Lp (p ∈ [0, 2]) penalty and models the effects of taxa abundance differentiation and correlation jointly. We demonstrate that it can identify both true and biologically significant genera and network structures. AVAILABILITY AND IMPLEMENTATION: Software MLRR in MATLAB is available at http://biostatistics.csmc.edu/mlrr/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Identifying disease associated taxa and constructing networks for bacteria interactions are two important tasks usually studied separately. In reality, differentiation of disease associated taxa and correlation among taxa may affect each other. One genus can be differentiated because it is highly correlated with another highly differentiated one. In addition, network structures may vary under different clinical conditions. Permutation tests are commonly used to detect differences between networks in distinct phenotypes, and they are time-consuming. RESULTS: In this manuscript, we propose a multilevel regularized regression method to simultaneously identify taxa and construct networks. We also extend the framework to allow construction of a common network and differentiated network together. An efficient algorithm with dual formulation is developed to deal with the large-scale n ≪ m problem with a large number of taxa (m) and a small number of samples (n) efficiently. The proposed method is regularized with a general Lp (p ∈ [0, 2]) penalty and models the effects of taxa abundance differentiation and correlation jointly. We demonstrate that it can identify both true and biologically significant genera and network structures. AVAILABILITY AND IMPLEMENTATION: Software MLRR in MATLAB is available at http://biostatistics.csmc.edu/mlrr/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Kathleen Machiels; Marie Joossens; João Sabino; Vicky De Preter; Ingrid Arijs; Venessa Eeckhaut; Vera Ballet; Karolien Claes; Filip Van Immerseel; Kristin Verbeke; Marc Ferrante; Jan Verhaegen; Paul Rutgeerts; Séverine Vermeire Journal: Gut Date: 2013-09-10 Impact factor: 23.059
Authors: Alexander V Alekseyenko; Guillermo I Perez-Perez; Aieska De Souza; Bruce Strober; Zhan Gao; Monika Bihan; Kelvin Li; Barbara A Methé; Martin J Blaser Journal: Microbiome Date: 2013-12-23 Impact factor: 14.650