Literature DB >> 15262804

Predicting genetic regulatory response using classification.

Manuel Middendorf1, Anshul Kundaje, Chris Wiggins, Yoav Freund, Christina Leslie.   

Abstract

MOTIVATION: Studying gene regulatory mechanisms in simple model organisms through analysis of high-throughput genomic data has emerged as a central problem in computational biology. Most approaches in the literature have focused either on finding a few strong regulatory patterns or on learning descriptive models from training data. However, these approaches are not yet adequate for making accurate predictions about which genes will be up- or down-regulated in new or held-out experiments. By introducing a predictive methodology for this problem, we can use powerful tools from machine learning and assess the statistical significance of our predictions.
RESULTS: We present a novel classification-based method for learning to predict gene regulatory response. Our approach is motivated by the hypothesis that in simple organisms such as Saccharomyces cerevisiae, we can learn a decision rule for predicting whether a gene is up- or down-regulated in a particular experiment based on (1) the presence of binding site subsequences ('motifs') in the gene's regulatory region and (2) the expression levels of regulators such as transcription factors in the experiment ('parents'). Thus, our learning task integrates two qualitatively different data sources: genome-wide cDNA microarray data across multiple perturbation and mutant experiments along with motif profile data from regulatory sequences. We convert the regression task of predicting real-valued gene expression measurements to a classification task of predicting +1 and -1 labels, corresponding to up- and down-regulation beyond the levels of biological and measurement noise in microarray measurements. The learning algorithm employed is boosting with a margin-based generalization of decision trees, alternating decision trees. This large-margin classifier is sufficiently flexible to allow complex logical functions, yet sufficiently simple to give insight into the combinatorial mechanisms of gene regulation. We observe encouraging prediction accuracy on experiments based on the Gasch S.cerevisiae dataset, and we show that we can accurately predict up- and down-regulation on held-out experiments. We also show how to extract significant regulators, motifs and motif-regulator pairs from the learned models for various stress responses. Our method thus provides predictive hypotheses, suggests biological experiments, and provides interpretable insight into the structure of genetic regulatory networks. AVAILABILITY: The MLJava package is available upon request to the authors. Supplementary: Additional results are available from http://www.cs.columbia.edu/compbio/geneclass

Entities:  

Mesh:

Substances:

Year:  2004        PMID: 15262804     DOI: 10.1093/bioinformatics/bth923

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  29 in total

1.  Lasting impressions: motifs in protein-protein maps may provide footprints of evolutionary events.

Authors:  J Jeremy Rice; Aaron Kershenbaum; Gustavo Stolovitzky
Journal:  Proc Natl Acad Sci U S A       Date:  2005-02-22       Impact factor: 11.205

2.  Inferring network mechanisms: the Drosophila melanogaster protein interaction network.

Authors:  Manuel Middendorf; Etay Ziv; Chris H Wiggins
Journal:  Proc Natl Acad Sci U S A       Date:  2005-02-22       Impact factor: 11.205

3.  Quantitative analysis of binding motifs mediating diverse spatial readouts of the Dorsal gradient in the Drosophila embryo.

Authors:  Dmitri Papatsenko; Michael Levine
Journal:  Proc Natl Acad Sci U S A       Date:  2005-03-28       Impact factor: 11.205

4.  Statistical significance of combinatorial regulations.

Authors:  Aika Terada; Mariko Okada-Hatakeyama; Koji Tsuda; Jun Sese
Journal:  Proc Natl Acad Sci U S A       Date:  2013-07-23       Impact factor: 11.205

5.  Modelling transcriptional regulation with a mixture of factor analyzers and variational Bayesian expectation maximization.

Authors:  Kuang Lin; Dirk Husmeier
Journal:  EURASIP J Bioinform Syst Biol       Date:  2009-06-11

6.  Mining SARS-CoV protease cleavage data using non-orthogonal decision trees: a novel method for decisive template selection.

Authors:  Zheng Rong Yang
Journal:  Bioinformatics       Date:  2005-03-29       Impact factor: 6.937

7.  A top-performing algorithm for the DREAM3 gene expression prediction challenge.

Authors:  Jianhua Ruan
Journal:  PLoS One       Date:  2010-02-04       Impact factor: 3.240

8.  An ensemble learning approach to reverse-engineering transcriptional regulatory networks from time-series gene expression data.

Authors:  Jianhua Ruan; Youping Deng; Edward J Perkins; Weixiong Zhang
Journal:  BMC Genomics       Date:  2009-07-07       Impact factor: 3.969

9.  Reconstructing a network of stress-response regulators via dynamic system modeling of gene regulation.

Authors:  Wei-Sheng Wu; Wen-Hsiung Li; Bor-Sen Chen
Journal:  Gene Regul Syst Bio       Date:  2008-02-10

10.  Automatic policing of biochemical annotations using genomic correlations.

Authors:  Tzu-Lin Hsiao; Olga Revelles; Lifeng Chen; Uwe Sauer; Dennis Vitkup
Journal:  Nat Chem Biol       Date:  2009-11-22       Impact factor: 15.040

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.