Literature DB >> 10977068

Analysis of gene expression microarrays for phenotype classification.

A Califano1, G Stolovitzky, Y Tu.   

Abstract

Several microarray technologies that monitor the level of expression of a large number of genes have recently emerged. Given DNA-microarray data for a set of cells characterized by a given phenotype and for a set of control cells, an important problem is to identify "patterns" of gene expression that can be used to predict cell phenotype. The potential number of such patterns is exponential in the number of genes. In this paper, we propose a solution to this problem based on a supervised learning algorithm, which differs substantially from previous schemes. It couples a complex, non-linear similarity metric, which maximizes the probability of discovering discriminative gene expression patterns, and a pattern discovery algorithm called SPLASH. The latter discovers efficiently and deterministically all statistically significant gene expression patterns in the phenotype set. Statistical significance is evaluated based on the probability of a pattern to occur by chance in the control set. Finally, a greedy set covering algorithm is used to select an optimal subset of statistically significant patterns, which form the basis for a standard likelihood ratio classification scheme. We analyze data from 60 human cancer cell lines using this method, and compare our results with those of other supervised learning schemes. Different phenotypes are studied. These include cancer morphologies (such as melanoma), molecular targets (such as mutations in the p53 gene), and therapeutic targets related to the sensitivity to an anticancer compounds. We also analyze a synthetic data set that shows that this technique is especially well suited for the analysis of sub-phenotype mixtures. For complex phenotypes, such as p53, our method produces an encouragingly low rate of false positives and false negatives and seems to outperform the others. Similar low rates are reported when predicting the efficacy of experimental anticancer compounds. This counts among the first reported studies where drug efficacy has been successfully predicted from large-scale expression data analysis.

Entities:  

Mesh:

Year:  2000        PMID: 10977068

Source DB:  PubMed          Journal:  Proc Int Conf Intell Syst Mol Biol        ISSN: 1553-0833


  28 in total

1.  Relating whole-genome expression data with protein-protein interactions.

Authors:  Ronald Jansen; Dov Greenbaum; Mark Gerstein
Journal:  Genome Res       Date:  2002-01       Impact factor: 9.043

2.  Analysis of DNA microarrays using algorithms that employ rule-based expert knowledge.

Authors:  Kuang-Hung Pan; Chih-Jian Lih; Stanley N Cohen
Journal:  Proc Natl Acad Sci U S A       Date:  2002-02-19       Impact factor: 11.205

3.  Expression profiling of human tumors: the end of surgical pathology?

Authors:  M Ladanyi; W C Chan; T J Triche; W L Gerald
Journal:  J Mol Diagn       Date:  2001-08       Impact factor: 5.568

4.  Transcriptional analysis of the B cell germinal center reaction.

Authors:  Ulf Klein; Yuhai Tu; Gustavo A Stolovitzky; Jeffrey L Keller; Joseph Haddad; Vladan Miljkovic; Giorgio Cattoretti; Andrea Califano; Riccardo Dalla-Favera
Journal:  Proc Natl Acad Sci U S A       Date:  2003-02-25       Impact factor: 11.205

5.  Systematic learning of gene functional classes from DNA array expression data by using multilayer perceptrons.

Authors:  Alvaro Mateos; Joaquín Dopazo; Ronald Jansen; Yuhai Tu; Mark Gerstein; Gustavo Stolovitzky
Journal:  Genome Res       Date:  2002-11       Impact factor: 9.043

6.  Prediction of clinical drug efficacy by classification of drug-induced genomic expression profiles in vitro.

Authors:  Erik C Gunther; David J Stone; Robert W Gerwien; Patricia Bento; Melvyn P Heyes
Journal:  Proc Natl Acad Sci U S A       Date:  2003-07-17       Impact factor: 11.205

7.  Biclustering of linear patterns in gene expression data.

Authors:  Qinghui Gao; Christine Ho; Yingmin Jia; Jingyi Jessica Li; Haiyan Huang
Journal:  J Comput Biol       Date:  2012-06       Impact factor: 1.479

8.  Defining TNF-α- and LPS-induced gene signatures in monocytes to unravel the complexity of peripheral blood transcriptomes in health and disease.

Authors:  Biljana Smiljanovic; Joachim R Grün; Marta Steinbrich-Zöllner; Bruno Stuhlmüller; Thomas Häupl; Gerd R Burmester; Andreas Radbruch; Andreas Grützkau; Ria Baumgrass
Journal:  J Mol Med (Berl)       Date:  2010-07-17       Impact factor: 4.599

9.  Translational bioinformatics and healthcare informatics: computational and ethical challenges.

Authors:  Prerna Sethi; Kimberly Theodos
Journal:  Perspect Health Inf Manag       Date:  2009-09-16

10.  Enrichment constrained time-dependent clustering analysis for finding meaningful temporal transcription modules.

Authors:  Jia Meng; Shou-Jiang Gao; Yufei Huang
Journal:  Bioinformatics       Date:  2009-04-07       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.