Literature DB >> 14668233

A CART-based approach to discover emerging patterns in microarray data.

Anne-Laure Boulesteix1, Gerhard Tutz, Korbinian Strimmer.   

Abstract

MOTIVATION: Cancer diagnosis using gene expression profiles requires supervised learning and gene selection methods. Of the many suggested approaches, the method of emerging patterns (EPs) has the particular advantage of explicitly modeling interactions among genes, which improves classification accuracy. However, finding useful (i.e. short and statistically significant) EP is typically very hard.
METHODS: Here we introduce a CART-based approach to discover EPs in microarray data. The method is based on growing decision trees from which the EPs are extracted. This approach combines pattern search with a statistical procedure based on Fisher's exact test to assess the significance of each EP. Subsequently, sample classification based on the inferred EPs is performed using maximum-likelihood linear discriminant analysis.
RESULTS: Using simulated data as well as gene expression data from colon and leukemia cancer experiments we assessed the performance of our pattern search algorithm and classification procedure. In the simulations, our method recovers a large proportion of known EPs while for real data it is comparable in classification accuracy with three top-performing alternative classification algorithms. In addition, it assigns statistical significance to the inferred EPs and allows to rank the patterns while simultaneously avoiding overfit of the data. The new approach therefore provides a versatile and computationally fast tool for elucidating local gene interactions as well as for classification. AVAILABILITY: A computer program written in the statistical language R implementing the new approach is freely available from the web page http://www.stat.uni-muenchen.de/~socher/

Entities:  

Mesh:

Year:  2003        PMID: 14668233     DOI: 10.1093/bioinformatics/btg361

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  10 in total

1.  Genotype and phenotypes of an intestine-adapted Escherichia coli K-12 mutant selected by animal passage for superior colonization.

Authors:  Andrew J Fabich; Mary P Leatham; Joe E Grissom; Graham Wiley; Hongshing Lai; Fares Najar; Bruce A Roe; Paul S Cohen; Tyrrell Conway
Journal:  Infect Immun       Date:  2011-03-21       Impact factor: 3.441

2.  Classification and regression tree (CART) analyses of genomic signatures reveal sets of tetramers that discriminate temperature optima of archaea and bacteria.

Authors:  Betsey Dexter Dyer; Michael J Kahn; Mark D Leblanc
Journal:  Archaea       Date:  2008-12       Impact factor: 3.273

3.  Classifying gene expression profiles from pairwise mRNA comparisons.

Authors:  Donald Geman; Christian d'Avignon; Daniel Q Naiman; Raimond L Winslow
Journal:  Stat Appl Genet Mol Biol       Date:  2004-08-30

Review 4.  Relative expression analysis for molecular cancer diagnosis and prognosis.

Authors:  James A Eddy; Jaeyun Sung; Donald Geman; Nathan D Price
Journal:  Technol Cancer Res Treat       Date:  2010-04

5.  Mining SARS-CoV protease cleavage data using non-orthogonal decision trees: a novel method for decisive template selection.

Authors:  Zheng Rong Yang
Journal:  Bioinformatics       Date:  2005-03-29       Impact factor: 6.937

Review 6.  An argument for mechanism-based statistical inference in cancer.

Authors:  Donald Geman; Michael Ochs; Nathan D Price; Cristian Tomasetti; Laurent Younes
Journal:  Hum Genet       Date:  2014-11-09       Impact factor: 4.132

7.  Breast cancer prognosis by combinatorial analysis of gene expression data.

Authors:  Gabriela Alexe; Sorin Alexe; David E Axelrod; Tibérius O Bonates; Irina I Lozina; Michael Reiss; Peter L Hammer
Journal:  Breast Cancer Res       Date:  2006       Impact factor: 6.466

8.  Using contrast patterns between true complexes and random subgraphs in PPI networks to predict unknown protein complexes.

Authors:  Quanzhong Liu; Jiangning Song; Jinyan Li
Journal:  Sci Rep       Date:  2016-02-12       Impact factor: 4.379

9.  Combining statistical techniques to predict postsurgical risk of 1-year mortality for patients with colon cancer.

Authors:  Inmaculada Arostegui; Nerea Gonzalez; Nerea Fernández-de-Larrea; Santiago Lázaro-Aramburu; Marisa Baré; Maximino Redondo; Cristina Sarasqueta; Susana Garcia-Gutierrez; José M Quintana
Journal:  Clin Epidemiol       Date:  2018-03-06       Impact factor: 4.790

10.  Classification of tumor samples from expression data using decision trunks.

Authors:  Benjamin Ulfenborg; Karin Klinga-Levan; Björn Olsson
Journal:  Cancer Inform       Date:  2013-02-13
  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.