Literature DB >> 12618382

A classification-based machine learning approach for the analysis of genome-wide expression data.

James Lyons-Weiler1, Satish Patel, Soumyaroop Bhattacharya.   

Abstract

Three important areas of data analysis for global gene expression analysis are class discovery, class prediction, and finding dysregulated genes (biomarkers). The clinical application of microarray data will require marker genes whose expression patterns are sufficiently well understood to allow accurate predictions on disease subclass membership. Commonly used methods of analysis include hierarchical clustering algorithms, t-, F-, and Z-tests, and machine learning approaches. We describe an approach called the maximum difference subset (MDSS) algorithm that combines classification algorithms, classical statistics, and elements of machine learning and provides a coherent framework. By integrating prediction accuracy, the MDSS algorithm learns the critical threshold of statistical significance (the alpha or P-value), eliminating the arbitrariness of setting a threshold of statistical significance and minimizing the effect of the normality assumptions. To reduce the false positive rate and to increase external validity of the predictive gene set, a jackknife step is used. This step identifies and removes genes in the initial MDSS with low combined predictive utility. The overall MDSS provides a prediction that is less dependent on an arbitrary study design (sample inclusion or exclusion) and should thus have high external validity. We demonstrate that this approach, unlike other published methods, identifies biomarkers capable of predicting the outcome of anthracycline-cytarabine chemotherapy in cases of acute myeloid leukemia. By incorporating two criteria-statistical significance and predictive utility-the approach learns the significance level relevant for a given data set. The MDSS approach can be used with any test and classifier operator pair.

Entities:  

Mesh:

Year:  2003        PMID: 12618382      PMCID: PMC430281          DOI: 10.1101/gr.104003

Source DB:  PubMed          Journal:  Genome Res        ISSN: 1088-9051            Impact factor:   9.043


  39 in total

1.  Support vector machine classification and validation of cancer tissue samples using microarray expression data.

Authors:  T S Furey; N Cristianini; N Duffy; D W Bednarski; M Schummer; D Haussler
Journal:  Bioinformatics       Date:  2000-10       Impact factor: 6.937

2.  Singular value decomposition for genome-wide expression data processing and modeling.

Authors:  O Alter; P O Brown; D Botstein
Journal:  Proc Natl Acad Sci U S A       Date:  2000-08-29       Impact factor: 11.205

3.  Loss of caspase-8 expression in neuroblastoma is related to malignancy and resistance to TRAIL-induced apoptosis.

Authors:  S Hopkins-Donaldson; J L Bodmer; K B Bourloud; C B Brognara; J Tschopp; N Gross
Journal:  Med Pediatr Oncol       Date:  2000-12

4.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays.

Authors:  U Alon; N Barkai; D A Notterman; K Gish; S Ybarra; D Mack; A J Levine
Journal:  Proc Natl Acad Sci U S A       Date:  1999-06-08       Impact factor: 11.205

5.  Molecular classification of cutaneous malignant melanoma by gene expression profiling.

Authors:  M Bittner; P Meltzer; Y Chen; Y Jiang; E Seftor; M Hendrix; M Radmacher; R Simon; Z Yakhini; A Ben-Dor; N Sampas; E Dougherty; E Wang; F Marincola; C Gooden; J Lueders; A Glatfelter; P Pollock; J Carpten; E Gillanders; D Leja; K Dietrich; C Beaudry; M Berens; D Alberts; V Sondak
Journal:  Nature       Date:  2000-08-03       Impact factor: 49.962

6.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling.

Authors:  A A Alizadeh; M B Eisen; R E Davis; C Ma; I S Lossos; A Rosenwald; J C Boldrick; H Sabet; T Tran; X Yu; J I Powell; L Yang; G E Marti; T Moore; J Hudson; L Lu; D B Lewis; R Tibshirani; G Sherlock; W C Chan; T C Greiner; D D Weisenburger; J O Armitage; R Warnke; R Levy; W Wilson; M R Grever; J C Byrd; D Botstein; P O Brown; L M Staudt
Journal:  Nature       Date:  2000-02-03       Impact factor: 49.962

Review 7.  Caspases: key players in programmed cell death.

Authors:  M G Grütter
Journal:  Curr Opin Struct Biol       Date:  2000-12       Impact factor: 6.809

8.  Molecular portraits of human breast tumours.

Authors:  C M Perou; T Sørlie; M B Eisen; M van de Rijn; S S Jeffrey; C A Rees; J R Pollack; D T Ross; H Johnsen; L A Akslen; O Fluge; A Pergamenschikov; C Williams; S X Zhu; P E Lønning; A L Børresen-Dale; P O Brown; D Botstein
Journal:  Nature       Date:  2000-08-17       Impact factor: 49.962

9.  Uroguanylin treatment suppresses polyp formation in the Apc(Min/+) mouse and induces apoptosis in human colon adenocarcinoma cells via cyclic GMP.

Authors:  K Shailubhai; H H Yu; K Karunanandaa; J Y Wang; S L Eber; Y Wang; N S Joo; H D Kim; B W Miedema; S Z Abbas; S S Boddupalli; M G Currie; L R Forte
Journal:  Cancer Res       Date:  2000-09-15       Impact factor: 12.701

Review 10.  Leukemia arising out of paroxysmal nocturnal hemoglobinuria.

Authors:  J W Harris; R Koscick; H M Lazarus; J R Eshleman; M E Medof
Journal:  Leuk Lymphoma       Date:  1999-02
View more
  8 in total

Review 1.  Application of machine learning in the management of acute myeloid leukemia: current practice and future prospects.

Authors:  Jan-Niklas Eckardt; Martin Bornhäuser; Karsten Wendt; Jan Moritz Middeke
Journal:  Blood Adv       Date:  2020-12-08

2.  A jackknife-like method for classification and uncertainty assessment of multi-category tumor samples using gene expression information.

Authors:  Wensheng Zhang; Kelly Robbins; Yupeng Wang; Keith Bertrand; Romdhane Rekaya
Journal:  BMC Genomics       Date:  2010-04-29       Impact factor: 3.969

3.  Molecular biomarkers for quantitative and discrete COPD phenotypes.

Authors:  Soumyaroop Bhattacharya; Sorachai Srisuma; Dawn L Demeo; Steven D Shapiro; Raphael Bueno; Edwin K Silverman; John J Reilly; Thomas J Mariani
Journal:  Am J Respir Cell Mol Biol       Date:  2008-10-10       Impact factor: 6.914

4.  Optimized between-group classification: a new jackknife-based gene selection procedure for genome-wide expression data.

Authors:  Florent Baty; Michel P Bihl; Guy Perrière; Aedín C Culhane; Martin H Brutsche
Journal:  BMC Bioinformatics       Date:  2005-09-28       Impact factor: 3.169

5.  Transformation of expression intensities across generations of Affymetrix microarrays using sequence matching and regression modeling.

Authors:  Soumyaroop Bhattacharya; Thomas J Mariani
Journal:  Nucleic Acids Res       Date:  2005-10-13       Impact factor: 16.971

6.  Characterizing disease states from topological properties of transcriptional regulatory networks.

Authors:  David P Tuck; Harriet M Kluger; Yuval Kluger
Journal:  BMC Bioinformatics       Date:  2006-05-02       Impact factor: 3.169

7.  Individualized markers optimize class prediction of microarray data.

Authors:  Pavlos Pavlidis; Panayiota Poirazi
Journal:  BMC Bioinformatics       Date:  2006-07-14       Impact factor: 3.169

8.  SED, a normalization free method for DNA microarray data analysis.

Authors:  Huajun Wang; Hui Huang
Journal:  BMC Bioinformatics       Date:  2004-09-02       Impact factor: 3.169

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.