Literature DB >> 21347042

Effect of data combination on predictive modeling: a study using gene expression data.

Melanie Osl1, Stephan Dreiseitl, Jihoon Kim, Kiltesh Patel, Christian Baumgartner, Lucila Ohno-Machado.   

Abstract

BACKGROUND: The quality of predictive modeling in biomedicine depends on the amount of data available for model building.
OBJECTIVE: To study the effect of combining microarray data sets on feature selection and predictive modeling performance.
METHODS: Empirical evaluation of stability of feature selection and discriminatory power of classifiers using three previously published gene expression data sets, analyzed both individually and in combination.
RESULTS: Feature selection was not robust for the individual as well as for the combined data sets. The classification performance of models built on individual and combined data sets was heavily dependent on the data set from which the features were extracted.
CONCLUSION: We identified volatility of feature selection as contributing factor to some of the problems faced by predictive modeling using microarray data.

Entities:  

Mesh:

Year:  2010        PMID: 21347042      PMCID: PMC3041313     

Source DB:  PubMed          Journal:  AMIA Annu Symp Proc        ISSN: 1559-4076


  19 in total

1.  Adjustment of systematic microarray data biases.

Authors:  Monica Benito; Joel Parker; Quan Du; Junyuan Wu; Dong Xiang; Charles M Perou; J S Marron
Journal:  Bioinformatics       Date:  2004-01-01       Impact factor: 6.937

2.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias.

Authors:  B M Bolstad; R A Irizarry; M Astrand; T P Speed
Journal:  Bioinformatics       Date:  2003-01-22       Impact factor: 6.937

3.  On combining multiple microarray studies for improved functional classification by whole-dataset feature selection.

Authors:  See-Kiong Ng; Soon-Heng Tan; V S Sundararajan
Journal:  Genome Inform       Date:  2003

4.  Adjusting batch effects in microarray expression data using empirical Bayes methods.

Authors:  W Evan Johnson; Cheng Li; Ariel Rabinovic
Journal:  Biostatistics       Date:  2006-04-21       Impact factor: 5.899

5.  An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival.

Authors:  Lance D Miller; Johanna Smeds; Joshy George; Vinsensius B Vega; Liza Vergara; Alexander Ploner; Yudi Pawitan; Per Hall; Sigrid Klaar; Edison T Liu; Jonas Bergh
Journal:  Proc Natl Acad Sci U S A       Date:  2005-09-02       Impact factor: 11.205

6.  The meaning and use of the area under a receiver operating characteristic (ROC) curve.

Authors:  J A Hanley; B J McNeil
Journal:  Radiology       Date:  1982-04       Impact factor: 11.105

7.  A comparison of machine learning methods for the diagnosis of pigmented skin lesions.

Authors:  S Dreiseitl; L Ohno-Machado; H Kittler; S Vinterbo; H Billhardt; M Binder
Journal:  J Biomed Inform       Date:  2001-02       Impact factor: 6.317

8.  Can survival prediction be improved by merging gene expression data sets?

Authors:  Haleh Yasrebi; Peter Sperisen; Viviane Praz; Philipp Bucher
Journal:  PLoS One       Date:  2009-10-23       Impact factor: 3.240

9.  AceView: a comprehensive cDNA-supported gene and transcripts annotation.

Authors:  Danielle Thierry-Mieg; Jean Thierry-Mieg
Journal:  Genome Biol       Date:  2006-08-07       Impact factor: 13.583

10.  Novel and simple transformation algorithm for combining microarray data sets.

Authors:  Ki-Yeol Kim; Dong Hyuk Ki; Ha Jin Jeong; Hei-Cheul Jeung; Hyun Cheol Chung; Sun Young Rha
Journal:  BMC Bioinformatics       Date:  2007-06-25       Impact factor: 3.169

View more
  7 in total

1.  Selecting cases for whom additional tests can improve prognostication.

Authors:  Xiaoqian Jiang; Jihoon Kim; Yuan Wu; Shuang Wang; Lucila Ohno-Machado
Journal:  AMIA Annu Symp Proc       Date:  2012-11-03

2.  VERTIcal Grid lOgistic regression (VERTIGO).

Authors:  Yong Li; Xiaoqian Jiang; Shuang Wang; Hongkai Xiong; Lucila Ohno-Machado
Journal:  J Am Med Inform Assoc       Date:  2015-11-09       Impact factor: 4.497

3.  Grid Binary LOgistic REgression (GLORE): building shared models without sharing data.

Authors:  Yuan Wu; Xiaoqian Jiang; Jihoon Kim; Lucila Ohno-Machado
Journal:  J Am Med Inform Assoc       Date:  2012-04-17       Impact factor: 4.497

4.  Smooth isotonic regression: a new method to calibrate predictive models.

Authors:  Xiaoqian Jiang; Melanie Osl; Jihoon Kim; Lucila Ohno-Machado
Journal:  AMIA Jt Summits Transl Sci Proc       Date:  2011-03-07

5.  I-spline Smoothing for Calibrating Predictive Models.

Authors:  Yuan Wu; Xiaoqian Jiang; Jihoon Kim; Lucila Ohno-Machado
Journal:  AMIA Jt Summits Transl Sci Proc       Date:  2012-03-19

6.  Doubly Optimized Calibrated Support Vector Machine (DOC-SVM): an algorithm for joint optimization of discrimination and calibration.

Authors:  Xiaoqian Jiang; Aditya Menon; Shuang Wang; Jihoon Kim; Lucila Ohno-Machado
Journal:  PLoS One       Date:  2012-11-06       Impact factor: 3.240

7.  Comparative study of joint analysis of microarray gene expression data in survival prediction and risk assessment of breast cancer patients.

Authors:  Haleh Yasrebi
Journal:  Brief Bioinform       Date:  2015-10-26       Impact factor: 11.622

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.