Literature DB >> 19200394

Outcome prediction based on microarray analysis: a critical perspective on methods.

Michalis Zervakis1, Michalis E Blazadonakis, Georgia Tsiliki, Vasiliki Danilatou, Manolis Tsiknakis, Dimitris Kafetzopoulos.   

Abstract

BACKGROUND: Information extraction from microarrays has not yet been widely used in diagnostic or prognostic decision-support systems, due to the diversity of results produced by the available techniques, their instability on different data sets and the inability to relate statistical significance with biological relevance. Thus, there is an urgent need to address the statistical framework of microarray analysis and identify its drawbacks and limitations, which will enable us to thoroughly compare methodologies under the same experimental set-up and associate results with confidence intervals meaningful to clinicians. In this study we consider gene-selection algorithms with the aim to reveal inefficiencies in performance evaluation and address aspects that can reduce uncertainty in algorithmic validation.
RESULTS: A computational study is performed related to the performance of several gene selection methodologies on publicly available microarray data. Three basic types of experimental scenarios are evaluated, i.e. the independent test-set and the 10-fold cross-validation (CV) using maximum and average performance measures. Feature selection methods behave differently under different validation strategies. The performance results from CV do not mach well those from the independent test-set, except for the support vector machines (SVM) and the least squares SVM methods. However, these wrapper methods achieve variable (often low) performance, whereas the hybrid methods attain consistently higher accuracies. The use of an independent test-set within CV is important for the evaluation of the predictive power of algorithms. The optimal size of the selected gene-set also appears to be dependent on the evaluation scheme. The consistency of selected genes over variation of the training-set is another aspect important in reducing uncertainty in the evaluation of the derived gene signature. In all cases the presence of outlier samples can seriously affect algorithmic performance.
CONCLUSION: Multiple parameters can influence the selection of a gene-signature and its predictive power, thus possible biases in validation methods must always be accounted for. This paper illustrates that independent test-set evaluation reduces the bias of CV, and case-specific measures reveal stability characteristics of the gene-signature over changes of the training set. Moreover, frequency measures on gene selection address the algorithmic consistency in selecting the same gene signature under different training conditions. These issues contribute to the development of an objective evaluation framework and aid the derivation of statistically consistent gene signatures that could eventually be correlated with biological relevance. The benefits of the proposed framework are supported by the evaluation results and methodological comparisons performed for several gene-selection algorithms on three publicly available datasets.

Entities:  

Mesh:

Year:  2009        PMID: 19200394      PMCID: PMC2667512          DOI: 10.1186/1471-2105-10-53

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  35 in total

1.  Analysis of recursive gene selection approaches from microarray data.

Authors:  Fan Li; Yiming Yang
Journal:  Bioinformatics       Date:  2005-08-23       Impact factor: 6.937

2.  Prediction of cancer outcome with microarrays: a multiple random validation strategy.

Authors:  Stefan Michiels; Serge Koscielny; Catherine Hill
Journal:  Lancet       Date:  2005 Feb 5-11       Impact factor: 79.321

3.  Microarrays and molecular research: noise discovery?

Authors:  John P A Ioannidis
Journal:  Lancet       Date:  2005 Feb 5-11       Impact factor: 79.321

Review 4.  Microarray data analysis: from disarray to consolidation and consensus.

Authors:  David B Allison; Xiangqin Cui; Grier P Page; Mahyar Sabripour
Journal:  Nat Rev Genet       Date:  2006-01       Impact factor: 53.242

5.  Thousands of samples are needed to generate a robust gene list for predicting outcome in cancer.

Authors:  Liat Ein-Dor; Or Zuk; Eytan Domany
Journal:  Proc Natl Acad Sci U S A       Date:  2006-04-03       Impact factor: 11.205

Review 6.  Critical review of published microarray studies for cancer outcome and guidelines on statistical analysis and reporting.

Authors:  Alain Dupuy; Richard M Simon
Journal:  J Natl Cancer Inst       Date:  2007-01-17       Impact factor: 13.506

7.  GEMS: a system for automated cancer diagnosis and biomarker discovery from microarray gene expression data.

Authors:  Alexander Statnikov; Ioannis Tsamardinos; Yerbolat Dosbayev; Constantin F Aliferis
Journal:  Int J Med Inform       Date:  2005-08       Impact factor: 4.046

8.  Cancer biomarkers--an invitation to the table.

Authors:  William S Dalton; Stephen H Friend
Journal:  Science       Date:  2006-05-26       Impact factor: 47.728

9.  Bias in error estimation when using cross-validation for model selection.

Authors:  Sudhir Varma; Richard Simon
Journal:  BMC Bioinformatics       Date:  2006-02-23       Impact factor: 3.169

10.  Identifying genes that contribute most to good classification in microarrays.

Authors:  Stuart G Baker; Barnett S Kramer
Journal:  BMC Bioinformatics       Date:  2006-09-07       Impact factor: 3.169

View more
  13 in total

1.  An empirical assessment of validation practices for molecular classifiers.

Authors:  Peter J Castaldi; Issa J Dahabreh; John P A Ioannidis
Journal:  Brief Bioinform       Date:  2011-02-07       Impact factor: 11.622

2.  A biopsy sample reduction approach to identify significant alterations of the testicular transcriptome in the presence of Y-chromosomal microdeletions that are independent of germ cell composition.

Authors:  Heike Cappallo-Obermann; Kathrein von Kopylow; Wolfgang Schulze; Andrej-Nikolai Spiess
Journal:  Hum Genet       Date:  2010-07-29       Impact factor: 4.132

3.  SLocX: Predicting Subcellular Localization of Arabidopsis Proteins Leveraging Gene Expression Data.

Authors:  Malgorzata Ryngajllo; Liam Childs; Marc Lohse; Federico M Giorgi; Anja Lude; Joachim Selbig; Björn Usadel
Journal:  Front Plant Sci       Date:  2011-09-12       Impact factor: 5.753

4.  Nearest template prediction: a single-sample-based flexible class prediction with confidence assessment.

Authors:  Yujin Hoshida
Journal:  PLoS One       Date:  2010-11-23       Impact factor: 3.240

5.  Classification of microarrays; synergistic effects between normalization, gene selection and machine learning.

Authors:  Jenny Önskog; Eva Freyhult; Mattias Landfors; Patrik Rydén; Torgeir R Hvidsten
Journal:  BMC Bioinformatics       Date:  2011-10-07       Impact factor: 3.169

6.  Using rule-based machine learning for candidate disease gene prioritization and sample classification of cancer gene expression data.

Authors:  Enrico Glaab; Jaume Bacardit; Jonathan M Garibaldi; Natalio Krasnogor
Journal:  PLoS One       Date:  2012-07-11       Impact factor: 3.240

7.  A combinatory approach for selecting prognostic genes in microarray studies of tumour survivals.

Authors:  Qihua Tan; Mads Thomassen; Kirsten M Jochumsen; Ole Mogensen; Kaare Christensen; Torben A Kruse
Journal:  Adv Bioinformatics       Date:  2009-07-30

8.  LipocalinPred: a SVM-based method for prediction of lipocalins.

Authors:  Jayashree Ramana; Dinesh Gupta
Journal:  BMC Bioinformatics       Date:  2009-12-24       Impact factor: 3.169

9.  Gene expression profiles for predicting metastasis in breast cancer: a cross-study comparison of classification methods.

Authors:  Mark Burton; Mads Thomassen; Qihua Tan; Torben A Kruse
Journal:  ScientificWorldJournal       Date:  2012-11-28

10.  Biomarker selection and classification of "-omics" data using a two-step bayes classification framework.

Authors:  Anunchai Assawamakin; Supakit Prueksaaroon; Supasak Kulawonganunchai; Philip James Shaw; Vara Varavithya; Taneth Ruangrajitpakorn; Sissades Tongsima
Journal:  Biomed Res Int       Date:  2013-09-11       Impact factor: 3.411

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.