Literature DB >> 21300697

An empirical assessment of validation practices for molecular classifiers.

Peter J Castaldi1, Issa J Dahabreh, John P A Ioannidis.   

Abstract

Proposed molecular classifiers may be overfit to idiosyncrasies of noisy genomic and proteomic data. Cross-validation methods are often used to obtain estimates of classification accuracy, but both simulations and case studies suggest that, when inappropriate methods are used, bias may ensue. Bias can be bypassed and generalizability can be tested by external (independent) validation. We evaluated 35 studies that have reported on external validation of a molecular classifier. We extracted information on study design and methodological features, and compared the performance of molecular classifiers in internal cross-validation versus external validation for 28 studies where both had been performed. We demonstrate that the majority of studies pursued cross-validation practices that are likely to overestimate classifier performance. Most studies were markedly underpowered to detect a 20% decrease in sensitivity or specificity between internal cross-validation and external validation [median power was 36% (IQR, 21-61%) and 29% (IQR, 15-65%), respectively]. The median reported classification performance for sensitivity and specificity was 94% and 98%, respectively, in cross-validation and 88% and 81% for independent validation. The relative diagnostic odds ratio was 3.26 (95% CI 2.04-5.21) for cross-validation versus independent validation. Finally, we reviewed all studies (n = 758) which cited those in our study sample, and identified only one instance of additional subsequent independent validation of these classifiers. In conclusion, these results document that many cross-validation practices employed in the literature are potentially biased and genuine progress in this field will require adoption of routine external validation of molecular classifiers, preferably in much larger studies than in current practice.

Mesh:

Year:  2011        PMID: 21300697      PMCID: PMC3088312          DOI: 10.1093/bib/bbq073

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  79 in total

1.  Empirical evidence of design-related bias in studies of diagnostic tests.

Authors:  J G Lijmer; B W Mol; S Heisterkamp; G J Bonsel; M H Prins; J H van der Meulen; P M Bossuyt
Journal:  JAMA       Date:  1999-09-15       Impact factor: 56.272

2.  Over-optimism in bioinformatics: an illustration.

Authors:  Monika Jelizarow; Vincent Guillemot; Arthur Tenenhaus; Korbinian Strimmer; Anne-Laure Boulesteix
Journal:  Bioinformatics       Date:  2010-06-26       Impact factor: 6.937

3.  Array of hope.

Authors:  E S Lander
Journal:  Nat Genet       Date:  1999-01       Impact factor: 38.330

4.  Meta-analysis in clinical trials.

Authors:  R DerSimonian; N Laird
Journal:  Control Clin Trials       Date:  1986-09

Review 5.  Expectations, validity, and reality in gene expression profiling.

Authors:  Kyoungmi Kim; Stanislav O Zakharkin; David B Allison
Journal:  J Clin Epidemiol       Date:  2010-06-25       Impact factor: 6.437

6.  Problems of spectrum and bias in evaluating the efficacy of diagnostic tests.

Authors:  D F Ransohoff; A R Feinstein
Journal:  N Engl J Med       Date:  1978-10-26       Impact factor: 91.245

7.  Three biomarkers identified from serum proteomic analysis for the detection of early stage ovarian cancer.

Authors:  Zhen Zhang; Robert C Bast; Yinhua Yu; Jinong Li; Lori J Sokoll; Alex J Rai; Jason M Rosenzweig; Bonnie Cameron; Young Y Wang; Xiao-Ying Meng; Andrew Berchuck; Carolien Van Haaften-Day; Neville F Hacker; Henk W A de Bruijn; Ate G J van der Zee; Ian J Jacobs; Eric T Fung; Daniel W Chan
Journal:  Cancer Res       Date:  2004-08-15       Impact factor: 12.701

8.  Gene expression profiles predict complete pathologic response to neoadjuvant paclitaxel and fluorouracil, doxorubicin, and cyclophosphamide chemotherapy in breast cancer.

Authors:  M Ayers; W F Symmans; J Stec; A I Damokosh; E Clark; K Hess; M Lecocke; J Metivier; D Booser; N Ibrahim; V Valero; M Royce; B Arun; G Whitman; J Ross; N Sneige; G N Hortobagyi; L Pusztai
Journal:  J Clin Oncol       Date:  2004-05-10       Impact factor: 44.544

9.  Gene expression profiling in follicular lymphoma to assess clinical aggressiveness and to guide the choice of treatment.

Authors:  Annuska M Glas; Marie José Kersten; Leonie J M J Delahaye; Anke T Witteveen; Robby E Kibbelaar; Arno Velds; Lodewyk F A Wessels; Peter Joosten; Ron M Kerkhoven; René Bernards; Johan H J M van Krieken; Philip M Kluin; Laura J van't Veer; Daphne de Jong
Journal:  Blood       Date:  2004-09-02       Impact factor: 22.113

10.  Reproducibility of SELDI-TOF protein patterns in serum: comparing datasets from different experiments.

Authors:  Keith A Baggerly; Jeffrey S Morris; Kevin R Coombes
Journal:  Bioinformatics       Date:  2004-01-29       Impact factor: 6.937

View more
  28 in total

1.  Improving validation practices in "omics" research.

Authors:  John P A Ioannidis; Muin J Khoury
Journal:  Science       Date:  2011-12-02       Impact factor: 47.728

2.  STrengthening the reporting of OBservational studies in Epidemiology-Molecular Epidemiology (STROBE-ME): an extension of the STROBE statement.

Authors:  Valentina Gallo; Matthias Egger; Valerie McCormack; Peter B Farmer; John P A Ioannidis; Micheline Kirsch-Volders; Giuseppe Matullo; David H Phillips; Bernadette Schoket; Ulf Stromberg; Roel Vermeulen; Christopher Wild; Miquel Porta; Paolo Vineis
Journal:  Eur J Epidemiol       Date:  2011-10-29       Impact factor: 8.082

3.  Performance reproducibility index for classification.

Authors:  Mohammadmahdi R Yousefi; Edward R Dougherty
Journal:  Bioinformatics       Date:  2012-09-06       Impact factor: 6.937

Review 4.  Emerging applications of metabolomics in drug discovery and precision medicine.

Authors:  David S Wishart
Journal:  Nat Rev Drug Discov       Date:  2016-03-11       Impact factor: 84.694

Review 5.  Clinical outcome prediction by microRNAs in human cancer: a systematic review.

Authors:  Viswam S Nair; Lauren S Maeda; John P A Ioannidis
Journal:  J Natl Cancer Inst       Date:  2012-03-06       Impact factor: 13.506

6.  'Cytology-on-a-chip' based sensors for monitoring of potentially malignant oral lesions.

Authors:  Timothy J Abram; Pierre N Floriano; Nicolaos Christodoulides; Robert James; A Ross Kerr; Martin H Thornhill; Spencer W Redding; Nadarajah Vigneswaran; Paul M Speight; Julie Vick; Craig Murdoch; Christine Freeman; Anne M Hegarty; Katy D'Apice; Joan A Phelan; Patricia M Corby; Ismael Khouly; Jerry Bouquot; Nagi M Demian; Y Etan Weinstock; Stephanie Rowan; Chih-Ko Yeh; H Stan McGuff; Frank R Miller; Surabhi Gaur; Kailash Karthikeyan; Leander Taylor; Cathy Le; Michael Nguyen; Humberto Talavera; Rameez Raja; Jorge Wong; John T McDevitt
Journal:  Oral Oncol       Date:  2016-07-20       Impact factor: 5.337

Review 7.  Opportunities and challenges for selected emerging technologies in cancer epidemiology: mitochondrial, epigenomic, metabolomic, and telomerase profiling.

Authors:  Mukesh Verma; Muin J Khoury; John P A Ioannidis
Journal:  Cancer Epidemiol Biomarkers Prev       Date:  2012-12-14       Impact factor: 4.254

8.  Analytical strategies for studying stem cell metabolism.

Authors:  James M Arnold; William T Choi; Arun Sreekumar; Mirjana Maletić-Savatić
Journal:  Front Biol (Beijing)       Date:  2015-04

9.  COPD subtypes identified by network-based clustering of blood gene expression.

Authors:  Yale Chang; Kimberly Glass; Yang-Yu Liu; Edwin K Silverman; James D Crapo; Ruth Tal-Singer; Russ Bowler; Jennifer Dy; Michael Cho; Peter Castaldi
Journal:  Genomics       Date:  2016-01-08       Impact factor: 5.736

Review 10.  Big Data in Public Health: Terminology, Machine Learning, and Privacy.

Authors:  Stephen J Mooney; Vikas Pejaver
Journal:  Annu Rev Public Health       Date:  2017-12-20       Impact factor: 21.981

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.