Literature DB >> 23471879

Sample size considerations of prediction-validation methods in high-dimensional data for survival outcomes.

Herbert Pang1, Sin-Ho Jung.   

Abstract

A variety of prediction methods are used to relate high-dimensional genome data with a clinical outcome using a prediction model. Once a prediction model is developed from a data set, it should be validated using a resampling method or an independent data set. Although the existing prediction methods have been intensively evaluated by many investigators, there has not been a comprehensive study investigating the performance of the validation methods, especially with a survival clinical outcome. Understanding the properties of the various validation methods can allow researchers to perform more powerful validations while controlling for type I error. In addition, sample size calculation strategy based on these validation methods is lacking. We conduct extensive simulations to examine the statistical properties of these validation strategies. In both simulations and a real data example, we have found that 10-fold cross-validation with permutation gave the best power while controlling type I error close to the nominal level. Based on this, we have also developed a sample size calculation method that will be used to design a validation study with a user-chosen combination of prediction. Microarray and genome-wide association studies data are used as illustrations. The power calculation method in this presentation can be used for the design of any biomedical studies involving high-dimensional data and survival outcomes.
© 2013 WILEY PERIODICALS, INC.

Entities:  

Mesh:

Year:  2013        PMID: 23471879      PMCID: PMC3763900          DOI: 10.1002/gepi.21721

Source DB:  PubMed          Journal:  Genet Epidemiol        ISSN: 0741-0395            Impact factor:   2.135


  15 in total

1.  Sample size calculation for multiple testing in microarray data analysis.

Authors:  Sin-Ho Jung; Heejung Bang; Stanley Young
Journal:  Biostatistics       Date:  2005-01       Impact factor: 5.899

2.  Sample size for FDR-control in microarray data analysis.

Authors:  Sin-Ho Jung
Journal:  Bioinformatics       Date:  2005-04-21       Impact factor: 6.937

3.  Sample size planning for developing classifiers using high-dimensional DNA microarray data.

Authors:  Kevin K Dobbin; Richard M Simon
Journal:  Biostatistics       Date:  2006-04-13       Impact factor: 5.899

4.  Quick calculation for sample size while controlling false discovery rate with application to microarray analysis.

Authors:  Peng Liu; J T Gene Hwang
Journal:  Bioinformatics       Date:  2007-01-19       Impact factor: 6.937

5.  Practical guidelines for assessing power and false discovery rate for a fixed sample size in microarray experiments.

Authors:  Tiejun Tong; Hongyu Zhao
Journal:  Stat Med       Date:  2008-05-20       Impact factor: 2.373

6.  Pathway analysis using random forests with bivariate node-split for survival outcomes.

Authors:  Herbert Pang; Debayan Datta; Hongyu Zhao
Journal:  Bioinformatics       Date:  2009-11-18       Impact factor: 6.937

7.  Sample size requirements to detect gene-environment interactions in genome-wide association studies.

Authors:  Cassandra E Murcray; Juan Pablo Lewinger; David V Conti; Duncan C Thomas; W James Gauderman
Journal:  Genet Epidemiol       Date:  2011-02-09       Impact factor: 2.135

8.  Gene-expression profiles predict survival of patients with lung adenocarcinoma.

Authors:  David G Beer; Sharon L R Kardia; Chiang-Ching Huang; Thomas J Giordano; Albert M Levin; David E Misek; Lin Lin; Guoan Chen; Tarek G Gharib; Dafydd G Thomas; Michelle L Lizyness; Rork Kuick; Satoru Hayasaka; Jeremy M G Taylor; Mark D Iannettoni; Mark B Orringer; Samir Hanash
Journal:  Nat Med       Date:  2002-07-15       Impact factor: 53.440

9.  Prognostic significance of copy-number alterations in multiple myeloma.

Authors:  Hervé Avet-Loiseau; Cheng Li; Florence Magrangeas; Wilfried Gouraud; Catherine Charbonnel; Jean-Luc Harousseau; Michel Attal; Gerald Marit; Claire Mathiot; Thierry Facon; Philippe Moreau; Kenneth C Anderson; Loïc Campion; Nikhil C Munshi; Stéphane Minvielle
Journal:  J Clin Oncol       Date:  2009-08-17       Impact factor: 44.544

10.  Shrinkage-based diagonal discriminant analysis and its applications in high-dimensional data.

Authors:  Herbert Pang; Tiejun Tong; Hongyu Zhao
Journal:  Biometrics       Date:  2009-12       Impact factor: 2.571

View more
  14 in total

1.  Association of Interleukin-10 gene promoter polymorphisms with obstructive sleep apnea.

Authors:  Sibel Özdaş; Talih Özdaş; Mustafa Acar; Selim S Erbek; Sabri Köseoğlu; Gökhan Göktürk; Afife Izbirak
Journal:  Sleep Breath       Date:  2015-07-03       Impact factor: 2.816

2.  A leave-one-out cross-validation SAS macro for the identification of markers associated with survival.

Authors:  Christel Rushing; Anuradha Bulusu; Herbert I Hurwitz; Andrew B Nixon; Herbert Pang
Journal:  Comput Biol Med       Date:  2014-12-09       Impact factor: 4.589

3.  Prognostic value of health-related quality of life in patients with metastatic pancreatic adenocarcinoma: a random forest methodology.

Authors:  Momar Diouf; Thomas Filleron; Anne-Laure Pointet; Anne-Claire Dupont-Gossard; David Malka; Pascal Artru; Mélanie Gauthier; Thierry Lecomte; Thomas Aparicio; Anne Thirot-Bidault; Céline Lobry; Francine Fein; Olivier Dubreuil; Bruno Landi; Aziz Zaanan; Julien Taieb; Franck Bonnetain
Journal:  Qual Life Res       Date:  2015-11-28       Impact factor: 4.147

4.  Statistical aspect of translational and correlative studies in clinical trials.

Authors:  Herbert Pang; Xiaofei Wang
Journal:  Chin Clin Oncol       Date:  2016-02

Review 5.  Statistical Issues in the Design and Analysis of nCounter Projects.

Authors:  Sin-Ho Jung; Insuk Sohn
Journal:  Cancer Inform       Date:  2014-12-14

6.  Application of Machine Learning Techniques to High-Dimensional Clinical Data to Forecast Postoperative Complications.

Authors:  Paul Thottakkara; Tezcan Ozrazgat-Baslanti; Bradley B Hupf; Parisa Rashidi; Panos Pardalos; Petar Momcilovic; Azra Bihorac
Journal:  PLoS One       Date:  2016-05-27       Impact factor: 3.240

7.  Serum protein profiling using an aptamer array predicts clinical outcomes of stage IIA colon cancer: A leave-one-out crossvalidation.

Authors:  Jung Wook Huh; Sung Chun Kim; Insuk Sohn; Sin-Ho Jung; Hee Cheol Kim
Journal:  Oncotarget       Date:  2016-03-29

8.  Small sample sizes in high-throughput miRNA screens: A common pitfall for the identification of miRNA biomarkers.

Authors:  M G M Kok; M W J de Ronde; P D Moerland; J M Ruijter; E E Creemers; S J Pinto-Sietsma
Journal:  Biomol Detect Quantif       Date:  2017-12-18

9.  Decoding Tumor Phenotypes for ALK, ROS1, and RET Fusions in Lung Adenocarcinoma Using a Radiomics Approach.

Authors:  Hyun Jung Yoon; Insuk Sohn; Jong Ho Cho; Ho Yun Lee; Jae-Hun Kim; Yoon-La Choi; Hyeseung Kim; Genehee Lee; Kyung Soo Lee; Jhingook Kim
Journal:  Medicine (Baltimore)       Date:  2015-10       Impact factor: 1.817

10.  The expression pattern of 19 genes predicts the histology of endometrial carcinoma.

Authors:  Chang Ohk Sung; Insuk Sohn
Journal:  Sci Rep       Date:  2014-06-04       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.