Literature DB >> 27154835

Study design in high-dimensional classification analysis.

Brisa N Sánchez1, Meihua Wu2, Peter X K Song3, Wen Wang3.   

Abstract

Advances in high throughput technology have accelerated the use of hundreds to millions of biomarkers to construct classifiers that partition patients into different clinical conditions. Prior to classifier development in actual studies, a critical need is to determine the sample size required to reach a specified classification precision. We develop a systematic approach for sample size determination in high-dimensional (large [Formula: see text] small [Formula: see text]) classification analysis. Our method utilizes the probability of correct classification (PCC) as the optimization objective function and incorporates the higher criticism thresholding procedure for classifier development. Further, we derive the theoretical bound of maximal PCC gain from feature augmentation (e.g. when molecular and clinical predictors are combined in classifier development). Our methods are motivated and illustrated by a study using proteomics markers to classify post-kidney transplantation patients into stable and rejecting classes.
© The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Keywords:  Design; Higher criticism threshold; Large p small n; Linear discrimination; Sample size

Mesh:

Year:  2016        PMID: 27154835      PMCID: PMC5031947          DOI: 10.1093/biostatistics/kxw018

Source DB:  PubMed          Journal:  Biostatistics        ISSN: 1465-4644            Impact factor:   5.899


  20 in total

1.  Estimating dataset size requirements for classifying DNA microarray data.

Authors:  Sayan Mukherjee; Pablo Tamayo; Simon Rogers; Ryan Rifkin; Anna Engle; Colin Campbell; Todd R Golub; Jill P Mesirov
Journal:  J Comput Biol       Date:  2003       Impact factor: 1.479

2.  Determination of minimum sample size and discriminatory expression patterns in microarray data.

Authors:  Daehee Hwang; William A Schmitt; George Stephanopoulos; Gregory Stephanopoulos
Journal:  Bioinformatics       Date:  2002-09       Impact factor: 6.937

3.  The path to personalized medicine.

Authors:  Margaret A Hamburg; Francis S Collins
Journal:  N Engl J Med       Date:  2010-06-15       Impact factor: 91.245

4.  Sample size planning for developing classifiers using high-dimensional DNA microarray data.

Authors:  Kevin K Dobbin; Richard M Simon
Journal:  Biostatistics       Date:  2006-04-13       Impact factor: 5.899

5.  Robust combination of multiple diagnostic tests for classifying censored event times.

Authors:  T Cai; S Cheng
Journal:  Biostatistics       Date:  2007-12-03       Impact factor: 5.899

Review 6.  Next-generation sequencing transforms today's biology.

Authors:  Stephan C Schuster
Journal:  Nat Methods       Date:  2007-12-19       Impact factor: 28.547

7.  Impossibility of successful classification when useful features are rare and weak.

Authors:  Jiashun Jin
Journal:  Proc Natl Acad Sci U S A       Date:  2009-05-15       Impact factor: 11.205

8.  A model free approach to combining biomarkers.

Authors:  Ruth M Pfeiffer; Efstathia Bur
Journal:  Biom J       Date:  2008-08       Impact factor: 2.207

9.  Higher criticism thresholding: Optimal feature selection when useful features are rare and weak.

Authors:  David Donoho; Jiashun Jin
Journal:  Proc Natl Acad Sci U S A       Date:  2008-09-24       Impact factor: 11.205

10.  Determination of sample size for a multi-class classifier based on single-nucleotide polymorphisms: a volume under the surface approach.

Authors:  Xinyu Liu; Yupeng Wang; T N Sriram
Journal:  BMC Bioinformatics       Date:  2014-06-14       Impact factor: 3.169

View more
  3 in total

1.  Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances.

Authors:  Xin Feng; Shaofei Wang; Quewang Liu; Han Li; Jiamei Liu; Cheng Xu; Weifeng Yang; Yayun Shu; Weiwei Zheng; Bingxin Yu; Mingran Qi; Wenyang Zhou; Fengfeng Zhou
Journal:  J Vis Exp       Date:  2018-10-11       Impact factor: 1.355

2.  Networks of worry-towards a connectivity-based signature of late-life worry using higher criticism.

Authors:  Andrew R Gerlach; Helmet T Karim; Joseph Kazan; Howard J Aizenstein; Robert T Krafty; Carmen Andreescu
Journal:  Transl Psychiatry       Date:  2021-10-28       Impact factor: 7.989

3.  RIFS: a randomly restarted incremental feature selection algorithm.

Authors:  Yuting Ye; Ruochi Zhang; Weiwei Zheng; Shuai Liu; Fengfeng Zhou
Journal:  Sci Rep       Date:  2017-10-12       Impact factor: 4.379

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.