Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 How large a training set is needed to develop a classifier for microarray data?

Literature DB >> 18172259

How large a training set is needed to develop a classifier for microarray data?

Kevin K Dobbin¹, Yingdong Zhao, Richard M Simon.

Abstract

PURPOSE: A common goal of gene expression microarray studies is the development of a classifier that can be used to divide patients into groups with different prognoses, or with different expected responses to a therapy. These types of classifiers are developed on a training set, which is the set of samples used to train a classifier. The question of how many samples are needed in the training set to produce a good classifier from high-dimensional microarray data is challenging. EXPERIMENTAL
DESIGN: We present a model-based approach to determining the sample size required to adequately train a classifier.
RESULTS: It is shown that sample size can be determined from three quantities: standardized fold change, class prevalence, and number of genes or features on the arrays. Numerous examples and important experimental design issues are discussed. The method is adapted to address ex post facto determination of whether the size of a training set used to develop a classifier was adequate. An interactive web site for performing the sample size calculations is provided.
CONCLUSION: We showed that sample size calculations for classifier development from high-dimensional microarray data are feasible, discussed numerous important considerations, and presented examples.

Entities: Species

Mesh：

Year: 2008 PMID： 18172259 DOI： 10.1158/1078-0432.CCR-07-0443

Source DB: PubMed Journal: Clin Cancer Res ISSN： 1078-0432 Impact factor: 12.531

Keyword Cloud
Cited

40 in total

1. Module-based prediction approach for robust inter-study predictions in microarray data.

Authors: Zhibao Mi; Kui Shen; Nan Song; Chunrong Cheng; Chi Song; Naftali Kaminski; George C Tseng
Journal: Bioinformatics Date: 2010-08-17 Impact factor: 6.937

2. Prediction of outcome in internet-delivered cognitive behaviour therapy for paediatric obsessive-compulsive disorder: A machine learning approach.

Authors: Fabian Lenhard; Sebastian Sauer; Erik Andersson; Kristoffer Nt Månsson; David Mataix-Cols; Christian Rück; Eva Serlachius
Journal: Int J Methods Psychiatr Res Date: 2017-07-28 Impact factor: 4.035

Review 3. Clinical outcome prediction by microRNAs in human cancer: a systematic review.

Authors: Viswam S Nair; Lauren S Maeda; John P A Ioannidis
Journal: J Natl Cancer Inst Date: 2012-03-06 Impact factor: 13.506

Review 4. Genomic markers for decision making: what is preventing us from using markers?

Authors: Vicky M Coyle; Patrick G Johnston
Journal: Nat Rev Clin Oncol Date: 2009-12-15 Impact factor: 66.675

Review 5. Assessing the human immune system through blood transcriptomics.

Authors: Damien Chaussabel; Virginia Pascual; Jacques Banchereau
Journal: BMC Biol Date: 2010-07-01 Impact factor: 7.431

6. Interpretation of genomic data: questions and answers.

Authors: Richard Simon
Journal: Semin Hematol Date: 2008-07 Impact factor: 3.851

7. A prototype tobacco-associated oral squamous cell carcinoma classifier using RNA from brush cytology.

Authors: Antonia Kolokythas; Mitchell J Bosman; Kristen B Pytynia; Suchismita Panda; Herve Y Sroussi; Yang Dai; Joel L Schwartz; Guy R Adami
Journal: J Oral Pathol Med Date: 2013-04-17 Impact factor: 4.253

Review 8. A decade of genome-wide gene expression profiling in acute myeloid leukemia: flashback and prospects.

Authors: Bas J Wouters; Bob Löwenberg; Ruud Delwel
Journal: Blood Date: 2008-08-14 Impact factor: 22.113

9. Factors influencing the statistical power of complex data analysis protocols for molecular signature development from microarray data.

Authors: Constantin F Aliferis; Alexander Statnikov; Ioannis Tsamardinos; Jonathan S Schildcrout; Bryan E Shepherd; Frank E Harrell
Journal: PLoS One Date: 2009-03-17 Impact factor: 3.240

10. Emerging concepts in biomarker discovery; the US-Japan Workshop on Immunological Molecular Markers in Oncology.

Authors: Hideaki Tahara; Marimo Sato; Magdalena Thurin; Ena Wang; Lisa H Butterfield; Mary L Disis; Bernard A Fox; Peter P Lee; Samir N Khleif; Jon M Wigginton; Stefan Ambs; Yasunori Akutsu; Damien Chaussabel; Yuichiro Doki; Oleg Eremin; Wolf Hervé Fridman; Yoshihiko Hirohashi; Kohzoh Imai; James Jacobson; Masahisa Jinushi; Akira Kanamoto; Mohammed Kashani-Sabet; Kazunori Kato; Yutaka Kawakami; John M Kirkwood; Thomas O Kleen; Paul V Lehmann; Lance Liotta; Michael T Lotze; Michele Maio; Anatoli Malyguine; Giuseppe Masucci; Hisahiro Matsubara; Shawmarie Mayrand-Chung; Kiminori Nakamura; Hiroyoshi Nishikawa; A Karolina Palucka; Emanuel F Petricoin; Zoltan Pos; Antoni Ribas; Licia Rivoltini; Noriyuki Sato; Hiroshi Shiku; Craig L Slingluff; Howard Streicher; David F Stroncek; Hiroya Takeuchi; Minoru Toyota; Hisashi Wada; Xifeng Wu; Julia Wulfkuhle; Tomonori Yaguchi; Benjamin Zeskind; Yingdong Zhao; Mai-Britt Zocca; Francesco M Marincola
Journal: J Transl Med Date: 2009-06-17 Impact factor: 5.531