Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Learning curves in classification with microarray data.

Literature DB >> 20172367

Learning curves in classification with microarray data.

Abstract

The performance of many repeated tasks improves with experience and practice. This improvement tends to be rapid initially and then decreases. The term "learning curve" is often used to describe the phenomenon. In supervised machine learning, the performance of classification algorithms often increases with the number of observations used to train the algorithm. We use progressively larger samples of observations to train the algorithm and then plot performance against the number of training observations. This yields the familiar negatively accelerating learning curve. To quantify the learning curve, we fit inverse power law models to the progressively sampled data. We fit such learning curves to four large clinical cancer genomic datasets, using three classifiers (diagonal linear discriminant analysis, K-nearest-neighbor with three neighbors, and support vector machines) and four values for the number of top genes included (5, 50, 500, 5,000). The inverse power law models fit the progressively sampled data reasonably well and showed considerable diversity when multiple classifiers are applied to the same data. Some classifiers showed rapid and continued increase in performance as the number of training samples increased, while others showed little if any improvement. Assessing classifier efficiency is particularly important in genomic studies since samples are so expensive to obtain. It is important to employ an algorithm that uses the predictive information efficiently, but with a modest number of training samples (>50), learning curves can be used to assess the predictive efficiency of classification algorithms. Copyright 2010 Elsevier Inc. All rights reserved.

Entities: Disease Gene Species

Mesh：

Year: 2010 PMID： 20172367 PMCID： PMC4482113 DOI： 10.1053/j.seminoncol.2009.12.002

Source DB: PubMed Journal: Semin Oncol ISSN： 0093-7754 Impact factor: 4.929

2 in total

Review 1. Statistical assessment of the learning curves of health technologies.

Authors: C R Ramsay; A M Grant; S A Wallace; P H Garthwaite; A F Monk; I T Russell
Journal: Health Technol Assess Date: 2001 Impact factor: 4.014

2. Estimating dataset size requirements for classifying DNA microarray data.

Authors: Sayan Mukherjee; Pablo Tamayo; Simon Rogers; Ryan Rifkin; Anna Engle; Colin Campbell; Todd R Golub; Jill P Mesirov
Journal: J Comput Biol Date: 2003 Impact factor: 1.479

2 in total

3 in total

1. Addressing the challenge of defining valid proteomic biomarkers and classifiers.

Authors: Mohammed Dakna; Keith Harris; Alexandros Kalousis; Sebastien Carpentier; Walter Kolch; Joost P Schanstra; Marion Haubitz; Antonia Vlahou; Harald Mischak; Mark Girolami
Journal: BMC Bioinformatics Date: 2010-12-10 Impact factor: 3.169

2. Predicting sample size required for classification performance.

Authors: Rosa L Figueroa; Qing Zeng-Treitler; Sasikiran Kandula; Long H Ngo
Journal: BMC Med Inform Decis Mak Date: 2012-02-15 Impact factor: 2.796

3. Radio-pathomic Maps of Epithelium and Lumen Density Predict the Location of High-Grade Prostate Cancer.

Authors: Sean D McGarry; Sarah L Hurrell; Kenneth A Iczkowski; William Hall; Amy L Kaczmarowski; Anjishnu Banerjee; Tucker Keuter; Kenneth Jacobsohn; John D Bukowy; Marja T Nevalainen; Mark D Hohenwalter; William A See; Peter S LaViolette
Journal: Int J Radiat Oncol Biol Phys Date: 2018-04-24 Impact factor: 8.013

3 in total