Literature DB >> 10947254

Feature selection and classifier performance in computer-aided diagnosis: the effect of finite sample size.

B Sahiner1, H P Chan, N Petrick, R F Wagner, L Hadjiiski.   

Abstract

In computer-aided diagnosis (CAD), a frequently used approach for distinguishing normal and abnormal cases is first to extract potentially useful features for the classification task. Effective features are then selected from this entire pool of available features. Finally, a classifier is designed using the selected features. In this study, we investigated the effect of finite sample size on classification accuracy when classifier design involves stepwise feature selection in linear discriminant analysis, which is the most commonly used feature selection algorithm for linear classifiers. The feature selection and the classifier coefficient estimation steps were considered to be cascading stages in the classifier design process. We compared the performance of the classifier when feature selection was performed on the design samples alone and on the entire set of available samples, which consisted of design and test samples. The area Az under the receiver operating characteristic curve was used as our performance measure. After linear classifier coefficient estimation using the design samples, we studied the hold-out and resubstitution performance estimates. The two classes were assumed to have multidimensional Gaussian distributions, with a large number of features available for feature selection. We investigated the dependence of feature selection performance on the covariance matrices and means for the two classes, and examined the effects of sample size, number of available features, and parameters of stepwise feature selection on classifier bias. Our results indicated that the resubstitution estimate was always optimistically biased, except in cases where the parameters of stepwise feature selection were chosen such that too few features were selected by the stepwise procedure. When feature selection was performed using only the design samples, the hold-out estimate was always pessimistically biased. When feature selection was performed using the entire finite sample space, the hold-out estimates could be pessimistically or optimistically biased, depending on the number of features available for selection, the number of available samples, and their statistical distribution. For our simulation conditions, these estimates were always pessimistically (conservatively) biased if the ratio of the total number of available samples per class to the number of available features was greater than five.

Mesh:

Year:  2000        PMID: 10947254      PMCID: PMC5713476          DOI: 10.1118/1.599017

Source DB:  PubMed          Journal:  Med Phys        ISSN: 0094-2405            Impact factor:   4.071


  18 in total

1.  Classification of malignant and benign masses based on hybrid ART2LDA approach.

Authors:  L Hadjiiski; B Sahiner; H P Chan; N Petrick; M Helvie
Journal:  IEEE Trans Med Imaging       Date:  1999-12       Impact factor: 10.048

2.  Improvement in specificity of ultrasonography for diagnosis of breast tumors by means of artificial intelligence.

Authors:  V Goldberg; A Manduca; D L Ewert; J J Gisvold; J F Greenleaf
Journal:  Med Phys       Date:  1992 Nov-Dec       Impact factor: 4.071

3.  A decision-making theory of visual detection.

Authors:  W P TANNER; J A SWETS
Journal:  Psychol Rev       Date:  1954-11       Impact factor: 8.934

4.  Feature selection in the pattern classification problem of digital chest radiograph segmentation.

Authors:  M F McNitt-Gray; H K Huang; J W Sayre
Journal:  IEEE Trans Med Imaging       Date:  1995       Impact factor: 10.048

5.  MR image texture analysis applied to the diagnosis and tracking of Alzheimer's disease.

Authors:  P A Freeborough; N C Fox
Journal:  IEEE Trans Med Imaging       Date:  1998-06       Impact factor: 10.048

6.  Computerized characterization of masses on mammograms: the rubber band straightening transform and texture analysis.

Authors:  B Sahiner; H P Chan; N Petrick; M A Helvie; M M Goodsitt
Journal:  Med Phys       Date:  1998-04       Impact factor: 4.071

7.  Automated detection of breast masses on mammograms using adaptive contrast enhancement and texture classification.

Authors:  N Petrick; H P Chan; D Wei; B Sahiner; M A Helvie; D D Adler
Journal:  Med Phys       Date:  1996-10       Impact factor: 4.071

8.  The meaning and use of the area under a receiver operating characteristic (ROC) curve.

Authors:  J A Hanley; B J McNeil
Journal:  Radiology       Date:  1982-04       Impact factor: 11.105

9.  Improving the distinction between benign and malignant breast lesions: the value of sonographic texture analysis.

Authors:  B S Garra; B H Krasner; S C Horii; S Ascher; S K Mun; R K Zeman
Journal:  Ultrason Imaging       Date:  1993-10       Impact factor: 1.578

10.  Computerized analysis of breast lesions in three dimensions using dynamic magnetic-resonance imaging.

Authors:  K G Gilhuijs; M L Giger; U Bick
Journal:  Med Phys       Date:  1998-09       Impact factor: 4.071

View more
  43 in total

1.  Biomarker identification by feature wrappers.

Authors:  M Xiong; X Fang; J Zhao
Journal:  Genome Res       Date:  2001-11       Impact factor: 9.043

2.  Automated detection of mass lesions in dedicated breast CT: a preliminary study.

Authors:  I Reiser; R M Nishikawa; M L Giger; J M Boone; K K Lindfors; K Yang
Journal:  Med Phys       Date:  2012-02       Impact factor: 4.071

Review 3.  Modeling paradigms for medical diagnostic decision support: a survey and future directions.

Authors:  Kavishwar B Wagholikar; Vijayraghavan Sundararajan; Ashok W Deshpande
Journal:  J Med Syst       Date:  2011-10-01       Impact factor: 4.460

4.  Evaluating imaging and computer-aided detection and diagnosis devices at the FDA.

Authors:  Brandon D Gallas; Heang-Ping Chan; Carl J D'Orsi; Lori E Dodd; Maryellen L Giger; David Gur; Elizabeth A Krupinski; Charles E Metz; Kyle J Myers; Nancy A Obuchowski; Berkman Sahiner; Alicia Y Toledano; Margarita L Zuley
Journal:  Acad Radiol       Date:  2012-02-03       Impact factor: 3.173

5.  Evaluation of data augmentation via synthetic images for improved breast mass detection on mammograms using deep learning.

Authors:  Kenny H Cha; Nicholas Petrick; Aria Pezeshk; Christian G Graff; Diksha Sharma; Andreu Badal; Berkman Sahiner
Journal:  J Med Imaging (Bellingham)       Date:  2019-11-22

6.  Exploring nonlinear feature space dimension reduction and data representation in breast Cadx with Laplacian eigenmaps and t-SNE.

Authors:  Andrew R Jamieson; Maryellen L Giger; Karen Drukker; Hui Li; Yading Yuan; Neha Bhooshan
Journal:  Med Phys       Date:  2010-01       Impact factor: 4.071

7.  Computer-aided detection of breast masses on full field digital mammograms.

Authors:  Jun Wei; Berkman Sahiner; Lubomir M Hadjiiski; Heang-Ping Chan; Nicholas Petrick; Mark A Helvie; Marilyn A Roubidoux; Jun Ge; Chuan Zhou
Journal:  Med Phys       Date:  2005-09       Impact factor: 4.071

8.  Dual system approach to computer-aided detection of breast masses on mammograms.

Authors:  Jun Wei; Heang-Ping Chan; Berkman Sahiner; Lubomir M Hadjiiski; Mark A Helvie; Marilyn A Roubidoux; Chuan Zhou; Jun Ge
Journal:  Med Phys       Date:  2006-11       Impact factor: 4.071

9.  Classification of breast masses in mammograms using genetic programming and feature selection.

Authors:  R J Nandi; A K Nandi; R M Rangayyan; D Scutt
Journal:  Med Biol Eng Comput       Date:  2006-07-21       Impact factor: 2.602

10.  Feasibility Study of a Generalized Framework for Developing Computer-Aided Detection Systems-a New Paradigm.

Authors:  Mitsutaka Nemoto; Naoto Hayashi; Shouhei Hanaoka; Yukihiro Nomura; Soichiro Miki; Takeharu Yoshikawa
Journal:  J Digit Imaging       Date:  2017-10       Impact factor: 4.056

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.