Literature DB >> 25555756

Improving the Mann-Whitney statistical test for feature selection: an approach in breast cancer diagnosis on mammography.

Noel Pérez Pérez1, Miguel A Guevara López2, Augusto Silva3, Isabel Ramos4.   

Abstract

OBJECTIVE: This work addresses the theoretical description and experimental evaluation of a new feature selection method (named uFilter). The uFilter improves the Mann-Whitney U-test for reducing dimensionality and ranking features in binary classification problems. Also, it presented a practical uFilter application on breast cancer computer-aided diagnosis (CADx).
MATERIALS AND METHODS: A total of 720 datasets (ranked subsets of features) were formed by the application of the chi-square (CHI2) discretization, information-gain (IG), one-rule (1Rule), Relief, uFilter and its theoretical basis method (named U-test). Each produced dataset was used for training feed-forward backpropagation neural network, support vector machine, linear discriminant analysis and naive Bayes machine learning algorithms to produce classification scores for further statistical comparisons.
RESULTS: A head-to-head comparison based on the mean of area under receiver operating characteristics curve scores against the U-test method showed that the uFilter method significantly outperformed the U-test method for almost all classification schemes (p<0.05); it was superior in 50%; tied in a 37.5% and lost in a 12.5% of the 24 comparative scenarios. Also, the performance of the uFilter method, when compared with CHI2 discretization, IG, 1Rule and Relief methods, was superior or at least statistically similar on the explored datasets while requiring less number of features.
CONCLUSIONS: The experimental results indicated that uFilter method statistically outperformed the U-test method and it demonstrated similar, but not superior, performance than traditional feature selection methods (CHI2 discretization, IG, 1Rule and Relief). The uFilter method revealed competitive and appealing cost-effectiveness results on selecting relevant features, as a support tool for breast cancer CADx methods especially in unbalanced datasets contexts. Finally, the redundancy analysis as a complementary step to the uFilter method provided us an effective way for finding optimal subsets of features without decreasing the classification performances.
Copyright © 2014 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  Breast cancer CADx; Feature selection methods; Machine learning algorithms; Mann–Whitney U-test; Redundancy analysis; uFilter method

Mesh:

Year:  2014        PMID: 25555756     DOI: 10.1016/j.artmed.2014.12.004

Source DB:  PubMed          Journal:  Artif Intell Med        ISSN: 0933-3657            Impact factor:   5.326


  15 in total

Review 1.  Radiological images and machine learning: Trends, perspectives, and prospects.

Authors:  Zhenwei Zhang; Ervin Sejdić
Journal:  Comput Biol Med       Date:  2019-02-27       Impact factor: 4.589

2.  Paying attention to speech: The role of working memory capacity and professional experience.

Authors:  Bar Lambez; Galit Agmon; Paz Har-Shai Yahav; Yuri Rassovsky; Elana Zion Golumbic
Journal:  Atten Percept Psychophys       Date:  2020-10       Impact factor: 2.199

3.  Deep learning modeling using normal mammograms for predicting breast cancer risk.

Authors:  Dooman Arefan; Aly A Mohamed; Wendie A Berg; Margarita L Zuley; Jules H Sumkin; Shandong Wu
Journal:  Med Phys       Date:  2019-11-19       Impact factor: 4.071

4.  Discrimination of Breast Cancer with Microcalcifications on Mammography by Deep Learning.

Authors:  Jinhua Wang; Xi Yang; Hongmin Cai; Wanchang Tan; Cangzheng Jin; Li Li
Journal:  Sci Rep       Date:  2016-06-07       Impact factor: 4.379

5.  Urinary peptidomics analysis reveals proteases involved in diabetic nephropathy.

Authors:  Magdalena Krochmal; Georgia Kontostathi; Pedro Magalhães; Manousos Makridakis; Julie Klein; Holger Husi; Johannes Leierer; Gert Mayer; Jean-Loup Bascands; Colette Denis; Jerome Zoidakis; Petra Zürbig; Christian Delles; Joost P Schanstra; Harald Mischak; Antonia Vlahou
Journal:  Sci Rep       Date:  2017-11-09       Impact factor: 4.379

6.  Modified Bat Algorithm for Feature Selection with the Wisconsin Diagnosis Breast Cancer (WDBC) Dataset

Authors:  Suganthi Jeyasingh; Malathi Veluchamy
Journal:  Asian Pac J Cancer Prev       Date:  2017-05-01

7.  An Enhanced Grey Wolf Optimization Based Feature Selection Wrapped Kernel Extreme Learning Machine for Medical Diagnosis.

Authors:  Qiang Li; Huiling Chen; Hui Huang; Xuehua Zhao; ZhenNao Cai; Changfei Tong; Wenbin Liu; Xin Tian
Journal:  Comput Math Methods Med       Date:  2017-01-26       Impact factor: 2.238

Review 8.  Involvement of Machine Learning for Breast Cancer Image Classification: A Survey.

Authors:  Abdullah-Al Nahid; Yinan Kong
Journal:  Comput Math Methods Med       Date:  2017-12-31       Impact factor: 2.238

9.  Early and accurate detection and diagnosis of heart disease using intelligent computational model.

Authors:  Yar Muhammad; Muhammad Tahir; Maqsood Hayat; Kil To Chong
Journal:  Sci Rep       Date:  2020-11-12       Impact factor: 4.379

10.  Detection of Suicidal Ideation on Social Media: Multimodal, Relational, and Behavioral Analysis.

Authors:  Diana Ramírez-Cifuentes; Ana Freire; Ricardo Baeza-Yates; Joaquim Puntí; Pilar Medina-Bravo; Diego Alejandro Velazquez; Josep Maria Gonfaus; Jordi Gonzàlez
Journal:  J Med Internet Res       Date:  2020-07-07       Impact factor: 5.428

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.