Literature DB >> 17878205

A parsimonious threshold-independent protein feature selection method through the area under receiver operating characteristic curve.

Zhanfeng Wang1, Yuan-chin I Chang, Zhiliang Ying, Liang Zhu, Yaning Yang.   

Abstract

MOTIVATION: Protein expression profiling for differences indicative of early cancer holds promise for improving diagnostics. Due to their high dimensionality, statistical analysis of proteomic data from mass spectrometers is challenging in many aspects such as dimension reduction, feature subset selection as well as construction of classification rules. Search of an optimal feature subset, commonly known as the feature subset selection (FSS) problem, is an important step towards disease classification/diagnostics with biomarkers.
METHODS: We develop a parsimonious threshold-independent feature selection (PTIFS) method based on the concept of area under the curve (AUC) of the receiver operating characteristic (ROC). To reduce computational complexity to a manageable level, we use a sigmoid approximation to the empirical AUC as the criterion function. Starting from an anchor feature, the PTIFS method selects a feature subset through an iterative updating algorithm. Highly correlated features that have similar discriminating power are precluded from being selected simultaneously. The classification rule is then determined from the resulting feature subset.
RESULTS: The performance of the proposed approach is investigated by extensive simulation studies, and by applying the method to two mass spectrometry data sets of prostate cancer and of liver cancer. We compare the new approach with the threshold gradient descent regularization (TGDR) method. The results show that our method can achieve comparable performance to that of the TGDR method in terms of disease classification, but with fewer features selected. AVAILABILITY: Supplementary Material and the PTIFS implementations are available at http://staff.ustc.edu.cn/~ynyang/PTIFS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 17878205     DOI: 10.1093/bioinformatics/btm442

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  10 in total

1.  A boosting method for maximizing the partial area under the ROC curve.

Authors:  Osamu Komori; Shinto Eguchi
Journal:  BMC Bioinformatics       Date:  2010-06-10       Impact factor: 3.169

2.  Multi-TGDR: a regularization method for multi-class classification in microarray experiments.

Authors:  Suyan Tian; Mayte Suárez-Fariñas
Journal:  PLoS One       Date:  2013-11-19       Impact factor: 3.240

3.  AucPR: an AUC-based approach using penalized regression for disease prediction with high-dimensional omics data.

Authors:  Wenbao Yu; Taesung Park
Journal:  BMC Genomics       Date:  2014-12-12       Impact factor: 3.969

4.  Visualization-aided classification ensembles discriminate lung adenocarcinoma and squamous cell carcinoma samples using their gene expression profiles.

Authors:  Ao Zhang; Chi Wang; Shiji Wang; Liang Li; Zhongmin Liu; Suyan Tian
Journal:  PLoS One       Date:  2014-10-15       Impact factor: 3.240

5.  Past, present, and future geographic range of the relict Mediterranean and Macaronesian Juniperus phoenicea complex.

Authors:  Montserrat Salvà-Catarineu; Angel Romo; Małgorzata Mazur; Monika Zielińska; Pietro Minissale; Ali A Dönmez; Krystyna Boratyńska; Adam Boratyński
Journal:  Ecol Evol       Date:  2021-03-25       Impact factor: 2.912

6.  The evolutionary heritage and ecological uniqueness of Scots pine in the Caucasus ecoregion is at risk of climate changes.

Authors:  M Dering; M Baranowska; B Beridze; I J Chybicki; I Danelia; G Iszkuło; G Kvartskhava; P Kosiński; G Rączka; P A Thomas; D Tomaszewski; Ł Walas; K Sękiewicz
Journal:  Sci Rep       Date:  2021-11-24       Impact factor: 4.379

7.  A classification for complex imbalanced data in disease screening and early diagnosis.

Authors:  Yiming Li; Wei-Wen Hsu
Journal:  Stat Med       Date:  2022-05-23       Impact factor: 2.497

8.  The future of Viscum album L. in Europe will be shaped by temperature and host availability.

Authors:  Łukasz Walas; Wojciech Kędziora; Marek Ksepko; Mariola Rabska; Dominik Tomaszewski; Peter A Thomas; Roman Wójcik; Grzegorz Iszkuło
Journal:  Sci Rep       Date:  2022-10-12       Impact factor: 4.996

9.  Patterns of genetic diversity in North Africa: Moroccan-Algerian genetic split in Juniperus thurifera subsp. africana.

Authors:  Asma Taib; Abdelkader Morsli; Aleksandra Chojnacka; Łukasz Walas; Katarzyna Sękiewicz; Adam Boratyński; Àngel Romo; Monika Dering
Journal:  Sci Rep       Date:  2020-03-16       Impact factor: 4.379

10.  Spatial genetic structure and diversity of natural populations of Aesculus hippocastanum L. in Greece.

Authors:  Łukasz Walas; Petros Ganatsas; Grzegorz Iszkuło; Peter A Thomas; Monika Dering
Journal:  PLoS One       Date:  2019-12-11       Impact factor: 3.240

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.