Literature DB >> 23115301

A critical assessment of feature selection methods for biomarker discovery in clinical proteomics.

Christin Christin1, Huub C J Hoefsloot, Age K Smilde, B Hoekman, Frank Suits, Rainer Bischoff, Peter Horvatovich.   

Abstract

In this paper, we compare the performance of six different feature selection methods for LC-MS-based proteomics and metabolomics biomarker discovery-t test, the Mann-Whitney-Wilcoxon test (mww test), nearest shrunken centroid (NSC), linear support vector machine-recursive features elimination (SVM-RFE), principal component discriminant analysis (PCDA), and partial least squares discriminant analysis (PLSDA)-using human urine and porcine cerebrospinal fluid samples that were spiked with a range of peptides at different concentration levels. The ideal feature selection method should select the complete list of discriminating features that are related to the spiked peptides without selecting unrelated features. Whereas many studies have to rely on classification error to judge the reliability of the selected biomarker candidates, we assessed the accuracy of selection directly from the list of spiked peptides. The feature selection methods were applied to data sets with different sample sizes and extents of sample class separation determined by the concentration level of spiked compounds. For each feature selection method and data set, the performance for selecting a set of features related to spiked compounds was assessed using the harmonic mean of the recall and the precision (f-score) and the geometric mean of the recall and the true negative rate (g-score). We conclude that the univariate t test and the mww test with multiple testing corrections are not applicable to data sets with small sample sizes (n = 6), but their performance improves markedly with increasing sample size up to a point (n > 12) at which they outperform the other methods. PCDA and PLSDA select small feature sets with high precision but miss many true positive features related to the spiked peptides. NSC strikes a reasonable compromise between recall and precision for all data sets independent of spiking level and number of samples. Linear SVM-RFE performs poorly for selecting features related to the spiked compounds, even though the classification error is relatively low.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 23115301      PMCID: PMC3536906          DOI: 10.1074/mcp.M112.022566

Source DB:  PubMed          Journal:  Mol Cell Proteomics        ISSN: 1535-9476            Impact factor:   5.911


  36 in total

Review 1.  Protein biomarker discovery and validation: the long and uncertain path to clinical utility.

Authors:  Nader Rifai; Michael A Gillette; Steven A Carr
Journal:  Nat Biotechnol       Date:  2006-08       Impact factor: 54.908

Review 2.  Statistical data processing in clinical proteomics.

Authors:  Suzanne Smit; Huub C J Hoefsloot; Age K Smilde
Journal:  J Chromatogr B Analyt Technol Biomed Life Sci       Date:  2007-11-04       Impact factor: 3.205

3.  Assessing the statistical validity of proteomics based biomarkers.

Authors:  Suzanne Smit; Mariëlle J van Breemen; Huub C J Hoefsloot; Age K Smilde; Johannes M F G Aerts; Chris G de Koster
Journal:  Anal Chim Acta       Date:  2007-04-27       Impact factor: 6.558

4.  Support vector machine approach to separate control and breast cancer serum samples.

Authors:  Thang V Pham; Mark A van de Wiel; Connie R Jimenez
Journal:  Stat Appl Genet Mol Biol       Date:  2008-02-21

5.  A classification model for the Leiden proteomics competition.

Authors:  Huub C J Hoefsloot; Suzanne Smit; Age K Smilde
Journal:  Stat Appl Genet Mol Biol       Date:  2008-02-19

Review 6.  A review of feature selection techniques in bioinformatics.

Authors:  Yvan Saeys; Iñaki Inza; Pedro Larrañaga
Journal:  Bioinformatics       Date:  2007-08-24       Impact factor: 6.937

7.  A support vector machine approach to assess drug efficacy of interferon-alpha and ribavirin combination therapy.

Authors:  Eugene Lin; Yuchi Hwang
Journal:  Mol Diagn Ther       Date:  2008       Impact factor: 4.074

Review 8.  How-to guide on biomarkers: biomarker definitions, validation and applications with examples from cardiovascular disease.

Authors:  V O Puntmann
Journal:  Postgrad Med J       Date:  2009-10       Impact factor: 2.401

9.  Support vector machine-based feature selection for classification of liver fibrosis grade in chronic hepatitis C.

Authors:  Zheng Jiang; Kazunobu Yamauchi; Kentaro Yoshioka; Kazuma Aoki; Susumu Kuroyanagi; Akira Iwata; Jun Yang; Kai Wang
Journal:  J Med Syst       Date:  2006-10       Impact factor: 4.460

10.  Metabolic fingerprints of proliferative diabetic retinopathy: an 1H-NMR-based metabonomic approach using vitreous humor.

Authors:  Ignasi Barba; Marta Garcia-Ramírez; Cristina Hernández; María Angeles Alonso; Lluis Masmiquel; David García-Dorado; Rafael Simó
Journal:  Invest Ophthalmol Vis Sci       Date:  2010-04-07       Impact factor: 4.799

View more
  25 in total

1.  Targeted and Interactome Proteomics Revealed the Role of PHD2 in Regulating BRD4 Proline Hydroxylation.

Authors:  Luke Erber; Ang Luo; Yue Chen
Journal:  Mol Cell Proteomics       Date:  2019-06-25       Impact factor: 5.911

2.  The safety and tolerability of pirfenidone for bronchiolitis obliterans syndrome after hematopoietic cell transplant (STOP-BOS) trial.

Authors:  Efthymia Iliana Matthaiou; Husham Sharifi; Christian O'Donnell; Wayland Chiu; Clark Owyang; Paulami Chatterjee; Ihsan Turk; Laura Johnston; Theresa Brondstetter; Karen Morris; Guang-Shing Cheng; Joe L Hsu
Journal:  Bone Marrow Transplant       Date:  2022-05-31       Impact factor: 5.174

3.  InteractiVenn: a web-based tool for the analysis of sets through Venn diagrams.

Authors:  Henry Heberle; Gabriela Vaz Meirelles; Felipe R da Silva; Guilherme P Telles; Rosane Minghim
Journal:  BMC Bioinformatics       Date:  2015-05-22       Impact factor: 3.169

4.  Two of Them Do It Better: Novel Serum Biomarkers Improve Autoimmune Hepatitis Diagnosis.

Authors:  Saveria Mazzara; Antonia Sinisi; Angela Cardaci; Riccardo Lorenzo Rossi; Luigi Muratori; Sergio Abrignani; Mauro Bombaci
Journal:  PLoS One       Date:  2015-09-16       Impact factor: 3.240

5.  Quantitative Proteomic Approach for MicroRNA Target Prediction Based on 18O/16O Labeling.

Authors:  Xuepo Ma; Ying Zhu; Yufei Huang; Tony Tegeler; Shou-Jiang Gao; Jianqiu Zhang
Journal:  Cancer Inform       Date:  2016-12-08

6.  Selecting Feature Subsets Based on SVM-RFE and the Overlapping Ratio with Applications in Bioinformatics.

Authors:  Xiaohui Lin; Chao Li; Yanhui Zhang; Benzhe Su; Meng Fan; Hai Wei
Journal:  Molecules       Date:  2017-12-26       Impact factor: 4.411

Review 7.  Current strategies and findings in clinically relevant post-translational modification-specific proteomics.

Authors:  Oliver Pagel; Stefan Loroch; Albert Sickmann; René P Zahedi
Journal:  Expert Rev Proteomics       Date:  2015-05-08       Impact factor: 3.940

8.  Integrative analysis to select cancer candidate biomarkers to targeted validation.

Authors:  Rebeca Kawahara; Gabriela V Meirelles; Henry Heberle; Romênia R Domingues; Daniela C Granato; Sami Yokoo; Rafael R Canevarolo; Flavia V Winck; Ana Carolina P Ribeiro; Thaís Bianca Brandão; Paulo R Filgueiras; Karen S P Cruz; José Alexandre Barbuto; Ronei J Poppi; Rosane Minghim; Guilherme P Telles; Felipe Paiva Fonseca; Jay W Fox; Alan R Santos-Silva; Ricardo D Coletta; Nicholas E Sherman; Adriana F Paes Leme
Journal:  Oncotarget       Date:  2015-12-22

9.  Proteomic analysis of cerebrospinal fluid for relapsing-remitting multiple sclerosis and clinically isolated syndrome.

Authors:  Zbyšek Pavelek; Oldřich Vyšata; Vojtěch Tambor; Kristýna Pimková; Dai Long Vu; Kamil Kuča; Pavel Šťourač; Martin Vališ
Journal:  Biomed Rep       Date:  2016-04-28

10.  biosigner: A New Method for the Discovery of Significant Molecular Signatures from Omics Data.

Authors:  Philippe Rinaudo; Samia Boudah; Christophe Junot; Etienne A Thévenot
Journal:  Front Mol Biosci       Date:  2016-06-21
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.