Literature DB >> 16309341

Feature Selection for Classification of SELDI-TOF-MS Proteomic Profiles.

Milos Hauskrecht1, Richard Pelikan, David E Malehorn, William L Bigbee, Michael T Lotze, Herbert J Zeh, David C Whitcomb, James Lyons-Weiler.   

Abstract

BACKGROUND: Proteomic peptide profiling is an emerging technology harbouring great expectations to enable early detection, enhance diagnosis and more clearly define prognosis of many diseases. Although previous research work has illustrated the ability of proteomic data to discriminate between cases and controls, significantly less attention has been paid to the analysis of feature selection strategies that enable learning of such predictive models. Feature selection, in addition to classification, plays an important role in successful identification of proteomic biomarker panels.
METHODS: We present a new, efficient, multivariate feature selection strategy that extracts useful feature panels directly from the high-throughput spectra. The strategy takes advantage of the characteristics of surface-enhanced laser desorption/ionisation time-of-flight mass spectrometry (SELDI-TOF-MS) profiles and enhances widely used univariate feature selection strategies with a heuristic based on multivariate de-correlation filtering. We analyse and compare two versions of the method: one in which all feature pairs must adhere to a maximum allowed correlation (MAC) threshold, and another in which the feature panel is built greedily by deciding among best univariate features at different MAC levels.
RESULTS: The analysis and comparison of feature selection strategies was carried out experimentally on the pancreatic cancer dataset with 57 cancers and 59 controls from the University of Pittsburgh Cancer Institute, Pittsburgh, Pennsylvania, USA. The analysis was conducted in both the whole-profile and peak-only modes. The results clearly show the benefit of the new strategy over univariate feature selection methods in terms of improved classification performance.
CONCLUSION: Understanding the characteristics of the spectra allows us to better assess the relative importance of potential features in the diagnosis of cancer. Incorporation of these characteristics into feature selection strategies often leads to a more efficient data analysis as well as improved classification performance.

Entities:  

Year:  2005        PMID: 16309341     DOI: 10.2165/00822942-200504040-00003

Source DB:  PubMed          Journal:  Appl Bioinformatics        ISSN: 1175-5636


  12 in total

1.  Measuring stability of feature selection in biomedical datasets.

Authors:  Jonathan L Lustgarten; Vanathi Gopalakrishnan; Shyam Visweswaran
Journal:  AMIA Annu Symp Proc       Date:  2009-11-14

2.  Improving classification performance with discretization on biomedical datasets.

Authors:  Jonathan L Lustgarten; Vanathi Gopalakrishnan; Himanshu Grover; Shyam Visweswaran
Journal:  AMIA Annu Symp Proc       Date:  2008-11-06

3.  Automatic selection of preprocessing methods for improving predictions on mass spectrometry protein profiles.

Authors:  Richard C Pelikan; Milos Hauskrecht
Journal:  AMIA Annu Symp Proc       Date:  2010-11-13

4.  Ovarian cancer classification based on dimensionality reduction for SELDI-TOF data.

Authors:  Kai-Lin Tang; Tong-Hua Li; Wen-Wei Xiong; Kai Chen
Journal:  BMC Bioinformatics       Date:  2010-02-27       Impact factor: 3.169

5.  Applying proteomic-based biomarker tools for the accurate diagnosis of pancreatic cancer.

Authors:  Kyoko Kojima; Senait Asmellash; Christopher A Klug; William E Grizzle; James A Mobley; John D Christein
Journal:  J Gastrointest Surg       Date:  2008-08-15       Impact factor: 3.452

Review 6.  Biomarkers for pancreatic cancer: recent achievements in proteomics and genomics through classical and multivariate statistical methods.

Authors:  Emilio Marengo; Elisa Robotti
Journal:  World J Gastroenterol       Date:  2014-10-07       Impact factor: 5.742

7.  Nonnegative principal component analysis for mass spectral serum profiles and biomarker discovery.

Authors:  Henry Han
Journal:  BMC Bioinformatics       Date:  2010-01-18       Impact factor: 3.169

8.  Inter-session reproducibility measures for high-throughput data sources.

Authors:  Milos Hauskrecht; Richard Pelikan
Journal:  Summit Transl Bioinform       Date:  2008-03-01

9.  Merging microarray data, robust feature selection, and predicting prognosis in prostate cancer.

Authors:  Jing Wang; Kim Anh Do; Sijin Wen; Spyros Tsavachidis; Timothy J McDonnell; Christopher J Logothetis; Kevin R Coombes
Journal:  Cancer Inform       Date:  2007-02-14

10.  Combining multiple hypothesis testing and affinity propagation clustering leads to accurate, robust and sample size independent classification on gene expression data.

Authors:  Argiris Sakellariou; Despina Sanoudou; George Spyrou
Journal:  BMC Bioinformatics       Date:  2012-10-17       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.