Literature DB >> 18310105

Machine learning methods for predictive proteomics.

Annalisa Barla1, Giuseppe Jurman, Samantha Riccadonna, Stefano Merler, Marco Chierici, Cesare Furlanello.   

Abstract

The search for predictive biomarkers of disease from high-throughput mass spectrometry (MS) data requires a complex analysis path. Preprocessing and machine-learning modules are pipelined, starting from raw spectra, to set up a predictive classifier based on a shortlist of candidate features. As a machine-learning problem, proteomic profiling on MS data needs caution like the microarray case. The risk of overfitting and of selection bias effects is pervasive: not only potential features easily outnumber samples by 10(3) times, but it is easy to neglect information-leakage effects during preprocessing from spectra to peaks. The aim of this review is to explain how to build a general purpose design analysis protocol (DAP) for predictive proteomic profiling: we show how to limit leakage due to parameter tuning and how to organize classification and ranking on large numbers of replicate versions of the original data to avoid selection bias. The DAP can be used with alternative components, i.e. with different preprocessing methods (peak clustering or wavelet based), classifiers e.g. Support Vector Machine (SVM) or feature ranking methods (recursive feature elimination or I-Relief). A procedure for assessing stability and predictive value of the resulting biomarkers' list is also provided. The approach is exemplified with experiments on synthetic datasets (from the Cromwell MS simulator) and with publicly available datasets from cancer studies.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18310105     DOI: 10.1093/bib/bbn008

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  17 in total

Review 1.  Systems vaccinology: its promise and challenge for HIV vaccine development.

Authors:  Helder I Nakaya; Bali Pulendran
Journal:  Curr Opin HIV AIDS       Date:  2012-01       Impact factor: 4.283

2.  Signal processing for metagenomics: extracting information from the soup.

Authors:  Gail L Rosen; Bahrad A Sokhansanj; Robi Polikar; Mary Ann Bruns; Jacob Russell; Elaine Garbarine; Steve Essinger; Non Yok
Journal:  Curr Genomics       Date:  2009-11       Impact factor: 2.236

3.  Pitfalls of supervised feature selection.

Authors:  Pawel Smialowski; Dmitrij Frishman; Stefan Kramer
Journal:  Bioinformatics       Date:  2009-10-29       Impact factor: 6.937

4.  Functional analysis of multiple genomic signatures demonstrates that classification algorithms choose phenotype-related genes.

Authors:  W Shi; M Bessarabova; D Dosymbekov; Z Dezso; T Nikolskaya; M Dudoladova; T Serebryiskaya; A Bugrim; A Guryanov; R J Brennan; R Shah; J Dopazo; M Chen; Y Deng; T Shi; G Jurman; C Furlanello; R S Thomas; J C Corton; W Tong; L Shi; Y Nikolsky
Journal:  Pharmacogenomics J       Date:  2010-08       Impact factor: 3.550

5.  Challenges in Biomarker Discovery: Combining Expert Insights with Statistical Analysis of Complex Omics Data.

Authors:  Jason E McDermott; Jing Wang; Hugh Mitchell; Bobbie-Jo Webb-Robertson; Ryan Hafen; John Ramey; Karin D Rodland
Journal:  Expert Opin Med Diagn       Date:  2013-01

Review 6.  Pathway and network analysis in proteomics.

Authors:  Xiaogang Wu; Mohammad Al Hasan; Jake Yue Chen
Journal:  J Theor Biol       Date:  2014-06-06       Impact factor: 2.691

7.  Bioinformatic-driven search for metabolic biomarkers in disease.

Authors:  Christian Baumgartner; Melanie Osl; Michael Netzer; Daniela Baumgartner
Journal:  J Clin Bioinforma       Date:  2011-01-20

8.  An integrated method for cancer classification and rule extraction from microarray data.

Authors:  Liang-Tsung Huang
Journal:  J Biomed Sci       Date:  2009-02-24       Impact factor: 8.410

9.  A scale space approach for unsupervised feature selection in mass spectra classification for ovarian cancer detection.

Authors:  Michele Ceccarelli; Antonio d'Acierno; Angelo Facchiano
Journal:  BMC Bioinformatics       Date:  2009-10-15       Impact factor: 3.169

Review 10.  Knowledge-based analysis of proteomics data.

Authors:  Marina Bessarabova; Alexander Ishkin; Lellean JeBailey; Tatiana Nikolskaya; Yuri Nikolsky
Journal:  BMC Bioinformatics       Date:  2012-11-05       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.