Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 A data-analytic strategy for protein biomarker discovery: profiling of high-dimensional proteomic data for cancer detection.

Literature DB >> 12925511

A data-analytic strategy for protein biomarker discovery: profiling of high-dimensional proteomic data for cancer detection.

Yutaka Yasui¹, Margaret Pepe, Mary Lou Thompson, Bao-Ling Adam, George L Wright, Yinsheng Qu, John D Potter, Marcy Winget, Mark Thornquist, Ziding Feng.

Abstract

With recent advances in mass spectrometry techniques, it is now possible to investigate proteins over a wide range of molecular weights in small biological specimens. This advance has generated data-analytic challenges in proteomics, similar to those created by microarray technologies in genetics, namely, discovery of 'signature' protein profiles specific to each pathologic state (e.g. normal vs. cancer) or differential profiles between experimental conditions (e.g. treated by a drug of interest vs. untreated) from high-dimensional data. We propose a data-analytic strategy for discovering protein biomarkers based on such high-dimensional mass spectrometry data. A real biomarker-discovery project on prostate cancer is taken as a concrete example throughout the paper: the project aims to identify proteins in serum that distinguish cancer, benign hyperplasia, and normal states of prostate using the Surface Enhanced Laser Desorption/Ionization (SELDI) technology, a recently developed mass spectrometry technique. Our data-analytic strategy takes properties of the SELDI mass spectrometer into account: the SELDI output of a specimen contains about 48,000 (x, y) points where x is the protein mass divided by the number of charges introduced by ionization and y is the protein intensity of the corresponding mass per charge value, x, in that specimen. Given high coefficients of variation and other characteristics of protein intensity measures (y values), we reduce the measures of protein intensities to a set of binary variables that indicate peaks in the y-axis direction in the nearest neighborhoods of each mass per charge point in the x-axis direction. We then account for a shifting (measurement error) problem of the x-axis in SELDI output. After this pre-analysis processing of data, we combine the binary predictors to generate classification rules for cancer, benign hyperplasia, and normal states of prostate. Our approach is to apply the boosting algorithm to select binary predictors and construct a summary classifier. We empirically evaluate sensitivity and specificity of the resulting summary classifiers with a test dataset that is independent from the training dataset used to construct the summary classifiers. The proposed method performed nearly perfectly in distinguishing cancer and benign hyperplasia from normal. In the classification of cancer vs. benign hyperplasia, however, an appreciable proportion of the benign specimens were classified incorrectly as cancer. We discuss practical issues associated with our proposed approach to the analysis of SELDI output and its application in cancer biomarker discovery.

Entities: Disease

Mesh：

Substances：
Biomarkers, Tumor

Year: 2003 PMID： 12925511 DOI： 10.1093/biostatistics/4.3.449

Source DB: PubMed Journal: Biostatistics ISSN： 1465-4644 Impact factor: 5.899

Keyword Cloud
Cited

52 in total

1. Choosing Therapy on the Basis of Disease Classifications in Inflammatory Bowel Disease.

Authors: Maria T. Abreu
Journal: Curr Treat Options Gastroenterol Date: 2004-06

2. Peptide Peak Detection for Low Resolution MALDI-TOF Mass Spectrometry.

Authors: Jingwen Yao; Shin-Ichi Utsunomiya; Shigeki Kajihara; Tsuyoshi Tabata; Ken Aoshima; Yoshiya Oda; Koichi Tanaka
Journal: Mass Spectrom (Tokyo) Date: 2014-08-23

3. PrepMS: TOF MS data graphical preprocessing tool.

Authors: Yuliya V Karpievitch; Elizabeth G Hill; Adam J Smolka; Jeffrey S Morris; Kevin R Coombes; Keith A Baggerly; Jonas S Almeida
Journal: Bioinformatics Date: 2006-11-22 Impact factor: 6.937

4. SELDI-TOF MS of quadruplicate urine and serum samples to evaluate changes related to storage conditions.

Authors: Avram Z Traum; Meghan P Wells; Manuel Aivado; Towia A Libermann; Marco F Ramoni; Asher D Schachter
Journal: Proteomics Date: 2006-03 Impact factor: 3.984

5. Bayesian analysis of mass spectrometry proteomic data using wavelet-based functional mixed models.

Authors: Jeffrey S Morris; Philip J Brown; Richard C Herrick; Keith A Baggerly; Kevin R Coombes
Journal: Biometrics Date: 2007-09-20 Impact factor: 2.571

6. Identification of a beta-casein-like peptide in breast nipple aspirate fluid that is associated with breast cancer.

Authors: Edward R Sauter; Wade Davis; Wenyi Qin; Sarah Scanlon; Brian Mooney; Karen Bromert; William R Folk
Journal: Biomark Med Date: 2009-10 Impact factor: 2.851

7. LC-MS Based Detection of Differential Protein Expression.

Authors: Leepika Tuli; Habtom W Ressom
Journal: J Proteomics Bioinform Date: 2009-10-02

8. A novel urine peptide biomarker-based algorithm for the prognosis of necrotising enterocolitis in human infants.

Authors: Karl G Sylvester; Xuefeng B Ling; G Y Liu; Zachary J Kastenberg; Jun Ji; Zhongkai Hu; Sihua Peng; Ken Lau; Fizan Abdullah; Mary L Brandt; Richard A Ehrenkranz; Mary Catherine Harris; Timothy C Lee; Joyce Simpson; Corinna Bowers; R Lawrence Moss
Journal: Gut Date: 2013-09-18 Impact factor: 23.059

9. A general-purpose baseline estimation algorithm for spectroscopic data.

Authors: Donald A Barkauskas; David M Rocke
Journal: Anal Chim Acta Date: 2010-01-11 Impact factor: 6.558

Review 10. Current status and prospects of clinical proteomics studies on detection of colorectal cancer: hopes and fears.

Authors: M E de Noo; R A E M Tollenaar; A M Deelder; L H Bouwman
Journal: World J Gastroenterol Date: 2006-11-07 Impact factor: 5.742