Literature DB >> 17510167

Probability-based pattern recognition and statistical framework for randomization: modeling tandem mass spectrum/peptide sequence false match frequencies.

Jian Feng1, Daniel Q Naiman, Bret Cooper.   

Abstract

MOTIVATION: In proteomics, reverse database searching is used to control the false match frequency for tandem mass spectrum/peptide sequence matches, but reversal creates sequences devoid of patterns that usually challenge database-search software.
RESULTS: We designed an unsupervised pattern recognition algorithm for detecting patterns with various lengths from large sequence datasets. The patterns found in a protein sequence database were used to create decoy databases using a Monte Carlo sampling algorithm. Searching these decoy databases led to the prediction of false positive rates for spectrum/peptide sequence matches. We show examples where this method, independent of instrumentation, database-search software and samples, provides better estimation of false positive identification rates than a prevailing reverse database searching method. The pattern detection algorithm can also be used to analyze sequences for other purposes in biology or cryptology. AVAILABILITY: On request from the authors. SUPPLEMENTARY INFORMATION: http://bioinformatics.psb.ugent.be/.

Mesh:

Substances:

Year:  2007        PMID: 17510167     DOI: 10.1093/bioinformatics/btm267

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  12 in total

1.  Non-parametric estimation of posterior error probabilities associated with peptides identified by tandem mass spectrometry.

Authors:  Lukas Käll; John D Storey; William Stafford Noble
Journal:  Bioinformatics       Date:  2008-08-15       Impact factor: 6.937

Review 2.  Translational informatics: enabling high-throughput research paradigms.

Authors:  Philip R O Payne; Peter J Embi; Chandan K Sen
Journal:  Physiol Genomics       Date:  2009-09-08       Impact factor: 3.107

3.  Mass spectrometry-based protein identification with accurate statistical significance assignment.

Authors:  Gelio Alves; Yi-Kuo Yu
Journal:  Bioinformatics       Date:  2014-10-31       Impact factor: 6.937

4.  A review of statistical methods for protein identification using tandem mass spectrometry.

Authors:  Oliver Serang; William Noble
Journal:  Stat Interface       Date:  2012       Impact factor: 0.582

Review 5.  A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics.

Authors:  Alexey I Nesvizhskii
Journal:  J Proteomics       Date:  2010-09-08       Impact factor: 4.044

6.  Different Cellular Origins and Functions of Extracellular Proteins from Escherichia coli O157:H7 and O104:H4 as Determined by Comparative Proteomic Analysis.

Authors:  Nazrul Islam; Attila Nagy; Wesley M Garrett; Dan Shelton; Bret Cooper; Xiangwu Nou
Journal:  Appl Environ Microbiol       Date:  2016-06-30       Impact factor: 4.792

7.  DNA repair of 8-oxo-7,8-dihydroguanine lesions in Porphyromonas gingivalis.

Authors:  Leroy G Henry; Lawrence Sandberg; Kangling Zhang; Hansel M Fletcher
Journal:  J Bacteriol       Date:  2008-10-10       Impact factor: 3.490

8.  A cross-validation scheme for machine learning algorithms in shotgun proteomics.

Authors:  Viktor Granholm; William Stafford Noble; Lukas Käll
Journal:  BMC Bioinformatics       Date:  2012-11-05       Impact factor: 3.169

Review 9.  Computational and statistical analysis of protein mass spectrometry data.

Authors:  William Stafford Noble; Michael J MacCoss
Journal:  PLoS Comput Biol       Date:  2012-01-26       Impact factor: 4.475

10.  Chapter 1: Biomedical knowledge integration.

Authors:  Philip R O Payne
Journal:  PLoS Comput Biol       Date:  2012-12-27       Impact factor: 4.475

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.