Literature DB >> 23908556

Deep coverage of the Escherichia coli proteome enables the assessment of false discovery rates in simple proteogenomic experiments.

Karsten Krug1, Alejandro Carpy, Gesa Behrends, Katarina Matic, Nelson C Soares, Boris Macek.   

Abstract

Recent advances in mass spectrometry (MS) have led to increased applications of shotgun proteomics to the refinement of genome annotation. The typical "proteo-genomic" workflows rely on the mapping of peptide MS/MS spectra onto databases derived via six-frame translation of the genome sequence. These databases contain a large proportion of spurious protein sequences which make the statistical confidence of the resulting peptide spectrum matches difficult to assess. Here we performed a comprehensive analysis of the Escherichia coli proteome using LTQ-Orbitrap MS and mapped the corresponding MS/MS spectra onto a six-frame translation of the E. coli genome. We hypothesized that the protein-coding part of the E. coli genome approaches complete annotation and that the majority of six frame-specific (novel) peptide spectrum matches can be considered as false positive identifications. We confirm our hypothesis by showing that the posterior error probability distribution of novel hits is almost identical to that of reversed (decoy) hits; this enables us to estimate the sensitivity, specificity, accuracy, and false discovery rate in a typical bacterial proteo-genomic dataset. We use two complementary computational frameworks for processing and statistical assessment of MS/MS data: MaxQuant and Trans-Proteomic Pipeline. We show that MaxQuant achieves a more sensitive six-frame database search with an acceptable false discovery rate and is therefore well suited for global genome reannotation applications, whereas the Trans-Proteomic Pipeline achieves higher specificity and is well suited for high-confidence validation. The use of a small and well-annotated bacterial genome enables us to address genome coverage achieved in state-of-the-art bacterial proteomics: identified peptide sequences mapped to all expressed E. coli proteins but covered 31.7% of the protein-coding genome sequence. Our results show that false discovery rates can be substantially underestimated even in "simple" proteo-genomic experiments obtained by means of high-accuracy MS and point to the necessity of further improvements concerning the coverage of peptide sequences by MS-based methods.

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 23908556      PMCID: PMC3820952          DOI: 10.1074/mcp.M113.029165

Source DB:  PubMed          Journal:  Mol Cell Proteomics        ISSN: 1535-9476            Impact factor:   5.911


  60 in total

1.  Augmented annotation of the Schizosaccharomyces pombe genome reveals additional genes required for growth and viability.

Authors:  Danny A Bitton; Valerie Wood; Paul J Scutt; Agnes Grallert; Tim Yates; Duncan L Smith; Iain M Hagan; Crispin J Miller
Journal:  Genetics       Date:  2011-01-26       Impact factor: 4.562

Review 2.  Mass spectrometry at the interface of proteomics and genomics.

Authors:  Karsten Krug; Sven Nahnsen; Boris Macek
Journal:  Mol Biosyst       Date:  2010-10-21

Review 3.  Proteogenomics and systems biology: quest for the ultimate missing parts.

Authors:  Jean Armengaud
Journal:  Expert Rev Proteomics       Date:  2010-02       Impact factor: 3.940

4.  Andromeda: a peptide search engine integrated into the MaxQuant environment.

Authors:  Jürgen Cox; Nadin Neuhauser; Annette Michalski; Richard A Scheltema; Jesper V Olsen; Matthias Mann
Journal:  J Proteome Res       Date:  2011-02-22       Impact factor: 4.466

Review 5.  Proteogenomics to discover the full coding content of genomes: a computational perspective.

Authors:  Natalie Castellana; Vineet Bafna
Journal:  J Proteomics       Date:  2010-07-08       Impact factor: 4.044

Review 6.  A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics.

Authors:  Alexey I Nesvizhskii
Journal:  J Proteomics       Date:  2010-09-08       Impact factor: 4.044

7.  Proteogenomics of Pristionchus pacificus reveals distinct proteome structure of nematode models.

Authors:  Nadine Borchert; Christoph Dieterich; Karsten Krug; Wolfgang Schütz; Stephan Jung; Alfred Nordheim; Ralf J Sommer; Boris Macek
Journal:  Genome Res       Date:  2010-03-17       Impact factor: 9.043

8.  One-dimensional capillary liquid chromatographic separation coupled with tandem mass spectrometry unveils the Escherichia coli proteome on a microarray scale.

Authors:  Mio Iwasaki; Shohei Miwa; Tohru Ikegami; Masaru Tomita; Nobuo Tanaka; Yasushi Ishihama
Journal:  Anal Chem       Date:  2010-04-01       Impact factor: 6.986

9.  BLAST+: architecture and applications.

Authors:  Christiam Camacho; George Coulouris; Vahram Avagyan; Ning Ma; Jason Papadopoulos; Kevin Bealer; Thomas L Madden
Journal:  BMC Bioinformatics       Date:  2009-12-15       Impact factor: 3.169

Review 10.  A guided tour of the Trans-Proteomic Pipeline.

Authors:  Eric W Deutsch; Luis Mendoza; David Shteynberg; Terry Farrah; Henry Lam; Natalie Tasman; Zhi Sun; Erik Nilsson; Brian Pratt; Bryan Prazen; Jimmy K Eng; Daniel B Martin; Alexey I Nesvizhskii; Ruedi Aebersold
Journal:  Proteomics       Date:  2010-03       Impact factor: 3.984

View more
  30 in total

Review 1.  Methods, Tools and Current Perspectives in Proteogenomics.

Authors:  Kelly V Ruggles; Karsten Krug; Xiaojing Wang; Karl R Clauser; Jing Wang; Samuel H Payne; David Fenyö; Bing Zhang; D R Mani
Journal:  Mol Cell Proteomics       Date:  2017-04-29       Impact factor: 5.911

2.  Quantitative phosphoproteome analysis of Bacillus subtilis reveals novel substrates of the kinase PrkC and phosphatase PrpC.

Authors:  Vaishnavi Ravikumar; Lei Shi; Karsten Krug; Abderahmane Derouiche; Carsten Jers; Charlotte Cousin; Ahasanul Kobir; Ivan Mijakovic; Boris Macek
Journal:  Mol Cell Proteomics       Date:  2014-01-05       Impact factor: 5.911

Review 3.  Proteogenomics: concepts, applications and computational strategies.

Authors:  Alexey I Nesvizhskii
Journal:  Nat Methods       Date:  2014-11       Impact factor: 28.547

4.  Effective Leveraging of Targeted Search Spaces for Improving Peptide Identification in Tandem Mass Spectrometry Based Proteomics.

Authors:  Avinash K Shanmugam; Alexey I Nesvizhskii
Journal:  J Proteome Res       Date:  2015-11-24       Impact factor: 4.466

5.  REPARATION: ribosome profiling assisted (re-)annotation of bacterial genomes.

Authors:  Elvis Ndah; Veronique Jonckheere; Adam Giess; Eivind Valen; Gerben Menschaert; Petra Van Damme
Journal:  Nucleic Acids Res       Date:  2017-11-16       Impact factor: 16.971

Review 6.  Proteogenomics: Integrating Next-Generation Sequencing and Mass Spectrometry to Characterize Human Proteomic Variation.

Authors:  Gloria M Sheynkman; Michael R Shortreed; Anthony J Cesnik; Lloyd M Smith
Journal:  Annu Rev Anal Chem (Palo Alto Calif)       Date:  2016-03-30       Impact factor: 10.745

7.  A workflow to identify novel proteins based on the direct mapping of peptide-spectrum-matches to genomic locations.

Authors:  John Anders; Hannes Petruschke; Nico Jehmlich; Sven-Bastiaan Haange; Martin von Bergen; Peter F Stadler
Journal:  BMC Bioinformatics       Date:  2021-05-26       Impact factor: 3.169

8.  Avoidance of protein unfolding constrains protein stability in long-term evolution.

Authors:  Rostam M Razban; Pouria Dasmeh; Adrian W R Serohijos; Eugene I Shakhnovich
Journal:  Biophys J       Date:  2021-04-29       Impact factor: 3.699

9.  A Novel Quality Measure and Correction Procedure for the Annotation of Microbial Translation Initiation Sites.

Authors:  Lex Overmars; Roland J Siezen; Christof Francke
Journal:  PLoS One       Date:  2015-07-23       Impact factor: 3.240

10.  A note on the false discovery rate of novel peptides in proteogenomics.

Authors:  Kun Zhang; Yan Fu; Wen-Feng Zeng; Kun He; Hao Chi; Chao Liu; Yan-Chang Li; Yuan Gao; Ping Xu; Si-Min He
Journal:  Bioinformatics       Date:  2015-06-14       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.