Literature DB >> 20620248

Proteogenomics to discover the full coding content of genomes: a computational perspective.

Natalie Castellana1, Vineet Bafna.   

Abstract

Proteogenomics has emerged as a field at the junction of genomics and proteomics. It is a loose collection of technologies that allow the search of tandem mass spectra against genomic databases to identify and characterize protein-coding genes. Proteogenomic peptides provide invaluable information for gene annotation, which is difficult or impossible to ascertain using standard annotation methods. Examples include confirmation of translation, reading-frame determination, identification of gene and exon boundaries, evidence for post-translational processing, identification of splice-forms including alternative splicing, and also, prediction of completely novel genes. For proteogenomics to deliver on its promise, however, it must overcome a number of technological hurdles, including speed and accuracy of peptide identification, construction and search of specialized databases, correction of sampling bias, and others. This article reviews the state of the art of the field, focusing on the current successes, and the role of computation in overcoming these challenges. We describe how technological and algorithmic advances have already enabled large-scale proteogenomic studies in many model organisms, including arabidopsis, yeast, fly, and human. We also provide a preview of the field going forward, describing early efforts in tackling the problems of complex gene structures, searching against genomes of related species, and immunoglobulin gene reconstruction.
Copyright © 2010 Elsevier B.V. All rights reserved.

Entities:  

Mesh:

Substances:

Year:  2010        PMID: 20620248      PMCID: PMC2949459          DOI: 10.1016/j.jprot.2010.06.007

Source DB:  PubMed          Journal:  J Proteomics        ISSN: 1874-3919            Impact factor:   4.044


  92 in total

1.  Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search.

Authors:  Andrew Keller; Alexey I Nesvizhskii; Eugene Kolker; Ruedi Aebersold
Journal:  Anal Chem       Date:  2002-10-15       Impact factor: 6.986

2.  Proteogenomic mapping as a complementary method to perform genome annotation.

Authors:  Jacob D Jaffe; Howard C Berg; George M Church
Journal:  Proteomics       Date:  2004-01       Impact factor: 3.984

Review 3.  Current methods of gene prediction, their strengths and weaknesses.

Authors:  Catherine Mathé; Marie-France Sagot; Thomas Schiex; Pierre Rouzé
Journal:  Nucleic Acids Res       Date:  2002-10-01       Impact factor: 16.971

4.  Identification of post-translational modifications by blind search of mass spectra.

Authors:  Dekel Tsur; Stephen Tanner; Ebrahim Zandi; Vineet Bafna; Pavel A Pevzner
Journal:  Nat Biotechnol       Date:  2005-11-27       Impact factor: 54.908

5.  Lookup peaks: a hybrid of de novo sequencing and database search for protein identification by tandem mass spectrometry.

Authors:  Marshall Bern; Yuhan Cai; David Goldberg
Journal:  Anal Chem       Date:  2007-01-23       Impact factor: 6.986

Review 6.  Proteomics by mass spectrometry: approaches, advances, and applications.

Authors:  John R Yates; Cristian I Ruse; Aleksey Nakorchevsky
Journal:  Annu Rev Biomed Eng       Date:  2009       Impact factor: 9.590

Review 7.  Life with 6000 genes.

Authors:  A Goffeau; B G Barrell; H Bussey; R W Davis; B Dujon; H Feldmann; F Galibert; J D Hoheisel; C Jacq; M Johnston; E J Louis; H W Mewes; Y Murakami; P Philippsen; H Tettelin; S G Oliver
Journal:  Science       Date:  1996-10-25       Impact factor: 47.728

8.  Community proteomics of a natural microbial biofilm.

Authors:  Rachna J Ram; Nathan C Verberkmoes; Michael P Thelen; Gene W Tyson; Brett J Baker; Robert C Blake; Manesh Shah; Robert L Hettich; Jillian F Banfield
Journal:  Science       Date:  2005-05-05       Impact factor: 47.728

9.  Genome annotation of Anopheles gambiae using mass spectrometry-derived data.

Authors:  Dário E Kalume; Suraj Peri; Raghunath Reddy; Jun Zhong; Mobolaji Okulate; Nirbhay Kumar; Akhilesh Pandey
Journal:  BMC Genomics       Date:  2005-09-19       Impact factor: 3.969

10.  Novel peptide identification from tandem mass spectra using ESTs and sequence database compression.

Authors:  Nathan J Edwards
Journal:  Mol Syst Biol       Date:  2007-04-17       Impact factor: 11.429

View more
  59 in total

Review 1.  Integrative systems biology: an attempt to describe a simple weed.

Authors:  Louisa M Liberman; Rosangela Sozzani; Philip N Benfey
Journal:  Curr Opin Plant Biol       Date:  2012-01-23       Impact factor: 7.834

2.  An automated proteogenomic method uses mass spectrometry to reveal novel genes in Zea mays.

Authors:  Natalie E Castellana; Zhouxin Shen; Yupeng He; Justin W Walley; California Jack Cassidy; Steven P Briggs; Vineet Bafna
Journal:  Mol Cell Proteomics       Date:  2013-10-18       Impact factor: 5.911

3.  GAPP: A Proteogenomic Software for Genome Annotation and Global Profiling of Post-translational Modifications in Prokaryotes.

Authors:  Jia Zhang; Ming-Kun Yang; Honghui Zeng; Feng Ge
Journal:  Mol Cell Proteomics       Date:  2016-09-14       Impact factor: 5.911

4.  Large-scale mass spectrometric detection of variant peptides resulting from nonsynonymous nucleotide differences.

Authors:  Gloria M Sheynkman; Michael R Shortreed; Brian L Frey; Mark Scalf; Lloyd M Smith
Journal:  J Proteome Res       Date:  2013-11-11       Impact factor: 4.466

5.  The discovery of novel protein-coding features in mouse genome based on mass spectrometry data.

Authors:  Xiao-Bin Xing; Qing-Run Li; Han Sun; Xing Fu; Fei Zhan; Xiu Huang; Jing Li; Chun-Lei Chen; Yu Shyr; Rong Zeng; Yi-Xue Li; Lu Xie
Journal:  Genomics       Date:  2011-08-04       Impact factor: 5.736

6.  Proteogenomic strategies for identification of aberrant cancer peptides using large-scale next-generation sequencing data.

Authors:  Sunghee Woo; Seong Won Cha; Seungjin Na; Clark Guest; Tao Liu; Richard D Smith; Karin D Rodland; Samuel Payne; Vineet Bafna
Journal:  Proteomics       Date:  2014-11-17       Impact factor: 3.984

7.  Identification of gene fusions from human lung cancer mass spectrometry data.

Authors:  Han Sun; Xiaobin Xing; Jing Li; Fengli Zhou; Yunqin Chen; Ying He; Wei Li; Guangwu Wei; Xiao Chang; Jia Jia; Yixue Li; Lu Xie
Journal:  BMC Genomics       Date:  2013-12-09       Impact factor: 3.969

Review 8.  Methods, Tools and Current Perspectives in Proteogenomics.

Authors:  Kelly V Ruggles; Karsten Krug; Xiaojing Wang; Karl R Clauser; Jing Wang; Samuel H Payne; David Fenyö; Bing Zhang; D R Mani
Journal:  Mol Cell Proteomics       Date:  2017-04-29       Impact factor: 5.911

Review 9.  Inference and validation of protein identifications.

Authors:  Manfred Claassen
Journal:  Mol Cell Proteomics       Date:  2012-08-03       Impact factor: 5.911

10.  Brain proteomics of Anopheles gambiae.

Authors:  Sutopa B Dwivedi; Babylakshmi Muthusamy; Praveen Kumar; Min-Sik Kim; Raja Sekhar Nirujogi; Derese Getnet; Priscilla Ahiakonu; Gourav De; Bipin Nair; Harsha Gowda; T S Keshava Prasad; Nirbhay Kumar; Akhilesh Pandey; Mobolaji Okulate
Journal:  OMICS       Date:  2014-06-17
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.