Literature DB >> 17081056

Expressed peptide tags: an additional layer of data for genome annotation.

Alon Savidor1, Ryan S Donahoo, Oscar Hurtado-Gonzales, Nathan C Verberkmoes, Manesh B Shah, Kurt H Lamour, W Hayes McDonald.   

Abstract

While genome sequencing is becoming ever more routine, genome annotation remains a challenging process. Identification of the coding sequences within the genomic milieu presents a tremendous challenge, especially for eukaryotes with their complex gene architectures. Here, we present a method to assist the annotation process through the use of proteomic data and bioinformatics. Mass spectra of digested protein preparations of the organism of interest were acquired and searched against a protein database created by a six-frame translation of the genome. The identified peptides were mapped back to the genome, compared to the current annotation, and then categorized as supporting or extending the current genome annotation. We named the classified peptides Expressed Peptide Tags (EPTs). The well-annotated bacterium Rhodopseudomonas palustris was used as a control for the method and showed a high degree of correlation between EPT mapping and the current annotation, with 86% of the EPTs confirming existing gene calls and less than 1% of the EPTs expanding on the current annotation. The eukaryotic plant pathogens Phytophthora ramorum and Phytophthora sojae, whose genomes have been recently sequenced and are much less well-annotated, were also subjected to this method. A series of algorithmic steps were taken to increase the confidence of EPT identification for these organisms, including generation of smaller subdatabases to be searched against, and definition of EPT criteria that accommodates the more complex eukaryotic gene architecture. As expected, the analysis of the Phytophthora species showed less correlation between EPT mapping and their current annotation. While approximately 76% of Phytophthora EPTs supported the current annotation, a portion of them (7.7% and 12.9% for P. ramorum and P. sojae, respectively) suggested modification to current gene calls or identified novel genes that were missed by the current genome annotation of these organisms.

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 17081056     DOI: 10.1021/pr060134x

Source DB:  PubMed          Journal:  J Proteome Res        ISSN: 1535-3893            Impact factor:   4.466


  13 in total

1.  A proteogenomic survey of the Medicago truncatula genome.

Authors:  Jeremy D Volkening; Derek J Bailey; Christopher M Rose; Paul A Grimsrud; Maegen Howes-Podoll; Muthusubramanian Venkateshwaran; Michael S Westphall; Jean-Michel Ané; Joshua J Coon; Michael R Sussman
Journal:  Mol Cell Proteomics       Date:  2012-07-05       Impact factor: 5.911

Review 2.  Systems biology: Functional analysis of natural microbial consortia using community proteomics.

Authors:  Nathan C VerBerkmoes; Vincent J Denef; Robert L Hettich; Jillian F Banfield
Journal:  Nat Rev Microbiol       Date:  2009-03       Impact factor: 60.633

3.  Spectral dictionaries: Integrating de novo peptide sequencing with database search of tandem mass spectra.

Authors:  Sangtae Kim; Nitin Gupta; Nuno Bandeira; Pavel A Pevzner
Journal:  Mol Cell Proteomics       Date:  2008-08-14       Impact factor: 5.911

4.  Comparative proteogenomics: combining mass spectrometry and comparative genomics to analyze multiple genomes.

Authors:  Nitin Gupta; Jamal Benhamida; Vipul Bhargava; Daniel Goodman; Elisabeth Kain; Ian Kerman; Ngan Nguyen; Noah Ollikainen; Jesse Rodriguez; Jian Wang; Mary S Lipton; Margaret Romine; Vineet Bafna; Richard D Smith; Pavel A Pevzner
Journal:  Genome Res       Date:  2008-04-21       Impact factor: 9.043

5.  Tissue-specific Proteogenomic Analysis of Plutella xylostella Larval Midgut Using a Multialgorithm Pipeline.

Authors:  Xun Zhu; Shangbo Xie; Jean Armengaud; Wen Xie; Zhaojiang Guo; Shi Kang; Qingjun Wu; Shaoli Wang; Jixing Xia; Rongjun He; Youjun Zhang
Journal:  Mol Cell Proteomics       Date:  2016-02-22       Impact factor: 5.911

Review 6.  Proteogenomics to discover the full coding content of genomes: a computational perspective.

Authors:  Natalie Castellana; Vineet Bafna
Journal:  J Proteomics       Date:  2010-07-08       Impact factor: 4.044

7.  Discovery and revision of Arabidopsis genes by proteogenomics.

Authors:  Natalie E Castellana; Samuel H Payne; Zhouxin Shen; Mario Stanke; Vineet Bafna; Steven P Briggs
Journal:  Proc Natl Acad Sci U S A       Date:  2008-12-19       Impact factor: 11.205

8.  Analysis of the zebrafish proteome during embryonic development.

Authors:  Margaret B Lucitt; Thomas S Price; Angel Pizarro; Weichen Wu; Anastasia K Yocum; Christoph Seiler; Michael A Pack; Ian A Blair; Garret A Fitzgerald; Tilo Grosser
Journal:  Mol Cell Proteomics       Date:  2008-01-22       Impact factor: 5.911

9.  Profiling the secretome and extracellular proteome of the potato late blight pathogen Phytophthora infestans.

Authors:  Harold J G Meijer; Francesco M Mancuso; Guadalupe Espadas; Michael F Seidl; Cristina Chiva; Francine Govers; Eduard Sabidó
Journal:  Mol Cell Proteomics       Date:  2014-05-28       Impact factor: 5.911

10.  Investigating protein isoforms via proteomics: a feasibility study.

Authors:  Paul Blakeley; Jennifer A Siepen; Craig Lawless; Simon J Hubbard
Journal:  Proteomics       Date:  2010-03       Impact factor: 3.984

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.