Literature DB >> 19098097

Discovery and revision of Arabidopsis genes by proteogenomics.

Natalie E Castellana1, Samuel H Payne, Zhouxin Shen, Mario Stanke, Vineet Bafna, Steven P Briggs.   

Abstract

Gene annotation underpins genome science. Most often protein coding sequence is inferred from the genome based on transcript evidence and computational predictions. While generally correct, gene models suffer from errors in reading frame, exon border definition, and exon identification. To ascertain the error rate of Arabidopsis thaliana gene models, we isolated proteins from a sample of Arabidopsis tissues and determined the amino acid sequences of 144,079 distinct peptides by tandem mass spectrometry. The peptides corresponded to 1 or more of 3 different translations of the genome: a 6-frame translation, an exon splice-graph, and the currently annotated proteome. The majority of the peptides (126,055) resided in existing gene models (12,769 confirmed proteins), comprising 40% of annotated genes. Surprisingly, 18,024 novel peptides were found that do not correspond to annotated genes. Using the gene finding program AUGUSTUS and 5,426 novel peptides that occurred in clusters, we discovered 778 new protein-coding genes and refined the annotation of an additional 695 gene models. The remaining 13,449 novel peptides provide high quality annotation (>99% correct) for thousands of additional genes. Our observation that 18,024 of 144,079 peptides did not match current gene models suggests that 13% of the Arabidopsis proteome was incomplete due to approximately equal numbers of missing and incorrect gene models.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 19098097      PMCID: PMC2605632          DOI: 10.1073/pnas.0811066106

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  16 in total

1.  Genomewide comparative analysis of alternative splicing in plants.

Authors:  Bing-Bing Wang; Volker Brendel
Journal:  Proc Natl Acad Sci U S A       Date:  2006-04-21       Impact factor: 11.205

2.  A plant-specific protein essential for blue-light-induced chloroplast movements.

Authors:  Stacy L DeBlasio; Darron L Luesse; Roger P Hangarter
Journal:  Plant Physiol       Date:  2005-08-19       Impact factor: 8.340

3.  Improving gene annotation using peptide mass spectrometry.

Authors:  Stephen Tanner; Zhouxin Shen; Julio Ng; Liliana Florea; Roderic Guigó; Steven P Briggs; Vineet Bafna
Journal:  Genome Res       Date:  2006-12-22       Impact factor: 9.043

4.  Expressed peptide tags: an additional layer of data for genome annotation.

Authors:  Alon Savidor; Ryan S Donahoo; Oscar Hurtado-Gonzales; Nathan C Verberkmoes; Manesh B Shah; Kurt H Lamour; W Hayes McDonald
Journal:  J Proteome Res       Date:  2006-11       Impact factor: 4.466

5.  Pack-MULE transposable elements mediate gene evolution in plants.

Authors:  Ning Jiang; Zhirong Bao; Xiaoyu Zhang; Sean R Eddy; Susan R Wessler
Journal:  Nature       Date:  2004-09-30       Impact factor: 49.962

6.  Sequencing and comparison of yeast species to identify genes and regulatory elements.

Authors:  Manolis Kellis; Nick Patterson; Matthew Endrizzi; Bruce Birren; Eric S Lander
Journal:  Nature       Date:  2003-05-15       Impact factor: 49.962

7.  Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics.

Authors:  Katja Baerenfaller; Jonas Grossmann; Monica A Grobei; Roger Hull; Matthias Hirsch-Hoffmann; Shaul Yalovsky; Philip Zimmermann; Ueli Grossniklaus; Wilhelm Gruissem; Sacha Baginsky
Journal:  Science       Date:  2008-04-24       Impact factor: 47.728

8.  Novel gene and gene model detection using a whole genome open reading frame analysis in proteomics.

Authors:  Damian Fermin; Baxter B Allen; Thomas W Blackwell; Rajasree Menon; Marcin Adamski; Yin Xu; Peter Ulintz; Gilbert S Omenn; David J States
Journal:  Genome Biol       Date:  2006-04-28       Impact factor: 13.583

9.  AUGUSTUS: ab initio prediction of alternative transcripts.

Authors:  Mario Stanke; Oliver Keller; Irfan Gunduz; Alec Hayes; Stephan Waack; Burkhard Morgenstern
Journal:  Nucleic Acids Res       Date:  2006-07-01       Impact factor: 16.971

10.  Integration with the human genome of peptide sequences obtained by high-throughput mass spectrometry.

Authors:  Frank Desiere; Eric W Deutsch; Alexey I Nesvizhskii; Parag Mallick; Nichole L King; Jimmy K Eng; Alan Aderem; Rose Boyle; Erich Brunner; Samuel Donohoe; Nelson Fausto; Ernst Hafen; Lee Hood; Michael G Katze; Kathleen A Kennedy; Floyd Kregenow; Hookeun Lee; Biaoyang Lin; Dan Martin; Jeffrey A Ranish; David J Rawlings; Lawrence E Samelson; Yuzuru Shiio; Julian D Watts; Bernd Wollscheid; Michael E Wright; Wei Yan; Lihong Yang; Eugene C Yi; Hui Zhang; Ruedi Aebersold
Journal:  Genome Biol       Date:  2004-12-10       Impact factor: 13.583

View more
  116 in total

1.  Augmented annotation of the Schizosaccharomyces pombe genome reveals additional genes required for growth and viability.

Authors:  Danny A Bitton; Valerie Wood; Paul J Scutt; Agnes Grallert; Tim Yates; Duncan L Smith; Iain M Hagan; Crispin J Miller
Journal:  Genetics       Date:  2011-01-26       Impact factor: 4.562

2.  GenePRIMP: a gene prediction improvement pipeline for prokaryotic genomes.

Authors:  Amrita Pati; Natalia N Ivanova; Natalia Mikhailova; Galina Ovchinnikova; Sean D Hooper; Athanasios Lykidis; Nikos C Kyrpides
Journal:  Nat Methods       Date:  2010-05-02       Impact factor: 28.547

3.  Template proteogenomics: sequencing whole proteins using an imperfect database.

Authors:  Natalie E Castellana; Victoria Pham; David Arnott; Jennie R Lill; Vineet Bafna
Journal:  Mol Cell Proteomics       Date:  2010-02-17       Impact factor: 5.911

Review 4.  Generating and navigating proteome maps using mass spectrometry.

Authors:  Christian H Ahrens; Erich Brunner; Ermir Qeli; Konrad Basler; Ruedi Aebersold
Journal:  Nat Rev Mol Cell Biol       Date:  2010-10-14       Impact factor: 94.444

Review 5.  Proteomics: a pragmatic perspective.

Authors:  Parag Mallick; Bernhard Kuster
Journal:  Nat Biotechnol       Date:  2010-07-09       Impact factor: 54.908

6.  Target-decoy approach and false discovery rate: when things may go wrong.

Authors:  Nitin Gupta; Nuno Bandeira; Uri Keich; Pavel A Pevzner
Journal:  J Am Soc Mass Spectrom       Date:  2011-05-05       Impact factor: 3.109

7.  A proteogenomic survey of the Medicago truncatula genome.

Authors:  Jeremy D Volkening; Derek J Bailey; Christopher M Rose; Paul A Grimsrud; Maegen Howes-Podoll; Muthusubramanian Venkateshwaran; Michael S Westphall; Jean-Michel Ané; Joshua J Coon; Michael R Sussman
Journal:  Mol Cell Proteomics       Date:  2012-07-05       Impact factor: 5.911

8.  MASCP Gator: an aggregation portal for the visualization of Arabidopsis proteomics data.

Authors:  Hiren J Joshi; Matthias Hirsch-Hoffmann; Katja Baerenfaller; Wilhelm Gruissem; Sacha Baginsky; Robert Schmidt; Waltraud X Schulze; Qi Sun; Klaas J van Wijk; Volker Egelhofer; Stefanie Wienkoop; Wolfram Weckwerth; Christophe Bruley; Norbert Rolland; Tetsuro Toyoda; Hirofumi Nakagami; Alexandra M Jones; Steven P Briggs; Ian Castleden; Sandra K Tanz; A Harvey Millar; Joshua L Heazlewood
Journal:  Plant Physiol       Date:  2010-11-12       Impact factor: 8.340

9.  A ranking-based scoring function for peptide-spectrum matches.

Authors:  Ari M Frank
Journal:  J Proteome Res       Date:  2009-05       Impact factor: 4.466

10.  In planta proteomics and proteogenomics of the biotrophic barley fungal pathogen Blumeria graminis f. sp. hordei.

Authors:  Laurence V Bindschedler; Timothy A Burgis; Davinia J S Mills; Jenny T C Ho; Rainer Cramer; Pietro D Spanu
Journal:  Mol Cell Proteomics       Date:  2009-07-14       Impact factor: 5.911

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.