Literature DB >> 16651666

Iterative gene prediction and pseudogene removal improves genome annotation.

Marijke J van Baren1, Michael R Brent.   

Abstract

Correct gene prediction is impaired by the presence of processed pseudogenes: nonfunctional, intronless copies of real genes found elsewhere in the genome. Gene prediction programs frequently mistake processed pseudogenes for real genes or exons, leading to biologically irrelevant gene predictions. While methods exist to identify processed pseudogenes in genomes, no attempt has been made to integrate pseudogene removal with gene prediction, or even to provide a freestanding tool that identifies such erroneous gene predictions. We have created PPFINDER (for Processed Pseudogene finder), a program that integrates several methods of processed pseudogene finding in mammalian gene annotations. We used PPFINDER to remove pseudogenes from N-SCAN gene predictions, and show that gene prediction improves substantially when gene prediction and pseudogene masking are interleaved. In addition, we used PPFINDER with gene predictions as a parent database, eliminating the need for libraries of known genes. This allows us to run the gene prediction/PPFINDER procedure on newly sequenced genomes for which few genes are known.

Entities:  

Mesh:

Year:  2006        PMID: 16651666      PMCID: PMC1457044          DOI: 10.1101/gr.4766206

Source DB:  PubMed          Journal:  Genome Res        ISSN: 1088-9051            Impact factor:   9.043


  27 in total

1.  ASmodeler: gene modeling of alternative splicing from genomic alignment of mRNA, EST and protein sequences.

Authors:  Namshin Kim; Seokmin Shin; Sanghyuk Lee
Journal:  Nucleic Acids Res       Date:  2004-07-01       Impact factor: 16.971

Review 2.  Large-scale analysis of pseudogenes in the human genome.

Authors:  ZhaoLei Zhang; Mark Gerstein
Journal:  Curr Opin Genet Dev       Date:  2004-08       Impact factor: 5.578

3.  Integrated pseudogene annotation for human chromosome 22: evidence for transcription.

Authors:  Deyou Zheng; Zhaolei Zhang; Paul M Harrison; John Karro; Nick Carriero; Mark Gerstein
Journal:  J Mol Biol       Date:  2005-04-02       Impact factor: 5.469

4.  Evolutionary rate at the molecular level.

Authors:  M Kimura
Journal:  Nature       Date:  1968-02-17       Impact factor: 49.962

Review 5.  Processed pseudogenes: characteristics and evolution.

Authors:  E F Vanin
Journal:  Annu Rev Genet       Date:  1985       Impact factor: 16.830

6.  Prediction of complete gene structures in human genomic DNA.

Authors:  C Burge; S Karlin
Journal:  J Mol Biol       Date:  1997-04-25       Impact factor: 5.469

7.  A simple method for estimating the intensity of purifying selection in protein-coding genes.

Authors:  R Ophir; T Itoh; D Graur; T Gojobori
Journal:  Mol Biol Evol       Date:  1999-01       Impact factor: 16.240

8.  The Vertebrate Genome Annotation (Vega) database.

Authors:  J L Ashurst; C-K Chen; J G R Gilbert; K Jekosch; S Keenan; P Meidl; S M Searle; J Stalker; R Storey; S Trevanion; L Wilming; T Hubbard
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

9.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins.

Authors:  Kim D Pruitt; Tatiana Tatusova; Donna R Maglott
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

10.  Ensembl 2005.

Authors:  T Hubbard; D Andrews; M Caccamo; G Cameron; Y Chen; M Clamp; L Clarke; G Coates; T Cox; F Cunningham; V Curwen; T Cutts; T Down; R Durbin; X M Fernandez-Suarez; J Gilbert; M Hammond; J Herrero; H Hotz; K Howe; V Iyer; K Jekosch; A Kahari; A Kasprzyk; D Keefe; S Keenan; F Kokocinsci; D London; I Longden; G McVicker; C Melsopp; P Meidl; S Potter; G Proctor; M Rae; D Rios; M Schuster; S Searle; J Severin; G Slater; D Smedley; J Smith; W Spooner; A Stabenau; J Stalker; R Storey; S Trevanion; A Ureta-Vidal; J Vogel; S White; C Woodwark; E Birney
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

View more
  28 in total

1.  DNA-energetics-based analyses suggest additional genes in prokaryotes.

Authors:  Garima Khandelwal; Jalaj Gupta; B Jayaram
Journal:  J Biosci       Date:  2012-07       Impact factor: 1.826

2.  Pseudogenes in the ENCODE regions: consensus annotation, analysis of transcription, and evolution.

Authors:  Deyou Zheng; Adam Frankish; Robert Baertsch; Philipp Kapranov; Alexandre Reymond; Siew Woh Choo; Yontao Lu; France Denoeud; Stylianos E Antonarakis; Michael Snyder; Yijun Ruan; Chia-Lin Wei; Thomas R Gingeras; Roderic Guigó; Jennifer Harrow; Mark B Gerstein
Journal:  Genome Res       Date:  2007-06       Impact factor: 9.043

3.  Origination and Function of Plant Pseudogenes.

Authors:  Jianbo Xie; Sisi Chen; Weijie Xu; Yiyang Zhao; Deqiang Zhang
Journal:  Plant Signal Behav       Date:  2019-06-04

4.  The CaspBase: a curated database for evolutionary biochemical studies of caspase functional divergence and ancestral sequence inference.

Authors:  Robert D Grinshpon; Anna Williford; James Titus-McQuillan; A Clay Clark
Journal:  Protein Sci       Date:  2018-10       Impact factor: 6.725

5.  Approaches to Fungal Genome Annotation.

Authors:  Brian J Haas; Qiandong Zeng; Matthew D Pearson; Christina A Cuomo; Jennifer R Wortman
Journal:  Mycology       Date:  2011-10-03

6.  Computational Methods for Pseudogene Annotation Based on Sequence Homology.

Authors:  Paul M Harrison
Journal:  Methods Mol Biol       Date:  2021

7.  Targeted discovery of novel human exons by comparative genomics.

Authors:  Adam Siepel; Mark Diekhans; Brona Brejová; Laura Langton; Michael Stevens; Charles L G Comstock; Colleen Davis; Brent Ewing; Shelly Oommen; Christopher Lau; Hung-Chun Yu; Jianfeng Li; Bruce A Roe; Phil Green; Daniela S Gerhard; Gary Temple; David Haussler; Michael R Brent
Journal:  Genome Res       Date:  2007-11-07       Impact factor: 9.043

8.  MADS: a new and improved method for analysis of differential alternative splicing by exon-tiling microarrays.

Authors:  Yi Xing; Peter Stoilov; Karen Kapur; Areum Han; Hui Jiang; Shihao Shen; Douglas L Black; Wing Hung Wong
Journal:  RNA       Date:  2008-06-19       Impact factor: 4.942

9.  Comprehensive analysis of the pseudogenes of glycolytic enzymes in vertebrates: the anomalously high number of GAPDH pseudogenes highlights a recent burst of retrotrans-positional activity.

Authors:  Yuen-Jong Liu; Deyou Zheng; Suganthi Balasubramanian; Nicholas Carriero; Ekta Khurana; Rebecca Robilotto; Mark B Gerstein
Journal:  BMC Genomics       Date:  2009-10-16       Impact factor: 3.969

10.  Revisiting the missing protein-coding gene catalog of the domestic dog.

Authors:  Thomas Derrien; Julien Thézé; Amaury Vaysse; Catherine André; Elaine A Ostrander; Francis Galibert; Christophe Hitte
Journal:  BMC Genomics       Date:  2009-02-04       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.