Literature DB >> 14764925

Gene structure conservation aids similarity based gene prediction.

Irmtraud M Meyer1, Richard Durbin.   

Abstract

One of the primary tasks in deciphering the functional contents of a newly sequenced genome is the identification of its protein coding genes. Existing computational methods for gene prediction include ab initio methods which use the DNA sequence itself as the only source of information, comparative methods using multiple genomic sequences, and similarity based methods which employ the cDNA or protein sequences of related genes to aid the gene prediction. We present here an algorithm implemented in a computer program called Projector which combines comparative and similarity approaches. Projector employs similarity information at the genomic DNA level by directly using known genes annotated on one DNA sequence to predict the corresponding related genes on another DNA sequence. It therefore makes explicit use of the conservation of the exon-intron structure between two related genes in addition to the similarity of their encoded amino acid sequences. We evaluate the performance of Projector by comparing it with the program Genewise on a test set of 491 pairs of independently confirmed mouse and human genes. It is more accurate than Genewise for genes whose proteins are <80% identical, and is suitable for use in a combined gene prediction system where other methods identify well conserved and non-conserved genes, and pseudogenes.

Entities:  

Mesh:

Substances:

Year:  2004        PMID: 14764925      PMCID: PMC373336          DOI: 10.1093/nar/gkh211

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  21 in total

1.  RefSeq and LocusLink: NCBI gene-centered resources.

Authors:  K D Pruitt; D R Maglott
Journal:  Nucleic Acids Res       Date:  2001-01-01       Impact factor: 16.971

2.  Initial sequencing and analysis of the human genome.

Authors:  E S Lander; L M Linton; B Birren; C Nusbaum; M C Zody; J Baldwin; K Devon; K Dewar; M Doyle; W FitzHugh; R Funke; D Gage; K Harris; A Heaford; J Howland; L Kann; J Lehoczky; R LeVine; P McEwan; K McKernan; J Meldrim; J P Mesirov; C Miranda; W Morris; J Naylor; C Raymond; M Rosetti; R Santos; A Sheridan; C Sougnez; Y Stange-Thomann; N Stojanovic; A Subramanian; D Wyman; J Rogers; J Sulston; R Ainscough; S Beck; D Bentley; J Burton; C Clee; N Carter; A Coulson; R Deadman; P Deloukas; A Dunham; I Dunham; R Durbin; L French; D Grafham; S Gregory; T Hubbard; S Humphray; A Hunt; M Jones; C Lloyd; A McMurray; L Matthews; S Mercer; S Milne; J C Mullikin; A Mungall; R Plumb; M Ross; R Shownkeen; S Sims; R H Waterston; R K Wilson; L W Hillier; J D McPherson; M A Marra; E R Mardis; L A Fulton; A T Chinwalla; K H Pepin; W R Gish; S L Chissoe; M C Wendl; K D Delehaunty; T L Miner; A Delehaunty; J B Kramer; L L Cook; R S Fulton; D L Johnson; P J Minx; S W Clifton; T Hawkins; E Branscomb; P Predki; P Richardson; S Wenning; T Slezak; N Doggett; J F Cheng; A Olsen; S Lucas; C Elkin; E Uberbacher; M Frazier; R A Gibbs; D M Muzny; S E Scherer; J B Bouck; E J Sodergren; K C Worley; C M Rives; J H Gorrell; M L Metzker; S L Naylor; R S Kucherlapati; D L Nelson; G M Weinstock; Y Sakaki; A Fujiyama; M Hattori; T Yada; A Toyoda; T Itoh; C Kawagoe; H Watanabe; Y Totoki; T Taylor; J Weissenbach; R Heilig; W Saurin; F Artiguenave; P Brottier; T Bruls; E Pelletier; C Robert; P Wincker; D R Smith; L Doucette-Stamm; M Rubenfield; K Weinstock; H M Lee; J Dubois; A Rosenthal; M Platzer; G Nyakatura; S Taudien; A Rump; H Yang; J Yu; J Wang; G Huang; J Gu; L Hood; L Rowen; A Madan; S Qin; R W Davis; N A Federspiel; A P Abola; M J Proctor; R M Myers; J Schmutz; M Dickson; J Grimwood; D R Cox; M V Olson; R Kaul; C Raymond; N Shimizu; K Kawasaki; S Minoshima; G A Evans; M Athanasiou; R Schultz; B A Roe; F Chen; H Pan; J Ramser; H Lehrach; R Reinhardt; W R McCombie; M de la Bastide; N Dedhia; H Blöcker; K Hornischer; G Nordsiek; R Agarwala; L Aravind; J A Bailey; A Bateman; S Batzoglou; E Birney; P Bork; D G Brown; C B Burge; L Cerutti; H C Chen; D Church; M Clamp; R R Copley; T Doerks; S R Eddy; E E Eichler; T S Furey; J Galagan; J G Gilbert; C Harmon; Y Hayashizaki; D Haussler; H Hermjakob; K Hokamp; W Jang; L S Johnson; T A Jones; S Kasif; A Kaspryzk; S Kennedy; W J Kent; P Kitts; E V Koonin; I Korf; D Kulp; D Lancet; T M Lowe; A McLysaght; T Mikkelsen; J V Moran; N Mulder; V J Pollara; C P Ponting; G Schuler; J Schultz; G Slater; A F Smit; E Stupka; J Szustakowki; D Thierry-Mieg; J Thierry-Mieg; L Wagner; J Wallis; R Wheeler; A Williams; Y I Wolf; K H Wolfe; S P Yang; R F Yeh; F Collins; M S Guyer; J Peterson; A Felsenfeld; K A Wetterstrand; A Patrinos; M J Morgan; P de Jong; J J Catanese; K Osoegawa; H Shizuya; S Choi; Y J Chen; J Szustakowki
Journal:  Nature       Date:  2001-02-15       Impact factor: 49.962

3.  An assessment of gene prediction accuracy in large DNA sequences.

Authors:  R Guigó; P Agarwal; J F Abril; M Burset; J W Fickett
Journal:  Genome Res       Date:  2000-10       Impact factor: 9.043

4.  Integrating genomic homology into gene structure prediction.

Authors:  I Korf; P Flicek; D Duan; M R Brent
Journal:  Bioinformatics       Date:  2001       Impact factor: 6.937

5.  The Ensembl genome database project.

Authors:  T Hubbard; D Barker; E Birney; G Cameron; Y Chen; L Clark; T Cox; J Cuff; V Curwen; T Down; R Durbin; E Eyras; J Gilbert; M Hammond; L Huminiecki; A Kasprzyk; H Lehvaslaiho; P Lijnzaad; C Melsopp; E Mongin; R Pettett; M Pocock; S Potter; A Rust; E Schmidt; S Searle; G Slater; J Smith; W Spooner; A Stabenau; J Stalker; E Stupka; A Ureta-Vidal; I Vastrik; M Clamp
Journal:  Nucleic Acids Res       Date:  2002-01-01       Impact factor: 16.971

6.  SGP-1: prediction and validation of homologous genes based on sequence alignments.

Authors:  T Wiehe; S Gebauer-Jung; T Mitchell-Olds; R Guigó
Journal:  Genome Res       Date:  2001-09       Impact factor: 9.043

7.  Computational inference of homologous gene structures in the human genome.

Authors:  R F Yeh; L P Lim; C B Burge
Journal:  Genome Res       Date:  2001-05       Impact factor: 9.043

8.  Comparative genome and proteome analysis of Anopheles gambiae and Drosophila melanogaster.

Authors:  Evgeny M Zdobnov; Christian von Mering; Ivica Letunic; David Torrents; Mikita Suyama; Richard R Copley; George K Christophides; Dana Thomasova; Robert A Holt; G Mani Subramanian; Hans-Michael Mueller; George Dimopoulos; John H Law; Michael A Wells; Ewan Birney; Rosane Charlab; Aaron L Halpern; Elena Kokoza; Cheryl L Kraft; Zhongwu Lai; Suzanna Lewis; Christos Louis; Carolina Barillas-Mury; Deborah Nusskern; Gerald M Rubin; Steven L Salzberg; Granger G Sutton; Pantelis Topalis; Ron Wides; Patrick Wincker; Mark Yandell; Frank H Collins; Jose Ribeiro; William M Gelbart; Fotis C Kafatos; Peer Bork
Journal:  Science       Date:  2002-10-04       Impact factor: 47.728

9.  Using GeneWise in the Drosophila annotation experiment.

Authors:  E Birney; R Durbin
Journal:  Genome Res       Date:  2000-04       Impact factor: 9.043

10.  Gene recognition in eukaryotic DNA by comparison of genomic sequences.

Authors:  P S Novichkov; M S Gelfand; A A Mironov
Journal:  Bioinformatics       Date:  2001-11       Impact factor: 6.937

View more
  29 in total

1.  Dissecting plant genomes with the PLAZA comparative genomics platform.

Authors:  Michiel Van Bel; Sebastian Proost; Elisabeth Wischnitzki; Sara Movahedi; Christopher Scheerlinck; Yves Van de Peer; Klaas Vandepoele
Journal:  Plant Physiol       Date:  2011-12-23       Impact factor: 8.340

2.  DNA-energetics-based analyses suggest additional genes in prokaryotes.

Authors:  Garima Khandelwal; Jalaj Gupta; B Jayaram
Journal:  J Biosci       Date:  2012-07       Impact factor: 1.826

3.  Efficient algorithms for training the parameters of hidden Markov models using stochastic expectation maximization (EM) training and Viterbi training.

Authors:  Tin Y Lam; Irmtraud M Meyer
Journal:  Algorithms Mol Biol       Date:  2010-12-09       Impact factor: 1.405

4.  Molecular population genetics of accessory gland protein genes and testis-expressed genes in Drosophila mojavensis and D. arizonae.

Authors:  Bradley J Wagstaff; David J Begun
Journal:  Genetics       Date:  2005-08-05       Impact factor: 4.562

5.  Tracing the Evolutionary History of the CAP Superfamily of Proteins Using Amino Acid Sequence Homology and Conservation of Splice Sites.

Authors:  Anup Abraham; Douglas E Chandler
Journal:  J Mol Evol       Date:  2017-10-25       Impact factor: 2.395

6.  Predicting gene structure changes resulting from genetic variants via exon definition features.

Authors:  William H Majoros; Carson Holt; Michael S Campbell; Doreen Ware; Mark Yandell; Timothy E Reddy
Journal:  Bioinformatics       Date:  2018-11-01       Impact factor: 6.937

7.  The genome of the simian and human malaria parasite Plasmodium knowlesi.

Authors:  A Pain; U Böhme; A E Berry; K Mungall; R D Finn; A P Jackson; T Mourier; J Mistry; E M Pasini; M A Aslett; S Balasubrammaniam; K Borgwardt; K Brooks; C Carret; T J Carver; I Cherevach; T Chillingworth; T G Clark; M R Galinski; N Hall; D Harper; D Harris; H Hauser; A Ivens; C S Janssen; T Keane; N Larke; S Lapp; M Marti; S Moule; I M Meyer; D Ormond; N Peters; M Sanders; S Sanders; T J Sargeant; M Simmonds; F Smith; R Squares; S Thurston; A R Tivey; D Walker; B White; E Zuiderwijk; C Churcher; M A Quail; A F Cowman; C M R Turner; M A Rajandream; C H M Kocken; A W Thomas; C I Newbold; B G Barrell; M Berriman
Journal:  Nature       Date:  2008-10-09       Impact factor: 49.962

8.  Utilizing gene pair orientations for HMM-based analysis of promoter array ChIP-chip data.

Authors:  Michael Seifert; Jens Keilwagen; Marc Strickert; Ivo Grosse
Journal:  Bioinformatics       Date:  2009-04-28       Impact factor: 6.937

9.  HMMCONVERTER 1.0: a toolbox for hidden Markov models.

Authors:  Tin Yin Lam; Irmtraud M Meyer
Journal:  Nucleic Acids Res       Date:  2009-11       Impact factor: 16.971

10.  Revisiting the missing protein-coding gene catalog of the domestic dog.

Authors:  Thomas Derrien; Julien Thézé; Amaury Vaysse; Catherine André; Elaine A Ostrander; Francis Galibert; Christophe Hitte
Journal:  BMC Genomics       Date:  2009-02-04       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.