Literature DB >> 15342559

EGPred: prediction of eukaryotic genes using ab initio methods after combining with sequence similarity approaches.

Biju Issac1, Gajendra Pal Singh Raghava.   

Abstract

EGPred is a Web-based server that combines ab initio methods and similarity searches to predict genes, particularly exon regions, with high accuracy. The EGPred program proceeds in the following steps: (1) an initial BLASTX search of genomic sequence against the RefSeq database is used to identify protein hits with an E-value <1; (2) a second BLASTX search of genomic sequence against the hits from the previous run with relaxed parameters (E-values <10) helps to retrieve all probable coding exon regions; (3) a BLASTN search of genomic sequence against the intron database is then used to detect probable intron regions; (4) the probable intron and exon regions are compared to filter/remove wrong exons; (5) the NNSPLICE program is then used to reassign splicing signal site positions in the remaining probable coding exons; and (6) finally ab initio predictions are combined with exons derived from the fifth step based on the relative strength of start/stop and splice signal sites as obtained from ab initio and similarity search. The combination method increases the exon level performance of five different ab initio programs by 4%-10% when evaluated on the HMR195 data set. Similar improvement is observed when ab initio programs are evaluated on the Burset/Guigo data set. Finally, EGPred is demonstrated on an approximately 95-Mbp fragment of human chromosome 13. The list of predicted genes from this analysis are available in the supplementary material. The EGPred program is computationally intensive due to multiple BLAST runs during each analysis. The EGPred server is available at http://www.imtech.res.in/raghava/egpred/.

Entities:  

Mesh:

Substances:

Year:  2004        PMID: 15342559      PMCID: PMC515322          DOI: 10.1101/gr.2524704

Source DB:  PubMed          Journal:  Genome Res        ISSN: 1088-9051            Impact factor:   9.043


  29 in total

1.  RefSeq and LocusLink: NCBI gene-centered resources.

Authors:  K D Pruitt; D R Maglott
Journal:  Nucleic Acids Res       Date:  2001-01-01       Impact factor: 16.971

2.  Initial sequencing and analysis of the human genome.

Authors:  E S Lander; L M Linton; B Birren; C Nusbaum; M C Zody; J Baldwin; K Devon; K Dewar; M Doyle; W FitzHugh; R Funke; D Gage; K Harris; A Heaford; J Howland; L Kann; J Lehoczky; R LeVine; P McEwan; K McKernan; J Meldrim; J P Mesirov; C Miranda; W Morris; J Naylor; C Raymond; M Rosetti; R Santos; A Sheridan; C Sougnez; Y Stange-Thomann; N Stojanovic; A Subramanian; D Wyman; J Rogers; J Sulston; R Ainscough; S Beck; D Bentley; J Burton; C Clee; N Carter; A Coulson; R Deadman; P Deloukas; A Dunham; I Dunham; R Durbin; L French; D Grafham; S Gregory; T Hubbard; S Humphray; A Hunt; M Jones; C Lloyd; A McMurray; L Matthews; S Mercer; S Milne; J C Mullikin; A Mungall; R Plumb; M Ross; R Shownkeen; S Sims; R H Waterston; R K Wilson; L W Hillier; J D McPherson; M A Marra; E R Mardis; L A Fulton; A T Chinwalla; K H Pepin; W R Gish; S L Chissoe; M C Wendl; K D Delehaunty; T L Miner; A Delehaunty; J B Kramer; L L Cook; R S Fulton; D L Johnson; P J Minx; S W Clifton; T Hawkins; E Branscomb; P Predki; P Richardson; S Wenning; T Slezak; N Doggett; J F Cheng; A Olsen; S Lucas; C Elkin; E Uberbacher; M Frazier; R A Gibbs; D M Muzny; S E Scherer; J B Bouck; E J Sodergren; K C Worley; C M Rives; J H Gorrell; M L Metzker; S L Naylor; R S Kucherlapati; D L Nelson; G M Weinstock; Y Sakaki; A Fujiyama; M Hattori; T Yada; A Toyoda; T Itoh; C Kawagoe; H Watanabe; Y Totoki; T Taylor; J Weissenbach; R Heilig; W Saurin; F Artiguenave; P Brottier; T Bruls; E Pelletier; C Robert; P Wincker; D R Smith; L Doucette-Stamm; M Rubenfield; K Weinstock; H M Lee; J Dubois; A Rosenthal; M Platzer; G Nyakatura; S Taudien; A Rump; H Yang; J Yu; J Wang; G Huang; J Gu; L Hood; L Rowen; A Madan; S Qin; R W Davis; N A Federspiel; A P Abola; M J Proctor; R M Myers; J Schmutz; M Dickson; J Grimwood; D R Cox; M V Olson; R Kaul; C Raymond; N Shimizu; K Kawasaki; S Minoshima; G A Evans; M Athanasiou; R Schultz; B A Roe; F Chen; H Pan; J Ramser; H Lehrach; R Reinhardt; W R McCombie; M de la Bastide; N Dedhia; H Blöcker; K Hornischer; G Nordsiek; R Agarwala; L Aravind; J A Bailey; A Bateman; S Batzoglou; E Birney; P Bork; D G Brown; C B Burge; L Cerutti; H C Chen; D Church; M Clamp; R R Copley; T Doerks; S R Eddy; E E Eichler; T S Furey; J Galagan; J G Gilbert; C Harmon; Y Hayashizaki; D Haussler; H Hermjakob; K Hokamp; W Jang; L S Johnson; T A Jones; S Kasif; A Kaspryzk; S Kennedy; W J Kent; P Kitts; E V Koonin; I Korf; D Kulp; D Lancet; T M Lowe; A McLysaght; T Mikkelsen; J V Moran; N Mulder; V J Pollara; C P Ponting; G Schuler; J Schultz; G Slater; A F Smit; E Stupka; J Szustakowki; D Thierry-Mieg; J Thierry-Mieg; L Wagner; J Wallis; R Wheeler; A Williams; Y I Wolf; K H Wolfe; S P Yang; R F Yeh; F Collins; M S Guyer; J Peterson; A Felsenfeld; K A Wetterstrand; A Patrinos; M J Morgan; P de Jong; J J Catanese; K Osoegawa; H Shizuya; S Choi; Y J Chen; J Szustakowki
Journal:  Nature       Date:  2001-02-15       Impact factor: 49.962

3.  Integrating genomic homology into gene structure prediction.

Authors:  I Korf; P Flicek; D Duan; M R Brent
Journal:  Bioinformatics       Date:  2001       Impact factor: 6.937

4.  ExInt: an Exon Intron Database.

Authors:  M Sakharkar; F Passetti; J E de Souza; M Long; S J de Souza
Journal:  Nucleic Acids Res       Date:  2002-01-01       Impact factor: 16.971

5.  A Bayesian framework for combining gene predictions.

Authors:  Vladimir Pavlović; Ashutosh Garg; Simon Kasif
Journal:  Bioinformatics       Date:  2002-01       Impact factor: 6.937

6.  Computational inference of homologous gene structures in the human genome.

Authors:  R F Yeh; L P Lim; C B Burge
Journal:  Genome Res       Date:  2001-05       Impact factor: 9.043

7.  Evaluation of gene-finding programs on mammalian sequences.

Authors:  S Rogic; A K Mackworth; F B Ouellette
Journal:  Genome Res       Date:  2001-05       Impact factor: 9.043

8.  Improving gene recognition accuracy by combining predictions from two gene-finding programs.

Authors:  Sanja Rogic; B F Francis Ouellette; Alan K Mackworth
Journal:  Bioinformatics       Date:  2002-08       Impact factor: 6.937

9.  Gene recognition by combination of several gene-finding programs.

Authors:  K Murakami; T Takagi
Journal:  Bioinformatics       Date:  1998       Impact factor: 6.937

10.  Using GeneWise in the Drosophila annotation experiment.

Authors:  E Birney; R Durbin
Journal:  Genome Res       Date:  2000-04       Impact factor: 9.043

View more
  5 in total

1.  Harnessing Next Generation Sequencing in Climate Change: RNA-Seq Analysis of Heat Stress-Responsive Genes in Wheat (Triticum aestivum L.).

Authors:  Ranjeet R Kumar; Suneha Goswami; Sushil K Sharma; Yugal K Kala; Gyanendra K Rai; Dwijesh C Mishra; Monendra Grover; Gyanendra P Singh; Himanshu Pathak; Anil Rai; Viswanathan Chinnusamy; Raj D Rai
Journal:  OMICS       Date:  2015-09-25

2.  Revisiting the Principles of Designing a Vaccine.

Authors:  Shubhranshu Zutshi; Sunil Kumar; Prashant Chauhan; Bhaskar Saha
Journal:  Methods Mol Biol       Date:  2022

3.  An evaluation of contemporary hidden Markov model genefinders with a predicted exon taxonomy.

Authors:  Keith Knapp; Yi-Ping Phoebe Chen
Journal:  Nucleic Acids Res       Date:  2006-12-14       Impact factor: 16.971

4.  Vertebrate gene finding from multiple-species alignments using a two-level strategy.

Authors:  David Carter; Richard Durbin
Journal:  Genome Biol       Date:  2006-08-07       Impact factor: 13.583

5.  SCGPred: a score-based method for gene structure prediction by combining multiple sources of evidence.

Authors:  Xiao Li; Qingan Ren; Yang Weng; Haoyang Cai; Yunmin Zhu; Yizheng Zhang
Journal:  Genomics Proteomics Bioinformatics       Date:  2008-12       Impact factor: 7.691

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.