Literature DB >> 9461475

GeneMark.hmm: new solutions for gene finding.

A V Lukashin1, M Borodovsky.   

Abstract

The number of completely sequenced bacterial genomes has been growing fast. There are computer methods available for finding genes but yet there is a need for more accurate algorithms. The GeneMark. hmm algorithm presented here was designed to improve the gene prediction quality in terms of finding exact gene boundaries. The idea was to embed the GeneMark models into naturally derived hidden Markov model framework with gene boundaries modeled as transitions between hidden states. We also used the specially derived ribosome binding site pattern to refine predictions of translation initiation codons. The algorithm was evaluated on several test sets including 10 complete bacterial genomes. It was shown that the new algorithm is significantly more accurate than GeneMark in exact gene prediction. Interestingly, the high gene finding accuracy was observed even in the case when Markov models of order zero, one and two were used. We present the analysis of false positive and false negative predictions with the caution that these categories are not precisely defined if the public database annotation is used as a control.

Mesh:

Substances:

Year:  1998        PMID: 9461475      PMCID: PMC147337          DOI: 10.1093/nar/26.4.1107

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  25 in total

1.  Multiple alignment using simulated annealing: branch point definition in human mRNA splicing.

Authors:  A V Lukashin; J Engelbrecht; S Brunak
Journal:  Nucleic Acids Res       Date:  1992-05-25       Impact factor: 16.971

2.  Evidence for horizontal gene transfer in Escherichia coli speciation.

Authors:  C Médigue; T Rouxel; P Vigier; A Hénaut; A Danchin
Journal:  J Mol Biol       Date:  1991-12-20       Impact factor: 5.469

3.  Deriving ribosomal binding site (RBS) statistical models from unannotated DNA sequences and the use of the RBS model for N-terminal prediction.

Authors:  W S Hayes; M Borodovsky
Journal:  Pac Symp Biocomput       Date:  1998

Review 4.  Selfish operons and speciation by gene transfer.

Authors:  J G Lawrence
Journal:  Trends Microbiol       Date:  1997-09       Impact factor: 17.079

5.  The complete genome sequence of Escherichia coli K-12.

Authors:  F R Blattner; G Plunkett; C A Bloch; N T Perna; V Burland; M Riley; J Collado-Vides; J D Glasner; C K Rode; G F Mayhew; J Gregor; N W Davis; H A Kirkpatrick; M A Goeden; D J Rose; B Mau; Y Shao
Journal:  Science       Date:  1997-09-05       Impact factor: 47.728

6.  Stochastic models for heterogeneous DNA sequences.

Authors:  G A Churchill
Journal:  Bull Math Biol       Date:  1989       Impact factor: 1.758

7.  Complete genome sequence of Methanobacterium thermoautotrophicum deltaH: functional analysis and comparative genomics.

Authors:  D R Smith; L A Doucette-Stamm; C Deloughery; H Lee; J Dubois; T Aldredge; R Bashirzadeh; D Blakely; R Cook; K Gilbert; D Harrison; L Hoang; P Keagle; W Lumm; B Pothier; D Qiu; R Spadafora; R Vicaire; Y Wang; J Wierzbowski; R Gibson; N Jiwani; A Caruso; D Bush; J N Reeve
Journal:  J Bacteriol       Date:  1997-11       Impact factor: 3.490

8.  The complete genome sequence of the hyperthermophilic, sulphate-reducing archaeon Archaeoglobus fulgidus.

Authors:  H P Klenk; R A Clayton; J F Tomb; O White; K E Nelson; K A Ketchum; R J Dodson; M Gwinn; E K Hickey; J D Peterson; D L Richardson; A R Kerlavage; D E Graham; N C Kyrpides; R D Fleischmann; J Quackenbush; N H Lee; G G Sutton; S Gill; E F Kirkness; B A Dougherty; K McKenney; M D Adams; B Loftus; S Peterson; C I Reich; L K McNeil; J H Badger; A Glodek; L Zhou; R Overbeek; J D Gocayne; J F Weidman; L McDonald; T Utterback; M D Cotton; T Spriggs; P Artiach; B P Kaine; S M Sykes; P W Sadow; K P D'Andrea; C Bowman; C Fujii; S A Garland; T M Mason; G J Olsen; C M Fraser; H O Smith; C R Woese; J C Venter
Journal:  Nature       Date:  1997-11-27       Impact factor: 49.962

9.  The complete genome sequence of the gram-positive bacterium Bacillus subtilis.

Authors:  F Kunst; N Ogasawara; I Moszer; A M Albertini; G Alloni; V Azevedo; M G Bertero; P Bessières; A Bolotin; S Borchert; R Borriss; L Boursier; A Brans; M Braun; S C Brignell; S Bron; S Brouillet; C V Bruschi; B Caldwell; V Capuano; N M Carter; S K Choi; J J Cordani; I F Connerton; N J Cummings; R A Daniel; F Denziot; K M Devine; A Düsterhöft; S D Ehrlich; P T Emmerson; K D Entian; J Errington; C Fabret; E Ferrari; D Foulger; C Fritz; M Fujita; Y Fujita; S Fuma; A Galizzi; N Galleron; S Y Ghim; P Glaser; A Goffeau; E J Golightly; G Grandi; G Guiseppi; B J Guy; K Haga; J Haiech; C R Harwood; A Hènaut; H Hilbert; S Holsappel; S Hosono; M F Hullo; M Itaya; L Jones; B Joris; D Karamata; Y Kasahara; M Klaerr-Blanchard; C Klein; Y Kobayashi; P Koetter; G Koningstein; S Krogh; M Kumano; K Kurita; A Lapidus; S Lardinois; J Lauber; V Lazarevic; S M Lee; A Levine; H Liu; S Masuda; C Mauël; C Médigue; N Medina; R P Mellado; M Mizuno; D Moestl; S Nakai; M Noback; D Noone; M O'Reilly; K Ogawa; A Ogiwara; B Oudega; S H Park; V Parro; T M Pohl; D Portelle; S Porwollik; A M Prescott; E Presecan; P Pujic; B Purnelle; G Rapoport; M Rey; S Reynolds; M Rieger; C Rivolta; E Rocha; B Roche; M Rose; Y Sadaie; T Sato; E Scanlan; S Schleich; R Schroeter; F Scoffone; J Sekiguchi; A Sekowska; S J Seror; P Serror; B S Shin; B Soldo; A Sorokin; E Tacconi; T Takagi; H Takahashi; K Takemaru; M Takeuchi; A Tamakoshi; T Tanaka; P Terpstra; A Togoni; V Tosato; S Uchiyama; M Vandebol; F Vannier; A Vassarotti; A Viari; R Wambutt; H Wedler; T Weitzenegger; P Winters; A Wipat; H Yamamoto; K Yamane; K Yasumoto; K Yata; K Yoshida; H F Yoshikawa; E Zumstein; H Yoshikawa; A Danchin
Journal:  Nature       Date:  1997-11-20       Impact factor: 49.962

10.  Sequence of the initiation factor IF2 gene: unusual protein features and homologies with elongation factors.

Authors:  C Sacerdot; P Dessen; J W Hershey; J A Plumbridge; M Grunberg-Manago
Journal:  Proc Natl Acad Sci U S A       Date:  1984-12       Impact factor: 11.205

View more
  628 in total

1.  Detecting and analyzing DNA sequencing errors: toward a higher quality of the Bacillus subtilis genome sequence.

Authors:  C Médigue; M Rose; A Viari; A Danchin
Journal:  Genome Res       Date:  1999-11       Impact factor: 9.043

2.  Cloning and analysis of the capsid morphogenesis genes of Pseudomonas aeruginosa bacteriophage D3: another example of protein chain mail?

Authors:  Z A Gilakjan; A M Kropinski
Journal:  J Bacteriol       Date:  1999-12       Impact factor: 3.490

3.  The pyrimidine operon pyrRPB-carA from Lactococcus lactis.

Authors:  J Martinussen; J Schallert; B Andersen; K Hammer
Journal:  J Bacteriol       Date:  2001-05       Impact factor: 3.490

4.  DNA sequence and comparison of virulence plasmids from Rhodococcus equi ATCC 33701 and 103.

Authors:  S Takai; S A Hines; T Sekizaki; V M Nicholson; D A Alperin; M Osaki; D Takamatsu; M Nakamura; K Suzuki; N Ogino; T Kakuda; H Dan; J F Prescott
Journal:  Infect Immun       Date:  2000-12       Impact factor: 3.441

5.  Transcription profiling-based identification of Staphylococcus aureus genes regulated by the agr and/or sarA loci.

Authors:  P M Dunman; E Murphy; S Haney; D Palacios; G Tucker-Kellogg; S Wu; E L Brown; R J Zagursky; D Shlaes; S J Projan
Journal:  J Bacteriol       Date:  2001-12       Impact factor: 3.490

6.  Identification and analysis of Arabidopsis expressed sequence tags characteristic of non-coding RNAs.

Authors:  G C MacIntosh; C Wilkerson; P J Green
Journal:  Plant Physiol       Date:  2001-11       Impact factor: 8.340

7.  Massive sequence comparisons as a help in annotating genomic sequences.

Authors:  A Louis; E Ollivier; J C Aude; J L Risler
Journal:  Genome Res       Date:  2001-07       Impact factor: 9.043

8.  GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions.

Authors:  J Besemer; A Lomsadze; M Borodovsky
Journal:  Nucleic Acids Res       Date:  2001-06-15       Impact factor: 16.971

9.  Mining Bacillus subtilis chromosome heterogeneities using hidden Markov models.

Authors:  Pierre Nicolas; Laurent Bize; Florence Muri; Mark Hoebeke; François Rodolphe; S Dusko Ehrlich; Bernard Prum; Philippe Bessières
Journal:  Nucleic Acids Res       Date:  2002-03-15       Impact factor: 16.971

10.  Genome-wide analysis of core cell cycle genes in Arabidopsis.

Authors:  Klaas Vandepoele; Jeroen Raes; Lieven De Veylder; Pierre Rouzé; Stephane Rombauts; Dirk Inzé
Journal:  Plant Cell       Date:  2002-04       Impact factor: 11.277

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.