Literature DB >> 10481031

Heuristic approach to deriving models for gene finding.

J Besemer1, M Borodovsky.   

Abstract

Computer methods of accurate gene finding in DNA sequences require models of protein coding and non-coding regions derived either from experimentally validated training sets or from large amounts of anonymous DNA sequence. Here we propose a new, heuristic method producing fairly accurate inhomogeneous Markov models of protein coding regions. The new method needs such a small amount of DNA sequence data that the model can be built 'on the fly' by a web server for any DNA sequence >400 nt. Tests on 10 complete bacterial genomes performed with the GeneMark.hmm program demonstrated the ability of the new models to detect 93.1% of annotated genes on average, while models built by traditional training predict an average of 93.9% of genes. Models built by the heuristic approach could be used to find genes in small fragments of anonymous prokaryotic genomes and in genomes of organelles, viruses, phages and plasmids, as well as in highly inhomogeneous genomes where adjustment of models to local DNA composition is needed. The heuristic method also gives an insight into the mechanism of codon usage pattern evolution.

Mesh:

Substances:

Year:  1999        PMID: 10481031      PMCID: PMC148655          DOI: 10.1093/nar/27.19.3911

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  172 in total

1.  GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions.

Authors:  J Besemer; A Lomsadze; M Borodovsky
Journal:  Nucleic Acids Res       Date:  2001-06-15       Impact factor: 16.971

2.  The nucleotide sequence of Shiga toxin (Stx) 2e-encoding phage phiP27 is not related to other Stx phage genomes, but the modular genetic structure is conserved.

Authors:  Jürgen Recktenwald; Herbert Schmidt
Journal:  Infect Immun       Date:  2002-04       Impact factor: 3.441

3.  Complete nucleotide sequence of Klebsiella phage P13 and prediction of an EPS depolymerase gene.

Authors:  Anqi Shang; Yang Liu; Jianlei Wang; Zhaolan Mo; Guiyang Li; Haijin Mou
Journal:  Virus Genes       Date:  2014-11-13       Impact factor: 2.332

4.  Genomic sequence of C1, the first streptococcal phage.

Authors:  Daniel Nelson; Raymond Schuch; Shiwei Zhu; Donna M Tscherne; Vincent A Fischetti
Journal:  J Bacteriol       Date:  2003-06       Impact factor: 3.490

Review 5.  Current methods of gene prediction, their strengths and weaknesses.

Authors:  Catherine Mathé; Marie-France Sagot; Thomas Schiex; Pierre Rouzé
Journal:  Nucleic Acids Res       Date:  2002-10-01       Impact factor: 16.971

6.  Nucleotide sequence and evolution of the five-plasmid complement of the phytopathogen Pseudomonas syringae pv. maculicola ES4326.

Authors:  John Stavrinides; David S Guttman
Journal:  J Bacteriol       Date:  2004-08       Impact factor: 3.490

7.  Evaluating bacterial gene-finding HMM structures as probabilistic logic programs.

Authors:  Søren Mørk; Ian Holmes
Journal:  Bioinformatics       Date:  2012-01-03       Impact factor: 6.937

8.  Complete genome sequence of the giant virus OBP and comparative genome analysis of the diverse ΦKZ-related phages.

Authors:  Anneleen Cornelissen; Stephen C Hardies; Olga V Shaburova; Victor N Krylov; Wesley Mattheus; Andrew M Kropinski; Rob Lavigne
Journal:  J Virol       Date:  2011-11-30       Impact factor: 5.103

9.  Genomic analysis of bacteriophage PhiJL001: insights into its interaction with a sponge-associated alpha-proteobacterium.

Authors:  Jayme E Lohr; Feng Chen; Russell T Hill
Journal:  Appl Environ Microbiol       Date:  2005-03       Impact factor: 4.792

10.  Comparative genomics of Salmonella enterica serovar Typhi strains Ty2 and CT18.

Authors:  Wen Deng; Shian-Ren Liou; Guy Plunkett; George F Mayhew; Debra J Rose; Valerie Burland; Voula Kodoyianni; David C Schwartz; Frederick R Blattner
Journal:  J Bacteriol       Date:  2003-04       Impact factor: 3.490

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.