Literature DB >> 14751980

Effects of choice of DNA sequence model structure on gene identification accuracy.

Rajeev K Azad1, Mark Borodovsky.   

Abstract

MOTIVATION: Markov chain models of DNA sequences have frequently been used in gene finding algorithms. Performance of the algorithm critically depends on the model structure and parameters. Still, the issue of choosing the model structure has not been studied with sufficient attention.
RESULTS: We have assessed performance of several types of Markov chain models, both fixed order (FO) models and models with interpolation, within the framework of the GeneMark algorithm. The performance was measured in two ways: (i) the accuracy of detection of protein-coding potential in artificial DNA sequences and (ii) the accuracy of identifying genes in real prokaryotic genomes. We observed that the models built by deleted interpolation (DI) slightly outperformed other models in detecting protein-coding potential in artificial DNA sequences with GC content in the medium range and also in detecting genes in real genomes with medium GC content. For artificial and real genomic DNA with high or low GC content, we observed that the models built by DI were in some cases slightly outperformed by FO models.

Mesh:

Substances:

Year:  2004        PMID: 14751980     DOI: 10.1093/bioinformatics/bth028

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  3 in total

1.  Ab initio gene identification in metagenomic sequences.

Authors:  Wenhan Zhu; Alexandre Lomsadze; Mark Borodovsky
Journal:  Nucleic Acids Res       Date:  2010-04-19       Impact factor: 16.971

2.  Generalizations of Markov model to characterize biological sequences.

Authors:  Junwen Wang; Sridhar Hannenhalli
Journal:  BMC Bioinformatics       Date:  2005-09-06       Impact factor: 3.169

3.  MICheck: a web tool for fast checking of syntactic annotations of bacterial genomes.

Authors:  Stéphane Cruveiller; Jérôme Le Saux; David Vallenet; Aurélie Lajus; Stéphanie Bocs; Claudine Médigue
Journal:  Nucleic Acids Res       Date:  2005-07-01       Impact factor: 16.971

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.