Literature DB >> 10536149

Novel coding regions in four complete archaeal genomes.

S Raghavan1, C A Ouzounis.   

Abstract

In the process of analysing the four available complete archaeal genomes, we have noted that certain regions characterised as 'non-coding' exhibit significant sequence similarity to other protein sequences from Archaea and other species. Using established technology, we have identified a number of potential protein coding regions in these putative 'non-coding' regions. We have detected 524 such cases, of which 113 regions appear to code for proteins present in archaeal or other species, while the remaining 411 regions are mostly start/stop definition conflicts. Of the 113 protein coding regions, only 21 code for proteins with homologues of known function. The number of novel coding sequences identified herein amounts to 1. 5% of the total genome entries, while the conflicting cases represent an additional 5%. The observed differences between the four complete archaeal genomes seem to reflect disparate approaches to genome annotation. Genome sequence collections should be regularly checked to improve gene prediction by sequence similarity and greater effort is required to make gene definitions consistent across related species.

Mesh:

Substances:

Year:  1999        PMID: 10536149      PMCID: PMC148723          DOI: 10.1093/nar/27.22.4405

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  2 in total

1.  Analysis of the Thermotoga maritima genome combining a variety of sequence similarity and genome context tools.

Authors:  N C Kyrpides; C A Ouzounis; I Iliopoulos; V Vonstein; R Overbeek
Journal:  Nucleic Acids Res       Date:  2000-11-15       Impact factor: 16.971

Review 2.  The past, present and future of genome-wide re-annotation.

Authors:  Christos A Ouzounis; Peter D Karp
Journal:  Genome Biol       Date:  2002-01-31       Impact factor: 13.583

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.