| Literature DB >> 16749185 |
J Kleffe1, K Hermann, M Borodovsky.
Abstract
We have explored the performance of the GeneMark gene identification method using cross-validation over learning samples of E. coli DNA sequences. The computations gave more accurate estimations of the error rates in comparison with previous results when a sample of non-coding regions was derived from GenBank sequences with many true coding regions unannotated. The error rate components have been classified and delineated. It was shown that the method performs differently on class I, II and III genes. The most frequent errors come from misinterpreting the coding potential of the complementary sequence in the same frame. The effects of stop-codons present in alternative frames were also studied to understand better the main factors contributing to GeneMark performance.Entities:
Mesh:
Year: 1996 PMID: 16749185 DOI: 10.1016/s0097-8485(96)80014-3
Source DB: PubMed Journal: Comput Chem ISSN: 0097-8485