| Literature DB >> 16241270 |
Yonghui Wu1, Alan Wee-Chung Liew, Hong Yan, Mengsu Yang.
Abstract
The classification of human gene sequences into exons and introns is a difficult problem in DNA sequence analysis. In this paper, we define a set of features, called the simple Z (SZ) features, which is derived from the Z-curve features for the recognition of human exons and introns. The classification results show that SZ features, while fewer in numbers (three in total), can preserve the high recognition rate of the original nine Z-curve features. Since the size of SZ features is one-third of the Z-curve features, the dimensionality of the feature space is much smaller, and better recognition efficiency is achieved. If the stop codon feature is used together with the three SZ features, a recognition rate of up to 92% for short sequences of length <140 bp can be obtained.Entities:
Mesh:
Substances:
Year: 2003 PMID: 16241270 DOI: 10.1103/PhysRevE.67.061916
Source DB: PubMed Journal: Phys Rev E Stat Nonlin Soft Matter Phys ISSN: 1539-3755