| Literature DB >> 12376994 |
Abstract
A DNA sequence can be identified with a word over an alphabet N = [A, C, G, T]. Characteristic sequences of a DNA sequence are given in term of classifications of bases of nucleic acids. Using the characteristic sequences, we construct a set of 2 x 2 matrices to represent DNA primary sequences, which are based on counting of the frequency of occurrence of all (0,1) triplets of characteristic sequences. Furthermore, the leading eigenvalues of these matrices are computed and considered as invariants for the DNA primary sequences. Similarity and dissimilarity analysis based on the characteristic sequences are given for eight exon-1 genes of beta-globin about eight species.Mesh:
Year: 2002 PMID: 12376994 DOI: 10.1021/ci010131z
Source DB: PubMed Journal: J Chem Inf Comput Sci ISSN: 0095-2338