| Literature DB >> 6100988 |
S Karlin1, G Ghandour, D E Foulser, L J Korn.
Abstract
A method is presented for the analysis and comparison of nucleic acid and protein sequences utilizing all identity blocks (the term "identity block" refers to a set of consecutive matches between two sequences) above a prescribed length. Moreover, such identity blocks are determined for various groupings of amino acids according to chemical, functional, charge, and hydrophobic classifications. Alignment maps based on these classifications and containing all statistically significant identity blocks between two or more sequences are constructed. New theoretical results for determining the expected length of the longest identity block between sequences are also presented and are used, along with permutation procedures, to ascertain the significance of sequence identity blocks. As an example of the type of information that can be obtained, comparison has been made of the complete DNA sequences and the E1, E2, L1, and L2 genes of human and bovine papillomaviruses based on the classification schemes described above.Entities:
Mesh:
Substances:
Year: 1984 PMID: 6100988 DOI: 10.1093/oxfordjournals.molbev.a040318
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240