| Literature DB >> 17987374 |
Abstract
Dinucleotide composition has been recognized as a species-specific characteristic of organisms for more than 20 years. Lang (2000, Bioinformatics, 16, 212-221), found that in Monilinia rRNA a species-specific identity is conserved when dinucleotide counts are compressed into net dinucleotide counts (e.g., 50AC + 20CA = 30nAC) and clusters of net dinucleotides of equal value (e.g., 30nAC + 30nCT + 30nTA = 30ACTA) which were called circuits. This study evaluates circuit assemblages (CAs)--the collection of all net dinucleotide circuits derived from a sequence--in a diverse set of 110 HIV-1 genomes. The circuit composition, which is often based on <or= 15% of the total dinucleotides of a sequence, uniquely characterizes each gene and genome, although the pairwise similarity of the sequences is as low as 70%. Variations in net dinucleotide distributions are associated with structural and functional features of the genome and its proteins. Circuit values of the env signal sequence are different between subtypes that have remained localized and those that have become pandemic. CAs of complete genomes of HIV-1 are similar to other retro-transcribing viruses, and distinct from viroids and single- and double-stranded DNA and RNA viruses. CAs provide a succinct, quantitative, and species-specific description of DNA composition that is consistent with the results of traditional analytic methods at multiple levels of genome organization.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17987374 DOI: 10.1007/s11262-007-0128-6
Source DB: PubMed Journal: Virus Genes ISSN: 0920-8569 Impact factor: 2.332