| Literature DB >> 18218139 |
Mark Bauer1, Sheldon M Schuster, Khalid Sayood.
Abstract
BACKGROUND: Occult organizational structures in DNA sequences may hold the key to understanding functional and evolutionary aspects of the DNA molecule. Such structures can also provide the means for identifying and discriminating organisms using genomic data. Species specific genomic signatures are useful in a variety of contexts such as evolutionary analysis, assembly and classification of genomic sequences from large uncultivated microbial communities and a rapid identification system in health hazard situations.Entities:
Mesh:
Year: 2008 PMID: 18218139 PMCID: PMC2335307 DOI: 10.1186/1471-2105-9-48
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Average Mutual Information Profile for Human Chromosome 1 plotted for k ≥ 5. The x-axis is the distance between bases while the y-axis is the value of the average mutual information I.
Figure 2Average Mutual Information Profile for the Human Chromosomes plotted for values of k between 5 and 50, b) Average Mutual Information Profile for the Mouse Chromosomes plotted for values of k between 5 and 50.
Figure 3Average Mutual Information Profile for the C. elegans Chromosomes plotted for values of k between 5 and 50, b) Average Mutual Information Profile for the S. cerevisiae Chromosomes plotted for values of k between 5 and 50.
Figure 4Plot of the first sixteen elements of the average mutual information profile for E. coli using the entire sequence and using 0.5% of the sequence.
Figure 5Plot of the histogram of the correlation between the average mutual information profile of fragments of E. coli and S. aureus with the average mutual information profile of the entire E. coli genome.
Figure 6Plot of the average correlation between the average mutual information profile of fragments of E. coli and S. aureus with the average mutual information profile of the entire E. coli genome as a function of fragment length. The average correlation was obtained using 1000 trials with the appropriate fragment length.
Labels for chromosomes
| Accession | Chromosome | |
| NT 002582 | M. musculus chromosome 14 | |
| NT 002588 | M. musculus chromosome 17 | |
| NT 003030 | M. musculus chromosome X | |
| NC 001135 | S. cerevisiae chromosome 3 | |
| NC 001137 | S. cerevisiae chromosome 5 | |
| NC 001141 | S. cerevisiae chromosome 9 | |
| NC 000965 | C. elegans chromosome 1 | |
| NC 000966 | C. elegans chromosome 2 | |
| NC 000967 | C. elegans chromosome 3 | |
| NT 003140 | H. sapiens Chromosome 14 | |
| NT 002831 | H. sapiens Chromosome 17 | |
| NT 001374 | H. sapiens Chromosome X |
Labels used for the chromosomes of various species in Table 2.
Distance between chromosomes.
| 0.000 | 0.018 | 0.017 | 0.512 | 0.485 | 0.513 | 0.549 | 0.539 | 0.536 | 0.377 | 0.373 | 0.355 | |
| 0.018 | 0.000 | 0.016 | 0.446 | 0.418 | 0.450 | 0.483 | 0.469 | 0.469 | 0.312 | 0.309 | 0.291 | |
| 0.017 | 0.016 | 0.000 | 0.459 | 0.433 | 0.461 | 0.496 | 0.485 | 0.483 | 0.339 | 0.334 | 0.317 | |
| 0.512 | 0.446 | 0.459 | 0.000 | 0.009 | 0.015 | 0.046 | 0.055 | 0.056 | 0.205 | 0.197 | 0.205 | |
| 0.485 | 0.418 | 0.433 | 0.009 | 0.000 | 0.029 | 0.063 | 0.066 | 0.074 | 0.209 | 0.202 | 0.208 | |
| 0.514 | 0.450 | 0.461 | 0.015 | 0.029 | 0.000 | 0.071 | 0.083 | 0.079 | 0.225 | 0.216 | 0.225 | |
| 0.549 | 0.483 | 0.496 | 0.046 | 0.063 | 0.071 | 0.000 | 0.003 | 0.002 | 0.197 | 0.186 | 0.199 | |
| 0.539 | 0.469 | 0.485 | 0.055 | 0.066 | 0.083 | 0.003 | 0.000 | 0.004 | 0.189 | 0.179 | 0.189 | |
| 0.536 | 0.469 | 0.483 | 0.056 | 0.074 | 0.079 | 0.002 | 0.004 | 0.000 | 0.188 | 0.178 | 0.189 | |
| 0.377 | 0.312 | 0.339 | 0.205 | 0.209 | 0.225 | 0.197 | 0.189 | 0.188 | 0.0000 | 0.002 | 0.003 | |
| 0.373 | 0.309 | 0.334 | 0.197 | 0.202 | 0.216 | 0.186 | 0.179 | 0.178 | 0.002 | 0.000 | 0.004 | |
| 0.355 | 0.291 | 0.317 | 0.205 | 0.208 | 0.225 | 0.199 | 0.189 | 0.189 | 0.003 | 0.004 | 0.000 |
Distance between the profiles of Mus musculus, Saccharomyces cerevisiae, C. elegans, and Human chromosomes
Figure 7Clustering of all chromosomes from S. cerevisiae, M. musculus, H. sapiens and C. elegans. The clustering and visualization approach is described in the Methods section.
Labels for HIV subtypes
| Acc. No. | Description | |
| AF004885 | HIV-1 isolate from Kenya (Subtype A) | |
| AF069671 | HIV-1 isolate from Sweden, (Subtype A) | |
| U51190 | HIV-1, isolate from Uganda (Subtype A) | |
| AF069672 | HIV-1 isolate from Sweden (Subtype A) | |
| AF107771 | HIV-1 isolate from Sweden (Subtype A) | |
| M62320 | HIV-1 Ugandan isolate (Subtype A) | |
| AF069670 | HIV-1 isolate from Somalia (Subtype A) | |
| AF042101 | HIV-1 isolate from Australia (Subtype B) | |
| U37270 | HIV-1 isolate from Australia (Subtype B) | |
| U43096 | HIV-1 isolate from Germany (Subtype B) | |
| U43141 | HIV-1 isolate from Germany (Subtype B) | |
| AJ006287 | HIV-1 isolate from Spain (Subtype B) | |
| AF146728 | HIV-1 from Australia (Subtype B) | |
| U71182 | HIV-1 isolate from China (Subtype B) | |
| AF110960 | HIV-1 isolate from Botswana (Subtype C) | |
| AF110959 | HIV-1 isolate from Botswana (Subtype C) | |
| U52953 | HIV-1 isolate from Brazil (Subtype C) | |
| AF067157 | HIV-1 isolate from India (Subtype C) | |
| AF067155 | HIV-1 isolate 21068 from India (Subtype C) | |
| U46016 | HIV-1 Human immunodeficiency virus type 1 (subtype C) | |
| AB023804 | HIV-1 Human immunodeficiency virus type 1 (subtype C) |
List of accession numbers, descriptions, and labels of HIV 1 sequences used to examine distances between subtypes.
Figure 8Clustering of HIV-1 subtypes based on the distance between their respective AMI profiles.
Figure 9Clustering of HIV-1 subtypes evidenced by three coefficients of the singular valued decomposition of the AMI profiles.
Figure 10UPGMA tree for subtypes of the HIV-1 virus. The distances used to construct the UPGMA tree were obtained from their respective AMI profiles. The labels used here are defined in Table 3.