| Literature DB >> 18953039 |
Haruo Suzuki1, Masahiro Sota, Celeste J Brown, Eva M Top.
Abstract
Plasmids are ubiquitous mobile elements that serve as a pool of many host beneficial traits such as antibiotic resistance in bacterial communities. To understand the importance of plasmids in horizontal gene transfer, we need to gain insight into the 'evolutionary history' of these plasmids, i.e. the range of hosts in which they have evolved. Since extensive data support the proposal that foreign DNA acquires the host's nucleotide composition during long-term residence, comparison of nucleotide composition of plasmids and chromosomes could shed light on a plasmid's evolutionary history. The average absolute dinucleotide relative abundance difference, termed delta-distance, has been commonly used to measure differences in dinucleotide composition, or 'genomic signature', between bacterial chromosomes and plasmids. Here, we introduce the Mahalanobis distance, which takes into account the variance-covariance structure of the chromosome signatures. We demonstrate that the Mahalanobis distance is better than the delta-distance at measuring genomic signature differences between plasmids and chromosomes of potential hosts. We illustrate the usefulness of this metric for proposing candidate long-term hosts for plasmids, focusing on the virulence plasmids pXO1 from Bacillus anthracis, and pO157 from Escherichia coli O157:H7, as well as the broad host range multi-drug resistance plasmid pB10 from an unknown host.Entities:
Mesh:
Year: 2008 PMID: 18953039 PMCID: PMC2602791 DOI: 10.1093/nar/gkn753
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Histograms showing the distribution of ranks of genomic signature similarities between 504 plasmids and their known hosts based on Mahalanobis distance (A) and δ-distance (B).
Figure 2.Histogram showing the distribution of P-values derived from Mahalanobis distances (A) and δ-distances (B) between 504 plasmids and their known hosts.
Figure 3.Plot of Mahalanobis distances between 504 plasmids and their known hosts, against plasmid sizes.
Ten highest ranking bacterial strains based on Mahalanobis distance and δ-distance for plasmid pXO1 from B. anthracis str. Ames Ancestor
| Bacterial strain | Phylum | δ | |||
|---|---|---|---|---|---|
| Sorted by Mahalanobis distance | |||||
| | Firmicutes | 2.17 | 0.994 | 47.5 | 0.517 |
| | Firmicutes | 2.83 | 0.987 | 50.6 | 0.415 |
| | Firmicutes | 2.86 | 0.986 | 50.0 | 0.415 |
| | Firmicutes | 2.91 | 0.989 | 51.0 | 0.423 |
| | Firmicutes | 3.26 | 0.988 | 52.2 | 0.379 |
| | Firmicutes | 3.42 | 0.977 | 50.9 | 0.406 |
| | Firmicutes | 4.44 | 0.960 | 59.2 | 0.226 |
| | Firmicutes | 5.43 | 0.926 | 50.8 | 0.378 |
| | Firmicutes | 5.72 | 0.887 | 55.5 | 0.306 |
| | Firmicutes | 7.03 | 0.858 | 57.3 | 0.216 |
| Sorted by δ-distance | |||||
| | Cyanobacteria | 53.01 | 0.018 | 41.7 | 0.440 |
| | Cyanobacteria | 62.56 | 0.012 | 42.7 | 0.429 |
| | Proteobacteria | 19.64 | 0.211 | 43.1 | 0.523 |
| | Firmicutes | 2.17 | 0.994 | 47.5 | 0.517 |
| | Firmicutes | 2.86 | 0.986 | 50.0 | 0.415 |
| | Firmicutes | 2.83 | 0.987 | 50.6 | 0.415 |
| | Firmicutes | 5.43 | 0.926 | 50.8 | 0.378 |
| | Firmicutes | 3.42 | 0.977 | 50.9 | 0.406 |
| | Firmicutes | 2.91 | 0.989 | 51.0 | 0.423 |
| | Bacteroidetes | 9.91 | 0.711 | 51.2 | 0.297 |
D2, Mahalanobis distance; P(D2), P-value based on Mahalanobis distance; δ, δ-distance; P(δ), P-value based on δ-distance.
aKnown host.
The P-values are not completely negatively correlated with the distances because they are based on empirical distributions that differ between bacterial chromosomes.
Ten highest ranking bacterial strains based on Mahalanobis distance and δ-distance for plasmid pO157 from E. coli O157:H7 EDL933
| Bacterial strain | Phylum | δ | ||||
|---|---|---|---|---|---|---|
| Sorted by Mahalanobis distance | ||||||
| | Proteobacteria | 3.79 | 0.972 | 32.8 | 0.724 | |
| | Proteobacteria | 4.31 | 0.962 | 33.2 | 0.704 | |
| | Proteobacteria | 4.33 | 0.962 | 32.8 | 0.700 | |
| | Proteobacteria | 4.41 | 0.966 | 33.3 | 0.696 | |
| | Proteobacteria | 4.42 | 0.968 | 33.7 | 0.687 | |
| | Proteobacteria | 4.44 | 0.967 | 33.2 | 0.713 | |
| | Proteobacteria | 4.49 | 0.963 | 33.6 | 0.673 | |
| | Proteobacteria | 4.62 | 0.948 | 34.3 | 0.667 | |
| | Proteobacteria | 4.73 | 0.955 | 35.3 | 0.631 | |
| | Proteobacteria | 5.51 | 0.926 | 59.3 | 0.238 | |
| Sorted by δ-distance | ||||||
| | Proteobacteria | 3.79 | 0.972 | 32.8 | 0.724 | |
| | Proteobacteria | 4.33 | 0.962 | 32.8 | 0.700 | |
| | Proteobacteria | 4.44 | 0.967 | 33.2 | 0.713 | |
| | Proteobacteria | 4.31 | 0.962 | 33.2 | 0.704 | |
| | Proteobacteria | 4.41 | 0.966 | 33.3 | 0.696 | |
| | Proteobacteria | 4.49 | 0.963 | 33.6 | 0.673 | |
| | Proteobacteria | 4.42 | 0.968 | 33.7 | 0.687 | |
| | Proteobacteria | 4.62 | 0.948 | 34.3 | 0.667 | |
| | Proteobacteria | 4.73 | 0.955 | 35.3 | 0.631 | |
| | Firmicutes | 338.76 | 0.000 | 40.6 | 0.525 |
D2, Mahalanobis distance; P(D2), P-value based on Mahalanobis distance; δ, δ-distance; P(δ), P-value based on δ-distance.
aKnown host.
See Table 1 legend for explanation of P-values
Ten highest ranking bacterial strains based on Mahalanobis distance and mean δ-distance for broad host range plasmid pB10 from an unknown host (%GC = 64.2)
| Bacterial strain | Phylum | δ | %GC | |||
|---|---|---|---|---|---|---|
| Sorted by Mahalanobis distance | ||||||
| | Proteobacteria | 3.07 | 0.984 | 44.1 | 0.604 | 64.8 |
| | Proteobacteria | 6.05 | 0.923 | 69.6 | 0.133 | 66.5 |
| | Proteobacteria | 6.10 | 0.886 | 73.9 | 0.152 | 67.0 |
| | Proteobacteria | 6.32 | 0.922 | 93.6 | 0.061 | 65.1 |
| | Proteobacteria | 6.44 | 0.902 | 66.0 | 0.226 | 65.5 |
| | Proteobacteria | 7.31 | 0.856 | 47.3 | 0.481 | 67.9 |
| | Proteobacteria | 7.45 | 0.863 | 33.9 | 0.687 | 59.2 |
| | Proteobacteria | 7.46 | 0.844 | 64.4 | 0.243 | 68.5 |
| | Proteobacteria | 7.84 | 0.802 | 76.8 | 0.098 | 68.1 |
| | Proteobacteria | 8.09 | 0.823 | 93.9 | 0.062 | 64.7 |
| Sorted by δ-distance | ||||||
| | Proteobacteria | 7.45 | 0.863 | 33.9 | 0.687 | 59.2 |
| | Actinobacteria | 116.15 | 0.000 | 36.4 | 0.471 | 56.3 |
| | Proteobacteria | 13.02 | 0.445 | 39.7 | 0.547 | 60.5 |
| | Proteobacteria | 3.07 | 0.984 | 44.1 | 0.604 | 64.8 |
| | Chlorobi | 44.06 | 0.037 | 44.1 | 0.379 | 55.8 |
| | Proteobacteria | 10.74 | 0.657 | 44.1 | 0.482 | 66.6 |
| | Proteobacteria | 8.28 | 0.790 | 44.4 | 0.486 | 66.3 |
| | Proteobacteria | 30.23 | 0.065 | 46.5 | 0.369 | 59.2 |
| | Proteobacteria | 9.05 | 0.758 | 46.9 | 0.432 | 66.4 |
| | Proteobacteria | 31.08 | 0.077 | 47.0 | 0.353 | 53.9 |
D2, Mahalanobis distance; P(D2), P-value based on Mahalanobis distance; δ, δ-distance; P(δ), P-value based on δ-distance; %GC, genome G + C content defined as 100 × (G + C)/(A + T + G + C).
See Table 1 legend for explanation of P-values.