| Literature DB >> 21864345 |
Tom E Roos1, Mark W J van Passel.
Abstract
BACKGROUND: Microbial genomes do not merely evolve through the slow accumulation of mutations, but also, and often more dramatically, by taking up new DNA in a process called horizontal gene transfer. These innovation leaps in the acquisition of new traits can take place via the introgression of single genes, but also through the acquisition of large gene clusters, which are termed Genomic Islands. Since only a small proportion of all the DNA diversity has been sequenced, it can be hard to find the appropriate donors for acquired genes via sequence alignments from databases. In contrast, relative oligonucleotide frequencies represent a remarkably stable genomic signature in prokaryotes, which facilitates compositional comparisons as an alignment-free alternative for phylogenetic relatedness. In this project, we test whether Genomic Islands identified in individual bacterial genomes have a similar genomic signature, in terms of relative dinucleotide frequencies, and can therefore be expected to originate from a common donor species.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21864345 PMCID: PMC3176501 DOI: 10.1186/1471-2164-12-427
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Size distribution of 1787 Genomic Islands > 10 kb in 246 genome sequences (note the logarithmic scale on the vertical axis). The GIs are binned per 2 kb in size.
Figure 2Number of Genomic Islands per genome for the 246 genomes tested (Genome size > 800 kb, with GIs size > 10 kb and no conflicts).
Figure 3Distribution of the relative compositional similarity and GC similarity of all GIs (1787) with their respective genomes, with 1395 (78%, in red) of the GIs having a relative dissimilarity of 90%.
Figure 4Number of clustered GIs per genome.
Figure 5Clustering of the 24 Genomic Islands > 10 kb in . Below the cut-off value (red line; dissimilarity < 1.44, see Additional File 2), seven clusters are identified (six clusters with two GIs, and one with three GIs), with a total of 15 Genomic Islands (indicated with seven colored bars). The GIs and their numbers are identified in Additional File 5.
Compositional comparison of Core Islands e1-e5 (with relative dissimilarities of 10%) of Escherichia coli O157H7 with each other (underlined), and Genomic Islands with each other (bold)
| Genomic dissimilarity values (δ*) | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Start coordinate | End coordinate | size (bp) | GI8 | GI24 | GI25 | |||||||
| 0 | 5498450 | 5498450 | 0 | 18,6 | 18,5 | 18,5 | 18,7 | 18,9 | 47,2 | 58,1 | 49,0 | |
| 4770000 | 4785000 | 15001 | 18,6 | 59,8 | 72,7 | 65,8 | ||||||
| 2445000 | 2460000 | 15001 | 18,5 | 58,3 | 68,6 | 63,2 | ||||||
| 1695000 | 1710000 | 15001 | 18,5 | 53,0 | 60,8 | 57,7 | ||||||
| 135000 | 150000 | 15001 | 18,7 | 59,4 | 67,2 | 61,5 | ||||||
| 1455000 | 1470000 | 15001 | 18,9 | 49,8 | 54,7 | 47,0 | ||||||
| 892240 | 903808 | 11568 | 47,2 | 59,8 | 58,3 | 53,0 | 59,4 | 49,8 | ||||
| 2924490 | 2936721 | 12231 | 58,1 | 72,7 | 68,6 | 60,8 | 67,2 | 54,7 | ||||
| 3193144 | 3204209 | 11065 | 49,0 | 65,8 | 63,2 | 57,7 | 61,5 | 47,0 | ||||
Compositional comparison of Core Islands r1-r5 (with relative dissimilarities of 10%) of Rhodobacter sphaeroides with each other (underlined), and Genomic Islands with each other (bold)
| Genomic dissimilarity values (δ*) | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Start coordinate | End coordinate | size (bp) | GI1 | GI3 | GI4 | |||||||
| 0 | 3217726 | 3217726 | 0 | 16,3 | 16,9 | 16,8 | 16,8 | 16,9 | 42,1 | 40,6 | 44,0 | |
| 2400000 | 2415000 | 15001 | 16,3 | 41,7 | 43,7 | 45,9 | ||||||
| 990000 | 1005000 | 15001 | 16,9 | 48,1 | 46,2 | 47,1 | ||||||
| 1620000 | 1635000 | 15001 | 16,8 | 41,8 | 46,4 | 51,1 | ||||||
| 2910000 | 2925000 | 15001 | 16,8 | 29,6 | 33,1 | 33,5 | ||||||
| 2310000 | 2325000 | 15001 | 16,9 | 43,8 | 40,2 | 40,0 | ||||||
| 2883355 | 2905795 | 22440 | 42,1 | 41,7 | 48,1 | 41,8 | 29,6 | 43,8 | ||||
| 2085503 | 2112636 | 27133 | 40,6 | 43,7 | 46,2 | 46,4 | 33,1 | 40,2 | ||||
| 1575936 | 1597159 | 21223 | 44,0 | 45,9 | 47,1 | 51,1 | 33,5 | 40,0 | ||||
Overview of the characteristics of the GI analyses using decreasing similarity thresholds (for all GIs > 10 kb)
| Stringency | Total number of GIs | Number of genomes | GI < CI | Clusters | GIs in clusters* | Percentage clustered (%) | Prediction Accuracy (%) | |
|---|---|---|---|---|---|---|---|---|
| CI-0 | ++++ | 2191 | 267 | 1 | 20** | 40** | 1.8 | 99.9 |
| CI-5 | +++ | 2047 | 260 | 9 | 99 | 202 | 10.0 | 98.6 |
| CI-10 | ++ | 1787 | 246 | 11 | 134 | 271 | 15.3 | 97.5 |
| CI-25 | + | 1370 | 220 | 16 | 185 | 383 | 28.3 | 94.8 |
| Total analyzed | 2609 | 322 | ||||||
The totals represent the total numbers in the original data set from IslandViewer.
*) The percentage of clustered GIs (second last column) excludes 17 GIs from the total number of GIs (third column), since there are 17 genomes with a single GI only, and with less than two GIs there can be no clustering.
**) Six out of 20 clusters contain in fact largely identical Genomic Islands, which explains their high compositional similarity.