| Literature DB >> 22448273 |
Qianli Huang1, Xuanjin Cheng, Man Kit Cheung, Sergey S Kiselev, Olga N Ozoline, Hoi Shan Kwan.
Abstract
Genomic islands (GIs), frequently associated with the pathogenicity of bacteria and having a substantial influence on bacterial evolution, are groups of "alien" elements which probably undergo special temporal-spatial regulation in the host genome. Are there particular hallmark transcriptional signals for these "exotic" regions? We here explore the potential transcriptional signals that underline the GIs beyond the conventional views on basic sequence composition, such as codon usage and GC property bias. It showed that there is a significant enrichment of the transcription start positions (TSPs) in the GI regions compared to the whole genome of Salmonella enterica and Escherichia coli. There was up to a four-fold increase for the 70% GIs, implying high-density TSPs profile can potentially differentiate the GI regions. Based on this feature, we developed a new sliding window method GIST, Genomic-island Identification by Signals of Transcription, to identify these regions. Subsequently, we compared the known GI-associated features of the GIs detected by GIST and by the existing method Islandviewer to those of the whole genome. Our method demonstrates high sensitivity in detecting GIs harboring genes with biased GI-like function, preferred subcellular localization, skewed GC property, shorter gene length and biased "non-optimal" codon usage. The special transcriptional signals discovered here may contribute to the coordinate expression regulation of foreign genes. Finally, by using GIST, we detected many interesting GIs in the 2011 German E. coli O104:H4 outbreak strain TY-2482, including the microcin H47 system and gene cluster ycgXEFZ-ymgABC that activates the production of biofilm matrix. The aforesaid findings highlight the power of GIST to predict GIs with distinct intrinsic features to the genome. The heterogeneity of cumulative TSPs profiles may not only be a better identity for "alien" regions, but also provide hints to the special evolutionary course and transcriptional regulation of GI regions.Entities:
Mesh:
Year: 2012 PMID: 22448273 PMCID: PMC3309015 DOI: 10.1371/journal.pone.0033759
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Proportion of GIs with enriched TSPs among ten bacterial genomes.
The y-axis represents the proportion of GI regions, and names of bacterial genomes are shown along the x-axis.
Figure 2Number of GIs predicted by Islandviewer, GIST and Alien_hunter among ten bacterial genomes.
The y-axis represents the number of GIs detected by the three methods, and names of bacterial genomes are shown along the x-axis.
Figure 3BRIG diagram showing results of GIST and Islandviewer on Salmonella enterica 62:z4,z23:– RSK2980 (NC_010067).
GIs predicted with Islandviewer are marked as Island1, Island2, etc.; and those predicted by GIST are denoted with Gist1, Gist2, etc. The three main divergent regions detected by GIST but missed by Islandviewer are labeled with green triangles.
Comparison of GIST and Islandviewer on detection of GIs in different function categories.
| Average percentage (%) |
| ||||
| Function category | Genome | GIST | Islandviewer | Genome-GIST | Genome-Islandviewer |
|
| |||||
|
| 0.03 | 0 | 0 | - | - |
|
| 4.13 | 1.24 | 2.16 | ** | - |
|
| 7.01 | 6.68 | 4.58 | - | - |
|
| 4.08 | 3.51 | 4.13 | - | - |
|
| |||||
|
| 0.78 | 0.28 | 0.32 | ** | * |
|
| 5.41 | 7.48 | 7.48 | ** | ** |
|
| 2.65 | 3.58 | 2.49 | ** | - |
|
| 3.46 | 1.07 | 3.5 | ** | - |
|
| 4.06 | 2.81 | 2.29 | ** | ** |
|
| 3.01 | 4.38 | 4.59 | ** | * |
|
| 1.08 | 1.57 | 0.9 | * | - |
|
| 0.01 | 0.14 | 0 | * | * |
|
| |||||
|
| 6.24 | 3.37 | 1.79 | ** | ** |
|
| 7.95 | 4.32 | 1.87 | ** | ** |
|
| 2.02 | 1.02 | 0.14 | ** | ** |
|
| 8.12 | 3.41 | 4.15 | ** | ** |
|
| 3.91 | 2 | 0.27 | ** | ** |
|
| 2.05 | 1.43 | 0.65 | * | ** |
|
| 4.54 | 1.63 | 1.12 | ** | ** |
|
| 1.49 | 0.28 | 0.1 | ** | ** |
|
| |||||
|
| 27.98 | 39.74 | 47.39 | ** | ** |
Based on the t test, the * represents the p<0.05, the ** denotes the p<0.001 and the – indicates p>0.05.
The number is the average percentage (%) of corresponding function category in 10 strains.
Comparison of GIST and Islandviewer on detection of GIs in different subcellular locations.
| Average number (Percentage) |
| ||||
| Subcellular location | Genome | GIST | Islandviewer | Genome-GIST | Genome-Islandviewer |
|
| 2062.1 (42.60) | 128.2 (34.00) | 107 (31.94) | ** | ** |
|
| 1152 (23.80) | 68.9 (18.00) | 56 (16.86) | ** | ** |
|
| 70.5 (1.40) | 15.7 (4.20) | 10.6 (3.21) | ** | * |
|
| 97.5 (2.00) | 13.4 (3.50) | 8.1 (2.51) | ** | - |
|
| 155.2 (3.20) | 12.4 (3.30) | 6.1 (1.92) | - | ** |
|
| 1317.8 (26.70) | 141.6 (36.40) | 152.5 (43.57) | ** | ** |
Based on the t test, the * represents the p<0.05, the ** denotes the p<0.001 and the – indicates p>0.05.
The number in the bracket is the average percentage (%) of corresponding subcellular location in 10 strains.
Non-conserved genomic islands among three examined E. coli strains.
| GIs | Start–End | Operons or replicons | Annotation or Notes |
|
| 161735–169690 |
| Activate production of the biofilm matrix |
|
| 1251957–1267105 |
| O-antigen gene cluster |
|
| 1903476–1912915 | B7LDK9 | Integrase; CP4-57 prophage |
|
| 2312266–2319405 | - | Putative uncharacterized protein |
|
| 2334883–2342762 | D3GU29, B7LG82 | Antigen 43 (Ag43) phase-variable biofilm formation autotransporter; CP4-44 prophage |
|
| 2436282–2443184 | B7LGI6 | Putative fimbrial-like adhesin protein |
|
| 2822740–2830307 |
| Conversion of Methionine to Cysteine |
|
| 2894844–2907891 |
| Tn7-like transposition proteins |
|
| 2982406–2992135 |
| Long polar fimbria protein |
|
| 3888114–3894384 | E1U309, B7LDQ7 | Transposase InsAB; Putative Filamentation |
|
| 4186166–4193990 |
| Periplasmic pilin chaperone, fimbria-like adhesion |
|
| 775–9220 |
| Microcin H47 system—An |
|
| 1039464–1046318 | N6-adenine-methyltransferase (Phage), | Phage cdtI |
|
| 4326970–4337621 | B3HAL5 | Tn21 resolvase and many uncharacterized proteins |
Based on the comparison of non-conserved GIs among the three E. coli strains (TY-2482, EAEC strain 55989 and EHEC strain O157:H7),
denotes islands present only in TY-2482 and EAEC strain 55989 and.
denotes TY-2482-specific islands. The start–end was defined by the start site of the first gene and the end site of the last gene embraced.