| Literature DB >> 26223387 |
Xi Zhang1, Chong Peng1, Ge Zhang1, Feng Gao2.
Abstract
Essential genes are thought to encode proteins that carry out the basic functions to sustain a cellular life, and genomic islands (GIs) usually contain clusters of horizontally transferred genes. It has been assumed that essential genes are not likely to be located in GIs, but systematical analysis of essential genes in GIs has not been explored before. Here, we have analyzed the essential genes in 28 prokaryotes by statistical method and reached a conclusion that essential genes in GIs are significantly fewer than those outside GIs. The function of 362 essential genes found in GIs has been explored further by BLAST against the Virulence Factor Database (VFDB) and the phage/prophage sequence database of PHAge Search Tool (PHAST). Consequently, 64 and 60 eligible essential genes are found to share the sequence similarity with the virulence factors and phage/prophages-related genes, respectively. Meanwhile, we find several toxin-related proteins and repressors encoded by these essential genes in GIs. The comparative analysis of essential genes in genomic islands will not only shed new light on the development of the prediction algorithm of essential genes, but also give a clue to detect the functionality of essential genes in genomic islands.Entities:
Mesh:
Year: 2015 PMID: 26223387 PMCID: PMC4519734 DOI: 10.1038/srep12561
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
The information of the organisms used in the current study.
| Organism | RefSeq | Group | No.eg | IslandPath-DIMOB(%) | SIGI-HMM(%) | Integrated method (%) | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| NC_005966 | Bacteria(−) | 499 | 0.4 | 5.88 | 15.18 | 0.6 | 7.5 | 15.18 | 1 | 6.41 | 15.3 | |
| NC_000964 | Bacteria(+) | 271 | 0 | 0 | 6.97 | 1.11 | 2.61 | 6.6 | 1.11 | 0.89 | 6.98 | |
| NC_016776 | Bacteria(−) | 547 | 3.84 | 11.67 | 12.8 | 8.59 | 17.03 | 12.46 | 10.06 | 13.96 | 12.63 | |
| NC_004663 | Bacteria(−) | 325 | 2.46 | 3.15 | 7.01 | 2.15 | 2.73 | 7.03 | 5.23 | 3.46 | 7.18 | |
| NC_006350 | Bacteria(−) | 505 | 6.34 | 15.46 | 14.82 | 4.16 | 9.86 | 15.2 | 6.93 | 8.37 | 15.77 | |
| NC_007651 | Bacteria(−) | 406 | 3.2 | 4.98 | 13.03 | 0 | 0 | 12.7 | 3.2 | 4.68 | 13.11 | |
| NC_011916 | Bacteria(−) | 480 | 1.25 | 4.55 | 12.63 | 0.83 | 3.64 | 12.61 | 1.88 | 5.56 | 12.65 | |
| NC_000913 | Bacteria(−) | 609 | 5.26 | 15.53 | 14.67 | 5.09 | 10.44 | 15.04 | 9.03 | 14.55 | 14.73 | |
| NC_008601 | Bacteria(−) | 392 | 1.02 | 12.5 | 23 | 0.26 | 20 | 22.81 | 1.02 | 8.33 | 23.22 | |
| NC_000907 | Bacteria(−) | 642 | 1.25 | 44.44 | 39.82 | 0.62 | 17.39 | 40.2 | 2.34 | 32.61 | 40.09 | |
| NC_000915 | Bacteria(−) | 323 | 1.24 | 8.89 | 22.4 | 0 | − | − | 1.24 | 8.89 | 22.4 | |
| NC_005791 | Archaeon | 519 | 0.58 | 7.69 | 30.66 | 0 | − | − | 0.58 | 7.69 | 30.66 | |
| NC_000962 | Bacteria(−) | 687 | 1.6 | 8.53 | 17.9 | 0.58 | 40 | 17.53 | 2.18 | 11.36 | 17.81 | |
| NC_002771 | Mycoplasmas | 310 | 0.65 | 9.52 | 40.47 | 0 | − | − | 0.65 | 9.52 | 40.47 | |
| NC_010729 | Bacteria(−) | 463 | 2.16 | 6.9 | 23.3 | 0 | 0 | 22.41 | 2.16 | 6.67 | 23.36 | |
| NC_002516 | Bacteria(−) | 117 | 0 | 0 | 2.12 | 0 | 0 | 2.13 | 0 | 0 | 2.16 | |
| NC_008463 | Bacteria(−) | 335 | 0 | 0 | 5.83 | 0 | 0 | 5.7 | 1.19 | 1.69 | 5.85 | |
| NC_004631 | Bacteria(+) | 358 | 0.84 | 0.86 | 8.86 | 0 | − | − | 1.12 | 0.69 | 9.39 | |
| NC_016810 | Bacteria(−) | 353 | 2.83 | 7.3 | 7.96 | 9.63 | 11.6 | 7.68 | 10.48 | 9.54 | 7.79 | |
| NC_016856 | Bacteria(+) | 105 | 1.91 | 0.73 | 2.04 | 13.33 | 4.13 | 1.83 | 13.33 | 2.57 | 1.91 | |
| NC_003197 | Bacteria(+) | 230 | 1.74 | 1.15 | 5.51 | 6.52 | 5 | 5.18 | 7.83 | 4.04 | 5.29 | |
| NC_004347 | Bacteria(−) | 403 | 0 | − | − | 0 | 0 | 10.07 | 0 | 0 | 10.07 | |
| NC_009511 | Bacteria(−) | 535 | 2.24 | 6.9 | 11.18 | 0.19 | 1.59 | 11.16 | 2.24 | 5.74 | 11.27 | |
| NC_002745 | Bacteria(−) | 302 | 0 | 0 | 12.09 | 0 | 0 | 11.84 | 0 | 0 | 12.19 | |
| NC_007795 | Bacteria(−) | 351 | 1.14 | 16.67 | 12.65 | 0 | 0 | 12.69 | 1.14 | 16.67 | 12.65 | |
| NC_003028 | Bacteria(−) | 244 | 0 | 0 | 11.84 | 0 | − | − | 0 | 0 | 11.84 | |
| NC_009009 | Bacteria(+) | 218 | 0 | 0 | 9.66 | 0 | 0 | 9.91 | 0 | 0 | 9.97 | |
| NC_002505 | Bacteria(−) | 779 | 2.44 | 18.81 | 31.24 | 2.31 | 17.82 | 31.28 | 3.08 | 19.05 | 31.35 | |
aBacteria(+), Gram-positive bacteria; Bacteria(−), Gram-negative bacteria.
bNumber of essential genes of the organism.
cThe dataset classified by IslandPath-DIMOB (or SIGI-HMM, Integrated method) contain three numbers (%): X, Y, Z. X% is the percentage of essential genes which located in the GIs detected by the IslandPath-DIMOB (or SIGI-HMM, Integrated method) among the total essential genes of the organism. Y% is the percentage of essential genes in GIs. Z% is the percentage of essential genes outside GIs. The character ‘–’ in the column of IslandPath-DIMOB or SIGI-HMM means no genomic island is detected by the corresponding method.
Figure 1Average percentages of essential genes located in GIs and out of GIs.
The three methods used to detect GIs are listed in the vertical axis. The P values from Student’s t test are also displayed in the figure.
Figure 2The Venn diagrams for the number distribution of essential genes located in GIs.
The three circles represent IslandPath-DIMOB, SIGI-HMM and Integrated method, respectively.
Figure 3The circos plot of virulence factors (green) and prophages (yellow) that share similar sequences.
Each word in the inner band is the name of the identified organism, each word outside the left half band shows the name of the gene (the character ‘-’ means ‘unknown gene’). Each number around the circle shows the serial number of selected gene in the dataset of virulence factors and prophages.
Figure 4GC profile for the genome of Shewanella oneidensis MR-1.
The green line segments represent GIs. The blue triangles represent essential genes.