| Literature DB >> 19956607 |
Shannan J Ho Sui1, Amber Fedynak, William W L Hsiao, Morgan G I Langille, Fiona S L Brinkman.
Abstract
BACKGROUND: It has been noted that many bacterial virulence factor genes are located within genomic islands (GIs; clusters of genes in a prokaryotic genome of probable horizontal origin). However, such studies have been limited to single genera or isolated observations. We have performed the first large-scale analysis of multiple diverse pathogens to examine this association. We additionally identified genes found predominantly in pathogens, but not non-pathogens, across multiple genera using 631 complete bacterial genomes, and we identified common trends in virulence for genes in GIs. Furthermore, we examined the relationship between GIs and clustered regularly interspaced palindromic repeats (CRISPRs) proposed to confer resistance to phage. METHODOLOGY/PRINCIPALEntities:
Mesh:
Substances:
Year: 2009 PMID: 19956607 PMCID: PMC2779486 DOI: 10.1371/journal.pone.0008094
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Genomic Islands (GIs) contain higher proportions of virulence factors (VFs) than non–GI.
| GI Identification Method | VF Dataset | No. of VFs/Total no. of genes in GIs | No. of VFs/Total no. of genes in non-GIs |
|
|
| All VFs | 581/11437 (5.1) | 1054/83161 (1.3) | 1.2E-135 |
| Pathogen-associated | 160/10157 (1.6) | 151/72201 (0.2) | 2.8E-63 | |
| “Common” | 421/11318 (3.7) | 854/81791 (1.0) | 6.4E-86 | |
|
| All VFs | 217/4601 (4.7) | 1246/84832 (1.5) | 1.3E-44 |
| Pathogen-associated | 58/4030 (1.4) | 217/74311 (0.3) | 3.7E-20 | |
| “Common” | 159/4559 (3.5) | 979/83391 (1.2) | 1.2E-29 | |
|
| All VFs | 387/7618 (5.1) | 1039/80770 (1.3) | 4.9E-95 |
| Pathogen-associated | 116/7029 (1.6) | 149/71224 (0.2) | 7.0E-51 | |
| “Common” | 271/7616 (3.6) | 890/79283 (1.1) | 5.0E-51 |
a VFs are defined as those genes curated as being VFs according to the VFDB. Only VFs in the VFDB where GI predictions were available from IslandPath/SIGI-HMM were included in the analysis.
b Total number of genes in GIs varies according to the number of genomes used that contain pathogen-associated, “Common”, or both types of VFs.
c Fisher's exact test.
d GIs are defined as 8 or more consecutive ORFs with dinucleotide bias as predicted with IslandPath-DINUC.
e GIs are defined as 8 or more consecutive ORFs with dinucleotide bias plus presence of 1 or more mobility genes within the region as predicted with IslandPath-DIMOB.
f GIs are defined based on codon usage (removing regions like ribosomal operons) as predicted with SIGI-HMM. See text regarding the complementarity of the IslandPath-DIMOB and SIGI-HMM methods.
g Pathogen-associated VFs have homologs only in other pathogen genomes, at the similarity cut-off used (see Materials and Methods).
h “Common” VFs have homologs in both pathogens and non-pathogens (e.g. certain iron uptake systems, etc.) at the similarity cutoff used (see Materials and Methods).
Figure 1Enrichment of virulence factors (VFs) in GIs by pathogens.
The proportion (%) of genes that are VFs in GIs (predicted by the IslandPath–DINUC method) for pathogens grouped by genus is shown in red, versus the proportion of genes that are VFs outside of GIs, which is shown in blue.
Classification of virulence factors (VFs) in Genomic Islands (GIs) and non–GIs.
| VFDB Classification | VFs in GIs | Proportion of genes in GIs | VFs in non-GIs (#) | Proportion of genes in non-GIs |
|
| Unclassified | 162 | 1.49 | 116 | 0.14 | 1.31E-75* |
| Type IV secretion system | 51 | 0.47 | 24 | 0.03 | 1.03E-28* |
| Type III secretion system | 97 | 0.89 | 154 | 0.19 | 1.12E-26* |
| Adherence | 107 | 0.98 | 195 | 0.24 | 8.17E-26* |
| Iron uptake (NS) | 31 | 0.28 | 60 | 0.07 | 1.81E-07* |
| Intracellular survival | 8 | 0.07 | 4 | 0.00 | 8.15E-05* |
| Toxing.h (O) | 25 | 0.23 | 63 | 0.08 | 1.14E-04* |
| Capsule | 4 | 0.04 | 0 | 0.00 | 1.00E-03* |
| Protease | 5 | 0.05 | 4 | 0.00 | 8.77E-03* |
| Antiphagocytosis (D) | 18 | 0.17 | 67 | 0.08 | 3.89E-02* |
| Immune evasion | 3 | 0.03 | 8 | 0.01 | 4.99E-01 |
| Actin-based motility (O) | 1 | 0.01 | 1 | 0.00 | 7.75E-01 |
| Secretion system (other) (NS) | 16 | 0.15 | 98 | 0.12 | 8.22E-01 |
| Invasion (O) | 2 | 0.02 | 7 | 0.01 | 8.22E-01 |
| IgA1 Protease (D) | 1 | 0.01 | 2 | 0.00 | 8.22E-01 |
| Magnesium uptake (NS) | 1 | 0.01 | 2 | 0.00 | 8.22E-01 |
| Motility (NS) | 7 | 0.06 | 67 | 0.08 | 1.00E+00 |
| Exoenzyme (NS) | 2 | 0.02 | 31 | 0.04 | 1.00E+00 |
| Endotoxin (NS) | 3 | 0.03 | 29 | 0.04 | 1.00E+00 |
| Regulation (R) | 3 | 0.03 | 26 | 0.03 | 1.00E+00 |
| Type II secretion system (NS) | 0 | 0.00 | 22 | 0.03 | 1.00E+00 |
| Stress protein (D) | 1 | 0.01 | 11 | 0.01 | 1.00E+00 |
| Cellular metabolism (D) | 0 | 0.00 | 8 | 0.01 | 1.00E+00 |
| Enzyme (NS) | 0 | 0.00 | 8 | 0.01 | 1.00E+00 |
| Cell wall (NS) | 1 | 0.01 | 6 | 0.01 | 1.00E+00 |
| Biofilm formation (D) | 0 | 0.00 | 4 | 0.00 | 1.00E+00 |
| Molecular mimicry (D) | 0 | 0.00 | 4 | 0.00 | 1.00E+00 |
| Intracellular growth (NS) | 0 | 0.00 | 3 | 0.00 | 1.00E+00 |
| Plasminogen activator (O) | 0 | 0.00 | 3 | 0.00 | 1.00E+00 |
| Serum resistance (D) | 0 | 0.00 | 3 | 0.00 | 1.00E+00 |
| Biosurfactant (NS) | 0 | 0.00 | 2 | 0.00 | 1.00E+00 |
| Pigment (O) | 0 | 0.00 | 2 | 0.00 | 1.00E+00 |
| Proinflammatory effect (NS) | 0 | 0.00 | 2 | 0.00 | 1.00E+00 |
| Anti-proteolysis (D) | 0 | 0.00 | 1 | 0.00 | 1.00E+00 |
| Bile resistance (D) | 0 | 0.00 | 1 | 0.00 | 1.00E+00 |
| Complement Protease (D) | 0 | 0.00 | 1 | 0.00 | 1.00E+00 |
| Complement resistance (D) | 0 | 0.00 | 1 | 0.00 | 1.00E+00 |
| Heat-shock protein (NS) | 0 | 0.00 | 1 | 0.00 | 1.00E+00 |
| Manganese uptake (NS) | 0 | 0.00 | 1 | 0.00 | 1.00E+00 |
| Nutrient acquisition (NS) | 0 | 0.00 | 1 | 0.00 | 1.00E+00 |
| Peptidase (D) | 0 | 0.00 | 1 | 0.00 | 1.00E+00 |
| Resistance to antimicrobial peptides (D) | 0 | 0.00 | 1 | 0.00 | 1.00E+00 |
|
|
|
|
a VFs are defined as those genes curated as being VFs according to the VFDB. Only those VFs in the VFDB where GI predictions were available from IslandPath were included in the analysis. VFs are also categorized, according to the VFDB, as O = Offensive; D = Defensive; NS = Nonspecific; R = Regulation; NA = Not Available.
b Number of VFs in GIs predicted with IslandPath-DINUC (more sensitive method).
c Proportion of genes in GIs that are VFs.
d Proportion of genes in non-GIs that are VFs.
e Fisher's exact test with Benjamini-Hochberg correction for multiple testing. Asterisks indicate statistical significance (p-value<0.05).
e Includes Type III secretion system genes and Type III translocated proteins.
f Includes Type IV secretion system genes and Type IV secretory proteins.
g Categories of VFs that were also statistically significant with the IslandPath-DIMOB dataset.
h Categories of VFs that were also statistically significant with the IslandPath-SIGI-HMM dataset.
Virulence factors (VFs) in genomic islands (GIs) play more “offensive” roles.
| VF Type | Proportion of genes in DINUC GIs (%) | Proportion of genes in non-DINUC regions (%) |
| Proportion of genes in DIMOB GIs (%) | Proportion of genes in non- DIMOB regions (%) |
| Proportion of genes in SIGI-HMM GIs (%) | Proportion of genes in non- SIGI-HMM regions (%) |
|
| Offensive | 2.53 | 0.97 | 3.50E-37 | 1.64 | 1.15 | 4.98E-03 | 5.51 | 1.01 | 6.49E-144 |
| Defensive | 0.26 | 0.16 | 3.69E-02 | 0.18 | 0.17 | 8.52E-01 | 0.22 | 0.17 | 3.17E-01 |
| Nonspecific | 0.96 | 0.34 | 1.60E-16 | 0.57 | 0.41 | 1.16E-01 | 1.26 | 0.40 | 2.81E-19 |
| Regulation | 0.03 | 0.04 | 7.93E-01 | 0.00 | 0.04 | 1.70E+01 | 0.05 | 0.04 | 5.45E-01 |
a Fisher's exact test.
Statistically significant categories of virulence factors (VFs) that are Pathogen-associated or “Common” to both pathogens and non-pathogens.
| VFDB Classification | Pathogen-associated | “Common” |
|
| Categories with a higher percentage of Pathogen-associated VFs | |||
| Toxin (O) | 79 (15.28) | 58 (3.27) | 1.84E-18* |
| Type III secretion system (O) | 117 (22.63) | 175 (9.87) | 1.02E-11* |
| Type IV secretion system (O) | 32 (6.19) | 51 (2.88) | 4.77E-03* |
| Categories with a higher percentage of “Common” VFs | |||
| Motility (NA) | 0 (0) | 75 (4.23) | 9.95E-08* |
| Antiphagocytosis (D) | 6 (1.16) | 105 (5.92) | 1.13E-05* |
| Iron uptake (NS) | 5 (0.97) | 92 (5.19) | 2.51E-05* |
| Endotoxin (NS) | 0 (0) | 32 (1.80) | 2.98E-03* |
| Type II secretion system (NS) | 0 (0) | 22 (1.24) | 4.24E-02* |
a VFs are defined as those genes curated as being VFs according to the VFDB. VFs are also categorized, according to the VFDB, as O = Offensive; D = Defensive; NS = Nonspecific; R = Regulation; NA = Not Available.
b Pathogen-associated VFs have homologs only in other pathogen genomes, at the similarity cut-off used (see Materials and Methods).
c “Common” VFs have homologs in both pathogens and non-pathogens (e.g. certain iron uptake systems, etc.) at the similarity cutoff used (see Materials and Methods).
d Fisher's exact test. Only those categories with statistical significance (p-value<0.05) are listed.
Over-representation of CRISPRs in GIs.
| IslandPath-DINUC | IslandPath-DIMOB | SIGI-HMM | |
| Number of bacterial genomes | 245 | 237 | 213 |
| Number of GIs | 23889 | 6158 | 7529 |
| Proportion of genome in GIs (%) | 11.0 | 4.2 | 3.1 |
| Total number of CRISPRs | 684 | 661 | 607 |
| Expected number of CRISPRs in GIs | 75 | 28 | 19 |
| Observed number of CRISPRs in GIs | 145 | 66 | 43 |
|
| 1.4E-17 | 6.5E-14 | 1.4E-08 |
a Number of bacterial genomes for which both CRISPRs and GIs could be predicted.
b Chi-squared test includes number of observed and expected CRISPRs outside of islands (data not shown).