| Literature DB >> 16275786 |
Fan Yang1, Jian Yang, Xiaobing Zhang, Lihong Chen, Yan Jiang, Yongliang Yan, Xudong Tang, Jing Wang, Zhaohui Xiong, Jie Dong, Ying Xue, Yafang Zhu, Xingye Xu, Lilian Sun, Shuxia Chen, Huan Nie, Junping Peng, Jianguo Xu, Yu Wang, Zhenghong Yuan, Yumei Wen, Zhijian Yao, Yan Shen, Boqin Qiang, Yunde Hou, Jun Yu, Qi Jin.
Abstract
The Shigella bacteria cause bacillary dysentery, which remains a significant threat to public health. The genus status and species classification appear no longer valid, as compelling evidence indicates that Shigella, as well as enteroinvasive Escherichia coli, are derived from multiple origins of E.coli and form a single pathovar. Nevertheless, Shigella dysenteriae serotype 1 causes deadly epidemics but Shigella boydii is restricted to the Indian subcontinent, while Shigella flexneri and Shigella sonnei are prevalent in developing and developed countries respectively. To begin to explain these distinctive epidemiological and pathological features at the genome level, we have carried out comparative genomics on four representative strains. Each of the Shigella genomes includes a virulence plasmid that encodes conserved primary virulence determinants. The Shigella chromosomes share most of their genes with that of E.coli K12 strain MG1655, but each has over 200 pseudogenes, 300 approximately 700 copies of insertion sequence (IS) elements, and numerous deletions, insertions, translocations and inversions. There is extensive diversity of putative virulence genes, mostly acquired via bacteriophage-mediated lateral gene transfer. Hence, via convergent evolution involving gain and loss of functions, through bacteriophage-mediated gene acquisition, IS-mediated DNA rearrangements and formation of pseudogenes, the Shigella spp. became highly specific human pathogens with variable epidemiological and pathological features.Entities:
Mesh:
Substances:
Year: 2005 PMID: 16275786 PMCID: PMC1278947 DOI: 10.1093/nar/gki954
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 2Comparison of the Shigella chromosomes (a) and the virulence plasmids (b) (to scale). The chromosomes are compared to that from the E.coli K12 strain MG1655 (top). The virulence plasmids comparisons are made with pCP301 from Sf301 (always on the top). Each marker length denotes 300 and 30 kb for chromosome and plasmid comparisons, respectively. Colour code donates maximal length of the paired segments: red, >10 kb; blue, 5–10 kb; cyan, 1–5 kb. The replication origin, ori, is indicated by an arrow for each plasmid, and the cell-entry regions are marked with horizontal double-arrowhead lines. The arrowhead indicates the locus of the truncated ori sequence in Sd197, and the arched line indicates the corresponding region in pCP301 that is deleted from pSB4_227 nearby the cell-entry region (see main text).
General features of the Shigella genomes compared with the genome of E.coli K12 MG1655
| Chromosome | MG1655 | Sd197 | Sf301 | 2457T | Sb227 | Ss046 |
|---|---|---|---|---|---|---|
| Total length (bp) | 4 639 675 | 4 369 232 | 4 607 203 | 4 599 354 | 4 519 823 | 4 825 265 |
| No. of total ORFs | 4254 | 4557 | 4434 | 4456 | 4353 | 4434 |
| No. of pseudogenes | 12 | 285 | 254 | 372 | 217 | 210 |
| Percentage of CDS (%) | 87.3 | 77.2 | 80.4 | 77.2 | 80.5 | 80.5 |
| G+C content (%) | 50.79 | 51.25 | 50.89 | 50.91 | 51.21 | 51.01 |
| No. of ribosomal RNA (16S/23S/5S) | 7/7/8 | 7/7/8 | 7/7/8 | 7/7/8 | 7/7/8 | 7/7/8 |
| No. of transfer RNA | 86 | 85 | 97 | 98 | 91 | 97 |
| Deletions (kb) | – | 955 | 639 | 709 | 746 | 518 |
| Insertions (kb) | – | 411 | 444 | 479 | 441 | 490 |
| Translocations and inversions | – | 43 | 13 | 15 | 23 | 11 |
| IS-elements (percentage) | 44 (1%) | 623 (12%) | 314 (7%) | 280 (7%) | 403 (9%) | 394 (8%) |
| Virulence plasmid | pSD1_197 | pCP301 | pINV-2457T | pSB4_227 | pSS_046 | |
| Total length (bp) | 182 726 | 221 618 | ∼218 000 | 126 697 | 214 396 | |
| No. of total ORFs | 224 | 267 | ND | 149 | 241 | |
| Percentage of CDS (%) | 76.03 | 76.24 | ND | 74.18 | 79.06 | |
| G+C content (%) | 44.80 | 45.77 | ND | 47.41 | 45.27 | |
| IS-elements (percentage) | 78 (27%) | 88 (32%) | ND | 72 (38%) | 96 (33%) |
aData are obtained from a recently updated version of U00096.
bData are obtained from Jin et al. (11).
cData are obtained from Wei et al. (12); the virulence plasmid pINV-2457T was reported in the communication but the sequence is not yet publicly available.
dOnly those with DNA segments > 5 kb are listed.
Figure 1Circular representations of the Shigella genomes. The outer scale is marked every 200 kb. Circles range from 1 (outer circle) to 9 (inner circle). Circles 1 and 2, ORFs encoded by leading and lagging strands, respectively, with colour code for functions: salmon, translation, ribosomal structure and biogenesis; light blue, transcription; cyan, DNA replication, recombination and repair; turquoise, cell division; deep pink, posttranslational modification, protein turnover and chaperones; olive drab, cell envelope biogenesis; purple, cell motility and secretion; forest green, inorganic ion transport and metabolism; magenta, signal transduction; red, energy production; sienna, carbohydrate transport and metabolism; yellow, amino acid transport; orange, nucleotide transport and metabolism; gold, co-enzyme transport and metabolism; dark blue, lipid metabolism; blue, secondary metabolites, transport and catabolism; grey, general function prediction only; black, function unclassified or unknown. Circle 3, distribution of pseudogenes. Circles 4 and 5, distribution of IS1/IS1N and other IS-species, respectively. Circles 6 and 7, G+C content and GC skew (G-C/G+C), respectively, with a window size of 10 kb. Circles 8 and 9, distribution of tRNA genes and rrn operons, respectively. The replication origin and terminus are indicated for each. (The circular map for Sf301 was created based on the updated annotation.)
ORFs in each Shigella genome related to main clinical biochemical reactions
| Reaction | Gene | Product | Sd197 | Sf301 | 2457T | Sb227 | Ss046 |
|---|---|---|---|---|---|---|---|
| Indol | tnaA | Tryptophanase | – | SF3754 | S4017 | SBO3667 | – |
| Ornithine | speC | Ornithine decarboxylase | SDY3107 | SF2962 | S3165 | SBO3024 | SSO3230 |
| Lactose | lacY | Galactoside permease | SDY0376 | – | – | – | SSO0300 |
| lacZ | Beta-D-galactosidase | SDY0378 | – | – | – | SSO0299 | |
| Lysine | cadA | Lysine decarboxylase | SDY4466 | – | – | – | SSO4308 |
| cadB | Lysine/cadaverine transport protein | SDY4465 | – | – | – | SSO4315 | |
| Hydrogen sulfide | phsA | Hydrogen sulfide production: membrane anchoring protein | – | – | – | – | – |
| phsB | Hydrogen sulfide production: iron–sulfur subunit; electron transfer | – | – | – | – | – | |
| phsC | Hydrogen sulfide production: membrane anchoring protein | – | – | – | – | – | |
| Citric acid | citT | Citrate:succinate antiporter | – | SF0530 | S0536 | SBO0477 | SSO0564 |
| citC | Citrate lyase synthetase | – | SF0535 | S0542 | SBO0483 | SSO0571 | |
| citD | Citrate lyase acyl carrier protein (gamma chain) | – | SF0534 | S0541 | SBO0482 | SSO0569 | |
| citE | Citrate lyase beta chain (acyl lyase subunit) | – | SF0533 | S0540 | SBO0481 | SSO0568 | |
| citF | Citrate lyase alpha chain | – | SF0532 | S0539 | SBO0480 | SSO0567 | |
| citA | Sensory histidine kinase, regulation of citrate fermentation, senses citrate | – | – | – | SBO0484 | SSO0572 | |
| citB | Response regulator, regulation of citrate fermentation | – | SF0660 | S0683 | SBO0485 | SSO0573 | |
| Acetate | aceA | Isocitrate lyase | SDY4328 | SF4081 | S3649 | SBO4035 | SSO4187 |
| aceB | Malate synthase A | SDY4329 | SF4080 | S3650 | SBO4034 | SSO4186 | |
| aceK | Isocitrate dehydrogenase kinase/phosphatase | SDY327 | SF4082 | S3648 | SBO4036 | – | |
| cmtA | PTS system, mannitol permease II, BC component | SDY3144 | – | SBO3056 | SSO3087 | ||
| cmtB | PTS system, mannitol permease II, A component | SDY3143 | – | SBO3055 | SSO3086 | ||
| mtlA | PTS system, mannitol permease II, ABC components | – | SF3633 | S4135 | SBO3597 | SSO3809 | |
| mtlD | Mannitol-1-phosphate dehydrogenase | – | SF3634 | S4134 | SBO3598 | SSO3808 | |
| srlA | PTS system, glucitol/sorbitol-specific II, C component | SDY2898 | SF2725 | S2916 | SBO2816 | SSO2846 | |
| srlE | PTS system, glucitol/sorbitol-specific II, B component | SDY2899 | SF2726 | S2917 | SBO2815 | SSO2847 | |
| srlB | PTS system, glucitol/sorbitol-specific enzyme II, A component | SDY2901 | SF2727 | S2918 | SBO2814 | SSO2848 | |
| srlD | Glucitol (sorbitol)-6-phosphate dehydrogenase | SDY2902 | SF2728 | S2919 | SBO2813 | SSO2849 | |
| xylA | – | SF3609 | S4160 | SBO3573 | SSO3820 | ||
| xylB | Xylulokinase | – | SF3608 | S4161 | SBO3572 | SSO3821 | |
| xylF | SDY4336 | SF3610 | S4159 | SBO3574 | – | ||
| xylG | – | SF3611 | S4158 | SBO3575 | – | ||
| xylH | – | SF3612 | S4157 | SBO3576 | – |
aPseudogenes.
IS-elements identified in Shigella genomes and E.coli K12 MG1655 chromosome
| Length (bp) | No. of ORFs | No. of intact elements | No. of partial elements | |||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MG1655 | Sd197 | Sf301 | 2457T | Sb227 | Ss046 | pCP301 | pSD1_197 | pSB4_227 | pSS_046 | MG1655 | Sd197 | Sf301 | 2457T | Sb227 | Ss046 | pCP301 | pSD1_197 | pSB4_227 | pSS_046 | |||
| IS1 | 768 | 2 | 7 | 151 | 108 | 105 | 160 | 167 | 2 | 3 | 6 | 3 | 0 | 10 | 9 | 3 | 14 | 8 | 1 | 0 | 0 | 1 |
| iso-IS1(IS1N) | 803 (766) | 2 | 0 | 273 | 0 | 1 | 1 | 1 | 0 | 8 | 0 | 0 | 0 | 27 | 1 | 0 | 0 | 0 | 5 | 5 | 4 | 5 |
| IS2 | 1331 | 2 | 6 | 25 | 30 | 29 | 33 | 27 | 1 | 2 | 3 | 2 | 1 | 7 | 5 | 4 | 10 | 16 | 2 | 4 | 4 | 4 |
| IS3 | 1258 | 2 | 5 | 0 | 5 | 5 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 3 | 1 | 0 | 1 | 7 | 6 | 4 | 6 |
| IS4 | 1428 | 1 | 1 | 10 | 18 | 19 | 16 | 28 | 1 | 1 | 3 | 1 | 0 | 2 | 3 | 3 | 10 | 5 | 1 | 1 | 0 | 1 |
| IS5 | 1198 | 1 | 11 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| iso-IS10R | 1329 | 1 | 0 | 0 | 13 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| IS21 | 2131 | 2 | 0 | 0 | 0 | 0 | 0 | 17 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 4 | 3 | 3 | 1 | 5 |
| IS30 | 1221 | 1 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
| IS91 | 1830 | 1 | 0 | 0 | 3 | 5 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 2 | 1 | 0 | 2 | 6 | 4 | 2 | 6 |
| IS100 | 1963 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 7 | 3 | 3 | 5 |
| IS150 | 1443 | 3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5 | 0 | 0 | 0 | 2 | 0 | 1 | 1 |
| IS186 | 1372 | 1 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 |
| IS600 | 1264 | 2 | 0 | 54 | 35 | 35 | 20 | 51 | 3 | 2 | 1 | 6 | 1 | 28 | 17 | 12 | 17 | 23 | 10 | 8 | 4 | 4 |
| IS629 | 1310 | 2 | 0 | 0 | 10 | 12 | 41 | 3 | 8 | 4 | 5 | 3 | 0 | 2 | 11 | 2 | 0 | 1 | 3 | 5 | 4 | 6 |
| IS630 | 1164 | 1 | 0 | 0 | 0 | 0 | 0 | 16 | 1 | 0 | 1 | 5 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 2 | 2 | 4 |
| IS911 | 1250 | 2 | 0 | 12 | 16 | 16 | 26 | 7 | 1 | 0 | 2 | 0 | 4 | 9 | 0 | 4 | 22 | 0 | 0 | 1 | 4 | 1 |
| IS1294 | 1689 | 1 | 0 | 0 | 0 | 0 | 2 | 0 | 1 | 0 | 1 | 2 | 0 | 2 | 3 | 3 | 0 | 0 | 7 | 2 | 7 | 6 |
| IS | 923 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 1 | 5 |
| IS | 1374 | 1 | 0 | 4 | 6 | 5 | 8 | 9 | 2 | 1 | 1 | 2 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | 0 | 1 | 1 |
| IS | 1302 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 2 | 0 | 0 | 1 | 0 | 0 | 0 |
| IS | 2754 | 3 | 0 | 0 | 3 | 6 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 3 | 7 | 5 | 3 | 0 | 2 | 5 | 4 | 4 |
| IS | 2506 | 3 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 2 | 2 | 3 | 0 | 1 | 1 | 1 |
| IS | 2506 | 3 | 0 | 0 | 0 | 0 | 7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 11 | 1 | 0 | 1 | 1 | 1 |
| Total | 37 | 529 | 247 | 238 | 314 | 327 | 26 | 26 | 24 | 28 | 7 | 94 | 67 | 42 | 89 | 67 | 62 | 52 | 48 | 68 | ||
aOnly those with IS fragments ≥ 100 bp are listed.
Known and putative virulence genes in the Shigella chromosomes
| Product | Gene | Location | Function | Sd197 | Sf301 | 2457T | Sb227 | Ss046 |
|---|---|---|---|---|---|---|---|---|
| Toxins | ||||||||
| Shiga toxin | stxAB | N-glycosidase, block protein synthesis | SDY1398,1390 | – | – | – | – | |
| ShET1 | set1A, set1B | SHI-1 | Ion secretion | – | SF2973a, 2973b | ND | – | – |
| ShET2 | senB | Ion secretion | – | – | – | – | SSO2665 | |
| Protease | ||||||||
| Serine protease | pic | SHI-1 | Mucinase | – | SF2973 | S3178 | – | SSO3595 |
| Serine protease | sigA | SHI-1 | Ion secretion | – | SF2968 | S4824 | SBO0233, 4150 | SSO3223 |
| Others | ||||||||
| Aerobactin | iutA, iucABCD | SHI-2/SHI-3 | Iron acquisition | – | SF3719, 3715–3718 | S4052, 4053–4056 | SBO4314, 4337–4340 | SSO3605, 3601–3604 |
| Siderophore receptor | iroN, iroBCDE | Iron acquisition | SDY1022, 1023–1026 | – | – | – | – | |
| ABC transporter | sitABCD | Iron acquisition | SDY1454–1457 | SF1362–1365 | S1964–1967 | SBO1691–1694 | SSO1750–1753 | |
| Hemin receptor | shuA, shuS, shuTWXYUV | Iron acquisition | SDY3547–3555 | – | – | – | – | |
| ABC transporter | Iron acquisition | SDY1240–1242 | SF1192–1194 | S1278–1280 | – | – | ||
| Invasion plasmid antigen | ipaH | Unknown | SDY0834, 1062, 2001, 2003, 2753 | SF0722, 1383, 1880, 2022 | S0761 | SBO0653, 0953, 1026, 1256, 1619, 2084 | SSO0751, 1272, 1317, 2179, 2646 | |
| Putative adhesin | yadA-like | OI#144-like island | Unknown | – | SF3641 | S4127 | SBO3605 | SSO3803 |
| Putative chaperone | clp-like | OI#7-like island | Unknown | – | – | – | – | SSO0242 |
| Inner memberane protein | IcmF-like | OI#7-like island | Unknown | – | – | – | – | SSO0236 |
| Exoprotein | RTX-like | OI#28-like island | Unknown | SDY0420–0424 | – | – | – | – |
| Transport system | OI#28-like island | Unknown | SDY0416, 0417 | – | – | – | – | |
| T2SS | gspC-M | Unknown | SDY3092–3102 | – | – | SBO3011 | – | |
aSequences exist in the genome but are not recognized as coding genes by the current annotation.
bPseudogenes.
Known and putative virulence genes in the Shigella virulence plasmids
| Product | Gene | Function | pSD1_197 | pCP301 | pSB4_227 | pSS_046 |
|---|---|---|---|---|---|---|
| TTSS | Invasion and internalization | SDYP174–193 | CP0136–0156 | – | SSOP098–117 | |
| TTSS secreted protein | ipaA | Actin depolymerization | SDYP163 | CP0125 | – | SSOP087 |
| ipaB | Inducing apoptosis | SDYP166 | CP0128 | – | SSOP090 | |
| ipaC | Actin polymerization, activation of Cdc42 and Rac | SDYP165 | CP0127 | – | SSOP089 | |
| ipaD | Forming a complex with IpaB, control the flux of proteins through the type III secretion | SDYP164 | CP0126 | – | SSOP088 | |
| ipgD | Inositol 4-phosphatase, membrane blebbing | SDYP171 | CP0133 | – | SSOP095 | |
| icsB | Camouflaging IcsA from autophagic host defense system | SDYP170 | CP0132 | – | SSOP094 | |
| virA | Microtubule destabilization, membrane ruffling | SDYP211 | CP0181 | – | SSOP142 | |
| ospF/mkaD | Unknown | SDYP013 | CP0010 | SBOP017 | SSOP009 | |
| ipaH7.8 | Facilitating the escape of the bacteria from phagocytic vacuole of macrophages | SDYP038 | CP0078 | SBOP067 | SSOP058 | |
| ipaH9.8 | Transported to the nucleus, function unknown | SDYP099 | CP0226 | SBOP113 | SSOP167 | |
| ipaH4.5 | Unknown | SDYP037 | CP0079 | SBOP066 | SSOP059 | |
| ipgB | Unknown | SDYP168 | CP0130 | – | SSOP092 | |
| ospG | Unknown | SDYP101 | CP0227 | – | SSOP170 | |
| Toxins | ||||||
| ShET2 | senA | Ion secretion | SDYP056 | CP0093 | SBOP076 | SSOP050 |
| senB | Homologues of ShET2 | SDYP010 | CP0009 | SBOP016 | SSOP008 | |
| Enzymes | icsP/sopA | Cleavaging of IcsA | SDYP224 | CP0271 | SBOP149 | SSOP241 |
| sepA | Tissue invasion | – | CP0070 | – | – | |
| msbB | Fatty acyl modification of O-antigen | SDYP110 | CP0238 | SBOP119 | SSOP182 | |
| apy | ATP-diphosphohydrolase | SDYP004 | CP0004 | SBOP006 | SSOP004 | |
| phoN-Sf | Non-specific acid phoshatase | SDYP067 | CP0190 | – | – | |
| rfbU | O-antigen biosynthesis | SDYP108 | CP0236 | – | SSOP180 | |
| ushA | UDP-sugar hydrolase (5′-nucleotidase) | SDYP064 | CP0185 | – | SSOP147 | |
| Regulators | virF | Activating transcription of virB and icsA | – | CP0046 | SBOP052 | SSOP041 |
| virK | Post-transcriptional regulation of icsA expression | SDYP109 | CP0237 | SBOP118 | SSOP181 | |
| virB | Activating ipa, spa, and mxi operons | SDYP161 | CP0123 | – | SSOP085 | |
| Others | icsA/virG | Nucleation of actin filaments | SDYP214 | CP0182 | – | SSOP143 |
Figure 3Graphic representation of the different T2SS loci in E.coli K-12 MG1655 and Shigella genomes (to scale). (a) The yhe locus at 74.5 min of the MG1655 chromosome and the corresponding regions in the Shigella genomes. (b) The pheV tRNA locus at 67 min of MG1655 and the corresponding loci in Shigella genomes where the gsp genes are located. A strain name followed by a minus sign (−) means the reverse complement strands of the genome sequences were used for the diagram.