| Literature DB >> 21575176 |
Mikihiko Kawai1, Yoshikazu Furuta, Koji Yahara, Takeshi Tsuru, Kenshiro Oshima, Naofumi Handa, Noriko Takahashi, Masaru Yoshida, Takeshi Azuma, Masahira Hattori, Ikuo Uchiyama, Ichizo Kobayashi.
Abstract
BACKGROUND: The genome of Helicobacter pylori, an oncogenic bacterium in the human stomach, rapidly evolves and shows wide geographical divergence. The high incidence of stomach cancer in East Asia might be related to bacterial genotype. We used newly developed comparative methods to follow the evolution of East Asian H. pylori genomes using 20 complete genome sequences from Japanese, Korean, Amerind, European, and West African strains.Entities:
Mesh:
Year: 2011 PMID: 21575176 PMCID: PMC3120642 DOI: 10.1186/1471-2180-11-104
Source DB: PubMed Journal: BMC Microbiol ISSN: 1471-2180 Impact factor: 3.605
Comparison of hspEAsia to other genomes
| Strain | Disease | Population | Length | % GC | CDS | Core | Reference | |||
|---|---|---|---|---|---|---|---|---|---|---|
| subpopulation | (bp)(a,b) | content | genes | |||||||
| F57 | Gastric cancer | hpEastAsia hspEAsia | 1609006 | 38.7 | 1521 | 1402 | ABD | s1a-m1-i1 | -/B | This work |
| F32 | Gastric cancer | hpEastAsia hspEAsia | 1578824, 2637 | 38.9 | 1492 | 1385 | ABD | s1a-m1-i1 | -/E(e) | This work |
| F30 | Duodenal ulcer | hpEastAsia hspEAsia | 1570564, 9129 | 38.8 | 1485 | 1385 | ABD | s1a-m1-i1 | -/B | This work |
| F16 | Gastritis | hpEastAsia hspEAsia | 1575399 | 38.9 | 1500 | 1402 | ABD | s1a-m1-i1 | -/B | This work |
| 51 | Duodenal ulcer | hpEastAsia hspEAsia | 1589954 | 38.8 | 1509 | 1424 | ABD | s1a-m1-i1 | -/B | |
| 52 | ? | hpEastAsia hspEAsia | 1568826 | 38.9 | 1496 | 1383 | (A/B)(D/B)D | (s1a)-m1-i1 (f) | -/B | |
| Shi470 | Gastritis | hpEastAsia hspAmerind | 1608548 | 38.9 | 1517 | 1401 | AB(D/C),CC(g) | s1b-m1-i1 | -/B | [ |
| v225d | Gastritis | hpEastAsia hspAmerind | 1588278, 7326 | 39.0 | 1506 | 1377 | AB(C/D)(C/D), (tr) (g,h) | s1a-m1-i1 | -/B | [ |
| Cuz20 | ? | hpEastAsia hspAmerind | 1635449 | 38.9 | 1527 | 1364 | AB(D/C)×5(tr) (h) | s1a-m2-i2 | -/A | |
| Sat464 | ? | hpEastAsia hspAmerind | 1629557, 8712 | 38.9 | 1465 | 1376 | AB(D/C) | s1b-m1-i1 | -/B | |
| PeCan4 | Gastric cancer | hpEastAsia hspAmerind? | 1560342, 7228 | 39.1 | 1525 | 1388 | A(B/A)BC | s1a-m1-i1 | -/B | |
| 26695 | Gastritis | hpEurope | 1667867 | 38.9 | 1575 | 1411 | ABC | s1a-m1-i1 | A/- | [ |
| HPAG1 | Gastritis | hpEurope | 1596366, 9370 | 39.1 | 1492 | 1394 | A(B/A)C | s1b-m1-i1 | B/- | [ |
| G27 | ? | hpEurope | 1652982, 10031 | 38.9 | 1560 | 1400 | ABCC | s1b-m1-i1 | B/- | [ |
| P12 | Duodenal ulcer | hpEurope | 1673813, 10225 | 38.8 | 1593 | 1396 | ABCC | s1a-m1-i1 | A/- | [ |
| B38 | MALT lymphoma | hpEurope | 1576758 | 39.2 | 1493 | 1388 | - | s2-m1-i2 | A/- | [ |
| B8(i) | Gastric ulcer(i) | hpEurope | 1673997, 6032 | 38.8 | 1578 | 1385 | ABC | s1a-m2-i2 (j) | A/A | [ |
| SJM180 | Gastritis | hpEurope? | 1658051 | 38.9 | 1515 | 1381 | ABC | s1b-m1-i1 | B/B | |
| J99 | Duodenal ulcer | hpAfrica1 hspWAfrica | 1643831 | 39.2 | 1502 | 1383 | (A/B)C | s1b-m1-i1 | A/B | [ |
| 908(k) | Duodenal ulcer | hpAfrica1 hspWAfrica | 1549666 | 39.3 | 1503 | 1393 | ABC | -s1b-(-)-i1 (j,k,l) | -/-(k) | [ |
a) The first number is the length of the chromosome and the second number (when present) is that of the plasmid.
b) Accession numbers are as follows: F57 [DDBJ:AP011945.1 http://getentry.ddbj.nig.ac.jp/cgi-bin/get_entry2.pl?database=ver_ddbj&query=AP011945.1], F32 [DDBJ:AP011943.1 http://getentry.ddbj.nig.ac.jp/cgi-bin/get_entry2.pl?database=ver_ddbj&query=AP011943.1, DDBJ:AP011944.1 http://getentry.ddbj.nig.ac.jp/cgi-bin/get_entry2.pl?database=ver_ddbj&query=AP011944.1], F30 [DDBJ:AP011941.1 http://getentry.ddbj.nig.ac.jp/cgi-bin/get_entry2.pl?database=ver_ddbj&query=AP011941.1, DDBJ: AP011942.1 http://getentry.ddbj.nig.ac.jp/cgi-bin/get_entry2.pl?database=ver_ddbj&query=AP011942.1], F16 [DDBJ:AP011940.1 http://getentry.ddbj.nig.ac.jp/cgi-bin/get_entry2.pl?database=ver_ddbj&query=AP011940.1], 51 [GenBank:CP000012.1], 52 [GenBank:CP001680.1], Shi470 [GenBank:NC_010698.2], v225d [GenBank:CP001582.1, GenBank:CP001583.1], Cuz20 [GenBank:CP002076.1], Sat464 [GenBank:CP002071.1, GenBank:CP002072.1], PeCan4 [GenBank:NC_014555.1, GenBank:NC_014556.1], 26695 [GenBank:NC_000915.1], HPAG1 [GenBank:NC_008086.1, GenBank:NC_008087.1], G27 [GenBank:NC_011333.1, GenBank:NC_011334.1], P12 [GenBank:NC_011498.1, GenBank:NC_011499.1], B38 [GenBank:NC_012973.1], B8 [GenBank:NC_014256.1, GenBank:NC_014257.1], SJM180 [GenBank:NC_014560.1], J99 [GenBank:NC_000921.1], 908 [GenBank:CP002184.1]. Draft sequence of the East Asian strain 98-10 [140]. 98-10, [GenBank:NZ_ABSX01000001.1] - [GenBank:NZ_ABSX01000051.1].
c) Letters in parentheses are the hybrid EPIYA segment. For example, (A/B) is a hybrid of EPIYA-A and EPIYA-B segments [21,22,141].
d) Reference [142,143].
e) Designated as homE as it was very different from homA or homB.
f) "s" region locates outside of the ORF.
g) A second cagA gene between cagM and cagP.
h) (tr), truncation.
i) Mongolian gerbil-adapted, originally from gastric ulcer.
j) vacA gene is split.
k) According to a reference [139], the sequence might not represent a complete genome, although it is deposited as a complete circular genome in GenBank.
l) "m" region was not available because of a deletion in the center of the ORF.
Figure 1Phylogenetic tree of 20 . Well-defined core OGs were used for neighbor-joining method (see Methods). Numbers indicate bootstrap values. Scale bar indicates substitutions per nucleic acid residue (change/nucleotide site). The assignment of population/subpopulation was based on a phylogenetic tree constructed from the concatenated alignment of fragments of seven genes used in the H. pylori MLST database (atpA, efp, mutY, ppa, trpC, ureI and yphC) [18]. Classification of population/subpopulation was as described [10,19].
Characteristic gene contents of East Asian (hspEAsia) H. pylori
| Population type | Strain | Locus of outer membrane proteins | Periplasmic endonuclease | Molybdenum- related function | ||||
|---|---|---|---|---|---|---|---|---|
| hspEAsia | F57 | A/A(e) | +/- | A/B/- | A/- | x | x | - |
| F32 | A/x | +/- | A/B(tr)/- | A/- | x | x | - | |
| F30 | A/A | +/- | A/B/- | A/- | x | x | - | |
| F16 | A/A | +/- | A/B/- | A/- | x | x | - | |
| 51 | A/A | +/- | A/B/- | A/- | + | x | - | |
| 52 | A/A | +/- | A/B/- | A/A | x | x | - | |
| hspAmerind | Shi470 | A/x | +/- | A/B/- | A/- | + | + | + |
| v225d | A/x | +/- | A/B/- | A/- | + | + | + | |
| Cuz20 | A/x | +/- | A/B/- | A/- | + | + | + | |
| Sat464 | A/x | +/- | A/B/- | A/- | + | + | + | |
| PeCan4 | A/A | +/- | A/B/- | A/B | + | + | + | |
| hpEurope | 26695 | A/- | +/+ | B/A/C | A/A | + | x | + |
| HPAG1 | A/- | x/+ | A/C/B | A/B | + | + | + | |
| G27 | A/- | +/x | C/B/A | A/B | + | + | + | |
| P12 | A/- | +/+ | A/B/B(tr) | B/B | + | + | +?(f) | |
| B38 | A/- | +/+ | A/A(tr)/- | A/- | + | + | + | |
| B8 | A/- | +/+ | A/A/- | A/Q(g) | + | + | + | |
| SJM180 | A/- | +/+ | B/C/A | A/B | + | + | + | |
| hspWAfrica | J99 | A/- | +/+ | A/B(tr)/- | A/B | + | + | + |
| 908(h) | A/- | -/-(h) | A(tr)/B(tr)/-(h) | -/-(h) | + | + | + | |
+, present; x, disrupted (nucleotide sequence partly remained); -, absent. See Additional file 2 (= Table S1) for a detailed list.
a) babA locus corresponds to HP0896; babB locus, HP1243; babC locus, HP0317.
b) sabA locus corresponds to jhp0662; sabB locus, jhp0659.
c) Paralog of vacA (HP0289), but not vacA itself (HP0887). Another paralog vacA-4 (HP0922) is in Table 6.
d) HP1382.
e)/, different loci.
f) One of 12 molybdenum-related genes was truncated.
g) hopQ gene. Two hopQ copies exist, one at sabB locus and the other, as in other strains, at the hopQ locus.
h) From the description of the reference [139], the sequence might not represent a complete genome, although it is deposited as a complete circular genome in GenBank. Hence, care should be taken in interpreting the results.
Relevant information about each family from draft sequence of the Japanese strain 98-10 (NZ_ABSX01000001.1- NZ_ABSX01000051.1) [143] are as follows: oipA/oipA-2, with at least one copy, although the exact copy number cannot be determined because of a short contig encoded only the oipA gene but not the flanking region; hopM locus, +? (partial sequence at an end of the contig); hopN locus, not applicable because it was at an end of contigs (hopN fragment is deposited but the sequence was partial at both ends of the contig, preventing locus assignment); babA/babB/babC, A?/?/? (babA at babA locus but partial at an end of the contig; babB and babC loci, not applicable because they were at ends of contigs; babB sequence was partial at both ends of the contig, preventing locus assignment); sabA/sabB, +/-; vacA-2, x; nucG split as in the other hspEAsia strains; Molybdenum-related function, x.
Genes diverged between East Asian and European H. pylori
| Gene | Description | Representative of the gene family(a,b,c) | Distance | Distance | Reference | |
|---|---|---|---|---|---|---|
| HpaA paralog | HP0492(f) | 0.1608 | 0.0253 | 3 | [ | |
| Cag pathogenicity island protein | HP0547(f) | 0.1009 | 0.0285 | 3 | [ | |
| Bacterial SH3 domain | HP1250 | 0.0901 | 0.0615 | 3 | ||
| α-(1,3)-fucosyltransferase | HP0379, HP0651 | 0.0553 | 0.0436 | 3 | [ | |
| Sugar efflux transporter | HP1185 | 0.0441 | 0.0095 | 2 | ||
| Vacuolating cytotoxin A | HP0887 | 0.0420 | 0.0137 | 2 | [ | |
| tRNA delta(2)-isopentenylpyrophosphate transferase | mHP1415 | 0.0373 | 0.0241 | 3 | [ | |
| Hypothetical protein | HPAG1_0619 | 0.0366 | 0.0540 | 3 | ||
| Cysteine-rich protein, SLR (Sel1-like repeat) protein | HP0160 | 0.0363 | 0.0323 | 3 | [ | |
| Preprotein translocase subunit YajC | HP1551 | 0.0353 | 0.0268 | 3 | [ | |
| β-1,3-N-acetyl-glucosaminyl transferase | HP1105 | 0.0338 | 0.0228 | 2 | ||
| Ribonuclease HII | mHP1323(f) | 0.0337 | 0.0398 | 3 | [ | |
| Flagellar hook length control | HP0906 | 0.0328 | 0.0382 | 3 | [ | |
| Putative outer membrane protein | HP0373 | 0.0325 | 0.1207 | 3 | ||
| Outer membrane protein | HP0477, HP0923 | 0.0313 | 0.0357 | 3 | [ | |
| NAD(P)H-flavin oxidoreductase | HP0642 | 0.0306 | 0.0212 | 2 | [ | |
| Preprotein translocase subunit SecG | mHP1255 | 0.0300 | 0.0226 | 2 | [ | |
| Hypothetical protein | HP0384 | 0.0296 | 0.0302 | 3 | ||
| Tumor necrosis factor alpha-inducing protein | HP0596 | 0.0293 | 0.0145 | 2 | [ | |
| Membrane-bound, nickel containing, hydrogen uptake hydrogenase | HP0635 | 0.0288 | 0.0252 | 3 | [ | |
| tRNA(Ile) lysidine synthase | HP0728 | 0.0286 | 0.0193 | 2 | [ | |
| Periplasmic competence protein | HP1527 | 0.0285 | 0.0194 | 2 | [ | |
| Peptide deformylase | HP0793 | 0.0285 | 0.0065 | 2 | [ | |
| Putative vacuolating cytotoxin-like protein | HP0922 | 0.0284 | 0.0222 | 2 | ||
| Hydrogenase expression/formation protein | HP0898 | 0.0284 | 0.0169 | 2 | [ | |
| Helicase | HP1553 | 0.0283 | 0.0308 | 3 | [ | |
| Type I restriction enzyme, R protein | mHP1402 | 0.0282 | 0.0245 | 3 | ||
| Hypothetical protein | mHP0174 | 0.0268 | 0.0203 | 2 | ||
| Outer membrane protein OipA | HP0638 | 0.0267 | 0.0097 | 2 | [ | |
| Ribosomal protein L11 methyltransferase | HP1068 | 0.0261 | 0.0118 | 2 | [ | |
| Maf family (motility accessory family of flagellin-associated proteins) homolog | HP0465 | 0.0259 | 0.0214 | 2 | [ | |
| Hypothetical protein | HP0097 | 0.0257 | 0.0207 | 2 | ||
| Hypothetical protein | HP1143 | 0.0254 | 0.0146 | 2 | ||
| Membrane protein required for colicin V production and secretion | mHP0181 | 0.0252 | 0.0169 | 2 | [ | |
| 6-phosphogluconolactonase | HP1102 | 0.0250 | 0.0130 | 2 | ||
| Outer membrane protein Horl | HP1113 | 0.0248 | 0.0348 | 3 | ||
| cbb3-type cytochrome c oxidase subunit Q | mHP0146 | 0.0248 | 0.0023 | 1 | ||
| Hypothetical protein | HP0150 | 0.0248 | 0.0154 | 2 | ||
| Chemotaxis effector | HP1067 | 0.0248 | 0.0014 | 1 | [ | |
| Flagellar chaperone | HP0754 | 0.0245 | 0.0138 | 2 | [ | |
| Cell division protein | HP0978 | 0.0244 | 0.0071 | 2 | [ | |
| Ribonuclease H | HP0661 | 0.0243 | 0.0217 | 2 | [ | |
| Branched-chain amino acid aminotransferase | HP1468 | 0.0239 | 0.0136 | 2 | ||
| Cation transport subunit for cbb3-type oxidase | HP1163 | 0.0237 | 0.0250 | 3 | [ | |
| NADH-ubiquinone oxidoreductase chain F | HP1265 | 0.0236 | 0.0202 | 2 | ||
| Putative thiol:disulfide interchange protein | HP0861 | 0.0234 | 0.0185 | 2 | ||
| Hypothetical protein | HP0806 | 0.0233 | 0.0233 | 3 |
(a) m, different assignment of start codon from the RefSeq entry in the GenBank database
(b) All paralogous genes in each orthologous group are counted.
(c) Assignments to gene families are in Additional file 5 (= Table S4).
(d) Distance between the last common ancestor of hspEAsia and the last common ancestor of hpEurope.
(e) Average of distances between the last common ancestor of hspEAsia and each hspEAsia strain.
(f) A homolog in the draft genome sequence of another East Asian strain 98-10 has been reported to be diverged from four Western strains [143]. The other genes listed as diverged in 98-10 [143], HP0806, HP0061, HP1524, HP0519 and HP1322, did not meet the criteria of this study. HP0806 was below the dthreshold; for the others, the hspEAsia genes did not form a separate sub tree from hpEurope.
Figure 2East Asia-specific sequence at the C-terminus of the putative product of . (A) Four types of hopMN genes. Type c3 of m1-c3 and m2-c3 is composed of parts of c1 and c2. The c1-m1 and c2-m1 types correspond to hopM and hopN, respectively. (B) Phylogenetic network of whole region of proteins. Types m1-c3 and m2-c3 cannot be clearly distinguished from m1-c1 and m2-c1 in this figure. (C)-(F) Phylogenetic networks for the four domains. Scale bar indicates substitutions per amino acid residue (change/amino-acid site). Positions are for HP0227 of strain 26695.
Figure 3Fragmentation of . Genes homologous to horA in J99 (jhp0073) are classified by the number of ORFs. Numbers indicate coordinates on the genome sequence. Nucleotide similarity between each pair of strains is indicated by gray parallelogram. The state in strain 98-10 is: two ORFs.
Decay of molybdenum-related genes
| Type | hspEAsia | hspAmerind | hpEurope | hspWAfrica | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Strain | F57 | F32 | F30 | F16 | 51 | 52 | (a) | (b) | P12 | (c) |
| Molybdenum (MoO42-) transport | ||||||||||
| x | x | x | + | + | x | + | + | + | + | |
| x | + | + | + | x | x | + | + | + | + | |
| x | x | x | x | x | + | + | + | + | + | |
| Molybdenum cofactor synthesis | ||||||||||
| x | x | x | x | + | x | + | + | + | + | |
| x | + | + | + | + | + | + | + | + | + | |
| x | + | + | + | + | + | + | + | + | + | |
| + | x | + | + | + | + | + | + | x | + | |
| + | + | + | + | + | + | + | + | + | + | |
| x | + | x | x | x | + | + | + | + | + | |
| x | x | x | x | x | x | + | + | + | + | |
| + | + | + | + | + | x | + | + | + | + | |
| Molybdenum cofactor-containing enzyme | ||||||||||
| x | x | x | x | x | x | + | + | + | + | |
+, present; x, disrupted (nucleotide sequence remained).
a) Strains Shi470, v225d, Cuz20, Sat464 and PeCan4.
b) Strains 26695, HPAG1, G27, B38, B8 and SJM180.
c) Strains J99 and 908
The states in strain 98-10 are: x for modA, modB, mobA, moaA, moeB and bisC; + for modC, moaD, moaE, mogA, moaC and moeA.
Figure 4Decay of Mo-related genes in the hspEAsia strains. Mo-related genes are indicated by color. Homologs are indicated by the same color. See Additional file 3 (= Table S2) for nucleotide sequences.
Figure 5Variation in genes connecting acetyl-CoA and acetate. (A) Functional states of three genes in two pathways inferred for 20 strains. (B) Reconstruction of pathway evolution. (C) Genome comparison for the pta-ackA region. (D) Genome comparison for the acoE region. Homologs are indicated by the same color in (C) and (D). The states in strain 98-10 are: pta+ ackA+/acoE+ as F57.
Genomic islands in the four Japanese H. pylori strains
| Strain | GI number | Type | Length | Start-end | Flanking repeat (bp) | Secretion system | Left gene (annotation) | Right gene (annotation) |
|---|---|---|---|---|---|---|---|---|
| F16 | GI_HP_F16_1 | prophage-like | 12245 | 471964 - 484208 (HPF16_0465 - HPF16_0478) | N/D(a) | N/D(a) | HPF16_0464 (IS | HPF16_0479 ( |
| GI_HP_F16_2 | cagPAI | 36761 | 871413 - 834651 (HPF16_0834 - HPF16_0810) | 22(b) | Type IV | HPF16_0835 (hypothetical protein) | HPF16_0809 (glutamate racemase) | |
| F30 | GI_HP_F30_1 (left) | type 1b TnPZ partial | 7246 | 1280406 - 1287651 (HPF30_1205 - HPF30_1211) | N/D(a) | tfs3b partial | HPF30_1204 (outer membrane protein | HPF30_1212 ( |
| GI_HP_F30_1 (right) | type 1b TnPZ partial | 1655 | 1237267 - 1238921 (HPF30_1166 - HPF30_1167) | N/D(a) | N/D(a) | HPF30_1165 (hypothetical protein) | HPF30_1168 (5'-methylthioadenosine/S -adenosylhomocysteine nucleosidase) | |
| GI_HP_F30_2 | cagPAI | 37153 | 867993 - 830839 (HPF30_0803 - HPF30_0778) | 22(c) | Type IV | HPF30_0804 (hypothetical protein) | HPF30_0777 (glutamate racemase) | |
| F32 | GI_HP_F32_1 | type 2 TnPZ partial | 24283 | 1058236 - 1082518 (HPF32_0988 - HPF32_1014) | N/D(a) | tfs3 partial | HPF32_0987 (hypothetical protein) | HPF32_1015 (hypothetical protein) |
| GI_HP_F32_2 | cagPAI | 36609 | 534488 - 571096 (HPF32_0500 - HPF32_0524) | 44(d) | Type IV | HPF32_0499 (hypothetical protein) | HPF32_0525 (glutamate racemase) | |
| F57 | GI_HP_F57_1 (left) | type 1b TnPZ partial | 7246 | 103791 - 111036 (HPF57_0102 - HPF57_0109) | N/D(a) | tfs3b partial | HPF57_0101 (RNA polymerase sigma factor RpoD) | HPF57_0110 (hypothetical protein) |
| GI_HP_F57_1 (right) | type 1b TnPZ partial | 1625 | 152699 - 154323 (HPF57_0147 - HPF57_0148) | N/D(a) | N/D(a) | HPF57_0146 (5'-methylthioadenosine/S-adenosylhomocysteine nucleosidase) | HPF57_0149 (hypothetical protein) | |
| GI_HP_F57_2 | type 1 TnPZ | 38991 | 284353 - 323343 (HPF57_0279 - HPF57_0311) | 8(e) | tfs3b partial | HPF57_0278 ( | HPF57_0312 (type II DNA modification enzyme ) | |
| GI_HP_F57_3 | cagPAI | 36797 | 562215 - 599011 (HPF57_0550 - HPF57_0575) | 22(c) | Type IV | HPF57_0549 (hypothetical protein) | HPF57_0576 (glutamate racemase) | |
a) N/D, not detected
b) TTATAATTTGAGCCATTATTTA
c) TTTCAATTTGAGCCATTCTTTA
d) TTATAATTTGAGCCATTCTTTAGCTTGTTTTTCTAGCCAAACCA
e) ACATTCTT
Figure 6GIs inserted into restriction-modification systems. (A) Insertion of a prophage-like GI (GI_HP_F16_1) into a restriction-modification system. (B) Insertion of a GI into a modification gene. (See Table 4 for detail).
Figure 7GIs detected in Japanese . (A) GIs. (B) Decay of type 2 TnPZ in F32 strain inferred from comparison to the Shi170 strain. The sequence of Type 2 TnPZ of Shi170 is deposited under the accession number [GenBank:EU807988] [48] (Table 4).
Figure 8Genes diverged between East Asian and European strains. (A) Diagram of phylogenetic tree-based analysis. Black dots, last common ancestors of Eastern and Western strains. d, length of the branch separating the two; d, average branch length of the Eastern strains. (B) Plot of gene trees based on the two distance values. Large green dot, well-defined core tree; d*, dfor the well-defined core tree; d*, dfor the well-defined core tree; inset box, well-defined core tree; zone 1, d< 0.00550; zone 2, 0.00550 ≤ d≤ 0.0231; zone 3, d> 0.0231; red dot, genes with positive selection for amino acid change and with d> 2 × d*, that is, d>0.02324; (a), cheY; (b), fixQ; (c), sotA; (d), vacA; (e), cagA; (f), HP1250. N = 692 genes. (C) Representative trees with high divergence between hspEAsia and hpEurope strains. Lowest common ancestor (LCA) of hspEAsia (red) and hpEurope (cyan).
Selected genes diverged between East Asian (hspEAsia) and European (hpEurope) H. pylori
| Function | Genes (classified by divergence within hspEAsia) | ||
|---|---|---|---|
| Conserved(a) | Average(b) | Diverged(c) | |
| Known virulence genes | |||
| Outer membrane proteins | |||
| Lipopolysaccharide synthesis (Lewis antigen mimicry) | |||
| Transport | |||
| Motility and chemotaxis | |||
| Redox | |||
| Nuclease | |||
| Protein synthesis | |||
| Antibiotic-related | |||
Full list and details in Table 6, Additional file 5 (= Table S4) and text. Genes in bold were also extracted in the comparison of 6 hspEAsia vs. 5 hpEurope (Additional file 7 (= Table S5)).
(a) d
Genes with positively selected amino-acid changes between the East Asian and the European H. pylori
| Locus tag | Gene | Description | p-value(a) | Positively selected sites (b,c) |
|---|---|---|---|---|
| HP0547 | Cag pathogenicity island protein | < 1E-21 | V238R (0.994) | |
| A482Q (0.953) | ||||
| HP0373 | Putative outer membrane protein | < 1E-14 | E110N (0.978) | |
| K428H (0.986) | ||||
| T437D (0.979) | ||||
| HP0492 | Hpa paralog | < 1E-5 | S34V (0.970) | |
| A46Q (0.993) | ||||
| R122F (0.967) | ||||
| K127S (0.962) | ||||
| HP1185 | Sugar efflux transporter protein | 0.00005 | T50S (0.956) | |
| A57L (0.990) | ||||
| N134G (0.983) | ||||
| W186Y (0.980) | ||||
| mHP0174 | Hypothetical protein | 0.0007 | F144W (0.952) | |
| mHP1415 | General tRNA delta(2)-isopentenylpyrophosphate transferase | 0.0002 | H174A (0.992) | |
| HP0887 | Vacuolating cytotoxin A | 0.002(d) | S793A (0.964) (d) N931A (0.960) (d) | |
a) Bonferonni adjusted.
b) Posterior probabilities of dN/dS > 1.
c) Positions are for H. pylori 26695. Residues were aligned at the same site by both Mafft [128] and PRANK [136].
d) Two vacA genes (in B38 and B8) were eliminated because they belonged to different subtypes of the gene.
Figure 9Genes with positively selected amino acid changes between East Asian and non-East Asian strains. (A) Position of the positively selected amino-acid residues in ORF (triangles). In (i), EPIYA segments and CM sequences [138] are marked. (B) Position of positively selected amino acids in the three-dimensional structure. (i) HpaA-2 [PDB:2I9I]. (ii) E. coli MiaA [PDB:3FOZ] [61] with the residue corresponding to H174 of H. pylori MiaA. (iii) p55 fragment of VacA [PDB:2QV3] [61] (Table 7).