| Literature DB >> 25788149 |
Maria Juliana Soto-Girón1, Oscar E Ospina1, Steven Edward Massey2.
Abstract
Helicobacter pylori is a bacterium that lives in the human stomach and is a major risk factor for gastric cancer and ulcers. H.pylori is host dependent and has been carried with human populations around the world after their departure from Africa. We wished to investigate how H.pylori has coevolved with its host during that time, focusing on strains from Japanese and European populations, given that gastric cancer incidence is high in Japanese populations, while low in European. A positive selection analysis of eight H.pylori genomes was conducted, using maximum likelihood based pairwise comparisons in order to maximize the number of strain-specific genes included in the study. Using the genic Ka/Ks ratio, comparisons of four Japanese H.pylori genomes suggests 25-34 genes under positive selection, while four European H.pylori genomes suggests 16-21 genes; few of the genes identified were in common between lineages. Of the identified genes which were annotated, 38% possessed homologs associated with pathogenicity and / or host adaptation, consistent with their involvement in a coevolutionary 'arms race' with the host. Given the efficacy of identifying host interaction factors de novo, in the absence of functionally annotated homologs our evolutionary approach may have value in identifying novel genes which H.pylori employs to interact with the human gut environment. In addition, the larger number of genes inferred as being under positive selection in Japanese strains compared to European implies a stronger overall adaptive pressure, potentially resulting from an elevated immune response which may be linked to increased inflammation, an initial stage in the development of gastric cancer.Entities:
Keywords: Helicobacter pylori; Ka/Ks; PAML; pairwise comparison; pathogenic factors; positive selection
Year: 2015 PMID: 25788149 PMCID: PMC4419197 DOI: 10.1093/emph/eov005
Source DB: PubMed Journal: Evol Med Public Health ISSN: 2050-6201
Figure 1.Phylogenetic analysis of 37 H.pylori strains with complete genomes. Bayesian phylogenetic inference of 37 H.pylori strains was conducted as described in Methods. Colors indicate the region from which each strain was isolated. Numbers at nodes represent posterior probabilities
Figure 2.Pipeline for the inference of genes under positive selection in pairwise genome comparisons. Schematic representation of the selection analysis pipeline showing each step of a pairwise genome comparison from identification and extraction of the ortholog pairs to the functional categorization of the proteins under inferred positive selection
Statistics from the four pairwise genome comparisons of H.pylori strains from Japan and Europe
| Pairs | Strain (geographic origin in brackets) | Total genes present in genome | Genes pairs remaining after filtering | Average Ks (substitutions per site) | Average Ka/Ks < 1 | Number of genes under inferred positive selection | Percentage of total genes |
|---|---|---|---|---|---|---|---|
| Japan1 | F57 (Fukui) | 1563 | 1191 | 0.034 | 0.11 | 31 | 1.98 |
| F16 (Fukui) | 1543 | 0.030 | 0.11 | 30 | 1.94 | ||
| Japan2 | 35a (Kyoto) | 1560 | 1173 | 0.023 | 0.10 | 25 | 1.60 |
| 83 (Kyoto) | 1656 | 0.031 | 0.10 | 34 | 2.10 | ||
| Europe1 | G27 (Italy) | 1581 | 1181 | 0.054 | 0.15 | 17 | 1.08 |
| B38 (France) | 1571 | 0.051 | 0.16 | 16 | 1.02 | ||
| Europe2 | P12 (Germany) | 1634 | 1247 | 0.054 | 0.16 | 21 | 1.29 |
| 26695 (UK) | 1563 | 0.058 | 0.17 | 16 | 1.02 |
Figure 3.Gene Ontology enrichment of genes under inferred positive selection. Functional categories for all genes with Ka/Ks > 1 of the pairwise comparisons from Europe (B38 vs G27 and HPAG1 vs 26695) and Japan (F16 vs F57 and 35a vs 83) pairwise comparisons, using the Gene Ontology Database and Blast2Go, described in Methods. The numbers displayed on the y-axis represent the relative enrichment in terms of the proportion of positively selected genes in each GO category compared to the total number present in that category in the genome
Genes under inferred positive selection from the pairwise genome comparisons of four H.pylori strains isolated from Japan
| 35a (Kyoto, PUD) | 83 (Kyoto, PUD) | F16 (Fukui, Ga) | F57 (Fukui, GC) |
|---|---|---|---|
Flagellar exodeoxyribonuclease hook-associated protein 2, fliD (YP_005769952.1) ( | Lipid A 1-phosphatase (YP_005769383.1) ( | Urease-enhancing factor (YP_005779079) ( | Thioredoxin (BAJ55951.1) ( |
Probable outer membrane protein (ADU41365.1) ( | DNA-directed RNA polymerase subunit omega (YP_005769931.1) ( | 50S ribosomal protein L9 (BAJ55437.1) ( | Outer membrane protein, HorA (YP_005776806.1) ( |
Dethiobiotin synthetase (YP_005770624.1) ( | 6,7-dimethyl-8-ribityllumazine synthase (riboflavin synthase beta chain) (YP_005769366.1) ( | 30s ribosomal protein S6 (BAJ55777.1) ( | Conserved hypothetical ATP-binding protein (BAJ54919.1) ( |
Neuraminyllactose-binding hemagglutinin (YP_005769912.1) ( | ATP-binding protein (YP_005770583.1) ( | 50S ribosomal protein L3 (BAJ55847.1) ( | Putative secretion/efflux ABC transporter ATP-binding protein (YP_006227497.1) ( |
Urease-enhancing factor (YP_005770181) ( | Phosphonate metabolism protein PhnI (YP_005769292.1) ( | ABC transporter permease (BAJ55515.1) ( | CMP-N-acetylneuraminic acid synthetase (BAJ54930.1) ( |
Periplasmic competence protein (YP_005769282.1) ( | Type II methylase protein (YP_005769308.1) ( | Sialidase A (YP_003728769.1) ( | Putative endonuclease (YP_005776737.1) ( |
gfo/Idh/MocA family oxidoreductase (YP_005770018.1) ( | Methyl-accepting chemotaxis protein (YP_005769452.1) ( | Type II modification enzyme (BAJ55149.1) ( | D-Amino acid dehydrogenase (BAJ55519.1) ( |
Pseudouridine synthase D (YP_005769708.1) ( | Lipopolysaccharide ABC superfamily ATP binding cassette transporter permease protein (YP_005769347.1) ( | Adenine-specific DNA methyltransferase (BAJ54643.1) ( | RNA-binding protein (BAJ55130.1) ( |
Ribosomal protein L11 methyltransferase (YP_005769743.1) ( | Shikimate kinase (YP_005769533.1) ( | cag pathogenicity island protein (CagG) (BAJ55413.1) ( | Putative lipopolysaccharide biosynthesis protein (BAJ55644.1) ( |
50S ribosomal protein L22 (YP_005770576.1) ( | UDP-sugar diphosphatase (YP_005770185.1) ( | flgM protein (BAJ55661.1) ( | ABC transporter ATP-binding protein (YP_005777218.1) ( |
Holliday junction resolvase family protein (YP_005769700.1) ( | NADH-quinone oxidoreductase subunit A (YP_005770524.1) ( | 7-cyano-7-deazaguanine reductase (BAJ55907.1) ( | cell filamentation protein (BAJ55696.1) ( |
Disulfide interchange protein (YP_005770340.1) ( | Poly(A) polymerase (YP_005770052.1) ( | Biotin protein ligase (BAJ55679.1) ( | Ribose-5-phosphate isomerase B (BAJ55382.1) ( |
Flagellar hook-basal body complex protein FliE (YP_005769337.1) ( | Proteobacterial sortase system OmpA family protein (YP_005769939.1) ( | RNA-binding protein (BAJ55130.1) ( | Peptidase M50 (YP_005770113.1) ( |
Orotate phosphoribosyltransferase (YP_005770521.1) ( | DASS family divalent anion:sodium (Na+) symporter (YP_005769519.1) ( | Type II DNA modification enzyme (BAJ55625.1) ( | feoA gene product (YP_005770102.1) ( |
CDP-diacylglycerol-glycerol-3-phosphate, 3-phosphatidyltransferase (YP_005769796.1) ( | Phosphoserine phosphatase (YP_005770041.1) ( | Type II restriction endonuclease (BAJ55626.1) ( | Riboflavin synthase subunit alpha (YP_005778223.1) ( |
Fatty acid/phospholipid synthesis protein PlsX (YP_005769572.1) ( | Polar flagellin (YP_005769953.1) ( | Riboflavin synthase subunit alpha (YP_005779723.1) ( | Type II restriction endonuclease (YP_005777226.1) ( |
Pyrroline-5-carboxylate reductase (YP_005770419) ( | Chorismate mutase (YP_005769657.1) ( | Secreted protein involved in flagellar motility (YP_006219827.1) ( | Transcription elongation factor GreA (BAJ55095.1) ( |
H. protein HPF30_0649 (ADU41023) ( | 50S ribosomal protein L15 (YP_005770563) ( | H. protein HPF16_0633 (YP_005778868) ( | 50S ribosomal protein L10 (YP_005777971.1) ( |
H. protein HPCPY3281_0951 (ADU40935) ( | Thiamine-phosphate diphosphorylase (YP_005769865) ( | H. protein HPF16_0710 * (YP_005778945) ( | Succinyl-CoA-transferase subunit B (BAJ55301.1) ( |
H. protein HPHPA14_0570 (ADU40948) ( | HOP family outer membrane porin (YP_005769995) ( | H. protein HPF16_0214 (YP_005778449) ( | Biotin sulfoxide reductase BisC fragment (YP_005777721.1) ( |
Conserved hypothetical protein (ADU40534.1) ( | Integral membrane protein (YP_005769244) ( | H. protein HPF16_0250 (YP_005778485) ( | cag pathogenicity island protein (CagU) YP_005777275.1) ( |
H. protein HP9810_881g5 (ADU40463) ( | Undecaprenyl phosphate N-acetylglucosaminyltransferase (YP_005769359) ( | H. protein HPF16_0244 (YP_005778479) ( | H. protein HPF57_1192 (YP_005777906.1) ( |
H. protein HP0385 (ADU41339) ( | Acyl-phosphate glycerol 3-phosphate acyltransferase (YP_005769268) ( | H. protein HPF16_1359 * (BAJ55956) ( | H. protein HPF57_0261 (YP_005776975.1) ( |
H. protein HPCPY6311_1233 (ADU41477.1) ( | Conserved hypothetical protein * (ADU41024) ( | H. protein HPF16_0796 (BAJ55393) ( | H. protein HPF57_0394 (YP_005777108.1) ( |
H. protein HPKB_0158 (ADU40191.1) ( | Conserved hypothetical protein * (ADU41633) ( | H. protein HPF16_0423 * (BAJ55020) ( | H. protein HPF57_0426 (YP_005777140.1) ( |
Conserved hypothetical protein (ADU40713) ( | H. protein HPF16_0253 (YP_005778488) ( | H. protein HPF57_1344 (YP_005778058.1)* ( | |
H. protein KHP_0683 (ADU41063) ( | H. protein HPKB_0212 (YP_005761745) ( | H. protein HPF57_0789 (YP_005777503.1) ( | |
H. protein HPHPP23_1030 (ADU41054) ( | H. protein HPF16_1167 (YP_005779402) ( | H. protein rpmJ (YP_005777967.1) ( | |
H. protein HPKB_1436 (ADU40297) ( | H. protein HPF16_0878 (YP_005779113 ( | H. protein HPF57_0703 (YP_005777417.1) ( | |
H. protein HPF30_1072 (ADU40599) ( | H. protein HPF16_1062 (YP_005779297) ( | H. protein HPF57_0549 (YP_005777263.1) ( | |
H. protein HPF57_1078 (YP_005777792.1) ( |
The area of Japan from where the strains were isolated, and the disease status of the individuals from whom the H.pylori strains were isolated is indicated in brackets (‘PUD’ signifies peptic ulcer disease, ‘Ga’ gastritis, ‘GC’ gastric cancer). References describing the respective disease status for each strain are as follows: 35a and 83 (Yoshio Yamaoka, Michael E. DeBakey Veterans Affairs Medical Center, Baylor College of Medicine, personal communication), F16 (27) and F57 (27). Those genes that were inferred as being under positive selection, with a value of Ka/Ks > 1, are listed. p values generated by each respective likelihood ratio test are shown; values of P = 0.00 represent the rounding of small values of p by the codeml program. Predicted membrane localized proteins (according to either Psortb or CELLO) are denoted with an asterisk. ‘H. protein’ denotes ‘hypothetical protein’. Genbank accession numbers are in brackets.
Genes inferred as being under positive selection from the pairwise genome comparisons of four H.pylori strains isolated from Europe
| B38 (France, ML) | G27 (Italy, PUD) | 26695 (UK, Ga) | P12 (Germany, PUD) |
|---|---|---|---|
Two-component response regulator (YP_003057199.1) ( | 2-oxoglutarate-acceptor oxidoreductase subunit OorD (YP_003057524.1) ( | trbI protein (NP_206843.1) ( | sec-independent translocase (NP_207851.1) ( |
LPS 1,2-glycosyltransferase (YP_003056980.1) ( | Diacylglycerol kinase (YP_003057423.1) ( | ABC transporter permease (YP_006934535) ( | ABC transporter permease (YP_002301253) ( |
50S ribosomal protein L33 (YP_003057858.1) ( | Haloacid dehalogenase (YP_003057862.1) ( | Phosphotransacetylase (pta) (NP_207697.1) ( | Molybdenum ABC transporter periplasmic molybdate-binding protein (modA) (NP_207271.1) ( |
30S ribosomal protein S6 (YP_003057895.1) ( | yceI protein (YP_003057934.1) ( | Exodeoxyribonuclease VII small subunit (NP_208273.1) ( | Exodeoxyribonuclease VII small subunit (YP_002302088.1) ( |
30S ribosomal protein L28 (YP_003057592) ( | Flagellar biosynthesis protein (YP_003057130.1) ( | 30S ribosomal protein S11 (NP_208087.1) ( | tenA transcriptional regulator (NP_208079.1) ( |
CatIon transport subunit for cbb3-type oxidase (YP_003057819.1) ( | sec-independent translocase (YP_003057182.1) ( | 30S ribosomal protein S19 (NP_208107.1) ( | Lipoprotein (NP_208229.1) ( |
Biopolymer transport protein ExbD (YP_003057987.1) ( | Acetone carboxylase gamma subunit (YP_003057426.1) ( | Heat-inducible transcription repressor (NP_206911.1) ( | Flagellar protein FlaG (NP_207544.1) ( |
Type IV restriction-modification enzyme (YP_003057166.1) ( | rRNA large subunit methyltransferase (YP_003057653.1) ( | Ribosome maturation factor rimP (NP_207836.1) ( | NADH dehydrogenase subunit K (NP_208062.1) ( |
Dihydroneopterin aldolase (YP_003057321.1) ( | Chorismate mutase PheA (YP_003057099.1) ( | Neuraminyllactose-binding hemagglutinin precursor (NLBH) (NP_207289.1) ( | ATP-binding protein (NP_206967.1) ( |
RNA-binding protein (YP_003057806.1 ( | H. protein HELPY_1408 (YP_003058057.1) ( | 30S ribosomal protein S4 (NP_208086.1) ( | Pore-forming cytolysin (YP_003057160.1) ( |
H. protein HELPY_1050 (YP_003057739) ( | H. protein HELPY_0637 (YP_003057396) ( | Ribonuclease H (NP_207455) ( | Ribonuclease H (YP_002301307) ( |
H. protein HELPY_0735 (YP_003057482) ( | H. protein HELPY_0294 (YP_003057096.1) ( | H. protein HP1405 (NP_207861) ( | H. protein HP1405 (NP_208196) ( |
H. protein HELPY_0206 (YP_003057019) ( | H. protein HELPY_0581 (YP_003057349) ( | H. protein HP0095 (NP_206895.1) ( | Chain C, crystal structure of Flis-Hp1076 complex in H. Pylori (NP_207867.1) ( |
H. protein HELPY_1408 (YP_003058057.1) ( | H. protein HELPY_1405 (YP_003058054) ( | H. protein HP1579 (NP_208370.1) ( | H. protein HP1065 (ID: NP_207856.1) ( |
H. protein HELPY_0386 (YP_003057178) ( | H. protein HELPY_0671 (YP_003057425) ( | H. protein HP0219 (NP_207017) ( | H. protein HP0203 (NP_207002.1) ( |
H. protein HELPY_0261 (YP_003057065) ( | H. protein HELPY_1039 (YP_003057728) ( | H. protein HP79_07203 (NP_207845.1) ( | H. protein HP0444 (NP_207242.1) ( |
H. protein HELPY_0254 (YP_003057060) ( | H. protein HP0716 (NP_207510.1) ( | ||
H. protein HP0868 (NP_207662.1) ( | |||
H. protein HP0350 (NP_207148.1) ( | |||
H. protein HP0150 (NP_206949.1) ( | |||
H. protein HP0556 (NP_207856.1) ( |
The country of origin and disease status of the individuals from whom the H.pylori strains were isolated is indicated in brackets (‘ML’ signifies malt lymphoma, ‘PUD’ peptic ulcer disease, ‘Ga’ gastritis). References describing the respective disease statuses for each strain were B38 (58), G27 [59], 26695 [60] and P12 [61]. Those genes that were inferred as being under positive selection, with a value of Ka/Ks > 1, are listed. Predicted membrane localized proteins (according to either Psortb or CELLO) are denoted with an asterisk. ‘H. protein’ denotes ‘hypothetical protein’.
Figure 4.Selection map of H.pylori strains F16 and F57. Patterns of selection were plotted around the genomes of H.pylori strains F16 and F57, isolated from patients from Fukui, Japan. Genes inferred as being under positive selection are indicated with arrows. Colors represent the strength of purifying selection, with the scale ranging from Ka/Ks = 0 (red) to Ka/Ks = 0.4 (green).
Genes under inferred positive selection in more than one genome from the eight pairwise genome comparisons
| Gene | Strains (gene accession number in brackets) | Biological process | Cellular localization |
|---|---|---|---|
| sec-independent translocase | G27 (YP_003057182.1), P12 (NP_207851.1) | Secretion | Plasma membrane |
| Urease-enhancing factor | F16 (YP_005779079), 35A (YP_005770181) | Metabolism | Cytoplasmic |
| ABC transporter permease | 26695 (YP_006934535), P12 (YP_002301253) | Transport | Periplasmic |
| Ribonuclease H | 26695 (NP_207455), P12 (YP_002301307) | RNA catabolic process | Cytoplasmic |
| Hypothetical protein HP1405 | 26695 (NP_208196.1), P12 (NP_208196) | Unknown | Cytoplasmic |
| Riboflavin synthase alpha | F16 (YP_005779723.1), F57 (YP_005778223.1) | Metabolism | Cytoplasmic |
| Chorismate mutase | G27 (YP_003057099.1), 83 (YP_005769657.1) | Metabolism | Cytoplasmic |
| Exodeoxyribonuclease VII small subunit | 26695 (NP_208273.1), P12 (YP_002302088.1) | DNA repair | Cytoplasmic |
The table shows those genes that are inferred as being under positive selection in more than one genome examined.