| Literature DB >> 30071832 |
Elvire Berthenet1, Koji Yahara2, Kaisa Thorell3, Ben Pascoe4, Guillaume Meric4, Jane M Mikhail1,5, Lars Engstrand3, Helena Enroth6, Alain Burette7, Francis Megraud8,9, Christine Varon9, John C Atherton10, Sinead Smith11, Thomas S Wilkinson1, Matthew D Hitchings1, Daniel Falush12, Samuel K Sheppard13.
Abstract
BACKGROUND: Helicobacter pylori are stomach-dwelling bacteria that are present in about 50% of the global population. Infection is asymptomatic in most cases, but it has been associated with gastritis, gastric ulcers and gastric cancer. Epidemiological evidence shows that progression to cancer depends upon the host and pathogen factors, but questions remain about why cancer phenotypes develop in a minority of infected people. Here, we use comparative genomics approaches to understand how genetic variation amongst bacterial strains influences disease progression.Entities:
Keywords: GWAS; Gastric cancer; Helicobacter pylori
Mesh:
Substances:
Year: 2018 PMID: 30071832 PMCID: PMC6090961 DOI: 10.1186/s12915-018-0550-3
Source DB: PubMed Journal: BMC Biol ISSN: 1741-7007 Impact factor: 7.431
Fig. 1Neighbour-joining tree based on whole genome sequence alignment of all 173 strains from hpEurope-derived populations. Branches are shaded according to the population determined by fineSTRUCTURE analysis [17]. Labels reveal the patient disease background grouped into three categories: non-atrophic gastritis, progressive towards gastric cancer and gastric cancer. The scale bar represents a genetic distance of 0.02
Fig. 2Location of genetic elements associated with gastric cancer on ELS37 genome (GCA_000255955.1). GWAS comparing isolates from patients with (a) non-atrophic gastritis to those with gastric cancer and precancerous progression and (b) gastric cancer to those with non-atrophic gastritis and precancerous progression. Two GWAS were performed with bugwas software for each panel, one based on SNPs (upper panels) and the other based on k-mers (lower panels). Positions of the genomic elements are represented on the horizontal axis expressed. Log 10 of p value for each hit is recorded on the vertical axis. The blue line indicates a p value ≤ 10−5
Summary of the hits obtained in the genome-wide association studies based on 173 strains from hpEurope-derived sub-populations based upon patient disease phenotype
| Number of hits with | Number of genes with hits of | |||
|---|---|---|---|---|
| GWAS experiment | ≤ 10−5 | ≤ 10− 6 | ≤ 10− 5 | ≤ 10− 6 |
| Gastric cancer vs others (k-mer) | 166 | 39 | 20 | 6 |
| Non-atrophic gastritis vs others (k-mer) | 44 | 15 | 10 | 2 |
| Gastric cancer vs others (SNP) | 237 | 33 | 4 | 3 |
| Non-atrophic gastritis vs others (SNP) | 195 | 31 | 4 | 2 |
Fig. 3Prevalence of genes highlighted by GWAS in H. pylori genomes. a Prevalence of genes containing a GWAS hit with p value < 10−5 in three groups of isolates: non-atrophic gastritis isolates (red, n = 55 genomes), progressive toward cancer isolates (grey, n = 49) and gastric cancer isolates (black, n = 39), and defined as the ratio of number of isolates in each group harbouring the gene and the total number of isolates in each group. b Matrix of correlation of pairs of gene prevalence patterns in 143 H. pylori genomes. Red indicates that two genes have a high positive correlation of their patterns of presence and absence in all genomes examined and blue indicates a negative correlation. White indicates core genes that did not vary in prevalence in the dataset and for which correlations could not be calculated
Cancer risk genotypes identified in genome-wide association studies of 173 hpEurope isolates
| Gene name1 | Risk genotype | Position2 | Safe genotype | Frequency3 | Effect on amino acid sequence4 | Function | |
|---|---|---|---|---|---|---|---|
| 1.4.10−9 | A | 798 | C | 0.469/0.125 | S, associated with G to A substitution at position 797: non-synonymous with T in safe, A in risk | Outer membrane protein | |
| 2.24.10−8 | C + T | 325 and 334 | T + G | 0.592/0.181 | NS: L/S in safe, F/A in risk | Neuraminyllactose-binding hemagglutinin (HpaA) [ | |
| 3.99.10−8 | Presence | All genes | Absence | 0.94/0.51 | BabA (outer membrane protein) [ | ||
| 1.69.10− 7 | GGAA | 934 to 937 | AAAA/GGAG | 0.531/0.264 | NS: KA in safe, GT in risk | tRNA (guanine-N(7)-)-methyltransferase | |
| 2.13.10− 7 | A | 145 | G | 0.327/0.153 | NS: D in safe, N in risk | Adenosyl-chloride synthase | |
| A | 159 | G | 0.959/0.792 | S | |||
| 3.62.10− 7 | Presence | All genes | Absence | 0.92/0.61 | CagT protein (Censini, 1996) | ||
| 4.59.10− 7 | CGCC | 705 to 708 | CACG/TGCG | 0.694/0.514 | NS: T in safe, A in risk | Unknown | |
| A | 729 | G | 0.796/0.5 | S | |||
| 5.4.10−7 | Presence | All genes | Absence | 0.92/0.61 | CagU protein (Censini, 1996) | ||
| 6.6.10−7 | Presence | All genes | Absence | 0.92/0.61 | CagH protein (Censini, 1996) |
Risk and safe genotypes are overrepresented amongst isolates from patients with gastric cancer and non-atrophic gastritis respectively, with p value corresponding to the minimum in each gene (p value ≤ 1 × 10−6)
1Position in ELS37 genome [ ], + and – strand is denoted in ( )
2Position in gene
3Frequency GC strains/NAG strains
4The effect on the amino acid sequence is indicated as synonymous (S) and non-synonymous (NS)
Fig. 4Repartition of risk scores on 173 strains from hpEurope-derived sub-populations, according to patient disease background. Each point corresponds to the risk score associated with a single strain. This risk score was calculated based on the presence of risk or safe genotype for each of the 9 genes considered (Table 2)