Literature DB >> 25452339

Comparative genomic analysis of Helicobacter pylori from Malaysia identifies three distinct lineages suggestive of differential evolution.

Narender Kumar1, Vanitha Mariappan2, Ramani Baddam1, Aditya K Lankapalli1, Sabiha Shaik1, Khean-Lee Goh3, Mun Fai Loke2, Tim Perkins4, Mohammed Benghezal4, Seyed E Hasnain5, Jamuna Vadivelu2, Barry J Marshall4, Niyaz Ahmed6.   

Abstract

The discordant prevalence of Helicobacter pylori and its related diseases, for a long time, fostered certain enigmatic situations observed in the countries of the southern world. Variation in H. pylori infection rates and disease outcomes among different populations in multi-ethnic Malaysia provides a unique opportunity to understand dynamics of host-pathogen interaction and genome evolution. In this study, we extensively analyzed and compared genomes of 27 Malaysian H. pylori isolates and identified three major phylogeographic lineages: hspEastAsia, hpEurope and hpSouthIndia. The analysis of the virulence genes within the core genome, however, revealed a comparable pathogenic potential of the strains. In addition, we identified four genes limited to strains of East-Asian lineage. Our analyses identified a few strain-specific genes encoding restriction modification systems and outlined 311 core genes possibly under differential evolutionary constraints, among the strains representing different ethnic groups. The cagA and vacA genes also showed variations in accordance with the host genetic background of the strains. Moreover, restriction modification genes were found to be significantly enriched in East-Asian strains. An understanding of these variations in the genome content would provide significant insights into various adaptive and host modulation strategies harnessed by H. pylori to effectively persist in a host-specific manner.
© The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 25452339      PMCID: PMC4288169          DOI: 10.1093/nar/gku1271

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Helicobacter pylori, the human gastric pathogen, colonizes almost 50% of the world population (∼70% of the population of developing countries and ∼40% of developed countries) (1,2). It is a major etiological agent for a wide range of gastric diseases such as gastritis, peptic ulcers, gastric carcinoma and mucosa-associated lymphoid tissue lymphoma (3,4). Generally acquired during the childhood (intra-familial transfer), H. pylori establishes a lifelong persistent infection unless cleared by antibiotics (5). The analysis of H. pylori strains has revealed existence of populations that are geographically localized (6,7). These populations have been classified based on multi-locus sequence typing (MLST) (7–11) into seven major lineages or genotypes depending on their regional prevalence: hpEurope, hpSahul, hpEastAsia, hpAfrica1, hpAfrica2, hpWestAfrica and hpAsia2. The ability to undergo frequent mutation and recombination serves as one of the major contributors for the observed genetic heterogeneity among various H. pylori isolates (12–14). It also allows the bacterium to quickly adapt to the changing gastric niches and establish a persistent infection. Certain countries with people of different ethnicities, cultures, lifestyles and religions present a pertinent model to examine the effects of migration and co-evolution on bacteria–host interaction. Studies entailing such settings would provide a better understanding of the evolutionary and adaptive strategies employed by H. pylori which might aid in design of intervention strategies (15). Malaysia is one such multicultural, developing nation with a population comprising four major ethnic groups: Malay/‘Bumiputera’, Chinese, Indians and others (http://www.statistics.gov.my). In general, the Malays are considered natives (Bumiputera) and are in majority. The Malaysian-Chinese comprise the second largest ethnic group and are documented to have migrated from Southern China while the Malaysian-Indian group is comprised of migrants from Southern India (16). Apart from these major ethnic groups, there are a number of indigenous groups (‘Orang Asli’) living together, particularly in East Malaysia, Sabah and Sarawak who do not share the same ethnic origin as the Malays (17). Previous reports have shown high prevalence of H. pylori infection among the Malaysian-Indians (69–75%), followed by Malaysian-Chinese (45–60%) and Malays (8–43%), and a minuscule number of inter-racial/inter-community or inter-religion marriages result in a putatively reduced chance of cross-infection occurring between ethnic groups (18,19). A majority of the H. pylori isolates from Malays and Malaysian-Indians were suggested to be of a recent common origin, while those from Malaysian-Chinese exhibited East-Asian ancestry (6,19). Generally, the H. pylori isolate collections representative of Asian populations are composed of strains from hpEastAsia, hpEurope and hpAsia2 (10,19). A significant proportion of the Malay isolates were found similar to their Indian counterparts, suggesting a possible acquisition of H. pylori from Indians (19), although there is not enough genomic evidence to support this interpretation. Further, the reason for low prevalence of H. pylori infection among Malay/‘Bumiputera’ population remains unclear and is likely to involve a number of environmental, genetic and host-related factors (20). Recent sequencing efforts have reported multiple genome sequences of H. pylori isolates from patients of different ethnicities and various disease manifestations from Malaysia (21–23). To date, there are 29 Malaysian H. pylori genomes (27 clinical strains and two mice-adapted strains) available in NCBI database, a majority of them sequenced and deposited as a part of this work. In this study, we carried out an in-depth whole genome comparative analysis of 27 clinical isolates. The comparison of their core and accessory gene pools demonstrated close similarity among the strains according to their respective host genetic backgrounds. The status of various virulence genes and outer membrane proteins (OMPs) was also compared among the strains in order to unleash novel co-ordinates of adaptive evolution. The study aimed at understanding the genomic heterogeneity among these isolates and their possible role in observed enigmas related to disease outcomes in the region. Further, the analysis of strain-specific genes would allow us to better understand the disease biology and might open avenues for developing effective control strategies.

MATERIALS AND METHODS

Strain collection and ethics approval

Gastric biopsy samples were obtained from five non-ulcer dyspepsia patients of different ethnicities [two Malaysian-Chinese (UM007 and UM034), two Malaysian-Indians (UM018 and UM054) and one Malay (UM045)] at the University of Malaya Medical Centre (UMMC). All biopsies were obtained with written informed consents of the patients attending the Endoscopy Unit, at UMMC. This study was approved by the Human Ethics Committee of the University of Malaya, Kuala Lumpur, Malaysia (Ref. No. 943.2).

Bacterial culture and DNA isolation

The H. pylori isolates were cultured from gastric biopsies by inoculating them on chocolate agar fortified with 4% blood base agar No. 2 (Oxoid) containing defibrinated horse blood (Oxoid) and antimicrobials such as trimethoprim, vancomycin and polymyxin B added to it at standard concentrations. Primary cultures were kept for incubation for up to 10 days (with daily observation) at 37°C in an incubator with 10% CO2. For isolation of pure cultures, a single colony was identified and sub-cultured on chocolate agar for 3–5 days. Morphological identification of H. pylori isolates was carried out based on microscopic features and based on characteristic biochemical tests for the detection of enzymes such as urease, oxidase and catalase. A plateful of H. pylori culture was suspended into 500 μl of Tris buffer. The suspension was centrifuged at 5000 rpm for 10 min and the resulting pellet was collected. The H. pylori DNA was isolated using the QIAamp DNA Mini kit (Qiagen) according to the manufacturer's instruction.

Genome sequencing, assembly and annotation

Whole genome sequencing of the collected strains was carried out on Illumina GAIIx sequencer. The 100-bp paired-end sequencing run generated ∼1-GB read data per strain with an average insert size of ∼400 bp. The raw reads were then filtered using NGS QC toolkit (threshold quality >20) and were assembled into contigs using Velvet de novo assembler (24,25). The contigs were aligned against the NCBI refSeq database to identify a suitable complete genome as a reference. The contigs were sorted and re-oriented according to the chosen reference using in-house written scripts utilizing BLASTn. The sort-order was subjected to manual curation based on the paired-end information which helped us to order most of the unaligned contigs. These ordered contigs were joined together to form a draft genome by inserting a linker sequence (NNNNNCACACACTTAATTAATTAAGTGTGTGNNNNN) encoding start and stop codons in all six frames at the ends. The draft genomes were submitted for gene prediction and annotation to RAST annotation server, and the results were validated using GeneMarkS, Easygene and Glimmer (26–31). Genome statistics of respective strains were obtained through Artemis (32). tRNAs were identified using tRNAscan-SE program while rRNA genes were identified by using RNAmmer program (33,34).

Functional annotation, calculation of core and specific content

The predicted genes were scored using BLASTp against a protein database consisting of genes from 36 complete genomes from the NCBI refseq database. The output was filtered with an identity and query coverage of 90 and 70%, respectively. The proteins were then assigned functional categories based on their best hit obtained in the Basic Local Alignment Search Tool (BLAST) alignment. The other 22 draft genomes reported from Malaysia were downloaded and their gene prediction was performed as mentioned in the previous section. The core genome was calculated by identifying orthologs in every genome by applying Markov cluster (MCL) algorithm included in the OrthoMCL program (35). The parameters for deciding orthologs such as identity and e-value cutoff were set to 80% and 0.00001, respectively. The genes with less than 50 amino acids were excluded from the analysis. The clusters that contained orthologs in all the strains constituted the core, while those that did not have corresponding ortholog in any of the other genomes were considered as strain specific. The identified gene clusters were assigned functional categories after comparison with the COG database using rpsBLAST program followed by manual curation of the results (36,37).

Phylogenetic analysis

A whole genome alignment of 27 genomes of the Malaysian isolates was carried out with 43 other H. pylori genomes (draft or complete) from NCBI database using Gegenees tool (38,39). The tool utilizes a fragmented alignment algorithm to calculate average similarity among the compared genomes using BLASTn. The fragment size can be optimized according to the user. The tool was run with the fragment size set to 200 and a step size of 100 using BLASTn. The average similarity was calculated with a BLAST score threshold of 40% generating a heat plot matrix that was further used to deduce phylogenetic relationships exported in the form of a .nexus file. This nexus tree file was supplied as an input to SplitsTree (40) program for building an un-rooted phylogenetic tree employing Neighbor-Joining algorithm.

Virulence genes and phage detection

The available whole genomes were screened for the presence of virulence genes enlisted in the Virulence Factor Database (VFDB) using BLAST program (41). The cutoff identity and query coverage was set to 70 and 60%, respectively. Further, comparison of the amino acid sequences of CagA and VacA was carried out using sequence-similarity-based alignments. Phage-related sequences in the genome were identified using PHAST server that integrates the analysis against various phage databases and compares key phage attributes to detect similar phage sequences in the query genome sequence (42).

RESULTS AND DISCUSSION

Genome statistics and phylogenetic analysis

The whole genome sequencing of five Malaysian isolates (Figure 1) revealed their chromosome sizes ranging from 1.56 to 1.62 Mb. The genomes also revealed a low G+C content of 39% which is characteristic of H. pylori. The draft genomes were predicted to encode ∼1600 genes with an average coding DNA sequence (CDS) measuring up to 930 bp. All the sequenced genomes harbored three rRNA operons as well as 36 tRNA genes. Two out of five sequenced strains (UM045 and UM054) also harbored phage sequences encoding 136 and 12 putative phage genes, respectively. A detailed genome statistics of these five isolates has been mentioned in Table 1 and a comparison with the remaining 22 genomes under the study is given in Supplementary Table S1. The genomes were also compared using BLASTn against a reference strain G27 as shown in Figure 1.
Figure 1.

A circular representation of the genomes of Malaysian isolates: the draft genomes of 27 strains were aligned against the genome of reference strain H. pylori G27. Each genome is represented by a ring. The yellow rings represent H. pylori genomes from Malaysian-Chinese, purple represents those from Malays and light blue represents genomes of Malaysian-Indian strains. The G+C content (%) of the reference genome (strain G27) is represented by a ragged inner circle in black (GC). The variable regions such as plasticity zones (PZ) and cagPAI are compared across all the genomes using BRIG image generator (http://brig.sourceforge.net).

Table 1.

Genome statistics of the sequenced Malaysian H. pylori isolates

UM018UM054UM007UM034UM045
OriginMalaysian-IndianMalaysian-IndianChineseChineseMalay
Avg. genome coverage170X170X150X180X200X
No. of contigs7289802728
Genome size1 617 4331 603 2181 568 6781 714 2781 623 876
G+C39.0539.1238.8438.5938.96
CDS15791585155716691595
Avg. length937926924940933
Coding%91.591.691.791.591.7
rRNA34343
tRNA3636363636
cagA (EPIYA-motif)AB-CAB-CAB-DAB-DAB-C
vacAs1m1s2m2s1m2s1m2s1m2
ProphageAbsentPresent (incomplete)AbsentAbsentPresent (intact)
A circular representation of the genomes of Malaysian isolates: the draft genomes of 27 strains were aligned against the genome of reference strain H. pylori G27. Each genome is represented by a ring. The yellow rings represent H. pylori genomes from Malaysian-Chinese, purple represents those from Malays and light blue represents genomes of Malaysian-Indian strains. The G+C content (%) of the reference genome (strain G27) is represented by a ragged inner circle in black (GC). The variable regions such as plasticity zones (PZ) and cagPAI are compared across all the genomes using BRIG image generator (http://brig.sourceforge.net). The genomes of sequenced isolates were pooled together with others from NCBI database to construct a whole genome based phylogenetic tree. The phylogenetic tree demonstrated a similar clustering pattern of various isolates as reported by other MLST-based phylogenetic trees (5,7,9). The strains co-clustered according to their genetic relatedness, exhibited by the formation of distinct clusters, and could be grouped according to their geographical affinities as shown in Figure 2. The strains affiliated to European countries formed hpEurope cluster, whereas those from African continent clustered into hpAfrica1 and hpAfrica2. The East-Asian genotype (hpEastAsia) has been further subdivided into three subpopulations: hspEastAsia (found in Japan and China), hspAmerind (found among Native Americans) and hspMaori (found among Taiwanese aboriginals, Melanesians and Polynesians) (43). Although studies based on MLST have indicated existence of three lineages (6,10,19) in South Asia: hpEurope, hpAsia2 and hpEastAsia, a comprehensive understanding of their phylogeny, evolution and adaptation could not be achieved perhaps because of scarcity of available genome sequences. A rapid increase in the number of genome sequences being available provided a better opportunity to achieve greater resolution in classifying H. pylori strains by whole genome comparative studies.
Figure 2.

The whole genome phylogenetic analysis: the figure represents a whole genome comparison-based phylogenetic tree of various complete and draft H. pylori genomes from different geographical regions. The tree was constructed based on neighbor joining algorithm using SplitsTree. The Malaysian strains used in the analysis are labeled in black whereas the other genomes are colored to represent their genotypes.

The whole genome phylogenetic analysis: the figure represents a whole genome comparison-based phylogenetic tree of various complete and draft H. pylori genomes from different geographical regions. The tree was constructed based on neighbor joining algorithm using SplitsTree. The Malaysian strains used in the analysis are labeled in black whereas the other genomes are colored to represent their genotypes. The isolates from South India and those from Malaysian-Indians clustered tightly forming a group that we named as hpSouthIndia. This close phylogenetic association among Indian and Malaysian-Indian strains is in accordance with the findings of Tay et al. based on MLST typing (19). Moreover, UM045 and UM037 isolated from Malay and Malaysian-Indian patients, respectively, clustered with hpEurope genotype suggesting their European ancestry (11). Further analysis of only Malaysian H. pylori genomes also revealed a bipartite clustering that was supplemented by a similarity score matrix, as shown in Figure 3. All the strains of Malay and Malaysian-Indian (European) origin exhibited more similarity to each other allowing them to cluster away from the Malaysian-Chinese (East-Asian) strains. These findings also appear to support the hypothesis of ancient human migration entailing ancestral Indians (11) and their subsequent migration to South Asia including Malaysia (6,19). Similarly, the affinity of Malaysian-Chinese strains with hspEastAsia reiterated their common ancestry. On the whole, the phylogenetic analysis explained a mixed population genetic structure of H. pylori existing in Malaysian population. These differential genotypes might explain the observed discrepancy in the colonization rates and disease outcome among various ethnic groups (44–46).
Figure 3.

The analysis of 27 Malaysian genomes: the Neighbor-Joining phylogenetic tree constructed after the alignment of 27 Malaysian H. pylori isolates representing various ethnic groups. The heat plot shows average similarity values among the strains.

The analysis of 27 Malaysian genomes: the Neighbor-Joining phylogenetic tree constructed after the alignment of 27 Malaysian H. pylori isolates representing various ethnic groups. The heat plot shows average similarity values among the strains.

Virulence potential

The observed phylogenetic distinction among the Malaysian isolates was further investigated for the presence of differential virulence gene content. Various comparative studies have reported high polymorphism among different H. pylori lineages. Therefore, we sought to analyze the status of OMPs among 27 Malaysian H. pylori isolates. All the Malaysian genomes revealed a conserved nature for most of the 62 OMPs with minor exceptions. The BLASTn similarity percentage for these genes varied from 84 to 100 indicating their polymorphic nature. Few of the genes such as hopZ, hopMN, hopQ (sabB) varied among the strains, but we could not succeed in identifying a lineage/group-specific pattern among the East-Asian and other strains. The genes such as homA and homB were also found to variably exist among the genomes. The status of genes encoding these OMPs has been shown in Supplementary Table S2. Some of these genes correspond to critical virulence determinants induced upon host cell contact. These OMPs play an important role in adhesion and are reported to be associated with increased pro-inflammatory responses (47). OMPs in H. pylori have been categorized into five different families based on their structural composition and are known to carry out various functions ranging from host–surface interactions to non-selective porins for import of ions (48,49). The high recombination capability of H. pylori (50) and its natural competence (51) makes it difficult to draw conclusive inferences about its virulence apparatus. Therefore, various computational and functional efforts have revealed a number of genes implicated in pathogenesis of H. pylori. The virulence factor database (VFDB) (41) lists all the reported and predicted virulence markers for pathogenic organisms including H. pylori. We determined the status of all 57 virulence markers in the Malaysian H. pylori genomes using BLASTp (Supplementary Table S3). All the strains harbored intact cagPAI including a conserved cagA gene. Other virulence markers such as oipA, vacA and flgG which have been associated with severe disease phenotypes were also consistently present. Further, all the genomes possessed components of the urease cluster which allows H. pylori to survive under low pH conditions. The analysis thus revealed a high virulence potential encoded by the genomes irrespective of the ethnic groups that they represented. However, analysis of gene polymorphisms in cagA revealed lineage-specific patterns. CagA protein encoded by the cagPAI is highly correlated with severe gastric outcomes (45,52). The extraordinary virulence potential of CagA has earned its name as a bacterial ‘oncoprotein’ (53). Phylogenetic analysis of CagA could clearly differentiate East-Asian (Malaysian-Chinese) strains from their non-East-Asian (Malay and Malaysian Indian) counterparts (Supplementary Figure S1). The analysis of alignment revealed a lineage-specific variation not only at C-terminal EPIYA motifs but also at N-terminal region. The Malaysian-Indian and Malay strains possessed AB-C-type EPIYA motifs, whereas all the Malaysian-Chinese strains had AB-D-type motifs. These findings are in line with others and suggest a differential evolution of this protein among isolates of different lineages and its probable role in the observed disparity of the disease outcomes (13).

The core genome of Malaysian H. pylori

The gene content analysis of Malaysian isolates was carried out by calculating the core and accessory genome content. The genes that shared orthologs in all the genomes constituted the core while the accessory gene pool was constituted by those gene clusters which did not have orthologs in all the genomes. All the genes from the 27 strains formed a total of 1993 orthologous gene clusters. Among them, 1266 clusters comprised orthologs in all the genomes representing the core gene pool, whereas the remaining 727 gene clusters formed the variable or accessory gene pool which is also in accordance with the previous reports (54). Out of 1266 core gene clusters, 1005 clusters did find a significant match with the COG database and were assigned functional categories as shown in Figure 4A and the rest 261 gene clusters remained uncategorized. Among these 261 gene clusters, a majority were found to be encoding putative hypothetical proteins based on their comparison with other H. pylori genes. Out of 1005 functionally categorized gene clusters, 115 clusters encoded proteins involved in translation, ribosomal structure and biogenesis. Other than performing housekeeping functions, studies have shown that some proteins like Pol I also aid in generating genome plasticity (55). We found the core genome to be enriched with the genes related to cell wall biosynthesis and amino acid/ion transport. The existence of a significant proportion of these transport related genes may be suggestive of an increased dependence of H. pylori on the host metabolites which could possibly be a result of its long association with the host. Interestingly, 81 core gene clusters were identified as belonging to multiple functional classes. These genes could represent the proteins involved in multiple pathways (56). Multifunctional proteins could also be advantageous to H. pylori with a small genome and limited coding potential, but this requires further functional validation. Moreover, a high proportion of hypothetical proteins in the core genome also warrants their functional characterization to ascertain their role in the biology and pathogenesis of this gastric pathogen.
Figure 4.

The functional COG classification of genes: (A) the COG functional classification representing core genome of the Malaysian H. pylori isolates. (B) Core and specific gene content observed among various strains compared in the study.

The functional COG classification of genes: (A) the COG functional classification representing core genome of the Malaysian H. pylori isolates. (B) Core and specific gene content observed among various strains compared in the study. H. pylori has been reported to possess a high strain-specific gene content majorly localized in hypervariable regions known as plasticity zones (57). Few genes from plasticity regions have been reported to be associated with increased pro-inflammatory secretion in cell culture studies. Recent studies on jhp0940 and hp0986 have provided strong evidence for their role in induction of pathogenic phenotypes (58–60). Of the 27 Malaysian strains, 11 harbored hp0986 and belonged to non-East-Asian genotype. Further, jhp0940 was found to be present in seven of the East-Asian strains and one non-East-Asian strain. The presence of these genes also reflects their importance in the pathogenesis of this pathogen. The strain-specific content of Malaysian genomes varied from 10 to 12 genes per genome (Figure 4B). A high proportion of these strain specific genes was predicted to encode hypothetical proteins while few of them encoded putative restriction-modification (RM) systems in some strains. A total of 749 strain-specific genes were identified among the Malaysian strains of which 15 genes were found to encode putative type II restriction or modification related functions. Interestingly, these were mostly prevalent among the Malaysian-Chinese strains that appear highly virulent and therefore warrant further functional characterization of their possible role in the pathogenesis of H. pylori.

Lineage-specific genes

Our phylogenetic analysis classified Malaysian isolates into three distinct genotypes. The strains belonging to hpEurope shared a close similarity to hpSouthIndia compared to hspEastAsia strains. We divided the strains into two groups in accordance with their phylogenetic clustering and genomic identity. In total, 14 Malaysian-Chinese strains represented the East-Asian group, while 13 Malaysian-Indian and Malay strains constituted the non-East-Asian group. The core genome content was calculated for each group separately from the same orthologous cluster file. The East-Asian core genome possessed 1299 orthologous gene clusters, whereas 1301 gene clusters formed the core content among non-East-Asian genomes. The comparison of these two core genomes revealed 33 clusters conserved among East-Asian but varied among non-East-Asian genomes. Out of these 33 gene clusters, four gene clusters did not have orthologs in any of the non-East-Asian strains; one of them was predicted to encode a putative lysozyme-like protein (Table 2). The lysozyme-like proteins have been observed to be upregulated during DNA-damage-induced stress in H. pylori (50). The other three encoded hypothetical proteins await further functional characterization. A clear understanding of the proteins encoded by these genes could provide significant insights into the underlying distinction between East-Asian and non-East-Asian genotypes and associated disease outcomes (61).
Table 2.

Genes differentially present among the core of East-Asian (EA) and non-East-Asian (Non-EA) H. pylori

Status in H. pylori strains
Cluster IDEA (n = 14)Non-EA (n = 13)Predicted functionsOrthologs in 26695
2426140Lysozyme family proteinHP0339
2425140Hypothetical proteinHP0344
2424140Hypothetical proteinHP0346
2423140Hypothetical proteinAbsent
2403142Lipopolysaccharide biosynthesis proteinAbsent
2376144Type II restriction endonucleaseHP1537
2375144Type II methylaseAbsent
2370145GlycosyltransferaseAbsent
2364144DNA methyltransferaseHP0051
2402113Hypothetical proteinAbsent

RM genes

Previous studies on H. pylori genomes revealed a proportion of genes encoding RM systems (62). In our collection of 27 Malaysian genomes, a total of 1077 genes were predicted to encode RM-related genes. Further, clustering by UCLUST (63) with an identity of 80% arranged these genes into 149 clusters. We then analyzed the RM gene content of East-Asian (Malaysian-Chinese) and non-East-Asian (Malay and Malaysian-Indian) strains separately. It was observed that East-Asian strains together contained 698 genes, whereas non-East-Asian strains had only 379 genes. Thus, East-Asian strains harbored, on average, 52 RM genes per strain, much higher compared to 29 RM genes per strain for non-East-Asian strains (Figure 5). In addition, the distribution of RM genes among compared genomes revealed a higher proportion of genes in East-Asian strains as shown in Figure 5. This analysis, therefore, clearly outlines the extent of diversity both in terms of numbers and allelic diversity of RM genes present in H. pylori. This might explain the observed strain to strain diversity in H. pylori. Higher proportion of RM genes in case of East-Asian strains is striking and warrants further functional validation. The role of RM genes in regulating gene expression and virulence of H. pylori is being earnestly pursued. Recent studies have proved that inactivation of the RM genes leads to changes in the expression of several genes in H. pylori (64). Moreover, these RM genes have also been shown to exhibit phase variation (65). Therefore, a clear understanding of the roles played by these RM genes in a host/lineage-specific manner would be necessary to better understand the mechanisms of differential host adaptation in H. pylori.
Figure 5.

The distribution of RM genes in various strains: the graph shows the number of genes annotated to encode putative RM functions in each strain. The Y-axis represents the number of genes and the X-axis denotes strain names.

The distribution of RM genes in various strains: the graph shows the number of genes annotated to encode putative RM functions in each strain. The Y-axis represents the number of genes and the X-axis denotes strain names.

Differentially evolving genes

It has been proposed that pathogenic bacteria that resort to long-term adaptation to a particular niche modulate their core gene repertoire in synchrony with their virulence complement to gain fitness advantage (66). Therefore, we attempted to identify core gene clusters that show some evidence of differential evolution between East-Asian and non-East-Asian strains. The core gene clusters were analyzed by constructing a gene-based phylogeny to search for those gene clusters which distinguished East-Asian strains from non-East-Asian strains. The analysis identified 311 out of 1266 core gene clusters with possible signs of differential evolution. Out of 311, only 239 genes could be assigned to functional categories while the rest did not find a significant hit with the COG database. Their functional categorization revealed an enrichment of genes with functions related to cell-wall/membrane biogenesis, recombination and repair, plus some others with poorly characterized functions (Figure 6). The latter included various OMPs that have been proven to be differentially evolving among East-Asian and non-East-Asian genomes (67). Even cagA and vacA that are known to be differentially evolving among East-Asian-type and non-East-Asian-type strains showed up in our analysis. The core genome thus possesses a significant number of differentially evolving genes. This also mirrors the differential adaptive and evolutionary pressures experienced by these isolates.
Figure 6.

The functional classification of differentially evolving genes: the graph shows the COG functional classification of various core genes with signs of differential evolution among East-Asian and non-East-Asian strains. The Y-axis denotes the functional category and the X-axis represents the number of genes in a particular functional category.

The functional classification of differentially evolving genes: the graph shows the COG functional classification of various core genes with signs of differential evolution among East-Asian and non-East-Asian strains. The Y-axis denotes the functional category and the X-axis represents the number of genes in a particular functional category.

CONCLUSION AND FUTURE PERSPECTIVES

This study was aimed at understanding the genetic structure of H. pylori in Malaysia and cues obtained therefrom to gain insights into observed variation in disease outcomes. The whole genome phylogenetic analysis resolved the strains into three lineages representing patients/individuals from various ethnic groups in a multicultural setting such as Malaysia. The conservation of most of the virulence related genes in the core genome revealed a high pathogenic potential of the strains. Few genes were found to be more prevalent in East-Asian strains as compared to others but await further confirmation considering the draft status of the genomes we analyzed. Further investigation of the core gene pool revealed a significant proportion of genes differentially represented/evolving among East-Asian and non-East-Asian strains. These differentially evolving genes included RM genes and OMPs. Given these findings, it is tempting to believe that H. pylori could possibly harness various mechanisms like surface antigen variation and virulence gene regulation to effectively evade the inhospitable microenvironment of the host. A careful analysis of these molecular interactions would also open avenues for the development of specific control strategies and drug intervention for H. pylori. A functional level understanding of the preponderances and interplay of the virulence and core gene complements among different strains/lineages would allow us to gain better understanding of the pathogen biology and host–pathogen interactions in different endemic settings.

ACCESSION NUMBERS

The whole genome sequences of five Malaysian strains sequenced in this study have been submitted to NCBI genome database with the following accession numbers: UM018 (AONK00000000), UM054 (AONL00000000), UM007 (AONM00000000), UM034 (AONN00000000) and UM045 (AONO00000000).

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.
  66 in total

1.  Recombination and clonal groupings within Helicobacter pylori from different geographical regions.

Authors:  M Achtman; T Azuma; D E Berg; Y Ito; G Morelli; Z J Pan; S Suerbaum; S A Thompson; A van der Ende; L J van Doorn
Journal:  Mol Microbiol       Date:  1999-05       Impact factor: 3.501

2.  Comparative genomics of the restriction-modification systems in Helicobacter pylori.

Authors:  L F Lin; J Posfai; R J Roberts; H Kong
Journal:  Proc Natl Acad Sci U S A       Date:  2001-02-13       Impact factor: 11.205

3.  Quasispecies development of Helicobacter pylori observed in paired isolates obtained years apart from the same host.

Authors:  E J Kuipers; D A Israel; J G Kusters; M M Gerrits; J Weel; A van Der Ende; R W van Der Hulst; H P Wirth; J Höök-Nikanne; S A Thompson; M J Blaser
Journal:  J Infect Dis       Date:  2000-01       Impact factor: 5.226

4.  Microbial gene identification using interpolated Markov models.

Authors:  S L Salzberg; A L Delcher; S Kasif; O White
Journal:  Nucleic Acids Res       Date:  1998-01-15       Impact factor: 16.971

5.  Next-generation sequencing and de novo assembly, genome organization, and comparative genomic analyses of the genomes of two Helicobacter pylori isolates from duodenal ulcer patients in India.

Authors:  Narender Kumar; Asish K Mukhopadhyay; Rajashree Patra; Ronita De; Ramani Baddam; Sabiha Shaik; Jawed Alam; Suma Tiruvayipati; Niyaz Ahmed
Journal:  J Bacteriol       Date:  2012-11       Impact factor: 3.490

6.  PHAST: a fast phage search tool.

Authors:  You Zhou; Yongjie Liang; Karlene H Lynch; Jonathan J Dennis; David S Wishart
Journal:  Nucleic Acids Res       Date:  2011-06-14       Impact factor: 16.971

7.  Helicobacter Pylori's plasticity zones are novel transposable elements.

Authors:  Dangeruta Kersulyte; Wookon Lee; Dharmalingam Subramaniam; Shrikant Anant; Phabiola Herrera; Lilia Cabrera; Jacqueline Balqui; Orsolya Barabas; Awdhesh Kalia; Robert H Gilman; Douglas E Berg
Journal:  PLoS One       Date:  2009-09-03       Impact factor: 3.240

Review 8.  Helicobacter pylori CagA: From Pathogenic Mechanisms to Its Use as an Anti-Cancer Vaccine.

Authors:  Markus Stein; Paolo Ruggiero; Rino Rappuoli; Fabio Bagnoli
Journal:  Front Immunol       Date:  2013-10-15       Impact factor: 7.561

9.  The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST).

Authors:  Ross Overbeek; Robert Olson; Gordon D Pusch; Gary J Olsen; James J Davis; Terry Disz; Robert A Edwards; Svetlana Gerdes; Bruce Parrello; Maulik Shukla; Veronika Vonstein; Alice R Wattam; Fangfang Xia; Rick Stevens
Journal:  Nucleic Acids Res       Date:  2013-11-29       Impact factor: 16.971

10.  The complex methylome of the human gastric pathogen Helicobacter pylori.

Authors:  Juliane Krebes; Richard D Morgan; Boyke Bunk; Cathrin Spröer; Khai Luong; Raphael Parusel; Brian P Anton; Christoph König; Christine Josenhans; Jörg Overmann; Richard J Roberts; Jonas Korlach; Sebastian Suerbaum
Journal:  Nucleic Acids Res       Date:  2013-12-02       Impact factor: 16.971

View more
  17 in total

1.  Helicobacter pylori Infections in the Bronx, New York: Surveying Antibiotic Susceptibility and Strain Lineage by Whole-Genome Sequencing.

Authors:  William R Jacobs; Wendy A Szymczak; Rajagopalan Saranathan; Michael H Levi; Alice R Wattam; Adel Malek; Emmanuel Asare; Daniel S Behin; Debra H Pan
Journal:  J Clin Microbiol       Date:  2020-02-24       Impact factor: 5.948

2.  A Survey of Helicobacter pylori Antibiotic-Resistant Genotypes and Strain Lineages by Whole-Genome Sequencing in China.

Authors:  Yan Zhou; Zishao Zhong; Shengjuan Hu; Jing Wang; Yanhong Deng; Ximei Li; Xianmei Chen; Xue Li; Yuanyuan Tang; Xiaofei Li; Qian Hao; Jun Liu; Tian Sang; Yang Bo; Feihu Bai
Journal:  Antimicrob Agents Chemother       Date:  2022-06-02       Impact factor: 5.938

3.  Use of Alignment-Free Phylogenetics for Rapid Genome Sequence-Based Typing of Helicobacter pylori Virulence Markers and Antibiotic Susceptibility.

Authors:  Arnoud H M van Vliet; Johannes G Kusters
Journal:  J Clin Microbiol       Date:  2015-07-01       Impact factor: 5.948

4.  Transmission of the PabI family of restriction DNA glycosylase genes: mobility and long-term inheritance.

Authors:  Kenji K Kojima; Ichizo Kobayashi
Journal:  BMC Genomics       Date:  2015-10-19       Impact factor: 3.969

5.  Identification of a Latin American-specific BabA adhesin variant through whole genome sequencing of Helicobacter pylori patient isolates from Nicaragua.

Authors:  Kaisa Thorell; Shaghayegh Hosseini; Reyna Victoria Palacios Palacios Gonzáles; Chatchai Chaotham; David Y Graham; Lawrence Paszat; Linda Rabeneck; Samuel B Lundin; Intawat Nookaew; Åsa Sjöling
Journal:  BMC Evol Biol       Date:  2016-02-29       Impact factor: 3.260

6.  Multipronged regulatory functions of a novel endonuclease (TieA) from Helicobacter pylori.

Authors:  Savita Devi; Suhail A Ansari; Shivendra Tenguria; Naveen Kumar; Niyaz Ahmed
Journal:  Nucleic Acids Res       Date:  2016-08-22       Impact factor: 16.971

7.  Draft Genome Sequences of 42 Helicobacter pylori Isolates from Rural Regions of South India.

Authors:  Vignesh Shetty; Binit Lamichhane; Eng-Guan Chua; Mamatha Ballal; Chin-Yen Tay
Journal:  Genome Announc       Date:  2018-02-01

8.  Phylogenomics of Colombian Helicobacter pylori isolates.

Authors:  Andrés Julián Gutiérrez-Escobar; Esperanza Trujillo; Orlando Acevedo; María Mercedes Bravo
Journal:  Gut Pathog       Date:  2017-09-11       Impact factor: 4.181

9.  Visualization of consensus genome structure without using a reference genome.

Authors:  Ipputa Tada; Yasuhiro Tanizawa; Masanori Arita
Journal:  BMC Genomics       Date:  2017-03-14       Impact factor: 3.969

10.  Whole Genome Sequence and Phylogenetic Analysis Show Helicobacter pylori Strains from Latin America Have Followed a Unique Evolution Pathway.

Authors:  Zilia Y Muñoz-Ramírez; Alfonso Mendez-Tenorio; Ikuko Kato; Maria M Bravo; Cosmeri Rizzato; Kaisa Thorell; Roberto Torres; Francisco Aviles-Jimenez; Margarita Camorlinga; Federico Canzian; Javier Torres
Journal:  Front Cell Infect Microbiol       Date:  2017-02-28       Impact factor: 5.293

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.