| Literature DB >> 25705648 |
Amjad Ali1, Anam Naz2, Siomar C Soares3, Marriam Bakhtiar4, Sandeep Tiwari3, Syed S Hassan3, Fazal Hanan5, Rommel Ramos6, Ulisses Pereira7, Debmalya Barh8, Henrique César Pereira Figueiredo9, David W Ussery10, Anderson Miyoshi3, Artur Silva6, Vasco Azevedo3.
Abstract
Helicobacter pylori is a human gastric pathogen implicated as the major cause of peptic ulcer and second leading cause of gastric cancer (~70%) around the world. Conversely, an increased resistance to antibiotics and hindrances in the development of vaccines against H. pylori are observed. Pan-genome analyses of the global representative H. pylori isolates consisting of 39 complete genomes are presented in this paper. Phylogenetic analyses have revealed close relationships among geographically diverse strains of H. pylori. The conservation among these genomes was further analyzed by pan-genome approach; the predicted conserved gene families (1,193) constitute ~77% of the average H. pylori genome and 45% of the global gene repertoire of the species. Reverse vaccinology strategies have been adopted to identify and narrow down the potential core-immunogenic candidates. Total of 28 nonhost homolog proteins were characterized as universal therapeutic targets against H. pylori based on their functional annotation and protein-protein interaction. Finally, pathogenomics and genome plasticity analysis revealed 3 highly conserved and 2 highly variable putative pathogenicity islands in all of the H. pylori genomes been analyzed.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25705648 PMCID: PMC4325212 DOI: 10.1155/2015/139580
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
List of genomes analyzed in this study. Genomic feature, statistics, accession numbers (chromosome + plasmid) and pangenomic data generated by analyzing 39 Helicobacter pylori strains. The number of predicted proteins, clusters, new genes (unique), and core and pan genome is estimated.
| Organisms | Genomic features and statistics | Pan-genomics and comparative genomics | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Size (bp) | Proteins | Chromosome | Plasmid | %AT | New genes | New | Pan-genome | Core | |
|
| 1548238 | 1503 | NC_017374.1 | 60.698 | 1503 | 1491 | 1491 | 1491 | |
|
| 1562832 | 1508 | NC_017381.1 | 60.706 | 29 | 29 | 1520 | 1464 | |
|
| 1667867 | 1579 | NC_000915.1 | 61.124 | 207 | 189 | 1702 | 1323 | |
|
| 1566655 | 1500 | NC_017360.1 | 61.132 | 106 | 103 | 1800 | 1291 | |
|
| 1589954 | 1515 | NC_017382.1 | 61.230 | 80 | 79 | 1871 | 1278 | |
|
| 1568826 | 1509 | NC_017354.1 | 61.059 | 63 | 62 | 1926 | 1273 | |
|
| 1617426 | 1559 | NC_017375.1 | 61.277 | 87 | 84 | 2004 | 1268 | |
|
| 1549666 | 1511 | NC_017357.1 | 60.701 | 24 | 24 | 2023 | 1264 | |
|
| 1576758 | 1502 | NC_012973.1 | 60.840 | 61 | 52 | 2065 | 1237 | |
|
| 1680029 | 1579 | NC_014256.1 | NC_014257.1 | 61.215 | 51 | 50 | 2097 | 1235 |
|
| 1635449 | 1536 | NC_017358.1 | 61.136 | 73 | 72 | 2153 | 1227 | |
|
| 1669876 | 1564 | NC_017063.1 | NC_017064.1 | 61.123 | 57 | 55 | 2194 | 1224 |
|
| 1575399 | 1508 | NC_017368.1 | 61.115 | 28 | 26 | 2205 | 1225 | |
|
| 1579693 | 1498 | NC_017365.1 | NC_017369.1 | 61.197 | 19 | 19 | 2210 | 1225 |
|
| 1581461 | 1506 | NC_017366.1 | NC_017370.1 | 61.143 | 21 | 21 | 2218 | 1223 |
|
| 1609006 | 1530 | NC_017367.1 | 61.272 | 16 | 16 | 2226 | 1221 | |
|
| 1663013 | 1577 | NC_011333.1 | NC_011334.1 | 61.130 | 51 | 51 | 2266 | 1221 |
|
| 1712468 | 1589 | NC_017371.1 | NC_017364.1 | 60.876 | 28 | 28 | 2286 | 1220 |
|
| 1605736 | 1508 | NC_008086.1 | NC_008087.1 | 60.933 | 32 | 32 | 2310 | 1220 |
|
| 1607584 | 1507 | NC_017733.1 | NC_017734.1 | 60.957 | 37 | 35 | 2339 | 1219 |
|
| 1675918 | 1572 | NC_017372.1 | 61.102 | 33 | 32 | 2362 | 1217 | |
|
| 1643831 | 1505 | NC_000921.1 | 60.811 | 18 | 18 | 2373 | 1216 | |
|
| 1640673 | 1555 | NC_017362.1 | NC_017363.1 | 61.134 | 42 | 42 | 2401 | 1212 |
|
| 1684038 | 1587 | NC_011498.1 | NC_011499.1 | 61.213 | 22 | 22 | 2411 | 1206 |
|
| 1660685 | 1543 | NC_017742.1 | 60.981 | 17 | 17 | 2415 | 1208 | |
|
| 1638269 | 1532 | NC_014555.1 | NC_014556.1 | 61.089 | 27 | 27 | 2435 | 1208 |
|
| 1637762 | 1532 | NC_017378.1 | NC_017377.1 | 61.095 | 26 | 26 | 2452 | 1206 |
|
| 1646139 | 1539 | NC_017379.1 | 61.176 | 14 | 14 | 2463 | 1205 | |
|
| 1679829 | 1573 | NC_017361.1 | NC_017373.1 | 61.575 | 48 | 47 | 2498 | 1206 |
|
| 1644275 | 1579 | ALWV01 | 61.240 | 32 | 30 | 2513 | 1205 | |
|
| 1567570 | 1464 | NC_017359.1 | NC_017356.1 | 60.908 | 15 | 15 | 2521 | 1205 |
|
| 1663456 | 1561 | NC_017741.1 | 61.227 | 10 | 10 | 2519 | 1205 | |
|
| 1616909 | 1516 | NC_017740.1 | 61.136 | 6 | 6 | 2521 | 1205 | |
|
| 1665719 | 1545 | NC_017739.1 | 61.229 | 14 | 14 | 2530 | 1204 | |
|
| 1608548 | 1518 | NC_010698.2 | 61.229 | 7 | 7 | 2530 | 1203 | |
|
| 1658051 | 1524 | NC_014560.1 | 61.100 | 15 | 15 | 2540 | 1203 | |
|
| 1610830 | 1514 | NC_017376.1 | NC_017380.1 | 61.004 | 12 | 12 | 2540 | 1202 |
|
| 1595604 | 1510 | NC_017355.1 | NC_017383.1 | 61.055 | 15 | 15 | 2549 | 1199 |
|
| 1656544 | 1701 | NC_017926.1 | NC_017919.1 | 61.427 | 89 | 89 | 2614 | 1193 |
Figure 2Whole genome/proteome pairwise alignment and comparative analysis. The translated genomes of all H. pylori available strains are analyzed by BLASTp analysis. Pairwise comparisons across H. pylori proteomes are plotted in blast matrix. The shared proteome between any two H. pylori genomes and the percentage of similarities is calculated and shown in corresponding boxes, where the color intensity indicated the similarity. The diagonal row of rectangular boxes in the matrix illustrates the internal homology against its own proteome.
Figure 1The evolutionary history inferred by Neighbor Joining method (NJ). MEGA6 was used for multiple sequence alignment and construction of phylogenetic tree of all 39 H. pylori strains. Branch lengths were computed using evolutionary distances generated by Maximum Composite Likelihood method.
Figure 3Core and pan-genome estimation for the genus, H. pylori and non-pylori species. The figure demonstrates the distribution of core (1,193) and pan-genome (2,614) of H. pylori species (Table 1). The pan-genome plot represents total number of genes, gene clusters (families) for each genome (light grey), new gene families (dark gray) pan-genome (blue), and core genome (red). Name of genome is provided on x-axis and number of genes can be observed on y-axis.
Biological categorization of candidate essential gene families. The H. pylori EGFs analyzed by CELLO and PSORTb for prediction of surface proteins. This is followed by calculation of molecular weight of proteins by Expasy tool and Blast2go is used for functional annotation.
| Protein | Localization | Mol. weight (Da) | EC number | Sequence |
|---|---|---|---|---|
| SeqID: /gene_family=“1495” | Cytoplasmic | 76937.16 | ||
|
| ||||
| SeqID: /gene_family=“1623” | Outer membrane | 81336.76 | EC 3.6.3.3, EC 3.6.3.4, EC 3.6.3.5 | Copper-exporting ATPase |
|
| ||||
| SeqID: /gene_family=“1487” | Extracellular | 51589.98 | EC 3.4.21 | Serine protease |
|
| ||||
| SeqID: /gene_family=“1447” | Outer membrane | 24108.16 | ||
|
| ||||
| SeqID: /gene_family=“821” | Inner membrane | 26170.79 | ||
|
| ||||
| SeqID: /gene_family=“222” | Inner membrane | 61901.64 | ABC transporter permease | |
|
| ||||
| SeqID: /gene_family=“197” | Inner membrane | 32817.21 | EC 2.5.1 | Prenyltransferase |
|
| ||||
| SeqID: /gene_family=“577” | Inner membrane | 25149.00 | Membrane protein | |
|
| ||||
| SeqID: /gene_family=“1486” | Inner membrane | 62563.95 | Insertase | |
|
| ||||
| SeqID: /gene_family=“948” | Inner membrane | 52460.09 | Lysine-specific permease | |
|
| ||||
| SeqID: /gene_family=“1028” | Cytoplasmic | 33346.74 | EC 1.1.1.8 | Glycerol-3-phosphate dehydrogenase |
|
| ||||
| SeqID: /gene_family=“1439” | Inner membrane | 46137.77 | Potassium transporter | |
|
| ||||
| SeqID: /gene_family=“1480” | Inner membrane | 53997.01 | Sodium proline symporter | |
|
| ||||
| SeqID: /gene_family=“714” | Inner membrane | 49579.70 | Sodium: neurotransmitter symporter family protein | |
|
| ||||
| SeqID: /gene_family=“1238” | Inner membrane | 115481.83 | Cation efflux system protein | |
|
| ||||
| SeqID: /gene_family=“230” | Outer membrane | 30259.72 | EC 1.3.1.9 | Enoyl-acp reductase |
|
| ||||
| SeqID: /gene_family=“940” | Extracellular | 33739.71 | ||
|
| ||||
| SeqID: /gene_family=“792” | Extracellular | 39410.96 | ||
|
| ||||
| SeqID: /gene_family=“747” | Cytoplasmic | 32881.19 | Domain protein | |
|
| ||||
| SeqID: /gene_family=“30” | Periplasmic | 30271.1 | 50s ribosomal protein l2 | |
|
| ||||
| SeqID: /gene_family=“1043” | Periplasmic | 15669.30 | ||
|
| ||||
| SeqID: /gene_family=“128” | Periplasmic | 37289.58 | EC 3.5.1.49 | Formamidases |
|
| ||||
| SeqID: /gene_family=“428” | Periplasmic | 40467.09 | EC 1.1.1.1 | NADP-dependent alcohol dehydrogenase |
|
| ||||
| SeqID: /gene_family=“664” | Cytoplasmic | 26515.58 | EC 3.6.1.3 | Abc transporter ATP-binding protein |
|
| ||||
| SeqID: /gene_family=“928” | Periplasmic | 61103.25 | EC 2.3.2.2 | Gamma-glutamyltranspeptidase |
|
| ||||
| SeqID: /gene_family=“1097” | Periplasmic | 65613.27 | EC 3.1.4.16 | 5-nucleotidase protein |
|
| ||||
| SeqID: /gene_family=“88” | Cytoplasmic membrane | 21533.23 | Membrane protein | |
|
| ||||
| SeqID: /gene_family=“1595” | Cytoplasmic membrane | 74214.44 | EC 2.41.129 | Penicillin-binding protein 1a |
|
| ||||
| SeqID: /gene_family=“387” | Periplasmic | 62296.56 | Peptide ABC transporter substrate-binding protein | |
Figure 4Functional characterization of proteins selected as potential vaccine targets. Total of 29 nonhuman homolog proteins were functionally annotated using Blast2go and distribution of their molecular functions in analyzed in the form of graph.
Figure 5Protein-protein interaction (PPI) map of targeted core proteins. An interactome established between prioritized core protein targets. Eleven proteins showed interactions among each other revealing their collaborations in different pathways. Number and color of edges between nodes represent the type of evidence for the associations.
Figure 6Pathogenicity islands in H. pylori genomes (Pan-Heatmap). Heatmap analysis demonstrates high degree of variability on most of the PAIs across all genomes. Among the 22 predicted PAIs, only PiHps 2, 4, 14, and 15 are present in at least 50% of the strains.
Figure 7Conserved and variable pathogenicity islands in H. pylori. Putative pathogenicity islands predicted in H. pylori. The H. pylori 26695 a reference genome is selected for analysis (scaffold). All the genomes are aligned and a phylogenetically related nonpathogenic organism Wolinella succinogenes DSM 1740 is also included for comparison. PAIs 2, 4, 14, and 15 are found conserved ((a), (b), (c), and (d)). On the other side, PAIs PiHps 8, 13, and 9 ((e), (f), and (g)) were found variable among H. pylori genomes.