| Literature DB >> 35867873 |
Zachary N Phillips1, Amy V Jennison2, Paul W Whitby3, Terrence L Stull3, Megan Staples2, John M Atack1,4.
Abstract
Non-typeable Haemophilus influenzae (NTHi) is a major human pathogen for which there is no globally licensed vaccine. NTHi has a strict growth requirement for iron and encodes several systems to scavenge elemental iron and heme from the host. An effective NTHi vaccine would target conserved, essential surface factors, such as those involved in iron acquisition. Haemoglobin-haptoglobin binding proteins (Hgps) are iron-uptake proteins localized on the outer-membrane of NTHi. If the Hgps are to be included as components of a rationally designed subunit vaccine against NTHi, it is important to understand their prevalence and diversity. Following analysis of all available Hgp sequences, we propose a standardized grouping method for Hgps, and demonstrate increased diversity of these proteins than previously determined. This analysis demonstrated that genes encoding variants HgpB and HgpC are present in all strains examined, and almost 40% of strains had a duplicate, nonidentical hgpB gene. Hgps are also phase-variably expressed; the encoding genes contain a CCAA(n) simple DNA sequence repeat tract, resulting in biphasic ON-OFF switching of expression. Examination of the ON-OFF state of hgpB and hgpC genes in a collection of invasive NTHi isolates demonstrated that 58% of isolates had at least one of hgpB or hgpC expressed (ON). Varying expression of a diverse repertoire of hgp genes would provide strains a method of evading an immune response while maintaining the ability to acquire iron via heme. Structural analysis of Hgps also revealed high sequence variability at the sites predicted to be surface exposed, demonstrating a further mechanism to evade the immune system-through varying the surface, immune-exposed regions of the membrane anchored protein. This information will direct and inform the choice of candidates to include in a vaccine against NTHi.Entities:
Keywords: Hgp; NTHi; invasive disease; iron acquisition; phase variation
Mesh:
Substances:
Year: 2022 PMID: 35867873 PMCID: PMC9341677 DOI: 10.1093/femsle/fnac064
Source DB: PubMed Journal: FEMS Microbiol Lett ISSN: 0378-1097 Impact factor: 2.820
(a) The expression state (phase-varied ON or OFF) of hgpB (i) and hgpC (ii) in an invasive isolate collection was assessed via fragment length analysis. All strains had at least one hgpB and 96% had at least one hgpC (Figure S3, Supporting Information). (b) A summary of invasive isolates with at least one hgpB, hgpC and any of the hgpB or hgpC genes in-frame/ON. See Figure S3 (Supporting Information) for all data. We were unable to amplify a PCR product for any hgp gene products from two of the invasive isolates, so were not included.
|
|
| ||||
|---|---|---|---|---|---|
|
|
|
|
| ||
| No. | 58 | 28 | 3 | 89 | |
| % | 65.2 | 31.5 | 3.4 | 100 | |
| Gene presence in 72 samples = 123.6% | |||||
|
|
| ||||
|
|
|
|
| ||
| No. | 61 | 18 | 2 | 81 | |
| % | 75.3% | 22.2% | 2.5% | 100% | |
| Gene presence in 72 samples = 112.5% | |||||
|
|
| ||||
|
|
|
| |||
| No. | 32 | 19 | 42 | ||
| % | 44.4 | 26.4 | 58.3 | ||
(a) The number of hgp genes was surveyed in 75 fully annotated genomes from NCBI with total number (No.) and % of the amount screened (%) shown.(b) A collection of invasive NTHi isolates was also surveyed for hgp genes. Total number of hgp genes detected and their grouping included. Further information of each gene with subgrouping of alleles (e.g. hgpA1 and hgpA2) can be found in Table S2 (Supporting Information).
| (a) |
| |||||||
|---|---|---|---|---|---|---|---|---|
| Genomes |
|
|
|
|
|
|
| |
| No. | 75 | 21 | 107 | 84 | 4 | 6 | 8 | 7 |
| % | 100 | 28.0 | 142.7 | 112.0 | 5.3 | 8.0 | 10.7 | 9.3 |
|
|
| |||||||
|
|
|
|
|
|
|
|
| |
| No. | 74 | 20 | 89 | 81 | 15 | 8 | 6 | 2 |
| % | 100 | 27.0 | 123.6 | 109.5 | 20.3 | 10.8 | 8.1 | 2.7 |
Figure 2.(A) The location of the surface domains and heme-binding core within aligned HgpB protein sequences. The structure of HgpB (from strain NCTC13377) was predicted using AlphaFold (v2.1.2), with (B) side and (C) top-down view provided. The VDs of the Hgps are located in surface-exposed areas (white). The β-barrel structure was highly conserved within Hgp groups (blue). The heme-binding core was surface accessible and highly conserved between Hgp groups (red). (D) Variable (surface) Domains (VD1–5) of HgpB contain highly variable sequences. Individual sequences were identified by aligning all the sequences present from each of the VDs (separate from the whole sequence) in CLUSTAL OMEGA (v1.2.4) and viewed using default settings in JalView overview (v2.1.1.7). A total of 102 HgpB protein sequences were included in the alignments. Amino acids are coloured according to the percentage in each column that agree with the consensus sequence, with % identity shown as blue, ranging from > 80% to > 40% identity. Grey areas represent gaps, and white areas indicate < 40% identity with the consensus sequence. VD1–the largest surface domain—had the highest sequence variability, and was not separated into individual conserved sequences. VD2–VD5 had a lower amount of diversity than VD1, and as such we have been able to individually identify the number of variants within each of these VDs (numbered 1–6) indicated on the left-hand side of each individual VD alignment.
Figure 1.(A) The primary NTHi hgp gene cluster is located immediately downstream of the fucI gene (encoding a fucose isomerase), with variable distance (30–80 kb) between multiple hgp genes located in this region. Our analysis demonstrated that there were at least two hgp genes within this primary cluster, but the number of hgps varies in number and orientation in individual strains. Additionally, a secondary hgp gene can be located between the bioA (encoding adenosylmethionine-8-amino-7-oxononanoate aminotransferase) and the a pyk gene (encoding pyruvate kinase). This secondary site contains only a single hgp gene, and is not present in all strains. (B) Alignment of Hgp amino acid sequences in H. influenzae NCBI fully annotated genomes. Protein sequences were aligned by CLUSTAL OMEGA (v1.2.4) and viewed using default JalView (v2.1.1.7) settings, visually representing % identity (% ID) between sequences. The number of sequences aligned is under the ‘No.’ column. Amino acids are coloured according to the percentage in each column that agree with the consensus sequence, with % identity shown as blue, ranging from > 80% to > 40% identity. Grey areas represent gaps, and white areas indicate < 40% identity with the consensus sequence. We have categorized the previously broad Hgp groups (HgpA–D) using a > 70% identity cut-off to separate Hgps into groups (HgpA–G) and 80% identity to separate alleles (e.g HgpA1 vs. A2).