| Literature DB >> 29763438 |
Joseph J Wanford1, Luke R Green1, Jack Aidley1, Christopher D Bayliss1.
Abstract
Pathogenic Neisseria are responsible for significantly higher levels of morbidity and mortality than their commensal relatives despite having similar genetic contents. Neisseria possess a disparate arsenal of surface determinants that facilitate host colonisation and evasion of the immune response during persistent carriage. Adaptation to rapid changes in these hostile host environments is enabled by phase variation (PV) involving high frequency, stochastic switches in expression of surface determinants. In this study, we analysed 89 complete and 79 partial genomes, from the NCBI and Neisseria PubMLST databases, representative of multiple pathogenic and commensal species of Neisseria using PhasomeIt, a new program that identifies putatively phase-variable genes and homology groups by the presence of simple sequence repeats (SSR). We detected a repertoire of 884 putative PV loci with maxima of 54 and 47 per genome in gonococcal and meningococcal isolates, respectively. Most commensal species encoded a lower number of PV genes (between 5 and 30) except N. lactamica wherein the potential for PV (36-82 loci) was higher, implying that PV is an adaptive mechanism for persistence in this species. We also characterised the repeat types and numbers in both pathogenic and commensal species. Conservation of SSR-mediated PV was frequently observed in outer membrane proteins or modifiers of outer membrane determinants. Intermittent and weak selection for evolution of SSR-mediated PV was suggested by poor conservation of tracts with novel PV genes often occurring in only one isolate. Finally, we describe core phasomes-the conserved repertoires of phase-variable genes-for each species that identify overlapping but distinctive adaptive strategies for the pathogenic and commensal members of the Neisseria genus.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29763438 PMCID: PMC5953494 DOI: 10.1371/journal.pone.0196675
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Flowchart and visual output of PhasomeIt.
(A) Outputs of PhasomeIt can be viewed visually on the index page. Green bars indicate there is an homopolymeric tract within the open reading frame; orange bars indicate there is an SSR close to the gene of interest (for example in a promoter region); grey bars indicate there is a non-PV gene homologous to a PV gene in that same homology grouping; the remaining coloured bars are indicative of SSRs other than homopolymers which can be further derived from the dataset below the visual output. (B) Gene groupings corresponding to the visual output are found in a table below. From here, functions, PV status in each strain and tract entries can be obtained for the grouping of interest. The full dataset from which this figure is derived, containing further phasome information not discussed in this manuscript are available (https://figshare.com/s/d31b7b0b6ca4aeeb48df). A red outline shows highlights both the graphical and interactive outputs for the opa loci as an example.
Fig 2Range of phase variable genes identified in each species.
Data shown the median, range, upper and lower quartile number of PV genes, as indicated by presence of a repeat tract. These data exclude gene groupings which contain dinucleotide repeat tracts, due to the insufficient evidence of phase variation associated with dinucleotide repeats in the literature, and the loci discussed herein. Statistical analysis were performed with a Kruskal-Wallis test with Dunn’s multiple comparisons. NS; not significant, ***; p-value of <0.0005.
Variety of repeat types found in pathogenic and commensal Neisseria ssp.
| Mean number of SSRs per genome (range of repeat numbers) | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| SSR | |||||||||
| 15.4 (9–26) | 18.2 (9–19) | 19.4 (9–39) | 5 (9–14) | 5.8 (9–40) | 8.6 (9–33) | 6.8 (9–36) | 10.2 (9–34) | 5.2 (9–36) | |
| 3.6 (11–30) | 1.7 (11–16) | 4.8 (11–23) | 1.8 (11–16) | 1.5 (11–14) | 4.8 (11–17) | 2.2 (11–17) | 3.4 (11–41) | 2 (11–17) | |
| 4.3 (6) | 4.1 (6) | 9.6 (6–7) | 2.8 (6) | 3.3 (6) | 4.3 (6) | 2.8 (6–7) | 4.6 (6–7) | 7.4 (6–7) | |
| 1.4 (5–34) | 0.9 (6–14) | 1.6 (17–23) | 0.2 (23) | 0 | 0.8 (11–15) | 0 | 0.4 (12–23) | 0.2 (9) | |
| 3.2 (5–24) | 8.1 (5–21) | 1.8 (8) | 0 | 0 | 0 | 0.2 (8) | 0.6 (8) | 0.2 (5) | |
| 1 (3) | 1 (3) | 1.2 (3) | 0.3 (3) | 0 | 0.6 (3) | 0.2 (3) | 0.8 (3) | 0 | |
| 0.8 (3) | 1 (3) | 1.2 (3) | 0.3 (3) | 0 | 1.8 (3) | 0.2 (3) | 0.8 (3) | 1.8 (3) | |
| 0.7(3) | 0.3(3) | 0.6(3) | 0.3(3) | 0 | 1(3) | 0 | 0.4(3) | 0 | |
| 0.6 (3–22) | 1 (10–26) | 0 | 0 | 0 | 0.4 (6–14) | 0 | 0 | 0 | |
| 0.7 (5–25) | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
* The minimum repeat number required for detection of an SSR of a specific repeat unit length or type are as follows: G, 9; T 11; dinucleotides and trinucleotides, 6; tetranucleotides and pentanucleotides, 5; and repeats of between 6 and 9, 3.
Data are not comprehensive but comprise all SSRs that occur at >1% of all the repeats tracts identified in the 118 genomes.
Fig 3Tract length distribution in different Neisseria species.
Data are represented by heat maps. Colour intensity represents the percentage that a given tract length comprises of the total number of identified tracts of that type for each species. ‘-’ is indicative of no identified repeats of the given length. Information on the numbers of strains for each species, and numbers of tract lengths analysed can be found in Table 1.
Distribution of phase variable genes between phase variable modules.
| PV module | Total PV † | PV in gene ‡ | PV intergenic § | Non-PV homologs || | Ratio PV: non PV # |
|---|---|---|---|---|---|
| 389 | 360 | 29 | 121 | 3.5:1 | |
| 248 | 191 | 57 | 119 | 2:1 | |
| 354 | 320 | 34 | 156 | 2.2:1 | |
| 147 | 142 | 5 | 79 | 1.9:1 | |
| 1498 | 507 | 972 | 2666 | 1:1.8 | |
| 2335 | 1245 | 1109 | 42402 | 1:18 | |
| 633 | 363 | 270 | 15403 | 1:24 | |
| Total | 5604 | 3128 | 2476 | 60946 | N/A |
Due to the high copy number, and sporadic nature of PV in these genes, pilS loci were excluded from the pilin analysis.
† Total PV; the total number of phase variable genes—whether in the ORF or intergenic—mapping to each module.
‡ PV in gene; number of PV genes with repeat tracts in the ORF mapping to each module.
§ PV intergenic; number of PV genes with repeat tracts in intergenic regions.
|| Non-PV homologs; number of genes with homology to genes mapping to each module which do not contain a repeat tract.
# Ratio; the ratio of genic SSRs + intergenic SSRS: non-PV homologs.
Core phasome analysis of a subset of Neisseria species.
| Gene * ‡ | Species § || | Function | Evidence for PV † | Tract (Range) | Location of SSR | ||
|---|---|---|---|---|---|---|---|
| Restriction modification | Known [ | GCCA (6–37) | Genic | ||||
| Restriction modification | Known [ | CCAATG (7–31) | Genic | ||||
| NMB2030 | Restriction modification | None | GCCGC (3) | Genic | |||
| Pilin glycosylation | Known [ | G (9–21) | Genic | ||||
| Pilin glycosylation | Known [ | CAAACAC (5–34) | Genic | ||||
| Pilin glycosylation | Known [ | G (9–15) | Genic | ||||
| Pilus retraction | Known [ | G (9–20) | Genic | ||||
| Adhesion/invasion | Known [ | CTTCT (7–21) | Genic | ||||
| LOS biosynthesis | Known [ | G (9–20) | Genic | ||||
| LOS biosynthesis | Known [ | C (10–16) | Genic | ||||
| Iron acquisition | Known [ | C (10–14) | Intergenic | ||||
| NMB0032 | Putative lipoprotein | Alignment | A (11–13) | Intergenic | |||
| Autotransporter | Known [ | C (9–13) | Genic | ||||
| NMB0468 | Arginine decarboxylase | None | TGTTTG (3) | Genic | |||
| NMB1460 | ssDNA binding protein | None | GCCGC (3) | Genic | |||
| NMB1864 | Glutamate-1-semialdehyde-2,1-aminomutase | None | CGGTTG (3) | Genic | |||
| NMB1352 | YSIRK family signal peptide | None | AAGAA (3) | Genic | |||
| NMB1595 | Alanyl-tRNA synthetase | None | CGCGCC (3) | Genic | |||
| Macrolide efflux | Alignment | CAGGG (3–4) | Intergenic | ||||
| Topoisomerase IV subunit A | None | GGCGC (3) | Genic | ||||
| Periplasmic protein | None | CCGCC (3) | Genic | ||||
| NLA_12310 | Adhesin | Alignment | C (9–10) | Intergenic | |||
* Genes which were found to be present in greater than 90% of genomes analysed were considered the ‘core phasome’. Phase variable genes present in a smaller percentage of analysed genomes can be found in (https://figshare.com/s/d31b7b0b6ca4aeeb48df).
† ‘Known’ indicates genes with previous evidence for PV; ‘Alignment’ indicates genes for which there is alignment based evidence in this study; ‘None’ indicates that no switching in repeat length were identified in this analysis.
‡ Where NMB numbers are given, these are in reference to homologs in the meningococcal type strain MC58, where NGO numbers are given, these are in reference to the gonococcal reference strain FA 1090. where NLA numbers are given, these are in reference to the N. lactamica reference strain 020–06.
§ ++; PV copy present in >90% of genomes, +; PV copy present in <90% of genomes -; gene absent from the genome assembly. Bracketed numbers indicate the maximum copy number of that gene observed in a respective species.
|| In this case, scoring was based on whether a single one of these loci was PV, further information on the SSRs in each of these genes can be found in (https://figshare.com/s/d31b7b0b6ca4aeeb48df).
Exemplar gene groupings associated with in frame and read through phase variation.
| Gene grouping | PV | Non PV homologues | Associated tracts | Function |
|---|---|---|---|---|
| Peptidyl—prolyl cis-trans isomerase | 41 | 104 | GCCAAAGCT ( | Protein folding [ |
| ATP-dependent Clp protease | 14 | 69 | TGAAGA ( | Protein degradation [ |
| 4 | 75 | G ( | LOS decoration [ |
* Exemplar genes identified with repeat tracts consisting of multiples of 3 nucleotides indicative of ‘in-frame’ phase variation
Exemplar genes identified subject to 3’, ‘read through’ PV.