Leonor Sánchez-Busó1, Daniel Golparian2, Jukka Corander1,3,4, Yonatan H Grad5,6, Makoto Ohnishi7,8, Rebecca Flemming9, Julian Parkhill1, Stephen D Bentley1, Magnus Unemo2, Simon R Harris10. 1. Pathogen Genomics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK. 2. WHO Collaborating Centre for Gonorrhoea and other Sexually Transmitted Infections, National Reference Laboratory for Sexually Transmitted Infections, Department of Laboratory Medicine, Clinical Microbiology, Faculty of Medicine and Health, Örebro University, Örebro, Sweden. 3. Helsinki Institute for Information Technology, Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland. 4. Department of Biostatistics, University of Oslo, Oslo, Norway. 5. Department of Immunology and Infectious Diseases, Harvard TH Chan School of Public Health, Boston, MA, USA. 6. Division of Infectious Diseases, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA. 7. Department of Bacteriology I, National Institute of Infectious Diseases, Tokyo, Japan. 8. Antimicrobial Resistance Research Center, National Institute of Infectious Diseases, Tokyo, Japan. 9. Faculty of Classics, University of Cambridge, Cambridge, UK. 10. Pathogen Genomics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK. sh16@sanger.ac.uk.
Abstract
The sexually transmitted pathogen Neisseria gonorrhoeae is regarded as being on the way to becoming an untreatable superbug. Despite its clinical importance, little is known about its emergence and evolution, and how this corresponds with the introduction of antimicrobials. We present a genome-based phylogeographical analysis of 419 gonococcal isolates from across the globe. Results indicate that modern gonococci originated in Europe or Africa, possibly as late as the sixteenth century and subsequently disseminated globally. We provide evidence that the modern gonococcal population has been shaped by antimicrobial treatment of sexually transmitted infections as well as other infections, leading to the emergence of two major lineages with different evolutionary strategies. The well-described multidrug-resistant lineage is associated with high rates of homologous recombination and infection in high-risk sexual networks. A second, multisusceptible lineage is more associated with heterosexual networks, with potential implications for infection control.
The sexually transmitted pathogen Neisseria gonorrhoeae is regarded as being on the way to becoming an untreatable superbug. Despite its clinical importance, little is known about its emergence and evolution, and how this corresponds with the introduction of antimicrobials. We present a genome-based phylogeographical analysis of 419 gonococcal isolates from across the globe. Results indicate that modern gonococci originated in Europe or Africa, possibly as late as the sixteenth century and subsequently disseminated globally. We provide evidence that the modern gonococcal population has been shaped by antimicrobial treatment of sexually transmitted infections as well as other infections, leading to the emergence of two major lineages with different evolutionary strategies. The well-described multidrug-resistant lineage is associated with high rates of homologous recombination and infection in high-risk sexual networks. A second, multisusceptible lineage is more associated with heterosexual networks, with potential implications for infection control.
Almost 360 million curable sexually transmitted infections (STIs) are estimated
to occur globally each year, with Neisseria gonorrhoeae, the causative
agent of gonorrhoea, infecting approximately 78 million1. The highest gonorrhoea burden is reported among men, although problematic
infections are more common in women for whom urogenital infections are often
asymptomatic. Unresolved urogenital infections can lead to severe complications and
sequelae, such as reproductive problems including infertility, serious eye infections in
newborns, and enhanced transmission of HIV2. The
emergence and proliferation of gonococci with resistance to front-line antimicrobials
such as extended-spectrum cephalosporins (ESCs; cefixime and ceftriaxone) and
azithromycin have contributed to, although do not explain, the increase in incidence of
gonorrhoea. Resistance to dual therapy (injectable ceftriaxone plus oral azithromycin),
the current recommended treatment in many countries, is fortunately rare3, however, decreased susceptibility to ceftriaxone
has been reported from all continents and azithromycin resistance is on the increase
globally4, raising fears that the
effectiveness of this regimen will be short-lived. Much of the focus of gonococcal
control is on particular high-risk sexual networks that often partake in unprotected sex
with multiple partners, particularly sex workers and men who have sex with men (MSM) but
also young heterosexuals. These groups are more frequently exposed to both infection and
antimicrobial treatment, which has led to these networks being the suspected drivers of
antimicrobial resistance (AMR)5. However, AMR is
not the only factor driving the recent success of N. gonorrhoeae. Dual
therapy is effective against the vast majority of infections, yet since its
introduction, gonorrhoea infections have continued to increase in most settings6.Georges Luys famously opened his medical textbook on gonorrhoea with the
statement that ‘Gonorrhoea is as old as mankind’7. However, despite N. gonorrhoeae often being
described as an ancient pathogen, there are no clear descriptions of a disease like
modern gonorrhoea in the ancient sources. Some compatible symptoms do appear in the
medical literature of classical Greece and Rome, but nothing decisive, and the presence
or absence of modern gonorrhoea in the ancient Mediterranean has been much debated as a
result8. Early modern terms like ‘the
clap’, ‘the pox’, or ‘the venereal disease’ also
covered a range of conditions, and it was not until 1879 that Albert Neisser identified
the bacteria that now bears his name9. AMR in
N. gonorrhoeae became apparent soon after antimicrobials were first
introduced for its treatment. One characteristic of N. gonorrhoeae that
has played an important role in its rapid gain and spread of AMR is its ability to
exchange DNA via homologous recombination both within its own species and with other
Neisseria species. For example, mosaic penicillin-binding protein 2
(PBP2; encoded by the penA gene) alleles gained via recombination have
been key in the emergence of resistance to ESCs10,11 which led to the replacement of
cefixime as the first-line treatment for gonorrhoea. The first mosaic
penA allele causing high-level ceftriaxone resistance was seen in
an isolate from a pharyngeal infection in a female sex worker in Japan in 200912, but similar mosaic penA
alleles have been seen worldwide2,11,13,14. In fact, a number of resistances have first
been identified in Japan, leading to the hypothesis that most AMR gonorrhoea originates
there, or elsewhere in the WHO Western Pacific Region2.Whole-genome sequencing has been successfully used to reveal the origins, global
spread and population structure of several human pathogens15. However, gonococcal genome sequencing has mostly targeted
specific populations and outbreaks16–20. Here, we report the findings of a global
genomic study of 419 N. gonorrhoeae isolates spanning five continents
and more than 50 years, including varying susceptibilities to important antimicrobials.
Our aim was to elucidate when and where modern gonococcal populations emerged, evolved
and dispersed, and how antimicrobial usage and transmission in different sexual networks
has influenced their population dynamics.
Results
Modern gonococcus is not ‘as old as mankind’
Our collection spans a period of more than 50 years (1960 – 2013)
and 58 countries from five continents (Supplementary Table 1, Figure
1). A population-level analysis revealed a high level of admixture
among N. gonorrhoeae with no significant differentiation
between continents (Supplementary Table 2), with the exception of Africa (Supplementary Figure 1,
Supplementary Tables 3-5). We estimated the substitution rate for the
non-recombining section of the genomes in the collection (Supplementary Figure 2)
to be 3.74E-06 substitutions/site/year CI (confidence interval)
[3.39E-06 – 4.07E-06], which is similar to
previous reports16,18 and comparable to other bacteria21. The time of the most recent common ancestor (tMRCA) was
estimated to be around the 16th century (1589, CI [1544 –
1623]) (Figure 2). Although high rates of
recombination can lead to underestimation of tMRCAs to some extent, these
results are strongly at odds with the hypothesis that modern gonorrhoea has
existed as long as mankind and cast further doubt on the ascribing of historical
descriptions of gonorrhoea-like symptoms to infection with
‘modern’ gonococci.
Figure 1
Geographic and phylogenetic distribution of Neisseria
gonorrhoeae isolates.
The map shows the countries of isolation of the strains in the collection
coloured by continent. The phylogeny shows the relationship among the strains
(n=419). Coloured strips show (from inside out) the continent of isolation
(CONT), year and further typing information (BAPS clusters, NG-MAST, MLST and
penA types; colours represent different types or alleles).
Mosaic penA types are marked in the outermost black strip.
Figure 2
Global phylogeographic analysis.
The dated maximum likelihood phylogenetic tree shows the posterior probabilities
for each continent in every node (pie charts). Continents of isolation (prior)
are shown as metadata next to the tips (n=419). The top left legend contains
information on the proportion of strains from different continents before
(n=121) and after (n=298) the introduction to Asia.
Despite modern gonococci being globally mixed, we found strong evidence
of historic geographical separation, suggesting rapid mixing of populations is a
relatively recent phenomenon. A phylogeographic analysis ascribed the origin of
our collection to Europe (60.9% inferred ancestry). However, when corrected for
biases in the number of samples from each continent, complementing with isolates
from a US study16, there was support for
an African origin (90.7% inferred ancestry) (Supplementary Figure 3 and
Supplementary Table 6). From this African root, we identified a
number of change-points in the continental distribution of isolates across the
tree (Supplementary Figure
3). Most of these were recent events, but the most significant
change-point separated a basal lineage containing a high proportion of African
isolates (68.2%, 30/44) from a lineage containing a high proportion of Asian
isolates (92.6%, 137/148), despite the temporal sampling from the two continents
being similar (Supplementary
Figure 3). When combined with the dating, this can be interpreted as
an early introduction of the modern gonococcus population into Asia (1617, CI
[1578 – 1649], Figure 2) soon after
its emergence. More recently, many re-introductions into the rest of the world
have occurred from this Asian lineage, contributing to the highly mixed
population observed today.
Emergence of antimicrobial resistant gonorrhoea
Minimum Inhibitory Concentrations (MICs) for six antimicrobials (Supplementary Figure 4)
and the occurrence of genetic AMR determinants were significantly higher among
the isolates belonging to the lineage that arose after the phylogeographic
breakpoint representing the initial introduction into Asia (Wilcoxon test
W=66159, p-value<0.0001) (Figures 3
and 4c). We will therefore refer to the 298
isolates after the breakpoint as lineage A and the 121 isolates before the
breakpoint as lineage B.
Figure 3
Evolution of antimicrobial resistance genetic determinants in
Neisseria gonorrhoeae.
Antimicrobial resistance determinants (chromosomal mutations and presence/absence
of the tetM and bla genes on
plasmids (p)) detected in the new 413 strains included in this study using
ARIBA56 and mapped on the maximum
likelihood dated tree. Purple represents presence of the determinant and orange
its absence. Grey indicates isolates possessing porB1a rather
than porB1b. The two main lineages are marked as A (n=294) and
B (n=119). The left graph shows the proportion of strains with each resistance
determinant for both lineages. Statistical significance from a two-sided test
for equality of proportions is also shown in the graph with asterisks.
****p-value<0.0001, ***p-value<0.001, **p-value<0.01,
*p-value<0.05.
Figure 4
Characterization of the lineages of Neisseria
gonorrhoeae.
Distribution of the proportions of homoplasic sites in all terminal (a) and short
terminal branches (<=100 SNPs) (b) in lineages A (n=298), B (n=121) and
all strains (n=419) represented as boxplots. Each point represents the
proportion of homoplasies in one branch drawn from the total variation found in
that branch. Horizontal box lines represent the first quartile, the median and
the third quartile. Whiskers extend from the first quartile – 1.5x the
interquartile range and the third quartile + 1.5x the interquartile range.
Statistical significance between lineages A and B was assessed using a two-sided
Wilcoxon test and shown as asterisks. (c) Distribution of the total number of
antimicrobial resistance genetic determinants in the strains of each lineage as
detected using ARIBA56 (lineage A n=294;
lineage B n=119). (d) Proportion of strains isolated from female (F, n=114) and
male (M, n=566) patients in each lineage obtained from combining the global
dataset with 376 isolates from two North-American genomic studies17,26 (total n=639; lineage A n=503; lineage B n=136). Asterisks show
the significance level from a two-sided test for equality of proportions with
continuity correction (c-d). ****p-value<0.0001, ***p-value<0.001,
**p-value<0.01.
Two AMR determinants, folP R228S, which reduces
susceptibility to sulfonamides, and rpsJ V57M, which reduces
susceptibility to tetracyclines, were carried by a large proportion of isolates,
especially in lineage A (Supplementary Table 1, Figure
3). Fifty-one isolates contained a mosaic penA
allele22. We identified three
independent gains of mosaic alleles, all in lineage A. In a clade of 59 isolates
with MLST ST1901, a first recombination event replaced the wild type allele with
a mosaic penA10 allele and a subsequent event replaced
penA10 with mosaic penA34. These two
alleles differ by 16 SNPs and a codon insertion in the last 105 bases of the
nucleotide sequence. Two isolates in this clade exhibited high MICs for both
cefixime (3-4 mg/L) and ceftriaxone (2 mg/L) (Supplementary Figure 5),
and these were found to possess penA42, which is a single SNP
(A501P) variant from penA3423,24. In another lineage,
associated with MLST ST7363, most isolates possessed the penA10
alleles, but we again observed a case of replacement with
penA34. Only one isolate carried the A2045G 23S rRNA mutation
(A2059G in Escherichia coli) that confers high-level resistance
to azithromycin. Six isolates carried the low-level azithromycin resistance
C2597T 23S rRNA mutation (C2611T in E. coli). Strikingly, the
plasmids carrying tetM and blaTEM
co-localized far more frequently than expected (Pearson’s
χ2=97.82, df=1, p-value<0.0001), possibly
reflecting the mobilization of pBlaTEM by the pConjugative plasmid25, and were completely absent from
ESC-resistant isolates (Figure 3). The
Gonococcal Genomic Island (GGI) was found in 277 (67%) isolates (Supplementary Figure 6),
but showed no clear association with AMR. The plasmid-encoded resistances showed
no significant difference in prevalence in lineages A or B (two-sided test for
equality of proportions for tetM: χ2=0.01,
95% CI [-0.089, 0.110], df=1, p-value=0.92, and for
blaTEM: χ2=0.88, 95% CI
[-0.046, 0.147], df=1, p-value=0.35). In contrast, of the 29
chromosomally-mediated resistance substitutions examined, 18 were significantly
associated with clade A (Figure 3).
Importantly, based on our phylogenetic dating, the majority of occurrences of
these 29 determinants were estimated to have been acquired after the
introduction of the antimicrobial against which they act (Supplementary Figure
7).
Two strategies for gonococcal success
Overall, our data show far fewer gains of chromosomally-encoded AMR
determinants in lineage B compared to A (Supplementary Figure 8). Since these determinants primarily
spread through the population via homologous recombination, such differences
could be explained by differences in recombination frequency. To assess this, we
compared the proportion of homoplasic sites, an indicator of recombination, in
the terminal branches of the phylogenetic tree in the two lineages. This
confirmed a significantly higher proportion in clade A, particularly for short
branches, which represent very recent evolution (Wilcoxon test W=19416,
p-value<0.001; Figures 4a-b and
Supplementary Figure
9). Note that the distribution of branch lengths in both clades was
similar (Wilcoxon test W=14427, p-value=0.739). Similarly, the proportion of
clustered SNPs, another signal of recombination, was also higher on the terminal
branches in lineage A (Wilcoxon test W=16984, p-value<0.05). The
proportion of recombination-deficient strains (those with no recombination
events detected, r=0) in lineage B was higher than expected, bordering on
statistical significance (one-tailed test of proportions, p-value=0.05184).One explanation for such differences could be opportunity. For
recombination to occur, donor and recipient bacteria must co-localise. Thus,
recombination between gonococci would be expected to occur more frequently in
high-risk host populations where coinfection with other STIs and pharyngeal
infections, which allow access to commensal Neisseria species,
are more common. These risk-groups are also more likely to be exposed to
repeated antimicrobial therapy for gonorrhoea and other STIs5. Unfortunately, due to limitations in
availability of data on patient sexual behaviour, we could not adequately assess
association of the lineages to risk factors in our dataset. However, we could
analyse the distribution of the gender of the patients from which the isolates
were taken. To increase the power of the analysis, we included 376 isolates from
two North-American genomic studies17,26, to give a set of 639
isolates with complete gender information. Strikingly, lineage B included a
significantly higher proportion of women (40/136, 29.4%) than A (69/503, 13.7%)
(two-sided test for equality of proportions χ2=17.54, 95% CI
[0.070, 0.244], df=1, p-value<0.0001) (Figure 4d and Supplementary Figure 8), which would suggest that B is more closely
associated with heterosexuals. Corroborating this, data from a 2013
European-wide structured survey27 showed
a similar pattern. Lineage B isolates were strongly associated with reduced MICs
and female patients (61/214, 28.5% of lineage B isolates were from women vs
100/821, 12% of lineage A; two-sided test for equality of proportions
χ2=33.21, 95% CI [0.096, 0.231], df=1,
p-value<0.0001), and more importantly, of the patients that reported
sexual orientation, 78.3% (94/120) of isolates in lineage B were from
heterosexuals, in contrast to 52.6% (200/380) in A (two-sided test for equality
of proportions χ2=23.82, 95% CI [0.162, 0.352], df=1,
p-value<0.0001) (Supplementary Figure 10). Particular sublineages within lineage B
appeared particularly strongly associated with heterosexuals27. We suspect lineage A, being associated
with higher-risk populations, does have greater opportunity for recombination,
which may explain the observed higher recombination rate. However, transmission
between low and high-risk populations is common within lineage A, so we suspect
opportunity is not the only explanation for the differential recombination rate
in the two lineages. The observation that plasmid-born resistances do not show
the same difference in frequency between the two lineages also supports this
view.
Discussion
Gonorrhoea is one of the most clinically important STIs worldwide. Its rapid
mode of transmission, especially among high-risk groups, and the emergence of
resistance to many antimicrobials, has made the control of N.
gonorrhoeae of primary importance for public health. In recent years
there has been an understandable focus on AMR gonorrhoea, with resistance to all
classes of antimicrobials used to treat the infection having been reported2. However, the increase in prevalence of
gonorrhoea has continued in many settings6
despite resistance to dual therapy being extremely rare.Our genomic analysis revealed a contemporary global population with little
geographical structure, suggesting rapid recent intercontinental transmission is
occurring. In particular, introductions from Asia into the rest of the world appear
common, consistent with the observation that a number of recent resistant gonococcal
clones have emerged from this region2. The one
exception was Africa, where the sampled gonococcus was less diverse. However, our
African sample size was small due to limited availability of isolates, so further
study is required in this area.We estimated an origin of modern gonococci in the 16th century
[1544 - 1623], which contrasts with historical interpretations of modern gonorrhoea
as an ancient disease. Although we are keen to stress that high rates of
recombination make accurate estimates difficult, and our estimated confidence
intervals are probably too narrow, this dating suggests that ancient accounts of
gonorrhoeal-like symptoms may have been caused by other pathogens, or are evidence
of an ancient N. gonorrhoeae population distinct from that observed
today. It certainly disputes the view that the disease we now know as gonorrhoea is
‘as old as mankind’. The 16th century was, nonetheless, an
opportune time for the global dissemination of pathogens. It was a period of early
modern globalization marked by the initiation and intensification of many
intercontinental trade links, particularly by sea28. This period was of utmost importance for globalization due to an
expeditious increase in exchange of goods, including the import of crops from the
Americas to Europe. Increased movement of people around the world also spawned local
epidemics and pandemics29, and may well have
played an important role in the evolution of modern gonorrhoea. A phylogeographic
analysis using several subsampled sets of strains from different continents to avoid
bias placed the origin of the current global gonococcal population in Europe or
Africa. We identified a subsequent introduction into Asia in the early
17th century [1578 – 1649], which expanded rapidly throughout
the continent. Much more recently this lineage has been repeatedly transmitted back
to the rest of the world.A major finding is a strong association between isolates from the lineage
that evolved from this early introduction to Asia and the development of AMR. Nearly
all isolates in this lineage A, but only 50% of those lineage B, harboured
resistance to sulfonamides (folP R228S mutation) and tetracyclines
(rpsJ V57M mutation). Sulfonamides were the first
antimicrobials introduced to treat gonorrhoea in 1935, with initial efficacies of
around 90%. By the mid to late 1940s sulfonamide resistance was common, and it was
discarded as a treatment for gonorrhoea2.
However, sulfonamides are still widely used in combination with trimethoprim
(TMP-SMZ) for prophylaxis in HIV positivepatients and to treat a variety of
bacterial infections30. Doxycycline (a
tetracycline) is still used to treat gonococcal or presumptively non-gonococcal
urethritis/cervicitis and is the recommended treatment for lymphogranuloma
venereum31. We therefore suspect the high
incidence of sulfonamide and tetracycline resistance in modern gonorrhoea is due to
historic treatment of the disease itself followed by continued use of these drug
classes for other purposes. The high proportion of diverse circulating strains
carrying the folP and rspJ mutations could be used
as evidence that they were in the gonococcal population long before the introduction
of antimicrobials. However, this seems unlikely. More plausibly, the use of
sulfonamides and tetracyclines has produced a strong selective pressure over an
extended period of time, which has led to many independent acquisitions of
resistance mutations and convergent gains of resistance via homologous
recombination. In the more recombinogenic lineage A, this has resulted in these
mutations sweeping through the entire clade. Furthermore, other AMR determinants
that have entered the gonococcal population more recently appear to be undergoing
the same process, particularly in lineage A. The DNA gyrase A S91F substitution,
which provides resistance to ciprofloxacin, is one of many resistance mutations that
show extremely high levels of homoplasy in lineage A, consistent with a combination
of de novo mutation and rapid dissemination via recombination. The
mosaic penA alleles, which reduce susceptibility to ESCs are
another example. These elements were first described in N.
gonorrhoeae around the turn of the century, but have already been
independently acquired by a number of A sublineages, clearly showing that these
mutations are transferring en masse via recombination rather than
by repeated de novo mutation. Lineage B, on the other hand, has
remained susceptible to most antimicrobials. More generally, levels of homoplasy and
SNP clustering were found to be significantly higher in clade A, supporting the
hypothesis that higher rates of recombination in this lineage have played a role in
its high levels of AMR.The rise of AMR gonorrhoea is generally assumed to have been facilitated by
particular demographics who partake in high-risk sexual behaviours, particularly
unprotected sex with multiple partners. These groups are also more often treated
with antimicrobials than the general population due to frequent infection32. Concordantly, we found that lineage A is
associated with infection in MSM, one of the predominant risk groups, while isolates
from B are more rarely found in this demographic group. Thus, lineage A isolates
have the means (increased homologous recombination), motive (higher antimicrobial
exposure) and opportunity (higher rates of coinfection with commensal
Neisseria and other STIs) for recombination-driven gain of
AMR.Most recent media attention and gonococcal genomics research has focused on
the increasing levels of AMR in gonorrhoea. However, we have shown that a mostly
susceptible lineage is successfully persisting in lower-risk groups where it is
probably less likely to be exposed to antimicrobials. Notably, this lineage was
associated with heterosexual groups, and with infections in women, where rates of
asymptomatic infection are higher. Turner et al.33 showed, using a modelling approach, that in
a situation where both resistant and susceptible strains are present in a
population, high rates of asymptomatic infection, and therefore under-treatment, can
allow susceptible isolates to survive and thrive. In such circumstances, rates of
susceptible infection can be hugely underestimated, potentially meaning that our
understanding of gonococcal prevalence, and rates of AMR may be biased.
Interestingly, the majority of our African samples were from lineage B, consistent
with epidemiological studies that describe a hidden epidemic of gonococcus in rural
South African women, in which 48% of cases were asymptomatic and another 50% were
symptomatic but not seeking care34. Similarly
in Namibia, prevalence of asymptomatic gonococcal infections in both men and women
in rural villages are high35. This may
suggest that lineage B is associated with asymptomatic infection more fundamentally
than simply being more often found in women. In such a situation, if compensatory
mutations are not developed, gain of AMR determinants may be detrimental as these
elements may come with an associated general cost to fitness. Grad et
al36 reported, for example, that
23S rRNA mutations resulting in azithromycin resistance were associated with reduced
ESC MICs in isolates with mosaic penA alleles. Similarly, we have
observed that the tetM and
bla-containing plasmids are negatively associated
with isolates with mosaic penA alleles.In conclusion, in the first phylogeographic analysis of a global collection
of gonococci we have shown that although the modern gonococcal population is highly
mixed, this mixing is relatively recent. This gonococcal population originated as
late as the 16th century, most likely in Europe or Africa, and an early
single introduction into Asia led to a rapid spread throughout the continent and the
rest of the world. Despite most recent focus being on gonococcal AMR, we have
demonstrated that N. gonorrhoeae has adapted to sexual networks
with different risk profiles and exposures to antimicrobial treatment. Modern global
gonorrhoea can be divided into two lineages, which we term A (after the phylogenetic
breakpoint) and B (before the phylogenetic breakpoint). Lineage A has gained and
proliferated AMR determinants, aided by an increased rate of recombination. We
hypothesise that these isolates are often transmitted in higher-risk networks, e.g.
MSM, where pharyngeal infections are more common and individuals are more frequently
exposed to treatment for gonorrhoea and other STIs. Lineage B, however, has not
gained AMR so rapidly, with 26% of isolates containing no known AMR determinants,
and is potentially being silently transmitted in undertreated groups where levels of
asymptomatic infection are higher. Thus, our results have shown that the effect of
antimicrobial treatment on the gonococcal population has been more complex than
simply initiating an inexorable progression towards AMR.
Methods
Global N. gonorrhoeae strains and antimicrobial
susceptibility testing
A total of 413 N. gonorrhoeae strains without known
epidemiological relatedness were collected from patients suffering gonorrhoea in
58 countries spanning five continents. The strains were selected to represent a
wide geographic, temporal, phenotypic (based on antimicrobial resistance), and
genetic diversity, that is, to represent as much as feasible of the N.
gonorrhoeae species phylogeny (Supplementary Table 1).
Six genome references were also included in the study, spanning a range of
isolation dates between 1960 and 2013 in total. Bacterial isolation from the
corresponding samples, preservation and transportation was performed following
standard microbiological procedures37.
β-lactamase production and minimum inhibitory
concentrations (MIC) were tested for a range of antimicrobials as described
previously38: spectinomycin,
tetracycline, penicillin G, ciprofloxacin, azithromycin, cefixime and
ceftriaxone.
DNA preparation and whole-genome sequencing
All isolates were confirmed to be N. gonorrhoeae and
genomic DNA was extracted from the isolates using the Promega Wizard DNA
purification kit, following the instructions from the manufacturer. Purified
DNAs were multiplexed and sequenced using two lanes of the HiSeq 2500 2x100 bp
platform at the Wellcome Sanger Institute.
Mapping and variant calling
Fastq files from the 413 new gonococcal strains and the N.
meningitidis 10356_1#65 outgroup (ENA accession ERS248641) were
mapped to a common reference, N. gonorrhoeae FA1090 (NCBI
accession NC_002946, 2,153,922 bp) using SMALT v0.7.4 (http://www.sanger.ac.uk/science/tools/smalt-0). Variants were
called using SAMtools and BCFtools v1.239
after indel realignment with GATK v1.5.940 and further filtered as described previously41.Six public reference genomes were obtained from the NCBI and aligned
using progressiveMAUVE v2.3.142
(N. gonorrhoeae FA1090 (NCBI NC_002946.2), FA19 (NCBI
CP012026.1), FA6140 (NCBI CP012027.1), MS11 (NCBI NC_022240.1), 35/02 (NCBI
CP012028.1) and NCCP11945 (NCBI NC_011035.1)). The XMFA output alignment was
converted into a plain fasta format using N. gonorrhoeae FA1090
as reordering reference through a custom Perl script (see “Code availability”). Positions with
gaps in this reference were removed, so that the resulting alignment had
homologous positions to the 2,153,922 bp in the FA1090 genome. This alignment
was added into the one resulting from mapping the 413 isolates, producing a
419-strains alignment containing the core genome and accessory sites from FA1090
that are shared by any other strain in the collection.
Recombination removal and phylogenetic reconstruction
Prophages described in the N. gonorrhoeae FA1090
strain43 were masked in the alignment
before running Gubbins v1.4.1044, which
was used to remove segments that can have undergone recombination. This is done
by detecting regions of the alignment in which single nucleotide polymorphisms
(SNPs) are densely clustered and occur on the same branches of the tree. The
N. meningitidis 10356_1#65 strain was used as outgroup so
that events affecting all N. gonorrhoeae strains were not
excluded from subsequent calculations.The detected recombination events and repeat regions inferred by
repeat-match (MUMMER v3.23)45 using
default options on the N. gonorrhoeae FA1090 strain genome were
masked in order to minimize the occurrence of false-positive SNPs. Gblocks
v0.91b46 was run on the resulting
alignment to further clean poorly aligned regions that may introduce noise to
phylogenetic analyses. Gblocks was run by allowing gap positions in up to 50% of
the sequences, with a minimum block length of 10 and 8 as maximum number of
contiguous non-conserved positions. The resulting 1,211,180 bp clean alignment
included 15,562 variable sites, identified by snp-sites47, and was used for population structure analysis,
phylogenetic inference and divergence estimation. Genetic clusters were obtained
from the non-recombining alignment using hierBAPS v7.348.The final SNP alignment was used for Maximum Likelihood (ML)
phylogenetic tree reconstruction using RAxML v7.8.649 under the GTRGAMMA model of nucleotide substitution and
100 bootstrap replicates. An algorithm called BOOSTER v0.1.250 was also used to obtain an enhanced
estimate of node support values (Supplementary Note). Ancestral states of all SNPs before
recombination removal were reconstructed onto the resulting phylogenetic tree
using ACCTRAN transformation in python (http://scikit-learn.org).
Homoplasic sites in the terminal branches of the tree were detected for the
whole tree and the two main lineages. It is important to note that Gubbins
removed 97% (33,026/34,034) of those homoplasic sites, minimizing their effect
on subsequent analyses.
Genome de novo assembly and in silico typing
In parallel to the mapping process, reads were assembled using the
assembly and improvement iterative pipeline developed at the Wellcome Sanger
Institute51. Multi-locus sequence
typing (MLST)52 and N.
gonorrhoeae multi-antigen sequence typing (NG-MAST)53 typing schemes were retrieved directly
from the sequences using the get_sequence_type script (https://github.com/sanger-pathogens/mlst_check/blob/master/bin/get_sequence_type)
and NGMASTER v0.454, respectively. The
presence of the β-lactamase (blaTEM) and
tetracycline (tetM) genes on plasmids and the Gonococcal
Genomic Island (GGI) were detected using BLAST v2.3.0+55 and ARIBA v2.456.
Typing was performed for the conjugative plasmid and the
blaTEM plasmids using an
in-silico PCR (https://github.com/simonrharris/in_silico_pcr). Primers to
differentiate between the Dutch and the American
tetM-containing plasmids were obtained from Turner et
al, 199957. To type the
blaTEM plasmids, primers described in Dillon
et al, 199958 were
used and the resulting amplicon sizes evaluated to differentiate among the Asia,
Africa and Toronto/Rio types (Supplementary Table 1).
Analysis of population structure
In order to study population structure from the resulting alignment, the
poppr R package v2.5.059 was used to perform an AMOVA test on the non-recombining section
of the genome60 on three geographical
hierarchies: continent, subcontinent and country, to calculate the percentage of
observed variance within and between groups. In order to test if the observed
differentiation between continents was significant, a randomization test (N =
1000 permutations) was performed using the randtest function
from the ade4 R package v1.7-1161, which randomly permutes the population structure to assess the
observed signal of differentiation.To further study population structure, a Discriminant Analysis of
Principal Components (DAPC, adegenet R package v2.1.1)62,63 analysis was applied to the non-recombining 15,562 SNPs alignment
using continent of isolation as population. The procedure followed by this
multivariate discriminant analysis tries to maximize the discrimination between
the predefined groups. To avoid over-fitting and keep enough discrimination
power, the optimal number of principal components (PC) to retain was determined
using the a-score optimization test, which uses randomized groups to calculate
the proportion of successful reassignments corrected by the number of retained
PCs. This methodology resulted in 83 principal components as optimal to keep a
balance between discrimination power and over-fitting. Prior assignment to
continents was randomized and the DAPC analysis repeated to confirm that the
observed separation among clusters does not occur by chance. Four discriminant
functions were kept for the analysis, considering the number of variables was 5
continents. A Multivariate Analysis of Variance (MANOVA) test64 was applied to test whether there were
differences between the means of the different clusters (continents) on the
discriminant clustering. Wilks’ lambda was used to test the significance
of this MANOVA test. Resulting p-values were adjusted for multiple tests using
False Discovery Rate (FDR)65.DAPC derives group membership probabilities from the retained
discriminant functions. These results were used to evaluate the level of
admixture in the dataset under study. Isolates assigned with >80%
posterior probability to a continent different from the prior assignment were
interpreted as intercontinental transmission cases. Isolates with <80% of
posterior assignment to any of the continents were considered as admixed.
Divergence estimation with LSD and BEAST
Year of isolation for all the strains was used to calculate a
root-to-tip distance regression versus time to make an estimate of the temporal
signal in the data. To do this, a “clustered permutation” approach
was used as described66,67, which considers potential confounding
temporal and genetic structure in the data. A total of 1000 permutations were
performed with this method by randomizing the isolation dates in order to get an
estimate of the statistical significance of the results. This procedure was
applied to the whole dataset and to the different BAPS clusters.In order to get an estimate of the substitution rate and the Most Recent
Common Ancestor (tMRCA) for the whole N. gonorrhoeae global
collection, the Least-Square Dating (LSD) v0.3 software68 was used. This approach has been shown to be robust to
uncorrelated changes of the molecular clock and to give similar results to
BEAST68. In order to compare the
performance between LSD and BEAST, individual BAPS clusters were used.
Specifically, Bayesian approximation using BEAST v1.8.269 was run to estimate the tMRCA and the substitution rate
of the genetic clusters determined by hierBAPS48. Three chains were run per cluster up to 100 million generations
by using a GTRGAMMA model of nucleotide substitution with 4 categories, strict
molecular clock with a diffuse gamma distribution (shape 0.001 and scale 1000)
and a constant population size as priors. Default priors were used. For models
using relaxed clocks, the ucld mean prior was set to a gamma distribution with
shape 0.001 and scale 1000. The same configuration was used to run two different
chains with the whole collection, which did not reach proper convergence because
of the complexity of the dataset. LSD was also run for the BAPS clusters that
reached convergence in BEAST and the results compared (Supplementary Note). The
obtained tMRCA was further confirmed using the Wald statistic (Supplementary Note).
Phylogeography with stochastic character mapping
The continent of isolation was used as a discrete trait to study changes
in the distribution over the phylogenetic tree using treeBreaker v1.170 (https://github.com/ansariazim/treeBreaker). This program
calculates the per-branch posterior probability of having a change in the
distribution of a discrete character.Stochastic character mapping71
with a symmetric transition model (SYM) was applied to the phylogenetic tree to
get posterior probabilities for each continent at every node using the
make.simmap function implemented in the
phytools R package v0.6-4472. Given a phylogeny and a set of tip states
(“continent” in this study), this method uses an MCMC approach to
sample character histories from their posterior probability distribution
consistent with those states given a model of evolution for the mapped
character73. This procedure was
applied to the prior and posterior continent assignments excluding the admixed
individuals to reduce noise from the prior metadata.An extra set of 236 isolates from the US16 was added to the global collection and the phylogeographic
analyses repeated to confirm our results. To avoid biases due to different
number of strains from different continents, the combined datasets were down
sampled 100 times to N=41 (the maximum number of strains with a posterior
assignment to the continent with the least number of strains, Africa), except
for Oceania, from which there is not more data in the public databases to
include, generating 100 subtrees. Ten stochastic maps were inferred for each of
those subtrees and posteriorly combined using phytools72, resulting in a total of 1000 evaluated
maps.
Evolution of antimicrobial resistance determinants
Mutations conferring antimicrobial resistance in known genetic
determinants (16S rRNA, 23S rRNA,
rpoB, rpsJ, mtrR,
folP, gyrA, parC,
parE, penA, ponA and
porB) as well as the presence of the β-lactamase
(blaTEM) and tetM genes74 were obtained for the 413 strains
sequenced in this study using ARIBA v2.456 (Supplementary
Table 1) with a custom database created for N.
gonorrhoeae (precomputed version available in https://github.com/martinghunt/ariba-publication/tree/master/N_gonorrhoeae/Ref).
ARIBA searches for the presence of particular AMR genes and associated known
mutations using reference sequences as a subject database and the fastq files of
the strains in the collection as queries.Subsequent analyses were performed using R v3.1.275: the occurrence of different antimicrobial resistance
determinants before and after the change point detected by treeBreaker on the
distribution of continents and the distribution of MIC values for penicillin G,
tetracycline, ciprofloxacin, ceftriaxone, cefixime and azithromycin against
different combinations of the genetic determinants. The average number of
changes from a susceptible to a resistant state was inferred for each of the
resistant determinants under study in both lineages A and B independently using
stochastic mapping (100 simulations) with the make.simmap
function implemented in the phytools R package72. The inferred number was corrected by
the number of edges in each lineage: n = 586 in lineage A and n = 236 in lineage
B.As an approximation of studying the risk groups characterizing the
defined A and B lineages, 263 isolates from the global collection with
information on the gender of the patient were combined with 376 from two North
American studies with this information available17,26. ARIBA v2.4 was run for
the extra isolates and the obtained results joined with the ones from the global
dataset. Metadata on gender and number of total resistance determinants detected
per strain was plotted on a recombination-free phylogenetic tree obtained as
described above and differences between the two lineages evaluated using
two-sided tests for equality of proportions with continuity correction
(prop.test) and two-sided Wilcoxon rank sum tests
(wilcox.test) with R75.In order to confirm our hypothesis on the two lineages being associated
to different risk groups and antimicrobial susceptibilities, we downloaded the
phylogenetic tree of 1,054 European isolates from a 2013 Euro-GASP survey from
the Pathogenwatch N. gonorrhoeae scheme27 (https://pathogen.watch/collection/eurogasp2013). The breakpoint
between lineages A and B was detected by obtaining a combined core genome
alignment of this and our global set (1,473 strains in total) using Roary
v3.11.376 and running a
pseudo-maximum likelihood tree with the resulting SNPs47 with FastTree v2.1.977.
Visualization
Visualization of metadata in phylogenetic trees was performed using iTOL
v478. Mapping of the and
presence/absence of antimicrobial resistance determinants detected with ARIBA
were visualized using Phandango v1.1.079.
Supplementary Material
Supplementary Information is linked to the online version of
the paper at https://www.nature.com/.
Authors: W Demczuk; S Sidhu; M Unemo; D M Whiley; V G Allen; J R Dillon; M Cole; C Seah; E Trembizki; D L Trees; E N Kersh; A J Abrams; H J C de Vries; A P van Dam; I Medina; A Bharat; M R Mulvey; G Van Domselaar; I Martin Journal: J Clin Microbiol Date: 2017-02-22 Impact factor: 5.948
Authors: Andrew J Page; Carla A Cummins; Martin Hunt; Vanessa K Wong; Sandra Reuter; Matthew T G Holden; Maria Fookes; Daniel Falush; Jacqueline A Keane; Julian Parkhill Journal: Bioinformatics Date: 2015-07-20 Impact factor: 6.937
Authors: Sandeep J Joseph; Jesse C Thomas; Matthew W Schmerer; John C Cartee; Sancta St Cyr; Karen Schlanger; Ellen N Kersh; Brian H Raphael; Kim M Gernert Journal: Genome Biol Evol Date: 2022-01-04 Impact factor: 3.416
Authors: Magnus N Osnes; Lucy van Dorp; Ola B Brynildsrud; Kristian Alfsnes; Thamarai Schneiders; Kate E Templeton; Koji Yahara; Francois Balloux; Dominique A Caugant; Vegard Eldholm Journal: Mol Biol Evol Date: 2021-04-13 Impact factor: 16.240
Authors: Jolinda de Korne-Elenbaas; Sylvia M Bruisten; Henry J C de Vries; Alje P Van Dam Journal: J Antimicrob Chemother Date: 2021-06-18 Impact factor: 5.790
Authors: Magnus N Osnes; Xavier Didelot; Jolinda de Korne-Elenbaas; Kristian Alfsnes; Ola B Brynildsrud; Gaute Syversen; Øivind Jul Nilsen; Birgitte Freiesleben De Blasio; Dominique A Caugant; Vegard Eldholm Journal: Microb Genom Date: 2020-11-17
Authors: Manuel C Jamoralin; Silvia Argimón; Marietta L Lagrada; Alfred S Villamin; Melissa L Masim; June M Gayeta; Karis D Boehme; Agnettah M Olorosa; Sonia B Sia; Charmian M Hufano; Victoria Cohen; Lara T Hernandez; Benjamin Jeffrey; Khalil Abudahab; John Stelling; Matthew T G Holden; David M Aanensenb; Celia C Carlosa Journal: Western Pac Surveill Response J Date: 2021-02-26
Authors: Jesse C Thomas; Sandeep J Joseph; John C Cartee; Cau D Pham; Matthew W Schmerer; Karen Schlanger; Sancta B St Cyr; Ellen N Kersh; Brian H Raphael Journal: Nat Commun Date: 2021-06-21 Impact factor: 14.919