| Literature DB >> 34936652 |
Nicole A P Lieberman1, Michelle J Lin1, Hong Xie1, Lasata Shrestha1, Tien Nguyen1, Meei-Li Huang1, Austin M Haynes2, Emily Romeis2, Qian-Qiu Wang3,4, Rui-Li Zhang5, Cai-Xia Kou3,4, Giulia Ciccarese6, Ivano Dal Conte7, Marco Cusini8, Francesco Drago6, Shu-Ichi Nakayama9, Kenichi Lee9, Makoto Ohnishi9, Kelika A Konda10,11, Silver K Vargas10, Maria Eguiluz10, Carlos F Caceres10, Jeffrey D Klausner11, Oriol Mitjà12,13, Anne Rompalo14, Fiona Mulcahy15, Edward W Hook16, Sheila A Lukehart2,17, Amanda M Casto2,18, Pavitra Roychoudhury1,18, Frank DiMaio19, Lorenzo Giacani2,17, Alexander L Greninger1,18.
Abstract
In spite of its immutable susceptibility to penicillin, Treponema pallidum (T. pallidum) subsp. pallidum continues to cause millions of cases of syphilis each year worldwide, resulting in significant morbidity and mortality and underscoring the urgency of developing an effective vaccine to curtail the spread of the infection. Several technical challenges, including absence of an in vitro culture system until very recently, have hampered efforts to catalog the diversity of strains collected worldwide. Here, we provide near-complete genomes from 196 T. pallidum strains-including 191 T. pallidum subsp. pallidum-sequenced directly from patient samples collected from 8 countries and 6 continents. Maximum likelihood phylogeny revealed that samples from most sites were predominantly SS14 clade. However, 99% (84/85) of the samples from Madagascar formed two of the five distinct Nichols subclades. Although recombination was uncommon in the evolution of modern circulating strains, we found multiple putative recombination events between T. pallidum subsp. pallidum and subsp. endemicum, shaping the genomes of several subclades. Temporal analysis dated the most recent common ancestor of Nichols and SS14 clades to 1717 (95% HPD: 1543-1869), in agreement with other recent studies. Rates of SNP accumulation varied significantly among subclades, particularly among different Nichols subclades, and was associated in the Nichols A subclade with a C394F substitution in TP0380, a ERCC3-like DNA repair helicase. Our data highlight the role played by variation in genes encoding putative surface-exposed outer membrane proteins in defining separate lineages, and provide a critical resource for the design of broadly protective syphilis vaccines targeting surface antigens.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34936652 PMCID: PMC8735616 DOI: 10.1371/journal.pntd.0010063
Source DB: PubMed Journal: PLoS Negl Trop Dis ISSN: 1935-2727
Demographic information of samples sequenced in this study.
| Country | Year(s) of Collection | Number of Samples | Sex (n) | Stage (n) | Previous Study | ||||
|---|---|---|---|---|---|---|---|---|---|
| Male | Female | Unknown | Primary | Secondary | Unknown | ||||
|
| 2019 | 9 | 5 | 0 | 4 | 5 | 0 | 4 | n/a |
|
| 2002 | 11 | 0 | 0 | 11 | 0 | 0 | 11 | [ |
|
| 1998–2002 | 15 | 11 | 3 | 1 | 14 | 1 | 0 | [ |
|
| 2019 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | n/a |
|
| 2000–2007 | 85 | 0 | 0 | 85 | 10 | 16 | 59 | [ |
|
| 2017 | 10 | 10 | 0 | 0 | 10 | 0 | 0 | n/a |
|
| 2019–2020 | 57 | 34 | 22 | 1 | 0 | 0 | 57 | n/a |
|
| 2018 | 8 | 0 | 0 | 8 | 0 | 0 | 8 | n/a |
|
|
|
|
|
|
|
|
|
| |
Fig 1Recombination-masked whole genome phylogeny of T. pallidum patient isolates.
A) Whole genomes were MAFFT-aligned, recombination-masked, and maximum-likelihood phylogeny determined. Tips are shown as grey triangles and nodes with >0.95 support from 1000 ultrafast bootstraps shown as black circles. B) Subspecies/lineage, subclade, and continent of origin of all samples included in phylogeny. C) Azithromycin sensitivity/resistance as conferred by the 23S rRNA 2058/2059 alleles. Data represents alleles at both rRNA loci. D) MLST subtypes, including novel sequences, for tp0136, tp0548, and tp0705, as well as whether the three alleles constitute a known or novel MLST. Top 6 most abundant sequences at each locus are colored, while other less abundant known and novel sequences are grouped and colored in light and medium grey, respectively. Sequences containing N bases are denoted as indeterminate and shown in dark grey. Expanded metadata for all samples is included in S1 Data.
Fig 2Effect of recombination on T. pallidum subsp. pallidum evolution.
A) Recombination-masked (left) and unmasked (right) phylogenies, with equivalent subclades highlighted. Relative position of each tip is traced between the two panels. B) Putative recombinogenic regions in each clade. Genomic position is relative to the length of the MAFFT alignment. Consensus alignment of all tips is shown on the grey panel, with recombination blocks lettered above. Grey blocks represent recombination that occurred during evolution of the SS14 clade. Red and blue blocks represent recombination events unique to each clade. Mixed grey and colored blocks are regions of ancestral recombination that had a second event unique to that clade. C-F) Two example regions of recombination in SS14 Mexico (C), Nichols A (D), Nichols B (E), and Nichols E (F). Genomic position of the first divergent base in the window shown are shown with NC_021508.1 numbering.
Fig 3SS14 and Nichols subclades have different rates of SNP accumulation.
A) Linear regressions for recombination-masked root-to-tip distances from maximum likelihood phylogeny as a function of year of collection, including (left) or not including (right) highly passaged laboratory strains. B) Residuals from linear regression without laboratory strains were plotted per subclade, p < 2e-16, ANOVA. C) Bayesian maximum clade credibility tree showing mean common ancestor heights. Highlighted nodes have a posterior probability of >0.95, and branch colors reveal rate of change (SNPs per genome per year). Ages and 95% highest posterior density are included for nodes of interest including the TPA, SS14, and Nichols ancestral nodes, as well as those of each subclade. Inset: For each tip, mean rates of SNP accumulation along branches with >0.95 posterior probability were plotted per subclade, p < 2e-16, ANOVA.
Fig 4Coding mutations in the T. pallidum subsp. pallidum phylogeny.
A) Whole genome ML phylogeny of TPA, with tips collapsed to the subclade node. Open reading frames of inferred ancestral sequences for each node were annotated based on the SS14 reference sequence NC_021508. Coding mutations, including for putative recombinant genes, for each child node were determined relative to its parent node (complete list in S4 Data). Loci with amino acid differences (n = 49 loci, n = 134 individual AA mutation events) in the SS14 ancestral clade node (N101) are shown relative to the Nichols ancestral node (N001). B) Positions are equivalent to those shown in A. Black square represents the Nichols Ancestral Node (N001). Number of antigens with coding mutations on each child node relative to parent node. Color represents p value of for overrepresentation by Fisher’s Exact test of antigens among all mutated proteins per branch; those in grey have a p value > 0.05. C) Percentage of total individual mutation events per branch. Raw numbers of mutation events in antigens per total mutation events are shown for each branch. D) Tile plot showing mutated proteins in the ancestral node for each subclade relative to its parent node, colored by antigen or not. Proteins are arranged by number of subclades bearing mutations. Data is recapitulated in S3 Data.