| Literature DB >> 36204625 |
Nicole A P Lieberman1, Thaddeus D Armstrong1, Benjamin Chung1, Daniel Pfalmer1, Christopher M Hennelly2, Austin Haynes3, Emily Romeis3, Qian-Qiu Wang4,5, Rui-Li Zhang6, Cai-Xia Kou4,5, Giulia Ciccarese7, Ivano Dal Conte8, Marco Cusini9, Francesco Drago7, Shu-Ichi Nakayama10, Kenichi Lee10, Makoto Ohnishi10, Kelika A Konda11,12, Silver K Vargas11,13, Maria Eguiluz11, Carlos F Caceres11, Jeffrey D Klausner12, Oriol Mitja14,15, Anne Rompalo16, Fiona Mulcahy17, Edward W Hook18,19,20, Irving F Hoffman2,21, Mitch M Matoga2,21, Heping Zheng22,23, Bin Yang22,23, Eduardo Lopez-Medina24,25, Lady G Ramirez24,26, Justin D Radolf27,28,29,30,31, Kelly L Hawley27,28,30,32, Juan C Salazar28,30,32, Sheila A Lukehart3,33, Arlene C Seña2, Jonathan B Parr2, Lorenzo Giacani3,33, Alexander L Greninger1,34.
Abstract
Sequencing of most Treponema pallidum genomes excludes repeat regions in tp0470 and the tp0433 gene, encoding the acidic repeat protein (arp). As a first step to understanding the evolution and function of these genes and the proteins they encode, we developed a protocol to nanopore sequence tp0470 and arp genes from 212 clinical samples collected from ten countries on six continents. Both tp0470 and arp repeat structures recapitulate the whole genome phylogeny, with subclade-specific patterns emerging. The number of tp0470 repeats is on average appears to be higher in Nichols-like clade strains than in SS14-like clade strains. Consistent with previous studies, we found that 14-repeat arp sequences predominate across both major clades, but the combination and order of repeat type varies among subclades, with many arp sequence variants limited to a single subclade. Although strains that were closely related by whole genome sequencing frequently had the same arp repeat length, this was not always the case. Structural modeling of TP0470 suggested that the eight residue repeats form an extended α-helix, predicted to be periplasmic. Modeling of the ARP revealed a C-terminal sporulation-related repeat (SPOR) domain, predicted to bind denuded peptidoglycan, with repeat regions possibly incorporated into a highly charged β-sheet. Outside of the repeats, all TP0470 and ARP amino acid sequences were identical. Together, our data, along with functional considerations, suggests that both TP0470 and ARP proteins may be involved in T. pallidum cell envelope remodeling and homeostasis, with their highly plastic repeat regions playing as-yet-undetermined roles.Entities:
Keywords: AlphaFold; SPOR domain; Treponema pallidum; genomics; nanopore; next generation sequencing (NGS); syphilis; trRosetta
Year: 2022 PMID: 36204625 PMCID: PMC9531955 DOI: 10.3389/fmicb.2022.1007056
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 6.064
FIGURE 1Variation in tp0470 repeat length. (A) Recombination masked whole genome phylogeny (left) with the number of tp0470 repeats for each strain (right). Sequence variant number is included as text to the right of each length bar. All data are also included in a tabular representation in Supporting Information. Number of tp0470 repeats by subclade (B) or country (C).
FIGURE 2Variation in arp repeat length. (A) Recombination masked whole genome phylogeny (left) with the number of arp repeats for each strain (right). Sequence variant number is included as text to the right of each length bar. All data are also included in a tabular representation in Supplementary Table 2. Number of arp repeats by subclade (B) or country (C). (D) Multiple sequence alignment of the nine variants with 14 arp repeats. Variant positions are highlighted and bases colored red, blue, yellow, or green for A, C, G, or T, respectively. The number of strains with each variant sequence is included in the bar graph to the right of the multiple sequence alignment.
FIGURE 3arp repeat type usage. (A) Nucleotide sequence of the arp repeat module types. Variable positions are highlighted. (B) Amino acid sequence of the arp repeat module types. Variable positions are highlighted. (C) Recombination masked whole genome phylogeny (left) with the repeat type usage per strain (right). The 60 bp arp repeats are colored by type.
FIGURE 4Structure predictions of TP0470. (A) trRosetta and (B) AlphaFold predictions of structure of 15 repeat TP0470 variant. N terminal tetratricopeptide repeat domain is shown in green, repeats are in purple, and C-terminal region is in gold. (C) APBS electrostatic surface potential (top) and stick model of sidechains (bottom) for portion of repeat helix.
FIGURE 5Structure predictions ARP14A. (A) trRosetta and (B) AlphaFold predictions of structure of ARP14A. C-terminal SPOR domain is shown in magenta, repeats in cyan. (C) APBS electrostatic surface potential for trRosetta ARP structure from panel (A). Red denotes negative charge (acidic) and blue denotes positive charge (basic).
FIGURE 6Model showing ARP and TP0470 cellular location and putative interactions. Both ARP and TP0470 are localized to the periplasm. The ARP N terminus may be acylated at cysteine 29. OM, outer membrane; IM, inner membrane; PG, peptidoglycan. Proteins and cellular structures have not been drawn to scale.