| Literature DB >> 21209953 |
Lenka Mikalová1, Michal Strouhal, Darina Čejková, Marie Zobaníková, Petra Pospíšilová, Steven J Norris, Erica Sodergren, George M Weinstock, David Šmajs.
Abstract
The genomes of eight treponemes including T. p. pallidum strains (Nichols, SS14, DAL-1 and Mexico A), T. p. pertenue strains (Samoa D, CDC-2 and Gauthier), and the Fribourg-Blanc isolate, were amplified in 133 overlapping amplicons, and the restriction patterns of these fragments were compared. The approximate sizes of the genomes investigated based on this whole genome fingerprinting (WGF) analysis ranged from 1139.3-1140.4 kb, with the estimated genome sequence identity of 99.57-99.98% in the homologous genome regions. Restriction target site analysis, detecting the presence of 1773 individual restriction sites found in the reference Nichols genome, revealed a high genome structure similarity of all strains. The unclassified simian Fribourg-Blanc isolate was more closely related to T. p. pertenue than to T. p. pallidum strains. Most of the genetic differences between T. p. pallidum and T. p. pertenue strains were accumulated in six genomic regions. These genome differences likely contribute to the observed differences in pathogenicity between T. p. pallidum and T. p. pertenue strains. These regions of sequence divergence could be used for the molecular detection and discrimination of syphilis and yaws strains.Entities:
Mesh:
Year: 2010 PMID: 21209953 PMCID: PMC3012094 DOI: 10.1371/journal.pone.0015713
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Genome size and differences in restriction target sites (RTS) of T. p. pallidum, T. p. pertenue, T. paraluiscuniculi and Fribourg-Blanc strains.
| Strain | Place and year of isolation | Reference | The source of the material | Estimated genome size (kb) | Number of missing RTS | Number of additional RTS | Total number of different RTS | Estimated genome sequence identity with Nichols (%) |
| Nichols | Washington, DC; 1912 |
| Steven J. Norris, UT, Houston, TX, USA | 1138.0 AE000520 1139.6 | - | 1 | 1 | 100 |
| DAL-1 | Dallas; 1991 |
| David L. Cox, CDC, Atlanta, GA, USA | 1139.9 | 1 | 1 | 2 | 99.98 |
| SS14 | Atlanta; 1977 |
| Steven J. Norris, UT, Houston, TX, USA | 1139.5 | 3 | 5 | 8 | 99.92 |
| Mexico A | Mexico; 1953 |
| David L. Cox, CDC, Atlanta, GA, USA | 1140.0 | 3 | 4 | 7 | 99.93 |
| Samoa D | Western Samoa; 1953 |
| Steven J. Norris, UT, Houston, TX, USA | 1139.3 | 15 | 23 | 38 | 99.64 |
| CDC-2 | Akorabo, Ghana; 1980 |
| David L. Cox, CDC, Atlanta, GA, USA | 1139.7 | 17 | 22 | 39 | 99.63 |
| Gauthier | Congo; 1960 |
| Steven J. Norris, UT, Houston, TX, USA | 1139.4 | 16 | 22 | 38 | 99.64 |
| Fribourg-Blanc | Guinea; 1966 |
| David L. Cox, CDC, Atlanta, GA, USA | 1140.4 | 20 | 26 | 46 | 99.57 |
| Cuniculi A | ? | ? | Steven J. Norris, UT, Houston, TX, USA | 1133.4 | 96 | 94 | 190 | 98.21 |
Altogether, 1773 RTS were tested in the Nichols genome.
The additional AccI RTS present in the Nichols genome resulted from the added tprK-like insertion in the intergenic region between TP0126–TP0127.
The genome size was calculated from the published sequence [25] with addition of 7 repetitive sequence (60 bp) in genes TP0433–TP0434 and addition of the tprK-like insertion present in a part of the Nichols population [22].
Figure 1An unrooted tree showing the phypogenetic relationship of investigated genomes.
An unrooted tree (Tree View) constructed from the binary RTS data illustrating the relatedness of individual genomes. In addition, we incorporated also RTS data for T. paraluiscuniculi strain Cuniculi A that were taken from the previously published work of Strouhal et al. [21]. Bar scale represents 0.01 restriction target site substitutions per tested RTS. T. p. pallidum strains causing syphilis are shown in bold.
Figure 2A schematic representation of genome changes found in T. p. pallidum, T. p. pertenue strains and Fribourg-Blanc isolate.
A A schematic representation of indels found in all T. p. pertenue strains and the Fribourg-Blanc isolate but not found in any of the investigated T. p. pallidum strains (see also Table 3). Please note that TP0132 gene was not annotated in pertenue and Fribourg-Blanc strains. B Identified variable genomic regions in most of the investigated strains and isolates (see also Table 2). For more detailed structure of TP0126–TP0127 region see Figure 3, for details on TP0433–TP0434 locus, see [26]. C Indels specific for individual strains and isolates (see also Table 4). T. p. pallidum strains causing syphilis are shown in bold. Deletions are shown as vertical lines, insertions as lines with black triangles.
Genome regions showing differences specific for T. p. pertenue strains (Samoa D, CDC-2 and Gauthier) and the simian Fribourg-Blanc isolate.
| TPI interval/affected IGR or gene(s)/(coordinates following the Nichols genome | Strain(s) | Detected indel | Putative gene function or sequence similarity | Characterization of hypothetical protein/predicted cellular localization | GenBank accession no. |
| TPI12B TP0132 (152942–153151) | Samoa D, CDC-2, Gauthier, Fribourg-Blanc | several dispersed deletions (38 bp), 172 nt in this region remained | gene completely deleted | HM151364 Samoa D HM585245 Gauthier HM585244 CDC-2 HM585258 Fribourg-Blanc | |
| TPI13 TP0136 (157368–157430) | Samoa D, Gauthier, CDC-2, Fribourg-Blanc | deletion (63 bp) | gene coding for fibronectin binding protein | HM151364 Samoa D HM585245 Gauthier | |
| (157457–157458) | CDC-2, Fribourg-Blanc | insertion (33 bp) | HM585244 CDC-2 HM585258 Fribourg-Blanc | ||
| TPI21C TP0266 (278334–278366) | Samoa D, CDC-2, Gauthier, Fribourg-Blanc | deletion (33 bp), substitution of 1 nt (278448) leading to cancellation of stop codon | partial deletion (11 aa) and elongation at C-terminus (5 aa) of gene coding for hypothetical protein | bacterial cytoplasm | HM165228 Samoa D HM165229 Gauthier HM165230 CDC-2 HM165231 Fribourg-Blanc |
| TPI25B-A TP0316 (331265–331266) | Samoa D, CDC-2, Gauthier, Fribourg-Blanc | insertion (635 bp) resulting in frameshift mutation | insertion of | HM585230 Samoa D HM585231 Gauthier HM585232 CDC-2 HM585233 Fribourg-Blanc | |
| TPI42A IGR TP0548–TP0549 (593056–593057) | Samoa D, CDC-2, Gauthier, Fribourg-Blanc | insertion (52 bp) | prediction of a new hypothetical gene TP0548.1 (65 aa) | HM245777 Samoa D HM243496 Gauthier HM243495 CDC-2 HM585227 Fribourg-Blanc | |
| TPI77 TP1030–TP1031 (1124020–1124396) | Samoa D, CDC-2, Gauthier, Fribourg-Blanc | deletion (377 bp) resulting in frameshift mutation | 42 aa elongation of | HM623430 Samoa D HM585235 Gauthier HM585236 CDC-2 HM585254 Fribourg-Blanc |
The following algorithms were used for identification of sequence motifs and for prediction of cellular organization: SignalP, LipoP, CDD, Pfam, PSORT, and InterProScan.
Genome regions showing variability in most of investigated strains of T. p. pallidum (Nichols, SS14, DAL-1 and Mexico A), T. p. pertenue strains (Samoa D, CDC-2 and Gauthier), and in the Fribourg-Blanc isolate.
| TPI interval/affected IGR or gene(s)/(coordinates following the Nichols genome | Strain(s) | Detected indel | Total no. of repetitions | Putative gene function or sequence similarity | Characterization of hypothetical protein/predicted cellular localization | GenBank accession no. | |
| TPI12A IGR TP0126–TP0127 (148526–148527) | Nichols | insertion (1204 bp) |
| HM585242, HM585259 Nichols HM585255 DAL-1 | |||
| SS14, Mexico A | insertion (1255 bp) | HM585243 SS14 HM585256, HM585257 Mexico A | |||||
| Samoa D, Gauthier, CDC-2, Fribourg-Blanc | insertion (1269 bp) | HM151364 Samoa D HM585245 Gauthier HM585244 CDC-2 HM585258 Fribourg-Blanc | |||||
| TPI32B TP0433–TP0434 (461079–461499) | Nichols | insertion/deletion of repetitive sequences (60 bp per repetition) | insertion of 7 repetitions | 14 | fusion of TP0433 and TP0434 to | - | |
| DAL-1 | insertion of 7 repetitions | 14 | HM585240 DAL-1 | ||||
| Mexico A | insertion of 9 repetitions | 16 | HM585249 Mexico A | ||||
| SS14 | insertion of 7 repetitions | 14 | - | ||||
| Samoa D | insertion of 5 repetitions | 12 | HM585237 Samoa D | ||||
| Gauthier | insertion of 3 repetitions | 10 | HM585239 Gauthier | ||||
| CDC-2 | deletion of 3 repetitions | 4 | HM585238 CDC-2 | ||||
| Fribourg-Blanc | insertion of 8 repetition | 15 | - | ||||
| TPI34aa TP0470 (497265–497688) | Nichols | insertion/deletion of repetitive sequences (24 bp per repetition) | - | 17 | gene encoding conserved hypothetical protein | signal sequence, bacterial inner membrane | - |
| DAL-1 | insertion of 10 repetitions | 27 | - | ||||
| Mexico A | insertion of 9 repetitions | 26 | - | ||||
| SS14 | deletion of 7 repetitions | 10 | - | ||||
| Samoa D | deletion of 5 repetitions | 12 | HM585241 Samoa D | ||||
| Gauthier | insertion of 8 repetitions | 25 | - | ||||
| CDC-2 | insertion of 20 repetitions | 37 | - | ||||
| Fribourg-Blanc | insertion of 5 repetitions | 22 | - | ||||
| TPI71A-C TP0967 (1050281–1050282) | Mexico A, SS14 | insertion (9 bp) | gene encoding hypothetical protein | bacterial cytoplasm | HM151373 Mexico A | ||
| Samoa D | deletion (6 bp) | HM151370 Samoa D | |||||
| Gauthier, CDC-2, Fribourg-Blanc | insertion (12 bp) | HM151371 Gauthier HM151372 CDC-2 HM585251 Fribourg-Blanc |
The following algorithms were used for identification of sequence motifs and for prediction of cellular organization: SignalP, LipoP, CDD, Pfam, PSORT, and InterProScan.
In the Nichols genome, insertion of 1204 bp exists only in its subpopulation [22].
In the published Nichols genome sequence [25], only 7 tandem repetitions have been described in this region probably as a result of incorrect automated computer assembly. The correct number of repetitions in the Nichols strain is 14.
In this region, the T. p. pallidum strains contain additional incomplete repetition (16 bp in length), T. p. pertenue strains have the same incomplete repetition of 18 bp length.
The number of repetitions was estimated from PCR products visualized on agarose gels.
Figure 3A schematic representation of the chromosomal region between TP0126 and TP0127.
The newly annotated genes and the previously described gene conversion donor sites for the tprK variable (V) sequences [23] in the intergenic region between genes TP0126 and TP0127 are shown for each strain tested. T. p. pallidum strains causing syphilis are shown in bold.
Genome regions with changes specific to individual strains of T. p. pallidum (Nichols, SS14, DAL-1 and Mexico A), of T. p. pertenue strains (Samoa D, CDC-2 and Gauthier), and the Fribourg-Blanc isolate.
| TPI interval/affected IGR or gene(s)/(coordinates according to the Nichols genome | Strain(s) | Detected indel | Putative gene function or sequence similarity | Characterization of hypothetical protein/predicted cellular localization | GenBank accession no. |
| TPI5B TP0067 (73405–73707) | Samoa D | deletion (303 bp) | gene coding for conserved hypothetical protein | TPR domain, bacterial cytoplasm | HM151365 Samoa D |
| TPI13 IGR TP0135–TP0136 (156488–156551) | Nichols | insertion (64 bp) | - | - | |
| TPI13 TP0136 (157949–158017) | DAL-1 | insertion (58 bp) resulting in frameshift mutation, 67 nt in this region remained | gene coding for fibronectin binding protein | HM585255 DAL-1 | |
| TPI21A TP0259 (270357–270365) | Gauthier | deletion (9 bp) | gene coding for hypothetical protein | LysM domain, bacterial inner membrane | HM151366 Gauthier |
| TPI42A TP0548 (591799–591846) | Fribourg-Blanc | deletion (48 bp) | gene coding for hypothetical protein | HM585227 Fribourg-Blanc | |
| TPI49 TP0629 (686998–687299) | Gauthier | deletion (302 bp) resulting in frameshift mutation | gene coding for hypothetical protein (151 aa) | bacterial cytoplasm, signal sequence present in Nichols | HM151367 Gauthier |
| TPI55 IGR TP0696–TP0697 (764890–765321) | Fribourg-Blanc | insertion of repetitive sequence (430 bp) |
| HM151369 Fribourg-Blanc | |
| TPI65B TP0858 (935500–935578) | Gauthier | continuous deletion (79 bp) resulting in frameshift mutation and small indels | gene coding for hypothetical protein (385 aa) | signal sequence, UPF0164 domain, bacterial inner membrane | HM151368 Gauthier |
The following algorithms were used for identification of sequence motifs and for prediction of cellular organization: SignalP, LipoP, CDD, Pfam, PSORT, and InterProScan.
In the GenBank-deposited Nichols genome sequence, an insertion of 64 bp is included. All other investigated strains including subpopulation of the Nichols strain, have shorter version of this IGR.
Genome regions showing frameshifts and/or major sequence changesa (MSC) of T. p. pallidum (SS14, DAL-1 and Mexico A), T. p. pertenue strains (Samoa D, CDC-2 and Gauthier), and the Fribourg-Blanc isolate when compared to the reference Nichols genome.
| Gene | Strain | Detected frameshift or MSC (position according to the Nichols genome | Protein change | Characterization of hypothetical protein/predicted cellular localization | GenBank accession no. |
| TP0126 | DAL-1 | 1 nt deletion resulting in frameshift mutation (148340) | truncated hypothetical protein TP0126 (from 291 to 227 aa) | signal sequence present, bacterial inner membrane or periplasmic space (DAL-1, Mexico A, Gauthier, CDC-2, Fribourg-Blanc), bacterial cytoplasm (Nichols, Samoa D) | HM585255 DAL-1 |
| Mexico A | HM585256 Mexico A | ||||
| Gauthier | HM585245 Gauthier | ||||
| CDC-2 | HM585244 CDC-2 | ||||
| Fribourg-Blanc | HM585258 Fribourg-Blanc | ||||
| TP0127 | Mexico A | 1 nt deletion resulting in frameshift mutation (148945) | truncated hypothetical protein TP0127 (from 229 aa to 222 aa) | DUF2715 domain, bacterial inner membrane (Mexico A) | HM585256 Mexico A |
| DAL-1 | 2 nt deletion resulting in frameshift mutation (148944–148945) | truncated hypothetical protein TP0127 (from 229 aa to 126 aa) | HM585255 DAL-1 | ||
| SS14 | - | ||||
| Samoa D | HM151364 Samoa D | ||||
| Gauthier | HM585245 Gauthier | ||||
| CDC-2 | HM585244 CDC-2 | ||||
| Fribourg-Blanc | HM585258 Fribourg-Blanc | ||||
| TP0129 | Samoa D | 2 nt substitution (149875–149876) | premature stop codon resulting in 26-aa deletion at C-terminus of hypothetical protein TP0129 (from 158 to 132 aa) | bacterial cytoplasm | HM151364 Samoa D |
| Gauthier | HM585245 Gauthier | ||||
| CDC-2 | HM585244 CDC-2 | ||||
| Fribourg-Blanc | HM585258 Fribourg-Blanc | ||||
| TP0131 | Mexico A | MSC and small indels (151122–152890) | truncated TprD (TP0131) protein (from 598 aa to 596 aa) | HM585256 Mexico A | |
| Samoa D | HM151364 Samoa D | ||||
| CDC-2 | HM585244 CDC-2 | ||||
| Fribourg-Blanc | HM585258 Fribourg-Blanc | ||||
| TP0136 | DAL-1 | frameshift mutation (see | truncated fibronectin binding protein TP0136 (from 495 aa to 452 aa) | HM585255 DAL-1 | |
| SS14 | (from 495 aa to 492 aa) | - | |||
| Mexico A | (from 495 aa to 492 aa) | HM585257 Mexico A | |||
| Samoa D | (from 495 aa to 470 aa) | HM151364 Samoa D | |||
| Gauthier | (from 495 aa to 470 aa) | HM585245 Gauthier | |||
| CDC-2 | (from 495 aa to 481 aa) | HM585244 CDC-2 | |||
| Fribourg-Blanc | (from 495 aa to 481 aa) | HM585258 Fribourg-Blanc | |||
| TP0315 | Samoa D | 1 nt deletion resulting in frameshift mutation, MSC (330506) | elongation of conserved hypothetical protein TP0315 at C-terminus (from 215 aa to 270 aa) | DUF2715 domain, bacterial inner membrane ( | HM585230 Samoa D |
| Gauthier | HM585231 Gauthier | ||||
| CDC-2 | HM585232 CDC-2 | ||||
| Fribourg-Blanc | HM585258 Fribourg-Blanc | ||||
| TP0548 | SS14 | MSC and small indels (591822–592917) | elongation of treponemal conserved hypothetical protein TP0548 (from 434 aa to 438 aa) | bacterial inner membrane | - |
| Mexico A | (from 434 aa to 438 aa) | HM585228 Mexico A | |||
| Samoa D | shortening of treponemal conserved hypothetical protein TP0548 (from 434 aa to 432 aa) | HM245777 Samoa D | |||
| Gauthier | (from 434 aa to 432 aa) | HM243496 Gauthier | |||
| CDC-2 | (from 434 aa to 432 aa) | HM243495 CDC-2 | |||
| Fribourg-Blanc | (from 434 aa to 418 aa) | HM585227 Fribourg-Blanc |
Major sequence changes were defined as continuous amino acid replacements comprising 10 or more residues or 15 and more dispersed amino acid replacements.
The following algorithms were used for identification of sequence motifs and for prediction of cellular organization: SignalP, LipoP, CDD, Pfam, PSORT, and InterProScan.
The reference Nichols strain was not resequenced in this region.
Figure 4The unrooted trees constructed from sequences of genes showing major differences in strain clustering.
A An unrooted tree constructed from the binary RTS data without Cuniculi A data. Bar scale represents 0.01 restriction target site substitutions per RTS. The unrooted trees constructed from sequences of 4 treponemal genes including TP0131, TP0136, TP0548, and TP1031 are shown in panel B, C, D, and E, respectively. Bar scale represents 0.01 nucleotide substitutions per site. Bootstrap values based on 1,000 replications are shown next to branches. T. p. pallidum strains causing syphilis are shown in bold.