| Literature DB >> 18482458 |
Petra Matejková1, Michal Strouhal, David Smajs, Steven J Norris, Timothy Palzkill, Joseph F Petrosino, Erica Sodergren, Jason E Norton, Jaz Singh, Todd A Richmond, Michael N Molla, Thomas J Albert, George M Weinstock.
Abstract
BACKGROUND: Syphilis spirochete Treponema pallidum ssp. pallidum remains the enigmatic pathogen, since no virulence factors have been identified and the pathogenesis of the disease is poorly understood. Increasing rates of new syphilis cases per year have been observed recently.Entities:
Mesh:
Year: 2008 PMID: 18482458 PMCID: PMC2408589 DOI: 10.1186/1471-2180-8-76
Source DB: PubMed Journal: BMC Microbiol ISSN: 1471-2180 Impact factor: 3.605
DDT sequencing of 38 hypervariable regions where SNPs could not be identified by CGS
| Region no. | ORFa | Region size (nt) | Size of sequenced region (nt) | Left coordinatea | Right coordinatea | Newly found changes in the regions suggested by CGS | Newly found changes not suggested by mapping phase of CGS | Confirmation of SNPs identified by CGS in this regionb |
| 1 | TP0012 | 37 | 390 | 12322 | 12711 | 3 nt deletion | - | - |
| 2 | TP0076 | 29 | 529 | 83788 | 84316 | - | 1 solitary SNP | - |
| 3 | TP0117 | 86 | 699 | 134808 | 135506 | 7 clustered SNPs | - | - |
| 4 | TP0117 | 86 | 3 clustered SNPs | - | - | |||
| 5 | TP0126 | 29 | 393 | 147948 | 148340 | 1 solitary SNP | - | - |
| 6 | upstream | 29 | 460 | 149103 | 149562 | 2 clustered SNPs | - | - |
| of TP0128 | 1 nt + 5 nt insertions | |||||||
| 7 | TP0131 | 421 | 723 | 150925 | 151647 | 3 clustered SNPs, 1 solitary SNP | 2 clustered SNPs | - |
| 8 | upstream of TP0136 | 29 | 404 | 156348 | 156751 | 64 nt deletion | - | - |
| 9 | TP0136 | 1087 | 1609 | 156752 | 158360 | 19 clustered SNPs | 8 clustered SNPs | 21 SNPs |
| 1 nt + 1 nt + 1 nt + 6 nt deletions | 2 solitary SNPs | |||||||
| 10 | TP0272 | 29 | 480 | 288647 | 289126 | - | - | - |
| 11 | TP0304 | 37 | 466 | 318761 | 319226 | 3 nt deletion | - | - |
| 12 | TP0326 | 79 | 452 | 345605 | 346056 | 8 clustered SNPs | - | 1 SNP |
| 13 | TP0352 | 29 | 465 | 376926 | 377390 | - | - | - |
| 14 | TP0394 | 29 | 505 | 420353 | 420857 | - | - | 1 SNP |
| 15 | TP0431 | 29 | 465 | 458973 | 459437 | - | - | 1 SNP |
| 16 | TP0457 | 29 | 465 | 487935 | 488399 | - | - | - |
| 17 | TP0484 | 29 | 468 | 514441 | 514908 | - | - | - |
| 18 | TP0486 | 29 | 494 | 517297 | 517790 | - | 1 nt insertion, 1 nt deletion | - |
| 19 | TP0493 | 29 | 478 | 529146 | 529623 | - | - | - |
| 20 | TP0515 | 44 | 506 | 555754 | 556259 | 3 clustered SNPs | - | 4 SNPs |
| 21 | TP0544 | 29 | 611 | 585940 | 586550 | 6 nt insertion | - | - |
| 22 | TP0548 | 835 | 1189 | 591557 | 592745 | 22 clustered SNPs | 2 clustered SNPs | 5 SNPs |
| 3 nt + 4 nt + 5 nt insertions | 1 solitary SNP | |||||||
| 23 | TP0577 | 37 | 405 | 628247 | 628651 | 1 solitary SNP | - | - |
| 24 | TP0598 | 29 | 550 | 648851 | 649400 | - | 4 1 nt insertions | - |
| 25 | TP0620–TP0621 | 51 | 3469 | 670958 | 674426 | - | 4 clustered SNPs | - |
| 26 | TP0668 | 37 | 462 | 730080 | 730541 | 6 nt deletion | - | - |
| 27 | TP0699 | 51 | 469 | 766143 | 766611 | 1 solitary SNP | - | - |
| 28 | TP0785 | 29 | 438 | 851631 | 852068 | - | - | - |
| 29 | TP0814 | 29 | 476 | 882990 | 883465 | - | - | - |
| 30 | TP0865 | 29 | 480 | 943847 | 944326 | 3 nt insertion | - | 1 SNP |
| 31 | TP0866 | 29 | 543 | 944677 | 945219 | - | 1 nt insertion | - |
| 32 | TP0868 | 29 | 454 | 947257 | 947710 | 7 nt deletion | - | - |
| 33 | TP0896–TP0898 | 667 | 3038 | 974053 | 977090 | 4 SNPsc and 7 variable regionsd | 1 SNP | |
| 34 | TP0898 | 27 | 416 | 978349 | 978764 | - | - | - |
| 35 | TP0933 | 29 | 164 | 1014034 | 1014197 | - | - | - |
| 36 | TP0973 | 44 | 396 | 1057660 | 1058055 | 1 solitary SNP | - | 1 SNP (igr) |
| 37 | TP1030–TP1031 | 1507 | 402 | 1123660 | 1124061 | 18 clustered SNPs | 1 nt insertion, | 16 SNPs |
| 1775 | 1124256 | 1126030 | 1 solitary SNP | |||||
| 38 | TP1036 | 29 | 550 | 1132558 | 1133107 | - | - | - |
ORF – open reading frame; nt – nucleotide; SNP – single nucleotide polymorphism; igr – intergenic region; aas described in [3]; bSNPs identified using CGS in these regions were verified by DDT sequencing; c two SNPs represent the group of 17 SNPs in non-unique sites, originally excluded from list of total changes; didentified variable regions in TP0897 were identical to the variable regions V1–V7 described previously [22–24].
DDT sequencing of regions selected based on pilot SS14/Nichols comparison using microarray hybridization experiments
| Region no. | ORFa | Size of sequenced region (nt) | Left coordinatea | Right coordinatea | Newly found changes | Confirmation of SNPs identified by CGSb |
| 1 | TP0017 | 848 | 18454 | 19301 | - | - |
| 2 | TP0070 | 339 | 75493 | 75831 | - | - |
| 3 | TP0094 | 1011 | 102879 | 103889 | - | - |
| 4 | TP0123 | 1083 | 143207 | 144289 | - | - |
| 5 | TP0192 | 748 | 206663 | 207410 | - | - |
| 6 | TP0200 | 264 | 210183 | 210446 | - | - |
| 7 | TP0291 | 834 | 304706 | 305539 | - | - |
| 8 | TP0319 | 1014 | 334847 | 335860 | - | 1 SNP |
| 9 | TP0321–TP0322 | 2640 | 336149 | 338788 | - | - |
| 10 | TP0323 | 851 | 338885 | 339735 | - | 1 SNP |
| 11 | TP0376 | 806 | 400903 | 401708 | - | - |
| 12 | TP0377 | 78 | 401851 | 401928 | - | - |
| 13 | TP0516 | 1533 | 556351 | 557883 | - | - |
| 14 | TP0519 | 1277 | 559215 | 560491 | - | - |
| 15 | TP0580 | 1242 | 630328 | 631569 | - | - |
| 16 | TP0587 | 183 | 639620 | 639802 | - | - |
| 17 | TP0633 | 776 | 691437 | 692212 | - | - |
| 18 | TP0683 | 1047 | 746899 | 747945 | - | - |
| 19 | TP0799–TP0800 | 2168 | 866136 | 868303 | - | - |
| 20 | TP0806 | 1397 | 875808 | 877204 | - | - |
| 21 | TP0807 | 165 | 877407 | 877571 | - | - |
| 22 | TP0808 | 187 | 877632 | 877818 | - | - |
| 23 | TP0877 | 998 | 953710 | 954707 | - | 2 SNPs |
| 24 | TP0933 | 2023 | 1013098 | 1015120 | - | - |
| 25 | TP0952 | 1438 | 1032341 | 1033778 | - | 1 SNP |
| 26 | TP0961 | 1216 | 1041973 | 1043188 | - | - |
| 27 | TP0980 | 975 | 1063047 | 1064021 | - | - |
ORF – open reading frame; nt – nucleotide; SNP – single nucleotide polymorphism; aas described in [3]; bSNPs identified using CGS in these regions were verified by DDT sequencing.
DDT sequencing of regions showing different whole genome fingerprint profiles in SS14 strain
| Region no. | ORFa | Difference from WGF on the gel | Size of sequenced region (nt)b | Left coordinatea | Right coordinatea | Newly found changes | Confirmation of SNPs identified by CGSc |
| 1 | TP0124–TP0134 | insertion | 3245 + 1255 insertion | 145858 | 149102 | 1255 nt insertion, 2 nt deletion | - |
| 1362 | 149563 | 150924 | - | - | |||
| 252 | 151648 | 151899 | - | - | |||
| 3465 | 152043 | 155507 | 1 nt insertion | 2 SNPs | |||
| 2 | TP0135–TP0138 | deletion | 662 | 155686 | 156347 | - | (+ 64 nt deletion as in Table 1) |
| 894 | 158391 | 159284 | 1 nt deletion | ||||
| 3 | TP0433–TP0434 | insertion | 481 + 419 insertion | 461058 | 461538 | insertion of 7 repeats of 60 nt region altogether 14 repetitions, consensus sequence of the repeat CGTGAGGTGGAGGACGYGCCGRRGGTAGTG GAGCCGGCCTCTGRGCRTGARGGAGGGGAG | - |
| 4 | TP0468–TP0471 | deletion | 3571 | 495308 | 498878 | 2 nt deletion + 1 nt insertion + deletion of seven 24 nt repetitions, consensus sequence of the repeat CTCCGCCTCCTTGCGCCGGGCTTC | 1 SNP |
nt – nucleotide; SNP – single nucleotide polymorphism; aas described in [3]; bregions previously described in Table 1 were excluded; cSNPs identified using CGS in these regions were verified by DDT sequencing.
Figure 1Scheme to identify sequence changes in the SS14 genome.
Genes with mutations that significantly affect protein length
| ORFa | SNPs | Other changes | Result of mutation | Protein function |
| TP0006 | 1 | read-through stop codon | longer protein (+262 aa), fusion with TP0007 | Tp75 protein (possible surface protein) |
| TP0127 | 0 | 1 deletion (2 nt) | frameshift (-103 aa) | hypothetical protein |
| TP0132 | 0 | 1 insertion (1 nt) | frameshift (-44 aa) | hypothetical protein |
| TP0433–TP0434 | 1 | insertion of tandem repeats | fusion of 2 ORFs (604 aa) | hypothetical proteins (resulting fusion – |
| TP0468–TP0469 | 0 | 1 insertion (2 nt) | fusion of 2 ORFs (650 aa) | conserved hypothetical proteins |
| TP0470 | 0 | deletion of 7 tandem repeats (7 × 24 nt) | shorter protein (-56 aa) | conserved hypothetical protein |
| TP0486 | 0 | 1 deletion (1 nt)b | frameshift (+9 aa) | antigen, p83/100 (possible surface protein) |
| TP0598 | 1 | 4 insertion (4 nt)b | frameshift (+81 aa) fusion with TP0597 | hypothetical protein |
| TP0868 | 0 | 1 deletion (7 nt) | frameshift (-168aa) | flagellar filament 34.5 kDa core protein (FlaB1) |
| TP0924 | 1 | nonsense mutation | shorter protein (-250 aa) | Tex protein |
| TP1030 | 7 | 1 insertion (1 nt) | frameshift (-46 aa) | hypothetical protein |
ORF – open reading frame; SNP – single nucleotide polymorphism; aa – amino acid; aas described in [3]; bsame sequence change detected in Nichols Houston strain genomic DNA; c same sequence change described in [19].
ORFs with the highest number of detected SNPs (+ indels)
| ORFa | SNPs | aa changes | Other changes | Result of mutation | Protein function |
| TP0117 | 10 | 6 | Tpr protein C (TprC) | ||
| TP0136 | 49 | 38 | 4 deletions (9 nt) | 3 aa missing | hypothetical proteinb |
| TP0326 | 12 | 9 | outer membrane protein | ||
| TP0515 | 10 | 10 | conserved hypothetical protein | ||
| TP0548 | 30 | 21 | 3 insertions (12 nt) | 4 aa inserted | hypothetical protein |
| TP1031 | 31 | 23 | Tpr protein L (TprL) |
ORF – open reading frame; SNP – single nucleotide polymorphism; aa – amino acid; nt – nucleotide; aas described in [3]; bthis protein was described to be fibronectin and laminin protein [29].
Distribution of SNPs in different gene function groups and their effects on protein sequences
| Putative gene function | whole genomea | % | affected ORFsb | % | SNPsc | % | aa changesd | % |
| Hypothetical | 316 | 30.4 | 52 | 38.2 | 199 | 64.2 | 148 | 67.0 |
| Conserved hypothetical | 177 | 17.0 | 21 | 15.4 | 34 | 11.0 | 22 | 10.0 |
| Metabolic functions | 167 | 16.1 | 19 | 14.1 | 23 | 7.4 | 19 | 8.6 |
| Housekeeping genes | 223 | 21.5 | 24 | 17.6 | 25 | 8.0 | 10 | 4.4 |
| Other function | 156 | 15.0 | 20 | 14.7 | 29 | 9.4 | 22 | 10.0 |
| Total | 1039 | 100 | 136 | 100 | 310 | 100 | 221 | 100 |
anumber of genes (ORFs) in the complete genome of TPA Nichols strain [3]; bnumber of all genes with sequence changes in the genome of SS14 strain; cnumber of SNP changes identified within ORF groups in the genome, other sequence changes were not included; damino acid changes caused by SNPs, changes in length of the protein molecule are listed in Table 4.
Genetic heterogeneity in the SS14 population isolated from rabbit testes
| ORFa | Genome positiona | [GenBank: | SS14 sequenceb | position in ORF (Nichols) a | aa change | note |
| TP0117 | 135098 | G | G or C (5/6) | 1600 | P534 => A534 | |
| 135107 | T | T or C (3/4) | 1591 | I531 => V531 | ||
| 135141 | G | G or A (5/2) | 1557 | no change | ||
| 135144 | T | T or C (3/4) | 1554 | no change | ||
| 135149 | C | C or T (5/2) | 1549 | A517 => T517 | ||
| 135220 | G | G or A (5/6) | 1478 | T493 => I493 | ||
| 135227 | G | G or A (6/6) | 1471 | P491 => S491 | ||
| 135235 | G | G or A (2/10) | 1463 | A488 => V488 | ||
| 135239 | C | C or T (2/10) | 1459 | G487 => R487 | ||
| 135251 | A | A or G (6/6) | 1447 | Y483 => H483 | ||
| TP0402 | 427435 | C | C or T (NA) | 401 | P134 => L134 | |
| 427737 | G | G or T (NA) | 703 | A235 => S234 | ||
| TP0620 | 671746 | T | T or C (9/3) | 1142 | Q381 => R381 | |
| 671751 | T | T or G (19/10) | 1137 | R379 => G379 | ||
| 671753 | T | T or C (19/10) | 1135 | R379 => G379 | ||
| 671763 | C | C or T (8/4) | 1125 | no change | ||
| 671982 | G | G or C (12/6) | 906 | S302 => R302 | ||
| 672004 | C | C or T (12/6) | 884 | S295 => N295 | ||
| 672016 | A | G or A (12/6) | 872 | L291 => P291 | ||
| 672025 | T | T or C (11/7) | 863 | N288 => C288 | ||
| 672026 | T | T or A (11/6) | 862 | N288 => C288 | ||
| 672027 | A | A or G (11/6) | 861 | G287 => D287 | ||
| 672028 | C | C or T (12/5) | 860 | G287 => D287 | ||
| 672036 | G | G or T (11/6) | 852 | no change | ||
| 672039 | A | A or G (NA) | 849 | P283 => N283 | ||
| 672040 | G | G or T (NA) | 848 | P283 => N283 | ||
| 672041 | G | G or T (12/6) | 847 | P283 => N283 | ||
| 672042 | G | G or A (NA) | 846 | D282 => S282 | ||
| 672043 | T | T or C (13/6) | 845 | D282 => S282 | ||
| 672044 | C | C or T (10/5) | 844 | D282 => S282 | ||
| 672154 | G | G or T (7/10) | 734 | T245 => K245 | ||
| 672286 | G | G or A (4/12) | 602 | T201 => M201 | ||
| Upstream of TP0620 | 672916-7 | (-) | (-) or C (6/6) | position -30 from TP0620 | homopolymeric stretch | |
| 672944 | A | A or G (14/6) | position -58 from TP0620 | |||
| TP0621 | 673088 | T | T or C (14/4) | 2134 | I712 => V712 | |
| 673119 | G | G or A (14/4) | 2103 | no change | ||
| 673425 | C | C or T (2/8) | 1797 | no change | ||
| 673428 | A | A or G (2/8) | 1794 | no change | ||
| 673511 | A | A or C (6/6) | 1711 | F571 => V571 | ||
| 673545 | C | C or T (9/4) | 1677 | no change | ||
| 673550 | A | A or G (10/6) | 1672 | F558 => L558 | ||
| 673554 | C | C or T (10/6) | 1668 | no change | ||
| TP0971 | 1054447 | T | T or C (NA) | 301 | K101 => E101 | |
| TP1029 | 1123796 | G | G or A (5/6) | 15 | no change | |
ORF – open reading frame; aa – amino acid; NA – not available; aas described in [3]; bnumbers in parentheses show sequence reads for each alternative sequence.