| Literature DB >> 35491834 |
Charles M Thurlow1, Sandeep J Joseph1, Lilia Ganova-Raeva2, Samantha S Katz1, Lara Pereira1, Cheng Chen1, Alyssa Debra1, Kendra Vilfort1, Kimberly Workowski1,3, Stephanie E Cohen4, Hilary Reno5,6, Yongcheng Sun1, Mark Burroughs7, Mili Sheth7, Kai-Hua Chi1, Damien Danavall1, Susan S Philip3, Weiping Cao1, Ellen N Kersh1, Allan Pillay1.
Abstract
Downstream next-generation sequencing (NGS) of the syphilis spirochete Treponema pallidum subspecies pallidum (T. pallidum) is hindered by low bacterial loads and the overwhelming presence of background metagenomic DNA in clinical specimens. In this study, we investigated selective whole-genome amplification (SWGA) utilizing multiple displacement amplification (MDA) in conjunction with custom oligonucleotides with an increased specificity for the T. pallidum genome and the capture and removal of 5'-C-phosphate-G-3' (CpG) methylated host DNA using the NEBNext Microbiome DNA enrichment kit followed by MDA with the REPLI-g single cell kit as enrichment methods to improve the yields of T. pallidum DNA in isolates and lesion specimens from syphilis patients. Sequencing was performed using the Illumina MiSeq v2 500 cycle or NovaSeq 6000 SP platform. These two enrichment methods led to 93 to 98% genome coverage at 5 reads/site in 5 clinical specimens from the United States and rabbit-propagated isolates, containing >14 T. pallidum genomic copies/μL of sample for SWGA and >129 genomic copies/μL for CpG methylation capture with MDA. Variant analysis using sequencing data derived from SWGA-enriched specimens showed that all 5 clinical strains had the A2058G mutation associated with azithromycin resistance. SWGA is a robust method that allows direct whole-genome sequencing (WGS) of specimens containing very low numbers of T. pallidum, which has been challenging until now. IMPORTANCE Syphilis is a sexually transmitted, disseminated acute and chronic infection caused by the bacterial pathogen Treponema pallidum subspecies pallidum. Primary syphilis typically presents as single or multiple mucocutaneous lesions and, if left untreated, can progress through multiple stages with various clinical manifestations. Molecular studies often rely on direct amplification of DNA sequences from clinical specimens; however, this can be impacted by inadequate samples due to disease progression or timing of patients seeking clinical care. While genotyping has provided important data on circulating strains over the past 2 decades, WGS data are needed to better understand strain diversity, perform evolutionary tracing, and monitor antimicrobial resistance markers. The significance of our research is the development of an SWGA DNA enrichment method that expands the range of clinical specimens that can be directly sequenced to include samples with low numbers of T. pallidum.Entities:
Keywords: DNA enrichment; Treponema pallidum; metagenomics; syphilis; whole-genome sequencing
Mesh:
Year: 2022 PMID: 35491834 PMCID: PMC9241506 DOI: 10.1128/msphere.00009-22
Source DB: PubMed Journal: mSphere ISSN: 2379-5042 Impact factor: 5.029
Clinical and laboratory data for specimens and the T. pallidum isolate
| Sample/isolate ID | Collection yr | Gender | Sexual orientation | Syphilis stage | Site of lesion | Antibody titer (assay) | qPCR ( | RNP | Extraction method | Reference or source |
|---|---|---|---|---|---|---|---|---|---|---|
| CDC-SF003 | 2017 | Male | MSM | Primary | Penis | 1:4 (VDRL) | 9,680 | NA | Standard |
|
| EUHM-001 | 2019 | Male | MSM | Secondary | Neck | 1:128 (RPR) | <1 | 29.59 ± 0.20 | Standard | This study |
| EUHM-002 | 2019 | Male | MSM | Secondary | Perianal | 1:256 (RPR) | <1 | 28.15 ± 0.04 | Standard | This study |
| EUHM-003 | 2019 | Male | MSM | Secondary | Penis | 1:32 (RPR) | <1 | 29.35 ± 0.08 | Standard | This study |
| EUHM-004 | 2019 | Male | MSM | Primary | Penis | 1:4 (RPR) | 106.7 ± 6.5 | 25.48 ± 0.04 | Standard | This study |
| EUHM-005 | 2019 | Male | MSM | Secondary | Penis | 1:64 (RPR) | <1 | 33.34 ± 0.03 | Standard | This study |
| EUHM-006 | 2019 | Male | MSM | Primary | Penis | 1:16 (RPR) | <1 | 31.62 ± 0.03 | Standard | This study |
| EUHM-007 | 2019 | Male | MSM | Secondary | Hand | 1:64 (RPR) | <1 | 38.32 ± 0.1 | Standard | This study |
| EUHM-008 | 2019 | Male | MSM | Secondary | Scrotum | 1:64 (RPR) | 0.9 ± 0.1 | 31.00 ± 0.1 | Standard | This study |
| EUHM-009 | 2019 | Male | MSM | Secondary | Scrotum | 1:64 (RPR) | <1 | 33.24 ± 0.14 | Standard | This study |
| EUHM-010 | 2019 | Male | MSM | Secondary | Scrotum | 1:128 (RPR) | <1 | 31.27 ± 0.08 | Standard | This study |
| EUHM-011 | 2019 | Male | MSM | Primary | Penis | 1:32 (RPR) | <1 | 32.87 ± 0.21 | Standard | This study |
| EUHM-012 | 2019 | Male | MSM | Primary | Penis | 1:8 (RPR) | 31.5 ± 0.5 | 22.61 ± 0.08 | Large scale | This study |
| EUHM-013 | 2020 | Male | MSM | Secondary | Penis | 1:64 (RPR) | 122 ± 1.2 | 31.38 ± 0.21 | Large scale | This study |
| EUHM-014 | 2020 | Male | MSM | Secondary | NA | 1:16 (RPR) | 103 ± 6.7 | 24.06 ± 0.1 | Large scale | This study |
| STLC-001 | 2020 | Male | MSW | Primary | Penis | NR (RPR) | 28.8 ± 3.1 | 25.57 ± 0.07 | Standard | This study |
NA, not available; NR, nonreactive.
EUHM, Emory University Hospital, Atlanta, GA; STLC, St. Louis County STD Clinic, St. Louis. MO.
MSM, men who have sex with men; MSW, men who have sex with women.
VDRL, Venereal Disease Research Laboratory test; RPR, rapid plasma reagin test.
FIG 1T. pallidum gDNA copies/μL for the 10-fold dilution series spiked samples enriched by the NEB+MDA or SWGA. Spiked samples were composed of a 10-fold dilution series of T. pallidum Nichols DNA and a constant concentration of human DNA. T. pallidum genome DNA (copies/μL of DNA extract) in samples pre- and postenrichment was estimated using PCR targeting the polA gene and are shown in the bar graph. The y axis has been log10-scaled for depiction of the nonenriched dilution series. Error bars represent the standard error among three replicate enriched T. pallidum samples.
FIG 2Relative percent T. pallidum Nichols DNA in total DNA for nonenriched and NEB+MDA- and SWGA-enriched spiked samples. Spiked samples were composed of a 10-fold dilution series of T. pallidum Nichols DNA and a constant concentration of human DNA. The percent T. pallidum DNA in total DNA was calculated based on the input DNA concentration and gDNA copies/μL (nonenriched) and the DNA concentration and gDNA copies/μL for the Nichols-spiked samples postenrichment (NEB+MDA or SWGA). Genome copies were estimated from measured T. pallidum polA copies/μL of DNA extract. The y axis is log10-scaled for depiction of the nonenriched dilution series. Error bars represent the standard error among three replicate samples.
FIG 3Percent coverage of sequencing reads of enriched T. pallidum Nichols spiked samples. Treponemal reads with at least 1 read mapped per site (1×) against the T. pallidum subsp. pallidum Nichols reference genome (GenBank version number NC_000919.1) and percent coverage of the T. pallidum genome are shown. (A) Sequencing reads of samples enriched using the NEB+MDA method. (B) Sequencing reads of samples enriched using SWGA. All samples were sequenced using the Illumina NovaSeq 6000 platform. Error bars represent the standard error between the mapped reads derived from three replicate enriched Nichols samples.
FIG 4Percent coverage of a nonenriched T. pallidum Nichols isolate control containing 1,063.1 ± 45.22 T. pallidum genomic copies/μL of DNA extract, NEB+MDA-enriched clinical isolate CDC-SF003, and SWGA-enriched clinical specimens sequenced using the Illumina MiSeq v2 (500- cycle) platform. The percentages of T. pallidum reads are derived from down-selected T. pallidum reads. Prefiltered reads for Nichols-CDC were mapped to the Nichols reference genome (GenBank version number NC_000919.1). The prefiltered reads in all clinical isolates and specimens were mapped against the SS14 reference genome (GenBank version number NC_021508.1).
Sequencing percent coverage for the Nichols isolate, clinical isolate CDC-SF003, and clinical specimens across the T. pallidum reference genome
| Sample | Enrichment method | Clonal complex | Raw read pairs | Nonhost read pairs | Total read pairs after QC | Read pairs classified as | Total read pairs classified as | Mean read depth | Genome covered ≥1× (%) | Genome covered ≥5× (%) | Genome covered ≥10× (%) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Nichols_CDC | Nonenriched | Nichols-like | 1,063.1 ± 45.22 | 4,053,500 | 3,645,649 | 3,588,414 | 70,299 | 1.96 | 6.33 | 86.26 | 60.30 | 22.28 |
| CDC-SF003 | NEB+MDA | SS14-like | 2,394,930 ± 135,210 | 5,798,777 | 3,988,173 | 3,949,036 | 129,998 | 3.29 | 46.44 | 98.87 | 98.60 | 98.01 |
| EUHM-004 | SWGA | Nichols-like | 6,367,089.5 ± 240,811.5 | 6,102,826 | 4,440,618 | 4,280,401 | 1,403,645 | 32.79 | 370.39 | 96.99 | 95.13 | 92.67 |
| EUHM-012 | SWGA | Nichols-like | 2,140,753 ± 28,192 | 10,350,274 | 5,870,287 | 5,716,082 | 2,793,693 | 48.87 | 639.86 | 96.34 | 93.98 | 91.89 |
| EUHM-013 | SWGA | SS14-like | 5,159,716 ± 220,318.5 | 11,975,324 | 11,966,460 | 11,838,431 | 8,308,234 | 70.18 | 2,503.96 | 98.72 | 98.56 | 98.37 |
| EUHM-014 | SWGA | Nichols-like | 2,573,508 ± 221,900.5 | 11,250,518 | 9,266,926 | 9,059,022 | 2,355,426 | 26.00 | 930.87 | 98.79 | 98.49 | 98.04 |
| STLC-001 | SWGA | SS14-like | 7,420,534 ± 719,765 | 11,293,960 | 7,770,834 | 7,721,767 | 3,004,631 | 38.91 | 1,133.43 | 98.32 | 95.94 | 94.10 |
All sequencing was performed using Illumina’s MiSeq v2 (500 cycle) platform.
Nonenriched T. pallidum Nichols isolate used as MiSeq control.
Based on T. pallidum polA calculated copies/μL.
Calculated after quality assessment and T. pallidum selection of reads.
FIG 5SWGA primer set validation. (A) T. pallidum gDNA copies/μL for the Nichols spiked sample (1:100 diluted) enriched with each SWGA primer set (1 to 12). (B) Relative percent T. pallidum DNA for the Nichols spiked sample (1:100 dilution) enriched with each SWGA primer set. Percent T. pallidum DNA was calculated based on the input DNA concentration and gDNA copies/μL for the Nichols mock samples post-SWGA enrichment. The spiked sample contained purified human gDNA, and the T. pallidum genome copies were derived from qPCR-measured T. pallidum polA copies/μL of DNA extract. The input T. pallidum gDNA copies/μL of DNA is displayed as nonenriched. The y axis is log10-scaled in each panel for depiction of the relative percent T. pallidum post-enrichment with each primer set. Error bars represent the standard error among three replicate Nichols samples.
FIG 6Maximum likelihood global phylogenetic tree of the 7 T. pallidum strains sequenced in this study along with 122 high-quality (with 5× read depth covering >90% of the genome) publicly available T. pallidum genomes. The two major lineages, Nichols-like and SS14-like, are highlighted along with the presence of genotypic mutation responsible for macrolide resistance and country of origin.