| Literature DB >> 21264340 |
Gagan A Pandya1, M Catherine McEllistrem, Pratap Venepally, Michael H Holmes, Behnam Jarrahi, Ravi Sanka, Jia Liu, Svetlana A Karamycheva, Yun Bai, Robert D Fleischmann, Scott N Peterson.
Abstract
BACKGROUND: While the pneumococcal protein conjugate vaccines reduce the incidence in invasive pneumococcal disease (IPD), serotype replacement remains a major concern. Thus, serotype-independent protection with vaccines targeting virulence genes, such as PspA, have been pursued. PspA is comprised of diverse clades that arose through recombination. Therefore, multi-locus sequence typing (MLST)-defined clones could conceivably include strains from multiple PspA clades. As a result, a method is needed which can both monitor the long-term epidemiology of the pneumococcus among a large number of isolates, and analyze vaccine-candidate genes, such as pspA, for mutations and recombination events that could result in 'vaccine escape' strains.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21264340 PMCID: PMC3018475 DOI: 10.1371/journal.pone.0015950
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
ST-complex and serotype/serogroup designation of 72 pneumococcal strains.
| ST complex | Total number of strains/ST-complex | Serotype/Serogroup (no. strains) |
| ST81 | 5 | 23F (2), 19F (2), 23B (1) |
| ST180 | 18 | 3 (18) |
| ST199 | 3 | 19A (2), 15 (1) |
| ST377 | 4 | 35B (1), 35 (1), 6B (1), 14 (1) |
| ST690 | 3 | 6A (1), 19A (1), NT (1) |
| ST9 | 1 | 14 (1) |
| ST18 | 1 | 14 (1) |
| ST20 | 1 | 14 (1) |
| ST37 | 1 | 23F(1) |
| ST41 | 1 | 19A (1) |
| ST62 | 1 | 11 (1) |
| ST63 | 1 | 15A (1) |
| ST67 | 1 | 14 (1) |
| ST75 | 1 | 19A (1) |
| ST90 | 2 | 6B (2) |
| ST113 | 1 | 18C (1) |
| ST156 | 2 | 9V (2) |
| ST173 | 1 | 23F (1) |
| ST175 | 1 | 19A (1) |
| ST177 | 1 | 19F (1) |
| ST185 | 1 | 6B (1) |
| ST205 | 1 | 4 (1) |
| ST227 | 1 | 1 (1) |
| ST236 | 1 | 19F (1) |
| ST242 | 1 | 23F (1) |
| ST268 | 1 | 19A (1) |
| ST270 | 1 | 6B (1) |
| ST289 | 1 | 5 (1) |
| ST315 | 1 | 6B (1) |
| ST376 | 1 | 6A (1) |
| ST384 | 1 | 6B (1) |
| ST393 | 1 | 38 (1) |
| ST433 | 1 | 22 (1) |
| ST498 | 1 | 35 (1) |
| ST659 | 1 | 16 (1) |
| ST816 | 1 | 10 (1) |
| ST1201 | 1 | 7 (1) |
| ST1257 | 1 | 20 (1) |
| TIGR4 | 1 | 4 (1) |
| R6 | 1 | 2 (1) |
| G54 | 1 | 19F (1) |
| 670 | 1 | 6B (1) |
Serotype/Serogroup and ST-complex designation of 72 pneumococcal strains.
| Serotype/Serogroup | ST-complexes (no. strains) |
| 1 | 227 (1) |
| 2 | R6 strain |
| 3 | 180 (18) |
| 4 | 205 (1) & TIGR4 strain |
| 5 | 289 (1) |
| 6A | 376 (1), 690 (1) |
| 6B | 90 (2), 185 (1), 270 (1), 315 (1), 377 (1), 384 (1) & 670 strain |
| 7 | 1201 (1) |
| 9V | 156 (2) |
| 10 | 816 (1) |
| 11 | 62 (1) |
| 14 | 9 (1), 18 (1), 20 (1), 67 (1), 377 (1) |
| 15 | 199 (1) |
| 15A | 63 (1) |
| 16 | 659 (1) |
| 18C | 113 (1) |
| 19A | 41 (1), 75 (1), 175 (1), 199 (2), 268 (1), 690 (1) |
| 19F | 81 (2), 177 (1), 236 (1) & G54 strain |
| 20 | 1257 (1) |
| 22 | 433 (1) |
| 23B | 81 (1) |
| 23F | 37 (1), 81 (2), 173 (1), 242 (1) |
| 35 | 377 (1), 498 (1) |
| 35B | 377 (1) |
| 38 | 393 (1) |
| NT | 690 (1) |
The eleven genomic fragments included on the TIGR4 resequencing chip
| Gene Classification | Name/Locus | Gene/Sequence | Length (bp) | Sequenced length (bp) |
| Conserved | 16S_rRNA | 16S rRNA | 1413 | 1389 |
| SP_0834 | Hemolysin-related protein | 510 | 486 | |
| SP_1204 | Hemolysin A - putative | 594 | 570 | |
| SP_1466 | Hemolysin | 645 | 621 | |
| SP_1961 | DNA-directed RNA polymerase – β subunit | 3609 | 3585 | |
| Variable | SP_0368 | Cell wall surface anchor family protein 1 | 5301 | 5277 |
| SP_1833 | Cell wall surface anchor family protein 2 | 2124 | 2100 | |
| SP_1992 | Cell wall surface anchor family protein 3 | 663 | 639 | |
| SP_2145 | Antigen, cell wall surface anchor family | 2082 | 2058 | |
| SP_0667 | Pneumococcal surface protein - putative | 996 | 972 | |
| SP_0117 | Pneumococcal surface protein A | 2232 | 2208 | |
|
|
|
|
Analysis of S. pneumoniae DNA sequence polymorphism using DnaSP program†.
| Group of strains | Number of strains | Number of ST-complexes or serotypes | Total Number of Sites/Net Sites | Total Number of Mutations (Eta) | Number of Segregating Polymorphic sites (S) | Nucleotide Diversity (π) | Average Number of Nucleotide Differences (k) | Mutation Rate θG |
| All - genes | 72 | 21231/9791 | 71 (696) | 66 (649) | 0.0010 (0.0097) | 9.74 (95.39) | 14.67 (143.60) | |
| Conserved - genes | 72 | 6704/5338 | 27 (146) | 27 (145) | 0.0005 (0.0028) | 2.83 (15.09) | 5.64 (30.12) | |
| Variable - genes | 72 | 14465/4451 | 123 (547) | 113 (503) | 0.0040 (0.0179) | 17.94 (79.86) | 25.35 (112.86) | |
|
| ||||||||
| Serotype 3(and ST-180 complex) | 18 | 1 | 20322/17816 | 31 (560) | 31 (550) | 0.0004 (0.0077) | 7.66 (136.53) | 9.14 (162.81) |
| Serotype 6B | 8 | 6 | 20359/17814 | 66 (1169) | 58 (1037) | 0.0013 (0.0231) | 23.13 (412.00) | 25.31 (450.85) |
| Serotype 14 | 5 | 5 | 20034/13168 | 53 (695) | 50 (663) | 0.0020 (0.0268) | 26.75 (352.20) | 25.33 (333.60) |
| Serotype 19A | 7 | 6 | 20406/13160 | 75 (983) | 65 (852) | 0.0021 (0.0279) | 27.86 (366.62) | 30.49 (401.22) |
| Serotype 19F | 5 | 3 | 20186/16655 | 35 (586) | 35 (582) | 0.0010 (0.0166) | 16.61 (276.60) | 16.89 (281.28) |
| Serotype 23F | 5 | 4 | 20078/17462 | 33 (576) | 32 (559) | 0.0009 (0.0152) | 15.18 (265.10) | 15.83 (276.48) |
|
| ||||||||
| ST180(and Serotype 3) | 18 | 1 | 20322/17816 | 31 (560) | 31 (550) | 0.0004 (0.0077) | 7.66 (136.53) | 9.14 (162.81) |
| ST 81 | 5 | 3 | 20004/16770 | 5 (84) | 5 (80) | 0.0002 (0.0026) | 2.61 (43.80) | 2.40 (40.32) |
| ST 199 | 3 | 2 | 19861/18779 | 24 (443) | 24 (442) | 0.0008 (0.0157) | 15.71 (295.00) | 15.73 (295.33) |
| ST 377 | 4 | 4 | 19841/18797 | 7 (126) | 7 (126) | 0.0002 (0.0036) | 3.57 (67.17) | 3.66 (68.73) |
| ST 690 | 3 | 3 | 19591/19007 | 1 (11) | 1 (11) | 0.0000 (0.0004) | 0.39 (7.33) | 0.39 (7.33) |
a: Sequences were concatenated in the order 16S rRNA, hemolysin-related protein, putative hemolysin A, hemolysin, DNA-directed RNA polymerase – beta subunit, cell wall surface anchor family protein 1, cell wall surface anchor family protein 2, cell wall surface anchor family protein 3, antigen -cell wall surface anchor family protein, pneumococcal surface protein A, putative pneumococcal surface protein.
b: Order of sequence concatenation for 5 conserved genes was 16S rRNA, hemolysin-related protein, putative hemolysin A, hemolysin, DNA-directed RNA polymerase – beta subunit for the analysis.
c: Six variable gene sequences were concatenated in the order cell wall surface anchor family protein 1, cell wall surface anchor family protein 2, cell wall surface anchor family protein 3, antigen -cell wall surface anchor family protein, pneumococcal surface protein A, putative pneumococcal surface protein. for the analysis.
*: Data normalized for net number of sites and expressed per 1000 bases. Raw values are shown in the parentheses.
: Data generated using DnaSP program, version 5.1.
Figure 1Call rate per pneumococcal genomic fragment.
The upper panel shows the resequencing profile of the eleven genomic fragments tiled on the chip and analyzed from 72 strains in duplicates. The variability in call rate increases as the query sequence diverges from the reference on the chip. The vertical bars represent the standard deviation in the results. The lower panel shows the complementation of sequence information using complementary ABI Sanger sequencing method for the genomic fragments. Cumulative data obtained was ≥95% sequence information per fragment (). Light blue bars: resequencing array platform; dark red bars: cumulative data from resequencing array and Sanger sequencing platforms.
Figure 2Phylogenetic clustering of pneumococcal strains using concatenated genomic sequences of all 11 genes.
Genomic sequence information of all the 11 gene fragments of 72 S. pneumoniae strains was used to generate the phylogenetic tree, and MrBayes program was used to generate the consensus tree as described in Methods. The clustering was viewed and edited in TreeDyn http://www.treedyn.org/. The posterior probability score of clusters at its node ranged from 0.5 to 1.0. The frequency of posterior probability score of 1.0 was 63% and only 20% of clusters showed 0.5 as the posterior probability score at their nodes. Unexpected clusters of strains, based on MLST classifications, are indicated with shaded backgrounds. A: ST-complex designation, B: serotype/serogroup designation.
Figure 3In silico mapping of pspA gene deletions to protein domains.
Deletions in pspA gene identified in silico for S. pneumoniae strains belonging to ST180, ST81 and ST199-complex were used for this analysis. Data for eighteen strains belonging to ST180 (upper panel), three strains of ST81-complex (middle panel), and three strains of ST199-complex are shown (lower panel). Two strains of ST81-complex (PA195 and PA189) were excluded because of low sequence coverage (). The nucleotide coordinates and protein domains of the pspA gene are shown at the top of each panel. The thickness of the vertical line represents the size of the deletion. SP: signal peptide, A: ∼first 100 residues of mature protein, A*: transition zone between A and B, B: clade-defining region, C: proline rich (PR) and non-proline rich (NPR) regions, CB: choline binding domain.
Quantitative SNP profile of targeted genomic regions among pneumococcal strains.
| Name/Locus | Gene Classification | Gene/Sequence | Length (bp) | No. of strains | Total SNPs | SNPs/Kb/strain | Fold difference |
| 16S_rRNA | Conserved | 16S rRNA | 1413 | 72 | 124 | 1.22 | 1.00 |
| SP_0834 | Hemolysin-related protein | 510 | 72 | 393 | 10.70 | 8.78 | |
| SP_1204 | Hemolysin A - putative | 594 | 72 | 165 | 3.86 | 3.17 | |
| SP_1466 | Hemoylsin | 645 | 72 | 115 | 2.48 | 2.03 | |
| SP_1961 | DNA-directed RNA polymerase, beta subunit | 3609 | 72 | 1344 | 5.17 | 4.24 | |
| SP_0368 | Variable | Cell wall surface anchor family protein 1 | 5301 | 69 | 6862 | 18.76 | 15.39 |
| SP_1833 | Cell wall surface anchor family protein 2 | 2124 | 71 | 598 | 3.97 | 3.26 | |
| SP_1992 | Cell wall surface anchor family protein 3 | 663 | 72 | 331 | 6.93 | 5.69 | |
| SP_2145 | Antigen, cell wall surface anchor family | 2082 | 72 | 2733 | 18.23 | 14.96 | |
| SP_0667 | Pneumococcal surface protein - putative | 996 | 69 | 798 | 11.61 | 9.53 | |
| SP_0117 |
| 2232 | 68 | 15240 | 100.41 | 82.38 |
strains with ≥95% sequence coverage.
*compared to 16S rRNA.
pneumococcal surface protein A.
Figure 4Recombination events detected in the pspA gene of 72 strains using the recombination detection package.
Two recombination events are not shown in the figure as they could not be mapped to the corresponding TIGR4 PspA sequence. One of these events was detected only in one strain PA179 with nucleotide coordinates 178 to 389. The second event with coordinates 1128 to 1178 was detected in five strains. Two strains of ST81-complex (PA195 and PA189) were excluded because of low sequence coverage (). The nucleotide coordinates and protein domains of the pspA gene are shown at the top and bottom of the figure. a Number (percent) of 72 strains with this recombination event. b Among ST-complexes that included ≥3 strains: percent of strains belonging to specific ST-complex with recombination event, PspA clade designation(s). c SP: signal peptide, A: ∼first 100 residues of mature protein, A*: transition zone between A and B, B: clade-defining region, C: proline rich (PR) and non-proline rich (NPR) regions, CB: choline binding domain.