| Literature DB >> 28133541 |
Andries J van Tonder1, James E Bray2, Sigríður J Quirk3, Gunnsteinn Haraldsson3, Keith A Jolley2, Martin C J Maiden2, Steen Hoffmann4, Stephen D Bentley5, Ásgeir Haraldsson3, Helga Erlendsdóttir3, Karl G Kristinsson3, Angela B Brueggemann1.
Abstract
The pneumococcus is a leading global pathogen and a key virulence factor possessed by the majority of pneumococci is an antigenic polysaccharide capsule ('serotype'), which is encoded by the capsular (cps) locus. Approximately 100 different serotypes are known, but the extent of sequence diversity within the cps loci of individual serotypes is not well understood. Investigating serotype-specific sequence variation is crucial to the design of sequence-based serotyping methodology, understanding pneumococcal conjugate vaccine (PCV) effectiveness and the design of future PCVs. The availability of large genome datasets makes it possible to assess population-level variation among pneumococcal serotypes and in this study 5405 pneumococcal genomes were used to investigate cps locus diversity among 49 different serotypes. Pneumococci had been recovered between 1916 and 2014 from people of all ages living in 51 countries. Serotypes were deduced bioinformatically, cps locus sequences were extracted and variation was assessed within the cps locus, in the context of pneumococcal genetic lineages. Overall, cps locus sequence diversity varied markedly: low to moderate diversity was revealed among serogroups/types 1, 3, 7, 9, 11 and 22; whereas serogroups/types 6, 19, 23, 14, 15, 18, 33 and 35 displayed high diversity. Putative novel and/or hybrid cps loci were identified among all serogroups/types apart from 1, 3 and 9. This study demonstrated that cps locus sequence diversity varied widely between serogroups/types. Investigation of the biochemical structure of the polysaccharide capsule of major variants, particularly PCV-related serotypes and those that appear to be novel or hybrids, is warranted.Entities:
Keywords: molecular epidemiology; pneumococcal capsular locus; sequence-based serotyping; vaccine impact
Mesh:
Substances:
Year: 2016 PMID: 28133541 PMCID: PMC5266551 DOI: 10.1099/mgen.0.000090
Source DB: PubMed Journal: Microb Genom ISSN: 2057-5858
Pneumococcal genomes included in cps locus analyses
| Serogroup/type | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Genomes* | 1 | 3 | 6 | 7 | 9 | 11 | 14 | 15 | 18 | 19 | 22 | 23 | 33 | 35 | |
| Thailand | 1875 | 6 | 35 | 396 | 25 | 42 | 49 | 147 | 161 | 32 | 482 | 30 | 325 | 59 | 86 |
| Iceland | 1536 | 6 | 60 | 338 | 15 | 49 | 78 | 59 | 87 | 22 | 448 | 50 | 255 | 27 | 42 |
| USA (MA) | 540 | 0 | 11 | 97 | 14 | 10 | 49 | 4 | 82 | 5 | 104 | 20 | 73 | 5 | 66 |
| UK | 428 | 2 | 8 | 126 | 6 | 8 | 37 | 7 | 50 | 1 | 50 | 27 | 62 | 16 | 28 |
| Serotype 1 | 317 | 317 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| PMEN2 | 172 | 0 | 0 | 172 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| PMEN1 | 126 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 124 | 0 | 0 |
| Historical | 100 | 5 | 6 | 2 | 11 | 12 | 5 | 8 | 5 | 7 | 15 | 3 | 9 | 6 | 6 |
| Serotype 3 | 75 | 0 | 75 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| PMEN ref | 21 | 4 | 0 | 5 | 1 | 1 | 0 | 3 | 0 | 1 | 3 | 0 | 3 | 0 | 0 |
| GenBank | 215 | 8 | 10 | 44 | 4 | 4 | 4 | 30 | 0 | 3 | 83 | 0 | 23 | 2 | 0 |
| Total n | 5405 | 348 | 205 | 1182 | 76 | 126 | 222 | 258 | 385 | 71 | 1185 | 130 | 874 | 115 | 228 |
MA, Massachusetts; PMEN, Pneumococcal Molecular Epidemiology Network.
*Pneumococcal genome studies from which genome sequences were obtained.
Demographic and epidemiological information for serogroups/types with low to moderate diversity in the cps locus
|
Serogroup/type | Years | Countries ( | Genomes ( | Carriage/Disease* | PCV status* | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| Carr | Dis | Unk | Pre | Post | Unk | |||||
| 1 | – | 1943–2012 | 22 | 348 | 16 | 260 | 72 | 20 | 5 | 323 |
| 3 | – | 1961–2014 | 10 | 205 | 88 | 111 | 6 | 167 | 36 | 2 |
| 7 | 7A | 1937 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 |
| 7B | 1952–2010 | 3 | 13 | 11 | 0 | 2 | 13 | 0 | 0 | |
| 7C | 1939–2011 | 4 | 10 | 8 | 0 | 2 | 7 | 2 | 1 | |
| 7F | 1962–2014 | 5 | 36 | 16 | 17 | 3 | 6 | 25 | 5 | |
| 7Hybrid | 1952–2009 | 4 | 16 | 11 | 2 | 3 | 14 | 2 | 0 | |
| 9 | 9A | 1962–2009 | 2 | 3 | 0 | 2 | 1 | 2 | 1 | 0 |
| 9L | 1941–2010 | 3 | 11 | 8 | 0 | 3 | 11 | 0 | 0 | |
| 9Lii | 2008–2010 | 1 | 7 | 7 | 0 | 0 | 7 | 0 | 0 | |
| 9N | 1938–2014 | 5 | 36 | 24 | 7 | 5 | 18 | 14 | 4 | |
| 9V | 1968–2014 | 6 | 58 | 30 | 25 | 3 | 44 | 11 | 3 | |
| 9Vii | 2008–2011 | 2 | 11 | 11 | 0 | 0 | 10 | 0 | 1 | |
| 11 | 11A | 1939–2014 | 6 | 166 | 140 | 25 | 1 | 33 | 94 | 39 |
| 11B | 1940, 2010 | 2 | 2 | 0 | 1 | 1 | 2 | 0 | 0 | |
| 11C | 1957 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | |
| 11D | 1986 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | |
| 11E | 2012 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | |
| 11F | 1952 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | |
| 11Hybrid | 2001 | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | |
| 11X | 2008–2010 | 1 | 49 | 49 | 0 | 0 | 49 | 0 | 0 | |
| 22 | 22A | 1939–2009 | 2 | 19 | 17 | 0 | 2 | 19 | 0 | 0 |
| 22F | 1940–2014 | 4 | 98 | 75 | 22 | 1 | 7 | 64 | 27 | |
| 22Hybrid | 2008–2010 | 1 | 13 | 13 | 0 | 0 | 13 | 0 | 0 | |
*Carr, carriage; Dis, disease; Unk, unknown; Pre, recovered pre-PCV; Post, recovered post-PCV.
Fig. 1.Phylogenetic trees illustrating cps locus sequence diversity and genetic relationships among pneumococci of serotypes 1 and 3 and serogroup 7. (a) Phylogenetic trees generated based on 15 cps locus genes amongst 348 serotype 1 pneumococcal genomes (left) and reconstructed using the concatenated sequence of 1000 full-length coding loci found in all 348 serotype 1 genomes (right). (b) Phylogenetic trees generated based on 4 cps locus genes amongst 205 serotype 3 pneumococcal genomes (left) and reconstructed using the concatenated sequence of 1160 full-length coding loci found in all 205 serotype 3 genomes (right). (c) Phylogenetic trees generated based on 10 cps locus genes amongst 76 serogroup 7 pneumococcal genomes (left) and reconstructed using the concatenated sequence of 1378 full-length coding loci found in all 76 serogroup 7 genomes (right). The outer ring of the tree on the right indicates the serotype by colour as detailed in the corresponding tree on the left.
Fig. 2.Phylogenetic trees illustrating cps locus sequence diversity and genetic relationships among pneumococci of serogroups 9, 11 and 22. (a) Phylogenetic trees generated based on 12 cps locus genes amongst 126 serogroup 9 pneumococcal genomes (left) and reconstructed using the concatenated sequence of 1259 full-length coding loci found in all 126 serogroup 9 genomes (right). (b) Phylogenetic trees generated based on 10 cps locus genes amongst 222 serogroup 11 pneumococcal genomes (left) and reconstructed using the concatenated sequence of 1204 full-length coding loci found in all 222 serogroup 11 genomes (right). (c) Phylogenetic trees generated based on 16 cps locus genes amongst 130 serogroup 22 pneumococcal genomes (left) and reconstructed using the concatenated sequence of 1390 full-length coding loci found in all 130 serogroup 22 genomes (right). The outer rings of the trees on the right indicate the serotype by colour as detailed in the corresponding trees on the left.
Fig. 3.Phylogenetic trees illustrating cps locus sequence diversity and genetic relationships among pneumococci of serogroups 6, 19 and 23. (a) Phylogenetic trees generated based on 13 cps locus genes amongst 1182 serogroup 6 pneumococcal genomes (left) and reconstructed using the concatenated sequence of 470 full-length coding loci found in all 1182 serogroup 6 genomes (right). CCs with ten or more isolates were coloured as listed in the key. (b) Phylogenetic trees generated based on 13 cps locus genes amongst 1185 serogroup 19 pneumococcal genomes (left) and reconstructed using the concatenated sequence of 505 full-length coding loci found in all 1185 serogroup 19 genomes (right). CCs with five or more isolates were coloured as listed in the key. (c) Phylogenetic trees generated based on 17 cps locus genes amongst 874 serogroup 23 pneumococcal genomes (left) and reconstructed using the concatenated sequence of 628 full-length coding loci found in all 874 serogroup 23 genomes (right). The outer rings of the trees on the right indicate the serotype by colour as detailed in the corresponding trees on the left.
Demographic and epidemiological information for serogroups/types with high diversity in the cps locus
|
Serogroup/type | Years | Countries ( | Genomes ( | Carriage/Disease* | PCV status* | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| Carr | Dis | Unk | Pre | Post | Unk | |||||
| 6 | 6A | 1952–2014 | 5 | 283 | 212 | 68 | 3 | 140 | 110 | 33 |
| 6Bi | 1939–2014 | 4 | 171 | 136 | 34 | 1 | 79 | 49 | 43 | |
| 6Bii (6E) | 1981–2014 | 16 | 420 | 230 | 164 | 26 | 374 | 37 | 9 | |
| 6C | 2001–2014 | 3 | 106 | 88 | 18 | 0 | 8 | 56 | 42 | |
| 6D | 2006–2011 | 2 | 5 | 5 | 0 | 0 | 4 | 0 | 1 | |
| 6F | 2006–2011 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | |
| 6Hybrid | 1978–2014 | 5 | 196 | 180 | 13 | 3 | 166 | 21 | 9 | |
| 19 | 19A | 1939–2014 | 5 | 210 | 141 | 65 | 4 | 56 | 144 | 10 |
| 19B | 1971 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | |
| 19C | 1939 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | |
| 19F | 1952–2014 | 9 | 745 | 470 | 271 | 4 | 555 | 178 | 12 | |
| 19Hybrid | 1952–2014 | 9 | 222 | 166 | 43 | 13 | 154 | 33 | 35 | |
| 19A/F | 2012–2014 | 1 | 6 | 4 | 2 | 0 | 0 | 6 | 0 | |
| 23 | 23A | 1945–2014 | 4 | 84 | 70 | 13 | 1 | 17 | 52 | 15 |
| 23B | 1941–2014 | 4 | 76 | 64 | 11 | 1 | 2 | 54 | 20 | |
| 23F | 1940–2014 | 22 | 375 | 158 | 171 | 46 | 253 | 82 | 40 | |
| 23Fii | 1983–2010 | 2 | 49 | 48 | 0 | 1 | 48 | 1 | 0 | |
| 23Hybrid | 1997–2011 | 5 | 261 | 256 | 4 | 1 | 248 | 13 | 0 | |
| 23X | 1996–2014 | 5 | 29 | 26 | 2 | 1 | 17 | 10 | 2 | |
| 14 | 14i | 1952–2013 | 8 | 58 | 19 | 33 | 6 | 35 | 17 | 6 |
| 14wciY-fs | 1967–2013 | 9 | 48 | 15 | 30 | 3 | 32 | 11 | 5 | |
| 14ii | 2008–2010 | 1 | 88 | 88 | 0 | 0 | 88 | 0 | 0 | |
| 14Hybrid | 1939–2011 | 4 | 64 | 59 | 2 | 3 | 63 | 1 | 0 | |
| 15 | 15A | 1939–2014 | 5 | 62 | 59 | 2 | 1 | 53 | 3 | 6 |
| 15BC | 1939–2013 | 4 | 20 | 15 | 2 | 3 | 6 | 7 | 7 | |
| 15F | 1963 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | |
| 15Hybrid | 2001–2014 | 4 | 210 | 183 | 27 | 0 | 41 | 132 | 37 | |
| 15X | 2008–2010 | 1 | 92 | 92 | 0 | 0 | 92 | 0 | 0 | |
| 18 | 18A | 1952–2010 | 2 | 4 | 3 | 0 | 1 | 4 | 0 | 0 |
| 18B | 1941, 2011 | 2 | 2 | 0 | 1 | 1 | 1 | 1 | 0 | |
| 18C | 1939–2013 | 5 | 31 | 20 | 8 | 3 | 21 | 9 | 1 | |
| 18F | 1961 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | |
| 18Hybrid | 1945–2010 | 4 | 33 | 32 | 0 | 1 | 33 | 0 | 0 | |
| 33 | 33A | 1937–2011 | 3 | 4 | 1 | 0 | 3 | 3 | 0 | 1 |
| 33B | 1962 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | |
| 33C | 2007–2010 | 1 | 11 | 11 | 0 | 0 | 11 | 0 | 0 | |
| 33D | 1979 | 1 | 2 | 0 | 0 | 2 | 2 | 0 | 0 | |
| 33F | 1999–2014 | 3 | 40 | 23 | 17 | 0 | 11 | 20 | 9 | |
| 33Hybrid | 2008–2014 | 3 | 52 | 52 | 0 | 0 | 43 | 3 | 6 | |
| 33X | 2008–2010 | 1 | 5 | 5 | 0 | 0 | 5 | 0 | 0 | |
| 35 | 35A | 1939–2010 | 2 | 11 | 10 | 0 | 1 | 11 | 0 | 0 |
| 35B | 1939–2014 | 5 | 108 | 97 | 9 | 2 | 42 | 61 | 5 | |
| 35C | 1941–2009 | 2 | 8 | 6 | 0 | 2 | 8 | 0 | 0 | |
| 35F | 1939–2014 | 5 | 73 | 63 | 9 | 1 | 18 | 32 | 23 | |
| 35Hybrid | 2007–2010 | 2 | 28 | 28 | 0 | 0 | 25 | 3 | 0 | |
*Carr, carriage; Dis, disease; Unk, unknown; Pre, recovered pre-PCV; Post, recovered post-PCV.
Fig. 4.Phylogenetic trees illustrating cps locus sequence diversity and genetic relationships among pneumococci of serotype 14 and serogroups 15 and 18. (a) Phylogenetic trees generated based on 12 cps locus genes amongst 258 serotype 14 pneumococcal genomes (left) and reconstructed using the concatenated sequence of 911 full-length coding loci found in all 258 serotype 14 genomes (right). (b) Phylogenetic trees generated based on 13 cps locus genes amongst 385 serogroup 15 pneumococcal genomes (left) and reconstructed using the concatenated sequence of 893 full-length coding loci found in all 385 serogroup 15 genomes (right). (c) Phylogenetic trees generated based on 17 cps locus genes amongst 71 serogroup 18 pneumococcal genomes (left) and reconstructed using the concatenated sequence of 1302 full-length coding loci found in all 71 serogroup 18 genomes (right). The outer rings of the trees on the right indicate the serotype by colour as detailed in the corresponding trees on the left.
Fig. 5.Phylogenetic trees and a comparative diagram illustrating cps locus sequence diversity and genetic relationships among pneumococci of serogroups 33 and 35. (a) Phylogenetic trees generated based on five cps locus genes amongst 115 serogroup 33 pneumococcal genomes (left) and reconstructed using the concatenated sequence of 1218 full-length coding loci found in all 115 serogroup 33 genomes (right). (b) Comparison of the putative serogroup 33X cps locus sequence to the serotype 33B, 33F and 10B reference sequences using Easyfig (Sullivan ). The results of pairwise blast nucleotide sequence comparisons are shown: darker red highlights greater sequence conservation between the pair of sequences. In order to generate pair-wise comparisons between all three reference sequences the putative serogroup 33X cps locus sequence was included twice. Coding loci in the putative serogroup 33X cps locus sequence were coloured according to the level of sequence similarity to coding loci from one of the reference sequences: light green for serotype 33F; grey for serotype 10B; red for serotype 33B; and disrupted loci are highlighted in pink. (c) Phylogenetic trees generated based on seven cps locus genes amongst 228 serogroup 35 pneumococcal genomes (left) and reconstructed using the concatenated sequence of 993 full-length coding loci found in all 228 serogroup 35 genomes (right). The outer rings of the trees on the right indicate the serotype by colour as detailed in the corresponding trees on the left.