| Literature DB >> 32375991 |
Rebecca A Gladstone1, Stephanie W Lo1, Richard Goater2,3, Corin Yeats2,3, Ben Taylor2,3, James Hadfield4, John A Lees5, Nicholas J Croucher5, Andries J van Tonder6,1, Leon J Bentley1, Fu Xiang Quah1, Anne J Blaschke7, Nicole L Pershing7, Carrie L Byington8, Veeraraghavan Balaji9, Waleria Hryniewicz10, Betuel Sigauque11, K L Ravikumar12, Samanta Cristine Grassi Almeida13, Theresa J Ochoa14, Pak Leung Ho15, Mignon du Plessis16, Kedibone M Ndlangisa16, Jennifer E Cornick17, Brenda Kwambana-Adams18,19, Rachel Benisty20, Susan A Nzenze21,22, Shabir A Madhi21,22, Paulina A Hawkins23, Andrew J Pollard24, Dean B Everett25, Martin Antonio18, Ron Dagan20, Keith P Klugman23, Anne von Gottberg15, Benjamin J Metcalf26, Yuan Li26, Bernard W Beall26, Lesley McGee26, Robert F Breiman27,23, David M Aanensen2,3, Stephen D Bentley1.
Abstract
Knowledge of pneumococcal lineages, their geographic distribution and antibiotic resistance patterns, can give insights into global pneumococcal disease. We provide interactive bioinformatic outputs to explore such topics, aiming to increase dissemination of genomic insights to the wider community, without the need for specialist training. We prepared 12 country-specific phylogenetic snapshots, and international phylogenetic snapshots of 73 common Global Pneumococcal Sequence Clusters (GPSCs) previously defined using PopPUNK, and present them in Microreact. Gene presence and absence defined using Roary, and recombination profiles derived from Gubbins are presented in Phandango for each GPSC. Temporal phylogenetic signal was assessed for each GPSC using BactDating. We provide examples of how such resources can be used. In our example use of a country-specific phylogenetic snapshot we determined that serotype 14 was observed in nine unrelated genetic backgrounds in South Africa. The international phylogenetic snapshot of GPSC9, in which most serotype 14 isolates from South Africa were observed, highlights that there were three independent sub-clusters represented by South African serotype 14 isolates. We estimated from the GPSC9-dated tree that the sub-clusters were each established in South Africa during the 1980s. We show how recombination plots allowed the identification of a 20 kb recombination spanning the capsular polysaccharide locus within GPSC97. This was consistent with a switch from serotype 6A to 19A estimated to have occured in the 1990s from the GPSC97-dated tree. Plots of gene presence/absence of resistance genes (tet, erm, cat) across the GPSC23 phylogeny were consistent with acquisition of a composite transposon. We estimated from the GPSC23-dated tree that the acquisition occurred between 1953 and 1975. Finally, we demonstrate the assignment of GPSC31 to 17 externally generated pneumococcal serotype 1 assemblies from Utah via Pathogenwatch. Most of the Utah isolates clustered within GPSC31 in a USA-specific clade with the most recent common ancestor estimated between 1958 and 1981. The resources we have provided can be used to explore to data, test hypothesis and generate new hypotheses. The accessible assignment of GPSCs allows others to contextualize their own collections beyond the data presented here.Entities:
Keywords: Streptococcus pneumoniae; antibiotic resistance; pangenome; phylogenetic dating; pneumococcal; population structure; recombination; whole genome sequencing
Mesh:
Substances:
Year: 2020 PMID: 32375991 PMCID: PMC7371119 DOI: 10.1099/mgen.0.000357
Source DB: PubMed Journal: Microb Genom ISSN: 2057-5858
Population snapshots
|
Country |
No. of isolates |
Percentage IPD |
Sampling years |
|---|---|---|---|
|
|
4615 |
63 % |
1991, 2005–2014 |
|
|
1647 |
24 % |
1993,1996–2014 |
|
|
1584 |
100 % |
1998–2009 |
|
|
1304 |
43 % |
1997–2015 |
|
|
1143 |
100 % |
2005–2014 |
|
|
607 |
31 % |
2006–2011 |
|
|
504 |
42 % |
1995–2001, 2009–2017 |
|
|
420 |
97 % |
2008–2009, 2012–2013 |
|
|
416 |
16 % |
2005–2009, 2011–2014 |
|
|
189 |
100 % |
2007–2013 |
|
|
167 |
100 % |
2008–2010 |
|
|
114 |
97 % |
2007–2010, 2013–2016 |
IPD, Invasive Pneumococcal Disease.
Fig. 1.Point estimates of the year of the most recent common ancestor (MRCA) of each of the 51 GPSCs that could be reliably estimated. The 95 % confidence intervals (CI) are plotted.
Comparison of key feasible beast runs and BactDating models of phylogenetic dating
|
Lineage sub-clade |
Year range |
|
|
BactDating tMRCA |
|---|---|---|---|---|
|
GPSC3 |
18 |
358 |
Run infeasible |
1720.54 [1634.01–1780.92] |
|
GPSC3-CC53 |
12 |
152 |
1879 [1820.49–1921.87] |
1926.77 [1907.99–1941.56] |
|
GPSC3-CC62 |
17 |
56 |
1912.82 [1667.10–1967.00] |
1929.64 [1908.24–1944.03] |
|
GPSC3-CC100 |
17 |
67 |
1937.13 [1920.19–1950.82] |
1919.82 [1899.11–1936.45] |
|
GPSC3-CC1012 |
17 |
83 |
1878 [1808.91–1925.02] |
1840.70 [1802.97–1870.36] |
CC, Clonal Complex; GPSC, Global Pneumococcal Sequence Cluster; tMRCA, Time to Most Recent Common Ancestor.
Fig. 2.Contextualizing serotype 14 genotypes in South Africa. (a) Phylogeny of South African pneumococcal population structure, with taxa expressing serotype 14 highlighted by a coloured circle representing their GPSC assignment, GPSC9 (blue) is highlighted with a box. (b) Expanded South African GPSC9 subtree where taxa are coloured by serotype: 14 (blue) and 15A (yellow). The left metadata block is coloured by clonal complex: CC63 (pink), CC12576 (green). The right-hand metadata block is coloured by sequence type: ST63 (green), ST2414 (orange). Three sub-clades (A,B,C) that each contain at least one ST63 isolates are highlighted with a box. (c) The international GPSC9 collection have taxa coloured by country of isolation with a map based key, the three South African (purple) sub-clades (A,B,C) are highlighted in the international GPSC9 collection with a labelled box.
Fig. 3.GPSC97 capsular polysaccharide locus recombination events. Phandango plot of recombination detected with Gubbins, focused on the cps locus, across the GPSC97 phylogeny. Isoaltes are annotated with their serotype in a metablock: 6A (pink) 19A (blue). Recombination blocks span the taxa in which they are detected and the region of genes affected in the reference. Red blocks affect n>1, blue blocks affect n=1. Overlapping blocks increase the density of the colour. A sliding window of the number of recombination events affecting any one position in the reference is plotted at the underneath. The major recombination block spanning most of the cps locus and common to all 19A isolates is consistent with the serotype switch, and is outlined in blue. The nodes for most recent common ancestors of the 6A and 19A isolates and the 19A isolates are designated with a blue circle, estimated dates and confidence intervals are given.
Fig. 4.GPSC23 acquired resistance gene presence and absence. Phandango plot of Roary gene presence and absence, focused on the genes with prevalence and phylogenetic patterns similar to acquired resistance genes: tetM 61 %(110/180) ermB 43 %(77/180) and cat 40 %(72/180), across the GPSC23 phylogeny. Isolates are annotated with their serotype in a metablock: 6B (green), 6A (pink) and 19A (blue). Genes are shown as light blue bricks along the top and are sorted left to right by the proportion of isolates they are observed in. Presence (blue) and absence (white) of genes is plotted with respect to each isolates phylogenetic placement. A graph of the proportion of isolates the gene is observed in is plotted at the underneath.
Pathogenwatch output
|
Summary | |
|---|---|
|
No. assemblies processed |
17 |
|
No. analyses performed |
102 |
|
Time taken |
3 min |
|
No. contigs |
38–58 |
|
GC content |
39.5–39.6 % |
|
Assembly length |
2.04–2.14 Mb |
|
Species |
|
|
Serotype |
1 (100 %, 17/17) |
|
ST |
ST227 (59 %, 10/17) ST306 (24 %, 4/17) ST304 (12 %, 2/17) ST4288 (6 %, 1/17) |
|
Strain |
GPSC31 (100 %, 17/17) |
|
AMR determinants |
None identified |
AMR, Antimicrobial resistance; Mb, Megabases.
Fig. 5.Giving context to serotype 1 isolates from Utah. (a) Different geographical distribution of serotype 1 genomes belonging to GPSC2 and GPSC31 on Pathogenwatch. (b) Serotype 1 isolates in the GPS project (n=893/13,454) fall exclusively into two lineages (green): GPSC2 (n=782) and GPSC31 (n=111), which are in different parts of the species-wide tree and do not share a recent common ancestor. (c) Utah isolates are highlighted on a and coloured by their ST (see key), the metablock shows the country of isolation across the GPC31 tree structure, and USA state on the expanded subtree. Triangle A denotes the most recent common ancestor 1973 [1958-1981] of the USA sub-clade in which the majority (10/17) of the Utah serotype 1 isolates were found.