| Literature DB >> 27366170 |
Ruth R Miller1, Morgan G I Langille2, Vincent Montoya3, Anamaria Crisan3, Aleksandra Stefanovic4, Irene Martin5, Linda Hoang6, David M Patrick1, Marc Romney7, Gregory Tyrrell8, Steven J M Jones9, Fiona S L Brinkman10, Patrick Tang11.
Abstract
Background. Streptococcus pneumoniae can cause a wide spectrum of disease, including invasive pneumococcal disease (IPD). From 2005 to 2009 an outbreak of IPD occurred in Western Canada, caused by a S. pneumoniae strain with multilocus sequence type (MLST) 289 and serotype 5. We sought to investigate the incidence of IPD due to this S. pneumoniae strain and to characterize the outbreak in British Columbia using whole-genome sequencing. Methods. IPD was defined according to Public Health Agency of Canada guidelines. Two isolates representing the beginning and end of the outbreak were whole-genome sequenced. The sequences were analyzed for single nucleotide variants (SNVs) and putative genomic islands. Results. The peak of the outbreak in British Columbia was in 2006, when 57% of invasive S. pneumoniae isolates were serotype 5. Comparison of two whole-genome sequenced strains showed only 10 SNVs between them. A 15.5 kb genomic island was identified in outbreak strains, allowing the design of a PCR assay to track the spread of the outbreak strain. Discussion. We show that the serotype 5 MLST 289 strain contains a distinguishing genomic island, which remained genetically consistent over time. Whole-genome sequencing holds great promise for real-time characterization of outbreaks in the future and may allow responses tailored to characteristics identified in the genome.Entities:
Year: 2016 PMID: 27366170 PMCID: PMC4904568 DOI: 10.1155/2016/5381871
Source DB: PubMed Journal: Can J Infect Dis Med Microbiol ISSN: 1712-9532 Impact factor: 2.471
Glossary of genomics and bioinformatics terms.
| Term | Definition |
|---|---|
| Read | Short fragment of DNA sequence output by genome sequencer. Commonly 50–250 bp in length. Raw read refers to reads taken directly from the genome sequencer and in no way filtered. |
| Paired-end | Pairs of reads that are two ends of the same region of DNA a standard distance apart. |
| Depth | Number of reads mapped to a given position in the reference. |
| Reference based assembly | Construction of reads by mapping/aligning to a known reference sequence. |
|
| Assembly of reads without a reference. |
| Adapter | Short nucleotide sequences found at the end of reads which are part of the sequencing reaction. |
| Insertion | Short regions of DNA present in our samples but not the reference sequence. |
| Contig | Contiguous regions of DNA generated by joining together raw sequence reads. |
| Genomic island | Regions of the genome with suspected horizontal origins, in that they were likely acquired from other bacteria of the same or similar species. |
| Sequence composition | Proportion of each nucleotide base present. |
| Codon usage bias | Regions with different sequence composition or amino acid composition compared to the rest of the genome. |
|
| Length for which all the contigs of that length or longer contain at least half of the total of the lengths of the contigs. |
| Nonsynonymous | Nucleotide substitution that alters the amino acid sequence. |
| Open reading frame (ORF) | The part of a gene that has the potential to code for a protein. |
bp: base pairs.
Figure 1Quarterly prevalence of serotype 5 pneumococcal IPD in British Columbia. Numbers above bars are % serotype 5 isolates per quarter. ∗ is introduction of PCV13; q: quarterly (i.e., three-month period).
SNVs between BCSP1 and BCSP2.
| Position | 70585 reference base | BCSP1 base | BCSP2 base | Qualifier | Gene product | Synonymous | Amino acid change |
|---|---|---|---|---|---|---|---|
| 384518 | T | A | T | SP70585_0437 | Hypothetical protein | No | K → |
| 425383 | C | C | T | SP70585_0473 | Helicase, RecD/TraA family | No | R → H |
| 457503 | C | C | T |
| Not applicable | — | — |
| 568204 | C | C | T |
| Not applicable | — | — |
| 923459 | T | T | G | SP70585_1004 | Dihydroorotate dehydrogenase 1B | No | S → A |
| 936681 | A | A | T | SP70585_101 | Competence protein | No | K → N |
| 1389846 | C | A | — | SP70585_1509 | Pyridoxal biosynthesis lyase PdxS | No | E → |
| 1675901 | G | G | A | SP70585_1804 | Preprotein translocase subunit SecY | No | T → I |
| 2044455#
| T | G | T | SP70585_2234 | beta-N-Acetylglucosaminidase/beta-glucosidase (3-beta-N-acetyl-D-glucosaminidase/beta-D-glucosidase) (Nag3) | No | E → A |
| 2104679 | T | T | G | SP70585_2289 | alpha-L-Fucosidase | No | S → R |
#SNV is located in the genomic island.
Figure 2Neighbor joining tree of single nucleotide variants (SNVs) within the genomic island showing the evolutionary relationship between BCSP1 and BCSP2 (sequenced as part of this study), the 70585 reference, and six strains sequenced by Chewapreecha et al. [16] (ERR067846, ERR066355, ERR064008, ERR054237, ERR084189, and ERR052634). The table to the right shows the individual SNVs that separate the isolates, with their position in the 70585 reference at the top.