Literature DB >> 35638828

Complete Genome Sequences of Nine Streptococcus pneumoniae Serotype 3 Clonal Complex 180 Strains.

Smitha Shambhu1, Eleonora Cella1, Mohammad Jubair1, Taj Azarian1.   

Abstract

We announce the complete genomes of nine Streptococcus pneumoniae strains belonging to serotype 3 clonal complex 180 (CC180). The genomes consist of a single circularized contig with an average length of 2.033 Mbp. Pangenome analysis identified 1,762 core genes and 412 accessory genes. These genomes are the basis for future population genomic studies.

Entities:  

Year:  2022        PMID: 35638828      PMCID: PMC9302067          DOI: 10.1128/mra.00275-22

Source DB:  PubMed          Journal:  Microbiol Resour Announc        ISSN: 2576-098X


ANNOUNCEMENT

Streptococcus pneumoniae is a commensal bacterium found in the human nasopharynx that can cause the invasive diseases of pneumonia, otitis media, meningitis, and bacteremia. Of the 100 identified serotypes, serotype 3 is highly invasive and is associated with a high risk of death (1). Nine serotype 3 strains were obtained from a culture collection of samples collected during a carriage study of Massachusetts children that was conducted between 2000 and 2014 (2, 3). Previous population genomic analysis of draft assemblies identified that they belonged to two divergent clades of clonal complex 180 (CC180), termed clade Iα and clade II (3, 4). Clade II is of particular interest due to its increased prevalence after the introduction of the 13-valent pneumococcal conjugate vaccine (PCV13). Strains were grown overnight at 37°C in 5% CO2 in Bacto Todd-Hewitt broth (BD, Heidelberg, Germany) containing 0.5% yeast extract (BD). Genomic DNA (gDNA) was extracted and purified using the Qiagen DNeasy blood and tissue kit according to the manufacturer’s instructions. Enzyme lysis buffer for Gram-positive bacteria was prepared according to instructions with the addition of 100 mg/mL lysozyme. An overnight 5-mL culture was centrifuged at 5,000 × g for 10 min, and 360 μL of the lysis buffer and lysozyme mixture was added to each cell pellet and incubated at 37°C for 1 h. The quality and concentration of gDNA were assessed using the Agilent 4200 TapeStation system and a Qubit 4 fluorometer. Using an Oxford Nanopore Technologies (ONT) MinION system, a ligation sequencing kit, and an R9.4.1 flow cell, we produced an average of 330 Mbp of sequencing data for each strain. We performed base calling using Guppy v0.5.1 with FAST mode and adapter trimming with Porechop v0.2 (5). Reads were filtered with Filtlong v0.2.0 (https://github.com/rrwick/Filtlong) using the settings –min_length 1000 and –target_bases 84,000,000. The final long-read data set for the nine strains had mean read lengths that ranged from 4,495 to 11,490 bp (minimum N50, 6,120 bp) and a mean read quality score of 10.0. Data quality was assessed using NanoPlot v1.0.0 (6). Hybrid assemblies were generated from ONT data and previously published Illumina short-read data (BioProject accession number PRJNA437292; detailed accession numbers are in Table 1) using Unicycler v0.4.8, which resolved a single circularized unitig (7). Three samples required an alternative approach using Trycycler v0.5.1 to obtain circularized assemblies (8). With both approaches, assemblies were error corrected (i.e., polished) using Illumina short reads and reordered to begin at the start position of dnaA. Default parameters were used for all software unless otherwise specified. The final error-corrected assemblies have a total average length of 2,033,799 bp and a GC content of 39.7%. The genomes were annotated using the NCBI Prokaryotic Genome Annotation Pipeline (PGAP) (9). The genomes consist of, on average, 2,091 total annotated genes, 2,018 protein-coding sequences, 58 tRNAs, four 5S rRNAs, four 16S rRNAs, and four 23S rRNAs. Pangenome analysis was performed using Roary v3.13.0, which identified 1,762 core genes shared by all nine genomes and 412 accessory genes (10).
TABLE 1

Data on the nine Streptococcus pneumoniae serotype 3 CC180 strains

TaxonBioSample accession no.SRA accession no.
ONT sequencing data
Hybrid assembly GenBank accession no.Genome size (bp)GC content (%)No. of coding sequences
Illumina sequencing readsONT sequencing readsTotal no. of basesN50 (bp)Mean read quality scoreMean read length (bp)
PT8465 SAMN08647902 SRX3774795 SRR17486872 110,005,7388,13710.78,324.3 CP090888 2,003,56139.82,087
LE4448 SAMN08647548 SRX3775148 SRR17486877 110,000,8506,99410.16,455.4 CP090883 2,003,81839.82,068
CH2439 SAMN08647378 SRX3775100 SRR17486871 110,000,6206,12010.04,495.9 CP090889 2,003,57139.82,067
CH2241 SAMN08647361 SRX3775363 SRR17486878 110,005,1628,35310.48,292.3 CP090882 2,003,72339.82,114
NP7536 SAMN08647838 SRX3774973 SRR17486873 110,003,27310,79110.810,796.3 CP090887 2,046,17739.82,115
ND6401 SAMN08647706 SRX3775069 SRR17486875 110,007,73911,32010.611,490.3 CP090885 2,057,10139.72,120
MD5403 SAMN08647626 SRX3775317 SRR17486876 110,000,5838,68910.68,589.1 CP090884 2,061,64839.72,068
NP7513 SAMN08647831 SRX3774783 SRR17486874 110,002,69111,24310.911,248.9 CP090886 2,062,50439.72,111
BR1268 SAMN08647280 SRX3774733 SRR17486879 110,003,7339,59910.59,777.2 CP090881 2,062,08839.72,067
Data on the nine Streptococcus pneumoniae serotype 3 CC180 strains

Data availability.

Whole-genome shotgun projects have been deposited in GenBank under the accession numbers CP090888, CP090883, CP090889, CP090882, CP090887, CP090885, CP090884, CP090886, and CP090881. The versions described in this paper are versions CP090881.1 to CP090889.1. The raw sequence reads are available under BioProject accession number PRJNA437292, with the BioSample accession numbers SAMN08647902, SAMN08647548, SAMN08647378, SAMN08647361, SAMN08647838, SAMN08647706, SAMN08647626, SAMN08647831, and SAMN08647280. An extended version of Table 1 with additional metadata is available at https://doi.org/10.6084/m9.figshare.19654020.v1.
  10 in total

1.  Roary: rapid large-scale prokaryote pan genome analysis.

Authors:  Andrew J Page; Carla A Cummins; Martin Hunt; Vanessa K Wong; Sandra Reuter; Matthew T G Holden; Maria Fookes; Daniel Falush; Jacqueline A Keane; Julian Parkhill
Journal:  Bioinformatics       Date:  2015-07-20       Impact factor: 6.937

2.  Completing bacterial genome assemblies with multiplex MinION sequencing.

Authors:  Ryan R Wick; Louise M Judd; Claire L Gorrie; Kathryn E Holt
Journal:  Microb Genom       Date:  2017-09-14

3.  Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads.

Authors:  Ryan R Wick; Louise M Judd; Claire L Gorrie; Kathryn E Holt
Journal:  PLoS Comput Biol       Date:  2017-06-08       Impact factor: 4.475

4.  Global emergence and population dynamics of divergent serotype 3 CC180 pneumococci.

Authors:  Taj Azarian; Patrick K Mitchell; Maria Georgieva; Claudette M Thompson; Amel Ghouila; Andrew J Pollard; Anne von Gottberg; Mignon du Plessis; Martin Antonio; Brenda A Kwambana-Adams; Stuart C Clarke; Dean Everett; Jennifer Cornick; Ewa Sadowy; Waleria Hryniewicz; Anna Skoczynska; Jennifer C Moïsi; Lesley McGee; Bernard Beall; Benjamin J Metcalf; Robert F Breiman; P L Ho; Raymond Reid; Katherine L O'Brien; Rebecca A Gladstone; Stephen D Bentley; William P Hanage
Journal:  PLoS Pathog       Date:  2018-11-26       Impact factor: 6.823

5.  Population genomics of pneumococcal carriage in Massachusetts children following introduction of PCV-13.

Authors:  Patrick K Mitchell; Taj Azarian; Nicholas J Croucher; Alanna Callendrello; Claudette M Thompson; Stephen I Pelton; Marc Lipsitch; William P Hanage
Journal:  Microb Genom       Date:  2019-02-19

6.  Serotype-specific mortality from invasive Streptococcus pneumoniae disease revisited.

Authors:  Pernille Martens; Signe Westring Worm; Bettina Lundgren; Helle Bossen Konradsen; Thomas Benfield
Journal:  BMC Infect Dis       Date:  2004-06-30       Impact factor: 3.090

7.  Population genomics of post-vaccine changes in pneumococcal epidemiology.

Authors:  Nicholas J Croucher; Jonathan A Finkelstein; Stephen I Pelton; Patrick K Mitchell; Grace M Lee; Julian Parkhill; Stephen D Bentley; William P Hanage; Marc Lipsitch
Journal:  Nat Genet       Date:  2013-05-05       Impact factor: 38.330

8.  NCBI prokaryotic genome annotation pipeline.

Authors:  Tatiana Tatusova; Michael DiCuccio; Azat Badretdin; Vyacheslav Chetvernin; Eric P Nawrocki; Leonid Zaslavsky; Alexandre Lomsadze; Kim D Pruitt; Mark Borodovsky; James Ostell
Journal:  Nucleic Acids Res       Date:  2016-06-24       Impact factor: 16.971

9.  NanoPack: visualizing and processing long-read sequencing data.

Authors:  Wouter De Coster; Svenn D'Hert; Darrin T Schultz; Marc Cruts; Christine Van Broeckhoven
Journal:  Bioinformatics       Date:  2018-08-01       Impact factor: 6.937

10.  Trycycler: consensus long-read assemblies for bacterial genomes.

Authors:  Ryan R Wick; Louise M Judd; Louise T Cerdeira; Jane Hawkey; Guillaume Méric; Ben Vezina; Kelly L Wyres; Kathryn E Holt
Journal:  Genome Biol       Date:  2021-09-14       Impact factor: 13.583

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.