| Literature DB >> 35229712 |
Areej S Alsheikh-Hussain1,2, Nouri L Ben Zakour1,2,3, Brian M Forde1,2, Oleksandra Silayeva4, Andrew C Barnes4, Scott A Beatson1,2.
Abstract
Fish mortality caused by Streptococcus iniae is a major economic problem in aquaculture in warm and temperate regions globally. There is also risk of zoonotic infection by S. iniae through handling of contaminated fish. In this study, we present the complete genome sequence of S. iniae strain QMA0248, isolated from farmed barramundi in South Australia. The 2.12 Mb genome of S. iniae QMA0248 carries a 32 kb prophage, a 12 kb genomic island and 92 discrete insertion sequence (IS) elements. These include nine novel IS types that belong mostly to the IS3 family. Comparative and phylogenetic analysis between S. iniae QMA0248 and publicly available complete S. iniae genomes revealed discrepancies that are probably due to misassembly in the genomes of isolates ISET0901 and ISNO. Long-range PCR confirmed five rRNA loci in the PacBio assembly of QMA0248, and, unlike S. iniae 89353, no tandemly repeated rRNA loci in the consensus genome. However, we found sequence read evidence that the tandem rRNA repeat existed within a subpopulation of the original QMA0248 culture. Subsequent nanopore sequencing revealed that the tandem rRNA repeat was the most prevalent genotype, suggesting that there is selective pressure to maintain fewer rRNA copies under uncertain laboratory conditions. Our study not only highlights assembly problems in existing genomes, but provides a high-quality reference genome for S. iniae QMA0248, including manually curated mobile genetic elements, that will assist future S. iniae comparative genomic and evolutionary studies.Entities:
Keywords: SMRT sequencing; insertion sequence; misassembly; mobile genetic elements; reference-guided assembly
Mesh:
Substances:
Year: 2022 PMID: 35229712 PMCID: PMC9176272 DOI: 10.1099/mgen.0.000777
Source DB: PubMed Journal: Microb Genom ISSN: 2057-5858
Fig. 1.Circular map of the QMA0248 genome. Genomic features from outer ring to inner ring are described in the key to the left, where the innermost two rings correspond to the GC skew (inner) and GC plot (outer). CDS: coding sequence. GI: genomic island. The circular map was generated using DNAPlotter [70].
General features of five complete genomes
|
Feature |
QMA0248 |
SF1 |
YSFST01-82 |
ISET0901 |
ISNO |
|---|---|---|---|---|---|
|
Accession number |
CP022392 |
CP005941 |
CP010783 |
CP007586 |
CP007587 |
|
Genome size (bp) |
2 116 570 |
2 149 844 |
2 086 959 |
2 070 822 |
2 070 182 |
|
GC content (%) |
36.8 |
36.7 |
36.8 |
36.8 |
36.8 |
|
Total CDS number |
1946 |
2125 |
1897 |
1872 |
1865 |
|
Total gene number |
2196 |
2196 |
2029 |
1997 |
1996 |
|
rRNAs (5S, 16S, 23S) |
15 |
12 |
15 |
12 |
12 |
|
tRNAs |
58 |
45 |
58 |
45 |
45 |
|
Reference |
This study |
[ |
[ |
[ |
[ |
|
Assembly type |
PacBio RSII P2C4 |
454 FLX+/Illumina MiSeq/Sanger |
454 FLX Titanium/Opgen/Sanger |
Illumina 1500 HiSeq/Reference guided assembly |
Illumina 1500 HiSeq/Reference guided assembly |
Large mobile genetic elements (MGEs) and regions of difference (ROD) identified in the five genomes analysed (QMA0248, SF1, YSFST01-82, ISET0901 and ISNO)
|
Characteristic |
Genomic island (GI-leu) |
ROD1 |
Prophage 2 |
ROD2 |
Prophage 1 (Phi1) |
|---|---|---|---|---|---|
|
Coordinates* |
87374–100014 |
177819–206359 (YSFST01-82) |
848479–890501 (SF1) |
1767227–1787619 |
1991661–2023508 |
|
Length (kb) |
12.6 |
28.5 |
42.0 |
20.4 |
31.8 |
|
GC content (%) |
35.9 |
37.2 |
35.3 |
34.3 |
37.6 |
|
Features |
MGE; 13 tRNA and 1 rRNA operon (5S, 16S and 23S) upstream; integrase; IS |
Integrase; IS |
MGE; integrase |
IS |
MGE; integrase; 1 tRNA-Cys |
|
No. of CDSs |
11 |
33 |
63 |
18 |
53 |
|
Major CDSs |
Cro/CI family transcriptional regulator; ECF subfamily RNA polymerase sigma factor; plasmid replication protein; membrane protein |
ESAT-6-like protein; two-component sensor histidine kinase; galactose mutarotase; 3 PTS galactitol transporter subunits (IIA, IIC and IIB) |
Phage DNA replication protein; prophage antirepressor, phage capsid and scaffold protein; putative tail protein; holin; endolysin; antigen C; several phage hypothetical proteins |
ESAT-6-like protein; |
DNA helicase; Cro/CI family transcriptional regulator; tail and capsid proteins; holin; lysin; DNA N-4 cytosine methyltransferase; site-specific recombinase; several phage hypothetical proteins |
|
Best hit (% identity, % coverage) |
|
|
|
|
Bacteriophage PH10 of |
*Coordinates are in QMA0248 GenGenBank annotation (GCA_002220115.1) unless otherwise indicated.
Fig. 2.Whole-genome alignment of the five genomes QMA0248, SF1, YSFST01-82, ISET0901 and ISNO. The genomes are ordered according to their position in the core SNP-based phylogenetic tree. The maximum-likelihood (ML) phylogeny was rooted to QMA0140 (not shown) and built using 1111 SNPs. Bar, the number of substitutions represented by branch lengths. BLASTn comparison was produced using EasyFig [33] using 2000 bp as the minimum length, 50 % as the minimum identity value and 1×10−17 as the maximum e-value.
Summary of all insertion sequences (IS) identified in QMA0248; partial IS are suffixed by -p
|
IS family in QMA0248 |
no. of IS copies |
IS types (copy no., mean % amino acid identity) |
|---|---|---|
|
IS |
32 |
IS IS IS *IS |
|
IS |
22 |
IS *IS |
|
IS |
17 |
IS |
|
IS |
13 |
*IS |
|
IS |
5 |
*IS |
|
IS |
1 |
Unclassified most similar to IS |
|
IS |
2 |
Unclassified most similar to IS |
*Novel IS element.
Fig. 3.Comparison of the CRISPR/Cas region between QMA0248, SF1, YSFST01-82, ISET0901 and ISNO. Alignment of Cas genes where the genomes are ordered according to their position in the phylogenetic tree (left). The maximum-likelihood (ML) phylogeny was rooted to QMA0140 (not shown) and built using 1111 SNPs. The scale bar indicates the number of substitutions represented by branch lengths. Arrows correspond to Cas genes, which are labelled at the bottom. Figure was produced using EasyFig [33] using 500 bp as the minimum length, 90 % as the minimum identity value and 0.001 as the maximum e-value.