Tanu Saroha1,2, Vasvi Chaudhry1,3, Prabhu B Patil1. 1. Bacterial Genomics and Evolution Laboratory, CSIR - Institute of Microbial Technology, Chandigarh, India. 2. Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India. 3. Present address: Department of Microbial Interactions, Centre for Plant Molecular Biology, Interfaculty Institute of Microbiology and Infection Medicine Tubingen, University of Tubingen, Tubingen, Germany.
Abstract
Staphylococcus haemolyticus is a species of coagulase-negative staphylococci that has primarily been studied as a human skin microbiome member and an emerging nosocomial pathogen. Here, we present the first complete genome of S. haemolyticus strains SE3.9, SE3.8 and SE2.14 reported as an endophyte of rice seed. Detailed investigation of the genome dynamics of strains from diverse origins revealed an expanded genome size in clinical isolates, and a role of many insertion sequence (IS) elements in strain diversification. Interestingly, several of the IS elements are also unique or enriched in a particular habitat. Comparative studies also revealed the potential movement of mobile elements from rice endophytic S. haemolyticus to strains from other pathogenic species such as Staphylococcus aureus. The study highlights the importance of ecological studies in the systematic understanding of genome plasticity and management of medically important Staphylococcus species.
Staphylococcus haemolyticus is a species of coagulase-negative staphylococci that has primarily been studied as a human skin microbiome member and an emerging nosocomial pathogen. Here, we present the first complete genome of S. haemolyticus strains SE3.9, SE3.8 and SE2.14 reported as an endophyte of rice seed. Detailed investigation of the genome dynamics of strains from diverse origins revealed an expanded genome size in clinical isolates, and a role of many insertion sequence (IS) elements in strain diversification. Interestingly, several of the IS elements are also unique or enriched in a particular habitat. Comparative studies also revealed the potential movement of mobile elements from rice endophytic S. haemolyticus to strains from other pathogenic species such as Staphylococcus aureus. The study highlights the importance of ecological studies in the systematic understanding of genome plasticity and management of medically important Staphylococcus species.
Entities:
Keywords:
IS element; Staphylococcus haemolyticus; complete genome; mobile elements; plant host; plasmids
All the sequencing data have been deposited in GenBank under BioProject ID nos. PRJNA263225, PRJNA263226 and PRJNA263227. All the supporting data have been provided through supplementary data files.is a commensal and inhabitant of diverse hosts such as humans, animals and plants. As plants are not mobile, bacteria associated with plants face a variety of intense biotic and abiotic stresses. Our systematic, complete genome-based study revealed that ecologically diverse strains can act as gene reservoirs for stress-resistant genes and can transfer them to other
species such as
. Plasmids are known to evolve at a faster rate and can also change the selective effect of chromosomal mutations through epistatic interactions. Apart from mobile elements, our study indicates the role of several plasmids in a plant isolate in adaptation as well as evolution of ecologically diverse
strains and in other medically important
species.
Introduction
is a gram-positive, coagulase-negative bacterium inhabiting human skin and mucous as a commensal organism. It occurs mainly associated with bloodstream infection and medical device-associated infection [1]. After
, it is the most frequent isolate from human bloodstream infections, particularly from sepsis [2-4]. Recent studies have shown that 75 % of analysed
isolates are multidrug-resistant [5]. Its ability to form biofilms [6, 7], to produce phenol-soluble modulins [8] and the presence of a large number of insertion sequences leading to frequent genomic rearrangements [9] have been proposed as factors contributing to its multidrug resistance phenotype. Additionally, it contributes to the transfer of resistant genes to virulent pathogen
, leading to the emergence of extremely rampant clones of
[10-12].Apart from inhabiting humans,
is also known to be associated with a multitude of hosts such as plants and the environment. In a previous study, we provided insights into genome-based taxonomy and evolution of
adapted to different habitats, particularly as a rice endophyte [13]. As the study was based on draft genome sequences, it is challenging to study repetitive elements like insertion sequence (IS) elements and other mobile elements such as plasmids integrons, CRISPRs, conjugative transposons, and phages. With the advent of third-generation sequencing, it is possible to obtain complete genome sequences to understand intragenomic heterogeneity and inter-strain variations in an effective manner [14]. In the present study, we provide evidence for genome expansion and the role of specific IS elements along with plasmids in the ecological diversification of
. Interestingly, plasmids from the rice isolate harbour toxin genes, antibiotic resistance genes, lantibiotic genes, insertion elements, and metal resistance genes. Comparative genomic analysis revealed the distinct difference in genome size and role of mobile elements in diversification of
with diverse lifestyles. We also found evidence of the role of plasmids in acquisition of novel genes and also transfer to medically important species such as
. The study highlights the importance of ecological genomics studies in
with particular emphasis on the mobilome of rice or plant isolates.
Methods
DNA isolation and genome sequencing
The strains were grown in nutrient broth media at 28 °C for 24 h. Genomic DNA was isolated using a Quick-DNA Fungal/Bacterial Miniprep Kit (Zymo Research). The quantity and quality of DNA were assessed using a Nanodrop and Qubit 2.0 fluorometrics. For Nanopore sequencing, 3.5 µg of initial DNA was used for DNA end prep using NEBNext Ultra II End repair/dA-Tailing modules (NEB). The library was prepared using Ligation Sequencing Kit 1 D (SQK-LSK109) and Native Barcoding Kit 1 D (EXP-NBD104). The barcode and adapter ligation steps were performed using the NEB ligase master mix module and the NEB T4 DNA ligase module. All bead washing steps in the protocol were performed using AMPure beads (Beckman Coulter). Finally, 12 µl of prepared DNA library was loaded onto the flow cell according to the manufacturer’s instructions and sequenced using a MinION (FLO-MIN-106 version R9.4) flow cell using MinKNOW software (http://community.nanoporetech.com) (Oxford Nanopore Technologies) (v.1.13.1) for 48 h. Raw reads were base called using Guppy v3.4.3 software (http://community.nanoporetech.com).
Genome assembly and annotation
Hybrid genome assembly was first performed using Unicycler v0.4.8-beta [15] with ONT long reads with bold mode. The complete genome obtained using ONT reads was then polished for multiple rounds of Pilon v1.22 [16] using Illumina raw reads [13]. The genomes were assembled in a closed circular single chromosome. Assembly quality in terms of completeness and contamination was assessed using CheckM v1.0.13 [17]. Genome coverage was calculated using BBMap v38.42 [18]. Annotation was performed using the National Center for Biotechnology Information-Prokaryotic Genome Annotation Pipeline (NCBI-PGAP) [19].
Genome comparison and identification of mobile elements
Genome comparison between strains SE3.9, SE3.8, SE2.14, S167, SGAir0252, ATCC 29970T and JCSC1435 was performed using the progressive Mauve tool with default parameters [20]. The pictorial representation of plasmids of SE3.9 was drawn using DNA plotter [21]. IS elements were identified using the ISsaga tool [22]. Circular genome representation of SE3.9 was done using the CGview comparison tool [23]. Genomic islands of SE3.9 were predicted using the Islandviewer 4 tool [24] with atypical GC (%) content of <29 % and >35 % in reference to GC content of the
genome (32%).
Data deposition
The projects have been deposited at the National Center for Biotechnology Information (NCBI) under accession numbers CP049091–CP049096, CP084229–CP084234 and CP084235–CP084240.
Results and discussion
Complete genome sequencing of rice endophytic strains SE3.9, SE3.8, and SE2.14 were performed using Oxford Nanopore MinION (see Methods). Strains were taxonomically classified as
by the average nucleotide identity (ANI) method and showed an identity range of 96.78–97.02 % with the available reference genome of the type strain ATCC 29970T. The GC content for the chromosome was 32.8 % and for plasmids was 29.5, 26.2, 30.0, 33.2, and 30.4 % respectively. The genomic features of strains are detailed in Table 1 and a circular genome representation for strain SE3.9 is given in Fig. 1a). The protein-coding genes of SE3.9 were classified into 21 Clusters of Orthologous Groups (COGs). COG categories were assigned to a total of 1768 coding sequences (CDS), of which class E (amino acid metabolism and transport) constitute maximum proportion, i.e., 8.96 % (174 out of 1,941), followed by class J (translation, ribosomal structure and biogenesis) with 7.93 % (Fig. 1a, Table 2).
Table 1.
Genomic features of
strains SE3.9, SE3.8 and SE2.14
SE3.9
SE3.8
SE2.14
Size (bp)
Chromosome
2323296
2323407
2323230
Plasmid1
25444
25444
25444
Plasmid2
21031
21031
21031
Plasmid3
18550
18550
18550
Plasmid4
4101
4101
4101
Plasmid5
3674
3674
3685
Genome coverage
57×
46×
52×
CDS
2191
2279
2283
tRNA
59
61
58
IS elements
15
14
15
Completeness
99.62
99.62
99.62
Contamination
0.71
0.71
0.71
NCBI accession numbers
CP049091–CP049096
CP084229–CP084234
CP084235–CP084240
Fig. 1.
(a) Chromosomal map of the
SE3.9 chromosome. From outer to inner: the first circle represents forward strand genes, coloured according to COG classification; the second and third circles represent CDS and non-coding RNAs on the forward and reverse strand, respectively; the fourth circle represents reverse strand genes, coloured according to COG classification. The inner area represents the GC content, GC skew, and genome size. (b) Circular map of plasmids showing pSE3.9-1, pSE3.9-2 and pSE3.9-3 of
strain SE3.9 with antibiotic, antimicrobial and metal resistance genes highlighted.
Post-translational modification, protein turnover and chaperones
64
3.29%
P
Inorganic ion transport and metabolism
125
6.43%
Q
Secondary metabolites biosynthesis, transport and catabolism
28
1.44%
R
General function prediction
257
13.24%
S
Function unknown
201
10.35%
T
Signal transduction mechanisms
57
2.93%
U
Intracellular trafficking, secretion and vesicular transport
28
1.44%
V
Defence mechanisms
30
1.54%
(a) Chromosomal map of the
SE3.9 chromosome. From outer to inner: the first circle represents forward strand genes, coloured according to COG classification; the second and third circles represent CDS and non-coding RNAs on the forward and reverse strand, respectively; the fourth circle represents reverse strand genes, coloured according to COG classification. The inner area represents the GC content, GC skew, and genome size. (b) Circular map of plasmids showing pSE3.9-1, pSE3.9-2 and pSE3.9-3 of
strain SE3.9 with antibiotic, antimicrobial and metal resistance genes highlighted.Genomic features of
strains SE3.9, SE3.8 and SE2.14SE3.9SE3.8SE2.14Size (bp)Chromosome232329623234072323230Plasmid1254442544425444Plasmid2210312103121031Plasmid3185501855018550Plasmid4410141014101Plasmid5367436743685Genome coverage57×46×52×CDS219122792283tRNA596158IS elements151415Completeness99.6299.6299.62Contamination0.710.710.71NCBI accession numbersCP049091–CP049096CP084229–CP084234CP084235–CP084240Clusters of orthologous groups of
SE3.9COG classesDescriptionGene countPercentageBChromatin structure and dynamics10.05%CEnergy production and conversion1115.71%DCell cycle control, cell division, chromosome partitioning211.08%EAmino acid transport and metabolism1748.96%FNucleotide transport and metabolism693.55%GCarbohydrate transport and metabolism1186.07%HCoenzyme transport and metabolism1005.15%ILipid transport and metabolism542.78%JTranslation, ribosomal structure and biogenesis1547.93%KTranscription1356.95%LReplication, recombination and repair1025.25%MCell wall/membrane/envelope biogenesis1055.40%NCell motility70.36%OPost-translational modification, protein turnover and chaperones643.29%PInorganic ion transport and metabolism1256.43%QSecondary metabolites biosynthesis, transport and catabolism281.44%RGeneral function prediction25713.24%SFunction unknown20110.35%TSignal transduction mechanisms572.93%UIntracellular trafficking, secretion and vesicular transport281.44%VDefence mechanisms301.54%We carried out comparative genome analysis using Progressive Mauve to understand the genome dynamics of
variants. The high degree of synteny among genomes indicated the conservation of arrangements of local collinear blocks (LCBs) (Fig. 2a). LCBs represent homologous backbone sequences, and differences in locations of LCBs indicate genomic rearrangements such as inversions and translocations. Additionally, there is a considerable expansion in genome size from rice to clinical isolates by 3 61 719 bp. In this context, the major driving force in bacterial evolution and adaptation is the horizontal acquisition of accessory genes through mobile genetic elements such as plasmids, genomic islands (GIs), and IS elements. GIs are discrete segments of DNA, varying from 10 to 100 kb in size. Based on the type of accessory genes they carry, GIs are classified as pathogenicity, resistance, metabolic and symbiosis islands [25]. The Islandviewer4 tool predicted three GIs in rice strain SE3.9 (Fig. 3, Table S1, available in the online version of this article). GI 1 (10kb) is an acquired resistance island with beta-lactamase genes providing potential resistance to antibiotics having beta- lactam rings such as ampicillin and penicillin. GI 2 (9kb) has virulence genes such as pathogenicity islands and terminase small subunit which has a role in phage packaging. GI 3 (14.7kb) is basically a metabolic island with genes such as PTS galactitol transporter, type 1 glutamine amidotransferase, LacI family transcriptional regulator, and 3-hexulose-6-phosphate synthase involved in carbohydrate metabolism and the pentose phosphate pathway. Overall, new genes acquired by horizontal transfer are largely for adaptation to a plant.
Fig. 2.
(a) Complete genome alignment of strains SE3.9, SE3.8, SE2.14, S167, SGAir0252, ATCC 29970T and JCSC1435. The scale represents coordinates of each genome. Different colour blocks represent LCBs, which are conserved segments in the genomes. (b) Distribution of IS elements into different IS families in the genomes of
strains of diverse origin. The green colour bar represents strains SE3.9, SE3.8 and SE2.14; blue colour represents strain S167; purple colour represents strain SGAir0252; yellow colour represents strain ATCC 29970T; and red colour represents strain JCSC1435.
Fig. 3.
Schematic representation of GIs of strain SE3.9. Panels represent three islands, GI 1, GI 2, GI 3, in strain SE3.9. Genes are depicted as filled arrows coloured according to their proposed functional category (see enclosed box).
(a) Complete genome alignment of strains SE3.9, SE3.8, SE2.14, S167, SGAir0252, ATCC 29970T and JCSC1435. The scale represents coordinates of each genome. Different colour blocks represent LCBs, which are conserved segments in the genomes. (b) Distribution of IS elements into different IS families in the genomes of
strains of diverse origin. The green colour bar represents strains SE3.9, SE3.8 and SE2.14; blue colour represents strain S167; purple colour represents strain SGAir0252; yellow colour represents strain ATCC 29970T; and red colour represents strain JCSC1435.Schematic representation of GIs of strain SE3.9. Panels represent three islands, GI 1, GI 2, GI 3, in strain SE3.9. Genes are depicted as filled arrows coloured according to their proposed functional category (see enclosed box).To understand the role of IS elements in genome plasticity, we studied IS element content among diverse
strains. As shown in Fig. 2b, there is variation in the number and type of IS elements in
isolates of diverse origins. About 14 or 15 IS elements were observed for rice strains SE3.9, SE3.8 and SE2.14 classified into six families; a total of 65 IS elements were found for strain S167 isolated from a leafy vegetable, classified into six families, a total of 73 IS elements were found for environmental strain SGAir0252 classified into six families; a total of 41 IS elements were found for commensal strain ATCC 29970T, classified into four families; and a total of 99 elements were found for clinical strain JCSC1435, classified into six families. Three IS groups (ISL3, ISNCY, IS1182) comprised 80 % of the IS elements. ISL3-SH of
exhibited 94.67 % nucleotide identity to ISL3-SA of
and 94.58 % identity to ISL3-SE of
. The IS30 and IS3_ssgr_IS150 families are exclusively present in strains isolated from rice seeds. They are associated with MFS transporter and toxic anion resistance genes potentially related to stress tolerance in rice seeds. The IS6 family is associated with resistance genes such as tetracycline resistance, arsenate reductase, quaternary ammonium compound efflux (qacA), and magnesium and cadmium transporter genes. They are known to disseminate antibiotic resistance genes and are present in plant, environmental, and clinical isolates. IS200_IS605_ssgr_IS200 is present in all isolates studied except those from rice seed. It is an ancestral IS of bacteria inhabiting a human host [26, 27]. IS256 is present in a large number in clinical strains and is a well-known molecular marker for clinical
[28]. The presence of many ISs in clinical strain JCSC1435 is probably the reason for the frequent genome rearrangements in
owing to its multidrug resistance.Plasmids contribute to bacterial ecology and evolution by mobilization of accessory genes through horizontal gene transfer [29]. The genomes of
strains completed in this study consist of five plasmids (Fig. 1b). Since all three genomes from rice have plasmids of similar size, we discuss the plasmids of strain SE3.9 in detail. Of these, four plasmids encode genes helping adapt to survive in harsh and stressful conditions. pSE3.9-1 harbours the gene qacA encoding a multidrug efflux pump reported to export toxic molecules such as quaternary ammonium compounds, some antibiotics and disinfecting agents [30, 31]. pSE3.9-1 also contains copper resistance genes, and hence there is a potential for transferring copper resistance genes to human isolates of clinical origin. Copper-containing compounds such as the Bordeaux mixture are widely used antibacterial agents to control plant diseases such as bacterial leaf blight in rice plants [32, 33]. Copper is also a promising antibacterial agent to control hospital pathogens including
and
. Hence there is the worrying possibility of plant isolates being a source of copper resistance in clinical isolates. The presence of alleles of copper and other stress tolerance genes on plasmids provides further scope for diversification and selection. pSE3.9-2 encodes LanM type lantibiotic, which provides immunity against other pathogens by disrupting their membrane integrity. pSE3.9-3 encodes metal resistance genes, including cadmium, arsenic, magnesium and chromium, to overcome the metal resistance in the environment. Additionally, it contains antibiotic resistance genes for tetracycline. pSE3.9-4 mainly contains hypothetical genes. Interestingly, blastn analysis revealed pSE3.9-5 has 100 % nucleotide identity and 85 % query coverage to
strain USA300-SUR24 plasmid pUSA07-1-SUR24 and plasmid pUSA07-1-SUR15, isolated from a hospital outbreak [34] (Fig. 4). This result indicates the movement of plasmids across
species.
Fig. 4.
Comparison of the genetic organization of
strain USA300-SUR24 plasmid pUSA07-1-SUR24 with
strain SE3.9 plasmid pSE3.9-5. Arrows represent predicted CDSs. Highly conserved regions determined by pairwise blastn comparisons with E-values <0.001 were plotted. Regions with forward and reverse matches are indicated by red and blue shading, respectively, with colour intensity indicating nucleotide identity levels (from 80 to 100 %). The absence of red and blue area denotes no homology.
Comparison of the genetic organization of
strain USA300-SUR24 plasmid pUSA07-1-SUR24 with
strain SE3.9 plasmid pSE3.9-5. Arrows represent predicted CDSs. Highly conserved regions determined by pairwise blastn comparisons with E-values <0.001 were plotted. Regions with forward and reverse matches are indicated by red and blue shading, respectively, with colour intensity indicating nucleotide identity levels (from 80 to 100 %). The absence of red and blue area denotes no homology.
Conclusion
Mobile and repetitive elements are known to drive the evolution of niche-specific bacteria through horizontal gene transfer events. The complete genome sequence of an ecological variant allowed us to gain insights into distinct evolutionary routes and mobile elements, particularly IS elements and multispecies plasmids in the adaptation of
. At the same time, there is the possibility of transfer of plasmids encoding antimicrobial resistance and fitness genes from plant strains into pathogenic strains of
and also other pathogenic species such as
. In this context, the presence of resistance and fitness genes on plasmids that are usually present in multiple copies has implications for the further success and adaptation of clinical species and strains. Overall, these resources and systematic insights should stimulate further ecological-based molecular and functional studies of the pathogenic lifestyle of
.Click here for additional data file.
Authors: Ad C Fluit; Neeltje Carpaij; Eline A M Majoor; Marc J M Bonten; Rob J L Willems Journal: J Antimicrob Chemother Date: 2013-04-18 Impact factor: 5.790
Authors: Donovan H Parks; Michael Imelfort; Connor T Skennerton; Philip Hugenholtz; Gene W Tyson Journal: Genome Res Date: 2015-05-14 Impact factor: 9.043
Authors: Tatiana Tatusova; Michael DiCuccio; Azat Badretdin; Vyacheslav Chetvernin; Eric P Nawrocki; Leonid Zaslavsky; Alexandre Lomsadze; Kim D Pruitt; Mark Borodovsky; James Ostell Journal: Nucleic Acids Res Date: 2016-06-24 Impact factor: 16.971