Literature DB >> 26823957

Complete genome sequence and genomic characterization of Microcystis panniformis FACHB 1757 by third-generation sequencing.

Jun-Yi Zhang1,2, Rui Guan1, Hu-Jun Zhang2, Hua Li3, Peng Xiao4, Gong-Liang Yu3, Lei Du5, De-Min Cao5, Bing-Chuan Zhu2, Ren-Hui Li3, Zu-Hong Lu1,6.   

Abstract

The cyanobacterial genus Microcystis is well known as the main group that forms harmful blooms in water. A strain of Microcystis, M. panniformis FACHB1757, was isolated from Meiliang Bay of Lake Taihu in August 2011. The whole genome was sequenced using PacBio RS II sequencer with 48-fold coverage. The complete genome sequence with no gaps contained a 5,686,839 bp chromosome and a 38,683 bp plasmid, which coded for 6,519 and 49 proteins, respectively. Comparison with strains of M. aeruginosa and some other water bloom-forming cyanobacterial species revealed large-scale structure rearrangement and length variation at the genome level along with 36 genomic islands annotated genome-wide, which demonstrates high plasticity of the M. panniformis FACHB1757 genome and reveals that Microcystis has a flexible genome evolution.

Entities:  

Keywords:  Comparative genomics; Lake Taihu; Microcystis; Microcystis panniformis FACHB1757; Third-generation sequencing; Water bloom

Year:  2016        PMID: 26823957      PMCID: PMC4730716          DOI: 10.1186/s40793-016-0130-5

Source DB:  PubMed          Journal:  Stand Genomic Sci        ISSN: 1944-3277


Introduction

The massive development of bloom-forming cyanobacteria is causing problems in eutrophic water bodies worldwide. Among the cyanobacteria, is perhaps the most notorious. Many species have been reported to be able to produce microcystins [1-4], which threaten many aquatic ecosystems and cause serious and occasionally fatal human liver, digestive, neurological, and skin diseases [5-7]. is a genus of unicellular colony-forming cyanobacteria whose taxonomy is still unclear [8]. Although morphological criteria have been proposed to distinguish species from field samples, such criteria have long been questioned for use in species identification within the genus [9]. Several studies attempted to reconcile molecular and morphological taxonomy in [9-14], and a morphology-based taxonomic system has been dominantly used. panniformis was first reported in 2002 and was morphologically described as having flattened, irregular, monolayer colonies with small holes inside and later disintegrated into small pieces [15]. Since the M. panniformis strain SPC 702 was successfully isolated from Lago das Garças, São Paulo in 1999, studies addressing different aspects of this species have been performed [16-25]. In China, M. panniformis was reported as a newly recorded species in 2012 [26], and one strain (FACHB1757) was isolated from Lake Taihu. panniformis was originally thought to only be distributed in tropical regions, but we showed that this species has invaded the subtropical regions with a monsoon climate [26]. Global expansion of harmful cyanobacteria has been thought to be linked to climate changes, particularly increasing amounts of atmospheric CO2 and surface temperature, which may promote growth and enhance the potential for bloom occurrence [27-29]. Therefore, a deeper understanding of the ecology and physiology of M. panniformisFACHB1757 by obtaining a robust genome reference may provide insight into the expansion and invasion mechanisms of .

Organism information

Classification and features

A water bloom sample was collected directly from the water surface using a plastic bucket in Meiliang Bay of Lake Taihu in August 2011 (Fig. 1a). Lake Taihu (E 30°56′–31°33′,N 119°54′–120°36′), the third largest freshwater lake in China, is located in the south of the Yangtze River Delta. The total area of the lake is 2338 km2, with an average depth of 2 m and total capacity of 47.6 × 108 m3. Lake Taihu is situated in the subtropical zone with a humid and semi-humid monsoon climate, and has suffered from severe eutrophication over the past three decades. Meiliang Bay is located in the northern part of Lake Taihu (Fig. 1a), which has a surface area of 100 km2, depth of 1.8–2.3 m, and is currently hypereutrophic [30].
Fig. 1

Strain collection location and photomicrographs of M. panniformis FACHB1757. The strain was originally isolated from Meiliang Bay of Lake Taihu in August 2011 and deposited in the Freshwater Algae Culture Collection at the Institute of Hydrobiology (FACHB-collection, China) with the unique identifier FACHB1757 in 2012. a The precise position of the isolated sample is indicated by a star; WT means Wutang station in Lake Taihu. b The morphology of the strain colonies in the white disk, which were collected directly from the water surface using a plastic bucket (on September 15, 2013 in Meiliang Bay, photo with a Nikon D7000). c, d Flat colonies with small holes as viewed under an optical microscope

Strain collection location and photomicrographs of M. panniformis FACHB1757. The strain was originally isolated from Meiliang Bay of Lake Taihu in August 2011 and deposited in the Freshwater Algae Culture Collection at the Institute of Hydrobiology (FACHB-collection, China) with the unique identifier FACHB1757 in 2012. a The precise position of the isolated sample is indicated by a star; WT means Wutang station in Lake Taihu. b The morphology of the strain colonies in the white disk, which were collected directly from the water surface using a plastic bucket (on September 15, 2013 in Meiliang Bay, photo with a Nikon D7000). c, d Flat colonies with small holes as viewed under an optical microscope Some colonies in the sample disintegrated during the sample collection process; thus, only those macroscopic colonies with significant monolayer were collected with 3-ml pipets (BD Falcon, USA), and transferred into 50-ml centrifuge tubes (Corning, USA), and immediately shipped to the laboratory. Finally, macroscopic colonies that had flattened irregular up to monolayers with small holes (in old colonies) were identified as M. panniformis by examination under an optical microscope. panniformisFACHB1757 was obtained, and this strain was then stored at the Freshwater Algae Culture Collection at the Institute of Hydrobiology, Chinese Academy of Sciences. The general characteristics of M. panniformisFACHB1757 are summarized in Table 1, and a phylogenetic tree based on 16S rRNA sequences is shown in Fig. 2. The spherical cells are estimated with a diameter of 2.6 to 6.8 μm (mean 4.7 μm), be densely agglomerated, and form irregular colonies with small holes. The young stages formed small clusters of cells, which were flat or circular in outline, sometimes spheroidal, and with or without an internal hollow. The old stages formed colonies with small holes, which later disintegrated into small groups. The mucilage (margin of colonies) was diffuse, and cells did not overlap. The margin of the colonies was smooth or (in old colonies) irregular. Cell density was regular and evenly agglomerated, sometimes in indistinct rows. Diagnostic characteristics included flat colonies with small holes, toxicity, homogeneously arranged cells, and life cycle was characterized by distinct benthic and planktonic phases [15, 31]. The distribution was tropical, and this is likely a pantropical species (e.g., S. Africa, N. Australia, S. America, Africa, China, Vietnam and New Zealand) [13, 15, 16, 26, 31, 32].
Table 1

Classification and general features of M. panniformis FACHB1757 according to the MIGS recommendations [69]

MIGS IDPropertyTermEvidence codea
ClassificationDomain Bacteria TAS [70]
Phylum Cyanobacteria TAS [71, 72]
Class Oscillatoriophycideae TAS [73]
Order Chroococcales TAS [73, 74]
Family Microcystaceae TAS [74]
Genus Microcystis TAS [71, 75]
Species M. panniformis TAS [15, 31]
Strain: M. panniformis FACHB1757TAS [26]
Gram stainGram-negativeTAS [76]
Cell shapeSpherical cellsTAS [15]
MotilityNon-motileNAS
SporulationNoneTAS [76]
Temperature rangeMesophileNAS
Optimum temperature29.5 °CIDA
pH range; OptimumpH 7.50-9.21; pH 8.33IDA
Carbon sourceAutotroph, heterotrophNAS
MIGS-6HabitatFresh waterNAS
MIGS-6.3Salinity1.0 % (maximum)IDA
MIGS-22Oxygen requirementAerobicNAS
MIGS-15Biotic relationshipFree-livingNAS
MIGS-14PathogenicityMicrocystins (MCY) TAS [25, 77]
MIGS-4Geographic locationIsolated Lake Taihu, ChinaIDA
MIGS-5Sample collectionAugust, 2015IDA
MIGS-4.1Latitude31.421 NIDA
MIGS-4.2Longitude120.201EIDA
MIGS-4.3DepthSurface 0.5 mIDA
MIGS-4.3Altitude11 mIDA

aEvidence codes - IDA Inferred from Direct Assay, TAS Traceable Author Statement (i.e., a direct report exists in the literature), NAS Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [78]

Fig. 2

Phylogenetic tree showing the position of M. panniformis FACHB1757. The dendrogram is based on the 16S ribosome RNA complete sequence of M. panniformis FACHB1757, M. aeruginosa NIES843, M. aeruginosa PCC7806, M. aeruginosa NIES 2549, and representatives of other cyanobacterial genera (Synechocystis, Pseudanabaena, Synechococcus, Thermosynechococcus, Planktothrix, Dolichospermum, Anabaena, Cylindrospermopsis, Nodularia, Nostoc, Aphanizomenon, Raphidiopsis) downloaded from NCBI (sequences without accession numbers were extracted from annotation files of the corresponding genomes) using the neighbor-joining algorithm with 100 bootstrap replications using MEGA6. A bootstrap consensus tree was constructed and is shown. The two copies of 16S rRNA sequences of M. panniformis FACHB1757 are labeled in red. The relationship between M. panniformis FACHB1757 and other important algae species in Cyanophyceae are demonstrated. Species colored in green have whole genome data available in NCBI

Classification and general features of M. panniformis FACHB1757 according to the MIGS recommendations [69] aEvidence codes - IDA Inferred from Direct Assay, TAS Traceable Author Statement (i.e., a direct report exists in the literature), NAS Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [78] Phylogenetic tree showing the position of M. panniformis FACHB1757. The dendrogram is based on the 16S ribosome RNA complete sequence of M. panniformis FACHB1757, M. aeruginosa NIES843, M. aeruginosa PCC7806, M. aeruginosa NIES 2549, and representatives of other cyanobacterial genera (Synechocystis, Pseudanabaena, Synechococcus, Thermosynechococcus, Planktothrix, Dolichospermum, Anabaena, Cylindrospermopsis, Nodularia, Nostoc, Aphanizomenon, Raphidiopsis) downloaded from NCBI (sequences without accession numbers were extracted from annotation files of the corresponding genomes) using the neighbor-joining algorithm with 100 bootstrap replications using MEGA6. A bootstrap consensus tree was constructed and is shown. The two copies of 16S rRNA sequences of M. panniformis FACHB1757 are labeled in red. The relationship between M. panniformis FACHB1757 and other important algae species in Cyanophyceae are demonstrated. Species colored in green have whole genome data available in NCBI

Phylogenetic analysis

Whole genome comparative analysis between M. panniformisFACHB1757 and 13 other cyanobacterial species was performed. General information of related genome data is shown in Table S1 (Additional file 1), and all data sets were downloaded from NCBI. The main water bloom-forming cyanobacterial species in freshwater and brackish water worldwide, particularly those in the Lake Taihu region, were included. Unicellular colony-forming and filamentous heterocystous Dolichospermum (formerly known as the planktonic ) were the main components of cyanobacterial blooms in Lake Taihu [33]. The , , , Raphidiopsis, , , and species occurred as dominant species or accompanying species in blooms of Lake Taihu (including Lake Wuli) across different seasons. Among the 14 genome sequences, 691 single-copy gene families were annotated by OrthoMCL (version 2.0.9) [34], and MEGA6 [35] was used to construct a phylogenetic tree based on these sequences (Fig. 3).
Fig. 3

Phylogenetic tree of water bloom-forming cyanobacterial species and representative cyanobacteria. The nucleotide divergence tree was constructed using the neighbor-joining algorithm based on 691 sequences of single-copy gene families annotated by OrthoMCL with 100 bootstrap replicates. The representative cyanobacteria that cannot form water blooms are indicated with an asterisk

Phylogenetic tree of water bloom-forming cyanobacterial species and representative cyanobacteria. The nucleotide divergence tree was constructed using the neighbor-joining algorithm based on 691 sequences of single-copy gene families annotated by OrthoMCL with 100 bootstrap replicates. The representative cyanobacteria that cannot form water blooms are indicated with an asterisk The phylogenetic tree shows that M. panniformisFACHB1757 and NIES843 shared a significantly high similarity, and there was no clear division between M. panniformis and strains in the phylogenetic tree. The lineage is distinct from the lineage that contains the unicellular , , and other multicellular cyanobacteria. Furthermore, the sp. PCC 6803 genome is more closely related to than other strains. This result is congruent with previously published results based on 16S rRNA sequences [36-39]. Topological relationships between species in the phylogenetic tree based on single-copy gene families were generally consistent with the phylogenetic tree based on 16S rRNA sequences (Fig. 2). Although can be identified based on 16S rRNA and single-copy gene families sequences at the genus level, taxonomy of at the species level was controversial in the past few decades, and five species have even been unified into a single species [13]. 16S rRNA sequence estimation can be ambiguous when analyzing certain species with distinct morphologies, as occurred when analyzing M. panniformis and M. ichthyobabe (Fig. 2). Therefore, the whole reference genome sequence data was expected to play a crucial role in species classification of . However, the currently available cyanobacterial genome sequences are highly limited. Only three strains with complete genomic sequences are available, including NIES843 and NIES2549, and M. panniformisFACHB1757 reported here. Furthermore, the further species concepts and more useful molecular approaches should be proposed to classify the species/strain divergences in [40, 41].

Genome sequencing information

Genome project history

panniformisFACHB1757 was selected for sequencing because of its obvious morphological characteristics; in particular, the macroscopic colonies with significant monolayer can even exceed 30 mm during the summer and early autumn in Lake Taihu. More importantly, until recently, only complete genomes of strains (including strains NIES843 and NIES2549) have been published. The complete genome sequence of M. panniformisFACHB1757 would only be the third for . The sample information for M. panniformisFACHB1757 is available in NCBI under BioSample ID SAMN03392520. A DNA library with an insert size of 10 Kb was constructed, and the whole genome was sequenced to 48-fold coverage. The completed genome sequence was assembled and uploaded to GenBank under accession number. CP011339. Project details were deposited to NCBI BioProject PRJNA277430. A summary of the project information can be found in Table 2.
Table 2

Project information

MIGS IDPropertyTerm
MIGS-31Finishing qualityComplete
MIGS-28Libraries used2 PacBio SMRT cells
MIGS-29Sequencing platformsPacBio RSII
MIGS-31.2Fold coverage43.39
MIGS-30AssemblersHGAP 2.2.3
MIGS-32Gene calling methodRAST
Locus TagVL20
GenBank IDCP011339
GenBank Date of ReleaseAugust 11, 2015
GOLD IDGp0111943
BIOPROJECTPRJNA277430
MIGS-13Source Material IdentifierFACHB1757
Project relevanceEnvironmental
Project information

Growth conditions and genomic DNA preparation

panniformisFACHB1757 colonies collected from the field were grown in MA medium [42] and incubated in 24-well culture plates for 4 wk. Then, floating colonies were transferred to the capped tubes that contained 5 ml of MA culture medium to finally form a unialgal culture. All cultures were grown at 25 ± 1 °C with a 12 h light/12 h dark cycle under a photon irradiance of 25 μmol photons/(m2 · sec) provided by daylight fluorescent lamps. Total genomic DNA of M. panniformisFACHB1757 was extracted using a commercial DNA isolation kit (DNeasy® Plant Mini Kit, Qiagen, USA) following the manufacturer’s instructions, and analyzed by micro-volume fluorescence detection (NanoDrop™ 8000 Spectrophotometer, Thermo Scientific, USA) and electrophoresis in 0.8 % agarose gel stained with ethidium bromide. The isolated DNA was eluted with 50 μl of the elution buffer from the commercial kit and then stored at −20 °C until subsequent analyses.

Genome sequencing and assembly

First, the genome was surveyed using an Illumina Hiseq sequencer to detect the purity of the cultured unialgal strain. The insert size of the next generation pair-end library was 100 bp, and 1 Gbp raw data was produced in total. All reads were mapped to the NIES843 reference complete genome, and more than 80 % of reads matched well. Subsequently, the genome was sequenced using PacBio RS II. Genomic DNA was sheared by Covaris S220 g-TUBE. A 10 Kb library was constructed using a PacBio template prep kit and sequenced using the PacBio SMRT platform. In total, two SMRT cells were run, and 303 megabase pair raw data was obtained. After filtering, the mean read length was 7143 bp with a quality of 0.84, and the longest read was 31,225 bp. HGAP (version 2.2.3) was used for genome assembly. Long reads were chosen as seeds, and the other reads were mapped to the seeds using Blasr (version 1.3.1.132871) [43] for error correction. After alignment, the accuracy of seed sequences were optimized to 99 % to meet the requirements of the Sanger assembly software. There was a total of 128 Mbp of high quality long seed reads, which had an average length of 7898 bp. Celera Assembler (version 8.1) [44] was then used to assemble the seed reads into contigs and Quiver [45] was used for second error correction. Contigs were assembled into the final complete genome sequence using minimus2 in AMOS (version 3.1.0). The final genome consisted of a complete circular 5,686,839 bp chromosome with a GC content of 42.35 % and a 38,683 bp plasmid with a 43.97 % GC content. Sequencing depths were 44.85 and 128.42, respectively.

Genome annotation

TRs were predicted by Tandem Repeat Finder (version 4.07b) [46] and Microsatellite identification tool (version 1.0), which can both identify perfect and compound micro-satellites. Prediction and annotation of the genome were done using the RAST server (version 2.0) [47]. RAST integrated tRNAscan-SE, and the search_for_rnas tool was used to call RNA genes across the chromosome. For gene estimation, GLIMMER2 was used to represent putative genes. Subsequently, a similar search was performed against FIGfams to identify the determined genes and annotate their functions. Moreover, all putative protein-coding genes were assigned to a category using databases including Clusters of Orthologous Groups (COG), Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), Swiss-Prot, and Non-Redundant Protein Database.

Genome properties

The genome assembly contained a complete circular chromosome sequence (5.69 M) and a plasmid (38.68 K). The schematic representation of the circular chromosome of M. panniformisFACHB1757 was showed in Fig. 4. Related genome assembly and annotation information can be found in Table 3. Nucleotide homology search of M. panniformisFACHB1757 and NIES843 genomes was conducted by BLAST, and similarity between the two species was 83.82 % (Additional file 1: Figure S1). A total of 1944 TRs were found in the genome, including 27 microsatellites, 1742 mini-satellites, and 176 satellites. Genome statistics are shown in Table 4. In total, there were 6567 genes, which included 48 RNA genes and 6519 protein-coding genes. Among the 6519 proteins, most contained around 100 amino acids (Additional file 1: Figure S2), and by compared with function databases mentioned above, 60.15 % of them were determined to have specific functions. There were 42 tRNA genes, and two copies of the rRNA gene cluster were found in the same direction. Function assignments of 6519 putative protein-coding genes were searched against several frequently used databases mentioned above; 3260 genes were assigned to COGs, of which 235 participated in signal transduction. Search of Pfam omains detected 3997 candidates. According to the subsystem classification results processed by RAST, 72 % of determined genes belong to specific subsystems, and the distribution of each category is demonstrated in (Additional file 1: Figure S3). The result of COG function annotation is shown in Table 4, and details of each COG cluster can be found in Additional file 2. The genes assigned to GO categories by InterProScane (version 5.4-47.0) [48] were classified into cellular components, molecular functions, and biological processes. Genes distributed in each category and their functions are shown in (Additional file 1: Figure S4). In the GO data, 309 signal function-related genes were found. KEGG matched 897 functional genes to related systems, as shown in (Additional file 1: Figure S5). Final gross function annotation outcomes are provided in (Additional file 1: Table S2).
Fig. 4

Schematic representation of the circular chromosome of M. panniformis FACHB1757. The scales indicate location in Mbp, starting with the initial coding region. Using Circos integrated the gene prediction results of COG function annotation, methylated modification, and some other information. From inner to outer circles: the first circle shows the GC skew (in purple and green), and the value is plotted as the deviation from the average GC skew of the entire chromosome sequence. The bars in the second circle (in black and red) represent the GC content, which is plotted using a 10-Kb sliding window. Positions of tRNA and rRNA are marked by green bars in the third circle. Bars in the fourth and fifth circle are colored according to COG function categories of CDS; the fourth is a backward strand and fifth is a forward strand. The sixth and seventh circles indicate m4C and m6A sites in CDS/rRNA/tRNA regions (in blue bars); the sixth circle is a backward strand and the seventh circle is a forward strand. In the eighth circle, red bars show the m4C and m6A sites in intergenic regions

Table 3

Genome statistics

AttributeValue% of Total
Genome size (bp)5,686,839100.00
DNA coding (bp)4,616,63181.18
DNA G + C (bp)2,408,63942.35
DNA scaffolds1100.00
Total genes6,567100.00
Protein coding genes6,51999.27
RNA genes480.73
rRNA genes60.09
tRNA genes420.64
Pseudo genes--
Genes in internal clusters--
Genes with function prediction3,921100.00
Genes assigned to COGs3,37386.02
Genes with Pfam domains2,06752.72
Genes with signal peptides3097.88
CRISPR repeats3-
Genes with transmembrane helices--
Table 4

Number of genes associated with general COG functional categories

CodeValue% AgeDescription
J1652.51Translation, ribosomal structure and biogenesis
A00.00RNA processing and modification
K1311.99Transcription
L6209.44Replication, recombination and repair
B10.02Chromatin structure and dynamics
D470.72Cell cycle control, Cell division, chromosome partitioning
V671.02Defense mechanisms
T1382.10Signal transduction mechanisms
M2033.09Cell wall/membrane biogenesis
N300.46Cell motility
U520.79Intracellular trafficking and secretion
O1522.31Posttranslational modification, protein turnover, chaperones
C1852.82Energy production and conversion
G1311.99Carbohydrate transport and metabolism
E2153.27Amino acid transport and metabolism
F620.94Nucleotide transport and metabolism
H1462.22Coenzyme transport and metabolism
I650.99Lipid transport and metabolism
P1882.86Inorganic ion transport and metabolism
Q1261.92Secondary metabolites biosynthesis, transport and catabolism
R5358.14General function prediction only
S4837.35Function unknown
-2,82643.02Not in COGs

The total is based on the total number of protein-coding genes in the genome

Schematic representation of the circular chromosome of M. panniformis FACHB1757. The scales indicate location in Mbp, starting with the initial coding region. Using Circos integrated the gene prediction results of COG function annotation, methylated modification, and some other information. From inner to outer circles: the first circle shows the GC skew (in purple and green), and the value is plotted as the deviation from the average GC skew of the entire chromosome sequence. The bars in the second circle (in black and red) represent the GC content, which is plotted using a 10-Kb sliding window. Positions of tRNA and rRNA are marked by green bars in the third circle. Bars in the fourth and fifth circle are colored according to COG function categories of CDS; the fourth is a backward strand and fifth is a forward strand. The sixth and seventh circles indicate m4C and m6A sites in CDS/rRNA/tRNA regions (in blue bars); the sixth circle is a backward strand and the seventh circle is a forward strand. In the eighth circle, red bars show the m4C and m6A sites in intergenic regions Genome statistics Number of genes associated with general COG functional categories The total is based on the total number of protein-coding genes in the genome

Insights from the genome sequence

Comparative species genomes

Gene ortholog analysis

Genes of four species were compared (Fig. 5), and 2669 highly conserved orthologous genes were shared, which are representative of the core genome. Moreover, each genome had strain-specific genes, which varied from 296 to 1900. The NIES2549 genome, which has 1388 unique genes, is 1.5 Mbp smaller than that of NIES843, which only has 296 unique genes (NIES843 has 1388). panniformisFACHB1757 was shown to have 1900 specific genes, which was the greatest amount among the four strains, even though its genome was not the longest.
Fig. 5

Venn diagram of gene numbers of four Microcystis species. Less than half of all genes were found in all four species

Venn diagram of gene numbers of four Microcystis species. Less than half of all genes were found in all four species

Secondary metabolite gene clusters

Microcystin was reported to enhance colony formation in spp. and plays a key role in the persistence of their colonies and the dominance of [49]. As in NIES843 and PCC7806 genomes, the microcystin synthetase gene cluster (mcyA-J) was highly conserved in M. panniformisFACHB1757 from coordinates 3,496,704 to 3,541,027. Additionally, the distinct thioesterase type II coding gene mcyT, which occurs in toxic strains, and 4-PPT transferase (4-PPTase) were both located far from the mcy gene cluster at coordinates 869,702 to 869,286 and 915,377 to 916,039, respectively, which are similar to the distributions observed in NIES843. Notably, there was an absence of mcnA and mcnB in the M. panniformisFACHB1757 chromosome. mcnA codes polyketide biosynthesis proteins, and mcnB is the first open reading frame of mamestra configurata nucleopolyhedrovirus B. Together with mcnC and mcnE, these four genes compose the cyanopeptolin synthesis gene cluster. mcnD was not found in the M. panniformisFACHB1757 genome; thus, the cyanopeptolin produced was non-halogenated and identical to that of NIES843 and PCC7806. Toxins may contribute to the adaptation of this strain to its specific ecological niche in eutrophic waters of tropical and subtropical zones. In addition, a putative polyketide synthase gene cluster, which may encode additional small polypeptides found in NIES843 (coordinates 2,508,556–2,513,289), was detected in M. panniformisFACHB1757 at coordinates 4,425,371 to 4,430,104. The change in location of the genes mentioned above reflected the extensive structural variation between M. panniformisFACHB1757 and NIES843.

Conserved gene clusters

Four functional clusters of conserved genes related to microcystin synthesis, colony formation, photo-regulation, and nutrient assimilation were also compared among the four strains. In the microcystin synthesis gene cluster, the mcy and mcn gene clusters were not found in NIES2549. This is consistent with the results of a previous study, which showed that NIES2549 is a nontoxic strain [50]. With regard to colony formation, , M. wesenbergii, and M. panniformis all have typical macroscopic colony structure when observed by naked eye in Lake Taihu during summer and autumn water blooms. panniformis seems to be the largest, and can even have more than 30 mm colonies. Polysaccharides and microcystin play important roles in the process of colony formation. The maximum EPS content was found in M. wesenbergii and , which are not the largest and are only approximately 100 μm [51], but positive correlations between EPS and colony size in cultures were supported by previous studies [52-54]. mrpC and epsL were absent from all four strains, and only NIES843 contained cpsF, although tagH, capD, csaB, and rfbB were conserved in all four strains. Furthermore, mvn codes for a lectin in M. panniformisFACHB1757 and PCC7806, which specifically binds to a sugar moiety present on the surface of cells. Additionally, a binding partner of MVN was identified in the lipopolysaccharide fraction of PCC7806, which involved in the colony formation [55]. Together, the toxin-, EPS-, and lectin-related genes may explain the reason why M. panniformisFACHB1757 usually aggregates and produces a larger colony in Lake Taihu during water blooms. In the photo-regulation cluster, psb, apc and gvp with the exception of gvpC were all detected. It is interesting that gvpC is absent from M. panniformisFACHB1757, because this gene encodes GvpC, which is a highly conserved expressed protein in some genera that is closely related to gas vesicles [56-58]. Genes related to nutrient assimilation include ntc, pst, and sph clusters. ntcB, pstA, pstB1, pstB2, and pstC were only absent from PCC7806 among the four strains, which may be accounted for by the incompleteness of the strain’s genome. Detailed information about function and coordinates of each gene are shown in (Additional file 1: Table S3).

Genome structure and constitution comparison

The genomes of NIES843 and NIES2549 have no plasmids, whereas a 38 Kb plasmid with a 43.97 % GC content was detected in M. panniformisFACHB1757 in this study. The stable presence of plasmids may play an important role in some obtaining competitive advantages [59-61]. NIES-843 is the first strain of the genus to be sequenced for its complete genome with the ABI 3770xl sequencer. Since then, the second completed genome (of NIES-2549) was released on the April 29, 2015. Thus, the whole genome at the nucleic acid level was compared between M. panniformisFACHB 1757 and NIES-843. Mauve, which was designed for identification and alignment of conserved genomic sequences with rearrangements and horizontal transfer, was used to conduct comparative genomic sequence analysis [62]. As shown in (Additional file 1: Figure S1), M. panniformisFACHB1757 underwent extensive chromosome structure rearrangement, which indicates that genomes are highly plastic [36].

Self-defense system

Restriction modification system

Comparison with REBASE [63], a restriction enzyme database containing information about restriction enzymes, revealed that DNA methyltransferases and related proteins are involved in the biological process of R–M, and 277 restriction enzymes were found. Detailed classification revealed that 12 and 130 enzymes belonged to type I and type II systems, respectively, which together represented 46.93 % of all enzymes, and are categories of rapidly evolving genes [64]. Sixty-three, 10, and 2 enzymes, respectively, belonged to type IIG, type III, and type IV systems, and one control protein restriction enzyme and 58 unknown enzymes were also found.

Methylation modification analysis

It is widely thought that methylation modification is associated with R-M systems and participates in self-defense against foreign genome invasion. Genome methylation modification and methyl-transferase recognition sequence motifs were analyzed using SMRT (version 2.3.0). In the chromosome, 3204 m4C (N4-methylcytosine), 9,758 m6A (N6-methyladenine), and 31,845 other modified bases were marked as modified (details are available in Additional file 3). Corresponding motif information is included in Table 5.
Table 5

Sequence structure and general information of motifs in the whole genome

MotifModified positionTypeMotifs detected# of motifs detected# of motifs in genomeMean QVMean motif coveragePartner motif
GATC2m6A78.55 %37,87448,21845.4422.91GATC
GAATTC3m6A74.54 %19382,60044.2522.53GAATTC
GCTGDAG6m6A73.70 %9951,35043.7222.95-
GGTGGA6m6A70.96 %19352,72743.2222.81-
GACGNAC6m6A70.26 %7231,02942.5823.11-
ACCACC4m6A69.67 %24103,45941.9122.82-
CAAGNNNNNNTTTC3m6A69.02 %17625541.4521.48-
GATATC2m6A67.42 %20553,04842.2923.09GATATC
MCGRAG5m6A52.23 %33906,49141.6422.35-
GCWGC2m4C24.17 %391116,18437.5225.13GCWGC
RGATCY5m4C19.09 %8084,23236.9925.80RGATCY
GGCC3m4C18.02 %372120,65437.6726.39GGCC
Sequence structure and general information of motifs in the whole genome

CRISPR system

MinCED derived from the CRT [65], was used to predict CRISPR structure. CRISPR are extensively found in prokaryotes and are thought to compose a CRISPR-associated system, which is a putative immune system based on RNA-interference [66]. Three candidate CRISPR clusters on chromosome sequence were annotated under strict parameter and 1 CRISPR on plasmid (further information is available in Additional file 4).

Genomic islands

GEIs are particularly influential in microorganism genomes with regard to virulence, antibiotic resistance, metabolic, symbiosis, or other important adaptations [67]. GEIs have substantial roles in horizontal gene transfer, which is now widely acknowledged as an important force that shapes bacterial genome structure. Island Viewer (version 2.0) [68] was used to predict the GEIs in M. panniformisFACHB1757. Island Viewer integrates SIGI-HMM, Island Pick, and Island Path-DIMOB and built-in databases, including the Virulence Factor Database and Antibiotic Resistance Gene Database. Thirty-six GEIs were found using Island Viewer, and their positions are shown in Fig. 6. Different kinds of functions were identified and are summarized in Table 6. Transposases were identified in most of the GEIs, as they participated in horizontal gene transfer. Toxin-related gene clusters were annotated in six GEIs and probably affect competitiveness and fitness. Some functional genes, such as hat/hatR, were also detected, which indicates the enhanced adaptability and metabolic versatility in this strain.
Fig. 6

GEIs distribution in the chromosome of M. panniformis FACHB1757. From inside to outside, green bars illustrate IslandPick prediction, orange bars show the results annotated by SIGI-HMM, and blue bars are predicted by IslandPath-DIMOB. Red bars indicate integrated GEIs candidate positions. Black line plot around the small circle reveal the GC content

Table 6

Functions and types of all 36 GEIs in chromosome

FunctionAdvantage conferredGEI typeRelated GEIs
Alkaline phosphataseIncreased metabolic versatilityMetabolicGEI2,GEI7,GEI10,GEI17,GEI19,GEI23,GEI26,GEI27,
Toxin/Antitoxin proteinCompetitivenessPathogenicity, resistanceGEI1,GEI6,GEI13,GEI31,GEI32,GEI33
TransferaseIncreased metabolic versatilityMetabolicGEI4,GEI9,GEI15,GEI21,GEI24,GEI25,GEI30,GEI36
TransposaseIncreased metabolic versatilityMetabolicGEI1,GEI3,GEI4,GEI11,GEI12,GEI15,GEI16,GEI18, GEI24,GEI29,GEI34
Hat/HatRIncreased metabolic versatility, increased adaptabilityFitnessGEI11,GEI28
Heat shock proteinIncreased metabolic versatility, increased adaptabilitySynthesis, fitnessGEI31
PsaEIncreased metabolic versatilityMetabolic, fitnessGEI9
GEIs distribution in the chromosome of M. panniformis FACHB1757. From inside to outside, green bars illustrate IslandPick prediction, orange bars show the results annotated by SIGI-HMM, and blue bars are predicted by IslandPath-DIMOB. Red bars indicate integrated GEIs candidate positions. Black line plot around the small circle reveal the GC content Functions and types of all 36 GEIs in chromosome

Conclusions

This study presents the complete whole genome sequence of a newly recorded species in China, M. panniformis, and demonstrates several genomic perspectives, including comparison with nine other water bloom-forming cyanobacterial species. A 5.6 Mbp chromosome with a 38 Kbp plasmid was reported, and gene function, methylation modification, CRISPR, and GEIs throughout the genome were described. Large-scale of structure variation was demonstrated by comparison with genomes. A Venn diagram of four strains showed gene quantity and category variation as a result of evolutionary divergence and revealed that has underwent flexible genome evolution.
  45 in total

1.  A phylogenetic definition of the major eubacterial taxa.

Authors:  C R Woese; E Stackebrandt; T J Macke; G E Fox
Journal:  Syst Appl Microbiol       Date:  1985       Impact factor: 4.022

Review 2.  Genomic islands in pathogenic and environmental microorganisms.

Authors:  Ulrich Dobrindt; Bianca Hochhut; Ute Hentschel; Jörg Hacker
Journal:  Nat Rev Microbiol       Date:  2004-05       Impact factor: 60.633

3.  The protein encoded by gvpC is a minor component of gas vesicles isolated from the cyanobacteria Anabaena flos-aquae and Microcystis sp.

Authors:  P K Hayes; C M Lazarus; A Bees; J E Walker; A E Walsby
Journal:  Mol Microbiol       Date:  1988-09       Impact factor: 3.501

Review 4.  Freshwater harmful algal blooms: toxins and children's health.

Authors:  Chelsea A Weirich; Todd R Miller
Journal:  Curr Probl Pediatr Adolesc Health Care       Date:  2014-01

5.  Changes in cyanoprokaryote populations, Microcystis morphology, and microcystin concentrations in Lake Elphinstone (Central Queensland, Australia).

Authors:  Susan H White; Larelle D Fabbro; Leo J Duivenvoorden
Journal:  Environ Toxicol       Date:  2003-12       Impact factor: 4.119

6.  Aerucyclamides A and B: isolation and synthesis of toxic ribosomal heterocyclic peptides from the cyanobacterium Microcystis aeruginosa PCC 7806.

Authors:  Cyril Portmann; Judith F Blom; Karl Gademann; Friedrich Jüttner
Journal:  J Nat Prod       Date:  2008-06-18       Impact factor: 4.050

7.  Evolution of multicellularity coincided with increased diversification of cyanobacteria and the Great Oxidation Event.

Authors:  Bettina E Schirrmeister; Jurriaan M de Vos; Alexandre Antonelli; Homayoun C Bagheri
Journal:  Proc Natl Acad Sci U S A       Date:  2013-01-14       Impact factor: 11.205

8.  The origin of multicellularity in cyanobacteria.

Authors:  Bettina E Schirrmeister; Alexandre Antonelli; Homayoun C Bagheri
Journal:  BMC Evol Biol       Date:  2011-02-14       Impact factor: 3.260

9.  Aggressive assembly of pyrosequencing reads with mates.

Authors:  Jason R Miller; Arthur L Delcher; Sergey Koren; Eli Venter; Brian P Walenz; Anushka Brownley; Justin Johnson; Kelvin Li; Clark Mobarry; Granger Sutton
Journal:  Bioinformatics       Date:  2008-10-24       Impact factor: 6.937

10.  CRISPR recognition tool (CRT): a tool for automatic detection of clustered regularly interspaced palindromic repeats.

Authors:  Charles Bland; Teresa L Ramsey; Fareedah Sabree; Micheal Lowe; Kyndall Brown; Nikos C Kyrpides; Philip Hugenholtz
Journal:  BMC Bioinformatics       Date:  2007-06-18       Impact factor: 3.169

View more
  5 in total

1.  Metatranscriptomics analysis of cyanobacterial aggregates during cyanobacterial bloom period in Lake Taihu, China.

Authors:  Zhenzhu Chen; Junyi Zhang; Rui Li; Fei Tian; Yanting Shen; Xueying Xie; Qinyu Ge; Zuhong Lu
Journal:  Environ Sci Pollut Res Int       Date:  2017-12-03       Impact factor: 4.223

2.  Utility of a PCR-based method for rapid and specific detection of toxigenic Microcystis spp. in farm ponds.

Authors:  Jian Yuan; Hyun-Joong Kim; Christopher T Filstrup; Baoqing Guo; Paula Imerman; Steve Ensley; Kyoung-Jin Yoon
Journal:  J Vet Diagn Invest       Date:  2020-04-20       Impact factor: 1.279

3.  Obtaining Genome Sequences of Mutualistic Bacteria in Single Microcystis Colonies.

Authors:  Jing Tu; Liang Chen; Shen Gao; Junyi Zhang; Changwei Bi; Yuhan Tao; Na Lu; Zuhong Lu
Journal:  Int J Mol Sci       Date:  2019-10-11       Impact factor: 5.923

4.  Complete Genome Sequence of Microcystis aeruginosa FD4, Isolated from a Subtropical River in Southwest Florida.

Authors:  Hidetoshi Urakawa; Taylor L Hancock; Jacob H Steele; Elizabeth K Dahedl; Haruka E Urakawa; Luka K Ndungu; Lauren E Krausfeldt; Barry H Rosen; Jose V Lopez
Journal:  Microbiol Resour Announc       Date:  2020-09-17

5.  Characterization of Microcystis (Cyanobacteria) Genotypes Based on the Internal Transcribed Spacer Region of rRNA by Next-Generation Sequencing.

Authors:  Da Huo; Youxin Chen; Tao Zheng; Xiang Liu; Xinyue Zhang; Gongliang Yu; Zhiyi Qiao; Renhui Li
Journal:  Front Microbiol       Date:  2018-05-15       Impact factor: 5.640

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.