| Literature DB >> 18534010 |
Lionel Frangeul1, Philippe Quillardet, Anne-Marie Castets, Jean-François Humbert, Hans C P Matthijs, Diego Cortez, Andrew Tolonen, Cheng-Cai Zhang, Simonetta Gribaldo, Jan-Christoph Kehr, Yvonne Zilliges, Nadine Ziemert, Sven Becker, Emmanuel Talla, Amel Latifi, Alain Billault, Anthony Lepelletier, Elke Dittmann, Christiane Bouchier, Nicole Tandeau de Marsac.
Abstract
BACKGROUND: The colonial cyanobacterium Microcystis proliferates in a wide range of freshwater ecosystems and is exposed to changing environmental factors during its life cycle. Microcystis blooms are often toxic, potentially fatal to animals and humans, and may cause environmental problems. There has been little investigation of the genomics of these cyanobacteria.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18534010 PMCID: PMC2442094 DOI: 10.1186/1471-2164-9-274
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Phylogenetic maximum likelihood (ML) tree based on the concatenated 23S-16S rDNA sequences of diverse cyanobacterial lineages. The sequences were taken from public databases. Strain identifiers, and the methods used for the phylogenetic analysis, are described in the Methods section. The scale bar represents the average number of nucleotide substitutions per site. Genome sizes in megabases (Mb) are mentioned in parentheses. Trees were constructed using three methods (ML, Neighbor Joining and Maximum Parsimony). ML bootstrap values are indicated only if the bootstrap values obtained with the three methods are > 500 (1000 resamplings).
Comparison between two Microcystis genomes
| Strain (a) | Mic-PCC7806 | Mic-NIES843 |
| Genome length | 5.17 Mb (116 Contigs) | 5.84 Mb |
| rRNA loci | 2 | 2 |
| tRNA loci | 41 | 42 |
| Number of CDSs | 5292 | 6312 |
| Putative transposases (COG similarity) | 362 (6.8%) | 469 (7.4%) |
| Proteins linked by BDBH | 3322 (63%) | 3322 (53%) |
| Proteins absent in the other | 838 (16%) | 1760 (28%) |
| Strain-specific proteins (c) | 644/838 (76%) | 1484/1760 (84%) |
| Large repeats (d) | 11.7% | 11.7% |
(a) See the Methods section for the strain identifiers.
(b) Proteins that do not share similarity (> 40%) with any proteins in the other Microcystis genome.
(c) Proteins with no similarity (> 40%) with any proteins in the 44 other cyanobacterial genomes.
(d) Proportion of large repeats (> 1000 bases; > 90% identity) in the genome (see Figure 2).
Mb: megabases; CDS: coding sequence; COG: cluster of orthologs; BDBH: bidirectional best hit.
Distribution of the best Blastp of the Mic-PCC7806 proteome against other cyanobacterial proteomes
| No Significant HSP | 14.4% |
| Cth-ATCC51142 | 15.5% |
| Cth-CCY0110 | 13.4% |
| Cwa-WH8501 | 9.9% |
| Mch-PCC7420 | 9% |
| Npu-PCC73102 | 5.8% |
| Syn-PCC6803 | 5.1% |
| Ana-PCC7120 | 4.3% |
| Nsp-CCY9414 | 4.2% |
| Lae-PCC8106 | 4% |
| Ava-ATCC29413 | 3.9% |
| Ama-MBIC11017 | 2.5% |
| Ter-IMS101 | 1.95% |
| Syn-PCC7002 | 1.6% |
| Gvi-PCC7421 | 1.15% |
| Syn-PCC7335 | 1% |
| Syn-PCC7942 | 0.5% |
| Tel-BP1 | 0.4% |
| Syn-WH5701 | 0.2% |
| Syn-JA-2-3B'a | 0.1% |
| Syn-PCC6301 | 0.1% |
| Other genomes | 0% |
See the Methods section for the strain identifiers. HSP: High-Scoring Segment Pair.
Figure 2Percentage of DNA repeated sequences in the total genome length. This analysis was performed on complete and in-finishing (*) cyanobacterial genomes. The strain identifiers are listed in the Methods section. Only DNA repeats containing more than 1000 bases, and with an identity threshold >90%, are taken into account.
Figure 3Comparison of the syntenic scores of cyanobacterial genomes (filled square) and other bacterial genomes (empty diamond) according to the maximum likelihood distances of their 23S-16S sequences calculated by Phyml (see Methods). The pairs of cyanobacterial genomes used in this study are listed in the Methods section.
Conserved gene clusters in the genomes of Mic-PCC7806, Cwa-WH8501 and Syn-PCC6803
| Gene name | ||||||||
| Mic-PCC7806 contig328 | ||||||||
| Cwa-WH8501 contig3 | ||||||||
| Syn-PCC6803 | ||||||||
| Gene annotation | periplasmic phosphate-binding protein of ABC transporter | phosphate-binding periplasmic protein precursor | phosphate transport system permease protein | phosphate transport system permease protein | phosphate transport ATP-binding protein | phosphate transport ATP-binding protein | ||
| Gene name | ||||||||
| Mic-PCC7806 contig303 | ||||||||
| Cwa-WH8501 contig2 | ||||||||
| Syn-PCC6803 | ||||||||
| Gene annotation | carbon dioxide concentrating mechanism protein | carbon dioxide concentrating mechanism protein | putative carboxysome assembly protein | putative carboxysome structural protein | putative carboxysome assembly protein | |||
| Gene name | ||||||||
| Mic-PCC7806 contig303 | ||||||||
| Cwa-WH8501 contig2 | ||||||||
| Syn-PCC6803 | ||||||||
| Gene annotation | COG1847 Predicted RNA-binding protein | COG0706 Preprotein translocase subunit | No similarity; highly conserved in cyanobacteria | protein subunit of ribonuclease P | ||||
| Gene name/Alternate gene name | ||||||||
| Mic-PCC7806 contig290 | ||||||||
| Cwa-WH8501 contig1 | ||||||||
| Syn-PCC6803 | ||||||||
| Gene annotation | ATP synthase gamma chain | ATP synthase alpha chain | ATP synthase delta chain of CF(1) | ATP synthase B chain (subunit I) of CF(0) | ATP synthase of B' chain (subunit b') of CF(0) | ATP synthase C chain of CF(0) | ATP synthase A chain | ATP synthase protein I |
For each of the three genomes, the gene identifiers are indicated in italics. See the Methods section for the strain identifiers.
Figure 4Distribution of the intergenic distances in diverse cyanobacterial genomes. The distances are based on the public syntaxic annotation of each genome. Strain identifiers are listed in the Methods section.
Analysis of the presence of atypical genes in several cyanobacterial genomes
| Mic-PCC7806 | 5292 | 1971 | 1790 (34%) | ||
| Mic-NIES843 | 6364 | 126 | 66% | ||
| Cwa-WH8501 | 5967 | 1004 (17%) | 61 | 523 (9%) | 60% |
| Mch-PCC7420 | 2008 (27%) | 150 | 1403 (19%) | 68% | |
| Syn-PCC6803 | 3314 | 494 (15%) | 32 | 243 (7%) | 75% |
| Npu-PCC72103 | 6182 | 1259 (20%) | 65 | 596 (10%) | 64% |
| Lae-PCC8106 | 6142 | 1549 (25%) | 102 | 1084 (18%) | 67% |
| Ana-PCC7120 | 5430 | 1254 (23%) | 48 | 390 (7%) | 69% |
(a) See the Methods section for the strain identifiers. Higher scores are shown in bold. AG: Atypical gene; CAG: Cluster of atypical genes.
Distribution of rare 6-mers in cyanobacterial genomes
| Ratio (a) Obs/Shuf | Number of 6-mers | |||
| Mic-PCC7806 | Mic-NIES843 | Cwa-WH8501 | Syn-PCC6803 | |
| < 0.02 | ||||
| < 0.04 | 13 (8) | 11 (5) | 9 (3) | 7 (3) |
| < 0.06 | 10 (4) | 8 (3) | 6 (3) | 5 (2) |
| < 0.08 | 2 (1) | 2 (1) | 3 (2) | 6 (1) |
| < 0.1 | 2 (1) | 3 (0) | 13 (6) | 14 (1) |
(a) Ratio between the frequency observed (Obs) for a given 6-mers and the frequency for the same 6-mers after shuffling (Shuf) of the genome sequence.
Figures in bold represent the number of 6-mers present 50× more in the shuffled sequence than in the original sequence. Figures in parentheses indicate the number of palindromic sites.
See the Methods section for the strain identifiers.
Gene clusters involved in the biosynthesis of secondary metabolites
| Strain (a) | Size Mb | % SM (b) | PKS | NRPS | Patellamide like | ||||
| Modular type I | Iterative type I/glycolipid synthase (c) | Enedyine type | PKS III | NRPS/PKS | NRPS | ||||
| Mic-PCC7806 | 5.2* | 3.5 | 1 | 0 | 1 | 1 (d) | 2 | 1 | 1 |
| Mic-NIES843 | 5.8 | 2.6 | 1 | 0 | 1 | 0 | 2 | 1 | 0 |
| Cwa-WH8501 | 6.2* | 1.6 | 0 | 0 | 0 | 0 | 1 | 6 (e) | 0 |
| Npu-PCC73102 | 8.2 | 4.5 | 2 | 2 | 0 | 0 | 6 | 1 | 0 |
| Syn-PCC6803 | 3.6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
(a) See the Methods section for the strain identifiers.
(b) %SM: percentage of the genome dedicated to secondary metabolites of the non-ribosomal peptide, polyketide or patellamide type family.
(c) Iterative PKS I not including enedyine type.
(d) PKS III is associated with modular PKS I.
(e) NRPS clusters in Cwa-WH8501 are all of a small size with an average of 1–2 genes per cluster.
* in-finishing genome; Mb: megabases; PKS: polyketide synthase; NRPS: non-ribosomal peptide synthetase.
Figure 5Schematic representation of secondary metabolite gene clusters in Mic-PCC7806. (A) Gene clusters encoding non-ribosomal peptide synthetases (NRPS) and polyketide synthases (PKS). The names assigned to individual genes in Mic-PCC7806, or to genes that were characterized in other cyanobacterial strains are indicated above the arrows. Products assigned to the respective pathways are shown on the right. (B) Gene cluster encoding enzymes potentially involved in a patellamide-like pathway. Names of patellamide biosynthesis genes are indicated above the arrows. Gene identifiers in the Mic-PCC7806 genome are indicated below the arrows.