| Literature DB >> 20169071 |
Karina Stucken1, Uwe John, Allan Cembella, Alejandro A Murillo, Katia Soto-Liebe, Juan J Fuentes-Valdés, Maik Friedel, Alvaro M Plominsky, Mónica Vásquez, Gernot Glöckner.
Abstract
Cyanobacterial morphology is diverse, ranging from unicellular spheres or rods to multicellular structures such as colonies and filaments. Multicellular species represent an evolutionary strategy to differentiate and compartmentalize certain metabolic functions for reproduction and nitrogen (N(2)) fixation into specialized cell types (e.g. akinetes, heterocysts and diazocytes). Only a few filamentous, differentiated cyanobacterial species, with genome sizes over 5 Mb, have been sequenced. We sequenced the genomes of two strains of closely related filamentous cyanobacterial species to yield further insights into the molecular basis of the traits of N(2) fixation, filament formation and cell differentiation. Cylindrospermopsis raciborskii CS-505 is a cylindrospermopsin-producing strain from Australia, whereas Raphidiopsis brookii D9 from Brazil synthesizes neurotoxins associated with paralytic shellfish poisoning (PSP). Despite their different morphology, toxin composition and disjunct geographical distribution, these strains form a monophyletic group. With genome sizes of approximately 3.9 (CS-505) and 3.2 (D9) Mb, these are the smallest genomes described for free-living filamentous cyanobacteria. We observed remarkable gene order conservation (synteny) between these genomes despite the difference in repetitive element content, which accounts for most of the genome size difference between them. We show here that the strains share a specific set of 2539 genes with >90% average nucleotide identity. The fact that the CS-505 and D9 genomes are small and streamlined compared to those of other filamentous cyanobacterial species and the lack of the ability for heterocyst formation in strain D9 allowed us to define a core set of genes responsible for each trait in filamentous species. We presume that in strain D9 the ability to form proper heterocysts was secondarily lost together with N(2) fixation capacity. Further comparisons to all available cyanobacterial genomes covering almost the entire evolutionary branch revealed a common minimal gene set for each of these cyanobacterial traits.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20169071 PMCID: PMC2821919 DOI: 10.1371/journal.pone.0009235
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Overview of the main gene clusters involved in nitrogen metabolism and heterocyst development in strains CS-505 and D9.
Transmission electron micrographs in the left panels show the heterocyst of CS-505 and the apically differentiated cell of D9. Optical micrographs on the right panels exhibit the Alcian blue staining characteristic of polysaccharides in the heterocyst.
Sequencing and assembly statistics for the two strains.
| D9 | CS-505 | |
| 454 GS sequence coverage | 27 | 34 |
| Small insert library | 188 | 3909 |
| Fosmid library | 491 | -- |
| Finishing reads | 253 | 161 |
| Total sequencing depth | 27 | 35 |
| Contigs | 157 | 268 |
| Assembled (Mb) | 3.20 | 3.89 |
| Contigs >3.5 kb | 33 | 94 |
| Largest contig (kb) | 543 | 259 |
| Repeats (regions) | 53 | 406 |
| Repeats (bases) | 53,870 | 244,280 |
| Repeats (% total) | 1.7 | 6.3 |
General features of the genomes of strains CS-505 and D9 in comparison with four other fully sequenced genomes of filamentous cyanobacteria.
| D9 | CS-505 | Avar | Anab | Tery | Npun | |
| Genome size (Mb) | 3.20 | 3.89 | 6.34 | 6.41 | 7.75 | 8.23 |
| G+C content (%) | 40 | 40.2 | 41 | 41 | 40.8 | 41 |
| Genes | 3,088 | 3,968 | 5,134 | 5,432 | 5,542 | 6,501 |
| Total CDS | 3,010 | 3,452 | 5,039 | 5,368 | 4,452 | 6,087 |
| Function assigned | 1,979 | 1,922 | 3,799 | 3,892 | 2,729 | 0 |
| Unclassified | 1,031 | 1,530 | 1,244 | 1,474 | 2,347 | 6,087 |
| rRNA genes | 9 | 9 | 12 | 12 | 5 | 12 |
| tRNA genes | 42 | 42 | 47 | 48 | 38 | 98 |
| Transposases | 9 | 77 | 57 | 145 | 260 | 112 |
| Phage integrases | - | 2 | 10 | - | 3 | 22 |
| Repeated regions | 53 | 406 | ||||
| Plasmids | ? | ? | 3 | 6 | - | 5 |
| Unique CDS | 394 | 794 | ||||
| Function assigned | 157 | 291 | ||||
| Unclassified | 237 | 503 |
Abbreviations: Avar: Anabaena variabilis ATCC 29413; Anab: Anabaena sp. PCC 7120; Tery: Trichodesmium erythraeum IMS101; Npun: Nostoc punctiforme PCC 73102.
*Function assigned according to COGs.
Figure 2Distribution of the unique CDS of CS-505 and D9 into Cluster of Orthologous Groups (COGs).
Only COG categories overrepresented by CDS of CS-505 or D9 are shown (see text for more details). Unique CDS were obtained by a Best-Bidirectional Hits (BBHs) search between both genomes using a 30% cutoff.
Figure 3Schematic representation of the synteny within the vicinity of the nif gene clusters.
The scheme represents the 15 kb gene cluster containing the nifHDK and the other 13 nitrogen fixation related genes in CS-505 compared with the nif1 and nif2 gene clusters of Anabaena variabilis ATCC 29413 and the synteny regions between CS-505 and D9. The synteny regions between CS-505 and D9 are delimited by the arrows. nif genes are represented by light grey and dashed lines. Genes in black correspond to hypothetical proteins and grey genes to proteins with assigned function.
Figure 4Schematic representation of the syntenic regions within the toxin gene clusters in CS-505 and D9.
A. Location of the CYN gene cluster of CS-505 compared with the syntenic genomic region in D9. B. Gel electrophoresis of the PCR products from the hypF/hupC amplification in R. brookii D9 and in the strains of C. raciborskii non-toxic: CS-507, CS-508, CS-509 and CS-510. Producers of CYN: CS-505, CS-506 and CS-511 do not present amplification of the hypF/hupC region. C. Location of the STX gene cluster of D9 compared with the syntenic genomic region in CS-505. Genes participating in syntenic regions are depicted in blue and highlighted in the green boxes within the arrows; genes outside the syntenic regions are depicted in white. tRNAs and transposases are shown in red. The grey arrows show the position of the primer pairs HYPa/HUPa and HYPb/HUPb used to amplify the region between hypF and hupC genes in different strains of C. raciborskii and in R. brookii D9, respectively. Ladder: GeneRuler 1 kb DNA ladder (Fermentas, Ontario, Canada). The strains of C. raciborskii were obtained from the culture collection of the Commonwealth Scientific and Industrial Research Organization (CSIRO), Australia. For more details on DNA isolation, primer synthesis and PCR conditions see Methods S1.
Common genes for the different traits.
| Trait | Hits between species | CS-505 | D9 | Core set present in wider spectrum of species |
| Filament formation | 32 | 23 | 20 | 10 |
| Diazotrophy | 49 | 38 | 6 | 10 |
| Heterocyst development | 149 | 58 | 54 | 41 |
*Paralogs are not removed.
Genes present only in filamentous species.
| Npun | Gene product description | Anab | D9 | CS-505 | |
| 186680616 | hypothetical protein | all1770 | CRD_00231 | CRC_00822 | |
| 186680621 | core set | hypothetical protein | all1765 | CRD_00230 | CRC_00821 |
| 186681198 | core set | hypothetical protein | alr0202 | CRD_00387 | CRC_01215 |
| 186681299 | hypothetical protein | all1340 | no | no | |
| 186681300 | hypothetical protein | all1339 | no | no | |
| 186681350 | PpiC-type peptidyl-prolyl cis-trans isomerase | alr1613 | no | CRC_02567 | |
| 186681409 | HEAT repeat-containing PBS lyase | alr2986 | CRD_00077 | CRC_02169 | |
| 186681476 | peptidoglycan binding domain-containing protein | alr4984 | CRD_02468 | CRC_02058 | |
| 186681631 | nuclease | all2918 | CRD_01392 | CRC_01535 | |
| 186681697 | core set | hypothetical protein | alr2393 | CRD_02002 | CRC_01280 |
| 186681814 | hypothetical protein | all3643 | CRD_01982 | CRC_01258 | |
| 186681958 | core set | hypothetical protein | all1729 | CRD_02583 | CRC_00038 |
| 186682138 | core set | peptidase S48, HetR | alr2339 | CRD_01519 | CRC_03184 |
| 186682240 | core set | PatU3 | alr0101 | CRD_02293 | CRC_02800 |
| 186682241 | core set | HetZ | alr0099 | CRD_02292 | CRC_02801 |
| 186682787 | hypothetical protein | alr1555 | no | no | |
| 186682808 | peptidoglycan binding domain-containing protein | all1861 | no | no | |
| 186683172 | hypothetical protein | all5122 | CRD_01021 | CRC_02539 | |
| 186683174 | hypothetical protein | all2077 | no | no | |
| 186683213 | hypothetical protein | all1154 | CRD_00512 | CRC_00964 | |
| 186683474 | GDSL family lipase | all0976 | no | no | |
| 186683904 | hypothetical protein | all0215 | CRD_00210 | CRC_03261 | |
| 186683953 | GDSL family lipase | all1288 | no | no | |
| 186684054 | hypothetical protein | asr1049 | no | no | |
| 186684093 | core set | hypothetical protein | all2344 | CRD_00085 | CRC_00676 |
| 186684579 | core set | hypothetical protein | all2320 | CRD_01527 | CRC_01389 |
| 186684586 | hypothetical protein | all5091 | CRD_02655 | CRC_00188 | |
| 186685511 | core set | hypothetical protein | alr4863 | CRD_02120 | CRC_01594 |
| 186685539 | NUDIX hydrolase | alr2015 | CRD_01916 | CRC_01834 | |
| 186685973 | hypothetical protein | all1007 | no | CRC_00879 | |
| 186685974 | hypothetical protein | all1006 | no | CRC_00878 | |
| 186686413 | hypothetical protein | all4622 | no | no |
Genes present in N2- fixing species.
| Npun | Gene | Gene product | Anab | D9 | CS-505 | Absent in | |
| 186680715 | glycerophosphoryl diester phosphodiesterase | all1051 | CRD_01538 | CRC_01381 | SynJA3, SynJA2, Cya7425 | ||
| 186680864 |
| core set | 4Fe-4S ferredoxin iron-sulfur binding domain-containing protein | all2512 | no | CRC_01763 | |
| 186680869 | hypothetical protein | alr2517 | no | CRC_03082 | SynJA3, SynJA2, Cya8801 | ||
| 186680870 | cupin 2 domain-containing protein | alr2518 | no | CRC_03081 | SynJA3, SynJA2, Mcht | ||
| 186680871 | nitrogenase-associated protein | alr2520 | no | CRC_03080 | Mcht | ||
| 186680875 | hypothetical protein | asr2523 | no | CRC_02152 | SynJA3, SynJA2, Cya7425 | ||
| 186680876 | hypothetical protein | alr2524 | no | CRC_02151 | SynJA3, SynJA2, Cya7425, Mcht | ||
| 186680892 | NHL repeat-containing protein | alr0693 | no | no | SynJA3, SynJA2, Cya7425, Mcht | ||
| 186680893 | Rieske (2Fe-2S) domain-containing protein | alr0692 | no | no | |||
| 186680895 | hypothetical protein | asr0689 | no | CRC_01692 | SynJA3, SynJA2, Cya7425, Mcht | ||
| 186680897 |
| Ni Fe-hydrogenase small subunit, HupS | all0688 | no | CRC_02736 | SynJA3, SynJA2, Cya7425, Mcht | |
| 186680898 |
| Ni Fe-hydrogenase large subunit, HupL | all0687 | no | CRC_02737 | SynJA3, SynJA2 | |
| 186680903 |
| hydrogenase maturation protease | alr1423 | no | CRC_01049 | SynJA3, SynJA2, Cya7425 | |
| 186680908 |
| FeoA family protein | asl1429 | no | CRC_02875 | Tery, Lyng, Nspu | |
| 186680909 |
| ferredoxin (2Fe-2S) | all1430 | no | CRC_02876 | Mcht | |
| 186680910 |
| iron-sulfur cluster assembly accessory protein | all1431 | no | CRC_02877 | SynJA3, SynJA2 | |
| 186680911 |
| UBA/THIF-type NAD/FAD binding protein | all1432 | no | CRC_02878 | SynJA3, SynJA2 | |
| 186680912 |
| nitrogen fixation protein | all1433 | no | CRC_02879 | Mcht | |
| 186680913 | protein of unknown function DUF683 | asl1434 | no | CRC_02880 | Tery | ||
| 186680914 | protein of unknown function DUF269 | all1435 | no | CRC_02881 | Mcht | ||
| 186680915 |
| nitrogen fixation protein | all1436 | no | CRC_02882 | Mcht | |
| 186680916 |
| core set | nitrogenase molybdenum-iron cofactor biosynthesis protein NifN | all1437 | no | CRC_02883 | |
| 186680917 |
| core set | nitrogenase MoFe cofactor biosynthesis protein | all1438 | no | CRC_02884 | |
| 186680918 | Mo-dependent nitrogenase family protein | all1439 | no | no | Tery | ||
| 186680919 |
| core set | nitrogenase molybdenum-iron protein beta chain | all1440 | no | CRC_02885 | |
| 186680550 |
| core set | nitrogenase molybdenum-iron protein alpha chain | all1454 | no | CRC_02886 | |
| 186680941 |
| core set | nitrogenase iron protein NifH | all1455 | no | CRC_02887 | |
| 186680943 |
| core set | Fe-S cluster assembly protein NifU | all1456 | no | CRC_02888 | |
| 186680944 |
| core set | Nitrogenase metalloclusters biosynthesis protein NifS | all1457 | no | CRC_02889 | |
| 186680946 |
| core set | nitrogenase cofactor biosynthesis protein | all1517 | no | CRC_02891 | |
| 186680953 |
| serine acetyltransferase | alr1404 | no | no | SynJA3, SynJA2 | |
| 186680954 | hypothetical protein | asr1405 | no | no | |||
| 186680955 | core set | hypothetical protein | asr1406 | no | CRC_02071 | ||
| 186680956 |
| homocitrate synthase | alr1407 | no | CRC_02070 | Tery, Mcht | |
| 186680957 |
| NifZ family protein | asr1408 | no | CRC_02069 | Mcht | |
| 186680958 |
| NifT/FixU family protein | asr1409 | no | CRC_02068 | Mcht | |
| 186680959 | hypothetical protein | alr1410 | no | CRC_02067 | Tery | ||
| 186682206 | hypothetical protein | all0969 | no | no | SynJA3, SynJA2 | ||
| 186682693 | ribokinase-like domain-containing protein | alr4681 | CRD_01205 | CRC_01938 | SynJA3, SynJA2, Cya7425 | ||
| 186683057 | hypothetical protein | alr0857 | no | no | SynJA3, SynJA2 | ||
| 186683906 | pathogenesis related protein-like protein | all0217 | no | CRC_03259 | SynJA3, SynJA2, Cya7425, Mcht | ||
| 186684105 | glycosyl transferase, group 1 | all1345 | CRD_02459 | no | SynJA3, SynJA2, Cya7424 | ||
| 186684241 | hypothetical protein | all4434 | CRD_01931 | CRC_02458 | SynJA3, SynJA2, Cya7425 | ||
| 186685158 | phosphoglycerate mutase | alr2972 | CRD_00352 | CRC_03094 | Tery, SynJA3, SynJA2, Cya7425 | ||
| 186685476 | hypothetical protein | asl0163 | no | no | SynJA3, SynJA2 | ||
| 186685625 | hypothetical protein | all3713 | no | no | SynJA3, SynJA2, Cya7425, Cya8801 | ||
| 186685845 | hypothetical protein | asl0597 | no | no | SynJA3, SynJA2, Cya7425 | ||
| 186686227 | Arginyl tRNA synthetase anticodon binding | all3951 | CRD_01597 | CRC_03274 | |||
| 186686347 | cytochrome P450 | all1361 | no | no | SynJA3, SynJA2, Cya7425 |
*Genes that show regulation in Anabaena under N2- depletion [38].
Characteristics of the cyanobacterial taxa used for comparative genomic analyses.
| Species | Morphology | Diazotrophy | Accession number | Genome sequence status |
|
| Filamentous, heterocystous | N2-fixing | NC_010628 | finished |
|
| Filamentous, heterocystous | N2-fixing | NZ_AAVW00000000 | unfinished |
|
| Filamentous, heterocystous | N2-fixing | NC_003272 | finished |
|
| Filamentous, heterocystous | N2-fixing | NC_007413 | finished |
|
| Filamentous, heterocystous | N2-fixing | NZ_ACIR00000000 | unfinished |
|
| Filamentous | N2-fixing | NC_008312 | finished |
|
| Filamentous | N2-fixing | NZ_AAVU00000000 | unfinished |
|
| Filamentous | N2-fixing | NZ_ABRS00000000 | unfinished |
|
| Filamentous | non-N2-fixing | NZ_ABYK00000000 | unfinished |
|
| Unicellular | N2-fixing | NZ_AADV00000000 | unfinished |
|
| Unicellular | N2-fixing | NC_010546 | finished |
|
| Unicellular | N2-fixing | NC_011726 | finished |
|
| Unicellular | N2-fixing | NC_011729 | finished |
|
| Unicellular | N2-fixing | NC_011884 | finished |
|
| Unicellular | N2-fixing | NC_007775 | finished |
|
| Unicellular | N2-fixing | NC_007776 | finished |
|
| Unicellular | non-N2-fixing | NC_008319 | finished |
|
| Unicellular | non-N2-fixing | NC_000911 | finished |
|
| Unicellular | non-N2-fixing | NC_009925 | finished |
|
| Unicellular | non-N2-fixing | NC_005125 | finished |
|
| Unicellular | non-N2-fixing | NC_010296 | finished |
|
| Unicellular | non-N2-fixing | NC_009091 | finished |
|
| Unicellular | non-N2-fixing | NC_007604 | finished |
|
| Unicellular | non-N2-fixing | NC_004113 | finished |
*Species included in the second part of the analysis.