Literature DB >> 20203057

Genomic structure of an economically important cyanobacterium, Arthrospira (Spirulina) platensis NIES-39.

Takatomo Fujisawa1, Rei Narikawa, Shinobu Okamoto, Shigeki Ehira, Hidehisa Yoshimura, Iwane Suzuki, Tatsuru Masuda, Mari Mochimaru, Shinichi Takaichi, Koichiro Awai, Mitsuo Sekine, Hiroshi Horikawa, Isao Yashiro, Seiha Omata, Hiromi Takarada, Yoko Katano, Hiroki Kosugi, Satoshi Tanikawa, Kazuko Ohmori, Naoki Sato, Masahiko Ikeuchi, Nobuyuki Fujita, Masayuki Ohmori.   

Abstract

A filamentous non-N(2)-fixing cyanobacterium, Arthrospira (Spirulina) platensis, is an important organism for industrial applications and as a food supply. Almost the complete genome of A. platensis NIES-39 was determined in this study. The genome structure of A. platensis is estimated to be a single, circular chromosome of 6.8 Mb, based on optical mapping. Annotation of this 6.7 Mb sequence yielded 6630 protein-coding genes as well as two sets of rRNA genes and 40 tRNA genes. Of the protein-coding genes, 78% are similar to those of other organisms; the remaining 22% are currently unknown. A total 612 kb of the genome comprise group II introns, insertion sequences and some repetitive elements. Group I introns are located in a protein-coding region. Abundant restriction-modification systems were determined. Unique features in the gene composition were noted, particularly in a large number of genes for adenylate cyclase and haemolysin-like Ca(2+)-binding proteins and in chemotaxis proteins. Filament-specific genes were highlighted by comparative genomic analysis.

Entities:  

Mesh:

Substances:

Year:  2010        PMID: 20203057      PMCID: PMC2853384          DOI: 10.1093/dnares/dsq004

Source DB:  PubMed          Journal:  DNA Res        ISSN: 1340-2838            Impact factor:   4.458


Introduction

Cyanobacteria are prokaryotes that perform oxygenic photosynthesis and constitute a large taxonomic group within the domain of eubacteria. Cyanobacteria are divided morphologically (unicellular or filamentous) or functionally (N2-fixing and non-N2-fixing). Filamentous species are subdivided into those with and without a heterocyst which is a differentiation from vegetative cells for fixing nitrogen.[1,2] Arthrospira is a representative filamentous non-N2-fixing cyanobacterium that lacks any differentiation such as for the heterocyst, akinete or hormogonium, which develops in some filamentous N2-fixing cyanobacteria. This cyanobacterium is also well known as ‘Spirulina’ because of its useful property as a food. The Aztecs consumed it regularly,[3] and it is thought to be an important dietary element in tropical areas. Recently, Lake Chad has been famous for Arthrospira that is naturally cultivated as a food supply.[4] However, current taxonomy claims that the name ‘Spirulina’ for strains used as food supplements is inappropriate, and there is agreement that Arthrospira is a distinct genus,[5] consisting of over 30 different species including A. platensis and A. maxima. In the present study, we determined nearly the complete genome sequence of A. platensis NIES-39.[6] Arthrospira platensis has physiologically particular diagnostic characteristics, and its biology is subject to a comprehensive book.[7] Arthrospira platensis shows vigorous gliding motility of filamentous cells (trichomes) with rotation along their long axis. Gliding is a self-propulsion across a solid or semi-solid material without the aid of any visible flagellum[8]; however, the mechanism of gliding motility is not fully understood. This organism also possesses ecologically very valuable characteristics such as alkali and salt tolerance and algal mat production on the periphery of lakes. Arthrospira platensis is also able to grow under high salt concentrations of ∼1.5-fold higher than sea water.[9] Accordingly, it often dominates in lakes with high carbonate/bicarbonate levels and high pH levels.[10] Gene expression during adaptation to new environmental conditions like a high salt concentration has been determined in A. platensis, and the importance of the cAMP signalling cascade in its regulatory mechanism in response to changes in the external environment has been highlighted.[11] Moreover, molecular analysis of adenylate cyclase genes has shown that A. platensis develops a number of diverse cAMP-dependent signal cascades to adapt to different severe environmental conditions.[12-14] Arthrospira platensis has become an important industrial organic material as a health supplement, a source of beta-carotene and a natural colouring agent. It has been approved for treating symptoms of radiation sickness after the Chernobyl disaster in Russia.[15] The presence of hydrogenase in its cells also makes this cyanobacterium a useful material for clean energy production.[16] Despite its various useful applications, very little is known about the biology, physiology and genetic system of A. platensis. For production of useful products, gene manipulation through genetic engineering should be considered. However, genetic transformation of Arthrospira has had limited success to date,[17] and thus commercial use of this organism has faced barriers due to difficulties in gene manipulation. To overcome these barriers, restriction-modification (RM) systems based on its genome sequence may prove useful. In 1996, whole-genome sequencing was achieved for the first time in the unicellular cyanobacterium, Synechocystis sp. PCC 6803.[18] Since then, whole genomes of 38 strains of cyanobacteria have been sequenced (Table 1). Of these strains, four are N2-fixing filamentous species, whereas the others are unicellular species, both N2-fixing and non-N2 fixing. Thus, this is the first study to sequence the genome of a filamentous non-N2-fixing species.
Table 1

Strains used in the comparative genomic analysis

CyanoClust grouping
SpeciesAbbr.GenesPhysiological indexes
Number of Pfam domains
GroupFilamentN2-fixationHeterocystHabitatMotilityTotalHisKARespons_regPASGAF
IAcaryochloris marina MBIC 11017amr8383Marine8273751615474
Gloeobacter violaceus PCC 7421gvi4430Rock546439562015
Microcystis aeruginosa NIES 843mar6312Freshwater51512030112
Prochlorococcus marinus SS120pma1883Marine19674610
Prochlorococcus marinus AS9601pmb1921Marine19474511
Prochlorococcus marinus MIT 9515pmc1906Marine18924511
Prochlorococcus marinus NATL1Apme2193Marine20745611
Prochlorococcus marinus MIT 9303pmf2997Marine29087810
Prochlorococcus marinus MIT 9301pmg1907Marine19284611
Prochlorococcus marinus MIT 9215pmh1983Marine19413511
Prochlorococcus marinus MIT 9312pmi1810Marine19285611
Prochlorococcus marinus MIT 9211pmj1855Marine19634610
Prochlorococcus marinus MED4pmm1717Marine19185611
Prochlorococcus marinus NATL2Apmn2163Marine21105611
Prochlorococcus marinus MIT 9313pmt2269Marine25496910
Synechococcus elongatus PCC 6301syc2527Freshwater319815262019
Synechococcus sp. CC9605syd2645Marine25715912
Synechococcus sp. CC9902sye2307Freshwater24535711
Synechococcus elongatus PCC 7942syf2662Freshwater326616272020
Synechococcus sp. CC9311syg2892Freshwater2696121511
Synechococcus sp. WH 5701syh3346Marinendndndndnd
Synechococcus sp. RCC307syr2535Marine263391111
Synechococcus sp. WH 8102syw2519Marine+23925912
Synechococcus sp. WH 7803syx2533Marine2669121410
Synechocystis sp. PCC 6803syn3569Freshwater+426941672933
Thermosynechococcus elongatus BP-1tel2476Hot spring290115321319
II+Crocosphaera watsonii WH 8501cro5958Marinendndndndnd
Synechococcus sp. JA-3-3Abcya2760Hot spring339418371322
Synechococcus sp. JA-2-3B'a(2–13)cyb2862Hot spring346920361518
Cyanothece sp. PCC 7424cyc5710Freshwater6407621206747
Cyanothece sp. PCC 7425cyn5327Freshwater62557014512268
Cyanothece sp. PCC 8801cyp4367Freshwater518238813239
Cyanothece sp. ATCC 51142cyt5304Marine549044813033
III−/+Synechococcus sp. PCC 7002syp3186Marine376431541619
IV+Arthrospira platensis NIES 39apl6631Freshwater+6631701077658
V++Trichodesmium erythraeum IMS101ter4451Marine615926471024
VI+++Anabaena sp. PCC 7120ana6130Freshwater72071031537280
Anabaena variabilis ATCC 29413ava5661Freshwater72341021517383
Nostoc punctiforme ATCC 29133npu6690Soil+875112519492121

Abbr., abbreviation; nd, not determined. Pfam domains: HisKA (PF00512), Respons_reg (PF00072), PAS (PF00989) and GAF(PF01590).

Strains used in the comparative genomic analysis Abbr., abbreviation; nd, not determined. Pfam domains: HisKA (PF00512), Respons_reg (PF00072), PAS (PF00989) and GAF(PF01590). This study determined the nearly complete genome sequence of A. platensis NIES-39 of ∼6.8 Mb. Some characteristic gene sets particularly in signal transduction and gene RM systems were determined, and additional properties in the genome structure were compared with those of other cyanobacteria.

Materials and methods

Bacterial strain and culture conditions

Arthrospira (Spirulina) platensis strain NIES-39 was obtained from the culture collection at the National Institute for Environmental Studies, which was originally isolated from Lake Chad and maintained in the Institute of Applied Microbiology, the University of Tokyo (strain IAM M-135).[6] The cells were grown in the SOT medium[19] at 30°C under continuous illumination at 30 µmol photon m−2 s−1 with aeration with 1% (v/v) CO2.

Sequencing and assembly

Genomic DNA was isolated with Genomic-tip 500/G (Qiagen, Valencia, CA, USA) as recommended by the supplier. A DNA shotgun library with 1.5- and 5-kb inserts in pUC118 vector (Takara, Otsu, Japan) was constructed as described previously.[20] Plasmid clones were end-sequenced using dye terminator chemistry on an ABI Prism 3730 sequencer as described previously.[20] Raw sequence data corresponding to 11-fold coverage were assembled using PHRED/PHRAP/CONSED software (http://www.phrap.org).[21,22] For assembly validation, a fosmid library with 40-kb inserts in the pCC1FOS fosmid vector was constructed using the CopyControl Fosmid library production kit (Epicenter, Madison, WI, USA). Fosmid DNA was extracted from Escherichia coli transformants using a Montage BAC96 MiniPrep kit (Millipore, Billerica, MA, USA), and end sequencing was carried out using dye terminator chemistry on an ABI Prism 3730 as described previously.[20] Fosmid end sequences were mapped onto the assembled sequence. Fosmid clones that link two contigs were selected and sequenced by primer walking to close any gaps. The sequencing of difficult templates was performed using a CUGA sequencing kit (Nippon Genetech, Tokyo, Japan). Contig sequences generated by Roche 454 FLX sequencer were also used to fill some gaps.

Optical mapping

The sequence assemblies were constructed by optical maps (OpGen Technologies, Madison, WI, USA) and PCR.[23,24] Briefly, high molecular weight DNA was immobilized as individual molecules onto optical chips, digested with NcoI (New England Biolabs, Ipswich, MA, USA), fluorescently stained with YOYO-1 (Invitrogen, Carlsbad, CA, USA) and positioned onto an automated fluorescent microscope system for image capture and fragment size measurement to give high-resolution single-molecule restriction maps. Single-molecule maps were collected and then assembled to produce whole genome. Optical maps and sequence contigs were compared as described previously.[23] Sequence FASTA files were converted to in silico restriction maps via MapViewer software (OpGen Technologies) for direct comparison with the optical maps. Comparisons were accomplished by aligning the sequence with the optical maps according to their restriction fragment pattern. Alignments were generated with a dynamic programming algorithm that finds the optimal location or placement of a sequence contig by global alignment of the sequence contig against the optical map. Local alignment analysis was also performed to compare segments of the sequence contigs to the optical map.

Genome analysis and annotation

Putative non-translated genes were identified using the Rfam[25] and tRNAscan-SE[26] programs and rRNA genes using the RNAmmer[27] and BLASTN[28] programs, whereas protein-coding genes were identified using the GLIMMER program by obtaining potential open reading frames >150 bp.[29,30] The genome sequence was translated into potential protein sequences in six frames and compared with sequences in the UniProt database[31] using the BLASTP program[28] for identification of additional genes not predicted by other methods and genes <150 bp, especially in the predicted intergenic regions. The start sites were manually inspected and altered based on predictions of GLIMMER and GeneHacker.[32] For functional annotation, the non-redundant UniProt database and protein signature database, InterPro,[33] were searched to assign the predicted protein sequences based on sequence similarities. Orthologous genes among cyanobacteria were manually curated using Gclust[34] and CYORF.[35] The KEGG database was used for pathway reconstruction.[36] Signal peptides in proteins were predicted using SignalP,[37] and transmembrane helices were predicted using TMHMM.[38]

Comparative genomic analysis

We first used the hmmer (v2.3.2) program and the Pfam_ls database (release 23) to analyse the cyanobacterial genomes (Table 1). For comparative genomic analysis to determine genes related to trichome formation and N2-fixation, we applied the CyanoClust database.[39] We classified all putative proteins from the 39 cyanobacterial genomes into nearly orthologous protein clusters; each cluster was statistically defined to minimize its size. This is particularly useful for analysing the many regulatory proteins in cyanobacteria which have multidomains. We divided A. platensis and the other strains of cyanobacteria into six groups (Table 1): unicellular non-N2-fixing cyanobacteria such as Synechocystis sp. PCC 6803 (group I), unicellular N2-fixing cyanobacteria such as Crocosphaera watsonii (group II), Synechococcus sp. PCC 7002 (a unicellular cyanobacterium which forms a short filament near optimal growth temperature; D. Bryant, personal communication) (group III), filamentous non-N2-fixing cyanobacteria (A. platensis, group IV), filamentous N2-fixing non-heterocyst-forming cyanobacteria (Trichodesmium erythraeum, group V) and filamentous N2-fixing heterocyst-forming cyanobacteria such as Anabaena sp. PCC 7120 (group VI). Gene clusters that were shared between groups were automatically extracted, and only those clusters present in all strains of the group were analysed.

Results and discussion

Sequencing, optical mapping and structural features of the A. platensis genome

The nucleotide sequence of the whole genome of A. platensis NIES-39 was determined using the whole-genome shotgun method. Three different libraries with 1.5-, 5- and 40-kb inserts were prepared and each end sequence was determined. A total of 92 878 random sequences corresponding to 11 genome equivalents were assembled into 18 supercontigs. Final sequence assembly was carried out by visually editing the draft sequences and by optical mapping (data not shown) to determine the relative order/orientation of each supercontig. The 18 remaining gaps were estimated to be ∼95 kb. Most, if not all, gaps were flanked by repeated clusters of tandem sequences or the phage-like sequences. Gap closing including long PCR followed by primer walking and transposon-mediated sequencing was not successful due to unusually abundant repeats. The total sequence of the 18 supercontigs has a length of 6 692 865 bp with an average G + C content of 44.3%. Taken together, the A. platensis genome is composed of a single, circular chromosome of 6.8 Mb (Fig. 1); no plasmid DNA sequences were detected. On GC skew analysis to locate the probable origin and terminator of DNA replication, no apparent shift of skew was detected as in other cyanobacteria.
Figure 1

Schematic representation of the circular chromosome of A. platensis. A scale indicates the coordinates in megabase pairs. From outside to inside: circle 1, the gaps in the genome; circles 2 and 3, predicted protein-coding genes on the forward and reverse strands; circle 4, G+C content; circle 5, GC skew. Eighteen contig gaps (G01-G18) are numbered in the clockwise direction starting from the end of the longest contig. Functional categories were colour-coded according to the standard colours used by COGs. The genome sequence and annotation of A. platensis NIES-39 are available at GenBank/EMBL/DDBJ under accession no. AP011615.

Schematic representation of the circular chromosome of A. platensis. A scale indicates the coordinates in megabase pairs. From outside to inside: circle 1, the gaps in the genome; circles 2 and 3, predicted protein-coding genes on the forward and reverse strands; circle 4, G+C content; circle 5, GC skew. Eighteen contig gaps (G01-G18) are numbered in the clockwise direction starting from the end of the longest contig. Functional categories were colour-coded according to the standard colours used by COGs. The genome sequence and annotation of A. platensis NIES-39 are available at GenBank/EMBL/DDBJ under accession no. AP011615. The estimated genome comprises 6630 potential protein-coding genes (average size, 835 bp), as well as 49 RNA genes consisting of 2 sets of rRNA genes, 40 tRNA genes representing 20 tRNA species, tmRNA, the B subunit of RNase P and signal recognition particle RNA. Translated amino acid sequences were compared with sequences in the UniProt database using the BLAST program. Of the 6630 potential protein-coding genes, 5157 (78%) were orthologous or had similarity to genes of known function or hypothetical genes (E-value of <10−3) and the remaining 1473 (22%) showed no significant similarity to any registered genes. On manual curation, 2539 (38%) genes could be assigned to biological roles. Genes for general metabolism in cyanobacteria were detected with no particular difference.

Conserved domain analysis using Pfam

We obtained 2673 kinds of Pfam domains from 37 cyanobacterial genomes and 1537 kinds of Pfam domains from the A. platensis genome. Table 2 presents the top 50 Pfam domains in A. platensis, which contains highly repetitive motifs including TPR_1 (PF00515), TPR_2 (PF07719), WD40 (PF00400), HemolysinCabind (PF00353), Pentapeptide (PF00805), HNH (PF01844) and GIIM (PF08388).
Table 2

Top 50 of Pfam domains of A. platensis

domainsaplamranaavacyacybcyccyncypcytgvimarnpupmapmbpmcpmepmfpmgpmhpmipmjpmmpmnpmtsycsydsyesyfsygsynsypsyrsywsyxtelter
TPR_14153001641585873202981447597197300812750207131312248868634911351255628181040628
TPR_239830816616376901991151477212320229871674921013131020987764211164312536313211748593
WD402203982302824419772651281539732022232222223000000451101010174
HemolysinCabind191172726000533102710521230603170000013705710140271150156
Pentapeptide160252988940441111061011116850123867813666878152011112012486213131341107
HNH*11191511437471441172332233324221231211422278
Response_reg107161153151373612014581815630194655686566669269727156754119143247
GIIM*1036520050461600000000000000000000000612
HATPase_c829513913526277610247524923168655696566668177718134836107112138
PAS_37747758013175115630311718800000000000020002003418000137
PAS76547273131567122323020192111111111111201120129161111310
PAS_47654738119245712828323038510011000001119111912412111145
DUF8207026564815216753855493896300000000000020020352600017
HisKA7075103102182062703844392012544457435455615551612413195121526
ABC_tran65791061055563788865736866106313228344330313229313444504038513860603647424956
GAF587480832218476839331512121011101110110192120133191201924
Pkinase465849569102830162015175600002000000150151781011142
SBBP*460000107009040000000000000000000000013
HEAT_PBS419553815142019201625272811054111204414571451314836933
PPC41317121020189650000000000000200012070075
Pkinase_Tyr40544652892623162012165100000000000050050680001038
Transposase_354022513476546430729220000000000001001012000324
Transposase_14*393001605110320000000000000000000260000015
Glycos_transf_138485245783456333629285710812121268978910131219131126191514171230
Glycos_transf_2382648458842483533243438910134146766114811121112924111211102038
Methyltransf_1138744241272635543336443254101412121910121110151115231412231525191718191151
RVT_1*379830060471521000000100000000021000511
GGDEF33541414343419233112210000000000001700174231300084
Methyltransf_12336634322322284628343928468111012199979121013201413201424171516171044
BPD_transp_12724343928342530232319193185881465796813221312231121221313151619
Epimerase27543742161430443130352756161820182216171412181617161820161925172119201327
Transposase_22722412426242426528420000000000001001012000326
PIN2610179121018101422311411111111111141153132215211
AAA25343437231828251919252337131313151613131212141515161814161618171416151437
DAO25322937202231282225282034161613152312111414151521232521242725222122211324
MMR_HSR125353228202124302428182431161616161717171716161617211818211923191715172427
PG_binding_1241292800331214260000000000001001017000111
adh_short2347384114122839232843257910989167911107816121316122017121517161116
CBS232421218618161113910193333544343351166115814565713
SLH2324293210991111131184020004000100476476911244810
AAA_52224202312111319161319193010910101399912111013131514131317121514141229
Guanylate_cyc*227651225230180000200000022102131001213
Radical_SAM222425261315313123202421251110108141211101210813171613171422191315151620
SMC_N2013202211112024181618182210971287678611118118961268109913
FHA19381313109152313156713000020000000201211191012014
Abhydrolase_118242427101124261520231429456545554654136513515146361118
GTP_EFTU181819191514151418171213189101111121091010101112121311121117161111111417
Hydrolase182125339919221323141526424562425357187718615169881012
NAD_binding_418252324861721181716182985871178769710118911816789101112
3Beta_HSD17251719551621131419112266377675488787885117867810
Aminotran_1_21712172012131413151517152278571086565891197139131112771116
CbiA17292427131219252219192029788968888886118812919199681118

See Table 1 for abbreviations

*Highest domain numbers in A. platensis

Top 50 of Pfam domains of A. platensis See Table 1 for abbreviations *Highest domain numbers in A. platensis Positive correlation was detected between the number of genes and number of Pfam domains of the 38 cyanobacterial species (r = 0.9X, P < 0.05), including A. platensis (Supplementary Fig. S1). Eighteen domains (DUF1064, BOF, Cytochrom_C_2, DUF1624, DUF1667, DUF2234, DUF2273, DUF268, DUF310, DUF579, Hp0062, JmjC, LAB_N, PhoU_div, PMSR, PPTA, SoxG and YibE_F) are unique to the A. platensis genome. The abundance of signalling domains, such as GAF (guanylate cyclase/adenylate cyclase/FhlA), PAS (Per/Arnt/Sim), HisKA and Respons_reg, depends on the habitat of the cyanobacteria; i.e. soil and freshwater cyanobacteria possess larger numbers of signalling domains than marine cyanobacteria (Supplementary Fig. S2). A. platensis, which lives in high salt lakes, was thus categorized as a soil/freshwater cyanobacterium.

Physiological profiling using CyanoClust

Arthrospira platensis is the first strain of filamentous non-N2-fixing cyanobacteria to have its genome sequenced. Whole-genome sequencing allows for comparative genomic analysis especially in regard to trichome formation and N2-fixation. Using the CyanoClust database, we extracted 694 gene clusters (938 genes in A. platensis) that were common to all six cyanobacteria groups (Table 3 and Supplementary Table S1). These clusters may be closely related to housekeeping, photosynthetic or cyanobacteria-specific genes. We also extracted 1066 gene clusters (2056 genes in A. platensis) that were specific to A. platensis (Table 3 and Supplementary Table S2); 71 (94 genes in A. platensis) common to only groups IV (A. platensis) and V (T. erythraeum) (Supplementary Table S3); and 223 common to the heterocyst-forming cyanobacteria. The latter clusters included known heterocyst-related genes such as patN and hetP (Table 3 and Supplementary Table S4). In addition, eight clusters were specifically conserved among N2-fixing cyanobacteria (groups II, V and VI) (Table 3 and Supplementary Table S5); most of these encode nif-related genes.
Table 3

Summary of comparative genomic analysis using CyanoClust

Physiological profiling
CommonA. platensis-specificHeterocyst-specificN2-fixing-specificFilament-specific
Group IOXXXX
Group IIOXXOX
Group IIIOXXXX/O
Group IVOOXXO
Group VOXXOO
Group VIOXOOO
No. of clusters6941066223829/7
No. of genes in A. platensis93820560031/7

For cyanobacteria grouping, see Table 1.

Summary of comparative genomic analysis using CyanoClust For cyanobacteria grouping, see Table 1. Of the 29 gene clusters (31 genes in A. platensis, Table 4 and Supplementary Table S6) common to only filamentous cyanobacteria (groups IV, V and VI), fraC and fraG (sepJ) are already known to have an association with trichome formation.[40,41] Notably, seven gene clusters (seven genes in A. platensis, Table 4 and Supplementary Table S7) were found to be common to the filamentous groups and Synechococcus sp. PCC 7002, which may be a primitive filamentous species (group III) and may also be related to trichome formation. Moreover, several genes (patU, hetR and hetF)[42-44] that are required for heterocyst maturation and N2 fixation are conserved in A. platensis, although no nitrogenase genes were detected; this suggests that heterocyst formation could be developmentally coupled with trichome formation. Several contiguous genes (bolded in Table 4) are syntenically conserved among the five trichome-forming cyanobacteria. Particularly, NIES39_A00790 is located just downstream of fraC, implying that it is another candidate required for trichome development.
Table 4

Gene clusters conserved only in filamentous cyanobacteria

Cluster No.Gene IDAnnotation
Specific for groups IV, V and VI
 4981NIES39_A00790Hypothetical protein
 5158NIES39_A00800Filament integrity protein (fraC)
 4909NIES39_A01310NUDIX hydrolase
 3571NIES39_A03140Hypothetical protein
 4548NIES39_A04850Hypothetical protein
 3571NIES39_C00130Hypothetical protein
 4852NIES39_C00870Hypothetical protein
 2616NIES39_D00070Hypothetical protein
 4736NIES39_D00920Hypothetical protein
 4710NIES39_E01660Nuclease (SNase-like)
 4588NIES39_E02590Serine/threonine protein kinase
 4667NIES39_F00600Probable glycosyl transferase
 4881NIES39_K02770Hypothetical protein
 4277NIES39_K02780Hypothetical protein
 4548NIES39_L02410Hypothetical protein
 4747NIES39_L03440Hypothetical protein
 2616NIES39_L06030Hypothetical protein
 5149NIES39_L06180Hypothetical protein
 5020NIES39_M00320Hypothetical protein
 4731NIES39_M02510DUF6 transmembrane protein (sepJ, fraG)
 5143NIES39_N00400Hypothetical protein (patU)
 2616NIES39_N00410Hypothetical protein
 5032NIES39_N00980Hypothetical protein
 4826NIES39_O03720Hypothetical protein
 5058NIES39_O03830hypothetical protein
 5102NIES39_O03850Hypothetical protein
 4991NIES39_O04240Hypothetical protein
 4590NIES39_O06790Alpha/beta hydrolase fold
 4996NIES39_Q00570Hypothetical protein
 5045NIES39_Q00580Hypothetical protein
 4607NIES39_R00660Hypothetical protein
Specific for groups III, IV, V and VI
 4298NIES39_C02810Hypothetical protein
 4333NIES39_C03480Heterocyst differentiation protein (hetR)
 4276NIES39_D04070Hypothetical protein
 4505NIES39_J00800Hypothetical protein
 3819NIES39_J02400Heterocyst differentiation protein (hetF)
 4311NIES39_O06320Hypothetical protein
 4255NIES39_Q02800Hypothetical protein

Bolded genes are contiguous.

Gene clusters conserved only in filamentous cyanobacteria Bolded genes are contiguous.

Mobile DNA elements

Group II introns include ribozymes and retroelements in the genomes of organelles, Eubacteria and Archaea. In bacteria, group II introns primarily act as retroelements consisting of six RNA structural domains (DI–DVI) and an intron-encoded protein, whereas in mitochondria and plastids, they frequently lack the intron-encoded protein and are mobile. At least 150 group II introns were identified in the A. platensis genome; 71 of these encode reverse transcriptase/maturase, whereas the other 79 do not. Additionally, 88 sequences were assigned to the group II catalytic intron (RF00029), DV and DVI. This number was much higher than that in previously reported abundant species (27 copies of group II introns in T. elongatus[45] and 28 copies of RF00029 in T. erythraeum). These results suggested that group II introns have been self-propagating extensively in A. platensis cells for a long time. Group I introns include ribozymes that catalyse their own RNA splicing to produce ligated exons (mature RNA) and the excised intron. We detected two group I introns in one gene comprising three ORFs (NIES39_L05970, NIES39_L05980 and NIES39_L05990) which encode class I ribonucleotide reductases (RNRs; Fig. 2). Notably, a stop codon was found just after the insertion position in each intron, indicative of translational regulation. It is extremely rare for a group I intron to be inserted in protein-coding genes in bacteria, although one report has detected this insertion in class II RNR of the cyanobacterium N. punctiforme.[46] At least three different classes of RNRs that differ in primary sequence, substrate and cofactor requirements have been described in cyanobacteria. Since class I RNRs have no homology with class II RNRs, the essential DNA biosynthesis step from RNA could be an evolutionary hot spot for targeting these selfish genes. Cyanobacterial genome comparison revealed that classes I and II are distributed complementally in cyanobacteria, except for Gloeobacter violaceus in which no known RNR genes have been detected (Supplementary Table S8), whereas class III anaerobic RNRs are also present in some species. Many class I and II RNR genes are interrupted by introns and/or inteins which are spliced out for sequencing of RNA or protein. Inteins in A. platensis were also found in DnaB (NIES39_M02320), DnaX (NIES39_D01990) and thymidylate synthase (NIES39_Q01490) genes.
Figure 2

Gene structure of the ribonucleotide reductase (RNR) gene with two group I intron insertions.

Gene structure of the ribonucleotide reductase (RNR) gene with two group I intron insertions. Novel phage-like sequences spanning from 12.5 to 25.5 kb were found in at least 18 loci in the A. platensis genome. Although there are a number of variations in deletion, insertion and rearrangement in each sequence, they consist of phage infection-related genes (e.g. NIES39_F00750) and small conserved putative genes, generating a direct repeat in some case. Such phage-like sequences contribute to at least 295 kb in the whole genome. As for transposases of the insertion sequence, 139 genes were found, which is comparable to other cyanobacteria.[18] A total 612 kb of the genome are composed of group II introns, phage-like sequences, insertion sequences and some repetitive elements, which include clustered regulatory interspaced short palindromic repeat (CRISPR) sequences (see Section 3.5.2).

Signal transduction proteins

cAMP and c-di-GMP signal transduction

cAMP is an important signalling molecule in cyanobacteria.[47] A. platensis genome encodes 22 adenylate cyclases that produce cAMP from ATP (Fig. 3). This number is quite prominent among cyanobacterial genomes, which usually encode less than 10 adenylate cyclases. Even T. erythraeum, the most closely related cyanobacterium to A. platensis, contains only 13 adenylate cyclase genes. In addition to the genes cyaC,[13,14] cyaG,[13,14] cyaA, cyaB1, cyaB2 and cyaD homologues (Fig. 3),[48] many novel genes were detected. Another unique characteristic of A. platensis is its 10 membrane-associated adenylate cyclases, compared with most cyanobacteria which have only one or two. Thus, the cAMP signal transduction response to external stimuli may be highly developed in A. platensis. As for the domain structure of adenylate cyclase, NIES39_A07100 bears a unique domain architecture: a Ser/Thr protein kinase domain in the N-terminus and GAF, PAS and adenylate cyclase domains in the C-terminus (Fig. 3). This domain architecture is quite similar to that of the HstK family,[49] a hybrid of Ser/Thr protein kinase and histidine kinase, except that the C-terminal histidine kinase domain is replaced with the adenylate cyclase domain.
Figure 3

Domain architecture of adenylate cyclases in A. platensis. See Table 1 for abbreviations.

Domain architecture of adenylate cyclases in A. platensis. See Table 1 for abbreviations. The A. platensis genome also contains 14 genes encoding cNMP-binding domains. Four of them [NIES39_C02730 (NtcA), NIES39_D00910 (CRP), NIES39_M01440 and NIES39_O00090] encode the helix-turn-helix DNA-binding motif of the CRP family, whereas the remaining encode unique domain architecture proteins. NIES39_A00950 bears a unique domain architecture, namely, an ATP:ADP antiporter-like domain in the N-terminus and a cNMP-binding domain in the C-terminus. The C-terminal cNMP-binding domain contains residues critical for binding cAMP. A previous report showed that A. platensis trichomes rapidly aggregate to form a mat in response to externally added cAMP.[50] This aggregation is accompanied with an increase in the intracellular ATP level and a decrease in the intracellular ADP level,[12] suggesting that NIES39_A00950 may contribute to this cAMP-dependent aggregation. There are 33 GGDEF and 15 EAL domains, respectively, involved in the synthesis and hydrolysis of c-di-GMP,[51] a widely distributed second messenger for biofilm formation in bacteria. One GGDEF (NIES39_C01090) and three GGDEF/EAL (NIES39_C03530, NIES39_L00190 and NIES39_Q02220) proteins contain putative flavin-binding PAS domains, suggestive of light- or redox-responsive c-di-GMP signalling for movement, aggregation or biofilm formation.

Two-component signal transduction systems

Two-component signal transduction systems canonically consist of two proteins, a histidine kinase and a response regulator.[52] There are 84 putative genes for histidine kinases in the A. platensis genome. Among them, 33 encode hybrid histidine kinases containing not only a transmitter domain but also single or multiple receiver domain(s) or histidine phospho-transfer domain(s), which accept the phosphate group via intramolecular and/or intermolecular phospho-transfer. There are also 65 putative genes for response regulators. Orthologous histidine kinases of Synechocystis, Hik2 (slr1147), Hik7 (SphS, sll0337), Hik8 (sasA, sll0750), Hik33 (sll0698) and Hik34 (slr1285), are conserved in almost all cyanobacterial genomes. The A. platensis genome contains orthologues of Hik2 (NIES39_J01640), Hik8 (NIES39_A03740), Hik33 (NIES39_M02760) and Hik34 (NIES39_J00420), but not SphS, a kinase that induces proteins to acquire inorganic phosphate (Pi) under Pi-deficient conditions. Although some marine cyanobacteria have been found to lack apparent orthologues of SphS, A. platensis is the first example among freshwater cyanobacteria. Accordingly, it does not contain a gene for SphU which regulates SphS.[53] However, the response regulator SphR and its target genes phoA (alkaline phosphatase) and pts (high affinity Pi transporter) are present in the A. platensis genome, suggesting a different regulation system in this cyanobacterium. Some histidine kinases in A. platensis contain multiple domains of PAS/PAC and/or GAF. These structures are also prevalent in other histidine kinases from filamentous cyanobacteria, such as Anabaena sp. PCC 7120[54] and N. punctiforme.[55] Like in other cyanobacterial genomes, most genes for histidine kinases and response regulators in the A. platensis genome are not collocated to each other on the chromosomes, except for the 18 cases (Supplementary Table S9). Several genes encode fragments of histidine kinases in the A. platensis genome suggestive of pseudogenes (NIES39_A02200–NIES39_A02210, NIES39_D02110–NIES39_D02130–NIES39_D02160, NIES39_L02740–NIES39_L02750, NIES39_L05670–NIES39_L05680–NIES39_L05690).

Transcription factors

Cyanobacteria have multiple σ70-type sigma factors of RNA polymerase and no σ54-type sigma factor. In the A. platensis genome, there are seven σ70-type sigma factors. NIES39_L06110 encodes a group 1 sigma factor, which is also called the principal sigma factor and is essential for cell viability. In addition, three group 2 sigma factors (NIES39_A03770, NIES39_A08720 and NIES39_C01040), which show a high degree of sequence similarity with the principal sigma factor but are non-essential for cell growth, and three group 3 sigma factors (NIES39_C05220, NIES39_D05870 and NIES39_O06410), which are divergent from group 1 and 2 sigma factors in amino acid sequence, are identified. The B-, C- and D-type group 2 sigma factors and the F- and G-type group 3 sigma factors are conserved among non-marine cyanobacteria,[56,57] all of which are present in A. platensis. An additional group 3 sigma factor which does not form a clade with previously analysed sigma factors was also determined. In the A. platensis genome, there are 66 putative genes for transcriptional regulators. Compared with the genome size, this number is quite small for a non-marine cyanobacterium, being more comparable to marine cyanobacteria.[58] For example, the freshwater cyanobacterium Anabaena sp. PCC 7120, having a comparable genome size to A. platensis, contains more than 100 transcriptional regulators.[54] Some well-conserved transcriptional regulators among cyanobacteria were found in A. platensis, including genes for NtcA (NIES39_C02730),[59] NtcB (NIES39_C02400),[60] NdhR (NIES39_K02860),[61] HrcA (NIES39_O04340)[62] and CRP (NIES39_D00910).[63] Two adjacent genes (NIES39_D06090 and NIES39_D06100) encode RbcR paralogues.[64] Although A. platensis does not form a heterocyst, a gene for HetR (NIES39_C03480) has recently been found.[65] Arthrospira platensis contains 20 response regulators that have a so-called receiver domain at the N-terminus and the helix-turn-helix type DNA-binding domain at the C-terminus. Of these regulators, 16 are classified in the OmpR family whereas the remaining belong to the LuxR family. Orthologous proteins of NrrA (NIES39_A06330),[66] ManR (NIES39_G00800),[67,68] RpaA (NIES39_H00910),[69] SphR (NIES39_J04020),[70] RpaB (NIES39_K03840)[69] and NblR (NIES39_O05300)[69] were also identified.

PAS/GAF domains

Most non-marine cyanobacteria contain a large number of PAS and GAF domains.[54] The PAS and GAF domains are important signalling molecules that serve as specific sensors for light, redox, oxygen and many other signals or a structural element for dimerization.[71,72] We detected 131 PAS domains using a custom HMM profile of Anabaena and Nostoc PAS domains.[73] This number is almost comparable to those of other freshwater and soil cyanobacteria such as Anabaena and Nostoc. Six PAS domains encoded by NIES39_C01090, NIES39_C03530, NIES39_M00920, NIES39_L00190, NIES39_M02160 and NIES39_Q02220 are predicted to bind a flavin; the first three retain the conserved Cys residue that is essential for the photocycle of the LOV-type photoreceptors. One PAS domain encoded by NIES39_A03380 is predicted to bind a heme. The A. platensis genome contains 58 GAF domains. Four GAF domains encoded by NIES39_M01980 (CyaB1) and NIES39_O00080 (CyaB2) are predicted to bind cAMP,[74] whereas 18 are categorized into the phytochrome-type or cyanobacteriochrome-type GAF domain subfamilies that bind linear tetrapyrrole and exhibit reversible photoconversion.[75] We found no Cph1 and AphA orthologues, which are cyanobacterial phytochromes, although bacteriophytochrome-type AphB (NIES39_A02210) and AphC (NIES39_C03450) orthologues are detected.[76] In addition, there are two green/red reversible AnPixJ-type (NIES39_J03990_GAF2 and NIES39_C00690_GAF1)[77] and six blue/green reversible TePixJ-type (NIES39_K04670_GAF1, NIES39_E01230_GAF1, NIES39_L03820_GAF1, NIES39_E04120_GAF3, NIES39_D04330_GAF1 and NIES39_M02490_GAF1) GAF domains.[75,78]

Chemotaxis regulators

In the A. platensis genome, there are eight methyl-accepting chemotaxis proteins (MCPs), which may serve as sensors for cell motility. In contrast with photoreceptor MCPs (e.g. PixJ) in many other cyanobacteria,[79] any of Arthrospira MCPs does not harbour any photoreceptor domain. Three of these MCP genes are clustered with other chemotaxis-regulating che genes such as cheY, cheW and cheA (NIES39_A07840–NIES39_A07910, NIES39_E01010–NIES39_E01070 and NIES39_H00230– NIES39_H00290 gene clusters). Two che clusters (NIES39_A07840–NIES39_A07910 and NIES39_H00230–NIES39_H00290) lack the patA genes which are present in the che clusters of other cyanobacteria. These che clusters contain additional che genes (cheB, cheC and cheR in the NIES39_A07840–NIES39_A07910 gene cluster and cheB and cheR in the NIES39_H00230–NIES39_H00290 gene cluster), which have not been detected in che clusters of other cyanobacteria such as Synechocystis, whose motility and phototaxis regulation is well understood. CheR and CheB are known to methylate and demethylate the MCP, respectively, and thereby function as a kind of molecular memory for flagellar regulation in some proteobacteria such as E. coli.[80] CheC is a phosphatase for the CheY response regulator.[81] As MCPs of these clusters have no photosensory domains, these genes may function in chemotaxis. Molecular memory and a rapid turnover system of phosphorylation can facilitate highly sensitive chemotactic responses.

Ser/Thr protein kinases and protein phosphatases

There are 43 genes that encode Ser/Thr protein kinases in the A. platensis genome. As in other filamentous cyanobacteria such as Anabaena and Nostoc,[54] HstK family (two), WD40-containing (nine) and TPR-containing (five) Ser/Thr kinases were detected. Fourteen genes encoding phospho-Ser, -Thr and -Tyr phosphatases were also identified in the A. platensis genome. Among them, nine belong to the PP2C family including NIES39_O07130 and NIES39_R00570 that have response regulator domains. Other members of this family include NIES39_A03470, NIES39_C00770, NIES39_J01920, NIES39_J03440, NIES39_K03740, NIES39_M01820 and NIES39_M02650 (PphA) which dephosphorylates the nitrogen regulator PII protein.[82] In addition, two phospho-Tyr (NIES39_B01000 and NIES39_N00600) and three PP1/2A/2B-type (NIES39_D01240, NIES39_G01060 and NIES39_G01070) phosphatases were identified.

Genome protection

RM system

We detected three sets of type I RM systems (hsdMSR) in the A. platensis NIES-39 genome (Supplementary Table S10 and Supplementary Fig. S3) which are shared between a closely related strain reported by a Chinese group.[83] Among them, two RM-specific genes (hsdS, NIES39_A06660 and NIES39_C00340) are mosaically conserved between the two strains (Fig. 4), whereas the full-length methylase and restriction genes hsdM and hsdR are highly conserved (>98%). HsdS protein is composed of two target recognition domains and the N-terminal, central and C-terminal regions. NIES39_A06660 is highly conserved with ABB51216 throughout the entire protein except for the second target recognition domain (Fig. 4A). NIES39_C00340 is highly conserved with ABB51239 at the N-terminal, central and C-terminal regions but not at the target recognition domains (Fig. 4B). These mosaically conserved HsdS proteins may be generated by one or two homologous recombinations through a mechanism similar to domain swapping of hsdS genes in other bacteria.[84] This enables A. platensis to acquire the restriction system against a wider range of exogenous DNA elements, interfering with genetic transformation of A. platensis. On the other hand, another hsdS-like reading frame, which is interrupted by an inframe stop codon, is totally conserved between these two strains.
Figure 4

Domain architecture of HsdS proteins. The target recognition domain roughly corresponds to Methylase_S (PF01420).

Domain architecture of HsdS proteins. The target recognition domain roughly corresponds to Methylase_S (PF01420). NIES-39 harbours eight clusters of the type II RM systems, four of which are identical to those of the strain of the Chinese group,[83] although three are frame shifted. The other four clusters are novel in NIES-39. On the basis of this information of the RM systems, genetic transformation in A. platensis requires further improvement.

CRISPR system

CRISPRs are widespread in the genomes of many bacteria and almost all archaea, and they confer resistance to phages together with a group of CRISPR-associated (Cas) proteins.[85-87] The A. platensis genome contains three regions of CRISPRs, which are located in close proximity to Cas proteins. CRISPR1 (at coordinates 2897603–2897927 and 2903170–2904079) is organized as 17 almost identical sequences of 35 nt interspaced by non-identical sequences of 34–43 nt. CRISPR2 (at coordinates 3587555–3589797) and CRISPR3 (at coordinates 5189874–5191453) are organized as 29 and 23 identical sequences of 36 and 35 nt interspaced by non-identical sequences of 37–49 and 32–41 nt, respectively. We also found 18 Cas genes. One of the three Cas1 proteins encoded by NIES39_F00150 is a fusion protein, having a C-terminal Cas1 domain and a reverse transcriptase domain similar to that of group II introns.

Membrane transporters

The A. platensis genome contains ∼180 genes encoding putative membrane transporters. NapA-type Na+/H+ antiporters (Nha3 in S. elongatus PCC 7942) are known to be involved in salt tolerance at alkaline pH in some cyanobacteria.[88-90] Arthrospira platensis possesses seven putative Na+/H+ antiporters, one of which (NIES39_C00590) is an orthologue of NapA and Nha3. Accumulation of bicarbonate in the cytoplasm is essential for photosynthesis under high pH conditions.[91] Arthrospira platensis possesses two sets of genes for CO2-uptake modulation, NDH-1 (NdhF3-D3-CupA-CupS for high affinity and NdhF4-D4-CupB for low affinity), which converts CO2 to bicarbonate in the cytoplasm (Supplementary Table S11). Genes sbtA and bicA encode the sodium-dependent bicarbonate transporter in A. platensis. Two closely related copies of bicA (NIES39_B00700, NIES39_B00710), which are tandemly arranged on the genome, were also found. There is only one operon for the ABC-type transporter for either nitrate (NrtA-B-C-D) or bicarbonate (CmpA-B-C-D).

Photosynthesis-related genes

Photosystem components

Almost all photosynthesis genes were detected in A. platensis (Supplementary Table S11). For photosystem II, the reaction centre D1 protein is encoded by at least three identical copies and one divergent copy of psbA. Additional variant genes include cytochrome c550-like (NIES39_J05860) and psb28-2 (NIES39_A01290). For photosystem I, all known genes including psaX were detected. Genes for cytochrome b6/f, ATP synthase, and NDH were detected as in other cyanobacteria. We detected two copies of cytochrome c6, but no typical electron carrier proteins in the thylakoid lumen such as plastocyanin and cytochrome cM.

Genes for tetrapyrrole biosynthesis

One set of single-copy genes involved in tetrapyrrole biosynthesis was identified in the A. platensis genome. Almost all genes involved in porphyrin biosynthesis were assigned. For siroheme and heme biosynthesis, uroporphyrinogen-III C-methyltransferase (NIES39_H00180) and two types of ferrochelatase (HemH: NIES39_K00630 and NIES39_M01510) were also identified. The latter encodes the so-called ferrochelatase-like protein possessing characteristic poly-His sequence at the C-terminus, which are also found in some cyanobacteria. For heme metabolism and subsequent phycobilin biosynthesis, heme oxygenase (HO: NIES39_Q00930), heme o synthase (NIES39_O03120), protoheme IX farnesyltransferase (NIES39_E02030) and biliverdin reductase (BvdR: NIES39_B00110) were assigned. For chlorophyll a biosynthesis, all genes were identified as single-copy genes. As in other cyanobacteria, isoforms catalyzing the same reaction, i.e. aerobic (HemF: NIES39_D01090) and anaerobic (HemN: NIES39_E03410) coproporphyrinogen III oxidase and light-dependent (Por: NIES39_R00680) and dark-operative (ChlL, ChlN and ChlB: NIES39_L05610, NIES39_L05630 and NIES39_L01300, respectively) protochlorophyllide oxidoreductase, were identified, whereas only the aerobic form of Mg-protoporphyrin IX monomethylester cyclase (AcsF/CrdI/CHL27: NIES39_B00770) was found. These isoforms were assumed to function in the adaption to different oxygen environments.

Genes for biosynthesis of carotenoid, lipid and others

All known cyanobacterial genes involved in carotenoid biosynthesis were found except for beta-carotene ketolase (CrtW or CrtO) in the A. platensis genome. However, 3-hydroxyechinenone, which is produced by ketolase, is reported to bind to an orange carotenoid protein (NIES39_N00720) to regulate energy dissipation from phycobilisome to photosystem II.[92,93] Therefore, a yet unknown type of ketolase may be present in the genome. All known cyanobacterial genes involved in biosynthesis of glycerolipids, fatty acids, lipoic acid, vitamin E, lipopolysaccharides and polyhydroxyalkanoates were assigned. A gene for ω-3 fatty acid desaturase (desB) was not found, which is consistent with the biochemical analysis of A. platensis.[94] The biosynthesis genes for cyanotoxins such as non-ribosomal peptide toxins (microcystins), ribosomal depsipeptide toxins (microviridins), alkaloid toxins (anatoxins and saxitoxins) and urea-derived toxins (cylindrospermopsin) were not present in the genome, which concurs with the long-time use of this organism as a food.

Carboxysome, reactive oxygen species protection and gas vesicles

A. platensis has eight genes for β-type carboxysome, which converts the accumulated bicarbonate ion to CO2 for ribulose 1,5-bisphosphate carboxylase. These genes are split into two operons (ccmK1-K2 and ccmK3-K4-L-M-N-O). Only one gene (ccmM) encodes carbonic anhydrase for conversion of bicarbonate to CO2. We also detected some enzymes which protect against reactive oxygen species, except for catalase. However, the A. platensis genome harbours many genes for thioredoxin peroxidase, peroxiredoxin and other putative peroxidases including five genes for bacterioferritin co-migratory proteins (NIES39_A03490, NIES39_E01510, NIES39_E02230, NIES39_L02380, NIES39_O06760). Genus Arthrospira possesses gas vesicles for buoyancy of its trichomes, which is in contrast with no gas vesicles in the morphologically similar genus Spirulina.[95] Consequently, A. platensis has six gas vesicle genes (gvpA-C-N-J, gvpV and gvpW); GvpA is a structural protein to form the skeleton of the gas vesicle.

Cell surface proteins and motility

Cell surface and extracellular proteins

Concerning the cell surface structure and extracellular proteins, 42 genes were found to encode putative extracellular proteins that contain 191 haemolysin-like Ca2+-binding domains in the A. platensis genome (Table 2). This number is much higher than that in many other cyanobacteria. Fifteen of these genes consist of merely haemolysin-like Ca2+-binding domains, whereas 14 possess an additional SBBP-like domain, which is often found in S-layer-related proteins (i.e. NIES39_B00690, NIES39_C00400, NIES39_D06770, NIES39_D06910, NIES39_D06930, NIES39_D06980, NIES39_D07000, NIES39_D07020, NIES39_Q00770, NIES39_Q00840, NIES39_Q02140, NIES39_Q02160, NIES39_Q02170 and NIES39_Q02190).[96] NIES39_D06770 also possesses the CalX-beta domain which is present in the cytoplasmic domains of CalX Na+/Ca2+ exchangers to expel calcium from cells,[97] and NIES39_Q02140, NIES39_Q02160, NIES39_Q02170 and NIES39_Q02190 possess both the PPC domain found at the C-terminus of secreted bacterial peptidases and the CalX-beta domain. Three other genes (NIES39_C03760, NIES39_F00640 and NIES39_J05690) also possess the PPC domain. Other genes possess domains for cadherin (NIES39_C05180), SCP (NIES39_D00540, NIES39_D06960 and NIES39_L04630), Lipase-GDSL (NIES39_E00490), pro-isomerase (NIES39_L02390) and Chlam-PMP (NIES39_L00320 and NIES39_M01680); NIES39_L00320 also possesses the CalX-beta domain. Although pathogenic haemolysins are well known in Enterobacteria,[98,99] it is interesting that non-toxic A. platensis has a large number of genes that contain the haemolysin-like Ca2+-binding domain. The A. platensis genome has no genes coding siderophore to scavenge iron from the environment.

Gliding motility

The molecular mechanism for gliding motility has been poorly understood in cyanobacteria. The outer surface of gliding cyanobacteria in Oscillatoriaceae consists of a parallel, helically arranged protein array (S-layer).[100,101] The S-layer protein, oscillin, required for gliding motility in Phormidium[102] is conserved in A. platensis (NIES39_A01430, 42% identity). NIES39_A01430 consists of 19 haemolysin-like Ca2+-binding domains. This protein is also homologous to SwmA (27% identity) for swimming motility in unicellular Synechococcus sp. WH 8102.[103,104] Gliding motility of cyanobacteria is reported to be driven by a secretion of mucilage from the junctional pore complex (JPC).[105] The JPC consists of a channel through the outer membrane and the peptidoglycan layer. Although the subunits of JPC have not been identified, comparative genomics analysis may reveal their genes.

Type IV pilus-related proteins

Type IV pili are involved in twitching motility in many bacteria[79,106-108] and have recently been reported to function in gliding motility in N. punctiforme[109] and twitching motility in unicellular Synechocystis.[110] A. platensis exhibits vigorous gliding motility and harbours type IV pilus-related genes (Supplementary Table S11), but shows no twitching motility, suggesting that this organism utilizes type IV pili as machinery for gliding motility like N. punctiforme.[109] Typical pil genes for pili biogenesis are the pilMNOQ operon, pilB1T1C operon, pilD (prepilin leader peptidase) and three prepilin subunit genes (NIES39_C03030, NIES39_C00840 and NIES39_C00850). Additionally, A. platensis harbours the rather unusual pilT2-like (NIES39_A08210) and pilC-like (NIES39_A03250) genes. These copies may be involved in novel regulation of the type IV pili or type II secretion system, which has not yet been studied in cyanobacteria.

Microevolution

Arthrospira platensis NIES-39 harbours many pseudogenes, whose coding regions fragmentally correspond to known genes or ORFs (Supplementary Table S12). Many of them are not just accidental mutants resulting from single mutations but are generated by at least two events, frameshift and/or nonsense mutations. An extreme case is the operon of nrsS/R/B/A, which is inactivated by at least 27 frameshift mutations and 13 nonsense mutations (Fig. 5). NrsS/R involve the two-component signal transduction system for nickel sensing and transcriptional regulation, whereas NrsB/A form a cation-efflux pump to remove heavy metals.[111] The extensive degradation of its operon strongly suggests that A. platensis NIES-39 is in the process of microevolution to adapt to its specific location in Lake Chad.
Figure 5

Selective mutations in the nrsS/R/B/A region.

Selective mutations in the nrsS/R/B/A region. Rapid evolution is also noted in the hsdS gene which determines the RM specificity (see Section 3.5.1). Since many strains of A. platensis and A. maxima are the target of genome projects and economical applications, the microevolution of their genomes and diversity in phenotype are critical themes for post-genome studies.

Supplementary material

Supplementary Material is available at www.dnaresearch.oxfordjournals.org.
  93 in total

1.  Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes.

Authors:  A Krogh; B Larsson; G von Heijne; E L Sonnhammer
Journal:  J Mol Biol       Date:  2001-01-19       Impact factor: 5.469

2.  Type IV pilus biogenesis and motility in the cyanobacterium Synechocystis sp. PCC6803.

Authors:  D Bhaya; N R Bianco; D Bryant; A Grossman
Journal:  Mol Microbiol       Date:  2000-08       Impact factor: 3.501

3.  Blue light stimulates cyanobacterial motility via a cAMP signal transduction system.

Authors:  Kazuki Terauchi; Masayuki Ohmori
Journal:  Mol Microbiol       Date:  2004-04       Impact factor: 3.501

4.  A soluble carotenoid protein involved in phycobilisome-related energy dissipation in cyanobacteria.

Authors:  Adjélé Wilson; Ghada Ajlani; Jean-Marc Verbavatz; Imre Vass; Cheryl A Kerfeld; Diana Kirilovsky
Journal:  Plant Cell       Date:  2006-03-10       Impact factor: 11.277

Review 5.  Cyanobacteriochromes: a new superfamily of tetrapyrrole-binding photoreceptors in cyanobacteria.

Authors:  Masahiko Ikeuchi; Takami Ishizuka
Journal:  Photochem Photobiol Sci       Date:  2008-08-18       Impact factor: 3.982

6.  Envelope structure of four gliding filamentous cyanobacteria.

Authors:  E Hoiczyk; W Baumeister
Journal:  J Bacteriol       Date:  1995-05       Impact factor: 3.490

7.  Halotolerant cyanobacterium Aphanothece halophytica contains a betaine transporter active at alkaline pH and high salinity.

Authors:  Surasak Laloknam; Kimihiro Tanaka; Teerapong Buaboocha; Rungaroon Waditee; Aran Incharoensakdi; Takashi Hibino; Yoshito Tanaka; Teruhiro Takabe
Journal:  Appl Environ Microbiol       Date:  2006-09       Impact factor: 4.792

8.  Detection of short protein coding regions within the cyanobacterium genome: application of the hidden Markov model.

Authors:  T Yada; M Hirosawa
Journal:  DNA Res       Date:  1996-12-31       Impact factor: 4.458

9.  CyanoClust: comparative genome resources of cyanobacteria and plastids.

Authors:  Naobumi V Sasaki; Naoki Sato
Journal:  Database (Oxford)       Date:  2010-01-08       Impact factor: 3.451

10.  hetR and patS, two genes necessary for heterocyst pattern formation, are widespread in filamentous nonheterocyst-forming cyanobacteria.

Authors:  Ju-Yuan Zhang; Wen-Li Chen; Cheng-Cai Zhang
Journal:  Microbiology       Date:  2009-04-21       Impact factor: 2.777

View more
  32 in total

1.  Linking chemistry and genetics in the growing cyanobactin natural products family.

Authors:  Mohamed S Donia; Eric W Schmidt
Journal:  Chem Biol       Date:  2011-04-22

2.  Molecular insight into the metabolic activities of a protein-rich micro alga, Arthrospira platensis by de novo transcriptome analysis.

Authors:  Venkatesh Kumaresan; Anbazahan Sannasimuthu; Mariadhas Valan Arasu; Naif Abdullah Al-Dhabi; Jesu Arockiaraj
Journal:  Mol Biol Rep       Date:  2018-07-05       Impact factor: 2.316

3.  Analysis of seven putative Na+/H+ antiporters of Arthrospira platensis NIES-39 using transcription profiling and in silico studies: an indication towards alkaline pH acclimation.

Authors:  Monika M Jangir; B Vani; Shibasish Chowdhury
Journal:  Physiol Mol Biol Plants       Date:  2019-07-24

4.  Genome Sequence and Composition of a Tolyporphin-Producing Cyanobacterium-Microbial Community.

Authors:  Rebecca-Ayme Hughes; Yunlong Zhang; Ran Zhang; Philip G Williams; Jonathan S Lindsey; Eric S Miller
Journal:  Appl Environ Microbiol       Date:  2017-09-15       Impact factor: 4.792

5.  Differential response of photosynthetic apparatus towards alkaline pH treatment in NIES-39 and PCC 7345 strains of Arthrospira platensis.

Authors:  Monika Mahesh Jangir; Shibasish Chowdhury; Vani Bhagavatula
Journal:  Int Microbiol       Date:  2021-01-12       Impact factor: 2.479

6.  Effect of nitrate availability on expression of multi sensor histidine kinase gene in Spirulina platensis PCC 7345.

Authors:  Pradeep Kumar Logeswaran; Swapnil Srivastav; Krithika Karunakaran; Vinky Kohli; Priya Munjal; Bhagavatula Vani
Journal:  Physiol Mol Biol Plants       Date:  2011-07-09

7.  Arthrospira platensis-Mediated Green Biosynthesis of Silver Nano-particles as Breast Cancer Controlling Agent: In Vitro and In Vivo Safety Approaches.

Authors:  Nehal M El-Deeb; Mai A Abo-Eleneen; Omyma A Awad; Atef M Abo-Shady
Journal:  Appl Biochem Biotechnol       Date:  2022-01-20       Impact factor: 2.926

8.  Draft genome sequence of Arthrospira platensis C1 (PCC9438).

Authors:  Supapon Cheevadhanarak; Kalyanee Paithoonrangsarid; Peerada Prommeenate; Warunee Kaewngam; Apiluck Musigkain; Somvong Tragoonrung; Satoshi Tabata; Takakazu Kaneko; Jeerayut Chaijaruwanich; Duangjai Sangsrakru; Sithichoke Tangphatsornruang; Juntima Chanprasert; Sissades Tongsima; Kanthida Kusonmano; Wattana Jeamton; Sudarat Dulsawat; Amornpan Klanchui; Tayvich Vorapreeda; Vasunun Chumchua; Chiraphan Khannapho; Chinae Thammarongtham; Vethachai Plengvidhya; Sanjukta Subudhi; Apiradee Hongsthong; Marasri Ruengjitchatchawalya; Asawin Meechai; Jittisak Senachak; Morakot Tanticharoen
Journal:  Stand Genomic Sci       Date:  2012-03-05

Review 9.  Sensing and responding to UV-A in cyanobacteria.

Authors:  Yoon-Jung Moon; Seung Il Kim; Young-Ho Chung
Journal:  Int J Mol Sci       Date:  2012-12-03       Impact factor: 5.923

10.  Identification of differentially expressed proteins of Arthrospira (Spirulina) plantensis-YZ under salt-stress conditions by proteomics and qRT-PCR analysis.

Authors:  Huili Wang; Yanmei Yang; Wei Chen; Li Ding; Peizhen Li; Xiaokai Zhao; Xuedong Wang; Aiying Li; Qiyu Bao
Journal:  Proteome Sci       Date:  2013-01-30       Impact factor: 2.480

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.