Literature DB >> 32722486

Genome Analysis of Two Novel Synechococcus Phages That Lack Common Auxiliary Metabolic Genes: Possible Reasons and Ecological Insights by Comparative Analysis of Cyanomyoviruses.

Tong Jiang1, Cui Guo1,2,3, Min Wang1,2,3, Meiwen Wang1, Xinran Zhang1, Yundan Liu1, Yantao Liang1,2,3, Yong Jiang1,2,3, Hui He1,2,3, Hongbing Shao1, Andrew McMinn1,4.   

Abstract

The abundant and widespread unicellular cyanobacteria Synechococcus plays an important role in contributing to global phytoplankton primary production. In the present study, two novel cyanomyoviruses, S-N03 and S-H34 that infected Synechococcus MW02, were isolated from the coastal waters of the Yellow Sea. S-N03 contained a 167,069-bp genome comprising double-stranded DNA with a G + C content of 50.1%, 247 potential open reading frames and 1 tRNA; S-H34 contained a 167,040-bp genome with a G + C content of 50.1%, 246 potential open reading frames and 5 tRNAs. These two cyanophages contain fewer auxiliary metabolic genes (AMGs) than other previously isolated cyanophages. S-H34 in particular, is currently the only known cyanomyovirus that does not contain any AMGs related to photosynthesis. The absence of such common AMGs in S-N03 and S-H34, their distinct evolutionary history and ecological features imply that the energy for phage production might be obtained from other sources rather than being strictly dependent on the maintenance of photochemical ATP under high light. Phylogenetic analysis showed that the two isolated cyanophages clustered together and had a close relationship with two other cyanophages of low AMG content. Comparative genomic analysis, habitats and hosts across 81 representative cyanomyovirus showed that cyanomyovirus with less AMGs content all belonged to Synechococcus phages isolated from eutrophic waters. The relatively small genome size and high G + C content may also relate to the lower AMG content, as suggested by the significant correlation between the number of AMGs and G + C%. Therefore, the lower content of AMG in S-N03 and S-H34 might be a result of viral evolution that was likely shaped by habitat, host, and their genomic context. The genomic content of AMGs in cyanophages may have adaptive significance and provide clues to their evolution.

Entities:  

Keywords:  AMGs; Myoviridae; cyanophage; genome; photosynthesis

Year:  2020        PMID: 32722486      PMCID: PMC7472177          DOI: 10.3390/v12080800

Source DB:  PubMed          Journal:  Viruses        ISSN: 1999-4915            Impact factor:   5.048


1. Introduction

With cell numbers of up to 106 cells mL−1 in the global oceans, unicellular cyanobacteria are amongst the most abundant photosynthetic organisms on earth and make major contributions to marine phytoplankton primary production [1,2]. Synechococcus and Prochlorococcus, the two most common marine cyanobacteria, are responsible for roughly one half of marine photosynthesis and are key players in marine biogeochemical cycles [2,3,4]. Cyanophages are viruses that infect cyanobacteria and make up an extremely abundant and genetically diverse component of marine planktonic communities. All known marine cyanophages are tailed double-stranded DNA viruses belonging to three well-defined bacteriophage families, Myoviridae, Podoviridae, and Siphoviridae [5]. Cyanophage infection is responsible for the mortality of a significant proportion of all cyanobacteria [6], regulating both their abundance and diversity. In coastal surface waters, more than 80% of Synechococcus cells were estimated to encounter infectious cyanophages on a daily basis, and ~5% to 7% of cells would become infected by viruses [6]. Viral lysis of host cells channel, or “shunt”, the photosynthetically-fixed carbon (particulate organic matter, POM) to the dissolved organic matter (DOM) pool that can be reused by cyanobacteria, driving recycling [7]. The substances released into the natural environment due to phage infection may further change the function of the environment [8,9]. Some cyanophages can influence marine biogeochemistry via manipulating host cell metabolism during infection. It is accomplished by the expression of auxiliary metabolic genes (AMGs) carried by phages but originating from bacterial cells. The AMGs provide supplemental support to boost host metabolic processes beneficial to phage reproduction and thereby allow an increase in energy production and a more efficient replication of the phage [10,11,12,13]. The proteins encoded by AMGs are found to participate in the host’s photosynthesis [14,15], carbon metabolism [16], nucleotide biosynthesis [10,17] and stress tolerance [18]. Cyanophage AMGs may be a special adaptive physiological mechanism that endows the phages with unique biological characteristics and a close relationship with cyanobacteria. In recent years, some AMGs have been used as marker genes for the detection of phage molecules, genetic diversity and relationships between cyanobacteria and cyanophages [19,20,21]. Among them, the photosynthesis-related genes, such as hli (encoding high-light-induced proteins), psbA (encoding the photosystem II reaction center protein D1) and psbD (encoding the photosystem II reaction center protein D2), are commonly found in the genome of cyanophages. Currently, all of the isolated cyanomyoviruses with complete genomes carry photosynthesis-related AMGs [11], suggesting that those genes are probably functional and essential under some conditions [22]. Besides their roles during phage infection, the genomic content of AMGs in cyanophages may also have adaptive significance and provide clues to their evolution [23]. Investigations on viral distribution in different sea areas around the world revealed high viral abundances, ranging from ~3 × 106 virus ml−1 in the deep sea to ~108 virus mL−1 in productive coastal waters [24]. Meanwhile, viral metagenomic studies have succeeded in discovering a large number of new viral genomes from the marine environment, far exceeding the number of existing viral genomes obtained in pure culture [25]. Compared to the large number of widespread marine viruses, there are still very few pure, cultured phages that have been isolated for research. In this study, two novel cyanophage strains belonging to Myoviridae (cyanomyovirus), that infected Synechococcus, were successfully isolated and their complete genomes were sequenced. Their genomes contain only 3–4 AMGs, far fewer than most of the cyanomyovirus genomes (>10 AMG genes) [11]. Moreover, the two new phages do not contain the highly prevalent photosynthesis-related genes (i.e., hli, psbA, and psbD).

2. Materials and Methods

2.1. Cyanophage Isolation

The cyanophages S-N03 and S-H34 were isolated by liquid serial dilution from concentrated surface seawater samples collected from coastal sites (N03, 37°30.025′ N, 123°02.315′ E and H34, 32°59.98′ N, 123°59.77′ E) in the Yellow Sea. The seawater was collected and sequentially filtered by polycarbonate membrane of 3 μm (IsoporeTM 3.0 μm TSTP; Merck, Ireland) and 0.2 μm (IsoporeTM 0.2 μm GTTP; Merck, Ireland). The percolates were then filtered with a 50 kDa cartridge (Pellicon® XL Cassette, Biomax® 50 kDa; polyethersulfone, Millipore Corporation, Billerica, MA, USA) and concentrated through tangential flow to make the viral concentration reach 300 times the initial concentration. The viral concentrate was stored at 4 °C in the dark [26,27]. The host of the cyanophages is Synechococcus sp. strain MW02 (NCBI accession number KP113680). The algal culture was grown in conical flasks with f/2 seawater medium under a constant illumination of approximately 25 µmol m−2 s−1 at 25 °C in a 12-h/12-h light-dark cycle [27]. The phage enrichment was performed by adding the viral-concentrated seawater to the exponentially growing host Synechococcus in a ratio of 1:9. The phage-host suspension was incubated under a constant irradiance of 25 µmol m−2 s−1 at 25 °C for about 1 week until lysed host cells were observed according to the color and turbidity of the lysate. A control group was set up in parallel by replacing the viral solution with the medium [28]. The cyanophage lysates were then filtered through a 0.22 μm pore size membrane (Millex®-GP 0.22 μm PES; Merck, Ireland). The infection was repeated three times. The filtrate was stored at 4 °C in the dark for further tests [29].

2.2. Phage Purification

Phage purification was performed using the serial dilution method, as described previously [27]. Generally, the infectivity was tested across the serially diluted phage samples (10 times dilution over 7 orders of magnitude). The most diluted phage sample that induced host lysis was used for another round of serial dilution and infection tests. After three rounds of purification, a pure lysate with a single phage strain was theoretically produced [28]. The cyanophage was then concentrated using Amicon® Ultra 15 with a 30 k-Da ultra-PL membrane (Merck, Ireland) [30]. Further purification was performed by sucrose density gradient centrifugation [27].

2.3. Host Range

The infectivity of cyanophages S-N03 and S-H34 was tested using nine Synechococcus strains, including Synechococcus WH7803, WH8102, MW02, MW03, LTWRed, LTWGreen, PSHK05, CCMP1333, PCC7002 (Table S1). The viral solution was added to each host Synechococcus culture in logarithmic growth phase at a volume ratio of 1:9, in triplicates. The viral solution was replaced by the medium in the control group. The mixtures were incubated under the same conditions described above. Cell lysis was monitored and compared in the control and viral solution groups every day for two weeks to examine the infectivity.

2.4. Morphological Study by Transmission Electron Microscopy

The 20 μL purified phage suspensions were placed onto a 200-mesh copper grid and stained by adding a drop of 1% (w/v) phosphotungstic acid (pH 7.2) for 10 min [29]. The grids were examined using a transmission electron microscope (JEOLJEM-1200EX, Japan) at 100 kV to reveal the cyanophage structural characteristics and dimensions [31].

2.5. Genome Sequencing and Assembly

Phage DNA was extracted from the sucrose density gradient-purified phages using a TIANamp Virus DNA Kit (TIANGEN) [27]. A total of 1 µg DNA per sample was used as input for the DNA sample preparations. Sequencing libraries were generated using NEBNext® Ultra™ DNA Library Prep Kit for Illumina (NEB, USA). The whole genomes of S-N03 and S-H34 were sequenced using Illumina NovaSeq PE150 by an ABI 3730 automated DNA sequencer. The reads containing >40% low-quality bases (mass value ≤20), >10% N content, overlap with the adapter for >15 bp with less than 3 mismatches, were removed. The reads were assembled with SOAP denovo [32], SPAdes [33], and Abyss [34] software packages. The assemblies from the three software packages were then integrated with CISA software to select the one with the least scaffolds. Gapcloser and GapFiller were used to fill the assembly gaps [27,35].

2.6. Genome Annotation and Analysis

The open reading frames (ORFs) in the genomes of cyanophage S-N03 and S-H34 were predicted using GeneMarkS [36], GLIMMER [37] and RAST (Rapid Annotation using Subsystem Technology) [38]. The predicted ORFs were translated into amino acid sequences and their homologous genes were searched in the NCBI (National Center for Biotechnology Information) non-redundant protein database by BLASTp [39,40]. The protein domains were predicted and analyzed by InterPro [41] and CDD [42]. tRNA scan-SE was used to identify transfer RNA (tRNA) genes [43], and RNAmmer was used to predict ribosomal RNA in the full genome sequence [44]. Genome mapping was performed using DNAplotter (version 17.0.1). An AMG database was created, which summarized protein sequences from 33 AMGs that were chosen based on prior recognition and extracted from various cyanomyovirus genomes [11,45]. A total of 337 genomes were downloaded from the NCBI database, which include all isolated cyanomyovirus with complete genomes available at the time of analysis (Table S2). Only one representative genome, when there were phages of the same name with the average nucleotide identity (ANI) greater than 95% was kept. Finally, 81 representative genomes of cyanophages were selected. Gene identity was assigned to a corresponding AMG gene when the BLASTp E-value ≤10−5, sequence identity ≥35%, and the query cover ≥60% [11]. The genome sequences of S-N03 and S-H34 were deposited in the GenBank database under accession number MT162466 and MT162467, respectively.

2.7. Comparative Analysis of Cyanophage Genomes

The ViPTree server was used to generate a proteomic tree based on the genome-wide sequence similarities computed by tBLASTx [46,47]. All related viruses contained in the Virus-Host Database were used to establish a circular tree [48]. The 37 closest phages in the circular tree were then selected to establish a rectangular tree with phage S-N03 and S-H34 for subsequent comparison and analysis. The genome sequences of S-N03 and S-H34 were compared with that of phage S-B68 by tBLASTx using ViPTree. Meanwhile, phylogenetic analysis with other related phages were carried out using the amino acid sequences of DNA polymerase and terminase large subunit by the ClustalW program. The maximum-likelihood (ML) phylogenetic tree was constructed by genetic analysis software MEGA (Version 7.0.18) [49,50]. The bootstrap values were based on 1000 replicates. The average nucleotide identity (ANI) was calculated using OrthoANI (Average Nucleotide Identity by Orthology) [51] and JSpeciesWS Online Service [52].

3. Results and Discussion

3.1. Host Range and Phage Morphology

The host of cyanophage S-N03 and S-H34 is PE-type (phycoerythrin-only) Synechococcus sp. strain MW02, which belongs to subcluster 5.1 clade IX and was originally isolated from Hong Kong estuarine waters [53]. The cross-infectivity test showed that both S-N03 and S-H34 infected the other three PE-type Synechococcuses belonging to subcluster 5.1 clade II, V and subcluster 5.2 (Table S1). The transmission electron microscopy examination showed that S-N03 and S-H34 displayed icosahedral heads of 97 and 88 nm in diameter and contractile tails of 138 and 129 nm in length, respectively. Their sizes are within the range of the previously isolated cyanomyoviruses (Figure S1).

3.2. General Genomic Features

Cyanophage S-N03 and S-H34 both contain a circular double-stranded DNA genome revealed by the terminal analysis that showed no protruding cohesive. The genome sizes of S-N03 and S-H34 are 167,069-bp and 167,040-bp, which are the sixth and seventh smallest genomes among the 81 representative cyanomyoviruses (Table 1). The G + C content of the genomes S-N03 and S-H34 are both 50.1%, two of the only four published cyanomyoviruses genomes (S-B68, S-CBWM1, S-N03 and S-H34) with G + C contents close to 50%. The G + C content of Prochlorococcus phages (34.3–40.7%) is generally lower than that of Synechococcus phages (35.4–51.7%). In the 81 representative cyanomyoviruses, about 71.4% of Prochlorococcus phages have G + C contents of less than 38.1%, while most of the Synechococcus phages have G + C contents between 38% and 45% and only 11.1% have G + C contents of less than 38.1% (Table 1). Studies have shown that the genomes of some organisms that depend on the survival of the host, such as bacteria, phages, and plasmids, are often rich in A + T, that is, the G + C content is low. This may be due to the differential cost of related metabolites in the cell and the limited availability of G and C relative to A and T/U [54]. Neutral bias can also explain the higher A + T content of phages, because the depletion of host bacterial resources may result in the systematic insertion of more abundant A and T nucleotides [54]. Moreover, the G + C content of the phage may be also affected by that of their host. A positive correlation has been observed between G + C content of bacteriophage and their host [55]. In this study, the G + C contents of Synechococcus marinus WH8102 and WH7803 that can be infected by S-H34 and S-N03 were 59.2% and 60.2%, respectively (the whole genome of their host MW02 was not published), which was at a relatively high level in the G + C content range of marine Synechococcus (~50–60% G + C content) [56]. Such high G + C content of a host might be associated with the high G + C content of the cyanophages, although more evidence is needed to prove this point. Thus, we inferred that the higher G + C content implied that S-N03 and S-H34 may have experienced independent evolutionary routes compared to the cyanophages of lower G + C values and evolved specific genomic traits that adapted to their hosts and their surrounding environments.
Table 1

General genomic features of 81 representative cyanomyoviruses.

HostPhageGenomeSize (bp)GC(%)AMGtRNANCBI Taxonomy IDNo. IsolatesAccessionRefSeq AccessionRef.Submission Date
Synechococcus S-PM2196,28037.810252388542AJ630128.1NC_006820.1[58]2004
Syn9177,30040.61863823591DQ149023.2NC_008296.2[59]2005
S-RSM4194,45441.118125553871FM207411.1NC_013085.1[60]2008
Syn1191,19540.61464448611GU071105.1NC_015288.1[45]2009
Syn19175,23041.31864456841GU071106.1NC_015286.1[45]2009
Syn33174,28539.61654448781GU071108.1NC_015285.1[45]2009
S-SM1174,07941.12264448591GU071094.1NC_015282.1[45]2009
S-SM2190,78940.422114448601GU071095.1NC_015279.1[45]2009
S-ShM2179,56341.11414456831GU071096.1NC_015281.1[45]2009
S-SSM5176,184402144456851GU071097.1NC_015289.1[45]2009
S-SSM7232,87839.12154456861GU071098.1NC_015287.1[45]2009
Syn2175,59641.31865364731HQ634190.1--2010
Syn10177,10340.61765364721HQ634191.1--2010
Syn30178,80739.92065364741HQ634189.1NC_021072.1-2010
S-SSM2179,98041.11415364641JF974292.1--2010
S-SSM4182,80139.41935364661HQ316583.1NC_020875.1-2010
S-SSM6a232,88339.12056826501HQ317391.1--2010
S-SSM6b182,36839.41936826511HQ316603.1--2010
S-CAM1198,013431787540376HQ634177.1NC_020837.1-2010
S-CAM8171,40739.31957540382HQ634178.1NC_021530.1-2010
S-CRM01178,56339.773410269551HQ615693.1NC_015569.1[61]2010
S-RIM8171,21140.616886972413JF974288.1NC_020486.1[62]2010
S-SKS1208,0073616127540421HQ633071.1NC_020851.1-2010
KBS-M-1A171,74440.61688899501JF974293.1--2010
S-IOM18171,79740.61577540391HQ317383.1NC_021536.1-2010
metaG-MbCM1172,87939.818210799991JN371769.1NC_019443.1-2010
S-TIM5161,44040.5151011377451JQ245707.1NC_019516.1[63]2011
ACG-2014c176,04339.120410799985JN371768.1NC_019444.1[64]2011
ACG-2014a171,28239.4185149350724KJ019026.1-[65]2013
ACG-2014b172,68839.1195149350818KJ019134.1NC_027130.1[65]2013
ACG-2014d179,11040.3183149350945KJ019136.1NC_026923.1[65]2013
ACG-2014e189,41838.917814935103KJ019156.1NC_026928.1[65]2013
ACG-2014f228,14341.6132149351141KJ019059.1NC_026927.1[65]2013
ACG-2014g174,88539.317514935121KJ019071.1NC_026924.1-2013
ACG-2014h189,31140.516713408101KF156338.1NC_023587.1[64]2013
ACG-2014i190,7683916814935131KJ019082.1NC_027132.1[65]2013
ACG-2014j192,10838.616714935142KJ019069.1NC_026926.1[65]2013
S-MbCM25176,04439.120413408111KF156339.1-[64]2013
S-MbCM100170,43839.417513408121KF156340.1NC_023584.1[64]2013
S-RIM2175,43042.214686966262HQ317292.1NC_020859.1[66]2016
S-RIM12173,82139.6175127840221KX349307.1-[66]2016
S-RIM14179,55841.114112784239KX349298.1-[66]2016
S-RIM32194,43739.9141112784791KU594606.1NC_031235.1[11]2016
S-RIM44197,62940.319512784858KX349291.1-[66]2016
S-RIM50174,30740.31686878031KU594605.1NC_031242.1[11]2016
S-CAM3198,19041.6141018833663KU686199.1NC_031906.1[11]2016
S-CAM4191,98338.616818833673KU686201.1NC_031900.1[11]2016
S-CAM7216,12141.25418833682KU686212.1NC_031927.1[11]2016
S-CAM9174,8303916818833693KU686206.1NC_031922.1[11]2016
S-CAM22172,34539.920518833653KU686209.1NC_031903.1[11]2016
S-WAM1185,10244.715418155211KU686210.1NC_031944.1[11]2016
S-WAM2186,38641.3151218155221KU686211.1NC_031935.1[11]2016
S-CBWM1139,06951.633620536531MG450654.1-[67]2017
Bellamy204,93041.1201020239961MF351863.1--2017
S-H35174,23141.215819835721KY945241.1--2017
S-B68163,98251.74425454371MK016664.1--2018
S-B64151,86741.315821639011MH107246.1--2018
S-PRM1144,31140.716821001301MH629685.1--2018
S-T4181,08238.914722685781MH412654.1--2018
S-E7177,62239.919624846391MH920640.1--2018
B3244,93035.442026749781MN695334.1--2019
B23243,63335.442026749771MN695335.1--2019
S-RIM4175,46241.215925301691MK493321.1--2019
S-B43213,99339.4141113408121MN018232.1--2019
S-B05208,85739.9141124846371MK799832.1-[27]2019
S-H34 167,04050.13527189421MT162467.2-This study2020
S-N03 167,06950.14127189431MT162466.1-This study2020
Prochlorococcus P-SSM2252,40135.52212687462AY939844.2NC_006883.2[45]2005
P-SSM4178,24936.71902687471AY940168.2NC_006884.2[45]2005
P-SSM7182,18037.11944456881GU071103.1NC_015290.1[45]2005
P-RSM4176,42837.61934448621GU071099.1NC_015283.1[45]2009
P-HM1181,04437.81704457001GU071101.1NC_015280.1[45]2009
P-HM2183,80638.11504456961GU075905.1NC_015284.1[45]2009
P-RSM1177,21140.21825364441HQ634175.1NC_021071.1-2010
P-RSM3178,75036.71705364461HQ634176.1--2010
P-RSM6192,49739.31839298321HQ634193.1NC_020855.1-2010
P-SSM3179,06336.71605364531HQ337021.1NC_021559.1-2010
P-SSM5252,01335.51915364541HQ632825.1--2010
MED4–213180,97737.81508899561HQ634174.1NC_020845.1-2010
P-TIM40188,63240.717115897331KP211958.1NC_028663.1-2014
P-TIM68197,36134.314015424771KM359505.1NC_028955.1[68]2014
In order to understand whether there is a relationship between the genome size and G + C content in the viral genome, the Spearman correlation analysis was performed based on the 81 representative genomes of cyanophages. The relationship between genome size and G + C content has been studied for bacteria but seldomly investigated on viruses [57]. Interestingly, we found a significant negative correlation between genome size and G + C content of cyanomyoviruses (r = −0.34, p < 0.01, Table 2). This is in contrast to bacteria and archaea, which were shown to have positive correlations between G + C values and genome size, but consistent with the result obtained in bacteriophages [55,57]. Such a negative relationship in cyanomyoviruses is still thought to be related to their adaptive evolution. If a phage genome is large and enriched with G + C at the same time, higher energy cost and limited availability of G/C could constrain phage-DNA replication, which does not comply with the life strategy of viruses.
Table 2

Spearman’s rank correlation coefficient between genome size, G + C content and AMGs. *: p < 0.05; **: p < 0.01; n = 81.

Genome SizeGC%AMGs
Genome size Correlation coefficient1−0.340 **0.001
Sig.(p-value)-0.0020.992
GC% Correlation coefficient−0.340 **1−0.272 *
Sig.(p-value)0.002-0.014
AMGs Correlation coefficient0.001−0.272 *1
Sig.(p-value)0.9920.014-
A total of 247 and 246 potential ORFs were identified in the S-N03 and S-H34 genomes, respectively (Table S3 and S4). Functional annotation of predicted ORFs in the NCBI non-redundant protein database showed that only 72 (29.15%) were assigned to specific functions in S-N03 (E-value < 10−5), while the rest 175 (70.85%) were predicted to encode hypothetical proteins, due to incomplete genomic information of the cyanophage in database [69]. Similarly, 70 (28.46%) predicted ORFs were assigned to specific functions in S-H34, while the rest 176 (71.54%) were predicted to encode hypothetical proteins. All predicted ORFs can be divided into five functional groups, including structuring (S-N03, 31ORFs and S-H34, 28 ORFs), packaging (S-N03, 3ORFs and S-H34, 3 ORFs), DNA replication and regulation (S-N03, 26 ORFs and S-H34, 29ORFs), hypothetical protein and additional functions related to physiological activity (12 ORFs in S-N03 and 10 ORFs in S-H34) (Figure 1A,B).
Figure 1

Genome map and functional annotation of the predicted proteins of cyanophage (A) S-N03 and (B) S-H34. The number next to the arrow indicates the ORF number. (C) Genome-wide comparison of phages S-N03, S-H34 and S-B68.

The functional annotation of phage structural proteins is highly dependent on the sequence similarity to proteins of other phages that were detected in respective viral particles [70,71]. The putative structural proteins in S-N03 and S-H34 were the baseplate, the tail tube, the tail sheath, the tail fibers, the tail completion proteins and neck proteins. The packaging modules of both S-N03 and S-H34 contain three ORFs, including terminase large subunit, terminase small subunit and major capsid protein. The DNA replication and conditioning module contained a wide variety of categories, including DNA primase, RNA polymerase, single-stranded DNA binding protein UvsY, endonuclease, DNA polymerase, exonuclease, ribonuclease H, ribonucleoside-diphosphate reductase alpha subunit (NrdA), and ribonucleotide diphosphate reductase beta subunit (NrdB). Among these, NrdA (S-N03: ORF202, S-H34: ORF4) and NrdB (S-N03: ORF201, S-H34: ORF3) are involved in DNA synthesis by converting nucleotides into deoxynucleotides, and can be found in all organisms [72,73]. In a marine environment with limited phosphorus content, obtaining sufficient free nucleotides is critical for DNA synthesis [72,74,75]. With ribonucleotide reductase (NrdA, NrdB) and thymidylate synthase (S-N03:ORF171, S-H34:ORF215), the rate of DNA synthesis of T4-like phage could be increased 10-fold compared to a system without these enzymes [76]. Additional proteins modules are mainly related to metabolism and regulation. In addition to AMGs (detailed results and discussions are shown in Section 3.5), S-N03 and S-H34 also have regulatory genes, such as genes encoding serine/threonine kinase (PSKs) PknB (S-N03: ORF136, S-H34: ORF180), serine/threonine phosphatase (S-N03: ORF100, S-H34: ORF147) and endolysins. PknB is a typical Ser/Thr kinase, which catalyzes the transfer of the gamma-phosphoryl group on the ATP molecule to the Ser/Thr residue of the protein substrate. It is involved in regulating many biological processes, including purine and pyrimidine biosynthesis, cell wall metabolism, antibiotic resistance, peptidoglycan synthesis, cell division, transcription, stress response and metabolic regulation [77,78,79]. Ser/Thr phosphatase (PSPs) are responsible for dephosphorylation of phosphoprotein substrates, which is the reverse process of Ser/Thr kinase catalysis. They participate in many cell pathways that regulate cell reproduction and programmed death [80]. The reversible phosphorylation of proteins is accomplished by opposing activities of kinases and phosphatases [80]. S-N03 and S-H34 also contain endolysin with amino acid identities of 51.75% (97% coverage) with that of cyanophage S-B68. Endolysins are enzymes produced by phages. They are responsible for catalyzing the hydrolysis of the peptidoglycan in the bacterial cell wall and rupturing the cell at the end of the virulence cycle [81].

3.3. tRNA Genes

Apart from host-like genes, cyanomyoviruses have also incorporated tRNA genes into their genomes. In this study, only one tRNA gene (Ans) was identified in the genome of S-N03 and five tRNA genes (Tyr, Asp, Val, 2× Ans) were identified in the genome of S-H34 (Table 3). The number and types of tRNA genes is variable in different Cyanomyoviruses (Table 1), which is a result of phage-host co-evolution, driven by the optimal codon usage [82]. The tRNAs carried in phage genomes match codons highly used by the phage and poorly used by the bacterial host during the infection [83]. They may augment the expression of late phage genes encoding structural proteins, such as phage capsid and tail proteins [56,84,85]. Therefore, the tRNA genes carried by S-N03 and S-H34 may contribute to phage protein synthesis and help the phage to adapt to a particular host or environment. Among the 81 published cyanomyoviruses, the number of tRNA contained in Prochlorococcus phages (0–4 tRNA) is significantly less than that in Synechococcus phages (1–36 tRNA); S-N03 is one of the Synechococcus phages that contains the least amount of tRNA. It has been proposed that the number of tRNAs genes is closely associated with differences of G + C content between phage and host: more tRNAs may increase the translation efficiency when infecting a host with higher G + C content, and potentially expand their potential host range while maintaining relatively lower G + C content in their genomes [56]. As such, the tradeoff between the G + C content and the occurrence of tRNA genes may result in the relatively low number of tRNAs and the wide host range of S-N03 and S-H34, given their relatively high G + C content (50.1%; Table 2) which is close to that of their hosts (marine Synechococcus of ∼50–60% G + C) [56].
Table 3

tRNA-related information obtained by tRNA scan-SE.

Sequence NamePositionLength (nt)tRNA TypeAnticodonIsotype ModelIsotype Score
Synechococcus phage S-N03
tRNA1101128–10105772AsnGTTAns98.7
Synechococcus phage S-H34
tRNA1157447–15736682TyrGTATyr70.1
tRNA2155156–15508572AsnGTTAns98.7
tRNA3155081–15500973AspGTCAsp66.5
tRNA4154982–15491172AsnGTTAns98.7
tRNA5154812–15474172ValTACPhe77.6

3.4. Genome Comparison and Phylogenetic Analysis

Comparative genome analysis was undertaken to reveal the divergence of the nucleotide sequence of S-N03 and S-H34 from other cyanophages. The “proteomic tree”, based on the genome-wide similarities, showed that the phages with the closest phylogenetic relationship to S-N03 and S-H34 all belong to cyanomyoviruses (Figure 2A,B). S-N03 and S-H34 have the closest genetic relationship with each other and group into the branch that also contains S-B68 and S-CRM01 and are distant from other cyanophages. The phylogenetic analyses of S-N03, S-H34 and other selected dsDNA viruses, based on the DNA polymerase and terminase large subunit sequences using the maximum likelihood method (ML), showed similar branching positions of S-N03 and S-H34 (Figure 2C,D). They share a highest similarity to each other with a nucleotide identity of 84.21% (coverage rate 67.81%). In this cluster, a lower similarity of S-N03 and S-H34 was obtained to S-B68 (identity of 70.54% and 72.18%, respectively) and S-CRM01 (identity of 63.71% and 64.76%, respectively), indicating that S-N03 and S-H34 are novel cyanophages.
Figure 2

Phylogenetic analysis. (A,B) Phylogenetic analysis with other related phages using the genome-wide sequence similarities computed by tBLASTx. These evolutionary trees have no roots. (C,D) Phylogenetic ML tree with other related phages based on the amino acid sequences of (C) DNA polymerase and (D) terminase large subunit. The trees were constructed in MEGA version 7 by the ML method with 1000 bootstrap replicates. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test are shown next to the branches. The scale bar represents 0.1 (C) or 0.05 (D) amino acid substitution per site.

Among these four isolates of cyanophage, S-H34, S-N03 and S-B68 are all marine lytic phages and have higher similarities, while S-CRM01 is a freshwater strain. Therefore, the genomes of S-N03 and S-H34 were compared with S-B68 (Figure 1C). The three genomes were found to share a high similarity in some proteins coded by the conserved genes, including DNA polymerase, terminase large subunit, and the major capsid protein, with 75.31–93.65% identity on amino acid level (Figure 1C). The differences among the genomes come mainly from a number of hypothetical proteins. Moreover, S-B68 is distinguished from S-H34 and S-N03 by the metabolic genes it encodes. It could be speculated that these differences arise due to the difference in host species because S-B68 have a different host from S-H34 and S -N03 [86].

3.5. Auxiliary Metabolic Genes (AMGs)

AMGs are commonly found in the genome of cyanophages. Among 81 cyanomyoviruses with available complete genomes, 92.6% are found to contain more than 5 AMGs (Figure 3). However, the newly isolated cyanophage S-N03 contains only 4 AMGs (hsp, MazG, ptoX, phoH) and S-H34 contains only 3 AMGs (hsp, MazG, phoH). The latter one is the phage with the least number of AMGs genes isolated so far (Figure 3). All of the AMGs found in S-N03 and S-H34 genomes are highly conserved genes among cyanophages.
Figure 3

A heat map of gene copy number matrix for 33 auxiliary metabolic genes (AMGs) across 81 representative genomes of isolated cyanomyoviruses with available complete genome in NCBI. The AMG content was listed in an ascending order. The names of the cyanophages are colored separately by the originally isolated genus of the host: Synechococcus is red and Prochlorococcus is blue.

3.5.1. MazG Gene (Pyrophosphatase)

MazG protein, the pyrophosphatase, is known as a regulator of nutrient stress and programmed cell death in E. coli [20]. The phage-encoded MazG was proposed to regulate the cellular level of ppGpp and, therefore, to affect transcription and translation in the host and extend the period of cell survival under the stress of phage infection [59,87]. However, a recent study showed that the purified cyanophage S-PM2 MazG has no binding or hydrolysis activity to (p)ppGpp [88]. Instead, dGTP and dCTP seem to be the preferred substrates for this protein, and affinity of the viral MazG for dGTP and dCTP is higher than their host counterparts. This may partially explain the lower G + C content of cyanophage genomes (37.7%) than that of the Synechococcus host genomes (60.2%), and it is consistent with preferential hydrolysis of deoxyribonucleotides in the host Synechococcus genome of high G + C content [88]. However, whether such a mechanism is applicable to cyanophages whose genomes generally have a similar G + C content with their hosts, such as S-H34 and S-N03 in this study, has yet to be determined. MazG is a highly conserved gene in cyanopodoviruses and cyanomyoviruses that infect Synechococcus. Only S-TIM5 and S-CBWM1 lack MazG in the 81 cyanomyoviruses examined (Figure 3). Previous research used the pyrophosphate nucleotide hydrolase gene MazG to prove that cyanophages are globally distributed. Despite the widespread presence of MazG gene in cyanophages, they have a small effective population size, indicative of rapid lateral gene transfer [20]. The phylogenetic trees based on MazG gene from previous studies showed that Prochlorococcus and Synechococcus phage MazG genes do not cluster with their hosts’ MazG, suggesting that this gene may be not obtained from the host but acquired by lateral gene transfer from other sources [20,88].

3.5.2. phoH Gene (P-Starvation Inducible Protein)

As the most prevalent phosphate-regulating gene in the genomes of cyanophages, phoH is present in 80 of 81 related cyanomyoviruses (except S-SKS1, Figure 4). Although its function is still unclear, it has been used as a molecular marker for describing viral diversity due to its universality [21,89]. In this study, the host of cyanophage S-N03 and S-H34, Synechococcus sp. strain MW02, was isolated from a Hong Kong estuarine site (affected by the Pearl River flows) where phosphorus limitation is usually present [53,90]. As such, these genes may play a role in regulating the phosphate uptake of the hosts from the environment.
Figure 4

(A) A histogram showing the distribution of AMG content in 81 cyanomyoviruses from different type of habitat (coastal areas, open oceans and lakes). (B) The isolation sites of the published cyanophages with AMGs content less than 10.

3.5.3. Hsp Gene (Heat Shock Protein)

Heat shock proteins (Hsps) are clusters of proteins that are induced in response to physical and chemical environmental stresses. They can facilitate cellular recovery from the damage caused by participating in protein translocation, re-folding and degradation, and are known as “molecular chaperones” [91]. Most of the heat shock proteins found in bacteriophages are small (sHsps), which can suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits [92,93]. Specifically, the heat shock protein family in phages might be important for scaffolding during maturation of the capsid [45]. Only 4 of the 81 cyanophages (S-CBWM1, B2, B23 and Syn10) did not contain the Hsp gene (Figure 3). Previous study has shown that cyanophage’s sHSPs form a monophyletic clade phylogenetically closer to bacteria than to cyanobacteria, while the host cyanobacterial sHSPs sequences forms a monophyletic clade closer to plants [93]. Such phylogenetic relationships point to horizontal gene transfer events that probably occurred millions of years ago. This means that the cyanophage sHsp gene has evolved independently and differently from its actual host cyanobacteria, but it still co-evolved with the host cyanobacteria in the pseudo- or lysogenic stage [93].

3.5.4. ptoX Gene (Plastoquinol Terminal Oxidase)

PTOX is an enzyme that mediates the electron flow from plastoquinol to oxygen. It exerts a variety of effects on the development and functioning of plant chloroplasts, including carotenoid biosynthesis, photoprotection and chlororespiration [60,94]. It does not exist in all photosynthetic organisms, but it is widely distributed among different strains of cyanobacteria [95]. The ptoX genes are also widespread among marine cyanomyoviruses (Figure 3). By carrying the ptoX gene, the cyanophage may have another way of preventing photodamage other than the psbA route [60]. PTOX can oxidize the plastoquinol produced by chloroplast NAD(P)H quinone oxidoreductase, which called chlororespiration. In the phage genome, PTOX is often arranged adjacent to NAD(P)H quinone oxidoreductase [95]. And in the genome of S-N03, NAD(P)H oxidoreductase appears upstream of PTOX (Figure 1A, Table S3). This indicates that NAD(P)H oxidoreductase and PTOX represent functional units in these cyanobacteria and may be transcription units. The phylogeny of the PTOX protein of the host and the cyanophage implies that although both Synechococcus and Prochlorococcus hosts and the cyanophage ptoX gene may share a common ancestor, they have evolved independently since then [60].

3.5.5. Lack of Photosynthetic AMGs

The prevalence of AMGs in the 81 sequenced cyanomyoviruses of Synechococcus and Prochlorococcus (including the 2 strains in this study) is shown in Figure 3. It shows that all the sequenced cyanomyoviruses carry at least one photosynthesis-related AMG. Some of the genes, such as the psbA, psbD, and hli, are common in the genomes of cyanomyoviruses. For example, it was found that 99% of the phage genomes contain hli, 95% contain psbA, and 76% contain both psbA and psbD. However, unlike all other known cyanomyoviruses, the genome of the novel strain S-H34 did not contain photosynthesis-related AMGs. The novel strain S-N03 only contains one photosynthesis-related AMG ptoX (plastoquinol terminal oxidase), but lacks the common psbA, psbD, and hli. The D1 and D2 proteins encoded by the psbA and psbD genes are core reaction-center proteins in photosystem II and participate in photochemical reactions. The D1 protein produced by the host turns over rapidly under high light and declines during phage infection [12,86,96]). Therefore, the expression of phage psbA gene during infection, as confirmed in previous studies, can bolster the host’s photosynthesis [12]. The high light-induced protein encoded by hli serves an important role in preventing cellular light damage by redirecting excessive light energy and protecting the photosynthetic apparatus [97]. By supplementing host photosynthesis, the phage photosynthetic AMGs ensures the energy required for phage maximum production and thus enhances their fitness [14]. It has been suggested that some phage photosynthetic AMGs (i.e., psbA and hli) have become an integral part of the phage genome as they are co-transcribed with the essential, highly expressed phage capsid genes surrounding the photosynthesis genes [12]. As such, the absence of the common photosynthetic AMGs in S-N03 and S-H34 infers their distinct evolutionary route. It also suggests that the energy for morphogenesis during phage production might be obtained from sources other than those strictly dependent on the maintenance of photochemical ATP under high light. It has been suggested that the number of photosynthesis AMGs (i.e., psbA and hli) and the optimal combination required by the phage may be determined by the light level [92]. The cyanophage fitness enhancement conferred by the phage photosynthesis genes only occurrs under a certain range of high light [86,98]. For example, a novel agent-based model shows that the phage photosynthesis genes are not necessary at a depth of 30 m, and the optimal photosynthesis gene combination in the phage was simplified to 0 psbA and 1 hli at a depth of 120 m [98]. In addition, the length of the latent period of infection has also been speculated to determine the presence or absence of psbA in a phage genome [99]. Therefore, the distinct genomic feature of S-H34, with an absence of all photosynthetic AMGs, might be the result of environmental adaptation and/or their own physiological characteristics.

3.5.6. Low AMG Contents in S-H34 and S-N03

Of the known cyanomyoviruses, less than 10% carry ≤10 AMGs in their genomes (Figure 3). S-H34 is one of the strains with by far the fewest AMGs (3 AMGs, equal to that of S-CBWM1). Intriguingly, from the phylogenetic analysis, it was noticed that S-N03 and S-H34 have the closest relationships with S-B68 and S-CRM01 (Figure 2B), which also have fewer AMGs (4 AMGs in S-B68 and 7 AMGs S-CRM01). This suggests that AMG content may be related to the genetic relationship. However, large variations of AMG content in phylogenetically closely-related cyanomyovirus genomes have also been demonstrated in previous studies [11,27]. It has been suggested that both vertical and horizontal evolution determine the AMGs content: the highly conserved AMGs across cyanomyoviruses are likely maintained by vertical inheritance while those occasional AMGs may be due to horizontal evolution under different selection pressures such as environmental condition and host type. In order to investigate the distribution of AMG content in different environments, the cyanophages were divided according to their location type (coastal, open ocean and lake) and plotted with their corresponding number of AMGs (Figure 4A). Although no significant difference in AMG number was identified between coastal and the open ocean regions, it is clear that the cyanophages with low AMG content were all isolated from relatively eutrophic areas such as coastal regions and lakes (Figure 4A). Further analysis using the six published cyanomyoviruses with AMG content less than 10 showed that these phages were all isolated from mid-latitude regions of 30–40°N (Figure 4B), and their hosts were all Synechococcus (Table 4). Compared with cyanomyoviruses isolated from Prochlorococcus, genomes of Synechococcus had a lower AMG content (Figure 3, Table 1). Data presented here are consistent with previous studies that also proposed an association of AMG content with location and the host genus [11]. Moreover, neither S-N03 nor S-H34 showed strict host specificity and the ability to infect other Synechococcus strains besides their host Synechococcus MW02, which coincided with the speculation that the expansion of the host range may be also accompanied by a decrease in AMG content in some cases [11].
Table 4

Information of cyanophages with AMG content less than 10.

PhageAMGsGenome Size (kb)GC(%)Isolation LocationHost Name(Syn)Host IsolationAccession
S-N033167,06950.1Yellow Sea, ChinaMW02estuary MT162466
S-H343167,04050.1Yellow Sea, ChinaMW02estuary MT162467.2
S-B684163,98251.7Bohai Sea, ChinaWH7803marine MK016664.1
S-CBWM14139,06951.6Chesapeake Bay, USACBW1002estuary MG450654.1
S-CAM75216,12141.2Crystal Cove, CAWH7803marine NC_031927.1
S-CRM017178,56339.7Copco Reservoir, Klamath River, CALC16freshwater NC_015569.1
The low number of AMGs in the two phages may also be related to the genomic features of small genome size and high G + C content. By performing the correlation analyses on the number of AMGs, genome size and G + C values of the 81 representative cyanomyoviruses, a significantly negative correlation was obtained between the number of AMGs and GC% (r = −0.272, p < 0.05, Table 2) However, although the AMG content did not correlate with the genome size in our dataset, the newly acquired genes fixed in the viral genome is usually at the cost of larger genome size [10]. Collectively, the lower content of AMG in S-N03 and S-H34 might be a result of viral evolution that was likely shaped by the habitat (eutrophic seawater), host type and range (Synechococcus phages with relatively wide host range), and genomic features (small genome size and high G + C content). However, more evidence is still needed to elucidate the regulation of AMG type and content in cyanophages.

4. Conclusions

In this study, two novel Synechcoccus phages, S-N03 and S-H34, were isolated, and their complete genomes were sequenced and analyzed. Both phages have relatively small genomes with high G + C content. Fewer AMGs than other cyanomyoviruses and an absence of common photosynthesis-related genes were also observed, which imply their different evolutionary routes were shaped by habitat types and host preference, and give clues to their likely ecological functions. Due to the limited information on genes and proteins in the cyanobacterial gene database, isolation and sequencing of more cyanophages from different environments are urgently needed. The cyanophage genomic information can contribute to further research on the interaction between cyanophage and their hosts in aquatic environments, and provide insights into viral adaptive evolution and ecological functions.
  83 in total

1.  Genetic organization of the psbAD region in phages infecting marine Synechococcus strains.

Authors:  Andrew Millard; Martha R J Clokie; David A Shub; Nicholas H Mann
Journal:  Proc Natl Acad Sci U S A       Date:  2004-07-19       Impact factor: 11.205

2.  Cyanophage tRNAs may have a role in cross-infectivity of oceanic Prochlorococcus and Synechococcus hosts.

Authors:  Hagay Enav; Oded Béjà; Yael Mandel-Gutfreund
Journal:  ISME J       Date:  2011-10-20       Impact factor: 10.302

3.  SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing.

Authors:  Anton Bankevich; Sergey Nurk; Dmitry Antipov; Alexey A Gurevich; Mikhail Dvorkin; Alexander S Kulikov; Valery M Lesin; Sergey I Nikolenko; Son Pham; Andrey D Prjibelski; Alexey V Pyshkin; Alexander V Sirotkin; Nikolay Vyahhi; Glenn Tesler; Max A Alekseyev; Pavel A Pevzner
Journal:  J Comput Biol       Date:  2012-04-16       Impact factor: 1.479

4.  Alternative oxidase and plastoquinol terminal oxidase in marine prokaryotes of the Sargasso Sea.

Authors:  Allison E McDonald; Greg C Vanlerberghe
Journal:  Gene       Date:  2005-04-11       Impact factor: 3.688

5.  Isolation and Molecular Characterization of Five Marine Cyanophages Propagated on Synechococcus sp. Strain WH7803.

Authors:  W H Wilson; I R Joint; N G Carr; N H Mann
Journal:  Appl Environ Microbiol       Date:  1993-11       Impact factor: 4.792

6.  Isolation and complete genome sequence of a novel cyanophage, S-B05, infecting an estuarine Synechococcus strain: insights into environmental adaptation.

Authors:  Tong Jiang; Cui Guo; Min Wang; Meiwen Wang; Siyuan You; Yundan Liu; Xinran Zhang; Hongbin Liu; Yong Jiang; Hongbing Shao; Yantao Liang; Andrew McMinn
Journal:  Arch Virol       Date:  2020-04-19       Impact factor: 2.574

7.  Unicellular cyanobacteria fix N2 in the subtropical North Pacific Ocean.

Authors:  J P Zehr; J B Waterbury; P J Turner; J P Montoya; E Omoregie; G F Steward; A Hansen; D M Karl
Journal:  Nature       Date:  2001-08-09       Impact factor: 49.962

Review 8.  Serine/threonine phosphatases: mechanism through structure.

Authors:  Yigong Shi
Journal:  Cell       Date:  2009-10-30       Impact factor: 41.582

9.  Genomic analysis of oceanic cyanobacterial myoviruses compared with T4-like myoviruses from diverse hosts and environments.

Authors:  Matthew B Sullivan; Katherine H Huang; Julio C Ignacio-Espinoza; Aaron M Berlin; Libusha Kelly; Peter R Weigele; Alicia S DeFrancesco; Suzanne E Kern; Luke R Thompson; Sarah Young; Chandri Yandava; Ross Fu; Bryan Krastins; Michael Chase; David Sarracino; Marcia S Osburne; Matthew R Henn; Sallie W Chisholm
Journal:  Environ Microbiol       Date:  2010-11       Impact factor: 5.491

10.  A Novel Roseosiphophage Isolated from the Oligotrophic South China Sea.

Authors:  Yunlan Yang; Lanlan Cai; Ruijie Ma; Yongle Xu; Yigang Tong; Yong Huang; Nianzhi Jiao; Rui Zhang
Journal:  Viruses       Date:  2017-05-15       Impact factor: 5.048

View more
  1 in total

1.  Isolation and Characterization of a Novel Cyanophage Encoding Multiple Auxiliary Metabolic Genes.

Authors:  Cuhuang Rong; Kun Zhou; Shuiming Li; Kang Xiao; Ying Xu; Rui Zhang; Yunlan Yang; Yu Zhang
Journal:  Viruses       Date:  2022-04-24       Impact factor: 5.818

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.