| Literature DB >> 19840100 |
Matthew B Sullivan1, Bryan Krastins, Jennifer L Hughes, Libusha Kelly, Michael Chase, David Sarracino, Sallie W Chisholm.
Abstract
Prochlorococcus, an abundant phototroph in the oceans, are infected by members of three families of viruses: myo-, podo- and siphoviruses. Genomes of myo- and podoviruses isolated on Prochlorococcus contain DNA replication machinery and virion structural genes homologous to those from coliphages T4 and T7 respectively. They also contain a suite of genes of cyanobacterial origin, most notably photosynthesis genes, which are expressed during infection and appear integral to the evolutionary trajectory of both host and phage. Here we present the first genome of a cyanobacterial siphovirus, P-SS2, which was isolated from Atlantic slope waters using a Prochlorococcus host (MIT9313). The P-SS2 genome is larger than, and considerably divergent from, previously sequenced siphoviruses. It appears most closely related to lambdoid siphoviruses, with which it shares 13 functional homologues. The approximately 108 kb P-SS2 genome encodes 131 predicted proteins and notably lacks photosynthesis genes which have consistently been found in other marine cyanophage, but does contain 14 other cyanobacterial homologues. While only six structural proteins were identified from the genome sequence, 35 proteins were detected experimentally; these mapped onto capsid and tail structural modules in the genome. P-SS2 is potentially capable of integration into its host as inferred from bioinformatically identified genetic machinery int, bet, exo and a 53 bp attachment site. The host attachment site appears to be a genomic island that is tied to insertion sequence (IS) activity that could facilitate mobility of a gene involved in the nitrogen-stress response. The homologous region and a secondary IS-element hot-spot in Synechococcus RS9917 are further evidence of IS-mediated genome evolution coincident with a probable relic prophage integration event. This siphovirus genome provides a glimpse into the biology of a deep-photic zone phage as well as the ocean cyanobacterial prophage and IS element 'mobilome'.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19840100 PMCID: PMC2784084 DOI: 10.1111/j.1462-2920.2009.02081.x
Source DB: PubMed Journal: Environ Microbiol ISSN: 1462-2912 Impact factor: 5.491
Fig. 1The morphology (A) and genome/structural proteome (B) of Prochlorococcus siphovirus P-SS2. A. Electron micrograph of uranyl acetate negative-stained, purified P-SS2 viral particle. B. The open reading frames (ORFs) are indicated on either the positive (above grey line) or negative (below grey line) DNA strand. Bioinformatically determined promoters and terminators are indicated, as is a putative host integration site (see Fig. 2). Structural proteins detected using mass-spectrometry are indicated by the diagonal lines in the corresponding ORFs, with structural modules indicated by the red lines and the text underneath the genome. For further detail, the number of virion structural peptides detected per ORF is provided in Table 2. The genome sequence is deposited in GenBank under accession #GQ334450.
Genome-wide characteristics of marine siphoviruses P-SS2 (this study) and phi-JL001 (Lohr ) relative to other recognized phage groups within the Siphoviridae. Siphoviruses are all non-enveloped and contain double-stranded DNA genomes, non-contractile, flexible tails, and are distinguished by different combinations of alleles of structural and DNA replication proteins
| Genome features | Particle features | ||||
|---|---|---|---|---|---|
| Phage genus | Size (kb) | # ORFs | %G+C | Capsid diameter (nm) | Tail (nm) – L × W |
| Marine, non-classified siphoviruses | |||||
| cyanophage P-SS2 | 108 | 131 | 52.3 | 75 | 325 × 12 |
| Alpha-protebacteria φJL001 | 63 | 91 | 62 | 75 | 125 × N.D. |
| Lambda-likes | |||||
| Enterobacteria phage λ | 48.5 | 92 | 49 | 60 | 150 × 8 |
| Enterobacteria phage HK022 | 40.8 | 57 | 49 | 51 | 106 × N.D. |
| Enverobacteria phage HK97 | 39.7 | 62 | 49 | 54 | 179 × N.D. |
| T1-likes | |||||
| Enterobacteria phage T1 | 48.8 | 78 | 45 | 60 | 150 × 8 |
| Enterobacteria phage TLS | 49.9 | 87 | 42 | 50 | N.D. |
| Enterobacteria phage RTP | 46.2 | 75 | 44 | 60 | 160 × N.D. |
| L5-likes | |||||
| Mycobacterium phage L5 | 52.3 | 88 | 62 | 60 | 135 × 8 |
| Mycobacterium phage D29 | 49.1 | 84 | 63 | N.D. | N.D. |
| Mycobacterium phage Bxb1 | 50.6 | 86 | 63 | 60 | 135 × N.D. |
| φC31-likes | |||||
| Streptomyces phage φC31 | 41.5 | 54 | 63 | 53 | 100 × 5 |
| Streptomyces phage φBT1 | 41.8 | 56 | 62 | N.D. | N.D. |
| N15-likes | |||||
| Enterobacteria phage N15 | 46.4 | 60 | 51 | 60 | 140 × 8 |
| T5-likes | |||||
| Enterobacteria phage T5 | 121.7 | 195 | 39 | 80 | 180 × 9 |
| c2-likes | |||||
| Lactococcus phage bIL67 | 22.2 | 37 | 35 | 41 | 98 × 9 |
| Lactococcus phage c2 | 22.2 | 41 | 36 | N.D. | N.D. |
| ψM1-likes | |||||
| Methanobacterium phage | 26.1 | 31 | 46 | 55 | 210 × 10 |
Genus as recognized by the International Committee on the Taxonomy of Viruses (van Regenmortel ) and recently described Sfi21-like siphovirus families (Proux ). There are 19 sequenced genomes currently recognized as part of the lambda supergroup. Here we present a representative genome from each major group.
Genome sizes are from the classified siphovirus genomes from the NCBI TaxBrowser database.
Summary table of P-SS2 predicted proteins that contained relevant annotation information as determined from (a) significant blastp hits (e-value < e−3) against the GenBank non-redundant database, (b) experimental proteomics on the virus particle, or (c) detection in viral metagenomes.
| P-SS2 ORF # | Strand | LeftEnd | RightEnd | Size (aa) | Gene | Putative function | Average peptides detected | |
|---|---|---|---|---|---|---|---|---|
| 001 | + | 1 | 528 | 176 | terS | Terminase – small subunit | e−8 | 0 |
| 002 | + | 525 | 2018 | 498 | terL | Terminase – large subunit | e−63 | 0.5 |
| 003 | + | 2091 | 2732 | 214 | Type III rpoS | Cyanobacterial type III RNAP sigma factor | e−13 | 0 |
| 005 | + | 3417 | 3608 | 64 | Unknown protein in metagenomes | No hits | 0 | |
| 009 | + | 6004 | 6192 | 63 | Structural protein | No hits | 1.5 | |
| 010 | + | 6423 | 6713 | 97 | Thioredoxin | Thioredoxin | e−3 | 0 |
| 011 | + | 6853 | 9423 | 857 | nrd | Cyanobacterial class II ribonucleotide reductase | e = 0 | 0 |
| 014 | + | 10792 | 11481 | 230 | Hypothetical protein | e−5 | 0 | |
| 020 | + | 13316 | 13579 | 88 | Unknown structural protein | No hits | 2.5 | |
| 025 | + | 15118 | 16674 | 519 | Structural prophage protein | e−46 | 33 | |
| 028 | + | 17092 | 17280 | 63 | Conserved T4-like protein in metagenomes | e−5 | 0 | |
| 030 | + | 17648 | 22147 | 1500 | Major capsid protein | e−18 | 96.5 | |
| 031 | – | 22144 | 22347 | 68 | Unknown structural protein, also in metagenomes | No hits | 1.5 | |
| 032 | + | 22350 | 22535 | 62 | Unknown structural protein | No hits | 1 | |
| 033 | + | 22567 | 22767 | 67 | Unknown structural protein | No hits | 2.5 | |
| 036 | + | 23802 | 24836 | 345 | cobO | Cyanobacterial | e−98 | 0 |
| 038 | + | 25557 | 25730 | 58 | Conserved marine cyanobacterial protein | e−12 | 0 | |
| 045 | + | 27442 | 27702 | 87 | Syn5_026 | Cyanopodophage Syn5 ORFan protein (gp26) | e−13 | 0 |
| 049 | + | 28259 | 28486 | 76 | Conserved marine | e−4 | 0 | |
| 053 | + | 29861 | 30058 | 66 | 9313_1008 | e−3 | 0 | |
| 058 | + | 31797 | 32363 | 189 | Kinase | Possible phage kinase | e−5 | 0.5 |
| 061 | + | 32988 | 33524 | 179 | Unknown structural protein | No hits | 16 | |
| 062 | + | 33526 | 33993 | 156 | Unknown structural protein | No hits | 3 | |
| 063 | + | 33993 | 34499 | 169 | Unknown structural protein | No hits | 5 | |
| 066 | + | 36202 | 36831 | 210 | Unknown structural protein | No hits | 2.5 | |
| 067 | + | 36831 | 37625 | 265 | Fibre | Cyanophage T4-like fibre | e−7 | 9.5 |
| 068 | + | 37635 | 42854 | 1740 | Fibre | Unknown structural protein, tail fibre | e = 0.015 | 6 |
| 069 | + | 42886 | 43926 | 347 | Unknown structural protein | No hits | 13 | |
| 071 | + | 44422 | 44988 | 189 | Unknown structural protein | No hits | 1 | |
| 072 | + | 44988 | 46352 | 455 | Fibre | Lambdoid phage tail fibre | e−11 | 3 |
| 073 | + | 46354 | 51234 | 1627 | Fibre | Tail fibre with low %G+C | e−35 | 5 |
| 074 | + | 51512 | 52852 | 447 | Capsid decoration protein | Lambdoid tail collar/fibre decoration protein (gpH) | e−45 | 2 |
| 076 | + | 53164 | 53838 | 225 | Unknown structural protein | No hits | 9.5 | |
| 077 | + | 54104 | 59761 | 1886 | Tail tape measure | Lambdoid tail tape measure protein | e−36 | 102 |
| 078 | + | 59795 | 60211 | 139 | Unknown structural protein | No hits | 3 | |
| 079 | + | 60216 | 63218 | 1001 | Unknown structural protein | No hits | 43 | |
| 080 | + | 63255 | 63593 | 113 | Cyanophage T4-like hypotheticals | e−5 | 9 | |
| 081 | + | 63603 | 64040 | 146 | Unknown structural protein | No hits | 7.5 | |
| 082 | + | 64040 | 64306 | 89 | M2_082 | Cyanophage P-SSM2 ORFan protein (gp082) | e−6 | 3.5 |
| 083 | + | 64460 | 64657 | 66 | Unknown structural protein | No hits | 2 | |
| 084 | + | 64656 | 72495 | 2613 | Unknown structural protein, also in metagenomes | No hits | 43 | |
| 085 | + | 72530 | 73087 | 186 | Unknown protein in metagenomes | No hits | 0 | |
| 086 | + | 73129 | 78666 | 1846 | Structural protein similar to marine siphophage JL001 ORFan protein (gp88) | e−4 | 45 | |
| 087 | + | 78666 | 80936 | 757 | Structural protein similar to cyanophage MaTMM01 ORFan protein (gp105) | e−10 | 11 | |
| 088 | + | 80960 | 82057 | 366 | Unknown structural protein | No hits | 17.5 | |
| 089 | + | 82057 | 82920 | 288 | Unknown structural protein | No hits | 6 | |
| 090 | + | 82920 | 83348 | 143 | Unknown structural protein | No hits | 4 | |
| 091 | + | 83401 | 84141 | 247 | Host specificity | Lambdoid host specificity protein (gpJ) | e−4 | 14 |
| 092 | + | 84331 | 85251 | 307 | Structural cyanobacterial prophage protein | e−65 | 43.5 | |
| 093 | + | 85317 | 85643 | 109 | Unknown structural protein | No hits | 12.5 | |
| 095 | + | 86065 | 86304 | 80 | Unknown protein in metagenomes | No hits | 0 | |
| 097 | – | 86560 | 87231 | 224 | hyp_Syn | e−23 | 0 | |
| 098 | + | 87302 | 88528 | 409 | Lysozyme | Lysozyme | e−11 | 0 |
| 101 | – | 89241 | 90614 | 458 | int | Site-specific integrase (int) | e−12 | 0 |
| 102 | – | 91145 | 92104 | 320 | bet | Recombination protein (bet) | e−15 | 0 |
| 103 | – | 92216 | 92956 | 247 | Conserved cyanobacterial protein | e−5 | 0 | |
| 108 | + | 94460 | 96085 | 542 | Helicase | DNA helicase | e−8 | 0 |
| 109 | + | 96089 | 97663 | 525 | Primase | Cyanobacterial DNA primase | e−58 | 0.5 |
| 111 | + | 98065 | 98655 | 197 | dcd | Cyanobacterial dCTP deaminase (dcd) | e−19 | 0 |
| 113 | + | 98987 | 99841 | 285 | Type II rpoS | Type II RNAP sigma factor (rpoS) | e−17 | 0 |
| 114 | + | 99889 | 100242 | 118 | ssb | Cyanobacterial single-stranded DNA binding protein (ssb) | e−22 | 0 |
| 123 | + | 103253 | 104206 | 318 | exo | 5′-3′ exonuclease recombination protein (exo) | e−11 | 0 |
| 126 | + | 105166 | 106119 | 318 | thy1 | Cyanobacterial thymidylate synthase | e−57 | 0 |
For each protein, the genome locus information is paired with our annotations, as well as the top e-value and the average number of peptides detected from three biological replicate proteomic analyses (see text and methods).
Fig. 2Phylogenetic relationships of the large terminase protein across diverse phage types. This protein is diagnostic of phage DNA packaging mechanisms (Casjens ) and was here used to initially characterize the P-SS2 large terminase protein relative to known phage terminases. The asterisk (*) denotes marine phage and cyanobacterial host genomes. Notably, the terminase from the other marine siphovirus whose genome is sequenced (phi-JL001) clusters separately from known terminases, while that from cyanophage P-SS2 clusters with terminases from marine cyanobacterial host genomes (likely remnant prophages, see text). The tree shown is a maximum likelihood tree constructed from 1513 positions (significantly divergent protein and gapped alignment) as described in Experimental procedures. Numbers above and below branches represent bootstrap values over 75 from maximum likelihood and distance analyses respectively. Numbers in parentheses with taxa labels represent number of taxa in collapsed nodes.
Fig. 3Schematic representation of genome regions surrounding the putative phage (P-SS2) and host (Prochlorococcus MIT9313, GenBank ID: NC_005071) integration sites. This site consists of a 53 bp exact match between the phage sequence downstream of its integrase gene at position 90,836–90,888, and the non-coding sequence in the host genome at position 912,261–912,313. This general region of the host genome is a genomic island, and thus hypervariable (see text). Numbers at the genome ends represent the nucleotide position in the respective genomes.
Fig. 4Genome arrangement at the tRNA-Met +ansA locus across (A) Prochlorococcus and Synechococcus genomes, and (B) detailed for Synechococcus RS9917. A. Comparative genomics of marine Prochlorococcus and Synechococcus at the tRNA-Met +ansA locus identified as the putative P-SS2 integration site in ProMIT9313. Across the marine cyanobacteria, this region is highly syntenic with four basic genome patterns observed – denoted as ProI, ProII, ProIII, and Synechococcus in the figure. However, some strains lack synteny and have hypervariable or ‘genomic islands’ regions, indicated by the red boxes in the figure. MED4 has a small ∼8 kb island with phage high-light inducible genes (this is equivalent to ISL2, Coleman ), while MIT9515 has a slightly larger and similar island to MED4s then a region that is a large genome rearrangement (red dashed line) that is syntenic to another region of the MED4 genome (647,805–687,505). The eMIT9313 variability in this region is detailed in Fig. 3. B. The tRNA-Met + AnsA region in Synechococcus RS9917 that is homologous to the putative attB integration site in ProMIT9313 from Fig. 3. This ∼41 kb ‘island’ region contains four transposases, an antitoxin gene, and two PSS2-like genes – lysozyme and structural protein ORF97. Genomic synteny to all the other marine cyanobacteria then continues for ∼27 kb until reaching a second ∼42 kb ‘island’ that is bounded on the other side by tRNA-Ser, and contains a transposase, as well as numerous prophage-related genes including a possible repressor, antirepressor, prophage maintenance protein, RNAP sigma factor, and four PSS2-like genes – large terminase, integrase, ORF25 structural gene, lysozyme. The COG categories refer to those at Microbes Online.
Fig. 5Characterization of insertion sequence (IS) elements in Synechococcus RS9917. The 22 transposase genes and surrounding regions (the IS element) revealed four groups of multicopy IS elements, and five unique or degraded IS elements in the SynRS9917 genome. Using the ACLAME database, we classified these IS elements as follows. The two multicopy groups ‘A1’ and ‘A2’ are IS3-like mobile elements and have identical inverted repeats, identical lengths and > 87% sequence identity. IS group ‘B’, is also IS3-like element but has a shorter inverted repeat. IS group ‘C’ is longer and most similar to IS21-like elements. A. Location and orientation of IS elements, represented by coloured arrows, in relation to the proposed P-SS2-like phage integration sites (regions represented by black bars). B. Diagrams of IS elements including size of inverted repeat (size in bp indicated above the 5′-end of the yellow box), position and orientation of ORFs, and size of flanking non-coding regions (shown in yellow, with size in bp indicated below the yellow box).
Fig. 6Genomic arrangement of the tRNA-Met +pyrE site in Synechococcus RS9917 identified as a secondary hot-spot for insertion sequence elements. A. Schematic of the highly syntenic tRNA-Met +pyrE region from representative Prochlorococcus and Synechococcus genomes. Minor insertions in ProMIT9303 and ProMIT9313 (rffM insertion, small hypothetical ORFs) and the marine Synechococcus (rffM+ large hypothetical ORF) are the only deviations from complete synteny in this region, except for the genomic island detailed for Synechococcus RS9917 (see Fig. 6B). Gene names are listed for the top genome only, and homologues across the genomes are similarly coloured. Red gene names have been previously observed in myovirus cyanophage genomes (Sullivan ). Nine other genomes are similar to the Prochlorococcus MED4 arrangement, one other for the MIT9313 arrangement, and eight other for the Synechococcus WH8102 arrangement (details in Table S2). B. In contrast to the genome conservation observed in other Prochlorococcus and Synechococcus genomes, SynRS9917 contains a ∼65 kb genomic island region that contains nine transposases, seven P-SS2-like genes and seven phage-like genes. This is the most intact prophage in any marine Prochlorococcus or Synechococcus genome, but it is still significantly degraded.