| Literature DB >> 26191051 |
Christopher M Bellas1, Alexandre M Anesio1, Gary Barker2.
Abstract
Microbial communities in glacial ecosystems are diverse, active, and subjected to strong viral pressures and infection rates. In this study we analyse putative virus genomes assembled from three dsDNA viromes from cryoconite hole ecosystems of Svalbard and the Greenland Ice Sheet to assess the potential hosts and functional role viruses play in these habitats. We assembled 208 million reads from the virus-size fraction and developed a procedure to select genuine virus scaffolds from cellular contamination. Our curated virus library contained 546 scaffolds up to 230 Kb in length, 54 of which were circular virus consensus genomes. Analysis of virus marker genes revealed a wide range of viruses had been assembled, including bacteriophages, cyanophages, nucleocytoplasmic large DNA viruses and a virophage, with putative hosts identified as Cyanobacteria, Alphaproteobacteria, Gammaproteobacteria, Actinobacteria, Firmicutes, eukaryotic algae and amoebae. Whole genome comparisons revealed the majority of circular genome scaffolds (CGS) formed 12 novel groups, two of which contained multiple phage members with plasmid-like properties, including a group of phage-plasmids possessing plasmid-like partition genes and toxin-antitoxin addiction modules to ensure their replication and a satellite phage-plasmid group. Surprisingly we also assembled a phage that not only encoded plasmid partition genes, but a clustered regularly interspaced short palindromic repeat (CRISPR)/Cas adaptive bacterial immune system. One of the spacers was an exact match for another phage in our virome, indicating that in a novel use of the system, the lysogen was potentially capable of conferring immunity on its bacterial host against other phage. Together these results suggest that highly novel and diverse groups of viruses are present in glacial environments, some of which utilize very unusual life strategies and genes to control their replication and maintain a long-term relationship with their hosts.Entities:
Keywords: CRISPR; cryoconite; cryosphere; lysogeny; phage plasmid; virus ecology; virus genomes
Year: 2015 PMID: 26191051 PMCID: PMC4490671 DOI: 10.3389/fmicb.2015.00656
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Searches used on all scaffolds to determine if scaffold is viral.
| HMMER | Predicted ORFs | Pfam-A | <10−5 | – |
| blastp | Predicted ORFs | Refseq virus | <10−5 | ≥50% of Pfam hits |
| tblastx | Nucleotide (scaffold) | Refseq mitochondria | <10−5 | – |
| blastp | Predicted ORFs | ACLAME plasmids | <10−5 | <50% of gene hits |
| blastp | Predicted ORFs | POGs-10 | <10−5 | ≥10% of Pfam hits |
| blastp | Predicted ORFs | POGs-7 infPQ | <10−5 | ≥0 |
| tblastx | Nucleotide (scaffold) | Silva SSU and LSU | <10−5 | 0 |
Illumina read summary and quality control.
| Read pairs | 41,205,412 | 47,705,988 | 34,093,025 |
| Reads | 82,410,824 | 95,411,976 | 68,186,050 |
| Passed QC | 71,444,796 | 78,956,224 | 58,304,468 |
| Gb data | 7.22 | 7.97 | 5.89 |
Figure 1Virome composition based on BLASTX hits of unassembled reads to the Refseq virus database (E-value < 10. Based on a 0.1% random subsample of each virome. Hits are normalized by genome length of the virotype by GAAS.
Gene prediction and virus scaffold detection from the pooled assemblies.
| GeneMark gene predictions | 59739 | 114914 | 55209 | |||
| PfamA gene hits (HMMER scan E-value < 10−5) | 29228 | 48.9% | 72103 | 62.7% | 31721 | 57.5% |
| RefseqVirus gene hits (blastp E-value < 10−5) | 10520 | 17.6% | 17609 | 15.3% | 8899 | 16.1% |
| POGs10 gene hits (blastp E-value < 10−5) | 4402 | 7.4% | 5632 | 4.9% | 2909 | 5.3% |
| POGs7 gene hits (blastp E-value < 10−5) | 135 | 0.2% | 98 | 0.1% | 82 | 0.1% |
| Silva gene hits (tblastx E-value < 10−5) | 244 | 385 | 274 | |||
| Scaffolds ≥15 kbp | 865 | 1855 | 659 | |||
| Virus scaffolds ≥15 kb (confirmed) | 262 | 30.3% | 152 | 8.2% | 128 | 19.4% |
| Reads mapped to all scaffolds ≥15 kb | 8,363,805 | 11.7% | 20,979,299 | 26.6% | 4,834,392 | 8.3% |
| Reads mapped to virus scaffolds ≥15 kb | 4,019,865 | 5.6% | 1677841 | 2.1% | 1077717 | 1.8% |
| Reads mapped to other scaffolds ≥15 kb | 4,343,940 | 6.1% | 19301458 | 24.4% | 3756675 | 6.4% |
| Percentage of reads in scaffolds ≥15 kb which are viral | – | 48.1% | – | 8.0% | – | 22.3% |
| Circular scaffolds ≥10 kb | 53 | 25 | 16 | |||
| Confirmed phage circular scaffolds ≥10 kb | 35 | 12 | 10 | |||
| Circular Mitochondrial scaffolds (MITOS) | 5 | 1 | 0 | |||
| Other cellular origin, plasmid etc. | 13 | 12 | 6 | |||
Figure 2Functional predictions of all viral genes based on homology to the subsystems database. The number of hits from our assembled, curated virus database is displayed based on an E-value cut-off <10−5 with a minimum of 50% identity.
Figure 3Whole genome comparison of circular page scaffolds against known phage genomes generated by TBLASTX comparisons. The tree is constructed with equal branch lengths. Red branches and boxes represent glacial viruses from this study. Red labels represent group number (Supplementary Data 1). Dark green text represent Myoviridae, Orange—Podoviridae, Blue—Siphoviridae. Light green boxes represent a virus scaffold where a putative hosts (light green text) has been assigned by attP homology to the tRNA database.
Putative host assignment for linear and circular virus scaffolds.
| CY1_33_1470 | 32760 | Circular | Actinobacteria | attP-attB | Supplementary Date |
| ML_33_9434 | 17132 | Actinobacteria | attP-attB | Supplementary Date | |
| ML_53_4264 | 40814 | Circular | Alphaproteobacteria | attP-attB | Supplementary Date |
| CY1_53_17 | 37329 | Circular | Firmicutes | attP-attB | Supplementary Date |
| CY1_43_2205 | 69796 | Gammaproteobacteria | attP-attB | Supplementary Date | |
| CY1_43_9289 | 15483 | Gammaproteobacteria | attP-attB | Supplementary Date | |
| Ml_53_6570 | 42514 | Cyanobacteria | CGS comparisons | Figure | |
| CY1_24_10438 | 40111 | Cyanobacteria | TER_L phylogeny | Figure | |
| CY1_24_17307 | 16195 | Cyanobacteria | DNA polB phylogeny | Figure | |
| CY1_24_11777 | 23597 | Cyanobacteria | DNA polB phylogeny | Figure | |
| AB_33_3099 | 45797 | Algae | DNA pol Phycodnaviridae | Figure | |
| AB_43_3071 | 23682 | Algae | DNA pol Phycodnaviridae | Figure | |
| ML_43_2588 | 16549 | Algae | DNA pol Phycodnaviridae | Figure | |
| ML_24_2669 | 60149 | Algae, Haptophyceae | attP-attB | Supplementary Data | |
| AB_33_3238 | 53511 | Eukaryote | MCP of NCLDV | ||
| ML_43_16378 | 15849 | Eukaryote | MCP of NCLDV | ||
| AB_43_3337 | 15446 | Eukaryote | MCP of NCLDV | ||
| Ab_53_508 | 80133 | Eukaryote | 24/99 genes match NCLDV | ||
| CY1_24_6609 | 12595 | Virus (NCLDV) | CGS comparisons | Figure |
attP-attB, phage and bacterial tRNA attachment sites; MCP, major capsid protein; NCLDV, nucleocytoplasmic large DNA viruses; TER_L, large terminase subunit; CGS, complete genomes acaffold.
Figure 4Phylogeny of the major capsid protein (MCP) gene of the cryoconite virophage. CY1_24_6609 (#25 Figure 3) is compared with all known virophage. Maximum likelihood tree with 100 bootstrap replicates. Zam, Zamillion virophage; OLV, Organic lake virophage; YLV, Yellowstone lake virophages.
Circular genome scaffold (CGS) groups with multiple phage-plasmid-like genes.
| VirE (Virulence), DNA pol A | CY1_24_716 | 37683 | 225.8 | |
| VirE (Virulence), Terminase, DNA pol A | CY1_24_11561 | 38498 | 12.6 | |
| Putative phage-plasmid | CY1_63_1964 | 80578 | 30.2 | |
| Lysogenic phage | IbrA, IbrB/ | CY1_24_2481 | 54228 | 61.2 |
| Putative satellite phage | Photolyase, Phage Integrase, | CY1_53_534 | 14105 | 18.2 |
| Putative satellite phage | CY1_53_1886 | 15192 | 22.9 | |
| Putative satellite phage | ML_43_3791 | 15770 | 48.6 | |
| Lysogenic phage | Integrase, Terminase | CY1_53_17 | 37329 | 92.6 |
| Putative phage-plasmid | ML_43_489 | 37582 | 69.7 | |
| Putative phage-plasmid | ML_33_2393 | 38190 | 40.0 | |
| Putative phage-plasmid | CY1_53_144 | 15551 | 20.2 | |
Bold text denotes phage-plasmid-like genes., Cro/Cl, part of phage repressor/antirepressor; VirR, virulence associated protein; IbrA/B, co activators of prophage gene expression; CRISPR, clustered regularly interspaced shore palindromic repeat; P2, phage-plasmid P2; P4, satellite phage P4; ParA/B, Plasmid partition genes; Prim-pol, bifuncational phage-like primase polymerase.
Figure 5Representative putative phage genome scaffolds in our assemblies. Genes were predicted using GeneMark heuristic models and displayed as forward (outer) and reverse (inner) coding. Gray are hypothetical proteins only, green hit to Refseq virus and GenBank NR (BLASTX E-value cut-off 10−5), orange hit to GenBank NR only, red are CRISPR arrays. Inner plot shows GC content and numbers denote kilobase pairs. (A) Greenland Ice Sheet phage encoding a CRISPR/Cas system and plasmid partition genes. (B) Svalbard phage representative of Group 10 (G10), encoding plasmid partition genes and a toxin antitoxin system. (C) Greenland virophage with limited homology to Sputnik virophage. (D) Group 8 (G8) phage with satellite phage plasmid genome arrangement. Full annotations are given in Supplementary Data 2.
Figure 6CRISPR/Cas system found on tailed, lysogenic phage CY1_64_1964. Cas/Cse are CRISPR associated genes from the type 1-E CRISPR/Cas subtype. Spacer 12 shows the location of the phage matching spacer. DR—Direct Repeat consensus for each CRISPR. Red genes are CRISPR, orange—Cas genes, gray—hypothetical proteins.