| Literature DB >> 23575371 |
Pascal Hingamp1, Nigel Grimsley, Silvia G Acinas, Camille Clerissi, Lucie Subirana, Julie Poulain, Isabel Ferrera, Hugo Sarmento, Emilie Villar, Gipsi Lima-Mendez, Karoline Faust, Shinichi Sunagawa, Jean-Michel Claverie, Hervé Moreau, Yves Desdevises, Peer Bork, Jeroen Raes, Colomban de Vargas, Eric Karsenti, Stefanie Kandels-Lewis, Olivier Jaillon, Fabrice Not, Stéphane Pesant, Patrick Wincker, Hiroyuki Ogata.
Abstract
Nucleo-cytoplasmic large DNA viruses (NCLDVs) constitute a group of eukaryotic viruses that can have crucial ecological roles in the sea by accelerating the turnover of their unicellular hosts or by causing diseases in animals. To better characterize the diversity, abundance and biogeography of marine NCLDVs, we analyzed 17 metagenomes derived from microbial samples (0.2-1.6 μm size range) collected during the Tara Oceans Expedition. The sample set includes ecosystems under-represented in previous studies, such as the Arabian Sea oxygen minimum zone (OMZ) and Indian Ocean lagoons. By combining computationally derived relative abundance and direct prokaryote cell counts, the abundance of NCLDVs was found to be in the order of 10(4)-10(5) genomes ml(-1) for the samples from the photic zone and 10(2)-10(3) genomes ml(-1) for the OMZ. The Megaviridae and Phycodnaviridae dominated the NCLDV populations in the metagenomes, although most of the reads classified in these families showed large divergence from known viral genomes. Our taxon co-occurrence analysis revealed a potential association between viruses of the Megaviridae family and eukaryotes related to oomycetes. In support of this predicted association, we identified six cases of lateral gene transfer between Megaviridae and oomycetes. Our results suggest that marine NCLDVs probably outnumber eukaryotic organisms in the photic layer (per given water mass) and that metagenomic sequence analyses promise to shed new light on the biodiversity of marine viruses and their interactions with potential hosts.Entities:
Mesh:
Year: 2013 PMID: 23575371 PMCID: PMC3749498 DOI: 10.1038/ismej.2013.59
Source DB: PubMed Journal: ISME J ISSN: 1751-7362 Impact factor: 10.302
General description of the samples analyzed in this study
| 3_S | 3 | Atlantic Ocean | Open ocean | SRF | 36°43.520'N 10°28.250'W | NA | NA | NA | 2009/09/13 10:40 | TARA-Y200000001 (A6.1) |
| 4_S | 4 | Atlantic Ocean | Open ocean | SRF | 36°33.200'N 6°34.010'W | NA | NA | NA | 2009/09/15 10:15 | TARA-Y200000002 (A11) |
| 6_S | 6 | Mediterranean Sea | Enclosed sea | SRF | 36°31.239'N 4°0.443'W | 17.0 | 37.35 | 3.121 | 2009/09/21 14:49 | TARA-Y200000003 (A32) |
| 7_S | 7 | Mediterranean Sea | Enclosed sea | SRF | 37°2.321'N 1°56.99'W | 23.8 | 37.48 | 0.075 | 2009/09/23 17:05 | TARA-A200000113 |
| 7_D | 7 | Mediterranean Sea | Enclosed sea | DCM (42 m) | 37°2.321'N 1°56.99'W | 17.8 | 37.09 | 0.296 | 2009/09/23 17:05 | TARA-A200000159 |
| 23_S | 23 | Mediterranean Sea | Enclosed sea | SRF | 42°10.462'N 17°43.163'E | 17.1 | 38.22 | 0.036 | 2009/11/18 12:44 | TARA-E500000066 |
| 23_D | 23 | Mediterranean Sea | Enclosed sea | DCM (56 m) | 42°10.462'N 17°43.163'E | 16.0 | 38.30 | 0.119 | 2009/11/18 12:44 | TARA-E500000081 |
| 30_S | 30 | Mediterranean Sea | Enclosed sea | SRF | 33°55.077'N 32°53.622'E | 20.4 | 39.42 | 0.025 | 2009/12/14 12:44 | TARA-A100001568 |
| 31_S | 31 | Red Sea | Enclosed sea | SRF | 27°8.100'N 34°48.400'E | 25.0 | 39.91 | 0.005 | 2010/01/09 10:03 | TARA-A100001568 |
| 36_S | 36 | Arabian Sea | Semi-enclosed sea | SRF | 20°49.053'N 63°30.727'E | 26.0 | 36.53 | 0.047 | 2010/03/12 10:36 | TARA-Y100000022 |
| 38_S | 38 | Arabian Sea | Semi-enclosed sea | SRF | 19°2.318'N 64°29.620'E | 26.3 | 36.62 | 0.052 | 2010/03/15 03:45 | TARA-Y100000288 |
| 38_Z | 38 | Arabian Sea | Semi-enclosed sea | OMZ (350 m) | 19°2.103'N 64°33.825'E | 14.7 | 36.00 | 0.002 | 2010/03/16 06:14 | TARA-Y100000294 |
| 39_S | 39 | Arabian Sea | Semi-enclosed sea | SRF | 18°34.213'N 66°29.167'E | 27.4 | 36.29 | 0.026 | 2010/03/18 09:56 | TARA-Y100000029 |
| 39_Z | 39 | Arabian Sea | Semi-enclosed sea | OMZ (270 m) | 18°44.043'N 66°23.375'E | 15.6 | 35.91 | 0.003 | 2010/03/20 08:17 | TARA-Y100000031 |
| 43_S | 43 | Indian Ocean | Lagoon | SRF | 4°39.582'N 73°29.128'E | 30.0 | 34.49 | 0.075 | 2010/04/05 08:50 | TARA-Y100000074 |
| 46_S | 46 | Indian Ocean | Lagoon | SRF | 0°39.748'S 73°9.664'E | 30.1 | 35.11 | 0.050 | 2010/04/15 02:40 | TARA-Y100000100 |
| 49_S | 49 | Indian Ocean | Open ocean | SRF | 16°48.497'S 59°30.257'E | 28.3 | 34.49 | 0.024 | 2010/04/23 10:29 | TARA-Y100000120 |
Abbreviations: DCM, deep chlorophyll maximum; NA, not applicable; OMZ, oxyzen minimum zone; SRF, surface; UTC, Coordinated Universal Time.
Locations, date and time correspond to events for the collection of contextual physicochemical data. Events for water sampling could slightly differ from these values.
Quality-controlled Tara Oceans pyrosequence data
| 3_S | 21 533 646 | 63 994 | 37 | 336 | 65 656 | 99 |
| 4_S | 52 953 075 | 140 754 | 38 | 376 | 149 018 | 108 |
| 6_S | 36 129 806 | 95 255 | 48 | 379 | 98 996 | 111 |
| 7_S | 98 750 180 | 332 049 | 38 | 297 | 335 408 | 90 |
| 7_D | 279 389 388 | 1 117 888 | 37 | 250 | 1 013 853 | 81 |
| 23_S | 67 695 268 | 196 190 | 39 | 345 | 201 447 | 101 |
| 23_D | 83 539 478 | 239 447 | 38 | 349 | 246 948 | 102 |
| 30_S | 89 180 466 | 256 028 | 37 | 348 | 268 616 | 101 |
| 31_S | 245 463 121 | 614 743 | 39 | 399 | 660 949 | 114 |
| 36_S | 245 945 064 | 737 506 | 39 | 333 | 757 448 | 100 |
| 38_S | 214 253 370 | 601 110 | 39 | 356 | 631 351 | 103 |
| 38_Z | 223 188 575 | 638 843 | 45 | 349 | 659 041 | 104 |
| 39_S | 233 273 851 | 590 664 | 43 | 395 | 629 501 | 114 |
| 39_Z | 249 558 778 | 679 589 | 46 | 367 | 708 056 | 108 |
| 43_S | 167 515 516 | 529 506 | 37 | 316 | 545 641 | 93 |
| 46_S | 251 310 870 | 648 425 | 41 | 388 | 689 641 | 112 |
| 49_S | 222 417 021 | 680 573 | 43 | 327 | 696 974 | 98 |
Abbreviation: ORF, open reading frame.
Figure 1Metagenome-based relative abundance of NCLDV and cellular genomes in the TOP data set. Seventeen TOP metagenomes (0.2–1.6 μm size fraction) were pooled and analyzed as a single data set to generate this plot. Each dot in the plot represents the density of one of the marker genes used in this study (16 markers for NCLDVs and 35 markers for cellular genomes). The estimated abundance of NCLDVs genomes is slightly lower than that of Archaea genomes and amounts to approximately 3% of bacterial genomes.
Figure 2NCLDV genome abundance in the TOP data set. (a) Proportion of the average marker gene density for NCLDVs relative to that of prokaryotes (Bacteria and Archaea) for each of the 17 TOP metagenomes. (b) Experimentally measured prokaryotic cell densities (gray circles; 16 samples by microscopy and 13 samples by FC) were used to estimate the absolute abundances of NCLDV genomes (black squares) by rescaling the metagenome-based relative abundances. ‘S', ‘D' and ‘Z' in the sample names indicate the depths from which the samples were collected: ‘S' for surface, ‘D' for deep chlorophyll max and ‘Z' for oxygen minimum zone.
Figure 3Metagenome-based relative abundance of NCLDV families. (a) Representation of different viral groups in the whole TOP metagenomic data set as measured by the NCLDV marker gene density. The number of marker reads taxonomically assigned to each viral group is shown in parentheses in the legend. (b) Representation of different viral groups in the 17 TOP metagenomic samples. ‘S', ‘D' and ‘Z' in the sample names indicate the depths from which the samples were collected: ‘S' for surface, ‘D' for deep chlorophyll max and ‘Z' for oxygen minimum zone. In both (a) and (b), three reads and one read assigned to Asfarviridae and Poxviridae, respectively, were omitted for presentation purpose.
Figure 4Phylogenetic positions of metagenomic reads closely related to NCLDV DNA polymerase sequences. An HMM search with a PolB profile detected 2028 PolB-like peptide sequences in the TOP metagenomes. Each of these peptides was placed within a large reference phylogenetic tree containing diverse viral and cellular homologs (Supplementary Figure S1) with the use of Pplacer. Of these peptides, 264 were mapped on the branches leading to NCLDV sequences and are shown in this figure. The numbers of mapped metagenomic reads are shown on the branches and are reflected by branch widths. This result is consistent with the preponderance of the Phycodnaviridae and Megaviridae families seen in our BLAST-based marker gene analysis. Only the NCLDV part of the reference tree is shown.
Figure 5Classification of NCLDV marker genes in the TOP data based on the level of sequence similarity to database sequences. Metagenomic reads showing ⩾80% amino-acid sequence identity to database sequences were classified as ‘known (or seen)', otherwise as ‘novel (or unseen)'. (a) BLAST result against UniProt. (b) BLAST result against the GOS data. The large proportions of ‘novel (and unseen)' genes suggest current environmental surveys are far from reaching saturation and that diverse yet unknown NCLDVs exist in the sea.
Examples of positive and negative viral-cell associations
| Viruses; dsDNA viruses, no RNA stage; Mimiviridae | Eukaryota; stramenopiles; Oomycetes | 0.949 | 2.22E-05 | 0.939 | 1.7E-02 |
| Viruses; dsDNA viruses, no RNA stage; Iridoviridae; Lymphocystivirus; unclassified Lymphocystivirus | Bacteria; Tenericutes; Mollicutes; Mycoplasmataceae | 0.883 | 1.44E-03 | — | — |
| Viruses; unclassified phages; environmental samples | Bacteria; Cyanobacteria; environmental samples | 0.864 | 2.92E-03 | — | — |
| Viruses; dsDNA viruses, no RNA stage; Caudovirales; Siphoviridae | Eukaryota; Alveolata; Apicomplexa; Aconoidasida; Piroplasmida | 0.861 | 3.26E-03 | — | — |
| Viruses; dsDNA viruses, no RNA stage; Herpesvirales; Herpesviridae; Gammaherpesvirinae | Bacteria; Proteobacteria; Gammaproteobacteria; Thiotrichales; Thiotrichaceae | 0.853 | 4.20E-03 | — | — |
| Viruses; dsDNA viruses, no RNA stage; Phycodnaviridae | Bacteria; Proteobacteria; Gammaproteobacteria; Alteromonadales; Alteromonadales genera incertae sedis | 0.838 | 6.30E-03 | — | — |
| Viruses; dsRNA viruses; Reoviridae; Sedoreovirinae; Mimoreovirus | Eukaryota; Metazoa; Chordata; Craniata | 0.834 | 6.98E-03 | — | — |
| Viruses; dsDNA viruses, no RNA stage; Herpesvirales; Herpesviridae; Gammaherpesvirinae | Bacteria; Chloroflexi; Thermomicrobiales; Thermomicrobiaceae; Thermomicrobium | 0.830 | 7.61E-03 | — | — |
| Viruses; dsDNA viruses, no RNA stage; Herpesvirales; Herpesviridae; Gammaherpesvirinae | Bacteria; Proteobacteria; Magnetococcus | 0.825 | 8.53E-03 | — | — |
| Viruses; dsDNA viruses, no RNA stage; Phycodnaviridae; unclassified Phycodnaviridae | Eukaryota; Viridiplantae; Chlorophyta; Prasinophyceae; Mamiellales | 0.821 | 9.36E-03 | — | — |
| Viruses; dsDNA viruses, no RNA stage; Herpesvirales; Herpesviridae; Gammaherpesvirinae | Bacteria; Acidobacteria; Solibacteres; Solibacterales; Solibacteraceae | 0.820 | 9.51E-03 | — | — |
| Viruses; dsDNA viruses, no RNA stage; Herpesvirales; Herpesviridae; Gammaherpesvirinae | Bacteria; Proteobacteria; Deltaproteobacteria; Desulfobacterales; Desulfobacteraceae | 0.820 | 9.51E-03 | — | — |
| Viruses; dsDNA viruses, no RNA stage; Caudovirales; Myoviridae; T4-like viruses | Bacteria; Cyanobacteria; environmental samples | 0.819 | 9.71E-03 | — | — |
| Viruses; dsDNA viruses, no RNA stage; Caudovirales; Podoviridae; Autographivirinae | Bacteria; Cyanobacteria; environmental samples | 0.817 | 1.02E-02 | — | — |
| Viruses; dsDNA viruses, no RNA stage | Eukaryota; Alveolata; Ciliophora; Intramacronucleata; Spirotrichea | 0.803 | 1.36E-02 | — | — |
| Viruses; dsDNA viruses, no RNA stage; Caudovirales; Podoviridae; N4-like viruses | Bacteria; Firmicutes; Clostridia; Clostridiales; Peptococcaceae | 0.802 | 1.38E-02 | — | — |
| Viruses; dsDNA viruses, no RNA stage; Caudovirales | Eukaryota; Alveolata; Apicomplexa; Aconoidasida; Piroplasmida | 0.802 | 1.39E-02 | — | — |
| Viruses; dsDNA viruses, no RNA stage; Viruses; dsDNA viruses, no RNA stage; unclassified dsDNA viruses | Bacteria; Proteobacteria; Alphaproteobacteria; Rickettsiales; SAR11 cluster | 0.801 | 1.39E-02 | — | — |
| Viruses; dsDNA viruses, no RNA stage; Phycodnaviridae; Phaeovirus | Eukaryota; stramenopiles; Actinophryidae; Actinophrys | 0.801 | 1.39E-02 | — | — |
| Viruses; dsDNA viruses, no RNA stage; Phycodnaviridae; unclassified Phycodnaviridae | Eukaryota; Viridiplantae; Chlorophyta; Prasinophyceae; environmental samples | 0.800 | 1.42E-02 | — | — |
| Viruses; dsDNA viruses, no RNA stage; Caudovirales; Myoviridae; phiKZ-like viruses | Eukaryota; Euglenozoa; Kinetoplastida; Trypanosomatidae; Leishmania | −0.742 | 3.32E-02 | —0.804 | 1.72E-02 |
| Viruses; dsDNA viruses, no RNA stage; Iridoviridae; Ranavirus | Bacteria; candidate division OP8; environmental samples | −0.751 | 2.95E-02 | −0.695 | 3.83E-02 |
| Viruses; dsDNA viruses, no RNA stage; Caudovirales; Myoviridae; phiKZ-like viruses | Eukaryota; Rhodophyta; Bangiophyceae; Cyanidiales; Cyanidiaceae | — | — | −0.659 | 2.95E-02 |
| Viruses; dsDNA viruses, no RNA stage; Caudovirales; Myoviridae; phiKZ-like viruses | Bacteria; Spirochaetes; Spirochaetales; Spirochaetaceae | — | — | −0.715 | 3.95E-02 |
Abbreviation: dsDNA, double-stranded DNA.
Statistical significance of taxon associations was assessed by two methods. ρ (Spearman's correlation coefficient) and q (false discovery rate) were calculated by the first method and ρ' (Spearman's correlation coefficient) and q' (false discovery rate) were calculated by a more stringent second method. See Materials and methods for details.
Figure 6Taxon associations inferred from co-occurrence analysis. (a) Distribution of P-values for Spearman's correlation coefficients for taxon associations observed in the TOP metagenomic data. Colored (red and green) areas of the histogram represent taxon pairs showing statistically significant correlations. The position of the P-value for the hypothetical positive association between the ‘Megaviridae' and ‘oomycetes' taxonomic groups is indicated by a red triangle. (b) Correlated occurrence of 454 reads taxonomically assigned to the ‘Megaviridae' and the ‘oomycetes' groups by the BLAST-based 2bLCA method. Each dot corresponds to one of the 17 TOP samples analyzed. Axes represent the density of these reads (number of reads per Mbp) for each of the ‘Megaviridae' and the ‘oomycetes' groups.
Figure 7Evidence of horizontal gene transfer between viruses and eukaryotes related to oomycetes. The displayed maximum likelihood tree was generated based on sequences of the Mimivirus hypothetical vWFA domain-containing protein (gi: 311978223) and its homologs using PhyML. The numbers on the branches indicate bootstrap percentages after 100 bootstrap sampling. The tree was mid-point rooted for visualization purpose. The grouping of the Megaviridae and oomycete sequences suggests a gene exchange between the lineage leading to Megaviridae and the lineage leading to oomycetes. Phylogenetic trees for the remaining five putative cases of horizontal gene transfers between these lineages are provided in the Supplementary Figure S9.