| Literature DB >> 28117696 |
Lucie Gallot-Lavallée1, Guillaume Blanc2,3.
Abstract
The nucleocytoplasmic large DNA viruses (NCLDV) are a group of extremely complex double-stranded DNA viruses, which are major parasites of a variety of eukaryotes. Recent studies showed that certain eukaryotes contain fragments of NCLDV DNA integrated in their genome, when surprisingly many of these organisms were not previously shown to be infected by NCLDVs. We performed an update survey of NCLDV genes hidden in eukaryotic sequences to measure the incidence of this phenomenon in common public sequence databases. A total of 66 eukaryotic genomic or transcriptomic datasets-many of which are from algae and aquatic protists-contained at least one of the five most consistently conserved NCLDV core genes. Phylogenetic study of the eukaryotic NCLDV-like sequences identified putative new members of already recognized viral families, as well as members of as yet unknown viral clades. Genomic evidence suggested that most of these sequences resulted from viral DNA integrations rather than contaminating viruses. Furthermore, the nature of the inserted viral genes helped predicting original functional capacities of the donor viruses. These insights confirm that genomic insertions of NCLDV DNA are common in eukaryotes and can be exploited to delineate the contours of NCLDV biodiversity.Entities:
Keywords: lateral gene transfer; nucleo-cytoplasmic large DNA virus; virus insertion
Mesh:
Year: 2017 PMID: 28117696 PMCID: PMC5294986 DOI: 10.3390/v9010017
Source DB: PubMed Journal: Viruses ISSN: 1999-4915 Impact factor: 5.048
Nucleocytoplasmic large DNA viruses (NCLDV) core protein homologs in eukaryotic sequence datasets.
| Eukaryotic Clade | Species | Habitat | Database | DNAP * | MCP * | ATPase * | D5 * | VLTF3 * |
|---|---|---|---|---|---|---|---|---|
| Amoebozoa (Discosea) | terrestrial and aquatic | Assembly | √ | √ | √ | |||
| RefSeq | √ | √ | √ | √ | ||||
| Assembly | √ | √ | √ | |||||
| Assembly | √ | |||||||
| Assembly | √ | √ | √ | √ | ||||
| Assembly | √ | √ | ||||||
| Assembly | √ | √ | √ | √ | ||||
| Assembly | √ | √ | √ | |||||
| Assembly | √ | √ | √ | |||||
| Assembly | √ | √ | √ | √ | ||||
| Assembly | √ | |||||||
| Cryptophyta (Pyrenomonadales) | sea | RefSeq | MI | |||||
| Euglenozoa | freshwater | Assembly | √ | |||||
| Fungi (Chytridiomycota) | freshwater | RefSeq | Phy | √ | √ | Phy | √ | |
| Fungi (Glomeromycota) | terrestrial | Assembly | Asf | |||||
| Fungi (Blastocladiomycota) | freshwater | RefSeq | Phy | |||||
| Metazoa (Arthropoda) | freshwater | RefSeq | √ | |||||
| Metazoa (Cnidaria) | sea | RefSeq | Asf | |||||
| freshwater | RefSeq | Mi | √ | Mi | √ | |||
| Opisthokonta (Ichthyosporea) | sea | RefSeq | Irma | Irma | ||||
| sea | Assembly | Irma | Irma | Irma | √ | Irma | ||
| Rhizaria (Cercozoa) | sea | RefSeq | Phy + √ | √ | √ | Phy | √ | |
| Stramenopiles (Bicosoecida) | saltern pond | Assembly | √ | |||||
| Stramenopiles (Eustigmatophyceae) | freshwater | Assembly | Pha | Pha | ||||
| Stramenopiles (Hyphochytriomycetes) | terrestrial | Assembly | Asf | Asf | Asf | Asf | Asf | |
| Stramenopiles (Oomycetes) | soilborne plant pathogen | Assembly | Asf | |||||
| Assembly | Asf | |||||||
| Assembly | Asf | |||||||
| Assembly | Asf | |||||||
| Assembly | Asf | |||||||
| Assembly | Asf | Asf | ||||||
| RefSeq | Asf | Asf | ||||||
| Assembly | Asf | |||||||
| Assembly | Asf | |||||||
| Assembly | Asf | Asf | ||||||
| Stramenopiles (Phaeophyceae) | sea | Assembly | Pha | Pha | Pha | Pha | Pha | |
| RefSeq | Pha | Pha | Pha | Pha | Pha | |||
| RefSeq | Pha | |||||||
| Viridiplantae (Chlorophyta) | lichen photobiont | Assembly | Phy | Phy | Phy | Phy | ||
| terrestrial | Assembly | Mi | ||||||
| freshwater | Assembly | Mi | Mi | Mi | Mi + Phy | Mi | ||
| freshwater | Assembly | Mi | Mi | Mi | Mi + Phy | Mi + √ | ||
| freshwater | Assembly | √ | ||||||
| unknown | Assembly | Mi | Mi | Mi | Mi | Mi | ||
| sea | Assembly | Phy | Phy | Phy | Phy | Phy | ||
| freshwater | Assembly | Mi | Mi | Mi | Mi + Phy | Mi | ||
| Viridiplantae (Streptophyta) | terrestrial | RefSeq | Phy | Phy | Phy | √ | Phy | |
| Viridiplantae (Streptophyta) | terrestrial | RefSeq | Pitho | Pitho | ||||
| Cryptophyta (Cryptomonadales) | sea | MMETSP | √ | |||||
| Cryptophyta (Pyrenomonadales) | sea | MMETSP | √ | |||||
| Haptophyceae (Coccolithales) | sea | MMETSP | √ | √ | √ | |||
| Haptophyceae (Isochrysidales) | sea | MMETSP | √ | |||||
| sea | MMETSP | √ | ||||||
| Haptophyceae (Phaeocystales) | sea | MMETSP | √ | |||||
| Haptophyceae (Prymnesiales) | sea | MMETSP | Coc | Coc | Coc | |||
| Rhizaria (Cercozoa) | sea | MMETSP | Phy | |||||
| Stramenopiles (Labyrinthulomycetes) | sea | MMETSP | Pha | |||||
| sea | MMETSP | √ | √ | |||||
| sea | MMETSP | √ | √ | |||||
| Undescribed Strain | sea | MMETSP | √ | |||||
| Undescribed Strain | sea | MMETSP | √ | |||||
| Viridiplantae (Chlorophyta) | freshwater | 1KP | Mi | Mi | ||||
| freshwater | 1KP | Mi | ||||||
| Viridiplantae (Streptophyta) | freshwater | 1KP | Phy | |||||
| terrestrial | 1KP | Phy | ||||||
| terrestrial | 1KP | Phy | ||||||
* putative phylogenetic grouping of the NCLDV core protein homologs based on the phylogenetic trees presented in Figure 1 and Figure S1–S4: √ = unknown clade, Phy = Phycodnaviridae, Mi = Mimiviridae, Pha = phaeoviruses, Coc = coccolothoviruses, Pitho = putative Pithoviridae, Asf = Asfarviridae and IrMa = Iridoviridae/Marseilleviridae cluster. Column names: DNAP, DNA polymerase; MCP, major capsid protein; ATPase, DNA packaging ATPase; D5, D5 helicase; VLTF3, very late transcription factor 3.
Figure 1Maximum likelihood phylogenetic tree of DNA polymerase proteins. Statistical supports for branch (SH-like local support test) are given above or below nodes in percent. Branches with support less than 50% were collapsed. Species names with colored background indicate transcribed genes: green, 1KP transcriptomes; blue, MMETSP transcriptomes. Red and black question marks show potential extension of recognized viral groups or new viral clades, respectively. The scale bar indicates the number of substitution per site. Sequences, alignments, and phylogenetic trees are available in Dataset S1.