| Literature DB >> 30542614 |
Eugene V Koonin1, Natalya Yutin1.
Abstract
The nucleocytoplasmic large DNA viruses (NCLDVs) are a monophyletic group of diverse eukaryotic viruses that reproduce primarily in the cytoplasm of the infected cells and include the largest viruses currently known: the giant mimiviruses, pandoraviruses, and pithoviruses. With virions measuring up to 1.5 μm and genomes of up to 2.5 Mb, the giant viruses break the now-outdated definition of a virus and extend deep into the genome size range typical of bacteria and archaea. Additionally, giant viruses encode multiple proteins that are universal among cellular life forms, particularly components of the translation system, the signature cellular molecular machinery. These findings triggered hypotheses on the origin of giant viruses from cells, likely of an extinct fourth domain of cellular life, via reductive evolution. However, phylogenomic analyses reveal a different picture, namely multiple origins of giant viruses from smaller NCLDVs via acquisition of multiple genes from the eukaryotic hosts and bacteria, along with gene duplication. Thus, with regard to their origin, the giant viruses do not appear to qualitatively differ from the rest of the virosphere. However, the evolutionary forces that led to the emergence of virus gigantism remain enigmatic.Entities:
Keywords: gene gain; gene loss; giant viruses; phagocytosis; virus evolution; virus-host interaction
Mesh:
Year: 2018 PMID: 30542614 PMCID: PMC6259494 DOI: 10.12688/f1000research.16248.1
Source DB: PubMed Journal: F1000Res ISSN: 2046-1402
Figure 1. Phylogenetic tree of five genes that are nearly universal in nucleocytoplasmic large DNA virus (NCLDV).
The tree was constructed from concatenated multiple alignment of five (nearly) universally conserved NCLDV proteins: DNA polymerase, major capsid protein, packaging ATPase, A18-like helicase, and poxvirus late transcription factor VLTF3. The branch color denotes confirmed or likely hosts: red, Amoebozoa; green, other protists; blue, Metazoa. The tree was constructed by using the FastTree software [60] with default parameters. The numbers at the internal branches indicate local likelihood-based support (percentage points); the branches with support below 50% were collapsed. Scale bars represent the number of amino acid (aa) substitutions per site. The middle panel shows genome length, on the scale shown in the bottom of the figure. The right panel shows the distribution of translation-related genes among the NCLDVs: 1–19, aminoacyl tRNA synthetases (1, Ala; 2, Arg; 3, Asn; 4, Asp; 5, Cys; 6, Gln; 7, Gly; 8, His; 9, Ile; 10, Leu; 11, Lys; 12, Met; 13, Pro; 14, Phe; 15, Thr; 16, Ser; 17, Trp; 18, Tyr; 19, Val); 20–33, translation factors (20, eIF-1/SUI1; 21, eIF1a; 22, eIF2a; 23, eIF2b; 24, eIF2g; 25, eIF4a; 26, eIF4e; 27, eIF4g; 28, eIF5a; 29, eIF5b; 30, EF1a; 31, aEF2; 32, eEF3; 33, eRF1). Green, blue, and orange circles represent one, two, or three proteins of the respective family encoded in a genome.
Figure 2. Gene gain and loss in the evolution of the nucleocytoplasmic large DNA virus (NCLDV) and multiple origins of giant viruses.
The tree topology is from the phylogeny of five nearly universal genes ( Figure 1). The maximum likelihood reconstruction was produced by using the GLOOME software [61], from the mapping of 5284 clusters of homologous NCLDV genes onto the tree leaves (extant viruses). Red triangles show gene gains, and green triangles show gene losses. The size of a triangle is roughly proportional to the maximum likelihood estimate of the number of gains or losses.
Figure 3. Translation-related gene gain and loss in the evolution of the nucleocytoplasmic large DNA virus (NCLDV).
Inferred gains of translation-related genes is shown by red circles, and the loss of translation-related genes is shown by green circles. The inferences are based on previously analyzed phylogenetic trees [37, 44]. The translation-related genes are numbered as in Figure 1.