Literature DB >> 29308275

The number of genes encoding repeat domain-containing proteins positively correlates with genome size in amoebal giant viruses.

Avi Shukla1, Anirvan Chatterjee1, Kiran Kondabagil1.   

Abstract

Curiously, in viruses, the virion volume appears to be predominantly driven by genome length rather than the number of proteins it encodes or geometric constraints. With their large genome and giant particle size, amoebal viruses (AVs) are ideally suited to study the relationship between genome and virion size and explore the role of genome plasticity in their evolutionary success. Different genomic regions of AVs exhibit distinct genealogies. Although the vertically transferred core genes and their functions are universally conserved across the nucleocytoplasmic large DNA virus (NCLDV) families and are essential for their replication, the horizontally acquired genes are variable across families and are lineage-specific. When compared with other giant virus families, we observed a near-linear increase in the number of genes encoding repeat domain-containing proteins (RDCPs) with the increase in the genome size of AVs. From what is known about the functions of RDCPs in bacteria and eukaryotes and their prevalence in the AV genomes, we envisage important roles for RDCPs in the life cycle of AVs, their genome expansion, and plasticity. This observation also supports the evolution of AVs from a smaller viral ancestor by the acquisition of diverse gene families from the environment including RDCPs that might have helped in host adaption.

Entities:  

Keywords:  genome expansion; genome plasticity; giant virus; repeat domain-containing proteins

Year:  2018        PMID: 29308275      PMCID: PMC5753266          DOI: 10.1093/ve/vex039

Source DB:  PubMed          Journal:  Virus Evol        ISSN: 2057-1577


1. Introduction

Allometry, the study of the relationship between biological size and function, is considered as an important readout of evolutionary processes (Klingenberg, 2016). In the case of viruses, an allometric exponent of 1.5 between the length of the viral genome and the volume of the virion particle suggests a significant positive correlation between virion and genome size (Cui, Schlub and Holmes 2014). An increase in the virion volume was strongly attributed to an increase in the genome length rather than protein content and capsid morphology (Cui, Schlub and Holmes 2014). Consistent with this observation, genomes of giant viruses that infect amoeba [amoebal viruses (AVs)] are large, despite being intracellular parasites (Koonin and Wolf 2010; Colson and Raoult 2012; Yutin, Wolf and Koonin 2014). If the amount of DNA is assumed to be a predominant factor in the virion volume (Cui, Schlub and Holmes 2014), amoeba-infecting megaviruses emerge as the bellwethers of large genomes driving the size of the virion. Interestingly, amoeba-resistant bacteria (ARBs) adapted to intra-amoeba lifestyle such as Legionella pneumophila and Rickettsia bellii also harbor unusually large genomes (Moliner, Fournier, and Raoult 2010). This seemingly contradicts the evolution of intracellular organisms from their free-living ancestors by genome reduction (Andersson and Kurland 1998; Sakharkar, Kumar, and Chow 2004; Merhej et al. 2009; Darmon and Leach 2014; McNally et al. 2016). In ARBs, genome expansion has been linked to the horizontal acquisition of mobile elements and genes encoding repeat domain-containing proteins (RDCPs) with functions analogous to the immune system and anti-host secretory system (Moliner, Fournier, and Raoult 2010). The genomes of AVs also harbor genes encoding RDCPs such as ankyrin, FNIP, and WD40 repeat domain-containing proteins (Suhre 2005). Both ARBs and AVs are internalized via phagocytosis, resist digestion, and exhibit many similar genomic features (Moliner, Fournier, and Raoult. 2010). Unlike other intracellular pathogens that are known to undergo genome reduction, ARBs and AVs maintain large genomes and acquire genes via horizontal gene transfer (HGT) (Boyer et al. 2009; Colson and Raoult 2010). In a complex evolutionary path, AVs and ARBs emerge as competitors (Slimani et al. 2013) for an amoebal host that also facilitates the horizontal transfer of genes. The cytoplasmic life cycle within amoeba emerges as a key evolutionary force driving the genomic content of both ARBs and AVs. The shared ‘mobilome’ among ARBs and AVs enable both to succeed in subverting the host predation/immune system. Here, we have identified an association between lineage-specific genome size expansion and acquisition and duplication of repeat domain proteins/multigene family in AVs. Box 1 . HGT and the mobilome of AVs Polintons (also known as mavericks) are the large DNA transposons (9–22 kb long) that are widely distributed in eukaryotes (Kapitonov and Jurka 2006; Fischer and Suttle 2011; Krupovic and Koonin, 2015). Recently, it was shown that virophages (parasitic viruses of large DNA viruses) and polintons, in addition to encoding several key homologous proteins including major and minor capsid proteins, FtsK-type packaging ATPase, protein-primed DNA polymerase B, retroviral-like family integrase and cysteine protease, exhibit similar genomic architecture (Fig. A). These observations imply that Polintons and virophages are evolutionarily linked (Filee, Pouget and Chandler 2008). Although Polintons encode two capsid proteins, their ability to form virion has not been demonstrated. Although an earlier study suggested the evolution of polintons from a virus (Benson et al. 1999), more recently, Polintons were hypothesized to have evolved from bacteriophages to become the first eukaryotic DNA viruses from which most of the extant NCLDVs have evolved (Krupovic and Koonin, 2015). Mavirus, a virophage of the Cafeteria roenbergensis virus (CroV) that infects the marine flagellate C. roenbergensis, possesses terminal inverted repeats that are characteristic of Polintons and other transposons (Filee, Pouget and Chandler 2008) and can integrate at multiple sites within the host (C. roenbergenesis) genome and get reactivated in a CroV-infection dependent manner (Fischer and Hackl 2016). Furthermore, Polintons are thought to be one of the major components of the complex genetic network that include NCLDVs, adenoviruses, virophages, bacteriophages, naked DNA elements (Koonin, Krupovic, and Yutin 2015; Krupovic and Koonin, 2015). Similar to Class 2 DNA transposons, polintons transfer genetic material by a replicative or a cut-paste mechanism (Wicker et al. 2007) (Fig. B) and augment the number of shared genes across the network in the mobilome (Desnues et al. 2012; Colson et al. 2017). Another key member of this mobilome is the transpovirons found in Mimiviridae (Desnues et al. 2012; Yutin, Raoult, and Koonin 2013; Yutin et al. 2013). ORFs found in transpovirons have diverse evolutionary histories (Desnues et al. 2012) with origins in bacteria and their phages, and eukaryotes such as Tetrahmenathermophila (Yutin, Raoult, and Koonin 2013; Yutin et al. 2013). With the ability to integrate non-specifically into any part of the host (Mimiviridae family) chromosome (Desnues et al. 2012), transpovirons, along with virophages and polintons, are speculated to drive gene transfer within the mobilome (Boyer et al. 2011). Consequently, homologs of several hallmark genes of AVs have been found to be present in the polintons, virophages, and transpovirons (Fig. A), along with genetic elements (integrases and terminal repeats) reminiscent of TEs (Fig. A). Thus, polintons and transpovirons frequently introduce genetic material from other branches of life (bacteria and eukarya) into the mobilome which is then transferred to AVs by virophages (Fig. B). Insertion sequences, a major component of HGT are also commonly found in giant viruses, specifically in Mimiviridae and Phycodnaviridae with two overlapping ORFs (Filee, Siguier, and Chandler 2007). Interestingly, identical elements are also found to be part of A. Castellanii genome suggesting a route for gene transfer either from prokaryotes via giant viruses or from proto-eukaryotic ancestors (Gilbert and Cordaux 2013). These elements can manipulate the downstream gene expression (Siguier, Gourbeyre, and Chandler 2014) and play a major role in gene inactivation, deletion, duplication and genetic rearrangement in the genome via homologous/illegitimate recombination (Filee, Siguier, and Chandler 2007). In an extreme case, about 30 non-autonomous transposable elements commonly known as MITEs (10 are integrated into the coding regions) have ‘colonized’ (Sun et al. 2015) the genome of Pandoravirus salinus, but were undetectable in Pandoravirus dulcis (Sun et al. 2015). Akin to their role in prokaryotes, they promote gene deletion and genetic rearrangement (Feschotte, Zhang, and Wessler, 2002). A conceivable outcome of such genome plasticity would be the loss and/or gain of function, accelerating host-switching and adaptation. Apart from these family-specific mobile elements, the genomes of NCLDV also contain self-splicing introns (Azza et al. 2009) and inteins along with HNH endonuclease which might aid in the mobility of genetic elements (Filee and Chandler 2010). All three are known to influence genome evolution in all forms of life through their splicing and nuclease activity (Darmon and Leach 2014).

1.1 Classes of RDCPs in AVs and their functions in cellular homologs

Amoebal giant viruses are replete with proteins containing repeating amino acid sequences and are classified as RDCPs. These include ankyrin (ANK) repeat (Boyer et al. 2011; Herbert, Squire and Mercer 2015), Kelch repeat (Suhre 2005), leucine-rich repeat (LRR) (Suhre 2005), Tetratricopeptide (TPR) repeat (Sobhy et al. 2015), membrane occupation, and recognition nexus (MORN) repeat (Boyer et al., 2009), phenylalanine-asparagine-isoleucine-proline (FNIP/IP22) repeat (Suhre 2005), tryptophan-aspartic acid (WD40) repeat (Suhre 2005), and Sel 1 repeat. Proteins containing these repeat motifs regulate various intracellular processes through protein–protein interactions (Kobe and Deisenhofer, 1994; Sedgwick et al. 1999; Adams, Kelso, and Cooley 2000; Voronin and Kiseleva 2007; Catalano et al. 2010; Zeytuni and Zarivach, 2012). In plants, genes encoding RDCPs and their duplication have been associated with adaptation to rapid environmental variations (Richard, Kerrest, and Dujon 2008; Sharma and Pandey 2015). These proteins are thought to be the result of intragenic tandem duplication via recombination and are more commonly found in eukaryotes and metazoans, than prokaryotes (Marcotte et al. 1999; Andrade, Perez-Iratxeta, and Ponting 2001). AVs encode many of these RDCPs that are either integrated into the functional genes or present as stand-alone repeats. Motif length and structure of RDCPs found in AVs and their known functions in prokaryotes and eukaryotes are summarized in Table 1.
Table 1.

The basic composition, structure, and functions of different repeat domain proteins in diverse forms of life excluding Megavirales

Multigene repeat familiesCompositionStructural unitTertiary structureParticipates inCommonly found inReferences
Ankyrin repeats (ANK)33 aaTwo antiparallel α-helices joined by β-hairpin at 90° forming L-shaped structureCupped hand shape solvent accessible groove formed by repeating protomersCell cycle regulation, cytoskeletal binding, protein trafficking across membrane, acquired resistance.Prokaryotes and eukaryotesNguyen, Liu and Thomas 2014, Al-khodor et al. 2009, Voronin and Kiseleva 2007, Sedgwick and Smerdon 1999, Cao et al. 1997, Shchelkunov, Blinov and Sandakhchiev 1993
Leucine rich repeats (LRR)20 to 29 aaA β-sheet and an α-helix arranged in an anti-parallel mannerMultiple repeats are oriented parallel to the axis forming horse-shoe like structureProtein -protein interaction, signal transduction and formation of protein complexesProkaryotes and eukaryotesKobe and Deisenhofer 1994, Sharma and Pandey 2015
FNIP/IP22 repeats22 aaA β-sheet and an α-helix arranged in an anti-parallel mannerHorse shoe like structure (like LRR)Interaction of calmodulin binding proteins, increases cell motility and chemotaxisDictyostelium and NCLDVCatalano et al. 2010, O’Day et al. 2006
Tetratricopeptide repeats (TPR)34 aaMultiple array of α-helix turn α-helix unit packaged in parallelA right-handed super-helix that provide concave groove for molecule bindingCell cycle regulation, chaperone functioning, protein translocation, bacterial pathogenesis, and biogenesis of multi-functional pilliProkaryotes and eukaryotes including humansCerveny et al. 2013, Zeytuni and Zarivach 2012
Sel1 repeats33 to 44 aaMultiple array of α-helix turn α-helix unit packaged in parallelA right-handed super-helixER-associated protein ubiquitination, regulation of mitosis and septum formation, host-pathogen interactionBacteria and eukaryotesNewton et al. 2007, Mittl and Schneider-Brachert 2007
WD 40 repeats40 aaFour anti-parallel β-sheet arranged radially with flanking dipeptidePropeller like structureGene regulation, chromatin modelling, transmembrane signalling, mRNA modification, vesicle fusion and adhesion complex of malarial parasitesEukaryotesSuganuma, Pattenden, and Workman 2016, von Bohl et al. 2015, Neer et al. 1994
Kelch repeats44 to 56 aaFour anti-parallel β-sheet arranged radially with flanking dipeptidePropeller like structureActin binding, manipulates cell organization and morphologyProkaryotes, eukaryotes and virusesPrag and Adams 2003, Adams, Kelso and Cooley 2000
MORN repeats23 aaNot knownNot knownParasites' budding, protein translocation, flagellum biogenesis, form junctional complex between plasma membrane to endoplasmic reticulum, promotes phagocytosis of bacteriumProkaryotes and eukaryotesMorriswood and Schmidt 2015, Abnave et al. 2014, Cuttel et al. 2008, Gubbels et al. 2006, Hui Ma et al. 2006, Takeshima et al. 2000
The basic composition, structure, and functions of different repeat domain proteins in diverse forms of life excluding Megavirales

1.2 Effect of genes encoding RDCPs on AV genome size

We compared the frequency of occurrence of genes encoding RDCPs and core viral functions and their association with genome size (Fig. 1A and B) as well as their genomic location in the representative genome from the thirteen giant virus families (Fig. 1C). A near–linear relationship was observed between the number of genes encoding RDCPs and the genome size of most large viruses (Fig. 1A). The trend is most evident in AVs, where the number of genes encoding RDCPs correlated with an increase in the genome size (r2 = 0.87). No such correlation was observed between the genome size and the number of genes encoding core viral functions (r2 =  0.11) (Fig. 1B). The correlation was less evident in other giant viruses, viz. Asfarviridae, Poxviridae, Iridoviridae, and Phycodnaviridae, which are not known to infect amoeba, suggesting genome expansion via the acquisition of RDCPs is specific to AVs (Moliner, Fournier and Raoult 2010). Interestingly, genes encoding RDCPs are concentrated towards the termini on either side of the core genes (Fig. 1C). This arrangement is most apparent in Mimiviridae family members. AVs with significantly smaller genomes have fewer RDCPs, spread across the genome (Mollivirus and Faustovirus), and in AVs with larger genomes (Pandoravirus), RDCPs appear to have spread throughout the genome. Proteins with repeat domain play important roles in protein–protein interactions (Table 1; Brüggemann, Cazalet, and Buchrieser, 2006). Interestingly, when Mimivirus was propagated repeatedly under a competition-free axenic environment, genes present in the termini region were lost (Boyer et al. 2011; Colson and Raoult 2012). The lost patches include the genes encoding proteins participating in the fiber formation and its glycosylation, and ANK repeat proteins (Boyer et al. 2011). But, in a competitive environment, the presence of fibers increases the virion size and may facilitate efficient phagocytosis. And in addition, the genomic-termini regions populated with RDCPs might aid survival in a sympatric environment but are under low selection pressure in an allopatric environment and can afford deletions (Boyer et al. 2011). This ensures the protection of the centrally located core genome that is also thought to be less recombinogenic than the termini (Filee, Siguier, and Chandler 2007; Boyer et al. 2011). We suggest that in a competitive environment, accumulation of RDCPs in the termini provide a selective advantage over other viruses and bacteria.
Figure 1.

A near–linear relationship between the genome size and the number of genes encoding RDCPs in AVs. Core genes and RDCPs were manually curated from 13 published genomes. Core function definitions were chosen as per the previous reports (Raoult et al. 2004; Yutin, et al. 2009; Yutin, Raoult, and Koonin 2013; Yutin et al. 2013). These included genes encoding DNA replication, recombination and repair, transcription and RNA processing, translation, and post-translation modifications, nucleotide metabolism, virion packaging, and morphogenesis. Genes encoding these functions in the 13 representative NCLDV families were retrieved as per annotations in the public databases. Genes for which annotations was not updated, but yielded significant alignment matches in Interpro, CDD, Pfam and Smart servers, were also included. (A) Scatterplot of the number of repeat protein families plotted against genome size. A high correlation between number of repeat protein families and genomes size (r2 = 0.87) was observed. (B) Scatterplot of the number of genes encoding core viral functions plotted against genome size, which shows a poor association between the two (r2 = 0.11). In (A) and (B), the shaded area indicates the standard error as per a linear regression model. The size of the data label (solid dot) representing genomes is proportional to the genome size. Number alongside the data label corresponds to the number inside the ideograms shown in Figure 1C. (C) Circos-generated ideograms of giant viral genomes. Outer concentric represents the clusters of repeat domain proteins/multigene families that include proteins containing ANK repeats = red, FNIP repeats = green, MORN repeats = blue, sel1 repeats = yellow, TPR = purple, WD40 repeats = black, LRR and kelch repeat = gray, and the inner concentric denotes the core genome. APMV, Acanthamoeba polyphaga mimivirus; APMoV, Acanthamoeba polyphaga momouvirus; PBCV 1, Paramecium bursaria chlorella virus 1; ASFV, African swine fever virus.

A near–linear relationship between the genome size and the number of genes encoding RDCPs in AVs. Core genes and RDCPs were manually curated from 13 published genomes. Core function definitions were chosen as per the previous reports (Raoult et al. 2004; Yutin, et al. 2009; Yutin, Raoult, and Koonin 2013; Yutin et al. 2013). These included genes encoding DNA replication, recombination and repair, transcription and RNA processing, translation, and post-translation modifications, nucleotide metabolism, virion packaging, and morphogenesis. Genes encoding these functions in the 13 representative NCLDV families were retrieved as per annotations in the public databases. Genes for which annotations was not updated, but yielded significant alignment matches in Interpro, CDD, Pfam and Smart servers, were also included. (A) Scatterplot of the number of repeat protein families plotted against genome size. A high correlation between number of repeat protein families and genomes size (r2 = 0.87) was observed. (B) Scatterplot of the number of genes encoding core viral functions plotted against genome size, which shows a poor association between the two (r2 = 0.11). In (A) and (B), the shaded area indicates the standard error as per a linear regression model. The size of the data label (solid dot) representing genomes is proportional to the genome size. Number alongside the data label corresponds to the number inside the ideograms shown in Figure 1C. (C) Circos-generated ideograms of giant viral genomes. Outer concentric represents the clusters of repeat domain proteins/multigene families that include proteins containing ANK repeats = red, FNIP repeats = green, MORN repeats = blue, sel1 repeats = yellow, TPR = purple, WD40 repeats = black, LRR and kelch repeat = gray, and the inner concentric denotes the core genome. APMV, Acanthamoeba polyphaga mimivirus; APMoV, Acanthamoeba polyphaga momouvirus; PBCV 1, Paramecium bursaria chlorella virus 1; ASFV, African swine fever virus.

1.3 HGT of RDCPs in AVs

AVs might have acquired genes encoding RDCPs from amoeba and bacteria by various HGT mechanisms resulting in genome expansion. Virophages, polintoviruses, and transpovirons, associated with AVs, facilitate HGT between AVs and their host environment (Desnues et al. 2012; Yutin, Raoult, and Koonin 2013; Yutin et al. 2013). Akin to mobile genetic elements (MGEs), these three drive HGT in AVs and have contributed to a shared gene pool, consisting of a variety of genes encoding essential functions and transposition, giving rise to the mobilome of AVs (Yutin, Raoult, and Koonin 2013; Yutin et al. 2013) as discussed in Box 1. The presence of multiple MGEs in nucleocytoplasmic large DNA virus (NCLDV) genomes along with proteins known for DNA transport connote the makings of a genome populated with agents for large-scale genomic insertion, deletion, and rearrangements. The presence of self-splicing intronic regions in several genes including capsid proteins and DNA polymerase in some AVs, not reported in other viruses, also suggests their acquisition from eukaryotic genomes (Arslan et al. 2011). This is analogous to HGT in bacteria and eukaryotes that facilitate the development of drug resistance (Novais et al. 2010), defense systems (Makarova et al. 2011; Krupovic et al. 2014), regulatory roles in transcriptional and signaling mechanisms, (Negi, Rai, and Suprasanna 2016), and immunological variation (Huang et al., 2016). The genome expansion ensuing from this plasticity could be crucial for enabling the evolutionary success of AVs as seen in ARBs with similar MGE architecture.

1.4 RDCPs and the lineage-specific genome expansion in AVs

Initial studies indicated the apparent monophyly of AVs (Yutin and Koonin 2012; Zade, Sengupta, and Kondabagil 2015) and recent comparative genomics of diverse AVs have provided more robust phylogenies suggesting a probable lineage-specific expansion in AVs (Iyer et al. 2006; Filée 2009). Tracing the genome size over a phylogeny based on the B family DNA polymerase amino acid sequence, conserved across NCLDVs suggests the presence of larger genomes in the AV lineages (Fig. 2A). This expansion in AV lineages could be primarily attributed to the acquisition of RDCPs which shows a positive correlation with genome size (Fig. 1A). However, in the case of Pandoravirus, the genome expansion may be free from geometrical constraints, unlike other AVs, such as Mimivirus and Faustovirus where the viral morphology may limit genome size expansion.
Figure 2.

A speculative hypothesis on the RDCP driven lineage-specific genome expansion in AVs. (A) Genome size distribution and B family DNA polymerase phylogeny. ML Tree of B family DNA polymerase amino acid sequence was constructed using FastTree with default settings using a representative sequence from 13 NCLDV families. A large red circle on the internal node of the AV lineage indicates a more recent ancestor from which we believe genome expansion has ensued, especially in the amoebal milieu. Smaller red circles indicate a much recent ancestor from which independent genome expansion strategy might have led to larger genomes in Faustoviruses and Pithovirus. Black and purple circles indicate ancestors of unknown genome size and nature. More genome sequences are needed to resolve the genome size distribution pattern and its evolutionary link to the nature of the ancestor in large DNA viruses. (B) Circos ideogram of Mimivirus genome. Three concentrics, labeled as 1, 2, and 3 represent RDCPs, core and hypothetical genes, and mobile elements, respectively. The bipartite AV genome consists of a conserved core region derived from a common ancestor, and the RDCPs that are clustered in the genomic termini of the AVs. In addition to aiding in genome expansion, the RDCPs may also help in survival in the competitive environment (see Fig. 3 for details). In an allopatric condition, most of these RDCPs are lost causing a reduced genome size (Boyer et al., 2011; Colson and Raoult 2012).

A speculative hypothesis on the RDCP driven lineage-specific genome expansion in AVs. (A) Genome size distribution and B family DNA polymerase phylogeny. ML Tree of B family DNA polymerase amino acid sequence was constructed using FastTree with default settings using a representative sequence from 13 NCLDV families. A large red circle on the internal node of the AV lineage indicates a more recent ancestor from which we believe genome expansion has ensued, especially in the amoebal milieu. Smaller red circles indicate a much recent ancestor from which independent genome expansion strategy might have led to larger genomes in Faustoviruses and Pithovirus. Black and purple circles indicate ancestors of unknown genome size and nature. More genome sequences are needed to resolve the genome size distribution pattern and its evolutionary link to the nature of the ancestor in large DNA viruses. (B) Circos ideogram of Mimivirus genome. Three concentrics, labeled as 1, 2, and 3 represent RDCPs, core and hypothetical genes, and mobile elements, respectively. The bipartite AV genome consists of a conserved core region derived from a common ancestor, and the RDCPs that are clustered in the genomic termini of the AVs. In addition to aiding in genome expansion, the RDCPs may also help in survival in the competitive environment (see Fig. 3 for details). In an allopatric condition, most of these RDCPs are lost causing a reduced genome size (Boyer et al., 2011; Colson and Raoult 2012).
Figure 3.

Putative roles of various RDCPs in the AV infection cycle. Giant capsid mimics the size of bacteria for promoting phagocytosis in a sympatric environment prohibiting the host encystment. Once inside, it suppresses the host immune system by interfering with host defense mechanisms by interacting with various host proteins via repeat domain-containing protein (that also mimic some of the host proteins) or/and deviating them to ubiquitination. The distinct phases of the intra-amoebal life cycle of a virus involve: (1) Particle size plays an important role in the mode entry on viruses. As seen in other viruses (Cui et al. 2014), the large particle size may be driven by genome expansion, caused by accumulation of RDCPs. (2) Once phagocytosed, the encystment of the trophozoite is arrested and the fusion of the phagosome to the lysosome is inhibited by ankyrin, TPR, WD40, and Sel1 repeat domains proteins, as has been reported in intra-amoebal parasitic bacteria (Shchelkunov, Blinov, and Sandakhchiev 1993; Newton et al. 2007; Cerveny et al. 2013; Nguyen, Liu, and Thomas 2014). Some of the RDCPs have been reported to be packaged in the virion indicating their role in the initiation of the viral replication cycle (Renesto et al. 2006). (3) The viral genome is released into the cytoplasm from the phagosome and the formation of a replication center is initiated by the recruitment of various cytoplasmic membranes, mitochondria, and cytoskeletal components. This formation requires a number of complex interactions and signaling pathways, that are probably mediated by FNIP, ANK repeats, Sel1, WD40, or/and MORN repeats domain proteins. (4) During infection, RDCPs such as LRR, FNIP, IP22, WD40, ANK repeats, and F-box proteins might interfere with host defense mechanisms. They have been shown to modify/regulate the host gene expression and subvert the host proteins to ubiquitination or mimics some of the inhibitory molecules to suppress the immune pathways (Sharma and Pandey 2015). (5) During the infection cycle, host cell morphology changes to avoid superinfection. This morphological change is brought about by MORN, Kelch, FNIP and ANK repeat domain proteins (Table 1). In addition, MORN repeat containing protein might also promote the degradation of other internalized microorganisms. (6) Unlike AVs, bacteria are unable to interfere with the formation of the phagolysosome, and are consequently digested by the hydrolytic enzymes in the lysosome (Cosson and Soldati 2008; Akya, Pointon and Thomas 2009). Although phagocytosis of AVs and bacteria is primarily driven by particle size, they have distinct fates. The RDCPs emerge as crucial drivers of both, the particle size and a successful viral life cycle.

As seen in plants and some pathogenic bacteria, RDCPs are characterized by frequent duplications and deletions (Siozios et al. 2013; Sharma and Pandey 2015) which confer plasticity to their genomes. Genome plasticity imparted by genes encoding RDCPs in AVs could be a major contributor to their ‘accordion’-like evolution (Filee 2013; Filee 2015). An accretion scenario considers a smaller virus as an ancestor of giant viruses (Yutin, Wolf, and Koonin 2014; Koonin, Krupovic, and Yutin 2015) that got bigger in some lineages by gene acquisition leading to both genome and particle size expansion (Rodrigues et al. 2016). On the other hand, a genome reduction scenario considers evolution from an ancestor with a larger genome (Claverie and Abergel 2013; Filee 2013). Although the presence of HGT-derived genes and MGEs (Filee, Siguier, and Chandler 2007) has been used as evidence for the former argument, the presence of some translation-related genes and lack of cellular homologs of giant viral genes (Jeudy et al. 2012; Abrahão et al. 2017) have been used to support the later. Genes related to key processes such as transcription, nucleotide metabolism, translation, virion assembly, and DNA packaging, are part of the Nucleo-Cytoplasmic Virus Orthologous Genes (NCVOGs) (Yutin et al. 2009) and are believed to be vertically transferred from a common ancestor (Raoult et al. 2004; Iyer et al. 2005, 2006; Chelikani et al. 2014; Zade, Sengupta, and Kondabagil 2015). Isolation of several novel NCLDVs and their genomic characterization has reduced the number of conserved genes to nine (Iyer, Aravind, and Koonin 2001; Yutin et al. 2009), with a conceivable diversity in other core genes arising from replacement of essential genes by unrelated ones with similar function (Forterre 2006; Iyer et al., 2006; Filee, Pouget and Chandler 2008). Further, it was also suggested that the common ancestor encoded several genes in addition to the basal machinery, indicating that the NCLDV ancestor was relatively complex (Yutin et al. 2009; Koonin and Yutin 2010). A majority of the other (non-core) NCVOGs are coded by two or more of the NCLDV family members. The core genomic landscape of the vertically transferred genes with lineage-specific diversification is reminiscent of gene reservoirs of pathogenic bacteria, which facilitate rapid adaptation to host (Hannan 2012; Andam and Hanage 2015; McNally et al. 2016). Based on the location of the genes encoding RDCPs and genes with known viral functions, the AV genomes could be thought of as bipartite, the central core genome flanked by the genomic termini (Fig. 2B). The core genes that are under high selection pressure predominates the central part. The peripheral segments on either side harbor genes encoding RDCPs, which confer plasticity and are under relatively less selection pressure. This bipartite genome may undergo lineage-specific expansion, primarily through accumulation and duplication of genes encoding RDCPs, resulting in a large genome size. This is consistent with the view that the members of Mimiviridae might have undergone genomic expansion from a common ancestor, as against a probable genome reduction scenario in some members of Phycodnaviridae family, which infect algae (Maruyama and Ueki 2016). Although the list of sequenced large DNA viral genomes from wider geographies is growing (Hingamp et al. 2013; Aherfi et al. 2016; Chatterjee et al. 2016a,b), isolation and sequencing of more large DNA viruses enable the description of phylogenetic intermediates that are critical for a parsimonious explanation of particle and genome size evolution. Despite missing the probable clade and lineage-specific ancestors, we observed genomic arrangement patterns in AVs which may enable their intra-amoebal lifestyle (Figs. 2A and 3).

1.5 Competitive advantage of large particle size driven by gene accretion including RDCPs

The capsid that harbors the giant genome plays a major role in the entry of these viruses into their respective hosts (Rodrigues et al. 2016). The mode of entry of metazoan and algal viruses differ from the AVs. Asfarvirus, Iridovirus, and Poxvirus enter the multicellular host by an actin-dependent macropinocytosis or a receptor-mediated endocytosis (Rodrigues et al. 2016). Poxviruses also enter the host by their membrane fusion to the plasma membrane (Moss 2012; Rizopoulos et al. 2015). Phycodnavirus generally enter their algal host by degradation of the host cell membrane (Wilson, Van Etten, and Allen 2009). Giant viruses such as Mimivirus, Pandoravirus, Pithovirus, and Mollivirus undergo phagocytosis (Fig. 3) (Rodrigues et al. 2016), which is predominantly a function of the size of the particle; the threshold size for entry is ∼500 nm (Korn and Weisman 1967). The importance of particle size in the mode of entry is further exemplified in the case of Marseillevirus, which is phagocytosed when present as a ‘parcel’ (many particles) in a vesicle (>1 µm). However, when present as a solitary particle of ∼220 nm, Marseillevirus undergoes endocytosis or macropinocytosis (Arantes et al. 2016). Amoeba generally grazes on particles of the general size of a bacterium (Korn and Weisman 1967) and digest it via phagolysosome pathway (Fig. 3; Khan 2001; Akya, Pointon and Thomas 2009; Raoult and Boyer, 2010)]. Thus, the giant size, largely driven by the acquisition and duplication of RDCPs, is critical for infecting amoeba via phagocytosis (Rodrigues et al., 2016). Once phagocytized, giant viruses must subvert encystment and hijack the host cellular machinery to initiate the formation of the viral replication center (Fig. 3, see figure legend for details). The rapidity of the hijack necessitates a multipronged approach of naturalization into the host via gene products adapted to the host pathway and infectiousness which directs cellular process towards the synthesis of viral proteins. Many of these are mediated by RDCPs. Some of these, such as WD40 and ANK repeat containing proteins are packaged in the Mimivirus particle indicating their imminent role in initiating the viral replication cycle (Renesto et al. 2006). Putative roles of various RDCPs in the AV infection cycle. Giant capsid mimics the size of bacteria for promoting phagocytosis in a sympatric environment prohibiting the host encystment. Once inside, it suppresses the host immune system by interfering with host defense mechanisms by interacting with various host proteins via repeat domain-containing protein (that also mimic some of the host proteins) or/and deviating them to ubiquitination. The distinct phases of the intra-amoebal life cycle of a virus involve: (1) Particle size plays an important role in the mode entry on viruses. As seen in other viruses (Cui et al. 2014), the large particle size may be driven by genome expansion, caused by accumulation of RDCPs. (2) Once phagocytosed, the encystment of the trophozoite is arrested and the fusion of the phagosome to the lysosome is inhibited by ankyrin, TPR, WD40, and Sel1 repeat domains proteins, as has been reported in intra-amoebal parasitic bacteria (Shchelkunov, Blinov, and Sandakhchiev 1993; Newton et al. 2007; Cerveny et al. 2013; Nguyen, Liu, and Thomas 2014). Some of the RDCPs have been reported to be packaged in the virion indicating their role in the initiation of the viral replication cycle (Renesto et al. 2006). (3) The viral genome is released into the cytoplasm from the phagosome and the formation of a replication center is initiated by the recruitment of various cytoplasmic membranes, mitochondria, and cytoskeletal components. This formation requires a number of complex interactions and signaling pathways, that are probably mediated by FNIP, ANK repeats, Sel1, WD40, or/and MORN repeats domain proteins. (4) During infection, RDCPs such as LRR, FNIP, IP22, WD40, ANK repeats, and F-box proteins might interfere with host defense mechanisms. They have been shown to modify/regulate the host gene expression and subvert the host proteins to ubiquitination or mimics some of the inhibitory molecules to suppress the immune pathways (Sharma and Pandey 2015). (5) During the infection cycle, host cell morphology changes to avoid superinfection. This morphological change is brought about by MORN, Kelch, FNIP and ANK repeat domain proteins (Table 1). In addition, MORN repeat containing protein might also promote the degradation of other internalized microorganisms. (6) Unlike AVs, bacteria are unable to interfere with the formation of the phagolysosome, and are consequently digested by the hydrolytic enzymes in the lysosome (Cosson and Soldati 2008; Akya, Pointon and Thomas 2009). Although phagocytosis of AVs and bacteria is primarily driven by particle size, they have distinct fates. The RDCPs emerge as crucial drivers of both, the particle size and a successful viral life cycle. Organization of various genes and their homologs in AVs, virophages, Polintons, transpovirons, and IS elements constituting the predicated mobilome network (Desnues et al. 2012; Yutin, Raoult, and Koonin 2013; Yutin et al. 2013). Despite limited synteny, the mobilome exhibits genetic and functional conservation. AVs, virophages, and polintons encode four core genes, viz. packaging ATPase, major and minor capsid protein, and cysteine protease. The presence of different types of helicases across the mobilome illustrates functional conservation. All the members of the mobilome have one or more genes encoding transposase, integrase, and endonuclease which facilitate genetic exchange. Although the inverted repeats are encoded in all the genomes, they have not been reported in the terminal regions of CroV and Mamavirus. Box Figure B. Probable evolutionary routes for the exchange of mobile elements in AVs. The closest homologs of the various domains of mobilome are from non-viral system suggesting, their acquisition from different microbial sources sharing the same niche. (Dotted line shows the probable transmission while solid blue lines shows classification) +CP, with capsid protein; −CP, without capsid protein.

2. Conclusion: repeat domain proteins are essential for intra-amoebal aadaptation

Acquired vertically or horizontally, the genomic composition of AVs exhibit an exceptional variability. Genomes of AVs could be thought of as bipartite, with genes encoding core functions populating the center and genes encoding RDCPs frequenting the termini. Unsurprisingly, RDCPs, which are considered to be the hotspots of protein evolution (Persi et al. 2016), emerge as one of the key genetic elements responsible for the lineage-specific genome expansion of AVs. With most genes in AVs found to be under purifying selection (Doutre et al. 2014), RDCPs are also expected to contribute to virus fitness. However, as in Ohno’s dilemma (Bergthorsson, Andersson, and Roth. 2007), strong purifying selection on RDCPs would reduce diversity. Consequently, as seen in repeat domain proteins across cellular organisms, RDCPs of AVs might undergo cycles of relaxed and strong purifying selection (Persi et al. 2016) to provide increased fitness in a competitive host environment, such as amoeba. This is expected to lead to the evolution of new functions and/or establishment of existing functions. We suggest that the acquisition of RDCPs in AVs facilitated both genome expansion and host adaptation. The later probably led to an allometric increase in the particle size. Finally, similar to a ‘telomeric strategy’, these elements are concentrated towards the termini protecting the core genes. This genomic arrangement of RDCPs in the termini may be crucial for AVs to adapt to a wide variety of hosts and outcompete prokaryotes and other viruses in the prokaryote-grazing protozoan milieu.

Data availability

Data are available through Dryad. Conflict of interest: None declared.
  110 in total

1.  The 1.2-megabase genome sequence of Mimivirus.

Authors:  Didier Raoult; Stéphane Audic; Catherine Robert; Chantal Abergel; Patricia Renesto; Hiroyuki Ogata; Bernard La Scola; Marie Suzan; Jean-Michel Claverie
Journal:  Science       Date:  2004-10-14       Impact factor: 47.728

Review 2.  Giants among larges: how gigantism impacts giant virus entry into amoebae.

Authors:  Rodrigo Araújo Lima Rodrigues; Jônatas Santos Abrahão; Betânia Paiva Drumond; Erna Geessien Kroon
Journal:  Curr Opin Microbiol       Date:  2016-04-01       Impact factor: 7.934

Review 3.  Mechanisms of genome evolution of Streptococcus.

Authors:  Cheryl P Andam; William P Hanage
Journal:  Infect Genet Evol       Date:  2014-11-13       Impact factor: 3.342

4.  Ankyrin-like proteins of variola and vaccinia viruses.

Authors:  S N Shchelkunov; V M Blinov; L S Sandakhchiev
Journal:  FEBS Lett       Date:  1993-03-15       Impact factor: 4.124

5.  Translation in giant viruses: a unique mixture of bacterial and eukaryotic termination schemes.

Authors:  Sandra Jeudy; Chantal Abergel; Jean-Michel Claverie; Matthieu Legendre
Journal:  PLoS Genet       Date:  2012-12-13       Impact factor: 5.917

6.  Genomic comparison of closely related Giant Viruses supports an accordion-like model of evolution.

Authors:  Jonathan Filée
Journal:  Front Microbiol       Date:  2015-06-16       Impact factor: 5.640

7.  Identification of giant Mimivirus protein functions using RNA interference.

Authors:  Haitham Sobhy; Bernard La Scola; Isabelle Pagnier; Didier Raoult; Philippe Colson
Journal:  Front Microbiol       Date:  2015-04-28       Impact factor: 5.640

8.  Positive and strongly relaxed purifying selection drive the evolution of repeats in proteins.

Authors:  Erez Persi; Yuri I Wolf; Eugene V Koonin
Journal:  Nat Commun       Date:  2016-11-18       Impact factor: 14.919

9.  Discovery of an Active RAG Transposon Illuminates the Origins of V(D)J Recombination.

Authors:  Shengfeng Huang; Xin Tao; Shaochun Yuan; Yuhang Zhang; Peiyi Li; Helen A Beilinson; Ya Zhang; Wenjuan Yu; Pierre Pontarotti; Hector Escriva; Yann Le Petillon; Xiaolong Liu; Shangwu Chen; David G Schatz; Anlong Xu
Journal:  Cell       Date:  2016-06-09       Impact factor: 41.582

10.  Exploring nucleo-cytoplasmic large DNA viruses in Tara Oceans microbial metagenomes.

Authors:  Pascal Hingamp; Nigel Grimsley; Silvia G Acinas; Camille Clerissi; Lucie Subirana; Julie Poulain; Isabel Ferrera; Hugo Sarmento; Emilie Villar; Gipsi Lima-Mendez; Karoline Faust; Shinichi Sunagawa; Jean-Michel Claverie; Hervé Moreau; Yves Desdevises; Peer Bork; Jeroen Raes; Colomban de Vargas; Eric Karsenti; Stefanie Kandels-Lewis; Olivier Jaillon; Fabrice Not; Stéphane Pesant; Patrick Wincker; Hiroyuki Ogata
Journal:  ISME J       Date:  2013-04-11       Impact factor: 10.302

View more
  10 in total

1.  Structure-Based Deep Mining Reveals First-Time Annotations for 46 Percent of the Dark Annotation Space of the 9,671-Member Superproteome of the Nucleocytoplasmic Large DNA Viruses.

Authors:  Yeva Mirzakhanyan; Paul David Gershon
Journal:  J Virol       Date:  2020-11-23       Impact factor: 5.103

2.  Gene copy number variations at the within-host population level modulate gene expression in a multipartite virus.

Authors:  Romain Gallet; Jérémy Di Mattia; Sébastien Ravel; Jean-Louis Zeddam; Renaud Vitalis; Yannis Michalakis; Stéphane Blanc
Journal:  Virus Evol       Date:  2022-06-22

3.  Role of an FNIP Repeat Domain-Containing Protein Encoded by Megavirus Baoshan during Viral Infection.

Authors:  Yucheng Xia; Huanyu Cheng; Wenya Bian; Weiyun Wang; Mengqi Zhu; Jiang Zhong
Journal:  J Virol       Date:  2022-06-28       Impact factor: 6.549

4.  Morphologic and Genomic Analyses of New Isolates Reveal a Second Lineage of Cedratviruses.

Authors:  Rodrigo Araújo Lima Rodrigues; Julien Andreani; Ana Cláudia Dos Santos Pereira Andrade; Talita Bastos Machado; Souhila Abdi; Anthony Levasseur; Jônatas Santos Abrahão; Bernard La Scola
Journal:  J Virol       Date:  2018-06-13       Impact factor: 5.103

Review 5.  Multiple evolutionary origins of giant viruses.

Authors:  Eugene V Koonin; Natalya Yutin
Journal:  F1000Res       Date:  2018-11-22

6.  Genomic and metagenomic signatures of giant viruses are ubiquitous in water samples from sewage, inland lake, waste water treatment plant, and municipal water supply in Mumbai, India.

Authors:  Anirvan Chatterjee; Thomas Sicheritz-Pontén; Rajesh Yadav; Kiran Kondabagil
Journal:  Sci Rep       Date:  2019-03-06       Impact factor: 4.379

7.  Coevolutionary and Phylogenetic Analysis of Mimiviral Replication Machinery Suggest the Cellular Origin of Mimiviruses.

Authors:  Supriya Patil; Kiran Kondabagil
Journal:  Mol Biol Evol       Date:  2021-05-04       Impact factor: 16.240

8.  Crystal structures of FNIP/FGxxFN motif-containing leucine-rich repeat proteins.

Authors:  Trevor Huyton; Mamta Jaiswal; Waltraud Taxer; Matthias Fischer; Dirk Görlich
Journal:  Sci Rep       Date:  2022-09-30       Impact factor: 4.996

Review 9.  Adaptation by copy number variation in monopartite viruses.

Authors:  Avraham Bayer; Greg Brennan; Adam P Geballe
Journal:  Curr Opin Virol       Date:  2018-07-14       Impact factor: 7.090

Review 10.  Giant Viruses-Big Surprises.

Authors:  Nadav Brandes; Michal Linial
Journal:  Viruses       Date:  2019-04-30       Impact factor: 5.048

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.