| Literature DB >> 32042448 |
Anderson F Brito1, John W Pinney1.
Abstract
Herpesviruses (HVs, Family: Herpesviridae) have large genomes that encode hundreds of proteins. Apart from amino acid mutations, protein domain acquisitions, duplications and losses are also common modes of evolution. HV domain repertoires differ across species, and only a core set is shared among all species, aspect that raises a question: How have HV domain repertoires diverged while keeping some similarities? To answer such question, we used profile Hidden Markov Models (HMMs) to search for domains in all possible translated open reading frames (ORFs) of fully sequenced HV genomes. With at least 274 domains being identified, we built a matrix of domain counts per species, and applied a parsimony method to reconstruct the ancestral states of these domains along the HV phylogeny. It revealed events of domain gain, duplication, and loss over more than 400 millions of years, where Alpha-, Beta-, and GammaHVs expanded and condensed their domain repertoires at distinct rates. Most of the acquired domains perform 'Modulation and Control', 'Envelope', or 'Auxiliary' functions, categories that showed high flexibility (number of domains) and redundancy (number of copies). Conversely, few gains and duplications were observed for domains involved in 'Capsid assembly and structure', and 'DNA Replication, recombination and metabolism'. Among the forty-one primordial domains encoded by Herpesviridae ancestors, twenty-eight are still found in all present-day HVs. Because of their distinct evolutionary strategies, HV domain repertoires are very specific at the subfamily, genus and species levels. Differences in domain composition may not only explain HV host range and tissue tropism, but also provide hints to the origins of HVs.Entities:
Keywords: ancestral reconstruction; gene duplication; gene loss; horizontal gene transfer; phylogenetics
Year: 2020 PMID: 32042448 PMCID: PMC7000910 DOI: 10.1093/ve/veaa001
Source DB: PubMed Journal: Virus Evol ISSN: 2057-1577
Figure 1.Time-calibrated MCC tree of HVs (Family: Herpesviridae), inferred using amino acid sequences from UL15, UL27, and UL30 as *Beast partitions. Posterior probabilities are shown at the nodes. HV subfamilies are shown as α (Alphaherpesvirinae); β (Betaherpesvirinae), and γ (Gammaherpesvirinae). HV genera are shown as: CY, Cytomegalovirus; IL, Iltovirus; LY, Lymphocryptovirus; MC, Macavirus; MD, Mardivirus; MU, Muromegalovirus; PB, Proboscivirus; PE, Percavirus; RH, Rhadinovirus; RO, Roseolovirus; SC, Scutavirus; SP, Simplexvirus; and VA, Varicellovirus. The geologic time scale is set according to Gradstein et al. (2012), where D, Devonian period; C, Carboniferous; P, Permian; T, Triassic; J, Jurassic; K, Cretaceous; Pε, Paleogene; N, Neogene; and *, Quaternary period (Q). The MCC tree is available at the GitHub repository (see Data availability section).
Figure 2.The domain repertoire. A total of 274 protein domains were identified from ORFs encoded by genomes of members of the Herpesviridae family. In this circular map, viral species are positioned following the same vertical order as shown in Fig. 1, with heatmaps of distinct subfamilies clearly separated. Domains with similar function were grouped, and the horizontal order of domains (clockwise sense) was defined based on their absolute frequency (i.e. most conserved domains are shown at the first columns of each group). Please observe that the lightest color shade of each group indicates domains that are absent in some species. As observed, only a small set of domains (n = 28, see arrowheads) was found across all species, and most domains (n = 203) are restricted to a single subfamily. Supplementary Table S1 presents a matrix of domains counts per viral species. The circular map was designed using Circos (Krzywinski et al. 2009).
Figure 3.Distribution of protein domains across distinct HV subfamilies (α, β, and γ). A set of forty-six domains was found in all subfamilies, but not in all species. As highlighted, the large majority of the protein domains of HVs are highly subfamily-specific. Domains depicted in this diagram are listed in Supplementary Table S2, including their distribution in each HV subfamily.
Figure 4.The evolution of the domain repertoire of the family Herpesviridae. Using the same tree shown in Fig. 1, events of domain gain, loss, and duplication (in this order) were mapped along the branches where they most likely took place according to their ancestral character reconstruction. The color scheme shown here is the same used in Fig. 2. Readers can access an interactive version of this tree on iToL (Letunic and Bork 2016): https://itol.embl.de/tree/12931243206234041528053583.
Figure 5.Frequency of gains (A), duplications (B), and losses (C) of domains and molecular functions across four-time intervals. By performing 1,000 simulations, each event had its time of occurrence sampled randomly along the branch it belongs to, allowing the events shown in Fig. 4 to be split highlighting their frequency in four chronological bins: Devonian–Carboniferous–Permian (D–C–P); Triassic–Jurassic (T–J); Cretaceous (K); and Pε–N–Q. In this way, the scenario observed here summarizes the one shown in Fig. 1. Since the number of branches increases exponentially, absolute values at each bin in the subplots of Fig. 5 cannot be directly compared due to differences in scale. Therefore, the purpose of these plots is to highlight what events and domain functions were predominant at each interval.
Figure 6.The impact of gains, losses, and duplications shaping the domain repertoires of HV subfamilies. (A) Number of events (×10−3) normalized by tMRCA (in Myr) and number of taxa (spp). (B) Absolute frequency of gains, losses and duplication of domains from distinct functional categories.
Minimal set of forty-one domains encoded by the MRCA of Herpesviridae.
| Domain | Gene | Absolute frequency | Distribution | Function |
|---|---|---|---|---|
| PF01771 (Herpes_alk_exo) | UL12 | 75 | α + β + γ | Auxiliary |
| PF01712 (dNK) | – | 26 | α + β + γ | Auxiliary |
| PF02499 (DNA_pack_C) | UL15 | 75 | α + β + γ | Capsid assembly and structure |
| PF02500 (DNA_pack_N) | UL15 | 75 | α + β + γ | Capsid assembly and structure |
| PF04559 (Herpes_UL17) | UL17 | 75 | α + β + γ | Capsid assembly and structure |
| PF01802 (Herpes_V23) | UL18 | 75 | α + β + γ | Capsid assembly and structure |
| PF03122 (Herpes_MCP) | UL19 | 75 | α + β + γ | Capsid assembly and structure |
| PF01499 (Herpes_UL25) | UL25 | 75 | α + β + γ | Capsid assembly and structure |
| PF00716 (Peptidase_S21) | UL26.5 | 75 | α + β + γ | Capsid assembly and structure |
| PF01366 (PRTP) | UL28 | 75 | α + β + γ | Capsid assembly and structure |
| PF02718 (Herpes_UL31) | UL31 | 75 | α + β + γ | Capsid assembly and structure |
| PF01673 (Herpes_env) | UL32 | 75 | α + β + γ | Capsid assembly and structure |
| PF03581 (Herpes_UL33) | UL33 | 75 | α + β + γ | Capsid assembly and structure |
| PF03327 (Herpes_VP19C) | UL38 | 75 | α + β + γ | Capsid assembly and structure |
| PF01763 (Herpes_UL6) | UL6 | 75 | α + β + γ | Capsid assembly and structure |
| PF04541 (Herpes_U34) | UL34 | 73 | α + β + γ | Capsid assembly and structure |
| PF03167 (UDG) | UL2 | 75 | α + β + γ | DNA Replication, recombination and metabolism |
| PF00747 (Viral_DNA_bp) | UL29 | 75 | α + β + γ | DNA Replication, recombination and metabolism |
| PF00136 (DNA_pol_B) | UL30 | 75 | α + β + γ | DNA Replication, recombination and metabolism |
| PF03104 (DNA_pol_B_exo1) | UL30 | 75 | α + β + γ | DNA Replication, recombination and metabolism |
| PF02689 (Herpes_Helicase) | UL5 | 75 | α + β + γ | DNA Replication, recombination and metabolism |
| PF03121 (Herpes_UL52) | UL52 | 75 | α + β + γ | DNA Replication, recombination and metabolism |
| PF02867 (Ribonuc_red_lgC) | UL39 | 73 | α + β + γ | DNA Replication, recombination and metabolism |
| PF03324 (Herpes_HEPA) | UL8 | 73 | α + β + γ | DNA Replication, recombination and metabolism |
| PF00692 (dUTPase) | UL50 | 69 | α + β + γ | DNA Replication, recombination and metabolism |
| PF00693 (Herpes_TK) | UL23 | 56 | α + β + γ | DNA Replication, recombination and metabolism |
| PF00268 (Ribonuc_red_sm) | UL40 | 55 | α + β + γ | DNA Replication, recombination and metabolism |
| PF00317 (Ribonuc_red_lgN) | UL39 | 52 | α + γ | DNA Replication, recombination and metabolism |
| PF02399 (Herpes_ori_bp) | UL9 | 42 | α + β | DNA Replication, recombination and metabolism |
| PF01528 (Herpes_glycop) | UL10 | 75 | α + β + γ | Envelope |
| PF17488 (Herpes_glycoH_C) | UL22 | 75 | α + β + γ | Envelope |
| PF00606 (Glycoprotein_B) | UL27 | 75 | α + β + γ | Envelope |
| PF17416 (Glycoprot_B_PH1) | UL27 | 75 | α + β + γ | Envelope |
| PF17417 (Glycoprot_B_PH2) | UL27 | 75 | α + β + γ | Envelope |
| PF02489 (Herpes_glycop_H) | UL22 | 72 | α + β + γ | Envelope |
| PF03554 (Herpes_UL73) | – | 62 | α + β + γ | Envelope |
| PF05459 (Herpes_UL69) | UL54 | 75 | α + β + γ | Modulation and Control |
| PF01646 (Herpes_UL24) | UL24 | 75 | α + β + γ | Tegument |
| PF04843 (Herpes_teg_N) | UL36 | 75 | α + β + γ | Tegument |
| PF03044 (Herpes_UL16) | UL16 | 74 | α + β + γ | Tegument |
| PF01677 (Herpes_UL7) | UL7 | 72 | α + β + γ | Tegument |
The domains are shown with their absolute frequency among the fully sequenced genomes of members of the Herpesviridae family (75 = shared by all viruses), their distribution among the three subfamilies (α, β, and γ), and their functional categories. Genes are named as found in the HHV1 genome annotation.
–, genes with no homologs in HHV1.