Literature DB >> 24916030

Evolutionary origins of human herpes simplex viruses 1 and 2.

Joel O Wertheim1, Martin D Smith2, Davey M Smith3, Konrad Scheffler4, Sergei L Kosakovsky Pond5.   

Abstract

Herpesviruses have been infecting and codiverging with their vertebrate hosts for hundreds of millions of years. The primate simplex viruses exemplify this pattern of virus-host codivergence, at a minimum, as far back as the most recent common ancestor of New World monkeys, Old World monkeys, and apes. Humans are the only primate species known to be infected with two distinct herpes simplex viruses: HSV-1 and HSV-2. Human herpes simplex viruses are ubiquitous, with over two-thirds of the human population infected by at least one virus. Here, we investigated whether the additional human simplex virus is the result of ancient viral lineage duplication or cross-species transmission. We found that standard phylogenetic models of nucleotide substitution are inadequate for distinguishing among these competing hypotheses; the extent of synonymous substitutions causes a substantial underestimation of the lengths of some of the branches in the phylogeny, consistent with observations in other viruses (e.g., avian influenza, Ebola, and coronaviruses). To more accurately estimate ancient viral divergence times, we applied a branch-site random effects likelihood model of molecular evolution that allows the strength of natural selection to vary across both the viral phylogeny and the gene alignment. This selection-informed model favored a scenario in which HSV-1 is the result of ancient codivergence and HSV-2 arose from a cross-species transmission event from the ancestor of modern chimpanzees to an extinct Homo precursor of modern humans, around 1.6 Ma. These results provide a new framework for understanding human herpes simplex virus evolution and demonstrate the importance of using selection-informed models of sequence evolution when investigating viral origin hypotheses.
© The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Keywords:  co-divergence; cross-species transmission; homo; molecular clock; selection; zoonosis

Mesh:

Year:  2014        PMID: 24916030      PMCID: PMC4137711          DOI: 10.1093/molbev/msu185

Source DB:  PubMed          Journal:  Mol Biol Evol        ISSN: 0737-4038            Impact factor:   16.240


Introduction

Herpesviridae is a family of DNA viruses, which epitomize the pattern of viral codivergence with their vertebrate hosts, dating back hundreds of millions of years (McGeoch and Cook 1994; McGeoch et al. 1995). Across the three subfamilies (alpha-, beta-, and gammaherpesvirinae), there have been multitudes of within-host viral lineage duplications (sympatric divergence events [Kitchen et al. 2011]) in which viral descendants follow the phylogenetic history of their host species. Herpes simplex viruses in primates exemplify this phenomenon and evidence of codivergence dates back at least to the ancestor of New World and Old World primates (Simiiformes) 44.2 Ma (Eberle and Black 1993; Luebcke et al. 2006; Steiper and Young 2009) (fig. 1). These viruses have been characterized in monkeys, where each host species is infected with a single, species-specific virus. Humans are the only primate species in which more than one herpes simplex virus has been characterized: HSV-1 and HSV-2 (table 1).
F

General pattern of codivergence for primate herpes simplex viruses and their Simiiforme hosts. Underlined viral taxa indicate phylogenetic incongruence, implying cross-species transmission events. Dashed lines connect virus to host species.

Table 1.

Known Primate Herpes Simplex Viruses.

VirusVirus AbbreviationHost Latin NameHost Common Name
Baboon herpes virus 2HVP-2Papio spp.Baboons
Cercopithecus herpes virus 2CeHV-2Chlorocebus pygerythrusaAfrican green monkeya
Chimpanzee herpes virusChHVPan troglodytesChimpanzee
Herpes simplex virus 1HSV-1Homo sapiensHuman
Herpes simplex virus 2HSV-2H. sapiensHuman
Macacine herpes virus 1MHV-1Macaca spp.Macaques
Saimiriine herpes virusHVS-1Saimiri sciureusSquirrel monkey
Spider monkey herpes virusHVA-1Ateles geoffroyiSpider monkey

aSpecies is likely not the natural host.

General pattern of codivergence for primate herpes simplex viruses and their Simiiforme hosts. Underlined viral taxa indicate phylogenetic incongruence, implying cross-species transmission events. Dashed lines connect virus to host species. Known Primate Herpes Simplex Viruses. aSpecies is likely not the natural host. The discovery of the first nonhuman ape simplex virus, chimpanzee herpes simplex virus (ChHV), shed light on how two distinct human simplex viruses arose (Luebcke et al. 2006; Severini et al. 2013). HSV-2 is more closely related to ChHV than it is to HSV-1, which suggests that ChHV and at least one of the human herpes simplex viruses arose via host–virus codivergence. Therefore, there are ten parsimonious scenarios explaining the phylogenetic relationship among HSV-1, HSV-2, and ChHV that require only a single viral lineage duplication or cross-species transmission event (figs. 2 and 3). All of these scenarios predict extant or extinct undiscovered simplex viruses in apes: bonobos (Pan paniscus), gorillas, orangutans, and gibbons (shown in gray in figs. 2 and 3). Primate simplex viruses could have undergone within-host viral lineage duplication after apes diverged from Old World monkeys but before humans diverged from the Pan genus (fig. 2). Alternatively, HSV-1 could be the result of a cross-species transmission event from gibbons, orangutans, or gorillas (fig. 3A–C). Finally, HSV-2 could be the result of a cross-species transmission from Pan troglodytes, P. paniscus, or their common ancestor (fig. 3D–F).
F

Evolutionary scenarios that could produce the human and chimpanzee herpes simplex virus phylogeny via viral duplication within a host lineage. Hypothetical unobserved viruses are shown in gray. Nodes representing the common ancestor of humans and chimpanzees, around 6 Ma, are indicated with asterisks. All scenarios imply a 6 Ma tMRCA for HSV-2/ChHV and a tMRCA >6 Ma for HSV-1/ChHV. (A) Viral duplication prior to the diversification of apes. (B) Viral duplication prior to the diversification of the great apes. (C) Viral duplication prior to the diversification of the African apes. (D) Viral duplication prior to the split between the Homo and Pan genera.

F

Evolutionary scenarios that could produce the human and chimpanzee herpes simplex virus phylogeny via viral cross-species transmission. Hypothetical unobserved viruses are shown in gray. Nodes representing the common ancestor of humans and chimpanzees, around 6 Ma, are indicated with asterisks. Scenarios A–C imply a 6 Ma tMRCA for HSV-2/ChHV and a tMRCA >6 Ma for HSV-1/ChHV, whereas scenarios D–F imply a 6 Ma tMRCA for HSV-1/ChHV and a tMRCA <6 Ma for HSV-2/ChHV. (A) Cross-species transmission of a gibbon virus to a human ancestor, giving rise to HSV-1. (B) Cross-species transmission of an orangutan virus to a human ancestor, giving rise to HSV-1. (C) Cross-species transmission of a gorilla virus to a human ancestor, giving rise to HSV-1. (D) Cross-species transmission of a virus infecting a Pan ancestor to a human ancestor, giving rise to HSV-2. (E) Cross-species transmission of a chimpanzee virus to a human ancestor, giving rise to HSV-2. (F) Cross-species transmission of a bonobo virus to a human ancestor, giving rise to HSV-2.

Evolutionary scenarios that could produce the human and chimpanzee herpes simplex virus phylogeny via viral duplication within a host lineage. Hypothetical unobserved viruses are shown in gray. Nodes representing the common ancestor of humans and chimpanzees, around 6 Ma, are indicated with asterisks. All scenarios imply a 6 Ma tMRCA for HSV-2/ChHV and a tMRCA >6 Ma for HSV-1/ChHV. (A) Viral duplication prior to the diversification of apes. (B) Viral duplication prior to the diversification of the great apes. (C) Viral duplication prior to the diversification of the African apes. (D) Viral duplication prior to the split between the Homo and Pan genera. Evolutionary scenarios that could produce the human and chimpanzee herpes simplex virus phylogeny via viral cross-species transmission. Hypothetical unobserved viruses are shown in gray. Nodes representing the common ancestor of humans and chimpanzees, around 6 Ma, are indicated with asterisks. Scenarios A–C imply a 6 Ma tMRCA for HSV-2/ChHV and a tMRCA >6 Ma for HSV-1/ChHV, whereas scenarios D–F imply a 6 Ma tMRCA for HSV-1/ChHV and a tMRCA <6 Ma for HSV-2/ChHV. (A) Cross-species transmission of a gibbon virus to a human ancestor, giving rise to HSV-1. (B) Cross-species transmission of an orangutan virus to a human ancestor, giving rise to HSV-1. (C) Cross-species transmission of a gorilla virus to a human ancestor, giving rise to HSV-1. (D) Cross-species transmission of a virus infecting a Pan ancestor to a human ancestor, giving rise to HSV-2. (E) Cross-species transmission of a chimpanzee virus to a human ancestor, giving rise to HSV-2. (F) Cross-species transmission of a bonobo virus to a human ancestor, giving rise to HSV-2. Although host–virus codivergence is the primary mode of evolution for primate simplex viruses, zoonotic transmission can occur. For example, humans are frequently infected by a macaque simplex virus (MHV-1, formerly known as B virus) (Elmore and Eberle 2008), resulting in severe illness, though human-to-human transmission is exceptionally rare (Centers for Disease Control and Prevention 1987). Moreover, CeHV-2 (previously known as SA8) was discovered in an African green monkey (Malherbe and Harwin 1958), though it has become clear that CeHV-2 is likely a baboon virus (Malherbe and Strickland-Cholmley 1969a, 1969b; Kalter et al. 1978; Hilliard et al. 1989; Tyler and Severini 2006). In the case of human herpes simplex viruses, molecular sequence dating could be used to identify which divergence event (i.e., HSV-1/ChHV or HSV-2/ChHV) corresponds to the speciation between Homo sapiens and P. troglodytes around 6 Ma (Kumar et al. 2005). Previous dating analysis accompanying the discovery of ChHV used pairwise genetic distance regression analysis to suggest that HSV-2 was the result of codivergence 6 Ma and that HSV-1 originated from an orangutan cross-species transmission event (fig. 3B) (Luebcke et al. 2006; Severini et al. 2013). However, such analyses may be misleading because, unlike phylogeny-based molecular clocks, they do not account for shared evolutionary history (Drummond et al. 2003). Recent methodological developments in the dating of RNA viruses have suggested that standard evolutionary models used in molecular dating can underestimate the time to most recent common ancestor (tMRCA) (e.g., in measles virus, Ebola virus, avian influenza virus, and coronaviruses) (Wertheim and Kosakovsky Pond 2011; Wertheim et al. 2013). This bias has been attributed to the action of strong purifying selection over long evolutionary time scales, and selection-informed models have been shown to improve branch length estimation (Wertheim and Kosakovsky Pond 2011); even under these models, however, many viruses are likely too old to produce reliable tMRCA estimates. For viruses such as HIV and influenza A virus, whose evolutionary rate is on the order of 10−3 substitutions/site/year, it is standard to rely on tMRCA estimates around 100 years old (Korber et al. 2000; Worobey et al. 2008, 2014; Smith et al. 2009). The herpes simplex virus substitution rate is estimated to be around 10−8 substitutions/site/year (Sakaoka et al. 1994; Norberg et al. 2011), which suggests that tMRCAs on the order of tens of millions of years could be reliably inferred. To resolve the origin of HSV-1 and HSV-2, we propose a novel computational hypothesis testing approach that combines realistic codon-substitution evolutionary models that allow for selection strength heterogeneity with a penalized likelihood molecular clock estimation procedure. The standard evolutionary models produce dating estimates that are not consistent with any of the ten evolutionary origin scenarios. In contrast, our selection-informed model suggests a new explanation for the origin of human herpes simplex viruses, in which HSV-2 was acquired by an extinct Homo species from an ancestor of modern chimpanzees.

Results

Bayesian Markov Chain Monte Carlo Dating Analysis

To estimate the tMRCAs of HSV-1, HSV-2, and ChHV, we performed molecular dating analysis on a genome-wide data set comprising 12 concatenated glycoprotein sequences. This analysis was carried out in a Bayesian Markov chain Monte Carlo (BMCMC) framework using a standard nucleotide substitution model (GTR + Γ4) under a relaxed molecular clock in the BEAST software package (Drummond and Rambaut 2007; Drummond et al. 2012). The molecular clock was calibrated assuming a general pattern of viral–host codivergence across the phylogeny, using three previously estimated internal node ages (see Materials and Methods for details). Under both general sets of hypotheses (viral lineage duplication [fig. 2] or cross-species transmission [fig. 3]), the tMRCA of either HSV-1/ChHV or HSV-2/ChHV should correspond to the divergence between their human and chimpanzee hosts around 6 Ma (fig. 1). We also performed dating analysis using only glycoprotein B (gB) sequences, which have been sampled in a greater number of taxa, permitting the inclusion of an additional calibration point in New World monkeys (see Materials and Methods). The phylogenetic relationships among the primate simplex viruses were all well supported in the BMCMC analyses in both data sets (posterior probability = 1.0; supplementary fig. S1, Supplementary Material online); the topologies are in agreement with previous viral analyses (Luebcke et al. 2006; Severini et al. 2013) and generally congruent with the host phylogeny (fig. 1). Surprisingly, in the concatenated glycoprotein analysis, neither the HSV-1/ChHV tMRCA (mean = 10.2 Ma, 95% highest posterior density [HPD] = 6.2–14.5 Ma) nor the HSV-2/ChHV tMRCA (mean = 3.2 Ma, 95% HPD = 1.5–5.4) corresponded to the speciation between their human and chimpanzee hosts around 6 Ma (fig. 4A). The same pattern recurred in the gB data set: HSV-1/ChHV tMRCA (mean = 10.4 Ma, 95% HPD = 5.3–16.0 Ma) and HSV-2/ChHV tMRCA (mean = 3.4 Ma, 95% HPD = 1.1–5.9) (fig. 4B). Although the split between humans and chimpanzees is likely better reflected by a continuous process whereby alleles segregate over a period of time between 5 and 7 Ma (Kumar et al. 2005; Patterson et al. 2006; Yamamichi et al. 2012), BMCMC dating results support neither the lineage duplication nor the cross-species transmission hypotheses for the origin of HSV-1 and HSV-2.
F

Posterior distributions of the tMRCAs for HSV-1/ChHV and HSV-2/ChHV in BMCMC analysis. (A) Concatenated glycoprotein tMRCA estimates. (B) gB tMRCA estimates. Shaded regions depict the 95% highest posterior densities. The vertical dashed line represents the divergence between humans and chimpanzees around 6 Ma.

Posterior distributions of the tMRCAs for HSV-1/ChHV and HSV-2/ChHV in BMCMC analysis. (A) Concatenated glycoprotein tMRCA estimates. (B) gB tMRCA estimates. Shaded regions depict the 95% highest posterior densities. The vertical dashed line represents the divergence between humans and chimpanzees around 6 Ma.

Correcting for Selection-Induced Bias in Branch Length Estimates

We continued with an in-depth analysis of the gB data set, because it contains three additional host species for molecular clock calibration and validation. We inferred a maximum likelihood phylogeny for the primate simplex viruses using this gB data set (fig. 5). Again, the phylogenetic relationships among the primate simplex viruses were well supported (approximate likelihood ratio test [aLRT] = 1.0). Standard nucleotide models (e.g., the ubiquitous GTR + Γ4) are known to underestimate branch lengths in RNA viruses, resulting in biased tMRCA inference (Wertheim and Kosakovsky Pond 2011; Wertheim et al. 2013). Therefore, we re-estimated the phylogenetic branch lengths using a model of molecular evolution that accounts for variation in selection pressures both across sites and across the phylogeny: branch-site random effects likelihood (BSREL; Kosakovsky Pond et al. 2011).
F

Branch length expansion under BSREL relative to GTR + Γ4 substitution model in the gB phylogeny. (A) Maximum likelihood tree with branch lengths estimated under GTR + Γ4. HSV-1 and HSV-2 clades are collapsed. (B) Maximum likelihood tree with branch lengths re-estimated under BSREL. Branches determined by cAIC to support multiple dN/dS classes are colored, and the HSV-1 and HSV-2 clades are collapsed. Both trees are shown on the same scale. (C) Comparison of branches inferred under BSREL and GTR + Γ4. Branches determined by cAIC to support multiple dN/dS classes are filled. All other branches supported only a single dN/dS class. The dashed line depicts x = y.

Branch length expansion under BSREL relative to GTR + Γ4 substitution model in the gB phylogeny. (A) Maximum likelihood tree with branch lengths estimated under GTR + Γ4. HSV-1 and HSV-2 clades are collapsed. (B) Maximum likelihood tree with branch lengths re-estimated under BSREL. Branches determined by cAIC to support multiple dN/dS classes are colored, and the HSV-1 and HSV-2 clades are collapsed. Both trees are shown on the same scale. (C) Comparison of branches inferred under BSREL and GTR + Γ4. Branches determined by cAIC to support multiple dN/dS classes are filled. All other branches supported only a single dN/dS class. The dashed line depicts x = y. A comparison of the branch length estimates under the standard model (GTR + Γ4) and BSREL shows how the nucleotide model underestimates branch lengths for long branches (fig. 5). Nine branches had statistical evidence of multiple selection regimes (different dN/dS ratio classes), indicating that a complex pattern of selection has likely acted across the phylogeny. All but two of these branches had weights of 0.95 or greater assigned to the dN/dS = 0 class (i.e., the class with no nonsynonymous substitutions). These nine branches tended to be long internal branches and experienced the greatest expansion under BSREL compared with GTR + Γ4. For instance, the longest internal branch, which separates Old World primate viruses from New World monkey viruses, was 3.15 times longer under BSREL than GTR + Γ4. These findings suggest that tMRCAs obtained using a standard model are likely biased. One of the length expansions (involving a short terminal branch leading to an HVP-2 isolate) appears to be an outlier (fig. 5C), likely caused by low precision point estimates of dN/dS. However, when we used the original (GTR + Γ4) branch length instead of the expanded length for this branch in our subsequent analyses, the hypothesis testing results and tMRCA inference remained unaffected (not shown).

Comparison of Evolutionary Scenarios

We inferred the tMRCAs of HSV-1/ChHV and HSV-2/ChHV under the BSREL substitution model and investigated which of the ten alternative hypotheses was most consistent with these estimates. Because codon-substitution models that explicitly include variation in the strength of natural selection cannot be readily implemented in BEAST or similar packages, we employed a penalized likelihood framework (r8s) (Sanderson 2003) to apply a molecular clock using branch lengths previously estimated using BSREL. We calibrated the molecular clock using the same four internal node ages used in the BMCMC analysis. We constructed two constrained models describing the ten evolutionary scenarios (figs. 2 and 3). The first model, corresponding to viral lineage duplication or HSV-1 cross-species transmission, forces the tMRCA of HSV-2/ChHV to coincide with humanchimpanzee speciation 6 Ma (figs. 2 and 3A–C). The second model, corresponding to HSV-2 cross-species transmission, forces the tMRCA of HSV-1/ChHV to coincide with humanchimpanzee speciation 6 Ma (fig. 3D–F). We then performed likelihood ratio tests comparing the fit of these constrained models to the fit of an unconstrained model in which the tMRCAs of HSV-1/ChHV and HSV-2/ChHV were free to vary. Under the BSREL model, one set of origin scenarios is clearly favored (table 2). A tMRCA for HSV-2 and ChHV at 6 Ma is rejected in favor of an unconstrained model (P < 0.0001). Therefore, viral duplication and HSV-1 cross-species transmission scenarios (fig. 2 and 3A–C) can be rejected. However, a tMRCA for HSV-1 and ChHV at 6 Ma (fig. 3D–F) cannot be rejected in favor of an unconstrained model (P = 0.506).
Table 2.

Maximum Penalized Likelihood Values for gB Phylogenies and Likelihood Ratio Tests for Human and ChHV Codivergence Scenarios under Different Evolutionary Models: GTR + Γ4 and BSREL.

Homo–Pan Codivergence EventGTR + Γ4
BSREL
−ln LΔln LaPb−ln LΔln LaPb
Unconstrained−307.517−628.711
HSV-1/ChHV−326.64819.131<0.0001−628.9320.2210.506
HSV-2/ChHV−337.54330.026<0.0001−740.962112.251<0.0001

aDifference in ln L between constrained (null) model and unconstrained (alternative) model.

bLikelihood ratio test with 1 degree of freedom comparing the constrained and unconstrained models.

Maximum Penalized Likelihood Values for gB Phylogenies and Likelihood Ratio Tests for Human and ChHV Codivergence Scenarios under Different Evolutionary Models: GTR + Γ4 and BSREL. aDifference in ln L between constrained (null) model and unconstrained (alternative) model. bLikelihood ratio test with 1 degree of freedom comparing the constrained and unconstrained models. Under the GTR + Γ4 substitution model, a 6 Ma tMRCA for HSV-1/ChHV also provided a better fit than a 6 Ma tMRCA for HSV-2/ChHV (table 2). However, both constrained scenarios were rejected when compared with the unconstrained scenario (P < 0.0001). Therefore, if GTR + Γ4 is an appropriate evolutionary model for the primate herpes simplex viruses, then neither the divergence of HSV-1 nor that of HSV-2 from ChHV corresponded to a host speciation event. Given that prior studies of ancient viral evolution have cast doubt on the suitability of GTR + Γ4 in this context (Wertheim and Kosakovsky Pond 2011; Wertheim et al. 2013), the problem likely lies with the substitution model and not with the codivergence scenarios. The divergence between HSV-1 and ChHV could have occurred at any time during the host speciation process, which may or may not have involved substantial gene flow between nascent species (Patterson et al. 2006; Yamamichi et al. 2012). However, the results of our hypothesis tests are qualitatively similar if we allow the divergence between humans and P. troglodytes to vary between 5 and 7 Ma (by replacing the fixed constraint with a range); both scenarios are rejected under GTR + Γ4 (P < 0.0001), and only the HSV-2/ChHV codivergence scenario can be rejected under BSREL (P < 0.0001). The maximum likelihood estimate for the HSV-1/ChHV tMRCA using branch lengths inferred under BSREL was 6.2 (5.6–7.4) Ma (table 3). This date is in line with expected divergence between their primate host species. Moreover, the inferred substitution rate of 1 × 10−8 (8.1 × 10−7–1.2 × 10−8) substitutions per site per year agrees with previous estimates for herpes simplex virus evolution in humans (Sakaoka et al. 1994; Norberg et al. 2011). The tMRCA of HSV-2/ChHV was 1.6 (1.4–2.1) Ma, suggesting a more recent viral cross-species transmission event. The tMRCAs of HSV-1/ChHV and HSV-2/ChHV are robust to inclusion of specific internal calibration points. When the penalized likelihood analysis is performed with only three of the four calibrations (all four possible combinations were explored), the tMRCA for HSV-1/ChHV ranges between 5.7 and 6.5 Ma, and the tMRCA for HSV-2/ChHV ranges between 1.5 and 1.7 Ma. Although this test for robustness is potentially sensitive to dependencies that may exist among the nonfossil-derived calibration points, divergence dates within the primate phylogeny have been extensively studied and are internally consistent (Steiper and Young 2009).
Table 3.

Simplex Virus Time of Most Common Ancestor (tMRCA) Estimates from Relaxed Molecular Clock Analyses in r8s and Their Corresponding Host Divergence Dates Inferred Using gB.

TaxatMRCA under GTR + Γ4a (Ma)tMRCA under BSRELa (Ma)Published Host Divergenceb (Ma)
HSV-1/ChHV9.1 (8.6–9.5)6.2 (5.6–7.4)5–7
HSV-2/ChHV2.7 (2.9–2.9)1.6 (1.4–2.1)5–7
MHV-13.7 (3.5–4.1)2.1 (1.8–2.8)2.2–2.5
MHV-1 (Macaca fuscata/ M. mullata)3.3 (3.1–3.7)2.0 (1.6–2.5)1.7
CeHV-2/HVP-23.1 (2.9–3.4)1.9 (1.6–2.5)11.6

aVariance estimates obtained via LHC scheme (see Materials and Methods).

bSee text for corresponding citations.

Simplex Virus Time of Most Common Ancestor (tMRCA) Estimates from Relaxed Molecular Clock Analyses in r8s and Their Corresponding Host Divergence Dates Inferred Using gB. aVariance estimates obtained via LHC scheme (see Materials and Methods). bSee text for corresponding citations.

tMRCA Inference for Other Primate Simplex Viruses

We also investigated whether BSREL provided consistent tMRCA estimates for the MHV-1. Divergence times within Macaca have been well characterized. The three host species whose viruses were included here (i.e., Macaca mullata, M. fuscata, and M. fascularis) share an MRCA around 2.2–2.5 Ma (Tosi et al. 2003). The tMRCA for all three viral lineages estimated using BSREL branch lengths, 2.1 (1.8–2.8) Ma, was closer to the host tMRCA, compared with branch lengths inferred under GTR + Γ4, which yielded an older tMRCA, 3.7 (3.5–4.1) Ma (table 3). The same pattern holds for the M. fuscata and M. mullata tMRCA around 1.7 Ma (Fabre et al. 2009). The BSREL analysis estimated a viral tMRCA at 2.0 (1.6–2.5) Ma, whereas the GTR + Γ4 placed the tMRCA at 3.3 (3.1–3.7) Ma (table 3). Therefore, BSREL provides more internally consistent tMRCAs than the GTR + Γ4 model and supports a general pattern of codivergence throughout the primate simplex viruses. An exception to this general pattern of codivergence is found by examining the tMRCA of CeHV-2 and HVP-2. Both substitution models indicate that this tMRCA is too recent to be the result of codivergence with Chlorocebus pygerythrus and Papio spp. around 11.6 Ma (Raaum et al. 2005) (table 3). The genus Papio started diverging around 2 Ma (Sithaldeen et al. 2009; Zinner et al. 2009). Therefore, the CeHV-2/HVP-2 tMRCA inferred with BSREL branch lengths of 1.9 (1.6–2.5) Ma confirms that CeHV-2 is, evolutionarily, a baboon virus, as has been suggested previously (Malherbe and Strickland-Cholmley 1969a, 1969b; Kalter et al. 1978; Hilliard et al. 1989; Tyler and Severini 2006).

Discussion

The evolutionary origins of human herpes simplex viruses can be resolved using phylogenetic and molecular dating analyses. The discovery of ChHV and its placement in the phylogenetic tree yielded several evolutionary scenarios that could explain the origins of HSV-1 and HSV-2. We were able to reject 1) scenarios in which HSV-1 and HSV-2 arose due to viral lineage duplication in apes and 2) scenarios in which HSV-1 is the result of cross-species transmission. Instead ours results suggest 3) a scenario in which HSV-2 is the result of cross-species transmission and HSV-1 is the result of host–virus codivergence. Specifically, the molecular clock analysis indicates that after HSV-1 and ChHV codiverged around 6 Ma, ChHV was transmitted to an ancestor of modern humans around 1.6 Ma, giving rise to HSV-2. Dating estimates within the Pan genus provide guidance for distinguishing among the possible HSV-2 origin scenarios (fig. 3D–F). Pan troglodytes diverged from P. paniscus around 2.2 Ma and split into four subspecies starting around 1.0 Ma (Stone et al. 2010; Bjork et al. 2011). Therefore, the HSV-2/ChHV tMRCA points to transmission of the precursor of HSV-2 from the common ancestor of P. troglodytes to a now extinct Homo species (e.g., H. habilis, H. erectus, and H. ergaster [Anton 2003; Severini et al. 2013]) that preceded modern humans (fig. 3E). The specific identity of this Homo ancestor cannot be determined based solely on the molecular clock. Moreover, recent fossil evidence suggests that these taxa may be best classified as a single Homo species (Lordkipanidze et al. 2013). Finally, because both human herpes simplex viruses are transmitted via oral and sexual routes (Brugha et al. 1997; Langenberg et al. 1999), the route of viral transmission between P. troglodytes and the extinct Homo species (e.g., physical or sexual contact) remains unknown. Primate herpes simplex viruses have experienced synonymous substitutions to a degree, which causes standard evolutionary models (e.g., GTR + Γ4) to produce biased tMRCA estimates. However, unlike other viruses such as Ebola, avian influenza, and coronaviruses where the temporal signal has been lost due to levels of sequence evolution which saturate even the selection-aware models, the branch lengths in the primate simplex virus phylogeny can still be estimated reliably if selection-informed models of evolution (e.g., BSREL) are implemented. Our results suggest that primate simplex viruses are young enough to contain sufficient evolutionary signal for molecular dating but too old for this signal to be extracted by standard evolutionary models; BSREL produces internally consistent dating estimates across the primate simplex virus phylogeny. Therefore, selection-informed models should be employed when investigating ancient evolution in both RNA and DNA viruses. The phylogenetic history of HSV-1, like that of varicella zoster virus (Grose 2012) and Helicobacter pylori (Linz et al. 2007), recapitulates human migration patterns dating back tens of thousands of years (Norberg et al. 2011; Kolb et al. 2013). However, molecular clock dating estimates for the tMRCA of HSV-1 may need to be revisited in light of the findings presented here. Specifically, Norberg et al. (2011) inferred a tMRCA for major HSV-1 clades at 710,000 years ago using a calibration based on the tMRCA of HSV-1/HSV-2 at 8.45 Ma, which would correspond to either the African ape duplication (fig. 2C) or gorilla cross-species transmission (fig. 3C) scenarios. In contrast, Kolb et al. (2013) inferred a tMRCA of major HSV-1 clades around 50,000 years ago and a tMRCA for HSV-1/HSV-2 at 2.2 Ma. Based on this latter age estimate for HSV-1/HSV-2, they postulated a viral duplication event corresponding to the rise of the genus Homo; however, the close phylogenetic relationship between HSV-2 and ChHV would necessitate a less parsimonious explanation in which HSV underwent a duplication event followed by cross-species transmission to P. troglodytes. More accurate molecular dating within HSV-1 and HSV-2 may be possible using the divergence between humans and P. troglodytes as a calibration, though it is possible that standard substitution models may slightly underestimate the evolutionary distance between HSV-1 and HSV-2 (fig. 5B). Our results strongly suggest a scenario (fig. 3E) in which African and Asian apes are infected with a single herpes simplex virus (barring genus/species specific extinction or duplication events). Serological evidence suggests that wild mountain gorillas (Gorilla beringei beringei) are infected with a herpes simplex virus that is distinct from human herpes simplex viruses (Eberle 1992). The legitimacy of our preferred scenario could be confirmed by the identification of putative gorilla or bonobo herpes simplex viruses. Specifically, if the phylogenetic placement of these putative viruses is found to be in agreement with our predicted scenario, it would serve as a confirmation of the promise held by selection-informed models in the study of viral origins. This work highlights the need for a method that can simultaneously model variable selection pressures (as in BSREL) and estimate tMRCAs (as in BEAST). Selection-informed models could be incorporated into a Bayesian relaxed molecular clock framework and take advantage of recent computational advances for evaluating codon substitution models (Suchard and Rambaut 2009). Moreover, the appropriate number of dN/dS classes could be sampled as part of the BMCMC to reflect the level of confidence that exists for multiple classes. If, as this and other studies suggest, selection-informed models are necessary for accurate dating of ancient viral divergence events, then their incorporation into relaxed molecular clock methodology will be an important step towards understanding viral evolutionary history.

Materials and Methods

Data Set

Full-length genomes for primate simplex viruses from six host species were downloaded from GenBank. The 12 conserved glycoproteins spanning the genome (UL1, UL10, UL22, UL27, UL44, UL49A, UL53, US4, US5, US6, US7, and US8) were concatenated and aligned using MUSCLE v2.0, based on translated amino acid sequences (Edgar 2004). Regions containing gaps, indicating ambiguity in homology, were then excised. The final concatenated alignment comprised 7,566 nucleotides. We screened for recombination using GARD (Kosakovsky Pond et al. 2006), which failed to detect any significant breakpoints among the concatenated glycoproteins. In addition, all available full-length UL27 (gB) genes for primate herpes simplex viruses, representing at least nine host species, were downloaded from GenBank. Alignment was performed using the same protocol, resulting in an alignment of 2,283 nucleotide sites. Identical sequences were replaced with a single representative, because they do not add information for fitting substitution models in the phylogenetic likelihood framework, resulting in a final alignment of 74 sequences (all alignments are available as supplementary material, Supplementary Material online).

BMCMC Molecular Clock Analysis

The tMRCA of HSV-1/ChHV and HSV-2/ChHV was inferred for both the concatenated glycoprotein and gB data sets under an uncorrelated lognormal relaxed molecular clock (Drummond et al. 2006) using a GTR + Γ4 substitution model in BEAST v1.8.0 (Drummond and Rambaut 2007; Drummond et al. 2012). Four independent chains were run for 25 million generations, sampling every 2,500 generations. The effective sample size for all parameters was greater than 200. These chains were combined using LogCombiner, and the maximum clade credibility trees were summarized using TreeAnnotator. The XML input files are available in supplementary material, Supplementary Material online. The molecular clock was calibrated assuming a pattern of viral–host codivergence using ages from internal nodes assembled from the literature: 1) tMRCA of Old and New World primates, Simiiformes, at 44.2 Ma (Steiper and Young 2009); 2) tMRCA of Old World monkeys, Catarrhini, at 23 Ma (Raaum et al. 2005); 3) tMRCA of Macaca spp. and Papio spp. at 9.8 Ma (Raaum et al. 2005), and 4) tMRCA of Saimiri sciureus and Ateles geoffroyi at 14.4 Ma (Fabre et al. 2009). The last of these nodes was relevant only for the gB analyses, as this gene is the only sequence available for the A. geoffroyi simple virus, HVA-1. The tMRCAs of Simiiformes and Catarrhini were estimated based on fossil data, whereas the other two tMRCAs were inferred by previous studies using relaxed molecular clocks. Because the Macaca/Papio tMRCA was the only date published with associated uncertainty, we placed lognormal prior distributions (mean 0; standard deviation 0.56; similar to a previous study of P. troglodytes tMRCAs [Bjork et al. 2011]) at the nodes, offset so that the median value of the distribution corresponded to the tMRCA, to allow for a reasonably degree of uncertainty.

Re-Estimating Branch Lengths

For the gB data set, a maximum likelihood phylogeny was inferred using a GTR + Γ4 model using a subtree pruning and regrafting algorithm in PhyML 3.0 (Guindon and Gascuel 2003; Guindon et al. 2009), available in SeaView4 (Gouy et al. 2010). Branch support was established using the aLRT (Anisimova and Gascuel 2006). To ensure consistency in later comparisons, branch lengths were reoptimized using HyPhy (Kosakovsky Pond et al. 2005); these branch lengths were indistinguishable from those estimated by PhyML. To estimate branch lengths under a selection-informed model, we modified the BSREL algorithm. In its original form, BSREL assumed three dN/dS classes along each branch in the phylogeny, each class representing a proportion of sites evolving with particular dN/dS value, inferred from the sequence alignment (Kosakovsky Pond et al. 2011). However, this model is generally overparameterized, because short branches rarely contain enough information to support more than one dN/dS class (Wertheim et al. 2013). Although overparameterization is not a substantial problem when the goal of an analysis is to perform a statistical test for selection (Scheffler et al. 2014), it does become a problem when point estimates of model parameters are used for downstream inference. Therefore, we modified the BSREL model via a step-up parameter selection procedure. Initially, each branch is assigned one dN/dS class. Then, starting with the longest branch, branch-specific dN/dS classes are added and retained only if there is an improvement in the small-sample corrected AIC (c-AIC) (fig. 5). Once the optimal number of dN/dS classes has been inferred, the likelihood model is reoptimized, and branch lengths are estimated.

Penalized Likelihood Molecular Clock Analysis

Inferring tMRCAs on a tree with fixed branch lengths cannot be accomplished using existing relaxed molecular clock packages such as BEAST. Furthermore, forcing an ultrametric tree for dating analysis in the BSREL framework would entail the assumption of a strict molecular clock (which is not realistic) and is not feasible in the current implementation. Therefore, we employed a semiparametric penalized likelihood approach, implemented in r8s (Sanderson 2002, 2003), to smooth the GTR + Γ4 and BSREL trees under a relaxed molecular clock. Inference in r8s requires trees with fixed branch lengths, allowing us to fit a molecular clock and infer tMRCAs on the gB trees with branch lengths previously optimized under both GTR + Γ4 and BSREL. The penalized likelihood algorithm in r8s employs a smoothing parameter, which represents the degree to which the assumption of a strict molecular clock has been relaxed; higher values indicate more relaxation. To estimate these parameters, the same four internal node calibrations were used as in the BEAST dating analysis. These ages were treated as fixed points in r8s, rather than lognormal distributions, because r8s does not perform optimally with narrow calibration windows. Using these node calibrations, r8s estimated the optimal smoothing parameters for the GTR + Γ4 (smoothing = 3.2) and BSREL (smoothing = 100) trees. The r8s input file is available in supplementary material, Supplementary Material online.

Model Comparison via the Likelihood Ratio Test

Statistical significance was assessed using a likelihood ratio test in which the fixed tMRCA is the null model and the unconstrained tMRCA is the alternative model, with one degree of freedom (as the unconstrained model contains one additional parameter to be estimated). This comparison was performed using gB trees with branch lengths estimated under BSREL and GTR + Γ4. The four internal calibration points (described above) were used.

Variance Estimates Using Latin Hypercube Sampling

To estimate confidence in our dating estimates, we employed a Latin hypercube (LHC) sampling importance resampling scheme, described in detail previously, to draw 500 samples of scaled trees and estimate length variance (Wertheim and Kosakovsky Pond 2011; Wertheim et al. 2013). Briefly, the sampling distribution of each parameter is approximated by the normal distribution centered on the maximum likelihood estimation and, with variance determined by profile likelihood, discretized into 100,000 bins and used to define the LHC in the parameter space. The likelihood is then evaluated for each of the 100,000 parameter vectors defined by the LHC sampling procedure and resampled using a procedure described previously (Kosakovsky Pond et al. 2010). We took these 500 trees from the LHC analysis (for both GTR + Γ4 and BSREL) and ran them through r8s. The upper and lower 95% bounds are reported as confidence intervals.

Supplementary Material

Supplementary material and figure S1 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).
  55 in total

1.  Timing the ancestor of the HIV-1 pandemic strains.

Authors:  B Korber; M Muldoon; J Theiler; F Gao; R Gupta; A Lapedes; B H Hahn; S Wolinsky; T Bhattacharya
Journal:  Science       Date:  2000-06-09       Impact factor: 47.728

2.  Estimating absolute rates of molecular evolution and divergence times: a penalized likelihood approach.

Authors:  Michael J Sanderson
Journal:  Mol Biol Evol       Date:  2002-01       Impact factor: 16.240

3.  r8s: inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock.

Authors:  Michael J Sanderson
Journal:  Bioinformatics       Date:  2003-01-22       Impact factor: 6.937

4.  A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood.

Authors:  Stéphane Guindon; Olivier Gascuel
Journal:  Syst Biol       Date:  2003-10       Impact factor: 15.683

5.  Using HSV-1 genome phylogenetics to track past human migrations.

Authors:  Aaron W Kolb; Cécile Ané; Curtis R Brandt
Journal:  PLoS One       Date:  2013-10-16       Impact factor: 3.240

6.  A prospective study of new infections with herpes simplex virus type 1 and type 2. Chiron HSV Vaccine Study Group.

Authors:  A G Langenberg; L Corey; R L Ashley; W P Leong; S E Straus
Journal:  N Engl J Med       Date:  1999-11-04       Impact factor: 91.245

7.  The isolation of herpesvirus from trigeminal ganglia of normal baboons (Papio cynocephalus).

Authors:  S S Kalter; S A Weiss; R L Heberling; J E Guajardo; G C Smith
Journal:  Lab Anim Sci       Date:  1978-12

8.  Cercopithecine Y-chromosome data provide a test of competing morphological evolutionary hypotheses.

Authors:  Anthony J Tosi; Todd R Disotell; Juan Carlos Morales; Don J Melnick
Journal:  Mol Phylogenet Evol       Date:  2003-06       Impact factor: 4.286

9.  On the validity of evolutionary models with site-specific parameters.

Authors:  Konrad Scheffler; Ben Murrell; Sergei L Kosakovsky Pond
Journal:  PLoS One       Date:  2014-04-10       Impact factor: 3.240

10.  A synchronized global sweep of the internal genes of modern avian influenza virus.

Authors:  Michael Worobey; Guan-Zhu Han; Andrew Rambaut
Journal:  Nature       Date:  2014-02-16       Impact factor: 49.962

View more
  58 in total

1.  Highly conserved intragenic HSV-2 sequences: Results from next-generation sequencing of HSV-2 UL and US regions from genital swabs collected from 3 continents.

Authors:  Christine Johnston; Amalia Magaret; Pavitra Roychoudhury; Alexander L Greninger; Anqi Cheng; Kurt Diem; Matthew P Fitzgibbon; Meei-Li Huang; Stacy Selke; Jairam R Lingappa; Connie Celum; Keith R Jerome; Anna Wald; David M Koelle
Journal:  Virology       Date:  2017-07-13       Impact factor: 3.616

2.  Evolution of the ability to modulate host chemokine networks via gene duplication in human cytomegalovirus (HCMV).

Authors:  Jessica A Scarborough; John R Paul; Juliet V Spencer
Journal:  Infect Genet Evol       Date:  2017-03-14       Impact factor: 3.342

3.  Subclinical Herpes Simplex Virus Type 1 Infections Provide Site-Specific Resistance to an Unrelated Pathogen.

Authors:  Alexander M Rowe; Hongming Yun; Benjamin R Treat; Paul R Kinchington; Robert L Hendricks
Journal:  J Immunol       Date:  2017-01-06       Impact factor: 5.422

Review 4.  The Sordid Affair Between Human Herpesvirus and HIV.

Authors:  Sara Gianella; Marta Massanella; Joel O Wertheim; Davey M Smith
Journal:  J Infect Dis       Date:  2015-03-06       Impact factor: 5.226

5.  Molecular Evolution of Herpes Simplex Virus 2 Complete Genomes: Comparison between Primary and Recurrent Infections.

Authors:  Miguel A Minaya; Travis L Jensen; Johannes B Goll; Maria Korom; Sree H Datla; Robert B Belshe; Lynda A Morrison
Journal:  J Virol       Date:  2017-11-14       Impact factor: 5.103

6.  Phosphoregulation of a Conserved Herpesvirus Tegument Protein by a Virally Encoded Protein Kinase in Viral Pathogenicity and Potential Linkage between Its Evolution and Viral Phylogeny.

Authors:  Misato Shibazaki; Akihisa Kato; Kosuke Takeshima; Jumpei Ito; Mai Suganami; Naoto Koyanagi; Yuhei Maruzuru; Kei Sato; Yasushi Kawaguchi
Journal:  J Virol       Date:  2020-08-31       Impact factor: 5.103

7.  Viral Venereal Diseases of the Skin.

Authors:  Theodora K Karagounis; Miriam K Pomeranz
Journal:  Am J Clin Dermatol       Date:  2021-05-18       Impact factor: 7.403

8.  Assessing Host-Virus Codivergence for Close Relatives of Merkel Cell Polyomavirus Infecting African Great Apes.

Authors:  Nadège F Madinda; Bernhard Ehlers; Joel O Wertheim; Chantal Akoua-Koffi; Richard A Bergl; Christophe Boesch; Dieudonné Boji Mungu Akonkwa; Winnie Eckardt; Barbara Fruth; Thomas R Gillespie; Maryke Gray; Gottfried Hohmann; Stomy Karhemere; Deo Kujirakwinja; Kevin Langergraber; Jean-Jacques Muyembe; Radar Nishuli; Maude Pauly; Klara J Petrzelkova; Martha M Robbins; Angelique Todd; Grit Schubert; Tara S Stoinski; Roman M Wittig; Klaus Zuberbühler; Martine Peeters; Fabian H Leendertz; Sébastien Calvignac-Spencer
Journal:  J Virol       Date:  2016-09-12       Impact factor: 5.103

Review 9.  Current Concepts for Genital Herpes Simplex Virus Infection: Diagnostics and Pathogenesis of Genital Tract Shedding.

Authors:  Christine Johnston; Lawrence Corey
Journal:  Clin Microbiol Rev       Date:  2016-01       Impact factor: 26.132

10.  An Intrinsically Disordered Region of the DNA Repair Protein Nbs1 Is a Species-Specific Barrier to Herpes Simplex Virus 1 in Primates.

Authors:  Dianne I Lou; Eui Tae Kim; Nicholas R Meyerson; Neha J Pancholi; Kareem N Mohni; David Enard; Dmitri A Petrov; Sandra K Weller; Matthew D Weitzman; Sara L Sawyer
Journal:  Cell Host Microbe       Date:  2016-08-10       Impact factor: 21.023

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.