| Literature DB >> 35891531 |
James R Otieno1, Joshua L Cherry1,2, David J Spiro1, Martha I Nelson1, Nídia S Trovão1.
Abstract
Four seasonal human coronaviruses (sHCoVs) are endemic globally (229E, NL63, OC43, and HKU1), accounting for 5-30% of human respiratory infections. However, the epidemiology and evolution of these CoVs remain understudied due to their association with mild symptomatology. Using a multigene and complete genome analysis approach, we find the evolutionary histories of sHCoVs to be highly complex, owing to frequent recombination of CoVs including within and between sHCoVs, and uncertain, due to the under sampling of non-human viruses. The recombination rate was highest for 229E and OC43 whereas substitutions per recombination event were highest in NL63 and HKU1. Depending on the gene studied, OC43 may have ungulate, canine, or rabbit CoV ancestors. 229E may have origins in a bat, camel, or an unsampled intermediate host. HKU1 had the earliest common ancestor (1809-1899) but fell into two distinct clades (genotypes A and B), possibly representing two independent transmission events from murine-origin CoVs that appear to be a single introduction due to large gaps in the sampling of CoVs in animals. In fact, genotype B was genetically more diverse than all the other sHCoVs. Finally, we found shared amino acid substitutions in multiple proteins along the non-human to sHCoV host-jump branches. The complex evolution of CoVs and their frequent host switches could benefit from continued surveillance of CoVs across non-human hosts.Entities:
Keywords: 229E; HKU1; NL63; OC43; evolution; recombination; seasonal coronaviruses; zoonosis
Mesh:
Year: 2022 PMID: 35891531 PMCID: PMC9320361 DOI: 10.3390/v14071551
Source DB: PubMed Journal: Viruses ISSN: 1999-4915 Impact factor: 5.818
Figure 1An illustration of the sHCoV genomes, not drawn to scale. The ORFs analyzed in this study are indicated in orange.
Figure 2Maximum clade credibility (MCC) trees inferred from dataset D5 for full genomes (WGS), and the spike, nucleocapsid, membrane, and envelope ORFs, with the branches color-coded by the inferred coronavirus host. The upper panel shows MCC trees from alphacoronaviruses while the lower panel shows MCC trees from betacoronaviruses. Human, camel, and porcine coronavirus clades have been collapsed to increase readability. Human * is a lone human CoV (FJ415324) that clusters with ungulate and canine CoVs. Individual and more detailed MCC trees can be found in File S2.
The estimated zoonotic origins of the four species of human seasonal coronaviruses inferred from BEAST Markov Jumps (MJs) for whole genomes and the envelope, membrane, nucleocapsid, and spike ORFs. For 229E and OC43, we included the percentage of MJs of the presumptive ancestors, other potential ancestors with a percentage higher than the presumptive ancestor, or CoVs whose cumulative percentage of MJs were ≥70%.
| Genus | Species | WGS/ORF | Zoonotic Origins of the |
|---|---|---|---|
| alphaCoVs | 229E | WGS | Bat {65} Camelid {34} |
| Spike | Bat {73} Camelid {27} | ||
| Nucleocapsid | Bat {99} | ||
| Membrane | Bat {98} | ||
| Envelope | Bat {99} | ||
| NL63 | WGS | Bat {97} | |
| Spike | Bat {99} | ||
| Nucleocapsid | Bat {95} | ||
| Membrane | Bat {97} | ||
| Envelope | Bat {98} | ||
| betaCoVs | HKU1 | WGS | Murine {83} |
| Spike | Murine {92} | ||
| Nucleocapsid | Murine {72} | ||
| Membrane | Murine {78} | ||
| Envelope | Murine {71} | ||
| OC43 | WGS | Bovine {36} Murine {34} | |
| Spike | Bovine {35} Murine {19} | ||
| Nucleocapsid | Porcine {25} Bovine {23} | ||
| Membrane | Camel {75} Murine {6} | ||
| Envelope | Murine {30} Porcine {24} |
Figure 3Estimates of the evolutionary rate (A) and MRCA age (B) for full genomes and four open reading frames (dataset D5) of the seasonal human coronavirus species. The black horizontal lines in (B) are the dates of first isolation for the 229E (1966), OC43 (1967), NL63 (2004), and HKU1 (2005). The WGS is missing data points for HKU1_all (collective for both genotypes) and HKU1_genotype B as sequences for HKU1_genotype B were all removed in the generation of recombination-free WGS dataset D5.
Figure 4Summarized within and between host/species recombination patterns identified by RDP4, for alphaCoVs (A) and betaCoVs (B). For each sHCoV species, recombining CoVs are shown; non-human and sHCoV (black arrows), within sHCoV species (blue arrows), and between sHCoV species (green arrows). In orange is a lone human CoV (FJ415324) that clusters with ungulate and canine CoVs. Figure generated using Biorender.
Estimates of the ratio of recombination rate to point mutation rate (R/theta), substitutions by recombination relative to point mutation (r/m), number of substitutions per recombination event (δν), and mean length of DNA imported by homologous recombination (δ) from dataset D4. For each genome segment and analysis, the highest values are shown in bold. These analyses were conducted for the two HKU1 genotypes collectively (HKU1_all) and independently.
| Genome Segment | Coronavirus |
|
| ||
|---|---|---|---|---|---|
| (a) Spike | alphaCoVs | 1.103 |
|
|
|
| betaCoVs |
| 28.494 | 12.994 | 208.896 | |
| (b) WGS | alphaCoVs | 0.014 | 0.840 |
|
|
| betaCoVs |
|
| 35.369 | 780.250 | |
| (c) WGS: Interspecies analysis (between one sHCoV species and all other non-human CoVs) | 229E | 0.013 | 0.741 |
|
|
| NL63 | 0.016 |
|
|
| |
| OC43 |
| 0.888 | 32.291 | 687.266 | |
| HKU1_all |
|
| 35.890 | 760.242 | |
| HKU1 Genotype A | 0.024 | 0.840 | 35.170 | 747.680 | |
| HKU1 Genotype B | 0.024 | 0.865 | 36.718 | 787.036 | |
| (d) WGS: Intraspecies analysis (within one sHCoV species) | 229E |
|
| 1.722 | 649.287 |
| NL63 | 0.138 | 0.885 |
| 106.019 | |
| OC43 |
| 1.636 | 4.967 |
| |
| HKU1_all | 0.029 |
|
|
| |
| HKU1 Genotype A | 0.216 | 0.537 | 2.493 | 176.197 | |
| HKU1 Genotype B | 0.056 | 2.123 | 37.595 | 811.754 |
Mean pairwise genetic distances for each sHCoV species derived from WGS and four ORFs from dataset D4. In bold is the highest pairwise distance for each ORF/WGS and underlined is the highest pairwise distance for each sHCoV species.
| sHCoV Species | WGS | Spike | Envelope | Membrane | Nucleocapsid |
|---|---|---|---|---|---|
| 229E | 0.0048 |
| 0.0040 | 0.0079 | 0.0121 |
| NL63 | 0.0081 |
| 0.0077 | 0.0081 | 0.0076 |
| OC43 | 0.0076 |
| 0.0085 | 0.0082 | 0.0077 |
| HKU1_all |
|
|
|
|
|
| HKU1 Genotype A | 0.0028 | 0.0049 | 0.0010 | 0.0020 |
|
| HKU1 Genotype B | 0.0122 |
| 0.0072 | 0.0039 | 0.0049 |
Figure 5The number of inferred amino acid changes (AA) associated with the sHCoVs for AA positions in the envelope, membrane, nucleocapsid, and spike proteins from datasets D4 and D5. Panel (A) represents the aggregated AA changes from the alphaCoVs 229E and NL63 while (B) represents the aggregated changes from the betaCoVs OC43 and HKU1. At the top of each plot, the functional domains or regions of the respective proteins are shown; NTD = N-terminal domain, TM = transmembrane domain, CTD = C-terminal domain, RBD = receptor binding domain, LINK = central linker domain, LINK-Dimer = dimerization domain, S1 subunit, S2 subunit, FP = fusion peptide, IFP = internal fusion peptide, HR1 = heptad repeat 1, and HR2 = heptad repeat.
A list of positions with AA change in at least two host-jump branches leading to the sHCoVs. Where there was an AA insertion in the sHCoVs relative to the Wuhan-Hu-1 SARS-CoV-2 reference genome, we used the X.Y positional notation where X is the reference genome position and Y is nth sHCoV AA insertion. The positions at which there were no AA changes in branches other than the host-jump branches leading to the sHCoVs are in bold.
| ORF | Positions Shared by Alpha sHCoVs 229E and NL63 | Positions Shared by Beta sHCoVs OC43 and HKU1 | Positions Shared between Alpha and Beta sHCoVs |
|---|---|---|---|
| Spike | 8.21, 8.27, 8.31, 146, 154, 257, | 60, 74, 152, 162, 208, | |
| Nucleocapsid | 158, 160, 205, 216, 236, 252.7, 366, and 402 | 32 and 373.3 | 37, 62, 125, 127, 166, 182, 208, 210, 212, 240, 241, 249, 256, 297, 321, 323, 354, 364, 365, 384, 389, 390, 392, and 394 |
| Membrane | 8, 11, | - | 10, |
| Envelope | - | - | 3, 26, 27, 66, |