| Literature DB >> 26099036 |
Lili Ren1, Yue Zhang2, Jianguo Li2, Yan Xiao2, Jing Zhang2, Ying Wang2, Lan Chen2, Gláucia Paranhos-Baccalà3, Jianwei Wang1.
Abstract
Coronaviruses (CoVs) continuously threaten human health. However, to date, the evolutionary mechanisms that govern CoV strain persistence in human populations have not been fully understood. In this study, we characterized the evolution of the major antigen-spike (S) gene in the most prevalent human coronavirus (HCoV) OC43 using phylogenetic and phylodynamic analysis. Among the five known HCoV-OC43 genotypes (A to E), higher substitution rates and dN/dS values as well as more positive selection sites were detected in the S gene of genotype D, corresponding to the most dominant HCoV epidemic in recent years. Further analysis showed that the majority of substitutions were located in the S1 subunit. Among them, seven positive selection sites were chronologically traced in the temporal evolution routes of genotype D, and six were located around the critical sugar binding region in the N-terminal domain (NTD) of S protein, an important sugar binding domain of CoV. These findings suggest that the genetic drift of the S gene may play an important role in genotype persistence in human populations, providing insights into the mechanisms of HCoV-OC43 adaptive evolution.Entities:
Mesh:
Year: 2015 PMID: 26099036 PMCID: PMC4476415 DOI: 10.1038/srep11451
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Relative genetic diversity dynamics of OC43 and each genotype.
Population size was determined using sequences of 96 OC43 S genes obtained from 2003 to 2012. The median estimates (g) are represented by the black lines and 95% high posterior densities are shown in the color regions.
Evolution rate of the OC43 S gene in different genotypes.
| Genotype | Substitution rate (CR) | |
|---|---|---|
| Constant size | Exponential growth | |
| All | 8.48 (6.46–10.58) | 8.63 (6.60–10.65) |
| B | 9.85 (3.62–18.41) | 9.67 (4.10–16.77) |
| C | 4.85 (1.20–8.77) | 1.61 (0.04–4.05) |
| D | 8.83(5.56–11.66) | 8.55(5.97–11.83) |
| E | 6.01 (1.02–11.77) | 6.51 (0.07–12.65) |
aSubstitution rates are expressed as 10−4 substitutions per site per year. CR, Confidential range.
Selection analysis of the S gene of OC43 genotypes.
| Genotype | Mean dN/dS | Site | amino acid | dN/dS ± S.E. | Pr | Subunit/Domain |
|---|---|---|---|---|---|---|
| B | 0.29 | 131 | T | 2.27 ± 1.50 | 0.658 | NTD |
| 192 | L | 1.93 ± 1.58 | 0.536 | NTD | ||
| 263 | S | 2.28 ± 1.50 | 0.663 | NTD | ||
| 421 | G | 2.28 ± 1.50 | 0.648 | RBD | ||
| 627 | I | 2.11 ± 1.55 | 0.603 | S1 | ||
| 951 | Q | 2.41 ± 1.51 | 0.710 | S2 | ||
| C | 0.20 | 1001 | T | 2.43 ± 1.42 | 0.768 | S2 |
| D | 0.31 | 33 | N | 6.15 ± 1.78 | 0.94 | NTD |
| 38 | P | 5.56 ± 2.35 | 0.836 | NTD | ||
| 90 | K | 5.94 ± 2.04 | 0.902 | NTD | ||
| 93 | T | 5.90 ± 2.08 | 0.895 | NTD | ||
| 115 | T | 3.72 ± 3.11 | 0.535 | NTD | ||
| 120 | D | 6.17 ± 1.76 | 0.943 | NTD | ||
| 176 | Y | 3.85 ± 3.11 | 0.556 | NTD | ||
| 184 | K | 6.35 ± 1.50 | 0.976 | NTD | ||
| 195 | L | 3.80 ± 3.11 | 0.548 | NTD | ||
| 265 | L | 5.45 ± 2.42 | 0.818 | NTD | ||
| 266 | D | 4.93 ± 2.66 | 0.731 | NTD | ||
| 267 | I | 5.08 ± 2.60 | 0.755 | NTD | ||
| 354 | S | 3.95 ± 3.10 | 0.572 | RBD | ||
| 395 | I | 5.81 ± 2.16 | 0.878 | RBD | ||
| 521 | Y | 6.47 ± 1.27 | 0.999 | RBD | ||
| 535 | F | 5.32 ± 2.49 | 0.796 | RBD | ||
| 716 | Q | 3.61 ± 3.11 | 0.518 | S2 | ||
| 741 | Q | 3.61 ± 3.11 | 0.517 | S2 | ||
| 763 | R | 6.47 ± 1.26 | 1.000 | S2 | ||
| 768 | G | 4.77 ± 2.78 | 0.706 | S2 | ||
| 782 | V | 6.38 ± 1.45 | 0.982 | S2 | ||
| 813 | S | 3.59 ± 3.11 | 0.515 | S2 | ||
| 989 | L | 5.67 ± 2.34 | 0.855 | S2 | ||
| 1255 | D | 4.97 ± 2.64 | 0.737 | S2 | ||
| 1310 | C | 3.87 ± 3.11 | 0.559 | S2 | ||
| E | 0.15 | NA | NA | NA | NA |
adN/dS, the ratio of nonsynonymous (dN) to synonymous;
bThe amino acid positions of genotype B were determined according to 87309 Belgium 2003 (AY903459) and 5240/07 (KF572844) for genotype C and D;
cS.E. Standard error;
dPr, probability;
eNTD, N-terminal domain; RBD, receptor-binding domain; S, spike gene.
Figure 2Molecular clock analysis of OC43 genotype D.
The complete S gene sequences sampled from 2007 to 2012 were used to reconstruct the phylogeny. The calculated positive selection sites in each node are drawn on the MCC tree.