| Literature DB >> 35211321 |
Shanjun Deng1, Ke Xing2, Xionglei He1.
Abstract
Entities:
Year: 2021 PMID: 35211321 PMCID: PMC8690307 DOI: 10.1093/nsr/nwab220
Source DB: PubMed Journal: Natl Sci Rev ISSN: 2053-714X Impact factor: 17.275
Figure 1.Evolution of the mutation spectrum in the SARS-CoV-2 lineage. (a) The phylogenetic relationships of the seven coronaviruses included in the analysis. Two separate phylogenetic trees are considered to resolve the confusions caused by recombination, which results in different genealogical histories at different genomic regions in the ancestral branch of Bat-CoV-ZXC21 and Bat-CoV-ZC45. Nine major evolutionary branches examined in this study, X, B1–B7 and the human branch, are shown. Branch X and B1 are also present (as X′ and B1′) in the tree with B6 and B7 to help infer the ancestor of B6 and B7. Bat-CoV-Rc-o319 is used as an outgroup in both trees. (b) The relative mutation rate of the 12 mutation types on each of the nine evolutionary branches. (c) The similarity of mutation spectra between branch X and each of the other eight branches. The similarity of two branches is measured by identity score (i-score), which is the proportion of total rate variation explained by the x = y dimension in the plot of the two spectra. (d) The sensitivity of i-score between branch X and B1 to potential perturbations on X. Each histogram represents the result of 1000 replicates. The rate of replacements by random mutations is shown in each panel, with the red line showing the original i-score between X and B1, and the p showing the frequency of cases with an i-score larger than the original i-score. (e) The sensitivity of i-score under different replacement rates. The hollow point represents the median of 1000 replicates, and the error bar covers the upper and lower quartiles.
Figure 2.Host signatures inferred from the viral mutation spectrum. (a) A diagram showing the major sources of viral mutations, which include the replication errors (by the viral replication-transcription complex, RTC) and the lesions caused by host factors. Because replication processes are the same, despite being in the opposite order, for nucleotides G and C (or A and T), replication errors would result in equal rates of complementary mutations such as C > A and G > T. However, host factors would distort the equal-rate pattern of complementary mutation pairs. The positive-sense RNA is often in a single-stranded form, sensitive to ROS and the APOBEC family, while the negative-sense RNA tends to be in a double-stranded form, thus more affected by the ADAR family. (b) The rate difference of each complementary mutation pair serves as a signature of host factors. There are thus six host signatures, each corresponding to a complementary mutation pair, inferred from the viral mutation spectrum. Among the three major host signatures, S1 is likely associated with the APOBEC family, S2 the ADAR family and S3 the ROS. (c) The similarity of host signatures between branch X and each of the other eight branches. Branch X is highly similar to B1, B6 and B7, the three branches of bat coronavirus. (d) A multidimensional scaling (MDS) plot of the host signatures reveals that branch X and B1 have nearly the same positions. (e) Estimation of the likelihood that an arbitrary laboratory condition happens to match the host signatures of B1 (the branch of RaTG13). The grey rectangular area is defined by the empirical ranges of S1 (APOBEC-associated) and S2 (ADAR-associated) that are based on the data of panel (b). The probability of approaching B1 as closely as X is the area of the circle divided by the whole rectangular area, which is ∼2.0%. The positions of the other seven branches are also shown in the rectangular area. (f) The probability that an arbitrary condition approaches B1 as closely as X is given, by considering the different combinations of S1, S2 and S3, respectively.