| Literature DB >> 33889134 |
Ayan Roy1, Fucheng Guo2,3, Bhupender Singh1, Shelly Gupta1, Karan Paul4, Xiaoyuan Chen2, Neeta Raj Sharma1, Nishika Jaishee5, David M Irwin6,7, Yongyi Shen2,3,8.
Abstract
The novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been spreading rapidly all over the world and has raised grave concern globally. The present research aims to conduct a robust base compositional analysis of SARS-CoV-2 to reveal adaptive intricacies to the human host. Multivariate statistical analysis revealed a complex interplay of various factors including compositional constraint, natural selection, length of viral coding sequences, hydropathicity, and aromaticity of the viral gene products that are operational to codon usage patterns, with compositional bias being the most crucial determinant. UpG and CpA dinucleotides were found to be highly preferred whereas, CpG dinucleotide was mostly avoided in SARS-CoV-2, a pattern consistent with the human host. Strict avoidance of the CpG dinucleotide might be attributed to a strategy for evading a human immune response. A lower degree of adaptation of SARS-CoV-2 to the human host, compared to Middle East respiratory syndrome (MERS) coronavirus and SARS-CoV, might be indicative of its milder clinical severity and progression contrasted to SARS and MERS. Similar patterns of enhanced adaptation between viral isolates from intermediate and human hosts, contrasted with those isolated from the natural bat reservoir, signifies an indispensable role of the intermediate host in transmission dynamics and spillover events of the virus to human populations. The information regarding avoided codon pairs in SARS-CoV-2, as conferred by the present analysis, promises to be useful for the design of vaccines employing codon pair deoptimization based synthetic attenuated virus engineering.Entities:
Keywords: SARS-CoV-2; base composition; codon adaptation index; codon pair usage; codon usage; host adaptation
Year: 2021 PMID: 33889134 PMCID: PMC8057303 DOI: 10.3389/fmicb.2021.548275
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Relative synonymous codon usage (RSCU) patterns of SARS-CoV-2 in comparison with its host Homo sapiens.
FIGURE 1(A) GC3-ENC plot for SARS-CoV-2. Viral genes analyzed are marked as orange colored circles. ENC plot curve is indicated by the bell-shaped solid line. (B) Neutrality plot for SARS-CoV-2. Viral genes analyzed are marked as orange colored circles.
Correlation analysis (Spearman’s rank correlation) of various codon usage indices of SARS-CoV-2 with the principle axes of separation of the genes to Axes 1 and 2 of the RSCU data.
| 0.14** | −0.41** | |
| −0.32** | 0.42** | |
| −0.44** | 0.36** | |
| 0.21** | 0.12** | |
| −0.59** | −0.33** | |
| −0.49** | 0.42** | |
| −0.58** | 0.37** | |
| 0.13** | 0.11** | |
| −0.32** | 0.05 | |
| −0.16** | –0.08 | |
| 0.74** | −0.60** |
FIGURE 2(A) Relative dinucleotide abundance for SARS-CoV-2 in comparison with its host Homo sapiens. (B) Dinucleotide bias at the codon–codon interface for SARS-CoV-2 in comparison with its host Homo sapiens.
The most preferred codon, for each amino acid, in SARS-CoV-2 and iso-acceptor tRNAs in Homo sapiens.
| Ala | GCU | |
| Gly | GGU | ACC (0), GCC (14), CCC (5), UCC (9) |
| Pro | CCU | |
| Thr | ACU | |
| Val | GUU | AAC (9), GAC (0), CAC (11), UAC (5) |
| Ser | UCU | |
| Arg | AGA | ACG (7), GCG (0), CCG (4), UCG (6), CCU (5), UCU (6) |
| Leu | CUU | |
| Phe | UUU | AAA (0), GAA (10) |
| Asn | AAU | AUU (0), GUU (20) |
| Lys | AAA | CUU (15), UUU (12) |
| Asp | GAU | AUC (0), GUC (13) |
| Glu | GAA | CUC (8), UUC (7) |
| His | CAU | AUG (0), GUG (10) |
| Gln | CAA | CUG (13), UUG (6) |
| Ile | AUU | |
| Tyr | UAU | AUA (0), GUA (13) |
| Cys | UGU | ACA (0), GCA (29) |
FIGURE 3(A) Correspondence analysis depicting Axis 1 and Axis 2 of the RSCU data for MERS-CoV, SARS-CoV, SARS-CoV-2, and their human host. Dots representing MERS-CoV, SARS-CoV, and SARS-CoV-2 are marked in green, blue, and purple, respectively. Dots signifying human coding sequences are marked in red. (B) Codon adaptation index (CAI) values for MERS-CoV, SARS-CoV, and SARS-CoV-2. Mann–Whitney U Rank Sum Test was used to compare the average of the CAI values pertaining to the different sets of viruses. *P < 0.05; **P < 0.01; ***P < 0.001.
FIGURE 4Codon adaptation index (CAI) values for MERS-CoV, SARS-CoV, and SARS-CoV-2, isolated from different hosts, calculated with the human coding sequences as reference. Red, green, and blue boxes represent viral isolates from the bat, corresponding intermediate host (dromedary camel for MERS-CoV, civet for SARS-CoV, and pangolin for SARS-CoV-2) and human host, respectively. Mann–Whitney U Rank Sum Test was employed to compare the mean of the CAI values pertaining to the different sets of viruses. *P < 0.05; **P < 0.01; ***P < 0.001. ns, not significant.