| Literature DB >> 33290786 |
Zigui Chen1, Siaw S Boon2, Maggie H Wang3, Renee W Y Chan4, Paul K S Chan5.
Abstract
Three highly pathogenic human coronaviruses can causeEntities:
Keywords: COVID-19; Codon usage; Dinucleotide suppression; MERS-CoV; Phylogeny; SARS-CoV; SARS-CoV-2
Year: 2020 PMID: 33290786 PMCID: PMC7718587 DOI: 10.1016/j.jviromet.2020.114032
Source DB: PubMed Journal: J Virol Methods ISSN: 0166-0934 Impact factor: 2.014
Fig. 1Phylogeney of the family Coronaviridae. A maximum likelihood (ML) tree was contructed using RAxML MPI v8.2.12 inferred from the concatenated nucleotide sequence alignments of 6 open reading frames (1a-1b-S-E-M-N) of 55 reference genomes. The dot size on the nodes is proportional to the bootstrap support values. The HCoV clusters associated with severe acute respiratory syndrome (SARS-CoV, SARS-CoV-2 and MERS-CoV) and common cold (HCoV-OC43, HCoV-HKU1, HCoV-229E and HCoV-NL63) were highlighted in red and orange, respectively. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article).
Fig. 2Phylogeny of the subgenus Sarbecovirus in the genus Betacoronavirus. (A) A maximum likelihood (ML) tree was constructed using RAxML MPI v8.2.12 inferred from the concatenated nucleotide sequence alignments of 12 open reading frames (1a-1b-S-3a-E-M-6-7a-7b-8-N-10) of 114 genomes. The percent nucleotide differences are shown in the panel to the right of the phylogeny. Values for each comparison of a given isolate are connected by lines and the comparison to self is indicated by the 0.0 % difference point. Coloured lines are used to distinguish SARS-CoV-1 and SARS-CoV-2 clusters. (B) Tanglegram of tree topologies between the hierarchical clustering. Trimer spectrum and maximum likelihood of 114 Sarbecovirus genomes inferred from the concatenated nucleotide sequences of 12 ORF/genes. The bar to the side of each panel indicates the subgenus assignment as coloured according to the key in the figure.
Fig. 3Synonymous codon usage of coronavirus genomes based on concatenated nucleotide sequences of 6 ORFs (ORF1a-1b-S-E-M-N). (A) Boxplot of Effective Number of Codon (ENC) between HCoV clusters. The ENC values range from 20 when a gene is effectively using only a single codon for each amino acid (strongest bias) to 61 when a gene trends to use all codons with equal frequency (no bias). (B) Plot of ENC and the synonymous third codon position (GC3s) content. The red curve indicates the expected ENC* if codon usage pattern is only affected by GC3s. (C) Boxplot of differences between the observed and expected ENC values among HCoV clusters. (D) Mean values of Relative Synonymous Codon Usage (RSCU) for 59 codons (except for Met, Trp, and stop codons) amongst HCoV clusters. The preferred and suppressed codon usages were defined as RSCU values > 1.6 or < 0.6, respectively. (E) Scatter biplot of RSCU of HCoV clusters. The clustering was performed using redundancy analysis (RDA), with colours assigned to different human betacoronavirus clusters. The x-axis and the y-axis represent the first two principal coordinate component (PCoA) axes. For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article).
Relative synonymous codon usage (RSCU) patterns of the surveyed human betacoronaviruses inferred from the concatenated 6 ORFs (ORF1a, 1b, S, E, M, N). (For interpretation of the references to colour in this Table legend, the reader is referred to the web version of this article).
^ the preferred codons (RSCU > 1.6) and the suppressed codons (RSCU < 0.6) are highlighted in red and green, respectively.
Fig. 4Scatter biplot of Relative Synonymous Codon Usage (RSCU) of HCoV clusters inferred from distinct ORF/gene. The clustering was performed based on RSCU patterns for individual gene using redundancy analysis (RDA), with colours assigned to different coronavirus clusters. The x-axis and the y-axis represent the first two principal coordinate component (PCoA) axes.
Fig. 5Dinucleotide suppression of HCoV genomes inferred from the concatenated nucleotide sequences of 6 ORFs (ORF1a-1b-S-E-M-N). (A) Boxplot of dinucleotide observed/expected (O/E) ratio. The ρXY dinucleotide exhibits suppression if the O/E ratio is less than 1. (B) Scatter biplot of relative abundance of dinucleotides of HCoV genomes. The clustering was performed using redundancy analysis (RDA), with colours assigned to different clusters. The x-axis and the y-axis represent the first two principal coordinate component (PCoA) axes. (C) Boxplot of the O/E ratios of each dinucleotide amongst HCoV clusters.
Dinucleatide depletion of the surveyed human betacoronaviruses inferred from the concatenated 6 ORFs (ORF1a, 1b, S, E, M, N).
| Amino acid | SARS-CoV-2 | SARS-CoV | MERS-CoV | HCoV-OC43 | HCoV-HKU1 | HCoV-229E | HCoV-NL63 | Animal-CoV |
|---|---|---|---|---|---|---|---|---|
| pAA | 1.07 | 1.04 | 1.06 | 1.05 | 1.09 | 1.15 | 1.16 | 1.05 |
| pAC | 1.23 | 1.17 | 1.12 | 1.05 | 1.10 | 1.27 | 1.30 | 1.17 |
| pAG | 0.99 | 0.99 | 1.00 | 0.99 | 0.96 | 0.88 | 0.88 | 0.97 |
| pAT | 0.81 | 0.86 | 0.88 | 0.94 | 0.93 | 0.83 | 0.84 | 0.88 |
| pCA | 1.28 | 1.31 | 1.21 | 1.26 | 1.09 | 1.34 | 1.30 | 1.28 |
| pCC | 0.89 | 0.83 | 0.94 | 1.15 | 1.17 | 0.93 | 1.04 | 0.97 |
| pCG | ||||||||
| pCT | 1.18 | 1.20 | 1.16 | 1.07 | 1.15 | 1.09 | 1.09 | 1.10 |
| pGA | 0.90 | 0.95 | 0.89 | 0.88 | 0.89 | 0.87 | 0.82 | 0.87 |
| pGC | 1.09 | 1.15 | 1.16 | 1.32 | 1.19 | 1.18 | 1.11 | 1.18 |
| pGG | 0.96 | 0.94 | 0.93 | 0.88 | 0.89 | 0.86 | 0.92 | 0.91 |
| pGT | 1.07 | 0.99 | 1.03 | 1.03 | 1.07 | 1.11 | 1.12 | 1.07 |
| pTA | 0.83 | 0.79 | 0.89 | 0.92 | 0.96 | 0.80 | 1.12 | 0.88 |
| pTC | 0.79 | 0.85 | 0.84 | 0.70 | 0.79 | 0.71 | 1.12 | 0.76 |
| pTG | 1.39 | 1.41 | 1.33 | 1.31 | 1.26 | 1.43 | 1.12 | 1.35 |
| pTT | 1.04 | 1.00 | 0.97 | 1.00 | 0.97 | 1.03 | 1.12 | 1.00 |