| Literature DB >> 32592845 |
Chandra Mohan Dasari1, Raju Bhukya2.
Abstract
The genetic code contains information that impacts the efficiency and rate of translation. Translation elongation plays a crucial role in determining the composition of the proteome, errors within a protein contributes towards disease processes. It is important to analyze the novel coronavirus (2019-nCoV) at the codon level to find similarities and variations in hosts to compare with other human coronavirus (CoVs). This requires a comparative and comprehensive study of various human and zoonotic nature CoVs relating to codon usage bias, relative synonymous codon usage (RSCU), proportions of slow codons, and slow di-codons, the effective number of codons (ENC), mutation bias, codon adaptation index (CAI), and codon frequencies. In this work, seven different CoVs were analyzed to determine the protein synthesis rate and the adaptation of these viruses to the host cell. The result reveals that the proportions of slow codons and slow di-codons in human host of 2019-nCoV and SARS-CoV found to be similar and very less compared to the other five coronavirus types, which suggest that the 2019-nCoV and SARS-CoV have faster protein synthesis rate. Zoonotic CoVs have high RSCU and codon adaptation index than human CoVs which implies the high translation rate in zoonotic viruses. All CoVs have more AT% than GC% in genetic codon compositions. The average ENC values of seven CoVs ranged between 38.36 and 49.55, which implies the CoVs are highly conserved and are easily adapted to host cells. The mutation rate of 2019-nCoV is comparatively less than MERS-CoV and NL63 that shows an evidence for genetic diversity. Host-specific codon composition analysis portrays the relation between viral host sequences and the capability of novel virus replication in host cells. Moreover, the analysis provides useful measures for evaluating a virus-host adaptation, transmission potential of novel viruses, and thus contributes to the strategies of anti-viral drug design.Entities:
Keywords: 2019-nCoV; Amino acid; COVID-19; Codon bias; Coronavirus; ENC; RSCU; Slow codons
Year: 2020 PMID: 32592845 PMCID: PMC7314694 DOI: 10.1016/j.meegid.2020.104432
Source DB: PubMed Journal: Infect Genet Evol ISSN: 1567-1348 Impact factor: 3.342
Number of coding sequences, its ranges and discovered years of different coronaviruses.
| Corona Type | Year | No. of CDs | Range(bps) |
|---|---|---|---|
| 229E | 1965 | 113 | 51–20,292 |
| OC43 | 1980 | 1130 | 239–21,288 |
| SARS-CoV | 2002 | 1863 | 80–21,291 |
| NL63 | 2004 | 270 | 51–20,190 |
| HKU1 | 2005 | 238 | 249–21,654 |
| MERS-CoV | 2012 | 27 | 42–21,237 |
| 2019-nCoV | 2019 | 502 | 107–21,291 |
RSCU values of different types of coronaviruses.
| Amino Acid | Codon | 229E | HKU1 | NL63 | OC43 | MERS-CoV | SAR-COV | 2019-nCoV |
|---|---|---|---|---|---|---|---|---|
| Ala | GCT | |||||||
| GCC | 0.717 | 0.694 | 0.586 | 0.602 | 0.705 | 0.718 | 0.652 | |
| GCA | 1.132 | 0.876 | 1.159 | 1.159 | 0.942 | 1.545 | 1.556 | |
| GCG | 0.304 | 0.359 | 0.192 | 0.313 | 0.511 | 0.561 | 0.533 | |
| Arg | CGA | 0.818 | 0.553 | 0.763 | 0.761 | 0.935 | 0.922 | 0.915 |
| CGC | 0.697 | 1.051 | 0.761 | 0.995 | 1.284 | 0.762 | 0.716 | |
| CGT | ||||||||
| CGG | 0.561 | 0.517 | 0.463 | 0.528 | 0.856 | 0.334 | 0.3 | |
| AGA | ||||||||
| AGG | 0.909 | 0.857 | 1.172 | 1.16 | 1.467 | |||
| Asn | AAC | 0.699 | 0.339 | 0.513 | 0.497 | 0.655 | 0.809 | 0.807 |
| AAT | 1.37 | 1.59 | 1.37 | 1.281 | 1.274 | |||
| Asp | GAC | 0.778 | 0.493 | 0.577 | 0.502 | 0.796 | 0.895 | 0.895 |
| GAT | 1.244 | 1.486 | 1.551 | 1.281 | 1.28 | 1.289 | ||
| Cys | TGC | 0.689 | 0.361 | 0.435 | 0.641 | 0.827 | 0.993 | 0.999 |
| TGT | 1.438 | 1.462 | 1.243 | 1.296 | 1.287 | |||
| Gln | CAA | 1.288 | 1.32 | 1.324 | 1.167 | 1.135 | 1.394 | 1.384 |
| CAG | 0.827 | 0.767 | 0.773 | 0.881 | 0.865 | 0.695 | 0.694 | |
| Glu | GAA | 1.195 | 1.411 | 1.402 | 1.237 | 1.005 | 1.286 | 1.278 |
| GAG | 0.871 | 0.716 | 0.869 | 0.879 | 1.107 | 0.733 | 0.723 | |
| Gly | GGA | 0.542 | 0.577 | 0.488 | 0.741 | 0.87 | 1.169 | 1.176 |
| GGC | 1.012 | 0.709 | 0.652 | 0.701 | 1.04 | 1.482 | 1.491 | |
| GGT | ||||||||
| GGG | 0.453 | 0.591 | 0.546 | 0.431 | 0.662 | 0.266 | 0.208 | |
| His | CAT | 1.518 | 1.584 | 1.55 | 1.248 | 1.328 | 1.337 | |
| CAC | 0.777 | 0.514 | 0.512 | 0.567 | 0.874 | 0.794 | 0.755 | |
| Leu | TTA | 1.452 | 1.477 | 1.277 | 1.356 | 1.341 | ||
| TTG | 1.322 | 1.184 | 1.171 | |||||
| CTT | 1.272 | 1.47 | ||||||
| CTC | 0.516 | 0.36 | 0.449 | 0.403 | 0.871 | 1.087 | 1.104 | |
| CTA | 0.563 | 0.494 | 0.441 | 0.558 | 0.693 | 0.791 | 0.795 | |
| CTG | 0.362 | 0.346 | 0.223 | 0.551 | 0.628 | 0.621 | 0.601 | |
| lle | ATT | 1.524 | 1.509 | |||||
| ATC | 0.46 | 0.382 | 0.369 | 0.371 | 0.724 | 0.75 | 0.75 | |
| ATA | 0.788 | 1.044 | 0.744 | 1.021 | 0.856 | 1.073 | 1.068 | |
| Lys | AAA | 1.092 | 1.199 | 1.291 | 1.039 | 1.185 | 1.491 | 1.502 |
| AAG | 0.909 | 0.801 | 0.777 | 1.002 | 0.957 | 0.657 | 0.639 | |
| Phe | TTT | 1.25 | 1.22 | 1.22 | ||||
| TTC | 0.553 | 0.338 | 0.301 | 0.376 | 0.781 | 0.872 | 0.866 | |
| Pro | CCT | |||||||
| CCC | 0.897 | 0.564 | 0.717 | 0.633 | 0.755 | 0.543 | 0.519 | |
| CCA | 1.573 | 1.297 | 1.377 | 1.518 | 1.224 | |||
| CCG | 0.569 | 0.404 | 0.216 | 0.463 | 0.366 | 1.064 | 1.06 | |
| Ser | TCT | |||||||
| TCC | 0.538 | 0.392 | 0.457 | 0.724 | 1.1 | 0.851 | 0.851 | |
| TCA | 1.307 | 0.933 | 1.153 | 0.963 | 1.39 | |||
| TCG | 0.411 | 0.417 | 0.244 | 0.36 | 0.395 | 0.399 | 0.389 | |
| AGT | 1.494 | 1.388 | 1.507 | 1.503 | ||||
| AGC | 0.7 | 0.439 | 0.527 | 0.655 | 0.592 | 0.662 | 0.643 | |
| Thr | ACT | |||||||
| ACC | 0.573 | 0.568 | 0.541 | 0.716 | 0.737 | 0.577 | 0.565 | |
| ACA | 1.119 | 1.287 | 1.302 | 1.155 | ||||
| ACG | 0.324 | 0.364 | 0.335 | 0.314 | 0.28 | 0.663 | 0.658 | |
| Tyr | TAT | 1.461 | 1.587 | 1.251 | 1.101 | 1.091 | ||
| TAC | 0.585 | 0.435 | 0.493 | 0.486 | 0.782 | 1.029 | 1.03 | |
| Val | GTT | |||||||
| GTC | 0.584 | 0.346 | 0.454 | 0.445 | 0.819 | 0.753 | 0.722 | |
| GTA | 0.537 | 0.878 | 0.496 | 0.788 | 0.852 | 1.141 | 1.152 | |
| GTG | 0.919 | 0.433 | 0.455 | 0.832 | 0.829 | 0.665 | 0.641 |
bold values represents the over-preferred codons.
The average proportions of slow codons and slow di-codons.
| Corona Type | Non-repeated Slow Codon Proportion | Repeated Slow Codon Proportion | Non-overlapped Slow Di-codon Proportion | Overlapped Slow |
|---|---|---|---|---|
| 229E | 0.283612 | 0.220118 | 0.062016 | 0.049627 |
| HKU1 | 0.263711 | 0.245577 | 0.07943 | 0.061318 |
| NL63 | 0.270957 | 0.24975 | 0.089925 | 0.071582 |
| OC43 | 0.269984 | 0.23228 | 0.067392 | 0.056349 |
| MERS-CoV | 0.279569 | 0.215358 | 0.059871 | 0.049781 |
| SARS-CoV | 0.237613 | 0.176295 | 0.034371 | 0.029054 |
| 2019-nCoV | 0.235117 | 0.17425 | 0.033134 | 0.027704 |
Fig. 1The average proportions of human slow codons and slow di-codons.
Mutations found in various DNA sequences of 2019-nCoV, MERS-CoV, and NL63.
| CoV Type | Number of Transition: Transversion Mutations | Silent Mutations | Missense Mutations | Nonsense Mutations | CoVs-Strain |
|---|---|---|---|---|---|
| 2019-nCoV | 1: 1 | – | GTA(11082)➔CTA, | – | CHN/Yunnan-01/2020 MT049951 |
| 5: 0 | CTC(18059)➔CTT | CCT(17746)➔CTT, | – | USA/WA3- UW1/ 2020 MT163719 | |
| 1: 0 | – | CCT(14407)➔CTT | – | ZAF/R03006/2020 MT324062 | |
| 3: 0 | GTT(18125)➔GTC | CCT(14407)➔CTT, | – | GRC/10/2020 MT328032 | |
| 2: 2 | TAC(14804)➔TAT, | GTA(11082)➔TTA, | – | BRA/SP02cc/2020 MT350282 | |
| 1: 0 | – | CCT(14407)➔CTT | – | IND/GBRC1/2020 MT358637 | |
| 2: 1 | – | AGT(1396)➔AAT, | – | TWN/CGMH-CGU-05/2020 MT370518 | |
| 2: 3 | – | AGT(1396)➔AAT, | – | LKA/COV38/2020 MT371047 | |
| MERS-CoV | 46: 10 | CTT(776)➔CTG, | CAT(749)➔CAG, | – | HCoV-EMC MH306207 |
| 43: 5 | AGA(3275)➔AGG, | TGT(541)➔TAT, | – | HCoV-EMC MH013216 | |
| 57: 14 | CCC(1832)➔CCA, | CAT(749)➔CAG, | CAG(13395)➔TAG, | HCoV-EMC MH454272 | |
| 57: 14 | CCC(1832)➔CCA, | CTA(652)➔CAA, | CAG(13395)➔TAG, | 2366 MH432120 | |
| 57: 14 | CTG(7554)➔TTG, | CTA(1903)➔CCA, | GAG(23553)➔TAG, | 2363 MH395139 | |
| NL63 | 48: 6 | TGC(12974)➔TGT, | ATT(17433)➔GTT, | GAA(20799)➔TAA | Haiti-1/2015 KT266906 |
| 67: 12 | TGT(14591)➔TGC, | TTT(414)➔CTT, | GAA(20799)➔TAA, | UF-1/2015 KT381875 | |
| 70: 12 | GAA(12902)➔GAG, | CTC(7740)➔TTC, | GAA(20799)➔TAA, | UF-2/2015 KU521535 | |
| 57: 14 | GAA(12902)➔GAG, | GAA(12902)➔GAG, | GAA(20799)➔TAA, | UF-2/2015 KX179500 | |
| 21: 46 | CTT(16560)➔TTG, | AGT(13293)➔TGT, | – | UNKNOWNCS124012 CS124012 |
The average values of different parameters of seven coronaviruses.
| Parameter | 229E | HKU1 | NL63 | OC43 | MERS-CoV | SARS-COV | 2019-nCoV |
|---|---|---|---|---|---|---|---|
| ENCs | 44.63 | 39.26 | 38.36 | 45.24 | 49.56 | 43.75 | 43.91 |
| GC3s | 0.326 | 0.225 | 0.241 | 0.297 | 0.369 | 0.326 | 0.322 |
| GC2s | 0.382 | 0.355 | 0.366 | 0.371 | 0.397 | 0.365 | 0.363 |
| GC1s | 0.451 | 0.414 | 0.455 | 0.455 | 0.484 | 0.456 | 0.457 |
| AT3 | 0.674 | 0.775 | 0.759 | 0.703 | 0.631 | 0.674 | 0.678 |
| AT2 | 0.618 | 0.645 | 0.634 | 0.629 | 0.603 | 0.635 | 0.637 |
| AT1 | 0.549 | 0.586 | 0.545 | 0.545 | 0.516 | 0.544 | 0.543 |
Fig. 2ENC plot of seven different coronaviruses representing the relation between GC3s and ENC.
Fig. 3Comparison of genetic codon compositions of seven coronaviruses infect human hosts.
The minimum, mean, and maximum average CAI values of seven coronaviruses.
| Parameter | 229E | HKU1 | NL63 | OC43 | MERS-CoV | SARS-COV | 2019-nCoV |
|---|---|---|---|---|---|---|---|
| Minimum | 0.379 | 0.338 | 0.394 | 0.417 | 0.491 | 0.449 | 0.511 |
| Mean | 0.661 | 0.666 | 0.665 | 0.672 | 0.687 | 0.674 | 0.670 |
| Maximum | 0.781 | 0.787 | 0.785 | 0.817 | 0.781 | 0.777 | 0.756 |
The CAI values for each codon of seven coronaviruses.
| Codon | 229E | HKU1 | NL63 | OC43 | MERS-CoV | SARS-CoV | 2019-nCOV |
|---|---|---|---|---|---|---|---|
| AAA | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| AAC | 0.799 | 0.601 | 1 | 0.728 | 1 | 1 | 1 |
| AAT | 1 | 1 | 0.829 | 1 | 0.985 | 0.999 | 0.98 |
| AAG | 0.296 | 0.377 | 0.245 | 0.463 | 0.215 | 0.315 | 0.292 |
| ACA | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| ACC | 0.8 | 0.774 | 0.802 | 0.629 | 0.864 | 0.819 | 0.85 |
| ACT | 0.716 | 0.667 | 0.676 | 0.504 | 0.886 | 0.811 | 0.802 |
| ACG | 0.45 | 0.329 | 0.399 | 0.349 | 0.451 | 0.24 | 0.246 |
| ATA | 0.353 | 0.343 | 0.349 | 0.403 | 0.437 | 0.33 | 0.307 |
| ATC | 0.457 | 0.36 | 0.383 | 0.317 | 0.456 | 0.477 | 0.477 |
| ATT | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| AGA | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| AGC | 0.856 | 0.702 | 0.791 | 0.875 | 0.932 | 0.864 | 0.881 |
| AGT | 1 | 1 | 0.822 | 0.893 | 0.92 | 1 | 1 |
| AGG | 0.64 | 0.478 | 0.703 | 0.597 | 0.596 | 0.486 | 0.488 |
| CAA | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| CAC | 0.841 | 1 | 0.798 | 0.983 | 1 | 1 | 1 |
| CAT | 1 | 0.964 | 1 | 1 | 0.991 | 0.977 | 0.966 |
| CAG | 0.333 | 0.3 | 0.213 | 0.401 | 0.294 | 0.393 | 0.382 |
| CCA | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| CCC | 0.707 | 0.494 | 0.647 | 0.702 | 0.808 | 0.671 | 0.717 |
| CCT | 0.737 | 0.661 | 0.712 | 0.819 | 0.864 | 0.949 | 0.941 |
| CCG | 0.483 | 0.358 | 0.349 | 0.426 | 0.449 | 0.325 | 0.339 |
| CTA | 0.407 | 0.146 | 0.136 | 0.22 | 0.579 | 0.503 | 0.493 |
| CTC | 0.286 | 0.118 | 0.141 | 0.18 | 0.496 | 0.349 | 0.352 |
| CTT | 0.694 | 0.299 | 0.454 | 0.474 | 1 | 0.866 | 0.869 |
| CTG | 0.303 | 0.128 | 0.167 | 0.208 | 0.391 | 0.37 | 0.361 |
| CGA | 0.321 | 0.165 | 0.201 | 0.183 | 0.364 | 0.15 | 0.152 |
| CGC | 0.25 | 0.066 | 0.121 | 0.133 | 0.305 | 0.107 | 0.104 |
| CGT | 0.424 | 0.151 | 0.149 | 0.194 | 0.545 | 0.166 | 0.161 |
| CGG | 0.225 | 0.076 | 0.076 | 0.132 | 0.27 | 0.136 | 0.135 |
| GAA | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| GAC | 1 | 0.491 | 0.67 | 0.654 | 1 | 1 | 1 |
| GAT | 0.835 | 1 | 1 | 1 | 0.735 | 0.866 | 0.809 |
| GAG | 0.475 | 0.482 | 0.445 | 0.543 | 0.399 | 0.49 | 0.495 |
| GCA | 0.975 | 1 | 1 | 1 | 1 | 0.885 | 0.927 |
| GCC | 0.666 | 0.51 | 0.693 | 0.687 | 0.507 | 0.627 | 0.681 |
| GCT | 1 | 0.668 | 0.916 | 0.917 | 0.78 | 1 | 1 |
| GCG | 0.312 | 0.325 | 0.585 | 0.492 | 0.404 | 0.283 | 0.308 |
| GTA | 0.354 | 0.477 | 0.344 | 0.511 | 0.425 | 0.546 | 0.544 |
| GTC | 0.411 | 0.308 | 0.319 | 0.291 | 0.474 | 0.489 | 0.508 |
| GTT | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| GTG | 0.427 | 0.349 | 0.333 | 0.406 | 0.48 | 0.562 | 0.558 |
| GGA | 1 | 0.762 | 1 | 1 | 1 | 1 | 1 |
| GGC | 0.661 | 0.811 | 0.688 | 0.588 | 0.924 | 0.771 | 0.77 |
| GGT | 0.583 | 1 | 0.788 | 0.643 | 0.753 | 0.644 | 0.559 |
| GGG | 0.493 | 0.538 | 0.679 | 0.365 | 0.54 | 0.632 | 0.644 |
| TAC | 0.753 | 0.568 | 0.698 | 0.755 | 0.85 | 0.954 | 0.956 |
| TAT | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| TCA | 0.925 | 0.958 | 1 | 1 | 0.859 | 0.819 | 0.809 |
| TCC | 0.437 | 0.586 | 0.565 | 0.569 | 0.678 | 0.421 | 0.426 |
| TCT | 0.828 | 0.787 | 0.896 | 0.82 | 1 | 0.814 | 0.8 |
| TCG | 0.268 | 0.354 | 0.354 | 0.321 | 0.31 | 0.189 | 0.19 |
| TTA | 0.921 | 1 | 0.999 | 1 | 0.786 | 1 | 1 |
| TTC | 0.469 | 0.349 | 0.422 | 0.398 | 0.724 | 0.573 | 0.578 |
| TTT | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| TTG | 1 | 0.885 | 1 | 0.853 | 0.72 | 0.863 | 0.866 |
| TGC | 0.586 | 0.43 | 0.457 | 0.546 | 0.702 | 0.77 | 0.777 |
| TGT | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
Fig. 4Phylogenetic tree of human host genome sequences of the representative seven coronaviruses.