Literature DB >> 33934387

Betacoronaviruses genome analysis reveals evolution toward specific codons usage: Implications for SARS-CoV-2 mitigation strategies.

Elisson N Lopes1, Vagner Fonseca2,3, Diego Frias4, Stephane Tosta1, Álvaro Salgado1, Ricardo Assunção Vialle5, Toscano S Paulo Eduardo6, Fernanda K Barreto7, Vasco Ariston de Azevedo1, Michele Guarino8, Silvia Angeletti9, Massimo Ciccozzi10, Luiz C Junior Alcantara1,11, Marta Giovanetti1,11.   

Abstract

Since the start of the coronavirus disease 2019 (COVID-19) pandemic, the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has rapidly widespread worldwide becoming one of the major global public health issues of the last centuries. Currently, COVID-19 vaccine rollouts are finally upon us carrying the hope of herd immunity once a sufficient proportion of the population has been vaccinated or infected, as a new horizon. However, the emergence of SARS-CoV-2 variants brought concerns since, as the virus is exposed to environmental selection pressures, it can mutate and evolve, generating variants that may possess enhanced virulence. Codon usage analysis is a strategy to elucidate the evolutionary pressure of the viral genome suffered by different hosts, as possible cause of the emergence of new variants. Therefore, to get a better picture of the SARS-CoV-2 codon bias, we first identified the relative codon usage rate of all Betacoronaviruses lineages. Subsequently, we correlated putative cognate transfer ribonucleic acid (tRNAs) to reveal how those viruses adapt to hosts in relation to their preferred codon usage. Our analysis revealed seven preferred codons located in three different open reading frame which appear preferentially used by SARS-CoV-2. In addition, the tRNA adaptation analysis indicates a wide strategy of competition between the virus and mammalian as principal hosts highlighting the importance to reinforce the genomic monitoring to prompt identify any potential adaptation of the virus into new potential hosts which appear to be crucial to prevent and mitigate the pandemic.
© 2021 The Authors. Journal of Medical Virology Published by Wiley Periodicals LLC.

Entities:  

Keywords:  COVID-19; SARS-CoV-2; codon deoptimization; codon usage; coronaviruses

Mesh:

Substances:

Year:  2021        PMID: 33934387      PMCID: PMC8242727          DOI: 10.1002/jmv.27056

Source DB:  PubMed          Journal:  J Med Virol        ISSN: 0146-6615            Impact factor:   20.693


INTRODUCTION

Viral species members of the Coronaviridae family are enveloped by single‐stranded positive‐sense RNA viruses. Their genome encodes different nonstructural or accessory proteins that may differ according to the species and four structural proteins: envelope (E), nucleocapsid (N), membrane (M), and spike (S). The organization of spike protein across viral envelopes is responsible for the crown‐shape that names the family. The Coronaviruses (CoVs) are organized into four genera: Alphacoronavirus and Betacoronavirus which have as natural hosts bats and rodents, and Deltacoronavirus along with Gammacoronavirus that are more frequently found in avian species. After the emergence of severe acute respiratory syndrome (SARS) and middle east respiratory syndrome (MERS),1, 3, 4 the severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2), which is the etiological agent of the coronavirus disease 2019 (COVID‐19), is the third major coronavirus outbreak in the last 20 years. COVID‐19 may cause symptoms such as fever, cough, fatigue, and other severe complications leading to death. According to the World Health Organization (WHO) updated in April 2021, more than 137 million people have been infected, causing more than 2.9 million deaths worldwide. SARS‐CoV‐2 has a natural host, bats, and a secondary one, probably a mammalian host, who was the key to originating the jumping species mutation needed for human infection. Viruses with multiple host species such as the Coronaviruses evolve to successfully thrive under different hosts environments and available resources. Therefore, the virus may suit better with codons matching their hosts' codon usage. This selective pressure may cause the emergence of mutations on SARS‐CoV‐2 to their hosts, one example is the variant identified in Minks Farm (MT396266). Since then, due to the advanced whole genome sequencing technologies, an unprecedented number of genomes have been generated, providing invaluable insights into the ongoing evolution and epidemiology of the virus allowing the identification of hundreds of circulating genetic variants during the pandemic. Currently, three variants (B.1.1.7 or VOC202012/01, B.1.351 or 20H/501Y.V2 and P.1) carrying several mutations in the receptor‐binding domain (RBD) of the spike (S) protein, raise concerns about their potential to shift the dynamics and public health impact of the pandemic.6, 7, 8, 9 Those variants of concern (VOCs) appear to share a common aspect: the viral adaptation to the human host, resulting in changeable effects on COVID‐19 and complicating attempts to control the pandemic.10, 11 In addition, it should be noted that effective adaptation of CoVs to a new host needs not only such mutations affecting receptor binding but also a complete set of positive gene mutations that improve the reproduction and transmission of viruses in the new host. On this respect, here, using a codon usage bias (CUB) as an unequal frequency in the usage of synonymous codons we shield light on how SARS‐CoV‐2 acquired its adaptation to human host and provide insight regarding how other possible mammalian hosts might be ideal environments to promote viral infection.

MATERIALS AND METHODS

Data collection

We collected all fourteen (14) reference Betacoronavirus sequences from the National Center for Biotechnology Information (NCBI) Genbank. An in house R script was used to split the sequences in open reading frames (ORFs) (Table S1) and check for different read frames. Finally, we build a data set to represent all ORFs of Betacoronaviruses. After that, we collected the frequency of eight mammalian hosts available from the Codon Usage Database (http://www.kazusa.or.jp/codon/) and the transfer ribonucleic acid (tRNA) counts available from tRNA database (http://gtrnadb.ucsc.edu/), which were Homo sapiens, Bostaurus, Canis familiaris, Equus caballus, Felis catus, Mus musculus, Mustela putorius furo, Rattus sp. Using the relative synonymous codon usage (RSCU) formula we then calculated the RSCU of the hosts.

RSCU

RSCU is a measure of nonuniform usage of synonymous codons in a sequence and it has been found to have causes and implications in RNA viruses. Higher RSCU values indicate a higher bias toward a codon in detriment of its synonymous codon using codon metrics. To calculate RSCU, the observed codon value is divided by the expected codon value. Hence, the maximum possible RSCU values are proportional to the number of synonymous codons. To compare the codon bias of all Betacoronaviruses and their hosts, we did a normalization of data with the RSCU values between 0 and 1. Where 1 is the higher value and 0 is the smaller value to RSCU. In addition to identifying codons more representative in viruses and less in humans (codon targets), we compared hosts and virus RSCU; (i) codons with a RSCU value close to 1 for coronavirus and (ii) lower than expected value to human host.

Euclidean distance

We used Euclidean distance algorithms to identify putative relationships between the virus sequences and hosts. The construction of the Euclidean distance matrix was based on RSCU values calculated in previous section of each host and viruses, the analysis was performed using the following equation:

Translational adaptation estimation method

The availability of tRNA was inferred from the hosts' genomes, counting the number of genes that encode each type of tRNA and taking into account the mechanism of tRNA sharing between synonymous codons ending with pyrimidines. We compare the hosts tRNA distribution with Betacoronavirus RSCU values; then we calculated the ratio from each host and virus using the following formula: Each codon RSCU value is divided to tRNA frequency, resulting in a group of codons with more disponibility to each host's tRNA pool. After that, we calculated the relative distance to each host and virus based on the frequency of codons and tRNA: Finally, we calculated a translational adaptation index (TAI) varying between 0% and 100% was measured as:

RESULTS

Betacoronavirus codon usage

To investigate Betacoronaviruses' codon usage biases, we calculated RSCU values for each of the 14 genomes considering each ORFs individually. We noted that all viral genomes presented a similar rate with codons ending with A or T having higher RSCU values, whereas G and C end had the lower rates, as previously reported to RNA viruses. Additionally, the top five codons with higher RSCU values on average represented ≥50% of the amino acids used in the process of translation and this feature was shared across almost all Betacoronaviruses, suggesting a possible coevolution and permanence of specific codons groups. This could indicate a clade bias, probably connected to a successful survival strategy (Table S2). In addition, we created a matrix of RSCU values and calculated the Euclidean Distance across all Betacoronavirus, then we noticed a correlation between SARS‐CoV‐2 and SARS coronavirus (Table S3), which was expected based on their similarity as already described. After performing Betacoronavirus codon analysis, we focused on SARS‐CoV‐2 and we found that codons with the highest RSCU values (optimal codons) [TGT (Cys), GAA (Glu), TTT (Phe), CAA (Gln), and AAT (Asn)] were all AT‐rich and that the human codon usage presented as the five most used codons [CAG (Gln), CAC (His), GAG (Glu), AAG (Lys), and TAC (Tyr)] were, in the majority of case, GC‐rich (for more details see Table S2). Soon after, we compared SARS‐CoV‐2 RSCU values to mammalian hosts (Figure 1), we search for codons targets which appear to be more important for the virus and less to human, to elucidate the adaptive mechanism. Figure 1 represents SARS‐CoV‐2 RSCU values close to 1, in red; and human RSCU codons values close to 0, in blue. We found seven codons, which appear to be more preferentially used by the virus than the human host, which are located in three distinct ORFs: CCG, ACG, CTC located in ORF10; GGC, TAC located in E; and GAC in M.
Figure 1

Graphical representation of synonymous codon usage pattern of each amino acid among SARS‐CoV‐2 and mammalian hosts. Open read frames of SARS‐CoV‐2 genome representation; Heatmap of observed RSCU values representing codon more used in red, and less used in blue. Rows are SARS‐CoV‐2 ORFS and hosts, columns are codons organized by amino acids. The values are normalized between 0 and 1 for comparison purposes. RSCU, SARS‐CoV‐2, severe acute respiratory syndrome coronavirus 2

Graphical representation of synonymous codon usage pattern of each amino acid among SARS‐CoV‐2 and mammalian hosts. Open read frames of SARS‐CoV‐2 genome representation; Heatmap of observed RSCU values representing codon more used in red, and less used in blue. Rows are SARS‐CoV‐2 ORFS and hosts, columns are codons organized by amino acids. The values are normalized between 0 and 1 for comparison purposes. RSCU, SARS‐CoV‐2, severe acute respiratory syndrome coronavirus 2

Codons identification from relative frequency point of view

We searched in hosts' tRNA pools for tRNA corresponding to Betacoronavirus codons with higher RSCU codons, and our goals were found codons which appear to be more crucial to virus translation than for the hosts. Thus, correlating RSCU results for humans and SARS‐CoV‐2, we found four of the target codons as tRNA abundant and three as tRNA scarce (Table S4). After that, we used the TAI index to measure the adaptation scenario which will be able to explain how this newly emergent virus was able to adapt to the hosts in relation to their preferred codon usage (Table 1). These data compared the tRNA distribution and codon frequencies (for all virus TAI see Table S5).
Table 1

Translational metrics for all hosts to SARS‐CoV‐2

HostsSARS‐COV‐2 TAI, %
Equus caballus 80.24
Bostaurus 77.00
Canis familiaris 76.83
Felis Catus 74.71
Mus musculus 74.71
Homo Sapiens*74.40
Mustela putorius furo 73.87
Rattus sp 73.00

Note: These data present the Euclidean distance between observed values and ideal values for viral and each host.

Abbreviations: SARS‐COV‐2, severe acute respiratory syndrome coronavirus 2; TAI, translational adaptation index.

Translational metrics for all hosts to SARS‐CoV‐2 Note: These data present the Euclidean distance between observed values and ideal values for viral and each host. Abbreviations: SARS‐COV‐2, severe acute respiratory syndrome coronavirus 2; TAI, translational adaptation index. Our results point out to an high SARS‐CoV‐2 adaptation, with TAIs values over 70% compared with all mammalian hosts tested suggesting that also other mammal host might be ideal environments to promote viral infection, highlighting the importance of strength the active monitoring to further elucidate adaptation of the virus to newly potential hosts.

DISCUSSION

The global death toll from COVID‐19 topped 2.9 million as April 2021, crossing the threshold amid a vaccine rollout so immense but so uneven that in some countries there is real hope of vanquishing the outbreak, while in other, less‐developed parts of the world, it seems a far‐off dream. In this view, the identification of viral adaptation as well as the unbridled spread of this virus in a new host leading to the accumulation of mutations appear to be challenging to prevent the emergence of new SARS‐CoV‐2 variants of international concern. In this context, we analyzed the codon usage of diverse endemic and epidemic CoVs sequences to investigate the selective pressure that may cause the emergence of mutations on SARS‐CoV‐2 to their hosts. Codon usage carries a strategy for comparing sequences in a different way reflecting the viral evolutionary pressure suffered by the virus, the genetic drift and the natural selection for translational optimization.12, 15 In our analysis, we analyzed the codon pattern relating to Betacoronaviruses and their hosts, focusing on humans and SARS‐CoV‐2. We found a group of codons and classified them as targets, representing SARS‐CoV‐2 codons used more than expected compared with their human host. Our work brought to light seven codons and three ORFs as preferred in the SARS‐CoV‐2 selection. These codons present a nucleotide preference: A and T ending codons, which is in line with previous findings. Using translational adaptation models, we further inferred SARS‐CoV‐2 capability to survive in mammalian hosts highlighting the importance to reinforce the genomic monitoring to prompt identify any potential adaptation of the virus into new potential hosts which appear to be crucial to prevent and mitigate the pandemic.

CONFLICT OF INTERESTS

The authors declare that there are no conflict of interests.

AUTHOR CONTRIBUTIONS

Conception and design: Elisson Nogueira Lopes, Vagner Fonseca, Diego Frias, and MG. Performed the experiments: Elisson Nogueira Lopes, Vagner Fonseca, Diego Frias, Stephane Tosta, Álvaro Salgado, Ricardo Assunção Vialle, and Paulo Eduardo Toscano Soares. Data analysis: Elisson Nogueira Lopes, Vagner Fonseca, Diego Frias, Stephane Tosta, Álvaro Salgado, Ricardo Assunção Vialle, Paulo Eduardo Toscano Soares, and MG. Writing and revision: Elisson Nogueira Lopes, Vagner Fonseca, Diego Frias, Fernanda Khouri Barreto, Vasco Ariston de Azevedo, Michele Guarino, Silvia Angeletti, Massimo Ciccozzi, Luiz Carlos Junior Alcantara, and Marta Giovanetti. Supporting information. Click here for additional data file. Supporting information. Click here for additional data file. Supporting information. Click here for additional data file.
  12 in total

Review 1.  Middle East respiratory syndrome coronavirus: another zoonotic betacoronavirus causing SARS-like disease.

Authors:  Jasper F W Chan; Susanna K P Lau; Kelvin K W To; Vincent C C Cheng; Patrick C Y Woo; Kwok-Yung Yuen
Journal:  Clin Microbiol Rev       Date:  2015-04       Impact factor: 26.132

2.  Sixteen novel lineages of SARS-CoV-2 in South Africa.

Authors:  Houriiyah Tegally; Eduan Wilkinson; Richard J Lessells; Jennifer Giandhari; Sureshnee Pillay; Nokukhanya Msomi; Koleka Mlisana; Jinal N Bhiman; Anne von Gottberg; Sibongile Walaza; Vagner Fonseca; Mushal Allam; Arshad Ismail; Allison J Glass; Susan Engelbrecht; Gert Van Zyl; Wolfgang Preiser; Carolyn Williamson; Francesco Petruccione; Alex Sigal; Inbal Gazy; Diana Hardie; Nei-Yuan Hsiao; Darren Martin; Denis York; Dominique Goedhals; Emmanuel James San; Marta Giovanetti; José Lourenço; Luiz Carlos Junior Alcantara; Tulio de Oliveira
Journal:  Nat Med       Date:  2021-02-02       Impact factor: 53.440

3.  Genomics and epidemiology of the P.1 SARS-CoV-2 lineage in Manaus, Brazil.

Authors:  Nuno R Faria; Thomas A Mellan; Charles Whittaker; Ingra M Claro; Darlan da S Candido; Swapnil Mishra; Oliver G Pybus; Seth Flaxman; Samir Bhatt; Ester C Sabino; Myuki A E Crispim; Flavia C S Sales; Iwona Hawryluk; John T McCrone; Ruben J G Hulswit; Lucas A M Franco; Mariana S Ramundo; Jaqueline G de Jesus; Pamela S Andrade; Thais M Coletti; Giulia M Ferreira; Camila A M Silva; Erika R Manuli; Rafael H M Pereira; Pedro S Peixoto; Moritz U G Kraemer; Nelson Gaburo; Cecilia da C Camilo; Henrique Hoeltgebaum; William M Souza; Esmenia C Rocha; Leandro M de Souza; Mariana C de Pinho; Leonardo J T Araujo; Frederico S V Malta; Aline B de Lima; Joice do P Silva; Danielle A G Zauli; Alessandro C de S Ferreira; Ricardo P Schnekenberg; Daniel J Laydon; Patrick G T Walker; Hannah M Schlüter; Ana L P Dos Santos; Maria S Vidal; Valentina S Del Caro; Rosinaldo M F Filho; Helem M Dos Santos; Renato S Aguiar; José L Proença-Modena; Bruce Nelson; James A Hay; Mélodie Monod; Xenia Miscouridou; Helen Coupland; Raphael Sonabend; Michaela Vollmer; Axel Gandy; Carlos A Prete; Vitor H Nascimento; Marc A Suchard; Thomas A Bowden; Sergei L K Pond; Chieh-Hsi Wu; Oliver Ratmann; Neil M Ferguson; Christopher Dye; Nick J Loman; Philippe Lemey; Andrew Rambaut; Nelson A Fraiji; Maria do P S S Carvalho
Journal:  Science       Date:  2021-04-14       Impact factor: 47.728

4.  SARS-CoV-2 B.1.1.7 and B.1.351 Spike variants bind human ACE2 with increased affinity.

Authors:  Muthukumar Ramanathan; Ian D Ferguson; Weili Miao; Paul A Khavari
Journal:  bioRxiv       Date:  2021-02-22

5.  Genomic characterization of the 2019 novel human-pathogenic coronavirus isolated from a patient with atypical pneumonia after visiting Wuhan.

Authors:  Jasper Fuk-Woo Chan; Kin-Hang Kok; Zheng Zhu; Hin Chu; Kelvin Kai-Wang To; Shuofeng Yuan; Kwok-Yung Yuen
Journal:  Emerg Microbes Infect       Date:  2020-01-28       Impact factor: 7.163

6.  Evolution and epidemic spread of SARS-CoV-2 in Brazil.

Authors:  Darlan S Candido; Ingra M Claro; Jaqueline G de Jesus; William M Souza; Filipe R R Moreira; Simon Dellicour; Thomas A Mellan; Louis du Plessis; Rafael H M Pereira; Flavia C S Sales; Erika R Manuli; Julien Thézé; Luiz Almeida; Mariane T Menezes; Carolina M Voloch; Marcilio J Fumagalli; Thaís M Coletti; Camila A M da Silva; Mariana S Ramundo; Mariene R Amorim; Henrique H Hoeltgebaum; Swapnil Mishra; Mandev S Gill; Luiz M Carvalho; Lewis F Buss; Carlos A Prete; Jordan Ashworth; Helder I Nakaya; Pedro S Peixoto; Oliver J Brady; Samuel M Nicholls; Amilcar Tanuri; Átila D Rossi; Carlos K V Braga; Alexandra L Gerber; Ana Paula de C Guimarães; Nelson Gaburo; Cecila Salete Alencar; Alessandro C S Ferreira; Cristiano X Lima; José Eduardo Levi; Celso Granato; Giulia M Ferreira; Ronaldo S Francisco; Fabiana Granja; Marcia T Garcia; Maria Luiza Moretti; Mauricio W Perroud; Terezinha M P P Castiñeiras; Carolina S Lazari; Sarah C Hill; Andreza Aruska de Souza Santos; Camila L Simeoni; Julia Forato; Andrei C Sposito; Angelica Z Schreiber; Magnun N N Santos; Camila Zolini de Sá; Renan P Souza; Luciana C Resende-Moreira; Mauro M Teixeira; Josy Hubner; Patricia A F Leme; Rennan G Moreira; Maurício L Nogueira; Neil M Ferguson; Silvia F Costa; José Luiz Proenca-Modena; Ana Tereza R Vasconcelos; Samir Bhatt; Philippe Lemey; Chieh-Hsi Wu; Andrew Rambaut; Nick J Loman; Renato S Aguiar; Oliver G Pybus; Ester C Sabino; Nuno Rodrigues Faria
Journal:  Science       Date:  2020-07-23       Impact factor: 47.728

7.  SARS-CoV-2 infection in farmed minks, the Netherlands, April and May 2020.

Authors:  Nadia Oreshkova; Robert Jan Molenaar; Sandra Vreman; Frank Harders; Bas B Oude Munnink; Renate W Hakze-van der Honing; Nora Gerhards; Paulien Tolsma; Ruth Bouwstra; Reina S Sikkema; Mirriam Gj Tacken; Myrna Mt de Rooij; Eefke Weesendorp; Marc Y Engelsma; Christianne Jm Bruschke; Lidwien Am Smit; Marion Koopmans; Wim Hm van der Poel; Arjan Stegeman
Journal:  Euro Surveill       Date:  2020-06

8.  Genomic and evolutionary comparison between SARS-CoV-2 and other human coronaviruses.

Authors:  Zigui Chen; Siaw S Boon; Maggie H Wang; Renee W Y Chan; Paul K S Chan
Journal:  J Virol Methods       Date:  2020-12-05       Impact factor: 2.014

9.  Betacoronaviruses genome analysis reveals evolution toward specific codons usage: Implications for SARS-CoV-2 mitigation strategies.

Authors:  Elisson N Lopes; Vagner Fonseca; Diego Frias; Stephane Tosta; Álvaro Salgado; Ricardo Assunção Vialle; Toscano S Paulo Eduardo; Fernanda K Barreto; Vasco Ariston de Azevedo; Michele Guarino; Silvia Angeletti; Massimo Ciccozzi; Luiz C Junior Alcantara; Marta Giovanetti
Journal:  J Med Virol       Date:  2021-05-24       Impact factor: 20.693

10.  Genomic characterization of a novel SARS-CoV-2.

Authors:  Rozhgar A Khailany; Muhamad Safdar; Mehmet Ozaslan
Journal:  Gene Rep       Date:  2020-04-16
View more
  1 in total

1.  Betacoronaviruses genome analysis reveals evolution toward specific codons usage: Implications for SARS-CoV-2 mitigation strategies.

Authors:  Elisson N Lopes; Vagner Fonseca; Diego Frias; Stephane Tosta; Álvaro Salgado; Ricardo Assunção Vialle; Toscano S Paulo Eduardo; Fernanda K Barreto; Vasco Ariston de Azevedo; Michele Guarino; Silvia Angeletti; Massimo Ciccozzi; Luiz C Junior Alcantara; Marta Giovanetti
Journal:  J Med Virol       Date:  2021-05-24       Impact factor: 20.693

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.