Literature DB >> 33519278

Genome composition and genetic characterization of SARS-CoV-2.

Ayman M Al-Qaaneh1,2, Thamer Alshammari1, Razan Aldahhan1, Hanan Aldossary1,3, Zahra Abduljaleel Alkhalifah1, J Francis Borgio1,3.   

Abstract

SARS-CoV-2 is a type of Betacoronaviruses responsible for COVID-19 pandemic disease, with more than 1.745 million fatalities globally as of December-2020. Genetically, it is considered the second largest genome of all RNA viruses with a 5' cap and 3' poly-A tail. Phylogenetic analyses of coronaviruses reveal that SARS-CoV-2 is genetically closely related to the Bat-SARS Like-Corona virus (Bat-SL-Cov) with 96% whole-genome identity. SARS-CoV-2 genome consists of 15 ORFs coded into 29 proteins. At the 5' terminal of the genome, we have ORF1ab and ORF1a, which encode the 1ab and 1a polypeptides that are proteolytically cleaved into 16 different nonstructural proteins (NSPs). The 3' terminal of the genome represents four structural (spike, envelope, matrix, and nucleocapsid) and nine accessory (3a, 3b, 6, 7a, 7b, 8b, 9a, 9b, and orf10) proteins. As the number of COVID-19 patients increases dramatically worldwide, there is an urgent need to find a quick and sensitive diagnostic tool for controlling the outbreak of SARS-CoV-2 in the community. Today, molecular testing methods utilizing viral genetic material (e.g., PCR) represent the crucial diagnostic tool for the SARS-CoV-2 virus despite its low sensitivity in the early stage of viral infection. This review summarizes the genome composition and genetic characterization of the SARS-CoV-2.
© 2021 The Author(s).

Entities:  

Keywords:  COVID-19; Coronaviruses; Diagnosis; Genome; SARS-CoV-2

Year:  2021        PMID: 33519278      PMCID: PMC7834485          DOI: 10.1016/j.sjbs.2020.12.053

Source DB:  PubMed          Journal:  Saudi J Biol Sci        ISSN: 2213-7106            Impact factor:   4.219


Introduction

As a human pathogen, coronaviruses (CoVs) have been drawing a great consideration as a global health threat. They are types of viruses that can attack many systems in humans and vertebrates, such as the respiratory system, digestive system, nervous system, and liver (Chen et al., 2020). It belongs to the Coronavirinae subfamily, a subset of the Coronavirdiae family and Nidovirales order (Chan et al., 2020). Genetically they are categorized into four distinct genera; Alpha, Beta, Gamma, and Deltacoronavirus (Wu et al., 2020). While bats and rodents consider the primary sources of alpha and Betacoronaviruses, the avian species are the primary sources of Gamma and Deltacoronaviruses (Z. W. Ye et al., 2020). Among four betacoronaviruses that can infect humans, severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV) are considered the most serious pathogens able to cause severe lower respiratory tract infection. They are responsible for both the 2003 Severe acute respiratory syndrome (SARS) endemic and 2012 Middle East respiratory syndrome (MERS) endemic, respectively (Ahmed et al., 2020, Cui et al., 2019, Shereen et al., 2020). Recently we had a new addition to the betacoronavirus family, a novel coronavirus-2019 (2019-nCoV) that has been named on February 12, 2019 by World Health Organization (WHO) as SARS-CoV-2, which is responsible for the pneumonia-like disease since December 2019 (Guo et al., 2020, Gurwitz, 2020). It is well documented and approved that these three betacoronaviruses have shown the possibility of animal to human and human to human transmission (J. Y. Li et al., 2020). As of today, December 25, 2020, around 79 million patients have been diagnosed with SARS-CoV-2 and 1.745 million deaths all over the world (“COVID-19 Map - Johns Hopkins Coronavirus Resource Center,” n.d.). In addition, the reported cases geographically vary and are related to local responses divergence of hosts. Yet, if the biological consequences of hosts are different due to alterations in the SARS-CoV-2 genome, this could lead to serious complications for combating the pandemic (Garvin et al., 2020). Like any other Betacoronaviruses, Bats represent the primary host of SARS-CoV-2, which then transfers to a secondary amplification host (e.g., palm civets and Racoon dogs) before being able to infect humans (C. Li et al., 2020, Shereen et al., 2020). This review will discuss the genome composition and genetic characteristics of SARS-CoV-2, the mechanism of SARS-CoV-2 genome replication and translation, the function of different proteins expressed by these genes, and using molecular genetic techniques in diagnosis SARS-CoV-2.

Different types of Coronaviruses, including human coronaviruses

Since1960s (the date of the first recognition of human coronaviruses), seven types of human coronaviruses have been identified (Seah and Agrawal, 2020). They are classified based on their protein sequence into; Alphacoronavirus, Betacoronavirus, Gammacoronavirus, and Delta coronavirus. The alpha and betacoronaviruses are the two genera that infect mammals, whereas Gamma and Delta coronaviruses infect birds (Wu et al., 2020). Generally, RNA viruses encode for proteins that are required to control the host cells. For instance, they encode for the protein types needed to assemble the new viral particles. Further, RNA viruses are well-known for their high mutation rate, which supports them to rapidly adapt in host cells (Manfredonia et al., 2020). Human coronaviruses, in general, share core genomic features, e.g., I. They are vast and highly conserved RNA genomes viruses. II. Genome structure is organized as a 5′-untranslated region (UTR)-replicase-S-M-N-UTR-3′ with the genome function as a replicase gene. III. All family members have two enveloped (E) protein species, helical Nucleocapsid (N), spike (S) and membrane (M) proteins. They all share the expression of various nonstructural genes (Snijder et al., 2016). In addition, they all are required to achieve an infectious cycle of replication, starting from entering, assembling, packaging, and releasing the new virus particle inside of the host (Hassan et al., 2020). The two overlapping open reading frames (ORFs) ORF1a&1ab represent the two-third of the genome's upstream region (Fig. 1). The general genetic characteristic that distinguishes Alphacoronavirus from other coronaviruses is that they have a unique type of nonstructural protein NSP1 that differ in sequence and size from other Betacoronavirus NSP1 (~9 kDa for Alphacornavirus and ~20 kDa for Betacoronavisrus). In contrast, Gamma and Deltacoronaviruses lack NSP1 moiety (Jaimes et al., 2020) (“Virus Taxonomy − 1st Edition,” n.d.). NSP1 protein plays a crucial role in inhibits host gene expression, and its location is essential for virus virulence (Shen et al., 2019). [Table 1] shows different genomic structures for human coronavirus genera (Fehr and Perlman, 2015, Jansson, 2013, Payne, 2017, Snijder et al., 2016).
Fig. 1

Sequence-region of SARS-CoV-2. GenBank accession: NC_045512.2. Resources: The 2019 Novel Coronavirus Resource (Zhou et al., 2020) and National Center for Biotechnology Information ().

Table 1

Different genomic structures for human coronavirus genera.

Genera
Genomic StructureAlphacoronavirusBetacoronavirusGammacoronavirusDeltacoronavirus
Spike GlycoproteinsPresentPresentPresentPresent
ORFs7–10 additional ORFs7–10 additional ORFs7–10 additional ORFs7–10 additional ORFs
NSP1~110 amino acids size~180–250 amino acids sizeAbsentAbsent
Papain-like protease (PLpro)two PLpro domainstwo PLpro domainsSingle PLpro DomainsSingle PLpro domains
Hemagglutinin Esterase (HE)AbsentPresentAbsentAbsent
Envelope proteinsPresentPresentPresentPresent
Nucleocapsid proteinPresentPresentPresentPresent
Membrane Glycoproteins:PresentPresentPresentPresent
Sequence-region of SARS-CoV-2. GenBank accession: NC_045512.2. Resources: The 2019 Novel Coronavirus Resource (Zhou et al., 2020) and National Center for Biotechnology Information (). Different genomic structures for human coronavirus genera. Scientists explored seven Coronavirus strains able to infect humans (HCoV-229E, HCoV-HKU1, HCoV-NL63, HCoV-OC43, SARS-CoV, MERS-CoV, and SARS-CoV-2) (Arabi et al., 2020, Gaunt et al., 2010, Ye et al., 2020a). While HCoV-229E and HCoV-NL63 belong to alphacoronaviruses family and use different host proteins as a target receptor (HCoV-229E binds with host aminopeptidase N (hAPN) as a receptor, and HCoV-NL63 binds angiotensin-converting enzyme II (ACE 2) in addition to host hAPN as a receptor), the other five beta coronaviruses bind to different host receptors (HCoV-OC43 and HCoV-HKU1 use 9-O-acetylsialic acids as a receptor, SARS-CoV and SARS-CoV-2 bind ACE2 receptors, and MERS-CoV uses Dipeptidyl peptidase-4 (DPP4) receptor (Li et al., 2019, Ye et al., 2020b). Clinically, HCoV-229E, HCoV-NL63, HCoV-OC43, and HCoV-HKU1 cause common cold-like symptoms (e.g., sneezing, fever, dry cough, sore throat, Dyspnea, Myalgia, and diarrhea), which may be progressed to pneumonia in cardiopulmonary and immunocompromised patients. However, SARS-CoV, MERS-CoV, and SARS-CoV-2 are highly pathogenic viruses and can cause respiratory system failure in immunocompromised or cardiopulmonary patients (Seah and Agrawal, 2020). [Table 2] represents the classification of different human coronaviruses with their animal reservoir, intermediate host, cellular receptors, and their main sign and symptoms.
Table 2

Classification of different type of human coronavirus.

GenusCoronavirus TypeAnimal reservoirIntermediary hostFirst identifiedSymptomsCellular ReceptorEpidemiologyReferences
Alpha-coronavirusHCoV-229EBatsBats1966Common cold, sneezing, fever, dry cough, sore throat, Myalgia, diarrhea, and pneumoniaAPN receptor Human Aminopeptidase N (CD13)Endemic(Pene et al., 2003, Vijgen et al., 2005, Weiss and Leibowitz, 2011)
HCoV-NL63Batspalm civets2004Sneezing, fever, dry cough, sore throat, Myalgia, diarrhea, febrile, and convulsionACE2 receptorEndemic(Gaunt et al., 2010, Lim et al., 2016, Weiss and Leibowitz, 2011)



Beta- coronavirusesHCoV-OC43MiceCattle1967Common cold, sneezing, fever, dry cough, sore throat, Myalgia diarrhea, and pneumonia9-O-acetylsialic acids as receptorEndemic(Li et al., 2019, Lim et al., 2016, Weiss and Leibowitz, 2011)
HCoV-HKU1MiceMice2005Common cold, sneezing, fever, dry cough, sore throat, Myalgia, febrile convulsion diarrhea and pneumonia9-O-acetylsialic acids as receptorEndemic(Gaunt et al., 2010, Li et al., 2019, Lim et al., 2016, Seah and Agrawal, 2020, Weiss and Leibowitz, 2011)
SARS-CoVBatscivet cat2003Common respiratory symptoms, fever, cough, shortness of breath, breathing difficulties, and pneumoniaACE2Epidemic(Lee and Treanor, 2016, Lim et al., 2016, Weiss and Leibowitz, 2011)
MERS-CoVBatsDromedary camels2012Cough, shortness of breath, and, on occasion, pneumonia. Gastrointestinal symptoms (diarrhea).Dipeptidyl peptidase-4 (DPP4) receptorEpidemic(Weiss and Leibowitz, 2011, Ye et al., 2020a)
SARS-CoV-2BatsBats2019Common respiratory symptoms, fever, cough, shortness of breath, and breathing Myalgia Dyspnea Vomiting pneumonia, respiratory system failure and even deathACE2Pandemic(Z.-W. Ye et al., 2020)
Classification of different type of human coronavirus.

Common genomic features and characterizations of coronaviruses

Coronaviruses (CoVs) genome represents the second largest genome of all RNA viruses. It is a single-stranded, positive-sense RNA, with genome size ranging from 26 Kb to 32 Kb in length and has a cap and poly-A at its 5′ and 3′ tail, respectively (Chan et al., 2020, Wu et al., 2020). In general, the number of ORFs in the CoVs genome ranging from 6 to 15 ORFs (Song et al., 2019). The 1a & 1ab ORFs represent the biggest gene in the coronavirus's genome and cover almost two-thirds of its entire genome. These two genes encode different NSPs like replication-transcription complex (RTC), which responsible for synthesizing and transcribing the subgenomic RNA (sgRNA) (Gorbalenya et al., 2000, Hussain et al., 2005, Snijder et al., 2006). Transcription regulatory sequence that mediates the transcription process is located between ORFs in the sgRNA, and serves as a template for the mRNA synthesis (Sawicki et al., 2007). The Frameshift mutation between ORF1a and ORF1b results in two polypeptide synthesis (pp1a and pp1ab) that will then be processed into 16 NSPs by the aid of either chymotrypsin-like proteases (3CLpro) or papain-like proteases with the main protease (Mpro) (Masters, 2006). The remaining one-third of the coronaviruses genome is responsible for encoding at least four structural proteins like spike protein (S), nucleocapsid protein (N), envelope protein (E), and the membrane (M) proteins, besides some accessory proteins such as 3a/b, 4a/b, and Hemagglutinin‐Esterase (HE) proteins (Hussain et al., 2005). Sequence alignment of all CoVs genome illustrates an identity of 43% for structural proteins coding regions and 58% for nonstructural proteins coding regions. In comparison, the identity in the entire genome among all CoVs is about 54%. These results suggest that the structural proteins have more diversity than the other nonstructural proteins (Chen et al., 2020). Nonstructural protein 14 (NSP-14) is a 3′ to 5′ exoribonuclease enzyme distinctive for all CoVs. Its function is related to maintaining the whole RNA genome of CoVs and proofreading the replication-transcription complex (Eckerle et al., 2010, Smith et al., 2013). Amino acid sequences of both N and E proteins in SARS-CoV-2 were found to have a ~92% sequence identity (Azeez et al., 2020).

Genome composition of SARS-CoV-2

SARS-CoV-2 genome consists of 15 ORFs coded into 29 proteins (Srinivasan et al., 2020); at the 5′ terminal of the genome, we have ORF1ab and ORF1a that encode 1ab and 1a Polypeptide, respectively [Fig. 1]. The 3′ terminal of the genome represents four structural (spike, envelope, matrix, and nucleocapsid) and nine accessory (3a, 3b, 6, 7a, 7b, 8b, 9a, 9b, and orf10) proteins. [Table 3] represents the coding region, Base length, amino acid length, and functions of different ORFs in SARS-CoV-2, and [Table 4] represents the coding region, amino acid length, and functions of different NSPs in the SARS-CoV-2 virus genome.
Table 3

Coding region, base length, amino acid length, and functions of different ORFs in SARS-CoV-2.

Gene nameCoding region (nt)Base lengthAmino acid length (aa)FunctionReference
UTR265
Nonstructural proteinorf1a266–1348313,2174405Encoded nonstructural proteins (NSP1 to NSP11), essential for viral replication, viral assembly, immune response modulation, etc.(Dinan et al., 2019, Sawicki et al., 2007)
Non-structural proteinorf1ab266–13468, 13468–2155521,2897096Encoded nonstructural proteins (NSP12 to NSP16), essential for viral replication(Dinan et al., 2019, Sawicki et al., 2007)
Structural proteinS (Spike protein)21563–2538438221273Spike protein, binding to cell receptor and mediate virus-cell fusion(Pillay and Pillay, 2020, Zhang et al., 2020)
Accessory proteinORF 3a25393–26220828275ion-channel activity, cell cycle arrest, apoptosis(Schaecher and Pekosz, 2010)
Accessory proteinORF 3b25814–258826822IFN (Type I) production and signaling inhibition, cell cycle arrest, apoptosis(Kopecky-Bromberg et al., 2007, Michel et al., 2020)
Structural proteinE (envelope protein)26245–2647222875Morphogenesis, assembly and of virions(Duart et al., 2020, Singh Tomar and Arkin, 2020)
Structural proteinM (matrix protein)26523–27191669222Membrane protein, virus assembly(Bianchi et al., 2020, Mukherjee et al., 2020)
Accessory proteinORF 627202–2738718661IFN (Type I) production and signaling inhibition(Kopecky-Bromberg et al., 2007)
Accessory proteinORF 7a27394–27759366121inhibiting the host translation, cell cycle arrest, apoptosis(Schaecher and Pekosz, 2010, Tan et al., 2005)
Accessory proteinORF 7b27756–2788713243still unknown(Schaecher and Pekosz, 2010, Tan et al., 2005)
Accessory proteinORF 8b27894–28259366121still unknown(Mohammad et al., 2020)
Structural proteinN(nucleocapsid)28274–295331260419Replication, transcription, virion structure, and viral assembly(Kopecky-Bromberg et al., 2007, Zeng et al., 2020)
Accessory proteinORF 9a28284–2857729497still unknown(Schaecher and Pekosz, 2010, Yoshimoto, 2020)
Accessory proteinORF 9b28,734–28,95522273still unknown(Schaecher and Pekosz, 2010, Yoshimoto, 2020)
Accessory proteinORF 1029558–2967411738still unknown(Weiss and Leibowitz, 2011)
UTR299
Table 4

Coding region, amino acid length, and functions of different NSPs in SARS-CoV-2 virus genome.

NSPCoding region (aa)Amino acid length (aa)FunctionReference
NSP1 (leader protein)1–180180mRNA degradation, interferon (IFN) signaling inhibition(Huang et al., 2011, Tanaka et al., 2012)
NSP2181–818638still unknown(Gadlage et al., 2008, Graham et al., 2005)
NSP3819–27631945ADRP, cleaving Polypeptide, blocking the innate immunity of the host, enhancing expression of cytokines(Lei et al., 2018, Serrano et al., 2009)
NSP42764–3263500double-membrane vesicles (DMVs) formation(Beachboard et al., 2015, Gadlage et al., 2010, Lei et al., 2018)
NSP5 (3C-like proteinase)3264–3569306Mpro, 3CLpro, polypeptides cleaving, and IFN signaling inhibiting(Beachboard et al., 2015, Stobart et al., 2013)
NSP63570–3859290formation of DMV, and restricting autophagosome expansion(Angelini et al., 2013, Cotten et al., 2014)
NSP73860–394283cofactor with NSP8 and NSP12(Kirchdoerfer and Ward, 2019, Zhai et al., 2005)
NSP83943–4140198primase, cofactor with NSP7 and NSP12(Kirchdoerfer and Ward, 2019, Te Velthuis et al., 2012)
NSP94141–4253113RNA binding, dimerization(Egloff et al., 2004, Zeng et al., 2018)
NSP10 (growth-factor-like protein)4254–4392139Scaffold protein for NSP14 and NSP16(Bouvet et al., 2014, Fang et al., 2008, Ma et al., 2015)
NSP11 check if this available in SARS-CoV-24393–440513still unknown(Bouvet et al., 2014, Fang et al., 2008)
NSP12 (RNA-dependent RNA polymerase)4393–5324932Primer dependent RdRp(Ahn et al., 2012, Kirchdoerfer and Ward, 2019, te Velthuis et al., 2009)
NSP13 (RNA 5′-triphosphatase)5325–5925601RNA helicase, and 5′ triphosphatase(Adedeji and Lazarus, 2016, Jia et al., 2019)
NSP14 (3′-to-5′ exonuclease)5926–64525273′-to-5′ exonuclease(Bouvet et al., 2012, Eckerle et al., 2010, Minskaia et al., 2006)
NSP15 (endoRNAse)6453–6798346dsRNA sensors evasion, and NendoU(Bhardwaj et al., 2006, Zhang et al., 2018)
NSP16 (2′-O-ribose methyltransferase)6799–7096298OMT, negatively regulates the innate immunity(Chen et al., 2011, Decroly et al., 2011, Shi et al., 2019)

ADRP: adenosine diphosphate-ribose 1″-phosphatase; 3CLpro: 3C-like cysteine proteinase; RdRp: RNA-dependent RNA polymerase; NendoU: nidoviral endoribonuclease specific for U; OMT: S-adenosylmethionine-dependent ribose 2′-O-methyltransferase.

Coding region, base length, amino acid length, and functions of different ORFs in SARS-CoV-2. Coding region, amino acid length, and functions of different NSPs in SARS-CoV-2 virus genome. ADRP: adenosine diphosphate-ribose 1″-phosphatase; 3CLpro: 3C-like cysteine proteinase; RdRp: RNA-dependent RNA polymerase; NendoU: nidoviral endoribonuclease specific for U; OMT: S-adenosylmethionine-dependent ribose 2′-O-methyltransferase.

Origin of SARS-CoV-2 based on gene sequencing

Phylogenetic analyses of CoVs [Fig. 2] reveal that the SARS-CoV-2 is related to SARS-CoV and very closely related to both bat-SL-CoV ZC45 and bat‐SL‐CoV ZXC21. As previously reported, SARS-CoV and MERS-CoV are bat-originated (Cui et al., 2019); bats continue to be the prime suspected origin of SARS-CoV-2 since SARS-CoV-2 has a 96% resemblance of the whole genome identity with the bat coronavirus RaTG13 isolated from Rhinolophus affinis (Zhou et al., 2020). However, there are some important differences in genomic factors between SARS-CoV-2 and RaTG13, including the burin cleavage near the junction of the spike (S) protein subunits; S1 and S2, induced by the insertion of amino acids residues (PRRA) (Coutard et al., 2020). This insertion, which is absent in other related betaoronaviruses, may the reason for the increased infectivity of the virus (Zhang and Holmes, 2020). Also, there is a decreased similarity between SARS-CoV-2 and RaTG13 in the receptor-binding domain (RBD) protein encoded by the (S) gene (Wrapp et al., 2020).
Fig. 2

The phylogenetic analysis of different Coronaviruses.

The phylogenetic analysis of different Coronaviruses. This divergence region within the RBD corresponds to the receptor-binding motif (RBM), which determines the receptor binding specificity. All these variabilities, along with the fact that animal species like civets and camels generally host SARS-CoV and MERS-CoV before infecting humans, suggest the existence of an intermediate host between bats and humans for SARS-CoV-2 (Cui et al., 2019, Wong et al., 2020). This suggestion opposes the early indication that SARS-CoV-2 can leaps directly from bats to humans (Zhou et al., 2020). Full genome sequencing of coronaviruses isolated from Malayan pangolins (Liu et al., 2019) revealed ~90% identity to RaTG13 and ~91% to SARS-CoV-2 (Zhang and Holmes, 2020). Although the percentage identity between SARS-CoV-2 and Pangolin-CoV is lower than that between SARS-CoV-2 and RaTG13, the S1 protein in Pangolin-CoV showed high similarity to the one in SARS-CoV-2 compared to that of RaTG13. Furthermore, the RBM segment is highly conserved in both SARS-CoV-2 and Pangolin-CoV, sharing all five key residues. Unlike RaTG13, where only one of the five residues is shared (Zhang and Holmes, 2020, Zhou et al., 2020). Accordingly, the RBM introduced in SARS-CoV-2 might be resulting from a recombination event between Pangolin-CoV and RaTG13, and it is therefore unexpected that RaTG13 has obtained the same mutations found in the genome of Pangolin-CoV by random chance. The low nucleic acid identity between Pangolin-CoV and SARS-CoV-2 compared to the high conservation of amino-acid sequences suggests that the recombination events have developed a long time in the past, which enabled genetic drift to take place as the time passes (Wong et al., 2020). Also, it was discovered that the SARS-CoV-2 helicase amino acid sequence is similar to bat and SARS-like coronaviruses. (Borgio et al., 2020). [Table 5] represent the percent similarity between different betacoronaviruses (Jaimes et al., 2020).
Table 5

Genome sequence similarity among beta-coronaviruses.

Beta-CoronavirusPercent of genome sequence Identity
SARS-CoV-2 vs. Bat-SL-CoV-ZC4588%
SARS-CoV-2 vs. Bat-SL-CoV-ZXC2188%
SARS-CoV-2 vs. SARS-CoV79.6%
SARS-CoV-2 vs. MERS-CoV50%
Genome sequence similarity among beta-coronaviruses. Once a virus is inside the body, it encounters strong immunologic responses that may trigger the virus to develop mutations to beat and bypass the immune system. Consequently, these mutations alter the virus's virulence, infectiousness, and transmission (Berngruber et al., 2013, Lucas et al., 2001). A recent study on COVID-19 patients found a considerable level of viral diversity, with a median number of 4, suggesting rapid evolution of SARS-CoV-2 and a high frequency of mutations (Shen et al., 2020). It has been proposed that nucleotide substitution is a critical mechanism for evolving viruses in nature (Lauring and Andino, 2010). Phan reported three deletions and nitty-three missense mutations within the whole genome of SARS-CoV-2 isolated from different numbers of infected patients. ORF1ab polyprotein and the 3′ end of the genome were subjected to deletion. Apart from the envelope protein, the missense mutations were observed in the nonstructural and structural proteins. Three of these missense mutations were also in the RBD of the spike surface glycoprotein (Phan, 2020). Moreover, Tang et al. analysis of 103 sequences of the SARS-CoV-2 genome revealed that the virus has evolved into two types; type S, which accounts for ~30% of the strains, and type L accounts for 70%. Type L is derived from type S, and it is suggested that it is more aggressive and contagious than type S. However, the frequency of L type was high at the beginning of the outbreak in Wuhan, China but has decreased later (Tang et al., 2020).

Mutations in SARS-CoV-2 genome

Mutations drive viral evolution and genome variability, facilitating virus scape from different immune systems and promoting antivirus drug resistance. In general, RNA viruses are characterized by a high mutation rate, which can sometimes reach up to a million times than their hosts (Pachetti et al., 2020). For the SARS-CoV-2 virus, more than 10 thousand single nucleotide polymorphism (SNP) have been recognized and reported globally, with expected mutation rates ranging between (0.0001–0.01) substitutions per nucleotide site per cell infection (Cotten et al., 2014, Peck and Lauring, 2018). These mutations are distributed between different genomic structure of the virus. e.g., mutations in Spike protein, nucleocapsid, ORF (7b, 8, 9b, and 14), and RdRP (Badua et al., 2020). Among discovered mutations, two of them located in spike protein have been reported to have a higher rate of transmissibility; D614G and N501Y (Discovered in UK) (Daniloski et al., 2020, Leung et al., 2020). On the other hand, the newly discovered South African variant ‘501.V2’ is characterized by N501Y, E484K, and K417N mutations in the S protein – so it shares the N501Y mutation with the UK variant, but the other two mutations are not found in the UK variant- but the impact of this variant is still under investigation (Tegally et al., 2020).

Mechanism of SARS-CoV-2 genome replication and translation

Since the SARS-CoV-2 genome has the 5′ methylated cap (where the nonstructural proteins are placed) and 3′ polyadenylated side (where structural proteins reside), this allows direct translation after infection without needing an intermediate transcription stage. Also, the SARS-CoV-2 genome includes multiple ORFs that can be transcribed by several transcription regulating sequences (TRS). The viral life cycle begins when the virion enters the human respiratory tract and depend on the viral and epithelial cell membrane interaction. The infection initiates at the cell surface when the S1 subunit recognizes angiotensin-converting enzyme 2 (ACE2) as a targeted receptor or recognize exopeptidases as a key receptor for entry to the host cell. The entry mechanism depends on cellular proteases such as cathepsins and transmembrane protease serine 2 (TMPRSS2), and human airway trypsin protease (HAT) (Hoffmann et al., 2020). The S1 subunit consists of two subdomains that can operate as receptor binding domains proteins (RBD); N-terminal domain (NTD) as a mediated sugar-binding and a C-terminal domain (CTD) as receptors recognition. (Xia et al., 2020). Mutated RBD of the S1 subunit is needed for cross-species of SARS-CoV-2 transmitting. Once the binding occurs, conformational changes in the S protein, like pH reducing and/or S protein proteolysis, promote splitting of S1 from S2. Consequently, membrane fusion is placed by the S2 region (Han et al., 2020). S2 subunit has two heptad repeat (HR) domains, (HR1) and (HR2), that interact together to form a six-helix bundle (6-HB) fusion core to approximate the viral and cellular interaction (Xia et al., 2020). The replicase/transcriptase genes encoded by ORF1a/1ab translate two polyproteins (pp1a and pp1ab) that are cleaved into individual 16 nonstructural proteins (nsp1 to 16) by the action of chymotrypsin-like cysteine proteases enzyme (3CLpro, nsp3); that control viral replication and it is crucial for SARS-CoV-2 life cycle and papine-like proteases enzyme (PLpro, nsp5 proteases) which has a crucial role in suppressing immune system by deubiquitylating the host cell proteins (Morse et al., 2020, Tahir ul Qamar et al., 2020). RNA-dependent RNA polymerase (RdRp) that encoded by NSP12 plays a central role in the synthesis of a new genome molecule of viruses (replication), Also synthesis of subgenomic templates for messenger RNA production (transcription) (Gao et al., 2020). The genome replication and transcription take place at cytoplasmic membranes. The transcript proteins are inserted into the endoplasmic reticulum, and the Golgi apparatus then translated RNA packed inside the formed capsid. N protein forms the capsid while the envelope, membrane, and spike formed by E, M, and S proteins, respectively. Finally, the packed vesicles virus is released from the cell membrane by the exocytosis process (Shereen et al., 2020).

Diagnosis of SARS-CoV-2 based on molecular genetic technique

There is no doubt that the presented outcome of the pandemic regarding morbidity is scary (Simmonds, 2020). As the number of COVID-19 patients increased worldwide, there was an urgent need to find a quick and sensitive diagnostic tool as it is crucial for controlling the outbreak of SARS-CoV-2 in the community. World Health Organization (WHO) has recommended molecular testing methods like polymerase chain reaction (PCR) of respiratory tract samples for the identification and laboratory confirmation of COVID-19 cases (World Health Organization, 2020). By comparison with syndromic testing and the computed tomography (CT), molecular techniques are more suitable because of their ability to target and identify particular infectious agents. Developing the molecular techniques requires recognizing the pathogen genomics and proteomics composition and understanding gene expression changes within the host during and post-infection (Udugama et al., 2020). Real-time reverse transcription-polymerase chain reaction (rRT-PCR), which is the most used diagnostic method for COVID-19 (“Real-time RT-PCR Primers and Probes for COVID-19 | CDC,” n.d.), allows the genetic detection of SARS-CoV-2. Thus, various kits have been designed to reverse-transcribe the viral RNA genome to complementary DNA (cDNA) and amplify specific regions of the cDNA (Freeman et al., 1999, Kageyama et al., 2003). Corman et al. have found three regions of the SARA-CoV-2 genome with conserved sequences, the RNA-dependent RNA polymerase gene in the ORF1ab region, the E and the N gene. Analytically, both RdRP and E genes provided high detection sensitiveness, while the N gene showed poorer sensitivity (Corman et al., 2020). [Table 6] represents some probes and primers used for SARS-CoV-2 detection via real-time PCR tests. Besides designing the reaction primers and probes, other factors must be optimized to achieve a proper and accurate RT-PCR assay, including reagent conditions, incubation time, and temperature (Wong and Medrano, 2005). Also, selecting the appropriate controls is essential to ensure the results' reliability (Yan et al., 2020). In general, these controls can help identify the detrimental factors such as contamination and ensure pathogen recognition in the addressed samples. Thus, allowing for the detection of any false-negative or false-positive amplification. There are two main methods to perform RT-PCR, the one-step assay format where reverse transcription and amplification are combined into one reaction. This method can yield fast and reproducible results; however, it is challenging because it requires the optimization of both the reverse transcription and amplification, so they do not compete with each other. The other method is the two-steps assay, where the reaction occurs in two different tubes. This assay method is more precise than the one-step assay, yet it also consumes a longer time and needs the optimization of extra parameters (Bustin, 2002, Wong and Medrano, 2005). Detection of SARS-CoV-2 through rRT-PCR assays requires collecting respiratory samples. The upper respiratory tract specimens, which involve nasopharyngeal swabs, oropharyngeal swabs, and nasal aspirates, are recommended for the initial diagnostic testing. However, the lower respiratory tract specimens, which involve sputum, Bronchoalveolar Lavage Fluid (BALF), and tracheal aspirates, are recommended when the patient is having a productive cough (“Interim Guidelines for Clinical Specimens for COVID-19 | CDC,” n.d.). Yang et al. have reported that sputum, along with nasal swabs, are more accurate for SARS-CoV-2 diagnosis.
Table 6

Probes and Primers for SARS-CoV-2 Polymerase Chain Reaction (PCR) Tests.

Name of the geneInstitutionForward (5′-3′) / Reverse (5′-3′)Probe (5′-3′)
Nuncleoacpsid fragment-1 (N1)US CDCGACCCCAAAATCAGCGAAAT / TCTGGTTACTGCCAGTTGAATCTGFAM-ACCCCGCATTACGTTTGGTGGACC-BHQ1
Nuncleoacpsid fragment-2 (N2)US CDCTTACAAACATTGGCCGCAAA / GCGCGACATTCCGAAGAAFAM-ACAATTTGCCCCCAGCGCTTCAG-BHQ1
Nucleocapsid fragment-3 (N3)US CDCGGGAGCCTTGAATACACCAAAA / TGTAGCACGATTGCAGCATTGFAM-AYCACATTGGCACCCGCAATCCTG-BHQ1
Nucleocapsid (N)China, CDCGGGGAACTTCTCCTGCTAGAAT / CAGATGTTAAASACACTATTAGCATAFAM-TTGCTGCTGCTTGACAGATT-TAMRA
Hong Kong UniversityTAATCAGACAAGGAACTGATTA / CGAAGGTGTGACTTCCATGFAM-GCAAATTGTGCAATTTGCGG-TAMRA
National Institute of Infectious Diseases, JapanAAATTTTGGGGACCAGGAAC / TGGCAGCTGTGTAGGTCAACFAM- ATGTCGCGCATTGGCATGGA-BHQ
National Institute of Health, ThailandCGTTTGGTGGACCCTCAGAT / CCCCACTGCGTTCTCCATTFAM-CAACTGGCAGTAACCABQH1
Envelope (E)Charité, GermanyACAGGTACGTTAATAGTTAATAGCGT / ATATTGCAGCAGTACGCACACAFAM-ACACTAGCCATCCTTACTGCGCTTCG-BBQ
RNAse P Forward Primer RNase (RP-F Rnase)US CDCAGATTTGGACCTGCGAGCG / GAGCGGCTGTCTCCACAAGTFAM-TTCTGACCTGAAGGCTCTGCGCG-BHQ-1
Open Reading Frame-1ab (ORF1ab)China CDCCCCTGTGGGTTTTACACTTAA / ACGATTGTGCATCAGCTGAFAM-CCGTCTGCGGTATGTGGAAAGGTTATGG-BHQ1
Open Reading Frame-1b (ORF1b)Hong Kong UniversityTGGGGYTTTACRGGTAACCT / AACRCGCTTAACAAAGCACTCFAM-TAGTTGTGATGCWATCATGACTAG-TAMRA
RNA-dependent RNA polymerase (RdRP)Charité, GermanyGTGARATGGTCATGTGTGGCGG / CARATGTTAAASACACTATTAGCATARdRp 1: FAM-CAGGTGGAACCTCATCAGGAGATGC-BBQRdRp 2: FAM-CCAGGTGGWACRTCATCMGGTGATGC-BBQ
Probes and Primers for SARS-CoV-2 Polymerase Chain Reactioene">n (PCR) Tests. Further, in severe cases, BALF is necessary for detecting the viral RNAs to diagnose and monitor the viruses (Yang et al., 2020). Respiratory samples negative results do not eliminate the possibility of having the disease and may arise as results of poorly-handled sampling techniques or small viral amount in the sampled area (Ai et al., 2020, Winichakoon et al., 2020). Like any other diagnostic tool, RT-PCR for SARS-CoV-2 diagnosis has some drawbacks, such as the shortage of the kits with the increased demand, its inability to detect prior infection in recovered patients who showed no symptoms of the disease, and requiring advanced equipment that are often not available in resource‐limited regions. However, other alternative nucleic acid tests may overcome these limitations (X. Li et al., 2020, Udugama et al., 2020), like the Loop-mediated isothermal amplification test (LAMP), which is used for DNAs and RNAs amplification. It contains five primers designed for aiming at the 5′ region of the open reading frame 1a gene and Gene N in the SARS-CoV-2 RNA. Each gene has been checked with both synthetic DNA substrates and RNA transcripts. There are different ways to detect amplifying DNA either by turbidity (induce by-products of the reaction), fluorescence (a fluorescent dye binds to double-stranded DNA), or color (pH-sensitive dye), and observed either on colorimetric detection or agarose gel electrophoresis. However, when comparing the two genes, it showed that the synthetic DNA substrates are slower in amplification and recognition than the RNA pattern, confirming that RNA is proficiently transformed to cDNA through the reverse transcriptase and amplified by the DNA-dependent DNA polymerase. The result was completely comparable with the real-time detection. Also, cell lysate has been tested, and the detection sensitivity was as the synthetic RNA alone (Zhang and Holmes, 2020). Loop-mediated isothermal amplification (LAMP) assay is rapid, conducted at a single PCR, and does not require high-priced reagents or devices (Udugama et al., 2020). There are disadvantages to LAMP, such as the challenge reaction condition, designing primers, and the difficulty with designing the readout device. The RT-LAMP assay has been improved using a quenching probe (QProbe) to reveal signs, which has a good performance as an RT-PCR assay (Ozma et al., 2020). On the other hand, the microarray test has been used broadly in detecting SARS-COV-2, in which the Viral RNA will produce categorized cDNA that has precise probes over reverse transcription. Those labeled cDNAs are loaded into a solid-phase oligonucleotide, which has hybridized and fixed on the microarray accompanied to remove unfastened DNAs through washing steps. The viral RNA can be distinguished through the recognition of specific probes. Luna et al. had designed a nonfluorescent low-value, low-density oligonucleotide array that can detect the whole genome, with sensitivity equal to the RT-PCR (De Souza Luna et al., 2007, Hardick et al., 2018). Furthermore, thinking about the fast mutation of SARS-CoV 2, Guo et al. advanced the microarray test to discover mutations in 24 single nucleotide polymorphism (SNP), as well as a few mutations in the spike (S) gene (Guo et al., 2014, Shi et al., 2003). Also, the next-generation sequencing (NGS) and electron microscope technology has a role in the initial diagnosis, which has been developed in current years and made excessive progress in the fast recognition of SARS-COV-2 via RNA-Seq. NGS can sequence millions of DNA fragments by using random primers RNA-Sequences (RNA-Seq). Frequently, most RNA-Seq are from cellular RNA, but some may be from the viral genome. Consequently, RNA-Seq is used to identify RNA viruses. NGS is currently an ideal method for virus detection to verify unbiased sequencing of bat CoVs, bearing in mind its high genetic diversity. This method is highly effective in reducing sequence cost by increasing detection sensitivity (Shi et al., 2003). Moreover, SHERLOCK method used Cas13a ribonuclease for RNA sensing. Cas13a can be activated when the RNA guide sequence binds to an amplified RNA product, and then probes are cleaved to generate a fluorescent sign (Guo et al., 2014).

Conclusion

SARS-CoV-2 via utilizing molecular techniques and based on its genetic material, represents a new addition to the betacoronavirus family. It is genetically related, with more than 96% identity, to bat-SARS-Like-coronavirus. Although it is highly conserved, new strains and mutations appeared in different parts of the world, revealing the virus's capability to adopt different environmental conditions to survive. As of today, no curable treatment or prophylactic vaccine available for this virus, which makes the success in resolving this pandemic depends on the people awareness of adopting proper safety measures to stop viral dissemination and the availability of new diagnostic tools able to detect the virus on its early stage in more efficient-fast way.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

CRediT authorship contribution statement

Ayman M. Al-Qaaneh: Conceptualization, Methodology, Formal analysis, Data curation, Writing - original draft, Writing - review & editing. Thamer Alshammari: Conceptualization, Methodology, Formal analysis, Data curation, Writing - original draft, Writing - review & editing. Razan Aldahhan: Conceptualization, Methodology, Formal analysis, Data curation, Writing - original draft, Writing - review & editing. Hanan Aldossary: Conceptualization, Methodology, Formal analysis, Data curation, Writing - original draft, Writing - review & editing. Zahra Abduljaleel Alkhalifah: Conceptualization, Methodology, Formal analysis, Data curation, Writing - original draft, Writing - review & editing. J. Francis Borgio: Conceptualization, Writing - review & editing, Supervision, Project administration.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
  4 in total

1.  TiO2 Nanostructures That Reduce the Infectivity of Human Respiratory Viruses Including SARS-CoV-2.

Authors:  Alka Jaggessar; Amar Velic; Prasad Kdv Yarlagadda; Kirsten Spann
Journal:  ACS Biomater Sci Eng       Date:  2022-06-06

2.  BCG vaccination provides protection against IAV but not SARS-CoV-2.

Authors:  Eva Kaufmann; Nargis Khan; Kim A Tran; Antigona Ulndreaj; Erwan Pernet; Ghislaine Fontes; Andréanne Lupien; Patrice Desmeules; Fiona McIntosh; Amina Abow; Simone J C F M Moorlag; Priya Debisarun; Karen Mossman; Arinjay Banerjee; Danielle Karo-Atar; Mina Sadeghi; Samira Mubareka; Donald C Vinh; Irah L King; Clinton S Robbins; Marcel A Behr; Mihai G Netea; Philippe Joubert; Maziar Divangahi
Journal:  Cell Rep       Date:  2022-02-21       Impact factor: 9.995

Review 3.  Host and Viral Zinc-Finger Proteins in COVID-19.

Authors:  Sabrina Esposito; Gianluca D'Abrosca; Anna Antolak; Paolo Vincenzo Pedone; Carla Isernia; Gaetano Malgieri
Journal:  Int J Mol Sci       Date:  2022-03-28       Impact factor: 5.923

4.  Safety of Tocilizumab in COVID-19 Patients and Benefit of Single-Dose: The Largest Retrospective Observational Study.

Authors:  Ayman M Al-Qaaneh; Fuad H Al-Ghamdi; Sayed AbdulAzeez; J Francis Borgio
Journal:  Pharmaceutics       Date:  2022-03-11       Impact factor: 6.321

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.