Literature DB >> 33166565

Deletion in the C-terminal region of the envelope glycoprotein in some of the Indian SARS-CoV-2 genome.

Ballamoole Krishna Kumar1, Anusha Rohit2, Kattapuni Suresh Prithvisagar1, Praveen Rai1, Indrani Karunasagar1, Iddya Karunasagar3.   

Abstract

The envelope glycoprotein (E) is the sn class="Gene">mallest structural component of SARS-CoVs; plays an essential role in the viral replication starting from envelope formation to assembly. The in silico analysis of 2086 whole genome sequences from India performed in this study provides the first observation on the extensive deletion of amino acid residues in the C-terminal region of the envelope glycoprotein in 34 Indian SARS-CoV-2 genomes. These amino acid deletions map to the homopentameric interface and PDZ binding motif (PBM) present in the C-terminal region of E protein as well as immediately after the reverse primer binding region as per Charité protocol in 26 of these genomes, hence, their detection through RT-qPCR may not be hampered and therefore E gene-based RT-qPCR would still detect these isolates. Eight genomes from the State of Odisha had deletion even in the primer binding site. It is possible that the deletions in the C-terminal region of E protein of these genomes are a result of adapting to a newer geographical area and host. The information on the clinical status was available only for 9 out of 34 cases and these were asymptomatic. However, further studies are indispensable to understand the functional consequences of amino acid deletion in the C terminal region of SARS-CoV-2 envelope protein in the viral pathogenesis and host adaptation.
Copyright © 2020. Published by Elsevier B.V.

Entities:  

Keywords:  COVID-19; E protein; Genome analysis; SARS-CoV-2; Viral assembly

Year:  2020        PMID: 33166565      PMCID: PMC7645280          DOI: 10.1016/j.virusres.2020.198222

Source DB:  PubMed          Journal:  Virus Res        ISSN: 0168-1702            Impact factor:   3.303


The coronavirus disease 2019 (n class="Disease">COVID-19) caused by SARS-CoV-2 become a pandemic affecting over 200 countries, causing more than 33.8 million cases and associated with over 1.01 million deaths worldwide. The disease spreads mostly through droplets released when an infected person coughs or speaks or sneezes, and the disease spreads rapidly among contacts. SARS-CoV-2 is an enveloped, single-stranded positive-sense RNA virus belonging to the genus β-coronavirus (Lu et al., 2020; Malik et al., 2020). Similar to SARS-CoV and MERS-CoV, the 29.9 kb genome of SARS-CoV-2 encodes four major structural proteins - spike (S), envelope (E), membrane (M), as well as nucleocapsid (N) protein, 16 non-structural proteins (nsp1–16), and five to eight accessory proteins (Wu et al., 2020). Based on the currently available pieces of evidence, it's noteworthy to mention that the severity of COVID-19 differs significantly within populations and geographical locations. Episodes of asymptomatic infection were also reported in more than 80 % of the tested positive COVID-19 cases in India (Acharya and Porwal, 2020). Genomic data on Indian isolates of SARS-CoV-2 is coming from different laboratories, and interim analysis shows the introduction of this virus to India from multiple sources such as China, Europe, USA, Canada and the Middle East (Potdar et al., 2020; Singh and Sharma, 2020; Somasundaram et al., 2020). Analysis of 361 genomes of SARS-CoV-2 from India revealed that there were 5 clusters, 4 of these being known clades identified by Nextstrain: A2a, A3, B, and B4. 62 % of the genomes belonged to A2a clade, but 29 % belonged to a distinct cluster designated clade I/A3i, not reported outside India (Banu et al., 2020). It has been suggested that the evolution of Clade I/A3i is primarily determined by changes in the structural proteins, N and E, while the evolution in globally predominant A2a clade is determined by changes in Spike (S) and Membrane (M) proteins. Since E gene is the target for amplification in Reverse Transcription Real-Time PCR (RT-qPCR) used for diagnostic purposes in India (Alagarasu et al., 2020), we analyzed the sequence of E genes to see if variations affected the primer binding sites. The envelope (E) protein of n class="Species">SARS-CoV is the smallest structural component plays an essential role in the viral replication starting from envelope formation, to assembly (Schoeman and Fielding, 2019). Studies have reported that mutation or deletions in the E protein significantly affects the viral maturation, incompetent viral progeny leading to reduced viral titer upon infection (Cohen et al., 2011; Schoeman and Fielding, 2019). In this study nucleotide sequence of E genes from Indian isolates were examined to see if there are any changes in primer binding sites. Further, in silico analysis of the Indian SARS-CoV-2 E proteins were performed to investigate the amino acid composition, domains involved in the pathogenicity to obtain better insight into the epidemiology and pathogenicity of Indian SARS-CoV-2. E gene nucleotide and protein sequences available for the 2086 Indian SARS-CoV-2 geene">non class="Gene">me as of Sep 11th 2020 and Reference sequence were downloaded from the GISAID as well as NCBI virus database and aligned using Geneious Prime. These genome sequences were derived from both illumina sequencing technology with coverage of more than 47 X (47X-764 X) and ARTIC Network Primal Scheme on MinION with coverage of more than 673 X (673 X- 4600X) (Table 1 ). The detailed information on the sequences used and their metadata is given in Supplementary Table S1. As the success of PCR based molecular diagnostics mainly depends on the efficient primers and/or probes to specifically amplify the target gene, any genetic variations, especially in the 3' primer or probe binding regions, can result in potential mismatches and false-negative results. The primer/probes described in Charite-E protocols (WHO, 2020) were mapped to the multiple sequence alignment of E gene to see if there are any changes in primer binding sites (data not shown).
Table 1

Patient characteristics of the COVID-19 cases used for the genome analysis of SARS-CoV-2.

Sl. NoAccession No. (NCBI/GISAID)Isolate nameSequencing TechnologyCoverageGeographical LocationAgeGenderPatient status
1.NC_045512 - WUWuhan-Hu-1IlluminaNAChina/WuhanNANARef. seq.
2.MT012098 - KLSARS-CoV-2/human/IND/29/2020IlluminaNAIndia/KeralaNANANA
3.EPI_ISL_428482hCoV-19/India/nimh-0182/2020Oxford nanopore technology MinION1,900xIndia/Karnataka26MaleNA
4.EPI_ISL_436447hCoV-19/India/KA-NCDC-4055/2020Oxford nanopore technology MinION1,294.84xIndia/Karnataka18MaleNA
5.EPI_ISL_428485hCoV-19/India/KA-nimh-0834/2020Oxford nanopore technology MinION700xIndia/Karnataka32MaleAsymptomatic
6.EPI_ISL_486405hCoV-19/India/KA-nimh-14709/2020Oxford nanopore technology MinIONNAIndia/Karnataka34MaleAsymptomatic
7.EPI_ISL_516078hCoV-19/India/KA-nimh-1600/2020Oxford nanopore technology MinION1814.56xIndia/Karnataka21FemaleReleased
8.EPI_ISL_428484hCoV-19/India/KA-nimh-0351/2020Oxford nanopore technology MinION1200xIndia/Karnataka55MaleNA
9.EPI_ISL_486402hCoV-19/India/KA-nimh-11076/2020Oxford nanopore technology MinIONNAIndia/Karnataka30FemaleAsymptomatic
10.EPI_ISL_428480hCoV-19/India/KA-nimh-0116/2020Oxford nanopore technology MinION900xIndia/Karnataka28FemaleNA
11.EPI_ISL_436139hCoV-19/India/KA-nimh-3970/2020Oxford nanopore technology MinIONNAIndia/Karnataka20MaleSymptomatic
12.EPI_ISL_436140hCoV-19/India/KA-nimh-4376/2020Oxford nanopore technology MinIONNAIndia/Karnataka52FemaleAsymptomatic
13.EPI_ISL_436138hCoV-19/India/KA-nimh-3952/2020Oxford nanopore technology MinIONNAIndia/Karnataka38MaleAsymptomatic
14.EPI_ISL_428481hCoV-19/India/KA-nimh-0130/2020Oxford nanopore technology MinION3000xIndia/Karnataka55MaleNA
15.EPI_ISL_428487hCoV-19/India/KA-nimh-1071/2020Oxford nanopore technology MinION1600xIndia/Karnataka25MaleNA
16.EPI_ISL_428486hCoV-19/India/KA-nimh-0996/2020Oxford nanopore technology MinION2000XIndia/Karnataka43MaleNA
17.EPI_ISL_428483hCoV-19/India/KA-nimh-0318/2020Oxford nanopore technology MinION4600xIndia/Karnataka19MaleNA
18.EPI_ISL_516077hCoV-19/India/KA-nimh-1453/2020Oxford nanopore technology MinION1000.31xIndia/Karnataka23MaleReleased
19.EPI_ISL_515948hCoV-19/India/KA-nimh-11074/2020Oxford nanopore technology MinION1,510x (average)India/Karnataka10MaleReleased
20.EPI_ISL_515953hCoV-19/India/KA-nimh-12171/2020Oxford nanopore technology MinION726x (average)India/Karnataka35FemaleReleased
21.EPI_ISL_436156hCoV-19/India/KA-nimh-1596/2020Oxford nanopore technology MinIONNAIndia/Karnataka55FemaleAsymptomatic
22.EPI_ISL_436141hCoV-19/India/KA-nimh-4378/2020Oxford nanopore technology MinIONNAIndia/Karnataka27FemaleAsymptomatic
23.EPI_ISL_436137hCoV-19/India/KA-nimh-2873/2020Oxford nanopore technology MinIONNAIndia/Karnataka28FemaleAsymptomatic
24.EPI_ISL_436157hCoV-19/India/KA-nimh-1598/2020Oxford nanopore technology MinIONNAIndia/Karnataka50FemaleAsymptomatic
25.EPI_ISL_428479hCoV-19/India/KA-nimh-0113/2020Oxford nanopore technology MinION3000xIndia/Karnataka68MaleNA
26.EPI_ISL_528426hCoV-19/India/MH-IGIB-D2/2020Oxford nanopore technology MinIONNAIndia/Maharashtra24FemaleNA
27.EPI_ISL_436438hCoV-19/India/MH-NCDC-3835/2020Oxford nanopore technology MinION1,467.2xIndia/MaharashtraNAMaleNA
28.EPI_ISL_436446hCoV-19/India/MH-NCDC-3985/2020Oxford nanopore technology MinION673.53xIndia/Maharashtra25MaleNA
29.EPI_ISL_436442hCoV-19/India/MH-NCDC-3950/2020Oxford nanopore technology MinION1,345.3xIndia/Maharashtra31MaleNA
30.EPI_ISL_436437hCoV-19/India/DL-NCDC-3831/2020Oxford nanopore technology MinION1,531.56xIndia/Delhi56MaleNA
31.EPI_ISL_436443hCoV-19/India/MP-NCDC-3961/2020Oxford nanopore technology MinION1,346.03xIndia/Madhya Pradesh33MaleNA
32.EPI_ISL_455773hCoV-19/India/OR-RMRC22/2020Illumina179xIndia/Odisha29MaleNA
33.EPI_ISL_463046hCoV-19/India/OR-ILSCV16495/2020Illumina355xIndia/Odisha37MaleNA
34.EPI_ISL_463080hCoV-19/India/OR-ILSCV20598/2020Illumina403xIndia/Odisha58MaleNA
35.EPI_ISL_463091hCoV-19/India/OR-ILSCV27412/2020Illumina47xIndia/Odisha20FemaleNA
36.EPI_ISL_463048hCoV-19/India/OR-ILSCV16508/2020Illumina275xIndia/Odisha35MaleNA
37.EPI_ISL_463021hCoV-19/India/OR-ILSCV13695/2020Illumina764xIndia/Odisha29MaleNA
38.EPI_ISL_463041hCoV-19/India/OR-ILSCV15937/2020Illumina638xIndia/Odisha50MaleNA
39.EPI_ISL_463093hCoV-19/India/OR-ILSCV28955/2020Illumina84xIndia/Odisha55MaleNA

NA = Not Available.

Patient characteristics of the n class="Disease">COVID-19 cases used for the genome analysis of SARS-CoV-2. NA = Not Available. The analysis revealed that none of the E gene sequences analysed had variations in the primer/probe binding regions except in 8 out of 168 sequences fron class="Gene">m the State of Odisha where reverse primer binding found to be abolished. Hence such strains may be negative for E gene in RT PCR. This may explain observations from some diagnostic laboratories in India using GeneXpert and Trunat systems that occasional samples can be negative for E gene but positive for N gene or ORF1 gene (Rohit et al., unpublished observations). SARS-CoV-2 E protein has 75 amino acid residues with a predicted molecular weight of 8.3 kDa and has an N-terminal cytoplasmic segment and C-terminal non-cytoplasmic region separated by a hydrophobic transmembrane domain (aa 12-aa 35) as predicted using XtalPred Server (http://xtalpred.godziklab.org/) InterPro Scan search (http://www.ebi.ac.uk/interpro/). Multiple sequence alignment revealed that amino acid residues of E proteins are highly conserved in all the sequences except that there was a characteristic in-frame deletion of 25–55 amino acid residues in the E protein C-terminal region of the 34 SARS-CoV-2 genomes available from Karnataka (20/128), Maharashtra (4/325), Delhi (1/121), Madhya Pradesh (1/41) and Odisha (8/168) (Fig. 1 ). These amino acid deletions map to the homopentameric interface and PDZ binding motif (PBM) present in the C-terminal region as well as immediately after the reverse primer binding region as per Charité protocol (WHO, 2020) in 26 genomes (except the ones from the State of Odisha discussed above), hence, their detection through RT-qPCR may not be hampered and therefore E gene based RT-qPCR would still detect these isolates. Among them, 9 individuals were asymptomatic (Table 1) and the clinical condition of other individuals was not available in the GISAID database. When we analyzed the patient data for these cases, none of them had a travel history to COVID-19 infected countries; hence their exact source of the contract cannot be adequately traced. We also examined the lineage of these sequences using the methodology of Pattabiraman et al. (2020) and as indicated in Table S1, of the 34 sequences with E gene deletions, 15 belonged to 19A, one belonged to 19B, 4 belonged to 20A and 20B each and remaining sequences could not be categorised into any of the clades defined by Nextstrain. Our data suggests that C-terminal deletion in the E-gene of SARS-CoV-2 was spread across different lineages and this deletion event would have occurred independently in different lineages and geographical locations.
Fig. 1

Representative image showing the amino acid residues and domains in the envelope (E) protein of SARS-CoV-2. Homopentameric interface region and PDZ binding domains are indicated with star marks coloured in blue and orange. The location of N-glycosylation sites are marked with a straight line () at the top of the residue. The predicted antigenic determinant regions are highlighted in the box.

Representative image showing the an class="Gene">mino acid residues and domains in the envelope (E) protein of SARS-CoV-2. Homopentameric interface region and PDZ binding domains are indicated with star marks coloured in blue and orange. The location of N-glycosylation sites are marked with a straight line () at the top of the residue. The predicted antigenic determinant regions are highlighted in the box. Further, sequential B cell epitopes on the E protein was predicted using BepiPred-2.0 (http://www.cbs.dtu.dk/services/BepiPred/index.php). This showed the presence of highly conserved antigenic determinaene">nt regions in the n class="Gene">N-terminal region (SEET) and C-terminal region (YVYSRVKNLNSSRVP) of E protein (Fig. 1). It is also important to highlight that; we were unable to map the C terminal antigenic determinant region in those 34 isolates of SARS-CoV-2 E proteins as there was a deletion of 25–55 amino acid residues in its C-terminal region. Based on PROSITE analyses (https://prosite.expasy.org/), it was predicted that SARS-CoV-2 E protein has two N-glycosylation sites at N48 and N66 but lacking in all 34 isolates of SARS-CoV-2 which had an extensive deletion/gaps in C-terminal region amino acid residues. Recent study of Sun et al. (2020) reported 12 bp deletions in the E gene of SARS-CoV-2 genome at the position of 26320–26331 and mutant strain had higher spike protein compared to wildtype without any difference in viral titer. In SARS-CoV, PDZ binding motif of E protein is considered to be the major determinant of virulence and known to participate in interactions with several host proteins like syntenin and PALS1 during viral infection (Jimenez-Guardeño et al., 2014; Teoh et al., 2010). SARS-CoVs lacking PBM domain of E protein have abrogated expression of inflammatory cytokines and were attenuated, causing the least damage to lungs without the mortality of the infected animal (Jimenez-Guardeño et al., 2014). DeDiego et al. (2008) showed that elimination of SARS-CoV E protein has the deleterious effect of viral replication where it grew about 100-fold lower compared to the wild type virus in lung cells of the infected mice. In many coronaviruses, E protein is targeted towards the Golgi complex, where it participates in viral assembly and budding. Studies have shown that Golgi complex targeting information is located in the C-terminal region of the E protein and therefore, either truncation of C-terminal region or mutation of conserved amino acid residues in the C-terminal would disrupt the Golgi complex targeting the region of the SARS-CoV E protein leading to crippled viral maturation and incompetent viral progeny (Cohen et al., 2011). It has also been shown that the deletion of N-glycosylation sites in the envelope protein of many viruses had a significant consequence on infectivity potential and antibody-mediated neutralization by the host immune system. It was also observed that mutation of the glycosylation site at N66 resulted in the formation of oligomers of E protein with higher molecular weight, thereby affecting the functionality of E protein in the viral replication (Liao et al., 2006; Schoeman and Fielding, 2019). The in silico analysis performed in this study provides the first observation on the extensive deletion of an class="Gene">mino acid residues in the C-terminal region of the envelope glycoprotein in some of the Indian SARS-CoV-2 genomes. It is possible that the deletions in the C-terminal region of E protein of these genomes are a result of adapting to a newer geographical area and host. Sometimes sequencing of samples with low viral titters could lead to assemblies with low coverage and/or spurious gaps in the genome, but this is unlikely in this case since gaps in the same region in 34 isolates from are different geographical location and sequences by different laboratories. It is also possible that these have reduced virulence since nine of the thirty-four individuals from whom isolates were obtained were asymptomatic and clinical information for others were not available. Lack of travel history in the infected individuals suggests that the virus isolates might be circulating in this region for some time. However, further studies are indispensable to understand the functional consequences of amino acid deletion in the C terminal region of SARS-CoV-2 envelope protein in the viral pathogenesis and host adaptation.

Declaration of Competing Interest

The authors report no declarations of interest.
  5 in total

Review 1.  Molecular characteristics, immune evasion, and impact of SARS-CoV-2 variants.

Authors:  Cong Sun; Chu Xie; Guo-Long Bu; Lan-Yi Zhong; Mu-Sheng Zeng
Journal:  Signal Transduct Target Ther       Date:  2022-06-28

2.  Increased Frequency of Indels in Hypervariable Regions of SARS-CoV-2 Proteins-A Possible Signature of Adaptive Selection.

Authors:  Arghavan Alisoltani; Lukasz Jaroszewski; Mallika Iyer; Arash Iranzadeh; Adam Godzik
Journal:  Front Genet       Date:  2022-06-02       Impact factor: 4.772

3.  A deletion in SARS-CoV-2 ORF7 identified in COVID-19 outbreak in Uruguay.

Authors:  Yanina Panzera; Natalia Ramos; Sandra Frabasile; Lucía Calleros; Ana Marandino; Gonzalo Tomás; Claudia Techera; Sofía Grecco; Eddie Fuques; Natalia Goñi; Viviana Ramas; Leticia Coppola; Héctor Chiparelli; Cecilia Sorhouet; Cristina Mogdasy; Juan Arbiza; Adriana Delfraro; Ruben Pérez
Journal:  Transbound Emerg Dis       Date:  2021-03-05       Impact factor: 4.521

4.  Transmission cluster of COVID-19 cases from Uruguay: emergence and spreading of a novel SARS-CoV-2 ORF6 deletion.

Authors:  Yanina Panzera; Natalia Ramos; Lucía Calleros; Ana Marandino; Gonzalo Tomás; Claudia Techera; Sofía Grecco; Sandra Frabasile; Eddie Fuques; Leticia Coppola; Natalia Goñi; Viviana Ramas; Cecilia Sorhouet; Victoria Bormida; Analía Burgueño; María Brasesco; Maria Rosa Garland; Sylvia Molinari; Maria Teresa Perez; Rosina Somma; Silvana Somma; Maria Noelia Morel; Cristina Mogdasy; Héctor Chiparelli; Juan Arbiza; Adriana Delfraro; Ruben Pérez
Journal:  Mem Inst Oswaldo Cruz       Date:  2022-01-10       Impact factor: 2.743

Review 5.  SARS-CoV-2 one year on: evidence for ongoing viral adaptation.

Authors:  Thomas P Peacock; Rebekah Penrice-Randal; Julian A Hiscox; Wendy S Barclay
Journal:  J Gen Virol       Date:  2021-04       Impact factor: 3.891

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.