Literature DB >> 33545393

Probing SARS-CoV-2 sequence diversity of Pakistani isolates.

Zaira Rehman1, Massab Umair2, Aamer Ikram2, Afreenish Amir2, Muhammad Salman2.   

Abstract

Entities:  

Keywords:  Mutations; SARS-CoV-2

Year:  2021        PMID: 33545393      PMCID: PMC8035043          DOI: 10.1016/j.meegid.2021.104752

Source DB:  PubMed          Journal:  Infect Genet Evol        ISSN: 1567-1348            Impact factor:   3.342


× No keyword cloud information.
To the Editor, With the increasing spread of COVID-19 pandemic around the world, there have been 223,983 whole genome sequences of SARS-CoV-2 submitted in GISAID database as of November 2, 2020. This wealth of sequences can be useful in probing variations in the viral genome which can potentially affect its transmissibility and virulence. The SARS-CoV-2 is a RNA virus constituting six major open reading frames (ORF) that encodes structural and non-structural proteins. Sixteen non-structural proteins (nsp 1–16) are encoded by ORF1a and 1b while the accessory genes are encoded by ORF3a, ORF6, ORF7a and b, and ORF8 (Shimamoto et al., 2015; Zhou et al., 2020). From a comparative standpoint, RNA viruses (like influenza and HIV) tend to incorporate nucleotide variations due to the lack of proof reading activity of RNA polymerase enzyme. Logically, this can bring about high mutation rate, however SARS-CoV viruses has evolved with a proof reading region, the nsp14, that keeps a check on rapid mutational changes in its genome (Denison et al., 2011). Numerically, this has been exemplified from reports which have suggested 12,000 mutations in SARS CoV2 mutations till September 2020 (Callaway, 2020). Pakistan is a populous country with 403,311 positive cases and 166 deaths as of December 1, 2020 (https://covid.gov.pk/stats/pakistan). Therefore, following genomic surveillance for SARS-CoV-2 is imperative. As of October 16, 2020, only 14 whole genome sequences of SARS-CoV-2 has been reported from Pakistan. Nevertheless, studying these sequences with respect to its divergence from worldwide sequences, is crucial to get a lead on the possible genetic variants of SARS-CoV-2 which might be circulating in Pakistani population. All the 14 whole genome sequences of SARS-CoV-2 reported from Pakistan till October 16, 2020 were downloaded from GISAID database (https://www.epicov.org/epi3/frontend#efa72), and the first sequence SARS-CoV-2 from Wuhan was used as reference sequence (Accession number: NC_045512.2). The multiple sequence alignment was performed using Clustal X (Larkin et al., 2007). Visualization of alignment followed by mutational analysis was performed using Jalview (Waterhouse et al., 2009). Phylogenetic analysis of 14 Pakistani sequence isolates of SARS-CoV-2, was performed using Galaxy server. For phylogenetic analysis multiple sequence alignment was performed using MAFFT followed by Maximum Likelihood tree construction using IQTree available on Galaxy server (Nguyen et al., 2015). The visualization and editing of tree was performed using FigTree (http://tree.bio.ed.ac.uk/software/figtree/). The variations in the amino acid sequences has been compared with the SARS-CoV-2 sequences reported around the world on COVIDCG database (Chen et al., 2020) as of October 16, 2020. The effect of mutations on stability of protein was studied through I-Mutant 3.0. It predicts the effect of mutation on stability of protein by estimation of Gibbs free change (ΔG) (difference of energy (DDG) between native and mutated protein). The effect of mutations on the stability of protein is classified either increasing the stability (DDG > 0.5Kcal/mol), decreasing the stability (DDG < -0.5 Kcal/mol) or neutral impact (−0.5 ≤DDG ≤0.5 Kcal/mol) on protein structure (Capriotti et al., 2005). Phylogenetic analysis revealed that the SARS-CoV-2 sequences correspond to GH, S, O, GR and L clades circulating in Pakistan. Initial sequences (March 2020) from Pakistan revealed the presence of L and O clades. The L clade sequences appeared to be closely related to strains reported from United Kingdom and United Arab Emirates, while O clade sequences were closely related to SARS-CoV-2 sequences from Japan. The samples collected in May 2020 belongs to GR clade that is closely related to isolates from USA, Sweden and Germany. The GH and S clade were observed in June 2020 sequences that appear to be closely related to strains reported from United Arab Emirates (Fig. S1).
Fig. S1

Maximum-Likelihood phylogenetic tree of 14 SARS-CoV-2 whole genome sequences from Pakistan. The Pakistani sequences are highlighted in red color. The L clade is highlighted as blue, S clade as purple, GH clade as sea green, GR clade as green, and O clade as yellow color.

Phylogenetic analysis revealed that the SARS-CoV-2 sequences correspond to GH, S, O, GR and L clades circulating in Pakistan. Initial sequences (March 2020) from Pakistan revealed the presence of L and O clades. The L clade sequences appeared to be closely related to strains reported from United Kingdom and United Arab Emirates, while O clade sequences were closely related to SARS-CoV-2 sequences from Japan. The samples collected in May 2020 belongs to GR clade that is closely related to isolates from USA, Sweden and Germany. The GH and S clade were observed in June, 2020 sequences that appear to be closely related to strains reported from United Arab Emirates (Fig. S1). In total, 28 amino acid variations in the structural and nonstructural proteins of SARS-CoV-2 have been identified from patient isolates across Pakistan. There are 07 non-structural genes (nsp1, 7–11, 14–16) in ORF1ab that have been found to be conserved in Pakistani isolates. The amino acid changes have been observed in nsp2, 3, 4, 5,6, 12, and 13 (Fig. 1 ).
Fig. 1

Multiple sequence alignment of Pakistani SARA-CoV-2 sequences with the reference Wuhan-1 strain. (A) nsp2; (B) nsp3; (C) nsp4; (D) nsp5; (E) nsp6; (F) nsp13; (G) Spike; (H) ORF3; (I) ORF8; (J) Nucleocapsid; (K) ORF10. The amino acid variations are shown with white color.

Multiple sequence alignment of Pakistani SARA-CoV-2 sequences with the reference Wuhan-1 strain. (A) nsp2; (B) nsp3; (C) nsp4; (D) nsp5; (E) nsp6; (F) nsp13; (G) Spike; (H) ORF3; (I) ORF8; (J) Nucleocapsid; (K) ORF10. The amino acid variations are shown with white color. In nsp2, three changes (R207C, V378I, D448N) have been observed in the sequences collected from March and one change (L450F) have been observed in the samples from June. Interestingly, these changes have not been observed in any of the sequences reported around the world. Nsp2 is an important viral protein that along with nsp8, is involved in viral replication (Angeletti et al., 2020). Hence, any change in this gene may impair viral replication and therefore requires further investigation. In nsp3, three changes (L944S, T1246I, K1305N, and Q2702H) have been detected in the isolates from May 2020. The T1246I variant has been reported in only 0.2% sequences from different countries (Table 1 ). The K1305N has been observed in only 0.1% of the Asian sequences and 0.5% in European isolates. The Q2702H have been observed from 5 sequences recovered in May and June 2020. Comparatively worldwide, this mutation has been reported in less than 1% of the isolates. Another mutation that has been observed in nsp3 is T2016K which has been reported earlier by Ghanchi et al. 2020 (Ghanchi et al., 2020). This mutation has been reported in 15% of the isolates from Asia and found to be prevailing in October 2020 isolates as well. Functionally, the nsp3 plays its part in immunosuppression of innate immune responses of host (Lei et al., 2018). Hence, the changes in nsp3 can result in enhanced viral capability to evade innate immune defenses.
Table 1

Details of mutations reported in Pakistani Sequences in comparison with the worldwide reported sequences and effect of mutations on stability of protein. DDG > 0.5Kcal/mol = protein stability increases, DDG < -0.5 Kcal/mol = protein stability decreases.

Sample IDGeneMutationWorldwide PrevalenceDDG (KJ/mol)Effect on Protein Stability
EPI_ISL_417444nsp2R207CV378I−0.82654688−0.47956396Decrease
EPI_ISL_451958nsp2D448N−0.91305103Decrease
EPI_ISL_45579nsp2L450F−1.6603629Decrease
EPI_ISL_548942EPI_ISL_548943EPI_ISL_548944EPI_ISL_548945nsp3L944S−0.61244224−0.16539386Decrease
T1246IAsia = 0.2%Europe = 2.5%South America = 7.6%USA = 0.1%Canada = 0.2%Africa = 9.5%
K1305NAsia = 0.1%Europe = 0.6%Africa = 0.04%
EPI_ISL_513925EPI_ISL_45143EPI_ISL_45090EPI_ISL_45579nsp3Q2702HAsia = 0.5%Europe = 0.9%Canada = 0.2%Africa = 1.7%−1.4611661Decrease
EPI_ISL_417444nsp4P2965L0.433947Increase
EPI_ISL_548942EPI_ISL_548943EPI_ISL_548944EPI_ISL_548945nsp5G3278SAsia = 0.2%Europe = 6.4%South America = 7.6%Canada = 0.4%Africa = 9.6%−0.39907465Decrease
EPI_ISL_468160nsp5N3491K−1.3556076Decrease
EPI_ISL_417444EPI_ISL_513925EPI_ISL_45143EPI_ISL_45090EPI_ISL_45579nsp6L3606FAsia = 21.0%Europe = 9.6%South America = 2.6%USA = 3.3%Canada = 6.2%Africa = 5.1%−1.2923268Decrease
EPI_ISL_468159EPI_ISL_468163nsp6M3655IAsia = 0.6%Europe = 0.8%Africa = 1.1%−0.74222727Decrease
EPI_ISL_548942EPI_ISL_548943EPI_ISL_548944EPI_ISL_548945EPI_ISL_513925EPI_ISL_468161EPI_ISL_548946EPI_ISL_468160EPI_ISL_468162nsp12P4715LAsia = 62.7%Europe = 90%South America = 94%USA = 89%Canada = 85%Africa = 91%
EPI_ISL_513925EPI_ISL_417444EPI_ISL_419313nsp13A5561T−1. 8,647,808Decrease
EPI_ISL_548946SN74K0.82838446Increase
EPI_ISL_548946EPI_ISL_468160EPI_ISL_468161EPI_ISL_548942EPI_ISL_548943EPI_ISL_548944EPI_ISL_548945EPI_ISL_513925EPI_ISL_468162SD614GAsia = 62.4%Europe = 87.9%South America = 94.3%USA = 89.2%Canada = 82.2%Africa = 95%−1.4867818
EPI_ISL_468159EPI_ISL_468163SD830A−0.59789075Decrease
EPI_ISL_513925EPI_ISL_468161EPI_ISL_468160 EPI_ISL_468162ORF3aQ57HAsia = 25.2%Europe = 10.8%South America = 12.0%USA = 62.0%Canada = 36.6%Africa = 9.3%−0.91817829Decrease
EPI_ISL_468160ORF3aF105S−0.94734895Decrease
EPI_ISL_468161ORF8W45LEurope = 0.1%USA = 0.2%−0.57599897Decrease
EPI_ISL_468159EPI_ISL_468163ORF8L84SAsia = 8.3%Europe = 2.2%South America = 3.7%USA = 8.8%Canada = 12.8%Africa = 4.2%−1.9053577Decrease
E92KAsia = 0.4%Africa = 1.7%−0.75628271
EPI_ISL_468159EPI_ISL_468163EPI_ISL_468162NS202NAsia = 2.7%Europe = 0.1%South America = 0.0%USA = 0.3%Africa = 4.2%0.6542265Increase
EPI_ISL_548942EPI_ISL_548943EPI_ISL_548944EPI_ISL_548945EPI_ISL_548946NR203KG204RAsia = 29.3%Europe = 46.7%South America = 62.7%USA = 12.9%Canada = 16.9%Africa = 53.7%−1.0587625Decrease
EPI_ISL_513925NS327LAsia = 0.1%Europe = 0.1%South America = 0.1%USA = 0.1%Africa = 0.3%0.55558303Increase
EPI_ISL_513925ORF10V30LAsia = 0.2%Europe = 6.9%USA = 0.1%Africa = 0.2%−1.4376902Decrease
Details of mutations reported in Pakistani Sequences in comparison with the worldwide reported sequences and effect of mutations on stability of protein. DDG > 0.5Kcal/mol = protein stability increases, DDG < -0.5 Kcal/mol = protein stability decreases. In nsp4, only one novel change (P2965L) has been observed in one isolate. This change has not been found in any of the worldwide reported sequences to our knowledge. In nsp5, the G3278S and N3491K changes have been observed from the isolates of May and June, respectively. The N3491K appears to be a novel mutation while G3278 has been reported with high prevalence rate (9.6%) in the initial months of pandemic and has not been observed after August 2020. The nsp5 is also known as 3C like protease, is involved in viral replication (Macchiagodena et al., 2020) and also interferes with interferon signaling (Zhu et al., 2017a; Zhu et al., 2017b). Hence, potentially any mutation can impact viral replication capability. In nsp6, the L3606F change has been present in 5 isolates from March, and June 2020. This mutation has been shown in 21% of isolates from Asia while in 9.6% of isolates from Europe. The nsp6 protein of other coronaviruses has been reported to interfere cellular autophagy signaling by affecting the PI3K3C3 and ATG5 proteins thereby inducing autophagosome formation but blocking its maturation (Benvenuto et al., 2020; Gassen et al., 2019; Yang and Shen, 2020). Among the structural proteins of SARS-CoV-2, the spike protein (S) is the outermost protein that is involved in entry of virus into the host cell. The D614G change in the S protein is observed in 9 Pakistani isolates from May and June 2020. The D614G mutation has been considered to increase SARS-CoV-2 infectious capability (Korber et al., 2020) by increasing the transmissibility of virus. The D830A is another novel change observed in 2 Pakistani isolates from June 2020. This change is important as D830 is present near the TMPRSS2 binding site and may have an effect on viral fusion. In ORF3a, the important change has been observed at position Q57H from the isolates of May and June 2020. It has been reported that this mutation is prevalent worldwide and it has an impact on protein structure (Banoun, 2020). The Q57H is present in 62% of the isolates from USA, 25% of isolates from Asia and 10.8% of isolates from Europe. ORF3a and ORF8 may have a role in host immune responses (Banoun, 2020). Three changes (W45L, L84S, and E92K) has been observed in ORF8 from the isolates of May and June 2020. The L84S has been present in Canada (12.8% of isolates), USA (8.8% of isolates) and Asia (8.3% of isolates). The E92K has been reported in less than 0.5% of worldwide isolates and not observed after July 2020. The ORF8 is important in downregulating the MHC-I molecules thus protecting the affected cells from cytotoxic T-cell killing (Banoun, 2020). In the N gene the important mutations have been S202N, R203K, G204R, S327L observed in Pakistani isolates. In comparison with the worldwide reported sequences, it is observed that in Asian and African sequences the S202N is present in 2.7% and 4.2% of isolates, respectively but it has disappeared after July 2020. The R203K and G204R are the changes that co-exist in the isolates while observing the worldwide reported sequences. These two mutations have also been among one of the prevalent changes observed in SARS-CoV-2 genome present in 62% of isolates in America, 53% of isolates in Africa, 46% of isolates in Europe and 29% of isolates in Asia. The N protein -contribute to viral genome assembly and serve as viral suppressor of RNAi to antagonize the host immune defense system (Liu et al., 2020). In ORF10, only one change the V30L have been observed in one of the isolate from Pakistan. The V30L have not been observed in sequences from USA and Asia after August, 2020 while in European sequences this change is still appearing. Since ORF10 is a novel protein to SARS-CoV-2, not much data available about the role of this protein in viral pathogenesis (Cagliani et al., 2020). By analyzing the impact of mutations on the stability of protein structure through I-Mutant, all the reported variations have been suggesting a decreasing stability of protein structure with the exception of P2965L, N75K, S202N, and S327L variations which suggest an impression of increasing the stability of nsp4, S, and N protein respectively (Table1). Pakistan is now experiencing second wave of COVID-19 insurgence. Therefore, more SARS-CoV-2 sequences are required for effective genomic surveillance of SARS-CoV-2 and for identifying sequence divergence from Pakistan. The novel mutations in the nsp2, nsp4, nsp5, nsp13, S, and ORF3a should be further evaluated in more SARS-CoV-2 genomes from Pakistan. The novel mutation in nsp4 (P2965L) and similar variants that have been reported around the world should be further investigated for its impact on protein stability as it may affect the viral counter measures in developing efficient vaccines and therapeutic solutions.

Data availability

The annotated genomes of SARS-CoV-2 from Pakistan and the sequences used for phylogenetic analysis has been retrieved from the global initiative on sharing all influenza data (GISAID) (https://www.gisaid.org/). A full list of accession number along with the acknowledgment table is provided as supplementary file 1.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:
  19 in total

1.  Porcine Deltacoronavirus nsp5 Antagonizes Type I Interferon Signaling by Cleaving STAT2.

Authors:  Xinyu Zhu; Dang Wang; Junwei Zhou; Ting Pan; Jiyao Chen; Yuting Yang; Mengting Lv; Xu Ye; Guiqing Peng; Liurong Fang; Shaobo Xiao
Journal:  J Virol       Date:  2017-04-28       Impact factor: 5.103

Review 2.  Coronaviruses: an RNA proofreading machine regulates replication fidelity and diversity.

Authors:  Mark R Denison; Rachel L Graham; Eric F Donaldson; Lance D Eckerle; Ralph S Baric
Journal:  RNA Biol       Date:  2011-03-01       Impact factor: 4.652

3.  A pneumonia outbreak associated with a new coronavirus of probable bat origin.

Authors:  Peng Zhou; Xing-Lou Yang; Xian-Guang Wang; Ben Hu; Lei Zhang; Wei Zhang; Hao-Rui Si; Yan Zhu; Bei Li; Chao-Lin Huang; Hui-Dong Chen; Jing Chen; Yun Luo; Hua Guo; Ren-Di Jiang; Mei-Qin Liu; Ying Chen; Xu-Rui Shen; Xi Wang; Xiao-Shuang Zheng; Kai Zhao; Quan-Jiao Chen; Fei Deng; Lin-Lin Liu; Bing Yan; Fa-Xian Zhan; Yan-Yi Wang; Geng-Fu Xiao; Zheng-Li Shi
Journal:  Nature       Date:  2020-02-03       Impact factor: 69.504

4.  Coding potential and sequence conservation of SARS-CoV-2 and related animal viruses.

Authors:  Rachele Cagliani; Diego Forni; Mario Clerici; Manuela Sironi
Journal:  Infect Genet Evol       Date:  2020-05-05       Impact factor: 3.342

5.  Identification of potential binders of the main protease 3CLpro of the COVID-19 via structure-based ligand design and molecular modeling.

Authors:  Marina Macchiagodena; Marco Pagliai; Piero Procacci
Journal:  Chem Phys Lett       Date:  2020-04-18       Impact factor: 2.328

6.  SKP2 attenuates autophagy through Beclin1-ubiquitination and its inhibition reduces MERS-Coronavirus infection.

Authors:  Nils C Gassen; Daniela Niemeyer; Doreen Muth; Victor M Corman; Silvia Martinelli; Alwine Gassen; Kathrin Hafner; Jan Papies; Kirstin Mösbauer; Andreas Zellner; Anthony S Zannas; Alexander Herrmann; Florian Holsboer; Ruth Brack-Werner; Michael Boshart; Bertram Müller-Myhsok; Christian Drosten; Marcel A Müller; Theo Rein
Journal:  Nat Commun       Date:  2019-12-18       Impact factor: 14.919

Review 7.  Targeting the Endocytic Pathway and Autophagy Process as a Novel Therapeutic Strategy in COVID-19.

Authors:  Naidi Yang; Han-Ming Shen
Journal:  Int J Biol Sci       Date:  2020-03-15       Impact factor: 6.580

8.  Evolutionary analysis of SARS-CoV-2: how mutation of Non-Structural Protein 6 (NSP6) could affect viral autophagy.

Authors:  Domenico Benvenuto; Silvia Angeletti; Marta Giovanetti; Martina Bianchi; Stefano Pascarella; Roberto Cauda; Massimo Ciccozzi; Antonio Cassone
Journal:  J Infect       Date:  2020-04-10       Impact factor: 6.072

9.  COVID-2019: The role of the nsp2 and nsp3 in its pathogenesis.

Authors:  Silvia Angeletti; Domenico Benvenuto; Martina Bianchi; Marta Giovanetti; Stefano Pascarella; Massimo Ciccozzi
Journal:  J Med Virol       Date:  2020-02-28       Impact factor: 2.327

10.  Tracking Changes in SARS-CoV-2 Spike: Evidence that D614G Increases Infectivity of the COVID-19 Virus.

Authors:  Bette Korber; Will M Fischer; Sandrasegaram Gnanakaran; Hyejin Yoon; James Theiler; Werner Abfalterer; Nick Hengartner; Elena E Giorgi; Tanmoy Bhattacharya; Brian Foley; Kathryn M Hastie; Matthew D Parker; David G Partridge; Cariad M Evans; Timothy M Freeman; Thushan I de Silva; Charlene McDanal; Lautaro G Perez; Haili Tang; Alex Moon-Walker; Sean P Whelan; Celia C LaBranche; Erica O Saphire; David C Montefiori
Journal:  Cell       Date:  2020-07-03       Impact factor: 66.850

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.