Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Understanding the possible origin and genotyping of the first Bangladeshi SARS-CoV-2 strain.

Literature DB >> 32492206

Understanding the possible origin and genotyping of the first Bangladeshi SARS-CoV-2 strain.

A S M Rubayet Ul Alam¹, M Rafiul Islam², M Shaminur Rahman², Ovinu Kibria Islam¹, M Anwar Hossain¹.

Abstract

The novel coronavirus, SARS-CoV-2, has caused the most unfathomable pandemic in the history of humankind. Bangladesh is also a victim of this critical situation. To investigate the genomic features of the pathogen from Bangladesh, the first complete genome of the virus has very recently been published. Therefore, long-awaited questions regarding the possible origin and typing of the strain(s) can now be answered. Here, we endeavor to mainly discuss the published reports or online-accessed data (results) regarding those issues and present a comprehensive picture of the typing of the virus alongside the probable origin of the subclade containing the Bangladeshi strain. Our observation suggested that this strain might have originated from the United Kingdom or the other European countries epidemiologically linked to the United Kingdom. According to different genotyping classification schemes, this strain belongs to the A2a clade under the G major clade, is of B and/or L type, and is a SARS-CoV-2a substrain. In the future, randomized genomic data will certainly increase in Bangladesh, however because of globalization and immigrant movement, we urgently need a mass regional sequencing approach targeting the partial or complete genome that can link the epidemiological data and may help in further clinical intervention.

Entities: Chemical

Keywords: Bangladesh; Evolution; Genome; SARS-CoV-2; Unique Mutation; Virus Classification

Mesh：

Year: 2020 PMID： 32492206 PMCID： PMC7300542 DOI： 10.1002/jmv.26115

Source DB: PubMed Journal: J Med Virol ISSN： 0146-6615 Impact factor: 20.693

From Bangladesh, the very first complete genome of SARS‐CoV‐2 (hCoV‐19/Bangladesh/CHRF_nCOV19_0001/2020 under GISAID Accession no: EPI/ISL/437912) was published by the Child Health Research Foundation (CHRF) on 12 May 2020. Since the publication of the sequence, along with the huge appreciation from the national bioscience community, many researchers are conducting bioinformatics analyses to gain insight into the characteristics of the viruses circulating in Bangladesh. This opinion reflects the scientific thoughts regarding this genome, then a point‐by‐point approach will reveal the related bioinformatically analyzed reports published or open‐accessed elsewhere and future prospects. First, Bangladesh reported the first case of COVID‐19 on 8 March 2020 having a travel history from Italy. The first complete sequence data is of the virus strain sampled from a 22 years aged female patient on 18 April 2020, although not mentioned, possibly the person had no recent travel history. Phylodynamics data found in the Nextstrain (https://nextstrain.org/) showed that the time to most recent common ancestor (tMRCA) of this Bangladeshi strain was 14 March 2020 spanning the time interval range between 7 March and 23 March 2020 (Figure 1). Evolutionary analysis shown in the maximum clade credibility (MCC) tree represented the closest relative of hCoV‐19/Bangladesh/CHRF_nCOV19_0001/2020 as Wales/PHWC‐277D1/2020 until 14th April. Also, other closely related strains within the respective subclade are mostly from European countries (Figure 1).

Figure 1

MCC tree showing the tMRCA and possible origin of first Bangladeshi strain. Image was taken from the Nextstrain, link https://nextstrain.org/ncov/global?f_country=Bangladesh

MCC tree showing the tMRCA and possible origin of first Bangladeshi strain. Image was taken from the Nextstrain, link https://nextstrain.org/ncov/global?f_country=Bangladesh It should be noted that the nearest origin and tMRCA of the Bangladeshi strain can be changed as this is a dynamic evolving tree, which is continuously updated based on ever‐increasing sequence numbers and resulting from a more equitable global sequence distribution, hiding samples available from regions that are generating lots of genomic data. , Nevertheless, the origin of this subclade has pointed toward the United Kingdom indicating that the Bangladeshi strain may come from what was once the epicenter of this pandemic. Very likely, the sequence is not directly related to the first three cases reported in Bangladesh, as among the infected persons, two men returned from Italy and the other infected females had close contact with them. It can also be noted that on 14th March, the Bangladesh Government banned flights carrying passengers from all European countries, except the United Kingdom. Even on 10th April, the Civil Association Authority reported all the passenger flights on domestic and international routes remained suspended except a few with China and the United Kingdom. Secondly, the sequence is showing 99.99% identity to the strains of European, Arabian, and Asian countries (Table 1 listing the representative countries with the strain ID) considering a good number of the aligned sequences of the NCBI and GISAID databases. Focusing on this point, we should keep in mind that distinct phylogenetic analyses considering different parameters within the selected models can give ambiguous monophyletic results due to a high level of identity among a considerable number of sequences with the Bangladeshi strain. For example, one neighbor‐joining tree can predict a close relation of the virus to strain(s) of Greece, another tree may cluster with the other European, even Asian or American strains. Besides this, the maximum likelihood can give a misleading tree as the same mutations at identical genomic positions are present among multiple closely related viruses. As Nextstrain or GISAID deals with a large number of data set (25,246 as of 14th April 2020) and generates an updated picture of the evolutionary timeframe using the maximum clade credibility (MCC) tree, at this stage, these results can be considered more reliable. However, research studies based on interesting hypotheses, focused targets and solid rationales are highly encouraged.

Table 1

Representative sequence information for the countries

Middle‐East	Asia	Europe	North America
United Arab Emirates	Russia (EPI_ISL_428913)	Latvia (EPI_ISL_437090)	USA (EPI_ISL_444740)
(EPI_ISL_443182)	Russia (EPI_ISL_428913)	Latvia (EPI_ISL_437090)	USA (EPI_ISL_444740)
Saudi Arabia	India (MT415323)	Sweden (EPI_ISL_429119)	Mexico
(EPI_ISL_435132)	India (MT415323)	Sweden (EPI_ISL_429119)	(EPI_ISL_412972)
	Sri Lanka (EPI_ISL_428671)	Greece (MT328035)
	Taiwan (EPI_ISL_426631)	Belgium (EPI_ISL_420433)

Representative sequence information for the countries Many researchers have modeled the evolutionary landscape based on the genome‐wide diverse pattern of mutational variations. , , We have found, considering this literature and online resources, that the Bangladeshi strain fell within the clade A2a containing a mutation at the 614th position of the spike protein changing amino acid aspartate to glycine (synonymous to G clade according to GISAID phylogenetic tree) and another mutation at the 323rd position of NSP12 (proline to leucine). , In addition to this report, Brufsky suggested this type of mutation can alter the heavily glycosylated spike protein, which has a possible impact on the membrane fusion in tissues resulting in change of the virulence. Another related study showed that this mutation generates a serine protease cleavage site in the S1‐S2 junction of spike protein that may facilitate the entry of SARS‐COV‐2 into the host cell. According to Forster et al, the strain can be characterized as a B type virus originating from type A subtype 29095C, which has a linked outward branch towards Europe. This derived B type virus has particular changes within the genome (NSP4:8,782C; ORF3a: 26,144G; ORF8: 28,144T) separating the B type from the other two (A and C), and is usually linked to immunological or environmental resistance against outside Asia, especially East Asia. Concomitantly, the Bangladeshi strain belongs to the L type, which is probably more aggressive and can spread quickly although human interventions have been decreasing the relative frequency of the L type. Notably, based on another classification scheme, this is a SARS‐CoV‐2a substrain due to the presence of a unique trinucleotide‐bloc mutation present at the Nucleoprotein (N) protein coding region (28,881‐28,883: AAC), and the translated G204R (glycine to arginine) mutation places the strain within the GR clade under the major G clade. The resulting amino acid substitutions (lysine and arginine) might reduce the pathogenicity of the SARS‐CoV‐2. In the perspective of the ORF3a, the Bangladeshi strain has an identical nucleotide (base “G”) at two mutually exclusive sites (25,563 and 26,144), compared to the “wild‐type” 2g substrain (base “T”), wherein G25563T and G26144T are linked with the emergence of GH and V (or synonymously type C7) clades, respectively. The resulting nonsynonymous mutations of the 3a protein (Q57H: glutamine→histidine and G251V: glycine→valine), albeit not within any of its six functional domains (I to VI), might have a positive or negative impact on structural stability and binding affinity, possibly influencing disease pathogenesis (i.e. modulating the immunological reaction, notably the 'cytokine storm' in the host ) and drug resistance capacity. It was also claimed in the report of Ayub as of 9th April 2020 that the predominant presence of SARS‐CoV‐2a substrain in a particular city or country, such as United Kingdom (26%), Belgium (31%), Netherland (50%), Portugal (60%), can be a cause of reduced cases of COVID‐19. Strikingly, the Bangladeshi strain also has two unique mutations in the ORF1ab region. Among them, mutation E261D (glutamic acid to aspartic acid) in the NSP13 protein (RNA helicase and/or 5′ triphosphatase activity) was found only in one Austrian strain (hCoV‐19/Austria/CeMM0004/2020) collected on 3rd March 2020. , Remarkably, another mutation (I120F: isoleucine to phenylalanine) in the NSP2 (predicted role in viral pathogenicity) is not found in any other sequences available in the GISAID database. Overall, both of the unique mutations may not be very signficant, considering the similar chemical properties of the amino acids and will need further molecular investigation to find any clue. Overall, it is speculated that Bangladesh can be an important source of mixed virus strains. This may be explained by the fact of the return of a lot of immigrants to Bangladesh from other countries, some of which were declared as the COVID‐19 epicenters. In this context, extensive sequencing of representative districts or zones, therefore, will give a concrete and comprehensive basis to bring out significant information, as expected in an ongoing outbreak. Moreover, the sequence data should also be correlated with patient history. It is important to note here that we may not need to go for complete/partial genome sequencing because scientists might soon find out the crucial, epidemiological marker(s) from the existing sequence information linked to clinical importance, which can also be useful to track the presence of viral types alongside important mutations. Targeted sequencing of only that genomic region(s), or part of the segment(s) will be sufficient for getting the relevant information and will give a breadth of opportunity for matching the clinical history and spreading pattern more easily. To conclude, a comprehensive and rational analysis, considering the related virus strains or relevant countries, rather than taking all the strains into the computational run will bring about a clearer picture, although more genomic data along with patient history from Bangladesh is a prerequisite here.

11 in total

1. GISAID: Global initiative on sharing all influenza data - from vision to reality.

Authors: Yuelong Shu; John McCauley
Journal: Euro Surveill Date: 2017-03-30

2. Nextstrain: real-time tracking of pathogen evolution.

Authors: James Hadfield; Colin Megill; Sidney M Bell; John Huddleston; Barney Potter; Charlton Callender; Pavel Sagulenko; Trevor Bedford; Richard A Neher
Journal: Bioinformatics Date: 2018-12-01 Impact factor: 6.931

3. Delicate structural coordination of the Severe Acute Respiratory Syndrome coronavirus Nsp13 upon ATP hydrolysis.

Authors: Zhihui Jia; Liming Yan; Zhilin Ren; Lijie Wu; Jin Wang; Jing Guo; Litao Zheng; Zhenhua Ming; Lianqi Zhang; Zhiyong Lou; Zihe Rao
Journal: Nucleic Acids Res Date: 2019-07-09 Impact factor: 16.971

4. Genome-Wide Identification and Characterization of Point Mutations in the SARS-CoV-2 Genome.

Authors: Jun-Sub Kim; Jun-Hyeong Jang; Jeong-Min Kim; Yoon-Seok Chung; Cheon-Kwon Yoo; Myung-Guk Han
Journal: Osong Public Health Res Perspect Date: 2020-06

5. Understanding the possible origin and genotyping of the first Bangladeshi SARS-CoV-2 strain.

Authors: A S M Rubayet Ul Alam; M Rafiul Islam; M Shaminur Rahman; Ovinu Kibria Islam; M Anwar Hossain
Journal: J Med Virol Date: 2020-09-28 Impact factor: 20.693

6. Phylogenetic network analysis of SARS-CoV-2 genomes.

Authors: Peter Forster; Lucy Forster; Colin Renfrew; Michael Forster
Journal: Proc Natl Acad Sci U S A Date: 2020-04-08 Impact factor: 11.205

7. COVID-2019: The role of the nsp2 and nsp3 in its pathogenesis.

Authors: Silvia Angeletti; Domenico Benvenuto; Martina Bianchi; Marta Giovanetti; Stefano Pascarella; Massimo Ciccozzi
Journal: J Med Virol Date: 2020-02-28 Impact factor: 2.327

8. Distinct viral clades of SARS-CoV-2: Implications for modeling of viral spread.

Authors: Adam Brufsky
Journal: J Med Virol Date: 2020-06-24 Impact factor: 20.693

9. Failure in initial stage containment of global COVID-19 epicenters.

Authors: Veria Khosrawipour; Hien Lau; Tanja Khosrawipour; Piotr Kocbach; Hirohito Ichii; Jacek Bania; Agata Mikolajczyk
Journal: J Med Virol Date: 2020-04-28 Impact factor: 20.693

5 in total

1. Deepening of In Silico Evaluation of SARS-CoV-2 Detection RT-qPCR Assays in the Context of New Variants.

Authors: Mathieu Gand; Kevin Vanneste; Isabelle Thomas; Steven Van Gucht; Arnaud Capron; Philippe Herman; Nancy H C Roosens; Sigrid C J De Keersmaecker
Journal: Genes (Basel) Date: 2021-04-13 Impact factor: 4.096

2. A rapid and cost-effective multiplex ARMS-PCR method for the simultaneous genotyping of the circulating SARS-CoV-2 phylogenetic clades.

Authors: Mohammad Tanvir Islam; Asm Rubayet Ul Alam; Najmuj Sakib; Mohammad Shazid Hasan; Tanay Chakrovarty; Mohammad Tawyabur; Ovinu Kibria Islam; Hassan M Al-Emran; Mohammad Iqbal Kabir Jahid; Mohammad Anwar Hossain
Journal: J Med Virol Date: 2021-02-01 Impact factor: 20.693

3. Epidemiology and Genetic Analysis of SARS-CoV-2 in Myanmar during the Community Outbreaks in 2020.

Authors: Wint Wint Phyu; Reiko Saito; Keita Wagatsuma; Takashi Abe; Htay Htay Tin; Eh Htoo Pe; Su Mon Kyaw Win; Nay Chi Win; Lasham Di Ja; Sekizuka Tsuyoshi; Kuroda Makoto; Yadanar Kyaw; Irina Chon; Shinji Watanabe; Hideki Hasegawa; Hisami Watanabe
Journal: Viruses Date: 2022-01-27 Impact factor: 5.048

4. Understanding the possible origin and genotyping of the first Bangladeshi SARS-CoV-2 strain.

Authors: A S M Rubayet Ul Alam; M Rafiul Islam; M Shaminur Rahman; Ovinu Kibria Islam; M Anwar Hossain
Journal: J Med Virol Date: 2020-09-28 Impact factor: 20.693

5. Genomic analysis of SARS-CoV-2 variants of concern identified from the ChAdOx1 nCoV-19 immunized patients from Southwest part of Bangladesh.

Authors: Hassan M Al-Emran; Md Shazid Hasan; Md Ali Ahasan Setu; M Shaminur Rahman; Asm Rubayet Ul Alam; Shovon Lal Sarkar; Md Tanvir Islam; Mir Raihanul Islam; Mohammad Mahfuzur Rahman; Ovinu Kibria Islam; Iqbal Kabir Jahid; M Anwar Hossain
Journal: J Infect Public Health Date: 2021-12-07 Impact factor: 3.718

5 in total