Literature DB >> 32409547

Complete Genome Sequences of SARS-CoV-2 Strains Detected in Malaysia.

Yoong Min Chong1, I-Ching Sam1,2, Sasheela Ponnampalavanar3, Sharifah Faridah Syed Omar3, Adeeba Kamarulzaman3, Vijayan Munusamy3, Chee Kuan Wong3, Fadhil Hadi Jamaluddin4, Han Ming Gan5, Jennifer Chong2, Cindy Shuan Ju Teh1, Yoke Fun Chan6.   

Abstract

We sequenced four severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genomes from Malaysia during the second wave of infection and found unique mutations which suggest local evolution. Circulating Malaysian strains represent introductions from different countries, particularly during the first wave of infection. Genome sequencing is important for understanding local epidemiology.
Copyright © 2020 Chong et al.

Entities:  

Year:  2020        PMID: 32409547      PMCID: PMC7225546          DOI: 10.1128/MRA.00383-20

Source DB:  PubMed          Journal:  Microbiol Resour Announc        ISSN: 2576-098X


ANNOUNCEMENT

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) belongs to the family Coronaviridae and the genus Betacoronavirus and has caused a pandemic of coronavirus disease (COVID-19). As of 31 March 2020, Malaysia had 2,766 confirmed cases with 43 deaths (1). To obtain a preliminary understanding of SARS-CoV-2 molecular epidemiology in Malaysia, we performed complete genome sequencing of SARS-CoV-2 strains collected directly from nasopharyngeal swabs from four patients (186197, 188407, 189332, and 190300) in Kuala Lumpur, Malaysia, between 14 and 22 March 2020 (Fig. 1A).
FIG 1

(A) Cumulative cases in Malaysia until 31 March 2020. There have been two waves of infection, separated by 11 days during which no cases were reported. Cases of the first wave were associated mainly with travel from China and Singapore. About 48% of the cases in the second wave were associated with a religious mass gathering. (B) Maximum likelihood phylogenetic tree built with FastTree and based on the complete genomes of 2,364 sequences available at GISAID on 29 March 2020. The 10 Malaysian genomes are named and comprise six genomes from the first wave (dark blue) and the four genomes sequenced in this study (second wave, blue). The date of sample collection is recorded at the end of each strain name. The tree was midpoint rooted.

(A) Cumulative cases in Malaysia until 31 March 2020. There have been two waves of infection, separated by 11 days during which no cases were reported. Cases of the first wave were associated mainly with travel from China and Singapore. About 48% of the cases in the second wave were associated with a religious mass gathering. (B) Maximum likelihood phylogenetic tree built with FastTree and based on the complete genomes of 2,364 sequences available at GISAID on 29 March 2020. The 10 Malaysian genomes are named and comprise six genomes from the first wave (dark blue) and the four genomes sequenced in this study (second wave, blue). The date of sample collection is recorded at the end of each strain name. The tree was midpoint rooted. Viral RNA was extracted from the samples using a QIAamp viral RNA minikit (Qiagen, Germany) and amplified according to the ARTIC nCoV-2019 protocol (2). Briefly, cDNA was synthesized using a SuperScript IV first-strand synthesis system (Invitrogen, USA) with random hexamers. The ARTIC v1 primers were divided into two pools of 49 primer sets for PCR using Q5 high-fidelity DNA polymerase (NEB, USA). Overlapping amplicons of 400 bp were combined and purified using sample purification beads (SPB) (Illumina, USA), quantified with a Qubit 3.0 fluorometer, and used for library preparation. Nextera DNA Flex libraries were sequenced using iSeq 100 reagent (Illumina) on the iSeq 100 system (Illumina) with output of 2 × 100-bp paired-end reads and 4 million expected paired reads. Sequencing of one strain (186197) was also performed using a Nanopore MinION and ligation sequencing kit (SQK-LSK109) according to the Oxford Nanopore Technologies (ONT) standard protocol (ONT, UK). Briefly, purified amplicons were sequenced in an R9.4 flow cell and run for 30 min. The iSeq raw FastQ files were analyzed using Geneious Prime 2020 (Biomatters, New Zealand). The average number of raw paired reads obtained from iSeq was 1.8 million (Table 1). Paired reads were trimmed for quality using default parameters and mapped to reference strain Wuhan-Hu-1 (GenBank accession number MN908947) with the Geneious mapper. About 94% of the reads were mapped, except for strain 186197 (25.2%). The average depth of coverage for iSeq was 5,000×, except for strain 186197, which had only 1,000× coverage (Table 1). Therefore, the consensus sequence for strain 186197 was mapped from a combination of iSeq (359,453 paired reads) and MinION (23,390 reads) sequencing with Geneious mapper using default parameters. The four genome sizes ranged from 29,486 to 29,898 bp with GC contents of 36.6 to 37.9% (Table 1). Multiple sequence alignment was performed with MAFFT with default parameters (3). Phylogenetic analysis was conducted with FastTree 2.1.11 (4) implemented in Geneious with default parameters using whole genomes available at GISAID (www.gisaid.org), including six other previously deposited Malaysian strains (EPI_ISL_416829, EPI_ISL_416866, EPI_ISL_416884, EPI_ISL_416885, EPI_ISL_416886, and EPI_ISL_416907).
TABLE 1

Comparison of nucleotide and amino acid differences among Malaysian strains

Malaysian strainNGSa (iSeq)
Mutation in gene at indicated positionb
No. of raw readsNo. of mapped readsMaximum coverage (×)Avg coverage (×)Genome size (bp)GC content (%)ORF1a
ORF1b
SMORF8N3 UTRc
Nucleotide position (in the genome)2737631063128782110831373013975195242392927147281442831129862298682986929871
Codon position (in the protein)82420152016283936068820197892098413
Reference SARS-CoV-2 (MN908947)29,90338.0T (T)C (S)C (T)C (S)G (L)C (A)G (G)C (L)C (Y)G (D)T (L)C (P)GTGC
hCoV-19/Malaysia/MKAK-CL-2020-5045/2020|EPI_ISL_416829T (T)C (S)C (T)C (S)G (L)C (A)G (G)C (S)C (Y)C (H)T (L)C (P)GTGC
hCoV-19/Malaysia/MKAK-CL-2020-5049/2020|EPI_ISL_416884T (T)C (S)C (T)C (S)G (L)C (A)G (G)C (S)C (Y)C (H)T (L)C (P)GTGC
hCoV-19/Malaysia/MKAK-CL-2020-5047/2020|EPI_ISL_416866T (T)C (S)C (T)C (S)G (L)C (A)G (G)C (S)C (Y)C (H)T (L)C (P)GTGC
hCoV-19/Malaysia/MKAK-CL-2020-6430/2020|EPI_ISL_416886T (T)C (S)C (T)T (S)G (L)C (A)G (G)C (S)C (Y)G (D)C (S)T (L)GTGC
hCoV-19/Malaysia/MKAK-CL-2020-5096/2020|EPI_ISL_416885T (T)C (S)C (T)T (S)G (L)C (A)G (G)C (S)C (Y)G (D)C (S)T (L)GTGC
hCoV-19/Malaysia/MKAK-CL-2020-7554/2020|EPI_ISL_416907T (T)C (S)C (T)C (S)G (L)C (A)G (G)C (S)C (Y)G (D)T (L)T (L)GTGC
hCoV-19/Malaysia/186197/2020/EPI_ISL_4179191,467,222369,42773,3351,08729,48636.6A (T)C (S)C (T)C (S)G (L)C (A)G (G)C (S)C (Y)G (D)T (L)T (L)GAAA
hCoV-19/Malaysia/188407/2020|EPI_ISL_4179182,057,0201,952,56320,2315,69629,89837.9T (T)A (R)A (K)C (S)T (F)T (V)G (G)T (S)T (Y)G (D)T (L)T (L)GAAA
hCoV-19/Malaysia/190300/2020|EPI_ISL_4179201,796,7601,647,23330,6774,98629,86537.6T (T)C (S)A (K)C (S)T (F)T (V)G (G)T (S)T (Y)G (D)T (L)T (L)AAAA
hCoV-19/Malaysia/189332/2020|EPI_ISL_4179171,895,1601,821,26717,2085,28929,86837.9A (T)A(R)A (K)C (S)T (F)T (V)A (S)T (S)T (Y)G (D)T (L)T (L)AAAA

NGS, next-generation sequencing.

Amino acids are denoted in parentheses; the four genomes of the second wave (sequenced in the present study) are italicized. Unique mutations found only in Malaysian strains (as of 29 March 2020) are bold.

UTR, untranscribed region.

Comparison of nucleotide and amino acid differences among Malaysian strains NGS, next-generation sequencing. Amino acids are denoted in parentheses; the four genomes of the second wave (sequenced in the present study) are italicized. Unique mutations found only in Malaysian strains (as of 29 March 2020) are bold. UTR, untranscribed region. The four complete genome sequences reported here date from the main second wave of infections in Malaysia (Fig. 1A). Strain 188407 was linked to a religious mass gathering which has been associated with 48% of national cases and clusters with strains from Japan, Australia, and Saudi Arabia (Fig. 1B). Strain 189332 clusters with strain 188407, but the patient from whom it was isolated had no clear link to the gathering. This suggests that the strains associated with the gathering have established community transmission. The person with strain 186197 had travelled to Vietnam, while strain 190300, from a patient with no history of travelling or attending gatherings, was clustered with strains from Europe (Fig. 1B). Compared to reference strain Wuhan-Hu-1, Malaysian sequences have 16 nucleotide substitutions (Table 1). Four substitutions in the nonstructural region (ORF1a-T2737A, ORF1a-C6310A, ORF1b-G13975A, and ORF1b-C19524T) are unique to Malaysia, suggesting a degree of local evolution. Our data showed that current circulating strains in Malaysia represent introductions from different countries and local evolution. More genomic data will clarify virus spread in Malaysia, particularly with respect to the role played by the mass gathering.

Data availability.

These sequences have been deposited in the GISAID EpiCoV newly emerging coronavirus SARS-CoV-2 platform under identifiers EPI_ISL_417917 to EPI_ISL_417920. The sequences were also deposited in the following NCBI databases: GenBank (accession numbers MT372480 to MT372483), BioProject (PRJNA616147), BioSample (SAMN14483189, SAMN14483190, SAMN14596408, and SAMN14596409), and SRA (SRR11514750, SRR11514749, SRR11542244, and SRR11542243 [Illumina raw reads] and SRR11547279 [Nanopore raw reads]).
  2 in total

1.  FastTree 2--approximately maximum-likelihood trees for large alignments.

Authors:  Morgan N Price; Paramvir S Dehal; Adam P Arkin
Journal:  PLoS One       Date:  2010-03-10       Impact factor: 3.240

2.  MAFFT multiple sequence alignment software version 7: improvements in performance and usability.

Authors:  Kazutaka Katoh; Daron M Standley
Journal:  Mol Biol Evol       Date:  2013-01-16       Impact factor: 16.240

  2 in total
  4 in total

1.  MicroRNA of N-region from SARS-CoV-2: Potential sensing components for biosensor development.

Authors:  Fatin Syakirah Halim; N A Parmin; Uda Hashim; Subash C B Gopinath; Farrah Aini Dahalan; Iffah Izzati Zakaria; Wei Chern Ang; Nurfareezah Fareezah Jaapar
Journal:  Biotechnol Appl Biochem       Date:  2021-08-22       Impact factor: 2.724

Review 2.  Past, present, and future of COVID-19: a review.

Authors:  C M Romano; A Chebabo; J E Levi
Journal:  Braz J Med Biol Res       Date:  2020-07-24       Impact factor: 2.590

3.  SARS-CoV-2 lineage B.6 was the major contributor to early pandemic transmission in Malaysia.

Authors:  Yoong Min Chong; I-Ching Sam; Jennifer Chong; Maria Kahar Bador; Sasheela Ponnampalavanar; Sharifah Faridah Syed Omar; Adeeba Kamarulzaman; Vijayan Munusamy; Chee Kuan Wong; Fadhil Hadi Jamaluddin; Yoke Fun Chan
Journal:  PLoS Negl Trop Dis       Date:  2020-11-30

Review 4.  Essential interpretations of bioinformatics in COVID-19 pandemic.

Authors:  Manisha Ray; Mukund Namdev Sable; Saurav Sarkar; Vinaykumar Hallur
Journal:  Meta Gene       Date:  2020-12-17
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.