| Literature DB >> 32937193 |
Abozar Ghorbani1, Samira Samarfard2, Amin Ramezani3, Keramatollah Izadpanah1, Alireza Afsharifar1, Mohammad Hadi Eskandari4, Thomas P Karbanowicz5, Jonathan R Peters5.
Abstract
A novel coronavirus related to severe acute respiratory syndrome virus, (SARS-CoV-2) is the causal agent of the COVID-19 pandemic. Despite the genetic mutations across the SARS-CoV-2 genome being recently investigated, its transcriptomic genetic polymorphisms at inter-host level and the viral gene expression level based on each Open Reading Frame (ORF) remains unclear. Using available High Throughput Sequencing (HTS) data and based on SARS-CoV-2 infected human transcriptomic data, this study presents a high-resolution map of SARS-CoV-2 single nucleotide polymorphism (SNP) hotspots in a viral population at inter-host level. Four throat swab samples from COVID-19 infected patients were pooled, with RNA-Seq read retrieved from SRA NCBI to detect 21 SNPs and a replacement across the SARS-CoV-2 genomic population. Twenty-two RNA modification sites on viral transcripts were identified that may cause inter-host genetic diversity of this virus. In addition, the canonical genomic RNAs of N ORF showed higher expression in transcriptomic data and reverse transcriptase quantitative PCR compared to other SARS-CoV-2 ORFs, indicating the importance of this ORF in virus replication or other major functions in virus cycle. Phylogenetic and ancestral sequence analyses based on the entire genome revealed that SARS-CoV-2 is possibly derived from a recombination event between SARS-CoV and Bat SARS-like CoV. Ancestor analysis of the isolates from different locations including Iran suggest shared Chinese ancestry. These results propose the importance of potential inter-host level genetic variations to the evolution of SARS-COV-2, and the formation of viral quasi-species. The RNA modifications discovered in this study may cause amino acid sequence changes in polyprotein, spike protein, product of ORF8 and nucleocapsid (N) protein, suggesting further insights to understanding the functional impacts of mutations in the life cycle and pathogenicity of SARS-CoV-2.Entities:
Keywords: Inter-host; Nucleocapsid; Open reading frame; Phylogenetic; SARS-CoV-2; Single nucleotide polymorphism
Year: 2020 PMID: 32937193 PMCID: PMC7487081 DOI: 10.1016/j.meegid.2020.104556
Source DB: PubMed Journal: Infect Genet Evol ISSN: 1567-1348 Impact factor: 3.342
Raw data statistics of SARS-CoV-2 human infected libraries.
| Accession number | Total reads | Reads mapped to virus | Percentage of mapped-reads |
|---|---|---|---|
| SRR11454606 | 11, 336,944 | 3616 | 0.03 |
| SRR11454609 | 17,121,629 | 66,420 | 0.39 |
| SRR11454610 | 14,337,950 | 126,390 | 0.88 |
| SRR11454611 | 1,405,599 | 3383 | 0.24 |
Primers and probes that were used for RT-qPCR.
| Primers and Probes | Target | Sequence | Amplicon length (bp) |
|---|---|---|---|
| RdRP_F | RNA-dependent RNA polymerases | GTCTCTATAGAAATAGAGATGTTGACACA | 134 |
| RdRP_ R | ACCTTGAGATGCATAAGTGCTATTGA | ||
| RdRP_ P | FAM –AATGATGATACTCTCTGACGATGCT-BHQ | ||
| N gene_F | Nucleoprotein ORF | GACCCCAAAATCAGCGAAAT | 72 |
| N gene_R | TCTGGTTACTGCCAGTTGAATCTG | ||
| N gene_P | FAM-ACCCCGCATTACGTTTGGTGGACC-BHQ | ||
| RNase P_F | Human Ribonuclease P (Internal control) | AGATTTGGACCTGCGAGCG | 65 |
| RNase P_R | GAGCGGCTGTCTCCACAAGT | ||
| RNase P_P | ROX –TTCTGACCTGAAGGCTCTGCGCG-BHQ |
Single-nucleotide polymorphisms (SNPs) among a population of human SARS-COV-2a genome RNA-Seq reads.
| Gene | Reference Position | Type of Variation | Reference | Allele | Frequency (%) | Amino acid Change |
|---|---|---|---|---|---|---|
| Poly_protein | 885 | SNP | C | T | 13.17 | − |
| Poly_protein | 2910 | SNP | A | G | 5.05 | − |
| Poly_protein | 5122 | SNP | G | A | 5.33 | + |
| Poly_protein | 6645 | SNP | T | C | 5.06 | − |
| Poly_protein | 7002 | SNP | C | T | 6.95 | − |
| Poly_protein | 8094 | SNP | T | C | 8.86 | − |
| Poly_protein | 8517 | SNP | C | T | 99.80 | − |
| Poly_protein | 9512 | SNP | C | T | 5.74 | + |
| Poly_protein | 10,843 | SNP | G | A | 5.06 | + |
| Poly_protein | 11,206 | SNP | C | T | 5.23 | + |
| Poly_protein | 11,876 | SNP | C | A | 25.25 | + |
| Poly_protein | 21,199 | SNP | G | C | 6.12 | − |
| Poly_protein | 21,209 | SNP | T | G | 5.76 | − |
| Poly_protein | 21,220 | Replacement | AG | T | 6.45 | − |
| Poly_protein | 21,258 | SNP | T | A | 5.47 | − |
| Spike_protein | 24 | SNP | G | C | 6.25 | + |
| Spike_protein | 2254 | SNP | C | T | 38.54 | + |
| Spike_protein | 2464 | SNP | C | T | 7.96 | + |
| Spike_protein | 3017 | SNP | C | T | 13.77 | + |
| ORF8 | 184 | SNP | G | C | 6.58 | + |
| ORF8 | 251 | SNP | T | C | 98.92 | + |
| N_protein | 610 | SNP | G | A | 6.43 | + |
Severe acute respiratory syndrome coronavirus 2.
Open Reading Frame.
“+”Changed amino acid, “– “doesn't change amino acid.
Fig. 1Frequency of the single-nucleotide polymorphism (SNP) positions on the ORFs of SARS-CoV-2 genome in RNA-Seq reads in Human.
Fig. 2Expression levels (Reads per kilobase of transcript per million mapped reads, RPKM) of SARS-COV-2 ORFs in RNA-Seq data in infected patients. The mean and standard deviation of four biological replicates is shown. One-way ANOVA (p < 0.01) was used for statistical analysis.
Fig. 3Expression level of RdRP and N ORFs in 70 patients that were infected by SARS-CoV-2 using RT-qPCR. Data were compared with t-test analysis method.
Fig. 4Phylogenetic analysis of full-length genomes SARS-CoV-2 Iranian isolate and isolates from different locations. □: European isolates. ■: South American isolates. ●: Asian Isolates. ♦: African Isolates. ▲: North American Isolates. SARS and bat-SARS-like are out-groups.
Fig. 5Ancestor analysis of full-length genomes SARS-CoV-2 Iranian isolate and isolates from different location. SARS and bat-SARS-like are out-groups.