| Literature DB >> 33487779 |
Ahmed M El-Shehawi1,2, Saqer S Alotaibi1, Mona M Elseehy2.
Abstract
The COVID-19 corona virus has become a world pandemic which started in December 2019 in Wuhan, China with no confirmed biological source. Various countries reported the genomic sequence of different isolates obtained from infected patients. This allowed us to obtain a number of 38 isolates of full genomic sequences. Alignment of nucleotide (nt) sequence was carried out using Clustal Omega multiple alignment service at the EBI website. Alignment of nt sequence and phylogenetic relationship revealed that the COVID-19 is a new viral strain and its biological source has not been yet detected. The expected orf pattern was different among isolates obtained from the same country or different countries as well as from SARS-CoV isolates or bats CoV suggesting different virus human interaction possibilities during infection and severity. All isolates had the main five orfs (1ab, S, M, N, E), whereas they differed in the expected accessory orfs. Being with the biological source of COVID-19 undetected, the role of human endogenous retrovirus (HERVs) in the regulation of the host cell gene expression or the encoding for products that could modulate COVID-19 infection and the spectrum of its symptoms is discussed. © Allerton Press, Inc. 2020, ISSN 0095-4527, Cytology and Genetics, 2020, Vol. 54, No. 6, pp. 588–604. © Allerton Press, Inc., 2020.Entities:
Keywords: COVID-19; Human endogenous retroviruses (HERVs); genome; nucleotide sequence alignment
Year: 2021 PMID: 33487779 PMCID: PMC7810191 DOI: 10.3103/S0095452720060031
Source DB: PubMed Journal: Cytol Genet ISSN: 0095-4527 Impact factor: 0.579
Nucleotide sequence identity to the first reported case from China isolateHZ-1 (Accession no. MT039873.1)
| Accession | Isolate | Country | Total Score | Query | Ident, % | |
|---|---|---|---|---|---|---|
| 1 | MT019532.1 | IPBCAMS-WH-04 | China | 55 092 | 100 | 100 |
| 2 | MN996528.1 | WIV04 | China | 55 092 | 100 | 100 |
| 3 | MN988668.1 | V WHU01 | China | 55 092 | 100 | 100 |
| 4 | NC_045512.2 | Wuhan-Hu-1 | China | 55 092 | 100 | 100 |
| 5 | MT019533.1 | IPBCAMS-WH-05 | China | 55 086 | 100 | 100 |
| 6 | MT019531.1 | IPBCAMS-WH-03 | China | 55 086 | 100 | 100 |
| 7 | MT066176.1 | NTU02 | Taiwan | 55 081 | 100 | 99.99 |
| 8 | MT066175.1 | NTU01 | Taiwan | 55 081 | 100 | 99.99 |
| 9 | MT027064.1 | USA-CA5 | USA | 55 081 | 100 | 99.99 |
| 10 | MN994468.1 | USA-CA2 | USA | 55 081 | 100 | 99.99 |
| 11 | MT027062.1 | USA-CA3 | USA | 55 075 | 100 | 99.99 |
| 12 | MT019529.1 | IPBCAMS-WH-01 | China | 55 075 | 100 | 99.99 |
| 13 | MN985325.1 | USA-WA1 | USA | 55 075 | 100 | 99.99 |
| 14 | MN996530.1 | WIV06 | China | 55 071 | 99 | 100 |
| 15 | LC522974.1 | TY/WK-501 | Japan | 55 070 | 100 | 99.99 |
| 16 | LC522972.1 | KY/V-029 | Japan | 55 070 | 100 | 99.99 |
| 17 | MN997409.1 | USA-AZ1 | USA | 55 070 | 100 | 99.99 |
| 18 | MT039888.1 | USA-MA1 | USA | 55 066 | 100 | 99.98 |
| 19 | MT039887.1 | USA-WI1 | USA | 55 066 | 100 | 99.99 |
| 20 | MT049951.1 | Yunnan-01 | China | 55 064 | 100 | 99.98 |
| 21 | LC522975.1 | TY/WK-521 | Japan | 55 064 | 100 | 99.98 |
| 22 | LC522973.1 | TY/WK-012 | Japan | 55 064 | 100 | 99.98 |
| 23 | MN996529.1 | WIV05 | China | 55 064 | 99 | 99.99 |
| 24 | MN975262.1 | HKU-SZ-005b | H Kong | 55 064 | 100 | 99.98 |
| 25 | MN996531.1 | WIV07 | China | 55 062 | 99 | 99.99 |
| 26 | MN988713.1 | USA-IL1 | USA | 55 062 | 100 | 99.97 |
| 27 | LR757996.1 | Wuhan, genome assembly | China | 55 060 | 99 | 100 |
| 28 | MT019530.1 | IPBCAMS-WH-02 | China | 55 058 | 100 | 99.98 |
| 29 | LR757995.1 | Wuhan, genome assembly | China | 55 057 | 99 | 99.99 |
| 30 | MT044257.1 | USA-IL2 | USA | 55 053 | 100 | 99.98 |
| 31 | MN994467.1 | USA-CA1 | USA | 55 053 | 100 | 99.98 |
| 32 | MT039890.1 | SNU01 | Korea | 55 042 | 100 | 99.97 |
| 33 | LR757998.1 | Wuhan, genome assembly | China | 55 040 | 99 | 99.99 |
| 34 | MN996527.1 | WIV02 | China | 55 025 | 99 | 99.99 |
| 35 | MN938384.1 | HKU-SZ-002a | H Kong | 55 022 | 99 | 99.99 |
| 36 | MT007544.1 | VIC01 | Australia | 55 010 | 100 | 99.96 |
| 37 | MT044258.1 | USA-CA6 | USA | 54 937 | 100 | 99.92 |
| 38 | LC521925.1 | Japan/AI/I-004 | Japan | 54 926 | 100 | 99.91 |
| 39 | MN996532.1 | BatCoV-RaTG13 | China | 48 630 | 99 | 96.11 |
| 40 | AY395003.1 | SARS coronavirus ZS-C | China (2004) | 15 213 | 88 | 82.34 |
Fig. 1. Phylogenetic relationship among COVID-19 isolates from different countries.
Summary of detected nucleotide SNPs among COVID-19 isolates
| NO | AC # | Country | Isolate | Length, | Nt SNPs | SNPs |
|---|---|---|---|---|---|---|
| 1 | LR757995.1 | China | Whole genome | 29 872 | T28129C, C8767T | 2 |
| 2 | LR757996.1 | China | Whole genome | 29 868 | ||
| 3 | LR757998.1 | China | Whole genome | 29 866 | C6943A, T11739A | 2 |
| 4 | MN988668.1 | China | WHU01 | 29 881 | ||
| 5 | MN996527.1 | China | WIV02 | 29 825 | G21292A, A24292G | 2 |
| 6 | MN996528.1 | China | WIV04 | 29 891 | ||
| 7 | MN996529.1 | China | WIV05 | 29 852 | G7004A, A21125G | 2 |
| 8 | MN996530.1 | China | WIV06 | 29 854 | ||
| 9 | MN996531.1 | China | WIV07 | 29 857 | A7988C, C9521T, | 2 |
| 10 | MT019529.1 | China | IPBCAMS-WH-01 | 29 899 | A3778G, A8388G, T8987A | 3 |
| 11 | MT019530.1 | China | IPBCAMS-WH-02 | 29 889 | T104A, T111C, T112G, C119G, T120C, G124A | 6 |
| 12 | MT019531.1 | China | IPBCAMS-WH-03 | 29 899 | T6996C | 1 |
| 13 | MT019532.1 | China | IPBCAMS-WH-04 | 29 890 | ||
| 14 | MT019533.1 | China | IPBCAMS-WH-05 | 29 883 | G7866T | 1 |
| 15 | MT039873.1 | China | HZ-1, 1st case | 29 833 | ||
| 16 | NC_045512.2 | China | Wuhan-Hu-1 | 29 903 | ||
| 17 | MT049951.1 | China | Yunnan-01 | 29 903 | C75A, C8782T,G11083C, T21644A, T28144C | 5 |
|
| MN985325.1 | USA | USA-WA1 | 29 882 | C8782T, T28144C | 2 |
| 19 | MN994468.1 | USA | USA-CA2 | 29 883 | C17000T, G26144T | 2 |
| 20 | MT027062.1 | USA | USA-CA3 | 29 882 | G614A, A5084G, C28854T | 3 |
| 21 | MT027064.1 | USA | USA-CA5 | 29 882 | C2091T, C21707T | 2 |
| 22 | MT044258.1 | USA | USA-CA6 | 29 858 | Del 508-523, del 671-679, | 2 |
| 23 | MT039888.1 | USA | USA-MA1 | 29 882 | G3518T, C8782T,A17423G, C24034T C28854T | 5 |
| 24 | MT039887.1 | USA | USA-WI1 | 29 879 | C17373T, del 20298-20300, | 2 |
| 25 | MN988713.1 | USA | USA-IL1 | 29 882 | T490W,C3177Y, C8782Y, C24034Y, T26729Y, G28077S, T28144Y, C28854Y | 8 |
| 26 | MT044257.1 | USA | USA-IL2 | 29 882 | T490A, C3177T, C8782T, C24034T, T26729C, G28077C, T28144C | 7 |
| 27 | MN997409.1 | USA | USA-AZ1 | 29 882 | C8782T,G11083T, T28144C, C29095T | 4 |
| 28 | LC521925.1 | Japan | AI-I-004 | 29 848 | Del 351-374, C18485T, C18485T | 3 |
| 29 | LC522972.1 | Japan | KY-V-029 | 29 878 | G11554T, C15321T, C25807G, C29300T | 4 |
| 30 | LC522973.1 | Japan | TY-WK-012 | 29 878 | C2659T, C8779T C3789T, C29092T, T28141C | 5 |
| 31 | LC522974.1 | Japan | TY-WK-501 | 29 878 | C2659T, C8779T, C29092T, T28141C | 4 |
| 32 | LC522975.1 | Japan | TY-WK-521 | 29 878 | C2659T, C8779T, C29092T, G29702T, T28141C | 5 |
| 33 | MN938384.1 | H Kong | HKU-SZ-002a | 29 838 | C8750T, C29063T | 2 |
| 34 | MN975262.1 | H Kong | HKU-SZ-005b | 29 891 | C8782T,C9561T, T15607C, C29095T, T28144C | 5 |
| 35 | MT066175.1 | Taiwan | NTU01 | 29 870 | C8782T, T28144C | 2 |
| 36 | MT066176.1 | Taiwan | NTU02 | 29 870 | A9034G, C9491T | 2 |
| 37 | MT007544.1 | Ausuralia | Australia-VIC01 | 29 893 | T19065C, T22303G, G26144T,del 2974029950 | 4 |
| 38 | MT039890.1 | Korea | SNU01 | 29 903 | G2969T, C6031T, C12115T, T15597C, C20936G, C22224G, G25775T, G26144T, T26354A | 9 |
| Total | 1 135 284 | 108 | ||||
| SARS-CoV-1 | ||||||
| 39 | AY310120.1 | Germany | SARS-CoV-1-FRA | 29 740 | T18965A, C19084T, C24933T, C26660T, C28268T | 5 |
| 40 | AY323977.2 | Italy | SARS-CoV-1-HSR1 | 29 751 | G27254R | 1 |
| 41 | DQ182595.1 | China | SARS-CoV-1ZJ0301 | 29 706 | Del1-16, A12965C,T14022A, A14976T, C17478G, T17518A,C22573T | 7 |
| Total | 89 197 | 13 |
Common coronavirus orfs
| Accession # | orf | Genomic location | Length, aa | Function | |
|---|---|---|---|---|---|
| start nt | end nt | ||||
| YP_009724389.1 | orf1ab | 266 | 21 555 | 7.096 | Polyprotein |
| YP_009725295.1 | orf1a | 266 | 13 483 | 4.405 | Polyprotein |
| YP_009724390.1 | orfS | 21 563 | 25 384 | 1,273 | Surface glycoprotein |
| YP_009724392.1 | orfE | 2624 | 26 472 | 75 | Envelope protein |
| YP_009724397.2 | orfN (orf9) | 28 274 | 29 533 | 419 | Nucleocapsid phosphoprotein |
| YP_009724393.1 | orfM (orf5) | 26 523 | 27 191 | 222 | Membrane glycoprotein |
| YP_009724391.1 | orf3a | 25 393 | 26 220 | 275 | ORF3a protein |
| YP_009724394.1 | orf6 | 27 202 | 27 387 | 61 | ORF6 protein |
| YP_009724395.1 | orf7a | 27 394 | 27 759 | 121 | ORF7a protein |
| YP_009725296.1 | orf7b | 27 756 | 27 887 | 43 | ORF7b |
| YP_009724396.1 | orf8 | 27 894 | 8259 | 121 | ORF8 protein |
| YP_009725255.1 | orf10 | 29 558 | 29 674 | 38 | ORF10 protein |
Accessory orfs produced from polyprotein orf1ab and orf1a
| Accession# | Protein name | Length (aa) | Source orf (1ab or 1a) | Function |
|---|---|---|---|---|
| YP_009725297.1 | nsp1 | 180 | ofr1ab, orf1a | Leader protein |
| YP_009725298.1 | nsp2 | 638 | ofr1ab, orf1a | – |
| YP_009725299.1 | nsp3 | 1,945 | ofr1ab, orf1a | – |
| YP_009725300.1 | nsp4 | 500 | ofr1ab, orf1a | – |
| YP_009725301.1 | nsp5 | 306 | ofr1ab, orf1a | 3C-like proteinase |
| YP_009725302.1 | nsp6 | 290 | ofr1ab, orf1a | – |
| YP_009725303.1 | nsp7 | 83 | ofr1ab, orf1a | – |
| YP_009725304.1 | nsp8 | 198 | ofr1ab, orf1a | – |
| YP_009725305.1 | nsp9 | 113 | ofr1ab, orf1a | – |
| YP_009725306.1 | nsp10 | 139 | ofr1ab, orf1a | – |
| YP_009725312.1 | nsp11 | 13 | orf1a | – |
| YP_009725307.1 | nsp12 | 932 | orf1a | RNA-dependent RNA polymerase |
| YP_009725308.1 | nsp13 | 601 | orf1a | Helicase |
| YP_009725309.1 | nsp14 | 527 | orf1a | 3'-to-5' exonuclease |
| YP_009725310.1 | nsp15 | 346 | orf1a | EndoRNAse |
| YP_009725311.1 | nsp16 | 298 | orf1a | 2'-O-ribose methyltransferase |
Summary of predicted ORFs in reported nCoV-2 isolates (+ indicates the presence of orf, – indicates the absence of orf)
| No | AC # | Country | Isolate | bp | orf | Extra orfs | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1ab | 1a | S | 3a | E | M | 6 | 7a | 7b | 8 | N | 10 | 3b | 8a | 8b | 9b | |||||
| 1 | LR757995.1* | China | Whole genome | 29 872 | ||||||||||||||||
| 2 | LR757996.1* | China | Whole genome | 29 868 | ||||||||||||||||
| 3 | LR757998.1* | China | Whole genome | 29 866 | ||||||||||||||||
| 4 | MN988668.1 | China | WHU01 | 29 881 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 5 | MN996527.1 | China | WIV02 | 29 825 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 | MN996528.1 | China | WIV04 | 29 891 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 | MN996529.1 | China | WIV05 | 29 852 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 | MN996530.1 | China | WIV06 | 29 854 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 | MN996531.1 | China | WIV07 | 29 857 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 | MT019529.1 | China | IPBCAMS-WH-01 | 29 899 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 | MT019530.1 | China | IPBCAMS-WH-02 | 29 889 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 | MT019531.1 | China | IPBCAMS-WH-03 | 29 899 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 | MT019532.1 | China | IPBCAMS-WH-04 | 29 890 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 | MT019533.1 | China | IPBCAMS-WH-05 | 29 883 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 | MT039873.1 | China | HZ-1, 1st case | 29 833 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 | MT039890.1 | S. Korea | SNU01 | 29 903 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 17 | NC_045512.2 | China | Wuhan-Hu-1 | 29 903 | + | + | + | + | + | + | + | + | + | + | + | + | – | – | – | – |
| 18 | MT049951.1 | China | Yunnan-01 | 29 903 | + | + | + | + | + | + | + | + | + | + | + | + | – | – | – | – |
| 19 | MN985325.1 | USA | USA-WA1 | 29 882 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 | MN994468.1 | USA | USA-CA2 | 29 883 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 | MT027062.1 | USA | USA-CA3 | 29 882 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 22 | MT027064.1 | USA | USA-CA5 | 29 882 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 | MT044258.1 | USA | USA-CA6 | 29 858 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| ||
| 24 | MT039888.1 | USA | USA-MA1 | 29 882 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 25 | MT039887.1 | USA | USA-WI1 | 29 879 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 | MN988713.1 | USA | USA-IL1 | 29 882 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 27 | MT044257.1 | USA | USA-IL2 | 29 882 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
| 28 | MN997409.1 | USA | USA-AZ1 | 29 882 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 29 | LC521925.1 | Japan | AI-I-004 | 29 848 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 30 | LC522972.1 | Japan | KY-V-029 | 29 878 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 31 | LC522973.1 | Japan | TY-WK-012 | 29 878 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 | LC522974.1 | Japan | TY-WK-501 | 29 878 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 | LC522975.1 | Japan | TY-WK-521 | 29 878 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 | MN938384.1 | H Kong | HKU-SZ-002a | 29 838 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 | MN975262.1 | H Kong | HKU-SZ-005b | 29 891 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 36 | MT066175.1 | Taiwan | NTU01 | 29 870 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 37 | MT066176.1 | Taiwan | NTU02 | 29 870 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
| 38 | MT007544.1 | Ausuralia | Australia-VIC01 | 29 893 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| ||||||||||||||||||||
| 39 | MN996532.1 | China | BatCoV-RaTG13 | 29 855 |
|
|
|
|
|
|
|
|
|
|
|
| ||||
|
| ||||||||||||||||||||
| 40 | AY310120.1 | Germany | SARS-CoV-1-FRA | 29 740 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 41 | AY323977.2 | Italy | SARS-CoV-1-HSR1 | 29 751 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 42 | DQ182595.1 | China | SARS-CoV-1ZJ0301 | 29 706 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
*Isolates number 1,2,3 have their nt sequence in the nucleotide database without their expected orfs annotated.
Fig. 2. Map of expected orfs pattern of selected 4 COVID-19, 1 SARS-CoV isolate compared to the bat BatCoV-RaTG13 isolate. Accession number and isolate name are shown in each map panel.