| Literature DB >> 33551661 |
Pooi Ling Mok1,2,3, Avin Ee-Hwan Koh2, Aisha Farhana1, Abdullah Alsrhani1, Mohammad Khursheed Alam4, Subbiah Suresh Kumar3,5,6,7.
Abstract
COVID-19 is a rapidly emerging infectious disease caused by the SARS-CoV-2 virus currently spreading throughout the world. To date, there are no specific drugs formulated for it, and researchers around the globe are racing against the clock to investigate potential drug candidates. The repurposing of existing drugs in the market represents an effective and economical strategy commonly utilized in such investigations. In this study, we used a multiple-sequence alignment approach for preliminary screening of commercially-available drugs on SARS-CoV sequences from the Kingdom of Saudi Arabia (KSA) isolates. The viral genomic sequences from KSA isolates were obtained from GISAID, an open access repository housing a wide variety of epidemic and pandemic virus data. A phylogenetic analysis of the present 164 sequences from the KSA provinces was carried out using the MEGA X software, which displayed high similarity (around 98%). The sequence was then analyzed using the VIGOR4 genome annotator to construct its genomic structure. Screening of existing drugs was carried out by mining data based on viral gene expressions from the ZINC database. A total of 73 hits were generated. The viral target orthologs were mapped to the SARS-CoV-2 KSA isolate sequence by multiple sequence alignment using CLUSTAL OMEGA, and a list of 29 orthologs with purchasable drug information was generated. The results showed that the SARS CoV replicase polyprotein 1a had the highest sequence similarity at 79.91%. Through ZINC data mining, tanshinones were found to have high binding affinities to this target. These compounds could be ideal candidates for SARS-CoV-2. Other matches ranged between 27 and 52%. The results of this study would serve as a significant endeavor towards drug discovery that would increase our chances of finding an effective treatment or prevention against COVID19.Entities:
Keywords: COVID-19; Coronavirus; Multiple sequence alignment; Saudi Arabia; Tanshinones
Year: 2021 PMID: 33551661 PMCID: PMC7845492 DOI: 10.1016/j.sjbs.2021.01.051
Source DB: PubMed Journal: Saudi J Biol Sci ISSN: 2213-7106 Impact factor: 4.219
Fig. 1The phylogenetic tree of SARS-CoV-2 isolates from Saudi Arabia. The tree shows the evolutionary relationship of all 164 SARS-CoV-2-SA isolates obtained from the GISAID database. The relationship shows a high similarity between the tested samples (97–98%). The figure was generated using the Maximum Likelihood method and Tamura-Nei model in MEGA X.
SARS-CoV-2-SA genomic construction using the Viral Genome ORF Reader, VIGOR4 annotator. The isolate sequence was previously submitted to GISAID database by King Abdullah International Medical Research Center (KAIMRC). The genome consists of ORF1a, ORF1b, Spike, Envelope, Membrane, ORF6, ORF7a, ORF7b, ORF8, Nucleocapsid, and ORF10, respectively. The ORF regions code for the accessory proteins and non-structural proteins, which are involved in virus replication and assembly.
| Gene Product Name | Reference | Peptide Length | Ref. Length | % Identity | % Similarity | % Coverage | ||
|---|---|---|---|---|---|---|---|---|
| 272..13489 | ORF1a | ORF1a polyprotein | YP_009725295.1 | 4405 | 4405 | 99.95 | 99.95 | 100.00 |
| 13448..13486 | ORF1a | nsp11 | YP_009725312.1 | 13 | 13 | 100.00 | 100.00 | 100.00 |
| 272..13474, 13474..21561 | ORF1ab | ORF1ab polyprotein | YP_009724389.1 | 7096 | 7096 | 99.96 | 99.96 | 100.00 |
| 272..811 | ORF1ab | leader protein | YP_009725297.1 | 180 | 180 | 100.00 | 100.00 | 100.00 |
| 812..2725 | ORF1ab | nsp2 | YP_009725298.1 | 638 | 638 | 100.00 | 100.00 | 100.00 |
| 2726..8560 | ORF1ab | nsp3 | YP_009725299.1 | 1945 | 1945 | 99.95 | 99.95 | 100.00 |
| 8561..10060 | ORF1ab | nsp4 | YP_009725300.1 | 500 | 500 | 100.00 | 100.00 | 100.00 |
| 10061..10978 | ORF1ab | 3C-like proteinase | YP_009725301.1 | 306 | 306 | 100.00 | 100.00 | 100.00 |
| 10979..11848 | ORF1ab | nsp6 | YP_009725302.1 | 290 | 290 | 99.66 | 100.00 | 100.00 |
| 11849..12097 | ORF1ab | nsp7 | YP_009725303.1 | 83 | 83 | 100.00 | 100.00 | 100.00 |
| 12098..12691 | ORF1ab | nsp8 | YP_009725304.1 | 198 | 198 | 100.00 | 100.00 | 100.00 |
| 12692..13030 | ORF1ab | nsp9 | YP_009725305.1 | 113 | 113 | 100.00 | 100.00 | 100.00 |
| 13031..13447 | ORF1ab | nsp10 | YP_009725306.1 | 139 | 139 | 100.00 | 100.00 | 100.00 |
| 13448..16243 | ORF1ab | RNA-dependent RNA polymerase | YP_009725307.1 | 932 | 932 | 99.89 | 99.89 | 100.00 |
| 16244..18046 | ORF1ab | helicase | YP_009725308.1 | 601 | 601 | 100.00 | 100.00 | 100.00 |
| 18047..19627 | ORF1ab | 3′-to-5′ exonuclease | YP_009725309.1 | 527 | 527 | 100.00 | 100.00 | 100.00 |
| 19628..20665 | ORF1ab | endoRNAse | YP_009725310.1 | 346 | 346 | 100.00 | 100.00 | 100.00 |
| 20666..21559 | ORF1ab | 2′-O-ribose methyltransferase | YP_009725311.1 | 298 | 298 | 100.00 | 100.00 | 100.00 |
| 21569..25390 | S | surface glycoprotein | YP_009724390.1 | 1273 | 1273 | 100.00 | 100.00 | 100.00 |
| 25399..26226 | ORF3a | ORF3a protein | YP_009724391.1 | 275 | 275 | 99.64 | 100.00 | 100.00 |
| 26251..26478 | E | envelope protein | YP_009724392.1 | 75 | 75 | 100.00 | 100.00 | 100.00 |
| 26529..27197 | M | membrane glycoprotein | YP_009724393.1 | 222 | 222 | 100.00 | 100.00 | 100.00 |
| 27208..27393 | ORF6 | ORF6 protein | YP_009724394.1 | 61 | 61 | 100.00 | 100.00 | 100.00 |
| 27400..27765 | ORF7a | ORF7a protein | YP_009724395.1 | 121 | 121 | 100.00 | 100.00 | 100.00 |
| 27762..27893 | ORF7b | ORF7b protein | YP_009725296.1 | 43 | 43 | 100.00 | 100.00 | 100.00 |
| 27900..28265 | ORF8 | ORF8 protein | YP_009724396.1 | 121 | 121 | 100.00 | 100.00 | 100.00 |
| 28280..29539 | N | nucleocapsid phosphoprotein | YP_009724397.2 | 419 | 419 | 99.76 | 99.76 | 100.00 |
| 29564..29680 | ORF10 | ORF10 protein | YP_009725255.1 | 38 | 38 | 100.00 | 100.00 | 100.00 |
List of targeted viral genes and the total number of its acting compounds. Using ZINC database, a total of 73 viral therapeutic targets were found. Each target includes its protein function, subclass, orthologs, as well as other information. ‘Observations’ refers to the number of individual reports on compounds that were tested on their respective target. ‘Substances’ refers to the number of compounds for the target. ‘Purchasable’ refers to commercially available compounds, and finally ‘Predicted’ shows the similarity ensemble approach (SEA) predictions-based candidate compounds based on chembl 20.
| Name | Description | Sub Class | Orthologs | Observations | Substances | Purchasable | Predicted |
|---|---|---|---|---|---|---|---|
| NA | Neuraminidase | hydrolase | 14 | 977 | 480 | 20 | 23,041 |
| M | Matrix protein 2 | IC-other | 1 | 32 | 32 | 12 | 1025 |
| DPOL_HHV11 | DNA polymerase catalytic subunit | transferase | 1 | 6 | 6 | 4 | 31,646 |
| TK | Thymidine kinase | enzyme-other | 5 | 302 | 147 | 35 | 20,453 |
| TAT_HV112 | Protein Tat | TF-other | 1 | 58 | 49 | 3 | 3374 |
| NS4A | Non-structural protein 4A | protease | 1 | 1 | 1 | 0 | 4879 |
| UL80 | Capsid scaffolding protein | protease | 1 | 208 | 182 | 4 | 176,053 |
| RIR1_HHV11 | Ribonucleoside-diphosphate reductase large subunit | enzyme-other | 1 | 30 | 26 | 0 | 88,152 |
| E2 | Regulatory protein E2 | TF-other | 3 | 1 | 1 | 1 | 2429 |
| UL54 | DNA polymerase catalytic subunit | enzyme-other | 1 | 87 | 80 | 4 | 148,906 |
| KITH_VZVD | Thymidine kinase | enzyme-other | 1 | 9 | 9 | 2 | 697 |
| NS5B | NS5B protein | transferase | 2 | 1186 | 925 | 56 | 213,055 |
| NS3 | Genome polyprotein | protease | 1 | 980 | 927 | 47 | 64,711 |
| UL26 | Capsid scaffolding protein | protease | 2 | 55 | 55 | 1 | 82,085 |
| POLG_HCV1 | Genome polyprotein | protease | 1 | 96 | 95 | 2 | 75,703 |
| PROTEASE | Protease | protease | 2 | 953 | 818 | 51 | 158,148 |
| TAT | Protein Tat | TF-other | 2 | 12 | 12 | 0 | 7728 |
| REV | Protein Rev | cytosolic-other | 1 | 19 | 15 | 1 | 3001 |
| ENV | Envelope glycoprotein gp160 | surface-antigen | 5 | 55 | 51 | 21 | 1,590,740 |
| ORF_36 | Thymidine kinase | enzyme-other | 1 | 0 | 0 | 0 | 0 |
| U38 | DNA polymerase catalytic subunit | enzyme-other | 1 | 12 | 12 | 2 | 243,208 |
| U53 | Capsid scaffolding protein | protease | 1 | 33 | 33 | 0 | 28,153 |
| SCAF_EBVB9 | Capsid scaffolding protein | protease | 1 | 28 | 28 | 0 | 72,426 |
| POL | Pol polyprotein | enzyme-other | 6 | 12,880 | 9001 | 876 | 799,012 |
| GAG-PRO-POL | Gag-Pro-Pol polyprotein | enzyme-other | 1 | 3 | 3 | 2 | 494 |
| POLG_POL1M | Genome polyprotein | protease | 1 | 1 | 1 | 0 | 3037 |
| R1A_CVHSA | Replicase polyprotein 1a | enzyme-other | 1 | 40 | 35 | 13 | 11,327 |
| GAG-PRO | Gag-Pro polyprotein | protease | 1 | 33 | 33 | 0 | 62,841 |
| HA | Hemagglutinin | surface-antigen | 3 | 8 | 6 | 0 | 1051 |
| ABL | Tyrosine-protein kinase transforming protein Abl | kinase | 1 | 1 | 1 | 0 | 10,324 |
| E | Endolysin | enzyme-other | 1 | 0 | 0 | 0 | 0 |
| E1 | Replication protein E1 | enzyme-other | 1 | 32 | 31 | 27 | 233,548 |
| PRIM_HHV11 | DNA primase | nuclear-other | 1 | 7 | 7 | 3 | 1929 |
| VP16_HHV11 | Tegument protein VP16 | nuclear-other | 1 | 2 | 2 | 1 | 55 |
| US28 | G-protein coupled receptor homolog US28 | GPCR-A | 2 | 46 | 14 | 3 | 4752 |
| DPOL_VZVD | DNA polymerase catalytic subunit | enzyme-other | 1 | 8 | 7 | 5 | 64,879 |
| MC087R | MC087R | enzyme-other | 1 | 0 | 0 | 0 | 0 |
| POLG_BVDVC | Genome polyprotein | enzyme-other | 1 | 0 | 0 | 0 | 0 |
| Q86831_AVIMB | Polyprotein II | enzyme-other | 1 | 0 | 0 | 0 | 0 |
| POLG_GBVB | Genome polyprotein | enzyme-other | 1 | 0 | 0 | 0 | 0 |
| POLG_DEN26 | Genome polyprotein | enzyme-other | 1 | 11 | 10 | 0 | 13,876 |
| PSET | Polynucleotide kinase | enzyme-other | 1 | 0 | 0 | 0 | 0 |
| POLG_WNV | Genome polyprotein | enzyme-other | 1 | 71 | 52 | 5 | 191,900 |
| POLG_HCVCO | Genome polyprotein | enzyme-other | 1 | 34 | 34 | 2 | 707 |
| REP | Replicase polyprotein 1ab | enzyme-other | 1 | 9 | 9 | 3 | 9189 |
| POLG_HRV16 | Genome polyprotein | enzyme-other | 1 | 5 | 5 | 1 | 4352 |
| Q82323_9DELA | Protease | unclassified | 1 | 20 | 20 | 0 | 76,070 |
| POLG_HCVBK | Genome polyprotein | enzyme-other | 1 | 7 | 7 | 0 | 2591 |
| V-FPS | Tyrosine-protein kinase transforming protein Fps | kinase | 1 | 1 | 1 | 1 | 0 |
| 30 | DNA ligase | enzyme-other | 1 | 0 | 0 | 0 | 0 |
| GAG | Gag polyprotein | unclassified | 2 | 15 | 15 | 3 | 1050 |
| DPOL_HHV1K | DNA polymerase catalytic subunit | enzyme-other | 1 | 4 | 4 | 2 | 869 |
| 43 | DNA polymerase | enzyme-other | 1 | 0 | 0 | 0 | 0 |
| G | Glycoprotein G | unclassified | 1 | 4 | 4 | 0 | 17,091 |
| UL23 | Thymidine kinase | unclassified | 1 | 8 | 8 | 6 | 4926 |
| RPOL_BPT7 | T7 RNA polymerase | enzyme-other | 1 | 0 | 0 | 0 | 0 |
| HBCAG | External core antigen | unclassified | 1 | 0 | 0 | 0 | 0 |
| THYX_PBCV1 | Probable thymidylate synthase | enzyme-other | 1 | 10 | 10 | 0 | 4316 |
| N1L | Virokine/NFkB inhibitor | unclassified | 1 | 22 | 12 | 8 | 31,528 |
| PA | Polymerase acidic protein | unclassified | 2 | 94 | 51 | 9 | 36,557 |
| Q3ZDS5_9HEPC | NS5B | unclassified | 1 | 6 | 2 | 0 | 2097 |
| UL97 | Phosphotransferase pUL97 | unclassified | 1 | 1 | 1 | 0 | 557 |
| HIVRT | Reverse transcriptase | unclassified | 2 | 294 | 159 | 23 | 24,831 |
| M2 | Matrix protein 2 | unclassified | 1 | 13 | 13 | 3 | 533 |
| UNG | Uracil-DNA glycosylase | enzyme-other | 2 | 4 | 4 | 1 | 1259 |
| E6 | Protein E6 | unclassified | 1 | 2 | 2 | 2 | 0 |
| Q76353_9HIV1 | Integrase | unclassified | 1 | 153 | 146 | 2 | 63,112 |
| POLG_CXB3N | Genome polyprotein | enzyme-other | 1 | 1 | 1 | 0 | 171 |
| R1A_CVHNL | Replicase polyprotein 1a | enzyme-other | 1 | 0 | 0 | 0 | 0 |
| PB2 | Polymerase basic protein 2 | unclassified | 1 | 31 | 31 | 0 | 1789 |
| A0A0K1CY61_9HEPC | Nonstructural protein NS3-4A | unclassified | 1 | 59 | 47 | 6 | 996 |
| HIV1_ENV | GP41 | unclassified | 1 | 1 | 1 | 1 | 35,089 |
| Q91H74_9FLAV | Genome polyprotein | unclassified | 1 | 30 | 25 | 5 | 24,580 |
Percent identity matrix of therapeutic targets mapped to the SARS-CoV-2-SA isolate sequence. Of the 73 genes, the orthologs with known purchasable drugs were mapped to the SARS-CoV-2-SA isolate sequence by multiple alignment using CLUSTAL OMEGA. A list of 29 orthologs, their respective similarity index, and purchasable drugs information was then generated. Compounds binding to targets with higher similarity to SARS-CoV-2 could be potential drug candidates for further study.
| Name | Target description | Orthologs | Ortholog ID | Virus | % | Purchasable | Examples |
|---|---|---|---|---|---|---|---|
| REP | Replicase polyprotein 1a | 1 | R1A_CVHSA | SARS-CoV | 79.91 | 13 | Tanshinone |
| E6 | Protein E6 | 1 | VE6_HPV16 | Human papilloma virus | 51.99 | 2 | Myricetin, Morin |
| NA | Neuraminidase | 14 | NRAM_I34A1 | Influenza A virus | 47.76 | 20 | Oseltamivir, Zanamivir, Rapivab |
| E2 | Regulatory protein E2 | 3 | VE2_HPV16 | Human papilloma virus | 47.61 | 1 | Podofilox |
| M | Matrix protein 2 | 1 | M2_I72A2 | Influenza A virus | 45.09 | 12 | Amantadine, Ramantadine |
| NS3 | Genome polyprotein | 1 | A3EZI9_9HEPC | Hepatitis C virus | 40.55 | 47 | Ciluprevir, Victrelis |
| POLG_WNV | Genome polyprotein | 1 | POLG_WNV | West Nile virus | 40.09 | 5 | ZINC3249673 |
| Q76353_9HIV1 | Integrase | 1 | Q76353_9HIV1 | Human immunodeficiency virus 1 | 39.19 | 2 | Raltegravir |
| UL26 | DNA polymerase catalytic subunit | 2 | SCAF_HHV11 | Human herpesvirus 1 | 38.26 | 1 | ZINC3625576 |
| TAT_HV112 | Tat protein | 1 | TAT_HV112 | Human immunodeficiency virus 1 | 37.9 | 1 | ZINC5155 |
| ENV | Envelope polyprotein GP160 | 5 | ENV_HV1H2 | Human immunodeficiency virus 1 | 37.7 | 21 | ZINC1780082 |
| UL54 | DNApolymerase | 1 | DPOL_HCMVA | Human cytomegalovirus | 35.79 | 4 | Foscarnet |
| UL80 | Capsid scaffolding protein | 1 | SCAF_HCMVA | Human cytomegalovirus | 35.69 | 4 | ZINC901466 |
| POL | Pol polyprotein | 6 | P88142_9HIV2 | Human immunodeficiency virus 2 | 35.67 | 876 | Elvucitabine, Trovirdine |
| HIVRT | Reverse transcriptase | 2 | Q06347_9HIV2 | Human immunodeficiency virus 2 | 35.48 | 23 | Efavirenz, Intelence |
| PROTEASE | Protease | 2 | Q4U254_9ENTO | Human rhinovirus | 35.28 | 51 | Pleconaril, Pirodavir |
| PRIM_HHV11 | DNA primase | 1 | PRIM_HHV11 | Human herpesvirus 1 | 35.23 | 3 | ZINC1675992 |
| UL26 | Capsid scaffolding protein | 2 | SCAF_HHV11 | Human herpesvirus 1 | 35.05 | 1 | ZINC3625576 |
| DPOL_VZVD | DNA polymerase catalytic subunit | 1 | DPOL_VZVD | Varicella-zoster virus | 34.45 | 5 | Aphidicolin, Foscarnet |
| E1 | Replication protein E1 | 1 | VE1_HPV11 | Human papilloma virus | 34.39 | 6 | ZINC3600349 |
| DPOL_HHV11 | DNA polymerase catalytic subunit | 1 | DPOL_HHV11 | Human herpesvirus 1 | 34.33 | 4 | Aphidicolin |
| GAG | Gag polyprotein | 2 | GAG_AVIER | Avian erythroblastosis virus | 34.33 | 3 | ZINC6584476 |
| PA | Polymerase acidic protein | 2 | PA_I34A1 | Influenza A virus | 34.02 | 9 | ZINC3626195 |
| POLG_HRV16 | Genome polyprotein | 1 | POLG_HRV16 | Human rhinovirus | 33.87 | 1 | ZINC40975895 |
| UNG | Uracil-DNA | 2 | UNG_VACCW | Vaccinia virus | 33.82 | 1 | ZINC359756 |
| US28 | G-protein coupled receptor homolog US28 | 2 | US28_HCMVA | Human cytomegalovirus | 30.89 | 3 | Metitepine |
| VP16_HHV11 | Tegument protein VP16 | 1 | VP16_HHV11 | Human herpesvirus 1 | 30.84 | 1 | ZINC3831128 |
| TK | Thymidine kinase | 5 | KITH_HHV1S | Human herpesvirus 1 | 28.11 | 35 | Brivudine, Sorivudine |
| N1L | Virokine/NFkB inhibitor | 1 | Q49PX0_9POXV | Vaccinia virus | 27.78 | 8 | ZINC1557545 |