Literature DB >> 33551661

Computational drug screening against the SARS-CoV-2 Saudi Arabia isolates through a multiple-sequence alignment approach.

Pooi Ling Mok1,2,3, Avin Ee-Hwan Koh2, Aisha Farhana1, Abdullah Alsrhani1, Mohammad Khursheed Alam4, Subbiah Suresh Kumar3,5,6,7.   

Abstract

COVID-19 is a rapidly emerging infectious disease caused by the SARS-CoV-2 virus currently spreading throughout the world. To date, there are no specific drugs formulated for it, and researchers around the globe are racing against the clock to investigate potential drug candidates. The repurposing of existing drugs in the market represents an effective and economical strategy commonly utilized in such investigations. In this study, we used a multiple-sequence alignment approach for preliminary screening of commercially-available drugs on SARS-CoV sequences from the Kingdom of Saudi Arabia (KSA) isolates. The viral genomic sequences from KSA isolates were obtained from GISAID, an open access repository housing a wide variety of epidemic and pandemic virus data. A phylogenetic analysis of the present 164 sequences from the KSA provinces was carried out using the MEGA X software, which displayed high similarity (around 98%). The sequence was then analyzed using the VIGOR4 genome annotator to construct its genomic structure. Screening of existing drugs was carried out by mining data based on viral gene expressions from the ZINC database. A total of 73 hits were generated. The viral target orthologs were mapped to the SARS-CoV-2 KSA isolate sequence by multiple sequence alignment using CLUSTAL OMEGA, and a list of 29 orthologs with purchasable drug information was generated. The results showed that the SARS CoV replicase polyprotein 1a had the highest sequence similarity at 79.91%. Through ZINC data mining, tanshinones were found to have high binding affinities to this target. These compounds could be ideal candidates for SARS-CoV-2. Other matches ranged between 27 and 52%. The results of this study would serve as a significant endeavor towards drug discovery that would increase our chances of finding an effective treatment or prevention against COVID19.
© 2021 The Author(s).

Entities:  

Keywords:  COVID-19; Coronavirus; Multiple sequence alignment; Saudi Arabia; Tanshinones

Year:  2021        PMID: 33551661      PMCID: PMC7845492          DOI: 10.1016/j.sjbs.2021.01.051

Source DB:  PubMed          Journal:  Saudi J Biol Sci        ISSN: 2213-7106            Impact factor:   4.219


Introduction

The COVID-19 disease initially began as a small outbreak that was first reported in Wuhan, Hubei, China, towards the end of the year 2019. Although the source of the first possible outbreak could not be pinpointed, it began spreading throughout the world. By January 2020, COVID-19 was announced as a global health emergency by the World Health Organization (WHO), a status that is shared by the likes of a few, namely the H1N1 swine flu (2009), Polio (2014), and Ebola (2014) outbreaks. Countries in the middle-east have repeatedly reported infectious disease outbreaks since the last decade. In 2012, the first presumed case of MERS-CoV, a relative to the current SARS-CoV-2 virus, was reported in Saudi Arabia (Chan et al., 2012). The patient developed pneumonia followed by renal failure. The virus, known as HCoV-EMC/2012 at the time, was categorized as a betacoronavirus. Later on, this infection resulted in the second coronavirus (CoV) epidemic (Kandeel et al., 2020, van Boheemen et al., 2012). SARS-CoV-2 is the newest member under this subgenus and was found to share about 50% similarity with MERS-CoV and about 79% with SARS CoV (Pal et al., 2020). During the early phases of the COVID-19 epidemic, three initial spreading or seeding patterns were observed based on population movements into and within Saudi Arabia (Memish et al., 2020). Transmission among international pilgrims heading towards the holy sites of Mecca and Medina accounted for the first infection pattern (Ebrahim and Memish, 2020, Memish et al., 2020). The second pattern involved returning Saudi shiite pilgrims from the eastern province. Finally, the third pattern, which involves general travelers moving in and out of Saudi Arabia (Ebrahim and Memish, 2020, Memish et al., 2020). As of September 2020, Saudi Arabia has reported more than 300,000 cases and almost 4000 deaths (Worldometer, 2020). Proactive and intensive responses, especially during the early stages of COVID-19, as well as a previous history in handling MERS-CoV, has primarily helped Saudi Arabia to manage local outbreaks (Algaissi et al., 2020, Barry et al., 2020, Obied et al., 2020). Besides, there have been ongoing efforts in the country to curb viral transmissions, such as the study of the popular hydroxychloroquine and Ivermectin as a candidate drug treatment (Kelleni, 2020, Meo et al., 2020). At present, pharmaceutical companies are constantly devising new drugs in the laboratory before proceeding with clinical trials. However, such drug studies require costly and lengthy endeavors before new compounds can be brought over from bench to bedside. Not to mention, more than half of these investigational studies fail to pass phase-3 of clinical trials, resulting in huge losses (Fogel, 2018, Hwang et al., 2016). This process can be expedited efficiently in an economical way through computational drug discovery, which utilizes bioinformatics processes and data mining on already available huge datasets to identify new drug targets and screen existing drugs for pharmaceutical research (Bharath et al., 2020, Ou-Yang et al., 2012). This facilitates the repurposing of de-risked drugs and shortens development timelines immensely, which make it a highly attractive approach, especially during a sudden viral outbreak (Pushpakom et al., 2018). Hence, many paid and open-access software have been developed over the years for this purpose. A comprehensive list of databases for computer-aided drug design and screening can be found on click2drug.org, a directory under the ExPASy bioinformatics resource portal (Artimo et al., 2012). In this study, we employed a multiple-sequence alignment approach in identifying potential drug candidates for COVID-19 (March-Vila et al., 2017), with an emphasis on isolates obtained from KSA. The genomic sequence of SARS-CoV-2 isolates from KSA (SARS-CoV-2-SA) were obtained from the GISAID database (gisaid.org) (Shu and McCauley, 2017). At the time of this study, 164 sequences were uploaded from KSA, including the province of Jeddah, Madinah, Makkah, and Riyadh. Phylogenetic analysis was carried out on the isolates using MEGA X. The VIGOR4 annotator was then used to construct a genomic structure using the SARS-CoV-2 sequence. The ZINC database was mined to obtain a list of potential compounds against the therapeutic targets of SARS-CoV-2. It is a free resource available at zinc.docking.org to provide researchers with free access to ligand discovery, annotated compounds, purchasable drug information, and more (Irwin et al., 2012). A total of 73 hits were obtained from the database for specific viral proteins, including the ORF1ab region of SARS CoV, which codes some of the essential proteins for coronaviral replication (e.g. 3C-like protease 3CLPRO, and papain-like protease PLPRO). The targets with annotated drug compounds were then mapped to the SARS-CoV-2 KSA isolate sequence by multiple sequence alignment using CLUSTAL OMEGA. A list of 29 orthologs, including its similarity indexes, was tabulated. Through data mining in ZINC, high-affinity binding drugs to the viral targets were generated.

Methods and materials

Genomic data information and collection

The complete genomic sequences of SARS-CoV-2 isolates from Saudi Arabia (SARS-C0V-2-SA) were downloaded from the GISAID database at gisaid.org. All downloaded 164 sequences were obtained from highly affected regions in Jeddah, Madinah, Makkah, and Riyadh. These also included a list of accession IDs, submission dates, originating labs, and other relevant information (supplementary file gisaid_hcov-19).

Nucleotide sequence alignment and phylogenetic tree analysis

The sequences were first aligned using the alignment explorer feature of MEGAX (Khan, 2017, Kumar et al., 2018). The ClustalW (codons) feature was used for the alignment. The gap opening and gap extension penalty were set to 10.00 and 0.20 respectively, as recommended by Newman et al., 2016, Newman et al., 2016. The aligned sequences were exported and then analyzed to construct the phylogenetic tree. The statistical method employed was the maximum likelihood tree, and the model used was the Tamura-Nei model along with the option to delete partial gaps (Kumar et al., 2018, Tamura and Nei, 1993).

Construction of SARS-CoV-2-SA isolate genomic structure

The hCoV-19/Saudi Arabia/KAIMRC-Alghoribi/2020 isolate, one of the first known sequenced SARS-CoV-2 in the region from a 68 y/o local male, was used to construct the viral genome structure. The genomic sequence of the SARS-CoV-2-SA isolate was annotated using the Viral Genome ORF Reader, VIGOR4 annotator. The open-source for this tool is available at github.com/VirusBRC/VIGOR4, or online at viprbrc.org (Wang et al., 2010). In brief, the sequence (in FASTA format) was loaded into the tool. The reference genome used was the SARS-CoV-2 Wuhan-Hu-1 isolate (GenBank sequence accession NC_045512.2). A total of 28 annotations in the sequence was analyzed and the results were then tabulated.

Database screening and multiple sequence alignment

A list of candidate viral drug targets was generated by screening through annotated entries from the ZINC database. The targets with purchasable drug information were then mapped to the SARS-CoV-2-SA isolate sequence by multiple sequence alignment using CLUSTAL OMEGA. The similarity index was generated, and the list of drugs for high similarity orthologs were then screened and tabulated.

Statistical method

The Maximum Likelihood method was used in the inference and construction of the phylogenetic tree with the highest log likelihood (-1396374.75) based on the Tamura-Nei model. The Neighbor-joining algorithm was applied to a matrix of pairwise distances, and the topology that passes the algorithm was selected to obtain the tree. The site proportion, where at least one unambiguous base is available in one or more sequence per descendent clade, is shown in the tree beside each internal node. The analysis involved 164 nucleotide sequences. There was a total of 30,643 positions in the final dataset.

Results

High homology between all SARS-CoV-2 isolates from Saudi Arabia

The purpose of constructing a phylogenetic tree of the SARS-CoV-2 isolates from Saudi Arabia (SARS-CoV-2-SA) was to determine the homology and evolutionary relationships between these sequences. The data was obtained from GISAID.org and aligned using MEGA X. The phylogenetic tree was then generated using the whole genome sequences compiled in the software. The results suggest that all isolates from Saudi Arabia demonstrated 97–98% similarity in the genome sequences (Fig. 1).
Fig. 1

The phylogenetic tree of SARS-CoV-2 isolates from Saudi Arabia. The tree shows the evolutionary relationship of all 164 SARS-CoV-2-SA isolates obtained from the GISAID database. The relationship shows a high similarity between the tested samples (97–98%). The figure was generated using the Maximum Likelihood method and Tamura-Nei model in MEGA X.

The phylogenetic tree of SARS-CoV-2 isolates from Saudi Arabia. The tree shows the evolutionary relationship of all 164 SARS-CoV-2-SA isolates obtained from the GISAID database. The relationship shows a high similarity between the tested samples (97–98%). The figure was generated using the Maximum Likelihood method and Tamura-Nei model in MEGA X.

High similarity of open reading frame (ORF) sequences in the SARS-CoV-2-SA isolates to the Wuhan isolate

Based on the phylogenetic analysis, the genomic sequences were found to be highly similar in all SARS-CoV-2-SA isolates. Thus, in order to determine the sequence structure and function of the viral ORFs in relation to the Wuhan isolate, the first isolate uploaded by King Abdullah International Medical Research (KAIMRC) to GISAID.org was used to build its genomic structure using VIGOR4. The data shows that ORFs from the input KAIMRC genome shows 99%-100% similarity to the Wuhan reference (Table 1). This includes the ORF1ab, ORF3a, ORF6a, ORF7a, ORF7b, ORF8, and ORF10 genes that encode for non-structural and accessory proteins essential for viral replication. In addition, there are also presence of highly similar structural ORFs for S, M, E, and N proteins.
Table 1

SARS-CoV-2-SA genomic construction using the Viral Genome ORF Reader, VIGOR4 annotator. The isolate sequence was previously submitted to GISAID database by King Abdullah International Medical Research Center (KAIMRC). The genome consists of ORF1a, ORF1b, Spike, Envelope, Membrane, ORF6, ORF7a, ORF7b, ORF8, Nucleocapsid, and ORF10, respectively. The ORF regions code for the accessory proteins and non-structural proteins, which are involved in virus replication and assembly.

Start..StopGeneGene Product NameReferencePeptide LengthRef. Length% Identity% Similarity% Coverage
272..13489ORF1aORF1a polyproteinYP_009725295.14405440599.9599.95100.00
13448..13486ORF1ansp11YP_009725312.11313100.00100.00100.00
272..13474, 13474..21561ORF1abORF1ab polyproteinYP_009724389.17096709699.9699.96100.00
272..811ORF1ableader proteinYP_009725297.1180180100.00100.00100.00
812..2725ORF1abnsp2YP_009725298.1638638100.00100.00100.00
2726..8560ORF1abnsp3YP_009725299.11945194599.9599.95100.00
8561..10060ORF1abnsp4YP_009725300.1500500100.00100.00100.00
10061..10978ORF1ab3C-like proteinaseYP_009725301.1306306100.00100.00100.00
10979..11848ORF1abnsp6YP_009725302.129029099.66100.00100.00
11849..12097ORF1abnsp7YP_009725303.18383100.00100.00100.00
12098..12691ORF1abnsp8YP_009725304.1198198100.00100.00100.00
12692..13030ORF1abnsp9YP_009725305.1113113100.00100.00100.00
13031..13447ORF1abnsp10YP_009725306.1139139100.00100.00100.00
13448..16243ORF1abRNA-dependent RNA polymeraseYP_009725307.193293299.8999.89100.00
16244..18046ORF1abhelicaseYP_009725308.1601601100.00100.00100.00
18047..19627ORF1ab3′-to-5′ exonucleaseYP_009725309.1527527100.00100.00100.00
19628..20665ORF1abendoRNAseYP_009725310.1346346100.00100.00100.00
20666..21559ORF1ab2′-O-ribose methyltransferaseYP_009725311.1298298100.00100.00100.00
21569..25390Ssurface glycoproteinYP_009724390.112731273100.00100.00100.00
25399..26226ORF3aORF3a proteinYP_009724391.127527599.64100.00100.00
26251..26478Eenvelope proteinYP_009724392.17575100.00100.00100.00
26529..27197Mmembrane glycoproteinYP_009724393.1222222100.00100.00100.00
27208..27393ORF6ORF6 proteinYP_009724394.16161100.00100.00100.00
27400..27765ORF7aORF7a proteinYP_009724395.1121121100.00100.00100.00
27762..27893ORF7bORF7b proteinYP_009725296.14343100.00100.00100.00
27900..28265ORF8ORF8 proteinYP_009724396.1121121100.00100.00100.00
28280..29539Nnucleocapsid phosphoproteinYP_009724397.241941999.7699.76100.00
29564..29680ORF10ORF10 proteinYP_009725255.13838100.00100.00100.00
SARS-CoV-2-SA genomic construction using the Viral Genome ORF Reader, VIGOR4 annotator. The isolate sequence was previously submitted to GISAID database by King Abdullah International Medical Research Center (KAIMRC). The genome consists of ORF1a, ORF1b, Spike, Envelope, Membrane, ORF6, ORF7a, ORF7b, ORF8, Nucleocapsid, and ORF10, respectively. The ORF regions code for the accessory proteins and non-structural proteins, which are involved in virus replication and assembly.

Tanshinones show highest binding affinity to replicase 1a in SARS-CoV-2

Drug repurposing remains the most practical method for rapidly developing new treatments using existing drugs. In this study, data mining was performed by screening through the ZINC database for known anti-viral targets. A list of targeted viral genes along with the associated acting compounds were generated (Table 2). According to the database, there are currently 73 viral therapeutic targets, about half of which, contain information on known purchasable drugs against these targets. A multiple sequence alignment approach was then adopted to align these viral target sequences with that of the SARS-CoV-2-SA isolate using CLUSTAL OMEGA. The list generated 29 possible viral target groups with associated drugs that may be effective against SARS-CoV-2 (Table 3). The replicase polyprotein 1a (REP) of SARS CoV had the highest similarity (79.91%) to SARS-CoV-2. Based on chembl 2.0, tanshinones had among the highest binding affinities to REP. The E6 protein (52%) of human papilloma virus had the second highest similarity. The flavonoids, myricetin and morin, were among the listed drugs that targeted the E6 protein. The other alignments were found to be below 50%, which might still be potent against SARS-CoV-2.
Table 2

List of targeted viral genes and the total number of its acting compounds. Using ZINC database, a total of 73 viral therapeutic targets were found. Each target includes its protein function, subclass, orthologs, as well as other information. ‘Observations’ refers to the number of individual reports on compounds that were tested on their respective target. ‘Substances’ refers to the number of compounds for the target. ‘Purchasable’ refers to commercially available compounds, and finally ‘Predicted’ shows the similarity ensemble approach (SEA) predictions-based candidate compounds based on chembl 20.

NameDescriptionSub ClassOrthologsObservationsSubstancesPurchasablePredicted
NANeuraminidasehydrolase149774802023,041
MMatrix protein 2IC-other13232121025
DPOL_HHV11DNA polymerase catalytic subunittransferase166431,646
TKThymidine kinaseenzyme-other53021473520,453
TAT_HV112Protein TatTF-other1584933374
NS4ANon-structural protein 4Aprotease11104879
UL80Capsid scaffolding proteinprotease12081824176,053
RIR1_HHV11Ribonucleoside-diphosphate reductase large subunitenzyme-other13026088,152
E2Regulatory protein E2TF-other31112429
UL54DNA polymerase catalytic subunitenzyme-other187804148,906
KITH_VZVDThymidine kinaseenzyme-other1992697
NS5BNS5B proteintransferase2118692556213,055
NS3Genome polyproteinprotease19809274764,711
UL26Capsid scaffolding proteinprotease25555182,085
POLG_HCV1Genome polyproteinprotease19695275,703
PROTEASEProteaseprotease295381851158,148
TATProtein TatTF-other2121207728
REVProtein Revcytosolic-other1191513001
ENVEnvelope glycoprotein gp160surface-antigen55551211,590,740
ORF_36Thymidine kinaseenzyme-other10000
U38DNA polymerase catalytic subunitenzyme-other112122243,208
U53Capsid scaffolding proteinprotease13333028,153
SCAF_EBVB9Capsid scaffolding proteinprotease12828072,426
POLPol polyproteinenzyme-other612,8809001876799,012
GAG-PRO-POLGag-Pro-Pol polyproteinenzyme-other1332494
POLG_POL1MGenome polyproteinprotease11103037
R1A_CVHSAReplicase polyprotein 1aenzyme-other140351311,327
GAG-PROGag-Pro polyproteinprotease13333062,841
HAHemagglutininsurface-antigen38601051
ABLTyrosine-protein kinase transforming protein Ablkinase111010,324
EEndolysinenzyme-other10000
E1Replication protein E1enzyme-other1323127233,548
PRIM_HHV11DNA primasenuclear-other17731929
VP16_HHV11Tegument protein VP16nuclear-other122155
US28G-protein coupled receptor homolog US28GPCR-A2461434752
DPOL_VZVDDNA polymerase catalytic subunitenzyme-other187564,879
MC087RMC087Renzyme-other10000
POLG_BVDVCGenome polyproteinenzyme-other10000
Q86831_AVIMBPolyprotein IIenzyme-other10000
POLG_GBVBGenome polyproteinenzyme-other10000
POLG_DEN26Genome polyproteinenzyme-other11110013,876
PSETPolynucleotide kinaseenzyme-other10000
POLG_WNVGenome polyproteinenzyme-other171525191,900
POLG_HCVCOGenome polyproteinenzyme-other134342707
REPReplicase polyprotein 1abenzyme-other19939189
POLG_HRV16Genome polyproteinenzyme-other15514352
Q82323_9DELAProteaseunclassified12020076,070
POLG_HCVBKGenome polyproteinenzyme-other17702591
V-FPSTyrosine-protein kinase transforming protein Fpskinase11110
30DNA ligaseenzyme-other10000
GAGGag polyproteinunclassified2151531050
DPOL_HHV1KDNA polymerase catalytic subunitenzyme-other1442869
43DNA polymeraseenzyme-other10000
GGlycoprotein Gunclassified144017,091
UL23Thymidine kinaseunclassified18864926
RPOL_BPT7T7 RNA polymeraseenzyme-other10000
HBCAGExternal core antigenunclassified10000
THYX_PBCV1Probable thymidylate synthaseenzyme-other1101004316
N1LVirokine/NFkB inhibitorunclassified12212831,528
PAPolymerase acidic proteinunclassified29451936,557
Q3ZDS5_9HEPCNS5Bunclassified16202097
UL97Phosphotransferase pUL97unclassified1110557
HIVRTReverse transcriptaseunclassified22941592324,831
M2Matrix protein 2unclassified113133533
UNGUracil-DNA glycosylaseenzyme-other24411259
E6Protein E6unclassified12220
Q76353_9HIV1Integraseunclassified1153146263,112
POLG_CXB3NGenome polyproteinenzyme-other1110171
R1A_CVHNLReplicase polyprotein 1aenzyme-other10000
PB2Polymerase basic protein 2unclassified1313101789
A0A0K1CY61_9HEPCNonstructural protein NS3-4Aunclassified159476996
HIV1_ENVGP41unclassified111135,089
Q91H74_9FLAVGenome polyproteinunclassified13025524,580
Table 3

Percent identity matrix of therapeutic targets mapped to the SARS-CoV-2-SA isolate sequence. Of the 73 genes, the orthologs with known purchasable drugs were mapped to the SARS-CoV-2-SA isolate sequence by multiple alignment using CLUSTAL OMEGA. A list of 29 orthologs, their respective similarity index, and purchasable drugs information was then generated. Compounds binding to targets with higher similarity to SARS-CoV-2 could be potential drug candidates for further study.

NameTarget descriptionOrthologsOrtholog IDVirus%PurchasableExamples
REPReplicase polyprotein 1a1R1A_CVHSASARS-CoV79.9113Tanshinone
E6Protein E61VE6_HPV16Human papilloma virus51.992Myricetin, Morin
NANeuraminidase14NRAM_I34A1Influenza A virus47.7620Oseltamivir, Zanamivir, Rapivab
E2Regulatory protein E23VE2_HPV16Human papilloma virus47.611Podofilox
MMatrix protein 21M2_I72A2Influenza A virus45.0912Amantadine, Ramantadine
NS3Genome polyprotein1A3EZI9_9HEPCHepatitis C virus40.5547Ciluprevir, Victrelis
POLG_WNVGenome polyprotein1POLG_WNVWest Nile virus40.095ZINC3249673
Q76353_9HIV1Integrase1Q76353_9HIV1Human immunodeficiency virus 139.192Raltegravir
UL26DNA polymerase catalytic subunit2SCAF_HHV11Human herpesvirus 138.261ZINC3625576
TAT_HV112Tat protein1TAT_HV112Human immunodeficiency virus 137.91ZINC5155
ENVEnvelope polyprotein GP1605ENV_HV1H2Human immunodeficiency virus 137.721ZINC1780082
UL54DNApolymerase1DPOL_HCMVAHuman cytomegalovirus35.794Foscarnet
UL80Capsid scaffolding protein1SCAF_HCMVAHuman cytomegalovirus35.694ZINC901466
POLPol polyprotein6P88142_9HIV2Human immunodeficiency virus 235.67876Elvucitabine, Trovirdine
HIVRTReverse transcriptase2Q06347_9HIV2Human immunodeficiency virus 235.4823Efavirenz, Intelence
PROTEASEProtease2Q4U254_9ENTOHuman rhinovirus35.2851Pleconaril, Pirodavir
PRIM_HHV11DNA primase1PRIM_HHV11Human herpesvirus 135.233ZINC1675992
UL26Capsid scaffolding protein2SCAF_HHV11Human herpesvirus 135.051ZINC3625576
DPOL_VZVDDNA polymerase catalytic subunit1DPOL_VZVDVaricella-zoster virus34.455Aphidicolin, Foscarnet
E1Replication protein E11VE1_HPV11Human papilloma virus34.396ZINC3600349
DPOL_HHV11DNA polymerase catalytic subunit1DPOL_HHV11Human herpesvirus 134.334Aphidicolin
GAGGag polyprotein2GAG_AVIERAvian erythroblastosis virus34.333ZINC6584476
PAPolymerase acidic protein2PA_I34A1Influenza A virus34.029ZINC3626195
POLG_HRV16Genome polyprotein1POLG_HRV16Human rhinovirus33.871ZINC40975895
UNGUracil-DNA2UNG_VACCWVaccinia virus33.821ZINC359756
US28G-protein coupled receptor homolog US282US28_HCMVAHuman cytomegalovirus30.893Metitepine
VP16_HHV11Tegument protein VP161VP16_HHV11Human herpesvirus 130.841ZINC3831128
TKThymidine kinase5KITH_HHV1SHuman herpesvirus 128.1135Brivudine, Sorivudine
N1LVirokine/NFkB inhibitor1Q49PX0_9POXVVaccinia virus27.788ZINC1557545
List of targeted viral genes and the total number of its acting compounds. Using ZINC database, a total of 73 viral therapeutic targets were found. Each target includes its protein function, subclass, orthologs, as well as other information. ‘Observations’ refers to the number of individual reports on compounds that were tested on their respective target. ‘Substances’ refers to the number of compounds for the target. ‘Purchasable’ refers to commercially available compounds, and finally ‘Predicted’ shows the similarity ensemble approach (SEA) predictions-based candidate compounds based on chembl 20. Percent identity matrix of therapeutic targets mapped to the SARS-CoV-2-SA isolate sequence. Of the 73 genes, the orthologs with known purchasable drugs were mapped to the SARS-CoV-2-SA isolate sequence by multiple alignment using CLUSTAL OMEGA. A list of 29 orthologs, their respective similarity index, and purchasable drugs information was then generated. Compounds binding to targets with higher similarity to SARS-CoV-2 could be potential drug candidates for further study.

Discussion

The COVID-19 disease has resulted in an ongoing pandemic with no specific treatment to date, only supportive care that has been endorsed by the World Health Organization (WHO) (Song et al., 2020). Hence, finding a functional cure or even an effective therapy is paramount in stopping the SARS-CoV-2 global outbreaks. At present, based on the analysis of over 48,000 SARS-CoV-2 genomes worldwide from the GISAID database, there are 7 clades (G, GH, GR, L, O, S, V) (Mercatelli and Giorgi, 2020). Each possesses characteristic variants, such as the spike protein variant S-D614G (clade G) and the ORF3a variant NS3-G251 (clade V). Although sequence variation can give rise to contrasting phenotypes and hence possible differences in treatment regimes, its implications in SARS-CoV-2 is still unclear and widely debated today (Young et al., 2020). This is made even more complicated by its relationship with host factors of different countries, which have given rise to varied infection and mortality rates (Toyoshima et al., 2020). Since April 2020, the variants of SARS-CoV-2 genomes were found to be unevenly distributed and highly diversified throughout the continents, which raised the importance of adoptive measures for containing its outbreak (Jones and Manrique, 2020, Mercatelli and Giorgi, 2020). Interestingly, it has also been shown that the mutation rate of SARS-CoV-2 is low, at least in comparison to SARS CoV (Jia et al., 2020, Rausch et al., 2020). In our study of the phylogeny of SARS-CoV-2 isolates in Saudi Arabia (Fig. 1), the genome sequences show very high similarity (between 97 and 98%). This may be beneficial because a uniformed drug treatment may be applicable to the current isolates. The genomic structure reflects that of the reference genome isolated from Wuhan (GenBank sequence accession NC_045512.2) (Table 1). Currently, more genomic sequences are being uploaded to GISAID, and this may increase the number of more diversified sequences and hence mutations that can render many available drugs ineffective (Naqvi et al., 2020, Pachetti et al., 2020). Despite this, a recent study by De Vries et al. (2020) has proven that clade differences do not have an impact on drugs that target highly conserved SARS-CoV-2 regions, such as 3CLPRO (De Vries et al., 2020). Developing drugs for neutralizing the infectivity of a novel pathogen in an outbreak itself is a lengthy endeavor. During a rapid outbreak that shows a high rate of infectivity and fatality, the window period of treatment becomes dangerously short. Hence, drug repurposing is an effective and economical strategy to immediately deliver the desired treatment outcomes in affected individuals with lesser side effects. This is because the use of de-risked drugs bypasses most of the drug developmental hurdles. There are several approaches to this method. Apart from the traditional experimental approaches through in vitro and in vivo studies, an in silico approach is commonly performed due to its high throughput nature (Kumar et al., 2019). With continuous advancements in big data and computing technology, investigators now have an arsenal of analytical or machine learning tools to sift through large numbers of genomes, chemical structures, phenomes, and more to discover ideal drugs for novel targets (Jarada et al., 2020). In our study, we used multiple-sequence alignment, which is a rapid and simple structural-based approach in identifying potential drug candidates (March-Vila et al., 2017). Based on the data mined from ZINC, we have tabulated a list of 73 known viral therapeutic targets (Table 2). This includes those that share some similarities to the SARS-CoV-2 virus, for instance, the rhinovirus, which is a positive-sense, single-stranded RNA virus (+ssRNA) much like SARS-CoV-2 (Pal et al., 2020). A large number of independent and clinical studies that are testing antiviral drugs for COVID-19 (e.g. remdesivir, favipiravir, and lopinavir) are being used commercially for their specificity to these known viral therapeutic targets (Gordon et al., 2020, Kang et al., 2020) (NCT04401579, NCT04358549, NCT04386876). Of these, we have generated a list of 29 viral orthologs that share sequence similarities to that of the SARS-CoV-2-SA isolates (Table 3). Each of these targets comes with information on purchasable drugs. Evidently, the replicase polyprotein 1a of SARS CoV, which also contains the conserved 3CLPRO and PLPRO sequences, had the highest similarity with SARS-CoV-2-SA (79.91%). A ZINC search revealed that tanshinones were among the highest binding affinity substances based on chembl 20. Tanshinones are a class of lipophilic phenanthrene compounds and are the main terpenoid bioactive components of Salvia miltiorrhiza, which is a dried root that is heralded for its therapeutic properties in traditional Chinese medicine (Jiang et al., 2019). It has been shown to activate the AMPK/mTOR signaling pathway, which led to apoptotic inhibition and autophagy induction in heart cells (X. Zhang et al., 2019). This involved the modulation of Bcl-2, Bax, caspase-3 and caspase-7. A study by Park et al (2012) has shown that these compounds are selective inhibitors of SARS CoV 3CLPRO and PLPRO cysteine proteases (Park et al., 2012), and this has been a recent subject of interest for its application in treating COVID-19 (Shahrajabian et al., 2020). These important viral proteins, along with RNA-dependent RNA polymerase (RdRp) and several others, are known to be highly conserved between the two human coronaviruses, including functional regions (C. Wu et al., 2020)). Because of that, finding available drugs that target these regions have garnered significant attention (Báez-Santos et al., 2015). Indeed, several studies have already been done to identify such potential drug candidates that can bind to these viral proteins (Jo et al., 2020a, Jo et al., 2020b, Virdi et al., 2020). Tanshinones are currently being investigated in clinical trials as a potential treatment for acute myocardial infarction (NCT02524964), pulmonary hypertension (NCT01637675), and polycystic ovary syndrome (NCT01452477). Its therapeutic role in vascular diseases is partly due to its modulatory effect on vascular smooth muscle cell proliferation, which is a key contributor in arterial remodeling and hypertension (Wu et al., 2019). To date, there are no related clinical studies involving its use in coronavirus infections. However, a study by Zhang et al. (2020) has shown that cryptotanshinone, a tanshinone derivative, elicited a dose-dependent anti-viral effect on cells infected with SARS-CoV-2 in vitro. This proved that the compound does indeed hold tremendous potential. Meanwhile, myricetin and morin are members of flavonoids, which have been shown by several studies to be chemical inhibitors of the SARS CoV 3CLPRO and nsp3 proteins (Jo et al., 2020a, Jo et al., 2020b, Yu et al., 2012). Currently, flavonoids are being recommended as potential phytochemical-based medicines for COVID-19 treatment (Ngwa et al., 2020, Russo et al., 2020). Podofilox (podophyllotoxin), which is a human papilloma virus drug, may also be a potential candidate. However, a separate virtual screening done by Jordaan et al (2020) showed a low binding affinity of this drug to the SARS-CoV-2 protease (Jordaan et al., 2020). Podofilox is similar to the molecular scaffold of efavirenz, which is an HIV drug (Jordaan et al., 2020). Several of these anti-HIV proteinase drugs (e.g. atazanavir, ritonavir, darunavir, and dolutegravir) have been predicted to possess inhibitory potency against SARS-CoV-2 3CLPRO in silico. An example of these can also be found in Table 3. Other notable mentions from this study are drugs targeting the influenza A virus (a negative-sense RNA virus) neuraminidase and matrix protein. Drugs against this virus (e.g. favipiravir, oseltamivir, etc) are currently in clinical trials and experimental treatments for COVID-19 patients (R. Wu et al., 2020) (NCT04464408, NCT04516915). Other influenza drugs like amantadine, are also being recommended as plausible treatments (Abreu et al., 2020, Araújo et al., 2020). In spite of the lower similarity indexes of other viral targets in our study, further studies should be done to investigate the potential effects of the listed drugs against these viruses. Perhaps, through other in silico methods such as molecular docking.

Conclusion

COVID-19 outbreak will not disappear in a short time and will add more to the rate of mortality and morbidity if appropriate intervention is not correctly applied. Due to the rapid human-to-human transmission of infection worldwide, finding an effective treatment or a functional cure is paramount to ensure our survival. Here, we employed multiple sequence alignment, which is a fast and simple approach in drug repurposing, to identify candidate drugs for COVID-19, with an emphasis on the isolates obtained from Saudi Arabia. Our study showed that these isolates show high sequence similarity (around 98%), and its genomic structure reflects that of the Wuhan reference isolate genome. By using multiple sequence alignment, we showed that among the list of viral target orthologs, the SARS CoV replicase polyprotein 1a sequence had the highest similarity (79.91%). Based on the ZINC database, tanshinones were among the highest compounds with binding affinity to the target, and it has been reported to selectively inhibit SARS CoV 3CLPRO and PLPRO cysteine proteases in the literature. These proteins have garnered significant attention in the discovery of compounds against SARS-CoV-2. However, tanshinones have yet to be investigated in current clinical trials for COVID-19, and thus, these compounds make potential drug candidates.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
  12 in total

1.  MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms.

Authors:  Sudhir Kumar; Glen Stecher; Michael Li; Christina Knyaz; Koichiro Tamura
Journal:  Mol Biol Evol       Date:  2018-06-01       Impact factor: 16.240

2.  Tanshinone I inhibits vascular smooth muscle cell proliferation by targeting insulin-like growth factor-1 receptor/phosphatidylinositol-3-kinase signaling pathway.

Authors:  Yu-Ting Wu; Yi-Ming Bi; Zhang-Bin Tan; Ling-Peng Xie; Hong-Lin Xu; Hui-Jie Fan; Hong-Mei Chen; Jun Li; Bin Liu; Ying-Chun Zhou
Journal:  Eur J Pharmacol       Date:  2019-03-13       Impact factor: 4.432

3.  Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees.

Authors:  K Tamura; M Nei
Journal:  Mol Biol Evol       Date:  1993-05       Impact factor: 16.240

4.  On the Integration of In Silico Drug Design Methods for Drug Repurposing.

Authors:  Eric March-Vila; Luca Pinzi; Noé Sturm; Annachiara Tinivella; Ola Engkvist; Hongming Chen; Giulio Rastelli
Journal:  Front Pharmacol       Date:  2017-05-23       Impact factor: 5.810

5.  In vitro activity of lopinavir/ritonavir and hydroxychloroquine against severe acute respiratory syndrome coronavirus 2 at concentrations achievable by usual doses.

Authors:  Chang Kyung Kang; Moon-Woo Seong; Su-Jin Choi; Taek Soo Kim; Pyoeng Gyun Choe; Sang Hoon Song; Nam-Joong Kim; Wan Beom Park; Myoung-Don Oh
Journal:  Korean J Intern Med       Date:  2020-05-29       Impact factor: 2.884

6.  Identification of myricetin and scutellarein as novel chemical inhibitors of the SARS coronavirus helicase, nsP13.

Authors:  Mi-Sun Yu; June Lee; Jin Moo Lee; Younggyu Kim; Young-Won Chin; Jun-Goo Jee; Young-Sam Keum; Yong-Joo Jeong
Journal:  Bioorg Med Chem Lett       Date:  2012-04-25       Impact factor: 2.823

7.  Analysis of therapeutic targets for SARS-CoV-2 and discovery of potential drugs by computational methods.

Authors:  Canrong Wu; Yang Liu; Yueying Yang; Peng Zhang; Wu Zhong; Yali Wang; Qiqi Wang; Yang Xu; Mingxue Li; Xingzhou Li; Mengzhu Zheng; Lixia Chen; Hua Li
Journal:  Acta Pharm Sin B       Date:  2020-02-27       Impact factor: 11.413

8.  From SARS and MERS CoVs to SARS-CoV-2: Moving toward more biased codon usage in viral structural and nonstructural genes.

Authors:  Mahmoud Kandeel; Abdelazim Ibrahim; Mahmoud Fayez; Mohammed Al-Nazawi
Journal:  J Med Virol       Date:  2020-03-16       Impact factor: 2.327

9.  Quantitative phylogenomic evidence reveals a spatially structured SARS-CoV-2 diversity.

Authors:  Leandro R Jones; Julieta M Manrique
Journal:  Virology       Date:  2020-08-26       Impact factor: 3.616

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.