Austė Kanapeckaitė1, Asta Mažeikienė2, Liesbet Geris3, Neringa Burokienė4, Graeme S Cottrell5, Darius Widera5. 1. AK Consulting, Laisvės g. 7, LT 12007 Vilnius, Lithuania. Electronic address: auste.kanapeckaite14@alumni.imperial.ac.uk. 2. Department of Physiology, Biochemistry, Microbiology and Laboratory Medicine, Institute of Biomedical Sciences, Faculty of Medicine, Vilnius University, M. K. Čiurlionio g. 21, LT-03101 Vilnius, Lithuania. 3. Biomechanics Research Unit, GIGA In Silico Medicine, University of Liège, Quartier Hôpital, Avenue de l'Hôpital 11 (B34), Liège 4000, Belgium; Biomechanics Section, Department of Mechanical Engineering, KU Leuven, Celestijnenlaan 300C (2419), Leuven 3001, Belgium; Skeletel Biology and Engineering Research Center, Department of Development and Regeneration, KU Leuven, Herestraat 49 (813), Leuven 3000, Belgium. 4. Clinics of Internal Diseases, Family Medicine and Oncology, Institute of Clinical Medicine, Faculty of Medicine, Vilnius University, M. K. Čiurlionio str. 21/27, LT-03101 Vilnius, Lithuania. 5. University of Reading, School of Pharmacy, Hopkins Building, Reading RG6 6UB, United Kingdom.
Abstract
The COVID-19 pandemic created an unprecedented global healthcare emergency prompting the exploration of new therapeutic avenues, including drug repurposing. A large number of ongoing studies revealed pervasive issues in clinical research, such as the lack of accessible and organised data. Moreover, current shortcomings in clinical studies highlighted the need for a multi-faceted approach to tackle this health crisis. Thus, we set out to explore and develop new strategies for drug repositioning by employing computational pharmacology, data mining, systems biology, and computational chemistry to advance shared efforts in identifying key targets, affected networks, and potential pharmaceutical intervention options. Our study revealed that formulating pharmacological strategies should rely on both therapeutic targets and their networks. We showed how data mining can reveal regulatory patterns, capture novel targets, alert about side-effects, and help identify new therapeutic avenues. We also highlighted the importance of the miRNA regulatory layer and how this information could be used to monitor disease progression or devise treatment strategies. Importantly, our work bridged the interactome with the chemical compound space to better understand the complex landscape of COVID-19 drugs. Machine and deep learning allowed us to showcase limitations in current chemical libraries for COVID-19 suggesting that both in silico and experimental analyses should be combined to retrieve therapeutically valuable compounds. Based on the gathered data, we strongly advocate for taking this opportunity to establish robust practices for treating today's and future infectious diseases by preparing solid analytical frameworks.
The COVID-19 pandemic created an unprecedented global healthcare emergency prompting the exploration of new therapeutic avenues, including drug repurposing. A large number of ongoing studies revealed pervasive issues in clinical research, such as the lack of accessible and organised data. Moreover, current shortcomings in clinical studies highlighted the need for a multi-faceted approach to tackle this health crisis. Thus, we set out to explore and develop new strategies for drug repositioning by employing computational pharmacology, data mining, systems biology, and computational chemistry to advance shared efforts in identifying key targets, affected networks, and potential pharmaceutical intervention options. Our study revealed that formulating pharmacological strategies should rely on both therapeutic targets and their networks. We showed how data mining can reveal regulatory patterns, capture novel targets, alert about side-effects, and help identify new therapeutic avenues. We also highlighted the importance of the miRNA regulatory layer and how this information could be used to monitor disease progression or devise treatment strategies. Importantly, our work bridged the interactome with the chemical compound space to better understand the complex landscape of COVID-19 drugs. Machine and deep learning allowed us to showcase limitations in current chemical libraries for COVID-19 suggesting that both in silico and experimental analyses should be combined to retrieve therapeutically valuable compounds. Based on the gathered data, we strongly advocate for taking this opportunity to establish robust practices for treating today's and future infectious diseases by preparing solid analytical frameworks.
A global healthcare crisis created by the COVID-19 pandemic led to an unprecedented challenge in healthcare. This infectious disease caused by the SARS-CoV-2 virus first emerged in December 2019 in Wuhan, China and rapidly spread around the world affecting multiple countries. While the initial response of the World Health Organisation (WHO) was restrained to a ‘health emergency’, the status was soon recategorised as a ‘pandemic’ [1,2]. COVID-19 patients presented with varying symptoms which were classified into three categories, namely mild, moderate, and severe. The disease progression, associated complications, and mortality were identified to show age and gender dependent differences with comorbidities also playing a significant role [3,4]. The initial therapeutic management of COVID-19 was limited and the urgent nature of the pandemic prompted clinicians to try different approaches [3]. A variety of protocols were developed to provide treatment and aid patient recovery where the therapeutic options included antiviral drugs, anti-inflammatory drugs, immunomodulators and other types of intervention [3,5,6]. As more information was collected and emergency approvals for clinical trials produced data, the clinical strategies were further refined [3,5,6]. This also presented an opportunity to explore drug repurposing and search for effective clinical management options capturing both infection prevention and the alleviation of symptoms [2,7,8].Repurposing drugs offers the advantage of reduced clinical development time and working with an already well-established pharmacological profile [9,10]. Moreover, many researchers took full advantage of various in silico approaches, such as machine learning, bioinformatics or computational chemistry, to test new candidate molecules against various COVID-19 targets [[11], [12], [13]]. While computational methods undoubtedly help to refine the complex chemical space, the field still requires a structured approach and practice establishment to ensure that outcomes are comparable and properly benchmarked [[14], [15], [16]]. Thus, unsurprisingly most of the molecules that were selected for repurposing where not effective in treating severe COVID-19 cases leading to various controversies [10,17,18]. The less-than-optimal results may also indicate that the research in drug repurposing focuses too narrowly on either the targets or the chemical entities without considering broader systemic implications [10,13,[19], [20], [21]].Seeing the challenges and urgent need for new research and discovery strategies, we set out to explore the COVID-19 infection from the perspectives of computational pharmacology, systems biology, and cheminformatics bridging the system-wide effects with compound chemical features. Mining the data of available COVID-19 clinical trials allowed us to capture the links between compounds under investigation and their targets spanning both direct and expanded interactions (i.e., additional degrees of separation between interacting proteins). Since in-depth information on the COVID-19 clinical profile is limited and study-dependent [10,22], we used tested chemical entities as a proxy to understand the disease network and what pathways might be relevant if a drug proves to be beneficial. We employed several different techniques ranging from a direct to an extended interactome recreation which enabled us to identify a broader scope of affected signalling networks under a single drug influence [23,24]. Our analysis of COVID-19 drug targets demonstrated that there is an overlap across certain target groups and the diversity of pharmaceuticals provides more options to adjust medications based on their systemic effects. Intriguingly, looking into the extended networks of drug targets, we found rich clusters of shared features. This allowed us to hypothesise that a direct drug-target interaction will have an effect on other associated interactors and this information could be used to assess drugs under investigation that share a predicted network [5,6,10,[23], [24], [25]]. Moreover, this approach could be used to select combination therapy regimens. Under these assumptions, we extracted the most noticeable clusters of drugs that have multiple shared targets through their signalling networks and used the ontology-based enrichment analysis to identify cellular processes, functional signalling networks, and linked pathways. Indeed, we were able to verify that compounds modulate a number of shared processes that are engaged through different network nodes. These findings have several important implications for future repurposing studies highlighting that focusing on a single target can lead to potentially missing a number of other relevant systemic effects [26]. Selecting treatment options based on a network-centric perspective can provide insights into short- and long-term effects, and gathering such data could significantly improve how medical practitioners prescribe therapeutics and/or mitigate unwanted outcomes [5,27,28]. Such methods could help establish robust practices to collect and organise data for drug repositioning and clinical studies [10,22,29].We also explored the regulatory miRNA space and found that many target genes/proteins showed a staggering number of shared regulatory miRNAs. These findings also agree with the clinical miRNA observations in COVID-19; however, such data is still very limited [30,31]. Our identified enrichment highlights that processes affected by the drugs used to treat viral infections have a complex regulatory interplay and analysing them further could help refine treatment options from biomarkers to RNA interference therapeutics [30,32,33].While understanding the biological space is extremely important when optimising drug selection or identifying new antiviral and/or anti-inflammatory treatment methods, computational pharmacology can aid in bridging compound chemical features with observed effects [[34], [35], [36]]. Thus, we explored the chemical profiles of the drugs that are undergoing or have been undergoing clinical trials for COVID-19. Such information can be a valuable guide when designing a screening library either by narrowing down or expanding drugs to diversify the compound set [[34], [35], [36]]. In our study, we demonstrated that drugs used to treat COVID-19 do not have clear structural patterns, aside from a steroid subgroup and certain smaller subclusters. Chemical space mining (> 2.1 M compounds) and our built quantitative structure-activity relationship (QSAR) models via machine learning (ML) and deep learning (DL) enabled us to extract the most relevant features of drugs predicted to have antiviral properties against the SARS-CoV-2 virus. We applied the models to investigate which chemical entities in the COVID-19 clinical trial set could have antiviral properties. Both models had similar accuracy (ML- 96.77%; DL- 96.48%) differentiating between antiviral and control compounds when they were tested on the said drug set. Some of the identified drugs were already known antivirals tested for COVID-19 (e.g., ritonavir). To our surprise, when we used the trained models on 95 small fragment-like molecules that were experimentally verified to bind the Mpro protein on the SARS-CoV-2 virus [37], none were selected as antivirals. These results underscored that current libraries used to test compounds might be too narrow or biased. Thus, experimental activity data on targets of interest is crucial in establishing QSAR relationships [38].Drug repurposing and new compound development need to be built on diverse compound screening libraries with a strong understanding of their interactome and regulome. Such integrative approaches can prevent early failures in discovery pipelines or ineffective treatment regimens in repurposing studies [36,39]. As can be seen from our in-depth analysis, evaluating compounds for COVID-19 should expand beyond direct drug-target interactions and consider a more complex space of affected networks in order to develop more robust combination therapies (Fig. 1
). Thus, we should take this opportunity to establish research and discovery practices for today's and future infectious diseases by preparing solid analytical frameworks.
Fig. 1
An outline for COVID-19 investigational drug analysis summarising key analytical steps and outcomes.
An outline for COVID-19 investigational drug analysis summarising key analytical steps and outcomes.
Methods
Data collection and mining
Data for COVID-19 associated clinical trials and drugs involved in treatment and/or clinical investigation protocols were primarily retrieved from the Open Targets platform that curates information on clinical testing, known targets, and compound information [40,41]. Mining (November 2021) returned 1375 target-drug pairs which constituted 230 unique drugs and 356 unique targets (i.e., some drugs have multiple main targets or different drug formulations). In addition, Open Targets were searched for compound and known target associations to extract the relevant chemical data (e.g., SMILES, InchiKey, etc.) - this provided information on 18,376 compounds. To expand and verify the data sets, the information was cross-referenced against PubChem COVID-19 records (1625 compound data) [42,43] and the STITCH database containing compound-protein interaction data (15,473,939 interaction points) [44]. Additional interactome data was retrieved mining the STRING database (135,660 interactions, 5922 new targets for the expanded interactome network) [45,46]. Reactome database was used to extract information on relevant pathways [47]. miRNA database was used from the OmicInt package associated repository to mine non-conding RNA interactions [48]. ChEMBL compound database (>2.1 M chemical entities) was used to search for similar and control compounds when investigating COVID-19 clinical trial drugs [49]. COVID-19 CAS and Diamond/Xchem Mpro compound repositories were used to extract predicted and experimentally tested antiviral drugs [37,50].
Computational pharmacology and bioinformatics analysis
COVID-19 clinical trial data mining, cleaning, and analysis was performed in R programming environment (v4.1.2) with RStudio [51]. Specific libraries used for enrichment, clustering, and ontology analyses include STRINGdb (v2.6.0), ClusterProfiler (v4.2.0), EnrichGO (v4.2.0), EnrichPathway (v4.2.0), and BioMart (v2.50.1) [[52], [53], [54], [55], [56]].
Cheminformatics analysis
Python programming environment (v3.9.7) [57] was used for chemical descriptor extraction, Morgan fingerprinting, Mol2vec fingerprinting [58], compound similarity assessment, substructure search, and image generation. Used packages and analytical frameworks include Rdkit (v2021.9.4), NumPy (v1.22.1), Pandas (v1.3.5), Seaborn (v0.11.2), Matplotlib (v3.5.1), and Chemexpy (v1.0.10) [[59], [60], [61], [62], [63], [64]]. Custom algorithmic assessments, comparative analyses, and data mining were performed using Rdkit (v2021.9.4) [59].
Machine and deep learning
Python programming environment (v3.9.7) [57] was used for machine and deep learning. The machine learning framework was implemented via Scikit-learn library [65] where LGBMClassifier [66] was used as a classifier with default parameters, train-test split at 0.2, where features comprised vectorised and normalised Morgan fingerprints (radius = 3, nBits = 2048). Deep learning neural networks were built for Mol2vec [58] encoded chemical features using the following set-up facilitated by TensorFlow and Keras libraries [67,68]: sequential addition of layers starting with a Dense layer (hidden units = 200, activation=’relu’, and input shape = (300,)), followed by Dense layers with hidden units: 128, 100, 50 and a dropout of 0.25 after each. All layers except the last were activated with ‘relu’ function, the last dense layer had only 1 hidden unit and sigmoid activation. Binary cross-entropy with adam optimiser and metrics for accuracy were used for the model compilation. The analysis was run for 200 epochs using 256 units for batch size with 0.2 split of the original data for validation. Deep learning was performed using Python 3 Google Compute Engine backend (Tensor processing units, TPU), RAM 12 GB, and HDD 107 GB.
Results
COVID-19 clinical trials represent a broad spectrum investigation of potential therapeutics to modify the disease course
We opted to use mined data from the Open Targets platform on referenced COVID-19 clinical trials (230 unique drugs with a known target status) so that our analyses were focused on a consistent set of compounds [40,41]. Assessing clinical phase distributions for drugs that are undergoing or underwent clinical trials revealed that the majority of the therapeutics are in phase 2 (46.52%) with phase 3 and 4 being the other predominant categories at 28.26% and 20.43%, respectively (Fig. 2
, A). Based on the available information, the largest proportion of clinical studies (50.87%) are still recruiting patients while other categories, such as ‘Not yet recruiting’ or ‘Completed’ distribute below the 20% mark (Fig. 2, B). Supplementary Fig. 1 captures the clinical phase and status of every drug profile with a clear shift towards advanced clinical phases underscoring the emergency status of the disease and a large number of therapeutics under investigation to capture population-based effects [40,41]. Drug targets and types of drugs used to combat the COVID-19 infection and the associated complications (Fig. 2, C) further exemplify a broad therapeutic engagement. Most of the drugs in clinical trials are inhibitors (nearly half of all treatment options) with agonists (20.87%) and antagonists (17.83%) comprising the other two main categories (Fig. 2, D). About two thirds of the pharmacological intervention options belong to a small-molecule category (Fig. 2, F). Antibodies (14.35%) and other proteins (10%) are two additional drug classes that are important in treating this viral infection (Fig. 2, F). However, it is necessary to note that an important evidence gap has been reported for the safety of the drugs tested for COVID-19. As reported in January 2021, 40.4% of completed trials did not post results on ClinicalTrials.gov or in the academic literature [22].
Fig. 2
Summary plots for COVID-19 clinical trials data (n = 230; November 2021). Information was mined from the Open Targets COVID-19 database where duplicate entries for the same chemical entity were removed when preparing summary plots. A – a clinical phase distribution plot for COVID-19 clinical trials; B - a bar plot for the status of COVID-19 clinical trials; C – a pie chart for drug activity types; D – a pie chart for drug molecule types.
Summary plots for COVID-19 clinical trials data (n = 230; November 2021). Information was mined from the Open Targets COVID-19 database where duplicate entries for the same chemical entity were removed when preparing summary plots. A – a clinical phase distribution plot for COVID-19 clinical trials; B - a bar plot for the status of COVID-19 clinical trials; C – a pie chart for drug activity types; D – a pie chart for drug molecule types.
COVID-19 treatment strategies highlight the need for a more in-depth understanding of drug pharmacological action
The original set of compounds (Open Targets, 230 unique drugs with a known target status) used to treat or investigated for the treatment of COVID-19 was searched for any shared proteins across the main known drug targets (Suppl. Fig. 2). Since a compound might be listed to engage one or more proteins, we aimed to cross-reference drugs and their targets to get insights into shared action [23]. This analysis returned several smaller clusters with denser sets in the top left corner of the heatmap (Suppl. Fig. 2). For example, dipyridamole and pentoxifylline engage a broad spectrum of phosphodiesterases (PDEs), such as PDE3B, PDE1A, or PDE5A. Comparing dipyridamole and pentoxifylline action allowed identifying some additional main targets where these compounds differ. Specifically, dipyridamole inhibits the equilibrative nucleoside transporter-1 (ENT1 or SLC29A1) which serves a sodium-independent transporter for purine and pyrimidine nucleosides and pentoxifylline is believed to downregulate adenosine A2A receptor (A2AR)-mediated pathways [[69], [70], [71]]. Propofol, sevoflurane, and isoflurane also form a noticeable group in the heatmap (Suppl. Fig. 2). These drugs are used in anaesthesia protocols for patients requiring mechanical ventilation and prolonged, deep sedation to optimize oxygenation and ventilation during respiratory failure from COVID-19 [[72], [73], [74], [75]]. Analgesia drugs also form a cluster through their shared action on PTGS1 and PTGS2, also known as cyclooxygenase 1 and cyclooxygenase 2 [76]. The corticosteroid section (bottom left of the heatmap, Suppl. Fig. 2) have one shared target that stands out – nuclear receptor subfamily 3 group C member 1 (NR3C1). This receptor was implicated in the progression of the COVID-19 infection where a single cell transcriptome study revealed that the NR3C1-CXCL8-Neutrophil axis determines the severity of the COVID-19 disease [77].Based on the drug-target network analysis, we concluded that in order to better understand pharmacological interaction processes, we needed to expand the drug interactome space.
Interactome investigation revealed new opportunities for drug repurposing and combination therapies
Taking into consideration the fact that targets are part of complex signalling networks prompted us to explore what the overlap size of the interactome is for each drug and if it could be used to find alternative therapies, repurpose existing ones, or help develop new drugs faster. In other words, the main known targets for each drug were expanded by mining protein-protein interaction networks to extract an estimated network that has links to the drug through its main target. These networks were searched for overlaps in pairwise drug comparisons where the overall protein set consisted of 5922 unique targets. On average each main drug target had about 97.6 interactors (Suppl. Fig. 3
) ranging from no known interactors to a maximum number of 558 other proteins. This search allowed us to identify emerging patterns in data on a systemic level (Fig. 3) where three larger and two smaller clusters with relatively large, shared networks were found for different drugs. Clusters 1, 2, and 3 were found to be the least diverse considering the number of seed proteins (or the main drug targets) (Fig. 3; Suppl. Table 1 and Suppl. Fig. 4
). By contrast, clusters 4 and 5 showed the most diversity with 17 and 29 unique seed proteins, respectively.
Fig. 3
A heatmap showing the size of shared gene networks for drugs (n = 230) that were used to treat or investigated for the treatment of COVID-19. Every drug used for the treatment had a main target and an extended network that consisted of protein-protein interactions or associations which were mined based on experimental, text mining, and analysis-based evidence (STRING database; threshold = 700). Each drug and its associated interactor network were cross-referenced against other drugs to establish a shared network, i.e., overlapping drug sets. This information is shown via the heatmap where diagonal entries represent the network size for a selected drug. Clusters selected for the downstream analysis are highlighted in green squares with the cluster IDs next to them. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Fig. 4
COVID-19 drug network enrichment for drug interactor cluster 5 (29 unique main targets, 23 drugs). A - a network plot showing genes for the top five enriched groups. B - a dot plot depicting specific functional enrichment. The size of the dots indicate the size of the cluster for the particular functional group with the probability provided through p.adj. Values.
A heatmap showing the size of shared gene networks for drugs (n = 230) that were used to treat or investigated for the treatment of COVID-19. Every drug used for the treatment had a main target and an extended network that consisted of protein-protein interactions or associations which were mined based on experimental, text mining, and analysis-based evidence (STRING database; threshold = 700). Each drug and its associated interactor network were cross-referenced against other drugs to establish a shared network, i.e., overlapping drug sets. This information is shown via the heatmap where diagonal entries represent the network size for a selected drug. Clusters selected for the downstream analysis are highlighted in green squares with the cluster IDs next to them. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)COVID-19 drug network enrichment for drug interactor cluster 5 (29 unique main targets, 23 drugs). A - a network plot showing genes for the top five enriched groups. B - a dot plot depicting specific functional enrichment. The size of the dots indicate the size of the cluster for the particular functional group with the probability provided through p.adj. Values.Exploring the two most diverse clusters, clusters 4 and 5, we can see that cluster 4 contains drugs used to treat hypertension, namely losartan, valsartan, telmisartan, candesartan, and ambrisentan. Losartan is an angiotensin II receptor blocker (ARB) used to treat hypertension and it has been proposed that this drug acting as a selective antagonist of the angiotensin II type 1 (AT1) receptor may offer some protection from lung damage induced by COVID-19 [[78], [79], [80]]. As can be seen, cluster 4 drugs are predicted to mostly engage the same size networks; however, comparing and exploring drug-specific networks can help identify diverging biochemical processes that could be therapeutically relevant. One example of such an approach can be found in cluster 4 for ambrisentan - a selective type A endothelin (ET-A) receptor antagonist. This selective antagonist is used to primarily treat pulmonary arterial hypertension and has been applied in COVID-19 combination therapy with dapagliflozin which inhibits sodium glucose co-transporter-2 (SGLT-2) [81]. While dapagliflozin has not been listed in the Open Targets COVID-19 clinical trial data (November 2021), we explored the extended interactome for SGLT-2 using the same principles as for other compounds. We then compared dapagliflozin with ambrisentan for any overlapping targets in their networks. The dapagliflozin network is relatively small – only 11 targets, while ambrisentan's network size is 183 proteins. The networks shared only two targets, namely adenylate cyclase 7 (ADCY7) and glucagon (GCG) highlighting the different cellular process engagement with a two-gene convergence point. Similarly, cluster 5 combines several different drug classes, e.g., aviptadil, prasugrel, chlorpromazine, and naltrexone, that converge through a shared interactor network (Fig. 3; Suppl. Table 1 and Suppl. Fig. 4). Ticagrelor and prasugrel (P2Y12 platelet inhibitors) have been employed to manage acute coronary syndrome (ACS) and as a thromboprophylaxis in patients with COVID-19. Numerous studies explored combinations of an enhanced prophylactic doses to correct the parameters of viscoelastic coagulation [[82], [83], [84]]. Other drugs, such as chlorpromazine (phenothiazine antipsychotic), naltrexone (opioid receptor antagonist), and fingolimod (a sphingosine 1-phosphate receptor modulator), have been recognised for their multi-modulation potential and have also been tested in various clinical settings [[85], [86], [87], [88]]. Considering our findings, it is important to highlight the need for the consolidation and further exploration of the pleiotropic effects of the drugs used to treat COVID-19. Consequently, we selected cluster 5, the most diverse cluster, to explore the occurring interactions more in-depth.
Interactome analyses accentuated diverse process networks that could be used to advance therapeutics development
Rather than exploring genes in isolation we examined what signalling networks and functional processes can be enriched for the identified clusters. Fig. 4 highlights that cluster 5 identified during the interactome analysis (Fig. 3, Suppl. Table 1 and 2) also shows varied functional enrichment patterns ranging from calcium ion homeostasis to neutrophil migration. A proportion of genes from cluster 5 also belongs to the extracellular signal-regulated protein kinase (ERK) cascade which has previously been reported as a potential therapeutic target in coronavirus infections where cascade inhibition was observed to lead to infection resolution [89]. As can be seen other clusters had different profiles where clusters 3 and 4 had several shared functional themes involving haemostasis (Suppl. Fig. 5
). Haemostatic aspects of COVID-19 have been reported as a serious concern in stabilising patients and reducing tissue/organ damage [82,90,91]. Furthermore, clusters 1 and 2 are good examples demonstrating how even a few genes/proteins (i.e., cluster drug main targets) can impact multiple different cellular functions through the extended network; such considerations can always be useful when predicting drug effects or exploring alternative uses (Suppl. Fig. 5) [5,7,10]. To better understand specific functions in a gene cluster, it is helpful to explore if any of the genes in the over-represented functional groups belong to the same pathway. Cluster 5 has several clear themes where some genes are shown to play a role in multiple pathways and others are much more pathway specific (Fig. 5). As an example we can inspect one drug - cenicriviroc from cluster 5 (experimental drug, inhibitor of C—C motif chemokine receptors, namely CCR2 and CCR5) which has several targets, e.g., interleukin 10 (IL-10), CCR5, C—C motif chemokine ligand 20 (CCL20), C-X-C motif chemokine ligand 10 (CXCL10), CD86, and formyl peptide receptor 1 (FPR1), belonging to the IL-10 signalling pathway (Fig. 5; Suppl. Table 1 and 2). Due to its apparent broad engagement spectrum, cenicriviroc has been included in clinical trials to assess its anti-inflammatory and immunomodulatory effects [[92], [93], [94], [95]]. Drugs with different targets, such as icatibant (bradykinin 2 receptor antagonist) or ozanimod (sphingosine 1-phosphate receptor agonist), show a potential pathway overlap through shared network targets and this understanding could be useful in managing their COVID-19 clinical trials (Fig. 5; Suppl. Table 2) [96,97].
Fig. 5
Pathway annotations for genes in cluster 5 (29 unique main targets, 23 drugs). Genes from cluster 5 (x-axis) were analysed based on their associations with specific pathways (y-axis) and the different memberships were visualised using a heatmap. Pathway data was mapped based on the Reactome database.
Pathway annotations for genes in cluster 5 (29 unique main targets, 23 drugs). Genes from cluster 5 (x-axis) were analysed based on their associations with specific pathways (y-axis) and the different memberships were visualised using a heatmap. Pathway data was mapped based on the Reactome database.The identified enrichment clusters as well as individual drugs could be used to compare and match different combination therapy regimens, such as haemodynamics modulating and anti-inflammatory. In the case of anti-inflammatory action, it is possible to compare drugs having immediate vs long-term effect in order to reduce tissue damage occurring in acute and chronic disease progression.
miRNAs represent a potentially new biomarker and therapeutic modulation space linking the drug interactome
Seeing the complexity of the drug networks, we also analysed the non-coding regulatory layer for the most diverse cluster 5 (Fig. 3). We used the minded data of validated miRNAs and their regulome genes to explore the dynamics of miRNAs in the selected gene cluster [48]. We found a rich network of miRNAs known to be involved in the regulation of multiple genes (Suppl. Fig. 6
). We identified that some genes from cluster 5 are linked to miR-320 family which downregulation has been associated with the progression of disease severity and miR-320 have been suggested as potential biomarkers for SARS-CoV-2 [98,99]. Other miRNAs from cluster 5 have also been reported in other studies as prognostic markers. For example, circulating miRNAs from ten COVID-19 patients (sampling done longitudinally with ten age and gender matched healthy donors) allowed to profile the alteration of 55 miRNAs in COVID-19 patients during early-stage disease [30]. Our enrichment recovered miR-31-5p (marked upregulation in COVID-19 patients) and other strongly associated biomarker miRNAs, namely miR-423-5p and miR-23a-3p [30].
Fig. 6
Compound property distributions for COVID-19 drugs (n = 158) where density plots and linear regression plots are also provided with pairwise scatter plots. Abbreviations: AP - Aromatic proportion = number of aromatic atoms / number of heavy atoms; MW – molecular weight, TPSA - topological polar surface area; MolLogP – log of a partition coefficient for a molecule.
Compound property distributions for COVID-19 drugs (n = 158) where density plots and linear regression plots are also provided with pairwise scatter plots. Abbreviations: AP - Aromatic proportion = number of aromatic atoms / number of heavy atoms; MW – molecular weight, TPSA - topological polar surface area; MolLogP – log of a partition coefficient for a molecule.
Compound chemical profiles capture certain interactome features and also reveal a highly heterogeneous chemical space
While various drugs showed interesting interactome overlaps, we assessed the chemical and pharmacological characteristics and investigated if network analyses can be superimposed to chemical data. Only 158 compounds out of 230 unique drugs used in COVID-19 investigational studies were selected for chemical profiling after filtering out similar drugs (i.e., only differences in formulation) and antibodies or small peptides. To analyse the chemical feature space, we employed chemical descriptors, structural analysis, and fingerprint-based approaches. We started compound analysis from a medicinal chemistry perspective (e.g., calculated partition coefficient - CLogP, molecular weight - MW, topological polar surface area - TSPA, etc.) to gain important insights about any biases in the data, such as lipophilicity or hydrophilicity. As can be seen, compounds showed diverse characteristics (Fig. 6); however, no specific correlation patterns could be identified for pairwise comparisons aside from the expected physicochemical relationships, e.g., MW vs C atom count. Such analyses provide initial glimpses into any emerging patterns that could be explained by linear dependencies and also help to evaluate any outliers or composition biases. We continued this analysis by performing cross-compound similarity evaluation using Morgan fingerprints (nBits = 2048, radius = 2) and Tanimoto similarity scores (Fig. 7
) [58,100]. Surprisingly, most of the compounds showed only borderline similarity fluctuating around 0.2 and just the steroid group stood out with higher similarity scores forming a cluster. Other smaller groups can also be identified, e.g., angiotensin receptor blockers (Fig. 7). However, only cluster 2 (Fig. 3) drugs show clear links between network and chemical features (Fig. 7). Other categories are not only more dispersed (with partitioned clusters) but also show very little overall similarity. Overall, the assessment underlined specific fragments or structural elements, such as heterocycles, fused ring structures, and/or amphiphatic groups, as a few chemical features influencing the observed similarity across the analysed compounds.
Fig. 7
A compound similarity heatmap for COVID-19 drugs (n = 158) where the legend provides information on the clusters identified through the gene network analysis for COVID-19 drugs. Similarity was assessed using the Tanimoto similarity method and compound fingerprints were calculated as Morgan fingerprints (nbits = 2048, radius = 2).
A compound similarity heatmap for COVID-19 drugs (n = 158) where the legend provides information on the clusters identified through the gene network analysis for COVID-19 drugs. Similarity was assessed using the Tanimoto similarity method and compound fingerprints were calculated as Morgan fingerprints (nbits = 2048, radius = 2).The network-based representation of the SARS-CoV-2 infection has also found support in other studies searching for a framework to evaluate specific clinical outcomes [[101], [102], [103]]. We, however, add to this proposition by bridging biological, pharmacological, and chemical spaces where searching for privileged structures to treat COVID-19 might involve a significantly larger chemical space and more variation than currently is considered [[101], [102], [103]]. We used cluster 5 as our case study to further evaluate what chemical features exist for these compounds and if we could use that information to search for new drugs or design alternative compounds.To explore cluster 5 compounds (Fig. 3, Suppl. Table 1 and 2) that were split into smaller groups based on chemical similarity in depth (Fig. 7), we isolated these compounds and assessed them as a separate set (Suppl. Fig. 7). Cluster 5 compounds share little similarity with the highest similarity score being 0.43 (Tanimoto similarity) between prasugrel and cenicriviroc. These drugs have several features contributing to the similarity where, for example, heterocycles and substituted benzenes, are among major elements linking these two groups. On the other hand, most other compounds from the cluster have marginal similarity. Another example of low similarity and different chemical structures can be found in the comparison between dexmedetomidine (used as an anxiolytic, sedative, and pain management drug with ability to provide sedation without risk of respiratory depression) and melatonin (sleep-wake cycle regulating hormone) (Tanimoto similarity = 0.09) (Suppl. Fig. 7) [104,105]. These observations underline why selecting pharmacological management options cannot rely on the chemical similarity of drugs alone or just main known targets because pharmacological engagement depends on multiple direct and indirect effects contributing to the cellular and organism level response (Suppl. Table 2).
Machine and deep learning based QSAR models underlined the limitations in current chemical libraries used for drug design
We began building a QSAR model by first mining the ChEMBL database (>2.1 M compounds) to investigate if cluster 5 compounds had any matches based on similarity [49]. To mine the database, we set the Tanimoto similarity threshold for >0.4 based on earlier observations of the existing heterogeneity within COVID-19 compounds (Fig. 7). This assessment returned various compounds that matched specific reference drugs from cluster 5 (Suppl. Table 3). Such findings demonstrate that while similarity-based search might lead to identifying more drugs, this may not be the most optimal strategy as more complex chemical relationships need to be established when searching for active compounds. Specifically, the earlier chemical space analysis motivated us to explore compound chemical characteristics not limited to similarity but relying on the drug's ‘architecture’ features [58,100,106].We first built a machine learning model (LightGBM) with a gradient boosting framework to take advantage of the tree-based learning algorithm for complex classification tasks [7,21,[106], [107], [108]]. In order to develop this model, we needed a balanced dataset representing active and inactive compounds. A curated set of known antivirals and/or compounds resembling antivirals (COVID-19 CAS) was used to build a reference compound set for the expected actives (n = 48,876) [50]. To prepare the inactives (n = 50,000), we opted to randomly search ChEMBL database and select the least similar compounds when compared to the actives (<0.2 Tanimoto similarity score) [49]. Chemical characteristics analysis (Suppl. Fig. 8 and Suppl. Fig. 9) confirmed that the selected groups are diverse without any noticeable biases in composition. We set aside 20% of combined data for the model evaluation. Each compound was prepared as a vectorised representation of Morgan fingerprints (nBits = 2048, radius = 3) to represent the chemical features which we reasoned captured both structural and composition elements. The model showed 96.77% accuracy without any marked overfitting and successfully classified the test compounds as active and inactive. Applying the model to the original dataset of 158 compounds in COVID-19 clinical trials, 13 drugs were predicted to have antiviral activity (Suppl. Fig. 10). Some known antiviral drugs, such as ritonavir or maraviroc, were also included in the returned group. Other interesting therapeutic options included menthol which has been suggested to have anti-inflammatory and antiviral properties [109] and amantadine which was shown to block the ion channel activity of Protein E from SARS-CoV-2 (a conserved viroporin among coronaviruses) [110]. Surprisingly, when we used the developed QSAR model to test experimentally validated antivirals against the COVID-19 Mpro protein [37], none of the compounds were classified to possess any activity against COVID-19.We developed a deep learning neural net to perform the same classification using a different type of compound feature encoding (Mol2vec) to mitigate any effects stemming from the model itself or compound preparation [58]. The training set-up was again a randomised selection of compounds setting aside 20% to monitor the performance of the neural network and inspect for under−/over-fitting. The model reached 96.48% accuracy after 200 epochs and returned 10 compounds from the COVID-19 clinical trial drug set as active against the virus. The compounds matched the ML model for decitabine, maraviroc, ritonavir, ribavirin, amantadine, baricitinib, hydroxychloroquine, etoposide, and cobicistat (Suppl. Fig. 11). Again, the Mpro experimental dataset was returned as inactive.These interesting findings point to the fact that compounds expected to be effective against the SARS-CoV-2 virus based on searching known antivirals and/or similar compounds do not represent a complete and relevant chemical space. This might also explain why identifying new therapeutic strategies for COVID-19 drug repurposing has been difficult [111]. Such observations highlight the need to combine both in silico analytics and experimental data to retrieve valuable compounds.Finally, we also investigated the Mpro experimental dataset which included a range of compounds with varying similarity (Suppl. Fig. 12). We mined the ChEMBL database to search for compounds that showed a Tanimoto similarity higher than 0.6 for Mpro molecules based on fingerprint comparisons. This returned several compounds and we then inspected their maximum common substructures (MCS) to evaluate any shared features (Suppl. Fig. 12). Interestingly, the experimental compound, namely Z1220452176, contained a functionalised indole ring which matched Melatonin. Melatonin has been included in COVID-19 clinical trials and while two different QSAR models did not predict it to be antiviral compound, it clearly shares a large substructure with the experimental compound [112].
Discussion
The global spread of the SARS-CoV-2 virus resulted in a fast-evolving pandemic which prompted researchers and clinicians to investigate many different therapeutic avenues to combat the emerging healthcare crisis. While this created an opportunity to take advantage of drug repurposing strategies, the large number of ongoing studies revealed pervasive issues in clinical research [1,10,18,22]. Specifically, the lack of accessible and organised data to effectively compare clinical protocols and experimental studies as well as missing updates on clinical trial outcomes create transparency problems. Consequently, this may negatively impact the research and meta-analyses in this field [22]. Such trends also highlight that we need to have a better preparedness for future infectious diseases where rapid information sharing might be critical.With our study we aimed not only to assess the existing clinical trial data but also to expand the available information so that new perspectives could be applied in clinical studies and therapeutic decision making. We based our analyses on computational pharmacology and systems biology principles to understand the networks that drugs modulate beyond a single main target (Fig. 1) [24,113]. Moreover, we reasoned that focusing on a single target limits our understanding about broader systemic effects and how to effectively select combination therapeutic regimens. We also wanted to bridge the interactome with the chemical compound space and explore shared similarities and emerging patterns as combined information could prove to be useful when developing repurposing strategies further [6,10].
Clinical study and data organisation issues complicate drug repurposing
We began our analysis by exploring the current status of the COVID-19 clinical trial data based on drug clinical phases, drug mode of action, and known main targets. COVID-19 infection management options span multiple clinical phases and drug pharmacological action (Fig. 2). In combination, the results suggest that many different therapeutic avenues have been and are currently being explored to combat the infection as well as the associated complications. However, the overall clinical trial organisation and monitoring come with evidence gaps as there are no systematic data collection and verification, with some reports missing or not harmonised across different databases [22]. Moreover, across academic reports and various databases, drug profiling might reflect different formulations, while in other cases this information is not exclusively reported. Consequently, the lack of an organised approach to monitor clinical trials and systemically collect data hinders repurposing and/or new compound development. These shortcomings not only reflect the current challenges but also call to take action and improve our data collection approaches. Specifically, by creating a unified method to combine clinical and academic data, we can significantly improve our forecasting and analytical capabilities which might be very important in various clinical areas, such as drug repurposing, new drug development, and preparing for other epidemics [5,6,10,13].
From a single target to network-centric pharmacology
As we focused on a selected set of compounds (Open Targets, 230 unique drugs with a known target status) used to treat or investigated for the treatment of COVID-19, we explored how their known main targets can be used to get insights into shared action and/or help predict side effects (Suppl. Fig. 2) [6,9]. This led to several interesting findings.First, our target-focused comparison revealed that while some drugs, e.g., dipyridamole and pentoxifylline, share a number of key targets, their differences in other targets might point to potential side effects. We used dipyridamole and pentoxifylline as a case study since these pharmaceuticals engage a broad spectrum of PDEs. Moreover, multiple studies have demonstrated the importance of phosphodiesterases in regulating various cellular processes [25,69,71,[114], [115], [116]]. PDE1A stands out from the rest of the PDE family with earlier research suggesting that this enzyme plays a role in myofibroblasts formation [117]. Since PDE1A preferentially hydrolyses cyclic guanosine monophosphate (cGMP), which regulates a variety of cellular responses, including proliferation, transformation, extracellular matrix expression, apoptosis, and vascular tone, it makes sense that under a severe immune challenge curtailing abnormal pro-fibrotic processes might help preserve multiple tissue functions [117]. Support for this also comes from a PDE5A inhibition study (with sildenafil) where the inhibition has been shown to reduce cardiac hypertrophy, adverse remodelling, as well as cardiac inflammation and apoptosis in the hypertensive heart [118]. When comparing dipyridamole and pentoxifylline action there are some additional main targets where these compounds differ and what could be used as a therapeutic guidance. Dipyridamole inhibits the ENT1 that serves as a sodium-independent transporter for purine and pyrimidine nucleosides [70,115]. Since adenosine is known to contribute to the pathophysiology of respiratory disease where adenosine challenge can lead to bronchospasm and dyspnoea, adenosine clearance can be therapeutically beneficial [115]. ENT1 facilitates the removal of this nucleoside from the extracellular environment, thus terminating its action. Consequently, inhibition of ENT1 can lead to a rapid spike in extracellular adenosine concentration and increased adenosine receptor signalling. While dipyridamole has been suggested as a therapeutic option for COVID-19 with multiple beneficial properties and clinical trials are on-going [70,71], it might prove to be advantageous to consider the side effects when selecting broad spectrum therapeutics. Specifically, higher adenosine concentration has been documented to lead to bronchoconstriction and dyspnoea in asthmatic or chronic pulmonary disease (COPD) patients [119]. Pentoxifylline is also known to act as an immunomodulator with anti-inflammatory properties. The 5′-nucleotidase inhibition of pentoxifylline leads to the reduced production of adenosine and inosine from their monophosphate forms. Through a nonselective phosphodiesterase inhibition as well as A2AR-mediated pathways, this drug downregulates the expression of tumor necrosis factor alpha (TNFα), IL-1, IL-6 and interferon gamma (IFNγ) [25,120]. Current evidence also suggests that pentoxifylline downregulates the A2AR pathway where this therapeutic multi-modulatory action could protect from adenosine receptor overactivity [25,119,120]. Thus, pentoxifylline has been suggested as a repurposing candidate to reduce tissue damage during the cytokine storm resulting from the SARS-CoV-2 infection [25]. Considering the clinical data and based on the target-centric analysis, this drug appears to have more clinical benefits in comparison to dipyridamole. Overall, PDEs modulators can prove to be useful in regulating multiple cellular functions and minimising tissue damage during uncontrolled or prolonged immune responses. Varying drug specificity towards PDEs also allows more flexibility in clinical approaches; for example, in contrast to other PDE inhibitors, apremilast offers a specific inhibition of PDE4 [114](Suppl. Fig. 2). The lessons learnt could be applied to other similar infections, especially employing network-based assessments for drug actions.Another key observation was that drugs with limited sets of known main targets, such as propofol, sevoflurane, isoflurane, cyclooxygenase inhibitors, or corticosteroids, reduce our ability to infer therapeutic and off-target effects. This likely explains why it has been difficult to draw conclusions about the efficacy of some of these drugs in COVID-19 treatment, e.g., in the case of corticosteroid or acetaminophen (paracetamol) use [121,122]. The majority of the studies for non-steroidal anti-inflammatory agents (NSAIDs) did not indicate any associations between their use and increased mortality rates or an increased risk for respiratory failure during COVID-19 and thus, NSAID use is supported to manage COVID-19 symptoms, such as fever or muscle pain [121,123,124]. There are reports where acetaminophen (paracetamol) was found to be linked with worse outcomes; yet other case studies do not report any significant differences between clinical outcomes for paracetamol or ibuprofen users [125,126]. Corticosteroid treatment has also been linked to IL-6 levels where one study showed that alveolar macrophages, endothelial cells, and smooth muscle cells co-express NR3C1 and IL-6, implicating a potential corticosteroid role in cytokine release storm [127]. However, corticosteroid use has diverging support as systemic studies are lacking and data collected from meta-analyses does not allow to account for all patient subgroups [122,128].We appreciate that the more information we have about known drug targets, the better differentiation and selection of drugs can be achieved [24]. Thus, until the compounds used in treating COVID-19 have extensive studies on their other potential targets, we must to rely on data mining to understand the larger modulation potential of each and every drug.To compensate for these limitations, we propose to use an extended interactome analysis where drug main target or targets are used to search for close interactors and establish network-centric characteristics of a drug. For drugs in COVID-19 clinical trials, we identified several clusters that have multiple interactions where clusters 1, 2, and 3 were found to be the least diverse and two remaining clusters had the most diversity considering the number of seed proteins (or the main drug targets) (Fig. 3; Suppl. Table 1 and Suppl. Fig. 4). One of the most diverse clusters – cluster 4, contains drugs used to treat hypertension where clustered pharmaceuticals mostly belong to the networks of the same size. One drug from this cluster, namely losartan, acts through angiotensin-converting enzyme (ACE), angiotensin II (Ang II), and AT1 or ACE–Ang II–AT1 axis in the renin–angiotensin system (RAS) which is a known molecular pathway for end-organ fibrosis [78]. However, clinical studies focusing on another blocker, namely valsartan, did not report any significant benefits [80]. Thus, comparing drugs based on their network similarity could aid in planning clinical trials. A good case example of identifying diverging biochemical processes can also be found analysing the extended network targets for ARBs and cetirizine (an antihistamine drug) which has an interesting gene uniquely belonging to a cetirizine cluster. Histidine decarboxylase (HDC) is a member of the group II decarboxylase family and converts L-histidine to histamine in a pyridoxal phosphate dependent manner. This histamine producing enzyme is known to be induced at inflammatory sites of both allergic and non-allergic reactions. Since histamine regulates various physiologic processes, including neurotransmission, gastric acid secretion, smooth muscle tone, and inflammation, its regulatory pathways can prove to be therapeutically valuable [129,130]. In addition, cetirizine and other antihistamines have been tested in COVID-19 clinical trials and showed beneficial outcomes [129,131]. Consequently, exploring therapeutics use from a network-centric perspective can reveal new therapeutic targets based on known drug efficacy and comparative studies. This information could be used either to repurpose or develop new drugs.Integrating clinical studies, patient evaluation, and advanced omics analyses can enable pooling and assessing clinical readouts which could make not only mono-therapies but also combination treatment more effective [[5], [6], [7]]. Moreover, the latter therapeutic strategy might offer a better pharmacological intervention ensuring a targeted pathway modulation across multiple effectors to avoid disbalanced responses. Such views are gaining more support; that is, even though certain therapies have shown benefit in subsets of the treatment population, the complexity of the viral infection underscores combination therapy usefulness in increasing treatment efficacy [10]. We discussed an example of ambrisentan - a selective ET-A receptor antagonist that has been used in combination with SGLT-2 inhibition to treat COVID-19 [81]. Since endothelin is a potent vasoconstrictor with pro-inflammatory and atherosclerotic action, selectively inhibiting ET-A receptor can be expected to improve pulmonary haemodynamics and oxygenation as well as reduce tissue injury. Ambrisentan's high potency and selectivity towards ET-A (4000 times greater affinity for the ET-A versus ET-B receptors) have been hypothesised to mitigate adverse effects through ET-A receptors, while preserving the potential beneficial vasodilatation of NO and prostacyclin which release is mediated by ET-B receptors on vascular endothelial cells [81]. To attenuate the injurious effects of COVID-19, concomitant SGLT-2 inhibition with dapagliflozin may also prove effective to reduce inflammatory cytokines and improve endothelial function as well as cardiovascular haemodynamics [81]. As a drug class, SGLT-2 inhibitors depend on blood glucose concentration and kidney function since their action of lowering blood glucose levels is achieved via the kidney independently of insulin secretion and sensitivity status. Thus, SGLT-2 inhibition would be expected to reduce inflammation and improve in glucose homeostasis, cellular metabolism, endothelial function, and cardiovascular haemodynamics. Dapagliflozin and ambrisentan networks demonstrate well that while classical pharmacological assessment depends on searching protocols and comparing data on a single target (or several known main targets), we can benefit from computational pharmacology-oriented data mining and algorithmic evaluations. In other words, network-centric approaches allow to simultaneously explore multiple targets and their associations across the network of interest [19,24,103]. Seeing the value in the extended network analyses encouraged us to explore further if we could associate specific clusters with cellular processes and pathways as a way to capture specific cluster features and enrich the analytical space.
Enrichment analysis offers new pharmacological insights
Cluster enrichment analysis not only revealed specific cellular processes based on the cluster's gene composition but also uncovered shared similarities (Fig. 4; Suppl. Fig. 5). For example, some genes from cluster 5 are linked to the ERK cascade which has been suggested as a potential therapeutic target in coronavirus infections [89]. Calcium homeostasis also appears to be affected by the extended network genes in cluster 5 (Fig. 4) and since calcium-linked cellular processes have been implicated in various COVID-19 outcomes, this could be helpful in narrowing down specific clinical strategies [132,133]. Less diverse clusters in terms of their seed proteins, namely clusters 1 and 2, also help to illustrate that even a small number of main targets can be very important in their process modulation because of their interactome size (Fig. 3). We investigated whether the ontology-based process exploration can be mapped onto pathways. Using cluster 5 as a case study we identified diverse patterns for genes in the cluster. Some genes were mostly shared between several pathways, while others dominated across many signalling events. Cenicriviroc served as a good example for a broad modulator where several members of the extended network belong to the IL-10 signalling pathway. The extensive regulatory potential of cenicriviroc has been taken advantage of in clinical management of COVID-19 [93]. A different drug from the same cluster, namely HuMax-IL8 (experimental antibody inhibiting human IL-8), shares quite a few network members with cenicriviroc. IL-8 is a pro-inflammatory cytokine involved in neutrophil activation and has been linked to the COVID-19 pathogenesis. In addition, as SARS CoV-2 has led to an increase in complications including Acute Respiratory Distress Syndrome (ARDS), the crucial role of IL-8 in lung inflammation has been suggested as a possible new therapeutic target to modulate the hyper-inflammatory response in ARDS [134,135]. Expanding this analysis with computational pharmacology could greatly increase research translational potential and help identify new therapeutic regimens, especially since ARDS has limited therapeutic options [135]. Furthermore, some drugs, e.g., icatibant or ozanimod, appear to show a potential pathway overlap through shared network targets and such information could also be very helpful in clinical decisions. Specifically, such comparative analyses can help with not only off-target prediction but also finding new therapeutic combinations to manage acute and chronic disease progression. As a result, our exploration of the drug associated interactome reveals how critical it is to understand the broader pharmacological network of a drug. Such integrative analyses could help prioritise therapeutic repurposing and even predict unwanted outcomes. For example, hydroxychloroquine, after various clinical trials, was found not to show beneficial action and strong recommendations were issued against the drug's inclusion in clinical protocols [136]. Our analysis revealed that hydroxychloroquine did not have many shared targets in the extended network (Fig. 3) and exploring common targets could have helped predict some off-target effects or optimize treatment for the most suitable patient groups. For example, hydroxychloroquine shares several extended network nodes with celecoxib (a NSAID) which has known cardiotoxic effects and similar issues were found for hydroxychloroquine in clinical trials [137]. Understanding drug combination use is also integral in the intensive care settings where patients are treated with many pharmaceuticals at once. It has been reported that between 46 and 90% of patients admitted to the intensive care unit (ICU) are exposed to potential drug-drug interactions [138].
miRNAs open new possibilities for therapeutics investigation
The interactome is only one aspect of the complex cellular features. We suggest a new clinically valuable avenue of miRNAs as biomarkers or even therapeutic targets in COVID-19. We used the most diverse cluster 5 (Fig. 3) to explore the non-coding regulatory layer by mining the data of validated miRNAs and their regulated genes. We identified a rich network of miRNAs known to be involved in the regulation of multiple genes from cluster 5 (Suppl. Fig. 6). Evidently, miRNAs have multiple pleiotropic effects and could be used as therapeutic targets. Specifically, a drug or combination therapy could be used to influence this regulome layer by inducing expression or suppression of miRNAs to achieve a clinical effect [139]. While reports are limited on the miRNA role in COVID-19, it is possible to appreciate their potentially significant function [31,99,140]. As a result, using compound target and/or interactome data we can extrapolate miRNA involvement and use that to guide therapeutic decisions or disease monitoring. It is also necessary to stress that seeking only significantly changed expression levels of miRNAs might lead to missing important clues for the overall pathways and processes that may be linked through miRNAs. This is also exemplified by miR-150 which lacks noticeable changes in people with COVID-19 but still regulates and interlinks a number of genes (Suppl. Fig. 6) [141].
Computational pharmacology allows bridging the chemical and biological space
Computational pharmacology analyses and disease-associated process modelling/investigation can be further enriched by exploring the medicinal chemistry space and linking functional parameters with drug chemical characteristics. Such analyses could significantly improve our therapeutic evaluation strategies and also aid in uncovering broader trends leading to specific pharmaceutical action. As we profiled chemical features of compounds in COVID-19 investigational studies, we immediately identified a high heterogeneity across all pharmaceuticals (Fig. 7). Moreover, there was very little overlap between pharmacological action and physicochemical features with the exceptions being the steroid and angiotensin receptor blocker groups. This prompted us to theorise that a high similarity is not necessary for the COVID-19 drugs as the modulation of signalling pathways and/or biological processes is most likely achieved through different interactome nodes. This not only means that a broader chemical space can be considered as a therapeutic option but also that repurposing strategies should take into account the interactome of a drug [[101], [102], [103]]. Importantly, such comparative analysis could offer a better combination therapy selection and a more robust approach towards clinical and repurposing studies [5,6].Machine and deep learning have been shown to be highly effective when identifying lead compounds. Compound fingerprinting and graph convolution principles are employed to build neural networks; however, these methods do come with shortcomings depending on the neural architecture itself and the available data [6,12]. We also saw the dependence on chemical training data in our QSAR model designed to predict antiviral compound properties that could guide the selection of COVID-19 antivirals [142]. While we built two different QSAR models (machine learning: LightGBM and a deep learning neural network) both of which relied on different compound feature encoding, none of the experimentally tested antivirals against COVID-19 Mpro protein were identified to have antiviral properties. This contrasted with the results for the COVID-19 clinical trial dataset as both QSAR models were successful in selecting drugs with predicted antiviral properties against COVID-19. Some of these identified drugs are known antiviral therapeutics. We reasoned that these discrepancies underscore the still limited chemical space for COVID-19. As modelling required a large number of molecules to train, we relied on COVID-19 CAS data (containing known and/or predicted antivirals) [50]. Consequently, it is likely that enriching the compound set and also including experimental data would significantly improve our ability to identify new compounds and avoid biases. Moreover, exploring a wider chemical feature space (not limited to similar compounds target- and structure-wise) could potentially lead to discovering broad action compounds where multi-action profile could help in managing more aspects of inflammation and reduce tissue damage.Overall, compared to de novo drug discovery, repurposing can be an attractive option. This is because of the significantly lower development risks since drugs have established safety and pharmacological profiles allowing a direct entry into phase II clinical trials. Accelerated therapeutic translation, cost reduction, and possibility to explore combination therapies make therapeutic repurposing especially desirable [21,27]. Despite the obvious advantages of drug repurposing, there are several major pitfalls. For example, the target population might differ significantly from the one that was involved in the original drug clinical trials. In addition, our understanding about targets and the interactome might be limited which further complicates repurposing [6]. Seeing this and some recent controversies, e.g., hydroxychloroquine use, we propose an integrative approach that relies on computational pharmacology, systems biology, medicinal as well as computational chemistry to efficiently evaluate potential therapeutic candidates. A comprehensive strategy is critical for the identification of COVID-19 therapeutic solutions and future drug repurposing [20,143]. Chakraborty and colleagues provide an excellent summary of the conflicting results in COVID-19 clinical trials which accentuates the need for better global strategies in clinical trials and epidemiological meta-analyses [10,22]. Finally, the issue is not just COVID-19, we need to have a better preparedness for future pandemics and also learn how to use the existing data to advance treatments for other diseases.
Conclusion
New compound development and drug repurposing need to incorporate diverse compound screening libraries with a strong understanding of their interactome and regulome. Importantly, employing computational pharmacology, data mining, systems biology methods, and computational chemistry can greatly advance our efforts in identifying the key targets and their affected networks. Our study revealed that formulating optimal pharmacological intervention options should rely on integrative approaches. We explored not only the current trends and shortcomings in COVID-19 drug repurposing, but also demonstrated the value of new perspectives using computational pharmacology and cheminformatics principles. The introduced in-depth analysis revealed the importance of expanding clinical studies beyond direct drug-target interactions and considering a more complex space of the affected networks. We also showed a number of interactions and pathways that could be exploited when considering combination therapies. The findings of miRNAs networks offer a new strategy to search for valuable biomarkers or therapeutic management options. Even though computational modelling is a powerful tool in prioritising compounds, it is important to include their biological action, experimental results, and extended network data to build better predictors. We demonstrated that the chemical space for COVID-19 investigational compounds might not be broad enough and could benefit from additional experimental evidence to create more robust models. Despite limitations in available data, it is still possible to extract valuable information that could potentially save time and resources by helping to better prioritise compounds for in vitro screening. Finally, we strongly advocate for taking this opportunity to establish comprehensive practices for today's and future infectious diseases by preparing solid analytical frameworks.
Data availability statement
Public repositories used for the analysis are clearly referenced and reported in the manuscript. The analytical framework and code is hosted at GitHub: https://github.com/AusteKan/Computational-Pharmacology
Funding statement
The research was not supported by grants.
Author contribution statement
AK devised the methodology, developed analytical concepts, performed the analysis, and wrote the manuscript. GSC, DW, and LG critically reviewed the manuscript, provided suggestions and advice. AM and NB provided a critical review. All authors read and approved the final manuscript.
Ethics approval statement
The study and analysis did not involve any animal or human testing.
Permission to reproduce material from other sources, if relevant
The analytical framework and code can be used with appropriate referencing. The code is not intended for a commercial use without explicit permission from the code author.
Authors: Roseanne Sullivan; Ajay Kilaru; Bernhard Hemmer; Bruce Anthony Campbell Cree; Benjamin M Greenberg; Uma Kundu; Thomas Hach; Virginia DeLasHeras; Brian J Ward; Joseph Berger Journal: Neurol Neuroimmunol Neuroinflamm Date: 2021-11-30