| Literature DB >> 34707117 |
Sepehr Golriz Khatami1,2, Sarah Mubeen3,4,5, Vinay Srinivas Bharadhwaj3,4, Alpha Tom Kodamullil3, Martin Hofmann-Apitius3,4, Daniel Domingo-Fernández6,7,8.
Abstract
The utility of pathway signatures lies in their capability to determine whether a specific pathway or biological process is dysregulated in a given patient. These signatures have been widely used in machine learning (ML) methods for a variety of applications including precision medicine, drug repurposing, and drug discovery. In this work, we leverage highly predictive ML models for drug response simulation in individual patients by calibrating the pathway activity scores of disease samples. Using these ML models and an intuitive scoring algorithm to modify the signatures of patients, we evaluate whether a given sample that was formerly classified as diseased, could be predicted as normal following drug treatment simulation. We then use this technique as a proxy for the identification of potential drug candidates. Furthermore, we demonstrate the ability of our methodology to successfully identify approved and clinically investigated drugs for four different cancers, outperforming six comparable state-of-the-art methods. We also show how this approach can deconvolute a drugs' mechanism of action and propose combination therapies. Taken together, our methodology could be promising to support clinical decision-making in personalized medicine by simulating a drugs' effect on a given patient.Entities:
Mesh:
Year: 2021 PMID: 34707117 PMCID: PMC8551267 DOI: 10.1038/s41540-021-00199-1
Source DB: PubMed Journal: NPJ Syst Biol Appl ISSN: 2056-7189
Fig. 1Conceptual overview of the drug simulation workflow and case scenario on multiple datasets.
(a) Pathway activity scores are used to train a highly predictive ML model that differentiates between normal and disease samples, labeled green and red on the heatmap, respectively. (b) Next, pathway scores of disease samples are modified by using drug-target information and applying a scoring algorithm that simulates the effect of a given drug at the pathway-level. Using the modified pathway scores of disease samples, the trained ML classifier is then used to evaluate whether these modified disease samples that were previously classified as “diseased” could now be classified as “normal”. (c) Finally, we use the proportion of disease samples now classified as normal (i.e., % responders) as a proxy to identify candidate drugs and propose combination therapies. (d) To demonstrate the methodology in a case scenario, we first performed ssGSEA using pathways from KEGG and the BRCA, LIHC, PRAD, and KIRC TCGA datasets to acquire sample-wise pathway activity scores. (e) Next, we obtained known drug-target interactions from DrugBank and DrugCentral and drug-disease pairs (i.e., FDA-approved drugs and drugs under clinical trials for a given condition) from Clinicaltrials.gov and FDA-approved drugs, of which, the latter two were used as a ground-truth list of true positives (TP). (f) To simulate drug treatments of patients from the aforementioned TCGA datasets using their pathway activity scores (i.e., Fig. 1d), we applied the methodology described in Fig. 1a–c to acquire a ranking of drugs based on the proportion of disease samples that were treated. Finally, we identified the proportion of drugs ranked by our methodology that were true positives for the four TCGA datasets and compared this proportion to random chance.
Number of FDA-approved and clinically tested drugs recovered for both drug-target datasets (i.e., DrugBank (DB) and DrugCentral (DC)) across the four investigated cancers.
| Dataset | DB Prioritized | DB Approved (total) | DB Clinical trials (total) | DB Proportion of true positives (%) | DC Prioritized | DC Approved (total) | DC Clinical trials (total) | DC Proportion of true positives (%) |
|---|---|---|---|---|---|---|---|---|
| BRCA | 129 | 8 (26) | 23 (182) | 31/129 (24.03%) | 19 | 2 (14) | 4 (115) | 6/19 (31.57%) |
| LIHC | 74 | 2 (5) | 11 (50) | 13/74 (17.56%) | 19 | 1 (1) | 2 (35) | 3/19 (15.78%) |
| PRAD | 68 | 2 (13) | 18 (134) | 20/68 (29.41%) | 19 | 1 (7) | 3 (84) | 4/19 (21.05%) |
| KIRC | 88 | 2 (8) | 10 (44) | 12/88 (13.63%) | 26 | 3 (3) | 2 (25) | 5/26 (19.2%) |
In the first column for each drug-target dataset (“Prioritized”), we report the number of drugs that changed the predictions for at least 80% of the patients for each cancer type. The second column (“Approved”) reports the number of FDA-approved drugs among these prioritized drugs as well as the total number of FDA-approved/clinically tested drugs present in each dataset between parentheses. Similarly, the third column (“Clinical trials”) reports the number of drugs tested in clinical trials among the prioritized drugs and the total number of FDA-approved/clinically tested drugs between parentheses. Finally, the last column (“Proportion of true positives”) reports the proportion of true positives (both FDA-approved and clinically tested drugs) among the prioritized drugs.
Fig. 2Pathways targeted by prioritized drugs in DrugCentral for each of the three cancer test datasets.
The X axis corresponds to pathways targeted by any of the prioritized drugs (i.e., pathways not targeted by any prioritized drug are omitted for better visualization). Prioritized drugs for each cancer dataset have been clustered based on the pathways they target and are reported on the Y axis. Of the prioritized drugs, those that correspond to true positives are highlighted in bold. If a set of three or more similar pathways was clustered together, we manually assigned these pathways into distinct classes (Y axis) Pathway names and cluster information are available as a Supplementary File and the equivalent figures for DrugBank are available as Supplementary Figs. 2–4.
Examples of predicted combination therapies.
| Cancer type | Drug 1 | Drug 2 | Proportion of responders (%) | Reference |
|---|---|---|---|---|
| Liver cancer | Sorafenib | Trametinib | 87% | [ |
| Liver cancer | Erlotinib | Sorafenib | 87% | [ |
| Breast cancer | Vorinostat | Capecitabine | 88% | [ |
Number of FDA-approved and clinically tested drugs present in both drug-target datasets across the four investigated cancers.
| Dataset | DrugBank Approved | DrugBank Clinical trials | DrugCentral Approved | DrugCentral Clinical trials |
|---|---|---|---|---|
| BRCA | 26/1346 (1.93%) | 182/1346 (13.52%) | 14/638 (2.19%) | 115/638 (18.02%) |
| LIHC | 5/1346 (0.37%) | 50/1346 (3.71%) | 1/638 (0.16%) | 35/638 (5.49%) |
| PRAD | 13/1346 (0.97%) | 134/1346 (9.96%) | 7/638 (1.10%) | 84/638 (13.17%) |
| KIRC | 8/1346 (0.60%) | 44/1346 (3.26%) | 3/638 (0.47%) | 25/638 (3.91%) |
The percentage for the number of FDA-approved/clinically investigated drugs for each cancer type over the total number of drugs present in the drug-target dataset is reported between parentheses.