Literature DB >> 24460210

Pathway analysis for drug repositioning based on public database mining.

Yongmei Pan1, Tiejun Cheng, Yanli Wang, Stephen H Bryant.   

Abstract

Sixteen FDA-approved drugs were investigated to elucidate their mechanisms of action (MOAs) and clinical functions by pathway analysis based on retrieved drug targets interacting with or affected by the investigated drugs. Protein and gene targets and associated pathways were obtained by data-mining of public databases including the MMDB, PubChem BioAssay, GEO DataSets, and the BioSystems databases. Entrez E-Utilities were applied, and in-house Ruby scripts were developed for data retrieval and pathway analysis to identify and evaluate relevant pathways common to the retrieved drug targets. Pathways pertinent to clinical uses or MOAs were obtained for most drugs. Interestingly, some drugs identified pathways responsible for other diseases than their current therapeutic uses, and these pathways were verified retrospectively by in vitro tests, in vivo tests, or clinical trials. The pathway enrichment analysis based on drug target information from public databases could provide a novel approach for elucidating drug MOAs and repositioning, therefore benefiting the discovery of new therapeutic treatments for diseases.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 24460210      PMCID: PMC3956470          DOI: 10.1021/ci4005354

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


Introduction

Understanding the mechanisms of action (MOAs) of drugs is critical for drug development. The identification of drug MOAs has been primarily based on pharmacological experiments, but bioinformatics studies by data-mining of public databases[1−9] have become an alternative and important approach. Previous bioinformatics efforts have been focused on the establishment of compound–target associations based on drug similarity that could be used for the prediction of drug targets responsible for MOAs,[2,3,8] but recent studies have extended to the identification and analysis of biological pathways retrieved based on drug bioactivity and targets.[1,4,5,9] A biological pathway involving a drug and its targets may point to the MOA, the biological function, or a disease associated with this drug. Identification of biological pathways may be helpful for the elucidation of the MOAs underlying drug effects. For instance, Covell et al. explored the drug-gene-pathway relationships by the data mining of drug profiles from the NCI60 anticancer screen[4,5,9] and even discussed MOAs indicated by the pathways with coexistence of gene expression change upon drug treatments.[4] However, these studies were either limited to certain data sets with gene expression data[4,5,9] or focused on the overall relationship among drugs, their targets, and their involved pathways.[1] The application of pathway analysis in identifying drug MOAs by integrating large-scale biomedical databases with drug target information is still lacking. While MOA is essential for the development of new drugs, drug repurposing or repositioning, i.e., finding novel uses of existing drugs, has become an important strategy to exploit new treatments for diseases.[10−14] Computational methods, e.g., virtual screening of commercially available drug databases based on small-molecule similarity and Structure–Activity Relationship (SAR) models has been employed for predicting new uses of existing drugs.[15−18] The most widely discussed bioinformatics approach for drug repositioning is to integrate the network among drugs, genes, and diseases, mostly by taking advantage of the high throughput gene expression data.[11−14,19] For example, Dudley et al. predicted the new uses of existing drugs based on drug–disease pairs identified by comparing gene expression signatures of drugs with those in Gene Expression Omnibus (GEO).[12,13,20] Biological pathways involving drug targets may point to diseases not currently treated by a drug and therefore possibly indicate its new clinical uses. Although pathway enrichment analysis was recently applied in combination with gene expression profiles to explore new clinical uses among existing drugs,[21,22] its application in drug repurposing based on large-scale biomedical database mining is still yet to be explored. The vast biological and biomedical information available in public databases has greatly facilitated bioinformatics studies in recent years. In this work, 16 FDA-approved drugs obtained from the DrugBank database were investigated to elucidate their MOAs and clinical functions with pathway analysis based on the drug target information retrieved from various public databases available at NCBI, including the MMDB, GEO, and PubChem Compound and BioAssay databases. Biological pathways involving drug molecules and their targets were identified by querying the NCBI’s BioSystems database. Most drugs retrieved biological pathways that were shared by their drug targets and pertinent to their clinical uses or MOAs, indicating identifying relevant pathways could aid the elucidation of the clinical functions and MOAs of drugs; some drugs have obtained pathways responsible for other diseases than those they are currently used for, indicating identifying biological pathways could provide an alternative approach for exploiting drug repurposing or repositioning. The insights into the pharmacology of individual drugs have been discussed, demonstrating how pathway analysis would facilitate the elucidation of MOAs and reveal new clinical functions of existing drugs, therefore benefiting the discovery of new therapeutic treatments for diseases.

Methods

Overall Methodology

Pathways comprising component genes and proteins may account for biological processes or diseases that could be affected by a drug interacting with its pharmacological targets. Therefore, the pharmacological or clinical effects of a drug may be elucidated by analyzing the pathways enriched by drug targets with affinities or being affected by the investigated drug. In this study, pathway enrichment analysis was applied to elucidate drug MOAs and potential repositioning based on the integration of protein and gene expression information from public databases. There are three steps of this process: (1) Drug selection. (2) Drug target retrieval: protein targets with small molecule bioactivity data and crystal structures in complex with drugs were obtained; gene targets with gene expression level affected upon drug treatments were also retrieved. (3) Pathway retrieval and enrichment analysis: pathways with co-occurrence of identified drug targets were then obtained and ranked according to the number of identified drug targets involved in the pathways. The P values associated with each pathway were calculated. A number of in-house scripts have been developed for data retrieval and analysis.

Public Databases Used for Data Retrieval

Several public databases at NCBI were utilized for retrieving the various types of data used in this work. These databases were selected due to the comprehensive and cross-linking information among them. The MMDB[23] contains resolved protein structure data from the Protein Data Bank (PDB),[24] added with useful information on small molecules (i.e., ligands in resolved structures), domains, and sequence similarity. PubChem,[25] a suite of three Entrez[26] databases, i.e., Compound, Substance and BioAssay, is a public repository for chemical structures and biological properties of small molecules and RNAi reagents. With drug bioactivity and target information obtained from multiple depositors, PubChem BioAssay[27] is a useful resource for biomedical and pharmaceutical research. The GEO[20,28] of NCBI is a repository of functional genomics data generated by high-throughput gene expression and genomic hybridization experiments. As one of the GEO databases, GEO DataSets[20] contains entire experiments of gene expression measurements for all involved genes under certain conditions, such as upon small molecule or drug treatment in a sample. The BioSystems database[29] contains genes, proteins, and small molecules that are involved in biological systems, such as biological pathways and diseases. Currently, it comprises records from several external databases, including Kyoto Encyclopedia of Genes and Genomes (KEGG),[30] BioCyc,[31] Reactome,[32] the Pathway Interaction database,[33] Wikipathways,[34,35] and Gene Ontology,[36] which are the major resources containing biological pathway data. Therefore, the BioSystems database is an informative and integrated platform to explore combinations of genes and proteins that are related by biological systems. All the above databases belong to Entrez,[26] an integrated retrieval system supported by NCBI with access to 37 databases, covering a wide variety of categories including literature search, DNA and RNA, proteins, genes and expression, genomes, genetics and medicine, chemicals and bioactivities, and domains and structures. Furthermore, the whole Entrez system is comprehensively integrated and powered by the cross-links among records available at various databases.[26]

Drug Selection and Target Retrieval

Drugs were selected from the DrugBank database[37,38] and initially searched among the drugs with clear MOAs. The information about targets and MOAs responsible for drugs’ clinical functions was obtained from DrugBank and Wikipedia (http://en.wikipedia.org/wiki). Two categories of targets were defined in this work: the primary target that is responsible for the clinical effects of a drug and other targets (or secondary targets denoted in this paper) that have shown biological affinity or effects with a drug. The drugs were further narrowed down by the availability of protein target information in the MMDB[23] and PubChem BioAssay[25,27] databases. Each drug was searched in the PubChem Compound database[25] for their compound CID (PubChem chemical structure accession) number. MMDB contains resolved protein structures derived from the Protein Data Bank (PDB).[24] Structures in complex with each drug were retrieved with cross-links to MMDB in the PubChem Compound database; more protein targets were obtained within PubChem BioAssay by using the CID number of each drug. Only protein targets with the taxonomy of Homo sapiens retrieved from MMDB and BioAssay remained for further study. Drugs without any target information obtained from MMDB and BioAssay were excluded from further consideration. Drug target information was also retrieved according to affected gene expression data upon drug administration. For drugs that failed to obtain relevant pathways with just protein target information from the MMDB and PubChem BioAssay databases, additional gene targets were obtained by checking PubMed literature cross-linked with GEO DataSets[20,28] records associated with each drug.

Pathway Retrieval and Enrichment Analysis

The Entrez programming utilities (E-Utilities)[26,39] is an Application Programming Interface (API) that is used to search, link, and download data from the databases available at NCBI (http://www.ncbi.nlm.nih.gov/books/NBK25501/). The Gene IDs associated with each drug target were obtained from the Gene[40] database of NCBI. Human biological pathways involving every drug target were retrieved from the BioSystems database by using the Gene IDs via the ESearch of E-Utilities. The title of each pathway was obtained by using ESummary of E-Utilities, with the BioSystems IDs (BSIDs) of pathways as queries. The significance of retrieved pathways was evaluated with the enrichment statistics P values calculated in a similar way to DAVID[41] based on the modified Fisher’s Exact Test compared to the background with all human pathways available in the BioSystems database. In-house Ruby scripts were developed for large-scale retrieval of drug target Gene IDs and pathways by using the E-Utilities, for identification of pathways common to the retrieved drug targets, and for calculation of P values. All public databases were accessed in June–July, 2013. Extra gene targets for celecoxib were obtained by accessing PubMed in December, 2013.

Results

Drugs were searched in the order of their DrugBank primary accession numbers (Drugbank IDs) until a certain number of drugs were reached. In detail, the drugs were initially selected among 240 drugs with DrugBank ID prior to DB00860. Antibiotics, antifungals, and other drugs targeting nonhuman organisms were removed. The remaining drugs were then selected based on the availability of clear primary targets and MOAs and retrieved protein and gene targets that makes the pathway analysis feasible. At last, 16 drugs with various targets and clinical functions were obtained for subsequent pathway analysis (Table 1). Primary targets were mainly obtained from DrugBank, meanwhile secondary targets were obtained from the MMDB, BioAssay, and GEO databases. All drug targets retrieved to search the biological pathways are shown in Supporting Information, Table S1. The drug CID numbers and links used for obtaining the protein targets available at the MMDB and BioAssay databases are shown in Supporting Information, Table S2. Seven drugs obtained crystal structures of target proteins from MMDB, some of which are the primary targets. The corresponding gene symbols and PDB IDs of these protein structure complexes are shown in Supporting Information, Table S3. The numbers of retrieved protein and gene targets via the MMDB, BioAssay, and GEO databases ranged from 0 to 7, 2 to 68, and 0 to 433, respectively, which are summarized in Table 1 together with the drug therapeutic information.
Table 1

Profiles of Drugs and Retrieved Proteins, Genes, and Relevant Pathways

    no. targets
IDdrugtherapeutic categoryprimary target(s)MMDBaBioAssaybGEOc
1ActinomycincancerDNA0214
2AdenosineantiarrhythmicADORs7*6NAd
3Bortezomibrelapsed multiple myeloma, mantle cell lymphomaprotease01128
4Celecoxibanti-inflammatoryPTGS211956
5Disulfiramchronic alcoholism addictionALDHs0590
6DoxorubicincancerDNA0928
7FluorouracilcancerTYMS1580
8GefitinibcancerEGFR1*68NA
9MifepristoneabortifacientPGR114433
10PropofolhypnoticGABRs01188
11RosiglitazoneantidiabeticPPARs0514
12Tamoxifenbreast cancerESR1057NA
13Tretinointopical retinoidRARs2*3889
14Triiodothyroninemultiple functionsTHRs3*11275
15Valproic Acidanticonvulsant, mood-stabilizingABAT03196
16Vincristinecancertubulin0211

Number of target proteins retrieved from the links to protein structures at the PubChem Compound database. Numbers with ∗ indicate that the crystal structure of the primary target in complex with the drug is included.

Number of total target proteins retrieved from PubChem BioAssay (excluding the targets with crystal structures).

Number of total target genes retrieved via GEO DataSets (excluding the targets with crystal structures and obtained from BioAssay).

Drug targets were not retrieved via GEO because there were relevant pathways identified based on only MMDB and PubChem BioAssay.

Number of target proteins retrieved from the links to protein structures at the PubChem Compound database. Numbers with ∗ indicate that the crystal structure of the primary target in complex with the drug is included. Number of total target proteins retrieved from PubChem BioAssay (excluding the targets with crystal structures). Number of total target genes retrieved via GEO DataSets (excluding the targets with crystal structures and obtained from BioAssay). Drug targets were not retrieved via GEO because there were relevant pathways identified based on only MMDB and PubChem BioAssay.

Pathway Enrichment Analysis: Identified Pathways with Participating Drug Targets

Ruby scripts were developed for pathway retrieval and data processing. The scripts are attached as Supporting Information. There were usually multiple pathways retrieved for each drug target. We hypothesize that biological pathways common to the primary and secondary targets may indicate significant biological systems that are important for the therapeutic functions of drugs. In this study, all drugs have found biological pathways with the coexistence of primary and secondary drug targets. The list of all these pathways for each drug is shown in Supporting Information Table S4. The obtained pathways were ranked in terms of the number of retrieved drug targets involved in the pathways. The official full names of participating genes in the identified pathways are shown in Supporting Information Table S5. Most drugs have identified pathways related to the MOAs, biological functions, or clinical uses (denoted as relevant pathways in the paper) shared by the primary and secondary targets. Examples of these pathways are shown in Table 2. Four investigated drugs, i.e., adenosine, doxorubicin, gefitinib, and tamoxifen, found relevant pathways to their clinical uses or MOAs common to at least four protein targets (including the primary target) retrieved from MMDB and BioAssay. For the remaining drugs, additional gene targets were obtained via GEO DataSets records with gene information indicated by PubMed. Seven drugs, i.e., actinomycin, bortezomib, celecoxib, fluorouracil, rosiglitazone, tretinoin, and vincristine, have found relevant pathways among at least four target proteins and genes. Interestingly, four drugs, i.e., gefitinib, celecoxib, tamoxifen, and tretinoin, identified biological pathways responsible for diseases other than those currently treated by the drugs, with the majority of them supported by previous in vitro tests, in vivo tests, or clinical trials. See the Supporting Information for detailed discussions of the relevant pathways identified for each drug.
Table 2

Relevant Pathways Shared by Multiple Drug Targets

     no. of targets
 
idadrugprimary target(s)relevant pathwayspathway BSIDMbBcGdP valuee
1ActinomycinDNAintegrated pancreatic cancer pathway711 360NAf082.42e-10
apoptosis198 797NA072.48e-11
105 648NA055.35e-06
83 060NA051.14e-07
integrated breast cancer pathway219 801NA079.04e-08
pathways in cancer83 105NA073.62e-07
acute myeloid leukemia83 117NA064.36e-08
apoptosis modulation and signaling198 822NA051.14e-07
prostate cancer755 440NA056.04e-07
integrated cancer pathway672 450NA032.15e-05
pancreatic cancer83 108NA038.92e-03
small cell lung cancer83 118NA028.11e-03
prostate cancer83 111NA028.67e-03
endometrial cancer83 109NA023.04e-03
colorectal cancer83 106NA024.30e-03
2AdenosineADORsvascular smooth muscle contraction96 5301*3NA3.77e-04
3Bortezomibproteaseprotein processing in endoplasmic reticulum167 325NA242.08e-06
parkin-ubiquitin proteasomal system pathway700 638NA141.72e-03
4CelecoxibPTGS2pathways in cancer83 10501111.87e-05
small cell lung cancer83 1180181.58e-07
integrated pancreatic cancer pathway711 3600174.82e-04
eicosanoid synthesis198 8880224.24e-04
arachidonic acidmetabolism82 9910221.07e-02
685 5530226.02e-03
6DoxorubicinDNAintegrated pancreatic cancer pathway711 360NA4146.57e-22
pathways in cancer83 105NA3142.06e-16
apoptosis198 797NA2147.89e-25
105 648NA2111.15e-14
83 060NA2111.12e-18
apoptosis modulation and signaling198 822NA2113.45e-19
prostate cancer755 440NA1111.74e-15
integrated breast cancer pathway219 801NA1103.50e-12
pancreatic cancer83 108NA265.97e-11
prostate cancer83 111NA266.92e-10
small cell lung cancer83 118NA251.81e-08
integrated cancer pathway672 450NA162.58e-11
endometrial cancer83 109NA164.92e-10
colorectal cancer83 106NA161.76e-09
chronic myeloid leukemia83 116NA161.76e-09
acute myeloid leukemia83 117NA064.36e-08
bladder cancer83 115NA142.04e-07
glioma83 110NA048.00e-05
melanoma83 114NA122.03e-03
gastric cancer network 2760 637NA122.33e-04
7FluorouracilTYMScell cycle530 73302191.01e-09
cell cycle, mitotic105 76501177.80e-09
integrated pancreatic cancer pathway711 36002162.10e-13
mitotic G1-G1/S phases160 94101111.96e-09
G1/S transition105 7690184.50e-07
pyrimidine metabolism82 9461141.57e-03
pyrimidine deoxyribonucleotides de novo biosynthesis i782 3800132.16e-03
pyrimidine metabolism106 2811129.12e-04
8GefitinibEGFRintegrated pancreatic cancer pathway711 3601*8NA3.67e-05
pathways in cancer83 1051*7NA5.11e-03
signaling by FGFR in disease645 2741*6NA1.13e-03
signaling by FGFR106 3431*6NA7.00e-04
integrated breast cancer pathway219 8011*4NA1.76e-02
signaling pathways in glioblastoma672 4581*4NA1.77e-03
pancreatic cancer83 1081*3NA8.93e-03
bladder cancer83 1151*2NA2.51e-02
11Rosiglitazone PPARsPPAR signaling pathway83 042NA145.83e-06
insulin signaling pathway83 090NA131.89e-03
12TamoxifenESR1integrated pancreatic cancer pathway711 360NA6NA3.88e-03
integrated breast cancer pathway219 801NA4NA4.93e-02
13TretinoinRARspathways in cancer83 1050746.36e-03
transcriptional misregulation in cancer523 0160476.16e-05
integrated pancreatic cancer pathway711 3600626.36e-03
retinoic acid receptors-mediated signaling138 0390502.14e-04
nonsmall cell lung cancer83 1190504.65e-03
small cell lung cancer83 1180412.04e-02
acute myeloid leukemia83 1170324.96e-03
16Vincristinetubulincell cycle530 733NA224.25e-03
cell cycle, mitotic105 765NA211.74e-02

Numbers are consistent with those in Table 1.

Number of target proteins retrieved from the links to the protein structures at the PubChem Compound database (short cut to MMDB). Numbers with ∗ indicate that the crystal structure of the primary target in complex with the drug is included.

Number of target proteins retrieved from PubChem BioAssay (excluding the targets with crystal structures) involved in the particular pathway.

Number of target genes retrieved via GEO DataSets (excluding the targets with crystal structures and obtained from BioAssay) involved in the particular pathway.

The enrichment statistics P values were calculated based on a modified Fisher’s Exact Test compared to a background with all human pathways in the BioSystems database. Only pathways with P < 0.05 are shown as being enriched for investigated drugs in the table.

Not available. No targets retrieved via PubChem Compound (link to MMDB) or GEO.

Numbers are consistent with those in Table 1. Number of target proteins retrieved from the links to the protein structures at the PubChem Compound database (short cut to MMDB). Numbers with ∗ indicate that the crystal structure of the primary target in complex with the drug is included. Number of target proteins retrieved from PubChem BioAssay (excluding the targets with crystal structures) involved in the particular pathway. Number of target genes retrieved via GEO DataSets (excluding the targets with crystal structures and obtained from BioAssay) involved in the particular pathway. The enrichment statistics P values were calculated based on a modified Fisher’s Exact Test compared to a background with all human pathways in the BioSystems database. Only pathways with P < 0.05 are shown as being enriched for investigated drugs in the table. Not available. No targets retrieved via PubChem Compound (link to MMDB) or GEO.

Case A: Drugs with Pathways Indicating Potential New Therapeutic Uses

Four drugs identified relevant pathways that might indicate new drug uses (Table 2). Gefitinib (or iressa) is a known selective inhibitor of the tyrosine kinase domain of epidermal growth factor receptor (EGFR) and is used for the treatment of lung, breast, and other cancers.[42−44] The crystal structure of the primary target of human EGFR in complex with gefitinib was retrieved from MMDB (PDB ID: 2ITY); 68 more protein targets (most of them are tyrosine kinases) with moderate affinity to gefitinib were retrieved from the BioAssay database (Table 1). Pathways pertinent to the EGFR signaling were identified, such as (1) the Signaling by FGFR in Disease (BSID: 645274, shared by EGFR, ERBB2–4, LCK, MKNK1, SLK, and SRC) and (2) Signaling by FGFR (BSID: 106343, shared by EGFR, ERBB2–4, LCK, MKNK1, and SRC). A few cancer-related pathways were identified, including (1) Integrated Breast Cancer Pathway (BSID: 219801), (2) Pancreatic Cancer Pathways (BSID: 711360 and 83108), (3) Pathways in Cancer (BSID: 83105), (4) Signaling Pathways in Glioblastoma (BSID: 672458), and (5) Bladder cancer (BSID: 83115). The retrieved breast cancer pathway confirmed gefitinib as a chemotherapy agent for this disease; other pathways, such as those about pancreatic and ladder cancers as well as glioblastoma were supported retrospectively by previous in vivo or in vitro studies on the potential uses of gefitinib against these cancers. For instance, gefitinib was found to prevent progression of pancreatic carcinoma in the mouse model[45] and inhibit pancreatic cancer cell growth and invasion in vitro;[46] gefitinib was able to inhibit tumor cell migration in EFGR-overexpressed human glioblastoma;[47] gefitinib also suppressed bladder cancer cell growth[48] and decreased occurrence of bladder tumors in murine models.[49,50] Misregulation of protein kinases especially EGFR is widely believed as a cause of uncontrolled cell proliferation.[51−55] Therefore, EGFR has become a popular target for anticancer therapy.[56,57] Figure 1A demonstrates the relevant pathways identified for gefitinib, indicating that the primary target EGFR and the various corresponding EFGR signaling pathways may account for the current or potential clinical uses of gefitinib as an anticancer agent.
Figure 1

Relevant pathways identified for gefitinib (A), celecoxib (B), tamoxifen (C), and tretinoin (D). Yellow octagons and red circles represent the drugs and their primary targets. The orange, rose, green, and magenta squares represent relevant pathways responsible for the MOAs, biological functions, and current and potential clinical uses of drugs, respectively. Black solid lines are between a drug and the retrieved relevant pathways. Edge labels indicate numbers of target proteins and genes shared by each pathway. Red solid lines with an arrow represent a drug and its primary target. Purple dashed lines with an arrow represent that the primary targets are responsible for the MOAs or biological functions indicated by pathways. Cyan dashed lines with an arrow represent the primary targets or pathways of MOAs responsible for those indicating current or potential clinical uses.

Relevant pathways identified for gefitinib (A), celecoxib (B), tamoxifen (C), and tretinoin (D). Yellow octagons and red circles represent the drugs and their primary targets. The orange, rose, green, and magenta squares represent relevant pathways responsible for the MOAs, biological functions, and current and potential clinical uses of drugs, respectively. Black solid lines are between a drug and the retrieved relevant pathways. Edge labels indicate numbers of target proteins and genes shared by each pathway. Red solid lines with an arrow represent a drug and its primary target. Purple dashed lines with an arrow represent that the primary targets are responsible for the MOAs or biological functions indicated by pathways. Cyan dashed lines with an arrow represent the primary targets or pathways of MOAs responsible for those indicating current or potential clinical uses. Celecoxib is a well-known anti-inflammatory agent as a selective inhibitor of prostaglandin-endoperoxide synthase 2 (PTGS2 or COX2),[58,59] an enzyme that converts arachidonic acid (AA) to prostaglandin H2 (PGH2). The crystal structure of a target protein carbonic anhydrase II (CA2) bound with celecoxib (PDB ID: 1OQ5) was retrieved from MMDB. Nineteen additional protein targets and 56 affected genes were retrieved. Two biological pathways relevant to its biological functions were retrieved, i.e., the pathways of Arachidonic Acid Metabolism (BSID: 82991 and 685553, shared by PTGS2, CYP2J2, GGT1, and PTGES) and Eicosanoid Synthesis (BSID: 198888, shared by PTGS2, GGT1, GGT2, and PTGES). Although celecoxib is not a chemotherapy drug, some cancer pathways were also identified, including the Pathways in Cancer (BSID: 83105), Integrated Pancreatic Cancer Pathway (BSID: 711360), and Small Cell Lung Cancer (BSID: 83118). These results are consistent with numerous publications of the efficacy to cancer cells by solely or combinatory use of celecoxib.[60−67] Particularly, celecoxib or its combination with other drugs prevented progression of pancreatic cancer cells in vitro and in vivo.[68−72] While celecoxib was mostly reported to have inhibitory effects on non-small-cell lung cancer cells,[63,73−76] clinical trials have shown that coadministration of celecoxib with chemotherapy agents improved treatment of small cell lung cancer.[77,78] As discussed by previous studies, the effects of celecoxib on cancer cells may be attributed to its inhibition on COX-2, which in turn regulates multiple cancer pathways.[79−84] The relevant pathways are shown in Figure 1B, indicating the primary target of celecoxib is responsible for its biological functions and its potential drug repositioning in cancer treatment. Tamoxifen is an antagonist of the estrogen receptor 1 (ESR1) that is used for treatment of breast cancer.[85] No resolved structure was available for tamoxifen according to the link with MMDB. Fifty-seven protein targets with affinity or inhibition with tamoxifen were retrieved from the BioAssay database. Among them, ESR1, AR, EGFR, and TP53 shared the common pathway of Integrated Breast Cancer Pathway (BSID: 219801) based on the search of the BioSystems database, supporting the usage of tamoxifen as an anti-breast-cancer drug. Another cancer pathway, the Integrated Pancreatic Cancer Pathway, was shared by ESR1, AR, EGFR, ERBB2, ESR2, and TP53, which is consistent with in vivo tests and clinical trials by sole or combinatory administration of tamoxifen and other chemotherapy drugs for the treatment of pancreatic cancer.[86−90] Figure 1C demonstrates the relevant pathways identified for tamoxifen that could be attributed to its primary target ESR1. A few pathways regarding signal transduction and gene expression were also identified (Supporting Information, Table S4), possibly due to the many nuclear receptor and tyrosine kinase targets retrieved from the BioAssay database. Tretinoin (or all trans retinoic acid, ATRA) can be used for the treatment of acne[91] and acute myeloid leukemia.[92,93] Although tretinoin can activate retinoid acid nuclear receptors (RARs),[94] the underlying mechanisms responsible for its clinical effects are not quite clear.[95] The crystal structures of the primary target retinoid acid nuclear receptor gamma (RARG, PDB ID: 2LBD) and cellular retinoic acid binding protein type II (CRABP2, PDB ID: 1CBS) in complex with ARTA were retrieved. On the basis of the structures and the additional 38 proteins and 89 genes obtained by referring to BioAssay and GEO DataSets databases, the pathway of Retinoic Acid Receptors-mediated Signaling (BSID: 138039, shared by RARA/G, MAPK1, and VDR) was retrieved, supporting RARs as the primary targets of tretinoin. In addition, the pathway of Acute Myeloid Leukemia (BSID: 83117, shared by RARA, EIF4EBP1, MAPK1, MYC, and PPARD) was identified, consistent with tretinoin’s treatment for this disease. Extra cancer pathways were also retrieved (Table 2), such as Transcriptional Misregulation in Cancer (BSID: 523016), Pathways in Cancer (BSID: 83105), the Integrated Pancreatic Cancer Pathway (BSID: 711360), Small Cell Lung Cancer (BSID: 83118), and Nonsmall Cell Lung Cancer (BSID: 83119), indicating tretinoin may have potential therapeutic effects on these cancers. The cancer pathways agreed with the previous discussion of potential roles of tretinoin for cancer treatment.[96−98] The relevant pathways to pancreatic and lung cancers retrieved in this analysis were supported by previous in vitro or in vivo tests, demonstrating that ATRA inhibited growth of human pancreatic cancer cells in vitro[99,100] and decreased metastasis of lung tumors in cancer-bearing mice.[101] The related pathways are shown in Figure 1D, suggesting the primary targets RARs and their signaling pathways might be responsible for the clinical uses of tretinoin for the treatment of cancers.

Case B: Antineoplastic Drugs with Identified Pathways Responsible for Their Clinical Uses

Besides gefitinib and tamoxifen, a few drugs are antineoplastic drugs that kill cancer cells by targeting enzymes and receptors (such as fluorouracil and bortezomib) or directly interacting with DNA or tubulin (such as doxorubicin, actinomycin, and vincristine, Table 1). All of these drugs except bortezomib found biological pathways regarding the particular cancers they were used for, supporting their clinical functions. The drugs targeting DNA and tubulin witnessed pathways regarding apoptosis and cell mitosis, which represented biological functions responsible for their drug effects. These relevant pathways are shown in Table 2. Doxorubicin and actinomycin are anticancer agents by directly targeting DNA.[102−104] Both of them identified pathways responsible for apoptosis and a variety of cancers (Table 2). The participation of the target genes and proteins into various cancer-related pathways supported the broad-spectrum usage of doxorubicin in cancer therapy;[105] the relevant pathways about apoptosis accounted for the biological and clinical functions of both drugs. Vincritine has its antineoplastic effects by binding to tubulin.[106] Pathways regarding cell mitosis were identified, contributing to the biological functions behind its drug effects. Fluorouracil and bortezomib kill cancer cells by targeting thymidylate synthase (TYMS)[107] and proteasome,[108,109] respectively. The crystal structure of uridine phosphorylase 1 (UPP1) bound with fluorouracil (PDB ID: 3NBQ) was obtained. Pathways regarding the MOA of fluorouracil were retrieved, with those of Pyrimidine Metabolism (BSIDs: 82946 and 106281) shared by TYMS, UPP1, and other targets (Supporting Information, Table S4). The identified Integrated Pancreatic Cancer Pathway (BSID: 711360) supported the usage of fluorouracil for pancreatic cancer.[110] Bortezomib witnessed pathways underlying its MOA, such as the pathways of the Protein Processing in Endoplasmic Reticulum (BSID: 167325) and the Parkin-Ubiquitin Proteasomal System (BSID: 700638, Table 2).

Case C: Drugs Targeting Nuclear Receptors and Other Proteins

Some drugs have their clinical effects by targeting nuclear receptors, such as mifepristone at PGR (progesterone receptor),[111,112] rosiglitazone at PPAR (peroxisome proliferator-activated receptor),[113] propofol at GABR (γ-aminobutyric acid (GABA) A receptors),[114,115] and triiodothyronine at THRs (thyroid hormone receptors).[116] Some of them found pathways particular to their current drug effects. For example, rosiglitazone identified the Insulin (BSID: 83090) and PPAR (BSID: 83042) signaling pathways, which indicated the underlying MOAs accounting for its clinical function as an antidiabetic drug.[113] Adenosine identified the pathway of Vascular Smooth Muscle Contraction (BSID: 96530) shared by its primary target ADOR (adenosine receptor) and secondary targets, supporting its role as an antiarrhythmic agent for treatment of heart disorders.[117]

Discussion

Pathways Enrichment Analysis: A Novel Approach for Drug Repositioning

Pathways common to genes with significant expression changes upon drug treatment were previously used to elucidate drug MOAs and repositioning by using gene expression profiles across multiple cell types and data sets.[4,5,21,22] However, pathway enrichment analysis based on protein targets retrieved from large-scale public databases is still lacking. In this study, biological pathways shared by target proteins and genes retrieved via MMDB, PubChem BioAssay, and GEO were identified and analyzed in an attempt to examine and predict drug MOAs and functions. Eleven investigated drugs in this study have identified enriched pathways among their drug targets that are relevant to their clinical uses and MOAs, indicating finding biological pathways may provide useful information on drug uses and MOAs, especially for the drugs with unknown underlying mechanisms. Considering the tremendous cost to develop new drugs, drug repositioning or repurposing has become a promising strategy to develop new therapeutic treatments for diseases.[10−14] Previous efforts from the bioinformatics point of view focused on the molecular modeling based on virtual screening of existing drug databases,[15−18] as well as computational biology that predicted new drug uses by establishing the relationship among drugs, genes, and diseases.[11−14,19,21] Most recently, pathway enrichment analysis was applied for drug repositioning based on large-scale analysis of gene expression data sets.[21,22] In this study, some drugs have identified relevant pathways responsible for diseases other than those currently treated by these drugs. Many of these pathways were supported by previous in vitro tests, in vivo tests, or clinical trials, indicating obtaining biological pathways shared by drug targets might be a novel approach for predicting new clinical uses of existing drugs. It is noticed that the Integrated Pancreatic Cancer Pathway (BSID: 711360) was identified for many of the investigated drugs (especially those used for cancer therapy) and even indicated potential drug repositioning with drugs such as gefitinib, celecoxib, tamoxifen, and tretinoin. The involvement of this pathway with these drugs might be due to the integrated collection of multiple proteins based on different mechanistic pathways relevant to pancreatic cancer (see description of this pathway in the BioSystems database: http://www.ncbi.nlm.nih.gov/biosystems/?term=711360), with many of the participating proteins also responsible for other caner pathways. The other point is pathways pointing to diseases other than the current uses of a drug may not necessarily mean a novel therapeutic treatment. Some drugs were tested on diseases identified by such pathways but failed to become a treatment due to their pharmacokinetics properties or side effects. For example, although actinomycin identified a few pathways of cancers that are not currently used, these relevant pathways do not indicate its new clinical uses due to its high cytotoxicity. One thing to point out is the pathways demonstrated in Table 2 were the examples of relevant pathways shared by the primary and secondary targets but might not indicate all the pathways that contributed to the pharmacological effects underlying the clinical uses of drugs. For example, doxorubicin is an antineoplastic drug with the function of inducing cell apoptosis by targeting DNA.[118] Besides the 20 relevant pathways demonstrated in Table 2, several pathways of the participating molecules in the apoptosis signaling transduction were identified, such as those of TNF (tumor necrosis factors), FAS (Fas cell surface death receptor), and caspases. Causing apoptosis is an important function of the immune system.[119] There were several retrieved pathways responsible for the immune system and the participating molecules, such as cytokines and interleukins. See Supporting Information Table S4 for more pathways that could contribute to the pharmacological effects of the drugs.

Drug Targets: Direct or Indirect Interaction with Drugs

The retrieved proteins and genes in this study can be divided into two categories: those that physically bind with a drug and those without direct interaction with a drug but whose activities or expression are affected as up or downstream proteins or by synergistic effects with primary proteins. Primary target proteins interact physically with a drug which causes its biological effects on this target. Many proteins retrieved from the BioAssay database are also physically interacting with drugs, as suggested by their affinities with Kd, IC50, Ki, or EC50 values; for cell-based assays in BioAssay, an identified drug target might not be the exact pharmacological target and the drug effects on this target might be attributed to an upstream protein that is directly targeted by this drug. The genes retrieved as drug targets via GEO DataSets were mainly according to their gene expression data, indicating that they might be regulated by the primary target within the same signal cascade. The identified pathways could involve targets with both direct and indirect interaction with drugs. It is reasonable that in a retrieved pathway, targets with physical binding with a drug coexist with other proteins belonging to the same signal cascade without direct drug interaction. Some identified pathways even involve multiple proteins with physical interaction with a drug. For example, TYMS is the primary target of fluorouracil; Uridine Phosphorylase 1 (UPP1) is also an enzyme catalyzing the metabolism of fluorouracil, and its physical binding with this drug is supported by the crystal structure of UPP1 bound with fluorouracil (PDB ID: 3NBQ) retrieved from MMDB.[120] Both targets participated in the retrieved pathway of Pyrimidine Metabolism (BSID: 82946) that is responsible for the antineoplastic effect of fluorouracil (Table 2 and Supporting Information, Table S4). The involvement of targets with both direct and indirect interaction with drugs indicates that both categories of targets are significant for identification of biological pathways relevant to drug effects.

Drugs That Failed to Identify Responsible Pathways for Their Clinical Uses

Disulfiram, mifepristone, propofol, triiodothyronine, and valproic acid failed to find any significantly enriched pathways that are relevant to their biological functions or clinical uses common to the drug targets. The reasons might be (1) the retrieved gene or protein targets were responsible for variety of cellular functions that were not particular to the pathways accounting for its primary target protein; (2) the retrieved target proteins or genes were not the direct targets of the drug but interacted with the drug in an indirect way as up or downstream proteins or by cross-talking with the primary pathways; (3) the retrieved drug targets were limited to obtain relevant pathways. See the Supporting Information for a detailed discussion of these drugs.

Public Databases: A Resource for Biomedical Research on Small Molecules, Bioactivities, Target Proteins and Genes, and Responsible Biological Systems

The enormous biological information in the MMDB, BioAssay, GEO, and BioSystems databases provides advantages to support this study. The cross-link to protein structures at the PubChem Compound database provides a convenient way to exploit the crystal structures of drug targets associated with each drug in the MMDB database. It is noticed that only a few resolved structures were retrieved for the investigated drugs, and some drugs even failed to find any crystal structures with human proteins. The limited structures retrieved from MMDB may be attributed to the lack of experimental data deposited into PDB. BioAssay is one of the public databases with the biggest collection of drug target information. One advantage of the BioAssay database is the straightforward list of the drug bioactivity and targets, which presents users with an outline of drug target information. As a public repository with gene expression data, GEO DataSets makes a great complement to the target list in this study when the protein targets retrieved from MMDB and BioAssay are limited to obtain biological pathways relevant to drug effects. It should be pointed out that the usage of GEO DataSets in this study might be limited in that it is time-consuming to explore the specific gene expression data when checking the literature associated with GEO DataSets records retrieved with each drug, partly due to the lack of cross-linking between GEO and PubChem.

Further Application of Pathway Analysis in Drug Repositioning

The current study involves identification of drug repositioning by investigating prioritized pathways based on available drug targets. The application of pathway analysis in drug repositioning can be from multiple perspectives. Novel therapeutic treatments can also be discovered by retrospectively identifying repurposed drug targets in terms of pathways analysis results. Therefore, drugs with a similar target profile common to a same pathway might be lead candidates for a disease without previous therapeutic indication. The other point is, only pathways common to both primary and secondary targets are discussed in this study. Nevertheless, without the involvement of a primary target, statistically enriched pathways shared by secondary targets may also indicate interesting and valuable pharmacological and clinical significance that is helpful for future drug design. The aforementioned applications of pathway analysis are underway in our group.

Conclusion

In conclusion, the pathway analysis based on drug target information outlined in this study might present a new approach to investigate the MOAs and additional clinical uses of drugs. Eleven investigated drugs have identified relevant pathways based on gene and protein targets retrieved from MMDB, BioAssay, and GEO DataSets; the remaining five drugs failed to identify enriched relevant pathways possibly due to the limited data available in the public databases. Interestingly, drugs gefitinib, celecoxib, tamoxifen, and tretinoin have retrieved biological pathways responsible for diseases other than their current uses, indicating that identifying biological pathways could be a new approach to exploit drug repositioning or repurposing. Overall, the data mining and pathway analysis applied in this study provide a new approach for identifying drug MOAs and repurposing, therefore possibly benefiting the development of new therapeutic treatments for diseases.
  116 in total

Review 1.  Molecular similarity analysis in virtual screening: foundations, limitations and novel approaches.

Authors:  Hanna Eckert; Jürgen Bajorath
Journal:  Drug Discov Today       Date:  2007-02-07       Impact factor: 7.851

2.  The development of biologic end points in patients treated with differentiation agents: an experience of retinoids in prostate cancer.

Authors:  W K Kelly; I Osman; V E Reuter; T Curley; W D Heston; D M Nanus; H I Scher
Journal:  Clin Cancer Res       Date:  2000-03       Impact factor: 12.531

3.  Inhibition of metastatic lung cancer in C57BL/6 mice by liposome encapsulated all trans retinoic acid (ATRA).

Authors:  V M Berlin Grace
Journal:  Int Immunopharmacol       Date:  2012-09-27       Impact factor: 4.932

4.  Identification of novel breast cancer resistance protein (BCRP) inhibitors by virtual screening.

Authors:  Yongmei Pan; Paresh P Chothe; Peter W Swaan
Journal:  Mol Pharm       Date:  2013-03-21       Impact factor: 4.939

5.  Celecoxib synergizes human pancreatic ductal adenocarcinoma cells to sorafenib-induced growth inhibition.

Authors:  Ann H Rosendahl; Chinmay Gundewar; Katarzyna Said; Emelie Karnevi; Roland Andersson
Journal:  Pancreatology       Date:  2012-04-22       Impact factor: 3.996

6.  Enhancing antitumor effects in pancreatic cancer cells by combined use of COX-2 and 5-LOX inhibitors.

Authors:  Xiaoling Ding; Chen Zhu; Hui Qiang; Xiaorong Zhou; Guoxiong Zhou
Journal:  Biomed Pharmacother       Date:  2011-08-27       Impact factor: 6.529

7.  Effects of celecoxib on cycle kinetics of gastric cancer cells and protein expression of cytochrome C and caspase-9.

Authors:  Yu-Jie Wang; Xiao-Ping Niu; Li Yang; Zhen Han; Ying-Jie Ma
Journal:  Asian Pac J Cancer Prev       Date:  2013

Review 8.  Chemoprevention of cancers in gastrointestinal tract with cyclooxygenase 2 inhibitors.

Authors:  Rui Wang; Linjie Guo; Pu Wang; Wenjuan Yang; Yaoyao Lu; Zhiyin Huang; Chengwei Tang
Journal:  Curr Pharm Des       Date:  2013       Impact factor: 3.116

9.  Reactome knowledgebase of human biological pathways and processes.

Authors:  Lisa Matthews; Gopal Gopinath; Marc Gillespie; Michael Caudy; David Croft; Bernard de Bono; Phani Garapati; Jill Hemish; Henning Hermjakob; Bijay Jassal; Alex Kanapin; Suzanna Lewis; Shahana Mahajan; Bruce May; Esther Schmidt; Imre Vastrik; Guanming Wu; Ewan Birney; Lincoln Stein; Peter D'Eustachio
Journal:  Nucleic Acids Res       Date:  2008-11-03       Impact factor: 16.971

10.  PID: the Pathway Interaction Database.

Authors:  Carl F Schaefer; Kira Anthony; Shiva Krupa; Jeffrey Buchoff; Matthew Day; Timo Hannay; Kenneth H Buetow
Journal:  Nucleic Acids Res       Date:  2008-10-02       Impact factor: 16.971

View more
  9 in total

1.  DT-Web: a web-based application for drug-target interaction and drug combination prediction through domain-tuned network-based inference.

Authors:  Salvatore Alaimo; Vincenzo Bonnici; Damiano Cancemi; Alfredo Ferro; Rosalba Giugno; Alfredo Pulvirenti
Journal:  BMC Syst Biol       Date:  2015-06-01

2.  Systems level analysis and identification of pathways and networks associated with liver fibrosis.

Authors:  Mohamed Diwan M AbdulHameed; Gregory J Tawa; Kamal Kumar; Danielle L Ippolito; John A Lewis; Jonathan D Stallings; Anders Wallqvist
Journal:  PLoS One       Date:  2014-11-07       Impact factor: 3.240

3.  Quantitative Structure-activity Relationship (QSAR) Models for Docking Score Correction.

Authors:  Yoshifumi Fukunishi; Satoshi Yamasaki; Isao Yasumatsu; Koh Takeuchi; Takashi Kurosawa; Haruki Nakamura
Journal:  Mol Inform       Date:  2016-04-29       Impact factor: 3.353

4.  DDR: efficient computational method to predict drug-target interactions using graph mining and machine learning approaches.

Authors:  Rawan S Olayan; Haitham Ashoor; Vladimir B Bajic
Journal:  Bioinformatics       Date:  2018-04-01       Impact factor: 6.937

5.  Prediction of Protein-compound Binding Energies from Known Activity Data: Docking-score-based Method and its Applications.

Authors:  Yoshifumi Fukunishi; Yasunobu Yamashita; Tadaaki Mashimo; Haruki Nakamura
Journal:  Mol Inform       Date:  2018-02-14       Impact factor: 3.353

6.  Systematic analyses of drugs and disease indications in RepurposeDB reveal pharmacological, biological and epidemiological factors influencing drug repositioning.

Authors:  Khader Shameer; Benjamin S Glicksberg; Rachel Hodos; Kipp W Johnson; Marcus A Badgeley; Ben Readhead; Max S Tomlinson; Timothy O'Connor; Riccardo Miotto; Brian A Kidd; Rong Chen; Avi Ma'ayan; Joel T Dudley
Journal:  Brief Bioinform       Date:  2018-07-20       Impact factor: 11.622

7.  Rewired Pathways and Disrupted Pathway Crosstalk in Schizophrenia Transcriptomes by Multiple Differential Coexpression Methods.

Authors:  Hui Yu; Yan Guo; Jingchun Chen; Xiangning Chen; Peilin Jia; Zhongming Zhao
Journal:  Genes (Basel)       Date:  2021-04-29       Impact factor: 4.096

8.  Cogena, a novel tool for co-expressed gene-set enrichment analysis, applied to drug repositioning and drug mode of action discovery.

Authors:  Zhilong Jia; Ying Liu; Naiyang Guan; Xiaochen Bo; Zhigang Luo; Michael R Barnes
Journal:  BMC Genomics       Date:  2016-05-27       Impact factor: 3.969

9.  Expression correlation attenuates within and between key signaling pathways in chronic kidney disease.

Authors:  Hui Yu; Danqian Chen; Olufunmilola Oyebamiji; Ying-Yong Zhao; Yan Guo
Journal:  BMC Med Genomics       Date:  2020-09-21       Impact factor: 3.063

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.