| Literature DB >> 23802887 |
Emmanuel Bresso, Renaud Grisoni, Gino Marchetti, Arnaud Sinan Karaboga, Michel Souchet, Marie-Dominique Devignes, Malika Smaïl-Tabbone.
Abstract
BACKGROUND: Drug side effects represent a common reason for stopping drug development during clinical trials. Improving our ability to understand drug side effects is necessary to reduce attrition rates during drug development as well as the risk of discovering novel side effects in available drugs. Today, most investigations deal with isolated side effects and overlook possible redundancy and their frequent co-occurrence.Entities:
Mesh:
Year: 2013 PMID: 23802887 PMCID: PMC3710241 DOI: 10.1186/1471-2105-14-207
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Overview of our approach for characterizing drug-SEP associations. Terms used for describing side effects in SIDER DB are grouped using a semantic similarity measure in order to build Term Clusters or TCs (A). Drugs are mapped to DrugBank in order to retrieve information about drugs themselves and their targets (B). TCs are associated to drugs to represent each drug by a side-effect fingerprint (C). SEPs are extracted as maximal frequent itemsets from side effect fingerprints (D). Two machine-learning methods are used to characterize each SEP in terms of drug and target properties (E).
Figure 2NetworkDB conceptual model. In this entity-relationship schema, entities are in boxes and relationships in ellipses.
Figure 3Drug side-effect binary table. This table is presented as a heatmap (produced with R) where rows and columns are grouped by distribution similarity. Each row represents the side-effect fingerprint of a drug and each column is a side-effect term cluster.
Maximal frequent itemsets covering 20% of drugs (support) extracted from the drugTC table
| SEP_1 | 41_Leukopenia, 90_Feeling_abnormal, 99_Headache | 123 | 69 |
| SEP_2 | 90_Feeling_abnormal, 99_Headache, 110_Shock | 123 | 73 |
| SEP_3 | 58_Gout | 120 | 60 |
| SEP_4 | 70_Pneumonia, 99_Headache | 117 | 71 |
| SEP_5 | 110_Shock, 111_Infection | 117 | 68 |
| SEP_6 | 76_Asthma, 90_Feeling_abnormal, 99_Headache | 117 | 68 |
| SEP_7 | 65_Dermatitis | 116 | 53 |
| SEP_8 | 2_Haemorrhage, 76_Asthma | 115 | 65 |
| SEP_9 | 41_Leukopenia, 76_Asthma | 115 | 62 |
| SEP_10 | 48_Rhinitis, 99_Headache, 111_Infection | 115 | 69 |
| SEP_11 | 41_Leukopenia, 110_Shock | 114 | 66 |
| SEP_12 | 39_Stevens-Johnson_syndrome, 41_Leukopenia, 100_Erythema_multiforme | 114 | 52 |
| SEP_13 | 41_Leukopenia, 48_Rhinitis | 113 | 67 |
| SEP_14 | 99_Headache, 100_Erythema_multiforme | 113 | 56 |
| SEP_15 | 31_Lymphadenopathy | 112 | 59 |
| SEP_16 | 70_Pneumonia, 90_Feeling_abnormal | 112 | 71 |
| SEP_17 | 41_Leukopenia, 70_Pneumonia | 112 | 64 |
| SEP_18 | 76_Asthma, 111_Infection | 112 | 64 |
| SEP_19 | 80_Jaundice, 100_Erythema_multiforme | 112 | 45 |
| SEP_20 | 41_Leukopenia, 111_Infection | 111 | 63 |
| SEP_21 | 8_Haematuria, 90_Feeling_abnormal, 99_Headache | 111 | 68 |
| SEP_22 | 13_Pyrexia, 33_Musculoskeletal_discomfort, 48_Rhinitis, 99_Headache | 111 | 69 |
| SEP_23 | 13_Pyrexia, 70_Pneumonia | 110 | 69 |
| SEP_24 | 48_Rhinitis, 90_Feeling_abnormal, 110_Shock | 110 | 70 |
| SEP_25 | 13_Pyrexia, 90_Feeling_abnormal, 110_Shock | 110 | 70 |
| SEP_26 | 48_Rhinitis, 90_Feeling_abnormal, 111_Infection | 110 | 69 |
Avg overlap: average of overlap size between the SEP and other SEPs.
Evaluation of learning results by 10 × 10 stratified cross-validation of DT and ILP programs
| SEP_1 | 0.65 | 0.86 | 0.39 | 0.61 | 0.63 | 0.6 |
| SEP_2 | 0.69 | 0.88 | 0.4 | 0.63 | 0.69 | 0.54 |
| SEP_3 | 0.71 | 0.88 | 0.47 | 0.71 | 0.77 | 0.63 |
| SEP_4 | 0.66 | 0.89 | 0.32 | 0.62 | 0.7 | 0.51 |
| SEP_5 | 0.68 | 0.88 | 0.38 | 0.64 | 0.7 | 0.54 |
| SEP_6 | 0.68 | 0.87 | 0.39 | 0.61 | 0.69 | 0.49 |
| SEP_7 | 0.65 | 0.86 | 0.32 | 0.6 | 0.67 | 0.49 |
| SEP_8 | 0.7 | 0.87 | 0.44 | 0.67 | 0.73 | 0.57 |
| SEP_9 | 0.69 | 0.84 | 0.46 | 0.69 | 0.75 | 0.59 |
| SEP_10 | 0.7 | 0.89 | 0.4 | 0.65 | 0.76 | 0.47 |
| SEP_11 | 0.7 | 0.88 | 0.44 | 0.7 | 0.82 | 0.45 |
| SEP_12 | 0.71 | 0.88 | 0.45 | 0.7 | 0.76 | 0.61 |
| SEP_13 | 0.67 | 0.88 | 0.35 | 0.66 | 0.74 | 0.54 |
| SEP_14 | 0.69 | 0.89 | 0.39 | 0.63 | 0.71 | 0.51 |
| SEP_15 | 0.71 | 0.9 | 0.43 | 0.69 | 0.76 | 0.6 |
| SEP_16 | 0.69 | 0.89 | 0.39 | 0.66 | 0.72 | 0.57 |
| SEP_17 | 0.74 | 0.89 | 0.52 | 0.65 | 0.74 | 0.51 |
| SEP_18 | 0.65 | 0.87 | 0.34 | 0.61 | 0.69 | 0.5 |
| SEP_19 | 0.74 | 0.91 | 0.47 | 0.72 | 0.77 | 0.64 0 |
| SEP_20 | 0.71 | 0.89 | 0.44 | 0.64 | 0.73 | 0.51 |
| SEP_21 | 0.72 | 0.9 | 0.46 | 0.64 | 0.72 | 0.54 |
| SEP_22 | 0.65 | 0.88 | 0.32 | 0.61 | 0.69 | 0.48 |
| SEP_23 | 0.71 | 0.89 | 0.43 | 0.63 | 0.7 | 0.51 |
| SEP_24 | 0.68 | 0.87 | 0.4 | 0.62 | 0.71 | 0.5 |
| SEP_25 | 0.71 | 0.9 | 0.43 | 0.65 | 0.72 | 0.56 |
| SEP_26 | 0.69 | 0.88 | 0.4 | 0.62 | 0.69 | 0.52 |
| Average | 0.67 | 0.83 | 0.43 | 0.65 | 0.72 | 0.54 |
Acc: accuracy, Spec: specificity, Sens: sensitivity.
Direct testing results with 20 new molecules
| SEP_1 | 4 | 0 | 5 | 1 | 2 | 3 | 1 |
| SEP_2 | 11 | 1 | 2 | 1 | 0 | 1 | 1 |
| SEP_3 | 2 | 0 | 3 | 1 | 0 | 5 | 1 |
| SEP_4 | 3 | 0 | 3 | 1 | 1 | 2 | 1 |
| SEP_5 | 5 | 0 | 2 | 1 | 1 | 2 | 1 |
| SEP_6 | 5 | 1 | 5 | 3 | 2 | 3 | 1 |
| SEP_7 | 15 | 2 | 2 | 1 | 2 | 1 | 1 |
| SEP_8 | 4 | 1 | 1 | 1 | 1 | 3 | 1 |
| SEP_9 | 3 | 0 | 3 | 2 | 0 | 3 | 1 |
| SEP_10 | 5 | 0 | 1 | 0 | 0 | 3 | 1 |
| SEP_11 | 4 | 1 | 3 | 2 | 1 | 3 | 1 |
| SEP_12 | 0 | 0 | 5 | 1 | 0 | 4 | 1 |
| SEP_13 | 4 | 0 | 6 | 2 | 1 | 4 | 1 |
| SEP_14 | 4 | 1 | 0 | 0 | 1 | 5 | 2 |
| SEP_15 | 1 | 0 | 2 | 1 | 0 | 5 | 2 |
| SEP_16 | 3 | 0 | 6 | 2 | 2 | 7 | 3 |
| SEP_17 | 1 | 0 | 5 | 3 | 0 | 2 | 1 |
| SEP_18 | 2 | 0 | 3 | 1 | 0 | 2 | 1 |
| SEP_19 | 1 | 0 | 4 | 2 | 0 | 5 | 2 |
| SEP_20 | 3 | 0 | 3 | 1 | 1 | 4 | 1 |
| SEP_21 | 8 | 1 | 2 | 1 | 1 | 2 | 1 |
| SEP_22 | 5 | 0 | 3 | 1 | 1 | 5 | 1 |
| SEP_23 | 3 | 0 | 3 | 1 | 1 | 4 | 1 |
| SEP_24 | 5 | 1 | 4 | 2 | 2 | 5 | 2 |
| SEP_25 | 8 | 0 | 3 | 2 | 2 | 5 | 2 |
| SEP_26 | 4 | 0 | 6 | 3 | 0 | 3 | 1 |
Positives: number of positive examples in the test set according to SIDER, TP/FP: number of predicted true/false positives, FAERS: number of fished out molecules based on FAERS data.
Quantitative characteristics of DT models and ILP theories
| | ||||
|---|---|---|---|---|
| Model coverage (%) | 58 (32–67) | - | 83 (77–88) | - |
| Model size | 11 (6–15) | - | 33 (16–40) | - |
| | | | | |
| Categories | 4 (1–7) | 34 | 6 (2–13) | 19 |
| Targets | 3 (0–5) | 26 | 30 (23–39) | 90 |
| Clusters | 4 (1–9) | 40 | 9 (4–14) | 27 |
| | | | | |
| GO terms | NA | NA | 24 (16–31) | 73 |
| Domains | NA | NA | 1 (0–2) | 1 |
| Interactions | NA | NA | 8 (2–16) | 24 |
| Pathways | NA | NA | 4 (1–8) | 12 |
| NA | NA | 6 (3–9) | 19 | |
Model coverage is the percentage of positive examples covered, averaged over the 26 DT models and 26 ILP theories. Avg: average. Model size corresponds to the average number of nodes in a DT model or of rules in a ILP theory. Occurrence of each type of descriptor is estimated by counting the number of nodes (rules respectively) involving them (NA: not applicable).
Theory obtained for 65_Dermatitis SEP (SEP_7)
| 3 | drug_has_target(A,B,inhibitor), goterm(B,’cellular response to insulin stimulus’) | 15 | 1 |
| 18 | drug_has_target(A,B,inhibitor), goterm(B,C), go_relation(C,part_of,go:21543) | 13 | 1 |
| 1 | drug_has_target(A,B,activator), interact(B,C), goterm(C,’central nervous system development’) | 12 | 1 |
| 30 | drug_has_target(A,B,inhibitor), interact(B,C), pathway(C,’BCR signaling pathway’,pid), drug_cluster(A,’17_quinine’,hpcc) | 12 | 0 |
| 24 | drug_has_target(A,B,inhibitor), interact(B,C), goterm(C,’translation’), interact(C,D) | 10 | 1 |
| 20 | drug_has_target(A,B,inhibitor), interact(B,C), pathway(C,’BCR signaling pathway’,pid), pathway(C,’EPO signaling pathway’,pid) | 9 | 1 |
| 25 | drug_has_target(A,B,activator), goterm(B,’lipid binding’), goterm(B,’ligand-dependent nuclear receptor activity’) | 9 | 1 |
| 35 | drug_has_target(A,B,activator), interact(B,C), goterm(C,’identical protein binding’), goterm(C,’DNA binding’) | 9 | 1 |
| 6 | drug_has_target(A,B,inhibitor), goterm(B,’protein homodimerization activity’), drug_cluster(A,’16_gliclazide’,hpcc) | 8 | 0 |
| 8 | drug_has_target(A,B,activator), interact(B,C), interact(C,’Serine/threonine-protein phosphatase 2A 55 kDa regulatory subunit B beta isoform’) | 8 | 1 |
| 15 | drug_has_target(A,B,inhibitor), goterm(B,’response to ethanol’), goterm(B,’signal transduction’) | 8 | 1 |
| 19 | drug_has_target(A,B,inhibitor), goterm(B,C), go_relation(C,is_a,go:8227), drug_cluster(A,’16_Flavoxate’,hpcombo) | 8 | 0 |
| 31 | drug_has_target(A,B,inhibitor), interact(B,C), interact(C,’Dedicator of cytokinesis protein 1’) | 8 | 0 |
| 5 | drug_has_target(A,B,activator), goterm(B,’receptor activity’), interact(B,C), goterm(C,’mitosis’) | 7 | 1 |
| 10 | drug_has_target(A,B,inhibitor), goterm(B,C), go_relation(C,is_a,’cation channel activity’), goterm(B,’serotonin receptor activity’) | 7 | 1 |
| 21 | drug_has_target(A,B,activator), interact(B,C), interact(C,’RNA polymerase-associated protein CTR9 homolog’) | 7 | 1 |
| 22 | drug_has_target(A,B,inhibitor), pathway(B,’Role of Calcineurin-dependent NFAT signaling in lymphocytes’,pid), goterm(B,’signal transduction’) | 7 | 1 |
| 23 | drug_has_target(A,B,inhibitor), interact(B,C), domain(C,’ Protein synthesis factor, GTP-binding’) | 7 1 | |
| 28 | drug_cluster(A,’7_marinol’,hpcombo) | 7 | 1 |
| 7 | category(A,’Topoisomerase Inhibitors’), drug_has_target(A,B,inhibitor), goterm(B,’transferase activity’) | 6 | 1 |
| 12 | drug_cluster(A,’29_norfloxacin’,hpcf) | 6 | 1 |
| 17 | category(A,’Cyclooxygenase 2 Inhibitors’), drug_cluster(A,’2_estazolam’,hpcc) | 6 | 0 |
| 32 | drug_has_target(A,B,activator), goterm(B,’inflammatory response’), goterm(B,’protein binding’) | 6 | 0 |
| 2 | category(A,’Serotonin Uptake Inhibitors’) | 5 | 0 |
| 4 | drug_has_target(A,B,inhibitor), goterm(B,’synapse assembly’), drug_cluster(A,’14_fentanyl’,hpcombo) | 5 | 1 |
| 9 | drug_has_target(A,B,activator), goterm(B,’protein heterodimerization activity’), goterm(B,’cell-cell signaling’) | 5 | 1 |
| 13 | category(A,’HIV Protease Inhibitors’), drug_has_target(A,B,inhibitor), goterm(B,C), go_relation(C,is_a,D), go_relation(D,is_a,’catalytic activity’) | 5 | 1 |
| 26 | drug_has_target(A,B,inhibitor), goterm(B,’heart development’) | 5 | 1 |
| 27 | drug_has_target(A,B,inhibitor), goterm(B,C), go_relation(C,is_a,go:65008), drug_cluster(A,’55_thiothixene’,tanimoto) | 5 | 0 |
| 29 | category(A,’HIV Protease Inhibitors’), drug_has_target(A,B,inhibitor), goterm(B,’oxidation reduction’) | 5 | 0 |
| 33 | drug_has_target(A,B,other), goterm(B,C), go_relation(C,is_a,go:51240) | 5 | 1 |
| 34 | drug_has_target(A,B,inhibitor), goterm(B,C), go_relation(C,is_a,’binding’), drug_cluster(A,’27_quinine’,hpcombo) | 5 | 1 |
The condition parts of the 35 rules contained in SEP_7 theory are given with the number of positive (P) and negative (N) covered examples. The 3 rules confirmed using peer-reviewed literature are in bld. Rules are ordered by number of positive covered examples. The 8 predicates are defined as follows: Drug_has_target(A, B, inhibitor/activator) : drug A activates/inhibits protein B; goterm(B, G): protein B is annotated by GO term G; go_relation (G1, R, G2): the relationship between GO terms G1 and G2 is R; interact(B,C): protein B interacts with protein; pathway(B, P): protein B is involved in pathway P; drug_cluster(A,K,M): drug A is member of cluster K obtained using method M; category(A,T): drug A belongs to category T; domain(B,D): protein B is composed of domain D.