| Literature DB >> 35513784 |
Yongsun Shim1, Munhwan Lee1, Pil-Jong Kim2, Hong-Gee Kim3,4.
Abstract
BACKGROUND: To reduce drug side effects and enhance their therapeutic effect compared with single drugs, drug combination research, combining two or more drugs, is highly important. Conducting in-vivo and in-vitro experiments on a vast number of drug combinations incurs astronomical time and cost. To reduce the number of combinations, researchers classify whether drug combinations are synergistic through in-silico methods. Since unstructured data, such as biomedical documents, include experimental types, methods, and results, it can be beneficial extracting features from documents to predict anti-cancer drug combination synergy. However, few studies predict anti-cancer drug combination synergy using document-extracted features.Entities:
Keywords: Anti-cancer drug combination; Deep learning; Document-based feature extraction; Drug synergy; Machine learning; Natural language processing; Text mining; Word2vec
Mesh:
Substances:
Year: 2022 PMID: 35513784 PMCID: PMC9069794 DOI: 10.1186/s12859-022-04698-8
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.307
Fig. 1Overview of our approach
Hyperparameter setting of Word2vec
| Hyperparameter name | Value |
|---|---|
| Vector size | 256 |
| Window | 5 |
| Min count | 1 |
| sg | 0 |
| Epochs | 200 |
Examples of NCI-ALMANAC
| Drug1 | Drug2 | Cell line | ComboScore |
|---|---|---|---|
| Methotrexate | Hydroxyurea | SF-295 | 14.22 |
| Busulfan | 2-Fluoro Ara-A | CAKI-1 | 14.33 |
| Azacitidine | Thiotepa | NCI-H460 | 20.44 |
| Methotrexate | Dactinomycin | 786-0 | − 7.22 |
| Busulfan | Mercaptopurine | A498 | − 6.11 |
| Azacitidine | Thiotepa | CAKI-1 | − 16.33 |
Performance comparison of prediction models
| Algorithms | FFNN | AE | XGB | ERT | LR | |||
|---|---|---|---|---|---|---|---|---|
| Ref. | [ | [ | [ | [ | [ | [ | [ | [ |
| (a) Performance comparison using ROC-AUC | ||||||||
| Baseline | 0.912 ± 0.004 | 0.91 ± 0.005 | 0.914 ± 0.006 | 0.895 ± 0.007 | 0.885 ± 0.01 | 0.895 ± 0.005 | 0.843 ± 0.007 | 0.847 ± 0.006 |
| Ours | 0.915 ± 0.006 | 0.923 ± 0.003 | 0.92 ± 0.004 | 0.889 ± 0.003 | 0.892 ± 0.004 | 0.881 ± 0.004 | 0.854 ± 0.007 | |
| (b) Performance comparison using AUPR | ||||||||
| Baseline | 0.402 ± 0.017 | 0.417 ± 0.032 | 0.408 ± 0.025 | 0.339 ± 0.026 | 0.349 ± 0.016 | 0.381 ± 0.026 | 0.24 ± 0.015 | 0.192 ± 0.011 |
| Ours | 0.434 ± 0.014 | 0.427 ± 0.025 | 0.424 ± 0.011 | 0.371 ± 0.013 | 0.381 ± 0.016 | 0.326 ± 0.012 | 0.196 ± 0.004 | |
| (c) Performance comparison using F1 score | ||||||||
| Baseline | 0.359 ± 0.027 | 0.262 ± 0.045 | 0.325 ± 0.039 | 0.224 ± 0.05 | 0.227 ± 0.01 | 0.24 ± 0.019 | 0.272 ± 0.018 | 0.059 ± 0.01 |
| Ours | 0.392 ± 0.017 | 0.313 ± 0.052 | 0.296 ± 0.021 | 0.271 ± 0.016 | 0.259 ± 0.02 | 0.263 ± 0.015 | 0.064 ± 0.013 | |
Fig. 2ROC curve of highest performing prediction model