| Literature DB >> 34907245 |
Kohei Fukuto1, Tatsuya Takagi1, Yu-Shi Tian2.
Abstract
The severe side effects of some drugs can threaten the lives of patients and financially jeopardize pharmaceutical companies. Computational methods utilizing chemical, biological, and phenotypic features have been used to address this problem by predicting the side effects. Among these methods, the matrix factorization method, which utilizes the side-effect history of different drugs, has yielded promising results. However, approaches that encapsulate all the characteristics of side-effect prediction have not been investigated to date. To address this gap, we applied the logistic matrix factorization algorithm to a database of spontaneous reports to construct a prediction with higher accuracy. We expressed the distinction in the importance of drug-side effect pairs by a weighting strategy and addressed the cold-start problem via an attribute-to-feature mapping method. Consequently, our proposed model improved the prediction accuracy by 2.5% and efficiently handled the cold-start problem. The proposed methodology is expected to benefit applications such as warning systems in clinical settings.Entities:
Mesh:
Year: 2021 PMID: 34907245 PMCID: PMC8671428 DOI: 10.1038/s41598-021-03348-y
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Flow chart of this study.
Figure 2The method for split training and test sets.
List of hyperparameters and their range in the grid search.
| Hyperparameter | Range | |
|---|---|---|
| Logistic MF | λ | [1.0 × 10–4, 5.0 × 10–4, 1.0 × 10–3, 5.0 × 10–3, 1.0 × 10–2] |
| α | [0, 1, 2, 5, 10, 15] | |
| β | [0.2, 0.4, 0.6, 0.8, 1.0] | |
| MF | λ | [1.0 × 10–4, 5.0 × 10–4, 1.0 × 10–3, 5.0 × 10–3, 1.0 × 10–2] |
| FGRMF | λ | [1.0 × 10–5, 5.0 × 10–5, 1.0 × 10–4, 5.0 × 10–4, 1.0 × 10–3] |
| μ | [1.0 × 10–4, 5.0 × 10–4, 1.0 × 10–3, 5.0 × 10–3, 1.0 × 10–2] | |
| SVM | C | [1.0 × 10–5, 1.0 × 10–4, 1.0 × 10–3, 1.0 × 10–2, 1.0 × 100, 1.0 × 101, 1.0 × 102] |
| kernel | [“linear”, “poly”, “rbf”] |
PR-AUC of test sets for Logistic MF and other models.
| Mean (68 ADRs) | SJS | LPT | NMS | |
|---|---|---|---|---|
| Logistic MF | 0.812 | 0.865 | 0.948 | 0.771 |
| MF | 0.787 | 0.877 | 0.941 | 0.685 |
| FGRMF | 0.752 | 0.800 | 0.821 | 0.699 |
| SVM | 0.763 | 0.794 | 0.938 | 0.755 |
Figure 3The change of PR-AUC when using different thresholds.
PR-AUC of the external tests for Logistic MF and other methods.
| Mean (68 ADRs) | SJS | LPT | NMS | |
|---|---|---|---|---|
| Logistic MF | 0.297 | 0.293 | 0.243 | 0.234 |
| MF | 0.291 | 0.275 | 0.246 | 0.226 |
| FGRMF | 0.293 | 0.277 | 0.264 | 0.251 |
| SVM | 0.195 | 0.133 | 0.149 | 0.078 |
PR-AUC of test set in SIDER for Logistic MF and other methods.
| Mean | SJS | LPT | NMS | |
|---|---|---|---|---|
| Logistic MF | 0.462 | 0.540 | 0.777 | 0.658 |
| MF | 0.445 | 0.453 | 0.742 | 0.701 |
| FGRMF | 0.481 | 0.494 | 0.769 | 0.722 |
| SVM | 0.551 | 0.689 | 0.869 | 0.681 |
Figure 4PR-AUC of test sets with varying number of known side effects.
Test PR-AUC for Logistic MF and Map-LMF with varying number of known side effects.
| Test_delete_ratio | 0.80 | 0.90 | 0.95 | 0.99 |
|---|---|---|---|---|
| Logistic MF | 0.550 | 0.377 | 0.286 | 0.235 |
| Map-LMF (RDKit) | 0.309 | 0.310 | 0.308 | 0.308 |
| Map-LMF (ECFP) | 0.357 | 0.357 | 0.358 | 0.359 |