| Literature DB >> 34252971 |
Omid Bazgir1, Souparno Ghosh2, Ranadip Pal1.
Abstract
MOTIVATION: Anti-cancer drug sensitivity prediction using deep learning models for individual cell line is a significant challenge in personalized medicine. Recently developed REFINED (REpresentation of Features as Images with NEighborhood Dependencies) CNN (Convolutional Neural Network)-based models have shown promising results in improving drug sensitivity prediction. The primary idea behind REFINED-CNN is representing high dimensional vectors as compact images with spatial correlations that can benefit from CNN architectures. However, the mapping from a high dimensional vector to a compact 2D image depends on the a priori choice of the distance metric and projection scheme with limited empirical procedures guiding these choices.Entities:
Year: 2021 PMID: 34252971 PMCID: PMC8275339 DOI: 10.1093/bioinformatics/btab336
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Density of distances. Kernel density estimate of between features’observed (Euclidean) distance versus distances of projection in 2D space by the 4 DR techniques in regular and log scale
Fig. 2.Illustration of three ensemble learning approaches in this study. (I) is stacking four different REFINED CNN models to achieve the ultimate prediction. (II) is REFINED-CNN image stacking model that stack images in the z-direction prior CNN modeling and (III) is integrated REFINED CNN model that integrates all the created REFINED images into one image and then trains a CNN model
Fig. 3.Correlation between distances. Kendall’s τ among the distances estimated in 2D by each DR technique and their geometric and arithmetic means
Fig. 4.Different REFINED images. REFINED images created using 4 DR technique including MDS, Isomap, LLE, LE and arithmetic and geometric average of them as initialization step at the first row before applying the hill climbing. The second Row represents the REFINED images after applying the hill climbing algorithm on each initialization step
NCI60 results
| Model | NRMSE | NMAE | PCC | Bias |
|---|---|---|---|---|
| REFINED-CNN model stacking |
| 0.653 |
| 0.489 |
| iREFINED-CNN-AM | 0.715 |
| 0.706 | 0.461 |
| iREFINED-CNN-GM | 0.722 | 0.635 | 0.705 |
|
| REFINED-CNN image stacking | 0.775 | 0.679 | 0.655 | 0.509 |
| sREFINED with Isomap | 0.787 | 0.716 | 0.644 | 0.509 |
| sREFINED with LE | 0.788 | 0.720 | 0.644 | 0.504 |
| sREFINED with LLE | 0.795 | 0.759 | 0.625 | 0.511 |
| sREFINED with MDS | 0.778 | 0.709 | 0.650 | 0.488 |
| KBMTL ( | 0.856 | 0.768 | 0.547 | 0.733 |
| XGBoost ( | 0.842 | 0.806 | 0.513 | 0.781 |
| SVR ( | 0.870 | 0.806 | 0.525 | 0.755 |
| RF ( | 0.880 | 0.846 | 0.486 | 0.816 |
| EN ( | 0.976 | 0.942 | 0.287 | 0.968 |
Note: Comparison of performance of proposed approaches, single projection based REFINED (sREFINED) and state-of-the-art methods on NCI60 dataset. The bold values indicate best performance.
NCI-ALMANAC results
| Model | NRMSE | NMAE | PCC | Bias |
|---|---|---|---|---|
| REFINED-CNN model stacking |
|
|
|
|
| iREFINED-CNN-AM | 0.479 | 0.431 | 0.893 | 0.275 |
| iREFINED-CNN-GM | 0.474 | 0.427 | 0.892 | 0.248 |
| REFINED-CNN image stacking | 0.561 | 0.524 | 0.856 | 0.362 |
| sREFINED with Isomap | 0.508 | 0.470 | 0.887 | 0.227 |
| sREFINED with LE | 0.489 | 0.443 | 0.884 | 0.238 |
| sREFINED with LLE | 0.522 | 0.486 | 0.884 | 0.284 |
| sREFINED with MDS | 0.514 | 0.474 | 0.877 | 0.259 |
| Xie | 1.574 | 1.295 | 0.435 | 0.991 |
| DeepSynergy ( | 1.109 | 1.058 | 0.176 | 0.929 |
| XGBoost ( | 0.518 | 0.680 | 0.859 | 0.327 |
| RF ( | 0.525 | 0.679 | 0.851 | 0.290 |
| SVR ( | 0.561 | 0.675 | 0.830 | 0.255 |
| EN ( | 0.618 | 0.758 | 0.789 | 0.428 |
Note: Comparison of performance of proposed approaches, single projection based REFINED (sREFINED) and state-of-the-art methods on NCI-ALMANAC dataset. The bold values indicate best performance.
Execution time comparison
| Steps | iREFINED-CNN | REFINED-CNN model stacking |
|---|---|---|
| MDS | 7 s | 7 s |
| Isomap | 21 s | 21 s |
| LE | 23 s | 23 s |
| LLE | 28 s | 28 s |
| NMDS + DA | 47 s | – |
| Hill climbing | 8 min and 23 s | 33 min and 32 s |
| CNN | 2 h and 17 min and 36 s | 9 h and 10 min and 24 s |
| LR | – | 1 s |
| Total | 2 h and 28 min and 25 s | 9h and 45 min and 19 s |
Note: Comparing execution time of each step of integrated REFINED CNN model and REFINED-CNN model stacking trained on HCC_2998 cell line data of NCI60 dataset.