| Literature DB >> 31842404 |
Minjae Joo1, Aron Park1, Kyungdoc Kim2, Won-Joon Son3, Hyo Sug Lee3, GyuTae Lim4, Jinhyuk Lee4,5, Dae Ho Lee1,6, Jungseok An7, Jung Ho Kim6, TaeJin Ahn8, Seungyoon Nam1,9,10,11.
Abstract
Heterogeneity in intratumoral cancers leads to discrepancies in drug responsiveness, due to diverse genomics profiles. Thus, prediction of drug responsiveness is critical in precision medicine. So far, in drug responsiveness prediction, drugs' molecular "fingerprints", along with mutation statuses, have not been considered. Here, we constructed a 1-dimensional convolution neural network model, DeepIC50, to predict three drug responsiveness classes, based on 27,756 features including mutation statuses and various drug molecular fingerprints. As a result, DeepIC50 showed better cell viability IC50 prediction accuracy in pan-cancer cell lines over two independent cancer cell line datasets. Gastric cancer (GC) is not only one of the lethal cancer types in East Asia, but also a heterogeneous cancer type. Currently approved targeted therapies in GC are only trastuzumab and ramucirumab. Responsive GC patients for the drugs are limited, and more drugs should be developed in GC. Due to the importance of GC, we applied DeepIC50 to a real GC patient dataset. Drug responsiveness prediction in the patient dataset by DeepIC50, when compared to the other models, were comparable to responsiveness observed in GC cell lines. DeepIC50 could possibly accurately predict drug responsiveness, to new compounds, in diverse cancer cell lines, in the drug discovery process.Entities:
Keywords: artificial intelligence; drug discovery; drug responsiveness prediction
Mesh:
Substances:
Year: 2019 PMID: 31842404 PMCID: PMC6941066 DOI: 10.3390/ijms20246276
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Overview of input data. DeepIC50 (equivalently, 1D convolution neural network (CNN)) and four baseline models predicted logarithms of half-maximal drug concentrations, ln(IC50)s, using genomics profiles of cancer cell lines and drugs’ molecular fingerprints as inputs. Drug molecular descriptors, including molecular weights, polarity, and molecular fingerprints were calculated by PaDEL [29]. Genomics profiles were represented by mutation statuses (presence or absence) in genomic positions of protein-coding genes. The ln(IC50) values of drugs used to treat cancer cell lines were grouped into three classes (see column sensitivity): high responsiveness (class 0), intermediate responsiveness (class 1), and low responsiveness (class 2).
Figure 2Architecture of DeepIC50 and 2D CNN. It represented our network structure (refers to 4.2 in materials and methods section in detail).
Figure 3Performance of the five methods (DeepIC50, 2D CNN, support vector machine (SVM), ridge classifier, and XGBoost) in the Genomics of Drug Sensitivity in Cancer (GDSC) test set. Micro-average area under the curves (AUCs) (A) and macro-average AUCs (B) were represented by receiver operating characteristics (ROC) curves.
Figure 4Performance of DeepIC50, 2D CNN, SVM, ridge classifier, and XGBoost in the Cancer Cell Line Encyclopedia (CCLE) dataset. Micro-average (A) and macro-average (B) AUCs were, along with ROC curves, represented for the five models.
Figure 5DeepIC50 and SVM application to a The Cancer Genome Atlas (TCGA) gastric cancer (GC) patient dataset (n = 441), to inspect whether drug effectiveness observed in GC cell lines was aligned to predicted responsiveness, by DeepIC50, for TCGA GC patients. (A) Distribution of the observed responsiveness for the three selected potent drugs in GDSC GC cell lines. We selected three effective drugs (refer to method section) in the GDSC GC cell line ln(IC50) experiments. The drugs showed “high responsiveness” in the most of the cell lines, and “intermediate responsiveness” in the minority of the cell lines. (B) Distribution of predicted drug responsiveness in TCGA GC patients. Prediction of patient drug responsiveness by DeepIC50 was more similar to drug responsiveness pattern in GC cell lines, in comparison to the SVM model. DeepIC50 predicted minor population of the patients to be “intermediate responsiveness”, but SVM predicted all the patients to be “high responsiveness” class without any other classes. Considering a single drug cannot fit all patients due to GC heterogeneity, DeepIC50 prediction was assumed to be reasonable.