| Literature DB >> 27444372 |
Abstract
The computational prediction of drug responses based on the analysis of multiple types of genome-wide molecular data is vital for accomplishing the promise of precision medicine in oncology. This will benefit cancer patients by matching their tumor characteristics to the most effective therapy available. As larger and more diverse layers of patient-related data become available, further demands for new bioinformatics approaches and expertise will arise. This article reviews key strategies, resources and techniques for the prediction of drug sensitivity in cell lines and patient-derived samples. It discusses major advances and challenges associated with the different model development steps. This review highlights major trends in this area, and will assist researchers in the assessment of recent progress and in the selection of approaches to emerging applications in oncology.Entities:
Keywords: cancer; computational prediction models; drug sensitivity; precision medicine; translational bioinformatics
Mesh:
Year: 2017 PMID: 27444372 PMCID: PMC5862310 DOI: 10.1093/bib/bbw065
Source DB: PubMed Journal: Brief Bioinform ISSN: 1467-5463 Impact factor: 11.622
Figure 1Key steps in the development of computational models for predicting drug response. Data obtained from cell lines, animals or humans are stored in different data repositories, including public databases. These resources also include drug response information. Data sets are obtained to be subsequently used as training data sets, and may contain one or more types of ‘omics’ data, e.g. transcriptomics and DNA sequence. Such data are used as inputs to statistical or machine learning techniques. The prediction problem may be defined as either a classification or a regression problem, and a variety of techniques may be applied. The predictive performance of the models is assessed with cross-validation sampling techniques. The most-promising models are selected and evaluated using testing data sets, which were not used during the training phase. The model and its predictions undergo human expert interpretation and their reporting to stakeholders follows. Further independent validations using clinically relevant data are required to continue bridging the gap between the laboratory and the clinic.
Summary of key public resources for enabling the development of computational models for predicting drug response
| Attribute | CCLE | GDSC | NCI-60 |
|---|---|---|---|
| # cell lines | >1000 | >1000 | 60 |
| # compounds | 24 | 138 | >15 K |
| # drug tests | >11 K | >75 K | >100 K |
| Main omics data sets | Mut, Gcn, Gexp | Mut, Gcn, Gexp | Mut, Gcn, Gexp, Prot |
| # cancers | 36 | >15 | 9 |
| Reference | [ | [ | [ |
| Website |
Note. CCLE = Cancer Cell Line Encyclopedia; GDSC = genomics of drug sensitivity in cancer; NCI-60 = the US National Cancer Institute 60 human tumor cell line drug screen database; # = number of; Mut = mutations; Gcn = gene copy numbers; Gexp = gene expression; Pexp = protein expression.
Figure 2A graphical synthesis of the diversity of computational models available for the prediction of drug responses. (A) List of data types most commonly used. (B) Categorization of models on the basis of the prediction problems addressed. (C) General hierarchy of statistical and machine learning techniques most commonly investigated. (D) Fundamental data sampling strategies for assessing model prediction capability.