| Literature DB >> 29949984 |
Iiris Sundin1, Tomi Peltola1, Luana Micallef1, Homayun Afrabandpey1, Marta Soare1, Muntasir Mamun Majumder2, Pedram Daee1, Chen He3, Baris Serim3, Aki Havulinna2,4, Caroline Heckman2, Giulio Jacucci3, Pekka Marttinen1, Samuel Kaski1.
Abstract
Motivation: Precision medicine requires the ability to predict the efficacies of different treatments for a given individual using high-dimensional genomic measurements. However, identifying predictive features remains a challenge when the sample size is small. Incorporating expert knowledge offers a promising approach to improve predictions, but collecting such knowledge is laborious if the number of candidate features is very large.Entities:
Mesh:
Year: 2018 PMID: 29949984 PMCID: PMC6022689 DOI: 10.1093/bioinformatics/bty257
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Overview. Predictions in small-sample-size problems are improved by asking experts in an elicitation loop. The system presents questions for the expert sequentially to maximize performance with a minimal number of questions, i.e. on a budget. The expert answers the questions by indicating whether a feature is relevant in predicting quantitative traits, such as cancer cell’s sensitivity to a drug. The expert can also indicate in which direction the effect is likely to be
Fig. 2.Plate notation of the quantitative trait prediction model (right) and feedback observations (left) as introduced in Section 2.1. The feedbacks f rel and f dir are sequentially queried from the expert based on an expert knowledge elicitation method
Performance in metabolite concentration prediction
| Data mean | Elastic net | SnS no fb | SnS all fb | SnS rel. fb | |
|---|---|---|---|---|---|
| C-index | 0.500 | 0.519 | 0.540 | 0.556 | |
| MSE | 1.017 | 1.010 | 0.999 | 0.988 | |
| PVE | 0.000 | 0.007 | 0.018 | 0.028 |
Note: Values are averages over the four target metabolites. Best result on each row has been boldfaced. SnS = spike and slab sparse linear model; fb = feedback; Rel. fb = Only relevance feedback; MSE = mean squared error; PVE = proportion of variance explained.
Fig. 3.Sequential experimental design performance in metabolite concentration prediction comparing random querying, information gain-based sequential experimental design and its targeted version. First 1000 iterations of feedback are shown and the result with all feedbacks is included for reference. For the targeted sequential experimental design, each individual in the test set was the target separately and the predictions in the resulting feedback sequence were used for that individual. The curve is a mean over all these sequences
Feedback type and count, given to the 1944 (drug, feature) pairs by the experts
| Answer | SR | DC |
|---|---|---|
| Relevant, positive correlation | 192 | 47 |
| Relevant, negative correlation | 14 | 34 |
| Relevant, unknown correlation direction | 26 | 358 |
| Not relevant | 13 | 0 |
| I don’t know | 1699 | 1505 |
| Total | 1944 | 1944 |
Note: SR = Senior researcher, DC = Doctoral candidate.
Performance of drug sensitivity prediction without expert feedback
| Data mean | Elastic net | Spike-and-slab | |
|---|---|---|---|
| C-index | 0.500 | 0.505 | |
| MSE | 1.079 | 1.153 |
Note: Values are averaged over the 12 drugs. Best result on each row has been boldfaced.
Predictive performance of spike-and-slab regression with and without expert feedback
| No feedback | Doctoral candidate | Senior researcher | |
|---|---|---|---|
| C-index | 0.577 | 0.582 | |
| MSE | 1.050 | 1.040 |
Note: Values are averaged over the 12 drugs.
Performance of drug sensitivity prediction with only relevance feedback and with relevance and directional feedback
| Doctoral candidate | Senior researcher | |||
|---|---|---|---|---|
| Relevance fb | All fb | Relevance fb | All fb | |
| C-index | 0.582 | 0.578 | ||
| MSE | 1.048 | 1.048 | ||
Note: Values are averaged over the 12 drugs.
Fig. 4.Performance improves faster with the active elicitation methods than with randomly selected feedback queries. The curves show MSEs as a function of the number of iterations for the three query methods, with feedback of the doctoral candidate (left) and senior researcher (right). In each iteration, a (drug, feature) pair is queried from the expert