| Literature DB >> 23166592 |
Sebastian Briesemeister1, Jörg Rahnenführer, Oliver Kohlbacher.
Abstract
Quantitative predictions in computational life sciences are often based on regression models. The advent of machine learning has led to highly accurate regression models that have gained widespread acceptance. While there are statistical methods available to estimate the global performance of regression models on a test or training dataset, it is often not clear how well this performance transfers to other datasets or how reliable an individual prediction is-a fact that often reduces a user's trust into a computational method. In analogy to the concept of an experimental error, we sketch how estimators for individual prediction errors can be used to provide confidence intervals for individual predictions. Two novel statistical methods, named CONFINE and CONFIVE, can estimate the reliability of an individual prediction based on the local properties of nearby training data. The methods can be applied equally to linear and non-linear regression methods with very little computational overhead. We compare our confidence estimators with other existing confidence and applicability domain estimators on two biologically relevant problems (MHC-peptide binding prediction and quantitative structure-activity relationship (QSAR)). Our results suggest that the proposed confidence estimators perform comparable to or better than previously proposed estimation methods. Given a sufficient amount of training data, the estimators exhibit error estimates of high quality. In addition, we observed that the quality of estimated confidence intervals is predictable. We discuss how confidence estimation is influenced by noise, the number of features, and the dataset size. Estimating the confidence in individual prediction in terms of error intervals represents an important step from plain, non-informative predictions towards transparent and interpretable predictions that will help to improve the acceptance of computational methods in the biological community.Entities:
Mesh:
Year: 2012 PMID: 23166592 PMCID: PMC3499506 DOI: 10.1371/journal.pone.0048723
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Example of estimating confidence intervals.
In this example, we estimated the confidence intervals of instances. The left-hand plot shows the confidence interval widths and the corresponding absolute errors. The corresponding CEC equals . Although the CEC is not very large, it is possible to see an increased number of small confidence intervals for predictions with a low error. In the right-hand plot, the estimated confidence interval borders are displayed. In addition, every prediction defined by its prediction error and its normalized confidence score is depicted by a red circle. On average, the absolute error is smaller for predictions with a high and a small confidence interval.
Performance of confidence estimators on artificial data with different properties.
|
|
|
|
| σ<1.0 | σ≥1.0 | best | |
| CONFINE | 0.05 | 0.22 | 0.19 | 0.05 | 0.21 | 0.15 | 0.30 |
| CONFIVE | −0.02 | 0.05 | 0.03 | −0.01 | 0.04 | 0.02 | 0.07 |
| AvgDist | 0.02 | 0.12 | 0.10 | 0.03 | 0.11 | 0.08 | 0.16 |
| Bagging | 0.11 | 0.20 | 0.18 | 0.11 | 0.19 | 0.16 | 0.25 |
| Diff5NN | 0.01 | 0.17 | 0.14 | 0.02 | 0.14 | 0.11 | 0.29 |
| LocalCV | 0.01 | 0.05 | 0.04 | 0.02 | 0.04 | 0.03 | 0.05 |
| LocalVar | 0.00 | 0.12 | 0.10 | −0.00 | 0.09 | 0.08 | 0.16 |
| NoNN | 0.05 | 0.12 | 0.12 | 0.03 | 0.11 | 0.09 | 0.16 |
For every confidence estimator, we calculated the average CEC by considering datasets with a different number of instances , a different number of selected features , and a different noise level . In the last column, we show the average CEC for the best parameter combination (, , ).
Performance of confidence estimators on biological datasets.
| Regression model | confidence | MHC | QSAR | ||||
| estimator | CEC | CAPI | runtime [ms] | CEC | CAPI | runtime [ms] | |
| LR | CONFINE | 0.27 | 0.39 | 2 | 0.08 | 0.09 | 1 |
| CONFIVE | 0.24 | 0.35 | 2 | 0.09 | 0.13 | 1 | |
| AvgDist | 0.11 | 0.18 | 2 | −0.02 | −0.10 | 1 | |
| Bagging | 0.13 | 0.18 | 1 | 0.20 | 0.35 | 1 | |
| DiffNN | 0.24 | 0.32 | 2 | −0.00 | −0.14 | 1 | |
| LocalCV | 0.16 | 0.27 | 214 | 0.08 | 0.10 | 353 | |
| LocalVar | 0.10 | 0.17 | 482 | −0.08 | −0.22 | 430 | |
| NoNN | 0.10 | 0.17 | 2 | −0.03 | −0.09 | 1 | |
| SVR | CONFINE | 0.23 | 0.41 | 9 | 0.23 | 0.32 | 9 |
| CONFIVE | 0.21 | 0.34 | 10 | 0.16 | 0.21 | 10 | |
| AvgDist | 0.12 | 0.23 | 9 | 0.02 | 0.03 | 12 | |
| Bagging | 0.21 | 0.50 | 374 | 0.15 | 0.17 | 3064 | |
| DiffNN | 0.24 | 0.35 | 9 | 0.10 | 0.20 | 10 | |
| NoNN | 0.22 | 0.18 | 9 | 0.12 | 0.14 | 44 | |
For every confidence estimator, the avgCEC, the confidence associated prediction improvement (CAPI), and the time for an individual estimation in milliseconds on the MHC datasets and on the QSAR datasets is shown. For the upper part of the table, the estimators were applied together with linear regression (LR), whereas the number in the lower part were obtained using support vector regression with an RBF kernel (SVR).