| Literature DB >> 27467414 |
Katja Hansen1, David Baehrens2, Timon Schroeter3, Matthias Rupp2,4, Klaus-Robert Müller2,4.
Abstract
Statistical models are frequently used to estimate molecular properties, e.g., to establish quantitative structure-activity and structure-property relationships. For such models, interpretability, knowledge of the domain of applicability, and an estimate of confidence in the predictions are essential. We develop and validate a method for the interpretation of kernel-based prediction models. As a consequence of interpretability, the method helps to assess the domain of applicability of a model, to judge the reliability of a prediction, and to determine relevant molecular features. Increased interpretability also facilitates the acceptance of such models. Our method is based on visualization: For each prediction, the most contributing training samples are computed and visualized. We quantitatively show the effectiveness of our approach by conducting a questionnaire study with 71 participants, resulting in significant improvements of the participants' ability to distinguish between correct and incorrect predictions of a Gaussian process model for Ames mutagenicity.Entities:
Keywords: Confidence estimation; Domain of applicability; Kernel-based learning; QSAR; QSPR
Year: 2011 PMID: 27467414 DOI: 10.1002/minf.201100059
Source DB: PubMed Journal: Mol Inform ISSN: 1868-1743 Impact factor: 3.353