| Literature DB >> 33192663 |
Yi-Han Sheu1,2,3.
Abstract
Psychiatric research is often confronted with complex abstractions and dynamics that are not readily accessible or well-defined to our perception and measurements, making data-driven methods an appealing approach. Deep neural networks (DNNs) are capable of automatically learning abstractions in the data that can be entirely novel and have demonstrated superior performance over classical machine learning models across a range of tasks and, therefore, serve as a promising tool for making new discoveries in psychiatry. A key concern for the wider application of DNNs is their reputation as a "black box" approach-i.e., they are said to lack transparency or interpretability of how input data are transformed to model outputs. In fact, several existing and emerging tools are providing improvements in interpretability. However, most reviews of interpretability for DNNs focus on theoretical and/or engineering perspectives. This article reviews approaches to DNN interpretability issues that may be relevant to their application in psychiatric research and practice. It describes a framework for understanding these methods, reviews the conceptual basis of specific methods and their potential limitations, and discusses prospects for their implementation and future directions.Entities:
Keywords: deep learning; deep neural networks; explainable AI; machine learning; model interpretability; psychiatry
Year: 2020 PMID: 33192663 PMCID: PMC7658441 DOI: 10.3389/fpsyt.2020.551299
Source DB: PubMed Journal: Front Psychiatry ISSN: 1664-0640 Impact factor: 4.157
Figure 1Conceptual flow chart connecting ideas and articles reviewed and discussed in this paper. Each block between a set of arrows corresponds to a particular section of the paper. Numbers in parenthesis indicate relevant referenced articles.
Figure 2Schematic diagram for three common DNN architectures. (A) A three-layer, feed-forward NN (with one hidden layer). Each circle represents one artificial neuron. All neurons in one layer are connected to all neurons in its adjacent layer but not to the neurons in the same layer. (B) A simple RNN. Here, each circle represents a layer of neurons. Information from the hidden layer of the previous step is allowed to enter the following step. X: input layer; H: hidden layer; Y: output layer. (C) A convolutional NN with two convolutional layers, two pooling layers, and two feed-forward layers. At the first step, the convolutional filter transforms the input image into six “feature maps” (e.g., image transformed by the filter). The feature maps are then summarized by pooling, which reduces the dimension of the feature map, usually by taking the maximum values of smaller regions (i.e., 3 × 3) that covers the whole feature map and combining them with spatial relations pertained to produce a new feature map. This procedure is repeated two times and then the network is connected to a two layer feed-forward network to derive the final output.
Summary of approaches to DNN interpretation.
| Permutation importance ( | Model agnostic | Global | Permute values within each predictor and calculate score based on performance drop | |
| Results may be biased if predictor of interest is correlated with other predictors ( | ||||
| Partial Dependence Plot (PDP) ( | Model agnostic | Global | Plotting values of predictor of interest versus outcome with all other predictors averaged out | |
| May be biased when predictors are correlated; difficult to visualize when number of interested predictors is large ( | ||||
| Individual Conditional Expectation (ICE) ( | Model agnostic | Global | Similar to PDP, but plotted for individual examples | |
| May be biased when predictors are correlated; difficult to visualize when number of interested predictors is large ( | ||||
| Local Interpretable Model-agnostic Explanations (LIME) ( | Model agnostic | Local | Approximates model locally with another interpretable model and data representation | |
| The procedure of finding the neighboring sample points may result in unrealistic data point; results may not be robust ( | ||||
| Deep Learning Important FeaTures (DeepLIFT) ( | DNN specific | Local | Computes average gradients at the input value of interest versus a reference value. Calculation facilitated by the compositional DNN structure | |
| Results may be inaccurate in the presence of multiplicative interactions between predictors ( | ||||
| Shapley Additive Explanations (SHAP) ( | Model agnostic or DNN-specific | Local | Calculates Shapley values through various approaches for interpretations using linear additive models, such as LIME and DeepLIFT | |
| Robustness issues ( | ||||
| Perturbation-based methods ( | Model agnostic or DNN/CNN specific | Local | Perturbs input values of a specific example and observes the change in modeled prediction | |
| Computationally expensive ( | ||||
| Gradient-based methods ( | Mostly DNN/CNN specific | Local | Calculates score for each feature at the input of interest based on the gradient values with respect to modeled prediction | |
| Meaning of the interpretation itself is unclear. Some methods have shown insensitivity to data or weight permutations ( | ||||
| Attention weight visualization ( | DNN models with attention mechanism | Local | Visualizes attention weight by showing heat map with corresponding text | |
| Intuitive, but attention weights may not be totally causal to model decisions ( | ||||
| Attention saliency ( | DNN models with attention mechanism | Local | Visualize scores based on absolute value of the derivative of the model output with respect of the unnormalized attention weight | |
| Properties not yet investigated in depth | ||||
| Word token analysis ( | DNN models with attention mechanism; | Local | Analyzes spatial relationships of tokens transformed across attention layers with dimension reduction | |
| Position-preserving models | Works only with model architectures with positional alignment between input and output sequences | |||