| Literature DB >> 35212436 |
Zhoufan Jiang1, Yanming Wang1, ChenWei Shi1, Yueyang Wu1, Rongjie Hu1, Shishuo Chen1, Sheng Hu1, Xiaoxiao Wang1,2, Bensheng Qiu1,2.
Abstract
Decoding brain cognitive states from neuroimaging signals is an important topic in neuroscience. In recent years, deep neural networks (DNNs) have been recruited for multiple brain state decoding and achieved good performance. However, the open question of how to interpret the DNN black box remains unanswered. Capitalizing on advances in machine learning, we integrated attention modules into brain decoders to facilitate an in-depth interpretation of DNN channels. A four-dimensional (4D) convolution operation was also included to extract temporo-spatial interaction within the fMRI signal. The experiments showed that the proposed model obtains a very high accuracy (97.4%) and outperforms previous researches on the seven different task benchmarks from the Human Connectome Project (HCP) dataset. The visualization analysis further illustrated the hierarchical emergence of task-specific masks with depth. Finally, the model was retrained to regress individual traits within the HCP and to classify viewing images from the BOLD5000 dataset, respectively. Transfer learning also achieves good performance. Further visualization analysis shows that, after transfer learning, low-level attention masks remained similar to the source domain, whereas high-level attention masks changed adaptively. In conclusion, the proposed 4D model with attention module performed well and facilitated interpretation of DNNs, which is helpful for subsequent research.Entities:
Keywords: attention module; brain decoding; deep learning; functional magnetic resonance imaging; neuroimaging
Mesh:
Year: 2022 PMID: 35212436 PMCID: PMC9057093 DOI: 10.1002/hbm.25813
Source DB: PubMed Journal: Hum Brain Mapp ISSN: 1065-9471 Impact factor: 5.399
Details of the selected HCP time series
| Task | Selected condition | Frames of the block |
|---|---|---|
| Emotion | Fear | 26 |
| Gambling | Loss | 39 |
| Language | Present story | 29 |
| Motor | Right hand | 17 |
| Relational | Relational | 23 |
| Social | Mental | 32 |
| Working memory (WM) | 2‐Back places | 39 |
FIGURE 1The proposed neural network. (a) The model consists of a 4D convolution layer, four 3D attention modules, and a fully‐connected layer to provide labeled task classes. (b) The attention module, which includes the main branch and an attention branch composed of down‐sample and up‐sample paths, was connected by a shortcut skip
Comparisons with previous methods on the HCP dataset
| Authors | Model | Accuracy ± SD |
|---|---|---|
| X. Wang et al. ( | 3DResNet | 93.7 ± 1.9% |
| Nguyen et al. ( | 3DResNet‐TF | 95.1 ± 0.6% |
| 3DResNet‐LSTM++ | 97.0 ± 0.4% | |
| 3DResNet‐TF++ | 97.2 ± 0.6% | |
| Ours | 3DResNet‐Att | 96.3 ± 1.1% |
| 4DResNet | 96.1 ± 0.8% | |
|
|
|
The bolded values indicate the highest accuracy of different models.
FIGURE 2Performance evaluation on the HCP dataset. (a) The average confusion matrix showed a nice block diagonal architecture. (b) The 3DCNN and 4DCNN comparisons used different frames as input (frames = 7, 11, and 15). In terms of dynamic change over a long range, 4DCNN outperformed. (c) The classification performance with or without the attention module (frame = 15). Decoders with attention and a relatively longer 4D‐kernel performed better
FIGURE 3Visualization of attention masks on the HCP dataset. (a)–(d) Examples show the average focused regions on four attention stages (from low‐level to high‐level) of different tasks (language, motor, and relational). Each of the attention masks was color‐coded with a color gradient indicating the enhancement (positive with red) or diminishment (negative with blue) of the feature maps. [Correction added on March 11, 2022, after first online publication: Figure 3 has been updated to correct the task labels in 3c.]
FIGURE 4Prediction of individual traits. (a) An example showing that the transfer learning model yielded significant predictions of gF. (b) The attention masks from low‐level to high‐level after transfer learning. The focused regions of high‐level change adaptively
Prediction of individual traits between different model
| Model | Initial training | Transfer learning |
|---|---|---|
| 3DResNet‐Att |
|
|
| 4DResNet |
|
|
| 4DResNet‐Att |
|
|
FIGURE 5Visualization of attention masks on the BOLD5000 dataset. (a)–(d) Attention masks from low‐level to high‐level after transfer learning. The examples show the attention masks of four participants, which employed LOSO cross‐validation. The masks adaptively change to fit different subjects' brain structures