| Literature DB >> 27747563 |
Bokai Cao1, Xiangnan Kong2, Jingyuan Zhang3, Philip S Yu3,4, Ann B Ragin5.
Abstract
Investigating brain connectivity networks for neurological disorder identification has attracted great interest in recent years, most of which focus on the graph representation alone. However, in addition to brain networks derived from the neuroimaging data, hundreds of clinical, immunologic, serologic, and cognitive measures may also be documented for each subject. These measures compose multiple side views encoding a tremendous amount of supplemental information for diagnostic purposes, yet are often ignored. In this paper, we study the problem of subgraph selection from brain networks with side information guidance and propose a novel solution to find an optimal set of subgraph patterns for graph classification by exploring a plurality of side views. We derive a feature evaluation criterion, named gSide, to estimate the usefulness of subgraph patterns based upon side views. Then we develop a branch-and-bound algorithm, called gMSV, to efficiently search for optimal subgraph patterns by integrating the subgraph mining process and the procedure of discriminative feature selection. Empirical studies on graph classification tasks for neurological disorders using brain networks demonstrate that subgraph patterns selected by the multi-side-view-guided subgraph selection approach can effectively boost graph classification performances and are relevant to disease diagnosis.Entities:
Keywords: Brain network; Graph mining; Side information; Subgraph pattern
Year: 2015 PMID: 27747563 PMCID: PMC4737668 DOI: 10.1007/s40708-015-0023-1
Source DB: PubMed Journal: Brain Inform ISSN: 2198-4026
Fig. 2Two strategies of leveraging side views in feature selection process for graph classification: late fusion and early fusion [6]
Important notations
| Symbol | Definition and description |
|---|---|
| |.| | Cardinality of a set |
|
| Norm of a vector |
|
| Given graph dataset, |
|
| Class label vector for graphs in |
|
| Set of all subgraph patterns in the graph dataset |
|
| Binary vector for subgraph pattern |
|
| Binary vector for |
|
| Matrix of all binary vectors in the dataset, |
|
| Set of selected subgraph patterns, |
|
| Diagonal matrix indicating which subgraph patterns are selected from |
|
| Minimum frequency threshold; frequent subgraphs are contained by at least |
|
| Number of subgraph patterns to be selected |
|
| Weight of the |
|
| Kernel function on the |
Demographic characteristics
| HIV | Control |
| |
|---|---|---|---|
| Age (mean years | 33.3 | 31.4 | 0.45 |
| Gender (% male) | 89 % | 76 % | 0.22 |
| Race (% white) | 62 % | 76 % | 0.22 |
| Education (% college) | 81 % | 90 % | 0.29 |
Hypothesis testing results (p values) to verify side information consistency
| Side views | fMRI dataset | DTI dataset |
|---|---|---|
| Neuropsychological tests | 1.3220e−20 | 3.6015e−12 |
| Flow cytometry | 5.9497e−57 | 5.0346e−75 |
| Plasma luminex | 9.8102e−06 | 7.6090e−06 |
| Freesurfer | 2.9823e−06 | 1.5116e−03 |
| Overall brain microstructure | 1.0403e−02 | 8.1027e−03 |
| Localized brain microstructure | 3.1108e−04 | 5.7040e−04 |
| Brain volumetry | 2.0024e−04 | 1.2660e−02 |
Fig. 3Classification performance on the fMRI dataset with different numbers of features.
Fig. 4Classification performance on the DTI dataset with different numbers of features
Fig. 5Average CPU time for pruning versus unpruning with varying min_sup
Fig. 6Average number of subgraph patterns explored in the mining procedure for pruning versus unpruning with varying min_sup
Average classification performances of gMSV on the fMRI dataset with different single-side views
| Side views | Precision | Recall | F1 |
|---|---|---|---|
| Neuropsychological tests | 0.851 | 0.679 | 0.734 |
| Flow cytometry | 0.919 | 0.872 | 0.892 |
| Plasma luminex | 0.769 | 0.682 | 0.710 |
| Freesurfer | 0.851 | 0.737 | 0.785 |
| Overall brain microstructure | 0.824 | 0.500 | 0.618 |
| Localized brain microstructure | 0.686 | 0.605 | 0.637 |
| Brain volumetry | 0.739 | 0.737 | 0.731 |
| All side views | 1.000 | 0.949 | 0.973 |
Average classification performances of gMSV on the DTI dataset with different single-side views
| Side views | Precision | Recall | F1 |
|---|---|---|---|
| Neuropsychological tests | 0.630 | 0.705 | 0.662 |
| Flow cytometry | 0.847 | 0.808 | 0.822 |
| Plasma luminex | 0.801 | 0.705 | 0.744 |
| Freesurfer | 0.664 | 0.632 | 0.644 |
| Overall brain microstructure | 0.626 | 0.679 | 0.647 |
| Localized brain microstructure | 0.717 | 0.775 | 0.741 |
| Brain volumetry | 0.616 | 0.679 | 0.644 |
| All side views | 1.000 | 0.951 | 0.974 |
Fig. 7Discriminative subgraph patterns that are associated with HIV, selected from the fMRI dataset
Fig. 8Discriminative subgraph patterns that are associated with HIV, selected from the DTI dataset