| Literature DB >> 31295332 |
Betty Wutzl1,2,3,4, Kenji Leibnitz1,2, Frank Rattay3, Martin Kronbichler5,6, Masayuki Murata1,2, Stefan Martin Golaszewski4,5,7.
Abstract
The diagnosis and prognosis of patients with severe chronic disorders of consciousness are still challenging issues and a high rate of misdiagnosis is evident. Hence, new tools are needed for an accurate diagnosis, which will also have an impact on the prognosis. In recent years, functional Magnetic Resonance Imaging (fMRI) has been gaining more and more importance when diagnosing this patient group. Especially resting state scans, i.e., an examination when the patient does not perform any task in particular, seems to be promising for these patient groups. After preprocessing the resting state fMRI data with a standard pipeline, we extracted the correlation matrices of 132 regions of interest. The aim was to find the regions of interest which contributed most to the distinction between the different patient groups and healthy controls. We performed feature selection using a genetic algorithm and a support vector machine. Moreover, we show by using only those regions of interest for classification that are most often selected by our algorithm, we get a much better performance of the classifier.Entities:
Year: 2019 PMID: 31295332 PMCID: PMC6622536 DOI: 10.1371/journal.pone.0219683
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Graphical representation of crossover and mutation in GA.
(A) shows two parent chromosomes. The second row shows the children’s chromosomes after crossover (B) and after mutation (C).
Fig 2Representation of the feature selection algorithm.
The binary vector (A) is applied to the correlation matrix (B) (for graphical representation this is only a 9x9 matrix—in our actual analysis this was a 132x132 matrix), selecting only those combination of ROIs which both have a “1” in the binary mask. This results in a smaller truncated correlation matrix (C). Due to the symmetry of this matrix, we just use the upper triangular matrix for further calculation. This triangular matrix (D) is then flattened into a feature vector (E). Each integer tuple shown in the matrix represents its row and column indices, i.e., i,j corresponds to row i, column j.
Fig 3Schematic view of the overall algorithm.
We start with the correlation matrices (A). The classes A and B stand for any two classes of healthy controls, patients, UWS, or MCS. The correlation matrices are split into a train-test set (B). The training set is first used for the GA. We find a binary mask as the best solution of the GA (C). This binary mask is then applied to the correlation matrices (D) (see also Fig 2). Then the fitness is calculated from the test set for each genome (E) and a fitness function is plotted (F). This plot shows the fitness of the best genome of each generation. After the GA is repeated 100 times, the best genome is saved which results in 1000 best binary ROI masks (G). In order to find the best ROIs, we plot histograms of these results (H). These best results are used for further calculations.
Fig 4Averaged AUC values over 1000 repetitions when using a certain number of ROIs for differentiating healthy subjects from UWS patients.
The maximum is a number of 10 ROIs. The x-axis gives the number of ROIs and the y-axis gives the value of the averaged AUC.
Fig 5Example of a single run of the GA for feature selection.
The ROIs selected as features are shown in yellow in subfigure (A) (corresponds to Fig 3C). This subfigure (A) shows the entire current population at the 100th generation where each row represents one individual and rows are sorted by fitness in descending order. The subfigure (B) shows the chosen ROIs of the solution with highest fitness for each generation (corresponds to Fig 3G), ending with the one which just has 6 ROIs, i.e. 6, 58, 67, 88, 89, and 91. The subfigure (C) represents the best fitness of the population (corresponds to Fig 3F) starting at around 0.85 and ending at a fitness of 1 which represents the AUC of the precision and recall curve.
Fig 6Histograms of the different GA results (corresponds to Fig 3H) when comparing the four different combinations (healthy controls versus patients, healthy controls versus MCS, healthy controls versus UWS, and MCS versus UWS).
The x-axis shows the ID of the ROI in the CONN atlas and the y-axis shows its frequency, i.e., how often the GA chose this ROI to be important.
Most frequent ROIs when comparing healthy controls and patients, healthy controls and MCS patients, healthy controls and UWS, or MCS and UWS patients.
The columns are the name of the ROI, its ID in the CONN atlas, and its selection frequency (how often this ROI was chosen by the GA as one of the most important ones). Bold ROIs show up in more than one comparison.
| Healthy vs patients | MCS vs UWS | |||||
| ROI | Nr. | count | ROI | Nr. | Count | |
| 1 | 253 | inferior temporal gyrus, temporooccipital part left | 32 | 231 | ||
| 2 | 234 | 185 | ||||
| 3 | 189 | accumbens right | 104 | 182 | ||
| 4 | precentral gyrus left | 14 | 182 | 173 | ||
| 5 | 167 | lateral occipital cortex, superior division left | 44 | 136 | ||
| 6 | 166 | 131 | ||||
| 7 | 161 | angular gyrus right | 41 | 128 | ||
| 8 | parahippocampal gyrus, posterior division right | 64 | 161 | frontal orbital cortex left | 61 | 119 |
| 9 | inferior frontal gyrus, pars triangularis right | 9 | 160 | supramarginal gyrus, posterior division right | 39 | 116 |
| 10 | 119 | temporal occipital fusiform cortex left | 73 | 115 | ||
| 11 | precentral gyrus right | 13 | 119 | superior temporal gyrus, anterior division left | 18 | 110 |
| 12 | parietal operculum cortex left | 81 | 111 | inferior frontal gyrus, pars opercularis right | 11 | 107 |
| 13 | 104 | planum temporale left | 87 | 98 | ||
| 14 | superior frontal gyrus left | 6 | 96 | superior temporal gyrus, posterior division left | 20 | 97 |
| 15 | middle temporal gyrus, posterior division left | 24 | 95 | insular cortex right | 3 | 91 |
| 16 | temporal fusiform cortex, posterior division left | 71 | 94 | hippocampus right | 100 | 82 |
| 17 | inferior frontal gyrus, pars triangularis right | 10 | 93 | vermis 8 | 130 | 81 |
| 18 | inferior temporal gyrus, temporooccipital part right | 31 | 79 | |||
| 19 | cerebelum crus2 left | 109 | 78 | |||
| 20 | juxtapositional lobule cortex -formerly supplementary motor cortex- right | 50 | 77 | |||
| 21 | cerebelum 7b right | 118 | 76 | |||
| 22 | putamen right | 96 | 76 | |||
| Healthy vs UWS | Healthy vs MCS | |||||
| ROI | Nr. | count | ROI | Nr. | count | |
| 1 | 211 | 367 | ||||
| 2 | 102 | insular cortex left | 4 | 212 | ||
| 3 | 179 | 142 | ||||
| 4 | vermis 9 | 131 | 160 | middle frontal gyrus left | 8 | 138 |
| 5 | 153 | cerebelum 3 left | 111 | 128 | ||
| 6 | 151 | 118 | ||||
| 7 | 150 | |||||
| 8 | 145 | |||||
| 9 | cerebelum 10 left | 123 | 139 | |||
| 10 | cerebelum crus1 right | 108 | 136 | |||
Table of averaged results over 100 train-test splits with a ratio of 0.33.
The AUC of the precision and recall curve is shown (AUC with feature selection) for training and testing with the most important ROIs and with all ROIs (AUC without feature selection).
| Healthy vs patient | Healthy vs MCS | Healthy vs UWS | MCS vs UWS | |
|---|---|---|---|---|
| 0.8239 | 0.7950 | 0.8871 | 0.8628 | |
| 0.5018 | 0.3886 | 0.4116 | 0.6159 |