| Literature DB >> 28434153 |
L Torlay1, M Perrone-Bertolotti1, E Thomas2, M Baciu3,4.
Abstract
Our goal was to apply a statistical approach to allow the identification of atypical language patterns and to differentiate patients with epilepsy from healthy subjects, based on their cerebral activity, as assessed by functional MRI (fMRI). Patients with focal epilepsy show reorganization or plasticity of brain networks involved in cognitive functions, inducing 'atypical' (compared to 'typical' in healthy people) brain profiles. Moreover, some of these patients suffer from drug-resistant epilepsy, and they undergo surgery to stop seizures. The neurosurgeon should only remove the zone generating seizures and must preserve cognitive functions to avoid deficits. To preserve functions, one should know how they are represented in the patient's brain, which is in general different from that of healthy subjects. For this purpose, in the pre-surgical stage, robust and efficient methods are required to identify atypical from typical representations. Given the frequent location of regions generating seizures in the vicinity of language networks, one important function to be considered is language. The risk of language impairment after surgery is determined pre-surgically by mapping language networks. In clinical settings, cognitive mapping is classically performed with fMRI. The fMRI analyses allowing the identification of atypical patterns of language networks in patients are not sufficiently robust and require additional statistic approaches. In this study, we report the use of a statistical nonlinear machine learning classification, the Extreme Gradient Boosting (XGBoost) algorithm, to identify atypical patterns and classify 55 participants as healthy subjects or patients with epilepsy. XGBoost analyses were based on neurophysiological features in five language regions (three frontal and two temporal) in both hemispheres and activated with fMRI for a phonological (PHONO) and a semantic (SEM) language task. These features were combined into 135 cognitively plausible subsets and further submitted to selection and binary classification. Classification performance was scored with the Area Under the receiver operating characteristic curve (AUC). Our results showed that the subset SEM_LH BA_47-21 (left fronto-temporal activation induced by the SEM task) provided the best discrimination between the two groups (AUC of 91 ± 5%). The results are discussed in the framework of the current debates of language reorganization in focal epilepsy.Entities:
Keywords: Atypical; Epilepsy; Extreme Gradient Boosting; Language; ML; Machine learning; XGBoost
Year: 2017 PMID: 28434153 PMCID: PMC5563301 DOI: 10.1007/s40708-017-0065-7
Source DB: PubMed Journal: Brain Inform ISSN: 2198-4026
Demographic information of participants, patients (TLE, patient with epilepsy with left temporal lobe epilepsy) and healthy volunteers (controls)
|
| Age mean (SD) | Gender | Handedness | |
|---|---|---|---|---|
| TLE | 16 | 35.3 ± 11.1 | 9M–7F | 1L–15R |
| Controls | 39 | 26.5 ± 3.7 | 18M–21F | 15L–24R |
For each group, we mentioned the number of participants (N), the mean age and standard deviation (SD), the gender (F, female; M, male) and the handedness (right-handed, R; left-handed, L)
A total of 135 subsets were evaluated
| Task | Subset | Fronto-temporal regions (FT) | Left hemisphere | Right hemisphere | Bilateral | |
|---|---|---|---|---|---|---|
| SEM only | 1 | P | BA 47LH | BA 47RH | BA 47LH | BA 47RH |
| 2 | P | BA 44LH, BA 45LH, | BA 44RH, BA 45RH, | BA 44LH, BA 45LH, | BA 44RH, BA 45RH, | |
| 3 | T | BA 44LH, BA 45LH, BA 47LH | BA 44RH, BA 45RH, BA 47RH | BA 44LH, BA 45LH, BA 47LH | BA 44RH, BA 45RH, BA 47RH | |
| 4 | P | BA 21LH | BA 21RH | BA 21LH | BA 21RH | |
| 5 | P | BA 22LH | BA 22RH | BA 22LH | BA 22RH | |
| 6 | T | BA 21LH, BA 22LH | BA 21RH, BA 22RH | BA 21LH, BA 22LH | BA 21RH, BA 22RH | |
| 7 | P | BA 21LH, BA 47LH | BA 21RH, BA 47RH | BA 21LH. BA 47LH | BA 21RH. BA 47RH | |
| 8 | P | BA 22LH, BA 47LH | BA 22RH, BA 47RH | BA 22LH, BA 47LH | BA 22RH, BA 47RH | |
| 9 | P | BA 21LH, BA 22LH, BA 47LH | BA 21RH, BA 22RH, BA 47RH | BA 21LH, BA 22LH, BA 47LH | BA 21RH, BA 22RH, BA 47RH | |
| 10 | P | BA 21LH, BA 44LH, BA 45LH | BA 21RH, BA 44RH, BA 45RH | BA 21LH, BA 44LH, BA 45LH | BA 21RH, BA 44RH, BA 45RH | |
| 11 | P | BA 22LH, BA 44LH, BA 45LH | BA 22RH, BA 44RH, BA 45RH | BA 22LH, BA 44LH, BA 45LH | BA 22RH, BA 44RH, BA 45RH | |
| 12 | P | BA 21LH, BA 22LH, BA 44LH, BA 45LH | BA 21RH, BA 22RH, BA 44RH, BA 45RH | BA 21LH, BA 22LH, BA 44LH, BA 45LH | BA 21RH, BA 22RH, BA 44RH, BA 45RH | |
| 13 | P | BA 21LH, BA 44LH, BA 45LH, BA 47LH | BA 21RH, BA 44RH, BA 45RH, BA 47RH | BA 21LH, BA 44LH, BA 45LH, BA 47LH | BA 21RH, BA 44RH, BA 45RH, BA 47RH | |
| 14 | P | BA 22LH, BA 44LH, BA 45LH, BA 47LH | BA 22RH, BA 44RH, BA 45RH, BA 47RH | BA 22LH, BA 44LH, BA 45LH, BA 47LH | BA 22RH, BA 44RH, BA 45RH, BA 47RH | |
| 15 | T | BA 21LH, BA 22LH, BA 44LH, BA 45LH, BA 47LH | BA 21RH, BA 22RH, BA 44RH, BA 45RH, BA 47RH | BA 21LH, BA 22LH, BA 44LH, BA 45LH, BA 47LH | BA 21RH, BA 22RH, BA 44RH, BA 45RH, BA 47RH | |
Fifteen subsets were based on combinations of fronto-temporal (FT) regions according to hemisphere and task and defined as follows: (a) only frontal regions (partial subsets 1–2 and total subset 3); (b) only temporal regions (partial subsets 4–5 and total subset 6), and (c) combination of frontal and temporal regions (partial subsets 7–14 and total subset 15). These subsets were evaluated for three thematic sets according to task (semantic only, SEM only; phonological only, PHONO only; semantic and phonological combined, SEM + PHONO) and hemisphere (left hemisphere, right hemisphere and bilateral—both hemispheres)
Results obtained for the selected subset SEM (semantic) LH (left hemisphere) BA 21 and BA47 in terms of AUC as the performance metric for each iteration of the outer MCCV, using the XGBoost algorithm (n_estimators = 1200, learning rate = 0.01, subsample = 0.7, max_depth = 3)
| Iteration number | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Subset selected | SEM L21 L47 | SEM L21 L47 | SEM L21 L47 | SEM L21 L47 | SEM L21 L47 | SEM L21 L47 | SEM L21 L47 | SEM L21 L47 | SEM L21 L47 | SEM L21 L47 | SEM L21 L47 | SEM L21 L47 |
| AUC (%) | 93.75 | 87.50 | 87.50 | 93.75 | 93.75 | 93.75 | 100 | 83.33 | 83.33 | 93.75 | 87.50 | 100 |
Fig. 1Illustration of the validation schema, using outer Monte Carlo cross-validation (MCCV)
Fig. 2Distribution of the 12 AUC scores measured on the outer validation set of the Monte Carlo cross-validation (MCCV) around the mean score of 91%