| Literature DB >> 29379411 |
Jorge Bosch-Bayard1, Lídice Galán-García2, Thalia Fernandez1, Rolando B Lirio3, Maria L Bringas-Vega4, Milene Roca-Stappung1, Josefina Ricardo-Garcell1, Thalía Harmony1, Pedro A Valdes-Sosa2,4.
Abstract
In this paper, we present a novel methodology to solve the classification problem, based on sparse (data-driven) regressions, combined with techniques for ensuring stability, especially useful for high-dimensional datasets and small samples number. The sensitivity and specificity of the classifiers are assessed by a stable ROC procedure, which uses a non-parametric algorithm for estimating the area under the ROC curve. This method allows assessing the performance of the classification by the ROC technique, when more than two groups are involved in the classification problem, i.e., when the gold standard is not binary. We apply this methodology to the EEG spectral signatures to find biomarkers that allow discriminating between (and predicting pertinence to) different subgroups of children diagnosed as Not Otherwise Specified Learning Disabilities (LD-NOS) disorder. Children with LD-NOS have notable learning difficulties, which affect education but are not able to be put into some specific category as reading (Dyslexia), Mathematics (Dyscalculia), or Writing (Dysgraphia). By using the EEG spectra, we aim to identify EEG patterns that may be related to specific learning disabilities in an individual case. This could be useful to develop subject-based methods of therapy, based on information provided by the EEG. Here we study 85 LD-NOS children, divided in three subgroups previously selected by a clustering technique over the scores of cognitive tests. The classification equation produced stable marginal areas under the ROC of 0.71 for discrimination between Group 1 vs. Group 2; 0.91 for Group 1 vs. Group 3; and 0.75 for Group 2 vs. Group1. A discussion of the EEG characteristics of each group related to the cognitive scores is also presented.Entities:
Keywords: EEG classification; LD-NOS classification; elastic-net; non-parametric ROC; sparse classifiers; stability based biomarkers
Year: 2018 PMID: 29379411 PMCID: PMC5775224 DOI: 10.3389/fnins.2017.00749
Source DB: PubMed Journal: Front Neurosci ISSN: 1662-453X Impact factor: 4.677
Figure 1Schematic representation of the Robust Sparse classification algorithm with stable ROC assessment. LOO-CV, Leave-One-out cross-validation; RRS-CV, Repeated Random Subsampling cross-validation.
Selected biomarkers during the classification step.
| C4 | 0.39 | 86.5 | 0.12 | |
| T6 | 0.39 | 58.89 | 0.12 | Delta |
| T4 | 1.17 | 59.46 | 0.07 | |
| P4 | 3.52 | 59.55 | −0.18 | Low Theta |
| P4 | 3.91 | 61.4 | −0.3 | |
| P3 | 5.08 | 69.6 | 0.23 | |
| F8 | 5.08 | 57.5 | 0.16 | |
| Fp1 | 5.08 | 54.3 | 0.06 | |
| Fp2 | 5.47 | 58.6 | 0.03 | High Theta |
| Fp1 | 5.47 | 57.5 | 0.06 | |
| Fp2 | 5.86 | 56.5 | 0.03 | |
| Fz | 6.64 | 69.8 | 0.16 | |
| C3 | 7.81 | 60.8 | −0.11 | Alpha |
| T6 | 8.59 | 54.5 | −0.11 | |
| T6 | 10.16 | 56.44 | −0.04 | |
| C3 | 10.94 | 53.7 | −0.16 | |
| P3 | 14.06 | 56.7 | 0.21 | Beta |
| P3 | 15.23 | 62.3 | 0.34 | |
| F3 | 15.63 | 57 | −0.25 | |
| O1 | 18.36 | 52 | 0.27 |
Columns 1 and 2 contain the Lead name and the selected frequency; Column 3 contains the percent of times that the variable was identified as a biomarker. In Column 4 the Beta coefficients for each variable in the classification equation.
Figure 2Color-plate with the φ coefficients of the classification equation (A) and the differences of the mean EEG spectra between each pair of groups, at the selected leads by the biomarkers procedure (B). Everything has been summarized by the Broad Bands shown in Table 1. The φ coefficients have been normalized to show only the sign of the coefficient.
A comparison between the EEG patterns and the Cognitive findings for each group.
| Group 1 | Highest Low-Theta in P4 Highest Alpha in C3 Highest Beta in F3, P3 | Highest scores in Reading (Accuracy, Comprehension, and Speed); Writing Accuracy and Arithmetic Calculation and Numeric Management. Significantly best in Reading and Writing Accuracy. |
| Group 2 | Highest Alpha in T6 | Highest scores in Writing Narrative Composition; and Arithmetic Counting. Significantly best in Writing Narrative Composition. |
| Group 3 | Highest Delta in C4 and T6 Highest High-Theta in Fp1, Fp2, Fz, P3, and P8 Highest Beta in P3, O1 | Poorest scores in all areas, especially in Arithmetic (Calculation and Numeric Management); Writing Narrative Composition; and Reading Accuracy. |
Marginal and global AUC after the stability based procedure for ROC estimation.
| 0.71 | 0.91 | 0.75 | 0.89 |
Figure 3Performance of the classification method applied to the 85 LD-NOS children (A) is a boxplot of the individual classification according to the groups. As defined, the boxplot shows the mean, percentiles, and dispersion of the groups. Note that Group 3 is almost perfectly separated from Groups 1 and 2 (B) shows the ROC curve for the global performance of the algorithm, before applying the ROC stability procedure. The rate of True Positive at a rate of 10 and 20 percents of False Positive is very high (C) shows the performance of the ROC curve under the stability procedure. Note the stable ROC estimate for the Global classification as well as the Marginal estimates for each pair of groups.
Youden Index.
| D− (Group1) | 24 | 2.77 | 0.32 |
| D0 (Group2) | 26 | 3.19 | 0.38 |
| D+ (Group3) | 35 | 3.68 | 0.33 |
Raw Data Summary
Group correct classification probabilities, for the best Youden cut-points.
| 0.77 | 0.41 | 0.79 |
Figure 4Distribution of the AUC values obtained by 1000 random realizations of the classification algorithm. Left shows the probability function of the AUC in the range 0 to 1 and the right shows their density distribution function. Note that the probability of obtaining by chance an AUC value of 0.91 (like ours) is smaller than 0.1e-10, which is in practice an impossible event. Also, the density distribution is centered at 0.5 (random classification), which coincides with the mean value of the AUC in our random realizations.