| Literature DB >> 35508638 |
Sandeep K Mody1, Govindan Rangarajan2,3.
Abstract
Conventional Vector Autoregressive (VAR) modelling methods applied to high dimensional neural time series data result in noisy solutions that are dense or have a large number of spurious coefficients. This reduces the speed and accuracy of auxiliary computations downstream and inflates the time required to compute functional connectivity networks by a factor that is at least inversely proportional to the true network density. As these noisy solutions have distorted coefficients, thresholding them as per some criterion, statistical or otherwise, does not alleviate the problem. Thus obtaining a sparse representation of such data is important since it provides an efficient representation of the data and facilitates its further analysis. We propose a fast Sparse Vector Autoregressive Greedy Search (SVARGS) method that works well for high dimensional data, even when the number of time points is relatively low, by incorporating only statistically significant coefficients. In numerical experiments, our methods show high accuracy in recovering the true sparse model. The relative absence of spurious coefficients permits accurate, stable and fast evaluation of derived quantities such as power spectrum, coherence and Granger causality. Consequently, sparse functional connectivity networks can be computed, in a reasonable time, from data comprising tens of thousands of channels/voxels. This enables a much higher resolution analysis of functional connectivity patterns and community structures in such large networks than is possible using existing time series methods. We apply our method to EEG data where computed network measures and community structures are used to distinguish emotional states as well as to ADHD fMRI data where it is used to distinguish children with ADHD from typically developing children.Entities:
Mesh:
Year: 2022 PMID: 35508638 PMCID: PMC9068763 DOI: 10.1038/s41598-022-10459-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Network measures computed. See[38] for an explanation of these measures.
| Measure type | Measure |
|---|---|
| Scalar | (1) Edge Count, (2) Matrix Density, (3) Characteristic Path Length, (4) Global Efficiency, (5) Transitivity, (6) Assortativity-in-in, (7) Assortativity-in-out, (8) Assortativity-out-in, (9) Assortativity-out-out |
| Node-wise weighted | (1) Strengths-in, (2) Strengths-out, (3) Strengths-total, (4) Local Efficiencies, (5) Participation-in, (6) Participation-out, (7) Clustering Coefficients, (8) Closeness Centralities-in, (9) Closeness Centralities-out, (10) Closeness Centralities-total, (11) Eigenvector Centralities-in, (12) Eigenvector Centralities-out, (13) Sub-graph Centralities, (14) Node-Betweenness Centralities |
| Binary weighted | (1) Degrees-in, (2) Degrees-out, (3) Degrees-total, (4) Flow Coeffs, (5) Flow Path Counts, (6) Coreness Centralities |
Percent Spurious = Number of spurious predictions for every 100 true non-zero coefficients [For each system size, average over all simulations]
| Method/num variables | 35 | 100 | 300 | 500 |
|---|---|---|---|---|
| SVARGS | 12.8 | 7.0 | 5.6 | 5.4 |
| Lasso | 450.6 | 586.0 | 763.1 | 836.0 |
Figure 1(a,c) (left column) Percent Unpredicted (false negative rate) = Number of true non-zero coefficients not in the fitted model)/(Number of true non-zero coefficients) for and respectively, averaged over 50 samples per data length. (b,d) (right column) Time taken in seconds (single processor core) to complete fit for and respectively, averaged over 50 samples per data length.
Binary cross-validated classification accuracy on DEAP data set. The first row merely lists the proportions of the majority class for each emotion and sets the baseline accuracy. The second row shows, for each emotion, the cross-validated accuracy obtained by Koelstra et al.[49]. The third row shows the accuracies obtained using all the time-domain features obtained from our SVARGS method listed in section “DEAP features via SVARGS and CGC”. The fourth row is the same as the third except with the addition of the peripheral features. The fifth row shows the accuracy obtained with the SVARGS network scalar (NS-spect) and network edge band spectral features (NE-spect) and the last row shows the accuracy with NS-spect, NE-spect and the peripheral features. The feature group abbreviations are listed in section “DEAP features via SVARGS and CGC”. We used the the Adaboost-SAMME algorithm for classification, with 8-fold cross-validation aggregated over 8 runs. The accuracies shown for our methods is the average over 10 such meta runs of the modal accuracy in the last 75 out of 225 rounds of adaboost-SAMME. Any standard deviations are indicated in brackets.
| Method | Valence | Arousal | Liking | Dominance | Familiarity |
|---|---|---|---|---|---|
| 1. Baseline | 56.56 | 58.91 | 66.95 | 62.11 | 56.64 |
| 2. koelstra et al. | 57.60 | 62.00 | 55.40 | NaN | NaN |
| 3. SVARGS: time domain | 74.76 (0.53) | 66.47 (0.74) | 70.92 (0.43) | 70.41 (0.55) | 69.70 (0.40) |
| 4. SVARGS: time domain and PERI | 75.13 (0.64) | 66.55 (0.66) | 69.75 (0.61) | 70.60 (0.55) | 69.88 (0.42) |
| 5. SVARGS: NS-spect+NE-spect | 75.15 (0.52) | 67.46 (0.55) | 70.40 (0.45) | 72.01 (0.41) | 69.94 (0.67) |
| 6. SVARGS: NS-spect+NE-spect and PERI | 75.71 (0.57) | 67.82 (0.53) | 70.47 (0.63) | 71.60 (0.32) | 69.73 (0.50) |
Number of subjects per site.
| Site | BEIJING | KKI | NEURO. | NYU | OHSU | WASHU | PITTSBUR. | BROWN |
|---|---|---|---|---|---|---|---|---|
| TDC | 116 | 58 | 22 | 91 | 40 | 66 | 33 | 0 |
| ADHD | 78 | 20 | 17 | 97 | 30 | 0 | 0 | 0 |
| Total | 194 | 78 | 39 | 188 | 70 | 66 | 33 | 0 |
Binary (TDC/ADHD) classification accuracy of different methods on ADHD200 data set. The table shows cross-validated accuracies on the original training set and the accuracies obtained on the original holdout set. Standard deviations are shown in brackets. The first column shows the best accuracies obtained in the ADHD-200 Global Competition[68–70]. The second column shows the ADHD-200 Global Competition using PCD only. The best accuracies in the ADHD-200 Global Competition were obtained using only the personal characteristic data. The third and fourth columns show the accuracy using SVARGS+CGC features (4.2.4) and the adaboost-SAMME algorithm for classification with only fMRI and fMRI+PCD respectively. The last column merely lists the proportions of the majority class for each emotion and sets the baseline accuracy. The accuracies shown for our methods is the average over 10 meta runs. The accuracy on each each meta run is the aggregate over 8 runs of the modal accuracy in the last 75 out of 225 rounds of 12 fold cross-validation. The accuracy on the holdout set is the average over 20 runs at the 50th round of adaboost-SAMME.
| ADHD200 Comp. (fMRI only) | ADHD200 Comp. (PCD only) | SVARGS (FMRI only) | SVARGS (FMRI + PCD) | Baseline | |
|---|---|---|---|---|---|
| ADHD Training Set | 70.7 (6.2) | 75.0 (4.5) | 74.2 (1.2) | 78.2 (0.37) | 64.2 |
| ADHD Holdout Set | 60.5 | 69.0 | 65.8 (0.8) | 73.4 (0.85) | 55.0 |