Literature DB >> 35601598

Brain simulation augments machine-learning-based classification of dementia.

Paul Triebkorn^1,2,3, Leon Stefanovski^1,2, Kiret Dhindsa^1,2, Margarita-Arimatea Diaz-Cortes^1,2, Patrik Bey^1,2, Konstantin Bülau^1,2, Roopa Pai^1,2,4, Andreas Spiegler^1,2,5, Ana Solodkin⁶, Viktor Jirsa³, Anthony Randal McIntosh⁷, Petra Ritter^1,2,4.

Abstract

Introduction: Computational brain network modeling using The Virtual Brain (TVB) simulation platform acts synergistically with machine learning (ML) and multi-modal neuroimaging to reveal mechanisms and improve diagnostics in Alzheimer's disease (AD).
Methods: We enhance large-scale whole-brain simulation in TVB with a cause-and-effect model linking local amyloid beta (Aβ) positron emission tomography (PET) with altered excitability. We use PET and magnetic resonance imaging (MRI) data from 33 participants of the Alzheimer's Disease Neuroimaging Initiative (ADNI3) combined with frequency compositions of TVB-simulated local field potentials (LFP) for ML classification.
Results: The combination of empirical neuroimaging features and simulated LFPs significantly outperformed the classification accuracy of empirical data alone by about 10% (weighted F1-score empirical 64.34% vs. combined 74.28%). Informative features showed high biological plausibility regarding the AD-typical spatial distribution. Discussion: The cause-and-effect implementation of local hyperexcitation caused by Aβ can improve the ML-driven classification of AD and demonstrates TVB's ability to decode information in empirical data using connectivity-based brain simulation.

Entities: Chemical

Keywords: Alzheimer's disease; The Virtual Brain; machine learning; multi‐scale brain simulation; positron emission tomography

Year: 2022 PMID： 35601598 PMCID： PMC9107774 DOI： 10.1002/trc2.12303

Source DB: PubMed Journal: Alzheimers Dement (N Y) ISSN： 2352-8737

INTRODUCTION

Alzheimer's disease (AD) is a health problem with broad impact on a patient's personal life, as well as on our aging society. However, early diagnosis remains a challenge, and the knowledge of underlying disease mechanisms is still incomplete. Besides the two hallmark proteins amyloid beta (Aβ) , and tau, other involved factors have been identified, such as impairment of the blood–brain barrier, synaptic dysfunction, network disruption, mitochondrial dysfunction, neuroinflammation, and genetic risk factors. While Aβ and tau are widely accepted as involved core features, , their mutual interaction and interaction with other factors are incompletely understood. Comprehensive knowledge of this multifactorial interaction in the pathogenesis of AD is crucial for further therapeutic strategies, including recent developments of potentially disease‐modifying anti‐Aβ therapy with aducanumab. The Virtual Brain (TVB, www.thevirtualbrain.org) is an open‐source platform for modeling and simulating large‐scale brain networks by using personalized structural connectivity models. , TVB enables model‐based inference of underlying neurophysiological mechanisms across different brain scales that are involved in the generation of macroscopic neuroimaging signals including functional magnetic resonance imaging (MRI), electroencephalography (EEG), and magnetoencephalography. Moreover, TVB facilitates the reproduction and evaluation of individual configurations of the brain by using subject‐specific data. In this study, we make use of virtual local field potentials (LFPs) from simulated brain data from a recent experiment with TVB. In our previous work, we integrated individual Aβ patterns obtained from positron emission tomography (PET) with the Aβ‐binding tracer 18F‐AV‐45 into the brain model. Consecutively, distinct spectral patterns in simulated LFPs and EEG could be observed for patients with AD, mild cognitive impairment (MCI), and healthy control (HC) subjects (Figure 1). Such integration was done by transferring the local concentration of Aβ to a variation in the brain model's local excitation–inhibition balance. This resulted in a shift from alpha to theta rhythms in AD patients, which was located in a similar pattern as local hyperexcitation in core structures of the brain network. The frequency shift was reversible by applying “virtual memantine,” that is, virtual N‐methyl‐D‐aspartate (NMDA) antagonistic drug therapy. An overview of the study results is provided in Figure 1.

FIGURE 1

Modified from Stefanovski et al. Aβ‐PET‐driven brain simulation model of AD. (A): Regional PET intensity constraints regional parameters. A sigmoidal transfer function translates the regional Aβ load to changes in the excitation‐inhibition balance. (B) Virtual AD patient brains exhibited significantly slower simulated LFPs than MCI and HC virtual brains and showed a shift from alpha to theta frequency range. While the AD group is solely dominated by two clusters in the alpha and theta band, the groups of HC and MCI have an additional strong cluster exhibiting no oscillations (frequency of zero), called a stable focus. This phenomenon is absent in the AD group. The stable focus in HC and MCI virtual brains provides an additional—simulation inferred—distinctive criterion between groups. Although there has been observed a correlation between high Abeta burdens and lower LFP frequencies only in the AD group, the spatial distribution of this LFP slowing is in addition determined by network characteristics. Moreover, the observed slowing was spatially associated with local hyperexcitation. The graph in (C) represents the SC, wherein the nodes’ size reflects the degree, while color corresponds to the relative postsynaptic potentials (relative to the mean postsynaptic potential of the simulation). The graph indicates that local hyperexcitation occurs in central parts of the networks. Aβ, amyloid beta; AD, Alzheimer's disease; HC, healthy controls; LFP, local field potential; MCI, mild cognitive impairment; MRI, magnetic resonance imaging; PET, positron emission tomography; PSP, postsynaptic potential; SC, structural connectivity AD‐specific pathologies, such as deposition of Aβ in neuritic plaques, tau deposition in neurofibrillary tangles, and atrophy of neural tissue, have been widely studied with machine learning (ML) approaches. , The major advantage of using ML‐based classification algorithms on neuroimaging data is the potential for recognizing complex high dimensional previously unknown disease patterns in the data, potentially identifying AD before clinical manifestation or predicting a disease trajectory. We further argue that the current sample size of 33 subjects is sufficient to achieve a reliable proof of concept, considering the following three main aspects: This study aims to show an information gain provided by TVB with regard to differential classification among HC, MCI, and AD populations. We do not aim to push generalizability performance of state‐of‐the‐art ML methodologies with this sample size. This leads to a primary focus on the group‐level significance of the decoding accuracy rather than the accuracies themselves. This information gain and the significance of the model performances are validated by comparing the distributions of model accuracies between feature sets and against null distributions of accuracies approximated using permutation testing. As implemented in our approach, nested cross‐validation still represents the best way to estimate generalizability in the given context. In combination with the previous points, this leads to a feasible and robust estimation of the information gain. We show that TVB simulations provide additional unique diagnostic information that is not readily available using the available empirical data alone. This supports the idea that TVB provides value and real‐world applicability above and beyond merely reorganizing empirical data.

MATERIALS AND METHODS

Alzheimer's Disease Neuroimaging Initiative database

Data used in the preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). The ADNI was launched in 2003 as a public–private partnership, led by Principal Investigator Michael W. Weiner, MD. The primary goal of ADNI has been to test whether serial MRI, PET, other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of MCI and early AD. For up‐to‐date information, see www.adni‐info.org.

Data acquisition, processing, and brain simulation

Detailed methodology of data acquisition, selection, processing, and simulation is described in a previous study. We included 33 ADNI‐3 participants, thereof 10 AD patients, 15 HC participants, and 8 MCI patients. The selection criteria included availability of both Aβ and tau PET, diffusion‐weighted MRI, and all MRI sequences necessary to fulfill the standards of the human connectome project minimal preprocessing pipeline. The number of participants was limited because of restricted availability of all data modalities at once and comparable scanners (only the largest subcohort, Siemens scanner models with 3T, were included). In addition to the data used in our previous study, we also used the distribution of tau in 18F‐AV‐1451 PET for our analyses to obtain the best available empirical data basis. The nuclear signal intensity for both Aβ and tau PET is related to a reference volume in the cerebellum. For the subcortical volumetrics used in this study, we obtained the volumetry statistics provided by the ‐autorecon2 command. The segmentation is performed with the modified Fischl parcellation of subcortical regions in FreeSurfer. A detailed description of image processing can be found in Appendix A in supporting information. Whole‐brain simulations with TVB are based on a structural connectivity (SC) matrix derived from diffusion‐weighted MRI. After processing the empirical imaging data, we used the SC of the HC population to generate an averaged standard SC for all participants. For the simulations, we made use of the Jansen‐Rit neural mass model. , Neural mass models use a mean field simplification to compute electrical potentials on a regional level by using oscillatory equations systems. The variables, parameters, and model equations can be found in Stefanovski et al. Parameter settings were chosen due to theoretical considerations in previous studies. , We explored a range of the global scaling factor G, a coefficient that scales the connection between distant brain regions, to capture different dynamic states of the simulation. The novelty in our recent simulation study was the introduction of a mechanistic model for Aβ‐driven effects. We linked local Aβ concentrations, measured by Aβ PET in 379 regions of the Glasser and Fischl parcellations, to the excitation–inhibition balance in the model by defining the inhibitory time constant τi as a sigmoidal function of local Aβ burden. The simulation models electrical potentials in the whole brain, here measured on the region level by LFPs using the same 379 regions as above. In addition, we calculate the EEG signal as a projection of the LFP from within the brain to the surface of the head, taking into the concept of a lead‐field matrix simplification to three compartment borders brain–skull, skull–scalp, and scalp–air. , , , A detailed description of the simulations can be found in Appendix B in supporting information.

Machine learning approach

Our primary objective is to determine whether extracted features from TVB add to the classifiers’ predictive power. To achieve this, we repeated the ML procedure with three different feature sets: (1) using empirical features alone, (2) using simulated features alone, and (3) using both types of features to create a combined model.

RESEARCH IN CONTEXT

Systematic Review: Machine learning has been proven to augment diagnostics of dementia in several ways. Imaging‐based approaches enable early diagnostic predictions. However, individual projections of long‐term outcome as well as differential diagnosis remain difficult, as the mechanisms behind the used classifying features often remain unclear. Mechanistic whole‐brain models in synergy with powerful machine learning aim to close this gap. Interpretation: Our work demonstrates that multi‐scale brain simulations considering amyloid beta distributions and cause‐and‐effect regulatory cascades reveal hidden electrophysiological processes that are not readily accessible through measurements in humans. We demonstrate that these simulation‐inferred features hold the potential to improve diagnostic classification of Alzheimer's disease. Future Directions: The simulation‐based classification model needs to be tested for clinical usability in a larger cohort with an independent test set, either with another imaging database or a prospective study to assess its capability for long‐term disease trajectories. As simulated features, we used the 379 regional LFP frequencies from the simulations from our previous study. As empirical features, we used the global average and the corresponding 379 regional values in Glasser and Fischl parcellation for each Aβ PET standardized uptake value ratio (SUVR) and tau PET SUVR, moreover 40 subcortical volumes, leading to 800 empirical features. The combined feature space contains all the above with 1179 features (see the supporting information Data section containing a list with all these features). Therefore, we developed a methodology using extensive feature reduction to minimize overfitting. Two types of ML classifiers that are suitable for small‐sample classification problems were used: the kernel‐based support vector machine (SVM) and the decision‐tree–based random forest (RF). By training two classifiers based on different underlying ML mechanisms, we provide more robust evidence that the pattern in classification performance, when combining simulated and empirical features, is reliable and clinically relevant. Further, this pattern is driven by a reliably reoccurring subset of the features themselves, rather than by particular mechanisms underlying a classification algorithm. Our main results make use of a hybrid classification approach in which a RF is used for feature selection to take advantage of its ability to select features based on interactions between many features together in an interpretable way, and an SVM is used for classification due to its relative reliability in small‐sample non‐linear classification problems. The number of features selected by the RF is restricted to a maximum of 34 features, the square root of the total feature number (P = 1179). To validate our hybrid classification approach, we ran experiments using either the RF or SVM alone as comparisons. These results, along with additional details of the methodology, are presented in Appendix C in supporting information. To summarize, they show a significant improvement in classification performance using the hybrid classification approach over either individual classifier. Our ML approach is primarily designed to satisfy two goals: Providing a robust, reproducible, and accurate evaluation of classification performance with the data. Facilitating exploration of the empirical and simulated features that are most important for achieving optimal separation between the AD, MCI, and HC groups. To satisfy the first goal, we implemented a strict nested cross‐validation scheme that allows us to obtain statistically reliable classification performance metrics while minimizing overfitting in a P >> N setting (i.e., we have a small sample size N, but a very large number of features P). Our cross‐validation method is adapted from earlier work in ML for clinical neuroscience, and is described in greater detail in Figure 2.

FIGURE 2

Nested cross‐validation loop design. In this hybrid classification scheme, random forest (RF; feature selection) and support vector machine (SVM; classification) are used jointly in both inner and outer loop of a nested cross‐validation loop design. Starting in the outer loop: stochastic cross‐validation starts with 100 iterations using 25% of data (randomly selected per iteration without taking into account age or sex) for testing. The training subset goes to the inner loop after the train–test split. In the inner loop: split data again just like in the outer loop to obtain training set and validation set for an inner 10 cross‐validation iterations with each hyperparameter setting (in total 192 combinations for RF and 384 for SVM, leading to 73,728 combinations with every 10 iterations). Next, we scale training features by subtracting the median and dividing by the interquartile range (makes them robust to outliers we identified above). We apply these scaling statistics calculated from the training set also to the test set. Then, we iterate through hyperparameters (Tables SC.1 and SC.2 in supporting information). RF is used for feature selection. Afterward, the remaining features are used for training the SVM classifier with specific hyperparameter settings. We track the selected features for each run and compute the frequency with which they are selected across iterations for the outer loop. The SVM classifications are validated with the test sub‐subset (inner cross‐validation). This provides optimized hyperparameter settings from the inner cross‐validation loop. Back to the outer loop, we recombine training and validation data (which were separated in the inner loop)—still keeping test data separate. We set hyperparameters to the best settings obtained in the inner loop. Then, we train the model and record results: RF is again used for feature selection, which leads to feature importance (FI) statistics used for the results. Afterward, SVM classifies the reaming features, which are then validated with the test set (outer cross‐validation). After this, the next iteration of the outer loop begins We satisfy the second goal in two ways. First, our cross‐validation scheme provides a natural metric for feature relevance, that is, feature selection frequency across cross‐validation runs. Additionally, we use feature importance metrics inherent to each feature selection method explored. In our case, the F‐statistic and the entropy criterion were two metrics used for feature selection for the SVM and the RF, respectively. Currently, the most reliable method for statistical control of prediction accuracy is permutation testing. To this end, we performed the same classification pipeline, including all feature preprocessing, feature selection, and cross‐validation steps, using randomly shuffled class labels. This was repeated 750 times to achieve a robust estimate of the null model as an approximation for the inherent prediction error of the model and chance classification results. A detailed technical description of the ML methodology can be found in Appendix C.

RESULTS

Data properties

We used basic descriptive statistics to assess data quality prior to ML analysis. The distribution of simulated LFP frequencies, Aβ PET SUVR, tau SUVR, and regional volumes and their interdependency are shown in Figure 3. Aβ (P = 0.002) and tau SUVR (P < 0.001) are significantly different between AD and HC after Bonferroni correction. LFP frequency differs significantly between AD and MCI (P = 0.032) but is not significant after Bonferroni correction. We do not see significant differences in overall brain volume (AD and MCI [P = 0.706], AD and HC [P = 0.510], or HC and MCI [P = 0.141]), but a tendency toward ventricle enlargement and significant hippocampal atrophy in AD.

FIGURE 3

Characteristics of empirical feature space. In (A), regional distributions of Aβ, tau, and LFP frequency are shown for all groups in a 3D scatterplot. Red data points symbolize regions of AD patients, green points MCI patients, and blue points HC. Each scatters point stands for one region of one subject. Color density is normalized between groups. A kernel density estimate of the corresponding histograms is shown (projection of the 3D plot to one axis). In particular, it can be seen a string of outliers with very high tau values in the AD group and in parts in the MCI group, which does not appear for HC. Moreover, AD participants’ regions show higher Aβ values, in particular for lower frequencies. Besides, boxplots are presented for groupwise comparisons for the features mean Aβ per subject, mean tau per subject, mean simulated LFP frequency per subject, and mean volume per subject. A Kruskal–Wallis test was performed to assess significance: * marks significance with P < 0.05; ** marks significance after Bonferroni‐correction with P < 0.003 (for 15 tests). B, Aβ SUVR is significantly different between AD and HC (P = 0.002) and MCI (P = 0.045), but not between HC and MCI (P = 0.811). C, Tau SUVR is only significantly different between AD and HC (P < 0.001), but not between AD and MCI (P = 0.174) or HC and MCI (P = 0.267). D, LFP frequency is only significantly different between AD and MCI (P = 0.032), but not between AD and HC (P = 0.216) or HC and MCI (P = 0.472). E, As the mean volume of all regions (including, e.g., ventricles and white matter) does not show significant differences (as expected because of volume shifts between parenchyma and CSF). We explored the data regarding ventricle enlargement and hippocampal atrophy. Although we see a tendency for both in the AD group, only the difference in hippocampal volume reaches significance between AD and HC. Ventricle volumes: HC and MCI (P = 0.056), HC and AD (P = 0.116), MCI and AD (P = 0.910). Hippocampal volumes: HC and MCI (P = 0.556), HC and AD (P = 0.003), MCI and AD (P = 0.144). Aβ, amyloid beta; AD, Alzheimer's disease; CSF, cerebrospinal fluid; HC, healthy controls; LFP, local field potential; MCI, mild cognitive impairment; MRI, magnetic resonance imaging; PET, positron emission tomography; PSP, postsynaptic potential; SC, structural connectivity; SUVR, standardized uptake value ratio

Classification performance

Overall, we performed nine experiments spanning three different classification schemes and three feature sets (see Appendix D in supporting information). The hybrid classification scheme with SVM and RF performed best. For all schemes, the combined feature space outperformed both the empirical and the simulated feature space (Table SD.1 in supporting information). The results of the hybrid classification approach are given below. Weighted F1‐scores (wF1) and normalized confusion matrices are given in Figure 4. The combined approach (wF1 = 0.743) outperformed the empirical one (wF1 = 0.643) by about 0.1 (Figure 4D), mainly because of an improvement in the classification of the MCI group (Figure 4A–C). We used the Wilcoxon signed rank test from 100 cross‐validation runs to assess significance (Shapiro–Wilk test of normality for the wF1 distributions revealed P < 0.001 for empirical and combined approach and P = 0.070 for the simulated approach, leading to the usage of a nonparametric test). The differences between the combined approach and both individual approaches (empirical and simulated) were highly significant with P < 0.001; meanwhile, there was no significant difference between the empirical and simulated approaches (P = 0.340). Additionally, the hybrid classification approach outperformed the SVM‐only approach (wF1 = 0.718) and the RF‐only approach (wF1 = 0.670) for the combined features.

FIGURE 4

Results of the nested cross‐validation classification approach. A–C, Confusion matrices are computed by summing the confusion matrices across all 100 cross‐validation runs and normalizing per class. In particular, the combined approach improved the prediction of MCI participants, as AD and HC were already quite well distinguishable by the empirical features. D, Boxplots of mean weighted F1‐scores for three different feature spaces. The combined approach (wF1 = 0.743) outperformed the empirical one (wF1 = 0.643) by about 0.1. Significance assessment with the Wilcoxon signed rank test from 100 cross‐validation runs: combined versus empirical: P < 0.001; combined versus simulated: P < 0.001, empirical versus simulated P = 0.340. AD, Alzheimer's disease; HC, healthy controls; MCI, mild cognitive impairment;

Classification validity

As a further analysis to understand this classification improvement, we calculated the feature importance. Figure 5A shows the mean entropy‐based feature importance given by the RF classifier for 100 outer cross‐validation runs. This is used to show that there is a decreasing curve, as we would expect if meaningful features were found (as opposed to a more uniform distribution). Many of the more important features seem to be biologically plausible in the context of AD (Figures 5B and 6, full list in supporting information Data).

FIGURE 5

FIGURE 6

Anatomical representation of feature importance (FI) distribution. Displayed are cortical regions from left, right, and inferior as well as subcortical regions. The color indicates the FI. A, Aβ FI. The anatomical patterns reveal high importance of left‐temporal regions, as well as the left dorsal stream in the parietal and occipital cortex. The Aβ top features showed a more disseminated allocation mostly in the temporal, occipital, frontal, and insular cortices, which is also in line with typical amyloid deposition and locations of increased AV‐45 uptake in AD. B, Tau features show a similar pattern as Aβ, but with a higher focus on typical Braak stage 1 regions (as the entorhinal cortex). Most of the tau top features can be allocated to the temporal lobe, which is also the location of early tau deposition according to the neuropathological Braak and Braak stages I–III , and the location of increased in vivo binding of 18F‐AV‐1451 in AD. In particular, the entorhinal cortex is a consistent starting point of the sequential spread of tau through the brain , and also showed the most robust relationship between flortaucipir and memory scores in a recent machine learning study. C, Simulated frequencies do not show strong laterality as the empirical features but seem to have a focus in both occipital lobes, where typically alpha oscillations occur. The occipito‐temporal and occipito‐parietal regions of the first area are typical alpha‐rhythm generators in resting‐state electroencephalogram. Alteration of these posterior alpha sources is a typical phenomenon in AD and MCI compared to HC. The ventral or “what” stream and the dorsal or “where” stream have been implicated in object recognition and spatial localization and are essential for accurate visuospatial navigation. Impairment in visuospatial navigation is a potential cognitive marker in early AD/MCI that could be more specific than episodic memory or attention deficits. Besides this, subcortical areas like the thalami play a more crucial role than for Aβ and tau. Aβ, amyloid beta; AD, Alzheimer's disease; FI, feature importance; HC, healthy controls; MCI, mild cognitive impairment

Feature importance (FI) distribution. A, Mean random forest (RF)‐derived feature importance from 100 outer cross‐validation runs. Entropy criterion with combined feature types shown here. Feature importance values are normalized, so all features sum to one. In shaded blue, half standard deviation is displayed for each feature. B, Top 50 features across all cross‐validation runs. Both empirical (tau in dark blue, amyloid beta [Aβ] in green, volume in light blue), as well as simulated frequencies (red), contributed to the improved classification. Many features seem moreover to be biologically plausible in the context of Alzheimer's disease (AD), as, for example, tau in entorhinal cortex (Braak stage 1), thalamic dysfunction (as significant rhythm generator), and volumes in hippocampus (as signs of atrophy). C, Visualization of the structural connectivity (SC) graph with color indicating FI of the regional local field potential (LFP) frequencies, while vertex diameter reflects the structural degree. It shows a network dependency of the LFP FI. Only edges with connection strength above the 95th percentile are shown. D, The distributions of weighted F1 scores for permutation based null model (left box) and corresponding true model (right box). All models significantly outperform the null model with the combined model showing the greatest average distance to its null model, indicating the gain in differentiating information Anatomical representation of feature importance (FI) distribution. Displayed are cortical regions from left, right, and inferior as well as subcortical regions. The color indicates the FI. A, Aβ FI. The anatomical patterns reveal high importance of left‐temporal regions, as well as the left dorsal stream in the parietal and occipital cortex. The Aβ top features showed a more disseminated allocation mostly in the temporal, occipital, frontal, and insular cortices, which is also in line with typical amyloid deposition and locations of increased AV‐45 uptake in AD. B, Tau features show a similar pattern as Aβ, but with a higher focus on typical Braak stage 1 regions (as the entorhinal cortex). Most of the tau top features can be allocated to the temporal lobe, which is also the location of early tau deposition according to the neuropathological Braak and Braak stages I–III , and the location of increased in vivo binding of 18F‐AV‐1451 in AD. In particular, the entorhinal cortex is a consistent starting point of the sequential spread of tau through the brain , and also showed the most robust relationship between flortaucipir and memory scores in a recent machine learning study. C, Simulated frequencies do not show strong laterality as the empirical features but seem to have a focus in both occipital lobes, where typically alpha oscillations occur. The occipito‐temporal and occipito‐parietal regions of the first area are typical alpha‐rhythm generators in resting‐state electroencephalogram. Alteration of these posterior alpha sources is a typical phenomenon in AD and MCI compared to HC. The ventral or “what” stream and the dorsal or “where” stream have been implicated in object recognition and spatial localization and are essential for accurate visuospatial navigation. Impairment in visuospatial navigation is a potential cognitive marker in early AD/MCI that could be more specific than episodic memory or attention deficits. Besides this, subcortical areas like the thalami play a more crucial role than for Aβ and tau. Aβ, amyloid beta; AD, Alzheimer's disease; FI, feature importance; HC, healthy controls; MCI, mild cognitive impairment We also showed that feature relevance is dependent on the structural degree of the regions in the underlying SC network (Figure 5C). This is an indicator of network effects contributing to the improved classification and another indicator for meaningful classification results. Using the Wilcoxon signed rank test, we could further show that the classification performance was significantly higher than the null model (with P < 0.001 for all three approaches). The average performance of the combined approach showing the greatest distance to the corresponding null model laying outside the 100% interval (Figure 5D).

DISCUSSION

In this study, we show that the inclusion of virtual, simulated TVB features into ML classification can lead to an improved classification among HC, MCI, and AD. The diagnostic value of the underlying empirical features can be improved by integrating the features into a multi‐scale brain simulation framework in TVB. We showed an improvement in classification performance when combining both the empirical and the virtual derived features. The absolute gain of accuracy was 10%. Keeping in mind that all differences between the subjects have to be derived from their Aβ PET signal (because all other factors, e.g., the underlying SC, are the same) this provides evidence that TVB is able to decode the information that is contained in empirical data like the amyloid PET. More specific for the PET and its usage in diagnostics, it highlights the relevance of spatial distribution, which is often not considered in its analysis. The main reason for this improvement seems to be a better classification of MCI subjects. Without the simulated features, the models frequently misclassify MCI subjects as HC. In contrast, the simulated features alone result in more misclassification of HCs as either MCI or AD subjects compared to using the empirical features alone. However, combining the empirical features with the simulated features appears to complement their strengths in a clinically useful way; these models retain all or most of the ability to correctly classify healthy controls with the empirical features and retain much of the simulated features’ ability to classify MCI patients. The processing inside TVB seems to reorganize the existing data beneficially. In theory, a larger number of available features could provide a ML algorithm greater flexibility in finding useful combinations. This is the case simply due to a higher degree of freedom during feature selection and weighting. However, the equal empirical data foundation (only PET as individual features) in combination with a nested cross‐validation method protects from an overfitting bias due to the larger feature space, with additional evidence of this provided by the chance level performance of the null distributions. If the explanation for the improvement in classification accuracy were simply the presence of additional noisy features, we would see a flatter feature importance distribution than shown in Figure 5, and therefore a more random distribution of selected features across the 100 cross‐validation iterations. Instead, we see that only a few features with high importance are consistently guiding classification, indicating that they in fact provide useful discriminative information. Preventing this kind of overfitting via feature selection is a key motivation behind our use of the nested cross‐validation approach (Figure 2): Because the features are selected on the training and validation (test) set in the inner loop, any overfitting due to feature selection should not be transferred to the test set in the outer loop. We have shown that only a few selected features seem to play a crucial role in classification throughout the cross‐validation iterations and that these features play a biologically plausible role in the context of AD (Figures 5 and 6). As a limitation of our study, we see that the used simulated feature, the mean simulated LFP frequency (averaged across a wide range of the large‐scale coupling parameter G), is not directly equivalent to a biophysical measurement like empirically measured LFP. G scales the strength of long‐range connections in the brain network model and is a crucial factor in the simulation. Many different dynamics can develop across the dimension of G, from which some are similar to empirically observed phenomena, but others are not. Our former work has found that particular ranges of G with non‐plausible frequency patterns hold the potential to differentiate between diagnosis groups. This is mainly because of the underlying mathematics of the Jansen‐Rit model: besides two limit cycles that produce alpha‐like and theta‐like activity, the local dynamic model has a region of stable focus wherein no oscillations are produced in the absence of noise. Technically, this stable focus is represented as a zero‐line artifact that appears mainly in the HC group, because only Aβ values above a critical value led to the presence of the slower theta‐limit cycle. By averaging LFP frequencies across the whole spectrum of G, we incorporate this zero‐line information, which leads to apparently higher mean LFP frequencies for the AD group compared to non‐AD groups. In contrast, in the region of biologically plausible results, AD has lower frequencies, as would be expected. This can also be seen as another advantage of TVB. It shows how TVB does not just reproduce data that could also be obtained with EEG or intracranial electrodes, but delivers “artificial” data that are still informative. While particular parameter ranges deliver biologically plausible results, even other (less plausible) parameter settings provide unique individual patterns and can contribute to the classification. This work's primary aim is not to develop a ready‐to‐use ML classifier for AD, but to show the potential of brain simulation to enhance empirical datasets in clinically relevant ways. While the limited sample size used in this study would potentially be problematic in a more traditional ML study aimed at providing an ML‐based diagnostic aid, combined with our careful cross‐validation methodology, it does not detract from our primary conclusion. Future studies will have to reproduce these results using a more extensive cohort for further clinical usage of this work. Ideally, external validation with a dataset outside of ADNI would be performed. We used ML as an approach for the comparison of classifier performance with empirical data against simulated data, which is wholly derived from the empirical data. Improvement in classification is then strong evidence for successful processing of the empirical data in TVB: TVB decodes the information embedded within the empirical data which cannot be detected by statistics or ML classifiers. We showed in ADNI data that TVB can derive additional information out of the spatial distribution pattern in PET images. Our work provides novel evidence that TVB can act as a biophysical brain model and not just like a black box. Complex multi‐scale brain simulation in TVB can lead to additional information that goes beyond the implemented empirical data. Our analysis of feature importance supports this hypothesis, as the features with the highest relevance are already well‐known AD factors and hence, biologically plausible surrogates for clinically relevant information in the data. Moreover, in this pilot study, we demonstrate that TVB simulation can lead to an improved diagnostic value of empirical data and might become a clinically relevant tool.

CONFLICTS OF INTEREST

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The disclosures are based on the disclosure form of the International Committee of Medical Journal Editors (ICMJE). PR, ARM, and VJ report the following patent application: McIntosh AR, Mersmann J, Jirsa VK, Ritter P. Method and Computing System for Modeling a Primate Brain. Patent Application 137PCT1754. VJ report stock or stock options in Virtual Brain Technologies (VB‐Tech). VB‐Tech performs activities in the domain of brain simulation. There is no relation to field of dementia, nor to the content of the manuscript. All other authors, namely PT, LS, KD, MD, PB, KB, RP, ASp, and ASo, have nothing to declare.

AUTHOR CONTRIBUTIONS

All authors have made substantial intellectual contributions to this work and approved it for publication. PT and LS had equal contributions to this work. Particular roles according to CRediT : Paul Triebkorn: conceptualization, data curation, investigation, methodology, visualization, writing – original draft. Leon Stefanovski: conceptualization, formal analysis, investigation, methodology, visualization, writing – original draft. Kiret Dhindsa: formal analysis, methodology, software, writing – review and editing. Margarita‐Arimatea Diaz‐Cortes: methodology, software, writing – review and editing. Patrik Bey: methodology, software, writing – review and editing. Konstantin Bülau: validation, writing – review and editing. Roopa Pai: data curation, writing – review and editing. Andreas Spiegler: methodology, writing – review and editing. Ana Solodkin: writing – review and editing. Viktor Jirsa: writing – review and editing. Anthony Randal McIntosh: writing – review and editing. Petra Ritter: conceptualization, funding acquisition, methodology, project administration, supervision, writing – review and editing. SUPPORTING INFORMATION (APPENDICES A ‐ D) Click here for additional data file. SUPPORTING DATA Click here for additional data file.

40 in total

Review 1. The human visual cortex.

Authors: Kalanit Grill-Spector; Rafael Malach
Journal: Annu Rev Neurosci Date: 2004 Impact factor: 12.449

2. From Group-Level Statistics to Single-Subject Prediction: Machine Learning Detection of Concussion in Retired Athletes.

Authors: Rober Boshra; Kiret Dhindsa; Omar Boursalie; Kyle I Ruiter; Ranil Sonnadara; Reza Samavi; Thomas E Doyle; James P Reilly; John F Connolly
Journal: IEEE Trans Neural Syst Rehabil Eng Date: 2019-06-12 Impact factor: 3.802

3. Hierarchical feature representation and multimodal fusion with deep learning for AD/MCI diagnosis.

Authors: Heung-Il Suk; Seong-Whan Lee; Dinggang Shen
Journal: Neuroimage Date: 2014-07-18 Impact factor: 6.556

4. Entorhinal cortex tau, amyloid-β, cortical thickness and memory performance in non-demented subjects.

Authors: David S Knopman; Emily S Lundt; Terry M Therneau; Prashanthi Vemuri; Val J Lowe; Kejal Kantarci; Jeffrey L Gunter; Matthew L Senjem; Michelle M Mielke; Mary M Machulda; Bradley F Boeve; David T Jones; Jon Graff-Radford; Sabrina M Albertson; Christopher G Schwarz; Ronald C Petersen; Clifford R Jack
Journal: Brain Date: 2019-04-01 Impact factor: 13.501

Review 5. Neuropathological stageing of Alzheimer-related changes.

Authors: H Braak; E Braak
Journal: Acta Neuropathol Date: 1991 Impact factor: 17.088

Review 6. Amyloid-β and tau: the trigger and bullet in Alzheimer disease pathogenesis.

Authors: George S Bloom
Journal: JAMA Neurol Date: 2014-04 Impact factor: 18.302

7. Fine Structure of Posterior Alpha Rhythm in Human EEG: Frequency Components, Their Cortical Sources, and Temporal Behavior.

Authors: Elham Barzegaran; Vladimir Y Vildavski; Maria G Knyazeva
Journal: Sci Rep Date: 2017-08-15 Impact factor: 4.379

Review 8. Amyloid-beta: a crucial factor in Alzheimer's disease.

Authors: Saeed Sadigh-Eteghad; Babak Sabermarouf; Alireza Majdi; Mahnaz Talebi; Mehdi Farhoudi; Javad Mahmoudi
Journal: Med Princ Pract Date: 2014-11-27 Impact factor: 1.927

Review 9. NIA-AA Research Framework: Toward a biological definition of Alzheimer's disease.

Authors: Clifford R Jack; David A Bennett; Kaj Blennow; Maria C Carrillo; Billy Dunn; Samantha Budd Haeberlein; David M Holtzman; William Jagust; Frank Jessen; Jason Karlawish; Enchi Liu; Jose Luis Molinuevo; Thomas Montine; Creighton Phelps; Katherine P Rankin; Christopher C Rowe; Philip Scheltens; Eric Siemers; Heather M Snyder; Reisa Sperling
Journal: Alzheimers Dement Date: 2018-04 Impact factor: 21.566

Review 10. Thalamic pathology and memory loss in early Alzheimer's disease: moving the focus from the medial temporal lobe to Papez circuit.

Authors: John P Aggleton; Agathe Pralus; Andrew J D Nelson; Michael Hornberger
Journal: Brain Date: 2016-04-28 Impact factor: 13.501

1 in total

1. Brain simulation augments machine-learning-based classification of dementia.

Authors: Paul Triebkorn; Leon Stefanovski; Kiret Dhindsa; Margarita-Arimatea Diaz-Cortes; Patrik Bey; Konstantin Bülau; Roopa Pai; Andreas Spiegler; Ana Solodkin; Viktor Jirsa; Anthony Randal McIntosh; Petra Ritter
Journal: Alzheimers Dement (N Y) Date: 2022-05-15

1 in total