Literature DB >> 30186764

Data driven diagnostic classification in Alzheimer's disease based on different reference regions for normalization of PiB-PET images and correlation with CSF concentrations of Aβ species.

Francisco Oliveira¹, Antoine Leuzy², João Castelhano¹, Konstantinos Chiotis², Steen Gregers Hasselbalch³, Juha Rinne⁴, Alexandre Mendonça⁵, Markus Otto⁶, Alberto Lleó⁷, Isabel Santana⁸, Jarkko Johansson⁹, Sarah Anderl-Straub⁶, Christine Arnim⁶, Ambros Beer¹⁰, Rafael Blesa⁷, Juan Fortea⁷, Herukka Sanna-Kaisa¹¹, Erik Portelius¹², Josef Pannee¹², Henrik Zetterberg¹³, Kaj Blennow¹², Ana P Moreira¹, Antero Abrunhosa¹, Agneta Nordberg¹⁴, Miguel Castelo-Branco¹⁵.

Abstract

Positron emission tomography (PET) neuroimaging with the Pittsburgh Compound_B (PiB) is widely used to assess amyloid plaque burden. Standard quantification approaches normalize PiB-PET by mean cerebellar gray matter uptake. Previous studies suggested similar pons and white-matter uptake in Alzheimer's disease (AD) and healthy controls (HC), but lack exhaustive comparison of normalization across the three regions, with data-driven diagnostic classification. We aimed to compare the impact of distinct reference regions in normalization, measured by data-driven statistical analysis, and correlation with cerebrospinal fluid (CSF) amyloid β (Aβ) species concentrations. 243 individuals with clinical diagnosis of AD, HC, mild cognitive impairment (MCI) and other dementias, from the Biomarkers for Alzheimer's/Parkinson's Disease (BIOMARKAPD) initiative were included. PiB-PET images and CSF concentrations of Aβ38, Aβ40 and Aβ42 were submitted to classification using support vector machines. Voxel-wise group differences and correlations between normalized PiB-PET images and CSF Aβ concentrations were calculated. Normalization by cerebellar gray matter and pons yielded identical classification accuracy of AD (accuracy-96%, sensitivity-96%, specificity-95%), and significantly higher than Aβ concentrations (best accuracy 91%). Normalization by the white-matter showed decreased extent of statistically significant multivoxel patterns and was the only method not outperforming CSF biomarkers, suggesting statistical inferiority. Aβ38 and Aβ40 correlated negatively with PiB-PET images normalized by the white-matter, corroborating previous observations of correlations with non-AD-specific subcortical changes in white-matter. In general, when using the pons as reference region, higher voxel-wise group differences and stronger correlation with Aβ42, the Aβ42/Aβ40 or Aβ42/Aβ38 ratios were found compared to normalization based on cerebellar gray matter.

Entities: Chemical Disease Gene Species

Mesh：

Substances：

Year: 2018 PMID： 30186764 PMCID： PMC6120605 DOI： 10.1016/j.nicl.2018.08.023

Source DB: PubMed Journal: Neuroimage Clin ISSN： 2213-1582 Impact factor: 4.881

Introduction

Positron emission tomography (PET) imaging with the 11C-Pittsburgh Compound B (PiB) tracer is currently used in many nuclear medicine imaging centers to visualize in vivo amyloid plaques in the brain, which represent a core molecular feature of Alzheimer's disease (AD) (Hardy & Selkoe, 2002). The binary assessment of PiB-PET images, abnormal (amyloid-positive) versus normal (amyloid-negative), can be done by examining tracer uptake in cortical regions of interest. While the most commonly used approach, from a clinical standpoint, is the visual assessment of summated concentration images, quantitative approaches can also be applied; the most common of these is the standardized uptake value ratio (SUVR), which consists in normalizing uptake within target regions to that within a reference region. A global cut-off can then be applied to determine whether the PiB image is positive or negative. Quantitative assessment increases the accuracy and confidence of the visual readings and is also useful for longitudinal studies and clinical trials. The cerebellar gray matter has been widely used as reference region since its amyloid accumulation has been demonstrated to bear no significant differences between healthy controls (HC) and AD patients (Price et al., 2005; Klunk et al., 2004). Other biomarker extensively used in the clinical diagnosis of AD is the cerebrospinal fluid (CSF) concentration of amyloid-β (Blennow et al., 2012; Rosén et al., 2013). It is well known that, in AD patients, the concentration of amyloid-β42 (Aβ42) in the CSF is generally decreased (Olsson et al., 2016) concurrently with elevated brain retention of amyloid tracers, such as PiB or 18F-florbetapir (Leuzy et al., 2016; Johnson et al., 2013; Mattsson et al., 2014). Mild cognitive impairment (MCI) may often represent a prodromal stage of AD, with a conversion rate to dementia due to AD of about 10% to 25% per year while healthy elderly progress at a rate of approximately 1% to 2% per year (Grand et al., 2011). MCI patients who are PiB amyloid-positive are very likely cases of prodromal AD, while patients with MCI who are PiB amyloid-negative are less likely to represent a prodromal stage and to undergo conversion to AD (Wolk et al., 2009; Okello et al., 2009; Jack et al., 2010). Our main goal was to assess, using multivariate approaches, if the cerebellar gray matter is the best choice, from a clinical point of view, to be used as reference region when compared with the pons or subcortical white matter, since these two areas were also found to have similar PiB retentions in AD patients and HC subjects (Klunk et al., 2004). To decide which approach that would be the best option, we here investigated: 1) the ability to discriminate clinically diagnosed patients using voxel-wise statistical analysis, 2) the voxel-wise correlation with the CSF Aβ concentrations (providing both clinical and biological agreement), and 3) the accuracy in data-driven classification between clinically defined AD patients and HC or patients with other non AD dementias. A secondary goal was to compare, using data driven classification methods, the classification accuracy of PiB using the SUVR against the accuracy achieved using the CSF concentrations of Aβ38, Aβ40 and Aβ42 and their normalized values as assessed by the Aβ42/Aβ38 and Aβ42/Aβ40 ratios determined in the same central laboratory.

Methods

Dataset

The dataset used in this study has been described elsewhere (Leuzy et al., 2016) and is summarized in Table 1. It consists of 243 subjects from seven European academic centers belonging to the Biomarkers for Alzheimer's and Parkinson's Disease (BIOMARKAPD) initiative. It contains five groups of subjects: HC, patients with AD, patients with MCI, patients with frontotemporal dementia (FTD) and patients with vascular dementia (VaD). PiB-PET acquisitions protocols varied across sites. In all cases a late summation was considered, being the post injection intervals: 40 to 60 min (n = 101), 40 to 70 min (n = 31), 50 to 70 min (n = 24) and 60 to 90 min (n = 87). PiB-PET images were classified locally by a nuclear medicine physician as either positive (abnormal) if there was high binding in cortical regions, or negative (normal) if there was a predominantly white matter binding. All PiB-PET images had a isotropic voxel size of 2 mm. Local Aβ42 values were classified as positive (abnormal) or negative (normal) using an optimal cut-off of 557 pg/ml (Zwan et al., 2016). Local Aβ42 concentrations were measured using commercially available sandwich ELISA (INNOTEST, Fujirebio-Europe) and with similar protocol. Concerning central harmonization of measures (used in this study), see below.

Table 1

Summary of demographics, clinical and locally measured biomarkers according to the diagnostic group.

	AD (n = 122)	MCI (n = 81)	FTD (n = 20)	VaD (n = 7)	HC (n = 13)
Age, years	65 (59, 72)	64 (58, 71)	64 (59, 73)	61 (52, 74)	67 (58, 71)
Sex, M:F	50:72	37:44	9:11	3:4	6:7
MMSE, points	23 (20, 26)	27 (26, 28)	23 (20, 27)	26 (20, 29)	29 (28, 30)
PiB visual, positive	113	50	3	0	1
Ab42, positive	96	46	8	5	1
CSF-PiB, months	2.4 (0.7, 5.2)	4.0 (1.8, 8.4)	2.0 (1.1, 4.0)	3.5 (2.8, 6.1)	1.8 (1.3, 7.4)

Age, MMSE and CSF-PiB are reported as median (quartile 1, quartile 3), CSF-PiB is the time between the CSF collection and the PiB-PET exam.

Summary of demographics, clinical and locally measured biomarkers according to the diagnostic group. Age, MMSE and CSF-PiB are reported as median (quartile 1, quartile 3), CSF-PiB is the time between the CSF collection and the PiB-PET exam. Patients were assessed according to standard local clinical routines, and all diagnoses were made by a multidisciplinary team using a consensus-based approach. Patients with AD fulfilled the 1984 National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer's Disease and Related Disorders Association (NINCDS-ADRDA) criteria for probable AD dementia (McKhann et al., 1984), MCI patients were diagnosed according to the Petersen et al. (Petersen et al., 1999) criteria, FTD patients were diagnosed according to the Neary et al. (Neary et al., 1998) criteria, and finally, the VaD patients were diagnosed according to the National Institute of Neurological Disorders and Stroke - Association Internationale pour la Recherche et l'Enseignement en Neurosciences (NINDS-AIREN) criteria for vascular dementia (Román et al., 1993). The HC subjects were recruited from relatives and caregivers of patients. Inclusion criteria were absence of memory or other cognitive complaints; independence in basic and instrumental daily life activities; and no discernible neurological or psychiatric disease. All participants, or caregivers, when appropriate, gave written informed consent to participate in the research, which was conducted according to the Declaration of Helsinki and subsequent revisions. Ethical approval was obtained from local regional ethics committees. CSF concentration values used in this study were centrally obtained. Aβ42 concentrations were obtained using the reference measurement procedure (RMP) by liquid chromatography (LC) tandem mass spectrometry (MS) (MS-RMP) while Aβ38 and Aβ40 concentrations were obtained by a fully validated LC-MS method (Leinenbach et al., 2014). Aβ38, Aβ40 and Aβ42 were also analyzed using the MSD V-PLEX Aβ Peptide Panel 1 (4G8) kit (Meso Scale Diagnostics, Rockland, MD, USA), following the manufacturer's protocol. Samples from the local centers were sent for analysis to Clinical Neurochemistry Laboratory, Gothenburg University, Mölndal, Sweden. Technical measurement protocols are described elsewhere (Leuzy et al., 2016).

PiB-PET image pre-processing

Before further processing, all images were non-linearly spatially normalized to the Montreal Neurological Institute (MNI) T1 MRI template using Statistical Parametric Mapping 8 (SPM8), as described elsewhere (Leuzy et al., 2016). The spatial normalization was made uniquely based on the PiB-PET images. All spatially normalized images were visually inspected and consequently the registration was fine tuned when necessary. The SUVR was computed at the voxel level for all images, using three different reference regions: cerebellar gray matter, pons and subcortical white matter; which we defined as SUVRCER, SUVRPONS and SUVRWM, respectively. All three masks were defined on the T1 MRI template ICBM152 and then shrunk at least 4 mm all around to diminish the influence of the partial volume effects and imperfections of the image registration process. The cerebellar gray matter is essentially the cerebellum without the cerebellar peduncles. Fig. 1 illustrates the masks used as reference region. Note that, since the PiB-PET images were spatially normalized to the MNI space then the pons, cerebellum and white matter are also in the MNI space and the masks defined in the T1 MRI template can be directly applied to the spatially normalized PiB-PET images.

Fig. 1

Illustration of the reference regions used. Subcortical white matter is painted red, cerebellar gray is painted green and pons is painted blue.

Voxel-wise assessment of the SUVR differences

Voxel-wise group differences were evaluated using analysis of variance (ANOVA) using Statistical Parametric Mapping 12 (SPM12) following smoothing with a Gaussian kernel with a full width at half maximum (FWHM) of 12 mm. Post hoc pairwise comparisons were made using the Student t-test. To address the multiple comparisons issue, significance was only ascribed to regions with voxel-level p < .001.

Voxel-wise correlation between SUVR and CSF Aβ concentrations

Correlations between CSF Aβ38, Aβ40, Aβ42, Aβ42/Aβ38, Aβ42/Aβ40 concentrations (measured with MSD and MS-RMP)and voxel-wise SUVRCER, SUVRPONS and SUVRWM were computed after smoothing the SUVR images with a Gaussian kernel with a FWHM of 12 mm. Parametric correlation maps (positive and negative) were computed on the full cohort of subjects together. Correction for multiple comparisons was assessed as in the previous section.

Comparison of the automatic classification accuracies

Regarding the assessment of the ability to differentiate between clinically defined AD (given that a postmortem neuropathological golden standard was not available across sites) and HC or other dementias OD, five sets of features were extracted from the data: (1) Aβ38, Aβ40, Aβ42, Aβ42/Aβ38, Aβ42/Aβ40 based on MSD; (2) Aβ38, Aβ40, Aβ42, Aβ42/Aβ38, Aβ42/Aβ40 based on MS-RMP; (3) voxel-wise SUVRCER; (4) voxel-wise SUVRPONS; and (5) voxel-wise SUVRWM. The goal was to set up an automatic classification approach to decide if a subject data belongs to the AD group or not. Since the HC, FTD and VaD groups included a small number of individuals comparatively to the AD group and accumulation of amyloid plaques in the brain is not a feature of these three groups of individuals, we opted to join them together in just one dataset referred to as HC/OD. Thus, the binary classification is AD versus HC/OD. We used support vector machines (SVM) (Chang & Lin, 2011) as a classification technique. This technique can be divided in two steps: in the first (learning/training step) a mathematical model, i.e. a decision function, that best separates the training dataset is built using optimization techniques; in the second step (test) the model built in the first step is used to classify new data. Based on the patient's features, the decision function gives a score to the patient. The leave one out cross-validation (LOOCV) technique was used to assess the performance of the classifiers. The LOOCV technique is a cross validation method that is used to estimate the performance of a classifier. In this case, it uses the data of a subject to be classified, while the remaining subjects' data are used to train the classifier. This procedure is repeated until all subjects' data have been classified once based on the classifier built with the remaining subjects' data. Then, based on the results obtained on the successive classification tests, the accuracy, sensitivity and specificity are computed. Since the two groups of subjects are unbalanced, in the optimization process is given more weight to the HC/OD than to the AD cases. The weight of the HC/OD is 122/40 times the weight of the AD. Thus, the optimizer instead to converge to the maximal accuracy tends to converge to the maximal balanced accuracy. Since there is a high correlation among neighboring voxels, before use the voxel-wise SUVR in the classifier, the SUVR images were resampled into 8 mm isotropic voxels. For classification, we considered only the uptake of PiB in the brain cortex, defining an anatomical mask to select only the voxels that belong to this region. The resampled voxels were then used as features (voxel-as-feature approach) (Oliveira & Castelo-Branco, 2015). Statistical comparison of classifiers accuracy was done using Cochran's Q test followed by the McNemar test as a post-hoc procedure (IBM SPSS Statistics 20).

Results

Voxel-wise differences of the PiB SUVR among groups

Voxel-wise ANOVA showed a statistically significant difference (voxel-level p < .001) in the PiB uptake in all cortical areas among the defined four groups of subjects using the SUVRCER and SUVRPONS and in most of the cortical mantle using the SUVRWM. Some regions, however, were not being detected by this method. Post hoc t-tests showed that, using any of the three reference regions, there was a statistically significant difference between the AD and HC groups, and between AD and OD across almost all cortical voxels. In the comparison AD vs MCI, the differences were slightly higher using SUVRCER and SUVRPONS, compared to SUVRWM. In the comparison of MCI and HC, statistically significant differences were observed in a larger cluster of areas in the statistical maps using the SUVRPONS than using the SUVRCER and SUVRWM, which failed to capture occipitoparietal regions (Fig. 2). Although the brain areas with significant differences are smaller using the SUVRWM than using the SUVRPONS, there are clusters with higher t-value using the SUVRWM. Also in Fig. 2, a small cluster can be observed in the pons using the SUVRPONS for group comparison. Since it is in the border between regions with very different uptake and the t-values are not very high, this cluster is very likely a false positive, representing a typical border effect.

Fig. 2

Regions where the SUVR of the MCI patients is significantly higher than the SUVR of the HC subjects. From the left to the right, voxel-wise t-value obtained using the SUVRCER, SUVRPONS and SUVRWM. Note that the latter misses large clusters of cortical regions, in particular in occipitoparietal and temporal regions, with a similar pattern for SUVRCER. Only the SUVRPONS captures the whole cortical mantle.

Voxel-wise correlation between CSF Aβ concentrations and PiB SUVR

We performed a voxel-wise correlation analysis between the SUVR images of all subjects and the CSF Aβ concentrations and their ratios. Table 2 presents a summary of the observed patterns of correlation. In general, the correlations were slightly stronger using the concentrations measured by the MSD than measured by the MS-RMP methods.

Table 2

		SUVR_cer	SUVR_pons	SUVR_wm
MSD	Aβ₃₈	(+)WC: ventricles and brainstem	(+)WC: ventricles	(+)WC: ventricles
	Aβ₃₈	(−)NS	(−)NS	(−)WC: parietal lobe
	Aβ₄₀	(+)WC: ventricles and brainstem	(+)WC: part of ventricles	(+)WC: part of ventricles
	Aβ₄₀	(−)NS	(−)NS	(−)WC: part of parietal lobe
	Aβ₄₂	(+)WC-MC: brainstem	(+)NS	(+)WC-MC: brainstem
	Aβ₄₂	(−)WC: all brain cortex	(−)MC: all brain cortex	(−)WC-MC: all brain cortex
	Aβ₄₂/Aβ₃₈	(+)WC: brainstem	NS	(+)MC: brainstem
	Aβ₄₂/Aβ₃₈	(−)MC: all brain cortex	(−)MC-SC: all brain cortex	(−)MC: all brain cortex
	Aβ₄₂/Aβ₄₀	(+)WC: brainstem	(+)NS	(+)MC: brainstem
	Aβ₄₂/Aβ₄₀	(−)MC: all brain cortex	(−)MC-SC: all brain cortex	(−)MC-SC: all brain cortex
MS-RMP	Aβ₃₈	(+)WC: ventricles and brainstem	(+)WC: ventricles	(+)WC: part of ventricles
	Aβ₃₈	(−)NS	(−)NS	(−)WC: part of parietal lobe
	Aβ₄₀	(+)WC: ventricles and brainstem	(+)NS	(+)NS
	Aβ₄₀	(−)NS	(−)NS	(−)WC: part of parietal lobe
	Aβ₄₂	(+)WC: brainstem	(+)NS	(+)WC: brainstem
	Aβ₄₂	(−)WC: all brain cortex	(−)MC: all brain cortex	(−)WC-MC: all brain cortex
	Aβ₄₂/Aβ₃₈	(+)WC: brainstem	(+)NS	(+)MC: brainstem
	Aβ₄₂/Aβ₃₈	(−)MC: all brain cortex	(−)MC: all brain cortex	(−)WC-MC: all brain cortex
	Aβ₄₂/Aβ₄₀	(+)WC: brainstem	(+)NS	(+)MC: brainstem
	Aβ₄₂/Aβ₄₀	(−)MC: all brain cortex	(−)MC-SC: all brain cortex	(−)WC-MC: all brain cortex

Summary of the statistically significant correlation patterns found between the CSF Aβ concentrations and the PiB SUVR normalized by the three reference regions. NS - not significant correlation or just in small cluster (less than 100 voxels), WC - weak correlation (0.2 < |r| ≤ 0.4), MD - moderate correlation (0.4 < |r| ≤ 0.7), SC - strong correlation (0.7 < |r| ≤ 0.9), (+) - positive correlation and (−) - negative correlation. When comparing the whole brain correlations as function of the reference region used for normalization of the PiB-PET images, we found a weak positive correlation between the CSF Aβ concentration and the SUVRCER and SUVRWM in the ventricles and/or brainstem but not with the SUVRPONS. Aβ42, the Aβ42/Aβ38 and Aβ42/Aβ40 ratios showed a moderate to strong negative (as expected) correlation with SUVRCER and SUVRPONS in all cortical regions, and in most but not all the cortical mantle with the SUVRWM, suggesting that the latter is indeed less sensitive. Aβ38 and Aβ40 correlated significantly and negatively (albeit with a small effect size) with the SUVRWM in part of the parietal lobe, while they did not significantly correlate with the SUVRCER and SUVRPONS. Fig. 3 shows a comparison of the voxel-wise statistically significant correlation between Aβ42/Aβ40 measured by the MSD and the SUVR for all three reference regions. Images for the other correlations are available as supplementary figures.

Fig. 3

Voxel-wise statistically significant correlation between MSD Aβ42/Aβ40 and SUVRCER, SUVRPONS and SUVRWM, respectively. Correlation was computed for the entire dataset. Note that parts of the SUVRWM maps lack a correlation pattern.

Classification accuracy

Results from the assessment of the classification accuracies (taking into account the limitation that it is not possible to use a neuropathological gold standard, but just the clinical diagnosis) using the LOOCV are depicted in Table 3 and Fig. 4. Cochran's Q test showed that there was a statistically significant difference (p < .001) among the accuracies achieved on the differentiation of clinical AD from HC/OD using the all sets of features. Post hoc tests were made using the McNemar test. P-value results are shown in Table 4. It can be observed that the classification accuracies obtained using the SUVRCER or SUVRPONS are significantly higher than the accuracies obtained using the CSF concentration features. The classification accuracy obtained using the SUVRWM was inferior only at a trend level to the classification accuracies obtained using the SUVRPONS or SUVRCER.

Table 3

	Accuracy	Sensitivity	Specificity	Balanced accuracy
CSF measured with MS-RMP	88.3	91.8	77.5	84.7
CSF measured with MSD	90.7	93.4	82.5	88.0
SUVR_WM	93.8	95.1	90.0	92.5
SUVR_CER	95.7	95.9	95.0	95.5
SUVR_PONS	95.7	95.9	95.0	95.5

Fig. 4

Values of the decision functions obtained from the SVM classifiers. Values were obtained during the accuracy assessment using the LOOCV strategy. In all these cases, a positive value means that the case is more compatible with the AD patients then the other conditions. A negative value means the opposite.

Table 4

P-values for the post hoc pairwise accuracies comparison using the McNemar test. Please note that all CSF measures were taken into account as classification features.

	CSF measured with MSD	SUVR_WM	SUVR_CER	SUVR_PONS
CSF measured with MS-RMP	0.289	0.035	0.002	0.002
CSF measured with MSD		0.227	0.021	0.021
SUVR_WM			0.250	0.250
SUVR_CER				1

Cross-validation classification results from the differentiation between clinically defined AD and HC/OD using the SVM classifiers. Values of accuracy, sensitivities, specificities and balanced accuracy are given in percentage. Please note that all CSF measures were taken into account as classification features. Values of the decision functions obtained from the SVM classifiers. Values were obtained during the accuracy assessment using the LOOCV strategy. In all these cases, a positive value means that the case is more compatible with the AD patients then the other conditions. A negative value means the opposite. P-values for the post hoc pairwise accuracies comparison using the McNemar test. Please note that all CSF measures were taken into account as classification features. SUVRPONS and SUVRCER provided exactly the same accuracies, correctly classifying 155 out of 162 cases (95.7%) (Fig. 4). Regarding the seven misclassified cases, four were clinically diagnosed as AD but all five classifiers indicate they are not AD patients, suggesting that future work should focus on neuropathological validation. In two cases, the patients were diagnosed as FTD but all five classifiers indicated that they are more likely to have AD, again suggesting that gold standard and clinical discrimination issues remain to be solved. Finally, the last case was clinically diagnosed as AD and as AD by the classifiers based on the CSF concentrations but classified as non-AD by the classifiers based on SUVR. This was the only case where the classifiers based on CSF concentrations classified accordingly as the clinical diagnosis while the classifiers based on the SUVR did not, suggesting that the latter is usually more consistent with clinical assessment. On this dataset of AD and HC/OD, 113 of the 122 AD were visually classified as PiB positive and 36 of the 40 HC/OD were visually classified as PiB negative (Table 1). This represents a sensitivity of 92.6%, a specificity of 90% and accuracy of 92.0%, which is inferior to the accuracy found using the cerebellar gray matter or pons as reference region (one-tailed McNemar test, p = .035).

Comparison of amyloid burden and CSF data in MCI patients

The five classifiers built were applied to the data from the MCI patients with the goal to assess if each patient data is more close to AD than to HC/OD. The rate of MCI patients classified as AD-like was similar for all classifiers and varied between 63% and 65%. The better agreement was between the classifiers based on SUVRCER and SUVRPONS (agreement 80/81, Cohen's Kappa .973), and the worst agreement between the classifier based on CSF concentrations computed with the MS-RMP and the classifier based on SUVRPONS (agreement 75/81, Cohen's Kappa 0.839). Now, comparing the classification made by the classifier based on SUVRPONS with the classifications based on the PiB visual assessment and locally measured Aβ42 using the optimal cut-off (Zwan et al., 2016), an agreement of 75/81 and 61/81 was obtained, respectively. Similar results were obtained comparing with the SUVRCER based classifier.

Discussion

In this data-driven multivariate study we investigated the impact of PiB SUVR normalization (cerebellar gray matter, pons or white matter) on overall statistical classification of clinical diagnostic categories, and a comparison with CSF Aβ measures. To test the relative value of these options to differentiate patient groups we used an automatic data classification framework based on a dataset acquired in multiple European Centers. Importantly, we also tested which of the Aβ38, Aβ40 and Aβ42 individual values or ratios correlated best with PiB-PET SUVR images. The results showed that the classification accuracy of clinically defined AD versus HC or OD based on the SUVRCER and SUVRPONS images are equal and significantly higher than the accuracies obtained using the CSF concentrations. Thus, this allows us to conclude that the PiB-PET SUVR seems to be a promising solution to be used in multivariate classification when compared the CSF concentrations of multiple Aβ species, although this needs future confirmation with the neuropathological gold standard. It is however possible that adding Tau levels might increase CSF performance. In fact, only in one case the classifiers based on CSF concentrations classified accordingly to the clinical diagnosis while the classifiers based on SUVRCER or SUVRPONS did not. This means that in clinical practice the use of the CSF concentrations needs reappraisal when compared to the classification using the SUVRCER or SUVRPONS alone. Our finding does not mean the CSF concentrations should not be measured, we only conclude that to perform just the differential diagnosis of AD, the CSF concentrations may be possibly redundant if a PiB-PET acquisition is available; however the CSF concentrations contain complementary biological information, for instance Tau biomarkers, that may be relevant to the physician (Rosén et al., 2013). This tenet will however remain controversial without neuropathological validation. Although the qualitative PiB-PET visual evaluation is often used to help the physicians in the diagnosis of the patients, the accuracies found using the cerebellar gray matter or pons as reference regions are higher than the accuracy found using the PiB-PET visual evaluation. It is important to stress out that the PiB-PET images were acquired using different platforms and scanning windows, which we view as a strength, given the positive results identified in this study. On the other hand, while the CSF Aβ concentrations were measured centrally with the same assay procedures, the samples were collected at seven different clinical centers, which may have introduced variability due to differences in pre-analytical protocols (Bjerke, et al., 2010). This shows the robustness of both PiB-PET imaging and CSF biomarkers also in the multicenter setting. Our finding partially contradicts the results of Mattsson et al. (Mattsson et al., 2014), where the authors found that CSF Aβ42 and florbetapir-PET did not differ in terms of area under the curve (AUC) in the classification of the AD versus HC. In this study we have used more than one thousand of features (the resampled SUVR voxels) to represent the PiB-PET image, which contains more information than a single value (global or regional PiB), as used in Mattsson et al. (Mattsson et al., 2014). Moreover, we have used a set of CSF biomarkers as features, which allows increasing the classification accuracy comparatively if just one Aβ feature was used at a time. Previous studies (Leuzy et al., 2016; Janelidze et al., 2016) have shown the ratios Aβ42/Aβ40 and Aβ42/Aβ38 originate higher classification accuracy than using only the Aβ42. When we compared the classification results obtained using the CSF biomarkers from the MSD and MS-RMP methods, we found no significant difference. When the classifiers were applied to the MCI patients, a good agreement among all classifiers was found. In the worst case (CSF MS-RMP based classifier versus SUVRPONS based classifier) there was a disagreement in 6 out of 81 patients. Depending on the classifier, 63% to 65% of the MCI patients were classified as AD-like, which may lead to different diagnosis/prognostic for these patients in comparison with the other who are classified as non AD-like. Which is the best classifier to predict the conversion from MCI to AD is a question that only a subsequent follow-up study can answer. It is important to stress out that the classification based on the SUVRPONS and locally measured Aβ42 disagree in 20 out of 81 cases, which is a very substantial difference. The Aβ42/Aβ38 and Aβ42/Aβ40 ratios gave higher (negative) voxel-wise correlation with PiB-PET SUVR than the Aβ42 concentration alone. These higher correlations with PiB-PET SUVR may explain why the Aβ42 ratios provided better classification results than the Aβ42 concentration (Leuzy et al., 2016; Janelidze et al., 2016). Our findings suggest that the voxel-wise correlation of the SUVR with the Aβ42/Aβ38 and Aβ42/Aβ40 ratios is slightly stronger if the pons is used to normalize the uptake than if one uses the cerebellar gray matter or the brain white matter (Fig. 3). This provides a strong biological argument in favor of this reference region. We found that Aβ38 and Aβ40 correlate weakly and negatively with the SUVRWM in part of the parietal lobe, while they do not significantly correlate with the SUVRCER and SUVRPONS. These findings may be of particular biological significance in terms of specificity. Future studies should examine how they relate with the observations of Janelidze et al. (Janelidze et al., 2016) who found that Aβ38, Aβ40 (as well as Aβ42) correlate with non-AD-specific subcortical changes such as larger lateral ventricles and white matter lesions. We also found that, in general, the voxel-wise SUVR differences between groups of patients are higher (greater F value and larger areas) using the cerebellar gray matter and pons as reference region than using the white matter. This suggests that the latter has less power in detecting the cortical extent of early damage. Also, the difference between MCI and HC is higher (larger areas) using SUVRPONS than using the other two SUVR. The results we obtained are consistent with the ones obtained using other amyloid ligands. For instance, using 18F-Florbetapir, Habert et al. (Habert et al., 2017) found that when they used an association of the whole cerebellum and pons as reference region they obtained the best discrimination between HC and AD. Unfortunately they did not compare the pons against cerebellum, which precludes direct comparisons. Using the amyloid ligand 18F-flutemetamol, Thurfjell et al. (Thurfjell et al., 2014) found the best discrimination accuracy using the pons as reference region, comparatively to the whole cerebellum or only the cerebellar gray matter, on a dataset of autopsy confirmed AD. This slight superiority of the pons against the cerebellum or cerebellar gray matter are also in agreement with the results of Klunk et al. (Klunk et al., 2004) where the authors found that the relative difference of the PiB uptake between AD and HC is smaller in the pons than in the cerebellum, which means the PiB uptake is more stable in the pons than in the cerebellum. The main limitation of this study is the lack of an anatomical brain image per patient. Thus, the image registration process, i.e. normalization to the MNI space, was done based on the PiB-PET image only. Consequently, the accuracy of the registration process is inferior to what could be achieved if a structural image like MRI was available. For this reason, we reduced the size of the masks used to ensure as much as possible that, for each patient, each mask contains only voxels of the target brain area. Other consequence was our option to exclude the striatal region from the mask used to extract the SUVR values used in the automated classification process. Note that in elderly patients where a dilatation of the ventricle is common, if the striatal region was included in the mask used for classification it may happen that in some patients we would collect the values of the SUVR from the ventricles rather than from the striatal region. We have used a linear SVM as classifier model due to its simplicity, wide acceptance and proved good ability for many common classification problems using multivariate medical data (Oliveira & Castelo-Branco, 2015; Oliveira et al., 2018; Duarte et al., 2014; Moradi et al., 2015). As final remarks, both PiB SUVRPONS and SUVRCER are well suitable to be used in the differential diagnosis of AD, even if further studies also with postmortem neuropathological gold standard will be important for final validation of diagnostic accuracy. Although SUVRPONS and SUVRCER led to similar classifications accuracies, the SUVRPONS generally showed a higher t-value and larger extent of voxel-wise differences between patient groups. This suggests that the normalization of the PiB-PET uptake images by the pons may be a better option than the normalization by the cerebellar gray mater, as corroborated by studies using other ligands.

26 in total

1. Mild cognitive impairment: clinical characterization and outcome.

Authors: R C Petersen; G E Smith; S C Waring; R J Ivnik; E G Tangalos; E Kokmen
Journal: Arch Neurol Date: 1999-03

Review 2. Frontotemporal lobar degeneration: a consensus on clinical diagnostic criteria.

Authors: D Neary; J S Snowden; L Gustafson; U Passant; D Stuss; S Black; M Freedman; A Kertesz; P H Robert; M Albert; K Boone; B L Miller; J Cummings; D F Benson
Journal: Neurology Date: 1998-12 Impact factor: 9.910

3. Machine learning framework for early MRI-based Alzheimer's conversion prediction in MCI subjects.

Authors: Elaheh Moradi; Antonietta Pepe; Christian Gaser; Heikki Huttunen; Jussi Tohka
Journal: Neuroimage Date: 2014-10-12 Impact factor: 6.556

4. Use of amyloid-PET to determine cutpoints for CSF markers: A multicenter study.

Authors: Marissa D Zwan; Juha O Rinne; Steen G Hasselbalch; Agneta Nordberg; Alberto Lleó; Sanna-Kaisa Herukka; Hilkka Soininen; Ian Law; Justyna M C Bahl; Stephen F Carter; Juan Fortea; Rafael Blesa; Charlotte E Teunissen; Femke H Bouwman; Bart N M van Berckel; Pieter J Visser
Journal: Neurology Date: 2015-10-14 Impact factor: 9.910

5. Clinical diagnosis of Alzheimer's disease: report of the NINCDS-ADRDA Work Group under the auspices of Department of Health and Human Services Task Force on Alzheimer's Disease.

Authors: G McKhann; D Drachman; M Folstein; R Katzman; D Price; E M Stadlan
Journal: Neurology Date: 1984-07 Impact factor: 9.910

6. Vascular dementia: diagnostic criteria for research studies. Report of the NINDS-AIREN International Workshop.

Authors: G C Román; T K Tatemichi; T Erkinjuntti; J L Cummings; J C Masdeu; J H Garcia; L Amaducci; J M Orgogozo; A Brun; A Hofman
Journal: Neurology Date: 1993-02 Impact factor: 9.910

7. Florbetapir (F18-AV-45) PET to assess amyloid burden in Alzheimer's disease dementia, mild cognitive impairment, and normal aging.

Authors: Keith A Johnson; Reisa A Sperling; Christopher M Gidicsin; Jeremy S Carmasin; Jacqueline E Maye; Ralph E Coleman; Eric M Reiman; Marwan N Sabbagh; Carl H Sadowsky; Adam S Fleisher; P Murali Doraiswamy; Alan P Carpenter; Christopher M Clark; Abhinay D Joshi; Ming Lu; Michel Grundman; Mark A Mintun; Michel J Pontecorvo; Daniel M Skovronsky
Journal: Alzheimers Dement Date: 2013-01-30 Impact factor: 21.566

8. Amyloid imaging in mild cognitive impairment subtypes.

Authors: David A Wolk; Julie C Price; Judy A Saxton; Beth E Snitz; Jeffrey A James; Oscar L Lopez; Howard J Aizenstein; Ann D Cohen; Lisa A Weissfeld; Chester A Mathis; William E Klunk; Steven T De-Kosky; Steven T DeKoskym
Journal: Ann Neurol Date: 2009-05 Impact factor: 10.422

Review 9. Fluid biomarkers in Alzheimer disease.

Authors: Kaj Blennow; Henrik Zetterberg; Anne M Fagan
Journal: Cold Spring Harb Perspect Med Date: 2012-09-01 Impact factor: 6.915

10. Brain beta-amyloid measures and magnetic resonance imaging atrophy both predict time-to-progression from mild cognitive impairment to Alzheimer's disease.

Authors: Clifford R Jack; Heather J Wiste; Prashanthi Vemuri; Stephen D Weigand; Matthew L Senjem; Guang Zeng; Matt A Bernstein; Jeffrey L Gunter; Vernon S Pankratz; Paul S Aisen; Michael W Weiner; Ronald C Petersen; Leslie M Shaw; John Q Trojanowski; David S Knopman
Journal: Brain Date: 2010-10-08 Impact factor: 13.501

4 in total

1. Investigating the Spatial Associations Between Amyloid-β Deposition, Grey Matter Volume, and Neuroinflammation in Alzheimer's Disease.

Authors: Lília Jorge; Ricardo Martins; Nádia Canário; Carolina Xavier; Antero Abrunhosa; Isabel Santana; Miguel Castelo-Branco
Journal: J Alzheimers Dis Date: 2021 Impact factor: 4.472

2. Dual PET-fMRI reveals a link between neuroinflammation, amyloid binding and compensatory task-related brain activity in Alzheimer's disease.

Authors: Nádia Canário; Lília Jorge; Ricardo Martins; Isabel Santana; Miguel Castelo-Branco
Journal: Commun Biol Date: 2022-08-10

3. Clinical validation of the Lumipulse G cerebrospinal fluid assays for routine diagnosis of Alzheimer's disease.

Authors: Maria João Leitão; Anuschka Silva-Spínola; Isabel Santana; Veronica Olmedo; Alicia Nadal; Nathalie Le Bastard; Inês Baldeiras
Journal: Alzheimers Res Ther Date: 2019-11-23 Impact factor: 6.982

4. Combined Structural MR and Diffusion Tensor Imaging Classify the Presence of Alzheimer's Disease With the Same Performance as MR Combined With Amyloid Positron Emission Tomography: A Data Integration Approach.

Authors: Daniel Agostinho; Francisco Caramelo; Ana Paula Moreira; Isabel Santana; Antero Abrunhosa; Miguel Castelo-Branco
Journal: Front Neurosci Date: 2022-01-05 Impact factor: 4.677

4 in total