Literature DB >> 34132461

Agitation in Alzheimer's disease: Novel outcome measures reflecting the International Psychogeriatric Association (IPA) agitation criteria.

Adelaide De Mauleon¹, Zahinoor Ismail², Paul Rosenberg³, David Miller⁴, Christelle Cantet^1,5, Cedric O'Gorman⁶, Bruno Vellas¹, Constantine Lyketsos³, Maria Soto¹.

Abstract

INTRODUCTION: The 2017 European Union-North American Clinical Trials in Alzheimer's Disease Task Force recommended development of clinician-rated primary outcome measures for Alzheimer's disease (AD) agitation trials, incorporating International Psychogeriatric Association (IPA) criteria.
METHODS: In a modified Delphi process, Cohen-Mansfield Agitation Inventory (CMAI) and Neuropsychiatric Inventory-Clinician (NPI-C) items were mapped to IPA agitation domains generating novel instruments, CMAI-IPA and NPI-C-IPA. Validation in the Agitation and Aggression AD Cohort (A3C) assessed minimal clinically important differences (MCIDs), change sensitivity, and predictive validity.
RESULTS: MCID was -17 (odds ratio [OR] = 14.9, 95% confidence interval [CI] = 6.8-32.6) for CMAI; -5 (OR = 9.3, 95% CI = 4.0-21.2) for CMAI-IPA; -3 (OR = 11.9, 95% CI = 4.1-34.8) for NPI-C-A+A; and -5 (OR = 7.8, 95% CI = 3.4-17.9) for NPI-C-IPA at 3 months. Areas under the curve suggested no scale better predicted global clinician ratings. Sensitivity to change for all measures was high.
CONCLUSION: Internal consistency and reliability analyses demonstrated better accuracy for the NPI-C-IPA than for the CMAI-IPA and can be used for agitation clinical trial inclusion, and for response to intervention.

Entities: Chemical

Keywords: Alzheimer's disease; aggression; agitation; dementia; efficacy; gold standard; measure; neuropsychiatric symptoms; outcome; research; scales; trials; validation

Mesh：

Year: 2021 PMID： 34132461 PMCID： PMC9292260 DOI： 10.1002/alz.12335

Source DB: PubMed Journal: Alzheimers Dement ISSN： 1552-5260 Impact factor: 16.655

INTRODUCTION

Agitation frequently affects patients with Alzheimer's disease (AD) and is among the most disruptive clinical features of the disease. Agitation is associated with highly impactful negative outcomes affecting patients, caregivers, and health systems. , Given its impact, AD agitation is an important target for effective treatment development. Although progress has been made in implementing randomized clinical trials (RCTs) for agitation, the choice of optimal outcome measures to demonstrate treatment effectiveness is still unresolved.

HIGHLIGHTS

Optimal clinical rating outcome measures to evaluate treatment effectiveness for agitation in Alzheimer's disease (AD) clinical trials is lacking and no International Psychogeriatric Association (IPA)‐specific measures exist. Using a modified Delphi process, Cohen‐Mansfield Agitation Inventory (CMAI)‐IPA and Neuropsychiatric Inventory‐Clinician (NPI‐C)‐IPA were derived, to better reflect the IPA agitation criteria. In the Agitation and Aggression AD Cohort (A3C) study, all four measures demonstrated high sensitivity to change and the ability to predict clinician ratings; NPI‐C‐IPA showed a better accuracy over CMAI‐IPA. The novel measures CMAI‐IPA and NPI‐C‐IPA, specially NPI‐C‐IPA, were more efficient at measuring agitation than CMAI and NPI‐C‐A+A.

RESEARCH IN CONTEXT

Systematic review: This research responds to the European Union‐North American Clinical Trials in Alzheimer's Disease Task Force recommendation to develop an optimized, clinician‐rated, primary outcome measure for clinical trials targeting agitation in Alzheimer's disease (AD), consistent with the International Psychogeriatric Association (IPA) agitation criteria. A literature search discovered no such validated measures, supporting the need to develop an IPA agitation‐specific measure. Interpretation: A modified Delphi process was implemented to derive IPA domain‐specific measures from the Cohen‐Mansfield Agitation Inventory (CMAI) and the Neuropsychiatric Inventory‐Clinician scale (NPI‐C). Original and novel measures were validated in the Agitation and Aggression AD Cohort (A3C). The IPA‐informed novel scales, the CMAI‐IPA and NPI‐C‐IPA, performed as well as the original scales in a naturalistic study of AD patients. Future directions: As these novel measures accurately represent the IPA Agitation Syndrome, future studies incorporating IPA Agitation Criteria can use the novel scales for participant inclusion, and as outcome measures. Two approaches have been used to assess treatment response in RCTs targeting agitation in AD: (1) global ratings based on the judgment of experienced clinicians; and (2) severity and/or frequency ratings of items on scales reflecting components of the agitation syndrome. Examples of the latter approach include the Cohen‐Mansfield Agitation Inventory (CMAI) and the Neuropsychiatric Inventory (NPI), the most widely used measures across trials. A shortcoming of this approach is that deriving ratings exclusively from caregiver input leads to multiple biases. The NPI‐Clinician (NPI‐C) version was developed to address such limitations; however, to date there is no gold standard scale rating for these trials. In 2015 the International Psychogeriatric Association (IPA) published provisional criteria for agitation in cognitive disorders to facilitate research in the field. Unfortunately, no data have reported on how to measure clinically meaningful change using these criteria. In 2018, the European Union‐North American Clinical Trials in Alzheimer's Disease (EU‐US CTAD) Task Force on Agitation encouraged the evidence‐based development of a single rating scale tailored to the IPA agitation criteria, using existing datasets, and using items from existing scales. The desired instrument should reflect the syndromic agitation criteria developed by the IPA, incorporate information from both patient and caregiver, define clinically meaningful effects, demonstrate sensitivity to change, and form the basis for powering studies. In this study we respond to this EU‐US CTAD Task Force recommendation by constructing novel outcome measures from existing instruments, with items mapped onto the IPA agitation criteria, and then choosing the best measure based on performance characteristics, including sensitivity to change and predictive validity, as assessed in the Agitation and Aggression AD Cohort (A3C) database.

METHODS

Design overview

In a modified Delphi process, items from the CMAI and the NPI‐C were mapped by an expert panel onto the IPA agitation criteria domains to generate derivative “novel” measures, CMAI‐IPA and NPI‐C‐IPA. Original existing measures were the CMAI and the NPI‐C‐Agitation and Aggression domains (NPI‐C‐A+A). The original and novel instruments were examined in the A3C study database to assess their performance in terms of minimal clinically important difference (MCID), sensitivity to change, and predictive validity. Internal consistency and reliability were also assessed for the novel measures. For these purposes, the modified Alzheimer's Disease Cooperative Study‐Clinical Global Impression of Change (mADCS‐CGIC) was considered the gold standard.

Mapping CMAI and NPI‐C items onto IPA agitation criteria: modified Delphi process

The IPA definition for agitation describes three domains: excessive motor activity (EMA), verbal aggression (VA), and physical aggression (PA). The modified Delphi process included six clinicians with expertise in agitation in dementia. All were directly or indirectly involved with the CTAD Agitation Task Force. All Delphi raters are co‐authors. We used a multi‐step iterative process that we have previously used for scale generation. As a first step, items from the CMAI and the NPI‐C were reviewed for relevance to any of the three domains: EMA, VA, or PA. For the CMAI, all items were included, and for the NPI‐C, all items from the agitation, aggression, aberrant motor activity, abnormal vocalizations, disinhibition, and irritability/lability domains were included. As a next step, all relevant items were incorporated into an online survey and rated by the Delphi Panel as 1 (none), 2 (weak), or 3 (strong) for association to each of the three IPA definition domains. For each item, if mean score was ≥2.5, the item was included and applied to the corresponding domain, and if <1.5, the item was discarded. Items with scores from 1.5 to 2.5 were retained for further discussion. These residual items were discussed via teleconference and assigned to a domain if 80% consensus was reached. Items that did not distinctly map onto one domain were discarded. Thus, using this Delphi process, all items from the CMAI and the items from the six domains from the original NPI‐C: agitation, aggression, aberrant motor activity, disinhibition, irritability/lability, and abnormal vocalizations, were mapped onto the IPA agitation domains. The results were derivative novel measures CMAI‐IPA and NPI‐C‐IPA. The CMAI‐IPA is a scale of 19 items rated based on symptom frequency rated by caregiver; the NPI‐C‐IPA is a scale of 25 items rated based on frequency and severity in a clinician‐rated composite score based on information from patient and caregiver (see supporting information).

Original clinical trials outcome measures for agitation in AD

We considered the CMAI and the NPI‐C agitation and aggression domains (A+A) to be original measures. The CMAI has been used more frequently across trials in AD agitation while the NPI‐C‐A+A has been used more recently in trials. The CMAI is a caregiver‐rated questionnaire that quantifies the frequency of 29 behaviors. The NPI‐C‐A+A measures clinician‐rated severity of agitation by summing scores on the agitation (13 items) and aggression (eight items) domains of the NPI‐C.

mADCS‐CGIC as a gold standard outcome measure in clinical trials for AD agitation

The Alzheimer's Disease Cooperative Study‐Clinical Global Impression of Change (ADCS‐CGIC) is a global rating of change developed to assess clinically significant change in symptoms over time in AD clinical trials. In the A3C study, mADCS‐CGIC was rated by experienced clinicians, during face‐to‐face interviews, and recorded on a standardized case record form. All rating clinicians were previously trained in a standardized training session based on the protocol. The clinicians rated the mADCS‐CGIC after assessing the other two scales: CMAI and NPI‐C‐A+A. The mADCS‐CGIC for agitation rates five areas: mood lability, emotional distress, physical agitation, VA, and PA. At follow‐up visits in A3C, clinicians rated patient agitation on the mADCS‐CGIC as: very much improved (1), much improved (2), minimally improved (3), no change (4), minimally worse (5), much worse (6), and very much worse (7) compared to baseline symptoms. Here we considered the mADCS‐CGIC as the gold standard to differentiate two groups of patients, those whose agitation improved significantly (scores of 1 or 2) and those who were minimally improved or unimproved (defined as scores ≥3) at 1 and at 3 months of follow‐up.

Agitation and aggression in AD patients cohort (A3C) study

The A3C is a longitudinal, prospective, multicenter observational study performed at eight memory clinics in France, and their associated long‐term care facilities (LTCFs). Clinical visits were scheduled at baseline, monthly during the first 3 months, at 6, at 9, and at 12 months. This frequent schedule of assessments during the first 3 months was intended to simulate a 12‐week randomized control treatment trial design. The main objective of A3C was to study, over 12 months, the course of clinically significant agitation based on validated measures (NPI‐C, CMAI). An additional goal was to assess relationships between agitation symptoms, and to estimate clinical meaningfulness via comparison to global ratings of agitation (i.e., the mADCS‐CGIC) to optimize future trials. A3C enrolled 262 patients: methods and preliminary results are described in de Mauleon et al. Briefly, participants were aged ≥60 years, had a diagnosis of probable AD according to 2011 National Institute on Aging‐Alzheimer's Association criteria, with or without cerebrovascular components. Participants exhibited clinically significant agitation at baseline defined by the presence of significant symptoms on at least one of the following agitation symptoms as rated by NPI: agitation/aggression, disinhibition, aberrant motor behavior, and/or irritability with a NPI item score ≥4 with NPI frequency score ≥2 at entry. Participants also met criteria for the IPA provisional definition of agitation in cognitive disorders. To be included, patients could live at home with an identified primary caregiver or reside in a LTCF for at least 2 months before inclusion. Participants and their caregivers took part in the study voluntarily: written informed consent was obtained from all patients (or legal representatives) and caregivers (for the community dwelling population).

Statistical analyses

All analyses were performed using SAS version 9.4 (SAS Institute Inc.). Baseline characteristics are reported as frequencies and percentages for nominal variables and mean ± standard deviation (SD) for quantitative variables. To estimate MCID between mADCS‐CGIC (very/much improved: 1 or 2 vs. others: ≥3) and the agitation measures (CMAI and NPI‐C‐A+A as original and CMAI‐IPA and NPI‐C‐IPA as novel) at 1 and 3 months follow‐up, receiver operating characteristic (ROC) analyses were conducted. Youden's index (YI, calculated as sensitivity + specificity—1) was used to calculate the optimal cut‐offs in the ROC curves. Areas under ROC curves (AUC) were also calculated. To assess sensitivity to change of the four measures (CMAI, CMAI‐IPA, NPI‐C‐A+A, and NPI‐C‐IPA) for participants with significant improvement (mADCS‐CGIC = 1 or 2), five indices were calculated at 1‐ and at 3‐month follow‐up: Effect size (ES) = (change/SD [T0]); Standardized response mean (SRM) = (change/SD [change]); Guyatt response index (GRI) = (change/SD [change for stable subjects]); Reliable change index (RCI) = (change/SE [change]; numerically, RCI represents the number of scale points needed on a given psychometric measure to determine whether a change in score from pre‐ to‐post‐treatment is due to real change or chance variation. The RCI expresses the amount of change between pre‐ and post‐treatment scores on the measurements that would be statistically reliable); Change from baseline, estimated with linear mixed models, of the four measures converted to a Z‐score (reduced centered variable). In these models, the dependent variables were CMAI or CMAI‐IPA or NPI‐C‐A+A or NPI‐C‐IPA and the independent fixed effects were: baseline value of the outcome, grouped according to the mADCS‐CGIC (1 or 2; = 3; = 4; ≥5), time, and interaction between group and time. For all indices, a high absolute value meant a high sensitivity to change. To assess predictive validity of the four measures (CMAI, CMAI‐IPA, NPI‐C‐A+A, and NPI‐C‐IPA), linear mixed models were estimated to assess whether an improvement at 1 or 3 months on each measure, using cut‐offs from the ROC analysis, predicted subsequent change, at 11 or 9 months, respectively, in cognitive function (Mini‐Mental State Examination [MMSE] scale ) or functional autonomy (Activities of Daily Living [ADL] scale ). Predictive validity in the mADCS‐CGIC gold standard was similarly assessed. In each model, we included subject‐specific random effects to consider intra‐subject correlations, a random intercept to consider the heterogeneity of outcomes at the first timepoint (here 1 or 3 months), and a random slope to take into account heterogeneity of slopes between participants. In the models, we included the following fixed effects: value of the outcome at the first timepoint, grouped according to improvement in the four agitation measures (improvement = yes vs. no) or mADCS‐CGIC scale (1 or 2 vs. ≥3), time, and interaction between group and time. In all models the following potential confounders were tested: sex, age, agitation level at baseline, and the change from the first time of agitation level. None of these variables were associated with both the outcomes and with the variable of interest (improvement of agitation). To check the accuracy of the novel scales, Cronbach's alpha was estimated to examine the internal consistency of items for each IPA domain (EMA, VA, PA). To assess test–retest reliability (in terms of stability) we estimated intraclass correlation coefficients (ICC) and their 95% confidence intervals (CIs) at 1 and 3 months for stable participants (mADCS‐CGIC = 4). In accordance with the guidelines of Koo and Li, ICC estimates and their 95% CIs were calculated using the SAS ICC macro developed by Robert Hamer (https://support.sas.com/kb/25/031.html ) based on a single rater/measurement, absolute‐agreement, two‐way mixed‐effects model. ICC results can be interpreted is as follows: 5‐ <0.5 are indicative of poor reliability 5‐0.5 to 0.75 moderate reliability 5‐0.75 to 0.9 good reliability 5‐ >0.90 excellent reliability. Finally, we examined the association between the novel scales and the inclusion criteria with binary responses for each IPA domain to study the internal validity of novel measures.

RESULTS

A3C study participants

Table 1 reports baseline characteristics of the A3C population. Mean age was 82.4 (±7.2) years and 153 (58.4%) were female. Mean MMSE score was 10.0 (±8.0).

TABLE 1

Characteristics at baseline of A3C population

Characteristics at baseline	N = 262
Sex (female), n (%)	153 (58.4%)
Age, years ^a	82.4 (±7.2)
Level of education, n (%):
No diploma	15 (6.1%)
Primary school without certificate	75 (30.4%)
Primary school certificate	59 (23.9%)
Secondary education, without high‐school diploma	54 (21.9%)
High‐school diploma (Baccalaureat) or higher	44 (17.8%)
Living at home, n (%)	183 (69.9%)
MMSE, total ^a	10.0 (±8.0)
Neuropsychiatric symptoms measurements
CMAI, total ^a	62.0 (±15.8)
CMAI‐IPA ^a	38.5 (±12.2)
NPI‐C‐A+A ^a	15.8 (±10.8)
NPI‐C‐IPA ^a	15.2 (±10.8)
IPA, definition of agitation, n (%):
Excessive motor activity	199 (76.3%)
Verbal aggression	199 (76.3%)
Physical aggression	115 (44.1%)
CGI‐S, severity n (%):
Normal	0 (0.0%)
Borderline mentally ill	0 (0.0%)
Mildly ill	32 (12.2%)
Moderately ill	89 (34.0%)
Markedly ill	93 (35.5%)
Severely ill	39 (14.9%)
Among the most extremely ill	9 (3.4%)

Abbreviations: A+A, Agitation+Aggression; CGI‐S, Clinical Global Impression of Severity; CMAI, Cohen‐Mansfield Agitation Inventory; IPA, International Psychogeriatric Association; MMSE, Mini‐Mental State Examination; NPI‐C, Neuropsychiatric Inventory Clinician Rating.

Mean (± standard deviation).

Characteristics at baseline of A3C population Abbreviations: A+A, Agitation+Aggression; CGI‐S, Clinical Global Impression of Severity; CMAI, Cohen‐Mansfield Agitation Inventory; IPA, International Psychogeriatric Association; MMSE, Mini‐Mental State Examination; NPI‐C, Neuropsychiatric Inventory Clinician Rating. Mean (± standard deviation).

Minimal clinically important differences of CMAI, CMAI‐IPA, NPI‐C‐A+A, and NPI‐C‐IPA

At 1 month, 31 patients (13.0%) were much or very much improved by mADCS‐CGIC. For the whole cohort, CMAI mean score was 56.7 (SD ± 16.2), CMAI‐IPA was 35.0 (SD ± 10.9), NPI‐C‐A+A was 12.6 (SD ± 10.9), and NPI‐C‐IPA was 12.4 (SD ± 10.9). At 3 months, 44 patients (20.5%) were much or very much improved by mADCS‐CGIC. CMAI mean score was 52.5 (SD ± 15.9), CMAI‐IPA was 32.6 (SD ± 11.0), NPI‐C‐A+A was 9.9 (SD ± 9.0), and NPI‐C‐IPA was 10.1 (SD ± 10.0). Based on ROC analysis and the YI, at 1 month, the estimated MCID (defined as 1 or 2 on the mADCS‐CGIC vs. ≥3) for the original CMAI was –5 (odds ratio [OR] = 18.9, 95% CI = 6.3–56.4). For the derived CMAI‐IPA the MCID was –2 (OR = 21.2; 95% CI = 6.2–72.3). For the NPI‐C‐A+A the estimated MCID was –3 (OR = 15.5, 95% CI = 5.2–46.2). For the novel NPI‐C‐IPA the MCID was –5 (OR = 13.5, 95% CI = 5.4–33.4). At 3 months, the estimated MCID for CMAI was –17 (OR = 14.9, 95% CI = 6.8–32.6); for CMAI‐IPA was –5 (OR = 9.3, 95% CI = 4.0–21.2); for NPI‐C‐A+A was –3 (OR = 11.9, 95% CI = 4.1–34.8); and for NPI‐C‐IPA was –5 (OR = 7.8, 95% CI = 3.4–17.9). The estimated MCIDs were similar at 1 and 3 months for both NPI‐C instruments. AUCs of all four scales were comparable (Figure 1).

FIGURE 1

Receiver operating characteristic (ROC) curve for model: modified Alzheimer's Disease Cooperative Study‐Clinical Global Impression of Change (mADCS‐CGIC) 1–2 versus ≥3 and Cohen‐Mansfield Agitation Inventory (CMAI), Neuropsychiatric Inventory Clinician Rating (NPI‐C)‐Agitation+Aggression (A+A), CMAI‐International Psychogeriatric Association (IPA), and NPI‐C‐IPA at 1 and at 3 months

TABLE 2

	mADCS‐CGIC change agitation M1 compared to M0 great improvement : 1–2 (N = 31, 13.14%) versus others : > = 3 (N = 205, 86.86%)
	N	Unit	OR	Lower OR	Upper OR	P	AUC	Lower AUC	Upper AUC	Se (%)	Sp (%)	PLR (%)	PPV (%)	NPV (%)
CMAI M1‐M0	236	1.00	0.92	0.90	0.95	<.0001	0.82	0.73	0.91	–	–	–	–	–
CMAI M1‐M0 (< = ‐5 vs >‐5)	236	–	18.87	6.31	56.40	<.0001	–	–	–	87.10	73.66	3.31	33.33	97.42

Abbreviations: A+A, Agitation+Aggression; AUC, area under the curve; CMAI, Cohen‐Mansfield Agitation Inventory; IPA, International Psychogeriatric Association; M, month; mADCS‐CGIC, modified Alzheimer's Disease Cooperative Study‐Clinical Global Impression of Change; NPI‐C, Neuropsychiatric Inventory Clinician Rating; NPV, negative predictive value; PLR, positive likelihood ratio; PPV, positive predictive value; SD, standard deviation; Se, sensitivity; Sp, specificity.

Estimate minimal clinically important differences (MCID) of CMAI, CMAI‐IPA, NPI‐C‐A+A, and NPI‐C‐IPA by comparing the mADCS‐CGIC change agitation in two groups (very much improved and much improved vs. the others) at 1 and at 3 months Abbreviations: A+A, Agitation+Aggression; AUC, area under the curve; CMAI, Cohen‐Mansfield Agitation Inventory; IPA, International Psychogeriatric Association; M, month; mADCS‐CGIC, modified Alzheimer's Disease Cooperative Study‐Clinical Global Impression of Change; NPI‐C, Neuropsychiatric Inventory Clinician Rating; NPV, negative predictive value; PLR, positive likelihood ratio; PPV, positive predictive value; SD, standard deviation; Se, sensitivity; Sp, specificity.

Sensitivity to change of CMAI, CMAI‐IPA, NPI‐C‐A+A, and NPI‐C‐IPA for subjects very much improved

Figure 2 displays sensitivity to change properties of the four measures between baseline and 3 months and between baseline and 1 month according to five indices. All four measures had high sensitivity to change at 1 and at 3 months according to ES and SRM indices. All measures also showed a very high GRI.

FIGURE 2

The sensitivity to change of Cohen‐Mansfield Agitation Inventory (CMAI), Neuropsychiatric Inventory Clinician rating (NPI‐C)‐Agitation+Aggression (A+A), CMAI‐International Psychogeriatric Association (IPA), and NPI‐C‐IPA between M0 and M3 and between M0 and M1 according to five indices According to the RCI, CMAI should vary by –9.48 points and the CMAI‐IPA by –8.04 points between baseline and 3 months to ensure that the change between pre‐treatment and post‐treatment is statistically reliable. Similarly, the NPI‐C‐A+A should vary by –8.08 points and the NPI‐C‐IPA by –8.96 points, respectively, to ensure statistically reliable change. According to the mixed models the four scales had high sensitivity to change. At 1 month, for CMAI the estimated change was –19.11 (95% CI = [–22.32 to –15.90], P < .0001), for CMAI‐IPA –12.54 (95% CI = [–14.85 to –10.23], P < .0001), for NPI‐C‐A+A –11.68 (95% CI = [–13.76 to –9.60], P < .0001), and for NPI‐C‐IPA –11.17 (95% CI = [–13.14 to –9.20], P < .0001). At 3 months, for CMAI the change was –20.96 (95% CI = [–23.68 to –18.25], P < .0001), for CMAI‐IPA –13.31 (95% CI = [–15.26 to –11.37], P < .0001), for NPI‐C‐A+A –12.73 (95% CI = [–14.37 to –11.09], P < .0001), and for NPI‐C‐IPA ‐11.65 (95% CI = [‐13.26 to ‐10.05], P < .0001). To compare the changes observed, the four measures were converted to a Z‐score. All four measures demonstrated comparable high sensitivity to change, especially comparing the original and novel. Figure 2 shows a slightly better performance in sensitivity to change for CMAI than NPI‐C‐A+A at 3 months, but not at 1 month, considering that CMAI absolute values of different indices were higher than NPI‐C‐A+A. However, no significant differences in terms of performance were found between CMAI‐IPA and NPI‐C‐IPA at 1 and at 3 months. (see Table S1 in supporting information).

Predictive validity

Table 3 shows results from linear mixed models assessing whether a significant improvement, at 1 or 3 months, on the four measures and on the mADCS‐CGIC, was predictive of subsequent changes, at 11 or 9 months, in cognitive function (MMSE) or functional autonomy (ADL). Improvement at 3 months on NPI‐C‐A+A was significantly predictive of less ADL and MMSE change at 9 months of follow‐up. Improvement at 1 month on CMAI‐IPA was predictive of less ADL change at 11 months of follow‐up. Change in the other agitation measures were not predictive of MMSE or ADL at any timepoint.

TABLE 3

Predictive validity of CMAI, NPI‐C‐A+A, CMAI‐IPA, NPI‐C‐IPA, and mADCS‐CGIC between M0 and M3 and between M0 and M1 according to MMSE and ADL

		_{Change MMSE/ADL at 11 months (M12‐M1)}
_Outcome	_{Improvement agitation at 1 month}	_{Yes mean [95% CI]}	_{No mean [95% CI]}	_{Yes_versus_no mean [95% CI]} _P
_{1. Total MMSE}	_{1. CMAI M1–M0} _{(< = –5 vs >–5)}	_{–2.32 [–3.57 to –1.08]}	_{–2.07 [–2.93 to –1.20]}	_{–0.26 [–1.77 to 1.26]} _P _= .7375
_{1. Total MMSE}	_{2. NPI–C–A+A M1–M0} _{(< = –3 vs >–3)}	_{–2.05 [–3.23 to –0.88]}	_{–2.20 [–3.07 to –1.33]}	_{0.15 [–1.32 to 1.61]} _P _= .8419
_{1. Total MMSE}	_{3. CMAI–IPA M1–M0} _{(< = –2 vs >–2)}	_{–2.34 [–3.49 to –1.18]}	_{–2.02 [–2.92 to –1.12]}	_{–0.31 [–1.78 to 1.15]} _P _= .6728
_{1. Total MMSE}	_{4. NPI–C–IPA M1–M0} _{(< = –5 vs >–5)}	_{–2.20 [–3.54 to –0.86]}	_{–2.13 [–2.96 to –1.31]}	_{–0.07 [–1.64 to 1.51]} _P _= .9336
_{1. Total MMSE}	_{5. mADCS–CGIC M1–M0} _{(1–2 vs > = 3)}	_{–3.21 [–5.16 to –1.26]}	_{–2.00 [–2.75 to –1.25]}	_{–1.21 [–3.30 to 0.88]} _P _= .2546
_{2. Total ADL}	_{1. CMAI M1–M0} _{(< = –5 vs >–5)}	_{–0.76 [–1.10 to –0.43]}	_{–0.93 [–1.16 to –0.70]}	_{0.17 [–0.24 to 0.57]} _P _= .4154
_{2. Total ADL}	_{2. NPI–C–A+A M1–M0} _{(< = –3 vs >–3)}	_{–0.75 [–1.06 to –0.44]}	_{–0.94 [–1.17 to –0.71]}	_{0.19 [–0.19 to 0.58]} _P _= .3273
_{2. Total ADL}	_{3. CMAI–IPA M1–M0} _{(< = –2 vs >–2)}	_{–0.61 [–0.92 to –0.30]}	_{–1.02 [–1.25 to –0.79]}	_{0.41 [0.03 to 0.80]} _P _= .0360
_{2. Total ADL}	_{4. NPI–C–IPA M1–M0} _{(< = –5 vs >–5)}	_{–0.66 [–1.02 to –0.30]}	_{–0.95 [–1.17 to –0.73]}	_{0.29 [–0.13 to 0.72]} _P _= .1706
_{2. Total ADL}	_{5. mADCS–CGIC M1–M0} _{(1–2 vs > = 3)}	_{–1.03 [–1.56 to –0.50]}	_{–0.86 [–1.06 to –0.66]}	_{–0.18 [–0.74 to 0.39]} _P _= .5381

Abbreviations: A+A, Agitation+Aggression; ADL, Activities of Daily Living; CI, confidence interval; CMAI, Cohen‐Mansfield Agitation Inventory; IPA, International Psychogeriatric Association; M, month; mADCS‐CGIC, modified Alzheimer's Disease Cooperative Study‐Clinical Global Impression of Change; MMSE, Mini‐Mental State Examination; NPI‐C, Neuropsychiatric Inventory Clinician Rating; SD, standard deviation.

Note: Results from mixed models.

Predictive validity of CMAI, NPI‐C‐A+A, CMAI‐IPA, NPI‐C‐IPA, and mADCS‐CGIC between M0 and M3 and between M0 and M1 according to MMSE and ADL 1. CMAI M1–M0 (< = –5 vs >–5) –0.26 [–1.77 to 1.26] = .7375 2. NPI–C–A+A M1–M0 (< = –3 vs >–3) 0.15 [–1.32 to 1.61] = .8419 3. CMAI–IPA M1–M0 (< = –2 vs >–2) –0.31 [–1.78 to 1.15] = .6728 4. NPI–C–IPA M1–M0 (< = –5 vs >–5) –0.07 [–1.64 to 1.51] = .9336 5. mADCS–CGIC M1–M0 (1–2 vs > = 3) –1.21 [–3.30 to 0.88] = .2546 1. CMAI M1–M0 (< = –5 vs >–5) 0.17 [–0.24 to 0.57] = .4154 2. NPI–C–A+A M1–M0 (< = –3 vs >–3) 0.19 [–0.19 to 0.58] = .3273 3. CMAI–IPA M1–M0 (< = –2 vs >–2) 0.41 [0.03 to 0.80] = .0360 4. NPI–C–IPA M1–M0 (< = –5 vs >–5) 0.29 [–0.13 to 0.72] = .1706 5. mADCS–CGIC M1–M0 (1–2 vs > = 3) –0.18 [–0.74 to 0.39] = .5381 1. CMAI M3–M0 (< = –17 vs >–17) 0.12 [–1.48 to 1.71] p = 0.8863 2. NPI–C–A+A M3–M0 (< = –3 vs >–3) 1.38 [0.01 to 2.76] = .0483 3. CMAI–IPA M3–M0 (< = –5 vs >–5) 0.04 [–1.37 to 1.45] = .9534 4. NPI–C–IPA M3–M0 (< = –5 vs >–5) 1.17 [–0.21 to 2.55] = .0973 5. mADCS–CGIC M3–M0 (1–2 vs > = 3) 0.07 [–1.60 to 1.73] = .9366 1. CMAI M3–M0 (< = –17 vs >–17) 0.27 [–0.17 to 0.71] = .2315 2. NPI–C–A+A M3–M0 (< = –3 vs >–3) 0.38 [0.00 to 0.75] = .0480 3. CMAI–IPA M3–M0 (< = –5 vs >–5) 0.29 [–0.10 to 0.67] = .1427 4. NPI–C–IPA M3‐M0 (< = –5 vs >–5) 0.21 [–0.17 to 0.59] = .2681 5. mADCS–CGIC M3–M0 (1–2 vs > = 3) 0.02 [–0.44 to 0.48] = .9377 Abbreviations: A+A, Agitation+Aggression; ADL, Activities of Daily Living; CI, confidence interval; CMAI, Cohen‐Mansfield Agitation Inventory; IPA, International Psychogeriatric Association; M, month; mADCS‐CGIC, modified Alzheimer's Disease Cooperative Study‐Clinical Global Impression of Change; MMSE, Mini‐Mental State Examination; NPI‐C, Neuropsychiatric Inventory Clinician Rating; SD, standard deviation. Note: Results from mixed models.

The accuracy of CMAI‐IPA and NPI‐C‐IPA

The analysis of the internal consistency of the NPI‐C‐IPA showed a Cronbach's alpha at 0.8 for EMA, 0.7 for the VA, and 0.7 for the PA. Concerning the CMAI‐IPA, the Cronbach's alpha was 0.8 for PA, 0.6 for EMA, and 0.3 for VA (see Table S2 in supporting information). Concerning test–retest reliability, the mean of ICC for the CMAI‐IPA was 0.8 at 1 month and was 0.7 at 3 months, the ICC for the NPI‐C‐IPA was 0.9 at 1 month, and was 0.8 at 3 months (see Table S3 in supporting information). Table S4 in supporting information showed statistically significant OR of the association between the CMAI‐IPA and NPI‐C‐IPA and each of the IPA domain (EMA, VA, and PA; P < .0001).

DISCUSSION

Accurate and meaningful measurement of agitation is central to developing and testing the efficacy of treatments for agitation in AD. However, no gold standard exists. Little is known about the longitudinal course of clinically significant agitation in AD, about the variability in different outcome measures over time, or the definition of a clinically meaningful improvement. In this study of A3C participants we assessed the most widely used measures of agitation in clinical trials, the CMAI and NPI‐C‐A+A, as well as Delphi panel–derived measures informed by the IPA agitation criteria, the CMAI‐IPA and the NPI‐C‐IPA. At 3‐month follow‐up, one fifth of patients with clinically significant agitation were much or very much improved globally. Estimated MCIDs were the same at 1 and 3 months for both NPI‐C instruments, indicating that to differentiate trial participants who are very much improved, a period of 1 month may be sufficient. Similar findings were reported for the placebo arm of a recent RCT. AUCs of all four scales were similar, suggesting no scale had an advantage in predicting clinician ratings. The sensitivity to change properties of the four measurements at 1 and at 3 months follow‐up according to five different indices were high, and comparable among the scales, especially between original measures and novel versions. Cronbach's alpha and ICC results showed a better accuracy for the NPI‐C‐IPA than for the CMAI‐IPA. In 2015, the field significantly progressed with publication of the IPA agitation provisional criteria in cognitive disorders, helping to define agitation better as a syndrome in both clinical and research setting. This was an important step forward, given that agitation was interchangeably or confusedly described as a symptom or a syndrome, with unclear definitions of which behaviors constituted agitation, debates about the differences between agitation and aggression, and the role of other emergent dementia‐related behaviors. The EU‐US CTAD Task Force focused its 2017 meeting on finding the best outcome measure for agitation in dementia trials. The Task Force advocated the development of a single rating scale that reflects agitation as a unitary phenomenon, and that best reflects the IPA criteria and the situations in which agitation occurs. We explored the MCID of the four different measures. MCID is crucial to calculate sample size in clinical trials. Information about MCID is very important for newer scales; less widely used measures, such as the NPI‐C‐A+A; as well as for the novel, IPA criteria‐informed measures. Regarding MCID of NPI‐C‐A+A and NPI‐C‐IPA, follow‐up of 1 month seemed sufficient to detect the group of patients who would be very much improved by 3 months. This finding has important implications for trial design; while symptom response in clinical trials may have a different timeframe than in observational studies such as ours, what is important is that natural variation in symptoms can be observed in a short time frame (1 month) and predicts later response (3 months). This suggests that the widely used trial durations of 6 to 12 weeks is long enough to capture initial clinical response, and that we do not need substantially longer timeframes. Moreover, this may have important implications for clinical practice. In addition, these results, combined with the report by Rosenberg et al., are helpful in considering the use of a “run‐in period” of 3 to 4 weeks before active treatment to reduce the placebo effect often observed in such trials. Findings from this study also provide a better estimation of placebo group variability in trials, thus allowing for more precise power estimates. We used global ratings as a gold standard indicator of meaningful clinical change against which we compared scale performance. Notably CMAI‐IPA and CMAI had high scores at follow‐up, even for participants who were “much” better clinically. This was not the case for the NPI‐C measures on which participants who were clinically “much” better had scores approaching zero at follow‐up. The NPI‐C scales better approximate meaningful clinical improvement. This may be because CMAI contains items less clinically relevant to this population (e.g., the verbal nonaggressive items ). In fact, CMAI has many items that are rarely rated considering that it was originally developed in nursing homes on residents with advanced dementia. This assertion is corroborated by the fact that mapping the original CMAI onto the IPA criteria based on the expert Delphi process resulted in a shorter scale suggesting that the original one was unnecessarily broad. Another explanation could be that the CMAI is a frequency scale, while NPI‐C captures frequency and severity resulting in a more robust assessment of the clinically relevant behaviors. Further, CMAI ratings rely solely on subjective caregiver input resulting in potential bias, in contrast to NPI‐C in which trained clinicians use input from caregiver and patient to produce clinician severity scores. With respect to the novel NPI‐C‐IPA, not every one of the 25 items have been assessed for their “utility” in clinical and research settings, although this is the work for future studies. Comparing the original to novel measures, all AUCs were similar, suggesting that no measure had a clear advantage in predicting clinician ratings. In addition, the sensitivity to change properties of all four measures at 1 and at 3 months were similar, supporting the conclusion that the four measures have comparable psychometric properties. Thus, for the field to incorporate IPA agitation definition into measurement, the novel CMAI‐IPA and NPI‐C‐IPA are preferred as they are more efficient. Furthermore, the original NPI‐C‐A+A only included the agitation and the aggression domains, while the new NPI‐C‐IPA included six domains from the original NPI‐C related to agitation symptoms: agitation, aggression, aberrant motor activity, abnormal vocalizations, disinhibition, and irritability/lability domains, therefore better capturing the breadth of the IPA agitation domains. Thus, the novel scales CMAI‐IPA and NPI‐C‐IPA perform at least as well as the original scales but in addition, they reflect the IPA agitation criteria. The Food and Drug Administration recognized “agitation as a syndrome for clinical trials targeting,” and IPA criteria aim to define this syndrome given the historical heterogeneity in definition used. Concerning the accuracy of the novel scales, the statistical metrics showed that the internal consistency of the NPI‐C‐IPA was good concerning the three IPA domains (EMA, VA, and PA). However, the internal coherence of CMAI is more questionable; Cronbach's alpha was good for EMA but it was low for VA and very low for PA. One explanation could be that when calculating the Cronbach's alpha, the number of items is taken into account. Indeed, in the EMA and VA domains of CMAI‐IPA, there are only four items in each domain. This could explain the slightly lower coefficient in the EMA domain but not the very low coefficient for VA. To estimate the test–retest reliability, ICC statistical analysis of the NPI‐C‐IPA highlighted excellent reliability at 1 month and good reliability at 3 months. In contrast, the ICC of the CMAI‐IPA showed good reliability at 1 month and moderate reliability at 3 months. In summary, internal consistency and reliability analyses demonstrated a better accuracy for the NPI‐C‐IPA than for the CMAI‐IPA. The internal validity of the novel scales seemed to be good based on the analyses of the association between the CMAI‐IPA and the NPI‐C‐IPA items with each IPA domain (EMA, VA, and PA). The major strength of this study is that it uses data from A3C that mimics a clinical trial in terms of methodology and measurement in a naturalistic setting in both community and nursing home settings. However, there are notable limitations to this work. A3C was an observational, usual care study, whose population received close follow‐up, and was frequently treated with psychotropics or non‐pharmacological approaches for agitation. However, we believe it is an appropriate dataset for the validation of outcome measures—because no trial can ethically eliminate other efforts at treating agitation, development of novel treatments for agitation will by necessity occur against this background. Concerning other statistical metrics to assess the novel scales, unfortunately, we were unable to assess inter‐rater reliability because these data were not available. In fact, during the A3C study, two different raters for each patient did not administer the same scale. However, the novel scales are derived from existing scales; therefore, authors hypothesize that the inter‐rater variability is comparable to the existing ones (CMAI and NPI‐C). Nevertheless, a deeper statistical validation of novel measures will be the subject of further studies. Another limitation was the substantial attrition observed in the number of participants: 30% at 1 year. Although the attrition rate is high, our result is consistent with studies in the literature. Indeed, in a 2013 study conducted in Norway in subjects with dementia living in nursing homes, the attrition rate in the first year of follow‐up was ≈32%. Steinberg et al. followed patients with neuropsychiatric symptoms in dementia at home for 5 years. The attrition rate was ≈42% at 1.5 years of follow‐up. Attrition is common in cohorts of older adults with dementia. The study by Burke et al. highlighted that the presence of neuropsychiatric symptoms is a factor influencing attrition. This is why A3C sample size was calculated considering 15% attrition. Finally, this attrition is consistent with that seen in “real‐world clinical settings” for this vulnerable population, as A3C is a naturalistic study in a usual care setting. Nevertheless, attrition during the first 3 months, the critical period of A3C study, was much lower (13%). For all these reasons, we believe that our data can be generalizable to the population of patients with AD presenting with agitation symptoms. The attrition data we report will also help in power calculations for sample sizes in future trials. These results also have relevance to clinical practice. IPA agitation domain‐specific measures are an important advance in measurement but also in management of agitation in routine care. Use of these scales in clinical settings will allow for better definition of agitation symptoms, optimization of non‐pharmacological or medication options, and better assessment of efficacy.

CONCLUSION

In summary, in a naturalistic study of AD patients with agitation, the IPA‐informed novel CMAI‐IPA and NPI‐C‐IPA scales, both designed to reflect the IPA Agitation Criteria, performed at least as well as the original scales. We found better statistical accuracy and clinical relevance for the NPI‐C‐IPA over the CMAI‐IPA. As these novel measures accurately represent the IPA Agitation Syndrome, we propose that future agitation clinical trials using IPA criteria for inclusion use these novel scales, notably the NPI‐C‐IPA, to capture clinical effects of treatments. In the current absence of optimal outcome measures to demonstrate treatment effectiveness, these novel scales represent a step forward in the field of treatment development of agitation in AD.

CONFLICTS OF INTEREST

David Miller is a full‐time employee of Signant Health. Cedric O'Gorman is a full‐time employee of Axsome Therapeutics. Constantine Lyketsos declares: (1) grant support (research or CME) from NIH, Functional Neuromodulation, Bright Focus Foundation and (2) payment as consultant or advisor from Avanir, Astellas, Roche, Karuna, SVB Leerink, Maplight, Axsome, Global Institute on Addictions. Maria Soto declares: payment as consultant or advisor from Avanir, Acadia. Zahinoor Ismail declares: (1) payment as consultant or advisor from Lundbeck/Otuska outside the submitted work; (2) payment to his institution from Acadia, Biogen, Roche, and Sunovion outside the submitted work. No conflicts are declared for the other authors. Supplementary information Click here for additional data file.

30 in total

1. "Mini-mental state". A practical method for grading the cognitive state of patients for the clinician.

Authors: M F Folstein; S E Folstein; P R McHugh
Journal: J Psychiatr Res Date: 1975-11 Impact factor: 4.791

2. Clinical significance: a statistical approach to defining meaningful change in psychotherapy research.

Authors: N S Jacobson; P Truax
Journal: J Consult Clin Psychol Date: 1991-02

3. The Neuropsychiatric Inventory-Clinician rating scale (NPI-C): reliability and validity of a revised assessment of neuropsychiatric symptoms in dementia.

Authors: Kate de Medeiros; P Robert; S Gauthier; F Stella; A Politis; J Leoutsakos; F Taragano; J Kremer; A Brugnolo; A P Porsteinsson; Y E Geda; H Brodaty; G Gazdag; J Cummings; C Lyketsos
Journal: Int Psychogeriatr Date: 2010-07-01 Impact factor: 3.878

4. The Mild Behavioral Impairment Checklist (MBI-C): A Rating Scale for Neuropsychiatric Symptoms in Pre-Dementia Populations.

Authors: Zahinoor Ismail; Luis Agüera-Ortiz; Henry Brodaty; Alicja Cieslak; Jeffrey Cummings; Corinne E Fischer; Serge Gauthier; Yonas E Geda; Nathan Herrmann; Jamila Kanji; Krista L Lanctôt; David S Miller; Moyra E Mortby; Chiadi U Onyike; Paul B Rosenberg; Eric E Smith; Gwenn S Smith; David L Sultzer; Constantine Lyketsos
Journal: J Alzheimers Dis Date: 2017 Impact factor: 4.472

5. Effect of citalopram on agitation in Alzheimer disease: the CitAD randomized clinical trial.

Authors: Anton P Porsteinsson; Lea T Drye; Bruce G Pollock; D P Devanand; Constantine Frangakis; Zahinoor Ismail; Christopher Marano; Curtis L Meinert; Jacobo E Mintzer; Cynthia A Munro; Gregory Pelton; Peter V Rabins; Paul B Rosenberg; Lon S Schneider; David M Shade; Daniel Weintraub; Jerome Yesavage; Constantine G Lyketsos
Journal: JAMA Date: 2014-02-19 Impact factor: 56.272

6. Identifying Better Outcome Measures to Improve Treatment of Agitation in Dementia: A Report from the EU/US/CTAD Task Force.

Authors: M Sano; M Soto; M Carrillo; J Cummings; S Hendrix; J Mintzer; A Porsteinsson; P Rosenberg; L Schneider; J Touchon; P Aisen; B Vellas; C Lyketsos
Journal: J Prev Alzheimers Dis Date: 2018

7. Agitated behavior in persons with dementia: the relationship between type of behavior, its frequency, and its disruptiveness.

Authors: Jiska Cohen-Mansfield
Journal: J Psychiatr Res Date: 2008-04-03 Impact factor: 4.791

8. Medication development for agitation and aggression in Alzheimer disease: review and discussion of recent randomized clinical trial design.

Authors: Maria Soto; Sandrine Andrieu; Fati Nourhashemi; Pierre Jean Ousset; Clive Ballard; Philippe Robert; Bruno Vellas; Constantine G Lyketsos; Paul B Rosenberg
Journal: Int Psychogeriatr Date: 2014-09-16 Impact factor: 3.878

Review 9. Neuropsychiatric signs and symptoms of Alzheimer's disease: New treatment paradigms.

Authors: Krista L Lanctôt; Joan Amatniek; Sonia Ancoli-Israel; Steven E Arnold; Clive Ballard; Jiska Cohen-Mansfield; Zahinoor Ismail; Constantine Lyketsos; David S Miller; Erik Musiek; Ricardo S Osorio; Paul B Rosenberg; Andrew Satlin; David Steffens; Pierre Tariot; Lisa J Bain; Maria C Carrillo; James A Hendrix; Heidi Jurgens; Brendon Boot
Journal: Alzheimers Dement (N Y) Date: 2017-08-05

10. Agitation in Alzheimer's disease: Novel outcome measures reflecting the International Psychogeriatric Association (IPA) agitation criteria.

Authors: Adelaide De Mauleon; Zahinoor Ismail; Paul Rosenberg; David Miller; Christelle Cantet; Cedric O'Gorman; Bruno Vellas; Constantine Lyketsos; Maria Soto
Journal: Alzheimers Dement Date: 2021-06-16 Impact factor: 16.655

2 in total

Review 1. Psychosis in Alzheimer disease - mechanisms, genetics and therapeutic opportunities.

Authors: Zahinoor Ismail; Byron Creese; Dag Aarsland; Helen C Kales; Constantine G Lyketsos; Robert A Sweet; Clive Ballard
Journal: Nat Rev Neurol Date: 2022-01-04 Impact factor: 44.711

2. Agitation in Alzheimer's disease: Novel outcome measures reflecting the International Psychogeriatric Association (IPA) agitation criteria.

2 in total