Literature DB >> 24901109

Psychometric evaluation of the Care Transition Measure in TRACE-CORE: do we need a better measure?

Milena D Anatchkova¹, Constance M Barysauskas², Rebecca L Kinney¹, Catarina I Kiefe¹, Arlene S Ash¹, Lisa Lombardini¹, Jeroan J Allison¹.

Abstract

BACKGROUND: The quality of transitional care is associated with important health outcomes such as rehospitalization and costs. The widely used Care Transitions Measure (CTM-15) was developed with a classic test theory approach; its short version (CTM-3) was included in the CAHPS Hospital Survey. We conducted a psychometric evaluation of both measures and explored whether item response theory (IRT) could produce a more precise measure. METHODS AND
RESULTS: As part of the Transitions, Risks, and Actions in Coronary Events Center for Outcomes Research and Education, 1545 participants were interviewed during an acute coronary syndrome hospitalization, providing information on general health status (Short Form-36), CTM-15, health utilization, and care process questions at 1 month postdischarge. We used classic and IRT analyses and compared the measurement precision of CTM-15-, CTM-3-, and CTM-IRT-based score using relative validity. Participants were 79% non-Hispanic white and 67% male, with an average age of 62 years. The CTM-15 had good internal consistency (Cronbach's α=0.95) but demonstrated acquiescence bias (8.7% participants responded "Strongly agree" and 19% responded "Agree" to all items) and limited score variability. These problems were more pronounced for the CTM-3. The CTM-15 differentiated between patient groups defined by self-reported health status, health care utilization, and care transition process indicators. Differences between groups were small (2 to 3 points). There was no gain in measurement precision from IRT scoring. The CTM-3 was not significantly lower for patients reporting rehospitalization or emergency department visits.
CONCLUSION: We identified psychometric challenges of the CTM, which may limit its value in research and practice. These results are in line with emerging evidence of gaps in the validity of the measure.

Entities: CellLine Chemical Disease Gene Species

Keywords: IRT scoring; acute coronary syndromes; care transitions measure; validity

Mesh：

Year: 2014 PMID： 24901109 PMCID： PMC4309102 DOI： 10.1161/JAHA.114.001053

Source DB: PubMed Journal: J Am Heart Assoc ISSN： 2047-9980 Impact factor: 5.501

Introduction

Transitional care refers to actions to ensure the coordination and continuity of care for patients as they transfer between locations of care. Care transitions are particularly important for the 145 million Americans who live with chronic illness and receive care from multiple providers in various settings, often with noncommunicating medical record systems.[1] Poor transitions can fragment care, resulting in conflicting recommendations, increased medical error and duplication, and inadequate information to both patients and caregivers.[2] An effective measure of care transition quality is a key tool for reducing care fragmentation. The most widely used measure of care transition quality is the Care Transitions Measure (CTM‐15).[2] The 4 CTM domains, derived from patient focus groups, are (1) Information Transfer, (2) Patient and Caregiver Preparation, (3) Support for Self‐Management, and (4) Empowerment to Assert Preferences. A 4‐factor structure was retained for the final measure using confirmatory factor analysis. The labels for the retained domains differ a little from these: (1) critical understanding, (2) preferences important, (3) management preparation, and (4) care plan. All 15 items in the CTM‐15 use a 4‐point scale with responses ranging from “Strongly Disagree” to “Strongly Agree.” The items are scored by summing the responses (between 1 and 4) followed by linear transformation to a 0‐to‐100 range. The CTM‐15 has been shown to discriminate between patients who did and did not have a subsequent emergency department visit or rehospitalization for their index condition and between health care facilities with different levels of system integration.[3] A 3‐item version of the CTM was found to explain 88% of the variance in the full measure; the CTM‐3 has been demonstrated to have the same ability to detect group differences as the longer version in earlier studies.[4] The short form includes items from 2 of the 4 originally identified domains (critical understanding and preferences are important). The developers argue that the information lost by using the CTM‐3 measure is small compared with the reduction in response burden. The CTM‐3 measures the extent to which the hospital staff accomplished essential care processes in preparing the patient for discharge and participating in posthospital self‐care activities. Therefore, it is advised that survey data collection be administered between 48 hours and 6 weeks postdischarge. The CTM‐3 was endorsed by the National Quality Forum and included in the Consumer Assessment of Healthcare Providers and Systems (CAHPS) Hospital Survey in 2010. As with most self‐report questionnaires currently used in health research, the CTM is scored using a classic sum score transformed on a 0‐to‐100 scale. While this approach is widely used, more‐sophisticated approaches to scale scoring based on item response theory (IRT) are gaining popularity, since the IRT method uses more of the available information in each item, Thus, theoretically, IRT can produce a more discriminating score based on the same underlying data than a sum score.[5] In previous studies, comparisons of sum‐ versus IRT‐based scoring of patient‐reported measures have produced mixed results; that is, the benefits of IRT scoring differ and are really important for some scales but not others.[5] To the best of our knowledge, the benefits of IRT‐based scoring for the CTM‐15 have not been explored. The development and initial validation of the CTM‐15 and CTM‐3 are well documented[3,6]; however, these studies were conducted on relatively small samples of less than 250 people. Given that the CTM‐3 is now included among the core questions designed “to provide a standardized survey instrument and data collection methodology for measuring patients' perspectives on hospital care” as part of a national quality of care initiative, examining CTM measure performance in an independent large‐scale study is warranted. The aim of this report is to conduct a psychometric analysis of CTM items and evaluate the dimensionality, internal consistency, and construct validity of CTM‐15 and CTM‐3. We also explore the effect of IRT‐based scale scoring on measurement ability and compare the performance of the simple summation and IRT‐based scoring approaches head to head.

Methods

Sample

Transitions, Risks, and Actions in Coronary Events Center for Outcomes Research and Education (TRACE‐CORE) is a large, multiracial cohort of adult patients hospitalized with acute coronary syndromes (ACS) from 6 hospitals in Massachusetts and Georgia.[7] A total of 2300 patients were enrolled in the study and were approached for computer‐assisted telephone follow‐up interviews at 1, 3, 6, and 12 months post ACS hospitalization. Interviews focus on individual patient quality of life, rehospitalization, behavioral, and psychosocial characteristics. This study relies on data from the 1545 patients who completed both baseline and 1‐month follow‐up interviews, including the CTM‐15. The study has been approved by the University of Massachusetts Medical School Institutional Review Board and all participants gave informed consent for participation.

Measures

We used CTM‐15 data from the 1‐month follow‐up interview and demographic characteristics and a self‐reported health status measure collected at the baseline interview. Other data collected at 1 month include self‐reported questions on rehospitalization, emergency department visits and symptom development postdischarge along with 3 independent questions evaluating transition quality (access to medical records, prescheduled follow‐up appointments, and knowing who to contact if symptoms worsen).

Analytic Plan

Classic Test Theory Psychometric Analyses

Classic test theory psychometric analyses were performed to provide an overall understanding of item and scale quality. We computed item‐level statistics, including mean scores, proportions of response categories, and item‐total correlations. Items with a negative or low (<0.2) item–total correlation have undesirable discriminating power and should be flagged for further study.[8-9] We also examined scale‐level score distribution, ceiling and floor effects, and internal consistency (Cronbach's α).[10]

Dimensionality Analyses

Dimensionality analyses were conducted to examine the factor structure of the CTM items in the TRACE‐CORE cohort. The evidence of factor structure—how many latent factors are needed in explaining the variance of item responses—can be used to both justify the use of a single scale score as currently reported in the CTM and lay the groundwork for satisfying the unidimensionality assumption for further advanced psychometric modeling (eg, IRT).[11-12] Confirmatory factor analyses were conducted in the framework of structural equation modeling using the software Mplus.[13] The polychoric correlation matrix and weighted least squares with adjustments for mean and variance estimation ordered categorical data were used.[14] In addition to a model replicating the CTM's reported 4‐factor structure, we tested a unidimensional model and a bifactor model (items loading on 1 general factor and 4 secondary factors). The model–data fit was evaluated by examining the magnitude and patterns of item factor loadings, as well as the commonly used fit statistics compared with established thresholds (eg, the comparative fit index (CFI) ≥0.95, root mean square error of approximation (RMSEA) ≤0.08).[15] The residual matrix and modification indices were used to detect any local item dependency, which can potentially arise when a subset of items have the same phrasing or format and thus are correlated with each other after the primary factor is controlled for.[16] It was hypothesized that the CTM items present essential unidimensionality with a dominant factor, namely, that the general factor has uniformly high loadings on all items and explains the majority of total item covariance.

IRT Modeling

Modern measurement theory assumes that each response option has a specific relationship with the underlying construct, so a unique response can be selected for a different interval of the scale. If this assumption is violated, individual response options do not provide unique information and therefore can be collapsed before fitting an IRT model.[17] We inspected item characteristic curves for out‐of‐order or nondiscriminating response options by using the TestGraf program. Good items should have response choice categories with unequivocal and unique relationships to the latent trait appearing in rank order. Visual inspection of these item characteristic curve graphs provides insight into whether an item's response choice categories overlap and require rescaling (or collapsing across response choice categories) to achieve distinct categories.[18] To conduct the IRT analyses with the CTM, we used the polytomous generalized partial credit model[19-22] in PARSCALE.[23] The model fit for all items was evaluated using IRTFIT macro for SAS (SAS Institute).[24]

Validity Analyses

Validity analyses were conducted to evaluate the ability of CTM scores (CTM‐15, CTM‐3, CTM‐IRT) to discriminate patients of different transitional care quality and health status using previously reported variables. Specifically, the known‐groups method of construct validation was used to compare groups using 1‐way ANOVA and t tests.[25] Groups compared were formed on the basis of demographic characteristics and patients' responses to non‐CTM questions assessing care transition quality processes (eg, access to medical records at discharge, scheduled follow‐up visits with a health care provider), health care use at 1 month postdischarge (emergency department visits, rehospitalization), health status scores, and symptom trajectories postdischarge. We compared the 3 scores by using relative validity coefficients.[26]

Results

Study participants were predominantly white (79%) and male (67%), reporting a mean age of 62 years. Relatively few subjects reported having a college education or higher (27%) or annual household income greater than $75 000 (29%). Self‐reported health status was normally distributed across the 5 response categories from poor to excellent (Table 1).

Table 1.

Baseline Characteristics of TRACE‐CORE Participants

	% (n)	% (n)	P Value
	CTM‐15 Completers (n=1545)	CTM‐15 Noncompleters (n=750)	P Value
Age, mean (SD) y	61.9 (11)	62.7 (12)	NS
Female	33.5 (518)	33.6 (252)	NS
Hispanic or Latino	3.0 (47)	4.2 (30)	NS
White	79.0 (1219)	72.8 (518)	<0.001
Education
Less than high school	13.2 (204)	26.4 (190)	<0.001
High school graduate/some college	59.8 (924)	54.0 (389)
College or higher	26.9 (416)	19.7 (142)
Family income, $
<34 999	39.8 (472)	51.2 (285)	<0.001
35 000 to 74 999	31.2 (370)	28.9 (161)
75+	28.9 (343)	19.9 (111)
Self‐reported health status
Excellent	6.0 (93)	3.6 (26)	<0.001
Very good	20.1 (310)	14.6 (105)
Good	41.1 (633)	35.1 (252)
Fair	22.4 (346)	30.9 (222)
Poor	10.3 (159)	15.9 (114)

CTM indicates care transitions measure; TRACE‐CORE, Transitions, Risks, and Actions in Coronary Events Center for Outcomes Research and Education.

Baseline Characteristics of TRACE‐CORE Participants CTM indicates care transitions measure; TRACE‐CORE, Transitions, Risks, and Actions in Coronary Events Center for Outcomes Research and Education.

Classic Test Theory Psychometric Analyses

Item‐level review (Table 2) revealed sample means that were slightly higher (average item mean 3.2, range 2.9 to 3.3) than item means reported in earlier development and validation reports (average item mean 3.1, range 2.9 to 3.2),[3-4] yet, as reported in Table 3, these means were lower than those reported in recent publications (eg, average item mean 3.7, range 3.6 to 3.9). In our study, the CTM‐15 scale demonstrated good internal consistency (Cronbach's α 0.95) that is comparable to previous studies (Cronbach's α 0.90 to 0.95); however, the CTM‐15 had a ceiling effect of 8.7%, substantially higher than the 1.1% reported in the original reports and similar to the 10% ceiling effect levels reported in some later studies. None of our participants scored at the floor (at the lowest possible score); indeed, very few (5.5%) scored in the lower half of the scale's theoretical range (<50).

Table 2.

CTM‐15 Item Means, SDs, and Response Frequencies

	N*	Mean	SD	Frequencies
	N*	Mean	SD		Strongly Disagree	Disagree	Agree	Strongly Agree	Don't Know	Refused
Before I left the hospital, the staff and I agreed about clear health goals for me and how these would be reached.	1471	3.2	0.7	n	25	115	861	470	74	1
	1471	3.2	0.7	%	1.6	7.4	55.7	30.4	4.8	0.1
The hospital staff took my preferences and those of my family or caregiver into account in deciding what my health care needs would be when I left the hospital.	1435	3.2	0.7	n	27	137	835	436	105	6
	1435	3.2	0.7	%	1.8	8.9	54.0	28.2	6.8	0.4
The hospital staff took my preferences and those of my family or caregiver into account in deciding where my health care needs would be met when I left the hospital.	1426	3.2	0.7	n	21	134	838	433	118	2
	1426	3.2	0.7	%	1.4	8.7	54.2	28.0	7.6	0.1
When I left the hospital, I had all the information I needed to be able to take care of myself.	1532	3.3	0.6	n	21	70	864	577	13	1
	1532	3.3	0.6	%	1.4	4.5	55.9	37.3	0.8	0.1
When I left the hospital, I clearly understood how to manage my health.	1529	3.2	0.6	n	20	108	911	490	16	1
	1529	3.2	0.6	%	1.3	7.0	58.9	31.7	1.0	0.1
When I left the hospital, I clearly understood the warning signs and symptoms I should watch for to monitor my health condition.	1523	3.3	0.6	n	16	86	854	567	23	0
	1523	3.3	0.6	%	1.0	5.6	55.2	36.7	1.5	0.0
When I left the hospital, I had a readable and easily understood written plan that described how all of my health care needs were going to be met.	1495	3.2	0.7	n	34	145	855	461	50	1
	1495	3.2	0.7	%	2.2	9.4	55.3	29.8	3.2	0.1
When I left the hospital, I had a good understanding of my health condition and what makes it better or worse.	1518	3.3	0.6	n	17	84	902	515	28	0
	1518	3.3	0.6	%	1.1	5.4	58.3	33.3	1.8	0.0
When I left the hospital, I had a good understanding of the things I was responsible for in managing my health.	1532	3.3	0.6	n	11	70	926	525	13	1
	1532	3.3	0.6	%	0.7	4.5	59.9	34.0	0.8	0.1
When I left the hospital, I was confident that I knew what to do to manage my health.	1527	3.2	0.6	n	11	94	942	480	18	1
	1527	3.2	0.6	%	0.7	6.1	60.9	31.1	1.2	0.1
When I left the hospital, I was confident I could actually do the things I needed to do to take care of my health.	1529	3.2	0.6	n	9	104	965	451	16	1
	1529	3.2	0.6	%	0.6	6.7	62.4	29.2	1.0	0.1
When I left the hospital, I had a readable and easily understood written list of the appointments or tests I needed to complete within the next several weeks.	1510	3.3	0.6	n	20	72	880	538	35	1
	1510	3.3	0.6	%	1.3	4.7	56.9	34.8	2.3	0.1
When I left the hospital, I clearly understood the purpose for taking each of my medications.	1533	3.2	0.6	n	12	105	918	498	12	1
	1533	3.2	0.6	%	0.8	6.8	59.4	32.2	0.8	0.1
When I left the hospital, I clearly understood how to take each of my medications, including how much I should take and when.	1536	3.3	0.6	n	7	45	900	584	9	1
	1536	3.3	0.6	%	0.5	2.9	58.2	37.8	0.6	0.1
When I left the hospital, I clearly understood the possible side effects of each of my medications.	1495	2.9	0.7	n	48	320	801	326	49	2
	1495	2.9	0.7	%	3.1	20.7	51.8	21.1	3.2	0.1

CTM indicates care transitions measure.

Response rates varied by items. CTM‐15 allows scoring with missing item data.

Table 3.

CTM‐15 Summary Data: Development and Validation Studies

Study	Average Item Score	Item Score Range	CTM Mean	CTM SD	CTM Median	% Patients With Max Score	Chronbach's α
TRACE‐CORE (n=1545)	3.2	2.9 to 3.3	73.9	16.17	66.6*	8.7	0.92
Coleman et al[3] (n=200)	3.0	2.9 to 3.2	67.34	14.67	66.7	1.1	0.93
Coleman et al[27] (n=242)	3.1	2.9 to 3.3	70.1
Parry et al[4] (n=225)			71.21	16.48			0.93 to 0.95
Shadmi et al[6] (Hebrew, n=217)	3.2	3.0 to 3.4	73.1	19.7	71.1	10.3	0.94
Shadmi et al[6] (Arabic, n=100)	3.5	3.2 to 3.7	81.8	16.5	86.1	10.9	0.9
Ryvicker et al[28] (n=495)			63.7	12.1	66.7*

CTM indicates care transitions measure; TRACE‐CORE, Transitions, Risks, and Actions in Coronary Events Center for Outcomes Research and Education.

Of the sample, 26% had a value of 66.7.

Of the sample, 40% had a value of 66.7.

CTM‐15 Item Means, SDs, and Response Frequencies CTM indicates care transitions measure. Response rates varied by items. CTM‐15 allows scoring with missing item data. CTM‐15 Summary Data: Development and Validation Studies CTM indicates care transitions measure; TRACE‐CORE, Transitions, Risks, and Actions in Coronary Events Center for Outcomes Research and Education. Of the sample, 26% had a value of 66.7. Of the sample, 40% had a value of 66.7. Among the 15 CTM item responses (Table 2), the most frequently selected was “Agree,” followed by “Strongly Agree” and “Disagree.” Nineteen percent (N=288) of all participants selected “Agree” as a response to all 15 questions, while “Strongly disagree,” “Don't know,” and “Refused” were least selected, across all items. Dimensionality analyses of the CTM in our study population supports use of the single score assessment and provided evidence that the scale is sufficiently unidimensional for IRT analysis. The bifactor model fit the data best (CFI=0.981, Tucker‐Levis Index (TLI)=0.995, RMSEA=0.10) with the general factor having higher loadings on most of the items and explaining most of the total item covariance (see Table 4). Review of the residual matrix did not detect any items with local dependency (had the lowest score), further supporting the essential unidimensionality of the data and the applicability of IRT methods.

Table 4.

Confirmatory Factor Analyses Model Results

	Unidimensional Model	Bifactor Model
	Unidimensional Model	General	Preferences Important	Management Preparation	Critical Understanding	Care Plan
Item 1	0.754	0.729	0.292
Item 2	0.756	0.685	0.694
Item 3	0.741	0.685	0.428
Item 4	0.850	0.838		0.379
Item 5	0.881	0.873		0.265
Item 6	0.821	0.825		0.055
Item 7	0.790	0.797				0.200
Item 8	0.874	0.888		0.109
Item 9	0.947	0.955			−0.099
Item 10	0.922	0.927			−0.071
Item 11	0.821	0.825			0.057
Item 12	0.700	0.706				0.121
Item 13	0.785	0.755			0.610
Item 14	0.796	0.771			0.340
Item 15	0.657	0.642			0.246
Fit statistics
CFI	0.931		0.981
TLI	0.980		0.995
RMSEA	0.19		0.10

CFI indicates Comparative Fit Index; RMSEA, Root Mean Square Error of Approximation, TLI, Tucker‐Lewis Index.

Confirmatory Factor Analyses Model Results CFI indicates Comparative Fit Index; RMSEA, Root Mean Square Error of Approximation, TLI, Tucker‐Lewis Index.

IRT Modeling

Review of the item characteristic curves revealed that all 15 items had some nondiscriminating response options that should be collapsed. For 3 items, we collapsed the 4 response categories into 2 (Agree versus Disagree), and for the rest of the items, the response options of “Disagree” and “Strongly Disagree” were collapsed into 1. The results of our IRT parameter estimates and item fit are presented in Table 5. Item slopes ranged between 1.53 and 4.17. The fit index suggested that 5 items did not fit the model well, but the violations were minor. These item parameters were used to calculate an IRT‐based score for the 15 items of the CTM (denoted CTM‐IRT). The correlation between the CTM‐15 and CTM‐IRT scores was very high (r=0.98). A scatterplot of the 2 measures (Figure) revealed that the relationship followed the expected S‐shaped curve for higher scores, but for scores below 40, the scatter is rather wide, reflecting the scarcity of information available for estimating the IRT parameters in this range.

Table 5.

IRT Parameters and Item Fit for CTM Items

Item	a	b1	b2	X2	Prob_X2
Item 1	2.02	−1.56	0.63	87.73	0.00
Item 2	2.15	−1.40	0.68	131.79	0.00
Item 3	2.29	−1.39	0.66	108.33	0.00
Item 4	3.09	−1.60	0.28	37.51	0.11
Item 5	4.07	−1.32	0.41	47.00	0.01
Item 6	3.28	−1.51	0.29	42.36	0.02
Item 7	2.82	−1.23	0.58	128.29	0.00
Item 8	3.53	−1.49	0.36	42.19	0.02
Item 9	4.17	−1.53	0.32	34.55	0.12
Item 10	3.73	−1.43	0.44	29.95	0.23
Item 11	2.71	−1.56	0.64	37.80	0.08
Item 12	2.54	0.40		30.52	0.05
Item 13	2.71	0.52		18.77	0.54
Item 14	2.79	0.29		34.95	0.01
Item 15	1.53	−0.93	1.17	216.45	0.00

CTM indicates care transitions measure; IRT, item response theory.

Figure 1.

Scatter plot of CTM‐15 sum scores and CTM‐15 IRT scores (r=0.98). IRT, item response theory.

IRT Parameters and Item Fit for CTM Items CTM indicates care transitions measure; IRT, item response theory. Scatter plot of CTM‐15 sum scores and CTM‐15 IRT scores (r=0.98). IRT, item response theory.

Validity Analyses

An ANOVA (Table 6) was used to evaluate the ability of CTM to differentiate scores among different patient groups. This analysis had mixed results. As in previous studies, the CTM‐15 score was usually lower for patients who reported poor general health, problems with independent care transition indicators, and postdischarge rehospitalization and emergency department visits. In our study population, in contrast to the original reports, men and younger patients reported better CTM scores. For most indicators, the differences in the group mean scores were small (2 to 3 points; less than one‐fifth of an SD).

Table 6.

Known Group Validity Results for the Long (CTM‐15), Short (CTM‐3), and IRT‐Based (CTM‐15 IRT) Scores

Variable	Groups (n)	CTM‐15				CTM‐3				CTM‐15 IRT*
Variable	Groups (n)	Mean	SD	F	R ²	Mean	SD	F	RV (95% CI)	Mean	SD	F	RV (95% CI)
Age	<64 (948)	74.8	16.5			75.2	17.1	4.5^†	1.03 (0.6 to 1.8)	74.8	15.5	5.7^†	1.31 (0.9 to 1.9)
	64 to 75 (412)	73.3	15.8	4.3^†	0.01	73.3	17.0			73.3	14.3
	75+ (185)	71.2	14.8			71.7	15.0			71.0	13.0
Sex	Male (1027)	74.9	15.3	10.7^†	0.006	75.2	16.1	9.2^†	0.86 (0.4 to 1.4)	74.7	14.4	7.7^†	0.71 (0.4 to 1.0)
Sex	Female (518)	72.1	17.6	10.7^†	0.006	72.4	18.3	9.2^†	0.86 (0.4 to 1.4)	72.5	15.9	7.7^†	0.71 (0.4 to 1.0)
Race	Asian/Pacific Islander (11)	76.5	18.1	0.4	NS	77.3	18.7	0.4	NS	75.9	17.5	0.5	NS
	Black (206)	73.2	15.6			73.3	16.5			73.3	15.3
	Native American/Alaska (15)	70.3	11.1			71.5	12.7			70.6	9.0
	White (1219)	74.1	16.3			7.45	17.0			74.1	15.0
	More than one race (71)	74.6	16.9			74.7	18.4			75.2	15.1
Hispanic	Yes (47)	77.1	16.1	1.8	NS	76.8	16.7	1.1	NS	77.0	15.7	2.1	NS
Hispanic	No (1494)	73.9	16.2	1.8	NS	74.2	16.9	1.1	NS	73.9	14.9	2.1	NS
Education	<High school (204)	72.8	14.9	5.8^†	0.007	72.8	16.4	7.1	1.2 (0.8 to 2.1)	72.7	16.4	6.4^†	1.1 (0.8 to 1.7)
	High school/some college (924)	73.2	15.6			73.4	16.4			73.2	16.4
	College graduate or higher (416)	76.3	17.6			76.9	17.9			76.1	17.9
Health status (baseline)	Excellent (93)	77.6	15.8	3.5^†	0.009	78.6	16.0	2.5^†	0.72 (0.5 to 1.7)	77.3	15.2	3.4^†	0.97 (0.7 to 1.2)
	Very good (310)	75.7	16.0			75.3	17.1			75.5	15.2
	Good (633)	74.1	15.8			74.2	16.3			74.0	14.5
	Fair (346)	72.2	16.1			72.8	17.2			72.4	14.9
	Poor (159)	72.2	17.5			73.3	18.3			72.3	15.9
Self‐report health utilization indicators
ED visit since hospital discharge (1 m)	Yes (224)	72.0	17.5	3.8^†	0.002	72.7	17.6	2.3	0.61 (0.1 to 3.8)	72.7	15.6	1.8	0.48 (0.1 to 5.7)
ED visit since hospital discharge (1 m)	No (1317)	74.3	15.9	3.8^†	0.002	74.5	16.8	2.3	0.61 (0.1 to 3.8)	74.2	14.9	1.8	0.48 (0.1 to 5.7)
Hospital readmission for any reason (1 m)	Yes (208)	71.9	17.8	3.9^†	0.002	72.7	18.0	2.1	0.54 (0.1 to 2.9)	72.4	14.8	2.7	0.69 (0.1 to 2.6)
Hospital readmission for any reason (1 m)	No (1337)	74.3	15.9	3.9^†	0.002	74.5	16.7	2.1	0.54 (0.1 to 2.9)	74.2	15.9	2.7	0.69 (0.1 to 2.6)
Symptoms gotten better or worse after discharge (1 m)	A lot worse (30)	71.7	19.0	8.2^*	0.020	71.5	19.1	6.1^*	0.74 (0.5 to 0.9)	72.9	17.8	9.2^*	1.13 (0.9 to 1.3)
	A little worse (45)	66.7	16.1			68.8	16.4			67.4	14.4
	Stayed the same (252)	72.1	16.3			72.5	17.8			71.9	14.8
	A little better (317)	72.1	14.7			72.4	15.6			71.9	14.0
	A lot better (867)	76.1	16.1			76.2	16.7			76.0	14.9
Self‐report care transition indicators
Had access to medical records	Yes (930)	76.5	14.9	53.6^*	0.04	76.7	15.7	50.5^*	0.94 (0.8 to 1.1)	76.0	14.4	39.9^*	0.74 (0.6 to 0.9)
Had access to medical records	No (531)	70.1	17.7	53.6^*	0.04	70.3	18.4	50.5^*	0.94 (0.8 to 1.1)	70.9	15.6	39.9^*	0.74 (0.6 to 0.9)
Prescheduled follow up visits	Yes (1178)	74.9	15.5	13.5^a	0.01	75.1	16.3	10.1^†	0.75 (0.4 to 1.2)	74.6	14.7	7.6^b	0.56 (0.2 to 0.8)
Prescheduled follow up visits	No (338)	71.3	17.6	13.5^a	0.01	71.8	18.5	10.1^†	0.75 (0.4 to 1.2)	72.1	15.5	7.6^b	0.56 (0.2 to 0.8)
Know who to contact if symptoms get worse	Yes (1478)	74.3	15.9	13.1^*	0.01	74.5	16.7	8.1^†	0.62 (0.1 to 1.0)	74.2	14.8	10.9^a	0.83 (0.5 to 1.3)
Know who to contact if symptoms get worse	No (57)	66.4	20.5	13.1^*	0.01	68.0	20.1	8.1^†	0.62 (0.1 to 1.0)	67.5	17.2	10.9^a	0.83 (0.5 to 1.3)

CTM indicates care transitions measure; ED, emergency department; IRT, item response theory; RV, relative validity.

CTM‐IRT score linearly transformed to mean of 74 (SD 14).

*P<0.001; †P<0.05.

Known Group Validity Results for the Long (CTM‐15), Short (CTM‐3), and IRT‐Based (CTM‐15 IRT) Scores CTM indicates care transitions measure; ED, emergency department; IRT, item response theory; RV, relative validity. CTM‐IRT score linearly transformed to mean of 74 (SD 14). *P<0.001; †P<0.05. Overall, the CTM‐3 was able to detect differences in selected groups, as well as the CTM‐15, for many patient characteristics, self‐reported health, and self‐report care transition indicator subgroups. However, the CTM‐3 slightly inflated patient care transition scores, compared with the CTM‐15; this difference was observed primarily for small subgroups. In this sample, the short measure failed to detect differences between patients with and without self‐reported rehospitalization and emergency department visits. Relative validity coefficients suggest that this underperformance is not statistically significant. As expected, the SD of item responses was larger for the CTM‐3 than for the CTM‐15. All relative validity coefficients for comparing the CTM‐IRT with the traditional CTM‐15 were close to, and not significantly different from, 1.0, suggesting that the IRT scoring approach did not improve measurement precision.

Discussion

Major Findings

The major findings of this work of this study are related to the psychometric characteristics of the CTM and the application of IRT methods for improvement of measurement precisions. Each of these areas is discussed separately in the context of previous evidence next. The basic psychometric characteristics of the CTM‐15 in our data were similar to those previously reported; item‐level means were comparable to those previously reported and the measure had good internal consistency and reliability. However, we also identified some undesirable characteristics of the CTM‐15 scale score, which may influence the measurement of care transitions as a performance measure and future multivariable analyses. The distribution of CTM‐15 summary score was severely left‐skewed, due to a substantial ceiling effect and clustering of high summary transition scores. Moreover, the 4 response options do not provide unique information; 3 CTM items may be better assessed using a binary response. These problems can be partially explained by strong acquiescence bias (the tendency of respondents to agree with statements of opinion regardless of content),[29-30] which is common with the “agree/disagree” format used. Understanding the source does not fix the problem, however, since the highly skewed, clustered responses lead to scores with little variance, making it hard—even with rescaling—to discriminate levels of care transition quality. In our known‐groups validity analyses, the CTM‐15 found statistically significant hypothesized differences supporting findings from earlier reports; however, the magnitude of differences (2 to 3 points on a 0‐to‐100 scale, with an SD of 16) observed for many of these tests, was small. In the absence of established guidelines for minimally important differences, we would typically only consider differences to be practically meaningful if they are ≈0.5 SD or greater. The problems discussed here with the CTM‐15 apply as well or more to its short form, the CTM‐3. This is of particular concern given that the CTM‐3 is included in the CAHPS Hospital Survey, used to judge hospital quality. Explorations of the validity of the CTM‐3 based on the known‐groups validity method had mixed results that raise questions about its ability to detect differences in care transition quality. Our results complement those of a recent study of the CTM‐15, which also identified some gaps in measurement performance for assessing the quality of care transitions in a complex population of older rehabilitation patients.[31] While the CTM was found to be reliable, the authors noted that the construct validity and utility of the measure could be improved. Qualitative data in their study revealed the arbitrary nature of the choice that some patients make when selecting a response category between “agree” and “strongly agree.” Comments included: “Either you agree or you don't!” and “I don't know why I keep saying ‘agree’ and not ‘strongly agree’―I guess I just don't want you to think I'm not listening by choosing ‘strongly agree’ all the time!” In addition, focus‐group respondents identified some aspects of transitional care that are not included in the CTM, such as, building a relationship, and effectively communicating with, one's clinician, raising questions as to the CTM's content validity. The authors also noted problems associated with the large proportion of participants responding in “agreement,”[31] which aligns with reports from the CTM's developers that most patients agreed with each CTM question (range 69% to 94%).[32] In this study, we also explored the possibility that the application of IRT scoring may improve the measurement properties of the CTM‐15. Theoretically IRT scoring can often improve measurement precision[5]; however, as indicated by nonsignificant relative validity coefficients, IRT scoring did not improve the CTM‐15. This is the first attempt to apply IRT scoring to the CTM of which we are aware. Several reports on the use of IRT or Rasch scoring approach to patient reported measures developed in the classical test theory framework have produced mixed results. When IRT was used to score the Short Form‐36 Physical Function scale modest gains were observed, of these, the strongest gains were demonstrated in the most clinically dissimilar groups.[33] In sensitivity studies of the same measure, IRT scores were more sensitive for the general population across 7 countries, while results for the Short Form‐36 in patients with epilepsy were mixed depending on external criteria.[34] Considerable gains in precision were reported for the Rasch scoring of the Oxford Hip Score Questionnaire.[35-36] Precision was also improved using IRT‐based scoring for the upper limb subscale of the Motor Assessment Scale, particularly in the scale's extreme ranges.[37] An IRT‐based scoring approach improved the sensitivity of the Visual Function Index, but produced no gains for cross‐sectional comparisons.[38] No improvement in precision was observed for the Health Assessment Questionnaire[39] and the EORTC QLQ‐C30 scales.[5] Similarly, we could not improve the CTM‐15's measurement precision with IRT‐based scoring. Possible causes for this have been discussed previously[5] and could be related to the misfit of some items to the IRT model, the fact that the measure was developed within the classic test theory framework and not specifically for the IRT, and the existence of several parallel items with highly similar content. Unique to the CTM was acquiescence bias, in which up to 30% of respondents selected the same response option (“agree” or “strongly agree”) for all 15 items, leading to extreme clustering of the scale. For these respondents, IRT scoring cannot improve over a traditional sum scored approach. With the tight clustering of scale scores in the upper end of the scale, it is possible that IRT scoring failed to capture any additional information for the respondents. These findings are also in line with previous reports that IRT gains are potentially largest at the extremes of a construct's range.[33,37] Finally, it is worth noting that while the CTM‐15 is a patient‐reported measure, it aims to evaluate the patient's experience with a care process; it does not relate to any of health, quality of life, or physical functioning.

Strengths and Limitations

Our study has important strengths and several limitations. We used a large and diverse sample to conduct in‐depth psychometric evaluation using both classic and modern analytic methods. To the best of our knowledge, this is the most comprehensive psychometric evaluation of the CTM. Our rich data set will also allow for future evaluation of relationship of the CTM with important clinical variables. While our sample is the largest one on which CTM‐15 psychometric analyses have been reported, it is still limited to a particular patient population and geographic regions. Psychometric evaluation in a population with different characteristics may have different results. Good psychometric practice would require validation in samples that are significantly different from the ones previously evaluated. Our study participants who completed the measure reported on average higher level of education, better health, and higher income compared with all patients enrolled at baseline in the study. These differences may have biased the average CTM‐15 scores, leading to an underestimation of proportions of patients who report poor transitions and an overestimation of the ceiling effect in the measure. Given the diverse sample on which the analyses were completed, however, the general conclusions of the psychometric evaluation are most likely robust.

Conclusions and Implications

Our findings have important implications for analyzing and interpreting CTM scores. To the extent that CTM score distributions are both clustered around certain values and skewed, it is important that appropriate analytic techniques are used in the analyses of CTM scores. If the linear scoring of the CTM‐15 is not supported by available data, it would be more important to use analytic approaches for categorical data. Results from the CTM short form may have even a higher ceiling effect and conclusions, thus it may not always correspond to results of the CTM‐15 as previously suggested. The strong acquiescence bias of the measure also suggests that CTM results may be representing an overly optimistic view of care transition quality. To increase the utility of the CTM, it will be important to determine whether observed magnitudes of difference (2 to 3 points on the CTM scale) between groups are clinically important using recommended triangulation approaches[40-42]. Future work should also aim to improve care transition quality assessments by constructing a measure that fully reflects what patients care about in the area of care transition, and that is less prone to acquiescence bias. In summary, we identified some psychometric challenges in both the long and short forms of the CTM that could not be improved with IRT‐based scoring. Combined with accumulating evidence of existing gaps in the CTM's content validity, this study suggests that CTM scores should be interpreted with caution and may not be a sufficiently sensitive tool for detecting meaningful improvements in the quality of transitional care. A new patient‐reported measure of care transition quality that addresses these problems is needed.

25 in total

Review 1. Common method biases in behavioral research: a critical review of the literature and recommended remedies.

Authors: Philip M Podsakoff; Scott B MacKenzie; Jeong-Yeon Lee; Nathan P Podsakoff
Journal: J Appl Psychol Date: 2003-10

Review 2. Understanding the minimum clinically important difference: a review of concepts and methods.

Authors: Anne G Copay; Brian R Subach; Steven D Glassman; David W Polly; Thomas C Schuler
Journal: Spine J Date: 2007-04-02 Impact factor: 4.166

3. Psychometric evaluation and calibration of health-related quality of life item banks: plans for the Patient-Reported Outcomes Measurement Information System (PROMIS).

Authors: Bryce B Reeve; Ron D Hays; Jakob B Bjorner; Karon F Cook; Paul K Crane; Jeanne A Teresi; David Thissen; Dennis A Revicki; David J Weiss; Ronald K Hambleton; Honghu Liu; Richard Gershon; Steven P Reise; Jin-shei Lai; David Cella
Journal: Med Care Date: 2007-05 Impact factor: 2.983

Review 4. Recommended methods for determining responsiveness and minimally important differences for patient-reported outcomes.

Authors: Dennis Revicki; Ron D Hays; David Cella; Jeff Sloan
Journal: J Clin Epidemiol Date: 2007-08-03 Impact factor: 6.437

5. The minimal detectable change cannot reliably replace the minimal important difference.

Authors: Dan Turner; Holger J Schünemann; Lauren E Griffith; Dorcas E Beaton; Anne M Griffiths; Jeffrey N Critch; Gordon H Guyatt
Journal: J Clin Epidemiol Date: 2009-10-01 Impact factor: 6.437

6. Can the care transitions measure predict rehospitalization risk or home health nursing use of home healthcare patients?

Authors: Miriam Ryvicker; Margaret V McDonald; Melissa Trachtenberg; Timothy R Peng; Sridevi Sridharan; Penny H Feldman
Journal: J Healthc Qual Date: 2013 Sep-Oct Impact factor: 1.095

7. Rasch analysis of the Western Ontario MacMaster questionnaire (WOMAC) in 2205 patients with osteoarthritis, rheumatoid arthritis, and fibromyalgia.

Authors: F Wolfe; S X Kong
Journal: Ann Rheum Dis Date: 1999-09 Impact factor: 19.103

8. The feasibility of applying item response theory to measures of migraine impact: a re-analysis of three clinical studies.

Authors: Jakob B Bjorner; Mark Kosinski; John E Ware
Journal: Qual Life Res Date: 2003-12 Impact factor: 4.147

9. Rasch-based scoring offered more precision in differentiating patient groups in measuring upper limb function.

Authors: Asaduzzaman Khan; Chi-Wen Chien; Sandra G Brauer
Journal: J Clin Epidemiol Date: 2013-03-22 Impact factor: 6.437

10. Using the bootstrap to establish statistical significance for relative validity comparisons among patient-reported outcome measures.

Authors: Nina Deng; Jeroan J Allison; Hua Julia Fang; Arlene S Ash; John E Ware
Journal: Health Qual Life Outcomes Date: 2013-05-31 Impact factor: 3.186

3 in total

Psychometric evaluation of the Care Transition Measure in TRACE-CORE: do we need a better measure?

Introduction

Methods

Sample

Measures

Analytic Plan

Classic Test Theory Psychometric Analyses

Dimensionality Analyses

IRT Modeling

Validity Analyses

Results

Classic Test Theory Psychometric Analyses

IRT Modeling

Validity Analyses

Discussion

Major Findings

Strengths and Limitations

Conclusions and Implications

Review 1. Common method biases in behavioral research: a critical review of the literature and recommended remedies.

Review 2. Understanding the minimum clinically important difference: a review of concepts and methods.

3. Psychometric evaluation and calibration of health-related quality of life item banks: plans for the Patient-Reported Outcomes Measurement Information System (PROMIS).

Review 4. Recommended methods for determining responsiveness and minimally important differences for patient-reported outcomes.

5. The minimal detectable change cannot reliably replace the minimal important difference.

6. Can the care transitions measure predict rehospitalization risk or home health nursing use of home healthcare patients?

7. Rasch analysis of the Western Ontario MacMaster questionnaire (WOMAC) in 2205 patients with osteoarthritis, rheumatoid arthritis, and fibromyalgia.

8. The feasibility of applying item response theory to measures of migraine impact: a re-analysis of three clinical studies.

9. Rasch-based scoring offered more precision in differentiating patient groups in measuring upper limb function.

10. Using the bootstrap to establish statistical significance for relative validity comparisons among patient-reported outcome measures.

Review 1. Preparedness for hospital discharge and prediction of readmission.

2. Validity and reliability of the Chinese version of the care transition measure.

3. Measuring care transitions in Sweden: validation of the care transitions measure.