Literature DB >> 27349641

The COPD-SIB: a newly developed disease-specific item bank to measure health-related quality of life in patients with chronic obstructive pulmonary disease.

Muirne C S Paap1,2, Lonneke I M Lenferink3, Nadine Herzog4, Karel A Kroeze5, Job van der Palen5,6.   

Abstract

BACKGROUND: Health-related quality of life (HRQoL) is widely used as an outcome measure in the evaluation of treatment interventions in patients with chronic obstructive pulmonary disease (COPD). In order to address challenges associated with existing fixed-length measures (e.g., too long to be used routinely, too short to ensure both content validity and reliability), a COPD-specific item bank (COPD-SIB) was developed.
METHODS: Items were selected based on literature review and interviews with Dutch COPD patients, with a strong focus on both content validity and item comprehension. The psychometric quality of the item bank was evaluated using Mokken Scale Analysis and parametric Item Response Theory, using data of 666 COPD patients.
RESULTS: The final item bank contains 46 items that form a strong scale, tapping into eight important themes that were identified based on literature review and patient interviews: Coping with disease/symptoms, adaptability; Autonomy; Anxiety about the course/end-state of the disease, hopelessness; Positive psychological functioning; Situations triggering or enhancing breathing problems; Symptoms; Activity; Impact.
CONCLUSIONS: The 46-item COPD-SIB has good psychometric properties and content validity. Items are available in Dutch and English. The COPD-SIB can be used as a stand-alone instrument, or to inform computerised adaptive testing.

Entities:  

Keywords:  COPD; IRT; Item bank; Item response theory; MRF-26; Patient perspective; QoL-RIQ; SGRQ-C; VQ11

Mesh:

Year:  2016        PMID: 27349641      PMCID: PMC4924274          DOI: 10.1186/s12955-016-0500-0

Source DB:  PubMed          Journal:  Health Qual Life Outcomes        ISSN: 1477-7525            Impact factor:   3.186


Background

In the last few decades, it has been recognised that it is imperative to include health-related quality of life (HRQoL) as an outcome measure in the evaluation of treatment interventions in patients with chronic obstructive pulmonary disease (COPD) [1, 2]. COPD is a chronic respiratory condition that cannot be cured; therefore, many COPD treatment programmes focus on the self-management of symptoms and their effect on the patient’s HRQoL [3]. Currently, HRQoL in patients with COPD is typically measured by means of standardised self-report questionnaires that were developed using Classical Test Theory (CTT) [4]. Although most HRQoL questionnaires have been extensively validated, their use is not without limitations; many of these limitations stem directly from the static nature of the current generation of questionnaires [5]. To facilitate the comparison of scores within and among patients, the same questions need to be administered to each patient at each time-point. This means that a single set of questions should be suitable to assess the entire underlying range of HRQoL (from very good to very poor) and should provide sufficient measurement precision at all levels in between. Consequently, a large number of questions are typically required to achieve both sufficient measurement width (content validity) and precision (reliability). This places a considerable burden on patients, who have to complete numerous items, many of which seem irrelevant or redundant to their specific situation. Ideally, each questionnaire should be tailored to the individual patient, resulting in each item (question) soliciting valuable information. However, this should not result in a lack of comparability across patients. This flexibility can be achieved using modern techniques: computerised adaptive testing (CAT). CAT [6] is a specific type of computer based testing that uses an Item Response Theory (IRT) [7] measurement model for item selection during test taking. IRT and CAT were first used in the field of educational measurement. In the last few decades, both techniques have become increasingly popular in health research. Item selection in a CAT is dependent on a patient’s estimated score on one or more latent traits. The estimate of the score on the latent trait (here: HRQoL) of the patient is continuously adjusted (each time an answer to an additional item is given) until a specific pre-defined criterion is reached [8]. This procedure permits a higher degree of precision with fewer items than a procedure using static scales [8]. CAT is scored in real-time; results can be displayed to the physician and/or patient almost instantly in written and graphic reports. A CAT selects items from a pool of items: an item bank. An item bank ideally consists of a large number of items covering all relevant aspects of the construct under study. An item bank can be developed from scratch, or built on the foundations of previous work (e.g., using items from existing questionnaires as a starting point) [5, 8]. Item bank development usually includes both quantitative and qualitative methods; i.e., respectively, evaluating the item performance using an IRT model, and conducting cognitive interviews or focus groups in order to obtain in-depth understanding of the way the construct is perceived by members of the target population and cognitive interviews to improve item formulation (see e.g., [9-15]). It is paramount that the items be of good quality, both in terms of content validity and psychometric properties: a CAT can only be as good as the item bank it is based on [8]. After the key concepts to be included in the bank have been identified, the formulation and presentation of the items has been found adequate, and the psychometric properties of the items favourable (acceptable coverage of latent trait values, adequate measurement precision where it is needed) a final calibration of the item bank is performed. From this point onward the item parameters are considered “known” and can be used for item selection in CAT. There is a need for flexible, accurate, and efficient assessment of quality of life in COPD. Currently, there is no gold standard. The SGRQ and SGRQ-C are two of the best-known legacy measures and have been shown to be of high quality; however, they might be viewed as problematic or unsuitable for use in (routine) practice, due to their length. The purpose of this paper is to describe the development of the COPD-SIB: a COPD-specific HRQoL item bank that can be used to inform CAT, covering topics that are relevant to COPD patients. We report on both qualitative (item selection and generation) and quantitative (psychometric analysis using IRT) aspects of this process.

Methods

Item selection and development

A predefined structured item generation methodology was used to select and design items for the COPD-SIB. This procedure consisted of three steps (which are illustrated in Fig. 1). First, it was determined which topics should be covered. Topics were identified by conducting a literature review and by re-analysing interviews with patients conducted previously [9]. This task was performed by LL under the supervision of MP. Second, relevant items were selected from existing instruments based on the findings of step 1, and new items were written to fill gaps (defined as topics that were not sufficiently covered). This task was jointly performed by LL and MP, and reviewed by JP. Third, the items selected and developed in step two were evaluated for relevance and clarity in several sets of cognitive interviews (see Additional files 1 and 2); the results from these interviews were used to further improve the items and fill newly identified gaps (defined as topics that had not been identified in a previous step but emerged as highly relevant based on the interviews conducted in step 3). This task was primarily performed by MP, with contributions from LL and under the supervision of JP.
Fig. 1

Flowchart of the development process of the COPD-specific item bank (COPD-SIB)

Flowchart of the development process of the COPD-specific item bank (COPD-SIB) The St. George Respiratory Questionnaire for COPD patients (SGRQ-C) was taken as a starting point, since it is widely used and contains many items of high quality [16, 17]. Items from other instruments were considered for inclusion if a) they pertained to themes considered important by COPD patients (importance was deduced from interviews and literature review), and b) they did not show too much overlap with SGRQ-C items. Permission from the developers of the questionnaire for use of these items was a requirement. We included items from five existing questionnaires in our initial item pool: the SGRQ-C, the Quality of Life for Respiratory Illness Questionnaire (QoL-RIQ), the COPD Assessment Test, the Maugeri Respiratory Failure Questionnaire Reduced Form (MRF26), and the VQ11 [18-22]. After items had been selected from existing instruments, the topics covered by these items were compared to the ones most frequently mentioned in the patient interviews. Gaps were identified, and new items were written using statements made by patients as a starting point. For the SGRQ-C and the COPD Assessment Test, official Dutch translations were available. The items selected from the QoL-RIQ, MRF26, and VQ11 were translated into Dutch by an expert; a native Dutch speaker who holds a university degree in English Language and Culture and has ample experience in English-Dutch and Dutch-English translation. She also translated all newly developed Dutch items into English. All items in the initial item pool were subjected to cognitive debriefing, using the Three Step Test Interview (TSTI) [23]. In this study, only the Dutch items underwent the process of cognitive debriefing and validation. We plan to repeat this process for the English items in a future study, in collaboration with colleagues from Canada [24]. See Additional file 1 for a detailed explanation of this procedure along with example probes.

Patients

Data from three Dutch COPD patient samples were used for the analyses (see Fig. 1). Purposive sampling was used for samples 1 and 2 (interview data); inclusion stopped when saturation was reached. The inclusion criteria were: a medical diagnosis of COPD; sufficient mastery of the Dutch language; being able to answer questions in a face-to-face interview (samples 1 and 2); being able to complete a questionnaire (samples 1-3). All patients in samples 1 and 2 were recruited through pulmonary clinics in the Netherlands. The patients in sample 3 (questionnaire data) were recruited through healthcare professionals in JP’s professional network. See Additional file 2 for detailed information about the samples.

Psychometric evaluation of the item bank

Test design

In addition to evaluating the psychometric properties of the COPD-SIB items, we wanted to establish the measurement properties of three generic HRQoL domains in a Dutch COPD sample. The results for these three domains will be presented in a separate paper. We did not want to create one long questionnaire including all four domains, since this would be very burdensome for patients; therefore we decided to divide the total number of items1 over three so-called booklets (questionnaire versions), each containing around 100 items.2 The booklets contained between 23 and 32 COPD-SIB items each, of which 10 were anchor items. Anchor items are items that are present in every booklet and which are thought to have stable measurement properties. They can be used to link the items in the different booklets to form a common scale, when using parametric IRT (this procedure is also known as equating) [25]. A widely used guideline to selecting anchor items is that this item set should be a mini-version of the whole item bank, implying that the anchor set should cover the same content (but with fewer items) as the total item bank [25]. The anchor item set used in this study was selected by a content expert (JP) to ensure it adequately reflected the original spread in topics. The other COPD-SIB items were divided randomly over the three booklets. See Fig. 2 for a visual impression of the booklet design, and Table 2 for more information regarding which item was included in which booklet.
Fig. 2

Visual representation of booklet design with number of items on the y-axis and booklet number on the x-axis. Note that the items are ordered according to their booklet assignment to illustrate the design

Table 2

Item properties

Item nritem content (key words) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \widehat{\boldsymbol{\alpha}} $$\end{document}α^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \widehat{{\boldsymbol{\beta}}_1} $$\end{document}β1^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \widehat{{\boldsymbol{\beta}}_2} $$\end{document}β2^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \widehat{{\boldsymbol{\beta}}_3} $$\end{document}β3^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \widehat{{\boldsymbol{\beta}}_4} $$\end{document}β4^ bookletflagged in MSA (booklet)
1air-conditioning1.06−2.64−1.080.271
2cold1.15−3.1−0.920.321.541, 2, 3
3fog1.4−2.08−0.280.711.831
4humidity1.39−2.32−0.330.771.641
5wind1.25−2.09−0.380.621.881, 2
9achieving objectives1.8−1.040.831.382.311, 2, 3
10confidence0.31−11.43−5.73−2.027.991, 2, 31, 2, 3
11life worth living0.76−2.280.562.23.7222
12asking for help0.81−1.781.392.244.161, 2, 32, 3
13friendship−0.594.972.50.57−3.633
14hopelessness1.69−2.51−0.840.151.341
15fatigue1.21−2.270.010.942.71, 2, 3
16living consciously−0.525.692.590.23−5.7722
17accepting help1.52−1.050.721.462.291
18life worth living 21.32−2.71−1.57−0.381.183
19feeling content0.26−11.08−6.75−3.344.811
20feeling disappointed1.35−2.06−0.40.682.472
21fear of being alone1.69−1.65−0.440.752
22talking about anxiety1.1−1.02−0.041.851
23feeling restricted2.33−1.080.51.161.873
24valuing life0.71−4.84−2.22−0.932.6433
25shunning activities1.43−2.280.160.852.262
26avoidance0.65−4.09−0.441.173.6811
27fear of suffocation1.94−2.18−0.87−0.141.121, 2, 3
28not accepting restrictions1.61−1.320.561.312.411
29dependence2.08−2.12−0.440.031.22
30talking1.72−2.34−0.65−0.061.223
31bowel problems0.86−2.91−1.180.6611
32friendship 21.91−2.06−0.75−0.180.941, 2, 3
33alone1.74−2.1−0.75−0.131.183
34fear of being alone 21.67−1.39−0.630.952
35cough hurts1.41−1.68−0.50.981
36cough tired1.79−1.78−0.050.581.431
37breathless talk1.48−0.530.241.692
38breathless bend1.52−1.870.070.721.93
39sleep disturbed1.44−2.63−0.750.091.541, 2, 32
40exhausted2.04−1.130.961.433
41cough embarrassing1.11−3.35−1.19−0.181.941, 2, 33
42nuisance to others1.97−2.21−0.86−0.151.222
43panic2.4−1.86−0.630.041.192
44not in control1.64−1.41−0.391.591
45frail, invalid2.33−1.75−0.280.31.282
46effort2.22−1.99−0.560.161.543
47activities (gardening)2.34−1.130.110.621.641
48activities (exercise)1.55−0.820.921.5433
49activities (dancing)2.12−0.880.350.811.81
50frustration2.21−2.07−0.70.461.23
51being fed up1.7−2.41−1.37−0.661
52wanting to stay in bed1.35−2.14−0.9403
53acceptance−0.324.932.4−2.92−8.1722
54adapting0.12−14.8−7.533.9420.3533
55panic 21.88−2.06−0.880.252
56coping1.08−3.1−1.080.973
57control1.39−1.050.741
58cough0.74−3.97−0.921.513.3211
59phlegm0.8−3.27−1.430.522.1711
60short of breath1.95−1.58−0.221.551
61wheezing0.81−3.47−1.510.492.0211
62breathless wash2.28−1.47−0.640.10.663
63breathless walk 12.49−2.07−1.25−0.20.473
64breathless walk 20.98−1.98−0.511.242.311, 2, 31, 2
65breathless stairs0.77−1.52−0.191.352.9222
66breathless hills0.04−17.486.5422.3236.1411, 2

Note: the reported parameter estimates were calculated using the GRM; the last two columns indicate in which booklet the item was included, and whether or not the item was flagged for removal in the Mokken Scale Analysis (MSA)

Visual representation of booklet design with number of items on the y-axis and booklet number on the x-axis. Note that the items are ordered according to their booklet assignment to illustrate the design

Assessing item quality and calibrating the item bank

The main purpose of the current study was to develop a unidimensional disease-specific item bank: the COPD-SIB. We wanted to retain only items of sufficient psychometric quality. The Graded Response Model (GRM; an IRT model suitable for Likert scale data) [26, 27] was estimated to obtain item parameters needed for the CAT. Several item fit statistics are currently available for the GRM, such as the S-X; however, these only have adequate power in very large samples [28]. Unsurprisingly, this statistic did not flag any item for misfit in our analysis. Rather than relying on these outcomes, we used two complementary procedures providing outcomes that are not dependent on the IRT model under evaluation: Mokken Scale Analysis (MSA) [29, 30] and parametric smoothed regression lines based on a generalised additive model (GAM) [31]. MSA was used to identify items that formed a strong unidimensional scale. Items that were flagged for removal by the MSA were further evaluated by visually inspecting the response curves estimated using GAM plots to determine the nature of the misfit. A GAM model is a generalised linear model based on a set of smooth functions; the model does not require a detailed specification of parametric relationships, thus allowing for relatively flexible modelling of statistical relationships (typically involving regression splines) [32]. MSA was performed using the R [33] package Mokken [34]. The model used was the monotone homogeneity model (MHM), which is a nonparametric IRT model. In recent years, MSA has been increasing in popularity in health research (e.g., [16, 35–42]). MSA is a scaling method that identifies scales that allow an ordering of individuals on an underlying one-dimensional scale using the unweighted sum of item scores. In order to establish which items co-vary and form a scale, scalability coefficients are calculated on three levels: item-pairs (H), items (H), and scale (H). H is based on H and reflects the degree to which the scale can be used to reliably order persons on the latent trait using their sum score. Similar to the item-rest correlation, H also expresses the degree to which an item is related to other items in the scale. A scale is considered acceptable if 0.3 ≤ H < 0.4, good if 0.4 ≤ H < 0.5, and strong if H ≥ 0.5 [12; 13]. The MSA analyses were performed for each booklet separately (since MSA cannot account for the type of test design we used). We first performed confirmatory analyses, using H ≥ 0.3 as a cut-point for an acceptable scale. Since the H-value for one of the booklets fell below the cut-point, the confirmatory analyses were followed by exploratory analyses, again using H ≥ 0.3 as a cut-off. In an exploratory MSA scales are formed in an iterative manner; the selection algorithm starts with two good items, adding one item at a time using certain criteria (H ≥ user-specified cut-off; the item under consideration does shows a positive relationship in terms of H with other items in the scale). Two selection algorithms are currently available; we chose to use the newer one, the genetic algorithm [43]. The GRM was fitted and parameters were estimated using the R package mirt [44]. Metropolis-Hastings Robbins-Monro (MH-RM) estimation was used with a tolerance threshold of 0.001. The algorithm converged after 602 iterations. The GAM plots were also produced using the mirt package (function itemGAM).

Results

Domain definition

Eight important themes not covered by PROMIS domains were identified based on literature review and patient interviews: Coping with disease/symptoms, adaptability Autonomy Anxiety about the course/end-state of the disease, hopelessness Positive psychological functioning Situations triggering or enhancing breathing problems Symptoms Activity Impact Items that pertained to these eight themes were selected/written to be included in the COPD-SIB item bank.

Item generation and revision

The items that were selected for psychometric evaluation are listed in Table 1 (English version). Note that the items were coded in such a way that a higher score on the latent trait is indicative of better quality of life. We decided not to include the COPD Assessment Test items, since patients were confused by the format (most patients only read/paid attention to the left half of the items). The SGRQ-C items, on the other hand, were generally very well-received by patients. We used the findings reported by Paap et al. [17] to inform item revision for the SGRQ-C items that were included in the initial item pool.
Table 1

Overview over items selected for psychometric evaluation

Item nrSourcea Original ItemRevised ItemRevised response format (RF) and instruction (In)
1QoL-RIQBeing in air-conditioned buildingsBeing in air-conditioned buildings (for instance, in hospitals)RF1, In1
2QoL-RIQOn cold daysRF1, In1
3QoL-RIQOn foggy daysRF1, In1
4QoL-RIQOn humid daysRF1, In1
5On windy daysRF1, In1
6QoL-RIQBeing outside during the polling seasonRF1, In1
7QoL-RIQDue to domestic animals or petsRF1, In1
8QoL-RIQBy flowers, trees, plantsRF1, In1
9VQ11I feel unable to achieve my objectivesBecause of my COPD, I feel unable to achieve all of my objectives.RF2, In2
10I am confident I will be able to cope with my COPD, even if the complaints get worse.RF2b, In2
11I can imagine that there are people with severe COPD complaints, who feel that life is not worth living anymore.RF2, In2
12I don’t like having to ask somebody to help me, when I cannot do something myself.RF2, In2
13Because of my COPD, I appreciate my friends more.Because of my COPD, I appreciate my social contacts (e.g., friends, partner, relatives) more.RF2b, In2
14When I think about my COPD, I have a feeling of hopelessness.RF2, In2
15I shun activities I know will cause fatigue and breathlessness.I shun activities I know will cause fatigue.RF2, In2
16Since being diagnosed with COPD, I have lived more consciously.RF2b, In2
17I find it frustrating that I have to accept help for things I was used to doing myself.RF2, In2
18If my COPD symptoms get worse, I don’t care about life anymore.RF2, In2
19I am content with the things I can still do.RF2b, In2
20I feel disappointed, when I’m not able to do something because of my COPD.RF2, In2
21Because of my COPD I’m afraid of being alone.RF2, In2
22When I worry about my COPD, I find it hard to talk about it.RF2, In2
23I don’t feel restricted, due to my COPD.I feel restricted, due to my COPD.RF2, In2
24I value my life just as much as I did before I was diagnosed with COPD.RF2b, In2
25I shun activities I know will cause fatigue and breathlessness.I shun activities I know will cause breathlessness.RF2, In2
26I avoid thinking about how my COPD could get worse in the future.RF2, In2
27Once in a while I have such shortness of breath that I fear I will suffocate.Once in a while I have such shortness of breath/wheezy chest that I fear I will suffocate.RF2, In2
28I find it very hard to accept that I cannot do everything I would like to do, due to my COPD.I find it hard to accept that I cannot do everything I would like to do, due to my COPD.RF2, In2
29QoL-RIQFeeling dependent upon othersI don’t like the feeling of being dependent upon others.RF2, In2
30MRF-26Because of my lung disease, I cannot talk as much as I would like to.Because of my COPD, I cannot talk as much as I would like to.RF2, In2
31Because of my COPD, I am sometimes unable to control my bowel movements.RF2, In2
32MRF-26Because of my COPD, I visit friends and acquaintances less frequently than I used to.Because of my COPD, I go out to see friends or acquaintances less than usual.RF2, In2
33MRF-26Because of my COPD, I spend much more time alone.Because of my COPD, I spend much more time alone.RF2, In2
34MRF-26Because of my COPD, I would like somebody to accompany me, when I go outBecause of my COPD, when I am outside I feel I need to have someone with me.RF2, In2
35SGRQ-CMy cough hurtsRF2, In2
36SGRQ-CMy cough makes me tiredRF2, In2
37SGRQ-CI am breathless when I talkI get breathless when I talkRF2, In2
38SGRQ-CI am breathless when I bend overI get breathless when I bend overRF2, In2
39SGRQ-CMy cough or breathing disturbs my sleepRF2, In2
40SGRQ-CI get exhausted easilyI get tired easilyRF2, In2
41SGRQ-CMy cough or breathing is embarrassing in publicI feel ashamed when I have to cough or when I have difficulty breathing in the presence of other peopleRF2, In2
42SGRQ-CMy chest trouble is a nuisance to my family, friends or neighboursI feel that my chest trouble is a nuisance to my environment (e.g. family, friends or neighbours)RF2, In2
43SGRQ-CI get afraid or panic when I cannot get my breathRF2, In2
44SGRQ-CI feel that I am not in control of my chest problemI have the feeling that I am not in control of my chest problemRF2, In2
45SGRQ-CI have become frail or an invalid because of my chestI have become frail or an invalid because of my chest problemRF2, In2
46SGRQ-CEverything seems too much of an effortRF2, In2
47SGRQ-CMy breathing makes it difficult to do things such as walk up hills, carrying things up stairs, light gardening such as weeding, dance, play bowls or play golfMy breathing problems make it difficult to do light gardening, such as weeding.RF2, In2
48SGRQ-CMy breathing makes it difficult to do things such as carry heavy loads, dig the garden or shovel snow, jog or walk at 5 miles per hour, play tennis or swim.My breathing problems make it difficult to exercise (e.g., jogging, playing tennis, or swimming).RF2, In2
49SGRQ-CMy breathing makes it difficult to do things such as walk up hills, carrying things up stairs, light gardening such as weeding, dance, play bowls or play golf.My breathing problems make it difficult to do things such as dancing, playing golf, or playing bowls.RF2, In2
50It frustrated me that I couldn’t do everything I wanted to do anymore.RF3, In3
51I thought sometimes, I’m really fed up with everything.RF3, In3
52I wanted to stay in bed/lie down on the couch all day, when I had a “bad” day.RF3, In3
53I resigned myself to the fact that I was not able to do certain things anymore, due to my COPD.I could accept it, when I was not able to do something anymore, due to my COPD.RF3b, In3
54I tried to find an alternative when I could not perform a certain activity due to my COPD.I persevered until I had finished an activity, despite the fact that I couldn’t perform that activity well, due to my COPD.RF3b, In3
55I panicked, when I had trouble breathingRF3, In3
56I could cope with my COPD.RF3b, In3
57I got my breathing problems under control.RF3b, In3
58SGRQ-CI coughI coughed.RF3, In3
59SGRQ-CI bring up phlegm (sputum)I brought up phlegm (sputum).RF3, In3
60SGRQ-CI have shortness of breathI had shortness of breath.RF3, In3
61SGRQ-CI have attacks of wheezingI had attacks of wheezing.RF3, In3
62SGRQ-CGetting washed or dressedRF3, In4
63SGRQ-CWalking around the homeRF3, In4
64SGRQ-CWalking outside on the levelGoing for a walkRF3, In4
65SGRQ-CWalking up a flight of stairsWalking up a flight of stairs (one floor)RF3, In4
66SGRQ-CWalking up hillsWalking up a steep hillRF3, In4

RF1: 4 = Not at all; 3 = A little bit; 2 = Somewhat; 1 = Quite a bit; 0 = Very much

RF2: 4 = Strongly disagree; 3 = Disagree; 2 = Neither agree nor disagree; 1 = Agree; 0 = Strongly agree

RF3: 4 = Never; 3 = Rarely; 2 = Sometimes; 1 = Often; 0 = Always

In1 = “How much have you been troubled by breathing problems due to the following circumstance?”

In2 = “Please, indicate the degree to which you agree or disagree with the following statement”

In3 = “In the past 7 days…”

In4 = “Please, indicate whether the following activity causes shortness of breath. If the weather influences your complaints, assume the weather conditions are favourable, when you answer this question”

aIf the source is not given, it concerns a newly written item

bThe item scores for these items need to be reversed prior to analysis due to positive wording

Overview over items selected for psychometric evaluation RF1: 4 = Not at all; 3 = A little bit; 2 = Somewhat; 1 = Quite a bit; 0 = Very much RF2: 4 = Strongly disagree; 3 = Disagree; 2 = Neither agree nor disagree; 1 = Agree; 0 = Strongly agree RF3: 4 = Never; 3 = Rarely; 2 = Sometimes; 1 = Often; 0 = Always In1 = “How much have you been troubled by breathing problems due to the following circumstance?” In2 = “Please, indicate the degree to which you agree or disagree with the following statement” In3 = “In the past 7 days…” In4 = “Please, indicate whether the following activity causes shortness of breath. If the weather influences your complaints, assume the weather conditions are favourable, when you answer this question” aIf the source is not given, it concerns a newly written item bThe item scores for these items need to be reversed prior to analysis due to positive wording We followed an iterative procedure (three revision rounds) for the remaining items, since this subset of the item pool included newly written items. Patients clearly had trouble switching back and forth between different response formats, and strongly objected to dichotomous response options. Therefore, we decided to standardise the response format for all items in the item bank to a 5-point Likert-scale reflecting magnitude (“not at all” to “very much”), frequency (“never” to “always”), and agreement (“strongly disagree” to “strongly agree”). Composite items were split into separate ones, double negations were rephrased, and the expression “lung disease” was changed to “COPD”. See Table 1 for the original and revised item texts.

Preparing the data for psychometric analysis

A large number of items had low endorsement (n < 10) for at least one response option/category. This can cause problems in psychometric analyses; hence, the problematic categories were merged with adjacent categories for these items. Note that items having different numbers of response categories due to merging does not constitute a problem for the GRM, nor does it hamper the comparison of item discrimination parameters (estimated with the GRM) among items. See Additional file 3 for the R code used to merge item categories. Three items were removed at this stage, due to a large number of missing values (>20 %) per booklet: items 6, 7 and 8.

Assessing item quality: results of the MSA and visual inspection of GAM plots

MSA requires a complete data-set. Therefore the MSA analysis was repeated for each booklet separately and two-way imputation was used to create a complete data-set for each booklet (2-4 % missing values per booklet) [45, 46]. The confirmatory analyses resulted in acceptable H-values for booklets 1 (.30) and 3 (.31), but a low H-value for booklet 2 (.26). Taking the results of the three exploratory MSA’s together, 19 items (see Table 2) were flagged as problematic (most of them had very low or even negative H values and were not assigned to any scale). If these items would have been excluded from the analyses, the H-values would have equalled .43, .40, and .43 for booklets 1, 2, and 3, respectively. Item properties Note: the reported parameter estimates were calculated using the GRM; the last two columns indicate in which booklet the item was included, and whether or not the item was flagged for removal in the Mokken Scale Analysis (MSA) Visual inspection of the GAM plots (smoothed regression lines) for the items flagged for removal in the MSA revealed substantial differences between one or more response curves as estimated under the GAM as compared to their counterparts estimated under the GRM, for most items. In some cases, one or more of the response curves was hard to estimate (very erratic, with multiple peaks). For five items (10, 19, 24, 53, 54), a very striking type of misfit was identified: the GAM plots showed that one or more response curves were U-shaped, indicating that both patients with very high and very low θ-scores scores were likely to endorse these response categories (see Fig. 3 for example plots).
Fig. 3

Option response curves as estimated using the GRM (on the right), and parametric smoothed regression lines based on a GAM (on the left) for an item with good fit to the GRM (item 27), and one with bad fit (item 10)

Option response curves as estimated using the GRM (on the right), and parametric smoothed regression lines based on a GAM (on the left) for an item with good fit to the GRM (item 27), and one with bad fit (item 10)

Calibrating the item bank: results of the parametric IRT analysis

Table 2 shows the estimated parameters based on the GRM for 63 out of 66 items.3 Up to five parameters are calculated in this model: the slope (denoted α) and the thresholds (denoted β). The slope of an item expresses its ability to discriminate among persons with low and high HRQoL; it is also indicative of how strongly this item is related to the latent trait (denoted θ). The threshold parameters indicate the point on the latent trait scale at which 50 % of the patients would choose the response category in question or higher. Since the probability is always 100 % for choosing the lowest category or higher, there is no threshold for the lowest category. Originally, all items were scored on a 5-point Likert scale ranging 0-4; however, since we had to collapse some response categories due to data sparseness, not all items in Table 2 have four thresholds. For example, for item 21 (“Because of my COPD I’m afraid of being alone.”), the categories 0 (strongly agree) and 1 (agree) were merged. Thus, the probability of choosing at most neither agree nor disagree is 50 % for patients with a θ-score of -2.79; the probability of choosing at most disagree is 50 % for patients with a θ-score of -0.736; and the probability of choosing strongly disagree is 50 % for patients with a θ-score of 1.267. The metric of the threshold values is determined by the distribution of θ. A standard normal distribution (mean = 0, SD = 1) was assumed when estimating the model (this is done to identify the model, similar to confirmatory factor analysis; in Bayesian terms this can be considered as a prior distribution). The threshold values as well as θ-scores may be interpreted relative to this distribution. Bayesian expected a-posteriori (EAP) scoring was used to estimate the θ-scores. The EAP estimator uses prior information (in this case the estimated population distribution in the fitted model) in calculating θ-scores. When this method is used, extreme scores are pulled in toward more realistic values. This is especially useful in cases where patients endorse either the lowest or highest response category on all items, in which case the maximum likelihood estimate is undefined. Figure 4 depicts the distribution of estimated θ-scores as well as the estimated threshold parameters. Both distributions look reasonably normal, and the threshold parameters cover the entire range of relevant θ-values (see Fig. 4).
Fig. 4

Distribution of estimated theta-values (solid line) and of estimated beta parameters (dashed line); both estimated using the Graded Response Model

Distribution of estimated theta-values (solid line) and of estimated beta parameters (dashed line); both estimated using the Graded Response Model The information function (Fig. 5) shows that the item bank covers all relevant θ-values (>99 % of θ-values fall in the range of -3 and +3). This figure depicts the measurement precision as a function of θ. An information value of 5 corresponds with a reliability of 0.8. The information function is the sum of the item information functions; each item gives most information close to its thresholds, and items with higher slopes give more information.
Fig. 5

Information Function for the full item bank (solid line) and the shortened item bank (dashed; problematic items removed)

Information Function for the full item bank (solid line) and the shortened item bank (dashed; problematic items removed)

Selecting items for the final item bank

As can be seen from Table 2, 17 out of 20 items flagged by the MSA had low (<1) or even negative α values. For three flagged items (39, 41, 48), no clear reason for misfit could be identified (acceptable item parameters, no obvious difference between GAM and GRM plots). These three items were therefore retained in the item bank. The GRM was estimated again after removal of the 17 problematic items. The resulting item parameters can be found in Additional file 4. This set of 46 items can be considered as the final item bank. Removing problematic items did not have a substantial effect on the information function (Fig. 5).

Discussion

This paper describes the development of an item bank that measures disease-specific quality of life in patients with COPD: the COPD-SIB. We started out with 66 items (including SGRQ-C items) covering content described as highly relevant by patients, healthcare professionals, and the literature. These items were assessed using complementary psychometric techniques and the data of 666 Dutch COPD patients. The final item bank contains 46 items that form a strong scale. This item bank could be used as a stand-alone instrument, either in full-bank form; better yet, it could be used as the basis for CAT. Seven items stood out among misfitting items: they had negative slope parameters and/or one or more response curves were U-shaped. Negative slope parameters were found for four items (item 13: “Because of my COPD, I appreciate my social contacts (e.g., friends, partner, relatives) more”; item 16: “Since being diagnosed with COPD, I have lived more consciously”; item 53: “I could accept it, when I was not able to do something anymore, due to my COPD”; item 54: “I persevered until I had finished an activity, despite the fact that I couldn’t perform that activity well, due to my COPD”), while U-shaped response curves for one or more categories were found for four items (item 10: “I am confident I will be able to cope with my COPD, even if the complaints get worse”; item 19: “I am content with the things I can still do”; item 24: “I value my life just as much as I did before I was diagnosed with COPD”; and item 53). When comparing the content of these items to other items in the bank, it is apparent that these items are all worded in a positive way whereas most items in the bank are not. Only one positively worded item showed good fit (item 57: “I got my breathing problems under control.”). The reason we included items with a more positive item formulation, was that several patients indicated that they felt it did not do their situation justice if the item bank would only consist of negative items. Patient quotes were used to inform the formulation of these items. Our results illustrate that it can be difficult to optimise content validity while simultaneously maintaining the same level of construct validity (under a given model); in this case, adding items to improve content validity resulted in multidimensionality. It has been previously suggested that including reversed worded items in a questionnaire might affect reliability and aspects of validity [47, 48]. Patients may not notice that some items are formulated in a reversed way, or they might be confused by this reversal in meaning. As an effect, there may be an increase in measurement error and/or a method/artifical second factor may be found in dimensionality analyses caused by response bias [49]. To prevent response bias caused by inattention or confusion, it may be advisable to present positively and negatively worded items separately in a future study, as suggested by Roszkowski and Soven [50]. Another possibility would be to create separte item banks for positively and negatively worded items; PROMIS follows this strategy for a number of domains (e.g., [51, 52]). If these strategies do not solve the issue of U-shaped response curves, it may be worth while to re-analyse the data with a different IRT model, which allows for peaked/dipped response curves (a so-called “unfolding model”) [53]. We developed 29 new items that were subjected to cognitive debriefing along with a selection of items from existing questionnaires. Initially, the answering categories provided for the newly developed items were dichotomous: agree/do not agree. A substantial number of patients indicated that they were unhappy with only having two options, and asked for Likert scales. We made adjustments accordingly, and decided to harmonise the answering categories of all items following PROMIS guidelines. Patients were happy with the 5-point Likert scales. Our findings illustrate, however, that this not necessarily means that patients will use the entire scale. The resulting data sparseness poses challenges when modelling the data. A widely used solution is to merge adjacent categories, which is also what we did for a number of items. This solution is not popular with everyone; but since having a very low cell count for certain item-category combinations leads to problematic parameter estimates (very high or low, large standard error) it is unavoidable in practice. In such cases, it may be advisable to use a model that is unsensitive to differences in the number of categories per item after merging, such as the GRM we used in this study. We suggest that this approach (providing the patient with the response scale of their preference, subsequently merging categories prior to analysis, and finally using an appropriate model) is to be preferred to avoiding dealing with the field of tension between patient perspective and psychometric considerations.

Conclusions

In the development of the COPD-SIB, the patient perspective has taken a central role. The item bank contains items tapping into several topics described as highly relevant by patients and the literature. We used complementary psychometric techniques to evaluate the candidate items, and the final selection forms a strong unidimensional scale. The COPD-SIB is a promising candidate to measure COPD-specific HRQoL in routine practice; especially when used to build a CAT (time efficient, while not compromising measurement precision). The COPD-SIB was developed using a large Dutch sample of COPD patients. The Dutch version of the item bank is ready for use, and available upon request (contact MP or JP). First steps toward cross-cultural validation are currently underway [9, 24].
  29 in total

1.  Influence of Imputation and EM Methods on Factor Analysis when Item Nonresponse in Questionnaire Data is Nonignorable.

Authors:  C A Bernaards; K Sijtsma
Journal:  Multivariate Behav Res       Date:  2000-07-01       Impact factor: 5.923

Review 2.  Evaluation of Quality of Life instruments for use in COPD care and research: a systematic review.

Authors:  Saskia W M Weldam; Marieke J Schuurmans; Rani Liu; Jan-Willem J Lammers
Journal:  Int J Nurs Stud       Date:  2012-08-24       Impact factor: 5.837

3.  Using multidimensional modeling to combine self-report symptoms with clinical judgment of schizotypy.

Authors:  Stéphanie M van den Berg; Muirne C S Paap; Eske M Derks
Journal:  Psychiatry Res       Date:  2012-09-27       Impact factor: 3.222

Review 4.  Pulmonary rehabilitation in chronic respiratory insufficiency. 7. Health-related quality of life among patients with chronic obstructive pulmonary disease.

Authors:  J R Curtis; R A Deyo; L D Hudson
Journal:  Thorax       Date:  1994-02       Impact factor: 9.139

5.  Have a little faith: measuring the impact of illness on positive and negative aspects of faith.

Authors:  John M Salsman; Sofia F Garcia; Jin-Shei Lai; David Cella
Journal:  Psychooncology       Date:  2011-09-09       Impact factor: 3.894

6.  Maugeri Respiratory Failure questionnaire reduced form: a method for improving the questionnaire using the Rasch model.

Authors:  G Vidotto; M Carone; P W Jones; S Salini; G Bertolotti
Journal:  Disabil Rehabil       Date:  2007-07-15       Impact factor: 3.033

7.  The St George's Respiratory Questionnaire revisited: a psychometric evaluation.

Authors:  Muirne C S Paap; Danny Brouwer; Cees A W Glas; Evelyn M Monninkhof; Benjamin Forstreuter; Marcel E Pieterse; Job van der Palen
Journal:  Qual Life Res       Date:  2013-11-16       Impact factor: 4.147

8.  Development and first validation of the COPD Assessment Test.

Authors:  P W Jones; G Harding; P Berry; I Wiklund; W-H Chen; N Kline Leidy
Journal:  Eur Respir J       Date:  2009-09       Impact factor: 16.671

9.  Rumination and age: some things get better.

Authors:  Stefan Sütterlin; Muirne C S Paap; Stana Babic; Andrea Kübler; Claus Vögele
Journal:  J Aging Res       Date:  2012-02-22

10.  Using the Three-Step Test Interview to understand how patients perceive the St. George's Respiratory Questionnaire for COPD patients (SGRQ-C).

Authors:  Muirne C S Paap; Lukas Lange; Job van der Palen; Christina Bode
Journal:  Qual Life Res       Date:  2015-11-28       Impact factor: 4.147

View more
  5 in total

1.  The TBI-CareQOL Measurement System: Development and Preliminary Validation of Health-Related Quality of Life Measures for Caregivers of Civilians and Service Members/Veterans With Traumatic Brain Injury.

Authors:  Noelle E Carlozzi; Michael A Kallen; Robin Hanks; Elizabeth A Hahn; Tracey A Brickell; Rael T Lange; Louis M French; Anna L Kratz; David S Tulsky; David Cella; Jennifer A Miner; Phillip A Ianni; Angelle M Sander
Journal:  Arch Phys Med Rehabil       Date:  2018-09-07       Impact factor: 3.966

2.  Emotional Suppression and Hypervigilance in Military Caregivers: Relationship to Negative and Positive Affect.

Authors:  Angelle M Sander; Nicholas R Boileau; Robin A Hanks; David S Tulsky; Noelle E Carlozzi
Journal:  J Head Trauma Rehabil       Date:  2020 Jan/Feb       Impact factor: 3.117

3.  The effectiveness of a nurse-led illness perception intervention in COPD patients: a cluster randomised trial in primary care.

Authors:  Saskia W M Weldam; Marieke J Schuurmans; Pieter Zanen; Monique J W M Heijmans; Alfred P E Sachs; Jan-Willem J Lammers
Journal:  ERJ Open Res       Date:  2017-12-08

4.  Item usage in a multidimensional computerized adaptive test (MCAT) measuring health-related quality of life.

Authors:  Muirne C S Paap; Karel A Kroeze; Caroline B Terwee; Job van der Palen; Bernard P Veldkamp
Journal:  Qual Life Res       Date:  2017-06-23       Impact factor: 4.147

5.  Measuring Patient-Reported Outcomes Adaptively: Multidimensionality Matters!

Authors:  Muirne C S Paap; Karel A Kroeze; Cees A W Glas; Caroline B Terwee; Job van der Palen; Bernard P Veldkamp
Journal:  Appl Psychol Meas       Date:  2017-10-24
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.