Literature DB >> 36078543

Sequential Multiple Imputation for Real-World Health-Related Quality of Life Missing Data after Bariatric Surgery.

Sun Sun1,2, Nan Luo3, Erik Stenberg4, Lars Lindholm1, Klas-Göran Sahlén1, Karl A Franklin5, Yang Cao6.   

Abstract

One of the main challenges for the successful implementation of health-related quality of life (HRQoL) assessments is missing data. The current study examined the feasibility and validity of a sequential multiple imputation (MI) method to deal with missing values in the longitudinal HRQoL data from the Scandinavian Obesity Surgery Registry. All patients in the SOReg who received bariatric surgery between 1 January 2011 and 31 March 2019 (n = 47,653) were included for the descriptive analysis and missingness pattern exploration. The patients who had completed the short-form 36 (SF-36) at baseline (year 0), and one-, two-, and five-year follow-ups were included (n = 3957) for the missingness pattern simulation and the sequential MI analysis. Eleven items of the SF-36 were selected to create the six domains of SF-6D, and the SF-6D utility index of each patient was calculated accordingly. The multiply-imputed variables in previous year were used as input to impute the missing values in later years. The performance of the sequential MI was evaluated by comparing the actual values with the imputed values of the selected SF-36 items and index at all four time points. At the baseline and year 1, where missing proportions were about 20% and 40%, respectively, there were no statistically significant discrepancies between the distributions of the actual and imputed responses (all p-values > 0.05). In year 2, where the missing proportion was about 60%, distributions of the actual and imputed responses were consistent in 9 of the 11 SF-36 items. However, in year 5, where the missing proportion was about 80%, no consistency was found between the actual and imputed responses in any of the SF-36 items. Relatively high missing proportions in HRQoL data are common in clinical registries, which brings a challenge to analyzing the HRQoL of longitudinal cohorts. The experimental sequential multiple imputation method adopted in the current study might be an ideal strategy for handling missing data (even though the follow-up survey had a missing proportion of 60%), avoiding significant information waste in the multivariate analysis. However, the imputations for data with higher missing proportions warrant more research.

Entities:  

Keywords:  SF-36; health utility; health-related quality of life; multiple imputations; real-world data

Mesh:

Year:  2022        PMID: 36078543      PMCID: PMC9518315          DOI: 10.3390/ijerph191710827

Source DB:  PubMed          Journal:  Int J Environ Res Public Health        ISSN: 1660-4601            Impact factor:   4.614


1. Introduction

Health-related quality of life (HRQoL) represents the subjective evaluation of a patient’s health status, providing complementary information to survival, cures, and biological responses to treatment [1]. HRQoL data have been increasingly collected in clinical trials, population health surveys, and clinical registers in many countries [2,3,4,5]. However, one of the main challenges for the successful implementation of HRQoL assessments is missing data, which can be at the item level, i.e., respondents do not provide answers to certain items in an HRQoL questionnaire, or missing entire forms due to loss of follow-ups [1]. The missing data may lead to biased conclusions if unattended. Therefore, it is important to understand missingness patterns and handle missing data properly when analyzing HRQoL data. In economic evaluations, HRQoL instruments that can be used to generate health utility data, also known as preference-based measures, are applied. The most commonly applied preference-based measures are EQ-5D [6], short-form-6D (SF-6D) [7,8], and the health utilities index [9]. Missing data are also important issues when it comes to health utility calculations since these measures require complete answers to all of the relevant items [1,10]. Both missing items and missing forms in HRQoL are rather common in clinical trials or observational studies, which may reduce statistical power and present a challenge for research in this field [11,12]. During the data collection phases, strategies could be integrated into the study design to minimize the incidence of missing data. However, once the trial/registry has started, as an analyst, one has little influence on how data are collected but primarily relies on analytical methods to account for missing data [13]. The development of sound strategies for handling missing data includes imputation methods. The current practices of handling missing data in HRQoL studies include list-wise deletion, single imputation by replacing the missing value with one previously observed value or mean value, multiple imputation(s) (MI), and model-based approaches [14,15]. Among them, MI methods are widely recommended because they may incorporate uncertainty around the missing values [16]. However, this is often poorly applied in reality [17]. Managing missing real-world data, including HRQoL data, has become a challenging issue with the rapidly increasing applications to real-world data in recent years. Real-world data are derived from a number of sources that document outcomes in a heterogeneous patient population in real-world settings, including (but not limited to) electronic health records, health insurance claims, and patient surveys [18]. Real-world data provide insights beyond those that can be derived from clinical trials as they follow patients with different characteristics in real-life situations, and often for longer periods than clinical trials [19]. Compared with well-conducted randomized controlled trials (RCTs), missing data are more pronounced in real-world data because data can be missing for exposures, known confounders, and outcomes [13]. However, existing guidance and standards for handling missing data most often only concern RCTs. Currently, there are no standards or formal guidelines on how to deal with missing real-world HRQoL data [13]. In this study, we demonstrated and simulated the missingness of real-world HRQoL data from the Scandinavian Obesity Surgery Registry (SOReg) [20], and examined whether a sequential MI procedure is a practical strategy for handling missing values in the short-form 36 (SF-36) and SF-6D forms. Because the HRQoL data were repeatedly collected at four time points, the sequential MI procedure imputed the missing values chronologically, i.e., the missing data in a later follow-up were imputed using the multiply-imputed datasets of a prior follow-up.

2. Materials and Methods

This research applied data from existing registers in Sweden. Data retrieval, analyses, and presentation results were performed in accordance with the Declaration of Helsinki. The research work was approved by the Swedish Ethical Review Agency (Etikprövningsmyndigheten; approval numbers: 2019-03666 and 2019-05713).

2.1. Data Sources

The Scandinavian Obesity Surgery Registry (SOReg) is a Swedish national quality registry for bariatric surgery management and research. It has a coverage of >98% nationwide, its internal validity is evaluated regularly, and it has high data quality [21]. Data on patients’ sociodemographic information, hospital characteristics, and detailed information regarding the surgeries and post-surgery outcomes, including HRQoL assessed by SF-36 and the Obesity Problem Scale [22,23], were obtained from the SOReg. Patients included in the study reported their HRQoL data at baseline (i.e., prior surgery) and years 1, 2, and 5 postoperatively by filling out a questionnaire. Specialized nurses collected anthropometric data and completed questionnaires. Data entry was performed by trained persons (participating surgeons, plus dedicated nurses in each center). In the current study, all patients who received bariatric surgery between 1 January 2011 and 31 March 2019 (n = 47,653) were included for descriptive analysis and missingness pattern explorations, while only patients who had completed SF-36 at baseline (year zero), and at the one-, two-, and five-year follow-ups were included (n = 3957) as part of an analytical dataset to evaluate the multiple imputation process.

2.2. SF-36 and SF-6D

SF-36 measures HRQoL with 36 items, which can be grouped into 8 domains (physical function, role—physical, bodily pain, general health, vitality, social function, role-emotional, and mental health), and each item contains 2–6 severity levels [24]. In order to elicit the health utility, 11 items of the SF-36 (Supplementary Material Table S1) were selected to create SF-6D, including 6 domains (pain, mental health, physical functioning, social functioning, role participation, and vitality). Each domain described four to six severity levels (Supplementary Material Table S2) [7]. The SF-6D utility index of each patient in the current study was calculated using the Formula (1) below: where yi (i = 1, 2, …, 6) indicates SF-6D domains that can take m levels (m = 2, 3, …, 5 or 6); X = m represents dummy variables that indicate levels 2 to 5 or 6 and is the associated coefficient; corresponds to the constant deviating from full health; and Most is a dummy variable, indicating that there is at least one dimension at levels 5 or 6. Missing information on any of the 11 items would lead to missingness in the SF-6D domains (the right hand of the equation), which in turn would lead to missingness in the SF-6D index score. There were two methods applied for index imputation: (1) to impute the items, firstly, then calculate the index based on the above formula; (2) to impute the index directly. There might be differences in the results when the two different methods are used. In the current study, we applied the second method, as it is also useful even when information from the items is missing.

2.3. Missingness Mechanism and Missingness Pattern Simulation

The widely used missingness mechanisms in simulation studies on multiple imputations are: missing completely at random, missing at random (MAR), and missing not at random [16]. In the current study, we simulated missingness in the analytical dataset according to a MAR mechanism, which assumes that the probability of the data that are missing does not depend on the unobserved data, but is conditional on the observed data. To ensure that the missingness patterns of the analytical dataset used for multiple imputations may reflect the patterns found in the real-world data, we explored missingness patterns of the selected 11 SF-36 items and SF-6D index at baseline and one-, two-, and five-year follow-ups for the real-world data. The missingness pattern explorations were conducted using the package mice in the statistical software R 4.1.1 (R Foundation for Statistical Computing, Vienna, Austria). The overall missing proportion of the SF-36 items at baseline was 19.6%. The missing proportions of the 11 SF-36 items at baseline from the highest to the lowest are shown in Figure 1 (left) and a total of 163 missingness patterns were found (Figure 1, right). Each row in the right panel of Figure 1 is a missingness pattern that indicates where the missing values (red colored) are located in the 11 SF-36 items.
Figure 1

Missing proportions (left) and missingness patterns (right) of the selected 11 SF-36 items at baseline in the real-world data (red cells indicating missing). BP, bodily pain; MH, mental health; PF, physical function; RE, role-emotional; RP, role participation; SF, social function; VT, vitality.

The overall missing proportions of the 11 SF-36 items in the one-, two-, and five-year follow-ups were 40.7%, 62.9%, and 83.8%, respectively, and the missingness patterns are shown in Supplemental Figures S1–S3. In total, 150, 105, and 64 missingness patterns of the 11 SF-36 items were found in the one-, two-, and five-year follow-ups, respectively. To evaluate the performance of the proposed MI procedure using the analytical dataset, there was a need to simulate the missingness in the data by masking some known values in the analytical dataset. The missingness patterns (right panels of Figure 1, Supplemental Figures S1–S3) detected in the real-world data were applied to mask the values in the 11 SF-36 items at the four time points of the analytical dataset. The masking of the known values was conducted using the package mice as well. The simulated missingness of the analytical dataset for the selected 11 SF-36 items at baseline is shown in Figure 2, with an overall missing proportion of 20.0%. The missing proportions of the 11 items in the analytical dataset (Figure 2, left) were similar to those in the real-world data (Figure 1, left). The simulated missingness of the analytical dataset for the selected 11 SF-36 items in the one-, two-, and five-year follow-ups are shown in Supplemental Figures S4–S6, with the overall missing proportions of 40.4%, 63.8%, and 83.7%, respectively.
Figure 2

Simulated missing proportions (left) and missingness patterns (right) of the selected 11 SF-36 items at baseline in the analytical dataset (red cells indicating missing).

In general, the number of missing patterns decreased with reduced observations. Because the analytical dataset has much fewer observations than those in the real-world data (3957 vs. 47,653), the missingness patterns of the analytical dataset were less than those of the real-world data; however, the percentages of the top missingness patterns of both datasets were similar. The missingness patterns of the SF-6D index at the four time points were also obtained from real-world data. The masking of the known SF-6D index values in the analytical dataset was conducted according to the patterns. The missing proportions and missingness patterns of the SF-6D index at baseline (year 0) and one-, two-, and five-year follow-ups in the real-world data and the analytical dataset are shown in Figure 3 and Figure 4. The missing proportions and percentages of the top missingness patterns of both datasets were similar (Figure 3 and Figure 4).
Figure 3

Missing proportions (left) and missingness patterns (right) of the SF-6D index at four time points in the real-world data (red cells indicating missing).

Figure 4

Simulated missingness patterns of SF-6D index at four time points in the analytical dataset (red cells indicating missing).

2.4. Process of the Sequential Multiple Imputation

We applied a sequential method to impute the missing values at baseline (year 0), and years 1, 2, and 5 in order. The process of the sequential multiple imputation is shown in Figure 5, the “Sequential multiple imputation” step, and described in detail as follows:
Figure 5

Flowchart of simulation, sequential multiple imputation, and assessment. MAPE, mean absolute percentage error.

Firstly, the missing values of the selected 11 SF-36 items at baseline (year 0) were multiply-imputed (five imputations were used in the current study) using all baseline variables, including age, sex, BMI, pregnancy, and comorbidities, including sleep apnea, hypertension, diabetes, dyslipidemia, dyspepsia, diarrhea, depression, and other illnesses that may have contributed to the surgical decisions. Five imputed datasets were generated for the baseline data. Secondly, for each imputed baseline dataset, the missing values of the selected 11 SF-36 items and comorbidities at the one-year follow-up were multiply-imputed based on all the baseline variables, as well as the previously imputed SF-36 items. Five imputed datasets were generated for each imputed baseline (year 0) dataset; therefore, in total, 5 × 5 = 25 imputed datasets were generated for the one-year follow-up. Similarly, for missing values in the two- and five-year follow-ups, they were imputed five times for each previously imputed dataset based on all the variables in the previous years. Therefore, in total, 125 (5 × 25 imputed datasets of year 1) and 625 (5 × 125 imputed datasets of year 2) imputed datasets were generated for the two- and five-year follow-ups, respectively. When conducting the multiple imputations within each year, the multivariate imputation using chained equations was used, with predictive mean matching, logistic regression, and proportional odds regression for continuous, binary, and ordered variables, respectively [25].

2.5. Assessment of Performance

The performance of the sequential multiple imputation approach was evaluated by comparing the actual values with the imputed values of the selected 11 SF-36 items and index at all four time points (baseline, one-, two-, and five-year follow-ups). The SF-36 items were compared using frequency distributions of the actual and imputed item scores, and the agreement of the distributions was tested using the chi-squared test controlled for the false discovery rate [26,27]. The mean absolute percentage error (MAPE), one of the most common metrics used to measure accuracy for continuous variables, was calculated to assess the agreement between the actual and imputed values for the SF-6D index [28,29,30]. MAPE is the mean of the absolute difference between the actual and imputed values divided by the actual values [31]. MAPE < 10% is excellent, <20% is good, 20–50% is fine, and >50% is poor [32]. The intraclass correlation coefficients (ICCs) were also provided to indicate the agreement between the actual values and the imputed values. An ICC value below 0.50, between 0.50 and 0.75, between 0.75 and 0.90, or above 0.90 indicates poor, moderate, good, or excellent agreement, respectively [33]. In the current study, MAPE and ICC were averaged across the imputations. All statistical analyses were conducted in R 4.11 (R Foundation for Statistical Computing, Vienna, Austria) and Stata 17.0 (College Station, Texas, USA). A two-sided p-value < 0.05 was considered statistically significant.

3. Results

3.1. Characteristics of the Patients

The demographic characteristics of the patients at baseline are shown in Table 1. Statistically significant differences were found in most variables between the patients included in the analytical dataset and those excluded (with at least one missing form). In general, the patients in the analytical dataset were older and fewer (in proportion) of them had comorbidities.
Table 1

Demographic characteristics of the patients at baseline.

Variable AllExcludedAnalytical Datasetp-Value *
N 46,75342,7963957
Age (mean (SD)) 41.06 (11.30)40.75 (11.28)43.82 (11.15)<0.001
BMI (mean (SD)) 41.57 (5.65)41.53 (5.66)42.01 (5.48)<0.001
Sex (%)Man10,933 (23.4)10,058 (23.5)875 (22.1)0.050
Woman35,820 (76.6)32,738 (76.5)3082 (77.9)
Smoking (%)No28,781 (61.6)26,402 (61.7)2379 (60.1)0.067
Yes4765 (10.2)4381 (10.2)384 (9.7)
Quit8075 (17.3)7355 (17.2)720 (18.2)
Missing5132 (11.0)4658 (10.9)474 (12.0)
Pregnancy (%)No39,862 (85.3)35,905 (83.9)3957 (100.0)<0.001
Missing6891 (14.7)6891 (16.1)0 (0.0)
Comorbidity (%)No17,620 (37.7)15,869 (37.1)1751 (44.3)<0.001
Yes22,240 (47.6)20,034 (46.8)2206 (55.7)
Missing6893 (14.7)6893 (16.1)0 (0.0)
Sleep apnea (%)No35,749 (76.5)32,245 (75.3)3504 (88.6)<0.001
Yes4111 (8.8)3658 (8.5)453 (11.4)
Missing6893 (14.7)6893 (16.1)0 (0.0)
Hypertension (%)No29,890 (63.9)27,161 (63.5)2729 (69.0)<0.001
Yes9970 (21.3)8742 (20.4)1228 (31.0)
Missing6893 (14.7)6893 (16.1)0 (0.0)
Diabetes (%)No34,668 (74.2)31,305 (73.1)3363 (85.0)<0.001
Yes5192 (11.1)4598 (10.7)594 (15.0)
Missing6893 (14.7)6893 (16.1)0 (0.0)
Dyslipidemia (%)No36,018 (77.0)32,555 (76.1)3463 (87.5)<0.001
Yes3842 (8.2)3348 (7.8)494 (12.5)
Missing6893 (14.7)6893 (16.1)0 (0.0)
Dyspepsia (%)No35,564 (76.1)32,037 (74.9)3527 (89.1)<0.001
Yes4296 (9.2)3866 (9.0)430 (10.9)
Missing6893 (14.7)6893 (16.1)0 (0.0)
Diarrhea (%)No39,213 (83.9)35,339 (82.6)3874 (97.9)<0.001
Yes647 (1.4)564 (1.3)83 (2.1)
Missing6893 (14.7)6893 (16.1)0 (0.0)
Depression (%)No33,355 (71.3)29,904 (69.9)3451 (87.2)<0.001
Yes6505 (13.9)5999 (14.0)506 (12.8)
Missing6893 (14.7)6893 (16.1)0 (0.0)
Other illness (%)No35,519 (76.0)31,967 (74.7)3552 (89.8)<0.001
Yes4343 (9.3)3938 (9.2)405 (10.2)
Missing6891 (14.7)6891 (16.1)0 (0.0)
Obesity problem summary score (mean (SD)) 65.06 (26.11)65.62 (25.98)60.05 (26.73)<0.001

* Student’s t-test was used to compare the means and the chi-squared test was used to compare percentages.

Descriptive analysis of the selected 11 SF-36 items and SF-6D index at baseline are shown in Table 2. Similarly, statistically significant differences in proportions of the SF-36 item scores and mean values of the SF-6D index were found between the patients included and excluded. In general, the respondents in the analytical dataset reported much fewer missing items and a slightly higher SF-6D index, compared to those excluded.
Table 2

Scores for the selected SF-36 items and SF-6D index at baseline.

SF-6D ItemLevelAll(N = 46,753)Excluded(N = 42,796)Analytical Dataset(N = 3957)p-Value *
PF1 (%)126,464 (56.6)23,794 (55.6)2670 (67.5)<0.001
211,523 (24.6)10,388 (24.3)1135 (28.7)
31712 (3.7)1576 (3.7)136 (3.4)
Missing7054 (15.1)7038 (16.4)16 (0.4)
PF2 (%)14878 (10.4)4445 (10.4)433 (10.9)<0.001
222,027 (47.1)19,875 (46.4)2152 (54.4)
312,767 (27.3)11,417 (26.7)1350 (34.1)
Missing7081 (15.1)7059 (16.5)22 (0.6)
PF10 (%)12853 (6.1)2656 (6.2)197 (5.0)<0.001
213,910 (29.8)12,576 (29.4)1334 (33.7)
322,917 (49.0)20,507 (47.9)2410 (60.9)
Missing7073 (15.1)7057 (16.5)16 (0.4)
RP3 (%)117,956 (38.4)16,392 (38.3)1564 (39.5)<0.001
221,344 (45.7)18,999 (44.4)2345 (59.3)
Missing7453 (15.9)7405 (17.3)48 (1.2)
RE2 (%)114,899 (31.9)13,706 (32.0)1193 (30.1)<0.001
224,297 (52.0)21,595 (50.5)2702 (68.3)
Missing7557 (16.2)7495 (17.5)62 (1.6)
SF2 (%)11503 (3.2)1421 (3.3)82 (2.1)<0.001
24393 (9.4)4077 (9.5)316 (8.0)
38624 (18.4)7890 (18.4)734 (18.5)
49278 (19.8)8382 (19.6)896 (22.6)
515,447 (33.0)13,574 (31.7)1873 (47.3)
Missing7508 (16.1)7452 (17.4)56 (1.4)
BP1 (%)15669 (12.1)5068 (11.8)601 (15.2)<0.001
24777 (10.2)4232 (9.9)545 (13.8)
36042 (12.9)5416 (12.7)626 (15.8)
414,001 (29.9)12,606 (29.5)1395 (35.3)
57103 (15.2)6472 (15.1)631 (15.9)
61927 (4.1)1801 (4.2)126 (3.2)
Missing7234 (15.5)7201 (16.8)33 (0.8)
BP2 (%)110,196 (21.8)9045 (21.1)1151 (29.1)<0.001
29781 (20.9)8767 (20.5)1014 (25.6)
310,232 (21.9)9233 (21.6)999 (25.2)
46772 (14.5)6202 (14.5)570 (14.4)
52526 (5.4)2343 (5.5)183 (4.6)
Missing7246 (15.5)7206 (16.8)40 (1.0)
MH1 (%)1779 (1.7)731 (1.7)48 (1.2)<0.001
21788 (3.8)1684 (3.9)104 (2.6)
33688 (7.9)3411 (8.0)277 (7.0)
46184 (13.2)5644 (13.2)540 (13.6)
511,644 (24.9)10,466 (24.5)1178 (29.8)
615,548 (33.3)13,764 (32.2)1784 (45.1)
Missing7122 (15.2)7096 (16.6)26 (0.7)
MH4 (%)1797 (1.7)747 (1.7)50 (1.3)<0.001
22025 (4.3)1903 (4.4)122 (3.1)
33554 (7.6)3284 (7.7)270 (6.8)
46072 (13.0)5553 (13.0)519 (13.1)
513,164 (28.2)11,822 (27.6)1342 (33.9)
613,942 (29.8)12,323 (28.8)1619 (40.9)
Missing7199 (15.4)7164 (16.7)35 (0.9)
VT2 (%)1844 (1.8)738 (1.7)106 (2.7)<0.001
23573 (7.6)3138 (7.3)435 (11.0)
36083 (13.0)5408 (12.6)675 (17.1)
49448 (20.2)8409 (19.6)1039 (26.3)
511,734 (25.1)10,697 (25.0)1037 (26.2)
67903 (16.9)7265 (17.0)638 (16.1)
Missing7168 (15.3)7141 (16.7)27 (0.7)
Index (mean (SD)) 0.66 (0.13)0.66 (0.13)0.69 (0.13)<0.001

* Student’s t-test was used to compare the means and the chi-squared test was used to compare percentages. BP, bodily pain; MH, mental health; PF, physical function; RE, role-emotional; RP, role participation; SF, social function; VT, vitality.

Demographics and comorbidities, SF-36 items scores, and SF-6D indices in one-, two-, and five-year follow-ups are shown in Supplemental Tables S3–S8. At all three time points, the included respondents had much fewer missing values on characteristics and SF-36 items, and were relatively healthier, compared to those excluded.

3.2. Imputation Results for the Selected SF-36 Items

The comparisons of distributions between the actual and imputed responses of the patients in the analytical dataset are shown in Table 3. At the baseline and year 1, where missing proportions were about 20% and 40%, respectively, there were no statistically significant discrepancies between the distributions of the actual and imputed responses (all p-values > 0.05). In year 2, where the missing proportion was about 60%, distributions of the actual and imputed responses were consistent in most SF-36 items, except for PF2 and PF10. However, in year 5, where the missing proportion rose to about 80%, no consistency was found between the actual and imputed responses in any of the SF-36 items. The results indicate that the imputation based on the previous demographic and comorbidity information works well for the SF-36 items even when the missing proportion was as high as 60%. According to the ICC values presented in Table 3, we can see that the agreements between the actual values and the imputed values of the SF-36 items were good at baseline and in year 1 but moderate and poor in years 2 and 5, respectively.
Table 3

Accuracy of multiple imputations for the selected SF-36 item scores.

ItemsScoreYear 0Year 1Year 2Year 5
ActualImputedActualImputedActualImputedActualImputed
PF11258425434604555025437511341
211061148157315601412145815111267
3132144177818091892183315591226
χ2 = 1.610p = 0.447χ2 = 0.229p = 0.892χ2 = 3.178p = 0.204χ2 = 226.622p < 0.001
ICC = 0.868p < 0.001ICC = 0.782p < 0.001ICC = 0.671p < 0.001ICC = 0.391p < 0.001
PF21416390113121109147162940
220862088489490534556709546
313151356322932233175313129522348
χ2 = 1.431p = 0.489χ2 = 0.279p = 0.870χ2 = 6.358p = 0.042χ2 = 639.249p < 0.001
ICC = 0.868p < 0.001ICC = 0.772p < 0.001ICC = 0.623p < 0.001ICC = 0.101p < 0.001
PF10119017053687314285754
212951278215236293357371421
323342386355835303460333533682659
χ2 = 1.767p = 0.413χ2 = 2.940p = 0.230χ2 = 30.737p < 0.001χ2 = 619.995p < 0.001
ICC = 0.848p < 0.001ICC = 0.745p < 0.001ICC = 0.558p < 0.001ICC = 0.048p = 0.002
RP31152115164624655476147751600
222792318334433693255322030222334
χ2 = 0.168p = 0.682χ2 = 0.000p = 1.000χ2 = 3.796p = 0.051χ2 = 403.550p < 0.001
ICC = 0.853p < 0.001ICC = 0.730p < 0.001ICC = 0.562p < 0.001ICC = 0.206p < 0.001
RE21116211575565667097209511758
226292677323432683085311428432076
χ2 = 0.181p = 0.671χ2 = 0.007p = 0.935χ2 = 0.005p = 0.941χ2 = 358.890p < 0.001
ICC = 0.863p < 0.001ICC = 0.724p < 0.001ICC = 0.591p < 0.001ICC = 0.289p < 0.001
SF2180863846426674670
2309308119134161176231416
3709720304288402437533575
4872854558576600621677586
518241865277127902575253422551588
χ2 = 0.747p = 0.945χ2 = 2.180p = 0.703χ2 = 7.769p = 0.100χ2 = 653.746p < 0.001
ICC = 0.880p < 0.001ICC = 0.771p < 0.001ICC = 0.661p < 0.001ICC = 0.183p < 0.001
BP11587603165516671668164213491081
2528556701744527576524404
3610602425443427421417416
4135313496776457847909231022
5610606269263307293445660
6121118708292111138251
χ2 = 0.966p = 0.965χ2 = 3.264p = 0.659χ2 = 4.449p = 0.487χ2 = 124.586p < 0.001
ICC = 0.868p < 0.001ICC = 0.760p < 0.001ICC = 0.639p < 0.001ICC = 0.354p < 0.001
BP2111221172232723662224225219021381
2980955724720701670731614
3962986444438525543621569
4559547215210252244382779
51781748710098124164491
χ2 = 1.742p = 0.783χ2 = 1.159p = 0.885χ2 = 4.211p = 0.378χ2 = 393.990p < 0.001
ICC = 0.876p < 0.001ICC = 0.762p < 0.001ICC = 0.623p < 0.001ICC = 0.270p < 0.001
MH1145484056393656325
21001087888105126128300
3272269138150194176236436
4521529230236281299361381
511401118736749770749782786
617391762259925552429244922511606
χ2 = 0.810p = 0.976χ2 = 4.314p = 0.505χ2 = 3.798p = 0.579χ2 = 426.931p < 0.001
ICC = 0.849p < 0.001ICC = 0.775p < 0.001ICC = 0.635p < 0.001ICC = 0.273p < 0.001
MH4148485372567385330
2120126104104142174188408
3268268167181223209281690
4495487316296400374495438
5130613269991037104710901065813
615751580217621451952191317041155
χ2 = 0.302p = 0.998χ2 = 4.984p = 0.418χ2 = 8.045p = 0.154χ2 = 540.811p < 0.001
ICC = 0.851p < 0.001ICC = 0.780p < 0.001ICC = 0.673p < 0.001ICC = 0.277p < 0.001
VT2110096604615459422322267
24174321440148112451300949602
3660673792769857867804732
410031008458471505487663637
51014992340302472474617780
6622632189196280284460816
χ2 = 0.769p = 0.979χ2 = 3.556p = 0.615χ2 = 3.126p = 0.681χ2 = 204.960p < 0.001
ICC = 0.884p < 0.001ICC = 0.781p < 0.001ICC = 0.671p < 0.001ICC = 0.435p < 0.001

Frequencies in the cells were estimated by averaging frequencies for each item across the imputations and then rounded to whole numbers. ICC, intraclass correlation coefficient.

3.3. Imputation Results for SF-6D Index

In general, the imputed SF-6D index values had similar means and standard errors as the actual ones (Table 4). For the baseline, the MAPE of the imputed index was smaller than 5% of the actual index, which means, on average, the imputed index values ranged between 95% and 105% of the actual value. Even for the follow-up in postoperative year 2, the deviation of the imputed index values from the actual values was smaller than 10% (Table 4). The results indicate that the imputation method may provide relatively accurate values for the continuous index in terms of the mean absolute percentage error when the missing proportion is around 60%. According to the ICC values presented in Table 4, we can see that the agreement between the actual and imputed values of the SF-6D index was good at baseline but moderate in the one-, two-, and five-year follow-ups.
Table 4

Comparison of actual and imputed SF-6D indices at baseline and follow-ups.

Time PointActualImputedMAPE (%)ICC (95% CI)
MeanSEMeanSE
Baseline0.6880.00210.6980.00194.160.814 (0.811, 0.816)
One-yearfollow-up0.8130.00220.8120.00196.140.682 (0.678, 0.686)
Two-yearfollow-up0.7960.00230.7970.00188.150.598 (0.592, 0.693)
Five-yearfollow-up 0.7620.00250.7660.001811.620.516 (0.510, 0.522)

SE, standard error; MAPE, mean absolute percentage error; ICC, intraclass correlation coefficient; CI, confidence interval.

4. Discussion

Real-world data collections, compared to RCTs, face more challenges. Firstly, a high proportion of missing forms is possible due to long follow-ups, especially when the follow-up time extends beyond two years [11,12]. Moreover, the follow-up time points of most clinical registers are based on the need for care and are not standardized, which brings additional challenges in handling data missingness and analysis [34]. Secondly, missing or incompleteness in confounder measures in real-world data are more common compared to clinical trials, which might distort the inference. For example, patients lost to follow-ups might be due to characteristics that cannot be randomized, such as deteriorating health, which would introduce bias in MI and later inferential statistics. However, almost all of the guidelines regarding how to handle missingness with HRQoL data are for clinical trials only [13,14,35]. Therefore, our study might contribute to the development of guidance for good practices for the prevention and handling of missing data in real-world HRQoL data.

4.1. Main Findings

In the current study, we explored the missing data problem in the HRQoL data of a clinical register (SOReg) and examined a sequential multiple imputation method as a potential solution for the missing item and form problem in the repeated data collection using the SF-36 questionnaire. To the best of our knowledge, this was the first time that the sequential multiple imputation method was applied to impute HRQoL data. This method is preferred as it takes into consideration the longitudinal nature of HRQoL data collected in clinical studies; that is, a patient’s HRQoL at follow-ups is determined by his/her previous HRQoL i.e., at the baseline and previous follow-ups. Although the missing proportion was high for the self-reported HRQoL questionnaire, the sequential multiple imputation method may still provide quite similar distributions for the dataset with missing values even when the missing proportion is as high as 60%. The method has provided a potential solution to handle missing data for multivariable analyses in HRQoL studies when missing was quite substantial. Estimation of the SF-6D index requires complete answers on all 11 items from SF-36 or SF-12 [7,36]. Knowledge of missingness patterns, especially items associated with high missing proportions, is crucial to prevent missing data and select appropriate imputation methods. Knowledge of missingness patterns ((in terms of which characteristics of patients and providers are associated with missing data) might enable one to use appropriate strategies to reduce missing data during the process of data collection [12]. In the current study, although we found that there were many different combinations in missingness patterns for SF-6D, there was no dominating pattern, suggesting that the missingness on the 11 items used for SF-6D were independent of each other. The overall missing proportions of the 11 SF-6D items increased over time, which might suggest that the ‘missing’ is associated with the extension of the follow-up. Based on the missing at random mechanism of the 11 items, the sequential multiple imputation method achieved satisfactory agreement between the actual data and the imputed data even when the missing proportion was as high as 63% at the two-year follow-up. However, as expected, the imputation could not approximate the actual data at the five-year follow-up where the missing proportion was >80%. One reason for the worse performance of the sequential MI procedure in the two- and five-year follow-ups might be the effect of propagation of uncertainty (or propagation of error) embedded in the procedure [37]. Because the MI for a particular year has already incorporated uncertainty regarding the missing data of the year, the sequential MI for data in later years could propagate the uncertainty due to the uncertainty of the parameters in the function for imputation, which are inherited from the previously multiply-imputed data [38].

4.2. Strengths and Limitations

In the current study, we adopted and evaluated the sequential multiple imputation method for four waves of HRQoL data collected from around four thousand patients in a registry. Although the performance of the items and index cannot be compared directly in our study, both imputations presented a high agreement between the imputed data and the real-world data when the missing proportion was < 50%, showing great application potential. We hope our study may stimulate more research on missingness in real-world HRQoL data. Efficient imputation methods would help improve the translation of HRQoL data into complete, accurate, and reliable evidence for healthcare decision-making [13]. There are limitations in the current study. Firstly, because the actual values of the missing SF-36 items and SF-6D index in the real-world data were unknown, it was impossible to identify the mechanism of the missingness in the current study; therefore, we applied the missing at random mechanism for the missing values. However, if the probability of missingness for an item was dependent on what would have been true or the item’s non-response was ‘missing not at random’, for example, patients with worse health were more likely to have missing HRQoL items and/or forms, the current multiple imputation method would not be sufficient and other missing data models with different assumptions should be investigated [39]. Secondly, the proposed sequential MI and its performance were evaluated based on the simulation using the complete respondents, i.e., those who had all four HRQoL forms during the 5-year follow-up. However, we observed statistically significant differences between the patients with missing forms and complete forms in the current study. Although the differences were minor and the statistical significance might have been due to the large sample size, a possible bias introduced by the imputation based on the data of the complete respondents cannot be ruled out. Thirdly, in the current study, we investigated missing data concerning SF-36 and SF-6D in a clinical registry for obese patients. Further investigations on missingness based on other HRQoL instruments, such as EQ-5D, the Health Utility Index, and other patient groups, are also needed. It might be that HRQoL instruments with more items are more likely associated with higher missing proportions, which should be considered when designing the data collection strategy.

5. Conclusions

Relatively high missing proportions in HRQoL data—especially after long-term follow-ups—are common in clinical registries, which brings challenges to analyzing the HRQoL of longitudinal cohorts. The sequential multiple imputation method adopted in the current study might provide an ideal imputation for the missing data (even though the follow-up survey had a missing proportion of 60%), avoiding significant information waste in the multivariable analysis. However, imputations for data with higher missing proportions (above 60%) are unclear. To prevent and handle the missing data in HRQoL studies, researchers should apply a rigorous methodology and practices. Guidance for preventing and handling missing data in observational studies is needed, and studies that use real-world data should be prioritized.
  32 in total

Review 1.  One thousand health-related quality-of-life estimates.

Authors:  T O Tengs; A Wallace
Journal:  Med Care       Date:  2000-06       Impact factor: 2.983

2.  Estimating health state utility values for comorbid health conditions using SF-6D data.

Authors:  Roberta Ara; John Brazier
Journal:  Value Health       Date:  2011-05-31       Impact factor: 5.725

3.  High acquisition rate and internal validity in the Scandinavian Obesity Surgery Registry.

Authors:  Magnus Sundbom; Ingmar Näslund; Erik Näslund; Johan Ottosson
Journal:  Surg Obes Relat Dis       Date:  2020-10-22       Impact factor: 4.734

4.  Measurement tools of resource use and quality of life in clinical trials for dementia or cognitive impairment interventions: A systematically conducted narrative review.

Authors:  Fan Yang; Piers Dawes; Iracema Leroi; Brenda Gannon
Journal:  Int J Geriatr Psychiatry       Date:  2017-08-10       Impact factor: 3.485

Review 5.  Population-based cancer registries for quality-of-life research: a work-in-progress resource for survivorship studies?

Authors:  Melissa S Y Thong; Floortje Mols; Kevin D Stein; Tenbroeck Smith; Jan-Willem W Coebergh; Lonneke V van de Poll-Franse
Journal:  Cancer       Date:  2013-06-01       Impact factor: 6.860

6.  PsoReg--the Swedish registry for systemic psoriasis treatment. The registry's design and objectives.

Authors:  Marcus Schmitt-Egenolf
Journal:  Dermatology       Date:  2007       Impact factor: 5.366

Review 7.  The measurement of health-related quality of life (QOL) in paediatric clinical trials: a systematic review.

Authors:  Sally-Ann Clarke; Christine Eiser
Journal:  Health Qual Life Outcomes       Date:  2004-11-22       Impact factor: 3.186

8.  A guide to handling missing data in cost-effectiveness analysis conducted within randomised controlled trials.

Authors:  Rita Faria; Manuel Gomes; David Epstein; Ian R White
Journal:  Pharmacoeconomics       Date:  2014-12       Impact factor: 4.981

9.  When and how should multiple imputation be used for handling missing data in randomised clinical trials - a practical guide with flowcharts.

Authors:  Janus Christian Jakobsen; Christian Gluud; Jørn Wetterslev; Per Winkel
Journal:  BMC Med Res Methodol       Date:  2017-12-06       Impact factor: 4.615

10.  Multiple imputation for patient reported outcome measures in randomised controlled trials: advantages and disadvantages of imputing at the item, subscale or composite score level.

Authors:  Ines Rombach; Alastair M Gray; Crispin Jenkinson; David W Murray; Oliver Rivero-Arias
Journal:  BMC Med Res Methodol       Date:  2018-08-28       Impact factor: 4.615

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.