| Literature DB >> 33718787 |
Paul J Novotny1, Darrell Schroeder1, Jeff A Sloan1, Gina L Mazza2, David Williams3, David Bradley4, Irina V Haller5, Steven M Bradley6, Ivana Croghan7.
Abstract
OBJECTIVE: To determine the effects of missing and inconsistent data on a weight management mail survey results. PATIENTS AND METHODS: Weight management surveys were sent to 5000 overweight and obese individuals in the Learning Health System Network. Survey information was collected between October 27, 2017, and March 1, 2018. Some participants reported body mass index (BMI) values inconsistent with the intended overweight and obese sampling cohort. Analyses were performed after excluding these surveys and also performed again after setting these low BMI values to missing. Models were run after imputing missing values using expectation-maximization, Markov chain Monte Carlo, random forest imputation, multivariate imputation by chained equations, and multiple imputation and replacing missing BMI values with the minimum, maximum, mean, or median of the known BMI values.Entities:
Keywords: BMI, body mass index; MAR, missing at random; MCAR, missing completely at random; MCMC, Markov chain Monte Carlo; MNAR, missing not at random; OR, odds ratio
Year: 2021 PMID: 33718787 PMCID: PMC7930870 DOI: 10.1016/j.mayocpiqo.2020.09.006
Source DB: PubMed Journal: Mayo Clin Proc Innov Qual Outcomes ISSN: 2542-4548
Demographic Characteristics by BMI Groupa,b
| Characteristic | Missing BMI (n=222) | BMI <25.0 kg/m2 (n=155) | BMI 25.0-29.9 kg/m2 (n=703) | BMI 30.0-34.9 kg/m2 (n=665) | BMI 35.0-39.9 kg/m2 (n=503) | BMI ≥40.0 kg/m2 (n=551) |
|---|---|---|---|---|---|---|
| Reported BMI | ||||||
| n | 0 | 155 | 703 | 665 | 503 | 551 |
| Mean (kg/m2) | 23.8 | 27.5 | 32.4 | 37.3 | 46.1 | |
| Age | ||||||
| n | 211 | 150 | 685 | 656 | 498 | 542 |
| Mean (y) | 65.7 | 61.0 | 62.1 | 60.9 | 58.8 | 54.3 |
| Sex | ||||||
| Missing | 8 (4) | 2 (1) | 17 (2) | 7 (1) | 4 (1) | 8 (1) |
| Female | 128 (58) | 88 (57) | 360 (51) | 359 (54) | 321 (64) | 404 (73) |
| Male | 86 (39) | 65 (42) | 326 (46) | 299 (45) | 178 (35) | 138 (25) |
| Other | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 1 (0) |
| Race | ||||||
| Missing | 11 (5) | 2 (1) | 22 (3) | 9 (1) | 6 (1) | 8 (1) |
| Asian | 0 (0) | 4 (3) | 8 (1) | 1 (0) | 0 (0) | 1 (0) |
| Black | 14 (6) | 2 (1) | 14 (2) | 22 (3) | 14 (3) | 28 (5) |
| Other | 6 (3) | 9 (6) | 21 (3) | 18 (3) | 24 (5) | 29 (5) |
| White | 191 (86) | 138 (89) | 638 (91) | 615 (92) | 459 (91) | 485 (88) |
| Marital status | ||||||
| Missing | 120 (54) | 2 (1) | 7 (1) | 6 (1) | 10 (2) | 10 (2) |
| Married | 63 (28) | 117 (75) | 521 (74) | 497 (75) | 358 (71) | 330 (60) |
| Never married | 7 (3) | 14 (9) | 47 (7) | 48 (7) | 41 (8) | 89 (16) |
| Separated/divorced | 11 (5) | 8 (5) | 75 (11) | 74 (11) | 58 (12) | 89 (16) |
| Widowed | 21 (9) | 14 (9) | 53 (8) | 40 (6) | 36 (7) | 33 (6) |
| Education | ||||||
| Missing | 125 (56) | 2 (1) | 10 (1) | 12 (2) | 10 (2) | 12 (2) |
| Less than HS graduate | 6 (3) | 5 (3) | 18 (3) | 16 (2) | 13 (3) | 9 (2) |
| HS graduate | 28 (13) | 25 (16) | 117 (17) | 117 (18) | 91 (18) | 115 (21) |
| Some college | 32 (14) | 35 (23) | 223 (32) | 224 (34) | 183 (36) | 239 (43) |
| 4-Y college degree | 10 (5) | 29 (19) | 151 (21) | 130 (20) | 104 (21) | 86 (16) |
| Some postgraduate | 6 (3) | 9 (6) | 34 (5) | 37 (6) | 22 (4) | 25 (5) |
| Postgraduate or professional degree | 15 (7) | 50 (32) | 150 (21) | 129 (19) | 80 (16) | 65 (12) |
| Multiple comorbidities | ||||||
| No | 158 (71) | 91 (59) | 328 (47) | 211 (32) | 145 (29) | 124 (23) |
| Yes | 64 (29) | 64 (41) | 375 (53) | 454 (68) | 358 (71) | 427 (77) |
| Judged because of weight | ||||||
| Missing | 32 (14) | 9 (6) | 50 (7) | 57 (9) | 48 (10) | 69 (13) |
| No | 174 (78) | 141 (91) | 638 (91) | 577 (87) | 407 (81) | 417 (76) |
| Yes | 16 (7) | 5 (3) | 15 (2) | 31 (5) | 48 (10) | 65 (12) |
| Not always treated with respect | ||||||
| Missing | 20 (9) | 6 (4) | 35 (5) | 22 (3) | 17 (3) | 19 (3) |
| No | 163 (73) | 134 (86) | 586 (83) | 548 (82) | 411 (82) | 422 (77) |
| Yes | 39 (18) | 15 (10) | 82 (12) | 95 (14) | 75 (15) | 110 (20) |
| Not always treated as an equal | ||||||
| Missing | 19 (9) | 6 (4) | 40 (6) | 26 (4) | 20 (4) | 23 (4) |
| No | 152 (68) | 111 (72) | 506 (72) | 476 (72) | 353 (70) | 354 (64) |
| Yes | 51 (23) | 38 (25) | 157 (22) | 163 (25) | 130 (26) | 174 (32) |
| General health | ||||||
| Missing | 120 (54) | 0 (0) | 3 (0) | 8 (1) | 2 (0) | 5 (1) |
| Excellent | 5 (2) | 38 (25) | 64 (9) | 23 (3) | 3 (1) | 6 (1) |
| Very good | 29 (13) | 62 (40) | 285 (41) | 195 (29) | 117 (23) | 65 (12) |
| Good | 43 (19) | 43 (28) | 266 (38) | 295 (44) | 246 (49) | 235 (43) |
| Fair | 19 (9) | 9 (6) | 73 (10) | 123 (18) | 112 (22) | 193 (35) |
| Poor | 6 (3) | 3 (2) | 12 (2) | 21 (3) | 23 (5) | 47 (9) |
| Positive screen result for current depression (PHQ-2) | ||||||
| Missing | 30 (14) | 10 (6) | 61 (9) | 68 (10) | 37 (7) | 40 (7) |
| No | 156 (70) | 140 (90) | 584 (83) | 513 (77) | 392 (78) | 363 (66) |
| Yes | 36 (16) | 5 (3) | 58 (8) | 84 (13) | 74 (15) | 148 (27) |
| Currently smoke cigarettes | ||||||
| Missing | 119 (54) | 1 (1) | 3 (0) | 2 (0) | 3 (1) | 1 (0) |
| Yes | 9 (4) | 9 (6) | 39 (6) | 44 (7) | 26 (5) | 36 (7) |
| No | 94 (42) | 145 (94) | 661 (94) | 619 (93) | 474 (94) | 514 (93) |
| Currently use alcohol products | ||||||
| Missing | 125 (56) | 9 (6) | 26 (4) | 24 (4) | 22 (4) | 10 (2) |
| Yes | 41 (18) | 86 (55) | 458 (65) | 381 (57) | 269 (53) | 251 (46) |
| No | 56 (25) | 60 (39) | 219 (31) | 260 (39) | 212 (42) | 290 (53) |
| Considered overweight as a child | ||||||
| Missing | 123 (55) | 0 (0) | 3 (0) | 6 (1) | 3 (1) | 7 (1) |
| Yes | 19 (9) | 20 (13) | 92 (13) | 141 (21) | 177 (35) | 268 (49) |
| No | 80 (36) | 135 (87) | 608 (86) | 518 (78) | 323 (64) | 276 (50) |
| Current opinion of their weight | ||||||
| Missing | 28 (13) | 11 (7) | 63 (9) | 71 (11) | 43 (9) | 44 (8) |
| Underweight | 1 (0) | 0 (0) | 1 (0) | 2 (0) | 1 (0) | 0 (0) |
| Average or normal weight | 45 (20) | 111 (72) | 249 (35) | 40 (6) | 4 (1) | 2 (0) |
| Overweight | 95 (43) | 31 (20) | 376 (53) | 429 (65) | 214 (43) | 100 (18) |
| Obese | 42 (19) | 2 (1) | 13 (2) | 118 (18) | 207 (41) | 204 (37) |
| Very obese | 11 (5) | 0 (0) | 1 (0) | 5 (1) | 34 (7) | 201 (36) |
| Physical violence with growing up | ||||||
| Missing | 25 (11) | 9 (6) | 60 (9) | 61 (9) | 32 (6) | 34 (6) |
| Yes | 28 (13) | 17 (11) | 98 (14) | 103 (15) | 104 (21) | 129 (23) |
| No | 169 (76) | 129 (83) | 545 (78) | 501 (75) | 367 (73) | 388 (70) |
BMI = body mass index; HS = high school; PHQ-2 = Patient Health Questionnaire-2.
Data are expressed as No. (percentage) unless indicated otherwise.
Extent of Missing Dataa,b
| Cohort | All surveys | Excluding surveys with BMI <25.0 kg/m2 |
|---|---|---|
| Total n | 2799 | 2644 |
| Missing BMI | 377 (14) | 222 (8) |
| Missing judged because of weight | 265 (10) | 256 (10) |
| Missing always treated with respect | 119 (4) | 113 (4) |
| Missing always treated as an equal | 134 (5) | 128 (5) |
| Missing age | 57 (2) | 52 (2) |
| Missing sex | 46 (2) | 44 (2) |
| Missing race | 58 (2) | 56 (2) |
| Missing marital status | 155 (6) | 153 (6) |
| Missing education | 171 (6) | 169 (6) |
| Missing multiple comorbidities | 0 (0) | 0 (0) |
| Missing any of these variables | 725 (26) | 570 (22) |
BMI = body mass index.
Data are expressed as No. (percentage) unless indicated otherwise.
P Values for Associations With BMI Using Different Imputation Methods Excluding Participants With a BMI of <25.0 kg/m2a,b
| Imputation method | Age: | Female: | non-Hispanic white: | Single: | High school education: | Some college: |
|---|---|---|---|---|---|---|
| Original results | <.001 | <.001 | .018 | <.001 | .49 | <.001 |
| Minimum | <.001 | <.001 | .028 | <.001 | .74 | <.001 |
| Maximum | <.001 | <.001 | .033 | <.001 | .09 | <.001 |
| Mean | <.001 | <.001 | .022 | <.001 | .44 | <.001 |
| Median | <.001 | <.001 | .022 | <.001 | .44 | <.001 |
| EM algorithm | <.001 | <.001 | .005 | <.001 | .29 | <.001 |
| MCMC algorithm | <.001 | <.001 | .008 | <.001 | .34 | <.001 |
| Random forest imputation | <.001 | <.001 | <.001 | <.001 | .47 | <.001 |
| MICE imputation | <.001 | <.001 | <.001 | <.001 | .07 | <.001 |
P values are based on univariate logistic regression models.
BMI = body mass index; EM = expectation-maximization; MCMC = Markov chain Monte Carlo; MICE = multivariate imputation by chained equations.
Logistic Model for Feeling Judged by BMI Excluding Patients With Low BMIa,b
| Imputation | Type III: | BMI 30.0-34.9 kg/m2: odds ratio (95% CI) | BMI 35.0-39.9 kg/m2: odds ratio (95% CI) | BMI ≥40.0 kg/m2: odds ratio (95% CI) | |||
|---|---|---|---|---|---|---|---|
| Original results | <.001 | 2.38 (1.22-4.63) | .011 | 4.62 (2.45-8.74) | <.001 | 5.26 (2.78-9.96) | <.001 |
| Minimum | <.001 | 2.13 (1.14-3.98) | .017 | 4.14 (2.29-7.48) | <.001 | 4.68 (2.59-8.46) | <.001 |
| Maximum | <.001 | 2.35 (1.21-4.58) | .012 | 4.55 (2.41-8.61) | <.001 | 4.76 (2.54-8.95) | <.001 |
| Mean | <.001 | 2.34 (1.21-4.51) | .011 | 4.59 (2.43-8.69) | <.001 | 5.20 (2.75-9.84) | <.001 |
| Median | <.001 | 2.34 (1.21-4.51) | .011 | 4.59 (2.43-8.69) | <.001 | 5.20 (2.75-9.84) | <.001 |
| EM algorithm | <.001 | 2.38 (1.22-4.61) | .011 | 4.60 (2.44-8.67) | <.001 | 5.18 (2.74-9.82) | <.001 |
| MCMC algorithm | <.001 | 2.35 (1.21-4.56) | .012 | 4.54 (2.40-8.57) | <.001 | 5.28 (2.79-10.00) | <.001 |
| Random forest imputation | <.001 | 2.32 (1.20-4.50) | .013 | 4.51 (2.39-8.50) | <.001 | 5.28 (2.79-9.99) | <.001 |
| Multiple imputation | <.001 | 2.00 (1.07-3.75) | .030 | 3.82 (2.07-7.03) | <.001 | 4.70 (2.62-8.44) | <.001 |
| MICE | <.001 | 2.32 (1.22-4.39) | .010 | 4.30 (2.39-7.74) | <.001 | 6.12 (3.35-11.18) | <.001 |
BMI = body mass index; EM = expectation-maximization; MCMC = Markov chain Monte Carlo; MICE = multivariate imputation by chained equations.
Models were adjusted for age, sex, race, marital status, education, and presence of multiple comorbidities as covariates.
Logistic Model for Not Always Treated With Respect by BMI Excluding Patients With Low BMIa,b
| Imputation | Type III: | BMI 30.0-34.9 kg/m2: odds ratio (95% CI) | BMI 35.0-39.9 kg/m2: odds ratio (95% CI) | BMI ≥40.0 kg/m2: odds ratio (95% CI) | |||
|---|---|---|---|---|---|---|---|
| Original results | .11 | 1.24 (0.89-1.74) | .20 | 1.10 (0.76-1.57) | .62 | 1.51 (1.07-2.14) | .021 |
| Minimum | .13 | 1.20 (0.87-1.66) | .26 | 1.06 (0.75-1.51) | .74 | 1.46 (1.04-2.04) | .028 |
| Maximum | .13 | 1.24 (0.89-1.72) | .21 | 1.09 (0.76-1.56) | .65 | 1.46 (1.04-2.04) | .027 |
| Mean | .11 | 1.24 (0.90-1.71) | .20 | 1.09 (0.76-1.57) | .64 | 1.50 (1.06-2.12) | .023 |
| Median | .11 | 1.24 (0.90-1.71) | .20 | 1.09 (0.76-1.57) | .64 | 1.50 (1.06-2.12) | .023 |
| EM algorithm | .12 | 1.27 (0.91-1.76) | .15 | 1.14 (0.80-1.63) | .47 | 1.51 (1.06-2.13) | .021 |
| MCMC algorithm | .07 | 1.27 (0.92-1.76) | .15 | 1.08 (0.76-1.55) | .66 | 1.53 (1.08-2.17) | .016 |
| Random forest imputation | .10 | 1.22 (0.88-1.69) | .24 | 1.07 (0.75-1.53) | .70 | 1.50 (1.06-2.12) | .022 |
| Multiple imputation | .25 | 1.19 (0.86-1.64) | .29 | 1.19 (0.85-1.66) | .31 | 1.41 (1.00-1.99) | .048 |
| MICE | .07 | 1.18 (0.84-1.66) | .33 | 1.15 (0.81-1.64) | .42 | 1.42 (1.02-1.98) | .037 |
BMI = body mass index; EM = expectation-maximization; MCMC = Markov chain Monte Carlo; MICE = multivariate imputation by chained equations.
Models were adjusted for age, sex, race, marital status, education, and presence of multiple comorbidities as covariates.
Logistic Model for Not Always Treated as an Equal by BMI Excluding Patients With Low BMIa,b
| Imputation | Type III: | BMI 30.0-34.9 kg/m2: odds ratio (95% CI) | BMI 35.0-39.9 kg/m2: odds ratio (95% CI) | BMI ≥40.0 kg/m2: odds ratio (95% CI) | |||
|---|---|---|---|---|---|---|---|
| Original results | .12 | 1.06 (0.81-1.38) | .68 | 1.05 (0.79-1.40) | .74 | 1.37 (1.03-1.82) | .030 |
| Minimum | .12 | 1.07 (0.83-1.39) | .58 | 1.06 (0.81-1.41) | .66 | 1.38 (1.05-1.82) | .022 |
| Maximum | .34 | 1.05 (0.80-1.36) | .73 | 1.03 (0.78-1.37) | .82 | 1.26 (0.96-1.65) | .100 |
| Mean | .13 | 1.02 (0.79-1.32) | .87 | 1.04 (0.78-1.38) | .79 | 1.35 (1.01-1.79) | .039 |
| Median | .13 | 1.02 (0.79-1.32) | .87 | 1.04 (0.78-1.38) | .79 | 1.35 (1.01-1.79) | .039 |
| EM algorithm | .14 | 1.04 (0.80-1.35) | .76 | 1.05 (0.80-1.39) | .72 | 1.35 (1.02-1.79) | .036 |
| MCMC algorithm | .10 | 1.04 (0.80-1.35) | .76 | 1.05 (0.79-1.39) | .74 | 1.38 (1.04-1.83) | .025 |
| Random forest imputation | .07 | 1.02 (0.78-1.32) | .90 | 1.01 (0.76-1.34) | .93 | 1.38 (1.04-1.82) | .027 |
| Multiple imputation | .16 | 1.08 (0.85-1.38) | .52 | 1.12 (0.86-1.46) | .41 | 1.35 (1.03-1.77) | .029 |
| MICE | .06 | 1.11 (0.86-1.44) | .43 | 1.14 (0.87-1.50) | .35 | 1.41 (1.08-1.85) | .012 |
BMI = body mass index; EM = expectation-maximization; MCMC = Markov chain Monte Carlo; MICE = multivariate imputation by chained equations.
Models were adjusted for age, sex, race, marital status, education, and presence of multiple comorbidities as covariates.