| Literature DB >> 35281553 |
Tonya Moen Hansen1, Knut Stavem2, Kim Rand2.
Abstract
Background. National valuation studies are costly, with ∼1000 face-to-face interviews recommended, and some countries may deem such studies infeasible. Building on previous studies exploring sample size, we determined the effect of sample size and alternative model specifications on prediction accuracy of modeled coefficients in EQ-5D-5L value set generating regression analyses. Methods. Data sets (n = 50 to ∼1000) were simulated from 3 valuation studies, resampled at the respondent level and randomly drawn 1000 times with replacement. We estimated utilities for each subsample with leave-one-out at the block level using regression models (8 or 20 parameter; with or without a random intercept; time tradeoff [TTO] data only or TTO + discrete choice experiment [DCE] data). Prediction accuracy, root mean square error (RMSE), was calculated by comparing to censored mean predicted values to the left-out block in the full data set. Linear regression was used to estimate the relative effect of changes in sample size and each model specification. Results. Results showed that doubling the sample size decreased RMSE by on average 0.012. Effects of other model specifications were smaller but can when combined compensate for loss in prediction accuracy from a small sample size. For models using TTO data only, 8-parameter models clearly outperformed 20-parameter models. Adding a random intercept, or including DCE responses, also improved mean RMSE, most prominently for variants of the 20-parameter models. Conclusions. The prediction accuracy impact of further increases in sample size after 300 to 500 were smaller than the impact of combining alternative modeling choices. Hybrid modeling, use of constrained models, and inclusion of random intercepts all substantially improve the expected prediction accuracy. Beyond a minimum of 300 to 500 respondents, the sample size may be better informed by other considerations, such as legitimacy and representativeness, than by the technical prediction accuracy achievable. Highlights: Increases in sample size beyond a minimum in the range of 300 to 500 respondents provide smaller gains in expected prediction accuracy than alternative modeling approaches.Constrained, nonlinear models; time tradeoff + discrete choice experiment hybrid modeling; and including a random intercept all improved the prediction accuracy of models estimating values for the EQ-5D-5L based on data from 3 different valuation studies.The tested modeling choices can compensate for smaller sample sizes.Entities:
Keywords: EQ-5D; cross validation; model misspecification; regression models; sample size; valuation study
Year: 2022 PMID: 35281553 PMCID: PMC8905070 DOI: 10.1177/23814683221083839
Source DB: PubMed Journal: MDM Policy Pract ISSN: 2381-4683
Mean RMSE per Study and Sample Size (ss) for TTO-Only (T)/Hybrid (H) 20-/8-Parameter Models, with/without Random Intercept
| Study | ss | T20 | T20r | T8 | T8r | H20 | H20r | H8 | H8r |
|---|---|---|---|---|---|---|---|---|---|
| Netherlands | 50 | 0.178 | 0.164 | 0.130 | 0.132 | 0.127 | 0.138 | 0.114 | 0.126 |
| 100 | 0.136 | 0.128 | 0.105 | 0.108 | 0.103 | 0.109 | 0.096 | 0.103 | |
| 200 | 0.112 | 0.106 | 0.091 | 0.092 | 0.089 | 0.092 |
| 0.089 | |
| 350 | 0.100 | 0.095 |
|
|
|
|
|
| |
| 500 | 0.095 | 0.091 |
|
|
|
|
|
| |
| 750 | 0.090 |
|
|
|
|
|
|
| |
| 950 | 0.088 |
|
|
|
|
|
|
| |
| ∼990 |
|
|
|
|
|
|
|
| |
| United States | 50 | 0.209 | 0.189 | 0.154 | 0.157 | 0.146 | 0.169 | 0.132 | 0.154 |
| 100 | 0.159 | 0.146 | 0.124 | 0.126 | 0.120 | 0.130 | 0.113 | 0.122 | |
| 200 | 0.128 | 0.119 | 0.108 | 0.107 | 0.105 | 0.110 | 0.102 | 0.104 | |
| 350 | 0.113 | 0.104 | 0.101 |
| 0.099 |
|
|
| |
| 500 | 0.107 |
| 0.098 |
|
|
|
|
| |
| 750 | 0.101 |
|
|
|
|
|
|
| |
| 950 | 0.099 |
|
|
|
|
|
|
| |
| 1000 | 0.099 |
|
|
|
|
|
|
| |
| 1050 | 0.098 |
|
|
|
|
|
|
| |
| 1100 | 0.098 |
|
|
|
|
|
|
| |
| 1134 |
|
|
|
|
|
|
|
| |
| Norway | 50 | 0.174 | 0.163 | 0.136 | 0.138 | 0.137 | 0.151 | 0.126 | 0.141 |
| 100 | 0.144 | 0.137 | 0.120 | 0.120 | 0.121 | 0.127 |
| 0.119 | |
| 200 | 0.128 | 0.120 |
|
|
|
|
|
| |
| 350 | 0.120 |
|
|
|
|
|
|
| |
| 500 |
|
|
|
|
|
|
|
|
RMSE, root mean square error; TTO, time tradeoff.
r indicates with random intercept. Values in bold indicate a mean RMSE ≤the mean RMSE for the 20-parameter TTO-only (T20) model at the maximum sample size.
n = 989 for TTO-only models; n = 992 for hybrid models.
Figure 1Root mean square error (RMSE) by sample size for variants of time tradeoff (TTO) data only and TTO + discrete choice experiment hybrid models. RMSE calculated from the predicted values compared with censored mean observed values for states included for direct valuation. The black dashed line indicates the mean RMSE for the 20-parameter model without a random intercept at the maximum sample size.
Figure 2Root mean square error (RMSE) by sample size for variants of the 20-parameter additive and 8- parameter multiplicative models. RMSE calculated from predicted values compared with the censored mean observed values for states included for direct valuation. The black dashed line indicates the mean RMSE for the time tradeoff–only model without a random intercept at max sample size.
Results from the Linear Regression Model Estimating the Effect of Doubling Sample Size (Binary Logarithm [log2] of Sample Size), using an 8-Parameter Model (8 Parameter), Adding a Random Intercept (Random), and including Discrete Choice Experiment Responses in a Hybrid Model (Hybrid) on Root Mean Square Error (RMSE), Overall and Per Study
| Dependent Variable: RMSE | ||||
|---|---|---|---|---|
| Estimate (SE) | Netherlands | United States | Norway | |
| log2(sample size) | −0.012 | −0.011 | −0.013 | −0.011 |
| 8 parameter | −0.007 | −0.008 | −0.005 | −0.010 |
| Random | −0.003 | −0.001 | −0.006 | −0.002 |
| Hybrid | −0.008 | −0.009 | −0.007 | −0.007 |
| Netherlands | 0.204 | |||
| Norway | 0.219 | |||
| United States | 0.218 | |||
| Constant | ||||
| 0.196 | 0.227 | 0.212 | ||
| Observations | 432,000 | 160,000 | 184,000 | 88,000 |
|
| 0.975 | 0.532 | 0.440 | 0.421 |
| Adjusted | 0.975 | 0.532 | 0.440 | 0.421 |
| Residual standard error | 0.016 ( | |||
| 2,380,826.000 | ||||
P < 0.1; **P < 0.05; ***P < 0.01.