| Literature DB >> 35704682 |
Mona Aghdaee1, Bonny Parkinson1, Kompal Sinha2, Yuanyuan Gu1, Rajan Sharma1, Emma Olin1, Henry Cutler1.
Abstract
Non-preference-based patient-reported outcome measures (PROMs) are popular in health outcomes research. These measures, however, cannot be used to estimate health state utilities, limiting their usefulness for economic evaluations. Mapping PROMs to a multi-attribute utility instrument is one solution. While mapping is commonly conducted using econometric techniques, failing to specify the complex interactions between variables may lead to inaccurate prediction of utilities, resulting in inaccurate estimates of cost-effectiveness and suboptimal funding decisions. These issues can be addressed using machine learning. This paper evaluates the use of machine learning as a mapping tool. We adopt a comprehensive approach to compare six machine learning techniques with eight econometric techniques to map the Patient-Reported Outcomes Measurement Information System Global Health 10 (PROMIS-GH10) to the EuroQol five dimensions (EQ-5D-5L). Using data collected from 2015 Australians, we find the least absolute shrinkage and selection operator (LASSO) model out-performed all machine learning techniques and the adjusted limited dependent variable mixture model (ALDVMM) out-performed all econometric techniques, with the LASSO performing better than ALDVMM. The variable selection feature of LASSO was then used to enhance the performance of the ALDVMM in a hybrid model. Our analysis identifies the potential benefits and challenges of using machine learning techniques for mapping and offers important insights for future research.Entities:
Keywords: EQ-5D; PROMIS; econometrics; machine learning; mapping; utility
Mesh:
Year: 2022 PMID: 35704682 PMCID: PMC9545032 DOI: 10.1002/hec.4503
Source DB: PubMed Journal: Health Econ ISSN: 1057-9230 Impact factor: 2.395
Descriptive statistics
| Variables | General population survey |
|---|---|
| Age (years) | |
| Mean (SD) | 48.31 (17.79) |
| Range | 18–89 |
| Female (%) | 53.40% |
| EQ‐5D‐5L utilities | |
| Mean (SD) | 0.82 (0.25) |
| Range | −0.43 to 1 |
| Utilities <0 (%) | 38 (1.89%) |
| Utilities = 1 (%) | 440 (21.84%) |
| Utilities >0.9 (%) | 1120 (55.58%) |
| PROMIS‐GH10 | |
| Physical score (SD) | 14.21 (2.87) |
| Mental score (SD) | 13.22 (3.45) |
| No. of observations | 2015 |
Abbreviation: SD, standard deviation.
Predicted statistics summary mapping PROMIS‐GH10 to EQ‐5D‐5L
| Models | MAE | MSE | Mean (after truncation) | Minimum | Maximum (before truncation) | Maximum (after truncation) | % Of observations predicted >1 before truncation | ||
|---|---|---|---|---|---|---|---|---|---|
| Before truncation | After truncation | Before truncation | After truncation | ||||||
| Actual | 0.820901 | −0.426230 | 1 | 1 | |||||
| Econometric models, direct mapping | |||||||||
| Explanatory variable set 1 | |||||||||
| Linear regression | 0.142246 | 0.137432 | 0.044354 | 0.042543 | 0.832334 | 0.267356 | 1.198189 | 1 | 5.11% |
| Tobit | 0.131474 | 0.131474 | 0.040423 | 0.040423 | 0.829745 | 0.203345 | 0.964564 | 0.964564 | |
| Median regression | 0.126732 | 0.126144 | 0.040256 | 0.039332 | 0.838288 | 0.181222 | 1.076223 | 1 | 5.26% |
| GLM | 0.135323 | 0.135323 | 0.042167 | 0.042167 | 0.839760 | 0.325465 | 0.985476 | 0.985476 | |
| CLAD | 0.139422 | 0.136421 | 0.044545 | 0.042567 | 0.835377 | 0.173245 | 1.377532 | 1 | 5.46% |
| Betamix | 0.137780 | 0.137780 | 0.040465 | 0.040465 | 0.830632 | 0.123434 | 0.956323 | 0.956323 | |
| ALDVMM | 0.135323 | 0.135323 | 0.038232 | 0.038232 | 0.830053 | 0.111389 | 0.968134 | 0.968134 | |
| Explanatory variable set 2 | |||||||||
| Linear regression | 0.138100 | 0.130477 | 0.043210 | 0.042005 | 0.832442 | 0.253322 | 1.143212 | 1 | 5.06% |
| Tobit | 0.126243 | 0.126243 | 0.042901 | 0.042901 | 0.833654 | 0.196564 | 0.973114 | 0.973114 | |
| Median regression | 0.125325 | 0.124466 | 0.042132 | 0.038965 | 0.835564 | 0.186231 | 1.032231 | 1 | 5.11% |
| GLM | 0.129445 | 0.129445 | 0.041543 | 0.041543 | 0.834412 | 0.294223 | 0.985234 | 0.985234 | |
| CLAD | 0.135165 | 0.129321 | 0.044532 | 0.041345 | 0.834117 | 0.165564 | 1.144556 | 1 | 5.16% |
| Betamix | 0.121943 | 0.121943 | 0.037553 | 0.037553 | 0.830987 | 0.117326 | 0.975344 | 0.975344 | |
| ALDVMM | 0.120387 | 0.120387 | 0.036890 | 0.036890 | 0.829922 | 0.116745 | 0.977111 | 0.977111 | |
| Explanatory variable set 3 | |||||||||
| Linear regression | 0.105861 | 0.105061 | 0.034974 | 0.034195 | 0.819438 | −0.283661 | 1.013015 | 1 | 4.96% |
| Tobit | 0.103923 | 0.103923 | 0.030912 | 0.030912 | 0.817443 | −0.243432 | 0.986097 | 0.986097 | |
| Median regression | 0.101734 | 0.099122 | 0.029041 | 0.028455 | 0.829874 | −0.410107 | 1.020771 | 1 | 5.01% |
| GLM | 0.106531 | 0.106531 | 0.031326 | 0.031326 | 0.817477 | −0.296354 | 0.988065 | 0.988065 | |
| CLAD | 0.108825 | 0.107047 | 0.035533 | 0.033462 | 0.829588 | −0.333890 | 1.030432 | 1 | 5.31% |
| Betamix | 0.096645 | 0.096645 | 0.026508 | 0.026508 | 0.820799 | −0.353980 | 0.988395 | 0.988395 | |
| ALDVMM | 0.095826 | 0.095826 | 0.025877 | 0.025877 | 0.820902 | −0.367103 | 0.988465 | 0.988465 | |
| Explanatory variable set 4 | |||||||||
| Linear regression | 0.109855 | 0.107442 | 0.036302 | 0.035441 | 0.820402 | −0.285332 | 1.039458 | 1 | 5.01% |
| Tobit | 0.105336 | 0.105336 | 0.033271 | 0.033271 | 0.814437 | −0.242088 | 0.986098 | 0.986098 | |
| Median regression | 0.103902 | 0.101391 | 0.031441 | 0.030102 | 0.830179 | −0.375063 | 1.021416 | 1 | 5.21% |
| GLM | 0.107401 | 0.107401 | 0.032052 | 0.032052 | 0.816418 | −0.287088 | 0.988033 | 0.988033 | |
| CLAD | 0.110184 | 0.108371 | 0.035336 | 0.034298 | 0.829330 | −0.318164 | 1.089408 | 1 | 5.26% |
| Betamix | 0.100066 | 0.100066 | 0.029044 | 0.029044 | 0.819360 | −0.355600 | 0.988022 | 0.988022 | |
| ALDVMM | 0.987012 | 0.987012 | 0.027421 | 0.027421 | 0.819057 | −0.366211 | 0.988195 | 0.988195 | |
| Machine learning, direct mapping | |||||||||
| CART (regression trees) | 0.126756 | 0.126756 | 0.048433 | 0.048433 | 0.812054 | −0.111331 | 0.981242 | 0.981242 | |
| Random forests | 0.111418 | 0.111418 | 0.037371 | 0.037371 | 0.818166 | −0.202419 | 0.998012 | 0.998012 | |
| Bagged CART | 0.112339 | 0.112339 | 0.041446 | 0.041446 | 0.817192 | −0.196299 | 0.991002 | 0.991002 | |
| NN | 0.107195 | 0.107195 | 0.033278 | 0.033278 | 0.818389 | −0.245290 | 0.992866 | 0.992866 | |
| QRNN | 0.104027 | 0.104027 | 0.031190 | 0.031190 | 0.819744 | −0.300812 | 0.997521 | 0.997521 | |
| LASSO 1 | 0.095523 | 0.095523 | 0.025323 | 0.025323 | 0.820901 | −0.399345 | 0.998733 | 0.998733 | |
| LASSO 2 | 0.101939 | 0.101939 | 0.029339 | 0.029339 | 0.810058 | −0.432911 | 0.964977 | 0.964977 | |
| Econometric models, indirect mapping | |||||||||
| GLOGIT | 0.107066 | 0.107066 | 0.029267 | 0.029267 | 0.836044 | −0.281108 | 1 | 1 | |
| Machine learning, indirect mapping | |||||||||
| CART (classification trees) | 0.118269 | 0.118269 | 0.041493 | 0.041493 | 0.860133 | −0.190286 | 1 | 1 | |
| Random forests | 0.107251 | 0.107251 | 0.031279 | 0.031279 | 0.843662 | −0.235079 | 1 | 1 | |
| Bagged CART | 0.111491 | 0.111491 | 0.032466 | 0.032466 | 0.846118 | −0.222931 | 1 | 1 | |
| NN | 0.104729 | 0.104729 | 0.030422 | 0.030422 | 0.831362 | −0.260450 | 1 | 1 | |
| LASSO 1 | 0.104419 | 0.104419 | 0.030680 | 0.030680 | 0.830096 | −0.355210 | 1 | 1 | |
Note: Results were obtained from 10‐fold cross‐validation. Explanatory variables for set 1: the physical and mental health summary scores of PROMIS‐GH10 (as continuous variables), age, age squared, sex; set 2: the PROMIS‐GH10 items, age, age squared, sex; set 3: the PROMIS‐GH10 (as categorical variables), age, age squared, and sex; set 4: the PROMIS‐GH10, age, and sex all as categorical variables. LASSO 1: LASSO technique is used for prediction. Explanatory variables (without interactions) are only considered. LASSO 2: LASSO technique is used for prediction. Explanatory variables and their two‐way interactions are considered.
Abbreviations: ALDVMM, adjusted limited dependent variable mixture model; Betamix, mixture beta regression model; CLAD, censored least absolute deviation; GLM, generalized linear model; GLOGIT, generalized logistic regression; LASSO, least absolute shrinkage and selection operator; MAE, mean absolute error; MSE, mean squared error; NN, neural networks; PROMIS‐GH10, PROMIS short form Global Health 10; QRNN, quantile (median) regression neural networks.
FIGURE 1Distribution of the observed versus predicted utilities using the econometric techniques. ALDVMM, adjusted limited dependent variable mixture model; Betamix, mixture beta regression model; CLAD, censored least absolute deviation; GLM, generalized linear model; GLOGIT, generalized logistic regression; MR, median regression
FIGURE 2Distribution of the observed versus predicted utilities using direct mapping with machine learning techniques. LASSO 1: LASSO technique is used for prediction. Explanatory variables (without interactions) are only considered. LASSO 2: LASSO technique is used for prediction. Explanatory variables and their two‐way interactions are considered. LASSO, least absolute shrinkage and selection operator; NN, neural networks; QRNN, quantile (median) regression neural networks
FIGURE 3Distribution of the observed versus predicted utilities using indirect mapping with machine learning techniques. LASSO 1: LASSO technique is used for prediction. Explanatory variables (without interactions) are only considered. LASSO, least absolute shrinkage and selection operator; NN, neural networks; QRNN, quantile (median) regression neural networks
Performance of hybrid models
| Models | MAE | Rank in MAE | MSE | Rank in MSE | Mean | Rank in mean | Minimum | Rank in minimum | Maximum | Rank in maximum |
|---|---|---|---|---|---|---|---|---|---|---|
| Hybrid 1 | 0.096125 | 2 | 0.026310 | 3 | 0.826410 | 3 | −0.322932 | 3 | 0.998757 | 2 |
| Hybrid 2 | 0.098943 | 4 | 0.298421 | 4 | 0.815864 | 4 | −0.405733 | 1 | 0.979543 | 4 |
| LASSO 1 | 0.095993 | 1 | 0.025773 | 1 | 0.826159 | 1 | −0.347765 | 2 | 0.998831 | 1 |
| LASSO 2 | 0.995188 | 5 | 0.029542 | 5 | 0.810641 | 5 | −0.449753 | 5 | 0.969521 | 5 |
| ALDVMM | 0.096341 | 3 | 0.026052 | 2 | 0.826335 | 2 | −0.306850 | 4 | 0.988367 | 3 |
| Actual observations in the validation sample (50% of dataset) | 0.826099 | −0.426230 | 1 |
Note: Hybrid 1: explanatory variables (without interactions) are selected by LASSO and the ALDVMM is re‐estimated with the selected variables. Hybrid 2: explanatory variables (variables and their two‐way interactions) are selected by LASSO and the ALDVMM is re‐estimated with the selected variables.
Abbreviations: ALDVMM, adjusted limited dependent variable mixture model; LASSO, least absolute shrinkage and selection operator.
These three models are re‐estimated using 50% estimation and 50% validation sample to be comparable with Hybrid models, thus the statistics are different from ones previously reported in Table 2.
Mapping onto preference‐based measures reporting Standards (MAPS) checklist
| Section/topic | Item no. | Recommendation | Reported on page no. |
|---|---|---|---|
| Title and abstract | |||
| Title | 1 | Identify the report as a study mapping between outcome measures. State the source measure(s) and generic, preference‐based target measure(s) used in the study. | 1 |
| Abstract | 2 | Provide a structured abstract including, as applicable: Objectives; methods, including data sources and their key characteristics, outcome measures used and estimation and validation strategies; results, including indicators of model performance; conclusions; and implications of key findings. | 1 |
| Introduction | |||
| Study rationale | 3 | Describe the rationale for the mapping study in the context of the broader evidence base. | 2–4 |
| Study objective | 4 | Specify the research question with reference to the source and target measures used and the disease or population context of the study. | 3–4 |
| Methods | |||
| Estimation sample | 5 | Describe how the estimation sample was identified, why it was selected, the methods of recruitment and data collection, and its location(s) or setting(s). | 4 |
| External validation sample | 6 | If an external validation sample was used, the rationale for selection, the methods of recruitment and data collection, and its location(s) or setting(s) should be described. | NA |
| Source and target measures | 7 | Describe the source and target measures and the methods by which they were applied in the mapping study. | 4 |
| Exploratory data analysis | 8 | Describe the methods used to assess the degree of conceptual overlap between the source and target measures. | 8 |
| Missing data | 9 | State how much data were missing and how missing data were managed in the sample(s) used for the analyses. | NA |
| Modeling approaches | 10 | Describe and justify the statistical model(s) used to develop the mapping algorithm. | 5–8 |
| Estimation of predicted scores or utilities | 11 | Describe how predicted scores or utilities are estimated for each model specification. | 5–8 |
| Validation methods | 12 | Describe and justify the methods used to validate the mapping algorithm. | 5–8 |
| Measures of model performance | 13 | State and justify the measure(s) of model performance that determine the choice of the preferred model(s) and describe how these measures were estimated and applied. | 4 |
| Results | |||
| Final sample size(s) | 14 | State the size of the estimation sample and any validation sample(s) used in the analyses (including both number of individuals and number of observations). | 8 |
| Descriptive information | 15 | Describe the characteristics of individuals in the sample(s) (or refer back to previous publications giving such information). Provide summary scores for source and target measures, and summarize results of analyses used to assess overlap between the source and target measures. | 8–9 |
| Model selection | 16 | State which model(s) is(are) preferred and justify why this(these) model(s) was(were) chosen. | 9–15 |
| Model coefficients | 17 | Provide all model coefficients and standard errors for the selected model(s). Provide clear guidance on how a user can calculate utility scores based on the outputs of the selected model(s). | Appendix |
| Uncertainty | 18 | Report information that enables users to estimate standard errors around mean utility predictions and individual‐level variability. | Appendix |
| Model performance and face validity | 19 | Present results of model performance, such as measures of prediction accuracy and fit statistics for the selected model(s) in a table or in the text. Provide an assessment of face validity of the selected model(s). | Tables |
| Discussion | |||
| Comparisons with previous studies | 20 | Report details of previously published studies developing mapping algorithms between the same source and target measures and describe differences between the algorithms, in terms of model performance, predictions and coefficients, if applicable. | 15–16 |
| Study limitations | 21 | Outline the potential limitations of the mapping algorithm. | 16–17 |
| Scope of applications | 22 | Outline the clinical and research settings in which the mapping algorithm could be used. | 15–17 |
| Other | |||
| Additional information | 23 | Describe the source(s) of funding and non‐monetary support for the study, and the role of the funder(s) in its design, conduct and report. Report any conflicts of interest surrounding the roles of authors and funders. | 17 |
Abbreviation: NA, not applicable.
Mapping to estimate health‐state utility from non‐preference‐based outcome measures: An ISPOR good practices for outcomes research task force report
| Recommendation | Reported |
|---|---|
| 1. Describe relevant differences between data sets that are candidates for mapping estimation. | One only dataset was used, which was collected for the purpose of this mapping study. |
| 2. Give full details of the selected data set. Describe how the study was run and patients were sampled. Provide baseline and follow up characteristics including the distribution of patients' disease severity. Missingness in the longitudinal pattern of responses should be described. | How the study was conducted and patients sampled provided in Section |
| Data was cross‐sectional with all questions mandatory, except for the Charlson comorbidity index (CCI), which was not used in the mapping study. Hence there was no missing data. | |
| 3. Plot the distribution of the utility data. | Distribution of the observed versus predicted utilities presented in Figures |
| 4. Justify the type of model(s) selected with reference to the characteristics of the target utility distribution and the proposed use of the mapping function. | Justification of models selected presented in Sections |
| 5. Compare the dimensions of health covered by the target utility instrument and those covered by the explanatory clinical measure(s). | Description of instrument dimensions provided in Section |
| 6. Describe the approach to determining the final model. Include tests conducted and judgments made. | Described in Section |
| 7. Summary measures of fit are of limited value for the total sample. Provide information on fit conditional on disease severity as measured by the clinical outcome measure(s). A plot of mean predicted versus mean observed utility conditional on the clinical variable(s) should be included. | A range of summary measures are presented in Table |
| 8. Coefficient values, error term(s) distributions(s), variances, and covariances are required. | Presented in Appendix |
| 9. Provide an example predicted value for some sets of covariates. Consider providing a program that calculates predictions for user‐defined inputs. | Examples of machine learning presented in Appendix |
| 10. Parameter uncertainty in a mapping regression should be reflected using standard methods for Probabilistic Sensitivity Analysis (PSA). Assessment of model suitability for use in cost‐effectiveness analysis should also consider the distribution of utility values for PSA, with particular focus on whether these lie outside the feasible utility range for the preference based measure (PBM). | Table |
| 11. When imputing data from a mapping function, individual‐level variability should be incorporated using simulation methods and information about the distribution of the error term(s). These simulated data can be compared with the raw observed data, including an assessment of the range of values compared with the feasible range for the PBM. | Not applicable – no imputation conducted. |
| 12. Re‐estimation of mapping results in a separate data set or other forms of validation are not routinely required. | Due to the lack of data on the five‐level version of EQ‐5D‐5L, no external dataset was available, and only internal cross validation was applied in this study (mentioned in Section |
Note: Summary of reporting of mapping studies recommendations.
Coefficients and standard errors from the best performing econometric model (Adjusted limited dependent variable mixture model [ALVDMM])
| Predictor variables | Component one coefficients | Standard errors | Component two coefficients | Standard errors |
|---|---|---|---|---|
| PROMIS‐GH10 Q1 | ||||
| Level‐1 | −0.385959 | 0.0207294 | 0.0925135 | 0.0794249 |
| Level‐2 | −0.0009395 | 0.0111615 | 0.018756 | 0.0642693 |
| Level‐3 | 0.0027596 | 0.0095323 | 0.0402084 | 0.0565958 |
| Level‐4 | −0.00748 | 0.0082035 | 0.0325671 | 0.0520424 |
| PROMIS‐GH10 Q2 | ||||
| Level‐1 | −0.0585196 | 0.0139972 | −0.0794948 | 0.0714324 |
| Level‐2 | −0.0070622 | 0.0093945 | −0.0291088 | 0.0569722 |
| Level‐3 | −0.0066033 | 0.0082136 | −0.0320562 | 0.0509567 |
| Level‐4 | −0.0058776 | 0.0072301 | −0.0638011 | 0.0457145 |
| PROMIS‐GH10 Q3 | ||||
| Level‐1 | −0.0493412 | 0.0190513 | −0.0260497 | 0.0764026 |
| Level‐2 | −0.0097279 | 0.0104587 | 0.002096 | 0.0630417 |
| Level‐3 | 0.0061863 | 0.0095075 | 0.0626543 | 0.0583958 |
| Level‐4 | 0.009719 | 0.0082286 | 0.1288012 | 0.0550993 |
| PROMIS‐GH10 Q4 | ||||
| Level‐1 | −0.0192356 | 0.0109768 | −0.1073274 | 0.0579867 |
| Level‐2 | −0.0156022 | 0.0076756 | 0.0167262 | 0.0484853 |
| Level‐3 | −0.0012259 | 0.0065005 | 0.0983137 | 0.0424271 |
| Level‐4 | −0.0038548 | 0.0056822 | 0.0472543 | 0.0368971 |
| PROMIS‐GH10 Q5 | ||||
| Level‐1 | −0.0055637 | 0.0098399 | −0.0332379 | 0.059465 |
| Level‐2 | −0.0088215 | 0.0081152 | 0.0007317 | 0.0532768 |
| Level‐3 | −0.0104855 | 0.0074152 | 0.0262982 | 0.0486364 |
| Level‐4 | −0.0085666 | 0.0066697 | 0.0076209 | 0.0428807 |
| PROMIS‐GH10 Q6 | ||||
| Level‐1 | 0.5729949 | 0.051003 | −0.4978277 | 0.0734452 |
| Level‐2 | −0.0321034 | 0.0131572 | −0.2048257 | 0.0383186 |
| Level‐3 | −0.0295483 | 0.0063002 | −0.0573564 | 0.029585 |
| Level‐4 | −0.0185259 | 0.0045397 | −0.0534656 | 0.0284769 |
| PROMIS‐GH10 Q7 | ||||
| Level‐1 | −0.0359491 | 0.0058377 | −0.0286145 | 0.048754 |
| Level‐2 | −0.0489133 | 0.006043 | −0.1483418 | 0.0453403 |
| Level‐3 | −0.0557849 | 0.0062876 | −0.2193954 | 0.0464523 |
| Level‐4 | −0.0648232 | 0.0073438 | −0.2431705 | 0.0518035 |
| Level‐5 | −0.0737154 | 0.0080798 | −0.2560148 | 0.0444581 |
| Level‐6 | −0.0957117 | 0.0088171 | −0.2982205 | 0.046128 |
| Level‐7 | −0.1336154 | 0.0096407 | −0.3547439 | 0.04672 |
| Level‐8 | −0.583629 | 0.0197676 | −0.4905812 | 0.0504853 |
| Level‐9 | −0.1927814 | 0.0229887 | −0.7341895 | 0.0704952 |
| Level‐10 | −1.099365 | 0.0245806 | −0.4552999 | 0.0904497 |
| PROMIS‐GH10 Q8 | ||||
| Level‐1 | −0.0309681 | 0.0143363 | −0.13651 | 0.075945 |
| Level‐2 | −0.0336167 | 0.0089216 | −0.2188279 | 0.0544924 |
| Level‐3 | −0.0107356 | 0.0065049 | −0.169407 | 0.0489464 |
| Level‐4 | −0.0088674 | 0.0061089 | −0.1161397 | 0.0479181 |
| PROMIS‐GH10 Q9 | ||||
| Level‐1 | −0.013529 | 0.0160572 | −0.0730294 | 0.0648042 |
| Level‐2 | 0.0015722 | 0.0087684 | 0.01400000 | 0.0536675 |
| Level‐3 | 0.0013122 | 0.0072035 | −0.0279388 | 0.0471441 |
| Level‐4 | 0.0017452 | 0.0063561 | 0.0180966 | 0.0431105 |
| PROMIS‐GH10 Q10 | ||||
| Level‐1 | −0.0165344 | 0.0130433 | −0.2807613 | 0.0587058 |
| Level‐2 | −0.0495842 | 0.0073591 | −0.2297149 | 0.0439222 |
| Level‐3 | −0.0330347 | 0.0059642 | −0.0321068 | 0.0374949 |
| Level‐4 | −0.0139751 | 0.0054585 | 0.0172828 | 0.0368554 |
| Age | 0.0003956 | 0.0005682 | 0.0043447 | 0.0033483 |
| Age squared | −9.99E‐06 | 5.71E‐06 | −0.0000673 | 0.0000338 |
| Female | 0.0009506 | 0.0035428 | −0.0446998 | 0.0201271 |
| Constant | 0.9745076 | 0.0168771 | 0.97559610 | 0.0993999 |
| Probability ‐Component 1 | ||||
| Constant | 0.1232367 | 0.0777262 | ||
| /lns_1 | −3.240571 | 0.0435271 | ||
| /lns_2 | −1.403876 | 0.0336056 | ||
| sigma1 | 0.0391416 | 0.0017037 | ||
| sigma2 | 0.2456429 | 0.008255 | ||
Note: PROMIS‐GH10 Q n = nth question of PROMIS‐GH10. The algorithm is based on ALVDMM set (3) that included PROMIS‐GH10 questions as items, age, age squared and sex (Female = 1) as explanatory variables. For PROMIS‐GH10 Q1, Q2, Q3, Q4, Q5, Q6, Q8, Q9, Q10 reference levels are level 5 and for PROMIS‐GH10 Q7 reference level is level 0.
Coefficients and standard errors from the Hybrid 1
| Predictor variables | Component one coefficients | Standard errors | Component two coefficients | Standard errors |
|---|---|---|---|---|
| PROMIS‐GH10 Q1 | ||||
| Level‐1 | −0.3433060 | 0.0148387 | 0.0514095 | 0.0570408 |
| Level‐2 | −0.0293574 | 0.0090414 | 0.0184637 | 0.0470782 |
| Level‐3 | −0.0093709 | 0.0075675 | 0.0806698 | 0.0432715 |
| Level‐4 | −0.0137952 | 0.0073393 | 0.1219705 | 0.0431204 |
| PROMIS‐GH10 Q4 | ||||
| Level‐1 | −0.0216585 | 0.0121445 | −0.0596429 | 0.0550486 |
| Level‐2 | −0.0216042 | 0.0078642 | 0.0436002 | 0.0464942 |
| Level‐3 | −0.0017283 | 0.0066908 | 0.1149595 | 0.0405404 |
| Level‐4 | −0.0067162 | 0.0058649 | 0.0598039 | 0.034641 |
| PROMIS‐GH10 Q5 | ||||
| Level‐1 | −0.0112022 | 0.0100856 | −1.18E‐02 | 5.52E‐02 |
| Level‐2 | −0.0150583 | 0.0078866 | 0.0234542 | 0.0493677 |
| Level‐3 | −0.0168738 | 0.0072386 | 0.0576217 | 0.0450475 |
| Level‐4 | −0.0127415 | 0.006478 | 0.0196911 | 0.0400461 |
| PROMIS‐GH10 Q6 | ||||
| Level‐1 | −0.8875440 | 0.0476422 | −0.2675420 | 0.0653272 |
| Level‐2 | −0.0503005 | 0.0135963 | −0.2063312 | 0.0361192 |
| Level‐3 | −0.032574 | 0.0069206 | −0.0737328 | 0.0281568 |
| Level‐4 | −0.0193999 | 0.004604 | −0.0544085 | 0.0275175 |
| PROMIS‐GH10 Q7 | ||||
| Level‐1 | −0.0373753 | 0.0060229 | −0.0454425 | 0.0457808 |
| Level‐2 | −0.050612 | 0.0061428 | −0.1630828 | 0.0425605 |
| Level‐3 | −0.0583187 | 0.0063182 | −0.2345283 | 0.0439714 |
| Level‐4 | −0.0679572 | 0.0074092 | −0.2694051 | 0.0491968 |
| Level‐5 | −0.0763415 | 0.0080467 | −0.2772628 | 0.0416786 |
| Level‐6 | −0.0977318 | 0.0086558 | −0.3293773 | 0.0435022 |
| Level‐7 | −0.1341808 | 0.0097872 | −0.3876107 | 0.0447105 |
| Level‐8 | −0.1393537 | 0.0265474 | −0.6160806 | 0.0480539 |
| Level‐9 | −0.2173441 | 0.0232659 | −0.7665818 | 0.0676771 |
| Level‐10 | −0.0343796 | 0.0450647 | −0.6329282 | 0.0739894 |
| PROMIS‐GH10 Q9 | ||||
| Level‐1 | −0.0179659 | 0.0155147 | −0.0987504 | 0.0611147 |
| Level‐2 | −0.0016861 | 0.0093691 | −0.0312816 | 0.050512 |
| Level‐3 | 0.0066662 | 0.0072761 | −0.0893566 | 0.0449948 |
| Level‐4 | 0.0039562 | 0.0062173 | −0.0171913 | 0.0399247 |
| PROMIS‐GH10 Q10 | ||||
| Level‐1 | −0.0350141 | 0.01417 | −0.4081824 | 0.0529913 |
| Level‐2 | −0.0561646 | 0.0071169 | −0.3029859 | 0.0403542 |
| Level‐3 | −0.0360541 | 0.0060193 | −0.0912567 | 0.0349469 |
| Level‐4 | −0.0150028 | 0.0055922 | −0.0184979 | 0.0343078 |
| Age | 0.0004218 | 0.0005884 | 0.003671 | 0.0032549 |
| Age squared | −0.00001 | 5.89E‐06 | −0.0000534 | 0.0000328 |
| Female | 0.0001425 | 0.0036659 | −0.0395839 | 0.0194046 |
| Constant | 0.9989450 | 0.0167456 | 0.98656421 | 0.0882295 |
| Probability ‐Component 1 | ||||
| Constant | 0.0898711 | 0.0818972 | ||
| /lns_1 | −3.217258 | 0.0491279 | ||
| /lns_2 | −1.423268 | 0.0335654 | ||
| sigma1 | 0.0400648 | 0.0019683 | ||
| sigma2 | 0.2409255 | 0.0080868 | ||
Note: PROMIS‐GH10 Q n = nth question of PROMIS‐GH10. The algorithm is based on ALVDMM set (3) that included PROMIS‐GH10 questions as items, age, age squared and sex (Female = 1) as explanatory variables. For PROMIS‐GH10 Q1, Q4, Q5, Q6, Q9, Q10 reference levels are level 5 and for PROMIS‐GH10 Q7 reference level is level 0.
Goodness of fit for indirect approaches
| Mobility | Self‐care | Usual activity | Pain and discomfort | Anxiety and depression | |
|---|---|---|---|---|---|
| Indirect mapping approaches | |||||
| Glogit | 66.35% | 75.96% | 71.83% | 85.30% | 54.87% |
| CART (classification trees) | 61.70% | 72.15% | 68.54% | 82.15% | 53.52% |
| Random forests | 66.67% | 75.92% | 72.53% | 85.92% | 55.87% |
| Bagging | 63.91% | 73.49% | 72.49% | 83.35% | 55.87% |
| NN | 68.08% | 76.53% | 73.24% | 86.35% | 57.75% |
| LASSO 1 | 69.48% | 76.85% | 73.24% | 86.38% | 58.22% |
Note: The table presents the percentage of correctly predicted for each dimension of EQ‐5D‐5L. LASSO 1, LASSO technique is used for prediction. Explanatory variables (without interactions) are only considered.
Abbreviations: GLOGIT, generalized logistic regression; LASSO, least absolute shrinkage and selection operator; NN, neural networks.