| Literature DB >> 20799344 |
Borislava Mihaylova1, Andrew Briggs, Anthony O'Hagan, Simon G Thompson.
Abstract
We review statistical methods for analysing healthcare resource use and costs, their ability to address skewness, excess zeros, multimodality and heavy right tails, and their ease for general use. We aim to provide guidance on analysing resource use and costs focusing on randomised trials, although methods often have wider applicability. Twelve broad categories of methods were identified: (I) methods based on the normal distribution, (II) methods following transformation of data, (III) single-distribution generalized linear models (GLMs), (IV) parametric models based on skewed distributions outside the GLM family, (V) models based on mixtures of parametric distributions, (VI) two (or multi)-part and Tobit models, (VII) survival methods, (VIII) non-parametric methods, (IX) methods based on truncation or trimming of data, (X) data components models, (XI) methods based on averaging across models, and (XII) Markov chain methods. Based on this review, our recommendations are that, first, simple methods are preferred in large samples where the near-normality of sample means is assured. Second, in somewhat smaller samples, relatively simple methods, able to deal with one or two of above data characteristics, may be preferable but checking sensitivity to assumptions is necessary. Finally, some more complex methods hold promise, but are relatively untried; their implementation requires substantial expertise and they are not currently recommended for wider applied work.Entities:
Mesh:
Year: 2010 PMID: 20799344 PMCID: PMC3470917 DOI: 10.1002/hec.1653
Source DB: PubMed Journal: Health Econ ISSN: 1057-9230 Impact factor: 3.046
Figure 1Flow chart of selection of papers into the review
Summary characteristics of the reviewed methods for modelling cost and resource use data in moderate size data
| Features of data | Features of method | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Analytical approach | Skewness | Heavy tails | Excess zeros | Multimodality | Testing for cost difference | Covariate adjustment | Analysis on original scale/No need to back transform | Works with small samples | Ease of implementation |
| I. Methods based on the normal distribution | L | L | L | L | H | H | H | L | H |
| II. Methods following transformation of data | H | H | L | L | P | H | L | P | P |
| III. Single-distribution generalized linear models (GLM) | H | P | L | L | H | H | H | P L | H |
| IV. Parametric models based on skewed distributions outside the GLM family | H | P | L | L | H | H | H | P L | P |
| V. Models based on mixture of parametric distributions | H | H | P | H | P | H | H | P | L |
| VI. Two-part and hurdle models | H | P H | H | L | P | H | L H | P L | P H |
| VII. Survival (or duration) methods | |||||||||
| (i) Semi-parametric Cox and parametric Weibull proportional hazards model | H | H | H | P | P | H | H | P | H |
| (ii) Aalen additive hazard model | H | H | L | P | P | P | H | L | L |
| VIII. Non-parametric methods: | |||||||||
| (i) Central Limit Theorem and Bootstrap methods | P | P | L | L | H | H | H | P | H |
| (ii) Non-parametric modified estimators based on pivotal statistic or Edgeworth expansion | H | P | L | L | P | L | H | P | H |
| (iii) Non-parametric density approximation | H | H | H | H | P | H | H | L | P L |
| (iv) Quantile-based smoothing | H | H | H | H | H | H | P | L | L |
| IX. Methods based on data trimming | L | L | L | L | L | L | H | P L | H |
| X. Data components models | H | P | P | H | P | H | H | P L | L |
| XI. Model averaging | H | H | P | H | P | H | H | L | L |
| XII. Markov chain methods | H | H | P | H | P | P | H | L | L |
H, High applicability; P, Possible applicability; L, Low applicability.
Small sample refers to tens to a few hundreds of participants.
More complex GLMs require large sample sizes, as does checking parametric modelling assumptions.
Checking parametric modelling assumptions needs large sample size.
The ability to model heavy tails depends on the model used in the second part. If the model used in the second part is from Category II ‘Methods based on normality following a transformation of the data’ back transformation will be needed. Checking modelling assumptions needs reasonable sample size. The ease of implementation depends on the models used in the two parts.
These approaches are not available in standard statistical software.
Depends on data; in small samples opportunities to check parametric model assumptions are restricted.
Models beyond those relying on multivariate normality will need large data sets.
More detail is provided in the review templates in the web appendix at http://www.herc.ox.ac.uk/downloads/support_pub.