Literature DB >> 27885709

The median hazard ratio: a useful measure of variance and general contextual effects in multilevel survival analysis.

Peter C Austin^1,2,3, Philippe Wagner^4,5, Juan Merlo^4,6.

Abstract

Multilevel data occurs frequently in many research areas like health services research and epidemiology. A suitable way to analyze such data is through the use of multilevel regression models (MLRM). MLRM incorporate cluster-specific random effects which allow one to partition the total individual variance into between-cluster variation and between-individual variation. Statistically, MLRM account for the dependency of the data within clusters and provide correct estimates of uncertainty around regression coefficients. Substantively, the magnitude of the effect of clustering provides a measure of the General Contextual Effect (GCE). When outcomes are binary, the GCE can also be quantified by measures of heterogeneity like the Median Odds Ratio (MOR) calculated from a multilevel logistic regression model. Time-to-event outcomes within a multilevel structure occur commonly in epidemiological and medical research. However, the Median Hazard Ratio (MHR) that corresponds to the MOR in multilevel (i.e., 'frailty') Cox proportional hazards regression is rarely used. Analogously to the MOR, the MHR is the median relative change in the hazard of the occurrence of the outcome when comparing identical subjects from two randomly selected different clusters that are ordered by risk. We illustrate the application and interpretation of the MHR in a case study analyzing the hazard of mortality in patients hospitalized for acute myocardial infarction at hospitals in Ontario, Canada. We provide R code for computing the MHR. The MHR is a useful and intuitive measure for expressing cluster heterogeneity in the outcome and, thereby, estimating general contextual effects in multilevel survival analysis.

Entities: Disease Gene Species

Keywords: Median Hazard Ratio; Median Odds Ratio; clustered data; frailty models; multilevel analysis; survival analysis

Mesh：

Year: 2016 PMID： 27885709 PMCID： PMC5299617 DOI： 10.1002/sim.7188

Source DB: PubMed Journal: Stat Med ISSN： 0277-6715 Impact factor: 2.373

Introduction

Data with a multilevel nature occur frequently in public health, health services research, behavioral research, and in epidemiology. Examples include residents nested or clustered within neighborhoods or regions, patients nested within hospitals, and students nested within schools. The existence of multilevel information is relevant in the practice of epidemiology for both formal statistical and substantive epidemiological reasons, and a suitable way of analyzing these kinds of data is by using multilevel regression models (MLRM) 1, 2, 3, 4. From a statistical perspective, a condition for performing conventional regression analyses is the independence of the observations. Therefore, conventional analyses are inappropriate in the presence of clustering because subjects within the same cluster are more likely to have similar outcomes compared to randomly selected subjects from different clusters, and thus subjects within the same cluster are not statistically independent of one another. This lack of independence decreases the effective sample size, so that failure to account for clustering falsely increases the precision of the estimates. MLRM have been developed to fit regression models to data that have a multilevel structure 1, 2, 4, 5. Such models incorporate cluster‐specific random effects that account for the dependency of the data by partitioning the total individual variance into variation due to the clusters or ‘higher‐level units’ and the individual‐level variation that remains 6. It is important to remember that MLRM have been used for the analysis of repeated measurements within individuals. In this case, the between‐cluster (i.e., between individuals) component of variance is typically a large proportion of the total observed variation in the response variable. This is to be expected as the human body is a natural and homogenous system with well delimited boundaries 7, so the intra‐individual correlation of the repeated measurement is anticipated to be high. Therefore, when performing regression analyses of repeated measurements in a sample of individuals, the existence of a strong intra‐individual residual correlation can be seen as a nuisance that needs be taken into account for correct estimation of uncertainty around fixed effects. However, the situation is very different when MLRM are applied for investigating contextual effects on individual health. From an epidemiological perspective, knowing the share of the total individual variance that is attributable to the cluster level or the ‘variance partition coefficient’ (VPC) is relevant information on its own. Obviously, the higher the VPC, the more relevant the context appears to be for understanding individual health or disease outcomes 5, 6. The VPC informs on the existence of a general contextual effect (GCE), which is called ‘general’ because it reflects the influence of the cluster context as a whole, without specifying any contextual characteristic other than the very boundaries that delimit the cluster. As we have discussed elsewhere 8, quantifying the GCE is actually an implicit goal in all studies evaluating hospital performance, even in those studies that do not apply MLRM. Many such studies are based on single‐level models analyzing individual patient data with dummy variables for the hospitals or using information aggregated at the hospital level. Even when information is available at the patient, institutional, and geographical levels, assessments are typically performed at a single level by using, for instance, funnel plots, health league tables, or similar methods to compare hospitals or small geographical area averages. This single‐level approach is used extensively, but may provide misleading information for decision makers as it does not provide information on the size of the GCE. For evaluating hospital performance, what matters most is not the existence of differences between hospital averages in some quality indicator, but rather the share of the differences in patient outcomes that are due to the hospital level. That is, the GCE. Obviously, the higher that this share is, the more relevant the hospital context is. In health services research, and particularly within publicly funded systems, a fundamental task is the assessment of equity in health care. Equity means that health care is provided on equal terms and according to needs, regardless of the location of the patient or of the hospital in which they are treated. In other words, after adjusting for possible differences in case‐mix between health care providers, an equitable and well‐functioning healthcare system should result in comparable outcomes across different healthcare providers following provision of medical or surgical care. Using this perspective, we have recently suggested 8, 9 that for the evaluation of health care performance in a health care system (e.g., Canada or Sweden) it is necessary to consider at least two fundamental parameters: the national average of the quality indicator under analysis (e.g., mortality after hospitalization for acute myocardial infarction (AMI or heart attack)), and the size of the GCE. The use of these two parameters allows us to consider four different categories of health system performance. The ideal scenario (A) is when the national average denotes high quality (e.g., low mortality after hospitalization for AMI) and the GCE is very small (that is, a very low VPC). Under this scenario, all hospitals are performing homogenously well. That is, there is a well‐functioning healthcare system without disparities between hospitals. The worse scenario (B) is when the national average denotes low quality (e.g., high mortality after hospitalization for AMI) and the GCE is also small (that is, a very low VPC). Under this scenario, all hospitals are performing homogenously poorly. That is, there is an overall poorly functioning healthcare system. In both scenarios A and B, any intervention to improve quality (scenario B) or to maintain the high quality of the health care system (scenario A) should not be directed at specific hospitals, but to all hospitals in the state, province, or country. The next scenario (C) is when the national average denotes high quality (e.g., low mortality after hospitalization for AMI) but the GCE is large (that is, a high VPC). This scenario indicates that even if the quality of the whole health care system is, on average, good, there is heterogeneity in hospital performance, so that the prognosis after AMI is consistently poor for the patients in some hospitals but consistently good for patients in some other hospitals. In this scenario, an intervention to improve health care quality should be focused on the hospitals with poor quality (i.e., those with high average mortality after hospitalization for AMI). The final scenario (D) is when the national average denotes poor quality (e.g., high average mortality after hospitalization for AMI in the country) and the GCE is large (that is, a high ICC). This scenario indicates that even if the quality of the health care system is poor on average, there is heterogeneity in hospital performance so that the prognosis after myocardial infarction is consistently bad for the patients in some hospitals but consistently better for patients in other hospitals. In scenario D, an intervention to improve health care quality should be focused on all hospitals in the country and especially on those with the highest average mortality after hospitalization for AMI. This simplified approach of four scenarios can be applied together with classical funnel plots, health league tables, or similar method commonly used to compare hospitals or small geographical area averages 8, 9, 10, 11. It can also be used for the non‐randomized evaluation of interventions in health care 12, 13. When using MLRM with more than two levels (e.g., patient, wards, and hospitals 8, or patient, physicians, and hospitals 13), the level‐specific GCE may provide information on the relative important or relevance of the different levels. This approach may be used to complement the information provided by initiatives like ‘hospital report cards’, in which patients' outcomes are compared across hospitals within a given health care system. Such report cards have been published comparing mortality rates across hospitals for patients hospitalized with AMI in the Canadian province of Ontario 14, and in the American states of California 15 and Pennsylvania 16. Similarly, outcomes have been compared across hospitals or surgeons for patients undergoing coronary artery bypass graft (CABG) surgery in Ontario 17 and in the American states of New York 18, New Jersey 19, Pennsylvania 20, and Massachusetts 21. Similar analyses may be conducted in education (e.g., examining variation in student test scores across schools) and other areas in which it is important to quantify the GCE of the providers. Any researcher examining variation in outcomes across clusters in which subjects are nested (e.g., hospitals, schools, geographic regions) should be interested in formally describing and interpreting the observed variation in outcomes across clusters and the GCE. A classical concern in health services research is the issue of how to interpret the magnitude of the estimated between‐cluster variation. In doing so, one must consider Diehr et al.'s classical question, ‘What is too much variation?’ 22. The simplest approach of directly interpreting the estimated cluster variance, as is classically done in single‐level studies of hospital or small area variation, is insufficient. Our analytical framework, however, facilitates the answering of this classical question. Rather than focusing on cluster variance in isolation, we consider the existence of a multilevel continuum of individual variance that can be decomposed into between‐ and within‐cluster components. Therefore, the cluster variance is large when it is a large share of the total individual variance. The advantage of using the VPC to quantify the GCE is that it provides a measure with clearly defined limits as it can be expressed as a percentage that extends from 0% to 100%. The VPC is easy to calculate and interpret in multilevel linear regression models with continuous outcomes 23. In the case of discrete responses, the calculation and interpretation of the VPC are more complicated because, among other issues, the individual and cluster components of variance are on different scales. There are, however, several alternative approaches for computing the VPC with discrete outcomes. These include using a normal response approximation, the simulation method, Taylor series linearization, and the latent response method 6. Also, one could use random effects‐based predictions and their corresponding area under the ROC curve 8, 9. A problem is that many epidemiological practitioners and physicians are not familiar with concepts like underlying latent variables and constant individual variances of , which may explain the reluctance to use this approach. To avoid the interpretative technicalities of the VPC, Larsen et al. introduced the concept of the median odds ratio (MOR) that can be used in the interpretation of GCE when fitting a random effects logistic regression model 24, 25. The MOR indicates the median value of the odds ratios obtained when comparing the odds of the occurrence of the outcome in an individual from a randomly selected clusters with another individual with identical covariates but randomly selected from a different cluster when the clusters are ordered by risk. In other words, to calculate the MOR we should first measure the odds of the occurrence of the outcome for all randomly taken pairs of individuals from different clusters and, thereafter, compute the odds ratio for each pair of individuals having the individual from the cluster with the higher odds in the numerator and the individual from the cluster with the lower odds in the denominator. This would produce a distribution of odds ratios that are always equal to or higher than 1 and the MOR is the median OR of this distribution 26. The MOR can be thought of as the median increase in the odds of the occurrence of the outcome that would arise when an individual moves from a lower‐risk cluster to a higher‐risk cluster. An advantage to the MOR is that it permits the analyst to present the between‐cluster variation as a measure of association (i.e., an odds ratio) and thereby allows the comparison of the GCE with the fixed effect of the covariates in the model. Unlike the VPC which is a measure of homogeneity and intra‐cluster correlation within hierarchical data structures, the MOR is strictly a measure of heterogeneity between clusters. Also, the MOR is a probabilistic measure of association that extends from 1 to +∞ rather than from 0 to 1 as does the VPC. When interpreting the MOR, we need to consider that both the MOR and the VPC measures are simply functions of the cluster variance and both express the same GCE. For instance, a VPC as low as 2% corresponds to a MOR of 1.28, which some epidemiologist may interpret as a 28% increased risk, which is actually a low MOR. An important, but neglected, issue concerns the calculation of the GCE for survival or time‐to‐event outcomes that occur frequently in the medical and epidemiological literature 27. In fact, the quantification of the GCE is infrequent with time‐to‐event outcomes even if the Median Hazard Ratio (MHR), as an extension of the MOR, is available for this purpose. While the MHR was empirically applied in 2007 to examine geographic variation in ischemic heart disease mortality in Sweden 28, a formal derivation and interpretation of the MHR was first published by Lanke in 2010 as a short appendix written in Historical Methods 29 to quantify the impact of the family‐specific frailty in southern Sweden during the period 1766–1895 30. The MOR, described by Larsen in the year 2000 in Biometrics 24, only started being applied regularly in the medical and epidemiological research after 2005 when the concept was introduced and its utility demonstrated for an epidemiological audience 25, 26. This highlights the relevance and importance of translational studies such as this one to introduce to an epidemiologic audience methods developed elsewhere. Introducing advanced statistical techniques and concepts like MLRA to everyday epidemiological practice is a relevant task, not only for improving the validity of the epidemiological analysis, but also because statistical ideas may transform the way medical epidemiologists interpret information. The objective to the current paper was two‐fold. First, to introduce the MHR to researchers in epidemiology and biostatistics. Second, to illustrate its utility for measuring the general impact of the hospital context (i.e., general contextual effects) on the hazard of death in patients hospitalized for an AMI in Canada. The paper is structured as follows. In Section 2, we briefly review frailty models for analyzing clustered survival data and define the MHR. In Section 3, we provide a case study in which we illustrate the utility of this metric for assessing the magnitude of the general contextual (i.e., hospital) effects when analyzing survival data. Finally, in Section 4, we summarize our report and place it in the context of the existing literature.

The median hazard ratio

In this section we first formally define the MOR. We then define the MHR.

The median odds ratio

The MOR was defined by Larsen et al. for use with a random effects logistic regression model 24, and was subsequently popularized in the epidemiological literature 25, 26. Assume that the following random effects logistic regression model had been fit: where, p is the probability of the occurrence of the binary outcome for the ith subject in the jth cluster, X denotes a vector of explanatory variables, β denotes the vector of associated regression coefficients, and α denotes the cluster‐specific random effects. The assumption is typically made that the random effects follow a normal distribution: α ∼ N(0, σ 2). For this random effects logistic regression model, the MOR may be calculated as , where Φ− 1 denotes the inverse of the standard normal cumulative distribution function.

The median hazard ratio

Lanke, in an appendix 29 to a paper 30 published in the history literature, extended the concept of the MOR for use with survival or time‐to‐event outcomes when Cox frailty models are fit to account for the clustered nature of the data. We assume a Cox proportional hazards regression model that has incorporated cluster‐specific random effects: where h (t) denotes the hazard function for the ith subject within the jth cluster, while h 0(t) denotes the baseline hazard function (i.e., the hazard function for a subject whose covariates are all equal to zero). Furthermore, the vector X denotes a vector of predictor or explanatory variables, while β denotes the vector of associated regression coefficients. The α denotes the cluster‐specific random effects. The model can also be written in multiplicative form: When using the multiplicative formulation, the term is referred to as a frailty term 31, 32. The frailty terms have a multiplicative effect on the hazard function. Frailty models are described by the distribution of the frailty terms. When the distribution of the random effects is normal, the frailty terms will have a log‐normal distribution. We refer to such as a model as a Cox log‐normal frailty model. When the distribution of the frailty terms follows a Gamma distribution, we refer to the resultant model as a Cox Gamma frailty model. These are the most common Cox frailty models, and both can be implemented in popular statistical software such as R or SAS (while Stata only permits estimation of the Cox Gamma frailty model). Lanke demonstrated that when the frailty terms followed a log‐normal distribution, then the MHR is evaluated as where σ 2 is the variance of the random effects (i.e., α ∼ N(0, σ 2)) and Φ− 1 denotes the inverse of the standard normal cumulative distribution function 29. Thus, in the case of normally distributed random effects, the MHR is evaluated in a method that is identical to that which is used for evaluating the MOR for the logistic‐normal hierarchical regression model. The above result arises from the fact that when the random effects follow a normal distribution, then the distribution of |α − α | follows a half‐normal distribution with variance equal to 2σ 2. The median of this half‐normal distribution is given by . When the frailty terms follow a Gamma distribution with variance σ 2 (i.e., exp(α ) ∼ Γ(σ − 2, σ 2), under the convention that E[exp(α )] = 1), then the MHR is evaluated as the upper quantile of an F(2σ − 2, 2σ − 2) distribution 30. R code for computing the MHR and MOR is provided in the appendix.

Case study

We provide a case study to illustrate the utility of the MHR for evaluating the hospital GCE on the hazard of death subsequent to hospitalization for AMI. Typically, studies assessing hospital performance in survival after hospitalization for AMI assume that over and above patient characteristics, the hospital context exerts a general, shared effect on all patients at the same hospital. This concept is analogous to the frailty effect in multilevel survival regression. However, as explained in Section 1, this frailty effect cannot be properly assessed by interpreting between‐hospital variation only, but rather by a measure that, like the MHR, explicitly operationalizes hospital general contextual effects.

Data

We used data from the Ontario Myocardial Infarction Database, which contains data on patients hospitalized with an AMI at Ontario hospitals between 1992 and 2013 33. For this case study, we used hospital separations (occurring because of patient discharge or of in‐hospital death) that occurred in the 12‐month period between April 1, 2006 and March 31, 2007. The data have a multilevel structure, with patients nested within hospitals. The study sample consisted of 17 243 patients treated at 157 hospitals. Due to the study inclusion and exclusion criteria, no patient had more than one hospital discharge during the one year time frame of the study. Eleven variables, consisting of the variables in the Ontario AMI Mortality Prediction model (age, sex, congestive heart failure, cardiogenic shock, arrhythmia, pulmonary edema, diabetes mellitus with complications, stroke, acute renal disease, chronic renal disease, and malignancy), were measured on each patient 34. The one continuous explanatory variables (age) was centered around the sample average. The outcome for the case study was the time from hospital admission to the occurrence of death due to any cause. Patients were followed for up to one year from the time of hospital admission, and were censored after 365 days of follow‐up if they were still alive. Death within one year of hospital admission occurred for 3758 (21.8%) patients in the sample.

Statistical analysis

We fit two different Cox frailty models. First, we fit the null model, which included only hospital‐specific random effects: Second, we fit a frailty model which comprised the 11 variables in the Ontario AMI mortality prediction model and the hospital‐specific random effects: where the vector X denotes the vector of patient‐level characteristics, and β patient denotes the vector of associated regression coefficients. We first fit these two models assuming that the distribution of the random effects was normal: α ∼ N(0, σ 2) (equivalent to assuming that the frailties follow a log‐normal distribution). We then fit these two models assuming that the distribution of the frailty terms followed a Gamma distribution: exp(α ) ∼ Γ(θ − 1, θ), so that E[exp(α )] = 1 and Var[exp(α )] = θ. For each of the sets of two models, we computed the MHR using the methods described in Section 2. Statistical analyses were conducted using PROC PHREG in SAS (SAS/STAT version 13.1) (Cary, NC) unless otherwise noted (in which case, the stcox function in Stata (version 13.1) (College Station, TX) was used).

Results

Estimated hazard ratios and associated 95% confidence intervals obtained from the two frailty models are reported in Table 1. The estimated hazard ratios and associated 95% confidence intervals were essentially identical between the two frailty models.

Table 1

Hazard ratios and 95% confidence intervals for the two frailty models

Variable	Cox log‐normal model	Cox Gamma model
Age (per 10‐year increase)	1.85 (1.79,1.91)	1.85 (1.79,1.91)
Female (vs. male)	0.96 (0.90,1.03)	0.96 (0.90,1.03)
Congestive heart failure	1.60 (1.49,1.72)	1.60 (1.49,1.71)
Cerebrovascular disease	1.57 (1.35,1.82)	1.57 (1.35,1.82)
Pulmonary edema	1.70 (1.26,2.31)	1.70 (1.26,2.31)
Diabetes with complications	1.27 (1.18,1.37)	1.27 (1.18,1.37)
Malignancy	2.88 (2.54,3.27)	2.88 (2.54,3.26)
Chronic renal failure	1.38 (1.25,1.52)	1.38 (1.25,1.52)
Acute renal failure	1.51 (1.35,1.69)	1.51 (1.35,1.69)
Cardiogenic shock	6.37 (5.54,7.32)	6.36 (5.53,7.31)
Cardiac arrhythmia	1.21 (1.12,1.31)	1.21 (1.12,1.31)

Each cell contains the estimated hazard ratio and the associated 95% confidence interval.

Hazard ratios and 95% confidence intervals for the two frailty models Each cell contains the estimated hazard ratio and the associated 95% confidence interval. For the two log‐normal frailty models, the estimated variance of the distribution of the random effects (i.e., the variance of the underlying normal distribution) were 0.06218 (null model) and 0.02391 (model with patient characteristics). The MHR for these two models were 1.27 and 1.16, respectively. For the null model, the estimated standard error of this estimated variance was 0.01360. A modified Wald test which is equal to the estimated variance divided by an estimate of its standard error allows one to test whether the variance is significantly different than zero 35. The modified Wald test statistic for the null model is 4.57, which can be compared to the critical value of 1.64 for a normal one‐sided test. Thus, we would reject the null hypothesis of no between‐hospital variation with a highly significant p‐value (P < 0.001). For the adjusted model, the estimated standard error of the estimated variance was 0.007666, resulting in a modified Wald test statistic of 3.11. Thus, even after adjustment for patient characteristics, we would reject the null hypothesis of no between‐hospital variation in the hazard of death (P < 0.001). For the two gamma frailty models, the estimated variance of the distribution of the frailty terms (i.e., the variance of the Gamma distribution) were 0.05592 (null model) and 0.02105 (model with patient characteristics). Using a modified Wald test, we rejected the null hypothesis of no between‐hospital variation in both the null model and the model that adjusted for patient characteristics (P < 0.001) (the components for computing the Wald test for the gamma frailty models were obtained from models fit using Stata, as SAS does not provide an estimate of the standard error of the estimated variance of the frailty distribution for the gamma frailty model). The MHR for these two models were 1.26 and 1.15, respectively. Thus, when using the gamma frailty model, prior to adjusting for patient characteristics, the median increase in the hazard of mortality when comparing a patient at a hospital with higher mortality to a patient at a hospital with lower mortality was 26%. After accounting for patient characteristics, the median increase in the hazard of mortality when comparing a patient at a hospital with higher mortality to a patient at a hospital with lower mortality was 15%. Comparable interpretations are drawn from the log‐normal frailty model. One can better understand the magnitude of between‐hospital variation in mortality by comparing the MHR for the full model with the hazard ratios for the patient‐level characteristics. The MHR for the gamma frailty model was 1.15 (the reciprocal of this MHR is 1/1.15 = 0.87). Only one patient‐level characteristics (female sex) had a hazard ratio that lay between 0.87 and 1.15. Thus, the median effect of clustering on mortality was less than the effect of 10 of the 11 patient characteristics. The estimated distributions of the frailty terms are described in Figure 1. The left panel contains the two distributions of the frailty terms under the null models, while the right panel contains the two distributions under the model that adjusted for the 11 patient characteristics. For a given model (null vs. adjusted), the choice of distribution had only a marginal effect on the shape of the distribution. The right tail was slightly heavier under the log‐normal distribution than under the Gamma distribution. Thus, under the lognormal model, the proportion of hospitals that have a very elevated risk of death compared to the average hospital is higher than under the Gamma model. Through a comparison of the estimated hazard ratios (Table 1), the MHRs, and the distribution of the frailty terms (Figure 1), one notes that the choice of frailty distribution had at most a minor impact on the conclusions that would be drawn from the data.

Figure 1

Distribution of frailty terms.

Distribution of frailty terms. As a sensitivity analysis to examine the effect of duration of follow‐up and number of events on the MHR, we repeated the above analyses allowing each patient to be followed for up to five years from the time of hospital admission. In this secondary analysis, 6904 (39.4%) patients died within five years of hospital admission. The MHRs for the models with normally distributed random effects were 1.23 (null model) and 1.15 (model with patient characteristics). The MHRs for the models with the gamma‐distributed frailty terms were 1.22 (null model) and 1.14 (model with patient characteristics). These MHRs were qualitatively comparable to those obtained when subjects were followed for one year after hospital admission.

Other measures of dependence

For comparative purposes, we examined other measures of dependence for use with frailty models. These measures provide alternative methods to quantify the effect of clustering. These measures were derived in the context of bivariate survival data (i.e., when the clusters consist of two observed survival times) 36. However, they can be used in the general setting with multiple survival times observed per cluster. In that case, these measures refer to the bivariate marginal distribution. Closed‐form expressions exist for these measures under the Gamma frailty model, but not under the log‐normal frailty model. Kendall's τ denotes the correlation of subjects' outcomes within groups 31, 36, 37. A closed‐form expression exists for Kendall's τ under the Gamma frailty model, but not under the log‐normal frailty model 31, 32. Under the Gamma frailty model, , where θ denotes the variance of the frailty distribution. Under the Gamma frailty model estimated above, τ is equal to 0.027 for the null model and 0.010 for the model with patient characteristics. Thus, prior to adjustment for patient characteristics, 2.7% of the variation in survival times is due to variation between hospitals, while after adjustment for these 11 covariates, 1.0% of the variation in survival times is due to variation between hospitals. Spearman's correlation coefficient for bivariate survival data can be defined as 36. Under the Gamma frailty model, this can be evaluated as , where 3F2 is a hypergeometric function defined by . Under the Gamma frailty model estimated above, Spearman's correlation coefficient is 1.014 for the null model and 1.005 for the model with patient characteristics. Median concordance is the concordance of a single observation (T1,T2) in relation to a fixed point (Median(T1),Median(T2)). It is defined by κ = Esign{(T 1 − median(T 1))(T 2 − median(T 2))} 36. Under the Gamma frailty model, it can be evaluated as . Under the Gamma frailty model estimated above, median concordance was 0.026 for the null model and 0.010 for the model with patient characteristics. Using each of these measures of dependence, one would conclude that the magnitude of the effect of clustering was weak to modest, both before and after adjustment for patient characteristics. However, unlike the MHR, each of these measures lacks the ability to compare the magnitude of the contextual effect to that of the effects of individual patient characteristics. Furthermore, closed‐form expressions for these dependence measures exist for the Gamma frailty model, but not the log‐normal frailty model. In contrast to this, the MHR can be evaluated for both of these families of frailty models.

Discussion

The objective of the current paper was to introduce researchers in epidemiology and medical research to the concept of the MHR for use with multilevel analysis of clustered survival data. The MHR allows one to quantify the magnitude of the general contextual effects (i.e., the ‘frailty’) on the hazard of the occurrence of the outcome on the hazard ratio scale. Furthermore, it also permits a comparison of the magnitude of this general contextual effect with that of model covariates. After an empirical application of the MHR in epidemiology in 2007 28 and its subsequent formal mathematic derivation in 2010 in the history literature 29, only two papers have applied the MHR in the peer‐reviewed literature (Source: Science Citation Index, April 28, 2016). The first used the MHR to quantify the family effect on infant and child mortality in Sweden during the period 1766–1895 30 and the second examined the effect of the mother on fertility in a nineteenth century alpine village 38. Therefore, despite the frequency with which multilevel survival or time‐to‐event outcomes occur in the medical, epidemiological, and health services research literature, investigators in these fields appear to be unaware of the MHR. The purpose of our brief article was to introduce researchers in these fields to the existence and utility of the MHR. In our case study, we used the MHR to quantify the magnitude of the effect of clustering within hospitals, that is the ‘frailty’ or general contextual effect, on the hazard of mortality subsequent to hospitalization for an AMI. We found that the MHR for the gamma frailty model that contained 11 patient characteristics was 1.15, indicating that, for 50% of possible pair‐wise comparisons, the hazard of death for a reference patient was less than 15% greater when comparing a hospital with higher mortality to a hospital with lower mortality. Furthermore, the MHR, which measures the median effect of ‘frailty’ on the hazard ratio scale, was smaller in magnitude than the hazard ratios for 10 of the 11 patient characteristics. Reporting the MHR complements the reporting of the variance of the frailty distribution (and possibly a plot of the density function of the frailty distribution). The MHR provides a characterization of the magnitude of the effect of clustering that would not have been possible had we simply reported the variances of the frailty distributions. From a description of the frailty distribution on its own, it is difficult to summarize the effect of context on the hazard of outcomes. In contrast to this, the MHR provides a summary measure of the contextual effect on the hazard of outcome. In conclusion, the MHR allows one to determine the median relative change in the hazard of the outcome between a subject in a cluster at a higher risk for the outcome and an identical subject in a cluster at a lower risk for the outcome. Such a measure permits for an intuitive description of the magnitude of the impact of the hospital ‘frailty’ or general contextual effects when analyzing clustered survival data.

18 in total

1. Interpreting parameters in the logistic regression model with random effects.

Authors: K Larsen; J H Petersen; E Budtz-Jørgensen; L Endahl
Journal: Biometrics Date: 2000-09 Impact factor: 2.571

Review 2. Individual and collective bodies: using measures of variance and association in contextual epidemiology.

Authors: J Merlo; H Ohlsson; K F Lynch; B Chaix; S V Subramanian
Journal: J Epidemiol Community Health Date: 2009-08-06 Impact factor: 3.710

Review 3. A substantial and confusing variation exists in handling of baseline covariates in randomized controlled trials: a review of trials published in leading medical journals.

Authors: Peter C Austin; Andrea Manca; Merrick Zwarenstein; David N Juurlink; Matthew B Stanbrook
Journal: J Clin Epidemiol Date: 2009-08-27 Impact factor: 6.437

4. Bringing the individual back to small-area variation studies: a multilevel analysis of all-cause mortality in Andalusia, Spain.

Authors: Juan Merlo; Francisco J Viciana-Fernández; Diego Ramiro-Fariñas
Journal: Soc Sci Med Date: 2012-07-03 Impact factor: 4.634

5. Temporal changes in the outcomes of acute myocardial infarction in Ontario, 1992-1996.

Authors: J V Tu; C D Naylor; P Austin
Journal: CMAJ Date: 1999-11-16 Impact factor: 8.262

6. Understanding adherence to therapeutic guidelines: a multilevel analysis of statin prescription in the Skaraborg Primary Care Database.

Authors: Per Hjerpe; Henrik Ohlsson; Ulf Lindblad; Kristina Bengtsson Boström; Juan Merlo
Journal: Eur J Clin Pharmacol Date: 2010-12-29 Impact factor: 2.953

7. The median hazard ratio: a useful measure of variance and general contextual effects in multilevel survival analysis.

Authors: Peter C Austin; Philippe Wagner; Juan Merlo
Journal: Stat Med Date: 2016-11-25 Impact factor: 2.373

8. Understanding the effects of a decentralized budget on physicians' compliance with guidelines for statin prescription--a multilevel methodological approach.

Authors: Henrik Ohlsson; Juan Merlo
Journal: BMC Health Serv Res Date: 2007-05-08 Impact factor: 2.655

9. An Original Stepwise Multilevel Logistic Regression Analysis of Discriminatory Accuracy: The Case of Neighbourhoods and Health.

Authors: Juan Merlo; Philippe Wagner; Nermin Ghith; George Leckie
Journal: PLoS One Date: 2016-04-27 Impact factor: 3.240

10. Short Term Survival after Admission for Heart Failure in Sweden: Applying Multilevel Analyses of Discriminatory Accuracy to Evaluate Institutional Performance.

Authors: Nermin Ghith; Philippe Wagner; Anne Frølich; Juan Merlo
Journal: PLoS One Date: 2016-02-03 Impact factor: 3.240

29 in total

1. Geographic Disparity in Deceased Donor Liver Transplant Rates Following Share 35.

Authors: Mary G Bowring; Sheng Zhou; Eric K H Chow; Allan B Massie; Dorry L Segev; Sommer E Gentry
Journal: Transplantation Date: 2019-10 Impact factor: 4.939

2. Geographic disparities in lung transplant rates.

Authors: Martin Kosztowski; Sheng Zhou; Errol Bush; Robert S Higgins; Dorry L Segev; Sommer E Gentry
Journal: Am J Transplant Date: 2018-12-15 Impact factor: 8.086

3. Long-Term Outcomes of Out-of-Hospital Cardiac Arrest Care at Regionalized Centers.

Authors: Jonathan Elmer; Clifton W Callaway; Chung-Chou H Chang; Jonathan Madaras; Christian Martin-Gill; Philip Nawrocki; Kristen A C Seaman; Denisse Sequeira; Owen T Traynor; Arvind Venkat; Heather Walker; David J Wallace; Francis X Guyette
Journal: Ann Emerg Med Date: 2018-07-04 Impact factor: 5.721

4. Geographic disparity in kidney transplantation under KAS.

Authors: Sheng Zhou; Allan B Massie; Xun Luo; Jessica M Ruck; Eric K H Chow; Mary G Bowring; Sunjae Bae; Dorry L Segev; Sommer E Gentry
Journal: Am J Transplant Date: 2018-01-27 Impact factor: 8.086

5. Regional variations in ambulatory care and incidence of cardiovascular events.

Authors: Jack V Tu; Anna Chu; Laura Maclagan; Peter C Austin; Sharon Johnston; Dennis T Ko; Ingrid Cheung; Clare L Atzema; Gillian L Booth; R Sacha Bhatia; Douglas S Lee; Cynthia A Jackevicius; Moira K Kapral; Karen Tu; Harindra C Wijeysundera; David A Alter; Jacob A Udell; Douglas G Manuel; Prosanta Mondal; William Hogg
Journal: CMAJ Date: 2017-04-03 Impact factor: 8.262

6. Analysis of mortality after hip fracture on patient, hospital, and regional level in Germany.

Authors: C Schulz; H-H König; K Rapp; C Becker; D Rothenbacher; G Büchele
Journal: Osteoporos Int Date: 2019-12-10 Impact factor: 4.507

7. The median hazard ratio: a useful measure of variance and general contextual effects in multilevel survival analysis.

Authors: Peter C Austin; Philippe Wagner; Juan Merlo
Journal: Stat Med Date: 2016-11-25 Impact factor: 2.373

8. Association of Annual Intensive Care Unit Sepsis Caseload With Hospital Mortality From Sepsis in the United Kingdom, 2010-2016.

Authors: Ritesh Maharaj; Alistair McGuire; Andrew Street
Journal: JAMA Netw Open Date: 2021-06-01

9. Changes in regional variation in mortality over five decades - The contribution of age and socioeconomic population composition.

Authors: Ulla Suulamo; Lasse Tarkiainen; Hanna Remes; Pekka Martikainen
Journal: SSM Popul Health Date: 2021-06-19

10. The role of the clinical departments for understanding patient heterogeneity in one-year mortality after a diagnosis of heart failure: A multilevel analysis of individual heterogeneity for profiling provider outcomes.

Authors: Nermin Ghith; Anne Frølich; Juan Merlo
Journal: PLoS One Date: 2017-12-06 Impact factor: 3.240