Literature DB >> 25803659

A review and comparison of methods for recreating individual patient data from published Kaplan-Meier survival curves for economic evaluations: a simulation study.

Xiaomin Wan1, Liubao Peng1, Yuanjian Li2.   

Abstract

BACKGROUND: In general, the individual patient-level data (IPD) collected in clinical trials are not available to independent researchers to conduct economic evaluations; researchers only have access to published survival curves and summary statistics. Thus, methods that use published survival curves and summary statistics to reproduce statistics for economic evaluations are essential. Four methods have been identified: two traditional methods 1) least squares method, 2) graphical method; and two recently proposed methods by 3) Hoyle and Henley, 4) Guyot et al. The four methods were first individually reviewed and subsequently assessed regarding their abilities to estimate mean survival through a simulation study.
METHODS: A number of different scenarios were developed that comprised combinations of various sample sizes, censoring rates and parametric survival distributions. One thousand simulated survival datasets were generated for each scenario, and all methods were applied to actual IPD. The uncertainty in the estimate of mean survival time was also captured.
RESULTS: All methods provided accurate estimates of the mean survival time when the sample size was 500 and a Weibull distribution was used. When the sample size was 100 and the Weibull distribution was used, the Guyot et al. method was almost as accurate as the Hoyle and Henley method; however, more biases were identified in the traditional methods. When a lognormal distribution was used, the Guyot et al. method generated noticeably less bias and a more accurate uncertainty compared with the Hoyle and Henley method.
CONCLUSIONS: The traditional methods should not be preferred because of their remarkable overestimation. When the Weibull distribution was used for a fitted model, the Guyot et al. method was almost as accurate as the Hoyle and Henley method. However, if the lognormal distribution was used, the Guyot et al. method was less biased compared with the Hoyle and Henley method.

Entities:  

Mesh:

Year:  2015        PMID: 25803659      PMCID: PMC4372344          DOI: 10.1371/journal.pone.0121353

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Decision analytic modeling is often applied as a vehicle for economic evaluations. [1] A commonly used approach in decision analytic modeling is the Markov model. Transition probabilities between Markov states are one of the most important features of the stochastic model. There are several available methods for the estimation of transition probabilities. Currently, survival analysis has been frequently used to estimate transition probabilities. Recently, an algorithm was developed by Latimer that aimed to improve the quality of survival analyses included within economic evaluations. [2] The framework proposed by Latimer was founded on the assumption that the researchers who conduct economic evaluations have access to individual patient-level data (IPD). Typically, IPD originate from clinical trials and observational studies. However, censoring frequently occurs during the follow-up period, and the key feature of the survival analysis is its ability to facilitate censoring inclusion, which makes it an appropriate method for survival data with censoring. When IPD are available, the relationship between transition probabilities and time can be estimated from IPD using a survival analysis. The commonly used approach is to fit parametric survival functions to IPD by applying the maximum likelihood estimation (MLE) to estimate the parameters of the distributions that are used for the estimation of the transition probabilities. An estimate of the mean survival time is essential for cost-effectiveness analyses (CEAs); the mean survival time is a measure of the effectiveness in a CEA and can be estimated from parametric survival functions. [3] In addition to the mean survival time, other parameters can also be estimated from survival distributions. The uncertainty of parameters can be captured with a variance-covariance matrix from parametric regression models. [1] However, IPD are often only available to clinical trial sponsors because of confidentiality, and the independent economic evaluation researchers lack access to these data. Independent researchers can typically obtain published Kaplan–Meier curves and summary statistics, e.g., numbers at risk and total number of events, for economic evaluations. For these reasons, methods that use published survival curves and reported summary statistics to reproduce statistics for economic evaluations are essential for independent researchers to conduct these evaluations. However, there are few methods available to attain information for economic evaluations in the absence of IPD, and only four methods have been identified to recreate IPD for economic evaluations from published KM curves. These four methods can be used to estimate the parameters of the fitted parametric survival function. The general equation of transition probabilities is as follows: tp(t) = 1 − exp{H(t − u) − H(t)}, where u indicates the length of the Markov cycle, t indicates the survival time, tp(t) indicates the transition probability between time point t-u and t, and H(t) indicates the cumulative hazard function of the parametric distribution. [4] In the following sections, the four identified methods were described briefly and then assessed by using a simulation study.

Methods

Method 1: Least squares method

The least squares method involves the estimation of parameters by minimizing the sum of the squares of the residuals; the residuals represent the differences between the actual data and the estimated data from a model. The least squares method is a common method of statistical estimation and is included in almost every statistical package, e.g., the R package, which makes it accessible to researchers in experimental sciences. Suppose 20 data points, (x1, y1), (x2, y2),…, (x20,y20), were extracted from a survival curve that followed a distribution with a function of y = f(x, β), in which β = (β1, β2). The β vectors of the parameters were estimated using the least squares method, which minimizes the sum of squares.

Method 2: Graphical method

The graphical method used to estimate the parameters of parametric models involves transforming the survivor function to a linear function and fitting a straight line through a series of points that were extracted from a published Kaplan-Meier survival curve [5]. For example, the survival function of the Weibull distribution is provided as follows: . We can transform this distribution by taking the logarithm as follows: log(S(t)) = − λt, where λ is a scale parameter and γ is a shape parameter. We can subsequently take the logarithm again: log(− log(S(t))) = log(γ) + γ*log(t). From this equation, we can plot log(log(S(t))) versus log(t), and the parameters can then be estimated. The graphical method typically provides rough estimates of the parameters, but it has been used because of its simplicity.

Method 3: Estimation of interval-censored data (Hoyle and Henley, 2011)

Hoyle and Henley [6] proposed a method for the estimation of IPD. Numbers at risk, the total number of patients and survival probabilities were used to estimate IPD. The method made an assumption that censoring is constant within each time interval. The number of events and censorships in each time interval of 1/4 length were estimated, which improved the curve fits considerably. When numbers of patients at risk are not reported, the numbers of events and censorship can be estimated by the method proposed by Tierney et al. [7] The method will be more accurate with more reported intervals. The estimated IPD according to this method are interval-censored. A spreadsheet that implements the method is available at http://medicine.exeter.ac.uk/media/universityofexeter/medicalschool/profiles/Hoyle_and_Henley_Version_1.1.xls… Hoyle and Henley conducted a comparison of their method with two transitional methods, e.g., the least squares and graphical methods, using the Monte Carlo method for comparison. The parameters used in the scenarios in their simulation included sample size, combinations of parameters of the Weibull distribution and censoring type. The results of the simulation demonstrated that the method proposed by Hoyle and Henley was more accurate in both fitting the curve and estimating the mean survival time compared with the traditional methods. Furthermore, the estimation of the mean survival time using the proposed method was as accurate as the estimation obtained by applying the MLE directly to actual IPD. One important feature of this proposed method that distinguishes it from traditional methods is that uncertainty can be captured with this method. However, alternative survival models and variations in the proportion of censoring were not included in the scenarios in the simulation study. The authors only used the Weibull distribution in their simulation study; however, many candidate survival models exist, such as Weibull, gamma, Gompertz, log-normal, and log-logistic. The lognormal distribution has typically been used to estimate the mean survival in NICE technology appraisals (TAs) instead of the Weibull distribution. [2] The lognormal distribution has commonly been used in the medical field, particularly for fitting cancer site data. [8] In a realistic trial setting, the censoring rate is often very high. [9] For example, a censoring rate as high as 70% is often identified in real clinical data. The MLE method is known to provide biased estimates when the data are heavily censored. [10]

Method 4: Estimation of precise survival data: survival data with left or right censoring (Guyot et al, 2012)

In 2012, an iterative algorithm developed by Guyot et al. was designed to obtain the reproduced IPD; consequently, survival curves could be reconstructed with the recreated IPD. [11]. The estimated IPD by the method was precise survival data rather than interval censored data. It was assumed that censoring rate is constant within time interval. An initial estimates for the number censored on interval i were obtained. Given this initial estimates, the number censored between extracted KM co-ordinates k and k + 1 was calculated. Then the number of events at each extracted KM co-ordinate and number of patients at risk at the next co-ordinate can be estimated. If the estimated number at risk at the beginning of interval i not equaled to the number at risk reported at the start of i, the process was repeated until the two figures matched from the previous iteration. If the reported total number of events was more than the estimated, the process was repeated until the two figures matched from the previous iteration. Further explanation of the method can be found in the original paper.[11] Regarding reproducibility and accuracy, the simulation was not performed to test the accuracy of the proposed method, which was different from the method created by Hoyle. Six pairs of survival curves were created from four published articles. The authors assessed the accuracy of the proposed method by comparing the actual 22 survival probabilities, 7 median survival times, 6 hazard ratios and 4 standard errors of the log hazard ratios that were reported in the four publications with the variables reconstructed using the proposed method. The results demonstrated that the accuracy was great for survival probabilities and median survival times, and the accuracy was reasonable when the number of individuals at risk or the total number of events was reported. However, because the method proposed by Guyot et al. was not compared with other traditional methods, it remains unknown whether the accuracy of the proposed method is greater than the other methods previously described. Because the Monte Carlo simulation was not performed to test the accuracy of the proposed method, we were not able to model different combinations of values for different inputs to observe the effects of truly different scenarios. There are similarities and differences in the methods proposed by Hoyle and Henley and Guyot et al. Both methods include the use of the survival curve and the number at risk, as well as the assumption that censoring occurs at a constant rate in each time interval for both methods. The differences are summarized in Table 1.
Table 1

Differences between the method by Hoyle and Henley and that by Guyot et al.

Method by Guyot et al.Method by Hoyle and Henley
Precise IPD was estimatedInterval-censored IPD was estimated
Total number of events was used (if available)No such data was used
Iterative algorithm was usedClosed-form was used
No comparison with traditional methodsComparison with traditional methods was made
All computing was done in RMost parts of computing was done in Excel and some in R
More data preparation work requiredLess data preparation work required
Taylor et al. [12] reviewed the non-linear, Guyot et al. and Hoyle and Henley methods and assessed their impacts on the estimates of survival parameters by extracting data points from published survival curves and applying the three methods to the two case studies obtained. Because IPD were unavailable, the accuracy of these three methods was not compared. The findings indicated the methods used to estimate survival parameters to obtain transition probabilities between transition states can affect cost-effectiveness results. Although the accuracy between Hoyle and Henley’s method and traditional methods has been compared, this comparison was performed in the absence of the method proposed by Guyot et al., which is an important method to help researchers recreate survival data that have been cited and used frequently. [2, 4, 13–15] Additionally, it is unclear how the four methods perform in different situations in which alternative survival models and various censoring rates are assumed to reflect realistic clinical data. Therefore, we evaluated the four methods using a simulation study.

Simulation settings

To compare the four methods previously described, a simulation study that could generate survival data with censoring was designed and conducted. This study enabled us to compare the performance of the four methods. The simulated survival data were designed to reflect real clinical trial data. This section contains the details of the simulation study design. The simulation settings in this study are similar to Hoyle and Henley’s study, with the exceptions that (1) the Guyot et al. method was included, (2) variations in the proportion of censoring were simulated, and (3) the lognormal distribution was used in addition to the Weibull distribution.

Survival times

To simulate data, a number of patients with an underling survival time must be generated. Two sample sizes were chosen: 100 and 500 individuals. These sample sizes reflect the sample sizes typically included in clinical trials. [16-18] To simulate survival times with censoring, two survival distributions were required, one distribution for the time-to-event (Ti) and a second distribution for time to censoring (Ci). The Weibull and lognormal distributions are most commonly used for time-to-event data. [19] In this simulation, we considered these two survival distributions for the uncensored survival time Ti: (1) Weibull distribution; and (2) lognormal distribution. The survival function of the Weibull distribution is S(t) = exp(− λt), in which λ is a scale parameter and γ is a shape parameter. The mean survival time of the Weibull distribution can be estimated as follows: E(X) = λΓ(1 + 1/γ), in which Γ is the gamma function. In the simulations, the mean survival time was set to 10 time units, which is common in time-to-event data. To ensure the mean survival time remained at 10 units, different combinations of the scale and shape parameters were chosen. First, three shape parameters were chosen that corresponded to the parameters used by Hoyle and Henley; these parameters cover most situations that would be encountered in the Weibull distribution. Second, the scale parameter was set to make the mean survival time equal to 10 units. The combinations of the parameters were as follows: Decreasing failure rate (scale parameter<1): γ = 0.6, λ = 6.65 Constant failure rate (scale parameter<1): γ = 1, λ = 10 Increasing failure rate (scale parameter>1): λ = 2, λ = 11.28. The survival function of the lognormal distribution is , in which Φ is the cumulative distribution function of the normal distribution,, and μ and σ are the mean and standard deviation, respectively, of the variable’s natural logarithm. The mean survival time of the lognormal distribution can be expressed as:. The mean survival time of the lognormal distribution was also set to 10 units. The hazard function of the lognormal distribution first increases from zero to a maximum value and then decreases back to zero. The amount that the hazard function increases or decreases primarily depends on the value of σ. The combinations of the parameters of the lognormal distribution were as follows: σ = 1, μ = 1.802585: The hazard rate first increases from zero to a maximum value and then decreases to zero; σ = 2, μ = 0.3025851: The hazard function essentially decreases over most time values. The type of censoring in the simulation included type I censoring (because of pre-assigned fixed censoring times) and random censoring (because of loss to follow-up). The common censoring distributions considered in the literature are typically uniform and have exponential distributions [20]; in the simulation study, the random censoring was generated from an exponential distribution. The mean values of the exponential distributions were set to attain the desired censoring rates of 26, 42, and 76% to simulate what occurs in the real world (Table 2).
Table 2

The mean of the exponential distribution in simulations.

Censoring rateDistribution
Weibull (γ,λ)Lognormal(σ, μ)
(0.6, 6.65)(1, 10)(2, 11.28)(1, 1.802585)(2, 0.3025851)
26% a b b c 13
42%1033200204
76%13.573.50.5

a: The mean of exponential distribution was set to a large number(e.g. 10000000) in order to attain the target censoring rate.

b: Not only the mean of exponential distribution was set, but the time when the study ends was set to 15 in order to attain the target censoring rate.

c: Not only the mean of exponential distribution was set, but the time when the study ends was set to 13 in order to attain the target censoring rate.

a: The mean of exponential distribution was set to a large number(e.g. 10000000) in order to attain the target censoring rate. b: Not only the mean of exponential distribution was set, but the time when the study ends was set to 15 in order to attain the target censoring rate. c: Not only the mean of exponential distribution was set, but the time when the study ends was set to 13 in order to attain the target censoring rate.

Entry and follow-up times

The subjects were assumed to enter the study at different times during three time units periods. The entry times greater than 0 were generated from a uniform distribution between times 0 and 3 units. The maximum follow-up time was 12 units, which represents what often occurs in real data. [21-22] The subjects who were censored at time 12 units represented the subjects who remained at the end of the follow-up period. Thus, all subjects were followed up for a period of time that ranged between 3 units to 12 units, depending on their entry time. The follow-up time was set to 12 units, with the exception of the situations described in Table 2.

Application of the methods

Considering all possible parameter selections, we ended up with several scenarios: 18 scenarios for the Weibull distribution and 12 scenarios for the lognormal distribution (Table 3).
Table 3

Summary of simulation variables.

VariableScenariosDetails
Sample size2100 or 500
Weibull:
γ = 0.6, λ = 6.65
γ = 1, λ = 10
Parametric distribution2γ = 2, λ = 11.28
lognormal:
σ = 1, μ = 1.802585
σ = 2, μ = 0.3025851
Censoring rate30.76, 0.42 or 0.26
Although the sample size 100–500 covered the range of trials typically encountered well and reflect what was often seen in large trials, [16-18] additional work was performed by simulating lager trials (sample size set to 1000). The simulation were run with γ set to 0.6 and λ set to 6.65 and censoring rate set to 0.76 for Weibull distribution and σ set to 2 and μ set to 0.30 and censoring rate set to 0.76 for lognormal distribution. For the Guyot et al. method, the number of events was used; however, this information is often not provided. For the Hoyle and Henley method, 6 time intervals were used in most cases. For each scenario, 1000 datasets were generated. The four methods previously described were subsequently applied to each of the 1000 datasets, and the MLE was applied directly to the simulated IPD to estimate the parameters of the survival distribution, which commonly occurs in cost-effectiveness analyses. For each method, the mean survival time was estimated for each of the 1000 datasets. All simulations were conducted using the R for Statistical Computing version 3.1.1 (R Foundation for Statistical Computing, Vienna, Austria).

Performance evaluation

The measurements used to evaluate the performance of the different methods have been summarized by Burton et al. [23] Bias was calculated as follows: in which β is the true value for the estimate of interest, and is the mean estimate of interest over the 1000 simulations performed. Mean square error (MSE) MSE is considered a useful measure of the overall accuracy because it incorporates both measures of bias and variability. The MSE was calculated as follows: in which was the empirical SE of the estimate of interest over all simulations.

Uncertainty

In CEAs, the measure of effectiveness is the mean survival time over the duration of interest. [24] Thus, consideration of the uncertainty regarding the mean survival time is essential to conduct CEAs. [3] One of the advantages of using the methods proposed by Guyot et al. and Hoyle and Henley to conduct a CEA based on a published KM curve is that these methods can capture uncertainty regarding the mean survival time, unlike traditional methods, which are unable to reconstruct IPD. To assess the performance of capturing the uncertainty reported by the two proposed methods, the standard errors of the mean survival time estimated by directly applying the MLE to actual IPD and by the two proposed methods were estimated. We analyzed the impact of uncertainty on the estimate of mean time by applying a bootstrap resampling technique on the parameters of the survival distributions (Weibull and lognormal distributions). For every 1000 simulations, the standard deviation was estimated by applying a bootstrap resampling technique. For each bootstrap resample, the correlated variables were generated by applying the formula x = y+Tz, where x is the vector of the correlated variables, z is the vector of the independent standard normal variables, y is the vector of the parameter mean values and T is a cholesky decomposition of a variance-covariance matrix that was recorded from the simulations. The parameters of the survival distribution were acquired from the distribution, and the mean survival times were estimated. This three-step procedure was then repeated 10,000 times. The standard deviation of the mean survival time over the 10,000 bootstrap simulations was estimated. Thus, the standard error of the mean survival time for every 1000 simulations was obtained. In the bootstrap simulations, uncertainty was captured as follows: Weibull distribution: (1) γ = 0.6, censoring rate = 0.76; (2) γ = 1, censoring rate = 0.76; (3) γ = 1, censoring rate = 0.26, and lognormal distribution: (4) σ = 2, censoring rate = 0.26.

Results

For the tables and figures in this section, the methods are referred to as follows: the MLE was applied to the actual IPD (referred to as Actual IPD), the method proposed by Guyot et al. (referred to as Guyot et al.), the method proposed by Hoyle and Henley (referred to as Hoyle and Henley), the least squares method and the graphical method.

Simulation results

Table 4 and Fig. 1 presents the results when the survival time followed a Weibull distribution and the sample size was 100. When the censoring rate was as low as 0.26, all four methods performed well in the estimation of the mean survival time, and their estimates were as accurate as the values estimated using actual IPD.
Table 4

Simulation results of mean survival time, bias, and MSE at 0.26, 0.42 and 0.76 censoring rate with sample size = 100 from three shapes of hazard function (weibull distribution).

Weibull parameter (γ,λ)Censoring rateMethodMean survival timeBiasMSE
0.6, 6.650.76Actual IPD15.93365.934137.1899
Guyot et al.17.27547.276053.9205
Hoyle and Henley16.87126.871847.6772
Least Squares3416340615462811
Graphical Method357.54347.54160506
0.42Actual IPD10.54700.54760.3125
Guyot et al.10.60750.60810.3834
Hoyle and Henley10.97340.97400.9687
Least Squares11.43881.43942.3522
Graphical Method11.18261.18321.5023
0.26Actual IPD10.36710.36770.1432
Guyot et al.10.28060.28120.0869
Hoyle and Henley10.51700.51760.2769
Least Squares10.58910.58970.3611
Graphical Method10.59180.59240.3640
1, 100.76Actual IPD11.17901.17891.4209
Guyot et al.11.89041.89043.6299
Hoyle and Henley12.90222.90228.5909
Least Squares17.06207.061962.0269
Graphical Method14.42694.426922.7738
0.42Actual IPD10.14330.14330.0239
Guyot et al.10.18010.18010.0364
Hoyle and Henley10.25340.25340.0700
Least Squares10.36300.36300.1572
Graphical Method10.47290.47290.2463
0.26Actual IPD10.09390.09390.0107
Guyot et al.10.17930.17930.0349
Hoyle and Henley10.33460.33460.1150
Least Squares10.04750.04750.0056
Graphical Method10.42130.42130.1831
2, 11.280.76Actual IPD10.16990.17330.0319
Guyot et al.10.47400.47740.2334
Hoyle and Henley10.58820.59150.3555
Least Squares10.36040.36370.1397
Graphical Method11.22201.22541.5166
0.42Actual IPD10.08180.08520.0078
Guyot et al.10.12020.12360.0160
Hoyle and Henley10.10140.10480.0117
Least Squares10.13460.13800.0199
Graphical Method11.23301.23631.5351
0.26Actual IPD10.00600.00940.0004
Guyot et al.10.12960.13290.0182
Hoyle and Henley10.13120.13460.0187
Least Squares10.02440.02780.0015
Graphical Method11.40721.41051.9993
Fig 1

Simulation results for the five methods: 100 patients per trial.

The values of .6, 1 and 2 represent the shape values of the Weibull distribution and .76, .42 and .26 represent the censoring rates.

Simulation results for the five methods: 100 patients per trial.

The values of .6, 1 and 2 represent the shape values of the Weibull distribution and .76, .42 and .26 represent the censoring rates. When the censoring rate was 0.42, all four methods provided satisfactory estimates of the mean survival time, with the least squares and graphical methods exhibiting small biases in the scenarios with a decreasing hazard rate. When the censoring rate increased to 0.76, the least squares and graphical methods noticeably overestimated the mean survival time in the situations in which the hazard rate decreased or was constant. Because of the bias created from the application of MLE to actual IPD, the Guyot et al. and Hoyle and Henley methods provided biased estimates; however, these methods were still similar to the estimates obtained using actual IPD compared with the biases that resulted from the least squares and graphical methods. When the sample size was 100 and the data followed a Weibull distribution, the accuracy of the estimates of mean survival time was somewhat dependent on the shape of the hazard function and the level of censoring. For decreasing and constant hazard patterns, the biases observed were greater in scenarios with higher compared with lower censoring rates, particularly for a decreasing hazard pattern; these biases resulted from the use of MLE when the data were heavily censored. There may be a substantially greater amount of bias with decreasing compared with constant hazard rates. One potential explanation is that the bias magnitude from the MLE is more affected by decreasing compared with constant hazard rates. When the censoring rate was the same, e.g., 0.76 or 0.42, the magnitude of the bias was affected by the shape of hazard function of the survival models. There was a greater bias with decreasing compared with increasing or constant hazard rates. When the shape parameter was equal to 0.6, which corresponds to a decreasing hazard function, the percentage bias in the Actual IPD, the method of Guyot et al. and the Hoyle and Henley method was 59.34, 72.76 and 68.72%, respectively, whereas the percentage bias was 11.79, 18.90 and 29.02%, respectively, with a constant hazard rate and 0.09, 1.33 and 1.35%, respectively, with an increasing hazard rate. One potential explanation is that the amount of random censoring in the increasing hazard case was less than the decreasing and constant hazard cases. The results from Table 5 and Fig. 2 demonstrate that all methods performed well when the sample size was 500.
Table 5

Simulation results of mean survival time, bias, and MSE at 0.26, 0.42 and 0.76 censoring rate with sample size = 500 from three shapes of hazard function (weibull distribution).

Weibull parameter (γ,λ)Censoring rateMethodMean estimateBiasMSE
0.6, 6.650.76Actual IPD10.50270.50330.2645
Guyot et al.10.88780.88840.8038
Hoyle and Henley9.8664-0.13310.0311
Least Squares11.84711.84773.4665
Graphical Method11.38321.38381.9476
0.42Actual IPD10.12980.13040.0189
Guyot et al.10.19620.19680.0408
Hoyle and Henley9.9430-0.05640.0055
Least Squares10.22780.22840.0554
Graphical Method10.21390.21450.0491
0.26Actual IPD10.05850.05910.0048
Guyot et al.10.08990.09050.0095
Hoyle and Henley10.01230.01290.0015
Least Squares10.08020.08080.0082
Graphical Method10.07510.07570.0073
1, 100.76Actual IPD10.17640.17640.0330
Guyot et al.10.29650.29650.0906
Hoyle and Henley10.41900.41900.1785
Least Squares10.37850.37850.1477
Graphical Method10.26080.26080.0715
0.42Actual IPD10.00500.00500.0005
Guyot et al.10.02260.02260.0011
Hoyle and Henley10.03660.03660.0019
Least Squares9.9990-0.00100.0006
Graphical Method10.01650.01650.0010
0.26Actual IPD9.9738-0.02620.0010
Guyot et al.10.01950.01950.0008
Hoyle and Henley10.00790.00790.0005
Least Squares9.9587-0.04130.0023
Graphical Method10.01060.01060.0007
2, 11.280.76Actual IPD10.02020.02360.0009
Guyot et al.10.10550.10880.0124
Hoyle and Henley10.05760.06090.0042
Least Squares10.03480.03810.0020
Graphical Method10.24870.25210.0648
0.42Actual IPD10.02910.03250.0012
Guyot et al.10.06340.06680.0046
Hoyle and Henley10.03210.03540.0014
Least Squares10.03660.04000.0017
Graphical Method10.47390.47720.2287
0.26Actual IPD9.99960.00300.0001
Guyot et al.10.06890.07220.0053
Hoyle and Henley10.02320.02660.0008
Least Squares9.9957-0.00100.0001
Graphical Method10.72780.73120.5361
Fig 2

Simulation results for the five methods: 500 patients per trial.

The values of .6, 1 and 2 represent the shape values of the Weibull distribution and .76, .42 and .26 represent the censoring rates.

Simulation results for the five methods: 500 patients per trial.

The values of .6, 1 and 2 represent the shape values of the Weibull distribution and .76, .42 and .26 represent the censoring rates. Table 6 and Fig. 3 presents the results when the survival time followed a lognormal distribution and the sample size was 100. When σ was 1 and μ was 1.80, all methods performed well at all three levels of censoring. When σ was 2 and μ was 0.30, the magnitude of the bias was strongly affected by the level of censoring. The greater the degree of censoring, the larger the bias. The least squares and graphical methods clearly overestimated the estimation of the mean survival time. As a result of the MLE bias, the Guyot et al. and Hoyle and Henley methods also overestimated the estimates. The Guyot et al. method performed noticeably better than the Hoyle and Henley method when σ = 2 and μ = 0.30 and the censoring rates = 0.76 and 0.42.
Table 6

Simulation results of mean survival time, bias, and MSE at 0.26, 0.42 and 0.76 censoring rate with sample size = 100 from two shapes of hazard function (lognormal distribution).

lognormal parameter (σ, μ)Censoring rateMethodMean estimateBiasMSE
1, 1.800.76Actual IPD10.43270.43270.2016
Guyot et al.10.81300.81300.6828
Hoyle and Henley11.02611.02611.0759
Least Squares11.60331.60332.6679
Graphical Method11.32561.32561.8004
0.42Actual IPD10.22510.22510.0543
Guyot et al.10.24240.24240.0627
Hoyle and Henley10.43220.43220.1911
Least Squares10.29690.29690.0944
Graphical Method10.60830.60830.3757
0.26Actual IPD10.06510.06510.0064
Guyot et al.10.14530.14530.0238
Hoyle and Henley10.34880.34880.1246
Least Squares10.06120.06120.0073
Graphical Method10.67280.67280.4574
2, 0.300.76Actual IPD17.54187.541858.1389
Guyot et al.21.686011.6860139.4465
Hoyle and Henley26.513516.5135274.3040
Least Squares6936692690206329
Graphical Method221.712211.71255210
0.42Actual IPD11.35901.35901.9006
Guyot et al.11.68691.68692.9204
Hoyle and Henley15.38915.389129.3930
Least Squares26.824616.8246415.537
Graphical Method23.534513.5345267.124
0.26Actual IPD11.08471.08471.2089
Guyot et al.11.22151.22151.5269
Hoyle and Henley12.71592.71597.5270
Least Squares13.26153.261510.9698
Graphical Method13.16013.160110.3016
Fig 3

Simulation results for the five methods: 100 patients per trial.

The values of 2 and 1 represent the sigma values of the lognormal distribution and .76, .42 and .26 represent the censoring rates.

The values of 2 and 1 represent the sigma values of the lognormal distribution and .76, .42 and .26 represent the censoring rates. Table 7 and Fig. 4 presents the results when the survival time followed a lognormal distribution and the sample size was 500. When σ was 1 and μ was 1.80, all methods performed well. When σ was 2 and μ was 0.30, a greater bias was observed in the Hoyle and Henley method.
Table 7

Simulation results of mean survival time, bias, and MSE at 0.26, 0.42 and 0.76 censoring rate with sample size = 500 from two shapes of hazard function (lognormal distribution).

lognormal parameter (σ, μ)Censoring rateMethodMean estimateBiasMSE
1, 1.800.76Actual IPD10.13370.13370.0197
Guyot et al.10.19410.19410.0399
Hoyle and Henley10.28610.28610.0842
Least Squares10.13590.13590.0216
Graphical Method10.06610.06610.0067
0.42Actual IPD10.01240.01240.0007
Guyot et al.10.01800.01800.0009
Hoyle and Henley10.04340.04340.0025
Least Squares10.00540.00540.0008
Graphical Method10.02390.02390.0013
0.26Actual IPD10.01710.01710.0008
Guyot et al.10.05310.05310.0033
Hoyle and Henley10.05430.05430.0035
Least Squares9.9953-0.00470.0006
Graphical Method10.21330.21330.0462
2, 0.300.76Actual IPD11.00171.00171.0305
Guyot et al.11.18881.18881.4487
Hoyle and Henley13.30283.302810.9742
Least Squares13.11463.11469.8380
Graphical Method11.99601.99604.0446
0.42Actual IPD10.16530.16530.0329
Guyot et al.10.28600.28600.0881
Hoyle and Henley10.85180.85180.7367
Least Squares10.66400.66400.4567
Graphical Method10.57800.57800.3479
0.26Actual IPD10.28290.28290.0846
Guyot et al.10.30120.30110.0955
Hoyle and Henley10.45340.45340.2142
Least Squares10.50750.50750.2675
Graphical Method10.49200.49200.2520
Fig 4

Simulation results for the five methods: 500 patients per trial.

The values of 2 and 1 represent the sigma values of the lognormal distribution and .76, .42 and .26 represent the censoring rates.

The values of 2 and 1 represent the sigma values of the lognormal distribution and .76, .42 and .26 represent the censoring rates. The results demonstrated that the biases were also affected by the shape of the hazard function and the level of censoring in both the lognormal and Weibull distributions, regardless of the sample size. For the same high censoring rate, a greater bias will be produced in the decreasing hazard pattern compared with the other hazard patterns. The Fig. 5 and Fig. 6 presents the results when sample size was 1000 and survival time followed a Weibull distribution and lognormal distribution, respectively. The results showed that the performances of all methods in the cases of the larger trial were consistent with those in the cases of the sample size 500.
Fig 5

Simulation results for the five methods: 1000 patients per trial.

The simulation were run with γ set to 0.6 and λ set to 6.65 and censoring rate set to 0.76 for Weibull distribution.

Fig 6

Simulation results for the five methods: 1000 patients per trial.

The simulation were run with σ set to 2 and μ set to 0.30 and censoring rate set to 0.76 for lognormal distribution.

Simulation results for the five methods: 1000 patients per trial.

The simulation were run with γ set to 0.6 and λ set to 6.65 and censoring rate set to 0.76 for Weibull distribution. The simulation were run with σ set to 2 and μ set to 0.30 and censoring rate set to 0.76 for lognormal distribution.

Uncertainty results

The results of Fig. 7 indicate that in situations that use the Weibull distribution, the Guyot et al. method provided slightly better uncertainty measures of the mean survival time than the Hoyle and Henley method. When the lognormal distribution was used, the mean difference between the standard error of the mean survival time from actual IPD and the corresponding standard error reported by the Guyot et al. method was substantially less than the Hoyle and Henley method.
Fig 7

The results of uncertainty regarding the estimate of mean survival time.

The uncertainty was captured in the following situations: Weibull distribution: A. γ = 0.6, censoring rate = 0.76; B. γ = 1, censoring rate = 0.76; C. γ = 1, censoring rate = 0.26, and lognormal distribution: D. σ = 2, censoring rate = 0.26.

The results of uncertainty regarding the estimate of mean survival time.

The uncertainty was captured in the following situations: Weibull distribution: A. γ = 0.6, censoring rate = 0.76; B. γ = 1, censoring rate = 0.76; C. γ = 1, censoring rate = 0.26, and lognormal distribution: D. σ = 2, censoring rate = 0.26.

Discussion

As expected, the application of the MLE to actual IPD was the most accurate method compared with the other methods in the simulation study; however, it provided noticeable biases in situations in which the level of censoring was high and the hazard function decreased over time. The biases occurred because the MLE can be quite biased when the sample size is small or when the survival data are heavily censored. [25-26] To correct this problem, a modified MLE (MMLE) was proposed to reduce bias in the estimates of the Weibull shape parameter through the modification of the profile likelihood [26]. The MLE and MMLE work well for complete, Type I, or Type II censored data, but they are not useful for random censored data. More recently, Shen [10] proposed a method that can be applied not only to censored data as previously discussed but also to other data, such as random censoring, progressive Type II censoring, and adaptive Type II progressive censoring. If the MMLE or the method proposed by Shen is applied to IPD, the bias will be reduced. However, the most commonly used statistical software packages do not include programs for MMLE or the method proposed by Shen. This debate is beyond our study’s scope, and we believe that bias will be reduced as long as the programs for the corrected-MLE method are included in statistical software. In general, the least squares and graphic methods provided biased estimates; the estimates were heavily biased in some scenarios. For the Weibull distribution, the bias differences between the two proposed methods were not significant. Therefore, both of the two proposed methods provided satisfactory estimates of the mean survival time; these estimates were almost as accurate as the estimates obtained using actual IPD. However, the number of events was used in the simulation for the Guyot et al. method, and this information is often not provided in clinical trials. Therefore, the simulation design favored the Guyot et al. method. In addition, the Guyot et al. method required substantially more data preparation work compared with the Hoyle and Henley method. The data prepared for the Guyot et al. method should be sufficient, and every step in the KM curves should be captured. Hundreds of data points were required for the Guyot et al. method, whereas the number of data points required for the Hoyle and Henley method was substantially less; for example, when 6 time points were reported, 24 data points were required for Hoyle and Henley method. There is a larger probability for bias to occur from data extraction when it is necessary to extract more data. In the simulation study, the actual survival probabilities were used instead of the probabilities obtained from the published survival curves. Therefore, the results from the simulation study were the most accurate results that could be obtained from the four methods. Six time intervals were used for most scenarios in the simulation. The Hoyle and Henley method worked better when more time intervals were reported in the clinical trial. Published data typically include a minimum of 6 time intervals. Therefore, the accuracy of the Hoyle and Henley method assessed here is likely the minimum accuracy that can be obtained from this method. Given that both the Guyot et al. and Hoyle and Henley methods provided satisfactory estimates of the mean survival time when the Weibull distribution was fitted, we believe that the Hoyle and Henley method would not perform worse than the Guyot et al. method in the real world. For the lognormal distribution, the Hoyle and Henley method resulted in more noticeable biases compared with the other methods potentially because the Hoyle and Henley method is affected to a greater degree by the long tail of the lognormal distribution. In the simulation, the sample sizes of 100 and 500 were chosen, because the sample size 100–500 covers the range of trials typically encountered well and reflect what is often seen in large trials. [16-18] However, there are trials in which sample size is less than 100 or more than 500. The results showed that the performances of all methods in the cases of the larger trial are consistent with those in the cases of the sample size 500. For small trials, it’s possible to see the exact drops in the Kaplan Meier curve and the tick marks for censorships. In these situations we don’t need to use the two proposed methods.

Conclusions

The objective of this article was to review and compare the methods used to recreate individual patient data for economic evaluations from published Kaplan-Meier survival curves using a Monte Carlo simulation. Of the methods assessed in this study, the least squares and graphical methods were the least preferred methods because of their remarkable overestimation in some situations. Because the simulation design favored the Guyot et al. method and the accuracy of the Hoyle and Henley method assessed in the simulation is likely the minimum accuracy that can be obtained from this method, both the Hoyle and Henley and Guyot et al. methods provided satisfactory estimates and uncertainty values of the mean survival time for the Weibull distribution; these estimates were as accurate as the values estimated using actual IPD. If the lognormal distribution was utilized, the Guyot et al. method performed noticeably better when sigma equaled 2 and the censoring rate was high, e.g., 0.76 and 0.42, compared with the Hoyle and Henley method.
  18 in total

1.  Survival modeling for the estimation of transition probabilities in model-based economic evaluations in the absence of individual patient data: a tutorial.

Authors:  Vakaramoko Diaby; Georges Adunlin; Alberto J Montero
Journal:  Pharmacoeconomics       Date:  2014-02       Impact factor: 4.981

2.  Survival analysis for economic evaluations alongside clinical trials--extrapolation with patient-level data: inconsistencies, limitations, and a practical guide.

Authors:  Nicholas R Latimer
Journal:  Med Decis Making       Date:  2013-01-22       Impact factor: 2.583

Review 3.  Overview of parametric survival analysis for health-economic applications.

Authors:  K Jack Ishak; Noemi Kreif; Agnes Benedict; Noemi Muszbek
Journal:  Pharmacoeconomics       Date:  2013-08       Impact factor: 4.981

4.  Progression-free survival with fulvestrant 500 mg and alternative endocrine therapies as second-line treatment for advanced breast cancer: a network meta-analysis with parametric survival models.

Authors:  Shannon Cope; Mario J N M Ouwens; Jeroen P Jansen; Peter Schmid
Journal:  Value Health       Date:  2013-01-26       Impact factor: 5.725

5.  Clinical and cost-effectiveness of compression hosiery versus compression bandages in treatment of venous leg ulcers (Venous leg Ulcer Study IV, VenUS IV): a randomised controlled trial.

Authors:  Rebecca L Ashby; Rhian Gabe; Shehzad Ali; Una Adderley; J Martin Bland; Nicky A Cullum; Jo C Dumville; Cynthia P Iglesias; Arthur R Kang'ombe; Marta O Soares; Nikki C Stubbs; David J Torgerson
Journal:  Lancet       Date:  2013-12-06       Impact factor: 79.321

6.  Multicenter randomized controlled trial of conventional versus laparoscopic surgery for colorectal cancer within an enhanced recovery programme: EnROL.

Authors:  Robin H Kennedy; E Anne Francis; Rose Wharton; Jane M Blazeby; Philip Quirke; Nicholas P West; Susan J Dutton
Journal:  J Clin Oncol       Date:  2014-05-05       Impact factor: 44.544

7.  Improved curve fits to summary survival data: application to economic evaluation of health technologies.

Authors:  Martin W Hoyle; William Henley
Journal:  BMC Med Res Methodol       Date:  2011-10-10       Impact factor: 4.615

8.  Adjusting survival time estimates to account for treatment switching in randomized controlled trials--an economic evaluation context: methods, limitations, and recommendations.

Authors:  Nicholas R Latimer; Keith R Abrams; Paul C Lambert; Michael J Crowther; Allan J Wailoo; James P Morden; Ron L Akehurst; Michael J Campbell
Journal:  Med Decis Making       Date:  2014-01-21       Impact factor: 2.583

9.  Incidence and associations of poststroke epilepsy: the prospective South London Stroke Register.

Authors:  Neil S N Graham; Siobhan Crichton; Michael Koutroumanidis; Charles D A Wolfe; Anthony G Rudd
Journal:  Stroke       Date:  2013-01-31       Impact factor: 7.914

10.  A method for analyzing censored survival phenotype with gene expression data.

Authors:  Tongtong Wu; Wei Sun; Shinsheng Yuan; Chun-Houh Chen; Ker-Chau Li
Journal:  BMC Bioinformatics       Date:  2008-10-06       Impact factor: 3.169

View more
  13 in total

1.  A Comparison of Different Analysis Methods for Reconstructed Survival Data to Inform Cost‑Effectiveness Analysis.

Authors:  Sandjar Djalalov; Jaclyn Beca; Emmanuel M Ewara; Jeffrey S Hoch
Journal:  Pharmacoeconomics       Date:  2019-12       Impact factor: 4.981

2.  Estimation of indirect effect when the mediator is a censored variable.

Authors:  Jian Wang; Sanjay Shete
Journal:  Stat Methods Med Res       Date:  2017-01-30       Impact factor: 3.021

Review 3.  Influence of glioblastoma contact with the lateral ventricle on survival: a meta-analysis.

Authors:  Akshitkumar M Mistry; Andrew T Hale; Lola B Chambless; Kyle D Weaver; Reid C Thompson; Rebecca A Ihrie
Journal:  J Neurooncol       Date:  2016-09-19       Impact factor: 4.130

4.  Cost-effectiveness Analysis of Pembrolizumab Plus Axitinib Versus Sunitinib in First-line Advanced Renal Cell Carcinoma in China.

Authors:  Jun Chen; Gaoyun Hu; Zhuo Chen; Xiaomin Wan; Chongqing Tan; Xiaohui Zeng; Zeneng Cheng
Journal:  Clin Drug Investig       Date:  2019-10       Impact factor: 2.859

5.  Integrating real world data and clinical trial results using survival data reconstruction and marginal moment-balancing weights.

Authors:  Kylie Getz; Ronac Mamtani; Rebecca A Hubbard
Journal:  J Biopharm Stat       Date:  2021-11-10       Impact factor: 1.503

6.  First-line Nivolumab Plus Ipilimumab vs Sunitinib for Metastatic Renal Cell Carcinoma: A Cost-effectiveness Analysis.

Authors:  XiaoMin Wan; YuCong Zhang; ChongQing Tan; XiaoHui Zeng; LiuBao Peng
Journal:  JAMA Oncol       Date:  2019-04-01       Impact factor: 31.777

7.  A technique for approximating transition rates from published survival analyses.

Authors:  Markian A Pahuta; Joel Werier; Eugene K Wai; Roy A Patchell; Doug Coyle
Journal:  Cost Eff Resour Alloc       Date:  2019-07-01

8.  Phase I/II Clinical Trial-Based Early Economic Evaluation of Acalabrutinib for Relapsed Chronic Lymphocytic Leukaemia.

Authors:  Rick A Vreman; Joost W Geenen; Anke M Hövels; Wim G Goettsch; Hubert G M Leufkens; Maiwenn J Al
Journal:  Appl Health Econ Health Policy       Date:  2019-12       Impact factor: 2.561

9.  Long term surgical outcomes for infective endocarditis in people who inject drugs: a systematic review and meta-analysis.

Authors:  David Goodman-Meza; Robert E Weiss; Sebastián Gamboa; Abel Gallegos; Alex A T Bui; Matthew B Goetz; Steven Shoptaw; Raphael J Landovitz
Journal:  BMC Infect Dis       Date:  2019-11-08       Impact factor: 3.090

Review 10.  Is monitoring of plasma 5-fluorouracil levels in metastatic / advanced colorectal cancer clinically effective? A systematic review.

Authors:  Karoline Freeman; Mark P Saunders; Olalekan A Uthman; Sian Taylor-Phillips; Martin Connock; Rachel Court; Tara Gurung; Paul Sutcliffe; Aileen Clarke
Journal:  BMC Cancer       Date:  2016-07-25       Impact factor: 4.430

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.