| Literature DB >> 34074267 |
Na Liu1, Yanhong Zhou1, J Jack Lee2.
Abstract
BACKGROUND: When applying secondary analysis on published survival data, it is critical to obtain each patient's raw data, because the individual patient data (IPD) approach has been considered as the gold standard of data analysis. However, researchers often lack access to IPD. We aim to propose a straightforward and robust approach to obtain IPD from published survival curves with a user-friendly software platform.Entities:
Keywords: Individual patient data (IPD); Kaplan-Meier curve; Meta-analysis; R package; Shiny application; Survival analysis
Mesh:
Year: 2021 PMID: 34074267 PMCID: PMC8168323 DOI: 10.1186/s12874-021-01308-8
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Fig. 1The flowchart of IPD reconstruction from published K-M curves
Overview of the user visible functions in IPDfromKM
| Function | Description | Object returned |
|---|---|---|
| getpoints | Extract raw data coordinates from published K-M curves. | A data frame containing the x- and y- coordinates of the K-M curve of interest. |
| preprocess | Preprocess the read-in data coordinates. | A list including cleaned data ready for reconstruction and a “riskmat” table displaying the index of read-in points within each time interval. |
| getIPD | Estimate the IPD. | A list including the reconstructed IPD. |
| survreport | Perform survival analysis on reconstructed IPD. | K-M curve, cumulative hazard, times for targeted survival probabilities. |
| plot | Plot the object returned by getIPD(). | K-M curves and number at risk for both reconstructed IPD and read-in data. |
| summary | Summarize objects returned by getIPD(). | Descriptive results for accuracy assessment and survival analysis on reconstructed IPD. |
Please consult the documentation (e.g., help(“preprocess”)) for function arguments and detailed return types
Fig. 2IPD reconstruction of two treatments using the Shiny application
Estimates of median overall survival (OS) and hazard ratio using the modified-iKM algorithm based on data extracted using different software (R: R package IPDfromKM, D: DigitizeIt, S: ScanIt), in comparison to published results (Report) in the POLAR trial
| Median OS | Hazard Ratio | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Group | Arm | Report | R | D | S | Report | R | D | S | |||
| 1 | Atezolizumab | 24 | 15.5 | 15.5 | 15.5 | 15.5 | 0.49 | 0.48 | 0.46 | 0.45 | ||
| Docetaxel | 23 | 11.1 | 11.1 | 11.1 | 11.1 | |||||||
| 2 | Atezolizumab | 50 | 15.1 | 15.3 | 15.3 | 15.3 | 0.54 | 0.54 | 0.56 | 0.53 | ||
| Docetaxel | 55 | 7.4 | 7.6 | 7.4 | 8.1 | |||||||
| 3 | Atezolizumab | 93 | 15.5 | 15.3 | 15.5 | 15.7 | 0.59 | 0.59 | 0.58 | 0.59 | ||
| Docetaxel | 102 | 9.2 | 9.3 | 9.2 | 9.6 | |||||||
| 4 | Atezolizumab | 51 | 9.7 | 9.7 | 9.7 | 9.5 | 1.04 | 1.06 | 0.99 | 1.03 | ||
| Docetaxel | 41 | 9.7 | 9.7 | 9.7 | 9.8 | |||||||
| 5 | Atezolizumab | 144 | 12.6 | 13.3 | 12.4 | 12.3 | 0.73 | 0.72 | 0.70 | 0.72 | ||
| Docetaxel | 143 | 9.7 | 9.7 | 9.7 | 9.8 | |||||||
Group 1: TC3 or IC3; 2: TC2/3 or IC2/3; 3: TC1/2/3 or IC1/2/3; 4: TC0 or IC0; 5: all patients. The value of n refers to sample size
Fig. 3Accuracy assessment in terms of the number of patients at risk
Fig. 4Survival analysis on the reconstructed IPD and the true data
Hypothesis testing using log-rank test with true IPD versus with reconstructed IPD based on data points extracted using different software
| Results based | Results based on reconstructed IPD | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| on true IPD | IPDfromKM | DigitizeIt | ScanIt | ||||||||
| HR(SE) | HR(SE) | HR(SE) | HR(SE) | ||||||||
| Simulated curves: | |||||||||||
| 1 | 0.47(0.12) | <0.001 | 0.48(0.13) | <0.001 | 0.47(0.12) | <0.001 | 0.47(0.12) | <0.001 | |||
| 2 | 0.40(0.18) | <0.001 | 0.40(0.18) | <0.001 | 0.40(0.18) | <0.001 | 0.40(0.18) | <0.001 | |||
| 3 | 0.70(0.12) | 0.002 | 0.70(0.12) | 0.003 | 0.71(0.12) | 0.003 | 0.70(0.12) | 0.002 | |||
| 4 | 1.06(0.16) | 0.735 | 1.06(0.16) | 0.727 | 1.05(0.16) | 0.761 | 1.05(0.16) | 0.780 | |||
| 5 | 0.76(0.13) | 0.028 | 0.77(0.13) | 0.044 | 0.77(0.13) | 0.039 | 0.77(0.13) | 0.037 | |||
| 6 | 0.48(0.16) | <0.001 | 0.48(0.16) | <0.001 | 0.48(0.16) | <0.001 | 0.48(0.16) | <0.001 | |||
| Real trial example: | |||||||||||
| 1 | 0.49(0.22) | 0.068 | 0.48(0.41) | 0.064 | 0.46(0.38) | 0.041 | 0.45(0.41) | 0.041 | |||
| 2 | 0.54(0.14) | 0.014 | 0.54(0.26) | 0.017 | 0.56(0.26) | 0.020 | 0.53(0.26) | 0.013 | |||
| 3 | 0.59(0.11) | 0.005 | 0.58(0.19) | 0.005 | 0.58(0.19) | 0.005 | 0.59(0.19) | 0.007 | |||
| 4 | 1.04(0.29) | 0.871 | 1.06(0.26) | 0.830 | 0.99(0.25) | 0.959 | 1.03(0.26) | 0.917 | |||
| 5 | 0.73(0.12) | 0.040 | 0.72(0.15) | 0.031 | 0.70(0.14) | 0.019 | 0.72(0.15) | 0.036 | |||
HR hazard ratio, SE standard error
Hazard ratio and 95% Bootstrap confidence interval (BCI) for the six simulated trials in Table 3
| Results based | Results based on reconstructed IPD | |||||
|---|---|---|---|---|---|---|
| Curve | on true IPD | IPDfromKM | DigitizeIt | ScanIt | ||
| 1 | 0.469 [0.369, 0.594] | 0.487 [0.379, 0.627] | 0.470 [0.366, 0.597] | 0.474 [0.368, 0.593] | ||
| 2 | 0.408 [0.288, 0.554] | 0.410 [0.281, 0.570] | 0.403 [0.278, 0.566] | 0.407 [0.277, 0.546] | ||
| 3 | 0.699 [0.557, 0.876] | 0.706 [0.560, 0.873] | 0.713 [0.561, 0.882] | 0.707 [0.563, 0.871] | ||
| 4 | 1.081 [0.768, 1.481] | 1.072 [0.764, 1.491] | 1.076 [0.758, 1.474] | 1.064 [0.768, 1.470] | ||
| 5 | 0.767 [0.602, 0.971] | 0.780 [0.596, 0.996] | 0.778 [0.589, 0.991] | 0.773 [0.592, 0.990] | ||
| 6 | 0.486 [0.345, 0.653] | 0.485 [0.343, 0.665] | 0.486 [0.349, 0.657] | 0.483 [0.340, 0.653] | ||
Stability of modified-iKM in comparison to the original iKM method
| Modified-iKM | Original iKM | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Interval | Lower | Upper | trisk | nrisk | ||||||
| When 6-interval information is used: | ||||||||||
| 1 | 1 | 19 | 0 | 200 | 200 | 6 | 200 | 6 | ||
| 2 | 20 | 43 | 10 | 177 | 177 | 3 | 177 | 3 | ||
| 3 | 44 | 63 | 20 | 133 | 133 | 41 | 133 | 41 | ||
| 4 | 64 | 87 | 30 | 64 | 64 | 15 | 64 | 17 | ||
| 5 | 88 | 117 | 40 | 24 | 24 | 3 | 24 | 2 | ||
| 6 | 118 | 132 | 50 | 7 | 7 | 5 | 7 | 7 | ||
| When 11-interval information is used: | ||||||||||
| 1 | 1 | 10 | 0 | 200 | 200 | 0 | 200 | 0 | ||
| 2 | 11 | 19 | 5 | 191 | 191 | 6 | 191 | 6 | ||
| 3 | 20 | 32 | 10 | 177 | 177 | 0 | 177 | -3 | ||
| 4 | 33 | 43 | 15 | 153 | 153 | 3 | 153 | 3 | ||
| 5 | 44 | 53 | 20 | 133 | 133 | 20 | 133 | 19 | ||
| 6 | 54 | 63 | 25 | 100 | 100 | 21 | 100 | 22 | ||
| 7 | 64 | 77 | 30 | 64 | 64 | 3 | 64 | 6 | ||
| 8 | 78 | 87 | 35 | 39 | 39 | 11 | 39 | 11 | ||
| 9 | 88 | 103 | 40 | 24 | 24 | 0 | 24 | -2 | ||
| 10 | 104 | 117 | 45 | 12 | 12 | 2 | 11 | 2 | ||
| 11 | 118 | 132 | 50 | 7 | 7 | 5 | 7 | 7 | ||