| Literature DB >> 34302373 |
Sarah F McGough1, Devin Incerti1, Svetlana Lyalina1, Ryan Copping1, Balasubramanian Narasimhan2,3, Robert Tibshirani2,3.
Abstract
High-dimensional data are becoming increasingly common in the medical field as large volumes of patient information are collected and processed by high-throughput screening, electronic health records, and comprehensive genomic testing. Statistical models that attempt to study the effects of many predictors on survival typically implement feature selection or penalized methods to mitigate the undesirable consequences of overfitting. In some cases survival data are also left-truncated which can give rise to an immortal time bias, but penalized survival methods that adjust for left truncation are not commonly implemented. To address these challenges, we apply a penalized Cox proportional hazards model for left-truncated and right-censored survival data and assess implications of left truncation adjustment on bias and interpretation. We use simulation studies and a high-dimensional, real-world clinico-genomic database to highlight the pitfalls of failing to account for left truncation in survival modeling.Entities:
Keywords: Cox model; high-dimensional data; lasso; left truncation; penalized regression; survival analysis
Mesh:
Year: 2021 PMID: 34302373 PMCID: PMC9290657 DOI: 10.1002/sim.9136
Source DB: PubMed Journal: Stat Med ISSN: 0277-6715 Impact factor: 2.497
FIGURE 1Left‐truncated and right‐censored patient follow‐up in a hypothetical study cohort, ordered chronologically by event time. Patients who receive a diagnosis (closed circle) become eligible to enter the cohort after reaching a milestone (black triangle), for example a genomic test. Patients are followed until death (open circle) or censoring (cross). However, patients who die or are censored before reaching the milestone are left‐truncated (in red), and only those who have survived until eligibility (in black) are observed. Left truncation time, or the time between diagnosis and cohort entry, is shown with a dashed line. Left truncation time is also referred to as “entry time” [Colour figure can be viewed at wileyonlinelibrary.com]
FIGURE 2Distribution of left truncation time (days) in nonsmall cell lung cancer patients in the clinico‐genomic database
FIGURE 3Calibration of survival predictions for lasso model in simulation: Notes: The Cox model with lasso penalty using the training data and was subsequently used to predict the survival function for each patient in the test set. The small and large p models contained 21 and 1011 predictors, respectively. Patients were divided into deciles at each time point based on their predicted survival probabilities. Each point in the plot represents patients within a decile. The “Predicted survival probability” is the average of the predicted survival probabilities from the Cox model across patients within each decile and the “Observed survival probability” is the Kaplan‐Meier estimate of the proportion surviving within each decile. A perfect prediction lies on the black 45 degree line [Colour figure can be viewed at wileyonlinelibrary.com]
Comparison of model discrimination in the simulation
| Left truncation adjustment | Test sample | C‐index |
|---|---|---|
| No | Observed | 0.72 |
| Yes | Observed | 0.67 |
| No | Complete | 0.69 |
| Yes | Complete | 0.69 |
Note: Cox models with lasso penalty were fit using the high‐dimensional simulated data.
FIGURE 4Hazard ratios from Cox lasso model: Notes: The figure for the “Large” p model only includes variables ranked in the top 10 by the absolute value of the hazard ratio in either the left truncation adjusted or nonadjusted model [Colour figure can be viewed at wileyonlinelibrary.com]
Comparison of model discrimination in the clinico‐genomic database
| C‐index | |||
|---|---|---|---|
| Model | Covariate size | No adjustment | Adjustment |
| Cox | Small | 0.649 | 0.580 |
| Cox (lasso) | Small | 0.648 | 0.580 |
| Cox | Large | 0.626 | 0.556 |
| Cox (lasso) | Large | 0.663 | 0.600 |
Note: “Adjustment” indicates models with a left truncation adjustment.
FIGURE 5Calibration of survival predictions in the clinico‐genomic database from the Cox lasso model [Colour figure can be viewed at wileyonlinelibrary.com]