| Literature DB >> 32671229 |
Lucy M Bull1,2, Mark Lunt1, Glen P Martin3, Kimme Hyrich1,4, Jamie C Sergeant1,2.
Abstract
BACKGROUND: Clinical prediction models (CPMs) predict the risk of health outcomes for individual patients. The majority of existing CPMs only harness cross-sectional patient information. Incorporating repeated measurements, such as those stored in electronic health records, into CPMs may provide an opportunity to enhance their performance. However, the number and complexity of methodological approaches available could make it difficult for researchers to explore this opportunity. Our objective was to review the literature and summarise existing approaches for harnessing repeated measurements of predictor variables in CPMs, primarily to make this field more accessible for applied researchers.Entities:
Keywords: Clinical risk prediction; Dynamic prediction; Electronic health records; Joint models; Longitudinal data; Personalised medicine; Prediction models; Repeated observations; Survival analysis; Time-dependent covariates
Year: 2020 PMID: 32671229 PMCID: PMC7346415 DOI: 10.1186/s41512-020-00078-z
Source DB: PubMed Journal: Diagn Progn Res ISSN: 2397-7523
Search strategy in Ovid format, as entered into MEDLINE and Embase
| 1 | ((repeat* adj1 measure*) OR ``repeatedly-measured" OR (repeat* adj1 observ*) OR ``repeatedly-observed").ti,ab |
| 2 | ``time-series" OR ``time-series" OR (``longitudinal" adj2 ``data")).ti,ab |
| 3 | ``longitudinal" adj3 (``survival" OR ``binary")) OR ((``longitudinal" OR ``repeat" OR ``discrete") adj2 (``time-to-event" OR (event* adj2 time*)))).ti,ab |
| 4 | ((time-depend* OR ``time-varying" OR ``longitudinal") adj1 (coefficient* OR variable* OR covariate* OR marker* OR factor* OR observ* OR measure* OR biomarker* OR model* OR predictor*)).ti,ab |
| 5 | ((predict* adj1 (accurac* OR ``power" OR individual* OR ``future" OR ``time-to-event" OR (event* adj2 time*) OR ``binary")) OR ``predictive ability" OR ``predictive performance").ti,ab |
| 6 | (((``predictive" OR ``prediction" OR ``prognostic") adj1 (tool* OR scor* OR ``algorithm" OR model* OR rule*))).ti,ab |
| 7 | ``predict" or ``predicts" or ``prediction" or ``predicting") adj2 (risk* OR ``outcome" OR ``incidence" OR ``time" OR development* OR event* OR ``disease" OR recurrence* OR ``progression" OR ``severity" OR ``achievement" OR ``status" OR ``application")).ti,ab |
| 8 | ((``predict" or ``predicts" or ``prediction" or ``predicting") adj3 (``mortality" OR ``survival")).ti,ab |
| 9 | (``dynamic prediction" OR ``dynamic predictions" OR ``dynamic prognostic" OR ``dynamic clinical prediction").ti,ab |
| 10 | 1 OR 2 OR 3 OR 4 |
| 11 | 5 OR 6 OR 7 OR 8 |
| 12 | 10 AND 11 |
| 13 | 12 OR 9 |
Search limits specific to MEDLINE, Embase and Web of Science
| Search limits | |||
| Database | Ovid MEDLINE(R) and Epub Ahead of Print, In-process and Other Non-indexed Citations, Daily and Versions(R) | Ovid Embase | Web of Science |
| Search | Title and Abstract (.ti,ab) | Title and Abstract (.ti,ab) | Title, abstract, and keywords (TS=()) |
| Dates | 1946 to November 30, 2018 | 1974 to 2018 December 03 | 1900–2018 |
| Publication status | No limit | Article-in-press, Embase status, In-Process status | No limit |
| Document type | Journal Article | Article, Article in press | Article |
| Language | English | English | English |
| Categories | No limit | No limit | Categories covering computer science, biology, healthcare, pharmacy, mathematics, statistics, biomedical engineering, psychology and probability. |
| Citation index | No limit | No limit | Social science, Science, Emerging sources. |
Inclusion criteria used for the title, abstract and full-text screening
| Inclusion criteria set A | Inclusion criteria set B | |
|---|---|---|
| Development of a CPM. | Development of a multivariable CPM, which predicts a binary or time-to-event outcome. | |
| Modelling techniques for longitudinal and survival/binary data. | CPM accounts for repeated measurements over time of at least one predictor variable. | |
| Clinical application described or article published in a medical or biometric journal. | CPM has been developed for a binary or time-to-event outcome for an individual | |
| Inclusion rule | (1 OR 2) AND 3 | 1 AND 2 AND 3 |
Fig. 1PRISMA flow diagram to illustrate the screening process
Fig. 2Temporal terminology for harnessing repeated measurements of predictors in clinical prediction models
Fig. 3Number of publications per year for each framework (n = 182), excluding methods in comparative reviews
Methodological frameworks available to enhance clinical prediction models using longitudinal information
| Framework | Aim | Advantages | Limitations | Software | Extensions/variations | Examples |
|---|---|---|---|---|---|---|
| 1. Time-dependent covariate modelling (TDCM) | A1 | Allows for updated predictions over time, simple to apply in available software. | Assumes no measurement error, cannot predict the future, correlationsa ignored, measurements assumed constant between time-points, requires complete predictors at event times. | Widely available (e.g. R, Stata, SAS). | Time-varying effects [ | Applied to assess the prognosis of patients with hepatocellular carcinoma, allowing for prediction at any stage of disease using their most recent information [ |
| 2. Generalised estimating equations (GEE) | A1 | Allows for updated predictions over time, accounts for correlationa, can adjust for patient clustering. | Ignores underlying trajectory, does not account for changes in at-risk population, and ignores time-dependency. | Widely available (e.g. R, stata, SAS). | Employed to identify patients at high risk of adverse events after cancer therapy [ | |
3. Landmark analysis (LA) | A1 | Avoids misspecification of underlying trajectory, only uses patient information prior to landmark time. | Ignores underlying covariate trajectory, often correlationsa ignored, requires complete follow-up, and LOCF approach induces bias. | Competing risks [ | Employed to predict relapse/death for those in leukaemia remission after transplant [ | |
| 4. Two-stage modelling (TSM) | A2 or A3 | Simple to apply, flexible, can account for correlationsa, can handle irregularly spaced measurements. | Ignores model-specification error in the first-stage, first model cannot account for drop-out bias. | Extends to TDCM [ | In conjunction with LA, TSM used to predict adverse events following endovascular abdominal aortic aneurysm repair [ | |
5. Joint-modelling (JM) | A1 and (A2 OR A3) | Address limitations of TSM framework, allows updated predictions over time, flexible. | Complex to implement, strong parametric assumptions, computationally intensive. | Time-varying effects [ | Shared random effects JM employed for real-time predictions of prostate cancer recurrence [ | |
6. Trajectory classification (TC) | A1, A2 and A3 | Accounts for correlationa, irregularly-spaced measurements, informative processes, updated predictions, underlying trajectory. | Complex and computationally intensive for multivariate applications, parametric assumptions required for covariate trajectory. | Multivariate modelling using Gaussian processes [ | Employed to classify repeated measurements of hormone levels in early pregnancy to predict pregnancy success in the context of in vitro fertilization [ | |
7. Machine learning (ML) | A1 and A3 | Few assumptions, handles high-dimensional data, can identify optimal trajectory characteristics. | Often predicts binary outcome, ignores right-censoring, large datasets required to avoid overfitting, often ‘black box’ algorithms. | Recurrent Neural Networks (RNNs) [ | RNNs employed to predict heart failure based on EHR data [ |
aCorrelations between and within individuals
Abbreviations: LOCF last observation carried forward, ME mixed effect, SVM support vector machine, MCMC Markov chain Monte Carlo, EHR electronic health record, JLCM joint latent class model