| Literature DB >> 35178111 |
Shanpeng Li1, Ning Li2,3, Hong Wang4, Jin Zhou1,2, Hua Zhou1,3, Gang Li1,3.
Abstract
Semiparametric joint models of longitudinal and competing risk data are computationally costly, and their current implementations do not scale well to massive biobank data. This paper identifies and addresses some key computational barriers in a semiparametric joint model for longitudinal and competing risk survival data. By developing and implementing customized linear scan algorithms, we reduce the computational complexities from O(n 2) or O(n 3) to O(n) in various steps including numerical integration, risk set calculation, and standard error estimation, where n is the number of subjects. Using both simulated and real-world biobank data, we demonstrate that these linear scan algorithms can speed up the existing methods by a factor of up to hundreds of thousands when n > 104, often reducing the runtime from days to minutes. We have developed an R package, FastJM, based on the proposed algorithms for joint modeling of longitudinal and competing risk time-to-event data and made it publicly available on the Comprehensive R Archive Network (CRAN).Entities:
Mesh:
Substances:
Year: 2022 PMID: 35178111 PMCID: PMC8846996 DOI: 10.1155/2022/1362913
Source DB: PubMed Journal: Comput Math Methods Med ISSN: 1748-670X Impact factor: 2.238
Figure 1Runtime (seconds) comparison between three different implementations of the EM algorithm for fitting the joint models (1) and (2) and the joineR package. The details of methods 1-3 are given in Section 3. joineR is an established R package which fits a similar semiparametric joint model with a slightly different latent association structure in the competing risk submodel [29]. Fold change is calculated as the ratio of runtime between two methods.
Figure 2Runtime (seconds) comparison between two implementations of standard error estimation for fitting the joint models (1) and (2), linear scan and no linear scan as described in Section 2.2.3, and the bootstrap method employed by the joineR package [29]. Fold change is calculated as the ratio of runtime between two methods.
Runtime comparison between different R packages for joint modeling of a longitudinal and a single event time on the lung health study data.
| Package | Semiparametric joint models | Parametric joint models | |||||||
|---|---|---|---|---|---|---|---|---|---|
| FastJM | joineR | JSM | JSM | JM | JM | JM | JMBayes | JMBayes | |
| Baseline hazard | Unspecified | Unspecified | Unspecified | Unspecified | Unspecified | Weibull | B-spline | B-spline | B-spline |
|
|
|
|
|
|
|
|
|
|
|
| Runtime | 0.3 min | 20.4 min | 1 h 36 min | 1 h 51 min | ∗ | 0.9 min | 1 min | 19.8 min | 43 min |
∗Failed to produce any result due to convergence issue.
Runtime comparison between different R packages for semiparametric joint modeling of longitudinal SBP trajectory and competing risk event time on the UK Biobank primary care (UKB-PC) data.
| Package | FastJM | joineR |
|---|---|---|
| UKB-PC subset ( | 1 min | 3.3 h |
| UKB-PC subset ( | 4.4 min | 33 h |
| UKB-PC full data ( | 1 h | ∗ |
∗Failed to produce any result due to computational failure.