| Literature DB >> 35571357 |
Kadri Künnapuu1, Solomon Ioannou2, Kadri Ligi1,3, Raivo Kolde3, Sven Laur1,3, Jaak Vilo1,3,4, Peter R Rijnbeek2, Sulev Reisberg1,3,4.
Abstract
Objective: To develop a framework for identifying temporal clinical event trajectories from Observational Medical Outcomes Partnership-formatted observational healthcare data. Materials andEntities:
Keywords: OHDSI; OMOP; R package; observational data; trajectory
Year: 2022 PMID: 35571357 PMCID: PMC9097714 DOI: 10.1093/jamiaopen/ooac021
Source DB: PubMed Journal: JAMIA Open ISSN: 2574-2531
Summary of the general principles used in recent large-scale disease trajectory studies
| Publications | Jensen et al, | Giannoula et al | Paik and Kim | Han et al | Our “Trajectories” framework |
|---|---|---|---|---|---|
| General principle of the trajectory analysis | First, identify event pairs, then build longer trajectories from these. In Jensen et al, cluster trajectories. | First, identify event pairs, then cluster these based on dynamic time warping | First, identify event pairs, then build longer trajectories from these and cluster these | First, identify event pairs, then build longer trajectories from these | First, identify event pairs, then build longer trajectories from these |
| Step 1: Study cohort | Entire Danish population, all events (whole dataset). Hu et al investigated events prior to the cancer diagnosis. | Spanish health registry (whole dataset) | Hospital deaths, events before deaths | Depression patients, events after the depression | Whole dataset or any subset of the data based on OHDSI/OMOP cohort definition principles |
| Step 2a: Event types in trajectories | ICD-10 level 3 diseases | ICD-9 disease diagnoses | 3-Digit ICD-9-CM codes | ICD-10 level 3 diseases | Any binary condition, observation, drug era, or procedure as recorded by using OMOP vocabulary |
| Step 2b: Handling repeated events | Only the first event is used | Only the first event is used | Only the first event is used | Only the first event is used | Only the first event is used |
| Step 2c: Maximum allowed temporal distance between events | 5 y | NA | 1 y | NA | Any positive number of days |
| Step 2d: Minimum number of occurrences of event pair | 10 | 10 | NA | 125 (∼0.5% of the cohort) | Any positive number or percentage of the cohort |
| Step 3a: Identification of significant event pairs | Sampling from matched exposed/background group by exact gender, age group, type
of hospital encounter, week of the | Matched by gender and age. Fisher’s exact test used for association testing and binomial test for assessing temporal order. Multiple testing correction. | Binomial test for association and temporal order testing. Multiple testing correction. | Matched by gender, sex, Townsend deprivation index, year of birth, year of depression diagnosis. Cox regression analysis for association testing. Binomial test for temporal order testing. Multiple testing correction. | Exposed/background group matching by using exact covariates (gender, age group
and a calendar year of |
| Step 3b: Measurement to describe the strength of the association of event pair | Relative risk | Relative risk | Relative risk | Hazard ratio | Relative risk |
| Step 4: Count trajectory patterns | Allow intermediate events between trajectory elements | Allow intermediate events between trajectory elements | Allow intermediate events between trajectory elements | Not specified | Allow intermediate events between trajectory elements |
| Method of further clustering of the results | Clustering of the longer trajectories based on shared diagnoses | Novel clustering method based on ICD-9 hierarchy and dynamic time warping | Clustering of the longer trajectories based on shared diagnoses | Clustering based on similarity of their underlying affected systems or their etiologies | Not used |
| Exact software code for analysis shared | No | Partially | No | Can be requested | Yes |
Note: In the last column, the methods used in our framework are described.
Figure 1.Illustration of the framework.
Figure 2.Twenty most frequent event pairs in type 2 diabetes cohort in Estonian dataset. Node size indicates the number of patients of that event record, relative risk of the future event is shown on edges. All pairs were also validated as significant in the IPCI database.
Figure 3.Process flow of testing Danish directional event pairs in Estonian dataset.
Figure 4.Attrition diagram of identifying directional event pairs in Estonian dataset.
Figure 5.Event frequency is correlated with the number of identified significant event pairs (example on Estonian data). Diagnosis codes mentioned in this article are highlighted.