| Literature DB >> 29523868 |
Alexia Giannoula1, Alba Gutierrez-Sacristán1, Álex Bravo1, Ferran Sanz1, Laura I Furlong2.
Abstract
Time is a crucial parameter in the assessment of comorbidities in population-based studies, as it permits to identify more complex disease patterns apart from the pairwise disease associations. So far, it has been, either, completely ignored or only, taken into account by assessing the temporal directionality of identified comorbidity pairs. In this work, a novel time-analysis framework is presented for large-scale comorbidity studies. The disease-history vectors of patients of a regional Spanish health dataset are represented as time sequences of ordered disease diagnoses. Statistically significant pairwise disease associations are identified and their temporal directionality is assessed. Subsequently, an unsupervised clustering algorithm, based on Dynamic Time Warping, is applied on the common disease trajectories in order to group them according to the temporal patterns that they share. The proposed methodology for the temporal assessment of such trajectories could serve as the preliminary basis of a disease prediction system.Entities:
Mesh:
Year: 2018 PMID: 29523868 PMCID: PMC5844976 DOI: 10.1038/s41598-018-22578-1
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Statistically significant comorbidity pairs in men. The thirty most frequent statistically significant pairwise comorbidities encountered in the male sub-population and their p-values. The arrow indicates preferred directionality (double arrow implies no preferred directionality). The total number of patients (#pat) sharing the diseases with either directionality is also shown. The comorbidity pairs are ordered according to the total number of patients.
| Disease Association | #pat | |
|---|---|---|
| Chronic bronchitis (491) → Other diseases of lung (518) | 9,087 | <4.9E-324 |
| Cataract (366) → Chronic bronchitis (491) | 7,749 | 1.9E-36 |
| Chronic bronchitis (491) → Pneumonia (486) | 7,091 | <4.9E-324 |
| Inguinal hernia (550) → Cataract (366) | 6,692 | 3.0E-38 |
| Acute myoc infarction (410) → Ischemic heart disease (414) | 6,101 | <4.9E-324 |
| Chronic bronchitis (491) → Heart failure (428) | 5,720 | <4.9E-324 |
| Cataract (366) → Other diseases of lung (518) | 5,043 | 3.9E-9 |
| Pneumonia (486) → Other diseases of lung (518) | 4,546 | <4.9E-324 |
| Osteoarthrosis (715) → Cataract (366) | 4,212 | 1.1E-66 |
| Heart failure (428) → Other diseases of lung (518) | 4,162 | <4.9E-324 |
| Other acute isch heart dis (411) → Ischemic heart disease (414) | 4,026 | <4.9E-324 |
| Cataract (366) → Occl of cerebral arteries (434) | 3,942 | 9.3E-18 |
| Cataract (366) ↔ Ischemic heart disease (414) | 3,915 | 1.6E-10 |
| Cataract (366) → Acute myoc infarction (410) | 3,902 | 1.2E-13 |
| Hyperplasia of prostate (600) → Cataract (366) | 3,820 | 4.0E-61 |
| Cardiac dysrhythmias (427) → Heart failure (428) | 3,792 | <4.9E-324 |
| Pneumonia (486) → Heart failure (428) | 3,752 | 3.1E-258 |
| Cataract (366) → Other dis urethra/urin tract (599) | 3,578 | 7.6E-47 |
| Acute myocardial infarction (410) → Heart failure (428) | 3,421 | <4.9E-324 |
| Cataract (366) → Bladder cancer (188) | 3,193 | 8.7E-13 |
| Bladder cancer (188) → Other dis urethra/urin tract (599) | 3,140 | <4.9E-324 |
| Ischemic heart disease (414) → Heart failure (428) | 2,879 | 9.9E-190 |
| Acute myocardial infarction (410) → Other acute isch heart dis (411) | 2,870 | <4.9E-324 |
| Ac bronch (466) → Chronic bronchitis (491) | 2,868 | 9.8E-161 |
| Ac bronch (466) → Pneumonia (486) | 2,854 | 2.7E-301 |
| Cataract (366) → Ac bronch (466) | 2,749 | 1.7E-120 |
| Ac bronch (466) → Other diseases of lung (518) | 2,728 | 2.6E-270 |
| Diseases of pancreas (577) → Cholelithiasis (574) | 2,660 | <4.9E-324 |
| Cataract (366) → Diabetes mellitus (250) | 2,636 | 1.1E-22 |
| Chronic bronchitis (491) → Emphysema (492) | 2,619 | <4.9E-324 |
Statistically significant comorbidity pairs in women. The thirty most frequent statistically significant pairwise comorbidities encountered in the female sub-population and their p-values. The arrow indicates preferred directionality (double arrow implies no preferred directionality). The total number of patients (#pat) sharing the diseases with either directionality is also shown. The comorbidity pairs are ordered according to the total number of patients.
| Disease Association | #pat | |
|---|---|---|
| Osteoarthrosis (715) → Cataract (366) | 10,665 | <4.9E-324 |
| Ac bronch (466) → Heart failure (428) | 5,528 | <4.9E-324 |
| Heart failure (428) → Other diseases of lung (518) | 5,067 | <4.9E-324 |
| Acquired deformities of toe (735) → Cataract (366) | 4,943 | 9.1E-22 |
| Cardiac dysrhythmias (427) → Heart failure (428) | 4,860 | <4.9E-324 |
| Cataract (366) → Cholelithiasis (574) | 4,810 | 4.5E-11 |
| Cataract (366) → Ac bronch (466) | 4,321 | 2.1E-34 |
| Cataract (366) → Cardiac dysrhythmias (427) | 4,141 | 1.4E-22 |
| Cataract (366) → Occlusion of cerebral arteries (434) | 3,798 | 6.2E-36 |
| Ac bronch (466) → Other diseases of lung (518) | 3,781 | <4.9E-324 |
| Cataract (366) → Other diseases of lung (518) | 3,681 | 4.E-16 |
| Cataract (366) → Other dis urethra and urinary tract (599) | 3,342 | 8.9E-40 |
| Pneumonia (486) ↔ Heart failure (428) | 3,199 | <4.9E-324 |
| Diseases of pancreas (577) → Cholelithiasis (574) | 3,156 | <4.9E-324 |
| Cataract (366) → Pneumonia (486) | 3,058 | 1.1E-14 |
| Mononeuritis upp. limb/multiplex (354) → Cataract (366) | 3,002 | 3.8E-9 |
| Ac bronch (466) → Pneumonia (486) | 2,940 | <4.9E-324 |
| Acute myocardial infarction (410) → Heart failure (428) | 2,721 | <4.9E-324 |
| Heart failure (428) → Hypertensive heart disease (402) | 2,710 | <4.9E-324 |
| Heart failure (428) → Other dis urethra and urinary tract (599) | 2,689 | 4.4E-167 |
| Acquired deformities of toe (735) → Osteoarthrosis (715) | 2,648 | 9.3E-157 |
| Chronic bronchitis (491) ↔ Heart failure (428) | 2,607 | <4.9E-324 |
| Chronic bronchitis (491) → Other diseases of lung (518) | 2,601 | <4.9E-324 |
| Varicose veins lower extrem (454) → Cataract (366) | 2,581 | 7.1E-131 |
| Cataract (366) → Diabetes mellitus (250) | 2,546 | 9.0E-24 |
| Other hernia abdom (no obstr/gangr) (553) → Cataract (366) | 2,474 | 5.0E-14 |
| Cataract (366) ↔ Malignant neoplasm of female breast (174) | 2,338 | 1.1E-53 |
| Genital prolapse (618) → Cataract (366) | 2,309 | 2.8E-53 |
| Heart failure (428) → Occlusion of cerebral arteries (434) | 2,273 | 2.6E-176 |
| Pneumonia (486) → Other diseases of lung (518) | 2,260 | <4.9E-324 |
The twenty most populated clusters extracted using DTW for the male sub-population. High-level (ICD-9 coding) description of the involved disease groups is provided for each cluster. The number of trajectories (#traj) and total number of patients (#pat) of each cluster is also listed. The clusters are ordered according to the total number of patients.
| #traj | #pat | High-level disease group distribution |
|---|---|---|
| 304 | 48,874 | Dis Respir Sys (99.7%) Dis Circul Sys (0.3%) |
| 162 | 40,196 | Dis Circul Sys (100.0%) |
| 132 | 22,437 | Dis Genitour Sys (63.2%) Dis Digest Sys (36.8%) |
| 238 | 18,648 | Dis Respir Sys (51.5%) Dis Circul Sys (48.5%) |
| 155 | 17,732 | Dis Circul Sys (60.7%) Dis Respir Sys (39.3%) |
| 221 | 16,961 | Dis Nerv Sys & Sense Org (34.1%) Dis Circul Sys (65.9%) |
| 192 | 13,557 | Dis Circul Sys (58.4%) Dis Genitour Sys (27.1%) Dis Digest Sys (14.5%) |
| 142 | 12,700 | Dis Nerv Sys & Sense Org (31.7%) Dis Respir Sys (67.9%) Dis Circul Sys (0.4%) |
| 45 | 11,379 | Dis Nerv Sys & Sense Org (45.3%) Dis Digest Sys (38.7%) Dis Genitour Sys (16.0%) |
| 61 | 10,191 | Dis Digest Sys (46.0%) Dis Respir Sys (54.0%) |
| 233 | 10,096 | Dis Respir Sys (98.5%) Dis Digest Sys (1.4%) Dis Circul Sys (0.1%) |
| 223 | 9,732 | Dis Circul Sys (99.3%) Dis Respir Sys (0.7%) |
| 51 | 9,405 | Dis Nerv Sys & Sense Org (100.0%) |
| 37 | 8,298 | Dis Nerv Sys & Sense Org (41.7%) Dis Respir Sys (55.2%) Dis Digest Sys (3.1%) |
| 63 | 8,243 | Neoplasms (38.4%) Dis Genitour Sys (54.1%) Dis Digest Sys (7.6%) |
| 57 | 7,871 | Dis Digest Sys (90.3%) Dis Genitour Sys (8.9%) Dis Respir Sys (0.8%) |
| 120 | 7,292 | Dis Circul Sys (68.3%) Dis Nerv Sys & Sense Org (31.7%) |
| 64 | 7,249 | Dis Digest Sys (92.5%) Dis Genitour Sys (7.5%) |
| 23 | 6,936 | Dis Nerv Sys & Sense Org (43.4%) Neoplasms (56.6%) |
| 56 | 6,748 | Dis Genitour Sys (43.9%) Dis Respir Sys (55.4%) Dis Digest Sys (0.7%) |
Figure 1Schematic representation of two highly populated clusters (respiratory/circulatory). The (a) fifth and (b) fourth most populated clusters extracted for the male sub-population, associated with Table 3 (and Supplementary Table 4). The description of diseases is provided in each node. The nodes are drawn at a size relative to the frequency of appearance of the disease or group of diseases (a minimum node size corresponding to a frequency of 5% has been arbitrarily considered). Shorter and longer trajectories formed by connected nodes are contained in each cluster. Cyclic arrows indicate additional distinct diagnoses belonging to the same group of diseases (repetitions of the same disease are not permitted in a single trajectory). Examples of disease trajectories contained in each cluster are also provided on the bottom-right of the figure panels, together with the corresponding average times (indicated in years below each arrow) and average number of patients involved (shown at the end of each trajectory in parenthesis).
The twenty most populated clusters extracted using DTW for the female sub-population. High-level (ICD-9 coding) description of the involved disease groups is provided for each cluster. The number of trajectories (#traj) and total number of patients (#pat) of each cluster is also listed. The clusters are ordered according to the total number of patients.
| #traj | #pat | High-level disease group distribution |
|---|---|---|
| 374 | 58,672 | Compl Pregn Birth Puerp (98.6%) Dis Genitour Sys (1.4%) |
| 220 | 29,781 | Dis Respir Sys (100.0%) |
| 301 | 28,588 | Dis Circul Sys (99.9%) Dis Nerv Sys & Sense Org (0.1%) |
| 160 | 20,430 | Dis Nerv Sys & Sense Org (36.3%) Dis Circul Sys (62.9%) Dis Respir Sys (0.8%) |
| 93 | 17,273 | Dis Circul Sys (56.8%) Dis Respir Sys (42.0%) Dis Digest Sys (1.2%) |
| 97 | 16,473 | Dis Circul Sys (97.0%) Dis Respir Sys (3.0%) |
| 51 | 14,427 | Dis Musculosk Sys & Conn Tiss (51.4%) Dis Nerv Sys & Sense Org (47.8%) Dis Skin & Subcut Tis (0.7%) |
| 75 | 13,234 | Dis Musculosk Sys & Conn Tiss (100.0%) |
| 68 | 12,598 | Dis Digest Sys (95.9%) Dis Genitour Sys (4.1%) |
| 130 | 12,373 | Dis Respir Sys (43.4%) Dis Circul Sys (56.6%) |
| 74 | 12,056 | Dis Circul Sys (55.7%) Dis Genitour Sys (17.5%) Dis Digest Sys (26.8%) |
| 42 | 11,133 | Dis Nerv Sys & Sense Org (98.9%) Dis Circul Sys (1.1%) |
| 62 | 7,843 | Dis Musculosk Sys & Conn Tiss (41.1%) Dis Circul Sys (55.6%) Dis Respir Sys (2.6%) Dis Skin & Subcut Tis (0.7%) |
| 41 | 7,652 | Dis Nerv Sys & Sense Org (40.7%) Dis Digest Sys (49.1%) Dis Genitour Sys (10.2%) |
| 45 | 7,010 | Dis Genitour Sys (82.4%) Dis Digest Sys (17.6%) |
| 135 | 6,464 | Dis Circul Sys (58.0%) Dis Respir Sys (42.0%) |
| 24 | 5,796 | Dis Respir Sys (21.6%) Dis Genitour Sys (9.8%) Dis Circul Sys (25.5%) Dis Digest Sys (43.1%) |
| 49 | 5,754 | Dis Nerv Sys & Sense Org (44.4%) Dis Musculosk Sys & Conn Tiss (54.9%) Dis Skin & Subcut Tis (0.8%) |
| 22 | 5,359 | Dis Nerv Sys & Sense Org (47.1%) Dis Genitour Sys (51.0%) Compl Pregn Birth Puerp (2.0%) |
| 63 | 5,332 | Dis Nerv Sys & Sense Org (32.4%) Dis Respir Sys (67.6%) |
Figure 2Clusters associated with bladder cancer. Six clusters containing disease trajectories associated with bladder cancer (ICD-9 code 188), extracted by the DTW clustering algorithm on the male sub-population. In each cluster, bladder cancer represented more than 20% of the total diagnoses. The nodes are drawn at a size relative to the frequency of appearance of the disease or group of diseases (a minimum node size corresponding to a frequency of 5% has been arbitrarily considered). Shorter and longer trajectories formed by connected nodes are contained in each cluster. Cyclic arrows indicate additional distinct diagnoses belonging to the same group of diseases (repetitions of the same disease are not permitted in a single trajectory).
Figure 3Clusters associated with osteoarthrosis. Four clusters containing disease trajectories associated with osteoarthrosis (ICD-9 code 715), extracted by the DTW clustering algorithm on the female sub-population. In each cluster, bladder cancer represented more than 20% of the total diagnoses. The nodes are drawn at a size relative to the frequency of appearance of the disease or group of diseases (a minimum node size corresponding to a frequency of 5% has been arbitrarily considered). Shorter and longer trajectories formed by connected nodes are contained in each cluster. Cyclic arrows indicate additional distinct diagnoses belonging to the same group of diseases (repetitions of the same disease are not permitted in a single trajectory).
Figure 4Application of DTW on two disease trajectories. (a) A numerical example of the local distance matrix D and (b) an image of the accumulated distance matrix A for two disease trajectories: s1 = {491, 466, 519, 494, 162, 280, 410, 428, 402} and s2 = {491, 519, 162, 428}. The optimal path is superimposed in (b) in white. (c) The original disease trajectories s1 and s2 and (d) warped ones after applying the DTW algorithm.
Figure 5Flow-charts of the proposed methodology. (a) A flow-chart of the proposed methodology for the extraction of time-dependent disease associations and (b) the unsupervised clustering method of the common disease trajectories using the DTW algorithm. A(N,M) denotes the final accumulated distance or global cost between the new incoming trajectory (new_traj) and each trajectory traj (i = i, 2, …) of the existing clusters, according to equation (2).