| Literature DB >> 35928784 |
Yutong Liu1, Feng-Chang Lin1, Jessica T Lin2, Quefeng Li1.
Abstract
A standard competing risks set-up requires both time to event and cause of failure to be fully observable for all subjects. However, in application, the cause of failure may not always be observable, thus impeding the risk assessment. In some extreme cases, none of the causes of failure is observable. In the case of a recurrent episode of Plasmodium vivax malaria following treatment, the patient may have suffered a relapse from a previous infection or acquired a new infection from a mosquito bite. In this case, the time to relapse cannot be modeled when a competing risk, a new infection, is present. The efficacy of a treatment for preventing relapse from a previous infection may be underestimated when the true cause of infection cannot be classified. In this paper, we developed a novel method for classifying the latent cause of failure under a competing risks set-up, which uses not only time to event information but also transition likelihoods between covariates at the baseline and at the time of event occurrence. Our classifier shows superior performance under various scenarios in simulation experiments. The method was applied to Plasmodium vivax infection data to classify recurrent infections of malaria.Entities:
Keywords: Markov transition model; malaria relapse; quadratic approximation; two-stage estimation
Year: 2021 PMID: 35928784 PMCID: PMC9347664 DOI: 10.6339/21-jds1026
Source DB: PubMed Journal: J Data Sci ISSN: 1680-743X
Figure 1:Kaplan-Meier curve for the first recurrent infection.
Figure 2:Heatmap for presence/absence of haplotypes.
Classification performance of proposed classifiers with low-dimensional binary covariates.
|
|
| |||||||
|---|---|---|---|---|---|---|---|---|
| Scenario |
| ( | Sensitivity | Specificity | Overall | Sensitivity | Specificity | Overall |
| 1 | −0.5 | (400, 10) | 50.3 (20.2) | 59.0 (19.2) | 53.6 (5.1) | 90.7 (4.2) | 94.3 (3.2) | 92.1 (2.2) |
| (800, 20) | 50.1 (19.9) | 59.6 (19.2) | 54.0 (4.5) | 97.8 (0.9) | 98.7 (0.8) | 98.0 (0.5) | ||
| 0 | (400, 10) | 49.1 (18.4) | 60.1 (17.6) | 53.6 (4.6) | 89.3 (10.8) | 93.2 (10.8) | 90.9 (10.4) | |
| (800, 20) | 49.6 (18.2) | 59.7 (17.8) | 53.8 (3.9) | 97.9 (0.9) | 98.2 (0.8) | 98.0 (0.6) | ||
| 0.5 | (400, 10) | 48.3 (18.7) | 61.9 (17.3) | 53.9 (4.8) | 88.9 (12.2) | 92.4 (12.0) | 90.3 (11.8) | |
| (800, 20) | 50.2 (17.8) | 59.5 (17.2) | 54.1 (3.9) | 97.9 (0.8) | 98.1 (0.8) | 98.0 (0.5) | ||
| 2 | −0.5 | (400, 10) | 48.7 (19.7) | 60.6 (18.8) | 53.6 (5.0) | 66.3 (16.9) | 72.5 (30.2) | 68.8 (21.2) |
| (800, 20) | 50.7 (18.7) | 58.8 (17.9) | 54.0 (4.1) | 66.2 (14.6) | 71.9 (13.4) | 68.6 (11.9) | ||
| 0 | (400, 10) | 49.3 (19.7) | 59.6 (18.5) | 53.6 (5.1) | 64.4 (18.2) | 69.2 (32.3) | 66.3 (23.1) | |
| (800, 20) | 51.6 (17.9) | 58.5 (17.4) | 54.5 (3.9) | 66.2 (14.7) | 72.1 (13.3) | 68.6 (11.9) | ||
| 0.5 | (400, 10) | 49.2 (18.6) | 60.6 (17.6) | 53.7 (4.5) | 68.7 (16.6) | 74.9 (27.5) | 71.1 (20.4) | |
| (800, 20) | 50.8 (18.1) | 58.8 (17.5) | 54.0 (4.1) | 66.3 (14.5) | 72.3 (12.9) | 68.7 (11.8) | ||
Sensitivity, specificity and overall accuracy are given as percentages.
Reported values are means and standard deviations over 500 simulations.
Classification and variable selection performance of proposed classifiers with high-dimensional binary covariates.
|
|
|
|
| |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Scenario |
| ( | Sensitivity | Specificity | Overall | Sensitivity | Specificity | Overall | Bias | Sensitivity | Specificity | Overall |
| 1 | −0.5 | (100, 200) | 98.1 (3.0) | 4.4 (6.8) | 76.1 (4.6) | 100 (0) | 96.3 (6.0) | 99.5 (0.7) | 0.48 (0.05) | 75.2 (12.8) | 57.1 (2.0) | 58.0 (2.1) |
| (200, 400) | 96.7 (3.5) | 8.4 (7.1) | 75.3 (3.1) | 100 (0) | 100 (0) | 100 (0) | 0.51 (0.01) | 87.2 (10.6) | 65.7 (1.3) | 66.7 (1.4) | ||
| 0 | (100, 200) | 97.8 (3.4) | 5.3 (7.7) | 74.8 (4.5) | 100 (0) | 97.4 (4.5) | 99.6 (0.6) | 0.49 (0.04) | 72.9 (13.0) | 58.2 (2.1) | 59.0 (2.2) | |
| (200, 400) | 95.7 (3.7) | 9.2 (7.5) | 75.1 (3.5) | 100 (0) | 100 (0) | 100 (0) | 0.51 (0.01) | 87.0 (10.6) | 65.7 (1.6) | 65.9 (1.4) | ||
| 0.5 | (100, 200) | 97.6 (2.8) | 4.6 (6.8) | 75.3 (4.5) | 100 (0) | 96.7 (6.0) | 99.5 (0.7) | 0.49 (0.04) | 72.9 (13.6) | 58.0 (2.3) | 58.7 (2.4) | |
| (200, 400) | 95.8 (3.8) | 9.8 (7.1) | 75.0 (3.3) | 100 (0) | 99.9 (0.8) | 100 (0) | 0.51 (0.02) | 87.1 (11.2) | 65.4 (1.4) | 66.0 (1.5) | ||
| 2 | −0.5 | (100, 200) | 97.9 (2.9) | 4.9 (5.8) | 75.8 (4.6) | 91.8 (4.5) | 13.0 (8.2) | 73.1 (5.0) | 0.49 (0.05) | 78.3 (14.3) | 62.2 (2.5) | 61.1 (2.3) |
| (200, 400) | 96.2 (3.8) | 8.8 (7.9) | 75.3 (3.3) | 90.7 (5.6) | 14.9 (9.1) | 72.7 (4.0) | 0.50 (0.02) | 73.8 (14.1) | 67.2 (1.7) | 66.3 (1.7) | ||
| 0 | (100, 200) | 97.5 (3.1) | 6.4 (6.9) | 74.9 (4.3) | 91.9 (4.8) | 14.3 (9.2) | 72.7 (4.2) | 0.50 (0.04) | 79.2 (15.8) | 62.6 (2.2) | 61.4 (2.4) | |
| (200, 400) | 95.8 (3.8) | 8.8 (7.4) | 74.8 (3.5) | 90.6 (5.1) | 15.4 (8.3) | 72.5 (3.9) | 0.51 (0.02) | 75.3 (14.5) | 67.5 (1.9) | 66.5 (1.4) | ||
| 0.5 | (100, 200) | 97.4 (2.6) | 5.7 (6.1) | 75.4 (4.4) | 91.5 (4.8) | 13.6 (8.7) | 72.8 (4.8) | 0.51 (0.04) | 79.0 (15.8) | 61.6 (2.3) | 60.5 (2.3) | |
| (200, 400) | 95.6 (3.7) | 9.5 (7.7) | 74.7 (3.1) | 90.3 (5.5) | 16.3 (8.3) | 72.2 (3.8) | 0.51 (0.02) | 73.1 (14.9) | 66.2 (1.5) | 65.4 (1.5) | ||
Sensitivity, specificity and overall accuracy are given as percentages.
Reported values are means and standard deviations over 500 simulations.
Classification of proposed classifiers with low-dimensional continuous covariates.
|
|
| |||||||
|---|---|---|---|---|---|---|---|---|
| Scenario |
| ( | Sensitivity | Specificity | Overall | Sensitivity | Specificity | Overall |
| 1 | −0.5 | (400, 10) | 67.8 (4.1) | 54.8 (4.5) | 61.6 (3.2) | 97.5 (1.4) | 97.5 (1.6) | 97.6 (1.0) |
| (800, 20) | 64.9 (2.8) | 58.6 (2.5) | 61.8 (1.9) | 99.8 (0.2) | 99.8 (0.3) | 99.7 (0.1) | ||
| 0 | (400, 10) | 65.9 (4.4) | 57.1 (4.3) | 61.6 (3.1) | 97.6 (1.3) | 97.5 (1.3) | 97.5 (0.9) | |
| (800, 20) | 63.6 (2.4) | 59.5 (2.6) | 61.6 (1.9) | 99.7 (0.3) | 99.7 (0.3) | 99.7 (0.2) | ||
| 0.5 | (400, 10) | 64.7 (3.5) | 60.2 (3.9) | 62.4 (3.0) | 97.6 (1.2) | 97.4 (1.1) | 97.5 (0.7) | |
| (800, 20) | 62.5 (2.5) | 60.4 (2.2) | 61.5 (1.7) | 99.7 (0.3) | 99.7 (0.3) | 99.7 (0.2) | ||
| 2 | −0.5 | (400, 10) | 67.6 (4.3) | 54.9 (4.3) | 61.5 (3.1) | 68.5 (4.3) | 56.2 (4.8) | 62.5 (3.1) |
| (800, 20) | 64.6 (2.7) | 58.4 (2.8) | 61.8 (2.5) | 67.9 (2.4) | 62.7 (3.8) | 65.4 (2.0) | ||
| 0 | (400, 10) | 65.7 (3.9) | 57.4 (4.2) | 61.7 (3.0) | 67.0 (3.9) | 59.1 (4.4) | 63.1 (3.0) | |
| (800, 20) | 63.6 (2.6) | 59.9 (2.7) | 61.8 (1.8) | 67.3 (2.4) | 64.0 (2.6) | 65.6 (1.8) | ||
| 0.5 | (400, 10) | 63.9 (3.6) | 59.6 (4.0) | 61.8 (2.7) | 65.5 (3.2) | 61.1 (4.0) | 63.5 (2.5) | |
| (800, 20) | 62.8 (2.6) | 60.6 (2.5) | 61.7 (1.8) | 66.4 (2.5) | 64.6 (2.6) | 65.5 (1.7) | ||
Sensitivity, specificity and overall accuracy are given as percentages.
Reported values are means and standard deviations over 500 simulations.
Classification performance of proposed classifiers with high-dimensional continuous covariates.
| Scenario |
| ( |
|
|
|
| ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Sensitivity | Specificity | Overall | Sensitivity | Specificity | Overall | Bias | Sensitivity | Specificity | Overall | |||
| 1 | −0.5 | (100, 200) | 85.8 (5.8) | 29.8 (8.7) | 59.2 (5.5) | 98.7 (11.4) | 99.7 (5.7) | 99.5 (6.7) | 0.44 (0.02) | 69.5 (14.8) | 57.3 (2.3) | 58.5 (2.9) |
| (200, 400) | 88.7 (3.5) | 27.1 (5.9) | 60.0 (4.1) | 100 (0) | 100 (0) | 100 (0) | 0.47 (0.01) | 82.0 (11.9) | 60.4 (1.9) | 60.9 (1.9) | ||
| 0 | (100, 200) | 83.4 (5.2) | 33.6 (7.4) | 59.0 (5.4) | 99.0 (10.5) | 99.6 (5.7) | 99.1 (7.2) | 0.45 (0.02) | 70.8 (15.4) | 57.3 (3.0) | 57.9 (3.0) | |
| (200, 400) | 85.2 (4.5) | 31.9 (5.6) | 59.6 (3.9) | 100 (0) | 100 (0) | 100 (0) | 0.47 (0.01) | 82.7 (12.5) | 59.3 (1.9) | 59.9 (1.9) | ||
| 0.5 | (100, 200) | 81.9 (5.3) | 37.5 (7.2) | 60.1 (5.2) | 98.3 (12.8) | 99.7 (5.8) | 99.0 (8.1) | 0.44 (0.02) | 71.4 (14.5) | 56.1 (2.7) | 56.9 (2.9) | |
| (200, 400) | 84.5 (3.7) | 34.0 (5.1) | 59.6 (3.9) | 100 (0) | 100 (0) | 100 (0) | 0.47 (0.01) | 85.0 (11.0) | 58.6 (1.7) | 59.2 (1.6) | ||
| 2 | −0.5 | (100, 200) | 85.5 (5.4) | 29.3 (7.9) | 58.9 (5.7) | 94.2 (3.6) | 23.4 (7.9) | 60.8 (6.3) | 0.43 (0.02) | 62.3 (15.4) | 64.8 (2.8) | 64.6 (2.9) |
| (200, 400) | 84.0 (4.3) | 32.0 (5.8) | 59.5 (4.1) | 96.3 (2.2) | 31.7 (6.9) | 65.8 (4.7) | 0.47 (0.01) | 75.6 (14.4) | 68.0 (1.8) | 68.2 (1.9) | ||
| 0 | (100, 200) | 82.9 (6.0) | 34.2 (7.2) | 59.5 (5.4) | 92.9 (4.0) | 27.7 (7.6) | 61.7 (5.6) | 0.44 (0.02) | 64.8 (15.6) | 64.1 (2.7) | 64.1 (2.9) | |
| (200, 400) | 81.3 (4.2) | 36.5 (5.9) | 59.6 (3.9) | 95.7 (2.2) | 35.7 (6.7) | 66.5 (4.7) | 0.47 (0.01) | 76.8 (14.1) | 67.4 (2.0) | 67.7 (2.0) | ||
| 0.5 | (100, 200) | 82.0 (5.7) | 37.8 (7.1) | 60.1 (5.5) | 92.5 (4.1) | 31.0 (8.1) | 62.1 (6.1) | 0.45 (0.02) | 63.8 (15.9) | 63.7 (2.8) | 63.7 (2.9) | |
| (200, 400) | 79.9 (4.0) | 38.5 (5.5) | 59.5 (3.7) | 95.1 (2.3) | 37.8 (6.6) | 66.9 (4.5) | 0.46 (0.01) | 77.1 (13.6) | 66.7 (2.1) | 67.4 (2.1) | ||
Sensitivity, specificity and overall accuracy are given as percentages.
Reported values are means and standard deviations over 500 simulations.
Classification of the first recurrent infection (ν = 2.05).
| Recurrence Pair | Days to Recurrence | Baseline Variants |
|
| Recurrence Variants | Variant Prevalence |
| Class by our method | Class by Lin et al. |
|---|---|---|---|---|---|---|---|---|---|
| 10 → 10R | 84 | CAM.00 | 0.907 | 0.783 | CAM.00 | 0.590 | 0.995 | Relapse | Relapse |
| CAM.11 | 0 | CAM.11 | 0.077 | ||||||
| CAM.15 | 0.013 | ||||||||
| 31 → 31R | 84 | CAM.00 | 0.907 | 0.910 | CAM.16 | 0.006 | 0.988 | Relapse | Relapse |
| CAM.02 | 0 | ||||||||
| CAM.04 | 1.026 | ||||||||
| CAM.31 | 0 | ||||||||
| 36 → 36R | 99 | CAM.00 | 0.907 | 0.910 | CAM.01 | 0.269 | 0.645 | Relapse | Relapse |
| CAM.01 | 0 | CAM.02 | 0.41 | ||||||
| CAM.02 | 0 | CAM.07 | 0.192 | ||||||
| CAM.03 | 0 | CAM.17 | 0.064 | ||||||
| CAM.04 | 1.026 | ||||||||
| CAM.05 | 0 | ||||||||
| CAM.06 | 0 | ||||||||
| CAM.07 | 0 | ||||||||
| CAM.09 | 0 | ||||||||
| CAM.11 | 0 | ||||||||
| 68 → 68R | 99 | CAM.00 | 0.907 | 0.910 | CAM.10 | 0.077 | 0.997 | Relapse | Relapse |
| CAM.02 | 0 | ||||||||
| CAM.04 | 1.026 | ||||||||
| CAM.10 | 0 | ||||||||
| 80 → 80R | 56 | CAM.00 | 0.907 | 0.910 | CAM.00 | 0.590 | 0.000 | Reinfection | Reinfection |
| CAM.04 | 1.026 | CAM.01 | 0.269 | ||||||
| CAM.05 | 0 | CAM.02 | 0.410 | ||||||
| CAM.08 | 0 | CAM.03 | 0.295 | ||||||
| CAM.09 | 0 | CAM.05 | 0.231 | ||||||
| CAM.24 | 0 | CAM.06 | 0.231 | ||||||
| CAM.27 | 0 | CAM.07 | 0.192 | ||||||
| CAM.08 | 0.154 | ||||||||
| CAM.12 | 0.064 | ||||||||
| CAM.41 | 0.013 | ||||||||
| 81 → 81R | 35 | CAM.00 | 0.907 | 0.783 | CAM.00 | 0.590 | 0.974 | Relapse | Relapse |
| CAM.01 | 0 | CAM.01 | 0.269 | ||||||
| CAM.51 | 0 | ||||||||
| 82 → 82R | 56 | CAM.00 | 0.907 | 0.910 | CAM.00 | 0.590 | 0.674 | Relapse | Relapse |
| CAM.03 | 0 | CAM.01 | 0.269 | ||||||
| CAM.04 | 1.026 | CAM.03 | 0.295 | ||||||
| CAM.10 | 0 | CAM.46 | 0.006 | ||||||
| 87 → 87R | 81 | CAM.00 | 0.907 | 0.783 | CAM.00 | 0.590 | 0.424 | Reinfection | Relapse |
| CAM.01 | 0 | CAM.07 | 0.192 | ||||||
| CAM.02 | 0 | CAM.08 | 0.154 | ||||||
| CAM.08 | 0 | CAM.53 | 0.013 | ||||||
| CAM.24 | 0 | ||||||||
| 89 → 89R | 14 | CAM.00 | 0.907 | 0.910 | CAM.01 | 0.269 | 0.052 | Reinfection | Reinfection |
| CAM.04 | 1.026 | CAM.09 | 0.077 | ||||||
| CAM.06 | 0 | CAM.20 | 0.026 | ||||||
| CAM.08 | 0 | CAM.27 | 0.038 | ||||||
| CAM.10 | 0 | ||||||||
| CAM.12 | 0 | ||||||||
| 96 → 96R | 71 | CAM.00 | 0.907 | 0.910 | CAM.00 | 0.590 | 0.983 | Relapse | Relapse |
| CAM.02 | 0 | CAM.30 | 0.013 | ||||||
| CAM.04 | 1.026 | ||||||||
| CAM.08 | 0 | ||||||||
| 112 → 112R | 67 | CAM.00 | 0.907 | 0.910 | CAM.00 | 0.590 | 0.670 | Relapse | Relapse |
| CAM.01 | 0 | CAM.01 | 0.269 | ||||||
| CAM.02 | 0 | CAM.02 | 0.410 | ||||||
| CAM.04 | 1.026 | ||||||||
| CAM.07 | 0 | ||||||||
| CAM.12 | 0 | ||||||||
| CAM.40 | 0 | ||||||||
| CAM.42 | 0 | ||||||||
| CAM.60 | 0 | ||||||||
| 118 → 118R | 89 | CAM.08 | 0 | 0.593 | CAM.01 | 0.269 | 0.008 | Reinfection | Reinfection |
| CAM.02 | 0.410 | ||||||||
| CAM.25 | 0.006 | ||||||||
| CAM.39 | 0.006 | ||||||||
| 123 → 123R | 26 | CAM.00 | 0.907 | 0.783 | CAM.00 | 0.590 | 0.700 | Relapse | Reinfection |
| CAM.02 | 0 | CAM.01 | 0.269 | ||||||
| 125 → 125R | 82 | CAM.02 | 0 | 0.593 | CAM.00 | 0.590 | 0.000 | Reinfection | Reinfection |
| CAM.01 | 0.269 | ||||||||
| CAM.02 | 0.410 | ||||||||
| CAM.04 | 0.346 | ||||||||
| CAM.09 | 0.077 | ||||||||
| CAM.13 | 0.006 | ||||||||
| CAM.14 | 0.026 | ||||||||
| CAM.38 | 0.006 | ||||||||
| CAM.45 | 0.006 | ||||||||
| 126 → 126R | 85 | CAM.00 | 0.907 | 0.910 | CAM.01 | 0.269 | 0.975 | Relapse | Relapse |
| CAM.01 | 0 | CAM.07 | 0.192 | ||||||
| CAM.02 | 0 | CAM.33 | 0.006 | ||||||
| CAM.03 | 0 | ||||||||
| CAM.04 | 1.026 | ||||||||
| CAM.05 | 0 | ||||||||
| CAM.06 | 0 | ||||||||
| CAM.07 | 0 | ||||||||
| CAM.22 | 0 | ||||||||
| CAM.50 | 0 | ||||||||
| 130 → 130R | 68 | CAM.00 | 0.907 | 0.910 | CAM.00 | 0.590 | 0.997 | Relapse | Relapse |
| CAM.02 | 0 | CAM.04 | 0.346 | ||||||
| CAM.03 | 0 | CAM.12 | 0.064 | ||||||
| CAM.04 | 1.026 | ||||||||
| CAM.12 | 0 | ||||||||
| 151 → 151R | 126 | CAM.03 | 0 | 0.593 | CAM.00 | 0.590 | 0.325 | Reinfection | Reinfection |
| CAM.05 | 0 | CAM.08 | 0.154 | ||||||
| CAM.08 | 0 | CAM.14 | 0.026 | ||||||
| CAM.64 | 0.006 | ||||||||
| 152 → 152R | 94 | CAM.00 | 0.907 | 0.783 | CAM.00 | 0.590 | 0.153 | Reinfection | Reinfection |
| CAM.01 | 0 | CAM.01 | 0.269 | ||||||
| CAM.05 | 0.231 | ||||||||
| CAM.07 | 0.192 | ||||||||
| 153 → 153R | 115 | CAM.00 | 0.907 | 0.910 | CAM.02 | 0.410 | 0.425 | Reinfection | Relapse |
| CAM.04 | 1.026 | CAM.20 | 0.026 | ||||||
| CAM.07 | 0 | ||||||||
| CAM.55 | 0 | ||||||||
| 154 → 154R | 64 | CAM.00 | 0.907 | 0.783 | CAM.03 | 0.295 | 0.116 | Reinfection | Reinfection |
| CAM.06 | 0 | CAM.05 | 0.231 | ||||||
| CAM.57 | 0 | CAM.06 | 0.231 | ||||||
| 160 → 160R | 17 | CAM.02 | 0 | 0.803 | CAM.00 | 0.590 | 0.000 | Reinfection | Reinfection |
| CAM.04 | 1.026 | CAM.03 | 0.295 | ||||||
| CAM.07 | 0 | CAM.05 | 0.231 | ||||||
| CAM.10 | 0.077 | ||||||||
| CAM.61 | 0.006 | ||||||||
| 177 → 177R | 84 | CAM.00 | 0.907 | 0.910 | CAM.01 | 0.269 | 0.773 | Relapse | Relapse |
| CAM.04 | 1.026 | ||||||||
| CAM.07 | 0 | ||||||||
| 179 → 179R | 84 | CAM.03 | 0 | 0.593 | CAM.01 | 0.269 | 0.234 | Reinfection | Reinfection |
| CAM.05 | 0 | CAM.13 | 0.006 | ||||||
| CAM.07 | 0 | ||||||||
| CAM.09 | 0 | ||||||||
| CAM.17 | 0 | ||||||||
| CAM.22 | 0 |
Figure 3:Goodness-of-fit model diagnosis for the P. vivax malaria data using ν = 2.05.