| Literature DB >> 30537954 |
Xiangrui Li1, Dongxiao Zhu2, Phillip Levy3,4.
Abstract
BACKGROUND: Accurate predictive modeling in clinical research enables effective early intervention that patients are most likely to benefit from. However, due to the complex biological nature of disease progression, capturing the highly non-linear information from low-level input features is quite challenging. This requires predictive models with high-capacity. In practice, clinical datasets are often of limited size, bringing danger of overfitting for high-capacity models. To address these two challenges, we propose a deep multi-task neural network for predictive modeling.Entities:
Keywords: Auxiliary task; Deep neural network; Multi-task learning; Predictive modeling
Mesh:
Year: 2018 PMID: 30537954 PMCID: PMC6290511 DOI: 10.1186/s12911-018-0676-9
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Fig. 1Motivating Example for GATAN. Left ventricular mass index to body surface area (LVMI) is the primary target. The labeling process also produces other measures that are clinically related to LVMI. We predict these measures as auxiliary tasks in our model
Fig. 2Sturcture of GATAN with one auxiliary task
Fig. 3An example of calculating the contribution of hidden neurons using weight back-propagation
Descriptive statistics of LVMI and other CMR measures
| Min | 1st Qtl | Median | 3rd Qtl | Max | Mean | |
|---|---|---|---|---|---|---|
| LVMI | 51.06 | 80.06 | 89.72 | 100.83 | 155.66 | 90.81 |
| LVSVI | 9.93 | 22.23 | 28.37 | 33.88 | 53.38 | 28.10 |
| LVEDVI | 18.39 | 33.29 | 41.42 | 50.63 | 106.73 | 42.81 |
| Septal | 4.8 | 9.7 | 11.60 | 13.60 | 26.5 | 11.96 |
| Posterior | 2.23 | 9.60 | 11.90 | 14.20 | 22.50 | 12.02 |
| Anterior | 5.70 | 10.60 | 12.40 | 14.50 | 20.40 | 12.66 |
Fig. 4Histogram of LVMI in hypertension data (left) and time to recur for WPBC (right)
Predictive performance on hypertension dataset
| Dataset | Model | KNN | RF | SVR | Ridge | Lasso | MTLasso | MLP-4 | GATAN-1 | GATAN-2 |
|---|---|---|---|---|---|---|---|---|---|---|
| Full feature set | MSE | 248.06 | 214.68 | 299.03 | 261.52 | 205.67 | 217.34 | 209.43 |
| 203.50 |
| (60.73) | (25.18) | (82.16) | (23.26) | (36.07) | (39.35) | (28.36) | (33.48) | (29.98) | ||
| EVS | 0.26 | 0.29 | 0.08 | 0.10 | 0.33 | 0.30 | 0.32 |
| 0.34 | |
| (0.18) | (0.12) | (0.02) | (0.37) | (0.11) | (0.14) | (0.14) | (0.10) | (0.14) | ||
| MAE | 10.91 | 11.29 | 11.66 | 12.41 | 11.40 | 11.65 | 10.43 |
| 10.77 | |
| (2.05) | (1.97) | (1.93) | (1.65) | (2.58) | (2.53) | (2.02) | (1.71) | (2.10) | ||
| Lab and demo | MSE | 282.06 | 261.27 | 284.05 | 278.80 | 250.754 | 253.59 | 243.41 | 237.97 | 237.66 |
| (39.58) | (20.56) | (58.15) | (18.88) | (26.01) | (33.79) | (31.87) | (33.59) | (34.09) | ||
| EVS | 0.06 | 0.08 | 0.06 | 0.03 | 0.15 | 0.14 | 0.17 | 0.19 | 0.19 | |
| (0.17) | (0.25) | (0.01) | (0.22) | (0.11) | (0.11) | (0.10) | (0.09) | (0.10) | ||
| MAE | 10.54 | 10.42 | 9.90 | 10.24 | 9.59 | 9.43 | 8.84 | 8.67 | 8.54 | |
| (2.38) | (0.95) | (1.24) | (1.78) | (1.26) | (0.94) | (1.96) | (2.05) | (2.01) |
The first section uses a full set of features; the second only uses lab results and demographic information
The best performance is bolded
Fig. 5Top-20 for the complete set of features. Auxiliary target: (a) LVEDVI (b) posterior wall thickness
Fig. 6Top-15 for only lab results as features. Auxiliary target: (a) LVEDVI (b) posterior wall thickness
Predictive performance on WPBC dataset
| Dataset | Model | KNN | RF | SVR | Ridge | Lasso | MTLasso | MLP-4 | GATAN |
|---|---|---|---|---|---|---|---|---|---|
| WPBC | MSE | 1139.06 | 1189.69 | 1007.50 | 1184.38 | 1000.94 | 990.63 | 941.88 |
|
| (200.05) | (273.34) | (153.95) | (253.02) | (144.57) | (163.17) | (145.68) | (65.49) | ||
| EVS | -0.22 | -0.17 | 0.00 | -0.21 | -0.01 | 0.01 | 0.00 | -0.01 | |
| (0.15) | (0.17) | (0.01) | (0.28) | (0.15) | (0.14) | (0.01) | (0.02) | ||
| MAE | 25.48 | 28.00 | 27.16 | 27.78 | 24.58 | 24.16 | 27.09 |
| |
| (4.94) | (3.31) | (6.08) | (4.27) | (1.32) | (3.30) | (4.88) | (0.79) |
The best performance is bolded
Fig. 7Feature contribution for WPBC dataset