| Literature DB >> 34398931 |
Diego Alvarez-Estevez1,2, Roselyne M Rijsman1.
Abstract
STUDYEntities:
Mesh:
Year: 2021 PMID: 34398931 PMCID: PMC8366993 DOI: 10.1371/journal.pone.0256111
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Preprocessing steps and general CNN-LSTM neural network architecture.
Performance results of each individual model on the local validation scenario.
| Local dataset | Model configuration | Training iterations | TR | VAL | TS |
|---|---|---|---|---|---|
| HMC | CNN_1 | 15 | 0.79 | 0.73 | 0.74 |
| CNN_3 | 7 | 0.83 | 0.72 | 0.71 | |
| CNN_5 | 7 | 0.87 | 0.71 | 0.7 | |
| CNN_7 | 5 | 0.83 | 0.69 | 0.69 | |
| CNN_LSTM_3 | 7 | 0.81 | 0.78 | 0.78 | |
| CNN_LSTM_5 | 17 | 0.84 | 0.79 | 0.79 | |
| CNN_LSTM_7 | 27 | 0.83 | 0.77 | 0.77 | |
| CNN_F_1 | 14 | 0.78 | 0.73 | 0.74 | |
| CNN_F_3 | 8 | 0.84 | 0.71 | 0.71 | |
| CNN_F_5 | 6 | 0.84 | 0.7 | 0.7 | |
| CNN_F_7 | 5 | 0.85 | 0.69 | 0.69 | |
| CNN_LSTM_F_3 | 7 | 0.79 | 0.77 | 0.77 | |
| CNN_LSTM_F_5 | 8 | 0.77 | 0.75 | 0.75 | |
| CNN_LSTM_F_7 | 10 | 0.76 | 0.74 | 0.74 | |
| Dublin | CNN_1 | 10 | 0.76 | 0.68 | 0.68 |
| CNN_3 | 7 | 0.85 | 0.66 | 0.66 | |
| CNN_5 | 6 | 0.89 | 0.62 | 0.64 | |
| CNN_7 | 7 | 0.89 | 0.65 | 0.67 | |
| CNN_LSTM_3 | 8 | 0.82 | 0.76 | 0.77 | |
| CNN_LSTM_5 | 9 | 0.84 | 0.78 | 0.79 | |
| CNN_LSTM_7 | 9 | 0.84 | 0.77 | 0.77 | |
| CNN_F_1 | 14 | 0.77 | 0.65 | 0.65 | |
| CNN_F_3 | 8 | 0.83 | 0.65 | 0.64 | |
| CNN_F_5 | 8 | 0.88 | 0.6 | 0.61 | |
| CNN_F_7 | 13 | 0.9 | 0.67 | 0.66 | |
| CNN_LSTM_F_3 | 8 | 0.81 | 0.76 | 0.77 | |
| CNN_LSTM_F_5 | 8 | 0.82 | 0.78 | 0.79 | |
| CNN_LSTM_F_7 | 9 | 0.84 | 0.77 | 0.78 | |
| SHHS | CNN_1 | 9 | 0.8 | 0.76 | 0.75 |
| CNN_3 | 7 | 0.89 | 0.79 | 0.79 | |
| CNN_5 | 7 | 0.95 | 0.78 | 0.79 | |
| CNN_7 | 6 | 0.92 | 0.76 | 0.76 | |
| CNN_LSTM_3 | 17 | 0.87 | 0.83 | 0.84 | |
| CNN_LSTM_5 | 18 | 0.86 | 0.83 | 0.82 | |
| CNN_LSTM_7 | 7 | 0.79 | 0.78 | 0.77 | |
| CNN_F_1 | 9 | 0.8 | 0.77 | 0.76 | |
| CNN_F_3 | 8 | 0.89 | 0.79 | 0.79 | |
| CNN_F_5 | 5 | 0.9 | 0.78 | 0.78 | |
| CNN_F_7 | 6 | 0.94 | 0.78 | 0.77 | |
| CNN_LSTM_F_3 | 10 | 0.85 | 0.82 | 0.83 | |
| CNN_LSTM_F_5 | 18 | 0.86 | 0.83 | 0.82 | |
| CNN_LSTM_F_7 | 5 | 0.8 | 0.79 | 0.79 | |
| Telemetry | CNN_1 | 14 | 0.81 | 0.76 | 0.76 |
| CNN_3 | 10 | 0.88 | 0.73 | 0.75 | |
| CNN_5 | 8 | 0.9 | 0.73 | 0.72 | |
| CNN_7 | 6 | 0.88 | 0.72 | 0.7 | |
| CNN_LSTM_3 | 10 | 0.85 | 0.8 | 0.81 | |
| CNN_LSTM_5 | 9 | 0.85 | 0.79 | 0.8 | |
| CNN_LSTM_7 | 8 | 0.84 | 0.8 | 0.8 | |
| CNN_F_1 | 14 | 0.82 | 0.77 | 0.77 | |
| CNN_F_3 | 8 | 0.87 | 0.71 | 0.71 | |
| CNN_F_5 | 9 | 0.88 | 0.73 | 0.73 | |
| CNN_F_7 | 9 | 0.91 | 0.73 | 0.71 | |
| CNN_LSTM_F_3 | 10 | 0.84 | 0.81 | 0.81 | |
| CNN_LSTM_F_5 | 15 | 0.89 | 0.82 | 0.83 | |
| CNN_LSTM_F_7 | 16 | 0.89 | 0.82 | 0.82 | |
| DREAMS | CNN_1 | 9 | 0.81 | 0.75 | 0.76 |
| CNN_3 | 8 | 0.92 | 0.76 | 0.76 | |
| CNN_5 | 7 | 0.91 | 0.75 | 0.75 | |
| CNN_7 | 5 | 0.85 | 0.73 | 0.73 | |
| CNN_LSTM_3 | 17 | 0.88 | 0.84 | 0.83 | |
| CNN_LSTM_5 | 20 | 0.9 | 0.83 | 0.83 | |
| CNN_LSTM_7 | 5 | 0.81 | 0.78 | 0.78 | |
| CNN_F_1 | 9 | 0.82 | 0.76 | 0.77 | |
| CNN_F_3 | 7 | 0.91 | 0.77 | 0.78 | |
| CNN_F_5 | 7 | 0.92 | 0.74 | 0.75 | |
| CNN_F_7 | 6 | 0.88 | 0.73 | 0.72 | |
| CNN_LSTM_F_3 | 20 | 0.89 | 0.84 | 0.83 | |
| CNN_LSTM_F_5 | 28 | 0.9 | 0.84 | 0.84 | |
| CNN_LSTM_F_7 | 10 | 0.85 | 0.81 | 0.8 | |
| ISRUC | CNN_1 | 17 | 0.81 | 0.77 | 0.76 |
| CNN_3 | 6 | 0.83 | 0.74 | 0.75 | |
| CNN_5 | 7 | 0.9 | 0.73 | 0.73 | |
| CNN_7 | 6 | 0.86 | 0.72 | 0.73 | |
| CNN_LSTM_3 | 10 | 0.81 | 0.8 | 0.8 | |
| CNN_LSTM_5 | 10 | 0.8 | 0.78 | 0.78 | |
| CNN_LSTM_7 | 6 | 0.75 | 0.75 | 0.75 | |
| CNN_F_1 | 9 | 0.79 | 0.76 | 0.75 | |
| CNN_F_3 | 7 | 0.84 | 0.75 | 0.75 | |
| CNN_F_5 | 7 | 0.9 | 0.73 | 0.73 | |
| CNN_F_7 | 6 | 0.86 | 0.71 | 0.72 | |
| CNN_LSTM_F_3 | 10 | 0.81 | 0.79 | 0.79 | |
| CNN_LSTM_F_5 | 9 | 0.78 | 0.77 | 0.76 | |
| CNN_LSTM_F_7 | 9 | 0.75 | 0.74 | 0.74 |
Results report agreement in terms of kappa index with respect to the corresponding human clinical scorings for each dataset. Agreement is reported separately for each corresponding training (TR), validation (VAL) and testing (TS) dataset partitions. The number of effective training iterations is indicated in the third column. Rows within each dataset correspond to the different tested neural network configurations as described in the experimental design.
Performance results of the individual local models on the external validation scenario.
| Individual local models | |||||||
|---|---|---|---|---|---|---|---|
| Predicted dataset | Model configuration | M(HMC) | M(Dublin) | M(SHHS) | M(Telemetry) | M(DREAMS) | M(ISRUC) |
| HMC | CNN_1 | 0.77 | 0.51 | 0.56 | 0.53 | 0.52 | 0.6 |
| CNN_3 | 0.79 | 0.46 | 0.6 | 0.42 | 0.47 | 0.56 | |
| CNN_5 | 0.81 | 0.37 | 0.57 | 0.39 | 0.44 | 0.54 | |
| CNN_7 | 0.78 | 0.4 | 0.5 | 0.34 | 0.43 | 0.55 | |
| CNN_LSTM_3 | 0.8 | 0.54 | 0.58 | 0.51 | 0.5 | 0.61 | |
| CNN_LSTM_5 | 0.82 | 0.53 | 0.6 | 0.5 | 0.49 | 0.62 | |
| CNN_LSTM_7 | 0.81 | 0.52 | 0.59 | 0.52 | 0.51 | 0.61 | |
| CNN_F_1 | 0.76 | 0.39 | 0.58 | 0.49 | 0.56 | 0.62 | |
| CNN_F_3 | 0.79 | 0.37 | 0.6 | 0.48 | 0.48 | 0.63 | |
| CNN_F_5 | 0.79 | 0.35 | 0.57 | 0.42 | 0.46 | 0.61 | |
| CNN_F_7 | 0.79 | 0.35 | 0.54 | 0.37 | 0.5 | 0.58 | |
| CNN_LSTM_F_3 | 0.78 | 0.38 | 0.62 | 0.45 | 0.55 | 0.64 | |
| CNN_LSTM_F_5 | 0.76 | 0.37 | 0.63 | 0.48 | 0.55 | 0.65 | |
| CNN_LSTM_F_7 | 0.75 | 0.34 | 0.62 | 0.47 | 0.55 | 0.62 | |
| Dublin | CNN_1 | 0.53 | 0.73 | 0.44 | 0.41 | 0.53 | 0.51 |
| CNN_3 | 0.57 | 0.78 | 0.5 | 0.34 | 0.49 | 0.57 | |
| CNN_5 | 0.52 | 0.79 | 0.51 | 0.32 | 0.51 | 0.59 | |
| CNN_7 | 0.53 | 0.81 | 0.39 | 0.31 | 0.49 | 0.57 | |
| CNN_LSTM_3 | 0.54 | 0.8 | 0.48 | 0.38 | 0.58 | 0.55 | |
| CNN_LSTM_5 | 0.54 | 0.82 | 0.5 | 0.39 | 0.58 | 0.55 | |
| CNN_LSTM_7 | 0.5 | 0.81 | 0.5 | 0.42 | 0.57 | 0.55 | |
| CNN_F_1 | 0.2 | 0.73 | 0.13 | 0.03 | 0.01 | 0.07 | |
| CNN_F_3 | 0.15 | 0.77 | 0.1 | 0.01 | 0.02 | 0.05 | |
| CNN_F_5 | 0.04 | 0.78 | 0.02 | 0.01 | 0.03 | 0.04 | |
| CNN_F_7 | 0.04 | 0.81 | 0.13 | 0.01 | 0.02 | 0.14 | |
| CNN_LSTM_F_3 | 0.24 | 0.8 | 0.07 | 0.02 | 0.01 | 0.05 | |
| CNN_LSTM_F_5 | 0.22 | 0.81 | 0.07 | 0.01 | 0.01 | 0.05 | |
| CNN_LSTM_F_7 | 0.17 | 0.81 | 0.06 | 0.01 | 0.01 | 0.04 | |
| SHHS | CNN_1 | 0.57 | 0.5 | 0.78 | 0.42 | 0.59 | 0.64 |
| CNN_3 | 0.59 | 0.52 | 0.86 | 0.3 | 0.6 | 0.63 | |
| CNN_5 | 0.55 | 0.42 | 0.89 | 0.27 | 0.6 | 0.63 | |
| CNN_7 | 0.61 | 0.4 | 0.86 | 0.26 | 0.6 | 0.65 | |
| CNN_LSTM_3 | 0.54 | 0.57 | 0.86 | 0.42 | 0.54 | 0.62 | |
| CNN_LSTM_5 | 0.5 | 0.56 | 0.85 | 0.46 | 0.52 | 0.67 | |
| CNN_LSTM_7 | 0.47 | 0.53 | 0.78 | 0.46 | 0.56 | 0.66 | |
| CNN_F_1 | 0.68 | 0.35 | 0.78 | 0.43 | 0.6 | 0.65 | |
| CNN_F_3 | 0.53 | 0.29 | 0.85 | 0.38 | 0.59 | 0.65 | |
| CNN_F_5 | 0.52 | 0.32 | 0.86 | 0.39 | 0.58 | 0.68 | |
| CNN_F_7 | 0.52 | 0.28 | 0.88 | 0.31 | 0.63 | 0.67 | |
| CNN_LSTM_F_3 | 0.68 | 0.29 | 0.84 | 0.4 | 0.57 | 0.63 | |
| CNN_LSTM_F_5 | 0.66 | 0.31 | 0.85 | 0.39 | 0.53 | 0.65 | |
| CNN_LSTM_F_7 | 0.67 | 0.22 | 0.79 | 0.41 | 0.57 | 0.62 | |
| Telemetry | CNN_1 | 0.67 | 0.53 | 0.51 | 0.79 | 0.48 | 0.63 |
| CNN_3 | 0.6 | 0.42 | 0.55 | 0.81 | 0.43 | 0.53 | |
| CNN_5 | 0.6 | 0.43 | 0.54 | 0.83 | 0.48 | 0.57 | |
| CNN_7 | 0.5 | 0.45 | 0.38 | 0.82 | 0.45 | 0.52 | |
| CNN_LSTM_3 | 0.68 | 0.59 | 0.49 | 0.83 | 0.45 | 0.62 | |
| CNN_LSTM_5 | 0.69 | 0.59 | 0.5 | 0.83 | 0.43 | 0.64 | |
| CNN_LSTM_7 | 0.67 | 0.61 | 0.5 | 0.82 | 0.48 | 0.65 | |
| CNN_F_1 | 0.7 | 0.39 | 0.61 | 0.8 | 0.44 | 0.61 | |
| CNN_F_3 | 0.67 | 0.46 | 0.57 | 0.83 | 0.46 | 0.62 | |
| CNN_F_5 | 0.63 | 0.43 | 0.41 | 0.83 | 0.42 | 0.55 | |
| CNN_F_7 | 0.64 | 0.44 | 0.33 | 0.84 | 0.46 | 0.54 | |
| CNN_LSTM_F_3 | 0.72 | 0.44 | 0.6 | 0.83 | 0.43 | 0.6 | |
| CNN_LSTM_F_5 | 0.71 | 0.48 | 0.62 | 0.87 | 0.44 | 0.63 | |
| CNN_LSTM_F_7 | 0.68 | 0.44 | 0.6 | 0.86 | 0.44 | 0.61 | |
| DREAMS | CNN_1 | 0.5 | 0.56 | 0.58 | 0.34 | 0.79 | 0.71 |
| CNN_3 | 0.46 | 0.52 | 0.59 | 0.34 | 0.86 | 0.71 | |
| CNN_5 | 0.42 | 0.33 | 0.36 | 0.31 | 0.85 | 0.68 | |
| CNN_7 | 0.61 | 0.51 | 0.58 | 0.27 | 0.81 | 0.67 | |
| CNN_LSTM_3 | 0.5 | 0.56 | 0.54 | 0.43 | 0.87 | 0.73 | |
| CNN_LSTM_5 | 0.41 | 0.56 | 0.54 | 0.47 | 0.87 | 0.75 | |
| CNN_LSTM_7 | 0.45 | 0.5 | 0.51 | 0.42 | 0.8 | 0.74 | |
| CNN_F_1 | 0.52 | 0.39 | 0.66 | 0.42 | 0.8 | 0.7 | |
| CNN_F_3 | 0.59 | 0.25 | 0.56 | 0.46 | 0.86 | 0.67 | |
| CNN_F_5 | 0.55 | 0.36 | 0.55 | 0.4 | 0.86 | 0.65 | |
| CNN_F_7 | 0.52 | 0.32 | 0.58 | 0.35 | 0.82 | 0.71 | |
| CNN_LSTM_F_3 | 0.53 | 0.32 | 0.6 | 0.43 | 0.87 | 0.71 | |
| CNN_LSTM_F_5 | 0.55 | 0.31 | 0.63 | 0.46 | 0.88 | 0.72 | |
| CNN_LSTM_F_7 | 0.51 | 0.16 | 0.6 | 0.43 | 0.83 | 0.71 | |
| ISRUC | CNN_1 | 0.56 | 0.57 | 0.6 | 0.29 | 0.63 | 0.79 |
| CNN_3 | 0.57 | 0.54 | 0.64 | 0.29 | 0.56 | 0.8 | |
| CNN_5 | 0.51 | 0.46 | 0.63 | 0.26 | 0.57 | 0.84 | |
| CNN_7 | 0.57 | 0.48 | 0.54 | 0.24 | 0.52 | 0.81 | |
| CNN_LSTM_3 | 0.54 | 0.55 | 0.65 | 0.36 | 0.61 | 0.81 | |
| CNN_LSTM_5 | 0.51 | 0.55 | 0.66 | 0.42 | 0.58 | 0.79 | |
| CNN_LSTM_7 | 0.43 | 0.53 | 0.6 | 0.38 | 0.6 | 0.75 | |
| CNN_F_1 | 0.68 | 0.42 | 0.63 | 0.42 | 0.65 | 0.77 | |
| CNN_F_3 | 0.59 | 0.35 | 0.65 | 0.41 | 0.61 | 0.81 | |
| CNN_F_5 | 0.57 | 0.41 | 0.66 | 0.37 | 0.57 | 0.84 | |
| CNN_F_7 | 0.55 | 0.38 | 0.62 | 0.35 | 0.56 | 0.81 | |
| CNN_LSTM_F_3 | 0.68 | 0.41 | 0.68 | 0.4 | 0.63 | 0.8 | |
| CNN_LSTM_F_5 | 0.67 | 0.43 | 0.69 | 0.44 | 0.61 | 0.78 | |
| CNN_LSTM_F_7 | 0.66 | 0.29 | 0.66 | 0.43 | 0.61 | 0.75 | |
Results report agreement in terms of kappa index with respect to the corresponding human clinical scorings for each dataset. The notation M(X) is used to indicate that the model was trained based on data on the dataset X. Rows within each dataset correspond to the different tested neural network configurations as described in the experimental design. The main diagonal (in greyed background) shows the results when the model is predicting its own complete local dataset (biased prediction).
Performance comparison between individual models and the ensemble approach in the local and external validation scenarios.
| Predicted dataset | Model configuration | Individual local models | Ensemble | ||
|---|---|---|---|---|---|
| Local performance | External performance | External performance | |||
| Range | Average | ||||
| HMC | CNN_1 | 0.74 | 0.51–0.60 | 0.54 | 0.61 |
| CNN_3 | 0.71 | 0.42–0.60 | 0.5 | 0.58 | |
| CNN_5 | 0.7 | 0.37–0.57 | 0.46 | 0.55 | |
| CNN_7 | 0.69 | 0.34–0.55 | 0.44 | 0.53 | |
| CNN_LSTM_3 | 0.78 | 0.50–0.61 | 0.55 | 0.62 | |
| CNN_LSTM_5 | 0.79 | 0.49–0.62 | 0.55 | 0.63 | |
| CNN_LSTM_7 | 0.77 | 0.51–0.61 | 0.55 | 0.62 | |
| CNN_F_1 | 0.74 | 0.39–0.62 | 0.53 | 0.61 | |
| CNN_F_3 | 0.71 | 0.37–0.63 | 0.51 | 0.6 | |
| CNN_F_5 | 0.7 | 0.35–0.61 | 0.48 | 0.58 | |
| CNN_F_7 | 0.69 | 0.35–0.58 | 0.47 | 0.56 | |
| CNN_LSTM_F_3 | 0.77 | 0.38–0.64 | 0.53 | 0.62 | |
| CNN_LSTM_F_5 | 0.75 | 0.37–0.65 | 0.54 | 0.64 | |
| CNN_LSTM_F_7 | 0.74 | 0.34–0.62 | 0.52 | 0.63 | |
| Dublin | CNN_1 | 0.68 | 0.41–0.53 | 0.49 | 0.6 |
| CNN_3 | 0.66 | 0.34–0.57 | 0.49 | 0.62 | |
| CNN_5 | 0.64 | 0.32–0.59 | 0.49 | 0.6 | |
| CNN_7 | 0.67 | 0.31–0.57 | 0.46 | 0.59 | |
| CNN_LSTM_3 | 0.77 | 0.38–0.58 | 0.51 | 0.63 | |
| CNN_LSTM_5 | 0.79 | 0.39–0.58 | 0.51 | 0.63 | |
| CNN_LSTM_7 | 0.77 | 0.42–0.57 | 0.51 | 0.62 | |
| CNN_F_1 | 0.65 | 0.01–0.20 | 0.09 | 0.08 | |
| CNN_F_3 | 0.64 | 0.01–0.15 | 0.07 | 0.04 | |
| CNN_F_5 | 0.61 | 0.01–0.04 | 0.03 | 0.01 | |
| CNN_F_7 | 0.66 | 0.01–0.14 | 0.07 | 0.03 | |
| CNN_LSTM_F_3 | 0.77 | 0.01–0.24 | 0.08 | 0.06 | |
| CNN_LSTM_F_5 | 0.79 | 0.01–0.22 | 0.07 | 0.05 | |
| CNN_LSTM_F_7 | 0.78 | 0.01–0.17 | 0.06 | 0.04 | |
| SHHS | CNN_1 | 0.75 | 0.42–0.64 | 0.54 | 0.62 |
| CNN_3 | 0.79 | 0.30–0.63 | 0.53 | 0.65 | |
| CNN_5 | 0.79 | 0.27–0.63 | 0.49 | 0.61 | |
| CNN_7 | 0.76 | 0.26–0.65 | 0.5 | 0.65 | |
| CNN_LSTM_3 | 0.84 | 0.42–0.62 | 0.54 | 0.62 | |
| CNN_LSTM_5 | 0.82 | 0.46–0.67 | 0.54 | 0.61 | |
| CNN_LSTM_7 | 0.77 | 0.46–0.66 | 0.54 | 0.61 | |
| CNN_F_1 | 0.76 | 0.35–0.68 | 0.54 | 0.66 | |
| CNN_F_3 | 0.79 | 0.29–0.65 | 0.49 | 0.62 | |
| CNN_F_5 | 0.78 | 0.32–0.68 | 0.5 | 0.62 | |
| CNN_F_7 | 0.77 | 0.28–0.67 | 0.48 | 0.62 | |
| CNN_LSTM_F_3 | 0.83 | 0.29–0.68 | 0.52 | 0.62 | |
| CNN_LSTM_F_5 | 0.82 | 0.31–0.66 | 0.51 | 0.62 | |
| CNN_LSTM_F_7 | 0.79 | 0.22–0.67 | 0.5 | 0.62 | |
| Telemetry | CNN_1 | 0.76 | 0.48–0.67 | 0.56 | 0.67 |
| CNN_3 | 0.75 | 0.42–0.60 | 0.51 | 0.61 | |
| CNN_5 | 0.72 | 0.43–0.60 | 0.53 | 0.62 | |
| CNN_7 | 0.7 | 0.38–0.52 | 0.46 | 0.58 | |
| CNN_LSTM_3 | 0.81 | 0.45–0.68 | 0.57 | 0.67 | |
| CNN_LSTM_5 | 0.8 | 0.43–0.69 | 0.57 | 0.69 | |
| CNN_LSTM_7 | 0.8 | 0.48–0.67 | 0.58 | 0.68 | |
| CNN_F_1 | 0.77 | 0.39–0.70 | 0.55 | 0.66 | |
| CNN_F_3 | 0.71 | 0.46–0.67 | 0.56 | 0.66 | |
| CNN_F_5 | 0.73 | 0.41–0.63 | 0.49 | 0.62 | |
| CNN_F_7 | 0.71 | 0.33–0.64 | 0.48 | 0.63 | |
| CNN_LSTM_F_3 | 0.81 | 0.43–0.72 | 0.56 | 0.69 | |
| CNN_LSTM_F_5 | 0.83 | 0.44–0.71 | 0.58 | 0.7 | |
| CNN_LSTM_F_7 | 0.82 | 0.44–0.68 | 0.56 | 0.68 | |
| DREAMS | CNN_1 | 0.76 | 0.34–0.71 | 0.54 | 0.61 |
| CNN_3 | 0.76 | 0.34–0.71 | 0.52 | 0.61 | |
| CNN_5 | 0.75 | 0.31–0.68 | 0.42 | 0.56 | |
| CNN_7 | 0.73 | 0.27–0.67 | 0.53 | 0.63 | |
| CNN_LSTM_3 | 0.83 | 0.43–0.73 | 0.55 | 0.61 | |
| CNN_LSTM_5 | 0.83 | 0.41–0.75 | 0.55 | 0.59 | |
| CNN_LSTM_7 | 0.78 | 0.42–0.74 | 0.52 | 0.58 | |
| CNN_F_1 | 0.77 | 0.39–0.70 | 0.54 | 0.62 | |
| CNN_F_3 | 0.78 | 0.25–0.67 | 0.51 | 0.66 | |
| CNN_F_5 | 0.75 | 0.36–0.65 | 0.5 | 0.64 | |
| CNN_F_7 | 0.72 | 0.32–0.71 | 0.5 | 0.63 | |
| CNN_LSTM_F_3 | 0.83 | 0.32–0.71 | 0.52 | 0.61 | |
| CNN_LSTM_F_5 | 0.84 | 0.31–0.72 | 0.54 | 0.64 | |
| CNN_LSTM_F_7 | 0.8 | 0.16–0.71 | 0.48 | 0.59 | |
| ISRUC | CNN_1 | 0.76 | 0.29–0.63 | 0.53 | 0.59 |
| CNN_3 | 0.75 | 0.29–0.64 | 0.52 | 0.61 | |
| CNN_5 | 0.73 | 0.26–0.63 | 0.48 | 0.58 | |
| CNN_7 | 0.73 | 0.24–0.57 | 0.47 | 0.56 | |
| CNN_LSTM_3 | 0.8 | 0.36–0.65 | 0.54 | 0.6 | |
| CNN_LSTM_5 | 0.78 | 0.42–0.66 | 0.54 | 0.61 | |
| CNN_LSTM_7 | 0.75 | 0.38–0.60 | 0.51 | 0.57 | |
| CNN_F_1 | 0.75 | 0.42–0.68 | 0.56 | 0.64 | |
| CNN_F_3 | 0.75 | 0.35–0.65 | 0.52 | 0.63 | |
| CNN_F_5 | 0.73 | 0.37–0.66 | 0.51 | 0.6 | |
| CNN_F_7 | 0.72 | 0.35–0.62 | 0.49 | 0.58 | |
| CNN_LSTM_F_3 | 0.79 | 0.40–0.68 | 0.56 | 0.65 | |
| CNN_LSTM_F_5 | 0.76 | 0.43–0.69 | 0.57 | 0.66 | |
| CNN_LSTM_F_7 | 0.74 | 0.29–0.66 | 0.53 | 0.64 | |
Results report agreement in terms of kappa index with respect to the corresponding human clinical scorings for each dataset. Rows within each dataset correspond to the different tested neural network configurations as described in the experimental design.
Global performance comparison by aggregating results across all datasets.
| Model configuration | Individual models—local dataset (I) | Individual models—external datasets (II) | Ensemble—external dataset (III) | I vs II differences | I vs III differences | II vs III differences |
|---|---|---|---|---|---|---|
| CNN_1 | 0.7417 | 0.5333 | 0.6167 | -0.2083 | -0.125 | 0.0833 |
| CNN_3 | 0.7367 | 0.5117 | 0.6133 | -0.225 | -0.1233 | 0.1017 |
| CNN_5 | 0.7217 | 0.4783 | 0.5833 | -0.2433 | -0.1383 |
|
| CNN_7 | 0.7133 | 0.485 | 0.59 | -0.2283 | -0.1233 |
|
| CNN_LSTM_3 | 0.7967 |
| 0.625 | -0.2533 | -0.1717 | 0.0817 |
| CNN_LSTM_5 |
|
|
| -0.2583 | -0.175 | 0.0833 |
| CNN_LSTM_7 | 0.7733 | 0.535 | 0.6133 | -0.2383 | -0.16 | 0.0783 |
| CNN_F_1 | 0.74 | 0.4683 | 0.545 | -0.2717 | -0.195 | 0.0767 |
| CNN_F_3 | 0.73 | 0.4433 | 0.535 | -0.2867 | -0.195 | 0.0917 |
| CNN_F_5 | 0.7167 | 0.4183 | 0.5117 | -0.2983 | -0.205 | 0.0933 |
| CNN_F_7 | 0.7117 | 0.415 | 0.5083 | -0.2967 | -0.2033 | 0.0933 |
| CNN_LSTM_F_3 | 0.8 | 0.4617 | 0.5417 |
|
| 0.08 |
| CNN_LSTM_F_5 | 0.7983 | 0.4683 | 0.5517 | -0.33 | -0.2467 | 0.0833 |
| CNN_LSTM_F_7 | 0.7783 | 0.4417 | 0.5333 | -0.3367 | -0.245 | 0.0917 |
Results report average agreement in terms of kappa index with respect to the corresponding human clinical scorings for each dataset: Local testing sets using individual models (I), external datasets using individual models (II), and external datasets using an ensemble of individual models (III). Each row corresponds to the different tested neural network configurations as described in the experimental design. The highest absolute values on each column are highlighted in bold.
Indices of human inter-rater agreement reported in the literature compared with the performance achieved by our proposed deep-learning approach.
| Dataset | Inter-rater agreement (same center/database) | Our results (local validation) | Inter-rater agreement (different center/database) | Our results (external validation) |
|---|---|---|---|---|
| HMC | 0.74 | 0.79 | --- | 0.63 |
| Dublin | --- | 0.79 | --- | 0.63 |
| SHHS | 0.81–0.83 [ | 0.82 | --- | 0.61 |
| Telemetry | --- | 0.80 | --- | 0.69 |
| DREAMS | --- | 0.83 | --- | 0.59 |
| ISRUC | 0.87 [ | 0.78 | --- | 0.61 |
|
|
|
|
|
|
| Other databases | 0.73 [ | --- | 0.46–0.89 [ | --- |
|
|
|
|
|
|
Results report agreement in terms of kappa index. The CNN_LSTM_5 model is taken as reference for the results regarding our automatic approach. Overall results across databases are highlighted in bold.
Indices of automatic scoring agreement reported in the literature in comparison with the results achieved by the proposed deep-learning approach.
| Dataset | Local dataset prediction scenario | Our results (local dataset) | External dataset prediction scenario | Our results |
|---|---|---|---|---|
| (external dataset) | ||||
| HMC | 0.62 [ | 0.79 | 0.60 [ | 0.63 |
| Dublin | 0.44 [ | 0.79 | 0.19 [ | 0.63 |
| 0.84 [ | ||||
| 0.74 [ | ||||
| 0.66 [ | ||||
| SHHS | 0.65 [ | 0.82 | 0.62 [ | 0.61 |
| 0.82 [ | 0.53–0.56 [ | |||
| 0.73 [ | 0.73 [ | |||
| 0.83 [ | 0.52–0.73 [ | |||
| 0.81 [ | ||||
| Telemetry | 0.58 [ | 0.8 | 0.53 [ | 0.69 |
| DREAMS | 0.62 [ | 0.83 | 0.43 [ | 0.59 |
| ISRUC | 0.68 [ | 0.78 | 0.63 [ | 0.61 |
| 0.65 [ | 0.57–0.68 [ | |||
|
|
|
|
|
|
| Other databases | 0.86 [ | --- | 0.42–0.63 [ | --- |
| 0.76–0.80 [ | 0.68–0.70 [ | |||
| 0.84 [ | 0.69 [ | |||
| 0.80 [ | 0.72–0.77 [ | |||
| 0.68 [ | 0.45–0.70 [ | |||
| 0.81 [ | 0.61 [ | |||
| 0.73–0.76 [ | 0.50–0.76 [ | |||
| 0.82 [ | ||||
| 0.77 [ | ||||
| 0.66 [ | ||||
| 0.70–0.79 [ | ||||
|
|
|
|
|
|
Results report agreement in terms of kappa index. The CNN_LSTM_5 model is taken as reference for the results regarding our automatic approach. Overall results across databases are highlighted in bold.