| Literature DB >> 33266577 |
David Cuesta-Frau1, Pau Miró-Martínez2, Sandra Oltra-Crespo1, Jorge Jordán-Núñez2, Borja Vargas3, Paula González3, Manuel Varela-Entrecanales3.
Abstract
Many entropy-related methods for signal classification have been proposed and exploited successfully in the last several decades. However, it is sometimes difficult to find the optimal measure and the optimal parameter configuration for a specific purpose or context. Suboptimal settings may therefore produce subpar results and not even reach the desired level of significance. In order to increase the signal classification accuracy in these suboptimal situations, this paper proposes statistical models created with uncorrelated measures that exploit the possible synergies between them. The methods employed are permutation entropy (PE), approximate entropy (ApEn), and sample entropy (SampEn). Since PE is based on subpattern ordinal differences, whereas ApEn and SampEn are based on subpattern amplitude differences, we hypothesized that a combination of PE with another method would enhance the individual performance of any of them. The dataset was composed of body temperature records, for which we did not obtain a classification accuracy above 80% with a single measure, in this study or even in previous studies. The results confirmed that the classification accuracy rose up to 90% when combining PE and ApEn with a logistic model.Entities:
Keywords: approximate entropy; body temperature; logistic regression; permutation entropy; sample entropy; signal classification
Year: 2018 PMID: 33266577 PMCID: PMC7512415 DOI: 10.3390/e20110853
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Classification results using permutation entropy (PE) with m ranging from 3 up to 8.
|
| Sensitivity (Class 0) | Specificity (Class 1) | Accuracy (Correct) |
|---|---|---|---|
| 8 | 0.875 | 0.786 | 0.833 |
| 7 | 0.9375 | 0.6428 | 0.8 |
| 6 | 0.75 | 0.7142 | 0.733 |
| 5 | 0.8125 | 0.5714 | 0.7 |
| 4 | 0.6875 | 0.5714 | 0.633 |
| 3 | 0.5625 | 0.6428 | 0.6 |
Individual results for each of the measures employed.
| ApEn | PE | SampEn | Class | |
|---|---|---|---|---|
| 1 | 0.375369 | 8.173881 | 0.259592 | 0 |
| 2 | 0.355621 | 7.354433 | 0.166436 | 0 |
| 3 | 0.427184 | 8.233846 | 0.305818 | 0 |
| 4 | 0.328920 | 8.437123 | 0.133882 | 0 |
| 5 | 0.742144 | 8.523112 | 0.589070 | 0 |
| 6 | 0.444839 | 8.458743 | 0.312369 | 0 |
| 7 | 0.444839 | 8.253519 | 0.331899 | 0 |
| 8 | 0.465783 | 8.422755 | 0.327447 | 0 |
| 9 | 0.649292 | 8.562641 | 0.411671 | 0 |
| 10 | 0.334577 | 8.737973 | 0.238648 | 0 |
| 11 | 0.404367 | 7.944453 | 0.282284 | 0 |
| 12 | 0.686112 | 8.683103 | 0.500797 | 0 |
| 13 | 0.211678 | 7.045437 | 0.125658 | 0 |
| 14 | 0.635363 | 8.702552 | 0.488307 | 0 |
| 15 | 0.375369 | 8.173881 | 0.259592 | 0 |
| 16 | 0.646044 | 8.023478 | 0.333306 | 0 |
| 17 | 0.132585 | 9.096728 | 0.102007 | 1 |
| 18 | 0.409274 | 9.480602 | 0.292390 | 1 |
| 19 | 0.083815 | 8.160992 | 0.074664 | 1 |
| 20 | 0.457642 | 8.042545 | 0.290163 | 1 |
| 21 | 0.143293 | 9.393343 | 0.090606 | 1 |
| 22 | 0.354052 | 9.858080 | 0.256855 | 1 |
| 23 | 0.407061 | 9.594236 | 0.275580 | 1 |
| 24 | 0.314689 | 9.703521 | 0.243019 | 1 |
| 25 | 0.269602 | 8.792517 | 0.161793 | 1 |
| 26 | 0.027989 | 9.439494 | 0.006332 | 1 |
| 27 | 0.376410 | 8.697850 | 0.230783 | 1 |
| 28 | 0.241392 | 9.959548 | 0.172143 | 1 |
| 29 | 0.053292 | 9.182114 | 0.011505 | 1 |
| 30 | 0.279302 | 8.660942 | 0.201991 | 1 |
Correlation results obtained for the three measures. Clearly, approximate entropy (ApEn) and sample entropy (SampEn) are strongly correlated.
| ApEn | PE | SampEn | |
|---|---|---|---|
| ApEn | 1.0000 | −0.2374 | 0.9604 |
| PE | −0.2374 | 1.0000 | −0.1342 |
| SampEn | 0.9604 | −0.1342 | 1.0000 |
Significance of the correlation analysis between measures.
| ApEn | PE | SampEn | |
|---|---|---|---|
| ApEn | - | 0.2066 | <0.0001 |
| PE | 0.2066 | - | 0.4795 |
| SampEn | <0.0001 | 0.4795 | - |
Variables in the equation for PE.
| Coefficients | Standard Error | Wald | df | Significance | Exp(B) | |
|---|---|---|---|---|---|---|
| PE ( | 3.704 | 1.395 | 7.050 | 1 | 0.008 | 40.598 |
|
| −32.202 | 12.025 | 7.171 | 1 | 0.007 | 0.000 |
Individual model summary for PE. It includes some measures to assess the model’s predictive power, the area under ROC curve (AUC), and the leave-one-out (LOO) average classification results.
| Step | −2 Log Likelihood | Cox–Snell | Nagelkerke | AUC | LOO |
|---|---|---|---|---|---|
| 1 | 24.031 | 0.441 | 0.588 | 0.87 | 77.6% |
Percentage agreement between observed and predicted classifications for temperature records using an individual model based on PE.
| Predicted | ||||
|---|---|---|---|---|
| Class | Percentage | |||
| Observed | 0 | 1 | Correct | |
| Class | 0 | 14 | 2 | 87.5 |
| 1 | 3 | 11 | 78.6 | |
|
|
| |||
Quantitative model probability results. Classification errors based on the computed value are in bold.
| PE | Class |
| |
|---|---|---|---|
| 1 | 8.173881 | 0 | 0.1270 |
| 2 | 7.354433 | 0 | 0.0069 |
| 3 | 8.233846 | 0 | 0.1537 |
| 4 | 8.437123 | 0 | 0.2783 |
| 5 | 8.523112 | 0 | 0.3465 |
| 6 | 8.458743 | 0 | 0.2946 |
| 7 | 8.253519 | 0 | 0.1634 |
| 8 | 8.422755 | 0 | 0.2677 |
| 9 | 8.562641 | 0 | 0.3803 |
|
|
|
|
|
| 11 | 7.944453 | 0 | 0.0585 |
| 12 | 8.683103 | 0 | 0.4895 |
| 13 | 7.045437 | 0 | 0.0022 |
|
|
|
|
|
| 15 | 8.173881 | 0 | 0.1270 |
| 16 | 8.023478 | 0 | 0.0769 |
| 17 | 9.096728 | 1 | 0.8161 |
| 18 | 9.480602 | 1 | 0.9484 |
|
|
|
|
|
|
|
|
|
|
| 21 | 9.393343 | 1 | 0.9301 |
| 22 | 9.858080 | 1 | 0.9867 |
| 23 | 9.594236 | 1 | 0.9655 |
| 24 | 9.703521 | 1 | 0.9767 |
| 25 | 8.792517 | 1 | 0.5898 |
| 26 | 9.439494 | 1 | 0.9404 |
| 27 | 8.697850 | 1 | 0.5032 |
| 28 | 9.959548 | 1 | 0.9909 |
| 29 | 9.182114 | 1 | 0.8589 |
|
|
|
|
|
Variables in the equation for SampEn.
| Coefficients | Standard Error | Wald | df | Significance | Exp(B) | |
|---|---|---|---|---|---|---|
| SampEn ( | −12.463 | 5.057 | 6.075 | 1 | 0.014 | 0.000 |
|
| 2.861 | 1.276 | 5.026 | 1 | 0.025 | 17.481 |
Individual model summary for SampEn. It includes some measures to assess the model’s predictive power, the area under ROC curve (AUC), and the leave-one-out (LOO) average classification results.
| Step | −2 Log Likelihood | Cox–Snell | Nagelkerke | AUC | LOO |
|---|---|---|---|---|---|
| 1 | 30.798 | 0.299 | 0.399 | 0.82 | 68.7% |
Percentage agreement between observed and predicted classifications for temperature records using an individual model based on SampEn.
| Predicted | ||||
|---|---|---|---|---|
| Class | Percentage | |||
| Observed | 0 | 1 | Correct | |
| Class | 0 | 13 | 3 | 81.3 |
| 1 | 6 | 8 | 57.1 | |
|
|
| |||
Variables in the equation for ApEn.
| Coefficients | Standard Error | Wald | df | Significance | Exp(B) | |
|---|---|---|---|---|---|---|
| ApEn ( | −11.744 | 4.761 | 6.083 | 1 | 0.014 | 0.000 |
|
| 4.096 | 1.770 | 5.354 | 1 | 0.021 | 60.077 |
Individual model summary for ApEn. It includes some measures to assess the model’s predictive power, the area under ROC curve (AUC), and the leave-one-out (LOO) average classification results.
| Step | −2 Log Likelihood | Cox–Snell | Nagelkerke | AUC | LOO |
|---|---|---|---|---|---|
| 1 | 27.636 | 0.369 | 0.493 | 0.83 | 69.7% |
Percentage agreement between observed and predicted classifications for temperature records using an individual model based on ApEn.
| Predicted | ||||
|---|---|---|---|---|
| Class | Percentage | |||
| Observed | 0 | 1 | Correct | |
| Class | 0 | 13 | 3 | 81.3 |
| 1 | 5 | 9 | 64.3 | |
|
|
| |||
Results for the logistic regression model using SampEn and PE.
| Coefficients | Standard Error | Wald | df | Significance | Exp(B) | |
|---|---|---|---|---|---|---|
| SampEn ( | −15.814 | 7.621 | 4.306 | 1 | 0.038 | 0.000 |
| PE ( | 3.564 | 1.532 | 5.409 | 1 | 0.020 | 35.293 |
|
| −26.899 | 12.955 | 4.311 | 1 | 0.038 | 0.000 |
Summary for the joint model using SampEn and PE. It includes some measures to assess the model’s predictive power, the area under ROC curve (AUC), and the leave-one-out (LOO) average classification results.
| Step | −2 Log Likelihood | Cox–Snell | Nagelkerke | AUC | LOO |
|---|---|---|---|---|---|
| 1 | 15.396 | 0.580 | 0.775 | 0.95 | 87.2% |
Percentage agreement between observed and predicted classifications for temperature records using the joint model with SampEn and PE.
| Predicted | ||||
|---|---|---|---|---|
| Class | Percentage | |||
| Observed | 0 | 1 | Correct | |
| Class | 0 | 14 | 2 | 87.5 |
| 1 | 1 | 13 | 92.9 | |
|
|
| |||
Figure 1ROC plots of all the models studied. The best classification performance is achieved with the PE+ApEn model (bold dotted line).
Figure 2Clouds of points for temperature records using PE and SampEn as coordinates. The separability of the two classes can be easily observed, with Class 1 objects (triangles) located mainly at the lower right zone of the plot, whereas Class 0 objects (circles) are located at the higher left zone. Only two circles and one triangle are clearly misplaced, accounting for the errors in Table 17.
Results for the logistic regression model using ApEn and PE.
| Coefficients | Standard Error | Wald | df | Significance | Exp(B) | |
|---|---|---|---|---|---|---|
| ApEn ( | −12.806 | 6.457 | 3.934 | 1 | 0.047 | 0.000 |
| PE ( | 3.433 | 1.602 | 4.590 | 1 | 0.032 | 30.974 |
|
| −24.940 | 13.711 | 3.309 | 1 | 0.069 | 0.000 |
Summary for the joint model using ApEn and PE. It includes some measures to assess the model’s predictive power, the area under ROC curve (AUC), and the leave-one-out (LOO) average classification results.
| Step | −2 Log Likelihood | Cox–Snell | Nagelkerke | AUC | LOO |
|---|---|---|---|---|---|
| 1 | 14.813 | 0.589 | 0.786 | 0.94 | 90.1% |
Percentage agreement between observed and predicted classifications for temperature records using the joint model with ApEn and PE. Results from previous experiments are included for comparative purposes.
| Predicted | ||||
|---|---|---|---|---|
| Class | Percentage | |||
| Observed | 0 | 1 | Correct | |
| Class | 0 | 15 | 1 | 93.8 |
| 1 | 1 | 13 | 92.9 | |
|
|
| |||
|
| ||||
|
| ||||
|
| ||||
Figure 3Clouds of points for temperature records using PE and ApEn as coordinates. The separability of the two classes can be easily observed, with Class 1 objects (triangles) located mainly at the lower right zone of the plot, whereas Class 0 objects (circles) are located at the higher left zone, as for the PE-SampEn case. Only one circle and one triangle are clearly misplaced, accounting for the errors in Table 20.
Best model fit based on the Akaike information criterion (AIC) for each case.
| Explanatory Variable | Residual Variance | AIC |
|---|---|---|
| PE | 24.03126 | 28.03126 |
| ApEn | 27.63616 | 31.63616 |
| SampEn | 30.79841 | 34.79841 |
| PE+ApEn | 14.81316 | 20.81316 |
| PE+SampEn | 15.39607 | 21.39607 |