| Literature DB >> 36175944 |
Noura Mohammed Abdelwahed1, Gh S El-Tawel2, M A Makhlouf3.
Abstract
BACKGROUND: Machine learning can be used to predict the different onset of human cancers. Highly dimensional data have enormous, complicated problems. One of these is an excessive number of genes plus over-fitting, fitting time, and classification accuracy. Recursive Feature Elimination (RFE) is a wrapper method for selecting the best subset of features that cause the best accuracy. Despite the high performance of RFE, time computation and over-fitting are two disadvantages of this algorithm. Random forest for selection (RFS) proves its effectiveness in selecting the effective features and improving the over-fitting problem.Entities:
Keywords: High dimensional data; Learning algorithms; Machine Learning; Over-fitting; Random Forest feature importance; Recursive feature elimination and its disadvantages
Year: 2022 PMID: 36175944 PMCID: PMC9523996 DOI: 10.1186/s13040-022-00304-y
Source DB: PubMed Journal: BioData Min ISSN: 1756-0381 Impact factor: 4.079
The meanings of the symbol
| Symbol | Meaning |
|---|---|
| PPV | Positive predictive value |
| TP | Tue positive (cancer type diagnosed correctly as a cancer type) |
| TN | True negative (non-cancer type diagnosed correctly non-cancer type) |
| FN | False-negative (cancer type diagnosed incorrectly as non-cancer type) |
| FP | False-positive (non-cancer type diagnosed incorrectly as a cancer type |
| SF | The size of the selected features after applying the algorithm |
| TF | The total size of features |
The meaning of parameter setting
| Parameter | Value | Definition |
|---|---|---|
| NRuns | 20 | No of runs |
| Problem Dimensions | – | No of features in the dataset. |
| X* | 2916 | The number of data produced after the bootstrap resamples method. |
| M | 100 | The number of trees using in the Random Forest algorithm. |
| Criterion | – | The method which measures the quality of split, Entropy is applied. |
| min_samples_leaf | 100 | The minimum number of samples required to be at a leaf node. |
| RFE estimators | – | A supervised learning algorithm. LR is applied. |
| C | 0.05 | Regularization parameter. |
| Max-iteration | 100 | Max iteration in LR classifier. |
| Tol | 0.0001 | Tolerance to stop criteria in LR classification. |
| CV | 10 | No of folds in cross-validation. |
Performance of original algorithms before applying the proposed methods
| Algo. | Train | Test | Over- | Pre | Rec | F1- | NO.F | F-Time | C-Time | AUC | Var. | ACC |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| | 100.000 | 99.800 | 0.999 | 0.998 | 0.998 | 60.000 | 1.000 | 0.00002 | 99.800 | |||
| | 100.000 | 99.800 | 0.999 | 0.998 | 0.998 | 0.275 | 1.000 | 0.00002 | 99.800 | |||
| | 97.500 | 87.000 | 0.741 | 0.706 | 0.709 | 40.000 | 0.960 | 0.023125 | 87.000 | |||
| | 89.803 | 84.054 | 0.819 | 0.764 | 5.085 | 0.955 | 0.000193 | 84.054 | ||||
| | 76.441 | 75.133 | 0.384 | 0.480 | 144.783 | 0.177 | 0.689 | 0.00145 | 75.133 | |||
| | 76.484 | 75.000 | 0.629 | 0.557 | 0.158 | 0.706 | 0.00108 | 75.000 | ||||
| | 95.000 | 94.000 | 0.948 | 0.937 | 0.941 | 0.142 | 0.099 | 94.000 | ||||
| | 95.000 | 93.000 | 2.000 | 0.938 | 0.928 | 0.932 | 27.000 | 0.090 | 0.008 | 0.989 | 0.00583 | 93.000 |
| | 97.541 | 96.997 | 0.972 | 0.960 | 0.963 | 0.062 | 0.001 | 0.997 | 0.001073 | 96.997 | ||
| | 89.860 | 87.725 | 2.135 | 0.837 | 0.821 | 0.815 | 0.094 | 0.002 | 0.985 | 0.002819 | 87.725 | |
Average results after applying OFBS-RFS-RFE after 20 runs
| Algo. | Train | Test | Over-fitting | Pre | Rec | F1-score | No.F | F-Time (sec) | C-Time (sec) | AUC | Var. | ACC |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| | 100.000 | 99.944 | 0.056 | 1.000 | 0.999 | 1.000 | 379.100 | 9.537 | 0.296 | 1.000 | 0.000003 | 99.944 |
| | 100.000 | 99.981 | 0.019 | 1.000 | 1.000 | 1.000 | 142.500 | 189.35 | 0.445 | 1.000 | ||
| | 100.000 | 99.945 | 0.055 | 1.000 | 0.999 | 1.000 | 379.100 | 9.537 | 0.296 | 1.000 | 0.000003 | 99.945 |
| | 100.000 | 95.038 | 4.962 | 1.000 | 0.961 | 1.000 | 142.500 | 189.35 | 0.192 | 1.000 | 95.038 | |
| | 100.000 | 99.875 | 0.125 | 0.999 | 0.999 | 0.999 | 379.100 | 9.537 | 1.007 | 0.999 | 0.000013 | 99.875 |
| | 100.000 | 99.925 | 0.075 | 1.000 | 0.999 | 0.999 | 142.500 | 189.35 | 0.807 | 0.999 | 0.000005 | 99.925 |
| | 99.967 | 99.439 | 0.528 | 0.995 | 0.994 | 0.994 | 379.100 | 9.537 | 0.912 | 0.999 | 0.000074 | 99.439 |
| | 99.972 | 99.513 | 0.459 | 0.996 | 0.995 | 0.995 | 142.500 | 189.35 | 0.4482 | 0.999 | 0.000063 | 99.513 |
| | 93.838 | 90.857 | 2.981 | 0.914 | 0.875 | 0.888 | 5.435 | 5.620 | 0.981 | 0.00138 | 90.857 | |
| | 93.462 | 0.875 | 0.887 | 2755 | ||||||||
| | 94.248 | 90.980 | 3.268 | 0.923 | 0.873 | 0.890 | 5.435 | 27.316 | 0.981 | 0.00012 | 90.980 | |
| | 94.248 | 90.980 | 3.268 | 0.923 | 0.873 | 0.890 | 2755 | 27.316 | 0.981 | 0.00048 | 90.980 | |
| | 90.966 | 86.613 | 4.353 | 0.918 | 0.834 | 0.888 | 5.435 | 2.934 | 0.985 | 0.00021 | 86.613 | |
| | 4.421 | 0.921 | 0.875 | 0.890 | 2755 | 2.147 | 0.986 | 0.00074 | 91.265 | |||
| | 98.622 | 92.971 | 5.651 | 0.927 | 0.907 | 0.916 | 5.435 | 9.080 | 0.929 | 0.00024 | 92.971 | |
| | 2755 | 4.502 | ||||||||||
| | 78.453 | 77.864 | 0.589 | 0.737 | 0.602 | 0.605 | 228.150 | 1.704 | 0.128 | 0.736 | 0.00103 | 77.864 |
| | 77.050 | 72.740 | 4.310 | 0.700 | 0.556 | 0.579 | 113.850 | 11.213 | 0.149 | 0.705 | 0.00093 | 72.740 |
| | 76.109 | 75.624 | 0.485 | 0.623 | 0.543 | 0.512 | 228.150 | 1.704 | 0.644 | 0.643 | 0.00043 | 75.624 |
| | 76.003 | 75.499 | 0.504 | 0.617 | 0.541 | 0.509 | 113.850 | 11.214 | 0.496 | 0.638 | 0.00041 | 75.499 |
| | 94.634 | 0.948 | 0.910 | 0.926 | 228.150 | 1.704 | 1.434 | 0.986 | 0.00064 | 94.634 | ||
| | 95.000 | |||||||||||
| | 99.719 | 93.163 | 6.556 | 0.917 | 0.904 | 0.908 | 228.150 | 1.704 | 1.790 | 0.966 | 0.00092 | 93.163 |
| | 99.735 | 93.008 | 6.727 | 0.916 | 0.901 | 0.906 | 113.850 | 11.214 | 0.820 | 0.966 | 0.00078 | 93.008 |
| | 96.023 | 95.037 | 0.986 | 0.926 | 0.912 | 0.907 | 18.625 | 0.216 | 0.014 | 0.995 | 0.000190 | 95.037 |
| | 97.241 | 96.481 | 0.760 | 0.932 | 0.934 | 0.926 | 16.000 | 0.203 | 0.517 | 0.997 | 0.000730 | 96.481 |
| | 79.269 | 78.375 | 0.894 | 0.672 | 0.708 | 0.668 | 18.625 | 0.216 | 0.169 | 0.973 | 0.002590 | 78.375 |
| | 99.484 | 98.940 | 0.544 | 0.988 | 0.986 | 0.984 | 16.000 | 0.203 | 0.064 | 0.998 | 0.000368 | 98.940 |
| | ||||||||||||
| | ||||||||||||
| | 99.970 | 99.730 | 0.240 | 0.997 | 0.995 | 0.996 | 18.625 | 0.216 | 0.077 | 0.997 | 0.000057 | 99.730 |
| | 99.966 | 99.796 | 0.170 | 0.998 | 0.995 | 0.996 | 16.000 | 0.203 | 0.242 | 0.998 | 0.000055 | 99.796 |
| | 94.776 | 94.218 | 0.558 | 0.922 | 0.933 | 0.937 | 27.100 | 0.298 | 0.988 | 0.000001 | 94.218 | |
| | 95.069 | 94.587 | 0.482 | 0.947 | 0.939 | 0.941 | 13.316 | 0.130 | 0.989 | 0.000810 | 94.587 | |
| | 92.167 | 91.902 | 0.266 | 0.934 | 0.897 | 0.909 | 27.100 | 0.298 | 0.076 | 0.978 | 0.000986 | 91.902 |
| | 93.301 | 93.114 | 0.187 | 0.913 | 0.914 | 0.927 | 13.316 | 0.130 | 0.982 | 0.001115 | 93.114 | |
| | 100.000 | 97.864 | 2.136 | 0.984 | 0.981 | 0.979 | 27.100 | 0.298 | 0.506 | 0.997 | 0.000270 | 97.864 |
| | 100.000 | 98.000 | 0.983 | 0.979 | 0.982 | 13.316 | 0.130 | 0.428 | ||||
| | 99.889 | 97.548 | 2.341 | 0.977 | 0.972 | 0.974 | 27.100 | 0.298 | 0.101 | 0.949 | 0.000280 | 97.548 |
| | 99.888 | 97.724 | 2.164 | 0.978 | 0.974 | 0.976 | 13.316 | 0.130 | 0.104 | 0.948 | 0.000430 | 97.724 |
Average results after applying IFBS-RFS-RFE after 20 runs
| Algo. | Train | Test | Over-fitting | Pre | Rec | F1-score | No.F | F-Time (sec) | C-Time (sec) | AUC | Var. | ACC |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| | 100.000 | 99.925 | 0.075 | 0.999 | 0.999 | 0.999 | 239.000 | 5.421 | 0.193 | 1.000 | 0.000004 | 99.925 |
| | 100.000 | 99.975 | 0.025 | 0.999 | 0.999 | 0.999 | 125.250 | 15.201 | 0.357 | 1.000 | ||
| | 99.999 | 99.906 | 0.093 | 0.999 | 0.998 | 0.999 | 239.000 | 5.421 | 0.225 | 1.000 | 0.000005 | 99.906 |
| | 100.000 | 99.988 | 0.012 | 0.999 | 0.999 | 0.999 | 125.250 | 15.201 | 0.153 | 1.000 | ||
| | 100.000 | 99.694 | 0.306 | 0.998 | 0.997 | 0.997 | 239.000 | 5.421 | 0.901 | 1.000 | 0.000030 | 99.694 |
| | 100.000 | 99.807 | 0.193 | 0.999 | 0.998 | 0.998 | 125.250 | 15.201 | 0.737 | 0.999 | 0.000019 | 99.807 |
| | 99.947 | 99.002 | 0.945 | 0.991 | 0.989 | 0.989 | 239.000 | 5.421 | 0.635 | 0.999 | 0.000075 | 99.002 |
| | 99.955 | 99.027 | 0.928 | 0.992 | 0.989 | 0.990 | 125.250 | 15.201 | 0.327 | 0.999 | 0.000075 | 99.027 |
| | 88.525 | 82.889 | 5.636 | 0.812 | 0.752 | 0.763 | 966.000 | 6.250 | 3.876 | 0.942 | 0.000370 | 82.889 |
| | 88.000 | 84.000 | 0.831 | 0.764 | 0.804 | 482.000 | 1545 | 0.959 | ||||
| | 88.341 | 81.637 | 6.704 | 0.815 | 0.729 | 0.745 | 966.000 | 6.050 | 29.078 | 0.955 | 0.000580 | 81.637 |
| | 89.668 | 82.268 | 7.400 | 0.827 | 0.738 | 0.753 | 482.000 | 1545 | 15.721 | 0.960 | 0.000880 | 82.268 |
| | 89.660 | 80.089 | 9.571 | 0.768 | 0.7025 | 0.709 | 966.000 | 6.050 | 3.150 | 0.938 | 0.000410 | 80.089 |
| | 89.935 | 80.138 | 9.797 | 0.770 | 0.703 | 0.719 | 482.000 | 1545 | 2.487 | 0.941 | 0.000470 | 80.138 |
| | 97.697 | 78.316 | 19.381 | 0.733 | 0.722 | 0.702 | 966.000 | 6.050 | 7.850 | 0.867 | 0.000450 | 78.316 |
| | 89.035 | 78.309 | 10.726 | 0.730 | 0.695 | 0.702 | 482.000 | 1545 | 4.023 | 0.910 | 0.000480 | 78.309 |
| | 78.122 | 76.718 | 1.404 | 0.664 | 0.588 | 0.582 | 154.263 | 1.071 | 0.082 | 0.732 | 0.002210 | 76.718 |
| | 76.693 | 73.998 | 2.695 | 0.667 | 0.588 | 0.581 | 80.050 | 4.594 | 0.164 | 0.722 | 0.001990 | 73.998 |
| | 75.684 | 72.248 | 3.436 | 0.468 | 0.498 | 0.448 | 154.263 | 1.071 | 0.482 | 0.621 | 0.000770 | 72.248 |
| | 75.697 | 72.228 | 3.469 | 0.464 | 0.497 | 0.448 | 80.050 | 4.594 | 0.407 | 0.619 | 0.000760 | 72.228 |
| | 99.999 | 83.912 | 16.087 | 0.811 | 0.738 | 0.760 | 154.263 | 1.071 | 1.230 | 0.866 | 0.003480 | 83.912 |
| | 100.000 | 81.485 | 18.515 | 0.773 | 0.700 | 0.719 | 80.050 | 4.594 | 0.964 | 0.834 | 0.003500 | 81.485 |
| | 99.590 | 80.810 | 17.78 | 0.754 | 0.731 | 0.737 | 154.263 | 1.071 | 1.225 | 0.826 | 0.003460 | 80.810 |
| | 99.580 | 79.191 | 20.389 | 0.729 | 0.707 | 0.713 | 80.050 | 4.594 | 0.546 | 0.804 | 0.003440 | 79.191 |
| | 91.531 | 91.000 | 0.531 | 0.771 | 0.796 | 0.777 | 13.000 | 0.515 | 0.002 | 0.988 | 0.000881 | 91.000 |
| | 92.198 | 91.801 | 0.397 | 0.773 | 0.799 | 0.780 | 12.000 | 0.016 | 0.020 | 0.988 | 0.000975 | 91.801 |
| | 94.870 | 93.979 | 0.891 | 0.888 | 0.878 | 0.875 | 13.000 | 0.515 | 0.023 | 0.988 | 0.001285 | 93.979 |
| | 94.869 | 93.979 | 0.890 | 0.888 | 0.878 | 0.875 | 12.000 | 0.016 | 0.075 | 0.989 | 0.001285 | 93.979 |
| | 97.025 | 93.183 | 3.482 | 0.900 | 0.892 | 0.889 | 13.000 | 0.515 | 0.142 | 0.984 | 0.001493 | 93.183 |
| | 97.000 | 93.500 | 3.500 | 0.900 | 0.892 | 0.889 | 12.000 | 0.016 | 0.140 | 0.980 | 0.001490 | 93.500 |
| | 96.903 | 92.102 | 4.801 | 0.895 | 0.884 | 0.881 | 13.000 | 0.515 | 0.016 | 0.989 | 0.003297 | 92.102 |
| | 97.177 | 81.194 | 15.983 | 0.789 | 0.764 | 0.760 | 12.000 | 0.016 | 0.014 | 0.970 | 0.081251 | 81.194 |
| | 94.394 | 93.678 | 0.461 | 0.938 | 0.938 | 0.938 | 23.100 | 0.410 | 0.012 | 0.988 | 0.000690 | 93.678 |
| | 94.855 | 94.403 | 0.946 | 0.946 | 0.946 | 11.900 | 0.103 | 0.091 | 0.992 | 0.000520 | 94.403 | |
| | 92.010 | 91.563 | 0.447 | 0.929 | 0.929 | 0.929 | 23.100 | 0.410 | 0.069 | 0.976 | 0.001010 | 91.563 |
| | 93.888 | 93.503 | 0.385 | 0.944 | 0.944 | 0.944 | 11.900 | 0.103 | 0.059 | 0.983 | 0.000550 | 93.503 |
| | 100.000 | 96.411 | 3.589 | 0.965 | 0.965 | 0.965 | 23.100 | 0.410 | 0.452 | 0.991 | 0.000980 | 96.411 |
| | 100.000 | 95.277 | 4.723 | 0.952 | 0.952 | 0.952 | 11.900 | 0.103 | 0.433 | 0.989 | 0.000930 | 95.277 |
| | 99.625 | 95.302 | 4.323 | 0.954 | 0.954 | 0.954 | 23.100 | 0.410 | 0.099 | 0.985 | 0.000920 | 95.302 |
| | 99.610 | 94.416 | 5.194 | 0.944 | 0.944 | 0.944 | 11.900 | 0.103 | 0.085 | 0.981 | 0.001170 | 94.416 |
Average results after applying O/IFBS-RFS-RFE after 20 runs
| Algo. | Train | Test | Over-fitting | Pre | Rec | F1-score | NO.F | F-Time (sec) | C-Time (sec) | AUC | Var. | ACC |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| | 100.000 | 99.975 | 0.025 | 0.999 | 0.999 | 0.999 | 238.800 | 4.220 | 0.176 | 1.000 | 0.0000006 | 99.975 |
| | 100.000 | 99.994 | 0.006 | 0.999 | 0.999 | 0.999 | 119.200 | 13.726 | 0.307 | 1.000 | 0.0000004 | |
| | 100.000 | 99.950 | 0.05 | 0.999 | 0.999 | 0.999 | 238.800 | 4.220 | 0.197 | 1.000 | 0.0000025 | 99.950 |
| | 100.000 | 99.981 | 0.019 | 0.999 | 0.999 | 0.999 | 119.200 | 13.726 | 0.125 | 1.000 | 0.0000004 | 99.981 |
| | 100.000 | 99.888 | 0.112 | 0.999 | 0.999 | 0.999 | 238.800 | 4.220 | 0.755 | 1.000 | 0.0000076 | 99.888 |
| | 100.000 | 99.913 | 0.087 | 0.999 | 0.999 | 0.999 | 119.200 | 13.726 | 0.596 | 0.999 | 0.0000054 | 99.913 |
| | 99.974 | 99.357 | 0.617 | 0.994 | 0.992 | 0.993 | 238.800 | 4.220 | 0.513 | 0.999 | 0.0000828 | 99.357 |
| | 99.972 | 99.363 | 0.609 | 0.994 | 0.993 | 0.993 | 119.200 | 13.726 | 0.266 | 0.999 | 0.000083 | 99.363 |
| | 92.581 | 89.818 | 2.763 | 0.904 | 0.861 | 0.877 | 973.000 | 3.650 | 3.850 | 0.975 | 0.00031 | 89.818 |
| | 91.878 | 89.601 | 2.277 | 0.906 | 0.857 | 0.885 | 485.000 | 1460 | 1.950 | 0.936 | 0.00035 | 89.601 |
| | 93.361 | 90.253 | 3.108 | 0.917 | 0.860 | 0.878 | 973.000 | 3.650 | 22.00 | 0.980 | 0.00065 | 90.253 |
| | 94.241 | 90.979 | 3.262 | 0.925 | 0.873 | 0.891 | 485.000 | 1460 | 11.700 | 0.985 | 0.00028 | 90.979 |
| | 95.527 | 90.764 | 4.763 | 0.914 | 0.868 | 0.882 | 973.000 | 3.650 | 2.650 | 0.984 | 0.00027 | 90.764 |
| | 95.681 | 90.954 | 4.727 | 0.919 | 0.872 | 0.890 | 485.000 | 1460 | 1.750 | 0.941 | 0.00027 | 90.954 |
| | 97.958 | 92.712 | 5.246 | 0.926 | 0.906 | 0.913 | 973.000 | 3.650 | 6.550 | 0.980 | 0.00027 | 92.712 |
| | 1460 | 3.150 | ||||||||||
| | 79.050 | 78.482 | 0.568 | 0.742 | 0.619 | 0.626 | 155.50 | 1.058 | 0.093 | 0.764 | 0.00123 | 78.482 |
| | 77.744 | 77.427 | 0.317 | 0.712 | 0.597 | 0.598 | 77.550 | 5.551 | 0.118 | 0.731 | 0.00092 | 77.427 |
| | 76.009 | 75.442 | 0.567 | 0.612 | 0.539 | 0.508 | 155.500 | 1.058 | 0.511 | 0.637 | 0.00041 | 75.442 |
| | 77.500 | 76.672 | 0.828 | 0.653 | 0.566 | 0.542 | 77.550 | 5.551 | 0.420 | 0.669 | 0.00051 | 76.672 |
| | 100.000 | 94.494 | 5.506 | 0.945 | 0.909 | 0.924 | 155.500 | 1.058 | 1.122 | 0.985 | 0.00064 | 94.494 |
| | 94.082 | 0.943 | 0.901 | 0.917 | 77.550 | 5.551 | 0.911 | 0.983 | 0.00070 | |||
| | 99.720 | 93.196 | 0.916 | 0.906 | 0.909 | 155.500 | 1.058 | 1.091 | 0.965 | 0.00093 | ||
| | 99.719 | 92.917 | 0.914 | 0.900 | 0.905 | 77.550 | 5.550 | 0.511 | 0.966 | 0.00084 | ||
| | 96.691 | 96.441 | 0.250 | 0.649 | 0.624 | 0.630 | 11.000 | 0.167 | 0.025 | 0.998 | 0.000848 | 96.441 |
| | 92.532 | 92.350 | 0.212 | 0.801 | 0.751 | 0.766 | 10.000 | 0.500 | 0.128 | 0.999 | 0.000790 | 92.350 |
| | 95.082 | 95.000 | 0.082 | 0.638 | 0.608 | 0.613 | 11.000 | 0.167 | 0.025 | 0.977 | 0.000632 | 95.000 |
| | 98.361 | 98.356 | 0.005 | 0.892 | 0.900 | 0.895 | 10.000 | 0.500 | 0.047 | 0.999 | 0.001040 | 98.356 |
| | ||||||||||||
| | 0.500 | 0.500 | ||||||||||
| | 0.167 | 0.520 | 0.999 | 0.0 | ||||||||
| | 0.500 | 0.500 | 0.991 | 0.0 | ||||||||
| | 94.647 | 94.148 | 0.499 | 0.944 | 0.932 | 0.936 | 22.900 | 0.399 | 0.010 | 0.988 | 0.00095 | 94.148 |
| | 95.305 | 94.842 | 0.463 | 0.949 | 0.942 | 0.944 | 11.300 | 0.1033 | 0.091 | 0.992 | 0.00086 | 94.842 |
| | 92.110 | 91.889 | 0.221 | 0.934 | 0.897 | 0.909 | 22.900 | 0.399 | 0.067 | 0.978 | 0.00098 | 91.889 |
| | 93.515 | 93.400 | 0.115 | 0.943 | 0.918 | 0.927 | 11.300 | 0.103 | 0.058 | 0.983 | 0.00094 | 93.400 |
| | 99.563 | 97.500 | 2.063 | 0.981 | 0.976 | 0.977 | 22.900 | 0.3994 | 0.411 | 0.996 | 0.00031 | 97.500 |
| | 2.000 | 0.979 | 0.977 | 0.978 | 11.300 | 0.103 | 0.404 | 0.997 | 0.00031 | |||
| | 99.819 | 97.618 | 2.201 | 0.977 | 0.973 | 0.974 | 22.900 | 0.399 | 0.089 | 0.994 | 0.00038 | 97.618 |
| | 99.803 | 97.505 | 2.298 | 0.976 | 0.972 | 0.973 | 11.300 | 0.103 | 0.065 | 0.993 | 0.00034 | 97.505 |
Fig. 1Comparison between proposed methods on all datasets using classification accuracy
Fig. 2Number of the selected features using all datasets
Fig. 3Variance of the proposed methods using all bootstrap positions
The comparison between results before and after PFBS-RFS-RFE
| Datasets | Before PFBS-RFS-RFE | After PFBS-RFS-RFE | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| ACC | Overfitting | No.F | C-time | Var. | ACC | Overfitting | NO. F | C-time | Var. | |
| RNA gene | 99.800 | 0.200 | 20,531 | 16.547 | 0.000015 | 99.994 | 0.006 | 119.200 | 0.307 s | 0.0000004 |
| DNA CNV | 85.000 | 12.600 | 16,381 | 170 s | 0.000580 | 92.762 | 2.763 | 675.000 | 0.981 s | 0.000230 |
Parkinson’s disease | 93.677 | 0.800 | 753 | 2.000 s | 0.000634 | 95.000 | 5.000 | 113.850 | 1.134 s | 0.000620 |
| Dermatology diseases | 97.807 | 0.493 | 34 | 0.003 s | 0.000810 | |||||
| BreastEW | 75.928 | 2.072 | 30 | 0.500 s | 0.002092 | 98.000 | 2.000 | 13.300 | 0.428 s | 0.000300 |
Achievement of accuracy in different research for cancer classification using the same datasets [7–9, 12, 36, 37]
| Reference | Dataset | FS Approach | No of selected features | Var. | AUC | ACC |
|---|---|---|---|---|---|---|
| García-Díaz et al. [ | RNA gene | GGA | 49 | 0.000303 | – | 98.810 |
| Zhang et al. [ | DNA CNV | mRMR & IFS | 19 | 0.000580 | 0.973 | 75.000 |
| Sanaa et al. [ | PSO & GA | 2050 | – | 0.961 | 84.600 | |
| Sanaa et al. [ | IG | 16,381 | – | 0.965 | 85.900 | |
| Sakar et al. [ | Parkinson’s disease | ,mRMR | 50 | – | – | 85.000 |
| Hegazy et al. [ | BreastEw | CSSA | 5.200 | – | – | 97.080 |
The proposed methods compared with the MIFS method
| Datasets | Train | Test | Over-fitting | Pre | Rec | F1-score | NO.F | F-Time | C-Time | AUC | Var. | ACC |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RNA gene | 100.000 | 99.875 | 0.125 | 0.999 | 0.998 | 0.988 | 10,000 | 192.552 | 2.896 | 1.000 | 0.000016 | 99.875 |
| DNA CNV | 96.597 | 84.978 | 11.619 | 0.817 | 0.782 | 0.788 | 9000 | 173.955 | 25.195 | 0.954 | 0.000416 | 84.978 |
Parkinson’s disease | 77.058 | 75.525 | 1.533 | 0.620 | 0.556 | 0.538 | 300 | 0.377 | 0.037 | 0.682 | 0.001001 | 75.525 |
Dermatology diseases | 97.845 | 96.989 | 0.856 | 0.971 | 0.965 | 0.966 | 25 | 0.203 | 0.003 | 0.997 | 0.000585 | 96.989 |
| BreastEW | 94.396 | 93.678 | 0.718 | 0.938 | 0.928 | 0.932 | 20 | 0.067 | 0.002 | 0.988 | 0.000694 | 93.678 |
| RNA gene | 100.000 | 99.750 | 0.250 | 0.998 | 0.997 | 0.997 | 10,000 | 192.552 | 2.534 | 1.000 | 0.000028 | 99.750 |
| DNA CNV | 91.606 | 84.122 | 7.484 | 0.860 | 0.756 | 0.775 | 9000 | 173.955 | 75.394 | 0.949 | 0.000668 | 84.122 |
Parkinson’s disease | 75.676 | 72.228 | 3.448 | 0.472 | 0.498 | 0.448 | 300 | 0.203 | 0.138 | 0.627 | 0.000814 | 72.228 |
Dermatology diseases | 98.421 | 97.523 | 0.898 | 0.976 | 0.967 | 0.969 | 25 | 0.203 | 0.028 | 0.998 | 0.000924 | 97.523 |
| BreastEW | 92.013 | 91.563 | 0.450 | 0.929 | 0.895 | 0.906 | 20 | 0.067 | 0.017 | 0.976 | 0.001014 | 91.563 |
| RNA gene | 100.000 | 99.627 | 0.373 | 0.998 | 0.996 | 0.997 | 10,000 | 192.552 | 1.252 | 1.000 | 0.000036 | 99.627 |
| DNA CNV | 92.962 | 80.623 | 12.339 | 0.771 | 0.719 | 0.718 | 9000 | 173.955 | 3.528 | 0.942 | 0.000614 | 80.623 |
Parkinson’s disease | 100.000 | 84.782 | 15.218 | 0.827 | 0.748 | 0.773 | 300 | 0.377 | 0.376 | 0.876 | 0.002303 | 84.782 |
Dermatology diseases | 100.000 | 96.456 | 3.544 | 0.972 | 0.950 | 0.955 | 25 | 0.203 | 0.148 | 0.999 | 0.001473 | 96.456 |
| BreastEW | 100.000 | 96.140 | 3.860 | 0.963 | 0.956 | 0.958 | 20 | 0.067 | 0.110 | 0.990 | 0.000944 | 96.140 |
| RNA gene | 99.847 | 98.628 | 1.219 | 0.989 | 0.985 | 0.987 | 10,000 | 192.552 | 7.322 | 0.999 | 0.000036 | 98.628 |
| DNA CNV | 98.960 | 78.806 | 20.154 | 0.733 | 0.699 | 0.707 | 9000 | 173.955 | 25.309 | 0.912 | 0.000613 | 78.806 |
Parkinson’s disease | 99.574 | 79.239 | 20.335 | 0.729 | 0.729 | 0.727 | 300 | 0.377 | 0.673 | 0.794 | 0.002002 | 79.239 |
Dermatology diseases | 99.696 | 95.105 | 4.591 | 0.955 | 0.940 | 0.939 | 25 | 0.203 | 0.021 | 0.995 | 0.001473 | 95.105 |
| BreastEW | 99.492 | 95.435 | 4.057 | 0.957 | 0.947 | 0.950 | 20 | 0.067 | 0.022 | 0.986 | .000901 | 95.435 |
The proposed methods compared with the IGF method
| Datasets | Train | Test | Over-fitting | Pre | Rec | F1-score | NO.F | F-Time | C-Time | AUC | Var. | ACC |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RNA gene | 100.000 | 99.875 | 0.125 | 0.999 | 0.999 | 0.998 | 3576 | 1.182 | 2.121 | 1.000 | 0.000016 | 99.875 |
| DNA CNV | 93.115 | 81.310 | 11.805 | 0.782 | 0.706 | 0.705 | 3315 | 5.651 | 0.595 | 0.951 | 0.000576 | 81.310 |
Parkinson’s disease | 77.822 | 76.984 | 0.838 | 0.680 | 0.576 | 0.566 | 396 | 0.093 | 0.057 | 0.710 | 0.001445 | 76.984 |
Dermatology diseases | 97.784 | 97.260 | 0.524 | 0.973 | 0.968 | 0.969 | 25 | 0.032 | 0.0009 | 0.998 | 0.000677 | 97.260 |
| BreastEW | 94.170 | 93.674 | 0.496 | 0.942 | 0.928 | 0.931 | 22 | 0.064 | 0.001 | 0.989 | 0.002176 | 93.674 |
| RNA gene | 100.000 | 99.750 | 0.250 | 0.999 | 0.997 | 0.998 | 3576 | 1.182 | 2.272 | 1.000 | 0.000028 | 99.750 |
| DNA CNV | 94.273 | 85.872 | 8.401 | 0.873 | 0.780 | 0.801 | 3315 | 5.651 | 3.142 | 0.969 | 0.000486 | 85.872 |
Parkinson’s disease | 75.666 | 72.379 | 3.287 | 0.434 | 0.497 | 0.443 | 396 | 0.093 | 0.204 | 0.640 | 0.004378 | 72.379 |
Dermatology diseases | 98.269 | 97.530 | 0.739 | 0.975 | 0.972 | 0.972 | 25 | 0.032 | 0.014 | 0.999 | 0.000752 | 97.530 |
| BreastEW | 92.007 | 91.569 | 0.438 | 0.930 | 0.895 | 0.904 | 22 | 0.064 | 0.021 | 0.979 | 0.004502 | 91.569 |
| RNA gene | 100.000 | 99.502 | 0.498 | 0.997 | 0.994 | 0.996 | 3576 | 1.182 | 0.826 | 0.999 | 0.000410 | 99.502 |
| DNA CNV | 92.558 | 81.139 | 11.419 | 0.773 | 0.714 | 0.721 | 3315 | 5.651 | 1.584 | 0.944 | 0.000531 | 81.139 |
Parkinson’s disease | 100.000 | 83.733 | 16.267 | 0.793 | 0.726 | 0.734 | 396 | 0.093 | 0.719 | 0.860 | 0.009057 | 83.733 |
Dermatology diseases | 100.000 | 96.997 | 3.003 | 0.973 | 0.962 | 0.964 | 25 | 0.032 | 0.098 | 0.999 | 0.000567 | 96.997 |
| BreastEW | 99.982 | 96.140 | 3.842 | 0.961 | 0.959 | 0.958 | 22 | 0.064 | 0.118 | 0.986 | 0.002280 | 96.140 |
| RNA gene | 99.940 | 99.126 | 0.814 | 0.996 | 0.990 | 0.992 | 3576 | 1.182 | 3.040 | 0.999 | 0.000260 | 99.126 |
| DNA CNV | 98.701 | 79.045 | 19.656 | 0.741 | 0.698 | 0.708 | 3315 | 5.651 | 1.095 | 0.911 | 0.000553 | 79.045 |
Parkinson’s disease | 99.653 | 82.297 | 17.356 | 0.790 | 0.752 | 0.754 | 396 | 0.093 | 1.079 | 0.830 | 0.011270 | 82.297 |
Dermatology diseases | 99.696 | 95.375 | 4.321 | 0.958 | 0.950 | 0.949 | 25 | 0.032 | 0.012 | 0.993 | 0.001466 | 95.375 |
| BreastEW | 99.636 | 95.253 | 4.383 | 0.957 | 0.946 | 0.948 | 22 | 0.064 | 0.029 | 0.987 | 0.001788 | 95.253 |
The proposed methods compared with the mRMR method
| Datasets | Train | Test | Over-fitting | Pre | Rec | F1-score | NO.F | F-Time | C-Time | AUC | Var. | ACC |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RNA gene | 100.000 | 99.750 | 0.250 | 0.999 | 0.997 | 0.998 | 650 | 1200.011 | 0.251 | 1.000 | 0.000028 | 99.750 |
| DNA CNV | 91.819 | 79.699 | 12.120 | 0.746 | 0.688 | 0.689 | 505 | 2296.409 | 0.686 | 0.940 | 0.000529 | 79.699 |
Parkinson’s disease | 74.617 | 73.011 | 1.606 | 0.500 | 0.515 | 0.479 | 145 | 61.005 | 0.017 | 0.659 | 0.002502 | 73.011 |
Dermatology diseases | 95.508 | 95.075 | 0.433 | 0.950 | 0.908 | 0.919 | 15 | 3.996 | 0.002 | 0.995 | 0.000796 | 95.075 |
| BreastEW | 93.085 | 92.620 | 0.465 | 0.936 | 0.910 | 0.917 | 19 | 4.181 | 0.002 | 0.981 | 0.003358 | 92.620 |
| RNA gene | 100.000 | 99.748 | 0.252 | 0.999 | 0.997 | 0.998 | 650 | 1200.011 | 0.382 | 1.000 | 0.000028 | 99.748 |
| DNA CNV | 92.486 | 83.848 | 8.638 | 0.845 | 0.747 | 0.766 | 505 | 2296.409 | 3.609 | 0.961 | 0.000559 | 83.848 |
Parkinson’s disease | 75.661 | 72.379 | 3.282 | 0.435 | 0.497 | 0.443 | 145 | 61.005 | 0.142 | 0.639 | 0.004378 | 72.379 |
Dermatology diseases | 52.793 | 52.185 | 0.608 | 0.325 | 0.463 | 0.363 | 15 | 3.996 | 0.053 | 0.948 | 0.002302 | 52.185 |
| BreastEW | 89.049 | 88.938 | 0.111 | 0.915 | 0.860 | 0.870 | 19 | 4.181 | 0.044 | 0.945 | 0.005405 | 88.938 |
| RNA gene | 100.000 | 99.627 | 0.373 | 0.998 | 0.996 | 0.997 | 650 | 1200.011 | 0.398 | 1.000 | 0.000036 | 99.627 |
| DNA CNV | 90.959 | 79.935 | 11.024 | 0.727 | 0.690 | 0.689 | 505 | 2296.409 | 0.534 | 0.942 | 0.001249 | 79.935 |
Parkinson’s disease | 100.000 | 81.918 | 18.082 | 0.767 | 0.703 | 0.709 | 145 | 61.005 | 0.467 | 0.833 | 0.011138 | 81.918 |
Dermatology diseases | 100.000 | 97.553 | 2.447 | 0.981 | 0.968 | 0.972 | 15 | 3.996 | 0.1000 | 0.999 | 0.000561 | 97.553 |
| BreastEW | 100.00 | 95.604 | 4.396 | 0.960 | 0.950 | 0.952 | 19 | 4.181 | 0.183 | 0.991 | 0.002693 | 95.604 |
| RNA gene | 99.961 | 98.746 | 1.215 | 0.991 | 0.984 | 0.986 | 650 | 1200.011 | 1.680 | 0.999 | 0.000430 | 98.746 |
| DNA CNV | 97.817 | 77.468 | 20.349 | 0.731 | 0.682 | 0.687 | 505 | 2296.409 | 1.135 | 0.910 | 0.000937 | 77.468 |
Parkinson’s disease | 99.498 | 79.369 | 20.129 | 0.725 | 0.712 | 0.705 | 145 | 61.005 | 0.561 | 0.799 | 0.010620 | 79.369 |
Dermatology diseases | 99.545 | 94.017 | 5.528 | 0.951 | 0.936 | 0.937 | 15 | 3.996 | 0.009 | 0.982 | 0.002412 | 94.017 |
| BreastEW | 99.642 | 93.684 | 5.958 | 0.943 | 0.928 | 0.931 | 19 | 4.181 | 0.042 | 0.980 | 0.002369 | 93.684 |
The proposed methods compared with the CfsSubsetEval method
| Datasets | Train | Test | Over-fitting | Pre | Rec | F1-score | NO.F | F-Time | C-Time | AUC | Var. | ACC |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RNA gene | 99.154 | 97.125 | 2.029 | 0.973 | 0.974 | 0.972 | 4083 | 3.860 | 1.030 | 0.987 | 0.000627 | 97.125 |
| DNA CNV | 64.034 | 63.682 | 0.352 | 0.539 | 0.537 | 0.533 | 41 | 2.950 | 0.009 | 0.815 | 0.000997 | 63.682 |
Parkinson’s disease | 87.683 | 78.421 | 9.262 | 0.730 | 0.673 | 0.691 | 119 | 0.180 | 0.016 | 0.713 | 0.005994 | 78.421 |
Dermatology diseases | 74.438 | 73.776 | 0.662 | 0.533 | 0.598 | 0.547 | 9 | 0.150 | 0.0002 | 0.884 | 0.002967 | 73.776 |
| BreastEW | 94.083 | 90.865 | 3.218 | 0.905 | 0.902 | 0.902 | 3 | 0.050 | 0.0002 | 0.950 | 0.000596 | 90.865 |
| RNA gene | 99.861 | 98.503 | 1.358 | 0.986 | 0.979 | 0.981 | 4083 | 3.860 | 0.048 | 0.987 | 0.000131 | 98.503 |
| DNA CNV | 64.864 | 64.196 | 0.668 | 0.601 | 0.620 | 0.598 | 41 | 2.950 | 0.001 | 0.884 | 0.001219 | 64.196 |
Parkinson’s disease | 80.904 | 79.915 | 0.989 | 0.753 | 0.681 | 0.695 | 119 | 0.180 | 0.001 | 0.762 | 0.005410 | 79.915 |
Dermatology diseases | 62.360 | 61.441 | 0.919 | 0.573 | 0.628 | 0.561 | 9 | 0.150 | 0.0007 | 0.935 | 0.005554 | 61.441 |
| BreastEW | 93.556 | 92.973 | 0.583 | 0.932 | 0.921 | 0.924 | 3 | 0.050 | 0.0008 | 0.979 | 0.000886 | 92.973 |
| RNA gene | 99.792 | 99.627 | 0.165 | 0.998 | 0.996 | 0.997 | 4083 | 3.860 | 0.007 | 1.000 | 0.000036 | 99.627 |
| DNA CNV | 78.986 | 72.599 | 6.387 | 0.691 | 0.626 | 0.630 | 41 | 2.950 | 0.00009 | 0.862 | 0.000289 | 72.599 |
Parkinson’s disease | 89.021 | 80.282 | 8.739 | 0.751 | 0.694 | 0.708 | 119 | 0.180 | 0.0008 | 0.743 | 0.002007 | 80.282 |
Dermatology diseases | 85.247 | 79.767 | 5.48 | 0.823 | 0.783 | 0.780 | 9 | 0.150 | 0.0004 | 0.943 | 0.006681 | 79.767 |
| BreastEW | 87.503 | 81.172 | 6.331 | 0.815 | 0.793 | 0.797 | 3 | 0.050 | 0.0008 | 0.857 | 0.005639 | 81.172 |
The proposed methods compared with the ReliefAttributeEval method
| Datasets | Train | Test | Over-fitting | Pre | Rec | F1-score | NO.F | F-Time | C-Time | AUC | Var. | ACC |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RNA gene | 99.154 | 97.625 | 1.529 | 0.980 | 0.979 | 0.979 | 10,000 | 1.950 | 2.887 | 0.992 | 0.000432 | 97.625 |
| DNA CNV | 65.322 | 63.922 | 1.4 | 0.546 | 0.548 | 0.540 | 8000 | 1.550 | 0.994 | 0.816 | 0.001225 | 63.922 |
Parkinson’s disease | 83.858 | 74.759 | 9.099 | 0.635 | 0.599 | 0.593 | 300 | 0.950 | 0.060 | 0.710 | 0.013702 | 74.759 |
Dermatology diseases | 79.356 | 78.701 | 0.655 | 0.570 | 0.647 | 0.591 | 20 | 0.500 | 0.0007 | 0.924 | 0.000719 | 78.701 |
| BreastEW | 96.485 | 93.499 | 2.986 | 0.467 | 0.448 | 0.455 | 16 | 0.350 | 0.003 | 0.959 | 0.002994 | 93.499 |
| RNA gene | 99.855 | 96.881 | 2.974 | 0.962 | 0.955 | 0.955 | 10,000 | 1.950 | 0.115 | 0.973 | 0.000976 | 96.881 |
| DNA CNV | 65.678 | 64.679 | 0.999 | 0.629 | 0.668 | 0.620 | 8000 | 1.550 | 0.297 | 0.849 | 0.001359 | 64.679 |
Parkinson’s disease | 81.550 | 79.338 | 2.212 | 0.742 | 0.729 | 0.722 | 300 | 0.950 | 0.003 | 0.775 | 0.014727 | 79.338 |
Dermatology diseases | 87.403 | 85.570 | 1.833 | 0.808 | 0.852 | 0.806 | 20 | 0.500 | 0.0007 | 0.979 | 0.002454 | 85.570 |
| BreastEW | 94.681 | 94.415 | 0.266 | 0.481 | 0.444 | 0.460 | 16 | 0.350 | 0.0005 | 0.989 | 0.002167 | 94.415 |
| RNA gene | 99.875 | 99.873 | 0.002 | 0.999 | 0.999 | 0.999 | 10,000 | 1.950 | 0.016 | 1.000 | 0.000031 | 99.873 |
| DNA CNV | 80.735 | 74.246 | 6.489 | 0.708 | 0.655 | 0.654 | 8000 | 1.550 | 0.013 | 0.874 | 0.000378 | 74.246 |
Parkinson’s disease | 85.801 | 71.708 | 14.093 | 0.595 | 0.586 | 0.579 | 300 | 0.950 | 0.0007 | 0.608 | 0.007203 | 71.708 |
Dermatology diseases | 92.289 | 86.059 | 6.23 | 0.872 | 0.848 | 0.840 | 20 | 0.500 | 0.001 | 0.961 | 0.004704 | 86.059 |
| BreastEW | 94.005 | 91.739 | 2.266 | 0.462 | 0.427 | 0.442 | 16 | 0.350 | 0.0002 | 0.964 | 0.000553 | 91.739 |
The proposed methods compared with the OneRAttributeEval method
| Datasets | Train | Test | Over-fitting | Pre | Rec | F1-score | NO.F | F-Time | C-Time | AUC | Var. | ACC |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RNA gene | 99.154 | 97.625 | 1.529 | 0.978 | 0.980 | 0.978 | 7000 | 2.021 | 1.779 | 0.991 | 0.000502 | 97.625 |
| DNA CNV | 65.367 | 64.059 | 1.308 | 0.543 | 0.557 | 0.542 | 5000 | 1.020 | 0.644 | 0.813 | 0.001261 | 64.059 |
Parkinson’s disease | 85.627 | 78.744 | 6.883 | 0.741 | 0.670 | 0.673 | 200 | 0.150 | 0.038 | 0.764 | 0.012796 | 78.744 |
Dermatology diseases | 79.448 | 77.080 | 2.368 | 0.563 | 0.633 | 0.578 | 15 | 0.120 | 0.0006 | 0.909 | 0.001517 | 77.080 |
| BreastEW | 96.641 | 92.095 | 4.546 | 0.919 | 0.912 | 0.914 | 17 | 0.105 | 0.002 | 0.962 | 0.001380 | 92.095 |
| RNA gene | 99.917 | 93.631 | 6.286 | 0.940 | 0.912 | 0.916 | 7000 | 2.021 | 0.076 | 0.949 | 0.000920 | 93.631 |
| DNA CNV | 67.368 | 66.528 | 0.840 | 0.617 | 0.625 | 0.610 | 5000 | 1.020 | 0.195 | 0.850 | 0.001035 | 66.528 |
Parkinson’s disease | 74.262 | 73.784 | 0.478 | 0.400 | 0.434 | 0.415 | 200 | 0.150 | 0.001 | 0.715 | 0.009473 | 73.784 |
Dermatology diseases | 86.004 | 83.589 | 2.415 | 0.812 | 0.791 | 0.758 | 15 | 0.120 | 0.0009 | 0.958 | 0.001873 | 83.589 |
| BreastEW | 93.451 | 93.196 | 0.250 | 0.469 | 0.460 | 0.462 | 17 | 0.105 | 0.001 | 0.986 | 0.003089 | 93.196 |
| RNA gene | 99.723 | 99.627 | 0.096 | 0.998 | 0.996 | 0.997 | 7000 | 2.021 | 0.010 | 0.999 | 0.000036 | 99.627 |
| DNA CNV | 78.155 | 71.777 | 6.378 | 0.651 | 0.603 | 0.597 | 5000 | 1.020 | 0.009 | 0.842 | 0.000189 | 71.777 |
Parkinson’s disease | 80.688 | 72.481 | 8.207 | 0.388 | 0.443 | 0.414 | 200 | 0.150 | 0.0001 | 0.623 | 0.002529 | 72.481 |
Dermatology diseases | 87.431 | 84.700 | 2.731 | 0.787 | 0.787 | 0.780 | 15 | 0.120 | 0.002 | 0.963 | 0.003370 | 84.700 |
| BreastEW | 94.728 | 92.976 | 1.752 | 0.930 | 0.921 | 0.924 | 17 | 0.105 | 0.001 | 0.961 | 0.000953 | 92.976 |
The proposed methods compared with the ConsistencySubsetEval method
| Datasets | Train | Test | Over-fitting | Pre | Rec | F1-score | NO.F | F-Time | C-Time | AUC | Var. | ACC |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RNA gene | 93.106 | 91.136 | 1.97 | 0.880 | 0.869 | 0.869 | 3 | 1.850 | 0.001 | 0.963 | 0.000641 | 91.136 |
| DNA CNV | 64.842 | 63.921 | 0.921 | 0.550 | 0.549 | 0.543 | 42 | 1.600 | 0.012 | 0.816 | 0.001172 | 63.921 |
Parkinson’s disease | 86.346 | 80.279 | 6.067 | 0.765 | 0.688 | 0.707 | 11 | 1.100 | 0.003 | 0.738 | 0.002407 | 80.279 |
Dermatology diseases | 87.918 | 87.740 | 0.178 | 0.751 | 0.755 | 0.742 | 12 | 0.102 | 0.0006 | 0.946 | 0.002407 | 87.740 |
| BreastEW | 96.993 | 94.380 | 2.613 | 0.950 | 0.933 | 0.939 | 8 | 0.090 | 0.0008 | 0.971 | 0.000873 | 94.380 |
| RNA gene | 97.545 | 97.380 | 0.165 | 0.972 | 0.970 | 0.970 | 3 | 1.850 | 0.001 | 0.994 | 0.000188 | 97.380 |
| DNA CNV | 73.015 | 71.606 | 1.409 | 0.682 | 0.698 | 0.680 | 42 | 1.600 | 0.006 | 0.923 | 0.001276 | 71.606 |
Parkinson’s disease | 76.940 | 75.516 | 1.424 | 0.666 | 0.599 | 0.605 | 11 | 1.100 | 0.0005 | 0.729 | 0.003051 | 75.516 |
Dermatology diseases | 90.265 | 89.839 | 0.426 | 0.878 | 0.901 | 0.867 | 12 | 0.102 | 0.0007 | 0.995 | 0.002740 | 89.839 |
| BreastEW | 94.630 | 94.201 | 0.429 | 0.943 | 0.934 | 0.937 | 8 | 0.090 | 0.002 | 0.988 | 0.000686 | 94.201 |
| RNA gene | 97.517 | 97.131 | 0.386 | 0.963 | 0.964 | 0.962 | 3 | 1.850 | 0.001 | 0.993 | 0.000311 | 97.131 |
| DNA CNV | 82.735 | 77.059 | 5.676 | 0.750 | 0.680 | 0.685 | 42 | 1.600 | 0.0001 | 0.888 | 0.000465 | 77.059 |
Parkinson’s disease | 79.100 | 69.698 | 9.402 | 0.557 | 0.524 | 0.517 | 11 | 1.100 | 0.002 | 0.583 | 0.002962 | 69.698 |
Dermatology diseases | 97.632 | 95.916 | 1.716 | 0.962 | 0.943 | 0.947 | 12 | 0.102 | 0.0008 | 0.994 | 0.001339 | 95.916 |
| BreastEW | 95.704 | 93.496 | 2.208 | 0.937 | 0.925 | 0.929 | 8 | 0.090 | 0.0009 | 0.965 | 0.000621 | 93.496 |
The proposed methods compared with the PCA method
| Datasets | Train | Test | Over-fitting | Pre | Rec | F1-score | NO.F | F-Time | C-Time | AUC | Var. | ACC |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RNA gene | 96.823 | 94.884 | 1.939 | 0.943 | 0.954 | 0.942 | 700 | 1.027 | 0.174 | 0.985 | 0.000740 | 94.884 |
| DNA CNV | 62.334 | 60.665 | 1.669 | 0.526 | 0.516 | 0.505 | 2800 | 49.795 | 1.549 | 0.827 | 0.001724 | 60.665 |
Parkinson’s disease | 83.745 | 73.816 | 9.929 | 0.625 | 0.607 | 0.602 | 250 | 0.079 | 0.028 | 0.645 | 0.002418 | 73.816 |
Dermatology diseases | 81.148 | 80.878 | 0.270 | 0.576 | 0.663 | 0.604 | 18 | 0.016 | 0.003 | 0.925 | 0.000142 | 80.878 |
| BreastEW | 95.723 | 93.493 | 2.230 | 0.942 | 0.925 | 0.929 | 20 | 0.016 | 0.002 | 0.959 | 0.000832 | 93.493 |
| RNA gene | 87.072 | 79.403 | 7.669 | 0.794 | 0.806 | 0.794 | 700 | 1.027 | 0.005 | 0.954 | 0.002116 | 79.403 |
| DNA CNV | 29.336 | 27.641 | 1.695 | 0.255 | 0.352 | 0.233 | 2800 | 49.795 | 0.077 | 0.680 | 0.000319 | 27.641 |
Parkinson’s disease | 74.471 | 73.821 | 0.65 | 0.604 | 0.558 | 0.545 | 250 | 0.079 | 0.009 | 0.698 | 0.002826 | 73.821 |
Dermatology diseases | 98.361 | 96.179 | 2.182 | 0.961 | 0.952 | 0.953 | 18 | 0.016 | 0.0001 | 0.997 | 0.001025 | 96.179 |
| BreastEW | 90.041 | 89.803 | 0.238 | 0.896 | 0.886 | 0.889 | 20 | 0.016 | 3.057 | 0.962 | 0.001707 | 89.803 |
| RNA gene | 99.750 | 99.740 | 0.010 | 0.999 | 0.997 | 0.998 | 700 | 1.027 | 0.002 | 0.999 | 0.000059 | 99.740 |
| DNA CNV | 81.200 | 74.348 | 6.852 | 0.663 | 0.639 | 0.634 | 2800 | 49.795 | 0.010 | 0.867 | 0.000273 | 74.348 |
Parkinson’s disease | 81.158 | 72.612 | 8.546 | 0.612 | 0.571 | 0.575 | 250 | 0.079 | 0.001 | 0.627 | 0.002308 | 72.612 |
Dermatology diseases | 92.380 | 87.162 | 5.218 | 0.862 | 0.860 | 0.844 | 18 | 0.050 | 0.00002 | 0.969 | 0.001984 | 87.162 |
| BreastEW | 94.728 | 92.976 | 1.752 | 0.930 | 0.921 | 0.924 | 20 | 0.030 | 0.0003 | 0.961 | 0.000953 | 92.976 |
The proposed methods compared with the MIFS, CBF and FCBF methods
| Datasets | Train | Test | Over-fitting | Pre | Rec | F1-score | NO.F | F-Time | C-Time | AUC | Var. | ACC |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RNA gene | 99.736 | 99.627 | 0.109 | 0.998 | 0.996 | 0.997 | 10,000 | 258.902 | 0.008 | 1.000 | 0.000036 | 99.627 |
| DNA CNV | 82.686 | 76.097 | 6.589 | 0.745 | 0.663 | 0.667 | 9000 | 180.314 | 0.011 | 0.854 | 0.000368 | 76.097 |
Parkinson’s disease | 80.879 | 72.479 | 8.400 | 0.610 | 0.568 | 0.572 | 300 | 2.121 | 0.0001 | 0.624 | 0.002344 | 72.479 |
Dermatology diseases | 97.966 | 97.267 | 0.699 | 0.975 | 0.969 | 0.969 | 25 | 0.351 | 0.002 | 0.963 | 0.000839 | 97.267 |
| BreastEW | 94.435 | 92.628 | 1.807 | 0.927 | 0.917 | 0.920 | 20 | 0.083 | 0.00002 | 0.958 | 0.001419 | 92.628 |
| RNA gene | 99.867 | 99.748 | 0.119 | 0.999 | 0.997 | 0.998 | 900 | 2.600 | 0.003 | 1.000 | 0.000092 | 99.748 |
| DNA CNV | 52.831 | 49.073 | 3.758 | 0.447 | 0.402 | 0.369 | 750 | 1.850 | 0.003 | 0.669 | 0.000490 | 49.073 |
Parkinson’s disease | 81.158 | 72.612 | 8.546 | 0.612 | 0.571 | 0.575 | 320 | 0.255 | 0.002 | 0.627 | 0.002308 | 72.612 |
Dermatology diseases | 94.171 | 90.953 | 3.218 | 0.871 | 0.855 | 0.846 | 20 | 0.202 | 0.002 | 0.947 | 0.002243 | 90.953 |
| BreastEW | 94.747 | 92.976 | 1.771 | 0.931 | 0.920 | 0.924 | 17 | 0.105 | 0.002 | 0.961 | 0.000953 | 92.976 |
| RNA gene | 99.742 | 99.625 | 0.117 | 0.998 | 0.996 | 0.997 | 400 | 1.750 | 0.001 | 1.000 | 0.000131 | 99.625 |
| DNA CNV | 81.390 | 76.236 | 5.154 | 0.721 | 0.671 | 0.676 | 13 | 0.800 | 0.007 | 0.905 | 0.001131 | 76.236 |
Parkinson’s disease | 82.657 | 73.270 | 9.387 | 73.271 | 0.585 | 0.587 | 16 | 1.500 | 0.002 | 0.675 | 0.001767 | 73.270 |
Dermatology diseases | 97.936 | 97.005 | 0.931 | 0.970 | 0.967 | 0.966 | 14 | 0.101 | 0.002 | 0.961 | 0.001217 | 97.005 |
| BreastEW | 95.333 | 95.078 | 0.255 | 0.953 | 0.945 | 0.947 | 7 | 0.006 | 0.002 | 0.953 | 0.000261 | 95.078 |
The proposed methods compared with the Chi-square method
| Datasets | Train | Test | Over-fitting | Pre | Rec | F1-score | NO.F | F-Time | C-Time | AUC | Var. | ACC |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RNA gene | 100.000 | 99.625 | 0.375 | 0.997 | 0.995 | 0.996 | 7555 | 0.0801 | 2.379 | 1.000 | 0.000036 | 99.625 |
| DNA CNV | 79.862 | 70.130 | 9.732 | 0.592 | 0.586 | 0.584 | 5555 | 0.528 | 3.050 | 0.901 | 0.000369 | 70.130 |
Parkinson’s disease | 75.661 | 72.228 | 3.433 | 0.471 | 0.497 | 0.448 | 398 | 0.016 | 0.210 | 0.628 | 0.000814 | 72.228 |
Dermatology diseases | 71.220 | 70.488 | 0.732 | 0.556 | 0.653 | 0.565 | 24 | 0.094 | 0.093 | 0.653 | 0.001305 | 70.488 |
| BreastEW | 91.994 | 91.563 | 0.431 | 0.929 | 0.895 | 0.906 | 21 | 0.016 | 0.016 | 0.976 | 0.001014 | 91.563 |
| RNA gene | 100.000 | 99.502 | 0.498 | 0.997 | 0.995 | 0.996 | 7555 | 0.0801 | 1.009 | 1.000 | 0.000041 | 99.502 |
| DNA CNV | 86.934 | 68.552 | 18.382 | 0.585 | 0.572 | 0.570 | 5555 | 0.528 | 2.817 | 0.891 | 0.000240 | 68.552 |
Parkinson’s disease | 100.000 | 81.087 | 18.913 | 0.755 | 0.701 | 0.704 | 398 | 0.016 | 0.471 | 0.836 | 0.008783 | 81.087 |
Dermatology diseases | 100.000 | 98.355 | 1.645 | 0.984 | 0.981 | 0.982 | 24 | 0.094 | 0.229 | 0.998 | 0.000363 | 98.355 |
| BreastEW | 100.000 | 96.832 | 3.168 | 0.973 | 0.962 | 0.965 | 21 | 0.016 | 0.104 | 0.990 | 0.001265 | 96.832 |
The proposed methods compared with the IGF, Chi-square and Bat algorithm methods
| Datasets | Train | Test | Over-fitting | Pre | Rec | F1-score | NO.F | F-Time | C-Time | AUC | Var. | ACC |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RNA gene | 99.723 | 99.627 | 0.096 | 0.998 | 0.996 | 0.997 | 3576 | 1.182 | 0.005 | 1.000 | 0.000036 | 99.627 |
| DNA CNV | 81.310 | 74.448 | 6.862 | 0.671 | 0.640 | 0.636 | 3315 | 5.651 | 0.014 | 0.850 | 0.001361 | 74.448 |
Parkinson’s disease | 81.026 | 72.479 | 8.547 | 0.777 | 0.885 | 0.827 | 396 | 0.093 | 0.0008 | 0.452 | 0.002384 | 72.479 |
Dermatology diseases | 97.996 | 97.267 | 0.729 | 0.974 | 0.970 | 0.969 | 25 | 0.032 | 0.0002 | 0.964 | 0.000496 | 97.267 |
| BreastEW | 94.728 | 92.976 | 1.752 | 0.924 | 0.888 | 0.904 | 22 | 0.064 | 0.002 | 0.969 | 0.000953 | 92.976 |
| RNA gene | 100.000 | 96.380 | 3.620 | 0.9685 | 0.946 | 0.952 | 3576 | 1.182 | 0.029 | 0.970 | 0.000710 | 96.380 |
| DNA CNV | 66.994 | 65.637 | 1.357 | 0.647 | 0.657 | 0.626 | 3315 | 5.651 | 0.435 | 0.832 | 0.001336 | 65.637 |
Parkinson’s disease | 74.618 | 74.070 | 0.548 | 0.803 | 0.867 | 0.833 | 396 | 0.093 | 0.004 | 0.721 | 0.006235 | 74.070 |
Dermatology diseases | 86.947 | 85.781 | 1.166 | 0.828 | 0.856 | 0.802 | 25 | 0.032 | 0.0007 | 0.984 | 0.001143 | 85.781 |
| BreastEW | 94.259 | 93.853 | 0.406 | 0.946 | 0.887 | 0.914 | 22 | 0.064 | 0.002 | 0.988 | 0.000766 | 93.853 |
| RNA gene | 99.154 | 97.250 | 1.904 | 0.975 | 0.977 | 0.975 | 3576 | 1.182 | 0.836 | 0.989 | 0.000444 | 97.250 |
| DNA CNV | 65.626 | 64.574 | 1.052 | 0.553 | 0.559 | 0.551 | 3315 | 5.651 | 2.067 | 0.820 | 0.000900 | 64.574 |
Parkinson’s disease | 86.449 | 75.528 | 10.921 | 0.810 | 0.878 | 0.841 | 396 | 0.093 | 0.077 | 0.707 | 0.005645 | 75.528 |
Dermatology diseases | 88.494 | 85.015 | 3.479 | 0.725 | 0.745 | 0.721 | 25 | 0.032 | 0.0008 | 0.934 | 0.003209 | 85.015 |
| BreastEW | 96.466 | 94.029 | 2.437 | 0.924 | 0.920 | 0.918 | 22 | 0.064 | 0.003 | 0.967 | 0.001310 | 94.029 |
| RNA gene | 99.847 | 99.750 | 0.097 | 0.999 | 0.997 | 0.998 | 7555 | 0.0801 | 0.010 | 1.000 | 0.000028 | 99.750 |
| DNA CNV | 70.283 | 59.635 | 10.648 | 0.526 | 0.498 | 0.492 | 5555 | 0.528 | 0.005 | 0.753 | 0.002142 | 59.635 |
Parkinson’s disease | 81.158 | 72.612 | 8.546 | 0.778 | 0.886 | 0.828 | 398 | 0.016 | 0.001 | 0.452 | 0.002308 | 72.612 |
Dermatology diseases | 92.622 | 88.498 | 4.124 | 0.889 | 0.875 | 0.865 | 24 | 0.094 | 0.0009 | 0.967 | 0.002641 | 88.498 |
| BreastEW | 94.728 | 92.976 | 1.752 | 0.924 | 0.888 | 0.904 | 21 | 0.016 | 0.002 | 0.969 | 0.000953 | 92.976 |
| RNA gene | 100.000 | 75.787 | 24.213 | 0.746 | 0.683 | 0.680 | 7555 | 0.0801 | 0.184 | 0.809 | 0.004833 | 75.787 |
| DNA CNV | 49.600 | 48.765 | 0.835 | 0.512 | 0.506 | 0.468 | 5555 | 0.528 | 0.552 | 0.734 | 0.000893 | 48.765 |
Parkinson’s disease | 74.691 | 74.207 | 0.484 | 0.797 | 0.879 | 0.835 | 398 | 0.016 | 0.006 | 0.708 | 0.008073 | 74.207 |
Dermatology diseases | 89.445 | 87.164 | 2.281 | 0.816 | 0.860 | 0.817 | 24 | 0.094 | 0.0004 | 0.975 | 0.001954 | 87.164 |
| BreastEW | 94.220 | 93.678 | 0.542 | 0.946 | 0.882 | 0.912 | 21 | 0.016 | 0.0007 | 0.988 | 0.000967 | 93.678 |
| RNA gene | 99.154 | 97.250 | 1.904 | 0.974 | 0.976 | 0.974 | 7555 | 0.0801 | 5.221 | 0.990 | 0.000410 | 97.250 |
| DNA CNV | 58.451 | 55.863 | 2.588 | 0.464 | 0.458 | 0.452 | 5555 | 0.528 | 2.155 | 0.776 | 0.000253 | 55.863 |
Parkinson’s disease | 83.127 | 76.721 | 6.406 | 0.823 | 0.878 | 0.849 | 398 | 0.016 | 0.117 | 0.729 | 0.005338 | 76.721 |
Dermatology diseases | 88.494 | 85.015 | 3.479 | 0.725 | 0.745 | 0.721 | 24 | 0.094 | 0.0008 | 0.934 | 0.003209 | 85.015 |
| BreastEW | 96.466 | 93.327 | 3.139 | 0.915 | 0.911 | 0.909 | 21 | 0.016 | 0.010 | 0.966 | 0.002513 | 93.327 |
| RNA gene | 99.861 | 99.752 | 0.109 | 0.999 | 0.997 | 0.998 | 6483 | 1350 | 0.012 | 1.000 | 0.000027 | 99.752 |
| DNA CNV | 81.786 | 75.309 | 6.477 | 0.685 | 0.644 | 0.639 | 5301 | 1280 | 0.008 | 0.864 | 0.000235 | 75.309 |
Parkinson’s disease | 80.277 | 69.326 | 10.951 | 0.757 | 0.869 | 0.809 | 35 | 0.305 | 9.422 | 0.507 | 0.003475 | 69.326 |
Dermatology diseases | 98.027 | 97.260 | 0.767 | 0.971 | 0.970 | 0.969 | 19 | 0.255 | 0.002 | 0.974 | 0.001678 | 97.260 |
| BreastEW | 94.728 | 92.976 | 1.752 | 0.924 | 0.888 | 0.904 | 14 | 0.200 | 0.001 | 0.969 | 0.000953 | 92.976 |
| RNA gene | 99.882 | 83.777 | 16.105 | 0.858 | 0.787 | 0.792 | 6483 | 1350 | 0.084 | 0.875 | 0.002498 | 83.777 |
| DNA CNV | 67.463 | 66.290 | 1.173 | 0.654 | 0.663 | 0.632 | 5301 | 1280 | 0.186 | 0.873 | 0.002012 | 66.290 |
Parkinson’s disease | 75.617 | 74.744 | 0.873 | 0.792 | 0.899 | 0.841 | 35 | 0.305 | 0.0008 | 0.706 | 0.005885 | 74.744 |
Dermatology diseases | 86.109 | 85.540 | 0.569 | 0.801 | 0.850 | 0.799 | 19 | 0.255 | 0.001 | 0.979 | 0.001915 | 85.540 |
| BreastEW | 95.477 | 95.099 | 0.378 | 0.962 | 0.906 | 0.931 | 14 | 0.200 | 0.0005 | 0.990 | 0.001183 | 95.099 |
| RNA gene | 99.320 | 98.750 | 0.570 | 0.988 | 0.990 | 0.988 | 6483 | 1350 | 1.782 | 0.994 | 0.000139 | 98.750 |
| DNA CNV | 65.592 | 64.608 | 0.984 | 0.553 | 0.560 | 0.551 | 5301 | 1280 | 0.578 | 0.822 | 0.000894 | 64.608 |
Parkinson’s disease | 81.820 | 74.609 | 7.211 | 0.783 | 0.915 | 0.843 | 35 | 0.305 | 0.010 | 0.699 | 0.003413 | 74.609 |
Dermatology diseases | 89.011 | 87.177 | 1.834 | 0.747 | 0.761 | 0.742 | 19 | 0.255 | 0.002 | 0.942 | 0.002909 | 87.177 |
| BreastEW | 96.466 | 94.029 | 2.437 | 0.924 | 0.920 | 0.918 | 14 | 0.200 | 0.003 | 0.968 | 0.001310 | 94.029 |
The comparison between the PFBS-RFS-RFE and other filter ones methods
| Algorithm | ACC% | NO.F | Pre | Rec | F1-score | AUC | Var. |
|---|---|---|---|---|---|---|---|
| MIFS | 99.875 | 10,000 | 0.999 | 0.998 | 0.988 | 1.000 | 0.000016 |
| IGF | 99.875 | 3576 | 0.999 | 0.999 | 0.998 | 1.000 | 0.000016 |
| mRMR | 99.750 | 650 | 0.999 | 0.997 | 0.998 | 1.000 | 0.000028 |
| CfsSubsetEval | 99.627 | 4083 | 0.998 | 0.996 | 0.997 | 1.000 | 0.000036 |
| ReliefAttributeEval | 99.873 | 10,000 | 0.999 | 0.999 | 0.999 | 1.000 | 0.000031 |
| OneRAttributeEval | 99.627 | 7000 | 0.998 | 0.996 | 0.997 | 0.999 | 0.000036 |
| ConsistencySubsetEval | 97.380 | 3 | 0.972 | 0.970 | 0.970 | 0.994 | 0.000188 |
| PCA | 99.740 | 700 | 0.999 | 0.997 | 0.998 | 0.999 | 0.000059 |
| MIFS, CBF and FCBF | 99.748 | 900 | 0.999 | 0.997 | 0.998 | 1.000 | 0.000092 |
| Chi-square | 99.625 | 7555 | 0.997 | 0.995 | 0.996 | 1.000 | 0.000036 |
IGF, Chi-square and Bat algorithm | 99.752 | 6483 | 0.999 | 0.997 | 0.998 | 1.000 | 0.000027 |
Proposed method (PFBS-RFS-RFE) | 100.000 | 10.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.0 |
The proposed methods compared with the hybrid of MIFS and RFE
| Datasets | Train | Test | Over-fitting | Pre | Rec | F1-score | NO.F | F-Time | C-Time | AUC | Var. | ACC |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RNA gene | 100.000 | 99.501 | 0.499 | 0.794 | 0.715 | 0.723 | 5000 | 10,227.579 | 1.199 | 1.000 | 0.000041 | 99.501 |
| DNA CNV | 92.908 | 85.034 | 7.874 | 0.770 | 0.716 | 0.717 | 4500 | 88,434.627 | 3.411 | 0.946 | 0.000698 | 85.034 |
Parkinson’s disease | 100.000 | 83.861 | 16.139 | 0.809 | 0.737 | 0.759 | 150 | 74.445 | 0.411 | 0.876 | 0.002867 | 83.861 |
Dermatology diseases | 99.727 | 94.819 | 4.908 | 0.941 | 0.930 | 0.930 | 12 | 1.113 | 0.079 | 0.996 | 0.001528 | 94.819 |
| BreastEW | 100.000 | 95.965 | 4.035 | 0.961 | 0.953 | 0.956 | 10 | 1.592 | 0.133 | 0.988 | 0.000756 | 95.965 |
The proposed methods compared with the hybrid of GA and RFE
| Datasets | Train | Test | Over-fitting | Pre | Rec | F1-score | NO.F | F-Time | C-Time | AUC | Var. | ACC |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RNA gene | 99.791 | 99.750 | 0.221 | 0.999 | 0.997 | 0.998 | 3123 | 15,746.043 | 0.727 | 1.000 | 0.000028 | 99.750 |
| DNA CNV | 93.271 | 84.980 | 8.291 | 0.860 | 0.770 | 0.790 | 2940 | 62,405.810 | 35.118 | 0.965 | 0.000620 | 84.980 |
Parkinson’s disease | 75.529 | 74.996 | 0.533 | 0.530 | 0.523 | 0.474 | 149.000 | 55.114 | 0.071 | 0.768 | 0.000652 | 74.996 |
Dermatology diseases | 84.800 | 84.722 | 0.078 | 0.854 | 0.835 | 0.830 | 5.000 | 0.651 | 0.016 | 0.960 | 0.000052 | 84.722 |
| BreastEW | 91.799 | 91.394 | 0.405 | 0.463 | 0.420 | 0.439 | 5.000 | 0.656 | 0.016 | 0.977 | 0.000776 | 91.394 |
The proposed methods compared with the hybrid of Ridge regression and RFE
| Datasets | Train | Test | Over-fitting | Pre | Rec | F1-score | NO.F | F-Time | C-Time | AUC | Var. | ACC |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RNA gene | 100.000 | 99.627 | 0.373 | 0.998 | 0.830 | 0.831 | 10,265 | 10,160.720 | 2.962 | 1.000 | 0.000036 | 99.627 |
| DNA CNV | 93.446 | 80.761 | 12.685 | 0.772 | 0.707 | 0.710 | 8190 | 37,302.17 | 3.216 | 0.944 | 0.000527 | 80.761 |
Parkinson’s disease | 100.000 | 82.930 | 17.070 | 0.810 | 0.714 | 0.738 | 376.000 | 5.482 | 1.195 | 0.855 | 0.003410 | 82.930 |
Dermatology diseases | 99.727 | 94.805 | 4.922 | 0.950 | 0.947 | 0.944 | 13.000 | 0.016 | 0.080 | 0.994 | 0.001556 | 94.805 |
| BreastEW | 100.000 | 93.675 | 6.325 | 0.941 | 0.926 | 0.932 | 15.000 | 0.0159 | 0.101 | 0.984 | 0.000969 | 93.675 |
The comparison between the PFBS-RFS-RFE and other RFE hybrid methods
| Algorithm | ACC% | NO.F | Pre | Rec | F1-score | AUC | Var. |
|---|---|---|---|---|---|---|---|
| MIFS and RFE | 99.501 | 4500 | 0.794 | 0.715 | 0.723 | 1.000 | 0.000041 |
| GA and RFE | 99.750 | 3123 | 0.999 | 0.997 | 0.998 | 1.000 | 0.000028 |
| Ridge regression and RFE | 99.627 | 10,265 | 0.998 | 0.830 | 0.831 | 1.000 | 0.000036 |
| Proposed method (PFBS-RFS-RFE) | 100.000 | 10.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.0 |
The selected features after intersection [38–58]
| 1 | G110 | – | – | |
| 12 | PPP1R8 | Through alternative splicing, three this gene encodes different isoforms [ | [ | |
| SCARNA1 | Small Cajal body-specific RNA 1 [ | [ | ||
| RPA2 | Protein A (RPA) complex is encoded by this gene [ | [ | ||
| SMPDL3B | Sphingomyelin phosphodiesterase acid like 3B [ | [ | ||
| XKR8 | Promotes phosphatidylserine exposure apoptotic cell surface, possibly by mediating phospholipid scrambling [ | [ | ||
| PHACTR4 | A member of the phosphatase and actin regulator (PHACTR) family are encoded by this gene [ | [ | ||
| RCC1 | Regulator of chromosome condensation 1 [ | [ | ||
| SNHG3 | Small nucleolar RNA host gene 3 [ | [ | ||
| SNORD99 | Small nucleolar RNA, C/D box 99 [ | [ | ||
| SNORA16A | Small nucleolar RNA, H/ACA box 16A [ | [ | ||
| RAB42 | Member RAS oncogene family [ | – | ||
| TFA12 | This gene Control of transcription by RNA polymerase II [ | [ | ||
| 7 | IMF_SNR_TKEO | – | – | |
| IMF_NSR_TKEO | – | – | ||
| mean_MFCC_1st_coef | – | – | ||
| mean_4th_delta_delta | – | – | ||
| mean_5th_delta_delta | – | – | ||
| mean_6th_delta_delta | – | – | ||
| mean_7th_delta_delta | – | – | ||
| 1 | Radius | Can be defined as the mean of distances from center to points on the perimeter [ | [ | |
| 5 | Borders | The border of the lesion which important for diagnosing and for other features [ | [ | |
| Parakeratosis | Nucleated keratinocytes are existed in the stratum corneum due to accelerated keratinocytic turnover [ | [ | ||
| Spongiosis | Intraepidermal eosinophils is existed in spongiotic zones [ | [ | ||
| Itching | Itching is a bad feeling that causes itching continuously, which affects the human psyche [ | [ | ||
| Age | The age at disease onset [ | [ |
Datasets Description
| Category Type | DS No. | Datasets | #Features | #Samples | #Class |
|---|---|---|---|---|---|
| Small < 100 | D1 | BreastEW | 30 | 569 | 2 |
| D2 | Dermatology erythemato-squamous diseases | 34 | 366 | 6 | |
Medium 100 < D2 < 1000 | D3 | Parkinson’s disease | 753 | 756 | 2 |
Large 1000 < D < 21,000 | D4 | DNA CNV | 16,381 | 2916 | 6 |
| D5 | RNA gene | 20,531 | 801 | 5 |
Fig. 4Hybrid proposed methods for feature selection
Algorithm 1 of the first hybrid proposed method using OFBS-RFS-RFE
Algorithm 2 of the second hybrid proposed method using IFBS-RFS-RFE
Algorithm 3 of the third hybrid proposed method using O/IFBS-RFS-RFE