| Literature DB >> 26413089 |
Loris Nanni1, Sheryl Brahnam2, Stefano Ghidoni1, Alessandra Lumini3.
Abstract
We perform an extensive study of the performance of different classification approaches on twenty-five datasets (fourteen image datasets and eleven UCI data mining datasets). The aim is to find General-Purpose (GP) heterogeneous ensembles (requiring little to no parameter tuning) that perform competitively across multiple datasets. The state-of-the-art classifiers examined in this study include the support vector machine, Gaussian process classifiers, random subspace of adaboost, random subspace of rotation boosting, and deep learning classifiers. We demonstrate that a heterogeneous ensemble based on the simple fusion by sum rule of different classifiers performs consistently well across all twenty-five datasets. The most important result of our investigation is demonstrating that some very recent approaches, including the heterogeneous ensemble we propose in this paper, are capable of outperforming an SVM classifier (implemented with LibSVM), even when both kernel selection and SVM parameters are carefully tuned for each dataset.Entities:
Mesh:
Year: 2015 PMID: 26413089 PMCID: PMC4564633 DOI: 10.1155/2015/909123
Source DB: PubMed Journal: Comput Intell Neurosci
Descriptive summary of the image datasets.
| Dataset | Number of classes | Number of samples | Sample size | URL for download |
|---|---|---|---|---|
| PS | 2 | 917 | Various |
|
| VI | 15 | 1500 | 41 × 41 |
|
| CH | 5 | 327 | 512 × 382 |
|
| SM | 2 | 2868 | 100 × 100 |
|
| HI | 4 | 2828 | Various | Upon request to Loris Nanni [ |
| BR | 2 | 584 | Various | Upon request to Geraldo Braz Junior [ |
| PR | 2 | 349 | Various | Upon request to Loris Nanni [ |
| HE | 10 | 862 | 512 × 382 |
|
| LO | 10 | 502 | 768 × 512 |
|
| TR | 11 | 553 | 768 × 512 |
|
| PI | 13 | 903 | Various |
|
| RN | 10 | 200 | 1024 × 1024 |
|
| PA | 13 | 2338 | Various |
|
| LE | 20 | 1200 | 128 × 128 | Upon request to |
UCI datasets and their features: number of attributes (#A), number of samples (#S), and number of classes (#C).
| Dataset | Acronym | #A | #S | #C | Brief description |
|---|---|---|---|---|---|
| BREAST | BR | 9 | 699 | 2 | For breast tumor diagnosis |
|
| |||||
| HEART | HE | 13 | 303 | 2 | For detecting heart disease; the “goal” field refers to the presence of heart disease in the patient |
|
| |||||
| PIMA | PI | 8 | 768 | 2 | For forecasting the onset of diabetes mellitus |
|
| |||||
| Spam | SP | 57 | 4601 | 2 | For classifying E-mail as spam or nonspam |
|
| |||||
| SONAR | SO | 60 | 208 | 2 | For discriminating between sonar signals bounced off a metal cylinder and those bounced off a rough cylindrical rock |
|
| |||||
| IONOSPHERE | IO | 34 | 351 | 2 | For classifying radar returns from the ionosphere |
|
| |||||
| Liver | LI | 7 | 345 | 2 | For classifying liver disorders that might arise from excessive alcohol consumption |
|
| |||||
| Haberman | HA | 3 | 306 | 2 | A dataset that contains cases on the survival of patients who had undergone surgery for breast cancer |
|
| |||||
| Vote | VO | 16 | 435 | 2 | For classifying Republican versus Democrat US representatives (this dataset includes votes for each member of the US House of Representatives on 16 key votes) |
|
| |||||
| Australian | AU | 14 | 690 | 2 | For credit card applications |
|
| |||||
| Transfusion | TR | 5 | 748 | 2 | This study adopted the donor database of Blood Transfusion Service Center; the aim is to predict whether a person donated blood in March, 2007 |
Performance (AUC) obtained in different image datasets using LTP as texture descriptor.
| LTP | Datasets (AUC) | Av | RA | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| PS | VI | CH | SM | HI | BR | PR | HE | LO | TR | PI | RN | PA | LE | |||
| SVM | 0.9144 | 0.9349 |
| 0.9975 | 0.9156 | 0.9692 | 0.8968 | 0.9814 | 0.9949 | 0.9926 | 0.9286 | 0.9696 | 0.8903 | 0.9792 | 0.9546 | 7.0 |
| RS-SVM | 0.9071 | 0.9352 |
|
| 0.9195 | 0.9763 | 0.9030 | 0.9826 | 0.9950 | 0.9924 | 0.9316 | 0.9713 | 0.8944 | 0.9807 | 0.9562 | 5.9 |
| GPC | 0.9086 | 0.9131 | 0.9997 | 0.9971 | 0.9198 | 0.9789 | 0.8865 | 0.9816 | 0.9964 | 0.9930 | 0.9090 | 0.9769 | 0.8968 | 0.9752 | 0.9523 | 8.1 |
| RS_AB | 0.9121 | 0.9254 | 0.9998 | 0.9974 | 0.8924 | 0.9810 | 0.9079 | 0.9813 | 0.9965 | 0.9953 | 0.9242 | 0.9771 | 0.8959 | 0.9738 | 0.9543 | 7.0 |
| RS_RB | 0.9110 | 0.9293 | 0.9999 | 0.9953 | 0.9136 | 0.9739 | 0.8886 | 0.9806 | 0.9969 | 0.9955 | 0.9178 | 0.9900 | 0.8940 | 0.9738 | 0.9543 | 7.6 |
| DL1 | 0.8927 | 0.9173 | 0.9999 | 0.9952 | 0.9072 | 0.9811 | 0.8486 | 0.9801 | 0.9962 | 0.9965 | 0.9147 | 0.9837 | 0.8865 |
| 0.9486 | 8.7 |
| DL2 | 0.8965 | 0.9220 | 0.9998 | 0.9956 | 0.8945 | 0.9815 | 0.8780 | 0.9806 | 0.9956 | 0.9959 | 0.9061 | 0.9878 | 0.8869 | 0.7900 | 0.9365 | 9.4 |
| DL3 | 0.7802 | 0.9239 | 0.9999 | 0.9963 | 0.9082 | 0.9815 | 0.8779 | 0.9812 | 0.9958 | 0.9962 | 0.9014 | 0.9916 | 0.8895 | 0.9525 | 0.9412 | 8.5 |
| S_D | 0.8985 | 0.9244 | 1.000 | 0.9958 | 0.9143 |
| 0.8783 | 0.9818 | 0.9958 | 0.9966 | 0.9151 |
| 0.8960 | 0.9806 | 0.9537 | 6.0 |
| E1 | 0.9130 | 0.9196 | 0.9997 | 0.9974 | 0.9162 | 0.9812 | 0.8999 | 0.9816 | 0.9962 | 0.9945 | 0.9191 | 0.9798 | 0.8983 | 0.9748 | 0.9551 | 7.2 |
| E2 |
| 0.9337 | 0.9998 | 0.9973 | 0.9184 | 0.9809 | 0.9007 | 0.9837 | 0.9969 | 0.9960 | 0.9238 | 0.9884 | 0.9030 | 0.9768 | 0.9583 | 5.1 |
| E3 |
| 0.9361 |
| 0.9975 |
| 0.9816 |
| 0.9843 | 0.9970 | 0.9968 | 0.9313 | 0.9835 | 0.9080 | 0.9796 | 0.9603 | 2.8 |
| E4 | 0.9164 |
|
| 0.9975 | 0.9235 | 0.9824 | 0.9059 |
|
|
|
| 0.9864 |
| 0.9808 |
|
|
Performance obtained on the different image datasets using LPQ as texture descriptor.
| LPQ | Datasets (AUC) | Av | RA | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| PS | VI | CH | SM | HI | BR | PR | HE | LO | TR | PI | RN | PA | LE | |||
| SVM | 0.9039 |
| 0.9999 | 0.9986 | 0.9138 | 0.9565 | 0.8618 | 0.9757 | 0.9764 | 0.9767 | 0.9071 | 0.9532 | 0.8834 |
| 0.9461 | 8.4 |
| RS- | 0.8951 | 0.9485 | 0.9999 | 0.9988 | 0.9251 | 0.9568 | 0.8727 | 0.9786 | 0.9809 | 0.9817 | 0.9128 | 0.9531 | 0.8854 | 0.9891 | 0.9485 | 7.5 |
| GPC | 0.9020 | 0.9282 | 0.9991 | 0.9985 | 0.9199 | 0.9720 | 0.8883 | 0.9793 | 0.9891 |
| 0.9073 | 0.9439 | 0.8867 | 0.9782 | 0.9490 | 7.4 |
| RS_AB | 0.9013 | 0.9417 | 0.9998 | 0.9989 | 0.8783 | 0.9671 | 0.8843 | 0.9781 | 0.9868 | 0.9907 | 0.9255 | 0.9478 | 0.8777 | 0.9826 | 0.9472 | 7.9 |
| RS_RB | 0.8994 | 0.9393 | 0.9992 | 0.9978 | 0.9120 | 0.9711 | 0.8999 | 0.9741 | 0.9800 | 0.9889 | 0.9116 | 0.9562 | 0.8806 | 0.9799 | 0.9493 | 8.6 |
| DL1 | 0.8701 | 0.9382 | 0.9994 | 0.9982 | 0.9083 | 0.9684 | 0.8758 | 0.9815 | 0.9847 | 0.9873 | 0.9110 | 0.9537 | 0.8858 | 0.9819 | 0.9460 | 9.0 |
| DL2 | 0.8081 | 0.9379 | 0.9989 | 0.9979 | 0.9025 | 0.9682 | 0.8745 | 0.9813 | 0.9851 | 0.9852 | 0.9033 | 0.9550 | 0.8783 | 0.9853 | 0.9401 | 10.2 |
| DL3 | 0.8717 | 0.9401 | 0.9990 | 0.9983 | 0.9097 | 0.9647 | 0.8694 | 0.9813 | 0.9861 | 0.9854 | 0.9038 |
| 0.8785 | 0.9833 | 0.9456 | 9.3 |
| S_D | 0.8864 | 0.9415 | 0.9997 | 0.9982 | 0.9165 | 0.9687 | 0.8807 |
| 0.9871 | 0.9885 | 0.9118 | 0.9594 | 0.8894 | 0.9848 | 0.9497 | 6.3 |
| E1 | 0.9045 | 0.9345 | 0.9994 | 0.9989 | 0.9137 | 0.9726 | 0.8884 | 0.9794 | 0.9899 | 0.9931 | 0.9202 | 0.9469 | 0.8860 | 0.9807 | 0.9506 | 6.3 |
| E2 | 0.9065 | 0.9441 | 0.9995 |
| 0.9168 |
| 0.8942 | 0.9793 | 0.9883 | 0.9932 | 0.9219 | 0.9574 | 0.8910 | 0.9834 | 0.9535 | 4.2 |
| E3 |
| 0.9467 | 0.9999 | 0.9990 | 0.9238 | 0.9716 | 0.8968 | 0.9805 | 0.9891 | 0.9927 | 0.9228 | 0.9581 | 0.8981 | 0.9867 | 0.9554 | 3.3 |
| E4 | 0.9097 |
|
| 0.9990 |
| 0.9714 |
| 0.9825 |
| 0.9926 |
| 0.9635 |
| 0.9870 |
|
|
Comparisons between all the pairs of tested methods.
| SVM | RS-SVM | GPC | RS_AB | RS_RB | DL1 | DL2 | DL3 | S_D | E1 | E2 | E3 | E4 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SVM | — | L | ND | L | ND | ND | ND | ND | ND | L | L | L | L |
| RS-SVM | — | — | ND | L | ND | ND | ND | ND | ND | L | L | L | L |
| GPC | — | — | — | ND | ND | ND | W | W | ND | L | L | L | L |
| RS_AB | — | — | — | — | ND | W | W | W | ND | ND | L | L | L |
| RS_RB | — | — | — | — | — | W | W | ND | ND | L | L | L | L |
| DL1 | — | — | — | — | — | — | ND | ND | L | L | L | L | L |
| DL2 | — | — | — | — | — | — | — | ND | L | L | L | L | L |
| DL3 | — | — | — | — | — | — | — | — | ND | L | L | L | L |
| S_D | — | — | — | — | — | — | — | — | — | ND | L | L | L |
| E1 | — | — | — | — | — | — | — | — | — | — | L | L | L |
| E2 | — | — | — | — | — | — | — | — | — | — | — | L | L |
| E3 | — | — | — | — | — | — | — | — | — | — | — | — | L |
Performance on the different data mining datasets.
| Datasets (AUC) | Av | RA | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| BR | HE | PI | SP | SO | IO | LI | HA | VO | AU | TR | |||
| SVM | 0.9941 | 0.8809 | 0.824 | 0.9708 | 0.9517 | 0.9814 | 0.7558 |
| 0.9855 | 0.9164 | 0.714 | 0.8796 | 7.3077 |
| RS-SVM | 0.9931 | 0.9076 | 0.8221 | 0.9771 |
| 0.9795 | 0.7411 | 0.6399 | 0.9853 | 0.9221 | 0.6931 | 0.8745 | 8.6923 |
| GPC | 0.9924 | 0.9024 | 0.827 | 0.979 | 0.9409 | 0.9713 | 0.729 | 0.6804 | 0.9882 | 0.9267 | 0.7295 | 0.8788 | 8.000 |
| RS_AB | 0.991 | 0.9101 | 0.8229 |
| 0.9371 | 0.9788 | 0.7581 | 0.6727 | 0.9887 | 0.9313 | 0.735 | 0.8831 | 7.000 |
| RS_RB | 0.9925 |
| 0.8208 | 0.9873 | 0.9334 |
| 0.7664 | 0.6071 | 0.9884 | 0.9326 | 0.674 | 0.8731 | 7.3846 |
| DL1 |
| 0.8852 | 0.8252 | 0.966 | 0.8794 | 0.9222 | 0.7541 | 0.6751 | 0.9795 | 0.9155 | 0.7338 | 0.8664 | 8.7692 |
| DL2 | 0.9941 | 0.8754 | 0.8149 | 0.9691 | 0.8789 | 0.9242 | 0.7478 | 0.6679 | 0.9808 | 0.9088 | 0.7318 | 0.8631 | 10.3077 |
| DL3 |
| 0.8941 | 0.8193 | 0.9684 | 0.8501 | 0.9022 | 0.6966 | 0.6537 | 0.9787 | 0.9154 | 0.7351 | 0.8553 | 10.2308 |
| S_D | 0.9942 | 0.883 | 0.8238 | 0.9683 | 0.8781 | 0.9297 | 0.751 | 0.6772 | 0.9813 | 0.9186 | 0.7357 | 0.8674 | 8.6154 |
| E1 | 0.992 | 0.9096 | 0.8277 | 0.9856 | 0.9426 | 0.9772 | 0.7532 | 0.6868 |
| 0.9331 |
| 0.885 | 6.000 |
| E2 | 0.9924 | 0.9124 | 0.8285 |
| 0.9426 | 0.9817 |
| 0.6724 | 0.9897 |
| 0.7257 | 0.8856 | 5.3846 |
| E3 | 0.9933 | 0.9141 | 0.8288 | 0.9873 | 0.9508 | 0.9819 | 0.7723 | 0.6726 | 0.989 | 0.9343 | 0.7258 |
|
|
| E4 | 0.9934 | 0.9113 |
| 0.9862 | 0.942 | 0.9805 | 0.7717 | 0.6794 | 0.9895 | 0.9339 | 0.7297 | 0.8861 | 5.2308 |
Comparisons between all the pairs of methods tested in Table 6.
| SVM | RS-SVM | GPC | RS_AB | RS_RB | DL1 | DL2 | DL3 | S_D | E1 | E2 | E3 | E4 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SVM | — | ND | ND | ND | ND | W | W | W | W | ND | L | L | L |
| RS-SVM | — | — | ND | ND | W | ND | ND | W | ND | L | L | L | L |
| GPC | — | — | — | L | ND | W | W | W | W | L | L | L | L |
| RS_AB | — | — | — | — | ND | W | W | W | W | ND | L | L | L |
| RS_RB | — | — | — | — | — | ND | ND | W | ND | ND | L | L | L |
| DL1 | — | — | — | — | — | — | W | W | ND | L | L | L | L |
| DL2 | — | — | — | — | — | — | — | W | ND | L | L | L | L |
| DL3 | — | — | — | — | — | — | — | — | L | L | L | L | L |
| S_D | — | — | — | — | — | — | — | — | — | L | L | L | L |
| E1 | — | — | — | — | — | — | — | — | — | — | ND | ND | ND |
| E2 | — | — | — | — | — | — | — | — | — | — | — | ND | ND |
| E3 | — | — | — | — | — | — | — | — | — | — | — | — | ND |