| Literature DB >> 35832244 |
Abstract
One-class classification (OCC) deals with the classification problem in which the training data have data points belonging only to the target class. In this paper, we present a one-class classification algorithm, One-Class Classification by Ensembles of Random Plane (OCCERP), that uses random planes to address OCC problems. OCCERP creates many random planes. There is a pivot point in each random plane. A data point is projected in a random plane and a distance from a pivot point is used to compute the outlier score of the data point. Outlier scores of a point computed using many random planes are combined to get the final outlier score of the point. An extensive comparison of the OCCERP algorithm with state-of-the-art OCC algorithms on several datasets was conducted to show the effectiveness of the proposed approach. The effect of the ensemble size on the performance of the OCCERP algorithm is also studied.Entities:
Mesh:
Year: 2022 PMID: 35832244 PMCID: PMC9273347 DOI: 10.1155/2022/4264393
Source DB: PubMed Journal: Comput Intell Neurosci
Figure 1A plane, H created by two points using two data points R and S. The plane is the random plane that will pass from the midpoint, Z of R and S points and will have the normal going through these two data points.
Figure 2P 1′ and O1′ are the projected points of P1 and O1 on the plane H. Z is the pivotal point. P1′Z is the outlier score of point P1′, and O1′Z is the outlier score of point O1′. The outlier score of the positive point is larger.
Information on the datasets that were taken from [28, 29]. The datasets presented before the separating line in the table are taken from [28] whereas the datasets presented after the separating line are taken from [29].
| Dataset | Number of features | Number of data points in negative class | Number of data points of positive class |
|---|---|---|---|
| Pima | 8 | 500 | 268 |
| segment0 | 19 | 1979 | 329 |
| yeast1 | 8 | 1055 | 429 |
| yeast3 | 8 | 1321 | 163 |
| yeast4 | 8 | 1433 | 51 |
| Winequality-red-4 | 11 | 1546 | 53 |
| Winequality-red-8_vs_6 | 11 | 638 | 18 |
| Winequality-white-3_vs_7 | 11 | 880 | 20 |
| Aloi-unsupervised | 27 | 48492 | 1508 |
| Annthyroid-unsupervised | 21 | 6666 | 250 |
| Breast-cancer-unsupervised | 30 | 357 | 10 |
| Letter-unsupervised | 32 | 1500 | 100 |
| Satellite-unsupervised | 36 | 5025 | 75 |
| Shuttle-unsupervised | 9 | 45586 | 878 |
| Speech-unsupervised | 400 | 3625 | 61 |
| Pen-local-unsupervised | 16 | 6714 | 10 |
| Pen-global-unsupervised | 16 | 719 | 90 |
Information on the domain datasets.
| Dataset | Number of Features | Number of data points in negative class | Number of data points in positive class |
|---|---|---|---|
| MF | 31 | 488 | 5430 |
| COV | 31 | 908 | 12392 |
| DLR | 31 | 84 | 26576 |
| Class-level-kc1-defectornot | 94 | 60 | 85 |
| kc2 | 21 | 105 | 415 |
| kc1 | 21 | 326 | 1783 |
| cm1 | 21 | 49 | 449 |
| Datatrieve | 8 | 11 | 119 |
| pc1 | 21 | 77 | 1032 |
| Class-level-kc1-defect-count-ranking | 94 | 8 | 137 |
Average AUCROC of various OCC algorithms against the OCCERP (500) algorithm on various datasets [28, 29] presented in Table 1. Bold numbers indicate the best performance.
| Dataset | If | LOF | OCSVM | Autoencoder | OCCERP (500) |
|---|---|---|---|---|---|
| Pima | 0.731 | 0.709 | 0.700 | 0.648 |
|
| segment0 | 0.474 | 0.815 | 0.294 | 0.342 |
|
| Winequality-red-4 | 0.584 | 0.651 | 0.615 | 0.609 |
|
| Winequality-red-8_vs_6 | 0.667 | 0.592 | 0.647 | 0.681 |
|
| Winequality-white-3_vs_7 | 0.849 | 0.866 | 0.853 | 0.851 |
|
| yeast1 | 0.543 |
| 0.548 | 0.534 | 0.589 |
| yeast3 | 0.673 |
| 0.725 | 0.728 | 0.788 |
| yeast4 | 0.734 | 0.665 | 0.733 |
|
|
| Aloi-unsupervised | 0.539 |
| 0.549 | 0.549 | 0.556 |
| Annthyroid-unsupervised | 0.737 |
| 0.727 | 0.702 | 0.766 |
| Breast-cancer-unsupervised | 0.982 |
| 0.985 | 0.982 |
|
| Letter-unsupervised | 0.627 | 0.862 | 0.615 | 0.526 |
|
| Satellite-unsupervised | 0.949 |
| 0.937 | 0.895 |
|
| Shuttle-unsupervised | 0.995 |
| 0.996 | 0.993 |
|
| Pen-global-unsupervised | 0.947 | 0.957 | 0.972 | 0.869 |
|
| Pen-local-unsupervised | 0.778 |
| 0.589 | 0.440 | 0.966 |
| Best performance | 0 | 8 | 0 | 1 | 11 |
Average AUCROC of various OCC algorithms against the OCCERP (500) algorithm on various domain datasets presented in Table 2. Bold numbers indicate the best performance.
| Dataset | If | LOF | OCSVM | Autoencoder | OCCERP (500) |
|---|---|---|---|---|---|
| MF | 0.969 | 0.890 | 0.978 | 0.941 |
|
| COV | 0.831 |
| 0.804 | 0.769 | 0.883 |
| DLR | 0.947 | 0.988 | 0.955 | 0.978 |
|
| Class-level-kc1-defectornot | 0.797 | 0.762 | 0.705 | 0.607 |
|
| kc2 |
| 0.632 | 0.806 | 0.754 | 0.827 |
| kc1 | 0.792 | 0.634 | 0.708 | 0.634 |
|
| cm1 | 0.704 | 0.661 | 0.636 | 0.518 |
|
| Datatrieve | 0.728 | 0.690 | 0.692 | 0.572 |
|
| pc1 | 0.697 | 0.689 | 0.676 | 0.599 |
|
| Class-level-kc1 |
| 0.884 | 0.864 | 0.780 | 0.891 |
| -Defect-count-ranking | |||||
| Best performance | 2 | 1 | 0 | 0 | 7 |
Wins/losses/ties of OCCERP (500) against other OCC algorithms. A tie is split evenly between the two algorithms.
| If | LOF | OCSVM | Autoencoder | |
|---|---|---|---|---|
| Wins/Losses/Ties | 24/2/0 | 17/6/3 | 26/0/0 | 25/0/1 |
| Effective number of wins | 24 | 18 | 26 | 25 |
Average AUCROC of OCCERP(500) and OCCERP(200) on various datasets [28, 29] presented in Table 1. Bold numbers indicate the best performance.
| Dataset | OCCERP (200) | OCCERP (500) |
|---|---|---|
| Pima | 0.736 |
|
| segment0 | 0.851 |
|
| Winequality-red-4 | 0.592 |
|
| Winequality-red-8_vs_6 | 0.686 |
|
| Winequality-white-3_vs_7 | 0.901 |
|
| yeast1 | 0.572 |
|
| yeast3 | 0.755 |
|
| yeast4 | 0.723 |
|
| Aloi-unsupervised | 0.554 |
|
| Annthyroid-unsupervised | 0.662 | 0.766 |
| Breast-cancer-unsupervised | 0.953 |
|
| Letter-unsupervised | 0.857 |
|
| Satellite-unsupervised | 0.963 |
|
| Shuttle-unsupervised | 0.997 |
|
| Pen-global-unsupervised | 0.996 |
|
| Pen-local-unsupervised | 0.953 |
|
Average AUCROC of OCCERP(500) and OCCERP(200) on various domain datasets presented in Table 2. Bold numbers indicate the best performance.
| Dataset | OCCERP (200) | OCCERP (500) |
|---|---|---|
| MF | 0.991 |
|
| COV | 0.852 |
|
| DLR | 0.978 |
|
| Class-level-kc1-defectornot | 0.758 |
|
| kc2 |
|
|
| kc1 | 0.791 |
|
| cm1 | 0.757 |
|
| Datatrieve |
| 0.753 |
| pc1 | 0.717 |
|
| Class-level-kc1-defect-count-ranking | 0.889 |
|