| Literature DB >> 27733969 |
Abstract
The global k-means algorithm is an incremental approach to clustering that dynamically adds one cluster center at a time through a deterministic global search procedure from suitable initial positions, and employs k-means to minimize the sum of the intra-cluster variances. However the global k-means algorithm sometimes results singleton clusters and the initial positions sometimes are bad, after a bad initialization, poor local optimal can be easily obtained by k-means algorithm. In this paper, we modified the global k-means algorithm to eliminate the singleton clusters at first, and then we apply MinMax k-means clustering error method to global k-means algorithm to overcome the effect of bad initialization, proposed the global Minmax k-means algorithm. The proposed clustering method is tested on some popular data sets and compared to the k-means algorithm, the global k-means algorithm and the MinMax k-means algorithm. The experiment results show our proposed algorithm outperforms other algorithms mentioned in the paper.Entities:
Keywords: Clustering; Global k-means; MinMax k-means; k-Means
Year: 2016 PMID: 27733969 PMCID: PMC5039165 DOI: 10.1186/s40064-016-3329-4
Source DB: PubMed Journal: Springerplus ISSN: 2193-1801
Comparative results
| Method | Clusters |
| Number of each cluster |
|---|---|---|---|
| Global | 4 | 1.0e+04 | (25, 14, 1, 1) |
| Modified global | 4 | 1.0e+04 | (12, 14, 13, 2) |
Fig. 1Example a is the initial point for using the global algorithm, and it’s clear that it is a bad initial point. Example b shows a better initial point
Comparative results on data set
| Method |
|
|
|---|---|---|
|
| 28.4856 | 96.6753 |
| Global | 25.3388 | 93.7457 |
| MinMax | 25.3388 | 93.7457 |
| MinMax | 25.3388 | 93.7457 |
| MinMax | 25.3388 | 93.7457 |
| Global Minmax | 25.3388 | 93.7457 |
| Global Minmax | 25.3388 | 93.7457 |
| Global Minmax | 25.3388 | 93.7457 |
Comparative results on data set
| Method |
|
|
|---|---|---|
|
| 52.0518 | 197.4535 |
| Global | 52.0518 | 197.4535 |
| MinMax | 52.0518 | 197.4535 |
| MinMax | 52.0518 | 197.4535 |
| MinMax | 52.0518 | 197.4535 |
| Global Minmax | 52.0518 | 197.4535 |
| Global Minmax | 52.0518 | 197.4535 |
| Global Minmax | 52.0518 | 197.4535 |
Comparative results on the Iris data set
| Method |
|
|
|---|---|---|
|
| 67.3007 | 147.2335 |
| Global | 57.1672 | 139.9622 |
| MinMax | 47.4502 | 138.8884 |
| MinMax | 47.4502 | 138.8884 |
| MinMax | 47.4502 | 138.8884 |
| Global Minmax | 47.4502 | 138.8884 |
| Global Minmax | 47.4502 | 138.8884 |
| Global Minmax | 47.4502 | 138.8884 |
Fig. 2The sketch of four typical synthetic data sets: a , b , c , d
Comparative results on data set
| Method |
|
|
|---|---|---|
|
| 90.8431 | 329.4181 |
| Global | 90.8431 |
|
| MinMax |
| 329.6677 |
| MinMax |
| 329.6677 |
| MinMax |
| 329.6352 |
| MinMax | 88.4824 | 329.4766 |
| MinMax | 88.4824 | 329.4766 |
| MinMax | 88.5052 | 329.4761 |
| MinMax | 89.6205 | 329.4349 |
| MinMax | 89.5976 | 329.4351 |
| MinMax | 89.6203 | 329.4346 |
| MinMax | 90.8430 | 329.4181 |
| Global Minmax |
| 329.6677 |
| Global Minmax |
| 329.6677 |
| Global Minmax |
| 329.6352 |
| Global Minmax |
| 329.5055 |
| Global Minmax |
| 329.5055 |
| Global Minmax |
| 329.5055 |
| Global Minmax | 88.5673 | 329.4616 |
| Global Minmax | 88.5673 | 329.4616 |
| Global Minmax | 88.5673 | 329.4616 |
| Global Minmax | 90.8431 |
|
Italic values indicate the best results in all the present results
Comparative results on data set
| Method |
|
|
|---|---|---|
|
| 68.0815 | 110.6536 |
| Global | 62.5878 |
|
| MinMax |
| 109.0927 |
| MinMax |
| 109.0927 |
| MinMax | 54.0464 | 109.1226 |
| MinMax | 57.3660 | 106.6937 |
| MinMax | 57.3660 | 106.6937 |
| MinMax | 57.3660 | 106.6937 |
| MinMax | 61.0903 | 105.6490 |
| MinMax | 61.0903 | 105.6490 |
| MinMax | 61.0903 | 105.6490 |
| MinMax | 68.0815 | 110.6536 |
| Global Minmax |
| 109.0927 |
| Global Minmax | 54.0464 | 109.1226 |
| Global Minmax | 54.0464 | 109.1226 |
| Global Minmax | 57.3660 | 106.6937 |
| Global Minmax | 57.3660 | 106.6937 |
| Global Minmax | 57.3660 | 106.6937 |
| Global Minmax | 61.0903 | 105.6490 |
| Global Minmax | 61.0903 | 105.6490 |
| Global Minmax | 61.0903 | 105.6490 |
| Global Minmax | 62.5878 |
|
Italic values indicate the best results in all the present results
The brief description of the real data sets
| Data set | Instances | Attributes | Classes | Balanced |
|---|---|---|---|---|
| Coil2 | 216 | 1000 | 3 | Yes |
| Iris | 150 | 4 | 3 | Yes |
| Seeds | 210 | 7 | 3 | Yes |
| Yeast | 1350 | 8 | 5 | No |
| Pendigits | 10,992 | 16 | 10 | Almost |
| User knowledge modeling | 403 | 6 | 4 | No |
Comparative results on the Coil2 data set
| Method |
|
|
|---|---|---|
|
| 79.0141 | 155.6635 |
| Global | 105.2087 | 154.8112 |
| MinMax | 58.7115 | 154.6850 |
| MinMax | 57.1880 | 155.1839 |
| MinMax | 58.7317 | 154.5164 |
| MinMax | 58.8274 | 154.5812 |
| MinMax | 58.8519 | 154.5189 |
| MinMax | 58.8205 | 154.4097 |
| MinMax | 58.9824 | 154.5769 |
| MinMax | 58.9544 | 154.5170 |
| MinMax | 58.9147 | 154.4083 |
| MinMax | 59.1028 |
|
| MinMax | 68.6188 | 154.6814 |
| Global Minmax |
| 157.7988 |
| Global Minmax |
| 157.7988 |
| Global Minmax | 57.7296 | 157.4811 |
| Global Minmax | 60.5913 | 157.1706 |
| Global Minmax | 60.8388 | 157.3204 |
| Global Minmax | 60.8388 | 157.3204 |
| Global Minmax | 102.5301 | 154.7850 |
| Global Minmax | 102.5301 | 154.7850 |
| Global Minmax | 102.5301 | 154.7850 |
| Global Minmax | 103.4904 | 154.7737 |
| Global Minmax | 103.4904 | 154.7737 |
Italic values indicate the best results in all the present results
Comparative results on the Seeds data set
| Method |
|
|
|---|---|---|
|
| 151.0572 | 428.7954 |
| global |
|
|
| MinMax |
|
|
| MinMax | 144.6353 | 428.7769 |
| MinMax | 144.6353 | 428.7769 |
| MinMax | 145.3806 | 428.6408 |
| MinMax | 145.3806 | 428.6408 |
| MinMax | 145.3806 | 428.6408 |
| MinMax | 145.3806 | 428.6408 |
| MinMax | 145.3806 | 428.6408 |
| MinMax | 145.3806 | 428.6408 |
| Global Minmax |
|
|
| Global Minmax | 144.6880 | 429.0006 |
| Global Minmax | 144.6880 | 429.0006 |
| Global Minmax | 146.4214 | 428.6840 |
| Global Minmax | 146.4214 | 428.6840 |
| Global Minmax | 146.4214 | 428.6840 |
| Global Minmax | 146.4214 | 428.6840 |
| Global Minmax | 146.4214 | 428.6840 |
| Global Minmax | 146.4214 | 428.6840 |
Italic values indicate the best results in all the present results
Comparative results on the Yeast data set
| Method |
|
|
|---|---|---|
|
| 13.5325 | 51.4444 |
| Global | 13.4129 |
|
| MinMax | 14.2165 | 52.7943 |
| MinMax | 22.6182 | 59.2278 |
| MinMax | 12.6324 | 51.7455 |
| MinMax | 11.1771 | 51.4789 |
| MinMax | 17.5689 | 54.6692 |
| MinMax | 12.6495 | 51.7366 |
| MinMax | 11.3333 | 51.3884 |
| MinMax | 11.6825 | 51.4354 |
| MinMax | 12.5912 | 51.7159 |
| MinMax | 12.6833 | 51.4565 |
| MinMax | 12.6655 | 51.4575 |
| MinMax | 12.6351 | 51.4379 |
| Global Minmax | 11.1427 | 51.3872 |
| Global Minmax | 21.2196 | 64.6526 |
| Global Minmax | 17.1350 | 53.5700 |
| Global Minmax | 11.3387 | 51.3334 |
| Global Minmax |
| 51.3190 |
| Global Minmax | 22.5238 | 53.2086 |
| Global Minmax | 11.8178 | 51.2643 |
| Global Minmax | 11.8837 | 51.2450 |
| Global Minmax | 22.5238 | 53.2086 |
| Global Minmax | 12.2198 | 51.1261 |
| Global Minmax | 12.2198 | 51.1261 |
| Global Minmax | 12.1166 | 51.1379 |
| Global Minmax | 16.0342 | 53.6899 |
| Global Minmax | 16.0342 | 53.6899 |
| Global Minmax | 16.0179 | 53.6955 |
Italic values indicate the best results in all the present results
Comparative results on the Pendigit data set
| Method |
|
|
|---|---|---|
|
| 11,540 | 60,963 |
| Global | 12,549 |
|
| MinMax | 8510 | 62,094 |
| MinMax | 16,826 | 71,546 |
| MinMax | 7744 | 61,116 |
| MinMax | 7609 | 61,184 |
| MinMax | 10,394 | 63,285 |
| MinMax | 7740 | 61,100 |
| MinMax | 7948 | 60,993 |
| MinMax | 7918 | 60,993 |
| MinMax | 7924 | 60,994 |
| MinMax | 8854 | 60,825 |
| MinMax | 8824 | 60,823 |
| MinMax | 8854 | 60,825 |
| MinMax | 9630 | 60,753 |
| MinMax | 9611 | 60,759 |
| MinMax | 9630 | 60,753 |
| MinMax | 10,920 | 60,805 |
| MinMax | 10,919 | 60,805 |
| MinMax | 10,915 | 60,805 |
| MinMax | 11,539 | 60,962 |
| Global Minmax |
| 60,394 |
| Global Minmax | 19,143 | 70,402 |
| Global Minmax | 6891 | 60,234 |
| Global Minmax | 6853 | 60,305 |
| Global Minmax | 6828 | 60,300 |
| Global Minmax | 6891 | 60,234 |
| Global Minmax | 6994 | 60,181 |
| Global Minmax | 6994 | 60,181 |
| Global Minmax | 6994 | 60,179 |
| Global Minmax | 10,860 | 59,918 |
| Global Minmax | 10,860 | 59,918 |
| Global Minmax | 10,860 | 59,918 |
| Global Minmax | 11,601 | 59,710 |
| Global Minmax | 12,330 | 59,645 |
| Global Minmax | 12,523 |
|
Italic values indicate the best results in all the present results
Comparative results on the user knowledge modeling data set
| Method |
|
|
|---|---|---|
|
| 13.9469 | 41.6798 |
| Global | 16.7506 | 41.2257 |
| MinMax | 11.1298 | 41.5906 |
| MinMax | 12.2885 | 42.2599 |
| MinMax | 11.3447 | 41.6220 |
| MinMax | 11.4587 | 41.5912 |
| MinMax | 11.4362 | 41.5951 |
| MinMax | 11.4776 | 41.5757 |
| MinMax | 11.8978 | 41.5361 |
| MinMax | 11.8994 | 41.5463 |
| MinMax | 11.9395 | 41.5356 |
| MinMax | 12.5516 | 41.5503 |
| MinMax | 12.5544 | 41.5626 |
| MinMax | 12.5672 | 41.5508 |
| Global Minmax |
| 41.2507 |
| Global Minmax |
| 41.2507 |
| Global Minmax |
| 41.2507 |
| Global Minmax | 11.0574 | 41.1979 |
| Global Minmax | 11.0574 | 41.1979 |
| Global Minmax | 11.0574 | 41.1979 |
| Global Minmax | 11.6460 | 41.0866 |
| Global Minmax | 11.6460 | 41.0866 |
| Global Minmax | 11.6460 | 41.0866 |
| Global Minmax | 11.8169 |
|
| Global Minmax | 11.8169 |
|
| Global Minmax | 11.8169 |
|
| Global Minmax | 11.8169 |
|
| Global Minmax | 14.9083 | 41.4720 |
Italic values indicate the best results in all the present results