| Literature DB >> 34198501 |
Tribhuvan Singh1, Nitin Saxena2, Manju Khurana2, Dilbag Singh3, Mohamed Abdalla4,5, Hammam Alshazly6.
Abstract
A k-means algorithm is a method for clustering that has already gained a wide range of acceptability. However, its performance extremely depends on the opening cluster centers. Besides, due to weak exploration capability, it is easily stuck at local optima. Recently, a new metaheuristic called Moth Flame Optimizer (MFO) is proposed to handle complex problems. MFO simulates the moths intelligence, known as transverse orientation, used to navigate in nature. In various research work, the performance of MFO is found quite satisfactory. This paper suggests a novel heuristic approach based on the MFO to solve data clustering problems. To validate the competitiveness of the proposed approach, various experiments have been conducted using Shape and UCI benchmark datasets. The proposed approach is compared with five state-of-art algorithms over twelve datasets. The mean performance of the proposed algorithm is superior on 10 datasets and comparable in remaining two datasets. The analysis of experimental results confirms the efficacy of the suggested approach.Entities:
Keywords: data clustering; data mining; k-means; meta-heuristic; moth flame optimization
Mesh:
Year: 2021 PMID: 34198501 PMCID: PMC8231885 DOI: 10.3390/s21124086
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Shape datasets.
| Name | #Instances | #Features | #Classes | Year of Publication | Constructor | Dataset Objective |
|---|---|---|---|---|---|---|
| Flame | 240 | 2 | 2 | 2007 | L. Fu and E. Medico | DNA microarray data |
| Jain | 373 | 2 | 2 | 2005 | A. Jain and M. Law | Consensus function |
| R15 | 600 | 2 | 15 | 2002 | C.J. Veenman et al. | Maximum variance clustering |
| D31 | 3100 | 2 | 31 | 2002 | C.J. Veenman et al. | Maximum variance clustering |
| Aggregation | 788 | 2 | 7 | 2007 | A. Gionis et al. | Aggregating set of clusterings |
| Compound | 399 | 2 | 6 | 1971 | C.T. Zahn | Detecting and describing gestalt clusters |
| Pathbased | 300 | 2 | 3 | 2008 | H. Chang and D.Y. Yeung | Robust path-based spectral clustering |
| Spiral | 312 | 2 | 3 | 2008 | H. Chang and D.Y. Yeung | Robust path-based spectral clustering |
UCI datasets.
| Name | #Instances | #Features | #Classes | Year of Publication | Constructor | Dataset Objective |
|---|---|---|---|---|---|---|
| Iris | 150 | 4 | 3 | 1936 | R.A. Fisher | To predict class of iris plant |
| Glass | 214 | 9 | 7 | 1987 | B. German | To define the glass in terms of their oxide content |
| Yeast | 1484 | 8 | 10 | 1991 | Kenta Nakai | Predicting the cellular localization sites of proteins |
| Wine | 178 | 13 | 3 | 1988 | M. Forina et al. | Using chemical analysis to determine the origin of wines |
Comparison of objective values of MFO, BHA, MVO, HHO, GWO, k-means algorithms.
| Dataset | Criteria | MFO | BHA | MVO | HHO | GWO | K-Means |
|---|---|---|---|---|---|---|---|
| Best | 770.09978 | 769.9661518 | 770.4754577 | 769.9927543 | 770.132897 | 778.2235737 | |
| Worst | 790.112976 | 799.8706082 | 883.7379944 | 881.5972455 | 862.9405109 | 882.2962778 | |
| Flame | Mean | 770.312682 | 770.0151324 | 820.6166847 | 773.1715904 | 774.9724323 | 825.0039174 |
| Std | 0.934345 | 0.048796151 | 25.9021804 | 2.197184147 | 3.456581526 | 32.9362996 | |
| Best | 2574.2421 | 2574.241619 | 2587.729382 | 2574.24163 | 2574.596821 | 2649.716145 | |
| Worst | 2895.455517 | 2872.057675 | 3317.743133 | 3243.435326 | 3351.013971 | 3348.696543 | |
| Jain | Mean | 2578.583781 | 2575.625939 | 2783.852076 | 2609.24115 | 2604.748372 | 2898.773998 |
| Std | 5.434534 | 1.216412722 | 152.227768 | 50.99513097 | 46.7066178 | 190.2690739 | |
| Best | 281.130101 | 587.7144266 | 692.2279482 | 518.9798792 | 555.9717927 | 766.9066841 | |
| Worst | 838.491757 | 882.9244343 | 914.6624615 | 912.7932725 | 933.2892028 | 901.9060829 | |
| R15 | Mean | 334.6612324 | 686.732183 | 830.124701 | 680.9312354 | 676.6880723 | 839.092725 |
| Std | 23.321267 | 34.76318932 | 59.42568736 | 54.95447158 | 56.28049001 | 38.30256467 | |
| Best | 3736.584896 | 5242.218307 | 5896.654083 | 4882.938027 | 5136.104753 | 5894.744809 | |
| Worst | 6637.059685 | 6420.08449 | 6606.096812 | 6675.235841 | 6768.271523 | 6706.157344 | |
| D31 | Mean | 4133.73861 | 5658.97124 | 6215.426144 | 5440.848598 | 5600.169046 | 6411.356336 |
| Std | 109.343697 | 121.1404795 | 172.050326 | 210.6444026 | 213.5167965 | 201.3093209 | |
| Best | 2715.302689 | 2953.63615 | 3290.011686 | 2800.375925 | 2876.078555 | 3309.472801 | |
| Worst | 3718.291098 | 3840.375256 | 3939.087978 | 3952.609942 | 3959.489207 | 3995.872968 | |
| Aggregation | Mean | 2789.291202 | 3158.484101 | 3672.354272 | 3080.247639 | 3112.108684 | 3731.786921 |
| Std | 2.53496107 | 89.73403431 | 165.7279226 | 146.4400151 | 159.5026523 | 183.3215657 | |
| Best | 1060.674781 | 1150.328041 | 1279.985246 | 1104.072942 | 1120.609246 | 1361.339487 | |
| Worst | 1541.948974 | 1575.296587 | 1604.72384 | 1664.861515 | 1654.681553 | 1678.228393 | |
| Compound | Mean | 1094.9423 | 1248.529445 | 1423.281446 | 1246.770747 | 1273.87772 | 1493.276887 |
| Std | 13.2355642 | 35.63566319 | 79.71609294 | 66.55744532 | 71.96053364 | 86.13103684 | |
| Best | 1424.899542 | 1427.872936 | 1492.322506 | 1425.176917 | 1429.842419 | 1553.128473 | |
| Worst | 1723.311224 | 1676.139045 | 1901.140798 | 1857.587734 | 1897.234909 | 1893.710862 | |
| Pathbased | Mean | 1430.903602 | 1447.009762 | 1683.491592 | 1497.152685 | 1477.009539 | 1703.054894 |
| Std | 1.6570813 | 7.767526694 | 109.5463984 | 44.80756305 | 38.18020827 | 83.51041425 | |
| Best | 1807.54755 | 1807.510795 | 1832.06375 | 1807.595765 | 1808.281132 | 1896.181926 | |
| Worst | 2015.011175 | 1926.563714 | 2163.452999 | 2094.070221 | 2107.31257 | 2149.720749 | |
| Spiral | Mean | 1810.02073 | 1809.074549 | 1963.454005 | 1820.774656 | 1824.186315 | 1996.155056 |
| Std | 2.168093216 | 0.663986887 | 70.079482 | 10.71573663 | 17.47221358 | 73.8703224 |
Comparison of objective values of MFO, BHA, MVO, HHO, GWO, k-means algorithms.
| Dataset | Criteria | MFO | BHA | MVO | HHO | GWO | K-Means |
|---|---|---|---|---|---|---|---|
| Best | 254.5686207 | 344.1858768 | 427.2765574 | 302.6048772 | 360.4325397 | 482.794362 | |
| Worst | 607.015981 | 579.4491593 | 657.6790272 | 653.2463069 | 682.8121634 | 668.037993 | |
| Glass | Mean | 286.3971108 | 394.6702904 | 563.6985645 | 375.0501591 | 441.7389961 | 592.7121853 |
| Std | 8.5864965 | 14.97574454 | 37.57437297 | 29.06851845 | 44.90562621 | 50.8694328 | |
| Best | 96.6566922 | 102.1609776 | 141.6280996 | 105.4454434 | 91.06876813 | 155.9380716 | |
| Worst | 187.7141075 | 196.0131392 | 231.7066358 | 220.9449828 | 186.6739426 | 215.8188002 | |
| Iris | Mean | 99.54558066 | 111.6727822 | 177.7656738 | 128.8472893 | 104.0780971 | 189.2905571 |
| Std | 0.04642567 | 2.418165686 | 18.17189821 | 9.40506885 | 9.580750806 | 19.36588562 | |
| Best | 6176852.759 | 6877262.007 | 9811505.667 | 7416306.523 | 7788077.075 | 10335482.5 | |
| Worst | 10526429.84 | 9127679.781 | 11418117.43 | 11754852.48 | 11731706.85 | 11731057.25 | |
| Wine | Mean | 6569678.631 | 7404560.759 | 10694275.29 | 8018085.743 | 8206163.788 | 10942626.63 |
| Std | 103291.3436 | 116859.092 | 411754.8747 | 253702.4043 | 229142.0161 | 351269.2404 | |
| Best | 297.404773 | 399.60419 | 472.7558453 | 344.6453467 | 368.171845 | 528.3446203 | |
| Worst | 642.528356 | 627.3754381 | 772.8618998 | 757.8158265 | 730.7458876 | 753.5334223 | |
| Yeast | Mean | 346.0571754 | 421.1546863 | 577.9115552 | 380.0538835 | 414.0104515 | 634.6032704 |
| Std | 1.325687567 | 4.639095791 | 53.82131146 | 13.81732081 | 35.1425062 | 55.9518587 |
Average ranking of MFO, BHA, MVO, HHO, GWO, k-means algorithms based on mean of objective values.
| MFO | BHA | MVO | HHO | GWO | K-Means | |
|---|---|---|---|---|---|---|
| Shape Dataset | 1.375 | 2.5 | 4.875 | 3 | 3.25 | 6 |
| UCI Dataset | 1 | 3 | 5 | 2.75 | 3.25 | 6 |
Statistical results based on mean of objective Values for shape datasets.
| Test Name | Statistical Value | Hypothesis | |
|---|---|---|---|
| Iman-Davenport | 27.69026 | <0.00001 | Rejected |
| Friedman | 31.92857 | <0.00001 | Rejected |
Statistical results based on mean of objective values for UCI datasets.
| Test Name | Statistical Value | Hypothesis | |
|---|---|---|---|
| Iman-Davenport | 24.99996 | <0.00001 | Rejected |
| Friedman | 17.85714 | 0.003131 | Rejected |
Holm’s test statistical results based on mean of objective values for Shape datasets.
| i | Algorithms | Statistical Value | Hypothesis | ||
|---|---|---|---|---|---|
| 5 | K-Means | 4.94433 | <0.00001 | 0.01 | Rejected |
| 4 | MVO | 3.74165 | 0.000183 | 0.0125 | Rejected |
| 3 | GWO | 2.00446 | 0.045027 | 0.0167 | Not Rejected |
| 2 | HHO | 1.73719 | 0.08237 | 0.025 | Not Rejected |
| 1 | BHA | 1.20267 | 0.22913 | 0.05 | Not Rejected |
Holm’s test statistical results based on mean of objective values for UCI datasets.
| i | Algorithms | Statistical Value | Hypothesis | ||
|---|---|---|---|---|---|
| 5 | K-Means | 3.77964 | 0.000157 | 0.01 | Rejected |
| 4 | MVO | 3.02371 | 0.002497 | 0.0125 | Rejected |
| 3 | GWO | 1.70084 | 0.088981 | 0.0167 | Not Rejected |
| 2 | BHA | 1.51186 | 0.130585 | 0.025 | Not Rejected |
| 1 | HHO | 1.32287 | 0.185902 | 0.05 | Not Rejected |
Figure 1Variation in the best fitness values of algorithms for datasets (A): Flame, (B): Jain with respect to iterations.
Figure 2Variation in the best fitness values of algorithms for datasets (A): R15, (B): D31 with respect to iterations.
Figure 3Variation in the best fitness values of algorithms for datasets (A): Aggregation, (B): Compound with respect to iterations.
Figure 4Variation in the best fitness values of algorithms for datasets (A): Pathbased, (B): Spiral with respect to iterations.
Figure 5Variation in the best fitness values of algorithms for datasets (A): Glass, (B): Iris with respect to iterations.
Figure 6Variation in the best fitness values of algorithms for datasets (A): Wine, (B): Yeast with respect to iterations.
The best centroids for D31 obtained by proposed approach.
| Sr No. | F1 | F2 |
|---|---|---|
| C1 | 20.74266809 | 27.59365568 |
| C2 | 25.50196489 | 24.19312765 |
| C3 | 11.57011301 | 8.50840516 |
| C4 | 25.82211536 | 26.17793719 |
| C5 | 27.37201232 | 10.57384902 |
| C6 | 22.08486665 | 5.496210514 |
| C7 | 23.58523731 | 8.888237338 |
| C8 | 22.37594806 | 11.79535569 |
| C9 | 4.83205804 | 26.81225277 |
| C10 | 27.50193421 | 17.28098473 |
| C11 | 15.01686978 | 27.19744896 |
| C12 | 6.353870768 | 16.21830889 |
| C13 | 16.35650612 | 9.106767944 |
| C14 | 9.968810869 | 23.65566343 |
| C15 | 9.153853041 | 14.9149635 |
| C16 | 23.13295757 | 16.05797592 |
| C17 | 8.101549272 | 10.37341231 |
| C18 | 20.47807037 | 18.998876 |
| C19 | 4.965093478 | 20.47535923 |
| C20 | 26.53577694 | 17.86530094 |
| C21 | 26.03937471 | 14.99664186 |
| C22 | 25.47861108 | 6.28135661 |
| C23 | 12.82474767 | 19.1136306 |
| C24 | 15.19151476 | 22.86896706 |
| C25 | 17.80680556 | 12.9098126 |
| C26 | 19.90521872 | 23.37912391 |
| C27 | 17.72660498 | 25.58120323 |
| C28 | 11.71645567 | 14.69915113 |
| C29 | 4.624749983 | 10.32233599 |
| C30 | 27.65379495 | 21.47346273 |
| C31 | 15.7736913 | 21.06158524 |
The best centroids for R15 obtained by proposed approach.
| Sr No. | F1 | F2 |
|---|---|---|
| C1 | 4.189631608 | 12.80375838 |
| C2 | 14.09450165 | 5.001272186 |
| C3 | 8.337048918 | 9.062858908 |
| C4 | 4.101436934 | 7.52179159 |
| C5 | 13.97254731 | 14.93207276 |
| C6 | 12.79155218 | 8.05529297 |
| C7 | 8.230614736 | 10.92315677 |
| C8 | 16.41253705 | 9.985521142 |
| C9 | 8.646224944 | 16.24662551 |
| C10 | 11.02097643 | 11.58322744 |
| C11 | 9.551563967 | 12.06489806 |
| C12 | 11.92041063 | 9.712070237 |
| C13 | 9.967326937 | 10.10242535 |
| C14 | 9.645716964 | 7.980621354 |
| C15 | 8.663770617 | 3.772581562 |
The best centroids for Jain obtained by proposed approach.
| Sr No. | F1 | F2 |
|---|---|---|
| C1 | 17.03102423 | 15.16831711 |
| C2 | 32.58459725 | 7.124899903 |
The best centroids for Flame obtained by proposed approach.
| Sr No. | F1 | F2 |
|---|---|---|
| C1 | 7.206597929 | 24.16493517 |
| C2 | 7.301802789 | 17.84894502 |
The best centroids for Aggregation obtained by proposed approach.
| Sr No. | F1 | F2 |
|---|---|---|
| C1 | 21.42567886 | 22.85728939 |
| C2 | 7.716573617 | 8.772216185 |
| C3 | 32.40196366 | 22.05208852 |
| C4 | 33.15470428 | 8.782254392 |
| C5 | 8.938930788 | 22.91640128 |
| C6 | 14.65416199 | 7.059473024 |
| C7 | 20.82265142 | 7.249080316 |
The best centroids for Compound obtained by proposed approach.
| Sr No. | F1 | F2 |
|---|---|---|
| C1 | 18.77723869 | 18.83342046 |
| C2 | 32.64318475 | 16.28179213 |
| C3 | 37.48781021 | 17.33548448 |
| C4 | 10.65769689 | 19.33852537 |
| C5 | 18.67265227 | 9.510696233 |
| C6 | 12.61754072 | 9.616177793 |
The best centroids for Pathbased obtained by proposed approach.
| Sr No. | F1 | F2 |
|---|---|---|
| C1 | 18.82903757 | 30.45142379 |
| C2 | 11.48394236 | 15.73097 |
| C3 | 26.16808047 | 16.08878767 |
The best centroids for Spiral obtained by proposed approach.
| Sr No. | F1 | F2 |
|---|---|---|
| C1 | 22.64471503 | 22.66591643 |
| C2 | 11.172831 | 16.53101706 |
| C3 | 22.08495457 | 10.76472807 |
The best centroids for Glass obtained by proposed approach.
| Sr No. | F1 | F2 | F3 | F4 | F5 | F6 | F7 | F8 | F9 |
|---|---|---|---|---|---|---|---|---|---|
| C1 | 1.531719668 | 13.06173613 | 3.510979859 | 1.394173337 | 72.84637382 | 0.162494133 | 8.41076102 | 0.025666476 | 0.007523229 |
| C2 | 1.52797292 | 12.80840956 | 0.246399681 | 1.609315064 | 73.83969663 | 0.245748967 | 11.78973298 | 0.462253331 | 0.257117154 |
| C3 | 1.52040748 | 13.35918127 | 0.219397152 | 2.308129393 | 70.18963569 | 6.207528249 | 6.479935975 | 0.152869685 | 0.03330514 |
| C4 | 1.533244544 | 13.8560578 | 3.047071044 | 1.202271091 | 70.60025867 | 3.494911842 | 7.093112782 | 0.306091421 | 0.059719952 |
| C5 | 1.512538966 | 13.84305467 | 2.912665802 | 0.875799374 | 72.00128777 | 0.047687008 | 9.335062282 | 0.08408769 | 0.032376928 |
| C6 | 1.5112 | 14.43925402 | 0.008206 | 2.085299146 | 73.35680382 | 0.457194235 | 8.521081118 | 1.11995061 | 0.005501446 |
| C7 | 1.513266442 | 12.92439889 | 2.072428469 | 0.29 | 72.17879752 | 0.585345503 | 9.906258882 | 0.045962136 | 0.026599321 |
The best centroids for Iris obtained by proposed approach.
| Sr No. | F1 | F2 | F3 | F4 |
|---|---|---|---|---|
| C1 | 5.01229979 | 3.40333071 | 1.471677299 | 0.235472045 |
| C2 | 6.732802141 | 3.067395056 | 5.623784792 | 2.106790702 |
| C3 | 5.934098654 | 2.797688794 | 4.417324546 | 1.41492155 |
The best centroids for Wine obtained by proposed approach.
| Sr No. | C1 | C2 | C3 |
|---|---|---|---|
| F1 | 39,986.76285 | 43,544.94447 | 20,030.90947 |
| F2 | 28,115.519 | 15,541.15111 | 13,971.82923 |
| F3 | 45,777.07237 | 35,143.40404 | 31,390.39269 |
| F4 | 28,154.45346 | 21,489.64815 | 33,270.71013 |
| F5 | 21,025.39322 | 25,555.71232 | 19,697.48292 |
| F6 | 16,405.36654 | 46,363.61618 | 27,124.14348 |
| F7 | 16,940.6724 | 35,341.31586 | 22,796.4139 |
| F8 | 37,050.85547 | 18,628.032 | 29,821.29914 |
| F9 | 19,508.26413 | 31,543.63104 | 24,125.44547 |
| F10 | 32,628.78338 | 23,408.28137 | 15,972.93531 |
| F11 | 10,576.51405 | 31,095.9809 | 29,682.27216 |
| F12 | 14,613.20707 | 45,340.52047 | 34,586.81203 |
| F13 | 16,507.21954 | 37,817.65194 | 10,303.62763 |
The best centroids for Yeast obtained by proposed approach.
| Sr No. | F1 | F2 | F3 | F4 | F5 | F6 | F7 | F8 |
|---|---|---|---|---|---|---|---|---|
| C1 | 0.757337919 | 0.142268616 | 0.827461959 | 0.001450393 | 0.527740868 | 0.771304322 | 0.630304293 | 0.383528402 |
| C2 | 0.781314193 | 0.71779793 | 0.419456881 | 0.377730495 | 0.560817461 | 0.015464843 | 0.511164619 | 0.170202938 |
| C3 | 0.496325357 | 0.491261885 | 0.499102561 | 0.234178288 | 0.500528038 | 0 | 0.504793757 | 0.25014915 |
| C4 | 0.131413932 | 0.34929326 | 0.393064657 | 0.841116915 | 0.704378018 | 0.212988264 | 0.518108105 | 0.444142521 |
| C5 | 0.957824847 | 0.549712612 | 0.456841891 | 0.964282073 | 0.540272009 | 0.393364094 | 0.288371843 | 0.448492104 |
| C6 | 0.147129502 | 0.724553473 | 0.474471507 | 0.175108699 | 0.571043788 | 0.746038613 | 0.534680335 | 0.185879771 |
| C7 | 0.430257651 | 0.47424918 | 0.534401249 | 0.225056925 | 0.500017048 | 0 | 0.478658653 | 0.655020513 |
| C8 | 0.371314927 | 0.342973839 | 0.518372939 | 0.135213842 | 0.521916885 | 0.016021841 | 0.545742633 | 0.275096267 |
| C9 | 0.292646344 | 0.132663231 | 0.270567884 | 0.035813911 | 0.505437432 | 0.366876625 | 0.08142005 | 0.187474957 |
| C10 | 0.411909662 | 0.491403883 | 0.541493781 | 0.519251596 | 0.546134059 | 0.000446405 | 0.4844054 | 0.113730494 |