| Literature DB >> 30521611 |
Peng Zhou1, Fan Ye1, Liang Du2.
Abstract
Since multi-view data are available in many real-world clustering problems, multi-view clustering has received considerable attention in recent years. Most existing multi-view clustering methods learn consensus clustering results but do not make full use of the distinct knowledge in each view so that they cannot well guarantee the complementarity across different views. In this paper, we propose a Distinction based Consensus Spectral Clustering (DCSC), which not only learns a consensus result of clustering, but also explicitly captures the distinct variance of each view. It is by using the distinct variance of each view that DCSC can learn a clearer consensus clustering result. In order to optimize the introduced optimization problem effectively, we develop a block coordinate descent algorithm which is theoretically guaranteed to converge. Experimental results on real-world data sets demonstrate the effectiveness of our method.Entities:
Mesh:
Year: 2018 PMID: 30521611 PMCID: PMC6283548 DOI: 10.1371/journal.pone.0208494
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1An example of the two parts of the embedding.
(a) Consensus embedding and distinct variances satisfy our constraints. (b) Consensus embedding obtained by averaging all views without any constraints on the distinct variances.
Details of the multi-view data sets used in our experiments (feature type (dimensionality)).
| Feature type | Cornell | Texas | Washington | Wisconsin |
|---|---|---|---|---|
| 1 | Cont(1703) | Cont(1703) | Cont(1703) | Cont(1703) |
| 2 | Cite(195) | Cite(187) | Cite(230) | Cite(265) |
| #Instances | 195 | 187 | 230 | 265 |
| #Clusters | 5 | 5 | 5 | 5 |
| Feature type | UCI Digit | Advertisements | Corel | Flower17 |
| 1 | FourierCoef(76) | ImageURL(457) | ColorHistogram(64) | ColorVocabulary |
| 2 | ProfileCorrelations(216) | BaseURL(495) | ColorMoment(9) | ShapeVocabulary |
| 3 | Pixel(240) | DestinationURL(472) | ColorCoherence(128) | TextureVocabulary |
| 4 | KarhunenLoeveCoef(64) | Alt(111) | CoarsnessTamuraTexture(10) | HSV |
| 5 | ZernikeMoments(47) | Caption(19) | DirectionalityTamuraTexture(8) | HOG |
| 6 | Morphological(6) | - | WaveletTexture(104) | ForegroundSIFT |
| 7 | - | - | MASARTexture(15) | BoundarySIFT |
| #Instances | 2000 | 3279 | 3400 | 1360 |
| #Clusters | 10 | 2 | 34 | 17 |
ACC results on all the data sets.
| Data | FeaConcat | RMKMC | MultiNMF | Co-reg SC | RMSC | AMGL | SwMC | RAMC | DCSC- | DCSC |
|---|---|---|---|---|---|---|---|---|---|---|
| Cornell | 0.4513 | 0.4492 | 0.4256 | 0.4697 | 0.4270 | 0.4451 | 0.4821 | 0.4256 | 0.5026 | |
| Texas | 0.3797 | 0.5599 | 0.5561 | 0.5775 | 0.5892 | 0.5968 | 0.5508 | 0.5508 | 0.6524 | |
| Washington | 0.4783 | 0.5613 | 0.4739 | 0.5691 | 0.5536 | 0.5217 | 0.5130 | 0.5500 | 0.6870 | |
| Wisconsin | 0.4528 | 0.5102 | 0.4717 | 0.5391 | 0.5155 | 0.5725 | 0.4712 | 0.4717 | 0.5736 | |
| UCI Digit | 0.7235 | 0.7579 | 0.7991 | 0.8048 | 0.7493 | 0.8450 | 0.8325 | 0.8595 | ||
| Advertisements | 0.7091 | 0.8618 | 0.8603 | 0.7995 | 0.6908 | 0.8446 | 0.8134 | 0.8600 | 0.8713 | |
| Corel | 0.2362 | 0.1094 | 0.0297 | 0.2842 | 0.2846 | 0.2937 | 0.1485 | 0.2756 | 0.3209 | |
| Flower17 | - | - | - | 0.5325 | 0.5641 | 0.5734 | 0.4860 | 0.5713 | 0.5647 |
Purity results on all the data sets.
| Data | FeaConcat | RMKMC | MultiNMF | Co-reg SC | RMSC | AMGL | SwMC | RAMC | DCSC- | DCSC |
|---|---|---|---|---|---|---|---|---|---|---|
| Cornell | 0.4821 | 0.5046 | 0.4359 | 0.5117 | 0.4872 | 0.5385 | 0.4256 | 0.5179 | ||
| Texas | 0.5580 | 0.6219 | 0.5668 | 0.6574 | 0.6043 | 0.5936 | 0.5668 | 0.6417 | ||
| Washington | 0.6174 | 0.6739 | 0.4826 | 0.6817 | 0.6730 | 0.6517 | 0.5435 | 0.6652 | 0.6783 | |
| Wisconsin | 0.6038 | 0.6589 | 0.4755 | 0.6894 | 0.6729 | 0.6019 | 0.6004 | 0.6604 | ||
| UCI Digit | 0.7435 | 0.7892 | 0.7996 | 0.8166 | 0.7704 | 0.8675 | 0.8620 | |||
| Advertisements | 0.8600 | 0.8625 | 0.8603 | 0.8667 | 0.8600 | 0.8631 | 0.8750 | 0.8731 | 0.8731 | |
| Corel | 0.2700 | 0.1163 | 0.0391 | 0.3069 | 0.3105 | 0.3330 | 0.1874 | 0.3262 | 0.3712 | |
| Flower17 | - | - | - | 0.5604 | 0.5768 | 0.5000 | 0.5868 | 0.5875 |
Fig 2Convergence curve of our method.
(a) UCI Digit data set. (b) Corel data set. (c) Advertisements data set. (d) Flower17 data set.
Fig 3Clustering results w.r.t. λ1, λ2 on UCI Digit and Corel data set.
(a) ACC on UCI Digit data set. (b) NMI on UCI Digit data set. (c) Purity on UCI Digit data set. (d) ACC on Corel data set. (e) NMI on Corel data set. (f) Purity on Corel data set.
NMI results on all the data sets.
| Data | FeaConcat | RMKMC | MultiNMF | Co-reg SC | RMSC | AMGL | SwMC | RAMC | DCSC- | DCSC |
|---|---|---|---|---|---|---|---|---|---|---|
| Cornell | 0.1628 | 0.1704 | 0.0199 | 0.1623 | 0.1352 | 0.1349 | 0.1361 | 0.1594 | ||
| Texas | 0.2002 | 0.1910 | 0.0288 | 0.2495 | 0.2317 | 0.1821 | 0.2569 | 0.2413 | 0.2323 | |
| Washington | 0.2321 | 0.2550 | 0.0206 | 0.2739 | 0.2729 | 0.2182 | 0.2215 | 0.2416 | 0.2881 | |
| Wisconsin | 0.2476 | 0.2498 | 0.0202 | 0.2834 | 0.2780 | 0.2412 | 0.2359 | 0.2773 | 0.2819 | |
| UCI Digit | 0.6925 | 0.7245 | 0.7186 | 0.8221 | 0.7467 | 0.7166 | 0.8735 | |||
| Advertisements | 0.0189 | 0.0319 | 0.0015 | 0.0725 | 0.0196 | 0.1619 | 0.2286 | 0.2216 | 0.2247 | |
| Corel | 0.3172 | 0.1156 | 0.0102 | 0.3741 | 0.3603 | 0.3644 | 0.1966 | 0.3645 | 0.3777 | |
| Flower17 | - | - | - | 0.5502 | 0.4491 | 0.5570 | 0.5593 |