| Literature DB >> 30781499 |
Xiulan Yu1, Hongyu Li2, Zufan Zhang3, Chenquan Gan4.
Abstract
Clustering analysis of massive data in wireless multimedia sensor networks (WMSN) has become a hot topic. However, most data clustering algorithms have difficulty in obtaining latent nonlinear correlations of data features, resulting in a low clustering accuracy. In addition, it is difficult to extract features from missing or corrupted data, so incomplete data are widely used in practical work. In this paper, the optimally designed variational autoencoder networks is proposed for extracting features of incomplete data and using high-order fuzzy c-means algorithm (HOFCM) to improve cluster performance of incomplete data. Specifically, the feature extraction model is improved by using variational autoencoder to learn the feature of incomplete data. To capture nonlinear correlations in different heterogeneous data patterns, tensor based fuzzy c-means algorithm is used to cluster low-dimensional features. The tensor distance is used as the distance measure to capture the unknown correlations of data as much as possible. Finally, in the case that the clustering results are obtained, the missing data can be restored by using the low-dimensional features. Experiments on real datasets show that the proposed algorithm not only can improve the clustering performance of incomplete data effectively, but also can fill in missing features and get better data reconstruction results.Entities:
Keywords: feature learning; fuzzy c-means; incomplete multimedia data; variational autoencoder
Year: 2019 PMID: 30781499 PMCID: PMC6413117 DOI: 10.3390/s19040809
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Architecture of the proposed method.
Figure 2The improved VAE model.
Clustering accuracy of ACC.
| Algorithm/Dataset | MNIST | STL-10 | NUS-WIDE |
|---|---|---|---|
| k-means | 53.49% | 28.40% | 81.51% |
| HOPCM | 80.34% | 33.12% | 92.75% |
| VAE | 84.20% | 35.48% | 93.32% |
| DEC | 84.31% | 35.90% | 93.75% |
| VAE-HOFCM | 85.54% | 36.44% | 95.14% |
Clustering accuracy of ARI.
| Algorithm/Dataset | MNIST | STL-10 | NUS-WIDE |
|---|---|---|---|
| k-means | 0.41 | - | 0.74 |
| HOPCM | 0.69 | - | 0.89 |
| VAE | 0.75 | - | 0.90 |
| DEC | 0.76 | - | 0.90 |
| VAE-HOFCM | 0.78 | - | 0.92 |
Figure 3Visual analysis of MNIST datasets.
Figure 4Reconstruction quality for different dimensionalities.
Figure 5Cluster category sampling.
Figure 6Clustering accuracy of ACC.
Figure 7Clustering accuracy of ARI.
Figure 8Reconstruction quality for different dimensionalities.
Figure 9Reconstruction quality for noise data.