| Literature DB >> 25180208 |
Tianxu He1, Shukui Zhang2, Jie Xin2, Pengpeng Zhao1, Jian Wu1, Xuefeng Xian3, Chunhua Li1, Zhiming Cui3.
Abstract
Big data from the Internet of Things may create big challenge for data classification. Most active learning approaches select either uncertain or representative unlabeled instances to query their labels. Although several active learning algorithms have been proposed to combine the two criteria for query selection, they are usually ad hoc in finding unlabeled instances that are both informative and representative and fail to take the diversity of instances into account. We address this challenge by presenting a new active learning framework which considers uncertainty, representativeness, and diversity creation. The proposed approach provides a systematic way for measuring and combining the uncertainty, representativeness, and diversity of an instance. Firstly, use instances' uncertainty and representativeness to constitute the most informative set. Then, use the kernel k-means clustering algorithm to filter the redundant samples and the resulting samples are queried for labels. Extensive experimental results show that the proposed approach outperforms several state-of-the-art active learning approaches.Entities:
Mesh:
Year: 2014 PMID: 25180208 PMCID: PMC4144157 DOI: 10.1155/2014/827586
Source DB: PubMed Journal: ScientificWorldJournal ISSN: 1537-744X
Algorithm 1Incorporating uncertainty, representativeness, and diversity for active learning.
Dataset properties and the corresponding sizes used.
| Dataset | Classes | Features | Initial set size | Unlabeled set size | Test set size |
|---|---|---|---|---|---|
| USPS | 10 | 256 | 30 | 5000 | 2000 |
| Letters | 26 | 16 | 30 | 5000 | 3000 |
| Pendigits | 10 | 16 | 30 | 7000 | 3498 |
Cluster number for each dataset.
| Dataset | Labeled numbers at each round |
|---|---|
| USPS | 10 |
| Pendigits | 10 |
| Letters | 26 |
Figure 1Results on USPS dataset.
Figure 3Results on Letters dataset.
Figure 2Results on Pendigits dataset.
Figure 4Comparison of diversity on 10 classes.
Figure 5Comparison of diversity on 26 classes.