| Literature DB >> 35756085 |
Chao Wang1, Jianchuan Feng1, Linfang Liu1, Sihang Jiang1, Wei Wang1.
Abstract
Recently, an exciting experimental conclusion in Li et al. (Knowl Inf Syst 62(2):611-637, 1) about measures of uncertainty for knowledge bases has attracted great research interest for many scholars. However, these efforts lack solid theoretical interpretations for the experimental conclusion. The main limitation of their research is that the final experimental conclusions are only derived from experiments on three datasets, which makes it still unknown whether the conclusion is universal. In our work, we first review the mathematical theories, definitions, and tools for measuring the uncertainty of knowledge bases. Then, we provide a series of rigorous theoretical proofs to reveal the reasons for the superiority of using the knowledge amount of knowledge structure to measure the uncertainty of the knowledge bases. Combining with experiment results, we verify that knowledge amount has much better performance for measuring uncertainty of knowledge bases. Hence, we prove an empirical conclusion established through experiments from a mathematical point of view. In addition, we find that for some knowledge bases that cannot be classified by entity attributes, such as ProBase (a probabilistic taxonomy), our conclusion is still applicable. Therefore, our conclusions have a certain degree of universality and interpretability and provide a theoretical basis for measuring the uncertainty of many different types of knowledge bases, and the findings of this study have a number of important implications for future practice.Entities:
Keywords: Concept structure; Knowledge base; Knowledge structure; ProBase; Rough set theory; Uncertainty
Year: 2022 PMID: 35756085 PMCID: PMC9207183 DOI: 10.1007/s10489-022-03726-7
Source DB: PubMed Journal: Appl Intell (Dordr) ISSN: 0924-669X Impact factor: 5.019
Key Notations and Descriptions
| Notation | Description |
|---|---|
| the empty set | |
| the set of real numbers | |
| the set of positive integers | |
| a non-empty finite set, named | |
| the family of all subsets of | |
| the binary relation between | |
| the set of all binary relations | |
| the set of all binary relations | |
| the set of all binary relations | |
| the set of all binary relations | |
| the family of all equivalence relations on | |
| | | the cardinality of |
| the simplified form of | |
| the knowledge base | |
| the knowledge base induced by ProBase | |
| the measure set on |
Candies are divided according to color, shape and taste
| Attribute | ||||||||
|---|---|---|---|---|---|---|---|---|
| Red | ✓ | ✓ | ✓ | |||||
| Blue | ✓ | ✓ | ||||||
| Yellow | ✓ | ✓ | ✓ | |||||
| Square | ✓ | ✓ | ||||||
| Round | ✓ | ✓ | ||||||
| Triangular | ✓ | ✓ | ✓ | ✓ | ||||
| Lemony | ✓ | ✓ | ✓ | |||||
| Sweet | ✓ | ✓ | ✓ | ✓ | ✓ |
C-values of measure sets M(KGR), M(REN), M(KEN) and M(KAM)
| Date set | ||||
|---|---|---|---|---|
| Nursery | 2.0431 | 0.6978 | 0.4750 | 0.1141 |
| Solar Flare | 0.9857 | 0.3219 | 0.2806 | 0.0615 |
| Tic-Tac-Toe Endgame | 1.7882 | 0.9015 | 0.4340 | 0.1186 |
Fig. 1A visualization of the different evaluation functions x, , , and at k = 16
Fig. 2A visualization of the different evaluation functions x, , , and at k = 25
Fig. 3Comparison of the measure values of the four measurement functions
Fig. 4Comparison of the outputs in λ(⋅) corresponding to the four different inputs
Data sets from UCI,a “#X”represents the number of “X”
| Datasets | Area | #Attributes | #Instances |
|---|---|---|---|
| Tic-Tac-Toe Endgame | Game | 9 | 958 |
| Chess | Game | 36 | 3,196 |
| Dota2 Games | Game | 116 | 102,944 |
| Lymphography | Life Science | 18 | 148 |
| Mushroom | Life Science | 22 | 8,124 |
| SPECT Heart | Life Science | 22 | 267 |
| Abalone | Life Science | 8 | 4,177 |
| Estimation of obesity levels | Life Science | 17 | 2,111 |
| Primary Tumor | Life Science | 17 | 339 |
| Breast Cancer | Life Science | 10 | 116 |
| Congressional Voting Records | Social Science | 16 | 435 |
| Balance Scale | Social Science | 4 | 625 |
| Nursery | Social Science | 8 | 12,960 |
| Student Performance | Social Science | 33 | 649 |
| Letter Recognition | Computer | 16 | 20,000 |
| Solar Flare | Physical | 10 | 1,389 |
| Car Evaluation | Other | 6 | 1,728 |
| MONK’s Problems | Other | 7 | 432 |
a https://archive.ics.uci.edu/ml/index.php
Coefficient of variation values of measure sets , , , and
| Index | Datesets | ||||
|---|---|---|---|---|---|
| Tic-Tac-Toe Endgame | 1.7879 | 0.9015 | 0.4340 | 0.1186 | |
| Chess | 1.5765 | 0.6719 | 0.5865 | 0.1276 | |
| Dota2 Games | 4.6868 | 2.5229 | 1.5775 | 0.6037 | |
| Lymphography | 1.5971 | 0.7135 | 0.4518 | 0.0946 | |
| Mushroom | 2.8592 | 0.6501 | 0.3279 | 0.0807 | |
| SPECT Heart | 0.8096 | 0.5593 | 0.2969 | 0.1384 | |
| Abalone | 2.0676 | 1.7854 | 0.6837 | 0.1041 | |
| Estimation of obesity levels | 3.6314 | 3.0076 | 0.2442 | 0.1288 | |
| Primary Tumor | 1.8870 | 0.8839 | 0.3288 | 0.1289 | |
| Breast Cancer | 1.5247 | 0.9560 | 0.3517 | 0.0980 | |
| Congressional Voting Records | 1.5189 | 0.6481 | 0.3574 | 0.1253 | |
| Balance Scale | 1.2943 | 0.7453 | 0.4472 | 0.0861 | |
| Nursery | 2.0431 | 0.6978 | 0.4750 | 0.1141 | |
| Student Performance | 3.1088 | 1.9325 | 0.2946 | 0.1643 | |
| Letter Recognition | 3.1032 | 1.3883 | 0.2953 | 0.0380 | |
| Solar Flare | 0.9537 | 0.4224 | 0.1988 | 0.0204 | |
| Car Evaluation | 1.5439 | 1.0686 | 0.2148 | 0.0556 | |
| MONK’s Problems | 1.3650 | 0.9847 | 0.3916 | 0.1201 |
Fig. 5Coefficient of variation values of four measure sets on datasets (a)–(r)
Statistical information of D1, D2 and D3
| Datasets | #concepts ( | #Instances |
|---|---|---|
| 3 | 72 | |
| 3 | 123 | |
| 3 | 1290 |
Coefficient of variation values of measure sets MKGR(D), MREN(D), MKEN(D), and MKAM(D) on dataset D
| Datesets | ||||
|---|---|---|---|---|
| 0.6217 | 0.3554 | 0.4246 | 0.2498 | |
| 0.8889 | 0.5106 | 0.4073 | 0.1239 | |
| 0.2705 | 0.0891 | 0.2397 | 0.0658 |
Fig. 6Coefficient of variation values of four measure sets on datasets D1, D2 and D3
Statistical information of D4
| Fruits | Hard | Soft | Non-citrus | Citrus |
|---|---|---|---|---|
| Apple | ✓ | ✓ | ||
| Apricot | ✓ | ✓ | ||
| Banana | ✓ | ✓ | ||
| Berry | ✓ | ✓ | ||
| Cherry | ✓ | ✓ | ||
| Gooseberry | ✓ | ✓ | ||
| Grape | ✓ | ✓ | ||
| Grapefruit | ✓ | ✓ | ||
| Kiwi | ✓ | ✓ | ||
| Melon | ✓ | ✓ | ||
| Orange | ✓ | ✓ | ||
| Papaya | ✓ | ✓ | ||
| Peach | ✓ | ✓ | ||
| Pear | ✓ | ✓ | ||
| Pineapple | ✓ | ✓ | ||
| Plum | ✓ | ✓ | ||
| Raspberry | ✓ | ✓ | ||
| Tomato | ✓ | ✓ |
Fig. 7Coefficient of variation values of measure sets on dataset D4