| Literature DB >> 28386161 |
Piotr A Kowalski1,2, Piotr Kulczycki1,2.
Abstract
Automated classification systems have allowed for the rapid development of exploratory data analysis. Such systems increase the independence of human intervention in obtaining the analysis results, especially when inaccurate information is under consideration. The aim of this paper is to present a novel approach, a neural networking, for use in classifying interval information. As presented, neural methodology is a generalization of probabilistic neural network for interval data processing. The simple structure of this neural classification algorithm makes it applicable for research purposes. The procedure is based on the Bayes approach, ensuring minimal potential losses with regard to that which comes about through classification errors. In this article, the topological structure of the network and the learning process are described in detail. Of note, the correctness of the procedure proposed here has been verified by way of numerical tests. These tests include examples of both synthetic data, as well as benchmark instances. The results of numerical verification, carried out for different shapes of data sets, as well as a comparative analysis with other methods of similar conditioning, have validated both the concept presented here and its positive features.Entities:
Keywords: Classification; Data analysis; Imprecise information; Interval data; Interval probabilistic neural network ; Neural network; Numerical simulation
Year: 2015 PMID: 28386161 PMCID: PMC5362677 DOI: 10.1007/s00521-015-2109-3
Source DB: PubMed Journal: Neural Comput Appl ISSN: 0941-0643 Impact factor: 5.606
Fig. 1The KDE function
Fig. 2The basic architecture of the PNN
Fig. 3Structure of the interval probabilistic neural network
Fig. 4Flow chart of algorithm for the neural classification of interval information
Results of the numerical verification in the case of pattern N(0, 1) N (2, 1)
|
| Interval length | ||||||
|---|---|---|---|---|---|---|---|
| 0.00 | 0.1 | 0.25 | 0.5 | 1.00 | 2.00 | 5.00 | |
| 10 | 0.1713 | 0.1720 | 0.1720 | 0.1723 | 0.1729 | 0.1761 | 0.1944 |
| 20 | 0.1655 | 0.1669 | 0.1669 | 0.1672 | 0.1680 | 0.1713 | 0.1888 |
| 50 | 0.1602 | 0.1605 | 0.1606 | 0.1609 | 0.1617 | 0.1652 | 0.1848 |
| 100 | 0.1596 | 0.1601 | 0.1602 | 0.1604 | 0.1615 | 0.1650 | 0.1827 |
| 200 | 0.1596 | 0.1602 | 0.1604 | 0.1609 | 0.1618 | 0.1650 | 0.1840 |
| 500 | 0.1591 | 0.1595 | 0.1596 | 0.1602 | 0.1613 | 0.1647 | 0.1844 |
| 1000 | 0.1579 | 0.1584 | 0.1588 | 0.1591 | 0.1603 | 0.1637 | 0.1833 |
The neural classification—basic concept
The results of the numerical verification for normal distributions with expected values , and unit covariance matrices
|
| Interval length | ||
|---|---|---|---|
| 0.1 | 0.5 | 2.0 | |
| 10 | 0.2082 | 0.2086 | 0.2109 |
| 20 | 0.1881 | 0.1882 | 0.1910 |
| 50 | 0.1725 | 0.1728 | 0.1762 |
| 100 | 0.1676 | 0.1677 | 0.1710 |
| 200 | 0.1630 | 0.1634 | 0.1679 |
| 500 | 0.1610 | 0.1615 | 0.1656 |
| 1000 | 0.1590 | 0.1593 | 0.1640 |
The results of the numerical verification for normal distributions with expected values , and unit covariance matrices
|
| Interval length | ||
|---|---|---|---|
| 0.1 × 0.1 × 0.1 | 0.5 × 0.5 × 0.5 | 2.0 × 2.0 × 2.0 | |
| 10 | 0.2295 | 0.2294 | 0.2302 |
| 20 | 0.2137 | 0.2140 | 0.2159 |
| 50 | 0.1906 | 0.1908 | 0.1911 |
| 100 | 0.1826 | 0.1824 | 0.1830 |
| 200 | 0.1760 | 0.1760 | 0.1767 |
| 500 | 0.1684 | 0.1687 | 0.1698 |
| 1000 | 0.1629 | 0.1631 | 0.1644 |
The numerical simulation results for equinumerous pattern sets, i.e.,
|
| Interval length | |||||
|---|---|---|---|---|---|---|
| 0.1 | 0.5 | 2.0 | ||||
| 10 | 0.1720 | 0.0870 | 0.1723 | 0.0870 | 0.1761 | 0.0841 |
| 0.0850 | 0.0853 | 0.0878 | ||||
| 20 | 0.1669 | 0.0823 | 0.1672 | 0.0823 | 0.1713 | 0.0841 |
| 0.0846 | 0.0849 | 0.0872 | ||||
| 50 | 0.1605 | 0.0814 | 0.1609 | 0.0814 | 0.1652 | 0.0836 |
| 0.0791 | 0.0795 | 0.0815 | ||||
| 100 | 0.1601 | 0.0806 | 0.1604 | 0.0805 | 0.1650 | 0.0827 |
| 0.0795 | 0.0799 | 0.0823 | ||||
| 200 | 0.1602 | 0.0797 | 0.1609 | 0.0801 | 0.1650 | 0.0822 |
| 0.0805 | 0.0808 | 0.0828 | ||||
| 500 | 0.1595 | 0.0776 | 0.1602 | 0.0778 | 0.1647 | 0.0797 |
| 0.0819 | 0.0824 | 0.0850 | ||||
| 1000 | 0.1584 | 0.0769 | 0.1591 | 0.0770 | 0.1637 | 0.0784 |
| 0.0814 | 0.0821 | 0.0853 | ||||
The numerical simulation results for imbalanced pattern sets, i.e.,
|
| Interval length | |||||
|---|---|---|---|---|---|---|
| 0.1 | 0.5 | 2.0 | ||||
| 10 | 0.1703 | 0.096 | 0.1705 | 0.0958 | 0.174 | 0.0942 |
| 0.0743 | 0.0747 | 0.0768 | ||||
| 20 | 0.166 | 0.0923 | 0.1662 | 0.0924 | 0.1698 | 0.0942 |
| 0.0737 | 0.0738 | 0.0756 | ||||
| 50 | 0.1611 | 0.0896 | 0.1617 | 0.0898 | 0.1661 | 0.092 |
| 0.0715 | 0.0719 | 0.0741 | ||||
| 100 | 0.1605 | 0.0873 | 0.1608 | 0.0873 | 0.1649 | 0.0899 |
| 0.0733 | 0.0734 | 0.075 | ||||
| 200 | 0.1605 | 0.0874 | 0.161 | 0.0876 | 0.1658 | 0.09 |
| 0.0731 | 0.0735 | 0.0758 | ||||
| 500 | 0.1597 | 0.0862 | 0.1603 | 0.0862 | 0.1646 | 0.088 |
| 0.0736 | 0.0741 | 0.0766 | ||||
| 1000 | 0.1596 | 0.087 | 0.1599 | 0.0869 | 0.1641 | 0.0882 |
| 0.0727 | 0.0731 | 0.0759 | ||||
The numerical simulation results for imbalanced pattern sets, i.e.,
|
| Interval length | |||||
|---|---|---|---|---|---|---|
| 0.1 | 0.5 | 2.0 | ||||
| 10 | 0.1753 | 0.1184 | 0.1759 | 0.1186 | 0.1791 | 0.1176 |
| 0.0569 | 0.0572 | 0.0588 | ||||
| 20 | 0.1718 | 0.1155 | 0.1722 | 0.1157 | 0.1754 | 0.1176 |
| 0.0563 | 0.0565 | 0.0578 | ||||
| 50 | 0.1691 | 0.1143 | 0.1697 | 0.1146 | 0.1740 | 0.1171 |
| 0.0548 | 0.0552 | 0.0568 | ||||
| 100 | 0.1675 | 0.1125 | 0.1682 | 0.1129 | 0.1723 | 0.1156 |
| 0.0550 | 0.0553 | 0.0568 | ||||
| 200 | 0.1680 | 0.1125 | 0.1684 | 0.1126 | 0.1726 | 0.1152 |
| 0.0556 | 0.0557 | 0.0574 | ||||
| 500 | 0.1665 | 0.1103 | 0.1671 | 0.1105 | 0.1717 | 0.1131 |
| 0.0562 | 0.0566 | 0.0585 | ||||
| 1000 | 0.1646 | 0.1100 | 0.1662 | 0.1095 | 0.1697 | 0.1126 |
| 0.0546 | 0.0567 | 0.0571 | ||||
Results of the numerical verification for a linear combination of normal distributions: the expected values , , and unit standard deviations and factors of the combination 1/3, 1/3, 1/3 for the first class, and for the second class, with the expected values , and unit standard deviations and factors of the combination 1/2, 1/2
|
| Interval length | ||
|---|---|---|---|
| 0.1 | 0.5 | 2.0 | |
| 10 | 0.2438 | 0.2442 | 0.2509 |
| 20 | 0.2251 | 0.2257 | 0.2339 |
| 50 | 0.2067 | 0.2077 | 0.2166 |
| 100 | 0.1984 | 0.1992 | 0.2092 |
| 200 | 0.1929 | 0.1943 | 0.2041 |
| 500 | 0.1886 | 0.1899 | 0.2015 |
| 1000 | 0.1859 | 0.1869 | 0.1994 |
The results of the numerical verification for the data Toy 2D
| Interval length | 0.00 | 0.10 | 0.25 | 0.50 | 1.00 | 2.00 |
| Classification error | 0.0681 | 0.0737 | 0.0750 | 0.0766 | 0.0828 | 0.1097 |
The results of the numerical verification for the data Synthetic Two-Class Problem
| Interval length | 0.00 | 0.10 | 0.25 | 0.50 | 1.00 | 2.00 |
| Classification error | 0.142 | 0.141 | 0.148 | 0.147 | 0.181 | 0.258 |
The results of the numerical verification for the Iris Data Set
| Interval length | 0.00 | 0.10 | 0.25 | 0.50 | 1.00 | 2.00 |
| Classification error | 0.041 | 0.045 | 0.047 | 0.048 | 0.049 | 0.066 |
The results of the numerical verification for the Seeds Data set
| Interval length | 0.00 | 0.10 | 0.25 | 0.50 | 1.00 | 2.00 |
| Classification error | 0.0777 | 0.0772 | 0.0798 | 0.0828 | 0.0869 | 0.1018 |
The results of the numerical verification for the Breast Cancer Wisconsin
| Interval length | 0.00 | 0.10 | 0.25 | 0.50 | 1.00 | 2.00 |
| Classification error | 0.0296 | 0.0301 | 0.0365 | 0.0404 | 0.0486 | 0.0692 |
The results of verification by way of using SVM for normal distributions N(0, 1) and N(2, 1)
|
| Interval length | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| 0.1 | 0.5 | 2.0 | |||||||
| Error | No decision | Full error | Error | No decision | Full error | Error | No decision | Full error | |
| 10 | 0.2202 | 0.0460 | 0.2432 | 0.1478 | 0.1892 | 0.2424 | 0.0989 | 0.3197 | 0.2588 |
| 20 | 0.2020 | 0.0652 | 0.2346 | 0.1166 | 0.2355 | 0.2343 | 0.0965 | 0.3146 | 0.2537 |
| 50 | 0.1678 | 0.0437 | 0.1897 | 0.1152 | 0.1518 | 0.1911 | 0.0789 | 0.2799 | 0.2189 |
| 100 | 0.1572 | 0.0225 | 0.1685 | 0.1243 | 0.0920 | 0.1703 | 0.0698 | 0.2666 | 0.2031 |
| 200 | 0.1538 | 0.0148 | 0.1612 | 0.1273 | 0.0692 | 0.1620 | 0.0669 | 0.2582 | 0.1960 |
| 500 | 0.1530 | 0.0129 | 0.1595 | 0.1286 | 0.0627 | 0.1599 | 0.0646 | 0.2587 | 0.1940 |
The results of verification by way of counting the elements contained in the interval pattern data for normal distributions N(0, 1) and N(2, 1)
|
| Interval length | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0.1 | 0.5 | 2.0 | ||||||||||
| Error | Equal el. | No el. | Full error | Error | Equal el. | No el. | Full error | Error | Equal el. | No el. | Full error | |
| 10 | 0.0216 | 0.0017 | 0.9086 | 0.4768 | 0.0751 | 0.0223 | 0.6440 | 0.4083 | 0.1296 | 0.0573 | 0.2761 | 0.2962 |
| 20 | 0.0400 | 0.0055 | 0.8291 | 0.4572 | 0.1063 | 0.0435 | 0.4546 | 0.3553 | 0.1469 | 0.0483 | 0.1558 | 0.2489 |
| 50 | 0.0775 | 0.0227 | 0.6461 | 0.4120 | 0.1345 | 0.0555 | 0.2377 | 0.2810 | 0.1543 | 0.0296 | 0.0722 | 0.2052 |
| 100 | 0.1085 | 0.0426 | 0.4592 | 0.3594 | 0.1474 | 0.0422 | 0.1297 | 0.2333 | 0.1588 | 0.0177 | 0.0387 | 0.1870 |
| 200 | 0.1299 | 0.0557 | 0.2834 | 0.2994 | 0.1525 | 0.0272 | 0.0717 | 0.2020 | 0.1617 | 0.0104 | 0.0211 | 0.1774 |
| 500 | 0.1473 | 0.0454 | 0.1313 | 0.2356 | 0.1563 | 0.0152 | 0.0319 | 0.1799 | 0.1632 | 0.0048 | 0.0093 | 0.1703 |