| Literature DB >> 33266882 |
Ivo Bukovsky1, Witold Kinsner2, Noriyasu Homma3.
Abstract
Recently, a novel concept of a non-probabilistic novelty detection measure, based on a multi-scale quantification of unusually large learning efforts of machine learning systems, was introduced as learning entropy (LE). The key finding with LE is that the learning effort of learning systems is quantifiable as a novelty measure for each individually observed data point of otherwise complex dynamic systems, while the model accuracy is not a necessary requirement for novelty detection. This brief paper extends the explanation of LE from the point of an informatics approach towards a cognitive (learning-based) information measure emphasizing the distinction from Shannon's concept of probabilistic information. Fundamental derivations of learning entropy and of its practical estimations are recalled and further extended. The potentials, limitations, and, thus, the current challenges of LE are discussed.Entities:
Keywords: information; learning; learning systems; non-probabilistic entropy; novelty detection
Year: 2019 PMID: 33266882 PMCID: PMC7514648 DOI: 10.3390/e21020166
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Figure 1(Top) Chaotic (deterministic) time series with a sudden occurrence of white noise (k > 400) superimposed on the output of its real-time sample-by-sample learning predictor. (Middle) The weight updates cannot converge to noise. (Bottom) Approximate Learning Entropies (of various orders) via (19) detect the noise as the novelty immediately at its occurrence at k > 400 and then LE decreases as the large variance of learning increments becomes a new usual learning pattern (details on LE and its orders can be found in Section 4.1 and Section 4.2).
Order of learning entropy (OLE) is determined by the difference in the order of weight increments in (12)–(14).
|
| Detection Function Modifications for Varying Orders of LE |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Figure 2The performance of the direct algorithm for estimation of learning entropy of various orders (23) for not pretrained adaptive predictor with a too low learning rate (left graphs) and for reasonable learning rate (right graphs); normally distributed noise is within (same as in Figure 1 and Figure 3).
Figure 3The limitation and challenges: The alternative LE estimation (21) displays capability to capture both unusually large learning effort as well as unusually small one, while the currently proposed algorithms of LE (18,19,23) are based on capturing unusually large learning effort and the novelty detection when the noise () changes back to deterministic signal for is still a challenge. So far, we found the Direct Algorithm (23) (bottom axes) be practically comparable to the original LE estimation (19) (see Figure 1 with a similar type of data).