| Literature DB >> 26352998 |
Andrew R Cohen, Paul M B Vitányi.
Abstract
Pairwise normalized compression distance (NCD) is a parameter-free, feature-free, alignment-free, similarity metric based on compression. We propose an NCD of multisets that is also metric. Previously, attempts to obtain such an NCD failed. For classification purposes it is superior to the pairwise NCD in accuracy and implementation complexity. We cover the entire trajectory from theoretical underpinning to feasible practice. It is applied to biological (stem cell, organelle transport) and OCR classification questions that were earlier treated with the pairwise NCD. With the new method we achieved significantly better results. The theoretic foundation is Kolmogorov complexity.Entities:
Mesh:
Year: 2015 PMID: 26352998 PMCID: PMC4566858 DOI: 10.1109/TPAMI.2014.2375175
Source DB: PubMed Journal: IEEE Trans Pattern Anal Mach Intell ISSN: 0098-5589 Impact factor: 6.226