| Literature DB >> 33265190 |
Dinu Coltuc1, Mihai Datcu2, Daniela Coltuc3.
Abstract
This paper investigates the usefulness of the normalized compression distance (NCD) for image similarity detection. Instead of the direct NCD between images, the paper considers the correlation between NCD based feature vectors extracted for each image. The vectors are derived by computing the NCD between the original image and sequences of translated (rotated) versions. Feature vectors for simple transforms (circular translations on horizontal, vertical, diagonal directions and rotations around image center) and several standard compressors are generated and tested in a very simple experiment of similarity detection between the original image and two filtered versions (median and moving average). The promising vector configurations (geometric transform, lossless compressor) are further tested for similarity detection on the 24 images of the Kodak set subject to some common image processing. While the direct computation of NCD fails to detect image similarity even in the case of simple median and moving average filtering in 3 × 3 windows, for certain transforms and compressors, the proposed approach appears to provide robustness at similarity detection against smoothing, lossy compression, contrast enhancement, noise addition and some robustness against geometrical transforms (scaling, cropping and rotation).Entities:
Keywords: NCD feature vectors; image similarity; lossless compression; normalized compression distance; normalized information distance; robust similarity
Year: 2018 PMID: 33265190 PMCID: PMC7512663 DOI: 10.3390/e20020099
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Classification results: P (promising), L (low robustness) and B (bad robustness).
| Horizontal Concatenation | Vertical Concatenation | NCD | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| H | V | |||||||||||
| JPEG2000 | L | P | L | L | P | P | L | P | P | P | B | B |
| JPEG | L | L | L | L | L | L | L | L | L | L | B | B |
| PNG | B | B | B | B | B | B | L | B | B | L | B | B |
| TIFF | B | B | B | B | B | B | B | B | B | L | B | B |
| GZIP | B | B | B | B | B | B | L | B | B | P | B | B |
| WinZIP | L | P | L | L | B | P | P | L | L | B | B | B |
| WinRAR | L | L | L | L | P | L | L | L | L | P | B | B |
| 7Z | L | L | L | L | L | L | P | L | P | B | B | B |
| PAQ8 | P | L | L | L | B | P | L | P | P | L | B | B |
| PAQ9 | P | P | L | L | B | P | P | L | L | B | B | B |
| LPAQ1 | P | L | L | L | B | P | L | L | L | B | B | B |
| FPAQ0f | L | P | L | L | P | L | L | L | L | L | B | B |
Figure 1GZIP based vectors for Lena and Barbara with horizontal translation (left) and vertical translation (right).
Figure 2Gray-level versions of Kodak test images 1–24: from left to right and from top to bottom.
Configurations with at least 20/24 correct classification results for histogram equalization.
| Compressor | Transform | Concatenation | Results |
|---|---|---|---|
| FPAQ0f | Rotation | Horizontal | 24/24 |
| FPAQ0f | Rotation | Vertical | 23/24 |
| JPEG2000 | Vertical translation | Horizontal | 21/24 |
| JPEG2000 | Inverse diagonal | Vertical | 20/24 |
More results for the selected configurations on Kodak set.
| FPAQ0f | JPEG2000 | ||||||
|---|---|---|---|---|---|---|---|
| Mooving average | Window size | ||||||
| Results | 22/24 | 18/24 | 17/24 | 21/24 | 21/24 | 19/24 | |
| Gaussian filtering | Window size | ||||||
| Results | 24/24 | 20/24 | 18/24 | 23/24 | 22/24 | 20/24 | |
| Median filtering | Window size | ||||||
| Results | 23/24 | 21/24 | 18/24 | 22/24 | 18/24 | 16/24 | |
| Gaussian noise | 0.0001 | 0.0005 | 0.001 | 0.0001 | 0.0005 | 0.001 | |
| Results | 24/24 | 21/24 | 21/24 | 24/24 | 19/24 | 19/24 | |
| “Salt and pepper” noise | Density | ||||||
| Results | 24/24 | 24/24 | 24/24 | 24/24 | 17/24 | 11/24 | |
| Lossy compression | Quality (QF) | 80 | 40 | 20 | 80 | 40 | 20 |
| Results | 24/22 | 22/24 | 18/24 | 24/24 | 22/24 | 20/24 | |
| Downscaling | Scale factor | 3/4 | 1/2 | 1/4 | 3/4 | 1/2 | 1/4 |
| Results | 24/24 | 23/24 | 22/24 | 2/24 | 23/24 | 18/24 | |
| Upscaling | Scale factor | 3/2 | 7/4 | 2 | 3/2 | 7/4 | 2 |
| Results | 24/24 | 24/24 | 24/24 | 13/24 | 7/24 | 22/24 | |
| Cropping | Rows & columns | 16 | 32 | 48 | 2 | 4 | - |
| Results | 24/24 | 23/24 | 18/24 | 11/24 | 4/24 | - | |
| Rotation | Degrees | - | - | ||||
| Results | 24/24 | 23/24 | 20/24 | 9/24 | - | - | |
Figure 3Details of test image Kodim5 corrupted with “salt and paper” noise (left) and sequence of FPAQ0f signatures for the original, and two noisy versions with and (right).
Figure 4Kodim01–Kodim24: feature vectors for PAQ0f and rotation.
Figure 5Four data sets: Overpass (rows 1–2); Denseresidential (rows 3–4); Harbor (rows 5–6); and Faces (rows 7–8).