| Literature DB >> 35324615 |
Kalthoum Adam1, Somaya Al-Maadeed1, Younes Akbari1.
Abstract
Automatic dating tools for historical documents can greatly assist paleographers and save them time and effort. This paper describes a novel method for estimating the date of historical Arabic documents that employs hierarchical fusions of multiple features. A set of traditional features and features extracted by a residual network (ResNet) are fused in a hierarchical approach using joint sparse representation. To address noise during the fusion process, a new approach based on subsets of multiple features is being considered. Following that, supervised and unsupervised classifiers are used for classification. We show that using hierarchical fusion based on subsets of multiple features in the KERTAS dataset can produce promising results and significantly improve the results.Entities:
Keywords: deep features; handwriting style-based features; hierarchical fusion; historical Arabic manuscript dating; sparse representation-based features
Year: 2022 PMID: 35324615 PMCID: PMC8954291 DOI: 10.3390/jimaging8030060
Source DB: PubMed Journal: J Imaging ISSN: 2313-433X
Figure 1Overview of our proposed system.
Figure 2Different structures for multi-feature fusion. (a) multiple views are sent to classifier without reducing them (our approach) (b) Views are reduced into one map.
Summary of numerical distribution of documents in KERTAS dataset.
| Key Century | Number of Documents | Training | Testing |
|---|---|---|---|
| 1 | 60 | 48 | 12 |
| 2 | 47 | 37 | 10 |
| 3 | 144 | 116 | 28 |
| 4 | 592 | 474 | 118 |
| 5 | 164 | 132 | 32 |
| 6 | 119 | 95 | 24 |
| 7 | 184 | 147 | 37 |
| 8 | 110 | 88 | 22 |
| 9 | 153 | 123 | 30 |
| 10 | 73 | 59 | 14 |
| 11 | 169 | 135 | 34 |
| 12 | 147 | 118 | 29 |
| 13 | 119 | 95 | 24 |
| 14 | 17 | 14 | 3 |
Figure 3Samples of KERTAS dataset images (a) from 3rd Islamic century, and (b) from 7th Islamic century.
Results of different feature extraction methods and the best results of our fusion approach on KERTAS dataset (test set). (Best values are highlighted in bold).
| Methods | Unsupervised | Accuracy (%) | Supervise | Accuracy (%) |
|---|---|---|---|---|
| Gabor | 50.40 | 45.71 | 35.65 | 66.66 |
| Hinge | 49.21 | 47.61 | 37.31 | 61.90 |
| Hog | 52.80 | 43.80 | 37.35 | 61.90 |
| ResNet | 43.80 | 55.23 | 33.30 | 69.52 |
| Concatenated | 39.35 | 61.90 | 31.50 | 71.42 |
| features | ||||
| Ours |
|
|
|
|
Figure 4(a) concatenated original data; (b) our approach using t-SNE on KERTAS dataset (based on four views).
Figure 5Different setups of our approach.
Comparison between different setups of our approach in terms of accuracy (%). (Best values highlighted in bold).
| State | Unsupervised | Supervise |
|---|---|---|
| A1 | 64.28 | 75.47 |
| A2 | 64.95 | 75.45 |
| A3 | 67.65 | 76.85 |
| A4 | 69.22 | 80.95 |
| A5 |
|
|