| Literature DB >> 35458866 |
Tajinder Pal Singh1, Sheifali Gupta2, Meenu Garg2, Deepali Gupta2, Abdullah Alharbi3, Hashem Alyami4, Divya Anand5,6, Arturo Ortega-Mansilla6,7, Nitin Goyal2.
Abstract
For analytical approach-based word recognition techniques, the task of segmenting the word into individual characters is a big challenge, specifically for cursive handwriting. For this, a holistic approach can be a better option, wherein the entire word is passed to an appropriate recognizer. Gurumukhi script is a complex script for which a holistic approach can be proposed for offline handwritten word recognition. In this paper, the authors propose a Convolutional Neural Network-based architecture for recognition of the Gurumukhi month names. The architecture is designed with five convolutional layers and three pooling layers. The authors also prepared a dataset of 24,000 images, each with a size of 50 × 50. The dataset was collected from 500 distinct writers of different age groups and professions. The proposed method achieved training and validation accuracies of about 97.03% and 99.50%, respectively for the proposed dataset.Entities:
Keywords: Gurumukhi script; convolutional neural network; performance analysis; word recognition
Mesh:
Year: 2022 PMID: 35458866 PMCID: PMC9026827 DOI: 10.3390/s22082881
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Various steps of dataset preparation.
Figure 2(a) Sample sheet from writer 1; (b) sample sheet from writer 2.
Detail overview of dataset.
| Sr No. | Class Name in English | Class Name in Gurumukhi | Time Duration | Type of Month |
|---|---|---|---|---|
| 1. | Vaisakh | ਵਿਸਾਖ | 14 April to 14 May | Desi Months |
| 2. | Jeth | ਜੇਠ | 15 May to 14 June | |
| 3. | Harh | ਹਾੜ੍ਹ | 15 June to 15 July | |
| 4. | Sawan | ਸਾਉਣ | 16 July to 15 August | |
| 5. | Bhado | ਭਾਦੋਂ | 16 August to 14 September | |
| 6. | Assu | ਅੱਸੂ | 15 September to 14 October | |
| 7. | Katak | ਕੱਤਕ | 15 October to 13 November | |
| 8. | Magar | ਮੱਘਰ | 14 November to 13 December | |
| 9. | Poh | ਪੋਹ | 14 December to 12 January | |
| 10. | Magh | ਮਾਘ | 13 January to 11 February | |
| 11. | Phagun | ਫੱਗਣ | 12 February to 13 March | |
| 12. | Chet | ਚੇਤ | 14 March to 13 April | |
| 13. | January | ਜਨਵਰੀ | 1 January to 31 January | English Months |
| 14. | February | ਫਰਵਰੀ | 1 February to 28/29 February | |
| 15. | March | ਮਾਰਚ | 1 March to 31 March | |
| 16. | April | ਅਪ੍ਰੈਲ | 1 April to 30 April | |
| 17. | May | ਮਈ | 1 May to 31 May | |
| 18. | June | ਜੂਨ | 1 June to 30 June | |
| 19. | July | ਜੁਲਾਈ | 1 July to 31 July | |
| 20. | August | ਅਗਸਤ | 1 August to 31 August | |
| 21. | September | ਸਤੰਬਰ | 1 September to 30 September | |
| 22. | October | ਅਕਤੂਬਰ | 1 October to 31 October | |
| 23. | November | ਨਵੰਬਰ | 1 November to 30 November | |
| 24. | December | ਦਸੰਬਰ | 1 December to 31 December |
Figure 3(a) Image converted from RGB to grayscale; (b) eroded image; (c) cropped images.
Figure 4Architecture of proposed CNN model.
Details of layers of proposed CNN model.
| S.No. | Layers | Input Image Size | Filter Size | No. of Filter | Activation Function | Output | Parameters |
|---|---|---|---|---|---|---|---|
| 1 | Input Image | 50 × 50 × 1 | ----- | ----- | ----- | ----- | ----- |
| 2 | Convolutional | 50 × 50 × 1 | 3 × 3 | 32 | ReLU | 50 × 50 × 32 | 320 |
| 3 | Maxpooling | 50 × 50 × 32 | Poolsize (3 × 3) | ------ | ------ | 16 × 16 × 32 | 0 |
| 4 | Convolutional | 16 × 16 × 32 | 3 × 3 | 64 | ReLU | 16 × 16 × 64 | 18,496 |
| 5 | Convolutional | 16 × 16 × 64 | 3 × 3 | 64 | ReLU | 16 × 16 × 64 | 36,928 |
| 6 | Maxpooling | 16 × 16 × 64 | Pool size 2 × 2 | ------ | ------ | 8 × 8 × 64 | 0 |
| 7 | Convolutional | 8 × 8 × 64 | 3 × 3 | 128 | ReLU | 8 × 8 × 128 | 73,856 |
| 8 | Convolutional | 8 × 8 × 128 | 3 × 3 | 128 | ReLU | 8 × 8 × 128 | 147,584 |
| 9 | Maxpooling | 8 × 8 × 128 | Pool size 2 × 2 | ------ | ------ | 4 × 4 × 128 | 0 |
| 10 | Flatten | 4 × 4 × 128 | ---- | ----- | ----- | 2048 | 0 |
| 11 | Dense | 2048 | ---- | ----- | ReLU | 1024 | 2,098,176 |
| 12 | Dense | 1024 | ---- | ----- | Softmax | 24 | 24,600 |
Figure 5Architecture of bilinear CNN.
Proposed model’s training parameters.
| Adam Optimizer’s | Learning Rate (LR) | Loss Function | Matrix | Number of Epochs | Batch Size (BS) |
|---|---|---|---|---|---|
| learning rate = 1.0 × 10−3, beta1 = 0.9, beta2 = 0.999, epsilon = 1.0 × 10−7, decay= learning rate/epochs | 0.0001 | Categorical cross entropy | Accuracy | 100 | 20 |
Figure 6Analysis of the proposed model at 100 epochs and a batch size of 20: (a) precision, recall and F1 score of the 24 classes; (b) confusion matrix; (c) accuracy and loss curve.
Figure 7Analysis of proposed model at 100 epochs and a batch size of 30: (a) precision, recall, and F1 score for the 24 classes; (b) confusion matrix; (c) accuracy and loss curve.
Figure 8Analysis of proposed model at 100 epochs and a batch size of 40: (a) precision, recall, and F1 score for the 24 classes; (b) confusion matrix; (c) accuracy and loss curve.
Figure 9Analysis of the proposed model at 40 epochs and a batch size of 20: (a) precision, recall and F1 score for the 24 classes; (b) confusion matrix; (c) accuracy and loss curve.
Figure 10Analysis of proposed model at epochs 40 and batch size 30 (a) precision, recall and f1 score of 24 classes (b) confusion matrix (c) accuracy and loss curve.
Figure 11Analysis of the proposed model at 40 epochs with a batch size of 40: (a) precision, recall and F1 score of 24 classes; (b) confusion matrix; (c) accuracy and loss curve.
Figure 12Performance of proposed model at 100 and 40 epochs with batch sizes of 20, 30 and 40: (a) validation accuracy graph; (b) validation loss graph.
Comparison of models.
| Training | Confusion Matrix | |||||||
|---|---|---|---|---|---|---|---|---|
| Parameters | Training | Validation | Training | Validation | Overall | Overall | Overall | |
| Model | Accuracy | Accuracy | Loss | Loss | Precision | Recall | F1 Score | |
|
| 0.3299 | 0.3929 | 2.1693 | 1.9268 | 0.4482 | 0.3937 | 0.3892 | |
|
| 0.7530 | 0.7771 | 0.7560 | 0.6647 | 0.7929 | 0.7767 | 0.7756 | |
|
| 0.7925 | 0.8138 | 0.6274 | 0.5484 | 0.8223 | 0.8135 | 0.8115 | |
|
|
|
|
|
|
|
|
| |
Comparison of proposed model with existing text recognition systems.
| The Authors (Year) | Technique Used | Dataset Used | Accuracy | |
|---|---|---|---|---|
| Feature Extraction Method | Classifier | |||
| [ | Parabola curve fitting and power curve fitting | SVM and k-NN | 3500 offline handwritten Gurumukhi characters | 98.10% |
| [ | Discrete wavelet transforms, discrete cosine transforms, fast Fourier transforms and fan beam transforms | SVM | 10,500 samples of isolated offline handwritten Gurumukhi characters. | 95.8% |
| [ | Local binary pattern (LBP) features, directional features, and regional features | Deep neural network | 2700 images of Gurumukhi text | 99.3% |
| [ | Histogram oriented gradient (HOG) and pyramid histogram oriented gradient (PHOG) features | SVM | 3500 handwritten Gurumukhi characters | 99.1% |
| [ | Zoning, diagonal, transition, intersection and open end points, centroid, the horizontal peak extent, the vertical peak extent, parabola curve fitting, and power curve fitting-based features | Naive Bayes (NB), decision Tree (DT), random forest (RF) and AdaBoostM1 | 49,000 samples of Gurumukhi handwritten text | 89.85% |
| [ | Zoning, discrete cosine transforms and gradient features | k-NN, SVM, decision tree (DT), random forest (RF) | Medieval HandwrittenGurumukhi Manuscripts | 95.91% |
| [ | Zoning, diagonal, peak extent-based features (horizontally and vertically) and shadow features | k-NN, decision tree (DT) and random forest | 8960 samples of Gurumukhi handwritten text | 96.03% |
| [ | Vertically peak extent, diagonal, centroid features | k-NN, linear- (SVM), RBF-SVM, naive Bayes, decision tree, CNN, and random forest | 13,000 samples that includes 7000 characters and 6000 numerals. | 87.9% |
| [ | Automatic feature extraction | Convolutional neural network | 3500 Gurumukhi characters | 98.32% |
|
|
|
|
|
|