Literature DB >> 26966718

Handwriting Moroccan regions recognition using Tifinagh character.

B El Kessab1, C Daoui1, B Bouikhalene1, R Salouan1.   

Abstract

The territorial organization of Morocco during administratives division of 2009 is based on 16 regions. In this work we will create a system of recognition of handwritten words (names of regions) using the Amazigh language is an official language by the Moroccan Royal Institute of Amazigh Culture (IRCAM) (2003a) [1] such as this language is slightly treated by researchers in pattern recognition field that is why we decided to study this language (El Kessab et al., 2013 [3]; El Kessab et al., 2014 [4]) that knowing the state make a decision to computerize the various public sectors by this language. In this context we propose a data set for handwritten Tifinagh regions composed of 1600 image (100 Image for each region). The dataset can be used in one hand to test the efficiency of the Tifinagh region recognition system in extraction of characteristics significatives and the correct identification of each region in classification phase in the other hand.

Year:  2015        PMID: 26966718      PMCID: PMC4783523          DOI: 10.1016/j.dib.2015.07.018

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications table Value of the data The region is the current highest administrative division of Morocco. The regions are subdivided into a total of 63s-order administrative divisions, which are prefectures and provinces [2] A Moroccan region is governed by a Wali, nominated by the King. The Wali is also governor of the province (or prefecture) where he resides. As part of a 1997 decentralization and regionalization law passed by the legislature 16 new regions of Morocco were created. We chose a database word contains 1000 words written in marker and that represents the 16 region of Morocco. Optical Character Recognition (OCR) can be applied on both cases printed or handwritten. In this work we use several efficient techniques in each of the three principal phases forming a the system of recognition which are firstly the pre-processing then secondly the features extraction then finally learning-classification several studies has been done for recognition of Handwritten Tifinagh regions recognition by using in the features extraction phase the square and triangular zoning method in one hand or in the learning-classification phase the support vectors machines (SVM) and the neural networks on the other hand. Amazigh alphabet is considered as a national language since a new constitution of 2011 is a creative field [3], [4], [5], [6], [7] is very useful to create a system for Tifinagh hand writing words representing the regions.

Experimental design, materials and methods

For several years, on-line and off-line handwriting character recognition has been considered as a very dynamic field given that its applicability in many different domains such as bank check processing, automatic data entry and postal sorting, The postal automation, bank checks identification, automatic processing of administrative files, etc. In this work we have presented the steps of the recognition system in Fig. 1.
Fig. 1

The proposed system for handwritten Tifinagh words recognition.

We chose a database word contains 1000 words written with marker and that represents sixteen region of Morocco Table 1.
Table 1

The obtained recognition rates τr and τg by each hybrid method and each classifier.

RegionsNeural networks
Support vectors machines
Square zoningTriangular zoningSquare zoningTriangular zoning
70.0067.5782.0074.49
79.1373.480.0075.18
60.0060.7483.0080.34
55.0953.4876.6770.61
65.2164.0074.0070.78
63.2565.8569.0066.60
50.1850.9768.6765.00
69.6665.9371.6770.56
64.4661.7167.0063.60
71.3167.4072.0070.96
73.0071.4374.0071.73
66.1164.8467.5769.41
68.6762.6970.4069.00
73.3769.2972.7470.44
67.4566.0469.4861.33
69.2668.3481.474.18
τg66.6464.6173.7370.26

All values of the recognition rate for each region τr (given in %) and also those of the global rate recognition τ of all 16 regions (given in %) which we have obtained in the table.

The extraction steps were We ask 70 students (in Laboratory of Information Processing and Decision Support) to write the 16 region with Tifinagh characters (Fig. 3).
Fig. 3

Sixteen regions with Amazigh language.

The direction of writing of this character is the left to right in horizontal lines. The characters are written in a way separated in the text (see Fig. 2, Fig. 3).
Fig. 2

Comparison between the Tifinagh, Arabic and Latin characters.

Each original region image has a size equal to 30×30 pixels (Fig. 4, Fig. 5, Fig. 6, Fig. 7, Fig.8, Fig. 9, Fig. 10, Fig. 11, Fig. 12, Fig. 13, Fig. 14, Fig. 15, Fig. 16, Fig. 17).
Fig. 4

Example of handwritten Tifinagh region from the proposed data base.

Fig. 5

The graphical representation of recognition rate τr for each region.

Fig. 6

The original image.

Fig. 7

Graphical representation of segmentation column.

Fig.8

The segmentation in columns.

Fig. 9

The square zoning method.

Fig. 10

Processes of feature extraction by square zoning.

Fig. 11

The triangular zoning method.

Fig. 12

Processes of feature extraction by triangular zoning.

Fig. 13

The determination of optimal hyperplane, vectors supports, maximum Marge and valid hyperplanes.

Fig. 14

The multi-layer perceptron.

Fig. 15

The graphical representation of recognition rate τr of each region with all methods.

Fig. 16

The graphical representation of recognition rate τr of all feature extraction methods.

Fig. 17

The graphical representation of recognition rate τr of square zoning method and SVM classifier.

The number of the square zones in features extraction equal to 4, 6 and 9 zones. The number of the triangles zones in features extraction equal to 4, 6 and 8 zones. Each numeral is transformed to a vector of 4, 6 and 9 components for square zoning and to a vector of 4, 6 and 8 components of triangular zoning in features extraction. The standard deviation of the GRBF kernel function is equal to 0.1 in classification phase with support vectors machines. The degree of the Polynomial (POL) kernel function is equal to 10 and their parameters a=b=1 in classification phase with support vectors machines. We realized a variation on the size of the zones in features extraction to find the best performing method. To do this, we have chosen the values {5, 10, 15} of hidden layer neurons number. The graphical representation to recognition rate of each region τr is shown in Fig. 5.
Subject areaComputer science
More specific subject areaImage processing, handwritten Tifinagh region, the Amazigh language
Type of dataImage
How data was acquiredHandwritten, Scanner, Marker
Data formatJpeg image
Experimental factorsWe ask 70 students to write 16 regions with Tifinagh characters, we use an HP G3110 with maximum resolution 4800×9600 dpi to data scan, and we use a marker in writing of characters
Experimental features1376 Image with a size of 30×30 pixels (100 images/region)
Data source locationBéni Mellal, Morocco
Data accessibilityWithin this article
  1 in total

1.  Genetic ancestry of a Moroccan population as inferred from autosomal STRs.

Authors:  K Bentayebi; F Abada; H Ihzmad; S Amzazi
Journal:  Meta Gene       Date:  2014-06-21
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.