Literature DB >> 31372425

ArASL: Arabic Alphabets Sign Language Dataset.

Ghazanfar Latif1,2, Nazeeruddin Mohammad1, Jaafar Alghazo1, Roaa AlKhalaf1, Rawan AlKhalaf1.   

Abstract

A fully-labelled dataset of Arabic Sign Language (ArSL) images is developed for research related to sign language recognition. The dataset will provide researcher the opportunity to investigate and develop automated systems for the deaf and hard of hearing people using machine learning, computer vision and deep learning algorithms. The contribution is a large fully-labelled dataset for Arabic Sign Language (ArSL) which is made publically available and free for all researchers. The dataset which is named ArSL2018 consists of 54,049 images for the 32 Arabic sign language sign and alphabets collected from 40 participants in different age groups. Different dimensions and different variations were present in images which can be cleared using pre-processing techniques to remove noise, center the image, etc. The dataset is made available publicly at https://data.mendeley.com/datasets/y7pckrw6z2/1.

Entities:  

Year:  2019        PMID: 31372425      PMCID: PMC6661066          DOI: 10.1016/j.dib.2019.103777

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications table The current trend of machine learning and deep learning in developing applications helpful in our daily lives such as fingerprint or face recognition and other application in fields such as healthcare, assistive technology, and others. The main core of these applications is image pre-processing, classification and recognition to automate tasks usually done by humans. The ArSL2018 dataset is a valuable resource for researchers in the machine learning and deep learning community for development of assistive technology applications for persons with disability. The ArSL2018 dataset collected in Al Khobar, Saudi Arabia is a collection of 54,000 images of the 32 Arabic Sign Language Signs and Alphabet. The ArSL2018 is a comprehensive Arabic Sign Language Image repository fully-labelled for purposes of classification and recognition, and for the purpose of applications automating the recognition of sign language for Arabic deaf and hard of hearing individuals. The ArSL2018 dataset would assist researchers and allow for faster application development, and faster prototyping of different applications and devices in the assistive technology field. The ArSL2018 is a base for the research community to build on this dataset to produce a dataset with more image variations.

Data

The ArSL2018 is a new comprehensive fully labelled dataset of Arabic Sign Language images launched in Prince Mohammad Bin Fahd University, Al Khobar, Saudi Arabia to be made available for researchers in the field of Machine Learning and Deep Learning. It is useful for application and device development in the assistive technology field for the benefit of the deaf and hard of hearing individuals. Examples of related datasets can be found in Refs. [3], [4], [5]. The ArSL2018 dataset is unique in the sense that it is the first large comprehensive dataset for Arabic Language Sign Language according to the author(s) knowledge. There is a large potential for this dataset to be used by researchers to both increase accuracies of classification and recognition and for development of prototypes useful for the deaf community. The ArSL2018 dataset is compiled of 54,049 images in gray scale with 64 × 64 dimension. Variations of images were introduced with different lighting and different background. Fig. 1 shows a sample of the pictures of the Arabic Sign Language signs and alphabets in the dataset. In order to assist researchers to access the ArSL2018 dataset for classification and recognition, we have collected, labelled, generated and published the ArSL2018 dataset [1]. Table 1 shows the classification of the Arabic Alphabet signs, with labels and number of images. The dataset has been identified to be sufficient for both training and classification, and has been tested as such. The dataset can be used as is and maybe increased with more variations in the second version of the dataset.
Fig. 1

Representation of the Arabic Sign Language for Arabic Alphabets.

Table 1

Input Arabic Alphabet Sign classes with their labels and number of images.

#Letter name in English ScriptLetter name in Arabic script# of Images#Letter name in English ScriptLetter name in Arabic script# of images
1Alifأَلِف)أ)167217ظَاء)ظ)1723
2بَاء) ب)179118Aynعَين)ع)2114
3أتَاء) ت)183819Ghaynغَين)غ)1977
4Thāثَاء) ث)176620فَاء)ف)1955
5Jīmجِيمْ) ج)155221Qāfقَاف) ق)1705
6حَاء) ح)152622Kāfكَاف)ك)1774
7Khāخَاء) خ)160723Lāmلاَمْ)ل)1832
8Dālدَالْ) د)163424Mīmمِيمْ)م)1765
9Dhālذَال) ذ)158225Nūnنُون)ن)1819
10رَاء) ر)165926هَاء)ه)1592
11Zāyزَاي) ز)137427Wāwوَاو)و)1371
12Sīnسِينْ) س)163828يَا) ئ)1722
13Shīnشِينْ) ش)150729Tāaة)ة)1791
14Sādصَادْ)ص)189530Alال)ال)1343
15Dādضَاد)ض)167031Laaﻻ)ﻻ)1746
16طَاء)ط)181632Yāaيَاء) يَاء)1293
Representation of the Arabic Sign Language for Arabic Alphabets. Input Arabic Alphabet Sign classes with their labels and number of images. There are still some limitations to the ArSL2018 dataset which include, 1) dataset was collected in one location, 2) not enough lighting and noise variations were introduced, 3) the number of participants providing samples were only 40 participants. The limitations are minor and could be addressed in the second version of the dataset.

Experimental design, materials, and methods

The ArSL2018 dataset images were taken at Prince Mohammad Bin Fahd University and in the Khobar Area, Kingdom of Saudi Arabia from volunteers of different age groups. A smart Camera attached to tripod was used to capture the images. Volunteers were made to stand around 1 m away from the camera. Variations of images were introduced with different lighting, angles, timings and different background. The total number of images per alphabet varies, however, the total number of images compiled for the dataset were 54,049 images. The images were taken in RGB format with different dimensions and variations, which required pre-processing the images to make the suitable for classification and recognition. The collected images were resized to a fixed dimension 64 × 64 and converted to grayscale images, with a range of pixel values 0 to 255.

Specifications table

Subject areaComputer Science, Machine Learning, Computer Vision, Deep Learning
More specific subject areaSign Language Recognition for the deaf and hard of hearing
Type of dataImages (64×64 pixels JPG format)
How data was acquiredSmart Camera (iPhone 6S) used to capture Images.
Data formatLabelled Grayscale Images
Experimental factorsPreprocessing to resize and grayscale conversion
Experimental featuresNone
Data source locationAl Khobar, Eastern Province, Saudi Arabia
Data accessibilityThe Dataset is made accessible athttps://data.mendeley.com/datasets/y7pckrw6z2/1[1]and it is free and publicly available for any research, academic and educational purposes.
Related research articleThe accuracy was stated in the paper and could serve as a benchmark for research to increase recognition accuracy. The modified version of the paper for journal is already accepted (in press)[2].
Value of the data

The current trend of machine learning and deep learning in developing applications helpful in our daily lives such as fingerprint or face recognition and other application in fields such as healthcare, assistive technology, and others. The main core of these applications is image pre-processing, classification and recognition to automate tasks usually done by humans. The ArSL2018 dataset is a valuable resource for researchers in the machine learning and deep learning community for development of assistive technology applications for persons with disability.

The ArSL2018 dataset collected in Al Khobar, Saudi Arabia is a collection of 54,000 images of the 32 Arabic Sign Language Signs and Alphabet.

The ArSL2018 is a comprehensive Arabic Sign Language Image repository fully-labelled for purposes of classification and recognition, and for the purpose of applications automating the recognition of sign language for Arabic deaf and hard of hearing individuals.

The ArSL2018 dataset would assist researchers and allow for faster application development, and faster prototyping of different applications and devices in the assistive technology field.

The ArSL2018 is a base for the research community to build on this dataset to produce a dataset with more image variations.

  3 in total

1.  Sharing Practices for Datasets Related to Accessibility and Aging.

Authors:  Rie Kamikubo; Utkarsh Dwivedi; Hernisa Kacorri
Journal:  ASSETS       Date:  2021

2.  Sign Language Recognition for Arabic Alphabets Using Transfer Learning Technique.

Authors:  Mohammed Zakariah; Yousef Ajmi Alotaibi; Deepika Koundal; Yanhui Guo; Mohammad Mamun Elahi
Journal:  Comput Intell Neurosci       Date:  2022-04-22

3.  Human-Computer Interaction with Hand Gesture Recognition Using ResNet and MobileNet.

Authors:  Abeer Alnuaim; Mohammed Zakariah; Wesam Atef Hatamleh; Hussam Tarazi; Vikas Tripathi; Enoch Tetteh Amoatey
Journal:  Comput Intell Neurosci       Date:  2022-03-26
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.