Literature DB >> 33817010

Optical character recognition system for Baybayin scripts using support vector machine.

Rodney Pino1, Renier Mendoza1, Rachelle Sambayan1.   

Abstract

In 2018, the Philippine Congress signed House Bill 1022 declaring the Baybayin script as the Philippines' national writing system. In this regard, it is highly probable that the Baybayin and Latin scripts would appear in a single document. In this work, we propose a system that discriminates the characters of both scripts. The proposed system considers the normalization of an individual character to identify if it belongs to Baybayin or Latin script and further classify them as to what unit they represent. This gives us four classification problems, namely: (1) Baybayin and Latin script recognition, (2) Baybayin character classification, (3) Latin character classification, and (4) Baybayin diacritical marks classification. To the best of our knowledge, this is the first study that makes use of Support Vector Machine (SVM) for Baybayin script recognition. This work also provides a new dataset for Baybayin, its diacritics, and Latin characters. Classification problems (1) and (4) use binary SVM while (2) and (3) apply the multiclass SVM classification. On average, our numerical experiments yield satisfactory results: (1) has 98.5% accuracy, 98.5% precision, 98.49% recall, and 98.5% F1 Score; (2) has 96.51% accuracy, 95.62% precision, 95.61% recall, and 95.62% F1 Score; (3) has 95.8% accuracy, 95.85% precision, 95.8% recall, and 95.83% F1 Score; and (4) has 100% accuracy, 100% precision, 100% recall, and 100% F1 Score. ©2021 Pino et al.

Entities:  

Keywords:  Baybayin; Baybayin script identification; Latin script identification; Optical character recognition; Support vector machine

Year:  2021        PMID: 33817010      PMCID: PMC7959605          DOI: 10.7717/peerj-cs.360

Source DB:  PubMed          Journal:  PeerJ Comput Sci        ISSN: 2376-5992


  5 in total

1.  Script recognition--a review.

Authors:  Debashis Ghosh; Tulika Dube; Adamane P Shivaprasad
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2010-12       Impact factor: 6.226

Review 2.  Biological applications of support vector machines.

Authors:  Zheng Rong Yang
Journal:  Brief Bioinform       Date:  2004-12       Impact factor: 11.622

3.  On the decoding process in ternary error-correcting output codes.

Authors:  Sergio Escalera; Oriol Pujol; Petia Radeva
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2010-01       Impact factor: 6.226

4.  iEnhancer-5Step: Identifying enhancers using hidden information of DNA sequences via Chou's 5-step rule and word embedding.

Authors:  Nguyen Quoc Khanh Le; Edward Kien Yee Yapp; Quang-Thai Ho; N Nagasundaram; Yu-Yen Ou; Hui-Yuan Yeh
Journal:  Anal Biochem       Date:  2019-02-26       Impact factor: 3.365

5.  iN6-methylat (5-step): identifying DNA N6-methyladenine sites in rice genome using continuous bag of nucleobases via Chou's 5-step rule.

Authors:  Nguyen Quoc Khanh Le
Journal:  Mol Genet Genomics       Date:  2019-05-04       Impact factor: 3.291

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.