Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Optical character recognition system for Baybayin scripts using support vector machine.

Literature DB >> 33817010

Optical character recognition system for Baybayin scripts using support vector machine.

Rodney Pino¹, Renier Mendoza¹, Rachelle Sambayan¹.

Abstract

In 2018, the Philippine Congress signed House Bill 1022 declaring the Baybayin script as the Philippines' national writing system. In this regard, it is highly probable that the Baybayin and Latin scripts would appear in a single document. In this work, we propose a system that discriminates the characters of both scripts. The proposed system considers the normalization of an individual character to identify if it belongs to Baybayin or Latin script and further classify them as to what unit they represent. This gives us four classification problems, namely: (1) Baybayin and Latin script recognition, (2) Baybayin character classification, (3) Latin character classification, and (4) Baybayin diacritical marks classification. To the best of our knowledge, this is the first study that makes use of Support Vector Machine (SVM) for Baybayin script recognition. This work also provides a new dataset for Baybayin, its diacritics, and Latin characters. Classification problems (1) and (4) use binary SVM while (2) and (3) apply the multiclass SVM classification. On average, our numerical experiments yield satisfactory results: (1) has 98.5% accuracy, 98.5% precision, 98.49% recall, and 98.5% F1 Score; (2) has 96.51% accuracy, 95.62% precision, 95.61% recall, and 95.62% F1 Score; (3) has 95.8% accuracy, 95.85% precision, 95.8% recall, and 95.83% F1 Score; and (4) has 100% accuracy, 100% precision, 100% recall, and 100% F1 Score. ©2021 Pino et al.

Entities: Chemical Disease Gene Species

Keywords: Baybayin; Baybayin script identification; Latin script identification; Optical character recognition; Support vector machine

Year: 2021 PMID： 33817010 PMCID： PMC7959605 DOI： 10.7717/peerj-cs.360

Source DB: PubMed Journal: PeerJ Comput Sci ISSN： 2376-5992

5 in total

4. iEnhancer-5Step: Identifying enhancers using hidden information of DNA sequences via Chou's 5-step rule and word embedding.

Authors: Nguyen Quoc Khanh Le; Edward Kien Yee Yapp; Quang-Thai Ho; N Nagasundaram; Yu-Yen Ou; Hui-Yuan Yeh
Journal: Anal Biochem Date: 2019-02-26 Impact factor: 3.365

5. iN6-methylat (5-step): identifying DNA N⁶-methyladenine sites in rice genome using continuous bag of nucleobases via Chou's 5-step rule.

Authors: Nguyen Quoc Khanh Le
Journal: Mol Genet Genomics Date: 2019-05-04 Impact factor: 3.291

5 in total

Optical character recognition system for Baybayin scripts using support vector machine.

1. Script recognition--a review.

Review 2. Biological applications of support vector machines.

3. On the decoding process in ternary error-correcting output codes.

4. iEnhancer-5Step: Identifying enhancers using hidden information of DNA sequences via Chou's 5-step rule and word embedding.

5. iN6-methylat (5-step): identifying DNA N⁶-methyladenine sites in rice genome using continuous bag of nucleobases via Chou's 5-step rule.

Optical character recognition system for Baybayin scripts using support vector machine.

1. Script recognition--a review.

Review 2. Biological applications of support vector machines.

3. On the decoding process in ternary error-correcting output codes.

4. iEnhancer-5Step: Identifying enhancers using hidden information of DNA sequences via Chou's 5-step rule and word embedding.

5. iN6-methylat (5-step): identifying DNA N6-methyladenine sites in rice genome using continuous bag of nucleobases via Chou's 5-step rule.

5. iN6-methylat (5-step): identifying DNA N⁶-methyladenine sites in rice genome using continuous bag of nucleobases via Chou's 5-step rule.