Literature DB >> 29994467

ASTER: An Attentional Scene Text Recognizer with Flexible Rectification.

Baoguang Shi, Mingkun Yang, Xinggang Wang, Pengyuan Lyu, Cong Yao, Xiang Bai.   

Abstract

A challenging aspect of scene text recognition is to handle text with distortions or irregular layout. In particular, perspective text and curved text are common in natural scenes and are difficult to recognize. In this work, we introduce ASTER, an end-to-end neural network model that comprises a rectification network and a recognition network. The rectification network adaptively transforms an input image into a new one, rectifying the text in it. It is powered by a flexible Thin-Plate Spline transformation which handles a variety of text irregularities and is trained without human annotations. The recognition network is an attentional sequence-to-sequence model that predicts a character sequence directly from the rectified image. The whole model is trained end to end, requiring only images and their groundtruth text. Through extensive experiments, we verify the effectiveness of the rectification and demonstrate the state-of-the-art recognition performance of ASTER. Furthermore, we demonstrate that ASTER is a powerful component in end-to-end recognition systems, for its ability to enhance the detector.

Entities:  

Year:  2018        PMID: 29994467     DOI: 10.1109/TPAMI.2018.2848939

Source DB:  PubMed          Journal:  IEEE Trans Pattern Anal Mach Intell        ISSN: 0098-5589            Impact factor:   6.226


  5 in total

1.  DeepCIN: Attention-Based Cervical histology Image Classification with Sequential Feature Modeling for Pathologist-Level Accuracy.

Authors:  Sudhir Sornapudi; R Joe Stanley; William V Stoecker; Rodney Long; Zhiyun Xue; Rosemary Zuna; Shellaine R Frazier; Sameer Antani
Journal:  J Pathol Inform       Date:  2020-12-24

2.  Interactive video retrieval evaluation at a distance: comparing sixteen interactive video search systems in a remote setting at the 10th Video Browser Showdown.

Authors:  Silvan Heller; Viktor Gsteiger; Werner Bailer; Cathal Gurrin; Björn Þór Jónsson; Jakub Lokoč; Andreas Leibetseder; František Mejzlík; Ladislav Peška; Luca Rossetto; Konstantin Schall; Klaus Schoeffmann; Heiko Schuldt; Florian Spiess; Ly-Duyen Tran; Lucia Vadicamo; Patrik Veselý; Stefanos Vrochidis; Jiaxin Wu
Journal:  Int J Multimed Inf Retr       Date:  2022-01-26

3.  Improving Scene Text Recognition for Indian Languages with Transfer Learning and Font Diversity.

Authors:  Sanjana Gunna; Rohit Saluja; Cheerakkuzhi Veluthemana Jawahar
Journal:  J Imaging       Date:  2022-03-23

4.  A Smart Visual Sensing Concept Involving Deep Learning for a Robust Optical Character Recognition under Hard Real-World Conditions.

Authors:  Kabeh Mohsenzadegan; Vahid Tavakkoli; Kyandoghere Kyamakya
Journal:  Sensors (Basel)       Date:  2022-08-12       Impact factor: 3.847

5.  MA-CharNet: Multi-angle fusion character recognition network.

Authors:  Qingyu Wang; Jing Liu; Ziqi Zhu; Chunhua Deng
Journal:  PLoS One       Date:  2022-08-29       Impact factor: 3.752

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.