Optimal Boxes: Boosting End-to-End Scene Text Recognition by Adjusting Annotated Bounding Boxes via Reinforcement Learning

07/25/2022
by   Jingqun Tang, et al.
0

Text detection and recognition are essential components of a modern OCR system. Most OCR approaches attempt to obtain accurate bounding boxes of text at the detection stage, which is used as the input of the text recognition stage. We observe that when using tight text bounding boxes as input, a text recognizer frequently fails to achieve optimal performance due to the inconsistency between bounding boxes and deep representations of text recognition. In this paper, we propose Box Adjuster, a reinforcement learning-based method for adjusting the shape of each text bounding box to make it more compatible with text recognition models. Additionally, when dealing with cross-domain problems such as synthetic-to-real, the proposed method significantly reduces mismatches in domain distribution between the source and target domains. Experiments demonstrate that the performance of end-to-end text recognition systems can be improved when using the adjusted bounding boxes as the ground truths for training. Specifically, on several benchmark datasets for scene text understanding, the proposed method outperforms state-of-the-art text spotters by an average of 2.0 4.6

READ FULL TEXT
research
06/26/2020

Text Detection on Roughly Placed Books by Leveraging a Learning-based Model Trained with Another Domain Data

Text detection enables us to extract rich information from images. In th...
research
05/17/2022

Text Detection Recognition in the Wild for Robot Localization

Signage is everywhere and a robot should be able to take advantage of si...
research
12/11/2018

Loss Guided Activation for Action Recognition in Still Images

One significant problem of deep-learning based human action recognition ...
research
11/30/2016

Deep Cuboid Detection: Beyond 2D Bounding Boxes

We present a Deep Cuboid Detector which takes a consumer-quality RGB ima...
research
05/22/2018

Learning Markov Clustering Networks for Scene Text Detection

A novel framework named Markov Clustering Network (MCN) is proposed for ...
research
03/14/2018

Approximate Query Matching for Image Retrieval

Traditional image recognition involves identifying the key object in a p...
research
11/28/2021

CHARTER: heatmap-based multi-type chart data extraction

The digital conversion of information stored in documents is a great sou...

Please sign up or login with your details

Forgot password? Click here to reset