Scene Text Recognition via Transformer

03/18/2020
by   Xinjie Feng, et al.
0

Scene text recognition with arbitrary shape is very challenging due to large variations in text shapes, fonts, colors, backgrounds, etc. Most state-of-the-art algorithms rectify the input image into the normalized image, then treat the recognition as a sequence prediction task. The bottleneck of such methods is the rectification, which will cause errors due to distortion perspective. In this paper, we find that the rectification is completely unnecessary. What all we need is the spatial attention. We therefore propose a simple but extremely effective scene text recognition method based on transformer [50]. Different from previous transformer based models [56,34], which just use the decoder of the transformer to decode the convolutional attention, the proposed method use a convolutional feature maps as word embedding input into transformer. In such a way, our method is able to make full use of the powerful attention mechanism of the transformer. Extensive experimental results show that the proposed method significantly outperforms state-of-the-art methods by a very large margin on both regular and irregular text datasets. On one of the most challenging CUTE dataset whose state-of-the-art prediction accuracy is 89.6 is a pretty surprising result. We will release our source code and believe that our method will be a new benchmark of scene text recognition with arbitrary shapes.

READ FULL TEXT
research
11/09/2022

Portmanteauing Features for Scene Text Recognition

Scene text images have different shapes and are subjected to various dis...
research
10/10/2019

On Recognizing Texts of Arbitrary Shapes with 2D Self-Attention

Scene text recognition (STR) is the task of recognizing character sequen...
research
04/20/2019

FACLSTM: ConvLSTM with Focused Attention for Scene Text Recognition

Scene text recognition has recently been widely treated as a sequence-to...
research
07/31/2022

Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition

Artistic text recognition is an extremely challenging task with a wide r...
research
11/09/2022

Pure Transformer with Integrated Experts for Scene Text Recognition

Scene text recognition (STR) involves the task of reading text in croppe...
research
07/23/2019

2D-CTC for Scene Text Recognition

Scene text recognition has been an important, active research topic in c...
research
11/19/2019

KISS: Keeping It Simple for Scene Text Recognition

Over the past few years, several new methods for scene text recognition ...

Please sign up or login with your details

Forgot password? Click here to reset