3D Rendering Framework for Data Augmentation in Optical Character Recognition

by   Andreas Spruck, et al.

In this paper, we propose a data augmentation framework for Optical Character Recognition (OCR). The proposed framework is able to synthesize new viewing angles and illumination scenarios, effectively enriching any available OCR dataset. Its modular structure allows to be modified to match individual user requirements. The framework enables to comfortably scale the enlargement factor of the available dataset. Furthermore, the proposed method is not restricted to single frame OCR but can also be applied to video OCR. We demonstrate the performance of our framework by augmenting a 15 Mobile OCR dataset. Our proposed framework is capable of leveraging the performance of OCR applications especially for small datasets. Applying the proposed method, improvements of up to 2.79 percentage points in terms of Character Error Rate (CER), and up to 7.88 percentage points in terms of Word Error Rate (WER) are achieved on the subset. Especially the recognition of challenging text lines can be improved. The CER may be decreased by up to 14.92 percentage points and the WER by up to 18.19 percentage points for this class. Moreover, we are able to achieve smaller error rates when training on the 15 subset augmented with the proposed method than on the original non-augmented full dataset.


Synthesizing Annotated Image and Video Data Using a Rendering-Based Pipeline for Improved License Plate Recognition

An insufficient number of training samples is a common problem in neural...

Data Augmentation in a Hybrid Approach for Aspect-Based Sentiment Analysis

Data augmentation is a way to increase the diversity of available data b...

CNN-BiLSTM model for English Handwriting Recognition: Comprehensive Evaluation on the IAM Dataset

We present a CNN-BiLSTM system for the problem of offline English handwr...

Data-Efficient Augmentation for Training Neural Networks

Data augmentation is essential to achieve state-of-the-art performance i...

Mixed Model OCR Training on Historical Latin Script for Out-of-the-Box Recognition and Finetuning

In order to apply Optical Character Recognition (OCR) to historical prin...

Text Augmentation for Language Models in High Error Recognition Scenario

We examine the effect of data augmentation for training of language mode...

Digitizing Historical Balance Sheet Data: A Practitioner's Guide

This paper discusses how to successfully digitize large-scale historical...

Please sign up or login with your details

Forgot password? Click here to reset