DocEnTr: An End-to-End Document Image Enhancement Transformer

01/25/2022
by   Mohamed Ali Souibgui, et al.
0

Document images can be affected by many degradation scenarios, which cause recognition and processing difficulties. In this age of digitization, it is important to denoise them for proper usage. To address this challenge, we present a new encoder-decoder architecture based on vision transformers to enhance both machine-printed and handwritten document images, in an end-to-end fashion. The encoder operates directly on the pixel patches with their positional information without the use of any convolutional layers, while the decoder reconstructs a clean image from the encoded patches. Conducted experiments show a superiority of the proposed model compared to the state-of the-art methods on several DIBCO benchmarks. Code and models will be publicly available at: <https://github.com/dali92002/DocEnTR>.

READ FULL TEXT

page 5

page 6

research
05/08/2023

SwinDocSegmenter: An End-to-End Unified Domain Adaptive Transformer for Document Instance Segmentation

Instance-level segmentation of documents consists in assigning a class-a...
research
04/18/2023

Deep Unrestricted Document Image Rectification

In recent years, tremendous efforts have been made on document image rec...
research
03/31/2022

Deep Hyperspectral Unmixing using Transformer Network

Currently, this paper is under review in IEEE. Transformers have intrigu...
research
05/26/2021

Enhance to Read Better: An Improved Generative Adversarial Network for Handwritten Document Image Enhancement

Handwritten document images can be highly affected by degradation for di...
research
02/01/2021

RectiNet-v2: A stacked network architecture for document image dewarping

With the advent of mobile and hand-held cameras, document images have fo...
research
06/15/2023

Fast Training of Diffusion Models with Masked Transformers

We propose an efficient approach to train large diffusion models with ma...
research
11/23/2022

Completing point cloud from few points by Wasserstein GAN and Transformers

In many vision and robotics applications, it is common that the captured...

Please sign up or login with your details

Forgot password? Click here to reset