SauvolaNet: Learning Adaptive Sauvola Network for Degraded Document Binarization

05/12/2021
by   Deng Li, et al.
9

Inspired by the classic Sauvola local image thresholding approach, we systematically study it from the deep neural network (DNN) perspective and propose a new solution called SauvolaNet for degraded document binarization (DDB). It is composed of three explainable modules, namely, Multi-Window Sauvola (MWS), Pixelwise Window Attention (PWA), and Adaptive Sauolva Threshold (AST). The MWS module honestly reflects the classic Sauvola but with trainable parameters and multi-window settings. The PWA module estimates the preferred window sizes for each pixel location. The AST module further consolidates the outputs from MWS and PWA and predicts the final adaptive threshold for each pixel location. As a result, SauvolaNet becomes end-to-end trainable and significantly reduces the number of required network parameters to 40K – it is only 1% of MobileNetV2. In the meantime, it achieves the State-of-The-Art (SoTA) performance for the DDB task – SauvolaNet is at least comparable to, if not better than, SoTA binarization solutions in our extensive studies on the 13 public document binarization datasets. Our source code is available at https://github.com/Leedeng/SauvolaNet.

READ FULL TEXT

page 5

page 7

research
05/24/2021

LineCounter: Learning Handwritten Text Line Segmentation by Counting

Handwritten Text Line Segmentation (HTLS) is a low-level but important t...
research
06/18/2019

Cascaded Cross-Module Residual Learning towards Lightweight End-to-End Speech Coding

Speech codecs learn compact representations of speech signals to facilit...
research
05/08/2023

SwinDocSegmenter: An End-to-End Unified Domain Adaptive Transformer for Document Instance Segmentation

Instance-level segmentation of documents consists in assigning a class-a...
research
01/25/2023

Faster DAN: Multi-target Queries with Document Positional Encoding for End-to-end Handwritten Document Recognition

Recent advances in handwritten text recognition enabled to recognize who...
research
03/31/2023

WSense: A Robust Feature Learning Module for Lightweight Human Activity Recognition

In recent times, various modules such as squeeze-and-excitation, and oth...
research
12/03/2022

Orders Are Unwanted: Dynamic Deep Graph Convolutional Network for Personality Detection

Predicting personality traits based on online posts has emerged as an im...
research
03/24/2022

Moving Window Regression: A Novel Approach to Ordinal Regression

A novel ordinal regression algorithm, called moving window regression (M...

Please sign up or login with your details

Forgot password? Click here to reset