ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images

06/05/2023
by   Wenwen Yu, et al.
0

Structured text extraction is one of the most valuable and challenging application directions in the field of Document AI. However, the scenarios of past benchmarks are limited, and the corresponding evaluation protocols usually focus on the submodules of the structured text extraction scheme. In order to eliminate these problems, we organized the ICDAR 2023 competition on Structured text extraction from Visually-Rich Document images (SVRD). We set up two tracks for SVRD including Track 1: HUST-CELL and Track 2: Baidu-FEST, where HUST-CELL aims to evaluate the end-to-end performance of Complex Entity Linking and Labeling, and Baidu-FEST focuses on evaluating the performance and generalization of Zero-shot / Few-shot Structured Text extraction from an end-to-end perspective. Compared to the current document benchmarks, our two tracks of competition benchmark enriches the scenarios greatly and contains more than 50 types of visually-rich document images (mainly from the actual enterprise applications). The competition opened on 30th December, 2022 and closed on 24th March, 2023. There are 35 participants and 91 valid submissions received for Track 1, and 15 participants and 26 valid submissions received for Track 2. In this report we will presents the motivation, competition datasets, task definition, evaluation protocol, and submission summaries. According to the performance of the submissions, we believe there is still a large gap on the expected information extraction performance for complex and zero-shot scenarios. It is hoped that this competition will attract many researchers in the field of CV and NLP, and bring some new thoughts to the field of Document AI.

READ FULL TEXT
research
03/18/2021

ICDAR2019 Competition on Scanned Receipt OCR and Information Extraction

Scanned receipts OCR and key information extraction (SROIE) represent th...
research
04/24/2023

ICDAR 2023 Competition on Reading the Seal Title

Reading seal title text is a challenging task due to the variable shapes...
research
09/09/2020

One-shot Text Field Labeling using Attention and Belief Propagation for Structure Information Extraction

Structured information extraction from document images usually consists ...
research
05/23/2022

Document Intelligence Metrics for Visually Rich Document Evaluation

The processing of Visually-Rich Documents (VRDs) is highly important in ...
research
03/03/2022

DareFightingICE Competition: A Fighting Game Sound Design and AI Competition

This paper presents a new competition – at the 2022 IEEE Conference on G...
research
09/17/2019

ICDAR 2019 Competition on Large-scale Street View Text with Partial Labeling – RRC-LSVT

Robust text reading from street view images provides valuable informatio...
research
07/16/2023

DocTr: Document Transformer for Structured Information Extraction in Documents

We present a new formulation for structured information extraction (SIE)...

Please sign up or login with your details

Forgot password? Click here to reset