Document understanding and information extraction include different task...
Document-based Visual Question Answering examines the document understan...
Most TextVQA approaches focus on the integration of objects, scene texts...
We propose a PiggyBack, a Visual Question Answering platform that allows...
Recognizing the layout of unstructured digital documents is crucial when...
As the use of deep learning techniques has grown across various fields o...
Automatic table detection in PDF documents has achieved a great success ...
Text-to-image multimodal tasks, generating/retrieving an image from a gi...
Visual question answering (VQA) is a challenging multi-modal task that
r...