TQ-Net: Mixed Contrastive Representation Learning For Heterogeneous Test Questions

by   He Zhu, et al.

Recently, more and more people study online for the convenience of access to massive learning materials (e.g. test questions/notes), thus accurately understanding learning materials became a crucial issue, which is essential for many educational applications. Previous studies focus on using language models to represent the question data. However, test questions (TQ) are usually heterogeneous and multi-modal, e.g., some of them may only contain text, while others half contain images with information beyond their literal description. In this context, both supervised and unsupervised methods are difficult to learn a fused representation of questions. Meanwhile, this problem cannot be solved by conventional methods such as image caption, as the images may contain information complementary rather than duplicate to the text. In this paper, we first improve previous text-only representation with a two-stage unsupervised instance level contrastive based pre-training method (MCL: Mixture Unsupervised Contrastive Learning). Then, TQ-Net was proposed to fuse the content of images to the representation of heterogeneous data. Finally, supervised contrastive learning was conducted on relevance prediction-related downstream tasks, which helped the model to learn the representation of questions effectively. We conducted extensive experiments on question-based tasks on large-scale, real-world datasets, which demonstrated the effectiveness of TQ-Net and improve the precision of downstream applications (e.g. similar questions +2.02 knowledge point prediction +7.20 open-source a subset of our data to promote the development of relative studies.


QuesNet: A Unified Representation for Heterogeneous Test Questions

Understanding learning materials (e.g. test questions) is a crucial issu...

Probing Negative Sampling Strategies to Learn GraphRepresentations via Unsupervised Contrastive Learning

Graph representation learning has long been an important yet challenging...

CLIP^2: Contrastive Language-Image-Point Pretraining from Real-World Point Cloud Data

Contrastive Language-Image Pre-training, benefiting from large-scale unl...

Unsupervised Graph Poisoning Attack via Contrastive Loss Back-propagation

Graph contrastive learning is the state-of-the-art unsupervised graph re...

Towards a Holistic Understanding of Mathematical Questions with Contrastive Pre-training

Understanding mathematical questions effectively is a crucial task, whic...

Graph-Text Multi-Modal Pre-training for Medical Representation Learning

As the volume of Electronic Health Records (EHR) sharply grows, there ha...

Contrastive Training for Models of Information Cascades

This paper proposes a model of information cascades as directed spanning...

Please sign up or login with your details

Forgot password? Click here to reset