ZhichunRoad at Amazon KDD Cup 2022: MultiTask Pre-Training for E-Commerce Product Search

01/31/2023
by   Xuange Cui, et al.
0

In this paper, we propose a robust multilingual model to improve the quality of search results. Our model not only leverage the processed class-balanced dataset, but also benefit from multitask pre-training that leads to more general representations. In pre-training stage, we adopt mlm task, classification task and contrastive learning task to achieve considerably performance. In fine-tuning stage, we use confident learning, exponential moving average method (EMA), adversarial training (FGM) and regularized dropout strategy (R-Drop) to improve the model's generalization and robustness. Moreover, we use a multi-granular semantic unit to discover the queries and products textual metadata for enhancing the representation of the model. Our approach obtained competitive results and ranked top-8 in three tasks. We release the source code and pre-trained models associated with this work.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/21/2020

Train No Evil: Selective Masking for Task-guided Pre-training

Recently, pre-trained language models mostly follow the pre-training-the...
research
08/23/2022

Predicting Query-Item Relationship using Adversarial Training and Robust Modeling Techniques

We present an effective way to predict search query-item relationship. W...
research
11/07/2021

How does a Pre-Trained Transformer Integrate Contextual Keywords? Application to Humanitarian Computing

In a classification task, dealing with text snippets and metadata usuall...
research
04/11/2022

Towards Generalizable Semantic Product Search by Text Similarity Pre-training on Search Click Logs

Recently, semantic search has been successfully applied to e-commerce pr...
research
04/10/2023

Delving into E-Commerce Product Retrieval with Vision-Language Pre-training

E-commerce search engines comprise a retrieval phase and a ranking phase...
research
02/28/2023

Towards Better Web Search Performance: Pre-training, Fine-tuning and Learning to Rank

This paper describes the approach of the THUIR team at the WSDM Cup 2023...
research
05/09/2022

EigenNoise: A Contrastive Prior to Warm-Start Representations

In this work, we present a naive initialization scheme for word vectors ...

Please sign up or login with your details

Forgot password? Click here to reset