Distilling Transformers for Neural Cross-Domain Search

08/06/2021
by   Colin B. Clement, et al.
4

Pre-trained transformers have recently clinched top spots in the gamut of natural language tasks and pioneered solutions to software engineering tasks. Even information retrieval has not been immune to the charm of the transformer, though their large size and cost is generally a barrier to deployment. While there has been much work in streamlining, caching, and modifying transformer architectures for production, here we explore a new direction: distilling a large pre-trained translation model into a lightweight bi-encoder which can be efficiently cached and queried. We argue from a probabilistic perspective that sequence-to-sequence models are a conceptually ideal—albeit highly impractical—retriever. We derive a new distillation objective, implementing it as a data augmentation scheme. Using natural language source code search as a case study for cross-domain search, we demonstrate the validity of this idea by significantly improving upon the current leader of the CodeSearchNet challenge, a recent natural language code search benchmark.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/10/2021

On the validity of pre-trained transformers for natural language processing in the software engineering domain

Transformers are the current state-of-the-art of natural language proces...
research
04/06/2021

CodeTrans: Towards Cracking the Language of Silicon's Code Through Self-Supervised Deep Learning and High Performance Computing

Currently, a growing number of mature natural language processing applic...
research
06/29/2021

Making the most of small Software Engineering datasets with modern machine learning

This paper provides a starting point for Software Engineering (SE) resea...
research
04/19/2022

On The Cross-Modal Transfer from Natural Language to Code through Adapter Modules

Pre-trained neural Language Models (PTLM), such as CodeBERT, are recentl...
research
03/23/2020

Fast Cross-domain Data Augmentation through Neural Sentence Editing

Data augmentation promises to alleviate data scarcity. This is most impo...
research
07/29/2019

Leveraging Pre-trained Checkpoints for Sequence Generation Tasks

Unsupervised pre-training of large neural models has recently revolution...
research
10/15/2021

Crisis Domain Adaptation Using Sequence-to-sequence Transformers

User-generated content (UGC) on social media can act as a key source of ...

Please sign up or login with your details

Forgot password? Click here to reset