Fast Cross-domain Data Augmentation through Neural Sentence Editing

03/23/2020
by   Guillaume Raille, et al.
0

Data augmentation promises to alleviate data scarcity. This is most important in cases where the initial data is in short supply. This is, for existing methods, also where augmenting is the most difficult, as learning the full data distribution is impossible. For natural language, sentence editing offers a solution - relying on small but meaningful changes to the original ones. Learning which changes are meaningful also requires large amounts of training data. We thus aim to learn this in a source domain where data is abundant and apply it in a different, target domain, where data is scarce - cross-domain augmentation. We create the Edit-transformer, a Transformer-based sentence editor that is significantly faster than the state of the art and also works cross-domain. We argue that, due to its structure, the Edit-transformer is better suited for cross-domain environments than its edit-based predecessors. We show this performance gap on the Yelp-Wikipedia domain pairs. Finally, we show that due to this cross-domain performance advantage, the Edit-transformer leads to meaningful performance gains in several downstream tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/27/2013

Cross-Domain Sparse Coding

Sparse coding has shown its power as an effective data representation me...
research
05/16/2023

Bidirectional Generative Framework for Cross-domain Aspect-based Sentiment Analysis

Cross-domain aspect-based sentiment analysis (ABSA) aims to perform vari...
research
02/20/2023

Cross-domain Compositing with Pretrained Diffusion Models

Diffusion models have enabled high-quality, conditional image editing ca...
research
09/02/2022

GReS: Graphical Cross-domain Recommendation for Supply Chain Platform

Supply Chain Platforms (SCPs) provide downstream industries with numerou...
research
08/06/2021

Distilling Transformers for Neural Cross-Domain Search

Pre-trained transformers have recently clinched top spots in the gamut o...
research
02/27/2023

Fluid Transformers and Creative Analogies: Exploring Large Language Models' Capacity for Augmenting Cross-Domain Analogical Creativity

Cross-domain analogical reasoning is a core creative ability that can be...
research
12/02/2021

Multi-Domain Transformer-Based Counterfactual Augmentation for Earnings Call Analysis

Earnings call (EC), as a periodic teleconference of a publicly-traded co...

Please sign up or login with your details

Forgot password? Click here to reset