Accelerated Reinforcement Learning for Sentence Generation by Vocabulary Prediction

09/05/2018
by   Kazuma Hashimoto, et al.
0

A major obstacle in reinforcement learning-based sentence generation is the large action space whose size is equal to the vocabulary size of the target-side language. To improve the efficiency of reinforcement learning, we present a novel approach for reducing the action space based on dynamic vocabulary prediction. Our method first predicts a fixed-size small vocabulary for each input to generate its target sentence. The input-specific vocabularies are then used at supervised and reinforcement learning steps, and also at test time. In our experiments on six machine translation and two image captioning datasets, our method achieves faster reinforcement learning (∼2.7x faster) with much less GPU memory (∼10x less) than the full-vocabulary counterpart. The reinforcement learning with our method consistently leads to significant improvement of BLEU scores, and the scores are equal to or better than those of baselines using the full vocabularies, with faster decoding time (∼3x faster) on CPUs.

READ FULL TEXT
research
10/06/2022

Reinforcement Learning with Large Action Spaces for Neural Machine Translation

Applying Reinforcement learning (RL) following maximum likelihood estima...
research
01/24/2021

Fast Sequence Generation with Multi-Agent Reinforcement Learning

Autoregressive sequence Generation models have achieved state-of-the-art...
research
08/04/2023

ESRL: Efficient Sampling-based Reinforcement Learning for Sequence Generation

Applying Reinforcement Learning (RL) to sequence generation models enabl...
research
04/03/2009

Time Hopping technique for faster reinforcement learning in simulations

This preprint has been withdrawn by the author for revision...
research
09/13/2018

Improving Reinforcement Learning Based Image Captioning with Natural Language Prior

Recently, Reinforcement Learning (RL) approaches have demonstrated advan...
research
09/25/2018

Fast and Simple Mixture of Softmaxes with BPE and Hybrid-LightRNN for Language Generation

Mixture of Softmaxes (MoS) has been shown to be effective at addressing ...
research
06/04/2019

Simultaneous Translation with Flexible Policy via Restricted Imitation Learning

Simultaneous translation is widely useful but remains one of the most di...

Please sign up or login with your details

Forgot password? Click here to reset