Learning When to Concentrate or Divert Attention: Self-Adaptive Attention Temperature for Neural Machine Translation

08/22/2018
by   Junyang Lin, et al.
0

Most of the Neural Machine Translation (NMT) models are based on the sequence-to-sequence (Seq2Seq) model with an encoder-decoder framework equipped with the attention mechanism. However, the conventional attention mechanism treats the decoding at each time step equally with the same matrix, which is problematic since the softness of the attention for different types of words (e.g. content words and function words) should differ. Therefore, we propose a new model with a mechanism called Self-Adaptive Control of Temperature (SACT) to control the softness of attention by means of an attention temperature. Experimental results on the Chinese-English translation and English-Vietnamese translation demonstrate that our model outperforms the baseline models, and the analysis and the case study show that our model can attend to the most relevant elements in the source-side contexts and generate the translation of high quality.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/06/2018

Decoding-History-Based Adaptive Control of Attention for Neural Machine Translation

Attention-based sequence-to-sequence model has proved successful in Neur...
research
05/31/2017

Learning When to Attend for Neural Machine Translation

In the past few years, attention mechanisms have become an indispensable...
research
09/02/2018

Future-Prediction-Based Model for Neural Machine Translation

We propose a novel model for Neural Machine Translation (NMT). Different...
research
12/19/2016

An Empirical Study of Adequate Vision Span for Attention-Based Neural Machine Translation

Recently, the attention mechanism plays a key role to achieve high perfo...
research
11/10/2019

Modelling Bahdanau Attention using Election methods aided by Q-Learning

Neural Machine Translation has lately gained a lot of "attention" with t...
research
04/07/2020

Salience Estimation with Multi-Attention Learning for Abstractive Text Summarization

Attention mechanism plays a dominant role in the sequence generation mod...
research
07/17/2017

Towards Bidirectional Hierarchical Representations for Attention-Based Neural Machine Translation

This paper proposes a hierarchical attentional neural translation model ...

Please sign up or login with your details

Forgot password? Click here to reset