Neural Machine Translation (NMT) models usually use large target vocabul...
Knowledge distillation describes a method for training a student network...
Attention-based Neural Machine Translation (NMT) models suffer from atte...
In this paper, we propose a novel finetuning algorithm for the recently
...
In this paper, we enhance the attention-based neural machine translation...