Mask-Align: Self-Supervised Neural Word Alignment
Neural word alignment methods have received increasing attention recently. These methods usually extract word alignment from a machine translation model. However, there is a gap between translation and alignment tasks, since the target future context is available in the latter. In this paper, we propose Mask-Align, a self-supervised model specifically designed for the word alignment task. Our model parallelly masks and predicts each target token, and extracts high-quality alignments without any supervised loss. In addition, we introduce leaky attention to alleviate the problem of unexpected high attention weights on special tokens. Experiments on four language pairs show that our model significantly outperforms all existing unsupervised neural baselines and obtains new state-of-the-art results.
READ FULL TEXT