Implicit Memory Transformer for Computationally Efficient Simultaneous Speech Translation

07/03/2023
by   Matthew Raffel, et al.
0

Simultaneous speech translation is an essential communication task difficult for humans whereby a translation is generated concurrently with oncoming speech inputs. For such a streaming task, transformers using block processing to break an input sequence into segments have achieved state-of-the-art performance at a reduced cost. Current methods to allow information to propagate across segments, including left context and memory banks, have faltered as they are both insufficient representations and unnecessarily expensive to compute. In this paper, we propose an Implicit Memory Transformer that implicitly retains memory through a new left context method, removing the need to explicitly represent memory with memory banks. We generate the left context from the attention output of the previous segment and include it in the keys and values of the current segment's attention calculation. Experiments on the MuST-C dataset show that the Implicit Memory Transformer provides a substantial speedup on the encoder forward pass with nearly identical translation quality when compared with the state-of-the-art approach that employs both left context and memory banks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/30/2020

Streaming Simultaneous Speech Translation with Augmented Memory Transformer

Transformer-based models have achieved state-of-the-art performance on s...
research
07/03/2023

Shiftable Context: Addressing Training-Inference Context Mismatch in Simultaneous Speech Translation

Transformer models using segment-based processing have been an effective...
research
04/19/2022

Blockwise Streaming Transformer for Spoken Language Understanding and Simultaneous Speech Translation

Although Transformers have gained success in several speech processing t...
research
01/19/2023

AtMan: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation

Generative transformer models have become increasingly complex, with lar...
research
05/25/2023

End-to-End Simultaneous Speech Translation with Differentiable Segmentation

End-to-end simultaneous speech translation (SimulST) outputs translation...
research
10/16/2022

RedApt: An Adaptor for wav2vec 2 Encoding Faster and Smaller Speech Translation without Quality Compromise

Pre-trained speech Transformers in speech translation (ST) have facilita...
research
08/31/2021

SimulLR: Simultaneous Lip Reading Transducer with Attention-Guided Adaptive Memory

Lip reading, aiming to recognize spoken sentences according to the given...

Please sign up or login with your details

Forgot password? Click here to reset