In the perspective of a layer normalization (LN) position, the architect...
Ensembling is a popular method used to improve performance as a last res...
Position representation is crucial for building position-aware
represent...
The use of pretrained masked language models (MLMs) has drastically impr...
We propose a parameter sharing method for Transformers (Vaswani et al.,
...
We often use perturbations to regularize neural models. For neural
encod...
One critical issue of zero anaphora resolution (ZAR) is the scarcity of
...
Existing approaches for grammatical error correction (GEC) largely rely ...
This paper investigates how to effectively incorporate a pre-trained mas...
We present ESPnet-ST, which is designed for the quick development of
spe...
Constructive feedback is an effective method for improving critical thin...
The incorporation of pseudo data in the training of grammatical error
co...
The current success of deep neural networks (DNNs) in an increasingly br...
The encoder-decoder model is widely used in natural language generation
...