b'Shun Kiyono'

research

∙ 06/01/2022

On Layer Normalizations and Residual Connections in Transformers

In the perspective of a layer normalization (LN) position, the architect...

21 Sho Takase, et al. ∙

research

∙ 05/24/2022

Diverse Lottery Tickets Boost Ensemble from a Single Pretrained Model

Ensembling is a popular method used to improve performance as a last res...

20 Sosuke Kobayashi, et al. ∙

research

∙ 09/13/2021

SHAPE: Shifted Absolute Position Embedding for Transformers

Position representation is crucial for building position-aware represent...

8 Shun Kiyono, et al. ∙

research

∙ 04/15/2021

Pseudo Zero Pronoun Resolution Improves Zero Anaphora Resolution

The use of pretrained masked language models (MLMs) has drastically impr...

0 Ryuto Konno, et al. ∙

research

∙ 04/13/2021

Lessons on Parameter Sharing across Layers in Transformers

We propose a parameter sharing method for Transformers (Vaswani et al., ...

0 Sho Takase, et al. ∙

research

∙ 04/05/2021

Rethinking Perturbations in Encoder-Decoders for Fast Training

We often use perturbations to regularize neural models. For neural encod...

0 Sho Takase, et al. ∙

research

∙ 11/02/2020

An Empirical Study of Contextual Data Augmentation for Japanese Zero Anaphora Resolution

One critical issue of zero anaphora resolution (ZAR) is the scarcity of ...

0 Ryuto Konno, et al. ∙

research

∙ 10/07/2020

A Self-Refinement Strategy for Noise Reduction in Grammatical Error Correction

Existing approaches for grammatical error correction (GEC) largely rely ...

0 Masato Mita, et al. ∙

research

∙ 05/03/2020

Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction

This paper investigates how to effectively incorporate a pre-trained mas...

0 Masahiro Kaneko, et al. ∙

research

∙ 04/21/2020

ESPnet-ST: All-in-One Speech Translation Toolkit

We present ESPnet-ST, which is designed for the quick development of spe...

0 Hirofumi Inaguma, et al. ∙

research

∙ 10/08/2019

Riposte! A Large Corpus of Counter-Arguments

Constructive feedback is an effective method for improving critical thin...

0 Paul Reisert, et al. ∙

research

∙ 09/02/2019

An Empirical Study of Incorporating Pseudo Data into Grammatical Error Correction

The incorporation of pseudo data in the training of grammatical error co...

0 Shun Kiyono, et al. ∙

research

∙ 10/13/2018

Mixture of Expert/Imitator Networks: Scalable Semi-supervised Learning Framework

The current success of deep neural networks (DNNs) in an increasingly br...

0 Shun Kiyono, et al. ∙

research

∙ 12/22/2017

Source-side Prediction for Neural Headline Generation

The encoder-decoder model is widely used in natural language generation ...

0 Shun Kiyono, et al. ∙

Shun Kiyono

Featured Co-authors

Sign in with Google

Consider DeepAI Pro