Despite the power of Large Language Models (LLMs) like GPT-4, they still...
Despite achieving remarkable performance on various vision-language task...
The fixed-size context of Transformer makes GPT models incapable of
gene...
Interactive Natural Language Processing (iNLP) has emerged as a novel
pa...
The primary way of building AI applications is shifting from training
sp...
Large language models generate fluent texts and can follow natural langu...
Machine translation quality estimation (QE) predicts human judgements of...
Vision language pre-training aims to learn alignments between vision and...
Intermediate-task transfer can benefit a wide range of NLP tasks with
pr...
Pre-trained vision-language models (VLMs) have achieved impressive resul...
With the success of vision-language pre-training, we have witnessed the
...
In this paper, we introduce Cross-View Language Modeling, a simple and
e...
Recent advances in vision-language pre-training (VLP) have demonstrated
...
How do masked language models (MLMs) such as BERT learn contextual
repre...
Personalizing dialogue agents is important for dialogue systems to gener...
In recent years, larger and deeper models are springing up and continuou...
Recent studies on compression of pretrained language models (e.g., BERT)...
We present Meta Learning for Knowledge Distillation (MetaDistil), a simp...
In this paper, we propose Inverse Adversarial Training (IAT) algorithm f...
Cant is important for understanding advertising, comedies and dog-whistl...
In this paper, we generalize text infilling (e.g., masked language model...
Fact verification models have enjoyed a fast advancement in the last two...
In this paper, we propose Patience-based Early Exit, a straightforward y...
In this paper, we introduce DropHead, a structured dropout method
specif...
Automated evaluation of open domain natural language generation (NLG) mo...
In this paper, we propose a novel model compression approach to effectiv...
Local sequence transduction (LST) tasks are sequence transduction tasks ...
Conventional Generative Adversarial Networks (GANs) for text generation ...
We propose a novel data synthesis method to generate diverse error-corre...