Quantity doesn't buy quality syntax with neural language models

08/31/2019
by   Marten van Schijndel, et al.
0

Recurrent neural networks can learn to predict upcoming words remarkably well on average; in syntactically complex contexts, however, they often assign unexpectedly high probabilities to ungrammatical words. We investigate to what extent these shortcomings can be mitigated by increasing the size of the network and the corpus on which it is trained. We find that gains from increasing network size are minimal beyond a certain point. Likewise, expanding the training corpus yields diminishing returns; we estimate that the training corpus would need to be unrealistically large for the models to match human performance. A comparison to GPT and BERT, Transformer-based models trained on billions of words, reveals that these models perform even more poorly than our LSTMs in some constructions. Our results make the case for more data efficient architectures.

READ FULL TEXT

page 4

page 8

page 9

page 10

research
10/05/2021

Word Acquisition in Neural Language Models

We investigate how neural language models acquire individual words durin...
research
05/23/2022

ScholarBERT: Bigger is Not Always Better

Transformer-based masked language models trained on general corpora, suc...
research
12/30/2020

Improving BERT with Syntax-aware Local Attention

Pre-trained Transformer-based neural language models, such as BERT, have...
research
10/05/2020

Investigating representations of verb bias in neural language models

Languages typically provide more than one grammatical construction to ex...
research
03/17/2023

Trained on 100 million words and still in shape: BERT meets British National Corpus

While modern masked language models (LMs) are trained on ever larger cor...
research
05/02/2023

Basic syntax from speech: Spontaneous concatenation in unsupervised deep neural networks

Computational models of syntax are predominantly text-based. Here we pro...
research
10/04/2021

Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics

Much of recent progress in NLU was shown to be due to models' learning d...

Please sign up or login with your details

Forgot password? Click here to reset