Word Acquisition in Neural Language Models

by   Tyler A. Chang, et al.

We investigate how neural language models acquire individual words during training, extracting learning curves and ages of acquisition for over 600 words on the MacArthur-Bates Communicative Development Inventory (Fenson et al., 2007). Drawing on studies of word acquisition in children, we evaluate multiple predictors for words' ages of acquisition in LSTMs, BERT, and GPT-2. We find that the effects of concreteness, word length, and lexical class are pointedly different in children and language models, reinforcing the importance of interaction and sensorimotor experience in child language acquisition. Language models rely far more on word frequency than children, but like children, they exhibit slower learning of words in longer utterances. Interestingly, models follow consistent patterns during training for both unidirectional and bidirectional models, and for both LSTM and Transformer architectures. Models predict based on unigram token frequencies early in training, before transitioning loosely to bigram probabilities, eventually converging on more nuanced predictions. These results shed light on the role of distributional learning mechanisms in children, while also providing insights for more human-like language acquisition in language models.


page 1

page 2

page 3

page 4


Language acquisition: do children and language models follow similar learning stages?

During language acquisition, children follow a typical sequence of learn...

Prosodic Features from Large Corpora of Child-Directed Speech as Predictors of the Age of Acquisition of Words

The impressive ability of children to acquire language is a widely studi...

Language Acquisition is Embodied, Interactive, Emotive: a Research Proposal

Humans' experience of the world is profoundly multimodal from the beginn...

Robots Learning to Say `No': Prohibition and Rejective Mechanisms in Acquisition of Linguistic Negation

`No' belongs to the first ten words used by children and embodies the fi...

Predicting Word Learning in Children from the Performance of Computer Vision Systems

For human children as well as machine learning systems, a key challenge ...

Quantity doesn't buy quality syntax with neural language models

Recurrent neural networks can learn to predict upcoming words remarkably...

Efficient Induction of Language Models Via Probabilistic Concept Formation

This paper presents a novel approach to the acquisition of language mode...

Please sign up or login with your details

Forgot password? Click here to reset