Contrastive Entropy: A new evaluation metric for unnormalized language models

01/03/2016
by   Kushal Arora, et al.
0

Perplexity (per word) is the most widely used metric for evaluating language models. Despite this, there has been no dearth of criticism for this metric. Most of these criticisms center around lack of correlation with extrinsic metrics like word error rate (WER), dependence upon shared vocabulary for model comparison and unsuitability for unnormalized language model evaluation. In this paper, we address the last problem and propose a new discriminative entropy based intrinsic metric that works for both traditional word level models and unnormalized language models like sentence level models. We also propose a discriminatively trained sentence level interpretation of recurrent neural network based language model (RNN) as an example of unnormalized sentence level model. We demonstrate that for word level models, contrastive entropy shows a strong correlation with perplexity. We also observe that when trained at lower distortion levels, sentence level RNN considerably outperforms traditional RNNs on this new metric.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/01/2016

A Compositional Approach to Language Modeling

Traditional language models treat language as a finite state automaton o...
research
03/14/2021

Learning a Word-Level Language Model with Sentence-Level Noise Contrastive Estimation for Contextual Sentence Probability Estimation

Inferring the probability distribution of sentences or word sequences is...
research
04/23/2018

Spell Once, Summon Anywhere: A Two-Level Open-Vocabulary Language Model

We show how to deploy recurrent neural networks within a hierarchical Ba...
research
07/05/2015

Dependency Recurrent Neural Language Models for Sentence Completion

Recent work on language modelling has shifted focus from count-based mod...
research
11/26/2020

Unigram-Normalized Perplexity as a Language Model Performance Measure with Different Vocabulary Sizes

Although Perplexity is a widely used performance metric for language mod...
research
05/08/2023

ANALOGICAL - A New Benchmark for Analogy of Long Text for Large Language Models

Over the past decade, analogies, in the form of word-level analogies, ha...
research
02/02/2015

Scaling Recurrent Neural Network Language Models

This paper investigates the scaling properties of Recurrent Neural Netwo...

Please sign up or login with your details

Forgot password? Click here to reset