"Average" Approximates "First Principal Component"? An Empirical Analysis on Representations from Neural Language Models

04/18/2021
by   Zihan Wang, et al.
0

Contextualized representations based on neural language models have furthered the state of the art in various NLP tasks. Despite its great success, the nature of such representations remains a mystery. In this paper, we present an empirical property of these representations – "average" approximates "first principal component". Specifically, experiments show that the average of these representations shares almost the same direction as the first principal component of the matrix whose columns are these representations. We believe this explains why the average representation is always a simple yet strong baseline. Our further examinations show that this property also holds in more challenging scenarios, for example, when the representations are from a model right after its random initialization. Therefore, we conjecture that this property is intrinsic to the distribution of representations and not necessarily related to the input structure. We realize that these representations empirically follow a normal distribution for each dimension, and by assuming this is true, we demonstrate that the empirical property can be in fact derived mathematically.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/09/2022

A note on the prediction error of principal component regression in high dimensions

We analyze the prediction error of principal component regression (PCR) ...
research
03/08/2023

Principal Component Analysis of Two-dimensional Functional Data with Serial Correlation

In this paper, we propose a novel model to analyze serially correlated t...
research
10/22/2012

Initialization of Self-Organizing Maps: Principal Components Versus Random Initialization. A Case Study

The performance of the Self-Organizing Map (SOM) algorithm is dependent ...
research
12/20/2013

The Sparse Principal Component of a Constant-rank Matrix

The computation of the sparse principal component of a matrix is equival...
research
06/09/2021

Low-Dimensional Structure in the Space of Language Representations is Reflected in Brain Responses

How related are the representations learned by neural language models, t...
research
08/23/2022

Neural PCA for Flow-Based Representation Learning

Of particular interest is to discover useful representations solely from...
research
09/14/2023

The Dynamical Principles of Storytelling

When considering the opening part of 1800 short stories, we find that th...

Please sign up or login with your details

Forgot password? Click here to reset