On Orderings of Probability Vectors and Unsupervised Performance Estimation

06/16/2023
by   Muhammad Maaz, et al.
0

Unsupervised performance estimation, or evaluating how well models perform on unlabeled data is a difficult task. Recently, a method was proposed by Garg et al. [2022] which performs much better than previous methods. Their method relies on having a score function, satisfying certain properties, to map probability vectors outputted by the classifier to the reals, but it is an open problem which score function is best. We explore this problem by first showing that their method fundamentally relies on the ordering induced by this score function. Thus, under monotone transformations of score functions, their method yields the same estimate. Next, we show that in the binary classification setting, nearly all common score functions - the L^∞ norm; the L^2 norm; negative entropy; and the L^2, L^1, and Jensen-Shannon distances to the uniform vector - all induce the same ordering over probability vectors. However, this does not hold for higher dimensional settings. We conduct numerous experiments on well-known NLP data sets and rigorously explore the performance of different score functions. We conclude that the L^∞ norm is the most appropriate.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/10/2019

Optimal rates for F-score binary classification

We study the minimax settings of binary classification with F-score unde...
research
09/30/2019

Tutorial on Implied Posterior Probability for SVMs

Implied posterior probability of a given model (say, Support Vector Mach...
research
03/01/2018

Re-examination of Bregman functions and new properties of their divergences

The Bregman divergence (Bregman distance, Bregman measure of distance) i...
research
12/17/2018

Information theoretical clustering is hard to approximate

An impurity measures I: R^d R^+ is a function that assigns a d-dimension...
research
11/14/2018

SCORE+ for Network Community Detection

SCORE is a recent approach to network community detection proposed by Ji...
research
12/10/2021

Boosting Active Learning via Improving Test Performance

Central to active learning (AL) is what data should be selected for anno...
research
09/16/2022

Joint estimation of posterior probability and propensity score function for positive and unlabelled data

Positive and unlabelled learning is an important problem which arises na...

Please sign up or login with your details

Forgot password? Click here to reset