Pareto optimal proxy metrics

by   Lee Richardson, et al.

North star metrics and online experimentation play a central role in how technology companies improve their products. In many practical settings, however, evaluating experiments based on the north star metric directly can be difficult. The two most significant issues are 1) low sensitivity of the north star metric and 2) differences between the short-term and long-term impact on the north star metric. A common solution is to rely on proxy metrics rather than the north star in experiment evaluation and launch decisions. Existing literature on proxy metrics concentrates mainly on the estimation of the long-term impact from short-term experimental data. In this paper, instead, we focus on the trade-off between the estimation of the long-term impact and the sensitivity in the short term. In particular, we propose the Pareto optimal proxy metrics method, which simultaneously optimizes prediction accuracy and sensitivity. In addition, we give an efficient multi-objective optimization algorithm that outperforms standard methods. We applied our methodology to experiments from a large industrial recommendation system, and found proxy metrics that are eight times more sensitive than the north star and consistently moved in the same direction, increasing the velocity and the quality of the decisions to launch new features.


page 1

page 2

page 3

page 4

āˆ™ 09/14/2023

Choosing a Proxy Metric from Past Experiments

In many randomized experiments, the treatment effect of the long-term me...
āˆ™ 06/13/2019

Early Detection of Long Term Evaluation Criteria in Online Controlled Experiments

A common dilemma encountered by many upon implementing an optimization m...
āˆ™ 11/30/2016

Unit Commitment using Nearest Neighbor as a Short-Term Proxy

We devise the Unit Commitment Nearest Neighbor (UCNN) algorithm to be us...
āˆ™ 10/29/2020

Targeting for long-term outcomes

Decision-makers often want to target interventions (e.g., marketing camp...
āˆ™ 07/10/2023

Ranking with Long-Term Constraints

The feedback that users provide through their choices (e.g., clicks, pur...
āˆ™ 09/12/2023

Pump, Dump, and then What? The Long-Term Impact of Cryptocurrency Pump-and-Dump Schemes

The pump and dump scheme is a form of market manipulation attack in whic...
āˆ™ 06/02/2021

Online Experimentation with Surrogate Metrics: Guidelines and a Case Study

A/B tests have been widely adopted across industries as the golden rule ...

Please sign up or login with your details

Forgot password? Click here to reset