Statistical Learning with Sublinear Regret of Propagator Models

01/12/2023
by   Eyal Neuman, et al.
0

We consider a class of learning problems in which an agent liquidates a risky asset while creating both transient price impact driven by an unknown convolution propagator and linear temporary price impact with an unknown parameter. We characterize the trader's performance as maximization of a revenue-risk functional, where the trader also exploits available information on a price predicting signal. We present a trading algorithm that alternates between exploration and exploitation phases and achieves sublinear regrets with high probability. For the exploration phase we propose a novel approach for non-parametric estimation of the price impact kernel by observing only the visible price process and derive sharp bounds on the convergence rate, which are characterised by the singularity of the propagator. These kernel estimation methods extend existing methods from the area of Tikhonov regularisation for inverse problems and are of independent interest. The bound on the regret in the exploitation phase is obtained by deriving stability results for the optimizer and value function of the associated class of infinite-dimensional stochastic control problems. As a complementary result we propose a regression-based algorithm to estimate the conditional expectation of non-Markovian signals and derive its convergence rate.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/06/2023

An Offline Learning Approach to Propagator Models

We consider an offline learning problem for an agent who first estimates...
research
04/18/2022

On Parametric Optimal Execution and Machine Learning Surrogates

We investigate optimal execution problems with instantaneous price impac...
research
04/12/2020

Regret Bounds for Kernel-Based Reinforcement Learning

We consider the exploration-exploitation dilemma in finite-horizon reinf...
research
03/22/2020

Optimal No-regret Learning in Repeated First-price Auctions

We study online learning in repeated first-price auctions with censored ...
research
11/03/2022

Phase Transitions in Learning and Earning under Price Protection Guarantee

Motivated by the prevalence of “price protection guarantee", which allow...
research
03/04/2020

Exploration-Exploitation in Constrained MDPs

In many sequential decision-making problems, the goal is to optimize a u...
research
06/26/2011

Decision-Theoretic Bidding Based on Learned Density Models in Simultaneous, Interacting Auctions

Auctions are becoming an increasingly popular method for transacting bus...

Please sign up or login with your details

Forgot password? Click here to reset