Near Optimal Memory-Regret Tradeoff for Online Learning

03/03/2023
by   Binghui Peng, et al.
0

In the experts problem, on each of T days, an agent needs to follow the advice of one of n “experts”. After each day, the loss associated with each expert's advice is revealed. A fundamental result in learning theory says that the agent can achieve vanishing regret, i.e. their cumulative loss is within o(T) of the cumulative loss of the best-in-hindsight expert. Can the agent perform well without sufficient space to remember all the experts? We extend a nascent line of research on this question in two directions: ∙ We give a new algorithm against the oblivious adversary, improving over the memory-regret tradeoff obtained by [PZ23], and nearly matching the lower bound of [SWXZ22]. ∙ We also consider an adaptive adversary who can observe past experts chosen by the agent. In this setting we give both a new algorithm and a novel lower bound, proving that roughly √(n) memory is both necessary and sufficient for obtaining o(T) regret.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro