Zhihan Xiong

Chat Image Generator Video Music Voice Chat Photo Editor

Featured Co-authors

Simon S. Du
95 publications
Kevin Jamieson
50 publications
Maryam Fazel
32 publications
Lalit Jain
20 publications
Ruoqi Shen
16 publications
Qiwen Cui
11 publications
Tian Tan
9 publications
Romain Camilleri
5 publications
Haozhe Jiang
5 publications
Vikranth R. Dwaracherla
2 publications

research

∙ 07/27/2023

A/B Testing and Best-arm Identification for Linear Bandits with Robustness to Non-stationarity

We investigate the fixed-budget best-arm identification (BAI) problem fo...

0 Zhihan Xiong, et al. ∙

research

∙ 06/12/2023

A Black-box Approach for Non-stationary Multi-agent Reinforcement Learning

We investigate learning the equilibria in non-stationary multi-agent sys...

0 Haozhe Jiang, et al. ∙

research

∙ 10/24/2022

Offline congestion games: How feedback type affects data coverage requirement

This paper investigates when one can efficiently recover an approximate ...

0 Haozhe Jiang, et al. ∙

research

∙ 06/04/2022

Learning in Congestion Games with Bandit Feedback

Learning Nash equilibria is a central problem in multi-agent systems. In...

0 Qiwen Cui, et al. ∙

research

∙ 10/28/2021

Selective Sampling for Online Best-arm Identification

This work considers the problem of selective-sampling for best-arm ident...

0 Romain Camilleri, et al. ∙

research

∙ 02/19/2021

Randomized Exploration is Near-Optimal for Tabular MDP

We study exploration using randomized value functions in Thompson Sampli...

0 Zhihan Xiong, et al. ∙

research

∙ 12/23/2019

Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning

It is well known that quantifying uncertainty in the action-value estima...

0 Tian Tan, et al. ∙

Success!

An error occurred

Zhihan Xiong

Featured Co-authors

A/B Testing and Best-arm Identification for Linear Bandits with Robustness to Non-stationarity

A Black-box Approach for Non-stationary Multi-agent Reinforcement Learning

Offline congestion games: How feedback type affects data coverage requirement

Learning in Congestion Games with Bandit Feedback

Selective Sampling for Online Best-arm Identification

Randomized Exploration is Near-Optimal for Tabular MDP

Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning

Sign in with Google

Consider DeepAI Pro