Many real-world offline reinforcement learning (RL) problems involve
con...
We study the problem of adaptively identifying patient subpopulations th...
Modeling the preferences of agents over a set of alternatives is a princ...
Understanding an agent's priorities by observing their behavior is criti...
Understanding decision-making in clinical environments is of paramount
i...
We consider a multiobjective multiarmed bandit problem with lexicographi...
Influence maximization, item recommendation, adaptive routing and dynami...
We analyze the regret of combinatorial Thompson sampling (CTS) for the
c...