We study how to learn ϵ-optimal strategies in zero-sum imperfect
informa...
We consider the problem of online allocation subject to a long-term fair...
In online advertisement, ad campaigns are sequentially displayed to user...
Many machine learning problems require performing dataset valuation, i.e...
Consider a hiring process with candidates coming from different universi...
Imperfect information games (IIG) are games in which each player only
pa...
Due mostly to its application to cognitive radio networks, multiplayer
b...
In this paper we discuss an application of Stochastic Approximation to
s...
The workhorse of machine learning is stochastic gradient descent. To acc...
We describe an efficient algorithm to compute solutions for the general
...
Contextual bandit algorithms are widely used in domains where it is desi...
We consider the problem of online linear regression in the stochastic
se...
In the fixed budget thresholding bandit problem, an algorithm sequential...
Finding an optimal matching in a weighted graph is a standard combinator...
Motivated by sequential budgeted allocation problems, we investigate onl...
We introduce a new procedure to neuralize unsupervised Hidden Markov Mod...
The objective of offline RL is to learn optimal policies when a fixed
ex...
Motivated by packet routing in computer networks, online queuing systems...
The gloabal objective of inverse Reinforcement Learning (IRL) is to esti...
Contextual bandit is a general framework for online learning in sequenti...
We study online learning for optimal allocation when the resource to be
...
The Greedy algorithm is the simplest heuristic in sequential decision pr...
Continuously learning and leveraging the knowledge accumulated from prio...
Auction theory historically focused on the question of designing the bes...
We consider the stochastic block model where connection between vertices...
Reinforcement learning algorithms are widely used in domains where it is...
Potential buyers of a product or service tend to first browse feedback f...
We investigate stochastic combinatorial multi-armed bandit with semi-ban...
We introduce a new stochastic multi-armed bandit setting where arms are
...
Motivated by cognitive radios, stochastic multi-player multi-armed bandi...
We introduce a new numerical framework to learn optimal bidding strategi...
Studies on massive open online courses (MOOCs) users discuss the existen...
We consider the problem of active linear regression where a decision mak...
We consider the practical and classical setting where the seller is usin...
We study a setting in which a learner faces a sequence of A/B tests and ...
Private data are valuable either by remaining private (for instance if t...
We consider the problem of the optimization of bidding strategies in
pri...
We consider a sequential stochastic resource allocation problem under th...
We improve the efficiency of algorithms for stochastic combinatorial
sem...
Studying continuous time counterpart of some discrete time dynamics is n...
We consider the stochastic contextual bandit problem with additional
reg...
State of the art online learning procedures focus either on selecting th...
We consider the stochastic multiplayer multi-armed bandit problem, where...
Second price auctions with reserve price are widely used by the main Int...
We consider the classical stochastic multi-armed bandit but where, from ...
Motivated by posted price auctions where buyers are grouped in an unknow...
We consider the problem where an agent wants to find a hidden object tha...
With the increasing use of auctions in online advertising, there has bee...
We provide a comparative study of several widely used off-policy estimat...
We consider the problem of bandit optimization, inspired by stochastic
o...