In this work, we present a scalable reinforcement learning method for
tr...
Large transformer models trained on diverse datasets have shown a remark...
In recent years, domains such as natural language processing and image
r...
Animals have evolved various agile locomotion strategies, such as sprint...
The progress of autonomous web navigation has been hindered by the depen...
Foundation models pretrained on diverse data at scale have demonstrated
...
A goal of artificial intelligence is to construct an agent that can solv...
By transferring knowledge from large, diverse, task-agnostic datasets, m...
Using massive datasets to train large-scale models has emerged as a domi...
Model-based reinforcement learning (RL) methods are appealing in the off...
In offline reinforcement learning (RL), a learner leverages prior logged...
Future- or return-conditioned supervised learning is an emerging paradig...
Large language models (LLMs) have shown exceptional performance on a var...
Evolution Strategy (ES) algorithms have shown promising results in train...
Classical theory in reinforcement learning (RL) predominantly focuses on...
Despite recent advancements in language models (LMs), their application ...
A longstanding goal of the field of AI is a strategy for compiling diver...
Motivated by the success of ensembles for uncertainty estimation in
supe...
Imitation learning aims to extract high-performance policies from logged...
In this work, we study the use of the Bellman equation as a surrogate
ob...
We study the problem of model selection in batch policy optimization: gi...
Reinforcement learning (RL) agents are widely used for solving complex
s...
The aim in imitation learning is to learn effective policies by utilizin...
Reasoning about the future – understanding how decisions in the present ...
In imitation learning, it is common to learn a behavior policy to match ...
Standard dynamics models for continuous control make use of feedforward
...
Off-policy evaluation (OPE) holds the promise of being able to leverage
...
Progress in deep reinforcement learning (RL) research is largely enabled...
Since its introduction a decade ago, relative entropy policy search
(REP...
Many modern approaches to offline Reinforcement Learning (RL) utilize
be...
The recent success of supervised learning methods on ever larger offline...
The presence of uncertainty in policy evaluation significantly complicat...
Reinforcement learning (RL) has achieved impressive performance in a var...
We study high-confidence behavior-agnostic off-policy evaluation in
rein...
In reinforcement learning, it is typical to use the empirically observed...
The recently proposed distribution correction estimation (DICE) family o...
Offline methods for reinforcement learning have the potential to help br...
Most reinforcement learning (RL) algorithms assume online access to the
...
The offline reinforcement learning (RL) problem, also referred to as bat...
The offline reinforcement learning (RL) problem, also referred to as bat...
In batch reinforcement learning (RL), one often constrains a learned pol...
We review basic concepts of convex duality, focusing on the very general...
When performing imitation learning from expert demonstrations, distribut...
In many real-world applications of reinforcement learning (RL), interact...
In reinforcement learning (RL) research, it is common to assume access t...
A number of machine learning (ML) methods have been proposed recently to...
Hierarchical reinforcement learning has demonstrated significant success...
Manipulation and locomotion are closely related problems that are often
...
In many real-world reinforcement learning applications, access to the
en...
Many reinforcement learning (RL) tasks provide the agent with
high-dimen...