AIXIjs: A Software Demo for General Reinforcement Learning

05/22/2017
by   John Aslanides, et al.
0

Reinforcement learning is a general and powerful framework with which to study and implement artificial intelligence. Recent advances in deep learning have enabled RL algorithms to achieve impressive performance in restricted domains such as playing Atari video games (Mnih et al., 2015) and, recently, the board game Go (Silver et al., 2016). However, we are still far from constructing a generally intelligent agent. Many of the obstacles and open questions are conceptual: What does it mean to be intelligent? How does one explore and learn optimally in general, unknown environments? What, in fact, does it mean to be optimal in the general sense? The universal Bayesian agent AIXI (Hutter, 2005) is a model of a maximally intelligent agent, and plays a central role in the sub-field of general reinforcement learning (GRL). Recently, AIXI has been shown to be flawed in important ways; it doesn't explore enough to be asymptotically optimal (Orseau, 2010), and it can perform poorly with certain priors (Leike and Hutter, 2015). Several variants of AIXI have been proposed to attempt to address these shortfalls: among them are entropy-seeking agents (Orseau, 2011), knowledge-seeking agents (Orseau et al., 2013), Bayes with bursts of exploration (Lattimore, 2013), MDL agents (Leike, 2016a), Thompson sampling (Leike et al., 2016), and optimism (Sunehag and Hutter, 2015). We present AIXIjs, a JavaScript implementation of these GRL agents. This implementation is accompanied by a framework for running experiments against various environments, similar to OpenAI Gym (Brockman et al., 2016), and a suite of interactive demos that explore different properties of the agents, similar to REINFORCEjs (Karpathy, 2015). We use AIXIjs to present numerous experiments illustrating fundamental properties of, and differences between, these agents.

READ FULL TEXT
research
05/28/2021

Learning Approximate and Exact Numeral Systems via Reinforcement Learning

Recent work (Xu et al., 2020) has suggested that numeral systems in diff...
research
01/28/2023

Towards Learning Rubik's Cube with N-tuple-based Reinforcement Learning

This work describes in detail how to learn and solve the Rubik's cube ga...
research
05/06/2020

Exploring Exploration: Comparing Children with RL Agents in Unified Environments

Research in developmental psychology consistently shows that children ex...
research
04/03/2023

Empirical Design in Reinforcement Learning

Empirical design in reinforcement learning is no small task. Running goo...
research
07/03/2021

QKSA: Quantum Knowledge Seeking Agent

In this article we present the motivation and the core thesis towards th...
research
09/24/2021

Go-Blend behavior and affect

This paper proposes a paradigm shift for affective computing by viewing ...
research
08/05/2021

An Elementary Proof that Q-learning Converges Almost Surely

Watkins' and Dayan's Q-learning is a model-free reinforcement learning a...

Please sign up or login with your details

Forgot password? Click here to reset