Rigorous Agent Evaluation: An Adversarial Approach to Uncover Catastrophic Failures

by   Jonathan Uesato, et al.

This paper addresses the problem of evaluating learning systems in safety critical domains such as autonomous driving, where failures can have catastrophic consequences. We focus on two problems: searching for scenarios when learned agents fail and assessing their probability of failure. The standard method for agent evaluation in reinforcement learning, Vanilla Monte Carlo, can miss failures entirely, leading to the deployment of unsafe agents. We demonstrate this is an issue for current agents, where even matching the compute used for training is sometimes insufficient for evaluation. To address this shortcoming, we draw upon the rare event probability estimation literature and propose an adversarial evaluation approach. Our approach focuses evaluation on adversarially chosen situations, while still providing unbiased estimates of failure probabilities. The key difficulty is in identifying these adversarial situations -- since failures are rare there is little signal to drive optimization. To solve this we propose a continuation approach that learns failure modes in related but less robust agents. Our approach also allows reuse of data already collected for training the agent. We demonstrate the efficacy of adversarial evaluation on two standard domains: humanoid control and simulated driving. Experimental results show that our methods can find catastrophic failures and estimate failures rates of agents multiple orders of magnitude faster than standard evaluation schemes, in minutes to hours rather than days.


page 1

page 2

page 3

page 4


Neural Bridge Sampling for Evaluating Safety-Critical Autonomous Systems

Learning-based methodologies increasingly find applications in safety-cr...

Accelerated Policy Evaluation: Learning Adversarial Environments with Adaptive Importance Sampling

The evaluation of rare but high-stakes events remains one of the main di...

A Method for Estimating the Probability of Extremely Rare Accidents in Complex Systems

Estimating the probability of failures or accidents with aerospace syste...

Failure-Scenario Maker for Rule-Based Agent using Multi-agent Adversarial Reinforcement Learning and its Application to Autonomous Driving

We examine the problem of adversarial reinforcement learning for multi-a...

A Versatile Approach to Evaluating and Testing Automated Vehicles based on Kernel Methods

Evaluation and validation of complicated control systems are crucial to ...

Failure Prediction for Autonomous Driving

The primary focus of autonomous driving research is to improve driving a...

Gradient Optimization for Single-State RMDPs

As modern problems such as autonomous driving, control of robotic compon...

Please sign up or login with your details

Forgot password? Click here to reset