On Assessing the Usefulness of Proxy Domains for Developing and Evaluating Embodied Agents

by   Anthony Courchesne, et al.

In many situations it is either impossible or impractical to develop and evaluate agents entirely on the target domain on which they will be deployed. This is particularly true in robotics, where doing experiments on hardware is much more arduous than in simulation. This has become arguably more so in the case of learning-based agents. To this end, considerable recent effort has been devoted to developing increasingly realistic and higher fidelity simulators. However, we lack any principled way to evaluate how good a "proxy domain" is, specifically in terms of how useful it is in helping us achieve our end objective of building an agent that performs well in the target domain. In this work, we investigate methods to address this need. We begin by clearly separating two uses of proxy domains that are often conflated: 1) their ability to be a faithful predictor of agent performance and 2) their ability to be a useful tool for learning. In this paper, we attempt to clarify the role of proxy domains and establish new proxy usefulness (PU) metrics to compare the usefulness of different proxy domains. We propose the relative predictive PU to assess the predictive ability of a proxy domain and the learning PU to quantify the usefulness of a proxy as a tool to generate learning data. Furthermore, we argue that the value of a proxy is conditioned on the task that it is being used to help solve. We demonstrate how these new metrics can be used to optimize parameters of the proxy domain for which obtaining ground truth via system identification is not trivial.


page 1

page 6

page 7


Protocol architectures for IoT domains

In this work we discuss proxy architectures which interconnect IoT domai...

Safety Margins for Reinforcement Learning

Any autonomous controller will be unsafe in some situations. The ability...

Formalizing the Problem of Side Effect Regularization

AI objectives are often hard to specify properly. Some approaches tackle...

Proxy Certificates: The Missing Link in the Web's Chain of Trust

The ability to quickly revoke a compromised key is critical to the secur...

The Optimal Choice of Hypothesis Is the Weakest, Not the Shortest

If A and B are sets such that A ⊂ B, generalisation may be understood as...

A Machine Learning Approach for Evaluating Creative Artifacts

Much work has been done in understanding human creativity and defining m...

Domain Attentive Fusion for End-to-end Dialect Identification with Unknown Target Domain

End-to-end deep learning language or dialect identification systems oper...

Please sign up or login with your details

Forgot password? Click here to reset