Incentivizing the Emergence of Grounded Discrete Communication Between General Agents
We converted the recently developed BabyAI grid world platform to a sender/receiver setup in order to test the hypothesis that established deep reinforcement learning techniques are sufficient to incentivize the emergence of a grounded discrete communication protocol between general agents. This is in contrast to previous experiments that employed straight-through estimation or tailored inductive biases. Our results show that these can indeed be avoided, by instead providing proper environmental incentives. Moreover, they show that a longer interval between communications incentivized more abstract semantics. In some cases, the communicating agents adapted to new environments more quickly than monolithic agents, showcasing the potential of emergent discrete communication for transfer learning.
READ FULL TEXT