Automatic curriculum learning (ACL) is a family of mechanisms that automatically adapt the distribution of training data by learning to adjust the selection of learning situations to the capabilities of deep reinforcement learning (RL) agents. ACL can be practically beneficial to enhance not only the training sample efficiency but also RL agents' capabilities to achieve harder tasks via incrementally challenging and diverse curriculum. In this work, we focus on scenario generation, or procedural content generation, to automatically create a curriculum, where a scenario is modelled and generated using a probabilistic programming language, SCENIC, whose syntax and semantics are designed to intuitively model scenarios. Here, a scenario defines distributions over initial states and dynamic and interactive behaviors (i.e. policies) of the environment agents. With SCENIC as a formal construct to represent scenarios, the key novel aspect of our research is the synthesis of environments via composition of scenarios to automatically create incrementally challenging and diverse scenarios. The scope of our research is to, first, formalize this notion of scenario composition. Then, we utilize this formalization to develop an ACL algorithm which, given a set of SCENIC programs, trains RL agents not only in all the provided scenarios but also combinatorial compositions of those scenarios to train more robust and generalizable RL agents. We target a multi-agent real-time strategy environment, namely soccer, to evaluate our algorithm.
- Edward Kim, UC Berkeley
- Abdus Azad, UC Berkeley
- Aditya Grover, Facebook
- Prof. Sanjit Seshia, UC Berkeley
This project is in part based upon work sponsored by Facebook.