Formal Skill Representation and Composition for Task Generalization

Abstract

In this work, we focus on developing a methodology to systematically generate challenging scenarios to assist RL agents generalize their ability to solve a task in autonomous driving domain. First, we derive our scenarios from autonomous vehicles (AVs) crash reports in California Department of Motor Vehicles (DMV). These scenarios will serve as realistic training environments for RL agents and learning to drive in these challenging scenarios could help generalize their ability to drive. More specifically, using natural language processing, we parse out structured information about the accident from the crash reports written in English. Then, we convert the structured information into a formal scenario model using a scenario specification language called SCENIC, which is a probabilistic programming language. The probabilistic aspect of a scenario allows one to model an abstract scenario, consisting of distributions over the initial states and the interactive behaviors (i.e. policies) of the environment traffic participants. From RL training perspective, an abstract scenario represent a structured variations in the training environments which can be helpful for generalization. One key issue with directly using the synthesized SCENIC programs from the AV crash reports is that they are under-specified. For example, the crash report contain high-level information about (i) the entities involved in the crash, (ii) distribution over their initial and goal positions (e.g. the truck was headed northbound on the fourth street at the intersection of fourth and tenth street), (iii) the nature of the accident (e.g. T-bone collision), and, (iv) optionally, the cause of the accident (e.g. the AV car did not yield). The reports do not contain more detailed information, for example, about the speeds, acceleration, or braking of these vehicles. Hence, these underspecified information, which determines the interactions among traffic participants and their behavior compositions, need to be searched via sampling from the distributions specified in the SCENIC programs, which is likely going to be sample inefficient. The crux of the research is how to generate scenarios that satisfy certain constraints over the initial/end positions and the nature of the collision from a wide distributions of underspecified scenario model. We formalize this problem as control improvisation[1][2] which given a specification language, synthesizes a probabilistic algorithm that generates concrete scenarios that satisfy hard (i.e. must be satisfied) and randomness (i.e. yet be as random as possible in the way you generate to create diverse scenarios) constraints.  

References

[1] Daniel Fremont, Alexandre Donze, S. Seshia, D. Wessel, "Control Improvisation," Foundations of Software Technology and Theoretical Computer Science, 2015

[2] Marcell Vazquez-Chanlatte, Sebastian Junges, Daniel Fremont, S. Seshia, "Entropy-guided Control Improvisation," Robotics: Science and Systems (RSS), 2021

Researchers

  • Edward Kim, UC Berkeley
  • Abdus Azad, UC Berkeley
  • Aleksandra Faust, Google
  • George Tucker, Google
  • Izzedin Gur, Google
  • Prof. Sanjit Seshia, UC Berkeley

Acknowledgements

This project is in part based upon work sponsored by Google.