Leveraging Demonstrations with Goal-Directed Multi-Task Bisimulation

Existing methods for visual goal-conditioned RL rely on a latent representation learned with pixel-wise reconstruction, which limits the generalization ability of these methods. Other recent work as developed methods for learning theory-backed lossy representations based on bisimulation that capture only what is relevant for a task in high-dimensional observation spaces. However, bisimulation is constrained to the single task setting as it relies on reward information to determine was relevant vs. irrelevant. Can we learn representations that are well-suited for control, discarding irrelevant distractors but keeping relevant information, while at the same time providing a general feature set suitable for multi-task settings?


  • Amy Zhang, Facebook
  • Ashvin Nair, UC Berkeley
  • Sergey Levine, UC Berkeley