Sergey Levine

Learning Successor Affordances as Temporal Abstractions

Successor features (SF) provide a convenient representation for value functions that can be used to obtain value functions under new reward functions by simply recombining the features via linear combination. However, successor features, by construction, require the underlying policy of the value function to be fixed. This can be undesirable whenthe goal is to find the optimal value function each different reward function as the successor features for different policy can be different.

In this project, we explore successor affordances (SA) that can provide a basis for...

ML-Based Robotic Manipulation via the Use of Diverse Datasets

With today's methods, having a robot solve complex multi-step manipulation tasks would either require numerous lengthy demonstrations, or a sequence of carefully choreographed motion plans, often rendering such an approach impractical. We instead build a two-level hierarchical system which can be trained using short snippets of robotic interaction data collected via teleoperation as well as a natural language dataset of high-level instructions paired with low-level tasks, which is significantly easier to create. Our proposed system features a high-level controller which accepts...

Leveraging Demonstrations with Goal-Directed Multi-Task Bisimulation

Existing methods for visual goal-conditioned RL rely on a latent representation learned with pixel-wise reconstruction, which limits the generalization ability of these methods. Other recent work as developed methods for learning theory-backed lossy representations based on bisimulation that capture only what is relevant for a task in high-dimensional observation spaces. However, bisimulation is constrained to the single task setting as it relies on reward information to...

Long-Horizon Decision-Making with Energy-Based Models

We introduce the γ-model, a predictive model of environment dynamics with an infinite probabilistic horizon. Replacing standard single-step models with γ-models leads to generalizations of the procedures that form the foundation of model-based control, including the model rollout and model-based value estimation.

The γ-model,...

Learning Legged Locomotion

More information on this project coming soon, please check back.

Multi-agent Social Learning

Project Goals: The goal of the project is to develop an algorithm for iteratively constructing recursive hierarchies of options. The hypothesis is that such a method could have the potential to achieve an exponential improvement over flat reinforcement learning policies in learning efficiency by exploring with high...