Learning Successor Affordances as Temporal Abstractions

Successor features (SF) provide a convenient representation for value functions that can be used to obtain value functions under new reward functions by simply recombining the features via linear combination. However, successor features, by construction, require the underlying policy of the value function to be fixed. This can be undesirable whenthe goal is to find the optimal value function each different reward function as the successor features for different policy can be different.

In this project, we explore successor affordances (SA) that can provide a basis for...

Learning Dexterous In-Hand Manipulation with Vision and Touch


Consider the task of stacking LEGO bricks or assembling IKEA furniture in Figure. Given a goal image configuration, humans can rapidly figure out a plan to accurately manipulate the LEGO bricks or furniture parts to achieve the goal. This is mainly due to: 1) humans are already equipped with a good mental dynamics model through daily interaction with objects,...

Ashera: Neural Optimization Modulo Theories


Ashera is a state-of-the-art Optimization Modulo Theory (OMT) solver, which explicitly targets a rising class of optimization problems such as multi-agent traveling salesman (mTSP) and multi-resource DAG scheduling. We excel at disjunctive problems which lead to well studied failure modes for ILP solvers exploiting both Logical Neighborhood Search and Neural Diving.

Logical Neighborhood Search decouples combinatorial search with convex optimization. For any feasible solution, Ashera performs convex optimization within...

Alpa: A Distributed System for Training and Serving Large Models

Alpa is a system for training and serving large-scale neural networks.

Scaling neural networks to hundreds of billions of parameters has enabled dramatic breakthroughs such as GPT-3, but training and serving these large-scale neural networks require complicated distributed system techniques. Alpa aims to automate large-scale distributed training and serving with just a few lines of code.


Fate of Snow

Northstar:“Develop iterative, meaningful benchmarks for AI researchers that enable substantial progress on problems related to climate change as well as impactful AI methodology.”

Summary: Learning from Observational, Multimodal, Multiscale, Spatiotemporal (OMMS) data sources are critical for researchers and practitioners working on problems related to climate change. AI methods for handling these types of data – and the many associated problems – remain largely undeveloped, and...

Statistically Efficient Offline RL with General Function Approximation


Offline reinforcement learning (RL) aims at learning effective policies from only a previously-collected dataset of interactions without access to further interactions with the environment. To handle datasets with partial coverage, conservatism is recently shown to be necessary, both in practice and theory, for offline RL. Existing offline RL algorithms, however, either do not offer theoretical guarantees or are not practical due to strong assumptions (such as tabular or linear parameterization) or computational intractability. We propose...

Combating Hallucination in Conditional Sequence Generation


In recent years, large-scale pre-trained language models (e.g. BERT, BART, GPT3) have been widely adopted in various text generation applications such as machine translation, document summarization, and question answering. However, as previous works [1] analyzed, powerful language models tend to dominate the prediction of conditional generation, and the model is likely to hallucinate only based on the target history. For example in summarization tasks, a conditional generation model may ignore the source texts, and generate summarization which does not exist in...

Automated Collision Prediction in Autonomous Systems with Monocular Camera

This project aims to improve real world, wide field of view depth estimation using monocular sensors. In doing so, various geometry of indoor and outdoor sceneries will be experimented with using large deep learning models. A focus will be placed on data representation in the process in order to investigate and identify the most efficient pipelines.

Researchers Jerome Quenum, University of California - Berkeley Brent Yi, University of California - Berkeley Avideh Zakhor, University of California - Berkeley Austin Stone, Google Rico Jonschkowski,...

Learning From Play in Children and Robots: Who Can Train a Robot Better?

Lynch et al. 2019 introduced “Play-LMP”, a self supervised method that learns to organize play behaviors in a latent space, then reuse them at test time to achieve specific goals. Combining self-supervised control with a diverse play dataset shifts the focus of skill learning from a narrow and discrete set of tasks to the full continuum of behaviors available in an environment. They found that this combination generalizes well empirically—after self-supervising on unlabeled play, their method substantially outperforms individual expert-trained policies on 18 difficult user-...

Distributed Learning: Privacy and Data Summarization

Machine learning is increasingly being used in applications involving sensitive data, such as healthcare and finance. This necessitates approaches that incorporate secure and private use of data. Differential privacy is the main framework for addressing these needs. However its adoption has been rife with barriers especially for distributed data. One reason for this is that theoretical guarantees often consider extreme cases where the data is fully distributed across agents (one data point per agent). This has led to impractical privacy guarantees, e.g., some methods...