Active

ML-Based Robotic Manipulation via the Use of Diverse Datasets

With today's methods, having a robot solve complex multi-step manipulation tasks would either require numerous lengthy demonstrations, or a sequence of carefully choreographed motion plans, often rendering such an approach impractical. We instead build a two-level hierarchical system which can be trained using short snippets of robotic interaction data collected via teleoperation as well as a natural language dataset of high-level instructions paired with low-level tasks, which is significantly easier to create. Our proposed system features a high-level controller which accepts...

Formal Skill Representation and Composition for Task Generalization

Abstract

In this work, we focus on developing a methodology to systematically generate challenging scenarios to assist RL agents generalize their ability to solve a task in autonomous driving domain. First, we derive our scenarios from autonomous vehicles (AVs) crash reports in California Department of Motor Vehicles (DMV). These scenarios will serve as realistic training environments for RL agents and learning to drive in these challenging scenarios could help generalize their ability to drive. More specifically, using natural language processing...

Automating Multi-Agent Curriculum Learning with Probabilistic Programs

Abstract

Automatic curriculum learning (ACL) is a family of mechanisms that automatically adapt the distribution of training data by learning to adjust the selection of learning situations to the capabilities of deep reinforcement learning (RL) agents. ACL can be practically beneficial to enhance not only the training sample efficiency but also RL agents' capabilities to achieve harder tasks via incrementally challenging and diverse curriculum. In this work, we focus on scenario generation, or procedural content generation, to automatically create a...

Investigations into the Complexity of Nonconvex Optimization

Machine learning based systems are set to play an increasing role in everyday life due to the increased abundance of large scale training data and the use of sophisticated statistical models such as neural networks. In light of these recent trends, much recent attention has been devoted towards understanding both how robust and reliable these methods are when deployed in the real world and the computational complexity of actually learning them from data. In our collaboration so far, we have adopted a theoretical perspective on each of these questions and have made partial progress towards...

Leveraging Demonstrations with Goal-Directed Multi-Task Bisimulation

Existing methods for visual goal-conditioned RL rely on a latent representation learned with pixel-wise reconstruction, which limits the generalization ability of these methods. Other recent work as developed methods for learning theory-backed lossy representations based on bisimulation that capture only what is relevant for a task in high-dimensional observation spaces. However, bisimulation is constrained to the single task setting as it relies on reward information to...

Enabling Non-Experts to Annotate Complex Logical Forms at Scale

The goal of semantic parsing is to map natural language utterances into logical forms, which will then be executed to fulfill the users’ needs. For example, a user might seek information by asking “What’s the height of the highest mountain in the U.S.”, and the semantic parser will produce an SQL query Select Max(altitude) from Mountain where country = ‘U.S.’, and execute it against a database to produce the answer. Semantic parsers can also be used to formally represent intended actions, track dialogue states, or process...

Active Visual Planning: Handling Uncertainty in Perception, Prediction, and Planning Pipelines

Abstract

When navigating complex multi-agent scenarios, humans not only reason about the uncertainty of their own perception, but also reason about the effect of their actions on their own perception. For example, a human driver at a blind intersection may inch forward to improve their visibility of oncoming traffic before deciding whether to proceed. This...

Towards Human-like Attention

Overview

Convolutional Neural Networks (CNNs) can already reach human performance on clean images, but is not as robust. Recently proposed self-attention seems to help robustness, but still fails at different cases. However, previous studies do show a close relationship between attention and robustness in human visual system. We hypothesize that attention is the key to robustness, only self-attention is not the right formulation for it. We propose to study the neuronal foundation of human visual attention, and propose a human-like attention mechanism to reach higher robustness....

Video Representation Learning for Global and Local Features

This ongoing project attempts to use self-supervised learning to learn video representation that are both useful for coarser tasks involving global informationand finer-grained tasks involving local information.

Researchers Franklin Wang, UC Berkeley Avideh Zakhor, UC Berkeley Yale Song, Microsoft Du Tran, Facebook Aravind Kalaiah, Facebook Overview

Self-supervised video representation learning provides new opportunities to computer vision: It can take full advantage of the wealth of unlabeled video data available, and when successful, it can improve...

Personalized federated learning: new algorithms and statistical rates,

Background: Federated Learning (FL) has emerged as a powerful paradigm for distributed, privacy-preserving machine learning over a large network of devices [1]. Most existing works on FL focus on learning a single model that is deployed to all devices. Given the diverse characteristics of the users and application scenarios, personalization is highly desirable and inevitable in the near future. Personalized Federated Learning (PFL) aims to improve the experience of individual users by training personalized on-device models that overcome the limitations of a common...