Writing with Speech — Using LLMs for Gist-Level Manipulation of Spoken Text

Dictation provides a more efficient text input method for mobile devices. However, using speech for writing can lead to verbose, incoherent, and inconsistent text, necessitating substantial editing. Our project creates Rambler, an LLM-integrated user interface designed for conceptual-level editing of dictated content through two main sets of functions: gist extraction and macro revision. Gist extraction...

Automated State and Action Space Design for Multi-Objective Reinforcement Learning


While Reinforcement Learning (RL) has shown impressive performance in games (e.g., Go and chess [14,13], DoTA2 [2], Starcraft II [18,17], etc.), it remains a challenging problem to use RL in real-world scenarios due to the gigantic sample complexity and limited ability to generalize to unknown environments. To train
an RL agent with traditional online methods, millions of interactions with the environments are often needed, which is definitely...

Self-supervised Open-World Segmentation


Standard benchmarks in image segmentation assume a "closed-world" setting, in which a pre-determined set of non-overlapping object categories is exhaustively segmented and labeled in all training and evaluation images. This significantly increases the difficulty of data collection, requiring either complex quality control and post-processing schemes if using crowd-sourced labeling or...

Multiscale Modeling for Control


A long-standing goal of AI research is to build/learn representations that lead to generalization in real-world sequential decision settings. Object-level representation that enables abstract reasoning in visual environments is a good candidate for this goal. Indeed, RL agents equipped with this inductive bias are able to generalize to tasks that involve a different...

Towards a Unified Understanding of Privacy and Generalization for Better Algorithm Design


Machine learning and deep learning have emerged as important technologies, which enable a wide range of applications including computer vision, natural language processing, healthcare, and recommendation. However, in order to responsibly deploy these machine learning algorithms in society, it is critical to design them to conform to ethical values such as privacy, safety, fairness, etc. For instance, researchers have found that information about training data can be extracted from a released machine learning model which raises important privacy concerns, and adversarial attacks or...

GuBERT: Grounded units for Self-Supervised Pre-training of Speech

Self-Supervised Learning (SSL) techniques have proved to be quite effective for representation learning in multiple modalities like text, image, and more recently speech. In the speech domain, SSL approaches for pretraining have resulted in state-of-the art demonstrations in several downstream applications like speech recognition (WAV2VEC, WAV2VEC2.0), spoken language modeling (GSLM), speech resynthesis (HuBERT) etc. As such this approach requires massive amounts of speech data (thousands of hours of speech) and computational resources to train such large models. Also, while...

Unsupervised Environment Design for Multi-task Reinforcement Learning

We are interested in designing a method to improve learning efficiency and generalization in a single-agent multi-task reinforcement learning (RL) setting by leveraging unsupervised environment design techniques.

Researchers Yuqing Du, UC Berkeley,...

Coherent and Consistent Long Story Generation

This is a continuation of our previous Year 3 collaboration, Learning-Driven Exploration For Search


Berkeley Advisor: Dan Klein,

Emergent Collaboration for Heterogeneous Multi-Robot Rearrangement

Collaboration among different species is common in nature. For instance, ostriches have sharp eyesight but poor hearing and weak sense of smell, while zebras have exceptional hearing and great sense of smell but bad eyesight. They form a symbiotic relationship to protect themselves from predators on the African savanna.

Drawing inspiration from symbiotic collaboration in nature, heterogeneous multi-robots can also work...