Active

Data Curation for Web-Scale Datasets

Abstract

Data curation is a promising direction for improving the efficiency and performance of large-scale models. Current efforts towards curation are ad-hoc and disconnected. We propose to develop new principled approaches for data curation inspired by Sorscher et al...

Read more about Data Curation for Web-Scale Datasets

Dynamic Compression Techniques for Efficient Transformers

Abstract

Transformers are a class of deep neural networks that have achieved state-of-the-art results across a wide range of domains, including natural language processing, computer vision, and computational biology. The widespread success of these models has been attributed to the attention mechanism, which identifies complex dependencies between elements of each input sequence. While the attention mechanism is incredibly...

Read more about Dynamic Compression Techniques for Efficient Transformers

Learning Large Touch-Vision-Language Models Using Self-Supervised Robot Learning

Abstract

Humans depend on the integration of multiple sensory inputs, including but not limited to vision, language, audio, and tactile, to successfully carry out daily tasks. Giving robots an analogous ability to perceive and process information from different sensory modalities enables a richer understanding of the physical...

Read more about Learning Large Touch-Vision-Language Models Using Self-Supervised Robot Learning

Writing with Speech — Using LLMs for Gist-Level Manipulation of Spoken Text

Dictation provides a more efficient text input method for mobile devices. However, using speech for writing can lead to verbose, incoherent, and inconsistent text, necessitating substantial editing. Our project creates Rambler, an LLM-integrated user interface designed for conceptual-level editing of dictated content through two main sets of functions: gist extraction and macro revision. Gist extraction...

Read more about Writing with Speech — Using LLMs for Gist-Level Manipulation of Spoken Text

Long-Range Understanding and Consistency in Story Generation

This is a continuation of our previous Year 4 collaboration, Coherent and Consistent Long Story Generation.

Participants

Berkeley Advisor: Dan Klein, klein@berkeley.edu...

Read more about Long-Range Understanding and Consistency in Story Generation

Automated State and Action Space Design for Multi-Objective Reinforcement Learning

Introduction

While Reinforcement Learning (RL) has shown impressive performance in games (e.g., Go and chess [14,13], DoTA2 [2], Starcraft II [18,17], etc.), it remains a challenging problem to use RL in real-world scenarios due to the gigantic sample complexity and limited ability to generalize to unknown environments. To train
an RL agent with traditional online methods, millions of interactions with the environments are often needed, which is definitely...

Read more about Automated State and Action Space Design for Multi-Objective Reinforcement Learning

Self-supervised Open-World Segmentation

Overview

Standard benchmarks in image segmentation assume a "closed-world" setting, in which a pre-determined set of non-overlapping object categories is exhaustively segmented and labeled in all training and evaluation images. This significantly increases the difficulty of data collection, requiring either complex quality control and post-processing schemes if using crowd-sourced labeling or...

Read more about Self-supervised Open-World Segmentation

Multiscale Modeling for Control

Abstract

A long-standing goal of AI research is to build/learn representations that lead to generalization in real-world sequential decision settings. Object-level representation that enables abstract reasoning in visual environments is a good candidate for this goal. Indeed, RL agents equipped with this inductive bias are able to generalize to tasks that involve a different...

Read more about Multiscale Modeling for Control

Towards a Unified Understanding of Privacy and Generalization for Better Algorithm Design

Abstract

Machine learning and deep learning have emerged as important technologies, which enable a wide range of applications including computer vision, natural language processing, healthcare, and recommendation. However, in order to responsibly deploy these machine learning algorithms in society, it is critical to design them to conform to ethical values such as privacy, safety, fairness, etc. For instance, researchers have found that information about training data can be extracted from a released machine learning model which raises important privacy concerns, and adversarial attacks or...

Read more about Towards a Unified Understanding of Privacy and Generalization for Better Algorithm Design

GuBERT: Grounded units for Self-Supervised Pre-training of Speech

Self-Supervised Learning (SSL) techniques have proved to be quite effective for representation learning in multiple modalities like text, image, and more recently speech. In the speech domain, SSL approaches for pretraining have resulted in state-of-the art demonstrations in several downstream applications like speech recognition (WAV2VEC, WAV2VEC2.0), spoken language modeling (GSLM), speech resynthesis (HuBERT) etc. As such this approach requires massive amounts of speech data (thousands of hours of speech) and computational resources to train such large models. Also, while...

Read more about GuBERT: Grounded units for Self-Supervised Pre-training of Speech

« first View: Taxonomy term
‹ previous View: Taxonomy term
1 of 9 View: Taxonomy term
2 of 9 View: Taxonomy term (Current page)
3 of 9 View: Taxonomy term
4 of 9 View: Taxonomy term
5 of 9 View: Taxonomy term
6 of 9 View: Taxonomy term
7 of 9 View: Taxonomy term
8 of 9 View: Taxonomy term
9 of 9 View: Taxonomy term
next › View: Taxonomy term
last » View: Taxonomy term