Peter Bartlett

Investigations into the Complexity of Nonconvex Optimization

Machine learning based systems are set to play an increasing role in everyday life due to the increased abundance of large scale training data and the use of sophisticated statistical models such as neural networks. In light of these recent trends, much recent attention has been devoted towards understanding both how robust and reliable these methods are when deployed in the real world and the computational complexity of actually learning them from data. In our collaboration so far, we have adopted a theoretical perspective on each of these questions and have made partial progress towards...

Reinforcement Learning in High Dimensional Systems

The goal of this collaboration is to explore the limits and possibilities of sequential decision making in complex, high-dimensional environments. Compared with more classical settings such as supervised learning, relatively little is known regarding the minimal assumptions, representational conditions, and algorithmic principles needed to enable sample-efficient learning in complex control systems with rich sets of actions and observations. Given recent empirical breakthroughs in robotics and game playing (...

Mitigating Emergent Biases in Online Learning

The field of online learning and bandits deals with sequential decision-making problems, where a learner performs a series of decisions aimed at minimizing (or maximizing) a loss (or reward) signal. Online learning algorithms are the basis of many data driven systems used to drive consequential decisions in internet commerce, finance and even policing. There has...

LP-based Algorithms for Reinforcement Learning

Since its introduction a decade ago, relative entropy policy search (REPS) has demonstrated successful policy learning on a number of simulated and real-world robotic domains, not to mention providing algorithmic components used by many recently proposed reinforcement learning (RL) algorithms....

Generalization Bounds for Interpolating Deep Neural Networks

We study the training of finite-width two-layer smoothed ReLU networks for binary classification using the logistic loss. We show that gradient descent drives the training loss to zero if the initial loss is small enough. When the data satisfies certain cluster and separation conditions and the network is wide enough, we show that one step of gradient descent reduces the loss sufficiently that the first result applies. In contrast, all past analyses of fixed-width networks that we know do not guarantee that the training loss goes to zero.

Update...

Agnostic Reinforcement Learning

Our goal is to understand the possibility of performing online control of an unknown dynamical system while making minimal assumptions regarding the underlying dynamics. In particular, we consider the problem of adaptively controlling the linear quadratic regulator whose states transitions lie inside a rich, nonlinear function space, such as an infinite dimensional RKHS.

Update

Completion Report...