Jiantao Jiao

Automated State and Action Space Design for Multi-Objective Reinforcement Learning


While Reinforcement Learning (RL) has shown impressive performance in games (e.g., Go and chess [14,13], DoTA2 [2], Starcraft II [18,17], etc.), it remains a challenging problem to use RL in real-world scenarios due to the gigantic sample complexity and limited ability to generalize to unknown environments. To train
an RL agent with traditional online methods, millions of interactions with the environments are often needed, which is definitely...

Statistically Efficient Offline RL with General Function Approximation


Offline reinforcement learning (RL) aims at learning effective policies from only a previously-collected dataset of interactions without access to further interactions with the environment. To handle datasets with partial coverage, conservatism is recently shown to be necessary, both in practice and theory, for offline RL. Existing offline RL algorithms, however, either do not offer theoretical guarantees or are not practical due to strong assumptions (such as tabular or linear parameterization) or computational intractability. We propose...