Using Deep Reinforcement Learning to Generalize Search in Games

Search methods have been instrumental in computing superhuman strategies for large-scale games [1,2,3]. However, existing search techniques are tabular and can therefore have trouble searching far into the future. This is particularly a problem in games with high stochasticity and/or imperfect information. For example, existing search techniques in Hanabi, which is considered an interesting research problem by the AI community [4], are only able to search one move ahead. Even searching two moves ahead is considered intractable for existing techniques. Since real-world situations are highly stochastic and commonly involve hidden information, it is important to develop more scalable online search methods for real-world applications.

Researchers

Arnaud Fickinger, UC Berkeley
Stuart Russell, UC Berkeley
Noam Brown, FAIR

Overview

We propose using deep reinforcement learning to search further ahead than existing tabular techniques are capable of. We would accomplish this by first training a policy network for the entire game. Then, at test time, whenever a decision must be made we would fine-tune the policy for the particular situation the agent is in.

We aim to develop a method that outperforms existing search techniques in the benchmark game of Hanabi, as well as potentially other benchmark games. Specifically, we aim to show that two-ply search in Hanabi is intractable when using existing state-of-the-art techniques [3], but that using our deep RL approach makes this both tractable and effective.

References

[1] Brown, Noam, and Tuomas Sandholm. "Superhuman AI for multiplayer poker." Science 365.6456 (2019): 885-890.

[2] Silver, David, et al. "A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play." Science 362.6419 (2018): 1140-1144.

[3] Lerer, Adam, et al. "Improving Policies via Search in Cooperative Partially Observable Games." AAAI. 2020.

[4] Bard, Nolan, et al. "The hanabi challenge: A new frontier for ai research." Artificial Intelligence 280 (2020): 103216.

Contact

arnaud.fickinger@berkeley.edu

Using Deep Reinforcement Learning to Generalize Search in Games

Researchers

Overview

References

Contact

Topics