Learning How to Play Bomberman with Deep Reinforcement and Imitation Learning

Ícaro Goulart; Aline Paes; Esteban Clua

Conference ProceedingsOPEN ACCESS

Learning How to Play Bomberman with Deep Reinforcement and Imitation Learning

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11863 LNCS 121-133

DOI: 10.1007/978-3-030-34644-7_10

1Citations

17Readers

Abstract

Making artificial agents that learn how to play is a long-standing goal in the area of Game AI. Recently, several successful cases have emerged driven by Reinforcement Learning (RL) and neural network-based approaches. However, in most of the cases, the results have been achieved by training directly from pixel frames with valuable computational resources. In this paper, we devise agents that learn how to play the popular game of Bomberman by relying on state representations and RL-based algorithms without looking at the pixel level. To that, we designed five vector-based state representations and implemented Bomberman on the top of the Unity game engine through the ML-agents toolkit. We enhance the ML-agents algorithms by developing an Imitation-based learner (IL) that improves its model with the Actor-Critic Proximal-Policy Optimization (PPO) method. We compared this approach with a PPO-only learner that uses either a Multi-Layer Perceptron or a Long-Short Term-Memory network (LSTM). We conducted several pieces of training and tournament experiments by making the agents play against each other. The hybrid state representation and our IL followed by PPO learning algorithm achieve the best overall quantitative results, and we also observed that their agents learn a correct Bomberman behavior.

Author supplied keywords

Cite

CITATION STYLE

APA

Goulart, Í., Paes, A., & Clua, E. (2019). Learning How to Play Bomberman with Deep Reinforcement and Imitation Learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11863 LNCS, pp. 121–133). Springer. https://doi.org/10.1007/978-3-030-34644-7_10

Readers' Seniority

PhD / Post grad / Masters / Doc 4

44%

Professor / Associate Prof. 3

33%

Lecturer / Post doc 2

22%

Readers' Discipline

Computer Science 5

56%

Engineering 2

22%

Decision Sciences 1

11%

Physics and Astronomy 1

11%

Learning How to Play Bomberman with Deep Reinforcement and Imitation Learning

Abstract

Author supplied keywords

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline