Reinforcement Learning with Information-Theoretic Actuation

Elliot Catt; Marcus Hutter; Joel Veness

Conference Proceedings

Reinforcement Learning with Information-Theoretic Actuation

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2023) 13539 LNAI 188-198

DOI: 10.1007/978-3-031-19907-3_18

1Citations

5Readers

Get full text

Abstract

Reinforcement Learning formalises an embodied agent’s interaction with the environment through observations, rewards and actions. But where do the actions come from? Actions are often considered to represent something external, such as the movement of a limb, a chess piece, or more generally, the output of an actuator. In this work we explore and formalize a contrasting view, namely that actions are best thought of as the output of a sequence of internal choices with respect to an action model. This view is particularly well-suited for leveraging the recent advances in large sequence models as prior knowledge for multi-task reinforcement learning problems. Our main contribution in this work is to show how to augment the standard MDP formalism with a sequential notion of internal action using information-theoretic techniques, and that this leads to self-consistent definitions of both internal and external action value functions.

Cite

CITATION STYLE

APA

Catt, E., Hutter, M., & Veness, J. (2023). Reinforcement Learning with Information-Theoretic Actuation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13539 LNAI, pp. 188–198). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-19907-3_18

Reinforcement Learning with Information-Theoretic Actuation

Abstract

Cite

Register to see more suggestions