Reinforcement Learning with Information-Theoretic Actuation

1Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Reinforcement Learning formalises an embodied agent’s interaction with the environment through observations, rewards and actions. But where do the actions come from? Actions are often considered to represent something external, such as the movement of a limb, a chess piece, or more generally, the output of an actuator. In this work we explore and formalize a contrasting view, namely that actions are best thought of as the output of a sequence of internal choices with respect to an action model. This view is particularly well-suited for leveraging the recent advances in large sequence models as prior knowledge for multi-task reinforcement learning problems. Our main contribution in this work is to show how to augment the standard MDP formalism with a sequential notion of internal action using information-theoretic techniques, and that this leads to self-consistent definitions of both internal and external action value functions.

Cite

CITATION STYLE

APA

Catt, E., Hutter, M., & Veness, J. (2023). Reinforcement Learning with Information-Theoretic Actuation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13539 LNAI, pp. 188–198). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-19907-3_18

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free