Introduction of fixed mode states into online profit sharing and its application to waist trajectory generation of biped robot

Seiya Kuroda; Kazuteru Miyazaki; Hiroaki Kobayashi

Conference Proceedings

Introduction of fixed mode states into online profit sharing and its application to waist trajectory generation of biped robot

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7188 LNAI 297-308

DOI: 10.1007/978-3-642-29946-9_29

0Citations

3Readers

Get full text

Abstract

In reinforcement learning of long-term tasks, learning efficiency may deteriorate when an agent's probabilistic actions cause too many mistakes before task learning reaches its goal. The new type of state we propose - fixed mode - to which a normal state shifts if it has already received sufficient reward - chooses an action based on a greedy strategy, eliminating randomness of action selection and increasing efficiency. We start by proposing the combining of an algorithm with penalty avoiding rational policy making and online profit sharing with fixed mode states. We then discuss the target system and learning-controller design. In simulation, the learning task involves stabilizing of biped walking by using the learning controller to modify a robot's waist trajectory. We then discuss simulation results and the effectiveness of our proposal. © 2012 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Kuroda, S., Miyazaki, K., & Kobayashi, H. (2012). Introduction of fixed mode states into online profit sharing and its application to waist trajectory generation of biped robot. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7188 LNAI, pp. 297–308). https://doi.org/10.1007/978-3-642-29946-9_29

Introduction of fixed mode states into online profit sharing and its application to waist trajectory generation of biped robot

Abstract

Author supplied keywords

Cite

Register to see more suggestions