* Join the Member of ICROS 
* Need your ID or Password?
Subject Keyword Abstract Author
Configuration Path Control

Sergey Pankov
International Journal of Control, Automation, and Systems, vol. 21, no. 1, pp.306-317, 2023

Abstract : Reinforcement learning methods often produce brittle policies – policies that perform well during training, but generalize poorly beyond their direct training experience, thus becoming unstable under small disturbances. To address this issue, we propose a method for stabilizing a control policy in the space of configuration paths. It is applied post-training and relies purely on the data produced during training, as well as on an instantaneous control-matrix estimation. The approach is evaluated empirically on a planar bipedal walker subjected to a variety of perturbations. The control policies obtained via reinforcement learning are compared against their stabilized counterparts. Across different experiments, we find two- to four-fold increase in stability, when measured in terms of the perturbation amplitudes. We also provide a zero-dynamics interpretation of our approach.

Keyword : Biped, configuration path control, reinforcement learning, stability, virtual constraints, zero dynamics.

Copyright ⓒ ICROS. All rights reserved.
Institute of Control, Robotics and Systems, Suseo Hyundai-Ventureville 723, Bamgogae-ro 1-gil 10, Gangnam-gu, Seoul 06349, Korea
Homepage | Tel. +82-2-6949-5801 (ext. 3) | Fax. +82-2-6949-5807 | E-mail