Trajectory Tracking Control for Robotic Manipulator Based on Soft Actor-Critic and Generative Adversarial Imitation Learning

Jintao Hu; Fujie Wang; Xing Li; Yi Qin; Fang Guo; Ming Jiang

doi:10.3390/biomimetics9120779

Trajectory Tracking Control for Robotic Manipulator Based on Soft Actor-Critic and Generative Adversarial Imitation Learning

Biomimetics (Basel). 2024 Dec 21;9(12):779. doi: 10.3390/biomimetics9120779.

Authors

Jintao Hu¹, Fujie Wang¹, Xing Li¹, Yi Qin¹, Fang Guo¹, Ming Jiang¹

Affiliation

¹ School of Computer Science and Technology, Dongguan University of Technology, Dongguan 523808, China.

PMID: 39727785
DOI: 10.3390/biomimetics9120779

Abstract

In this paper, a deep reinforcement learning (DRL) approach based on generative adversarial imitation learning (GAIL) and long short-term memory (LSTM) is proposed to resolve tracking control problems for robotic manipulators with saturation constraints and random disturbances, without learning the dynamic and kinematic model of the manipulator. Specifically, it limits the torque and joint angle to a certain range. Firstly, in order to cope with the instability problem during training and obtain a stability policy, soft actor-critic (SAC) and LSTM are combined. The changing trends of joint position over time are more comprehensively captured and understood by employing an LSTM architecture designed for robotic manipulator systems, thereby reducing instability during the training of robotic manipulators for tracking control tasks. Secondly, the obtained policy by SAC-LSTM is used as expert data for GAIL to learn a better control policy. This SAC-LSTM-GAIL (SL-GAIL) algorithm does not need to spend time exploring unknown environments and directly learns the control strategy from stable expert data. Finally, it is demonstrated by the simulation results that the end effector of the robot tracking task is effectively accomplished by the proposed SL-GAIL algorithm, and more superior stability is exhibited in a test environment with interference compared with other algorithms.

Keywords: deep reinforcement learning; end effector; generative adversarial imitation learning; long short-term memory; robotics tracking control.

Grants and funding

62203116, 62103106/National Natural Science Foundation of China