In this paper, a deep reinforcement learning (DRL) approach based on generative adversarial imitation learning (GAIL) and long short-term memory (LSTM) is proposed to resolve tracking control problems for robotic manipulators with saturation constraints and random disturbances, without learning the dynamic and kinematic model of the manipulator. Specifically, it limits the torque and joint angle to a certain range. Firstly, in order to cope with the instability problem during training and obtain a stability policy, soft actor-critic (SAC) and LSTM are combined. The changing trends of joint position over time are more comprehensively captured and understood by employing an LSTM architecture designed for robotic manipulator systems, thereby reducing instability during the training of robotic manipulators for tracking control tasks. Secondly, the obtained policy by SAC-LSTM is used as expert data for GAIL to learn a better control policy. This SAC-LSTM-GAIL (SL-GAIL) algorithm does not need to spend time exploring unknown environments and directly learns the control strategy from stable expert data. Finally, it is demonstrated by the simulation results that the end effector of the robot tracking task is effectively accomplished by the proposed SL-GAIL algorithm, and more superior stability is exhibited in a test environment with interference compared with other algorithms.
Keywords: deep reinforcement learning; end effector; generative adversarial imitation learning; long short-term memory; robotics tracking control.