Optimal Control Design of Rotation Floating Space Robot During Grasping Operation: An RL Approach

Roshan Sah, Tata Consultancy Services(TCS)-Research
Kaushik Das, Tata Consultancy Services(TCS)-Research

Abstract

This paper introduces an innovative model-free learning-based controller designed to comprehensively control an orbiting space robot engaged in proximity and grasping operations. Proximity control techniques are crucial for space robots, specifically robotic manipulators on a floating satellite base, facilitating tasks like in-orbit servicing and debris grasping. Traditional controllers need help managing these robots’ coupled motion due to the satellite base’s floating nature. While conventional controllers have been utilized for coupled control in nonlinear systems, their complexity increases with the growing degrees of freedom in the robot. In contrast, Model-free Reinforcement Learning (RL) has successfully mastered intricate policies within robotic manipulation. However, existing research has predominantly focused on controlling the space robotic arm, neglecting the satellite base. This paper addresses the vent by proposing a coupled controller for the space robotic arm and the satellite base orientation. This simultaneous control is essential for ensuring the proper functioning of onboard sensors and equipment with specific pointing requirements. The Proximal Policy Optimization (PPO) algorithm is employed in this study to control the position (3 DOF) and orientation (3 DOF) of the end-effector while also managing the orientation of the satellite base(3 DOF). To the best of the authors’ knowledge, this paper marks the first application of a model-free RL method for the simultaneous 9 DOF control of a floating space robot. Furthermore, the paper proposes enhancements to standard reward functions in RL algorithms to enhance the learning algorithm’s performance. The policy training is executed using a Pybullet framework environment, and the paper presents a trained policy performance of a rotation floating space robot at standard reward functions.

 
Aug 8th, 9:00 AM

Optimal Control Design of Rotation Floating Space Robot During Grasping Operation: An RL Approach

Utah State University, Logan, UT

This paper introduces an innovative model-free learning-based controller designed to comprehensively control an orbiting space robot engaged in proximity and grasping operations. Proximity control techniques are crucial for space robots, specifically robotic manipulators on a floating satellite base, facilitating tasks like in-orbit servicing and debris grasping. Traditional controllers need help managing these robots’ coupled motion due to the satellite base’s floating nature. While conventional controllers have been utilized for coupled control in nonlinear systems, their complexity increases with the growing degrees of freedom in the robot. In contrast, Model-free Reinforcement Learning (RL) has successfully mastered intricate policies within robotic manipulation. However, existing research has predominantly focused on controlling the space robotic arm, neglecting the satellite base. This paper addresses the vent by proposing a coupled controller for the space robotic arm and the satellite base orientation. This simultaneous control is essential for ensuring the proper functioning of onboard sensors and equipment with specific pointing requirements. The Proximal Policy Optimization (PPO) algorithm is employed in this study to control the position (3 DOF) and orientation (3 DOF) of the end-effector while also managing the orientation of the satellite base(3 DOF). To the best of the authors’ knowledge, this paper marks the first application of a model-free RL method for the simultaneous 9 DOF control of a floating space robot. Furthermore, the paper proposes enhancements to standard reward functions in RL algorithms to enhance the learning algorithm’s performance. The policy training is executed using a Pybullet framework environment, and the paper presents a trained policy performance of a rotation floating space robot at standard reward functions.