Session
Pre-Conference Workshop Session 3: Year in Review - Research & Academia
Location
Utah State University, Logan, UT
Abstract
As earth observation satellites, Diwata microsatellites need to have a high degree of target pointing accuracy. Additionally, being in low orbit, they could experience strong external disturbances. Current methods for attitude control have proven to be effective. However, they are prone to changes in control and mass parameters. In this paper, we explore using Deep Reinforcement Learning (RL) for attitude control. This paper also leverages on Diwata’s simulator, MATA: Mission, Attitude, and Telemetry Analysis (MATA) software, in training the RL agent. We implemented two RL algorithms: Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC). We then simulated different scenarios and compared the performance of these algorithms to that of Diwata’s current attitude controller, the Proportional-Integral-Derivative (PID) control. Our results show that reinforcement learning can outperform traditional controllers in terms of settling time, overshoot, and stability. The results of this research will help solve problems in conventional attitude controllers and enable satellite engineers to design a better Attitude Determination and Control System (ADCS).
MATA-RL: Continuous Reaction Wheel Attitude Control Using the MATA Simulation Software and Reinforcement Learning
Utah State University, Logan, UT
As earth observation satellites, Diwata microsatellites need to have a high degree of target pointing accuracy. Additionally, being in low orbit, they could experience strong external disturbances. Current methods for attitude control have proven to be effective. However, they are prone to changes in control and mass parameters. In this paper, we explore using Deep Reinforcement Learning (RL) for attitude control. This paper also leverages on Diwata’s simulator, MATA: Mission, Attitude, and Telemetry Analysis (MATA) software, in training the RL agent. We implemented two RL algorithms: Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC). We then simulated different scenarios and compared the performance of these algorithms to that of Diwata’s current attitude controller, the Proportional-Integral-Derivative (PID) control. Our results show that reinforcement learning can outperform traditional controllers in terms of settling time, overshoot, and stability. The results of this research will help solve problems in conventional attitude controllers and enable satellite engineers to design a better Attitude Determination and Control System (ADCS).