Session

Pre-Conference Workshop Session 3: Year in Review - Research & Academia

Location

Utah State University, Logan, UT

Abstract

As earth observation satellites, Diwata microsatellites need to have a high degree of target pointing accuracy. Additionally, being in low orbit, they could experience strong external disturbances. Current methods for attitude control have proven to be effective. However, they are prone to changes in control and mass parameters. In this paper, we explore using Deep Reinforcement Learning (RL) for attitude control. This paper also leverages on Diwata’s simulator, MATA: Mission, Attitude, and Telemetry Analysis (MATA) software, in training the RL agent. We implemented two RL algorithms: Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC). We then simulated different scenarios and compared the performance of these algorithms to that of Diwata’s current attitude controller, the Proportional-Integral-Derivative (PID) control. Our results show that reinforcement learning can outperform traditional controllers in terms of settling time, overshoot, and stability. The results of this research will help solve problems in conventional attitude controllers and enable satellite engineers to design a better Attitude Determination and Control System (ADCS).

SSC21-WKIII-04.pdf (2850 kB)

Share

COinS
 
Aug 7th, 12:00 AM

MATA-RL: Continuous Reaction Wheel Attitude Control Using the MATA Simulation Software and Reinforcement Learning

Utah State University, Logan, UT

As earth observation satellites, Diwata microsatellites need to have a high degree of target pointing accuracy. Additionally, being in low orbit, they could experience strong external disturbances. Current methods for attitude control have proven to be effective. However, they are prone to changes in control and mass parameters. In this paper, we explore using Deep Reinforcement Learning (RL) for attitude control. This paper also leverages on Diwata’s simulator, MATA: Mission, Attitude, and Telemetry Analysis (MATA) software, in training the RL agent. We implemented two RL algorithms: Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC). We then simulated different scenarios and compared the performance of these algorithms to that of Diwata’s current attitude controller, the Proportional-Integral-Derivative (PID) control. Our results show that reinforcement learning can outperform traditional controllers in terms of settling time, overshoot, and stability. The results of this research will help solve problems in conventional attitude controllers and enable satellite engineers to design a better Attitude Determination and Control System (ADCS).