Session

Session I: Advanced Technologies 1 - Research & Academia

Location

Salt Palace Convention Center, Salt Lake City, UT

Abstract

As space missions generate increasingly vast and complex datasets, efficient data management and processing solutions are critical. At the Michigan eXploration Lab (MXL) at the University of Michigan, where we design, build, and operate CubeSats, we have adopted big data technologies, specifically a Hadoop and Spark-based cluster, to address challenges in handling and analyzing large-scale data. These challenges include processing raw I/Q sample recordings from CubeSat satellite passes and suborbital missions, as well as mission-long spacecraft telemetry. The cluster leverages the distributed storage capabilities of the Hadoop Distributed File System (HDFS) and the in-memory data processing power of Apache Spark, supported by Apache Hive for structured data management. Additionally, we developed a custom Out-of-Tree (OOT) module for GNU Radio, enabling seamless read/write operations to and from HDFS directly within the GNU Radio environment. This module introduces two blocks: HDFS Sink and HDFS Source, which mimic the built-in File Sink and File Source blocks but interact with HDFS. Published on our lab’s GitHub as a contribution to the community, this module extends the versatility of GNU Radio in handling large-scale data workflows. This paper explores the motivation behind adopting big data tools, details the cluster’s hardware and software setup, and presents current applications, including processing radio signal data, analyzing telemetry, and supporting an on-premises machine learning platform as a data lake. These advancements demonstrate the transformative potential of scalable, data-driven engineering in space systems. The paper concludes with a discussion of the challenges encountered in implementing these technologies and an outlook on future directions for the developed cluster.

Document Type

Event

Share

COinS
 
Aug 12th, 9:00 AM

Big Data Solutions for CubeSat Mission Operations: A Case Study From the Michigan eXploration Lab

Salt Palace Convention Center, Salt Lake City, UT

As space missions generate increasingly vast and complex datasets, efficient data management and processing solutions are critical. At the Michigan eXploration Lab (MXL) at the University of Michigan, where we design, build, and operate CubeSats, we have adopted big data technologies, specifically a Hadoop and Spark-based cluster, to address challenges in handling and analyzing large-scale data. These challenges include processing raw I/Q sample recordings from CubeSat satellite passes and suborbital missions, as well as mission-long spacecraft telemetry. The cluster leverages the distributed storage capabilities of the Hadoop Distributed File System (HDFS) and the in-memory data processing power of Apache Spark, supported by Apache Hive for structured data management. Additionally, we developed a custom Out-of-Tree (OOT) module for GNU Radio, enabling seamless read/write operations to and from HDFS directly within the GNU Radio environment. This module introduces two blocks: HDFS Sink and HDFS Source, which mimic the built-in File Sink and File Source blocks but interact with HDFS. Published on our lab’s GitHub as a contribution to the community, this module extends the versatility of GNU Radio in handling large-scale data workflows. This paper explores the motivation behind adopting big data tools, details the cluster’s hardware and software setup, and presents current applications, including processing radio signal data, analyzing telemetry, and supporting an on-premises machine learning platform as a data lake. These advancements demonstrate the transformative potential of scalable, data-driven engineering in space systems. The paper concludes with a discussion of the challenges encountered in implementing these technologies and an outlook on future directions for the developed cluster.