Date of Award:


Document Type:


Degree Name:

Master of Science (MS)


Computer Science

Committee Chair(s)

Shah Muhammad Hamdi


Shah Muhammad Hamdi


Soukaina Filali Boubrahimi


Steve Petruzza


Solar flares are characterized by sudden bursts of electromagnetic radiation from the Sun's surface, and caused by the changes in magnetic field states in solar active regions. Earth and its surrounding space environment can suffer from various negative impacts caused by solar flares ranging from electronic communication disruption to radiation exposure-based health risks to the astronauts. In this paper, we address the solar flare prediction problem from magnetic field parameter-based multivariate time series (MVTS) data using multiple state-of-the-art machine learning classifiers that include MINImally RandOm Convolutional KErnel Transform (MINIROCKET), Support Vector Machine (SVM), Canonical Interval Forest (CIF), Multiple Representations SEQuence Learner (MR-SEQL), Long Short-Term Memory (LSTM)-based deep learning model, and the Transformer model. We showed our results on the Space Weather ANalytics for Solar Flares (SWAN-SF) benchmark data set, a partitioned collection of MVTS data of active region magnetic field parameters spanning over 9 years of operation of the Solar Dynamics Observatory (SDO). The MVTS instances of the SWAN-SF dataset are labeled by GOES X-ray flux-based flare class labels, and attributed to extreme class imbalance because of the rarity of the major flaring events (e.g., X and M). To minimize the dimensionality of the data, we also included data preprocessing activities such as statistical summarization. We used the true skill statistic (TSS) and realizations of the Heidke Skill Score (HSS; HSS2) score as a performance validation metric in this class-imbalanced dataset. Finally, we demonstrate the advantages of the MVTS learning algorithm MINIROCKET, which produces better results than other classifiers without the need for essential data preprocessing steps such as normalization, statistical summarization, and class imbalance handling heuristics.