Classification of Major Solar Flares from Extremely Imbalanced Multivariate Time Series Data Using Minimally Random Convolutional Kernel Transform

: Solar flares are characterized by sudden bursts of electromagnetic radiation from the Sun’s surface, and are caused by the changes in magnetic field states in active solar regions. Earth and its surrounding space environment can suffer from various negative impacts caused by solar flares, ranging from electronic communication disruption to radiation exposure-based health risks to astronauts. In this paper, we address the solar flare prediction problem from magnetic field parameter-based multivariate time series (MVTS) data using multiple state-of-the-art machine learning classifiers that include MINImally RandOm Convolutional KErnel Transform (MiniRocket), Support Vector Machine (SVM), Canonical Interval Forest (CIF), Multiple Representations Sequence Learner (Mr-SEQL), and a Long Short-Term Memory (LSTM)-based deep learning model. Our experiment is conducted on the Space Weather Analytics for Solar Flares (SWAN-SF) benchmark data set, which is a partitioned collection of MVTS data of active region magnetic field parameters spanning over nine years of operation of the Solar Dynamics Observatory (SDO). The MVTS instances of the SWAN-SF dataset are labeled by GOES X-ray flux-based flare class labels, and attributed to extreme class imbalance because of the rarity of the major flaring events (e.g., X and M). As a performance validation metric in this class-imbalanced dataset, we used the True Skill Statistic ( TSS ) score. Finally, we demonstrate the advantages of the MVTS learning algorithm MiniRocket, which outperformed the aforementioned classifiers without the need for essential data preprocessing steps such as normalization, statistical summarization, and class imbalance handling heuristics.


Introduction
Solar flares are strong outbursts of radiation that result from the Sun's magnetic energy suddenly releasing its stored energy.The duration of these flares ranges from a few minutes to several hours.Since 1974, the National Oceanic and Atmospheric Administration (NOAA) has been monitoring and classifying the X-ray production of these flares within the 1-8 wavelength range using Geostationary Operational Environmental Satellites (GOESs).Based on their peak soft X-ray emissions, flares are grouped logarithmically from A to X, as A, B, C, M, and X, ascending from less powerful to more powerful, starting 10 −8 W m −2 [1].The most intense flares are classified as X-class; they are roughly 100 times stronger than C-class and 10 times stronger than M-class flares.Nine subclasses that scale the intensity are also included in each class.Finding a lower intensity When X-ray levels are high, it can be difficult to identify A and B-class flares, but flares over the C2 threshold, in particular, are typically identified as C-class and above.Because they can cause damage, M and X-class flares are the most severe flares and are are typically the focus of space weather forecasting.
Flares of the X-class and the M-class have the potential to cause serious risks, such as radio blackouts globally and long-lasting radiation storms in the upper atmosphere.Astronauts, flight attendants, and passengers could be exposed to significant risks.Trillions of dollars in repair and replacement costs could result from solar flare damage, as mentioned in the in the study by [2].However, by implementing appropriate safety measures and deploying a reliable system for predicting solar flares, it is possible to significantly reduce the extent of the damage.
Classifying major solar flare events is a challenging task due to their rarity.According to NASA, the frequency of solar flares is determined by the solar cycle, which lasts around 11 years [3].Flares can occur multiple times a day during periods of maximum solar activity, and fewer than once a week during quieter times.Furthermore, M1 class flares can occur up to two thousand times per cycle, whereas more severe flares, like those in the X10 class, are extremely infrequent, occurring on average only eight times per cycle [4].It is challenging for traditional classifiers to predict the minority class with high accuracy due to this imbalance in the class distribution.Solar flare data do not follow the balanced distribution of samples across classes that is assumed by most classification-based machine learning algorithms [5,6].
The dataset, which contains a range of time series parameters generated from solar photospheric magnetograms and NOAA's records of flares in active locations, makes solar flare forecasting even more challenging The dataset also includes physics-based magnetic field parameters, originally acquired through the Space Weather HMI Active Region Patches (SHARP) data product [7].The high dimensionality of the time series introduces another challenge because of the curse of dimensionality and the possible noisiness of multiple feature vectors.Classifiers that are capable of providing reliable accuracy on such imbalanced time series data are typically computationally expensive and require significant training time, even with relatively small datasets.
The field of astrophysics lacks a specific physical theory that explains the mechanism behind the solar flares occurrence, which limits solar flares forecasting and classification ability [8,9].Although several physics research teams are working to develop a theory for flare prediction, there is still uncertainty about the project's chances of success.The most promising approach is to adopt a data-driven strategy using the active region parameters observed by the Solar Dynamics Observatory, given the rapid advancements in AI and machine learning.The goal is to develop a model capable of demonstrating an empirical connection between AR parameters and flare occurrences.
Reference [1] provided a detailed analysis of the challenges associated with the SWAN-SF dataset, the most extensive dataset on solar flares, which includes MVTS-based photospheric magnetic field parameters of solar active regions.They addressed the extreme class imbalance and temporal coherence of the data, proposing several solutions.Initially, they extracted statistical features from each magnetic field parameter time series-such as median, standard deviation, skewness, and kurtosis, along with the last value of each series-which also helped to reduce data dimensionality and enhance scalability.They employed an SVM classifier to evaluate their flare prediction model and used undersampling and oversampling techniques to balance the class distribution in the preprocessing phase.At the classifier level, they adjusted the weighting of misclassification penalties to minimize both false positives and false negatives.For maintaining temporal coherence, they avoided overlaps in the MVTS sequence by using 20 distinct pairs of testing and training data partitions.They assessed the robustness of their SVM model using True Skill Statistics (TSS) and different forms of the Heidke Skill Score (HSS).However, the experimental settings utilized by [1] had certain limitations, such as the use of only five statistical features, which might not fully capture the complexities of the time series data, potentially leading to less accurate predictions.Additionally, they implemented a critical preprocessing step involving either undersampling or oversampling of the training data.While the previous methods relied on preprocessing through normalization and balancing, the proposed algorithm MiniRocket can achieve higher performance without the preprocessing steps.
In this paper, we refine our work in [10] by the following steps.

•
This study aims to assess the effectiveness of the MiniRocket [11] time series classifier, based on the MINImally RandOm Convolutional KErnel Transform, for real-time prediction of solar flares with minimal data manipulation.MiniRocket, an efficient variant of the ROCKET algorithm [12], attains high precision with lower computational costs by employing random convolutional kernels to transform input time series.The transformed features are then used to train a linear classifier.MiniRocket, being an almost deterministic version of ROCKET, exhibits performs much faster on bigger datasets while keeping an equivalent level of accuracy.

•
The evaluation metrics, TSS and HSS 2 [14], are selected for comparison because they are the most commonly used performance metrics for flare prediction with class imbalance data.To address data overlapping, we implement the 20-partition pair strategy proposed by [1].

Related Work
Theo, ref. [15], was one of the first expert systems to rely on human input for solar flare prediction.It forecast various sorts of flares by combining sunspot and magnetic field parameters.Theo's rule-based flare prediction technique was adopted by National Oceanic and Atmospheric Administration's (NOAA's) Space Environment Center (SEC) in 1987.There are two primary types of data-driven flare prediction systems used today: nonlinear statistical methods and linear statistical methods.They can be further divided into line-of-sight magnetogram-based models and vector magnetogram-based models.Vector magnetograms-which offer thorough full-disk magnetic field data-are usually thought to be more useful for parameterizing active zones as they contain the full-disk magnetic field data as mentioned in [16].However, because to the lack of vector magnetograms, solar physicists primarily relied on line-of-sight magnetic data for flare predictions until NASA's Solar Dynamics Observatory was launched in 2010.
The primary goal of linear statistical research is to determine the magnetic characteristics of active regions (ARs) connected to solar flares, ref. [17] parameterized ARs using line-of-sight magnetograms, and investigated the relationship between AR parameters and flare events.Three physical properties were measured by analyzing multiple SOHO/MDI longitudinal magnetograms: the number of singular points, the length of the neutral line, and the maximum horizontal gradient.There was a high correlation seen between solar flare activity and these measurements, which indicate the complexity and non-potentiality of the photospheric magnetic field.According to their analysis, solar flare productivity increases with nonpotentiality and complexity.A similar study [18] used Solar Geophysical Data (SGD) flare reports and line-of-sight Michelson Doppler Imager (MDI) magnetograms of 89 active regions to study relationships between magnetic field characteristics and flare productivity.They concentrated on the total magnetic energy, the length of strong-gradient magnetic neutral lines (LGNL), and the mean value of spatial magnetic gradients at strong-gradient magnetic neutral lines (NL).Their results showed strong positive relationships with both the probability of more flares in the future and the overall flare productivity.Vector magnetograms were first used by Leka et al. [19] to define AR parameters.They used discriminant analysis to identify the photospheric magnetic characteristics that are essential for producing intense occurrences, such as solar flares.Their findings demonstrated that, whereas individual parameters had little discriminative power, combinations of several variables were able to distinguish between locations that were flaring and those that were not flaring.
Nonlinear statistical models frequently use conventional machine learning classifiers while utilizing different strategies.Several approaches have been explored in the context of classification models: ref. [20] employed logistic regression, ref. [21] utilized a C4.5 decision tree, and [22] employed a relevance vector machine, while [23] used an artificial neural network.Moreover, ref. [24] used vector magnetograms as well as line-of-sight magnetograms to evaluate the performance of three classifiers: k-NN, SVM, and Extremely Randomized Tree.
The first work to apply machine learning techniques to HMI vector magnetograms is the pioneering work by [25].They employed a Support Vector Machine (SVM) classifier to forecast M-and X-class solar flares using four years of data from the Helioseismic and Magnetic Imager (HMI) at the Solar Dynamics Observatory.For flare forecasting, this novel technique made use of a large dataset of vector magnetograms.A database of 2071 active regions and 1.5 million active region patches of vector magnetic field data were used by the researchers to create a catalog of flaring and non-flaring active regions.A feature selection technique was used to identify the most effective features that would best distinguish flaring from non-flaring regions after each active region was described using 25 parameters.A cost formula that minimized false negatives was devised in order to address the problem of class imbalance.
In order to address solar flare prediction, ref. [26] framed it as a binary classification problem, making the distinction between flaring and non-flaring active regions.They used k-NN classification on univariate time series to create a prediction system after carefully extracting time series samples of active region parameters.Their results showed that employing all active region characteristics at one time was less effective than applying a statistical summary technique to the "total unsigned current helicity" metric.The problem was reduced to a single-variable time series classification by the researchers after they identified the most important parameter by examining the time series features of the AR parameters.By applying a statistical summarizing method to the time series, they presented a novel strategy that allowed the key AR parameter to represent flaring/non-flaring active regions in a vector space.They obtained significant computational and temporal efficiencies by using the k-nearest neighbors (k-NN) classifier in this reduced vector space.They also found that adding C-class flares to the positive class did not improve classification performance.
Angryk and colleagues [27] published a comprehensive multivariate time series (MVTS) dataset collected from solar photospheric vector magnetograms.The dataset comprised 4098 MVTS entries with over 10,000 flare reports and 51 flare-predictive factors that were collected from active regions between May 2010 and December 2018.It offered a cleaned, integrated, and easily available dataset with numerous sources of verification as a comprehensive resource for solar physicists and machine learning specialists.The GOES flare catalog, SSW and XRT flares, and NOAA AR locations were used in the data compilation process, to improve the quality and cleanliness of the dataset.The authors recalculated the magnetic field parameters from specific region patches, and then they transformed them into multivariate time series spanning the entire length of a given HARP series.Additionally, they addressed missing values, location-based filtering, and accounting for empty SHARPs in order to purify the dataset.The dataset was categorized based on flare intensity threshold criteria into target classes.Observation window, latency, and prediction window ideas were used for customized slicing and labeling.
Ahmadzadeh and colleagues [1] discussed the challenges posed by the SWAN-SF dataset introduced by [27].They highlighted the extreme class imbalance ratio within the data and the temporal coherence.In order to tackle these issues, the researchers first reduced the dimensionality of the dataset by removing statistical elements from the time series.For conducting their experiments, they employed SVM classifiers, and they used a combination of data-level undersampling and oversampling strategies to address class imbalance.To lower false positives and false negatives, they also adjusted the misclassification weighting value at the classifier level.They employed training and testing data from separate partitions to prevent data point overlap and preserve temporal coherence.

Dataset
The Space Weather Analytics for Solar Flares (SWAN-SF) benchmark dataset by [27] serves as an example of a multivariate time series with the goal of achieving classification and forecasting of solar flares in an unbiased manner.The MVTS instances of the SWAN-SF benchmark dataset are labeled by five different flare classes, namely GOES-based X, M, C, and B, and a non-flaring class denoted by Q. Class Q includes flare-quiet events and GOES A-class events.In this paper, as positive class events, we consider major flaring events (M and X), and as negative class events, we consider minor events (Q, B, C).Considering B-and C-class flares as non-flaring motivated by the experimental findings of multiple previous studies [1,25,26,28,29].Solar flares are classified logarithmically into A, B, C, M, and X categories, with each category representing an increase in intensity starting from 10 −8 W m −2 [1,27].An X-class flare's peak X-ray flux is roughly 100 times more than that of a C-class flare and 10 times greater than that of an M-class flare.To give more detail, each class is subdivided into nine further levels to provide more granularity.Detection capabilities for flares vary with their intensity.High X-ray levels can render A and B-class flares difficult or even impossible to detect reliably.Due to their severe geomagnetic effects, M and X-class flares are of particular interest in space weather forecasting [30].Their significant potential for damage makes them critical targets for continuous monitoring and analysis.This hierarchical and intensity-based classification system not only aids in the systematic study of solar phenomena, but also prioritizes monitoring efforts toward those flares most likely to affect space and terrestrial environments.The dataset has been divided into five partitions with approximately equal numbers of X-and M-class flares in each one to ensure temporal segmentation (Table 1).Time series data from solar photospheric magnetograms and NOAA's record of active area flares are included in the dataset.For magnetograms, it makes use of the HMI Active Region Patches (HARP) [31] data package from the Solar Dynamics Observatory [32].Magnetic field parameters are first obtained from the Space Weather HMI Active Region Patches (SHARP) data product [7,33].However, for improved validation, these parameters are recalculated and supplemented with new variables, including some that were not in SHARPs at first (as shown in Table 2 in [27]).Table 2 provides a reference to the 24 physical magnetic field parameters that each sliding time series slice in the dataset represents.The references in Table 2 indicate the initial use of these parameters for flare prediction using machine learning algorithms [25].These time series instances are logged at 12 min intervals over a total of 12 h (60-time steps).

Abbreviation Description Formula
ABSNJZH [19] Absolute value of the net current helicity Mean photospheric magnetic free energy [36] Mean shear angle Sum of flux near polarity inversion line Sum of the modulus of the net current per polarity Fraction of Area with shear >45 TOTPOT [19] Total photospheric magnetic free energy density TOTUSJH [19] Total unsigned current helicity During a specific prediction window, every solar active region exhibits a range of flare classes or stays quiet.The representation of a solar event i as mvts i , a multivariate time series instance, and its class label, y i , which specifies the flare categories, encapsulate this variability.The multivariate time series instance mvts i , which is made up of N magnetic field parameters, the multivariate time series instance mvts i ∈ R T * N encompasses multiple time series with periodic observations over an interval of T. The time series for the j-th parameter is represented as P j ∈ R T , and the value at the t-th timestamp is represented as x <t> ∈ R N .The active region's state at the end of the observation period T and during the subsequent prediction interval L determines the event's classification.To determine the state of a given timestamp, NOAA records of flare events are utilized.
When the instances of one or more data classes is significantly less than the majority classes, the dataset is considered as highly imbalanced data.The minority classes consist of data points from the minority group, while the data points from the other group are referred to as the majority classes.Table 1 illustrates the substantial class imbalance ratio present in the SWAN-SF benchmark dataset.Traditional machine learning classifiers tend to favor the majority class, as highlighted by [38].It becomes concerning especially in solar flare classification, where the focus lies on a minority of cases.Class imbalance ratio can significantly impact various performance metrics, including accuracy, precision, and the F1 score.This is primarily due to the metrics disregarding the number of misclassification instances.For example, a traditional model that assigns all instances to the majority class may achieve high accuracy fail to capture any meaningful information about the minority class.In the following sections, we will discuss the TSS and HSS 2 evaluation metrics, which are specifically designed to assess model performance in scenarios with significant class imbalances.

Methodology
Although time series classification accuracy levels achieved by machine learning and deep learning classifiers are outstanding, they are generally associated with considerable computational complexity.Larger datasets exacerbate this problem even more, as they can necessitate longer training times and render these techniques useless.In addition, a lot of the existing methods tend to sacrifice a more comprehensive view in favor of concentrating on certain data features like shape or frequency.Reference [12] presented the RandOm Convolutional KErnel Transform (ROCKET) technique as a solution to these problems.Using random convolutional kernels to extract important features and using them to train a linear classifier, this novel approach leverages the power of convolutional neural networks for time series classification.Reference [11] proposed a more sophisticated variant of ROCKET with faster processing times and nearly deterministic performance, called the MINImally RandOm Convolutional KErnel Transform (MiniRocket).
The ROCKET approach transforms time series data by applying a collection of random convolutional kernels to each series.These kernels, similar to those in convolutional neural networks, have characteristics including length, weights, bias, dilation, and padding that are randomly allocated.The kernels are able to extract a wide range of patterns and information at various frequencies and scales.Two types of pooling are used to each kernel's output: percentage of positive values (PPV) pooling and global max pooling.While PPV pooling uses the formula ppv = 1/n ∑ n−1 i=0 [z i > 0], where z i is the output of the convolution operation, which is the convolution's result to determine the proportion of positive outputs, global max pooling extracts the maximum feature value from the output.By assessing the significance of the patterns identified by the kernels, this PPV pooling positive output metric greatly improves the method's accuracy.Each kernel produces two features, resulting in a total of 20,000 features per input time series when using 10,000 random convolutional kernels.A linear classifier is then trained using these features.
PPV pooling is used by the ROCKET and MiniRocket algorithms to assess convolution outputs.Table 3 provides more information on how MiniRocket improves computing efficiency using a predetermined set of kernels with particular hyper-parameter settings.Notable modifications include restricting the dilation hyper-parameter, matching the weight hyper-parameter to a defined range, matching the bias hyper-parameter to random convolution outputs, fixing the kernel length to nine, and utilizing PPV pooling only, instead of global max pooling and PPV.With these optimizations, MiniRocket can generate half as many features as ROCKET with comparable precision.These enhancements lead to MiniRocket's outstanding computing efficiency.It essentially doubles the kernel utilization without adding to the computational overhead by utilizing the mathematical properties of fixed kernels and PPV pooling to compute PPV for both positive and negative weights at the same time.Additionally, by replacing multiplicative processes with additive ones, it minimizes processing demands and maximizes the reuse of convolution output.Additionally, MiniRocket optimizes computation and output reuse by processing all kernels for each dilation simultaneously.These enhancements greatly increase computational efficiency without lowering the ROCKET classifier's accuracy.In our experiments using the SWAN-SF dataset, MiniRocket outperformed other classifiers in terms of both computational efficiency and accuracy, making it a highly effective model for time series classification tasks.In the next section, we will explore the results of our experimentation in more detail, and present a brief summary of the other classifiers we evaluated.

Experiments
In this section, we present an overview of the baseline models that we evaluated and compare to the state-of-the-art MiniRocket (MR).The study was conducted by evaluating the performance of each model under different data configurations and compare the results with the MiniRocket (MR) algorithm.To ensure the reliability of the results, we employed a 5-fold cross-validation approach.In this approach, one partition was used for training, while the remaining four partitions were used for testing.For instance, partition 1 was used for model training, whereas partitions 2, 3, 4, and 5 were each employed for separate testing.There were a total of 20 distinct pairings of training and testing sets produced by this method.We followed the methodology used by [1] to prevent data overlap and address temporal coherence.The performance of the models was evaluated using the True Skill Statistic (TSS) score and Heidke Skill Score (HSS 2 ) metrics, which are the primary metrics for evaluating flare prediction in datasets with class imbalances.

Performance Metrics: TSS Score and HSS 2 Score
A valuable approach to evaluate the effectiveness of a classifier is to measure its performance against a benchmark classifier using a skill score.This score is calculated by taking the classifier's prediction score and subtracting the standard forecast's score value from it.Then, this difference is divided by the difference between a perfect score and the standard forecast.This computation helps in evaluating the classifier's performance in comparison to the baseline forecast and the ideal result.Developing such a skill score is especially important in solar flare prediction, since non-flaring regions greatly outnumber flaring ones.We used forecast verification metrics to assess how well different classifiers performed in forecasting flares on the SWAN-SF dataset, with a particular emphasis on the True Skill Statistic (TSS) and Heidke Skill Score (HSS 2 ) [39].  4. The Heidke Skill Score (HSS) as defined by the Space Weather Prediction Center, also known as HSS 2 , is used by [40].This measure expresses how much better the forecast is than a random one.The following formula is used to calculate HSS 2 : where E represents the expected number of correct predictions due to chance alone: HSS 2 can be calculated from the True Positive (TP), True Negative (TN), False Negative (FN), and False Positive (FP) classification outcomes, as well as the total number of Positive (P) and Negative (N) instances: Although the class-imbalance ratio of the testing set may have an impact on HSS 2 , TSS is suggested by [14] as a more suitable metric in these situations because it is thought to be more equal and is known to be unbiased with regard to the class-imbalance ratio.The TSS is defined as follows: TSS, sometimes referred to as the Peirce skill score or the Hansen-Kuiper skill score [41], measures the difference between the false alarm rate and recall.The score goes from −1 to 1, a score of 1 indicates a perfect forecast, a score of 0 shows a random or constant forecast, and a score of −1 indicates a forecast that is always incorrect.TSS is highly considered for comparing the performance of different classifiers in solar flare predictions because it takes into account both false negatives and false positives in a balanced manner.Importantly, it remains unaffected by the imbalance in the testing set, which makes it a very helpful indicator in situations where there is a class imbalance.
The True Skill Statistic (TSS)'s potential limitation is that it gives equal weight to False Positive (FP) and False Negative (FN) outcomes, despite the fact that the results of these misclassifications can differ greatly.For example, in the forecasting of solar flares, the consequences of a False Negative-that is, failing to predict a flare that really happens-can be more severe than those of a False Positive-that is, forecasting a flare that never happens.This is especially important when preemptive actions are required, such spinning a satellite to protect it from energetic particles.As a result, the expenses related to False Positives and False Negatives differ.The Heidke Skill Score (HSS 2 ) is sensitive to the class imbalance in the testing set; as the imbalance increases, its value may approach zero, whereas TSS remains unaffected by it.

Baseline Models
We used TSS and HSS 2 scores to evaluate the performance of several time series classifiers, including LSTM, SVM, Mr-SEQL, and CIF.After comparing the performance of the aforementioned models, our analysis shows that MiniRocket outperforms the aforementioned classifiers by achieving the highest TSS score in binary classification and all-class classification.This study highlights the effectiveness of MiniRocket as a powerful tool for flare classification.In the following sections, we will provide a brief overview of each classifier, then we will compare the results obtained.

Long Short-Term Memory (LSTM)
In this research, we utilized Long Short-Term Memory (LSTM) networks to learn Multivariate Time Series (MVTS) instances representations without without requiring statistical characteristics to be hand-engineered.The LSTM network was trained by sequentially feeding magnetic field parameter vectors into LSTM cells and adjusting cell weights using gradient descent and backpropagation.This approach successfully revealed underlying patterns in the data, enabling trustworthy forecasts of flare occurrences by automated feature extraction [42].LSTM networks excel in processing and classifying time-series data due to their ability to capture order dependence and long-term dependencies that regular RNNs cannot.Deep LSTM networks are produced by stacking many LSTM layers together to identify increasingly more intricate patterns in sequential data.This study's use of LSTM networks demonstrates both their wide range of application across several fields and its ability to represent time series data.

Support Vector Machine (SVM)
The Support Vector Machine (SVM) classifier works by identifying a hyperplane in N-dimensional space that can accurately classify input points.Finding an ideal hyperplane involves finding a plane with the greatest margin, which represents the maximum distance between data instances of different classes.This margin is crucial, as it enables effective generalization and improves the prediction accuracy.Hyperplanes act as decision boundaries, separating data points.Hyperplanes size is determined by the number of features in the data.Support vectors, which are the data points closest to the hyperplane, greatly influence its placement and orientation.They play a critical role in optimizing the classifier's margin.The SVM classifier finds the best hyperplane by utilizing these support vectors, and achieves high prediction accuracy [43].
The optimal hyperplane that intersects the decision boundary is pushed further toward the domain of the minority class, in the class imbalance case in the flare dataset.The goal of this adjustment is to minimize the overall number of incorrect classifications, which leads to an increase in True Negatives (i.e., accurate classification of CBF-class flares) and a decrease in True Positives (i.e., accurate classification of XM-class flares).When there is a class imbalance, models have a tendency to be biased in favor of the dominant class, which is problematic because flare-forecasting research is more concerned with minority incidents than majority ones.Support Vectors and transformation functions (kernels) enable the SVM classifier to learn nonlinear decision surfaces effectively, which has led to its rise in popularity.To improve data transformation into new feature spaces and enable a more precise instance separation, a variety of kernels can be applied.Like any other function, a kernel requires one or more variables to be specified beforehand.

Canonical Interval Forest (CIF)
The time series forest (TSF) classifier is widely considered a powerful interval approach, due to its excellent performance, rapid training, and prediction.But it has lagged behind the latest developments in substitute methods.Initially, TSF used just three fundamental summary statistics to summarize intervals.In order to make large time series analysis easier, the 'catch22' feature set [44] was designed as a concise and practical set of 22-time series features.Expanding on these developments, ref. [13] presented the Canonical Interval Forest (CIF) classifier, which combines the strengths of TSF and catch22.The CIF classifier uses the special advantages of both approaches to improve time series analysis performance and accuracy.

Multiple Representations Sequence Learner (Mr-SEQL)
A robust univariate time series classifier called Mr-SEQL was introduced by [45].It uses features that are obtained from several symbolic representations of time series for training.
These representations are employed with linear classification models (logistic regression), such as Symbol Aggregation Approximation (SAX) and Symbol Fourier Approximation (SFA).SEQL [46] is used by Mr-SEQL to extract features based on three main concepts.First, Mr-SEQL combines numerous symbolic representations derived from different parameters, like several SAX representations, as opposed to depending on a single fixed representation.Second, it is robust to a broad spectrum of issues since it integrates many domain representations in time (like SAX) and frequency (like SFA).To successfully explore the relevant symbolic-words space, Mr-SEQL enhances a symbolic sequence classifier (SEQL) and uses an effective greedy feature selection technique to identify optimal features for each representation.Mr-SEQL is a time series classifier that is quite effective and has several key features that make it suitable for a variety of applications.

Binary Classification
In the preliminary experiments, we applied a transformation of the original data labels into binary labels with the goal of simplifying the classification process.The positive class, denoted as flaring, has M and X class flares, while the negative class, referred to as non-flaring, has Q, B, and C class flares.
We trained five different models, namely MiniRocket, CLF, Mr-SEQL, LSTM, and SVM, and compared their performances in terms of TSS and HSS 2 scores.We show the results of the experiments in the line plots presented in Figures 1 and 2. These plots highlight the obtained scores for TSS and HSS 2 , respectively.Our analysis demonstrated that, on the SWAN-SF dataset, the MiniRocket classifier outperformed the baselines classifiers, improving TSS and HSS 2 scores by an average of 19.4% and 23.9%, respectively.Furthermore, the box plots illustrate the distribution of TSS and HSS 2 score data from various classifiers across 20 distinct partition pairs.These plots offer insights into the variability and distribution of the data.A longer length in the box plots signifies increased variability in the data, as observed with the SVM and MiniRocket classifiers.Notably, the MiniRocket classifier exhibited the best performance, followed by SVM, LSTM, CIF, and Mr-SEQL models.

Multi-Class: All Class Classification
In this section, we are working on classifying the five different classes, which are Q, B, C, M, and X.The experimental setup stays the same: 20 distinct partition pairings are used for training and testing, and the TSS and HSS 2 scores are used to com-pare the performance of the selected models, MiniRocket and SVM.For a comparison of TSS and HSS 2 scores, see the line plots shown in Figures 3 and 4.
The analysis of the all-class classification showed that the MiniRocket classifier outperformed the baselines classifiers with a 9.61% higher TSS score and 10.36% higher HSS 2 score.Upon analyzing the box plots depicting TSS and HSS 2 scores for multi-class classification, it was evident that the MiniRocket model demonstrated superior performance once more, with SVM, LSTM, CIF, and Mr-SEQL following suit in that order.The SVM model's box plot also exhibited the highest variability.

Analysis with the Exclusion of B-and C-Class Flares
B-and C-class flares would be excluded in this part of the experiment.This decision was made following the research method conducted by [14], which indicated that the inclusion of C-class flares may have a negative impact on performance metrics.In our experiment, we noticed an improvement in the TSS score for all models after the B and C-class flares were removed.This underscores the significance of this exclusion in achieving optimal model performance.
After B and C-class flares were eliminated, the experiment was divided into two categories: binary class classification (Figures 5 and 6) and all-class classification (Figures 7 and 8).
After B-and C-class flares were removed, the experiments results for binary classification showed that MiniRocket performed remarkably well, increasing the TSS score by 30.06% and the HSS 2 score by 30.55% compared to the baselines classifiers.
After the B-and C-class flares were removed, we conducted an analysis of the all-class classifications and found out that MiniRocket again outperformed the aforementioned classifiers by 18.94% in terms of HSS 2 score and 20.13% in terms of TSS score.A notable observation arises from the analysis of the box plots in the conducted experiments.It has been deduced that the TSS and HSS 2 scores for all classifiers experienced a surge when B-and C-class flares were excluded, in contrast to the scenario where these classes were included.This discovery greatly reinforces the findings of [14].Moreover, the recurring showcased the superior performance of the MiniRocket classifier over other classifiers.This trend was followed: by the SVM, LSTM, CIF, and Mr-SEQL models.

Conclusions
In this study, we explored the use of the MiniRocket classifier for analyzing the SWAN-SF dataset.
We compared our model with multiple classifiers, such as LSTM, Mr-SEQL, SVM, and CIF.We utilized the True Skill Statistic (TSS) score and Heidke Skill Score (HSS 2 ) to evaluate the classification performance.The experimental findings indicated that MiniRocket outperformed the baseline classifiers on the SWAN-SF dataset, demonstrating a consistent average improvement of 20.92% in HSS 2 and 19.8% in TSS score across all experimental settings.This insight will help the solar physicists to use the right algorithm to classify flaring and non-flaring instances.We also found that after excluding B-and C-class flares, the trained models exhibited a significant improvement, resulting in a substantial increase in TSS and HSS 2 scores.The removal of B-and C-class flares for maximizing flare prediction performance was also suggested by the experimental findings of multiple previous studies [25,26].These findings demonstrate the potential of our approach to improve space weather forecasting accuracy.The power of MiniRocket in handling MVTS data complexities is evident.It can greatly progress the goal of classifying solar flares in real time.There is potential for these contributions to enhance space weather forecasting [47].
For future research, we propose exploring a Transformers/Attention-based model integrated with the SWAN-SF dataset.This integration could address long-range dependencies and enable a comparative analysis against the benchmark MiniRocket classifier, further advancing our understanding and capabilities in solar physics and space weather prediction.
TSS and HSS 2 are calculated based on the confusion matrix of the model, which shows the frequencies of the actual and predicted values.True Negatives (TNs) are instances in which the model correctly classified negative examples.True Positives (TPs) are when the model accurately classified positive examples."FPs" refers to False Positives, which happen when real negative examples are miss-labeled as positive.False Negatives (FNs) are instances where real positive examples are miss-classified as negative.An example of a confusion matrix for binary classification is shown in Table

Figure 5 .
Figure 5. TSS score comparison of binary class classification after removing B and C−class flares.

Figure 6 .
Figure 6.HSS 2 score comparison of binary class classification after removing B-and C-class flares.

Figure 7 .
Figure 7. TSS score comparison of all class classification after removing B and C−class flares.

Figure 8 .
Figure 8. HSS 2 score comparison of all class classification after removing B and C−class flares.

Table 1 .
Event type statistics of each partition of the SWAN-SF dataset.Class Q represents flare-quiet events and GOES A-class events.

Table 2 .
AR magnetic field parameters list.

Table 3 .
Difference between the ROCKET and MiniRocket kernels' hyper-parameters.

Table 4 .
Confusion matrix for binary classification.