Solar Active Region Detection Using Deep Learning

: Solar eruptive events could affect radio communication, global positioning systems, and some high-tech equipment in space. Active regions on the Sun are the main source regions of solar eruptive events. Therefore, the automatic detection of active regions is important not only for routine observation, but also for the solar activity forecast. At present, active regions are manually or automatically extracted by using traditional image processing techniques. Because active regions dynamically evolve, it is not easy to design a suitable feature extractor. In this paper, we ﬁrst overview the commonly used methods for active region detection currently. Then, two representative object detection models, faster R-CNN and YOLO V3, are employed to learn the characteristics of active regions, and ﬁnally establish a deep learning-based detection model of active regions. The performance evaluation demonstrates that the high accuracy of active region detection is achieved by both the two models. In addition, YOLO V3 is 4% and 1% better than faster R-CNN in terms of true positive (TP) and true negative (TN) indexes, respectively; meanwhile, the former is eight times faster than the latter.


Introduction
A solar active region is an area with a strong magnetic field on the Sun. It is considered the major source region of solar eruptive events. Solar eruptive events can cause severe space weather effects, which may affect the safety of satellites, the precision of global positioning systems and so on. Therefore, it is of great importance in routine monitoring and the automatic extraction of active regions.
Some efforts have been made towards automatically identifying solar active regions. Benkhalil et al. [1] determined the thresholds for an active region to obtain the initial seeds of active regions. The noise is removed by median filtering and morphological operations. Based on these initial seeds, a region growing algorithm is used to detect active regions. Zhang et al. [2] designed an active region detection system by using an intensity threshold and morphological analysis algorithm. McAteer et al. [3] combined a region growing algorithm and boundary extraction technique to detect active regions. Caballero et al. [4] proposed a two-step method to detect active regions from full-disk images. In the first step, the region growing algorithm is applied to segment the bright parts in active regions. In the second step, partition-based clustering and hierarchical clustering are used to group together these bright parts, respectively. The hierarchical clustering method was recommended because of its good performance. Higgins et al. [5] proposed the solar monitor active region tracking (SMART) algorithm to detect and track active regions throughout their lifetime. In this algorithm, the quiet Sun and some transient magnetic features are removed, and then the region-growing technique is applied to determine the active regions. Some magnetic properties of an active region, such as region size, magnetic flux emergence rate, and non-potentiality measurements, are calculated. Colak et al. [6] proposed automated solar activity prediction (ASAP) to automatically detect, group and classify sunspots. The intensity threshold, morphological algorithms, region growing algorithms and neural networks are applied to determine the boundaries of sunspots. Watson et al. [7] proposed the sunspot tracking and recognition algorithm (STARA) to detect and track sunspots from solar white light images. Barra et al. [8] proposed a fuzzy clustering algorithm (the spatial possibilistic clustering algorithm (SPoCA)) to automatically segment the full-disk solar images into coronal holes, quiet Sun and active regions, respectively. This unsupervised method can overcome the imprecision of the regions' definition. The SPoCA algorithm was improved in [9], and the automatic tracking of active regions was further developed. The performance of SMART, ASAP, STARA and SPoCA was analyzed and compared in [10]. They found that ASAP tends to detect very small sunspots, while STARA has a higher threshold for sunspot detection, and SMART and SPoCA detect more regions than the National Oceanic and Atmospheric Administration (NOAA) for the active regions. The proposed detection methods are mainly based on the intensity threshold, morphological operations, region growing algorithms and clustering methods. In these methods, the pre-defined parameters should be determined [11]. However, it is difficult to settle on the optimal parameters.
Deep learning algorithms can automatically extract the distinguishing features and realize the end-to-end objective detection. Many deep learning algorithms, for example, the convolutional neural network (CNN) [12][13][14][15], long short-term memory (LSTM) [16,17] and generative adversarial network (GAN) [18][19][20], have been widely used in solar activity forecast. However, so far, a deep learning algorithm has not been widely applied to detect solar active regions. Here, faster R-CNN (regions with convolutional neural networks) and YOLO V3 (you only look once, version 3) algorithms are used to detect active regions from the full-disk solar images, and their performances are compared. This paper is organized as follows: Section 2 summarizes the related works, Section 3 describes the data, Section 4 describes the deep learning based objective detection algorithm, Section 5 discusses the performance, and Section 6 presents the concluding remarks and suggestions for future work.

Related Work
There have been many studies on the automatic detection of active regions; the main algorithms include the intensity threshold, morphological operations, region growing algorithms, clustering methods and the combination among them, shown in Table 1. It is difficult to select parameters for different solar images in these algorithms. There are two main branches of object detection models when using deep learning. One is two-stage model, including R-CNN, fast R-CNN, faster R-CNN and mask R-CNN. The other is one stage model, including the YOLO (you only look once) series.
In faster R-CNN, region proposals are firstly generated. Then, a CNN is trained to classify proposal regions, and obtain the regressions of bounding boxes of the proposal regions. Unlike the R-CNN algorithm, the fast R-CNN algorithm maps CNN features corresponding to generated region proposals; hence, the fast R-CNN detector is more efficient than the R-CNN detector. In the faster R-CNN detector, the region proposal network (RPN) is used to generate region proposals. RPN is faster and better than the proposed generation method of region proposals.
Faster R-CNN is a detection algorithm with two steps. It usually has limits in the detection speed. A one step algorithm, for example, the YOLO series algorithm, is proposed to learn a single network for detecting the object boundary box. This detection network is an end-to-end regression based model; hence, the detection speed of the network is improved. However, the YOLO algorithm is not very good at detecting small targets. Fortunately, we usually do not focus on the small active regions in solar activity prediction.
In this paper, we apply these two object detection models (one-stage algorithm of YOLO and two-stage algorithm of faster R-CNN) to solar active region detection and compare their performance.

Data
The Helioseismic and Magnetic Imager (HMI) onboard the Solar Dynamics Observatory (SDO) provides the routine full-disk magnetic observation of the Sun. The National Oceanic and Atmospheric Administration (NOAA) provides the location and extension of active regions in the full-disk images day-by-day; one example is shown in Figure 1. We downloaded the solar full-disk images and the information of active regions from the Joint Science Operations Center (JSOC) database [21]. The location and extension of active regions are considered the ground truth.
A total of 4645 full-disk images labeled with active regions were obtained from 2010 to 2017, and the interval was 24 h. The data from 2010 to 2015 are considered the training dataset, and the remaining data are considered the testing dataset.

Faster R-CNN for Active Region Detection
Traditional object detection techniques include 3 major steps: (1) Region proposal generation. A large number of region proposals are generated by the selective search algorithm [22]. (2) Feature extraction. Some feature extractors are applied to obtain a fixed-length feature vector, for example, Hog or SIFT [23,24]. (3) Classification. Based on the fixed-length feature vector, the classification model can be learned to judge whether the region proposal is the object.
The feature extraction is critical to the success of the object detection techniques. In traditional object detection techniques, it is difficult to design feature extractors for different tasks. In region-based CNN (R-CNN), a convolutional neural network is proposed to learn the features from data. To speed up the process of object detection, an end-to-end network is used to detect different objects in faster R-CNN. A novel region proposal network (RPN) is applied to generate region proposals, which saves time, compared to traditional algorithms, such as selective search.
As shown in Figure 2, a faster R-CNN model is composed of 3 neural networks [25]: (1) A convolutional neural network is used to extract the features from the solar full-disk magnetograms. A pre-trained CNN (VGG-16 [26]) is selected.  The whole network of a faster R-CNN is composed of a network of feature extraction followed by a RPN network and a classification network.
RPN aims to obtain a set of rectangular object proposals and their objectness score. As shown in Figure 3, the RPN is mainly composed of a convolutional layer, a classification layer and a regression layer. Each sliding window over the output feature map of the last layer in VGG-16 is mapped into 512-dimensional features, which represents the k anchor box centered at this position in original images. Then, it is fed into a box-regression layer and a box-classification layer, respectively, to obtain the category (positive sample including detection object and negative sample for background) and position coordinate of the corresponding anchor. VGG-16 includes the shareable convolution layers, and the pooling layers. The convolution layers are applied to extract features, and the pooling layers are used to reduce the dimension of the images. VGG-16 includes 13 convolutional layers, each with Relu, and 4 max pooling layers. The pre-trained VGG-16 model is usually used as a feature extractor in the object detection task and can be download in the PyTorch platform [27]. In the preprocessing step, the full-disk images are resized to 1024-by-1024. An image of dimension 1024 × 1024 is reduced to 64 × 64 feature map here. Following the feature extraction network, a RPN is trained to generate object proposals to replace the time-consuming selective search algorithm.  In order to train RPN, we need to assign a binary class label. The positive class means that the concerned anchor contains an object, while the negative class means that the concerned anchor is the background. Intersection over union (IoU) between the ground truth and the proposal object is applied to define the loss function.
The IoU is calculated by dividing the area of overlap between the bounding box and the ground truth by the area of their union. The higher the IoU, the better the prediction. The proposal object, which has an IoU overlap higher than 0.7 with a ground-truth box, is considered to be a positive sample, while the proposal object, which has an IoU overlap less than 0.3 with a ground-truth box, is considered to be a negative sample. Neither positive nor negative samples contribute to the training process. After generating binary samples, the RPN is trained by using backpropagation and stochastic gradient descent. In the training step, random transformation is used to augment the training data. Data augmentation can improve the network accuracy by adding variety to the training data without actually increasing the labeled samples. The batch size is settled to 128, in which the ratio of positive and negative samples is 1:1.
When the object proposals are determined, the region of interest (RoI) pooling layer is used to obtain features of the region proposal with a fixed size; then, the softmax classifier and the bounding box regressor are trained to determine the class of the object and its bounding box.
The proposed model is optimized, using the SGD algorithm with momentum = 0.9. The initial learning rate is 0.001, and is then divided by 10 times after every 6 epochs. The model is trained on a single NVIDIA Tesla P100 with a batch size of 4, and the maximum epoch of 100.

YOLO V3 for Active Region Detection
Different from faster R-CNN, YOLO (you only look once) [28] regards detection as a pure regression problem without the need for two stages of RPN and regression/classification in faster R-CNN. YOLO directly predicts the image pixel as a bounding box coordinate and classification labels. YOLO series have been continually optimized, applying more advanced techniques for better performance and real-time applications. For example, YOLO V2 proposes a method to jointly train on object detection and classification; residual connections are used in YOLO V3; and YOLO V4 develops a more efficient and powerful model, as does YOLO V5. However, YOLO V4 and YOLO V5 may be a little bit overoptimized regarding our task, which concerns a simple binary classification. Thus, we employ YOLO V3 as a baseline of our model, which is light-weight.
The flow chart of YOLO V3 darknet-19 is shown in Figure 4, where block-1 is composed of conv-batchnorm-LeakyReLU-maxpool modules, and block-2 consists of conv-batchnorm-LeakyReLU modules. The route layer concatenates features from previous layers. In our task, we predict the coordinates of bounding box and classification confidence from the extracted features directly, which means that each bounding box consists of five predictions: t x , t y , t w , t h and confidence. The extracted features are divided into N × N grid cells. The YOLO detection block can predict five coordinates for each bounding box: t x , t y , t w , t h and t 0 . If the cell has the offset of (c x , c y ) relative to the top left corner of the image, and the bounding box prior has the width and height of p w ,p h , then the YOLO block can be predicted by the following: Here, two nonlinear functions σ and exp could make the prediction more efficiently. In our work, we take six bounding box priors of YOLO V3, (10 × 14), (23 × 27), (37 × 58), (81 × 82), (135, ×169), (344 × 319).
We use the sum of squared error and threshold of 0.5 for bounding box regression. For classification, we simply use logistic classifiers. From the Figure 4, we can find that the model predicts boxes at two different scales. The YOLO prediction module can predict bounding box, objectness (confidence in being an active region or not). In this work, we predict three boxes at each scales. Thus, the tensor is n × N × [3 × (4 + 1)] for the four bounding box offsets, and one objectness prediction (being active region or not). We take the feature map from the two previous layers and upsample it by 2×. We also take a feature map from the previous layers in the network and concatenate it with our upsampled features. We then add more convolutional layers to process this combined feature map, and finally predict a similar tensor for the final outputs. We follow the original loss function of YOLO V3 to optimize our model. We train the network for 300 epochs on the training and validation data sets. The batch size is 64 through training. During training, we use standard data augmentation tricks, including random crops, rotations, and hue, saturation, and exposure shifts.

Evaluation Index
There are two types of regions on the Sun: active regions and quiet regions. The active regions should be detected from the solar full-disk images.
Four possible measures could be defined in the contingency table ( Table 2). The active region is considered to be a positive sample, and the quiet region is considered to be a negative sample.  The output of the object detector is the detected bounding box. Depending on the overlapping amount between the bounding box and the ground truth object, the bounding box is determined to be true or false. Two metrics, the true positive rate (TP rate) and true negative rate (TN rate), are defined to measure the performance of positive class and negative classes, respectively.
The TP rate is the percentage of active regions correctly detected.
The TN rate is the percentage of quiet regions correctly detected. Figure 5 shows an example of faster R-CNN detection of active regions on 4 May 2016. We can find that two active regions are missed, and one quiet region is falsely detected as an active region.   A total of 572 magnetograms, labeled 601 active regions from 2016 to 2017, are used to test these two detection models, respectively. For the faster R-CNN model for active region detection, the TP rate is 90%, while the TN rate is 98%. Usually, small active regions or active regions at the edge of the solar disk are more likely to be missed because the active regions at the edge of the solar disk could be influenced by the projection effect of the Sun. For the YOLO V3 model for active region detection, the TP rate and TN rate are both improved, with a TP rate of 94% and TN rate of 99%.

Performance
There is a two-stage training process for the faster R-CNN algorithm, so it is slow and hard to optimize. YOLO V3 is extremely fast since it is a one-stage regression problem. The average time to process a single image for YOLO V3 is 10 ms, and it is 80 ms for the faster R-CNN model.

Conclusions
An active region detection dataset is built from 2010 to 2017. The dataset consists of solar full-disk images, bounding boxes of active regions. The dataset is chronologically divided: the data from 2010 to 2015 are used for training, and the data from 2016 to 2017 are used for testing. Based on the training data, two deep learning detection model (faster R-CNN and YOLO V3) are trained, and their performance is evaluated and compared by using the same testing data. We can find that YOLO V3 performs better than faster R-CNN in not only detection accuracy, but also computing speed. The TP rate and TN rate increase by 4% and 1%, respectively, and the average computing speed of YOLO V3 is 8 times faster than that of faster R-CNN. From the above analysis, active regions at the edge of solar disk are most likely to be missed. Thus, in the future, we will collect more active regions located at the edge of the solar disk to update the dataset.