1. Introduction
The ionosphere is the region 60–1000 km above the Earth’s surface, where neutral gas molecules are partially or completely ionized by the effects of solar ultraviolet radiation, X-rays, and cosmic particles of higher energy. As a protective barrier, the ionosphere protects life on Earth from direct exposure to solar radiation and energetic particles from space. Additionally, its reflective properties facilitate long-range radio communication. However, disturbances in the ionosphere resulting from external factors can pose substantial risks to radio systems, including aerospace, satellites, and telecommunications. With the increasing demand for precise satellite positioning, accurate and efficient prediction of the ionosphere, especially in condition of ionospheric disturbances, holds great research significance in improving satellite navigation precision. A large amount of observations provided by the Global Navigation Satellite System (GNSS) have been used in various studies, which considered broadcasting or empirical models for ionospheric TEC forecasting, such as the Klobuchar model [
1], the Bent model [
2], International Reference ionosphere (IRI) model [
3], NeQuick model [
4], and BDGIM [
5]. Unfortunately, these models cannot reach the desired level of accuracy due to their statistical methodologies. The accuracy of daily predictions of total electron content (TEC) from these models cannot be higher than 75% [
4,
6].
Since the establishment of the International GNSS Service (IGS) working group in 1998, a number of analysis centers, including the Center for Orbit Determination in Europe (CODE), the European Space Agency (ESA), the Jet Propulsion Laboratory (JPL) in the United States, and the Universitat Politecnica de Catalunya (UPC), have been providing Global Ionospheric Map (GIM) to users. These GIMs facilitate the study on highly accurate TEC forecasting models. CODE, ESA, and UPC additionally provide users with TEC maps one day in advance [
7,
8,
9]. Comparing to the conventional empirical ionospheric models, the higher accuracy of GIM provided by these centers benefits many associated studies. Meanwhile, many scholars have been inspired and then conducted studies on building TEC forecasting models with high accuracy. Autoregressive models, such as the Autoregressive Moving Average Model (ARAM) and the Autoregressive Integrated Moving Average Model (ARIMA), are popular due to their simplicity and data efficiency [
10,
11,
12,
13]. However, their linearity limits their application in TEC modelling. Machine learning has emerged as a potential solution, as its algorithms can automatically learn the implicit non-linear relationships between TEC values and external metrics, thereby improving prediction performance in extreme conditions. Li et al. [
14] introduced an improved model called pix2pixhd, built upon Generative Adversarial Networks (GAN), to specifically forecast TEC over China. The pix2pixhd model forecasts TEC with a 2 h lead time, and its accuracy is higher than the IRI-2016 model in diverse geomagnetic and solar activity scenarios. Liu et al. [
15] utilized the convolutional long short-term memory (ConvLSTM) model to develop two strategies for the global TEC forecasting model. One strategy directly forecasts the TEC of the following day, while the other one models the difference between two TEC maps with 24 h interval. Results indicate that the latter approach has a higher accuracy, better than the 1-day forecast product from the CODE. Chen et al. [
16] proposed a multi-step auxiliary forecast model based on the LSTM framework to mitigate the accumulation of computational errors inherent in a single-step sliding forecast. This model is able to forecast IGS GIM 6 days in advance, outperforming the IRI empirical model with more days in advance as well as giving a more stable estimation. Tang et al. [
17] improved the conventional LSTM model by including convolutional and attention layers. They further incorporated various geomagnetic indices to establish the CNN-LSTM-Attention model to predict TEC at 24 GNSS stations in the Crustal Movement Observation Network of China (CMONOC). Comparing with the NeQuick model and the traditional LSTM model, the improved LSTM model showed more stable forecasting over different months and in various geomagnetic conditions, with a remarkable reduction in root mean square error (RMSE) to 1.87 TECU. Ren et al. [
18] investigated the impact of different input time spans on TEC forecasting. They utilized the LSTM network to construct an indirect forecasting model, which forecasts spherical harmonic coefficients and subsequently calculates TEC, alongside a direct forecasting model that forecasts TEC using VTEC. Both models outperformed CODE’s forecast products and the traditional statistical models, in terms of forecasting accuracy. The IRI model is a classical empirical ionospheric model, which is often used to compare with new proposed models for accuracy verification. Xia et al. [
19] presented an ED-ConvLSTM model to forecast global ionospheric TEC and compared it with the 1-day forecast model of Beijing University of Aeronautics and Astronautics (BUAA) and the IRI-2016 model. Rajana et al. [
20] compared TEC derived from the IRI-2016 model with TEC calculated by Global Positioning System (GPS) observations. The results highlighted that the IRI-2016 model tends to underestimate TEC during periods of intense solar activity in the equatorial and low-latitude regions, especially in spring.
The LSTM model, as a recurrent neural network (RNN) model, has been proven to be highly applicable in temporal sequence forecasting challenges. The aforementioned researchers have successfully developed TEC forecasting models based on LSTM networks to generate good forecasting results. However, the LSTM models suffer from slow training speeds and high demand for computational ability due to the abundance of parameters. In response, the Convolutional Gated Recurrent Unit (GRU) network has emerged to overcome the limitations of the LSTM network. Many studies have been conducted to build TEC forecasting models using GRU networks and promising results have been achieved. The experiments conducted by Chung et al. [
21] showed that the total number of GRU units stacked in the model was more than the number of LSTM units when the number of model parameters was similar, and that GRU trains faster than the LSTM model. Iluore et al. [
22] employed GPS data from the MAL2 station in Kenya between 2010 and 2018 to construct a GRU-based TEC forecasting model, which was compared to LSTM, IRI-Plas 2017, GIM_TEC, and traditional MLP models. The result verified that the GRU model exhibited higher accuracy than the others, while its training speed was faster than the LSTM model. Lei et al. [
23] identified a limitation in traditional RNN models, which is that data from each step contribute equally to the forecast. To overcome this problem, they proposed a bidirectional GRU model incorporating an attention mechanism. The model was verified by comparing its forecast at nine TEC positions with those obtained from LSTM, bidirectional LSTM, and four other models. The result showed that the improved GRU model outperformed the other models in terms of forecasting accuracy at all nine positions.
Various global ionospheric TEC forecasting models have been built based on deep learning algorithms and have achieved high forecast accuracy [
17,
18,
19,
24]. However, the detailed features of the regional ionosphere are difficult to learn due to the limitation of low accuracy and resolution of GIM. Therefore, constructing TEC forecasting models with higher accuracy based on the high-precision Regional Ionospheric Maps (RIMs) has become a research hotspot in recent years. The RIMs have been established in different regions, including China [
25], the Korean Peninsula [
26], Japan [
27], India [
28,
29], Thailand [
30], South Africa [
31], and so on. Harsha et al. [
28] proposed a model which assimilated ground GPS TEC data obtained from the GPS Aided GEO Augmented Navigation (GAGAN) network over the low-latitude Indian subcontinent into the Thermosphere Ionosphere Electrodynamics General Circulation Model (TIEGCM). High-latitude electric field models (Weimer and Heelis) were also utilized to construct the TIEGCM. The results showed that the proposed models successfully presented a negative storm over the Indian region. The correlation between the forecasts and observations was 93% during quiet days and 82% during the main phase of the storm. Sivakrishna et al. [
29] used TEC data from 26 GPS stations data over the Indian region to forecast VTEC maps. They proposed an ionospheric TEC forecast model based on a bidirectional LSTM (bi-LSTM) algorithm to generate RIMs, which can capture the downward shift of depleted equatorial ionization anomaly (EIA) TEC structures during storm periods and which outperforms the ANN model and LSTM model. Furthermore, a number of studies have been conducted on forecasting high-precision TEC for single station or single grid point, effectively improving the accuracy of single-point positioning [
6,
32,
33,
34,
35]. Kaselimi et al. [
34] exploited the advantages of LSTM RNN for time series modeling and forecasted the TEC from a station by using a causal, supervised deep-learning method. Natras et al. [
35] developed learning algorithms of Decision Tree and ensemble learning of Random Forest, Adaptive Boosting (AdaBoost), and eXtreme Gradient Boosting (XGBoost) and found the developed models performed well in both quiet and storm conditions. The global optimal strategy was not used in the single station models, hence a higher accuracy was obtained than with the regional forecasting model. Currently, studies on forecasting ionospheric TEC over China usually use TEC interpolated from the GIM provided by CODE. In contrast to the sites used by CODE to model the GIM, CMONOC stations are more evenly distributed over China, thereby making it more feasible to build high-precision China Regional Ionospheric Maps (CRIMs), and then to build high-precision ionospheric forecasting over China.
The 1-D vectors of the inputs of conventional LSTM and GRU networks would limit the forecasting of regional TEC. In this paper, we therefore propose a bidirectional convolutional gated recurrent unit (BiConvGRU) network-based model to forecast regional ionospheric TEC over China. We use gridded TEC maps from 2015 to 2018 with a 1 h interval from the CRIMs as the dataset, including quiet periods and storm periods of ionospheric TEC. All fully connected layers in the GRU network are replaced by convolutional computation, enabling the model to extract spatial features. The BiConvGRU model can obtain correlations in both the progressive and regressive directions of the time series, which make the model perform better than the SOTA model-ConvLSTM.
3. Experiment and Analysis
Previous studies have shown significant advancement of incorporating geomagnetic activity and solar activity indices as supplementary parameters to build forecasting models. Geomagnetic activities result in higher peak values of TEC background in contrast to geomagnetic calm periods, leading to lower accuracy of the forecasted TEC during magnetic storms. Therefore, evaluating the model’s ability to forecast TEC during magnetic storms becomes a crucial measure of its resilience and holds significant importance in the assessing the model’s accuracy.
The variations of geomagnetic activity indices and solar activity indices during Day of Year (DOY) 231–240 in 2018 [
47] are illustrated in
Figure 4a–d, showing Kp index, ap index, Dst index, and F10.7 index, respectively.
Table 1 and
Table 2 shows classifications of magnetic storms and solar activities corresponding to specific indices.
In this study, the Dst index was selected as the primary parameter to assess the intensity of magnetic storms. According to the classification criteria in
Table 1, a day with Dst index below −30 nT can be defined as a magnetic storm day. Thus, there are 214 days with weak magnetic storms, 105 days with moderate magnetic storms, 16 days with strong magnetic storms, and 3 days with severe magnetic storms among the total 985 days of the training dataset. Taking into account all indices in
Figure 4 and classification criteria in
Table 1 and
Table 2, DOY 231–235 corresponds to quiet days as the Dst index exceeds −30 nT, the ap index is less than 32 nT, and the maximum Kp index is 4. However, during DOY 236–240, the ap and Kp indices increased and the Dst index decreased sharply; the peak values of the ap and Kp indices and the lowest Dst index are observed on DOY 238 with the minimum Dst index of −175 nT, indicating a severe magnetic storm occurred during the period. In addition, the monthly average of the F10.7 index in this month is 71 sfu, suggesting relatively weak solar activity, and hence little solar influence on the variation of ionospheric TEC. This study focuses on analyzing the forecasting model’s accuracy for both quiet days (DOY 231–235) and magnetic storm days (DOY 236–240).
3.1. The Evaluation Metrics for the Model
To evaluate the accuracy of various models during periods of tranquility and storms, RMSE, MAE, and the correlation coefficient
are calculated by the following elegant formulas:
where
signifies the total number of TEC data.
means a specific grid point, and there are
grid points in total.
and
, respectively, indicate the CRIM TEC and the forecasted TEC at time
t for a given grid point.
and
represent the sequences of CRIM TEC and the forecasted TEC, respectively, for all grid points at time
. The accuracy of each grid point is first calculated, and then averaged to obtain the error of the forecasting model for the entire region.
3.2. Analysis of Forecast Accuracy on Quiet Days
3.2.1. Analysis of Forecast Accuracy at Different Time Points
This paper takes DOY 233 as an example to analyze forecasting accuracy for quiet days.
Figure 5 presents the TEC modeled by CRIM and IRI-2016, and forecasted by the BiConvLSTM-A, BiConvLSTM-B, ConvLSTM-B, ConvGRU-B, BiConvGRU-A, and BiConvGRU-B models at UT 0, UT 6, UT 12, and UT 18 on DOY 233. The corresponding MAE of each model is labeled in the figure.
Figure 6 presents the error histogram at UT 6 and UT 12 to analyze the error distribution of different models. The X-coordinate shows the MAE between the forecasted TEC and the CRIM TEC on DOY 233. Y-coordinate shows the percentage of grid points.
Figure 7 shows the correlation analysis between the forecasted TEC of each model and the CRIM TEC throughout the 24 h in DOY 233. The scatter density plots in
Figure 7a–g represent the IRI-2016, BiConvLSTM-A, ConvLSTM-B, BiConvLSTM-B, BiConvGRU-A, ConvGRU-B and BiConvGRU-B models with CRIM TEC.
From
Figure 5, it can be seen that all of the seven forecasting models effectively capture the ionospheric equatorial anomaly in the region of China. Moreover, areas with higher TEC generally exhibit good agreement with CRIM TEC. Notably, the IRI-2016 model demonstrates significantly higher MAE at UT 12 and UT 18 compared to deep learning models. The integration of additional indices in the BiConvLSTM and BiConvGRU models improves their fitting performance, verified by smaller MAEs in BiConvLSTM-B and BiConvGRU-B models in contrast with MAEs in BiConvLSTM-A and BiConvGRU-A, respectively. Except for UT 12, the BiConvGRU model shows lower MAE compared to the BiConvLSTM model. We found that the MAE of BiConvGRU-B model and that of the ConvGRU-B model were very similar, meaning the bidirectional structure is not always effective for feature extraction.
From the examination of
Figure 6A(a–g) at UT 6, we can learn that the ConvGRU-B, BiConvGRU-A, and BiConvGRU-B models perform better than the ConvLSTM-B, BiConvLSTM-A, and BiConvLSTM-B models, respectively, because the percentage of grid points with lower MAE is higher. IRI-2016 performs better than the deep learning model without additional indices at UT 6, since the strategy is of global optimization in regional forecasting, which is a trade-off performed by the model. The BiConvGRU-B model has more than 50% of MAE in 0–1 TECU. However, the maximum value of MAE also increases to about 10 TECU. Higher percentages of low MAE and lower percentages of high MAE will make the average MAE lower. In most cases, models with higher accuracy usually have a smaller MAE maximum. From the examination of
Figure 6B(a–g) at UT 12, we can learn that the MAE is not so well distributed in the IRI-2016 model, which led to the highest average MAE shown in
Figure 5. The BiConvGRU-B model has an MAE ratio of over 60% within 1 TECU. In addition, we can easily find the model with the lower average MAE in
Figure 5, mainly because of the high percentage of low MAE. The performance of the bidirectional model with additional indices is better.
In the correlation analysis shown in
Figure 7, it can be seen that the majority of TEC values on DOY 233 fall within the range of 0–10 TECU. The correlation coefficient of the IRI-2016 model is notably lower than that of the deep-learning models, while the other models exhibit correlation coefficients around 0.98. The correlation coefficients of the ConvGRU-B model are higher than that of the widely used ConvLSTM-B model. Additionally, it is worth noting that the peak TEC values predicted by the BiConvLSTM-A and BiConvLSTM-B models are higher compared to those predicted by the BiConvGRU-A and BiConvGRU-B models.
For the purpose of providing a more comprehensible comparison of the forecast accuracy among different models,
Table 3 is presented, displaying the 24 h average values of various metrics for the models on DOY 233. As per the findings from
Table 3, it is evident that the IRI-2016 model exhibits considerably higher RMSE and MAE values compared to the other models. The ConvGRU-B model achieves a reduction of 11.9% in RMSE and 11.1% in MAE in comparison to the ConvLSTM-B model. The BiConvLSTM-B model achieves a noteworthy reduction of 34.7% in both RMSE and MAE when compared to the BiConvLSTM-A model. Similarly, the BiConvGRU-B model showcases a significant reduction of 28.3% in RMSE and 31% in MAE in comparison to the BiConvGRU-A model, as well as a reduction of 26.4% in RMSE and 27.5% in MAE when compared to the ConvGRU-B model. Notably, the BiConvGRU-B model demonstrates a substantial enhancement in forecast accuracy, with a remarkable reduction of 58.1% in RMSE and 60.6% in MAE when compared to the IRI-2016 model, and 35.1% in RMSE and 35.6% in MAE when compared to the classical ConvLSTM-B model.
During the magnetically quiet day, the BiConvGRU-B model exhibits a notable improvement in forecast accuracy when contrasted with the empirical IRI-2016 model. The higher forecast accuracy observed in the BiConvGRU-B model compared to the ConvGRU-B model indicates the effectiveness of the bidirectional mechanism. In addition, the prediction accuracy is somewhat improved compared to the BiConvLSTM-B model.
3.2.2. Analysis of Model Forecast Accuracy Variation with Longitude at Different Latitudes
Figure 8 showcases the RMSE variation of the IRI-2016, ConvLSTM-B, ConvGRU-B, BiConvLSTM-A, BiConvGRU-A, BiConvLSTM-B, and BiConvGRU-B models with longitude across various latitudes on DOY 233.
Figure 8a–e correspond to latitudes 15°N, 25°N, 35°N, 45°N, and 55°N, respectively.
Table 4 summarizes the RMSE values for these seven models at different latitudes.
Based on the analysis of
Figure 8 and
Table 4, it can be observed that overall, the RMSE values for all seven models demonstrate a declining trend as latitude increases. The IRI-2016 model consistently exhibits higher RMSE values compared to other deep learning models across various latitudes. The RMSE curves of the BiConvGRU-B model consistently lie below those of the BiConvLSTM-B model. Furthermore, according to
Table 4, the RMSE values of the BiConvGRU-B model remain below 2 TECU for all latitude ranges, except at 25°N where it slightly exceeds that of the BiConvLSTM-B model. The RMSE of the ConvLSTM-B model is higher than the BiConvGRU-B across various latitudes. Furthermore, it can be seen that there are small peaks in RMSE for different models near 70°E and 140°E, and an elevation in RMSE at 55°N. This can be attributed to the limited availability of CMONOC stations, which impacts the quality of extracted TEC data in the peripheral region of China. We will add some nearby IGS stations to assist in modeling in a subsequent study and study the variation of RMSE among the models under this condition.
In conclusion, the analysis suggests that the BiConvGRU-B model outperforms the BiConvLSTM-B model and IRI-2016 model in terms of RMSE at different latitudes during quiet days, highlighting its better forecasting performance.
3.2.3. Analysis of Model Forecast Accuracy at Individual Grid Points
To further assess the forecast accuracy of the proposed models for quiet-day TEC, interpolation was conducted using the 1-day Predicted GIM (C1PG) data provided by the CODE center. This enabled TEC predictions for nine specific grid points at various longitudes and latitudes within the Chinese region for DOY 231–235. A comparative analysis was then performed using the TEC predictions of the IRI-2016, BiConvLSTM-B, and BiConvGRU-B models at these corresponding grid points.
Figure 9 showcases the CRIM TEC values at the nine grid points within the Chinese region during quiet days, alongside the 5-day TEC predictions from the four models. Meanwhile,
Table 5 presents the RMSE values for the 5-day predictions of the four models. The RMSE in the
Table 5 can be calculated by the Equation (5):
where
represents the CRIM TEC, and
represents the model value of different models.
Based on the analysis of
Figure 9 and
Table 5, it can be observed that the TEC predictions of the IRI-2016 model exhibit a notable disparity compared to the other three models in the 25°N region, indicating inadequate fitting performance for the TEC minima. Similarly, the C1PG model yields higher TEC predictions than the other three models in the 45°N region. The BiConvLSTM-B model demonstrates poor TEC prediction performance specifically for DOY 233–235. The BiConvLSTM-B model overestimates TEC for the 35°N and 45°N during ionization periods, resulting in large errors. The IRI-2016 model predicts this region second only to the BiConvGRU-B model. On the other hand, the TEC predictions of the BiConvGRU-B model exhibit the closest alignment with the CRIM TEC values, displaying the lowest RMSE among the four models, except at 25°N, 120°E.
The average RMSE values for the nine grid points were computed as follows: IRI-2016-2.42 TECU, C1PG-2.73 TECU, BiConvLSTM-B-2.04 TECU, and BiConvGRU-B-1.39 TECU. In comparison to the IRI-2016, C1PG, and BiConvLSTM-B models, the BiConvGRU-B model achieved reductions in RMSE of 42.6%, 49.1%, and 31.9%, respectively.
3.3. Analysis of Model Forecast Accuracy during Geomagnetic Storm Days
3.3.1. Analysis of Model Forecast Accuracy at Different Time Intervals
Figure 10 shows the TEC values of the CRIM, IRI-2016, BiConvLSTM-A, BiConvLSTM-B, ConvLSTM-B, ConvGRU-B, BiConvGRU-A, and BiConvGRU-B models at different time points (UT 0, UT 6, UT 12, UT 18) on DOY 238, providing a specific illustration of the models’ forecasting accuracy during days of geomagnetic storms. In
Figure 10, the CRIM TEC data are presented in a comparison to investigate the feature extraction capability of the model on DOY 238.
Figure 11 presents the error histogram for UT 6 and UT 12 to analyze the error distribution of different models. The X-coordinate indicates the MAE between the model value and the CRIM TEC, and the Y-coordinate indicates the percentage of grid points.
Figure 12 presents a correlation analysis between the 24 h TEC predictions of the various models and the CRIM TEC values for DOY 238.
Figure 11a–g display the scatter density plots of the IRI-2016, BiConvLSTM-A, BiConvLSTM-B, ConvLSTM-B, ConvGRU-B, BiConvGRU-A, and BiConvGRU-B models with CMONOC TEC, respectively.
Based on
Figure 10, it is evident that TEC values during geomagnetic storms are considerably higher compared to calm days, resulting in lower forecasting accuracy for all five models during such stormy conditions. Among the models, BiConvGRU-B demonstrates relatively better forecasting performance during geomagnetic storms, although the predicted TEC values still remain notably lower than the CRIM TEC values. At UT 6, both BiConvLSTM and BiConvGRU models without exponential factors exhibit higher MAE compared to the IRI-2016 model. However, the incorporation of exponential factors results in a significant reduction in MAE by 19% and 23.9%, respectively, indicating a substantial improvement. Moreover, at this time point, the MAE of models with exponential factors is lower than that of the IRI-2016 model, suggesting that the inclusion of exponential factors enhances forecast accuracy.
From
Figure 11A(a–g) at UT 6, we can learn that the ConvGRU-B, BiConvGRU-A, and BiConvGRU-B models perform better than the ConvLSTM-B, BiConvLSTM-A, and BiConvLSTM-B models respectively, which is the same as for calm days. The IRI-2016 model performs better than the deep-learning model without additional indices at UT 6 except for the ConvLSTM-B model, because the ConvLSTM-B model has less than 10% of MAE within 1 TECU, which is the smallest among all models. The maximum MAE of the BiConvGRU-B model at UT 6 is approximately 5 TECU lower than other models. The BiConvGRU-B model has more than 50% of MAE in 0–1 TECU, which is the highest among all the models. From the examination of
Figure 11A(a–g) at UT 6, it can be seen that the MAE distribution of the IRI-2016 model is the worst, leading to the highest average MAE in
Figure 10. The BiConvLSTM-B model performs better than the BiConvGRU-B model at UT 12, as its MAE within 1 TECU exceeds 30%. A higher percentage of MAE in the range of 10–15 TECU causes the IRI-2016 model, ConvLSTM-B model, BiConvLSTM-A model, ConvGRU-B model, and BiConvGRU-B model to have worse performance compared to the BiConvLSTM-B model and BiConvGRU-B model.
The correlation analysis presented in
Figure 12 indicates that the correlation coefficients of the IRI-2016 model are lower compared to the other deep-learning models, even during geomagnetic storms. Despite the presence of storm conditions, the majority of TEC values still fall within the range of 0–10 TECU. While the correlation coefficients of different models slightly decrease compared to calm days, the incorporation of exponential factors leads to improved correlation coefficients for both BiConvLSTM and BiConvGRU models, with BiConvGRU-B exhibiting the highest correlation coefficient among the seven models.
Table 6 provides the average values of various metrics for different models on DOY 238. The results reveal that during magnetic storms, the IRI-2016 model demonstrates higher RMSE and MAE compared to the other models. The BiConvLSTM-B model achieves a reduction of 17.7% in RMSE and 21.9% in MAE compared to the BiConvLSTM-A model. Similarly, the BiConvGRU-B model shows a decrease of 27.4% in RMSE and 24.1% in MAE compared to the BiConvGRU-A model. In comparison to the ConvGRU-B model, the BiConvGRU-B model exhibits a reduction of 22.3% in RMSE and 21.9% in MAE. The BiConvGRU-B model shows a decrease of 24.1% in RMSE and 24.1% in MAE compared to the ConvLSTM-B model. Furthermore, when compared to the IRI-2016 model, the BiConvGRU-B model demonstrates a substantial decrease of 41.5% in RMSE and 49.2% in MAE. Notably, during magnetic storms, only the BiConvGRU-B model achieves an RMSE below 3 TECU, indicating higher forecast accuracy compared to the other five models.
3.3.2. Analysis of the Variation of Model Forecast Accuracy with Longitude at Different Latitudes
Figure 13 presents the variation of RMSE with longitude for the IRI-2016, ConvLSTM-B, ConvGRU-B, BiConvLSTM-A, BiConvGRU-A, BiConvLSTM-B, and BiConvGRU-B models at different latitudes on DOY 238.
Table 7 provides a summary of the RMSE values for the three models at various latitudes. Analysis of
Figure 13 and
Table 7 reveals that during magnetic storms, the RMSE values for all seven models tend to be higher at higher latitudes compared to lower latitudes, indicating the presence of a distinct equatorial anomaly phenomenon. The RMSE curves of the IRI-2016 model consistently lie above other models, with the peak RMSE value reaching 12 TECU. The BiConvLSTM-B and BiConvGRU-B models exhibit peak RMSE values of approximately 9 TECU and 7 TECU, respectively. Moreover,
Table 7 indicates that of all the latitude ranges, only at 35°N is the RMSE higher for the BiConvGRU-B model compared to the BiConvLSTM-B model. The most notable improvement in RMSE is observed at 25°N, with a reduction of 62.7% compared to the IRI-2016 model, 30.5% compared to the BiConvLSTM-B model, and 45.9% compared to the ConvLSTM-B model. The difference in forecast accuracy of different models at high latitudes is not significant, and the difference in model performance is mainly reflected in the forecast accuracy at low latitudes. Based on these findings, it can be concluded that during magnetic storms, the BiConvGRU-B model consistently exhibits superior RMSE performance compared to the other models across different latitudes.
3.3.3. The Forecast Accuracy Analysis of a Single-Grid Point Model
Figure 14 shows the CRIM TEC values at nine grid points within the Chinese region, alongside the predicted TEC values derived from the remaining four models during the period of DOY 236–240, amidst the occurrence of a geomagnetic storm.
Table 8 provides the average RMSE values for the 5-day predictions offered by the four models. Upon scrutinizing
Figure 14 and
Table 8, it becomes evident that the IRI-2016 model exhibits subpar fitting performance for the TEC values situated at the trough locations of 25°N and 35°N. In stark contrast, the C1PG model noticeably overestimates the TEC values within the regions of 35°N and 45°N when compared to the other three models. DOY 238 encounters a period of heightened geomagnetic activity, resulting in a prominently elevated peak in the TEC background value in comparison to the other four days, consequently leading to subpar fitting performance across all four forecast models on that particular day. The TEC prediction curve presented by the BiConvGRU-B model closely aligns with the CRIM TEC values overall, exhibiting lower RMSE values in contrast to the other four models, except at (25°N, 100°E). The calculated average RMSE values for the IRI-2016, C1PG, BiConvLSTM-B, and BiConvGRU-B models at the nine grid points are 2.91 TECU, 3.08 TECU, 2.38 TECU, and 2.02 TECU, respectively. The BiConvGRU-B model demonstrated significant enhancements in RMSE as compared to the IRI-2016, C1PG, and BiConvLSTM-B models, with reductions of 30.6%, 34.4%, and 15.1%, respectively. It showcased notable advancements in contrast to the IRI-2016 and C1PG models, as well as a discernible level of improvement in comparison to the BiConvLSTM-B model.
While the BiConvGRU-B model demonstrates superior overall forecast accuracy compared to the other three models, it is noteworthy that the C1PG model successfully captured the occurrence of the geomagnetic storm on DOY 238, exhibiting significantly higher forecast values in comparison to the other four days. However, both the BiConvGRU-B and BiConvLSTM-B models tend to overestimate the geomagnetic disturbance at certain grid points. This phenomenon can be attributed to the fact that only 12.8% of the training dataset comprised samples with moderate-to-severe geomagnetic storms. Despite the inclusion of auxiliary variables such as geomagnetic activity and solar activity indices during model training, the learned features specific to geomagnetic storm days gradually weaken as the number of training samples increases. Consequently, this can lead to false alarms in the predictions.
3.4. Analysis of Forecast Accuracy throughout 2018
To assess the long-term predictive capability of the models,
Figure 15 presents the cumulative distribution of Mean Absolute Error (MAE) for the BiConvLSTM-B and BiConvGRU-B models across the seasons of spring (March–May), summer (June–August), autumn (September–November), and winter (December–January/February) in 2018. From the figure, it is evident that the BiConvGRU-B model displays a significantly higher proportion of MAE falling within the range of 0–1 TECU compared to the BiConvLSTM model. Specifically, the BiConvGRU-B model exhibited respective increases of 50%, 47.6%, 61%, and 58.8% in the spring, summer, autumn, and winter seasons. Additionally, the percentage of MAE exceeding 2.5 TECU consistently remains lower for the BiConvGRU-B model compared to the BiConvLSTM-B model across all four seasons. Specifically, the BiConvGRU-B model demonstrated percentages of MAE exceeding 2.5 TECU as 4.8%, 0%, 10.8%, and 1.9% in the spring, summer, autumn, and winter seasons, respectively. The BiConvGRU-B model effectively enhances the proportion of MAE falling within the 0–1 TECU range while reducing the percentage of MAE exceeding 2.5 TECU, thereby indicating superior predictive performance.
In accordance with the classification criteria outlined in
Table 1, the validation set for the entirety of 2018 encompassed 365 days. Applying the aforementioned criteria, a total of 46 days in 2018 were categorized as storm days, comprising 36 days with weak storms, 9 days with moderate storms, and 1 day with a severe storm, and constituting 12.6% of the entire year. The remaining 319 days were classified as quiet days.
Table 9 provides the RMSE values for the BiConvLSTM-B and BiConvGRU-B models across diverse geomagnetic conditions. It is evident that the RMSE values of the models during storm days notably exceed those during quiet days. Furthermore, the superior forecast accuracy of the BiConvGRU-B model compared to the BiConvLSTM-B model is apparent across various geomagnetic conditions. In addition, the mean RMSE values of the BiConvLSTM-B model and the BiConvGRU-B model during storm days are 2.13 TECU and 1.91 TECU, respectively.
Other studies also show low accuracy in forecasting TEC during magnetic storm periods in the same region. For example, Xia et al. [
48] proposed a support vector machine (SVM) model to forecast TEC in the China region, and the RMSE of the model was 2.02 TECU during the storm condition. Shi et al. [
49] proposed a bidirectional LSTM model for the China region, and the RMSE was up to 2.07 TECU at low latitudes during the storm condition. The BiConvGRU-B model in our study has better performance compared to the above models. In addition, our research is most similar to the work by Gao et al. [
41], which proposed a storm–time ionospheric TEC model for the China region, with multichannel features on the basis of the ConvLSTM network. However, the RIMs they used were generated by CODE’s GIM. The MAE and RMSE values can reach the minimum of 0.98/1.31 TECU in quiet periods and 1.44/1.88 TECU in storm periods. Furthermore, the accuracy of the ConvLSTM models would decrease if the output steps increased.
Figure 16 shows the comparative analysis of Root Mean Square Error (RMSE) for Total Electron Content (TEC) predictions between the BiConvLSTM-B and BiConvGRU-B models throughout the twelve months of 2018. It is evident that the BiConvGRU-B model consistently outperforms the BiConvLSTM-B model, exhibiting lower RMSE and Mean Absolute Error (MAE) values across all months. The most substantial improvement in RMSE is observed in July, with a remarkable decrease of 0.34 TECU, while the largest improvement in MAE is observed in November, with a significant decrease of 0.46 TECU. Over the entire year of 2018, the BiConvLSTM-B model demonstrated an RMSE of 2 TECU and an MAE of 1.62 TECU, whereas the BiConvGRU-B model achieved an RMSE of 1.71 TECU and an MAE of 1.22 TECU. Comparatively, the BiConvGRU-B model showed a notable reduction of 14.5% in RMSE and 24.7% in MAE compared to the BiConvLSTM-B model, thereby establishing its superiority in terms of accuracy and stability for long-term forecasting purposes.
Xia et al. [
50] have proposed a method of calculating the RMSE and MAE between truth maps of the previous day and the future day as a baseline to analyze the models’ ability to learn the relationship between past TEC maps and future TEC maps. Using this method, we also calculated the RMSE and MAE between CRIMs of the previous day and the future day: the RMSE and MAE were 1.81 TECU and 1.17 TECU, respectively. The RMSE of the BiConvGRU-B model was 5.6% lower than that of the CRIMs, but the MAE was 4.3% higher than CRIMs’ MAE. However, the RMSE and MAE of the BiConvLSTM-B model was 10.5% and 38.5 higher than that of the CRIMs’ RMSE and MAE, respectively. The BiConvGRU-B model learned more of the relationship between past TEC maps and future TEC maps compared to the ConvBiLSTM model.