1. Introduction
The Earth’s ionosphere is a medium that contains electrically charged particles. The ionosphere is ionized by solar radiation, and its state is constantly varying due to ever-changing space weather conditions. During daytime, there is more solar radiation, which leads to photoionization and hence more ionized particles. During nighttime, the photoionization process is absent due to the lack of sunlight, which leads to reduced electron density in the ionosphere. In the ionosphere there exists enough ionization to affect the propagation of radio waves [
1] (p. 1). Global Navigation Satellite Systems (GNSS) use radio waves to estimate positions on Earth and in space. Depending on the state of the ionosphere, radio waves experience a delay in their signal. The ionospheric state can be described by the Total Electron Content (TEC), commonly measured in TEC Units (TECU). One TECU stands for 10
16 free electrons in a column of one squared meter. The propagation delay is directly proportional to the ionospheric TEC, which increases with an increase in electron density. The ionosphere is a dispersive medium, meaning that the delay also depends on the frequency of the radio waves. Utilizing this property, ionospheric delay can be corrected by combining two or more GNSS signals. Only higher order effects/terms remain in such combinations that are below 1% of the total delay [
2]. However, single frequency users need external information or ionospheric models to correct this propagation effect.
There are models available that contain corrections for the ionosphere, such as the Klobuchar model [
3,
4], the NeQuickG model [
5,
6], or different versions of the Neustrelitz TEC Model (NTCM) [
7,
8,
9,
10,
11] applicable for GPS and Galileo systems. GNSS satellites transmit coefficients as part of the navigation message to broadcast ionospheric models. The accuracy of position estimates improves when the broadcast models are applied. The International GNSS Service (IGS) releases Global Ionosphere Maps (GIMs) that contain Vertical Total Electron Content (VTEC) data. The GIMs are available in the IONosphere EXchange Format (IONEX) as rapid and final solutions, both released with a latency. The rapid solution is released within 24 h, but the final solution has a latency of approximately 11 days. The GIMs are more accurate than the broadcast models [
12] and have been available since the official start of 1998. Besides empirical ionosphere models, there are first-principle physics models that have the potential to provide ionospheric forecasts. Examples of such models are the Thermosphere-Ionosphere-Electrodynamics General Circulation Model (TIE-GCM) developed at the National Center for Atmospheric Research (NCAR), or the Coupled Thermosphere Ionosphere Plasmasphere Electrodynamics Model (CTIPe) developed at the Space Weather Prediction Center from the National Oceanic and Atmospheric Administration (NOAA). However, Shim et al. [
13] have suggested that errors in electron density can be very large due to errors in initialization and boundary conditions.
Different versions of NTCM developed at the German Aerospace Center (DLR) have shown that the computationally very fast 12 coefficient model is comparable in its simplicity to the Klobuchar model and achieves a similar performance as the NeQuick2/NeQuickG model [
7,
9,
10,
11,
14]. The NTCM approach describes the TEC dependencies on local time, geographic/geomagnetic location, and solar irradiance and activity. The local time dependency is explicitly described by the diurnal, semi-diurnal, and ter-diurnal harmonic components. The two ionization crests at the low latitude anomaly regions on both sides of the geomagnetic equator are modelled by Gaussian functions. The 12 model coefficients are fitted to the TEC data to describe the broad spectrum of TEC variation at all levels of solar activity. The driving parameter of the model is either the daily solar radio flux index, F10.7, which is a proxy for the EUV radiation of the Sun, or the ionosphere correction coefficients transmitted via GPS or Galileo satellite navigation messages.
The NTCM models are developed based on TEC data provided by the Center for Orbit Determination in Europe (CODE) [
15]. The models can predict the mean TEC behavior and reduce the ionospheric propagation errors by up to 80% in GNSS applications [
8]. The persistent anomaly features that exist during both high and low solar activity times, such as the equatorial ionization anomaly (EIA), are modelled by NTCM approaches. However, the anomalies that become visible only during low solar activity (LSA) time, such as the Nighttime Winter Anomaly (NWA), the Weddell Sea anomaly, and the midsummer nighttime anomaly (MSNA), are not explicitly modelled. The NWA is present because during LSA periods the mean ionization level is higher in the winter nights compared to the summer nights [
16]. During the day, the ionization is equal or even less compared to the summer, and thus, the NWA is only visible during the night. Modelling NWA and MSNA features are out of scope when developing empirical TEC models with a limited number of model coefficients. However, with the availability of fast computing machines, as well as the advancement of machine learning techniques and Big Data algorithms, the development of a more sophisticated TEC model featuring the NWA and other effects is possible and is presented here.
The use of neural networks (NN) for predicting TEC is not new; various types of networks capable of predicting TEC have already been developed. Some of the networks are focused on regional or short-term predictions, or are trained with data from a short period, which does not cover all solar activity. A few examples of previously developed networks are the Long Short-Term Memory (LSTM) network proposed by Xiong et al. [
17] that can make accurate, short-term TEC predictions over China, or the model proposed by Machado and Fonseca Jr. [
18] that forecasts the VTEC of a subsequent 72 h in the Brazilian region. A combination of Convolutional Neural Networks (CNNs) and LSTM networks has been proposed by Cherrier et al. [
19] This approach is able to predict global TEC maps 2 to 48 h in advance. The TensorFlow-based, fully connected NN prediction model proposed by Orus Perez [
12], the IONONet, was trained with GIM data from LSA periods and was also tested with LSA data. The IONONet has been making global predictions one or more days ahead but is unable to represent the equatorial anomalies. However, the IONONet model adjusted with NeQuick2 is able to represent the equatorial anomalies. The NN approach introduced by Cesaroni et al. [
20] trains a neural network with data from one solar cycle on a selection of grid points and uses the NeQuick2 model to extend it to a global scale. The approach proposed here is a fully connected neural network trained with global GIM data from almost two solar cycles.
Most of the previously mentioned models developed using machine learning techniques are already able to predict VTEC maps containing large-scale features, but a prediction containing NWA features is newly developed. The paper is divided into sections, beginning with brief descriptions of the ionospheric TEC models. The subsequent section discusses the database and sources used for the datasets for model development and testing. The final section provides a performance evaluation of the proposed NN model compared to the NTCM model. The model is further evaluated with respect to its capability of reproducing the NWA feature during low solar activity time.
2. Database
The database used to train and test the proposed neural network consists of GIMs provided by the CODE [
15]. The daily IONEX files were downloaded from the Crustal Dynamics Data Information System (CDDIS) archive at
https://cddis.nasa.gov/archive/gnss/products/ionex/ (accessed on 30 March 2021). Each IONEX file contains either bi-hourly or hourly (since the end of 2014) VTEC maps with 2.5° and 5° latitude and longitude resolution. We used CODE data from 2001 to 2020, which includes high and low solar activity condition/levels covering the last two solar cycles, 23 and 24.
The solar activity level was provided by the solar radio flux index, F10.7, measured in solar flux units (sfu, 1 sfu = 10–22 Wm
−2Hz
−1). This describes the solar emission at a wavelength of 10.7 cm [
21]. The daily F10.7 data were obtained from NASA’s (National Aeronautics and Space Administration) OMNIWeb interface, available at
https://omniweb.gsfc.nasa.gov/form/dx1.html (accessed on 12 May 2021). A time series of F10.7 is plotted in
Figure 1, which varies from about 70 units at LSA to up to more than 200 units at high solar activity (HSA) conditions, causing high dynamics in the VTEC.
In order to reduce computational complexity, the datasets (e.g., F10.7, VTEC maps) were downscaled by taking the Carrington rotation averages at each hour. One Carrington rotation period is approximately 27 days, the time for which a fixed feature of the Sun rotates to the same apparent position as viewed from Earth. Each rotation of the Sun is given a unique number, called the Carrington Rotation Number, starting from 9 November 1853 [
22]. The average F10.7 for each Carrington rotation is plotted in
Figure 1 (see orange plot) for the considered period. We observe that the average F10.7 successfully follows the trend of the solar activity level.
First, we divided the data into two main categories: training and test datasets. The training data consisted of data from the years 2001 to 2019, excluding the year 2015. The testing dataset consisted of data from the years 2015 and 2020, which corresponded to high and low solar activity conditions, respectively. The data from the December 2019 period were excluded from the training dataset since they were used for analyzing the model performance during the NWA condition.
The spatial resolution of the VTEC maps was reduced to speed up the model training. A resolution of 2.5° latitude by 15° longitude was used for the training dataset. However, for model testing, the original data (i.e., individual hourly maps of 2.5° and 5° latitude and longitude) as well as the Carrington rotation averaged data were used.
Since a validation dataset is needed to find the correct free parameters, known as the hyperparameters, part of the training data was separated. The process of finding the optimized free parameters is called hyperparameter tuning. Not only is the structure of the network defined by the hyperparameters (i.e., the number of neurons or layers), but they also define how the network is trained (i.e., the learning rate or number of epochs).
If the cost function (mean squared error) decreases for only the training set and not for the validation set, it means that the model is learning noise in the training data [
23]. The chosen hyperparameters are discussed in the method section. After finding the correct hyperparameters, the model was trained one last time with the validation dataset included in the training dataset.
The data were normalized and scaled to an interval of [0,1] in order to converge the model faster using the MinMaxScaler function from the Scikit-Learn Python library [
24]. The scaler was fitted to the training and validation data. The testing data were left out because, otherwise, the scaler could extract information from the test set. After the scaler was fitted, the test data could be transformed by the scaler in order to make predictions. If the information of the test datasets was used before or during training, the model’s performance could not be fairly evaluated as information about the test dataset may leak into the training of the model.
3. Method
Our previous investigation and modelling activities (e.g., [
7,
8,
9,
10,
11]) show that the ionospheric TEC can be successfully modelled by describing TEC dependencies on local time, seasons, geographic/geomagnetic location, solar irradiance, and activity. Our current investigation shows that the same geophysical conditions can be selected as features for training a neural network to accurately describe the events of the ionospheric TEC. Therefore, we used the following six features: F10.7 index, solar zenith angle, day of year (DOY), Universal Time (UT), geographic longitude, and geomagnetic latitude. UT was selected instead of Local Time (LT), although in our previous modelling activities LT was always used. We found that with neural network-based modelling we could directly use UT without any performance degradation. The solar zenith angle is the angle between the Sun’s rays and the vertical direction at a geographic location. The solar zenith angle dependency (Φ) is considered by the following expression [
25]:
where ϕ is the geographic latitude and δ is the solar declination.
The International Geomagnetic Reference Field (IGRF) model [
26] was used for geographic to geomagnetic latitude conversion.
At what extent the selected features correlate to the TECs can be known by estimating a Pearson’s correlation matrix [
27], shown in
Figure 2. We found that the F10.7 index and the solar zenith angle had a moderate to high correlation with the VTEC, whereas other features showed relatively low correlation values.
The prediction of VTEC is a regression problem for which a fully connected neural network is used to train the model. The architecture was built with a sequential model using the open source TensorFlow [
28] and Keras [
29] libraries, written for Python. Keras was built on top of TensorFlow and was designed to make the implementation easier. A 6-256-128-64-32-16-1 network architecture was chosen, where the numbers represent the number of neurons of each layer. This architecture is similar to one of the model architectures investigated by Orus Perez [
12]. A schematic view of the network architecture is displayed in
Figure 3.
To map non-linearities, the Rectified Linear Unit (ReLU) activation function was used. ReLU returns zero in the case of a negative input and returns a value for positive inputs. ReLU is a widely used activation function due to its computational efficiency [
30]. All the hidden layers of the network use the ReLU activation function. The output layer uses a linear identity activation function.
During the training of a network, it can occur that the model remembers the training data too well, which is called overfitting. This means that the model has also learnt the noise of the training dataset, and the performance of the model on the validation and testing datasets may be lower. One of the ways to avoid overfitting is by adding a regularization term that penalizes the model for fitting the data too well. As already mentioned, the model’s training data consisted of Carrington averaged data. Instead of being fed daily data, the model was fed Carrington averaged values (approximately 27 days), making it less vulnerable to noise. There was no regularization term added because, otherwise, the model would not be able to detect small-scale features such as the NWA.
The model training was optimized using an optimization method called Adam [
31], which considers a learning rate of 0.0001 and the Mean Squared Error (MSE) as the loss function. The Adam optimizer minimizes the MSE and is a computationally effective algorithm with little memory requirements [
31]. The loss function is defined as the MSE between model prediction and VTEC from the GIMs, also known as label. The training process took several hours, which can be considered as short since no special hardware was used other than a personal computer with an Intel Core i7-8665U CPU and 16 GB RAM. The number of epochs used for the model’s training was 150, which stands for how many times the complete training dataset is passed through the network. The model was fitted using a batch size of 128. The batch size is the number of samples that are being fed to the network.
The first consideration was using two models, one for HSA conditions (e.g., F10.7 > 100) and another one for LSA conditions. During high solar activity periods, equatorial anomaly crests are more prominent features when compared to low activity periods. The specialized LSA and HSA models had difficulties showing the predications that contained equatorial anomalies. The models predicted the anomalies as one blob instead of two crest features. After increasing the complexity of the network, such as by increasing the number of neurons and layers, the HSA model was able to show predictions containing the two crests. The anomalies were still not present in predictions made with the specialized LSA model because the crests were less prominent. After lowering the regularization term, the two crests became visible. However, the final model was trained on data from both solar periods and could also predict the equatorial anomalies during the LSA period. In terms of root mean square (RMS) errors, the combined model performed slightly worse than the specialized models, but the difference was not significant enough. However, a combined model is preferable for a smooth transition of TEC prediction when the main driving parameter, F10.7, experiences fluctuations due to space weather events. The NWA was more difficult to model because it is a short-lived feature. After reducing the regularization term, the model was able to make predictions containing the NWA effect.
After training, the model had to be tested with unseen test data. The model’s results were evaluated by looking at the performance metrics, such as standard deviation (STD), mean, RMS errors, and the presence of features in the VTEC predictions. The performance of the model was also compared to the performance of the NTCM model [
7].
The model was validated against the test data from the HSA in 2015 and the LSA in 2020. The test dataset contained the Carrington averaged data as well as the original daily data. Model values were calculated for the same geophysical conditions as the test datasets. The differences between the model values and test datasets were computed and the statistical estimates of differences were derived. The performance of the daily data in terms of mean, STD, and RMS differences is provided in
Table 1. We found that the values were lower for the Carrington averaged data compared to the daily data. This was expected because the model was trained with Carrington rotation averaged data. The model’s performance was worse during the HSA in 2015 compared to the LSA in 2020. The VTEC values had a larger spread, and were higher during 2015, which is shown in
Table 2. Therefore, it was expected that the model would have had lower accuracy during the HSA in 2015. In
Table 2, it can also be seen that the model was underpredicting during the HSA period because the model’s mean, STD, and maximum value was lower compared to the GIMs. However, during LSA conditions, the model was overpredicting because the mean, STD, and maximum value were higher compared to the GIMs. This behavior could have resulted from training the model with data from HSA and LSA periods and using Carrington rotation averaged data instead of daily data.
5. Conclusions
In this paper, a fully connected neural network-based (NN) TEC model was proposed for global Vertical TEC (VTEC) predictions that could reproduce the Nighttime Winter Anomaly (NWA). The NWA is a unique feature that only occurs at nighttime during low solar activity conditions. The model was trained with a large dataset containing IGS Global Ionosphere Maps (GIMs) from almost two solar cycles covering the years 2001 to 2019, with 2015 excluded. The NN model was tested with unseen data from a high and a low solar activity year (2015 and 2020, respectively). The day of year, universal time, geographic longitude, geomagnetic latitude, solar zenith angle, and solar activity proxy, F10.7, were used as input parameters for the network in order to predict the VTEC.
Our investigation shows that the model clearly reproduces the NWA feature in the global VTEC predictions. The Northern Hemisphere’s NWA in the American sector is clearly seen at 1 Local Time (LT), but the Southern Hemisphere’s NWA becomes more visible during 3–5 LT in the NN model prediction. The NWA in the Asian sector is also more prominent in the GIM data at 3 LT, but is also present at 1 LT. The reason why the NWA in the Asian sector is not clearly visible in the VTEC prediction is that the NWA effect is not as strong as it is in the American sector.
To validate the neural network model, a comparison with the Neustrelitz TEC Model (NTCM) was made. The NN model outperforms the NTCM by approximately 1 TEC unit in the case of high and low solar activity periods. The NTCM is not able to make predictions containing the NWA feature, whereas the NN model is able to do so. Even though the model was trained on both high and low solar activity data, it can still predict features that only occur during low solar activity periods in the night like the NWA can.