An Approach for Predicting Global Ionospheric TEC Using Machine Learning

: Accurate corrections for ionospheric total electron content (TEC) and early warning information are crucial for global navigation satellite system (GNSS) applications under the inﬂuence of space weather. In this study, we propose to use a new machine learning model—the Prophet model, to predict the global ionospheric TEC by establishing a short-term ionospheric prediction model. We use 15th-order spherical harmonic coefﬁcients provided by the Center for Orbit Determination in Europe (CODE) as the training data set. Historical spherical harmonic coefﬁcient data from 7 days, 15 days, and 30 days are used as the training set to model and predict 256 spherical harmonic coefﬁ-cients. We use the predicted coefﬁcients to generate a global ionospheric TEC forecast map based on the spherical harmonic function model and select a year with low solar activity (63.4 < F10.7 < 81.8) and a year with the high solar activity (79.5 < F10.7 < 255.0) to carry out a sliding 2-day forecast experiment. Meanwhile, we verify the model performance by comparing the forecasting results with the CODE forecast product (COPG) and ﬁnal product (CODG). The results show that we obtain the best predictions by using 15 days of historical data as the training set. Compared with the results of CODE’S 1-Day (C1PG) and CODE’S 2-Day (C2PG). The number of days with RMSE better than COPG on the ﬁrst and second day of the low-solar-activity year is 151 and 158 days, respectively. This statistic for high-solar-activity year is 183 days and 135 days.


Introduction
The state of the Earth's ionosphere always affects the accuracy and reliability of global navigation satellite system (GNSS) positioning, navigation, and timing [1][2][3].For low latitudes, equatorial ionization anomaly and equatorial plasma bubbles are two major phenomena [4,5].For mid-latitude, mid-latitude enhancement is a typical phenomenon [6].For high-latitude, tongue of ionization is a typical phenomenon [7].Meanwhile, earthquakes and other natural disasters have an impact on causing ionospheric anomalies [8,9].These phenomena of ionosphere play major roles in GNSS positioning, navigation, and timing.The ionospheric total electron content (TEC) is the total number of electrons in the column between the satellite and the receiver from the top to the bottom of the ionosphere.This number directly determines the time delay for a satellite signal to pass through the ionosphere.In real-time precision positioning, high accurate ionospheric prediction models play a vital role in positioning accuracy and accelerating the positioning convergence [10,11].Monitoring the inversion of the ionosphere using GNSS is a feasible method [12,13].Therefore, the tasks of monitoring and forecasting the ionosphere and building a high-precision ionosphere prediction model have important research value.
For this largest naturally occurring error source for GNSS [14], broadcasting or empirical models of the ionosphere are most commonly used to correct for ionospheric delay, such as the Klobuchar model [15][16][17], the NeQuick model [18], the BDGIM [19] and the International Reference Ionosphere (IRI) model [20,21].These models can reach a forecast accuracy of approximately 50~75% [15,18], and meet user needs in terms of single-frequency high-precision positioning [22,23].However, they are susceptible to accuracy limiting factors such as solar activity and sudden changes in the Earth's environment.In 1998, the International GNSS Service (IGS) established an ionospheric working group to regularly publish precision GNSS products, which provide a wealth of research data for ionospheric research [24][25][26].At present, the main institutions providing ionospheric prediction products are the Center for Orbit Determination in Europe (CODE), the European Space Agency (ESA), and the Universitat Politècnica de Catalunya (UPC) in Spain.CODE and ESA provide ionospheric prediction products one day and two days in advance, while UPC provides ionospheric prediction products two days in advance [27,28].
In recent years, many ionospheric researchers have conducted a series of short-term forecasting studies and developed ionospheric products with higher accuracy than empirical ionospheric models to make full use of historical data to analyze and extract the characteristics of ionospheric TEC changes.Tulunay et al. [29] developed the Middle East Technical University Neural Network (METU-NN) model, composed of a hidden layer and multiple neurons, to predict ionospheric TEC maps over Europe.Xia et al. [30] used support vector machine (SVM) and graphics acceleration technology to establish an ionospheric forecasting model for China.Zhukov et al. [31,32] established a regional ionospheric forecasting model based on machine learning methods such as random forest, support vector regression, and gradient boosting.Then they used geomagnetic activity index and solar activity index driven model and created a global ionospheric total electron content model, which both have higher performances compared with NeQuick2, Klobuchar and GEMTEC models.Huang et al. [33] developed a single short-term ionospheric TEC prediction model based on a radial basis function (RBF) neural network.Lee et al. [34] generated global ionospheric TEC forecasts based on a generative countermeasure network.Srivani et al. [35] established a long short-term memory (LSTM) neural network prediction model using observation data from Bengaluru Station in India.Ruwali et al. [36] introduced the application of LSTM, gated recurrent unit (GRU), and convolutional neural network (CNN)-LSTM networks in the field of ionospheric prediction, and experiments showed that the CNN-LSTM network had the best prediction effect.
Li et al. [37] used an indirect forecasting method based on spherical harmonic function coefficients provided by the IGS and an autoregressive and moving average (ARMA) model to realize global ionospheric TEC forecasting.Wang et al. [38] used an adaptive autoregressive model and a spherical harmonic function model to predict the global ionospheric map (GIM).Wang et al. [39] and Qiu et al. [40] used a semiautomatic kernel estimation method and an improved variant thereof, respectively, to predict the spherical harmonic function coefficients provided by CODE.They introduced a 15th-order spherical harmonic function coefficient model to establish a version of CODE'S 1-Day Predicted GIM (C1PG) with higher model longitudes.Liu et al. [41] input four factors-the historical sequence of the spherical harmonic coefficients, the solar extreme ultraviolet (EUV) flux, the disturbance storm time (Dst) index, and the hour of the day-into the LSTM neural network to predict the generation of spherical harmonic coefficients in periods of magnetic storms and magnetic quiet periods.The GIM was obtained using a forecasting model with higher accuracy than IRI-2016 or NeQuick-2.
However, understanding ways to improve the forecasting accuracy of global ionospheric model is still a key issue worthy of in-depth discussion due to the changeable and sensitive ionospheric state affected by many factors, such as solar, the earth's magnetic field, electric field, etc.In this study, we propose to use a new machine learning model-the Prophet model-to predict the global ionospheric TEC.We use 15th-order spherical harmonic function coefficients to forecast and generate global ionospheric TEC maps under different solar activity conditions and perform an accuracy assessment by comparing the ionospheric values forecast by the model proposed in this paper with the CODE forecast product (COPG).

Prophet Model
The Prophet model [42] is a novel time series forecasting model based on machine learning.The advantages of the Prophet model are automatic processing of missing data, a fast-processing speed, simple parameter adjustment and complete automation.Focusing on the characteristics of the sequence of annual changes in ionospheric TEC, weekday changes and sudden changes affected by geomagnetic disturbances, the Prophet model can automatically extract and separate periodic terms and identify sudden changes and exhibits good adaptability for forecasting such time series.The Prophet model is based on the principle of decomposition followed by recombination.It decomposes the time series into four parts: trend terms, periodic terms, holiday effects, and random noise.The forecast is then combined through an additive or multiplicative model.The additive model is expressed as where t is the time scale; y(t) is the time series, which varies with time t; g(t) is the decomposed trend component, with no periodicity; s(t) is the periodic component; h(t) is the holiday effect or sudden change factor; and ε t is random noise that obeys a normal distribution.

Trend Term Extraction
The Prophet model provides two methods for extracting the trend terms for different time series; a piecewise linear model and a logistic regression model.The piecewise linear model is suitable for time series without saturated growth, and its calculation formula is where k is the initial growth rate; α T (t)δ is the slope correction value; b is the initial offset rate, and α T (t)δ j T j is the offset correction value.The logistic regression model is suitable for nonlinear time series with saturated growth, and its calculation formula is where c is the saturated growth value and k and b are the initial growth rate and initial migration rate, respectively.As the time t increases, g approaches the natural saturated state c.

Periodic Term Extraction
The ionospheric TEC sequence exhibits obvious periodic characteristics.The Prophet model uses discrete Fourier series to model and extract the periodic terms of the time sequence.The calculation formula is where c n are the coefficient parameters to be estimated, which satisfy c n ∼ N(0, σ 2 c ); P is the time series period; and N is the number of approximate terms used to fit the period.For weekday changes in the ionospheric TEC sequence, the annual periodicity P is generally set to 365.25, and N is set to 10.For the weekly periodicity, P is set to 7, and N is set to 3. With the notation s(t) can be expressed as the dot product of x(t) with β, a parameter vector; and the calculation formula is where β is a periodic smoothing parameter that has a regularization effect and follows a normal distribution, β ∼ N(0, σ 2 ).The parameter σ can be increased or decreased to adjust the influence of the periodicity on the forecasting effect; specifically, its value is positively correlated with the influence of the periodicity on the time series.

Spherical Harmonic Function Model
Schaer first proposed using a spherical harmonic function model to fit GIMs in 1999 [43].With the development of GNSS technology, the number of Earth observation satellites has increased.At present, CODE provides data based on more than 200 Global Navigation Satellite System (GNSS) observation stations around the world.The data are fit using 15th-order spherical harmonic functions to obtain and publish GIMs.The calculation formula for the spherical harmonic functions is where m and n represent the order and degree of the spherical harmonic function.L A nm and L B nm represent the coefficients of the associated Legendre function P nm in the spherical harmonic function; and β and λ are the latitude and longitude, respectively, of the puncture point in the geocentric solar geomagnetic coordinate system.CODE provides 15th-order historical data for 256 spherical harmonic function coefficients.

Data Source and Parameter Setting
For forecast analysis using the spherical harmonic function coefficient products released by CODE, we select 2019 as a year with lower solar activity (63.4 < F10.7 < 81.8) and 2015 as a year with higher solar activity (79.5 < F10.7 < 255.0).We select training samples with different time scales and perform sliding prediction of the spherical harmonic function coefficients for 2 specific days.Thus, we generate a global ionospheric TEC map with a resolution of 2.5 • × 5 • .We compare the results with C1PG and CODE'S 2-Day Predicted GIM (C2PG) to verify the accuracy and reliability of the model presented in this paper.
According to the temporal and spatial changes in the ionosphere and the characteristics of the Prophet model, the short-term forecasting of the ionospheric TEC is mainly concerned with daily periodic changes, and the trend term can be described by a model with piecewise linear growth.Therefore, we set the growth term parameter to 'linear', the daily periodicity parameter to 'true', and the seasonality parameter to 'additive'.The actual parameter settings are shown in Table 1 after several rounds of search.For mutation point screening, the automatic screening mode of the Prophet model is adopted.For hyperparameters that characterize the flexibility scale that 'changepoint_prior_scale', we use the grid search method [44].According to the extrapolation results of parameter tuning, the parameter search range is set to 0.1~0.9, and the search step is set to 0.1.

Accuracy Evaluation
To evaluate the forecasting results, we calculate the root mean square error (RMSE), mean absolute error (MAE) and correlation coefficient (R) of the model forecasted values and CODG values to analyze the forecasting effect.These metrics are calculated as where y i represents the TEC forecasted value, y i represents the CODG value, and n represents the data length.
The full-year data of 2015 and 2019 were selected for the experiments to validate the model reliability.We designed 3 sets of experiments to find the optimal time-scale training set.The spherical harmonic function coefficients with time scales of 7, 15 and 30 days were used as training sets.For comparison with CODG, the 2-day data after the training set was used as the test set.The specific experimental steps are as follows: 1.
The 0th order spherical harmonic function coefficients were entered into the model.The last day of data from the test set was used as the validation set to determine the hyperparameter 'changepoint_prior_scale'.

2.
Using the hyperparameters determined in the first step to predict 256 spherical harmonic function coefficients, the 2-day spherical forecast values were obtained.

3.
The predicted values were substituted into Equation ( 8) to calculate the GIM.We can interpolate TEC values at any latitude and longitude on demand to develop a higher-resolution ionospheric forecast product.

Model Performance under Low Solar Activity
We select the spherical harmonic function products released by CODE for 2019 for predictive analysis, using training data for sliding predictions to generate GIMs.At the same time, the COPG values are used to analyze the prediction accuracy of the model presented in this paper.Figure 1a,b show the 1 June forecast obtained by the Prophet model with 15 days of the training data and the corresponding COPG values, respectively.The horizontal axis is the longitude, the vertical axis is the latitude, and the time interval is 6 h.With this interval, forecasting results are produced 4 times a day.If the TEC forecasted values in some areas are negative, then we set these values to 0.
Figure 2a shows the differences between the values predicted by the Prophet model and the CODG values.Similarly, Figure 2b shows the differences between the values of the COPG values and the CODG values.The prediction residuals of the Prophet model are mostly within ±2 TECU.Regions with large errors mainly occur near the equator, which can be attributed to ionospheric equatorial anomalies.Compared with the Prophet model results, the regions with large residual errors predicted by the COPG model are significantly larger.Table 2 shows the residual ratios between the forecast and CODG values for 5183 grid points at 24 time points on 1 June and 2 June.For more than 90% of the Prophet model prediction results on the second day, the absolute value of the residual error is within 2 TECU, and this absolute error is less than 3 TECU for more than 97% of the results.In contrast, the residual errors of the COPG model predictions are larger, with only approximately 70% of the residual errors being less than 1 TECU and the proportions of errors smaller than 2 TECU and 3 TECU also being lower than those of the Prophet model.Figures 3 and 4 show the correlation analysis between the global ionospheric TEC forecasted value and the CODG values on 1 and 2 June, respectively.The horizontal axis represents the CODG value, and the vertical axis represents the forecasted value.The grid points are divided with a step size of 0.3 TECU, the number of points within a 0.2 TECU radius of each grid point is counted, and this number is represented by the corresponding color.Figure 3a-c 3b is the best, as the distribution in Figure 3a is relatively dispersed, with the forecasted value being generally greater than the CODG value, while the forecasted values in Figure 3c are all less than 25 TECU, indicating that this model does not have a good effect in predicting the maximum value.Figure 4 shows that the correlation between the forecast and CODG values on 2 June is similar to that on 1 June.In summary, the forecasted values obtained by using 15-day historical data as the training set are the most consistent with the CODG values, indicating that the forecasting effect is the best.
Figures 3 and 4 show the correlation analysis between the global ionospheric TEC forecasted value and the CODG values on June 1 and 2, respectively.The horizontal axis represents the CODG value, and the vertical axis represents the forecasted value.The grid points are divided with a step size of 0.3 TECU, the number of points within a 0.2 TECU radius of each grid point is counted, and this number is represented by the corresponding color.Figure 3a-c 3b is the best, as the distribution in Figure 3a is relatively dispersed, with the forecasted value being generally greater than the CODG value, while the forecasted values in Figure 3c are all less than 25 TECU, indicating that this model does not have a good effect in predicting the maximum value.Figure 4 shows that the correlation between the forecast and CODG values on June 2 is similar to that on June 1.In summary, the forecasted values obtained by using 15-day historical data as the training set are the most consistent with the CODG values, indicating that the forecasting effect is the best.To analyze the prediction effects at different latitudes and longitudes, we plot the variations in the CODG value, the Prophet forecasted value and the COPG value with latitude in Figure 6.As seen from this figure, the Prophet forecasted value is closer to the CODG value in most regions, indicating that the Prophet model can accurately predict the TEC trend, and both the maximum and minimum values are consistent with the real values.In contrast, the prediction effect of the COPG model at the maximum value is not accurate, and this deficiency more obvious in the high-latitude areas of the Northern and Southern Hemispheres.

Model Performance under High Solar Activity
To further verify the reliability of the model, 2015 is selected as a year with high solar activity for forecast analysis.The global ionospheric TEC maps on June 1 to June 31 are again predicted with 7 days, 15 days and 30 days of training data to assess the forecasting

Model Performance under High Solar Activity
To further verify the reliability of the model, 2015 is selected as a year with high solar activity for forecast analysis.The global ionospheric TEC maps on 1 June to 31 June are again predicted with 7 days, 15 days and 30 days of training data to assess the forecasting accuracy and reliability.Figures 8 and 9 show the correlation analysis between the global ionospheric TEC forecasted value and the CODG values on 1 June and 2 June 2015, respectively.It can be seen that the annual TEC value in a year of high solar activity is significantly larger.The horizontal axis represents the CODG value, and the vertical axis represents the model forecasted value, the same as in Figure 3. Figure 8a-d        Figure 11 shows the CODG value, the Prophet forecasted value and the COPG value with latitude.As seen from the figure, the 2-day forecasts from both models accurately predicted the trend of TEC changes.Compared with the two models, the Prophet model is closer to the CODG value, which is similar to low-solar-activity year.The prediction error of COPG value at the maximum value is obviously larger, which is especially obvious in high latitude areas.Figure 12 shows the prediction results of the Prophet and COPG models under different longitude and latitude interpolations.Both models can accurately predict the TEC trend at a single point.The Prophet model is more accurate in predicting the overall trend, and the COPG model is more precise.Table 3 shows a forecast assessment for different latitudes at a longitude of 120 • E. The prediction error of the Prophet model in high-latitude areas is small, but the correlation is lower than that of the COPG model.This reflects the irregular    Figure 13 shows the RMSE statistics of the two models in June for a low-solar-activity year and a high-solar-activity year.Compared with the results of C1PG and C2PG.The root mean square errors (RMSE) of the first day forecasts for the considered days of the low-solar-activity year and the high-solar-activity year are significantly reduced.For the low-solar-activity year, we calculated the total RMSE of 720 hourly forecasted values with CODG values for the forecast 30 days, and the RMSE of the Prophet model for the first day forecasts for the considered days and the second day forecasts for the considered days both are 1.1 TECU.For the high-solar-activity year, the Prophet model forecasts an RMSE of 3.4 TECU for both the first and second day.The forecast accuracy does not decrease with time.Figure 13 shows the RMSE statistics of the two models in June for a low-solar-activity year and a high-solar-activity year.Compared with the results of C1PG and C2PG.The root mean square errors (RMSE) of the first day forecasts for the considered days of the low-solar-activity year and the high-solar-activity year are significantly reduced.For the low-solar-activity year, we calculated the total RMSE of 720 hourly forecasted values with CODG values for the forecast 30 days, and the RMSE of the Prophet model for the first day forecasts for the considered days and the second day forecasts for the considered days both are 1.1 TECU.For the high-solar-activity year, the Prophet model forecasts an RMSE of 3.4 TECU for both the first and second day.The forecast accuracy does not decrease with time.Figure 14 shows the correlation coefficient statistics of the two models at the low-solaractivity year and the high-solar-activity year.In the low-solar-activity year, the correlation coefficients between the Prophet model and CODG values are all above 0.93.Comparing the correlation coefficients between the two models and the CODG values, the Prophet model forecast of 24 days outperforms the COPG values in the first day forecasts for the considered days, and the Prophet model forecast of 20 days outperforms the COPG values in the second day forecasts for the considered days.In the high-solar-activity year, comparing the correlation coefficients between the two models and the CODG values, the Prophet model forecast of 17 days outperforms the COPG values in the first day forecasts for the considered days and the Prophet model forecast of 19 days outperforms the COPG values in the second day forecasts for the considered days.The average of correlation coefficients for the predicted 30 days between the predicted results and the CODG values on the first day and second day in the low-solar-activity year both are 0.96.The average of correlation coefficients for the predicted 30 days between the predicted results and CODG values on the first day and second day in the high-solar-activity year both are 0.93.These results indicate that the predicted values are highly correlated with the actual values and that the forecasting results are reliable.The correlation between the two model predictions is poor during extremely large magnetic storm days 173 and 174.  Figure 15 shows the RMSE statistics of the two models at low-solar-activity year and high-solar-activity year.As seen from the figure, the forecasted error is low for June, July and August throughout the year.In 2015, a year of intense solar activity, the error in the Prophet model forecasts for March, influenced by magnetic storms, was slightly larger.We calculated the RMSE for both models for a full year.The RMSEs of the Prophet model for the first and second day of the low-solar-activity year are both 1.5 TECU.The RMSEs of the Prophet model for the first and second day of the high-solar-activity year both are 4.1 TECU.Therefore, the Prophet model has good performance in global forecasting during geomagnetic quiet periods, the difference in performance between the two days is not significant.Figure 15 shows the RMSE statistics of the two models at low-solar-activity year and high-solar-activity year.As seen from the figure, the forecasted error is low for June, July and August throughout the year.In 2015, a year of intense solar activity, the error in the Prophet model forecasts for March, influenced by magnetic storms, was slightly larger.We calculated the RMSE for both models for a full year.The RMSEs of Prophet model the first and second day of the low-solar-activity year are both 1.5 TECU.The RMSEs of the Prophet model for the first and second day of the high-solar-activity year both are 4.1 TECU.Therefore, the Prophet model has good performance in global forecasting during geomagnetic quiet periods, the difference in performance between the two days is not significant.Figure 16 shows the variation of RMSE with Kp index and F10.7 index.As can be seen from the top left and top right panels, the RMSE increases with increasing Kp index.The overall performance of the Prophet model for low-solar-activity years is better than the COPG values.In the high-solar-activity year, the Prophet model performs better when Kp < 3. The Prophet forecasted error is greater than C1PG and less than C2PG when Kp > 3. Analysis based on solar activity, the RMSE of the two models is larger when the f10.7 index is less than 65 in the low-solar-activity year, and the forecast accuracy remains stable as the f10.7 index increases.The RMSE of the high-solar-activity year has a minimum value at an F10.7 index of 95 and gradually increases with the intensity of solar activity.Prophet's two-day forecast accuracy remained stable, while C1PG and C2PG values differed considerably.Figure 16 shows the variation of RMSE with Kp index and F10.7 index.As can be seen from the top left and top right panels, the RMSE increases with increasing Kp index.The overall performance of the Prophet model for low-solar-activity years is better than the COPG values.In the high-solar-activity year, the Prophet model performs better when Kp < 3. The Prophet forecasted error is greater than C1PG and less than C2PG when Kp > 3. Analysis based on solar activity, the RMSE of the two models is larger when the f10.7 index is less than 65 in the low-solar-activity year, and the forecast accuracy remains stable as the f10.7 index increases.The RMSE of the high-solar-activity year has a minimum value at an F10.7 index of 95 and gradually increases with the intensity of solar activity.Prophet's twoday forecast accuracy remained stable, while C1PG and C2PG values differed considerably.

Discussion
Machine learning is suitable in finding patterns and relationships from historical data to solve problems.Computer systems learn by example and extract useful information from a large set of historical data in a mathematical approach.It usually contains a large amount of data, spanning as many parameters as possible [45].In the paper, the Prophet

Discussion
Machine learning is suitable in finding patterns and relationships from historical data to solve problems.Computer systems learn by example and extract useful information from a large set of historical data in a mathematical approach.It usually contains a large amount of data, spanning as many parameters as possible [45].In the paper, the Prophet model was used to take into account multiple factors affecting forecasting when modeling and forecasting the 15th-order spherical harmonic function coefficients of the ionosphere and generating a global ionospheric TEC map.Training samples with three-time scales of 7 days, 15 days, and 30 days were selected for experimental analysis based on the unique ionospheric activity occurring in years of quiet solar activity and high solar activity.The RMSEs, MAEs and correlation coefficients were used for evaluation.
We used the Prophet model to perform sliding prediction of the spherical harmonic coefficients for a month.The results show that the model exhibits good performance in both years with low solar activity and years with high solar activity.The average of correlation coefficients for the predicted 30 days between the predicted results and the CODG values on the first day and second day in the low-solar-activity year are both 0.96.The average of correlation coefficients for the predicted 30 days between the predicted results and CODG values on the first day and second day in the high-solar-activity year are both 0.93.These results indicate that the predicted values are highly correlated with the actual values and that the forecasting results are reliable.The prediction accuracy of this model is affected by the number of days of training data used.We calculated the correlation coefficients for three sets of experiments for the predicted days of low solar activity year and the predicted days of high solar activity year.In low solar activity years, the correlation coefficients between the CODG values and the forecasts derived from the seven days, 15 days, and 30 days' time scales as training sets are 0.97, 0.98, and 0.97, respectively.In high solar activity years, these indicators were 0.97, 0.98 and 0.97.Specifically, the accuracy of the prediction results obtained using a 15-day training set is the best.The RMSEs for the two forecasted days of the low-solar-activity year were 1.1 TECU and 0.9 TECU, and the MAEs were 0.8 TECU and 0.6 TECU, respectively.The RMSEs for the two forecasted days in the high-solar-activity year were 2.4 TECU and 2.3 TECU, and the MAEs were 1.8 TECU and 1.6 TECU, respectively.Overall, a time scale of 15 days data is appropriate as a training set in short-term forecasting.
With the model presented in this paper, the prediction effects at middle and low latitudes were the best, and the prediction accuracy for the Northern Hemisphere was better than that for the Southern Hemisphere.The prediction error was lower than that of the COPG model, possibly due to the density of GNSS stations in the northern hemisphere, the more accurate data collected for the history, and the better quality of the training set data.The low number of GNSS receivers in the southern hemisphere led to missing ionospheric modeling details, which affects forecast accuracy.The trend of TEC changes could be accurately predicted in high-latitude areas, but the prediction of details was not precise, leading to a relatively slight correlation.This may be due to how the ionosphere is calmer at high latitudes with less TEC changes that are less cyclical, resulting in a low correlation between forecasted values and CODG values.In the future, we will analyze and discuss spherical harmonic function coefficients of various orders, incorporate geographical factors into the prediction model, and further optimize and improve the model.

Conclusions
We address several problems faced by traditional forecasting models, such as the consideration of relatively few factors, complex parameter settings and unstable forecasting effects.This paper proposes a global ionospheric TEC prediction model based on machine learning.The model combines indirect forecasting methods from previous studies with machine learning methods to achieve a global TEC map forecast for the next two days.The results show that the Prophet model has good accuracy for both 2-day forecasts.The machine learning model-the Prophet model-is robust and outperforms the results of C2PG when comparing with COPG in terms of their performance.In general, the model can be considered applicable and can provide a powerful data basis for GNSS error correction.This provides more stable and accurate positioning services for single-frequency GNSS users.In the future, we will consider more influencing factors and training models considering more characteristic conditions to achieve better prediction results.

Figure 1 .
Figure 1.The forecasted global ionospheric TEC maps on June 1, 2019, as obtained (a) us Prophet model and (b) from the COPG values, with a time interval of 6 h, longitude horizontal axis, and latitude on the vertical axis.

Figure 1 .
Figure 1.The forecasted global ionospheric TEC maps on 1 June 2019, as obtained (a) using the Prophet model and (b) from the COPG values, with a time interval of 6 h, longitude on the horizontal axis, and latitude on the vertical axis.

Figure 2 .
Figure 2. The difference between CODG value and the forecasted value on June 1, 2019, with a time interval of 6 h, longitude on the horizontal axis, and latitude the vertical axis.(a) The difference between Prophet value with CODG value.(b) The difference between COPG value with CODG value.

Figure 2 .
Figure 2. The difference between CODG value and the forecasted value on 1 June 2019, with a time interval of 6 h, longitude on the horizontal axis, and latitude the vertical axis.(a) The difference between Prophet value with CODG value.(b) The difference between COPG value with CODG value.
Figures3 and 4show the correlation analysis between the global ionospheric TEC forecasted value and the CODG values on 1 and 2 June, respectively.The horizontal axis represents the CODG value, and the vertical axis represents the forecasted value.The grid points are divided with a step size of 0.3 TECU, the number of points within a 0.2 TECU radius of each grid point is counted, and this number is represented by the corresponding color.Figure3a-cshow the scatter density diagrams of the correlations between the Prophet model forecasted value and the CODG values at training time scales of 7, 15 and 30 days, respectively.Figure 3d shows the scatter density diagram of the correlation between the COPG and CODG values.It can be seen from Figure 3 that the forecasted values and the CODG values are mostly concentrated in the range of 0~10 TECU, and all four forecasting methods yield reasonably accurate forecasting results in this range.However, the distribution of the COPG values vs. the CODG values is the most disperse, and the COPG value is generally larger than the CODG value, indicating that the forecasting effect is poor.When the Prophet model takes 7 days, 15 days and 30 days of training data, the R 2 values between the predicted values and the CODG values are 0.9592, 0.9618 and 0.9541, respectively.These findings indicate that 15 days is the best time scale of the training data for forecasting.It can also be seen from the figure that the correlation between the forecasted values and the CODG values in Figure3bis the best, as the distribution in Figure3ais relatively dispersed, with the forecasted value being generally greater than the CODG value, while the forecasted values in Figure3care all less than 25 TECU, indicating that this model does not have a good effect in predicting the maximum value.Figure4shows that the correlation between the forecast and CODG values on 2 June is similar to that on 1 June.In summary, the forecasted values obtained by using 15-day historical data as the training set are the most consistent with the CODG values, indicating that the forecasting effect is the best.

2 R
Figures 3 and 4  show the correlation analysis between the global ionospheric TEC forecasted value and the CODG values on June 1 and 2, respectively.The horizontal axis represents the CODG value, and the vertical axis represents the forecasted value.The grid points are divided with a step size of 0.3 TECU, the number of points within a 0.2 TECU radius of each grid point is counted, and this number is represented by the corresponding color.Figure3a-cshow the scatter density diagrams of the correlations between the Prophet model forecasted value and the CODG values at training time scales of 7, 15 and 30 days, respectively.Figure 3d shows the scatter density diagram of the correlation between the COPG and CODG values.It can be seen from Figure 3 that the forecasted values and the CODG values are mostly concentrated in the range of 0~10 TECU, and all four forecasting methods yield reasonably accurate forecasting results in this range.However, the distribution of the COPG values vs. the CODG values is the most disperse, and the COPG value is generally larger than the CODG value, indicating that the forecasting effect is poor.When the Prophet model takes 7 days, 15 days and 30 days of training data, the 2 R values between the predicted values and the CODG values are 0.9592, 0.9618 and 0.9541, respectively.These findings indicate that 15 days is the best time scale of the training data for forecasting.It can also be seen from the figure that the correlation between the forecasted values and the CODG values in Figure3bis the best, as the distribution in Figure3ais relatively dispersed, with the forecasted value being generally greater than the CODG value, while the forecasted values in Figure3care all less than 25 TECU, indicating that this model does not have a good effect in predicting the maximum value.Figure4shows that the correlation between the forecast and CODG values on June 2 is similar to that on June 1.In summary, the forecasted values obtained by using 15-day historical data as the training set are the most consistent with the CODG values, indicating that the forecasting effect is the best.

Figure 3 .
Figure 3. Correlation analysis between forecasted values and CODG values on 1 June in a year of low solar activity.(a-c) Scatter density plots of the correlations between the 7-day, 15-day and 30day Prophet model forecasted values, respectively, and the CODG values.(d) Scatter density plot of the correlation between the COPG and CODG values.

Figure 3 .
Figure 3. Correlation analysis between forecasted values and CODG values on 1 June in a year of low solar activity.(a-c) Scatter density plots of the correlations between the 7-day, 15-day and 30-day Prophet model forecasted values, respectively, and the CODG values.(d) Scatter density plot of the correlation between the COPG and CODG values.

Figure 4 .
Figure 4. Correlation analysis between forecasted values and CODG values on June 2 in a year of low solar activity.(a-c) Scatter density plots of the correlations between the 7-day, 15-day and 30day Prophet model forecasted values, respectively, and the CODG values.(d) Scatter density plot of the correlation between the COPG and CODG values.

Figure 5
Figure 5 shows statistical histograms of the forecasted residual distributions between the values predicted by the Prophet model and the COPG values in this low-solar-activity year.As seen from the figure, the forecasted residuals for the 2 investigated days are mostly within 2 TECU and forecasted values with residuals greater than 2 TECU account for only a small proportion.The forecasting results of the Prophet model on 1st day and 2nd day have MAEs of 1.02 TECU and 1.02 TECU, respectively, whereas the MAEs of the COPG model are 1.02 TECU and 1.00 TECU, respectively.These findings show that both models have similar accuracy.To analyze the prediction effects at different latitudes and longitudes, we plot the variations in the CODG value, the Prophet forecasted value and the COPG value with latitude in Figure6.As seen from this figure, the Prophet forecasted value is closer to the CODG value in most regions, indicating that the Prophet model can accurately predict the TEC trend, and both the maximum and minimum values are consistent with the real values.In contrast, the prediction effect of the COPG model at the maximum value is not accurate, and this deficiency is more obvious in the high-latitude areas of the Northern and Southern Hemispheres.

Figure 4 .
Figure 4. Correlation analysis between forecasted values and CODG values on 2 June in a year of low solar activity.(a-c) Scatter density plots of the correlations between the 7-day, 15-day and 30-day Prophet model forecasted values, respectively, and the CODG values.(d) Scatter density plot of the correlation between the COPG and CODG values.

Figure 5 24 Figure 5 .Figure 5 .
Figure 5 shows statistical histograms of the forecasted residual distributions between the values predicted by the Prophet model and the COPG values in this low-solar-activity year.As seen from the figure, the forecasted residuals for the 2 investigated days are mostly within 2 TECU and forecasted values with residuals greater than 2 TECU account for only a small proportion.The forecasting results of the Prophet model on 1st day and 2nd day have MAEs of 1.02 TECU and 1.02 TECU, respectively, whereas the MAEs of the COPG model are 1.02 TECU and 1.00 TECU, respectively.These findings show that both models have similar accuracy.Remote Sens. 2022, 14, x FOR PEER REVIEW 11 of 24

Figure 5 .
Figure 5. Forecasted residual distributions in a low-solar-activity year.Blue represents the Prophet model, and orange represents the COPG value.

Figure 6 .
Figure 6.CODG, Prophet forecasted, and COPG values as functions of latitude under low solar activity.The blue solid lines represent the CODG values, the red solid lines represent the Prophet forecasted values, and the green dashed lines represent the COPG values.

Figure 6 .
Figure 6.CODG, Prophet forecasted, and COPG values as functions of latitude under low solar activity.The blue solid lines represent the CODG values, the red solid lines represent the Prophet forecasted values, and the green dashed lines represent the COPG values.

Figure 7
compares the forecasting results of the Prophet model with the COPG and CODG values on 1 June 2015.It can be seen that the maximum annual TEC in this year of high solar activity is 65 TECU.The Prophet model has a basically accurate prediction effect, with the maximum predicted value being approximately 5 TECU smaller.The global distribution is close to that of the CODG values.

Figure 7 .
Figure 7.Comparison of Prophet model forecasted, COPG, and CODG values on 1 June 2015, at 4h intervals under high solar activity.The first column shows the Prophet model forecasted values, the second column shows the COPG values, and the third column shows the CODG values.The horizontal axis is longitude, and the vertical axis is latitude.

Figure 7 .
Figure 7.Comparison of Prophet model forecasted, COPG, and CODG values on 1 June 2015, at 4-h intervals under high solar activity.The first column shows the Prophet model forecasted values, the second column shows the COPG values, and the third column shows the CODG values.The horizontal axis is longitude, and the vertical axis is latitude.
Figures8 and 9show the correlation analysis between the global ionospheric TEC forecasted value and the CODG values on 1 June and 2 June 2015, respectively.It can be seen that the annual TEC value in a year of high solar activity is significantly larger.The horizontal axis represents the CODG value, and the vertical axis represents the model forecasted value, the same as in Figure3. Figure 8a-d present the scatter density diagrams of the correlations between the Prophet model forecasted values at the 7-day, 15-day and 30-day training scales and COPG values, respectively, and the CODG values.We divide the grid points with a step size of 0.5 TECU and count the number of points within a radius of 0.2 TECU of each grid point.As seen in this figure, the annual TEC values in this year with high solar activity are significantly larger than those in the previous analysis, reaching a maximum of 60 TECU.The 7-day Prophet model forecasted values, COPG values and CODG values are dispersed and have weak correlations.The prediction effect of the 30-day Prophet model for the maximum TEC value is not ideal, which may be related to the long duration of the training data.In general, the 15-day Prophet model shows the best prediction effect and the highest correlation with the CODG values, with the R 2 values of this Prophet model for the 2 investigated days being 0.96 and 0.97.Remote Sens. 2022, 14, x FOR PEER REVIEW 14 of 24

Figure 8 .
Figure 8. Correlation analysis between forecasted values and CODG values on 1 June in a year of high solar activity.(a-c) Scatter density plots of the correlations between the 7-day, 15-day and 30day Prophet model forecasted values, respectively, and the CODG values.(d) Scattered density plot of the correlation between the COPG and CODG values.

Figure 8 .
Figure 8. Correlation analysis between forecasted values and CODG values on 1 June in a year of high solar activity.(a-c) Scatter density plots of the correlations between the 7-day, 15-day and 30-day Prophet model forecasted values, respectively, and the CODG values.(d) Scattered density plot of the correlation between the COPG and CODG values.

Figure 9 .
Figure 9. Correlation analysis between forecasted values and CODG values on 2 June in a year of high solar activity.(a-c) Scatter density plots of the correlations between the 7-day, 15-day and 30day Prophet model forecasted values, respectively, and the CODG values.(d) Scattered density plot of the correlation between the COPG and CODG values.

Figure 10 .
Figure 10.Forecasted residual distributions in a high-solar-activity year.Blue represents the Prophet model, and orange represents the COPG value.

Figure 9 .
Figure 9. Correlation analysis between forecasted values and CODG values on 2 June in a year of high solar activity.(a-c) Scatter density plots of the correlations between the 7-day, 15-day and 30-day Prophet model forecasted values, respectively, and the CODG values.(d) Scattered density plot of the correlation between the COPG and CODG values.

Figure 10
Figure 10 shows histograms of the prediction residuals of the Prophet model and the COPG values.Due to the violent influence of solar activity, the TEC values are relatively large, leading to large forecast errors.In this figure, only 30% of the values predicted by the Prophet model on the first day are within 1 TECU; however, residual values of less than 10 TECU still account for a high proportion, with an average residual of 2.9 TECU on the first day.By comparison, values of less than 1 TECU account for a smaller proportion of the residuals of the COPG model, with an average residual of 2.8 TECU on the first day.On the second day, the average residual values of the Prophet and COPG models are 2.9 TECU and 3.0 TECU, respectively, and the residuals of the Prophet model are generally closer to 0. This shows that the Prophet model has smaller prediction residuals and higher accuracy.Figure11shows the CODG value, the Prophet forecasted value and the COPG value with latitude.As seen from the figure, the 2-day forecasts from both models accurately predicted the trend of TEC changes.Compared with the two models, the Prophet model is closer to the CODG value, which is similar to low-solar-activity year.The prediction error of COPG value at the maximum value is obviously larger, which is especially obvious in high latitude areas.

Figure 9 .
Figure 9. Correlation analysis between forecasted values and CODG values on 2 June in a year of high solar activity.(a-c) Scatter density plots of the correlations between the 7-day, 15-day and 30day Prophet model forecasted values, respectively, and the CODG values.(d) Scattered density plot of the correlation between the COPG and CODG values.

Figure 10 .Figure 10 .
Figure 10.Forecasted residual distributions in a high-solar-activity year.Blue represents the Prophet model, and orange represents the COPG value.

Figure 11 .
Figure 11.CODG, Prophet forecasted, and COPG values as functions of latitude under high solar activity.The blue solid lines represent the CODG values, the red solid lines represent the Prophet forecasted values, and the green dashed lines represent the COPG values.

Figure 12 Figure 11 .
Figure12shows the prediction results of the Prophet and COPG models under different longitude and latitude interpolations.Both models can accurately predict the TEC trend at a single point.The Prophet model is more accurate in predicting the overall trend, and the COPG model is more precise.Table3shows a forecast assessment for different latitudes at a longitude of 120° E. The prediction error of the Prophet model in high-latitude areas is small, but the correlation is lower than that of the COPG model.This reflects the irregular change in the TEC in low-latitude regions, for which the Prophet model is not ideally suited.Nevertheless, the prediction effect of the Prophet model is significantly better than that of the COPG model at middle and low latitudes.Compared with the COPG model, the RMSE of the Prophet model at middle and low latitudes is reduced by 2.6 TECU and 2.1 TECU, respectively, and the MAE is lower by 1.4 TECU and

Figure 12 .
Figure 12.Comparison of how CODG, Prophet forecasted, and COPG values at different latitudes and longitudes change over time.The blue solid lines represent the CODG values, the red solid lines represent the Prophet forecasted values, and the green dashed lines represent the COPG values.

Figure 12 .
Figure 12.Comparison of how CODG, Prophet forecasted, and COPG values at different latitudes and longitudes change over time.The blue solid lines represent the CODG values, the red solid lines represent the Prophet forecasted values, and the green dashed lines represent the COPG values.

Figure 13 .
Figure 13.Daily RMSE statistics of the Prophet model and the COPG values, C1PG and C2PG refer to COPG values forecast for the first and second day.Blue represents the Prophet model, and orange represents the COPG value.

Figure 14
Figure14shows the correlation coefficient statistics of the two models at the lowsolar-activity year and the high-solar-activity year.In the low-solar-activity year, the correlation coefficients between the Prophet model and CODG values are all above 0.93.Comparing the correlation coefficients between the two models and the CODG values, the Prophet model forecast of 24 days outperforms the COPG values in the first day forecasts for the considered days, and the Prophet model forecast of 20 days outperforms the COPG values in the second day forecasts for the considered days.In the high-solaractivity year, comparing the correlation coefficients between the two models and the CODG values, the Prophet model forecast of 17 days outperforms the COPG values in the first day forecasts for the considered days and the Prophet model forecast of 19 days outperforms the COPG values in the second day forecasts for the considered days.The

Figure 13 .
Figure 13.Daily RMSE statistics of the Prophet model and the COPG values, C1PG and C2PG refer to COPG values forecast for the first and second day.Blue represents the Prophet model, and orange represents the COPG value.

Figure 14 .
Figure 14.Daily correlation coefficient statistics of the Prophet model and the COPG value, C1PG and C2PG refer to COPG values forecast for the first and second day.Blue represents the Prophet model, and orange represents the COPG value.

Figure 14 .
Figure 14.Daily correlation coefficient statistics of the Prophet model and the COPG value, C1PG and C2PG refer to COPG values forecast for the first and second day.Blue represents the Prophet model, and orange represents the COPG value.

Figure 15 .
Figure 15.Monthly RMSE statistics of the Prophet model and the COPG values, C1PG and C2PG refer to COPG values forecast for the first and second day.Blue represents the Prophet model, and orange represents the COPG value.

Figure 15 .
Figure 15.Monthly RMSE statistics of the Prophet model and the COPG values, C1PG and C2PG refer to COPG values forecast for the first and second day.Blue represents the Prophet model, and orange represents the COPG value.

Figure 16 .
Figure 16.RMSE for variation with Kp index and F10.7 index.Blue and red lines represent the model.Yellow and purple lines represent the COPG value.

Table 1 .
Prophet model parameter settings.

Table 2 .
Residual percentage analysis of the Prophet model and the COPG model.

Table 2 .
Residual percentage analysis of the Prophet model and the COPG model.

Table 3 .
Accuracy comparison of the Prophet and COPG models at different latitudes at longitude 120° E.

Table 3 .
Accuracy comparison of the Prophet and COPG models at different latitudes at longitude 120 • E.
Remote Sens. 2022, 14, x FOR PEER REVIEW 19 of 24 predicted results and CODG values on the first day and second day in the high-solaractivity year both are 0.93.These results indicate that the predicted values are highly correlated with the actual values and that the forecasting results are reliable.The correlation between the two model predictions is poor during extremely large magnetic storm days 173 and 174.