Predicting the Effects of Solar Storms on the Ionosphere Based on a Comparison of Real-Time Solar Wind Data with the Best-Fitting Historical Storm Event

: This study presents a new modeling approach that aims for long time predictions (more than 12 h) of ionospheric disturbances driven by solar storm events. The proposed model shall run in an operational framework to deliver fast and precise localized warnings for these disturbances in the future. The solar wind data driven approach uses a data base of historical solar storm impacts covering two solar cycles to reconstruct future events and resulting ionospheric disturbances. The basic components of the model are presented and discussed in this study, and the strengths of the reconstruction based on historical events are presented by showing the good correlations for predicted and observed geomagnetic activity. Initial results on the ionospheric response are discussed for all historical events using global total electron content (GTEC) and in more detail using total electron content (TEC) maps for two speciﬁc case studies (including the St. Patrick’s Day geomagnetic storm during the 17 March 2015). Average root mean square error (RMSE) values of 3.90 and 5.21 TECU are calculated for these cases conﬁrming good results for the current conﬁguration of the model. Possible future improvements of the individual model parts, as well as the planned extensions and applications are discussed in detail.


Introduction
The complex interaction between solar wind and magnetosphere is a crucial driver for the upper atmosphere variability [1]. The characteristics of these variations change considerably when the solar wind is modified by interplanetary coronal mass ejections (ICME) or co-rotating interaction regions (CIR). Thus, ionospheric perturbations can be observed on different time scales and of different strength [2][3][4]. Furthermore, also spatial variations characterized by different processes (e.g., solar wind driven plasma drifts) are crucial [5][6][7]. The ionospheric disturbances can cause various problems for communication and navigation systems (e.g., reduced positioning accuracy for global navigation satellite system). Such disturbances need to be understood and predicted to assure the operation of infrastructure and services that are crucial for most users [8][9][10].
Forecasting ionospheric disturbances due to geomagnetic storms driven by ICME or even CIR is a major challenge that is being pursued and already carried out by various institutions. In this study, the idea of an improved global forecast model is presented and the basic concepts are validated. The model is able to provide 24 h forecast of ionospheric disturbances even during significant solar storms and describes ionospheric perturbations in high local and spatial resolution based on the observed solar wind conditions and information from historical storm events.
The model presented here is a significant extension of the previous work completed in the EU FP7 as project Advanced Forecast For Ensuring Communication Through Space (AFFECTS), where a forecast for ionospheric disturbances based on historical storm information was implemented [11][12][13]. The proposed total electron content (TEC) perturbation forecast model was realized based on empirical information and the extensive data archive of regional TEC maps for Europe. Unlike in the approach of this study, the forecast was based on geomagnetic indices rather than solar wind data. The approach followed a series of similar studies on modeling ionospheric parameters at the German Aerospace Center (DLR) Neustrelitz [14][15][16]. The efforts to enhance the model further are due the great need for the improvement of empirical ionospheric forecast models, which are able to run under real-time conditions [17][18][19].
In this study, an extended approach of the previous work is presented, aiming at two crucial changes. The model is extended to global maps and the forecast leading time is improved by prediction of TEC disturbances based on solar wind data instead of geomagnetic indices. The basic idea for the presented model is the forecast of ionospheric disturbances during solar storms by applying impact information from historical ICME or CIR events with similar features. This is realized with the use of data over approximately two solar cycles.
Another example of existing ionospheric disturbance forecast is the solar wind driven autoregression model for ionospheric short-term forecast (SWIF), which provides predictions for the critical frequency of the ionospheric F2 layer [20][21][22]. The model has been validated [22] and is operational on the European Digital upper Atmosphere Server (DIAS) providing ionospheric forecasts. However, the SWIF approach is performed only for certain stations and differs fundamentally in the calculation of predictions. Nevertheless, SWIF is an option for future comparisons with the model proposed in this study. Other approaches based on machine learning or neural networks (e.g., [23,24]) may also be considered for the model evaluation compared to other approaches.
For figures and analyses, whenever possible in this study, the St. Patrick's Day geomagnetic storm during the 17 March 2015 is used as an example because the behavior of this event has been discussed in various studies [25][26][27][28] and thus the underlying processes are well understood. Other storms are used especially for the analyses of the event reconstruction to avoid larger data gaps.

Data
Solar wind data, geomagnetic indices, and total electron content (TEC) maps are applied in this study and are required for the model. A brief introduction to the different measurements is given in this section.

Solar Wind Measurements
Features of the solar wind are measured by the National Aeronautics and Space Administration (NASA) satellite missions Advanced Composition Explorer (ACE) and Deep Space Climate Observatory (DSCOVR). The data of the instruments on board these satellites are provided in real-time and are used by different institutions to monitor and forecast space weather events [29,30]. ACE and DSCOVR were launched 25 August 1997 and 11 February 2015 orbiting the L1 Lagrangian point since then.
The solar wind electron, proton, and alpha monitor (SWEPAM) on board ACE measures the solar wind plasma electron and ion fluxes at minutely resolution [31]. The same parameters are measured by the Faraday cup and electron spectrometer on board DSCOVR [32]. In addition, the magnetic field experiment (MAG) on board ACE and the magnetometer on board DSCOVR measure the six magnetic field vectors of the interplanetary magnetic field (IMF). ACE data are provided through the NASA ACE Science Center [33,34] and DSCOVR data are provided through the National Oceanic and Atmospheric Administration's National Centers for Environmental Information [35]. Both datasets are available in real-time.
For this study, in particular, the proton density n p , proton speed v p and the North-South IMF direction B z will be analyzed as they describe the variations during ICME, as well as CIR. Although n p and v p describe the general strength of an event, the strength of the southward magnetic field (negative B z ) is an important measure for the geoeffectiveness of ICME [36,37]. This has to be considered especially during strong geomagnetic activity [38]. In addition, two derived parameters will be used. The dynamic pressure p d will be calculated as combining the two solar wind features. B z is then included by introducing the effective pressure p e with a relative scale of B z according to This scale is combined with p d according to This logistic function (or Fermi function), which was proposed in a previous study [12], is parameterized with the upper boundary B z 0 and a scaling factor k, which have been set to −10 nT and 0.01 nT in this study. By introducing this parameter, it is possible to evaluate the three measured values as one time series. This alternative allows a simplified version of the model approach.

Geomagnetic Indices
The disturbance storm time (Dst) describes the strength of the ring current and provides a reliable measurement for the impact of geomagnetic storms [39]. The ring current occurs when charged particles of the solar wind enter the inner magnetosphere and move along the geomagnetic field lines in a westward direction around Earth. The magnetic field of this ring current impacts the field strength on Earth's surface. This impact is described with Dst and provided at hourly resolution through the World Data Center for Geomagnetism in Kyoto [40][41][42]. Geomagnetic activity described by Dst is correlated with the strength of solar wind features and, thus, can be predicted based on their measurements [43][44][45][46][47].
The K index describes the disturbances in the horizontal component of Earth's magnetic field and is therefore another measurement for geomagnetic activity [48][49][50]. The global Kp index is provided at 3-hourly resolution by the German Research Centre for Geosciences in Potsdam [51]. Similar to Dst, geomagnetic activity described by Kp is correlated with the strength of solar wind features and can be predicted based on their measurement [52,53].

Global TEC Maps
Global TEC maps are provided by the International Global Navigation Satellite System (GNSS) Service (IGS) for different analysis centers [54]. Since maps with high sampling rate of at least 1 h are required, the Universitat Politècnica de Catalunya (UPC) rapid high-rate solution (UHR) maps are applied in this study. These maps are provided by NASA through the Crustal Dynamics Data Information System [55]. The European Space Agency (ESA) rapid high-rate solution (EHR) maps and the IGS combined (IHR) maps would also be suitable for the analysis, but in general the IGS data products are only a temporary solution. In the future DLR TEC maps with even higher sampling rate will be applied during the real-time operation of the model. TEC is preferred over other ionospheric parameters (e.g., critical frequency of the F2 layer), because it can be provided as TEC maps with global coverage in near real-time. In addition, TEC is an integral measurement of the ionosphere's electron density (equivalent to the ionospheric range error) that is of interest for most users and provided in established space weather services [56].

Reference Events
The basis for this study and the proposed forecast is a comprehensive list of historical events (ICME and CIR) that have caused ionospheric disturbances. The list compiled for this purpose is based on various sources. A large number of events can be taken from literature (e.g., [57][58][59][60]). Other events are selected manually, with Kp (≥5) and Dst (≤−50 nT) assumed to be indicators of IMCE and CIR. The compiled list is presented in Figure 1 showing a total of 593 events. The number of events each month indicates a dependence on the solar cycle 23 and 24 and thus a representative set of events can be assumed. On the other hand, the number of events that can be used in the analysis is much smaller than the total because data gaps (within 24 h of the onset time) prevent some periods from being evaluated for both solar wind data and TEC maps. Nevertheless, the proposed analysis includes a sufficient number of events.

Modeling Approach
Data assimilation of ongoing observations with dynamical models is a powerful tool for space weather predictions [61] and can be achieved with Kalman filtering [62] or extended Kalman filtering and ensemble Kalman filtering [63,64]. These approaches require a known model. In cases without a model dynamics can be reconstructed using the delay coordinate method, the so-called Takens filtering [65]. The combination of both approaches has been introduced as Kalman-Takens filtering [66,67]. This method realizes a model-free filter which applies Kalman filtering and data-driven Takens modeling [66]. The otherwise required model is replaced with dynamics from the delay coordinate method [66]. The idea of the model presented in this paper is inspired by these approaches, which do not rely on a dynamic model, but differs significantly in its non-continuous calculations and static observation sequences, that are used for the reconstruction.
The proposed forecast in this study is only performed if significant changes in the solar wind are observed. This splits the forecast into two steps: identification of events in solar wind features and TEC reconstruction based on past ionospheric disturbances. The first part is realized comparable to Takens filtering with static delays referencing the past events and a chosen observation function. The TEC reconstruction is realized with corresponding differential TEC (dTEC) entries for these past events. Thus, the precise prediction of the event onset time and the observation function are the crucial parts of the forecast. The details and current development status of the components of the forecast are presented in the following sections, which describe in more detail solar wind monitoring, event reconstruction, dTEC data archive, and quiet as well as storm condition TEC forecast.

Solar Wind Monitoring
ACE and DSCOVR instruments provide continuously solar wind and IMF parameters of which n p , v p , and B z are used for the proposed forecast. In addition, the calculated parameters p d (see Equation (1)) and p e (see Equation (3)) are analyzed. The identification of relevant solar storms is realized by two procedures, where both approaches can be applied to single parameters or any combination of parameters.
The first approach uses static thresholds for each parameter to classify moderate and strong events. The thresholds are summarized in Table 1. These values are based on the previous implementation in AFFECTS, but are intended to be adjusted to the current solar cycle in future analyses since the previous analysis was based on a smaller set of historical storm events. An example calculation of threat levels is shown in Figure 2 for the St. Patrick's Day geomagnetic storm.  This ICME driven super storm was a significant event during the solar cycle 24 [26]. For n p , there are larger data gaps during the period shown, but for the other parameters a continuous calculation can be performed. Similar situations may occur at any time, so the evaluation of single parameters and their combinations is implemented. For example, the rounded up mean threat level T c of all threat levels T according to is an approach that detects a threat when the majority of parameters k exceeds the thresholds. Despite the missing parameter, the event is well bound by the threat levels and an onset time (during the shock) can be identified. The sheath and magnetic cloud as described in other studies [26] are covered with the threat levels as well.
The second approach does not apply thresholds on absolute scales and instead relies on the calculation of multiple simple moving averages (SMA). This method is only appropriate for n p , v p , p d , and p e . B z is excluded due to the quasi-stationary behavior [68] which is observed for the majority of intervals. For each parameter a number s of SMA with time window w and weight v defines the mean SMA c as The time windows and weights may be optimized using a longer time frame or may be defined with a set of values based on expected variations. Finally, the difference of a parameter and corresponding SMA c is calculated according to providing a relative scale that can be applied to detect events. Similar to the thresholds for the absolute values, those for the relative scale must also be optimized for the current solar cycle in future analyses (e.g., using simulations of the solar cycle based on solar flux measurements). The example for v p in Figure 3 is calculated with two equally weighted time windows of 1 and 27 d. The resulting difference shows a much more distinct trend of the event even in this simple and not optimized configuration. In addition, this approach allows to detect storm conditions below the thresholds defined in Table 1. Both presented approaches are potential options for operating the forecast and can be further developed in the future. Other solutions may also be introduced to the framework.

Event Reconstruction
The detection of an event triggers the actual forecast process. For this, a past event from the reference list must be assigned to the current solar wind conditions. In general, this calculation requires the start time of the event t 0 , the last time step t n and all measurements x in this period per parameter k (sampling rate of 1 h). Thus, an input vector I k is defined as Furthermore, a matrix A k over 24 h with all reference events E is stored. All entries up to the last time step t n are selected from this matrix according to and this new defined matrix B k is used to calculate the best fitting event. Before the main process, there is the option to apply another function to the data, which performs a cumulative processing of the measurements according to The maximum may be applied as this specific function to only consider the increases of a parameter over time. The use of such a function generally depends on the method chosen to establish the fitting event. There are two methods implemented to realize this.
The first method will only be briefly mentioned in the context of this study, as this selection method is much more restrictive and, therefore, not specific enough with the given number of reference events. The method is based on the interpretation by lookup tables and thus the processing of selected information before the comparison of solar wind parameters. Such features include season, solar activity level (characterized by the solar radio flux index F10.7), geomagnetic indices and geometrical quantities (e.g., declination). However, this leads quite soon to a reduced set of fitting candidates based on the recently available historical events (e.g., assigning the season reduces the set of fitting reference events to approximately 25 ± 5%). With each further parameter this amount decreases and thus, the current situation of the solar wind has only a secondary role. This is not preferable and complicates the evaluation of the model results.
The second method does not rely on expert knowledge and calculates the best fitting event based on a nearest neighbor approach that takes n p , v p , B z , and p e into account. This is accomplished by calculating the distance d for each combination of reference event with the current event as The result for each combination is either the distance at the last time step or the mean distance from onset to last time step Finally, the best fitting event E F with the smallest distance is selected according to To estimate how much better E F compared to other entries performs (mean distance d calculated with d according to Equations (11) or (12) including all reference events), a quality parameter q is defined as This parameter is not a measure of the overall quality of the forecast, but allows to estimate how well the selected reference event describes the observed conditions and to what extent other reference events could be assigned. An additional parameter can be calculated using the observed and predicted trend (e.g., Pearson's correlation, Spearman's correlation, or cosine similarity are appropriate parameters to evaluate the assigned prediction). The quality of the forecast is further described by the number of time steps at which the selected reference event changes. The number of changes depends strongly on the chosen method (see Equations (11) and (12)) and is a measure of the forecast stability.
The proposed process is evaluated in three steps: by use of the quality parameters, comparison between observed and predicted Kp, as well as comparison between observed and predicted GTEC. Furthermore, four configurations of the algorithm are investigated. The configurations NN t n and NN t 0 ···t n perform the nearest neighbor (NN in configuration name) approach without applying a function (see Equation (9)) for the last (see Equation (11)) and all time steps (see Equation (12)). Both configurations are extended to NNM t n and NNM t 0 ···t n by applying the maximum (M in configuration name) for n p , v p , and p e and the minimum for B z according to Equation (9). This configuration only considers increases (decreases in case of B z ) and produces a constant time series otherwise. The analysis is based on the prediction of each reference event from the set of possible candidates (all other 592 reference events) with the four defined algorithms, excluding the event to be reconstructed. This is performed for each hourly time step of the stored 24 h time windows.
Each data point in the first row of Figure 4 represents one of the reference events and the best-fitting event according to the proposed approaches (excluding itself from the set of potential candidates). Thus, the resulting variability of the quality (see Equation (14)) describes how well events are selected using the different approaches. The second row of Figure 4 shows the number of changes between time steps and summarizes the distribution for all data points (size of dots). The estimated mean quality in Figure 4a,b is constant (approximately 0.87) due to the chosen method adjusting the best fitting event during that time step. However, this causes the selected reference event to change continuously, and the number of changes in Figure 4e shows that this happens for most of the events at each time step. This behavior is not desirable as it prevents a stable prediction. As shown in Figure 4f, the problem can be significantly reduced (more than 50%) by applying the maximum (minimum) without having a significant impact on the quality of the selection. The application of such a simple function has already resulted in a major improvement of the results. Thus, this option is definitely a useful addition to the core algorithm.
The estimated mean quality in Figure 4c,d is decreasing during the first time steps but then stabilizes (approximately at 0.75 and 0.79). The quality for NN t 0 ···t n and NNM t 0 ···t n is lower than for NN t n and NNM t n , because a time series is matched instead of a single data point. The quality in Figure 4c,d does not decrease indefinitely as the most significant changes in events are observed at some of the first time steps (e.g., shock of IMCE). These deviations determine the correlation and if, for example, no secondary shock occurs, then the model reaches a stable state. In the other case, the model would adjust further. The number of changes in Figure 4g,h show as well the impact of applying the maximum (minimum) to the time series. The number of changes is significantly decreased in Figure 4h but also the prediction quality is slightly improved in Figure 4d. As explained, these parameters do not provide any information about the quality of the TEC forecast (e.g., compared to other models) and instead describe how well the selected forecast compares to all other candidates in the reference list. These parameters are also useful to calibrate the stability and sensitivity of the model.
A first estimation of the forecast quality is calculated by comparing observed and predicted parameters. The geomagnetic index Kp is analyzed because it strongly depends on the latitude [69] and thus describes the influence of the solar wind also spatially. Since global TEC maps for the analyzed period are not yet available at the required sampling rate, the global mean TEC (GTEC) across all grid points (latitudes Φ and longitudes Λ) according to Equation (15) from the final combined solution (IGS) maps is used in this study (sampling rate of 2 h).
After the GTEC calculation, the required sampling rate of 1 h is obtained by applying linear interpolation. The analysis with GTEC is expected to be related to geomagnetic indices [70] and thus to solar wind features, but the lack of latitudinal separation is expected to create interference of different spatial variations. Figure 5 shows the difference of Kp and GTEC at the third time step past the onset time with scatter plots (the time step is chosen arbitrarily within the first phase of the storm). The correlation of observed and predicted Kp is moderate for all configurations of the model approach at this time step. NN t 0 ···t n performs better than the other configurations, but the difference is insignificant. All configurations are capable of recreating the geomagnetic effects in the first hours of an event. The correlation of observed and predicted GTEC is weak with NN t 0 ···t n performing the best. Configuration NN t n is considerably worse than the others. Continuously adjusting to the last time step increases the probability of missing a crucial feature in the solar wind resulting in weak correlation. The analysis of the correlation is extended by calculating each time step. The resulting correlation time series are shown in Figure 6.
The correlation of Kp in Figure 6a is moderate and strong after the first few hours for all configurations. The most significant difference between configurations is observed after approximately 12 h. Although configurations NN t n and NN t 0 ···t n show a decrease in correlation from this point on, the correlation for NNM t n and NNM t 0 ···t n is approximately constant. This suggests that the maximum (or minimum) of solar wind features is crucial for the duration of geomagnetic effects, and therefore the approaches that prioritize these maxima (minima) perform better in the recovery phase of an event. The mean correlation based on Kp for NN t n , NNM t n , NN t 0 ···t n and NNM t 0 ···t n is 0.64, 0.70, 0.64, and 0.66, respectively. These correlations are good considering that these are only first results and a Kp nowcast is not taken into account.
The correlation of GTEC in Figure 6b is weak for all configurations and has much stronger deviations. The results indicate that either the model algorithm or configuration needs to be improved, or that GTEC is not appropriate due to the lack of separation of spatial features (dTEC variations in different regions). This can only be resolved with the help of analyses based on TEC maps in the future. NNM t n and NNM t 0 ···t n generally perform better showing again the importance of the maximum (minimum). The mean correlation based on GTEC for NN t n , NNM t n , NN t 0 ···t n , and NNM t 0 ···t n is 0.13, 0.25, 0.12, and 0.21, respectively.   Figure 7i shows the ongoing change of the selected reference event and how this results in a good fit. However, this trend also shows that the variations do not fit before the selected time step. The observed deviations are stronger than in any other configuration. For NNM t n (e.g., see Figure 7j) the number of changes is reduced to 2 and a very good fit for all parameters is calculated. This shows once again that the reconstruction of an event is improved by applying the maximum (minimum) function (see Equation (9)). NN t 0 ···t n and NNM t 0 ···t n have a similar improvement with a reduced number of changes (especially in the second half of the 24 h period). The improvement is less significant for these configurations though, since NN t 0 ···t n shows without the applied function a small number of changes and a good fit for all parameters. In conclusion, NNM t n , NN t 0 ···t n and NNM t 0 ···t n predict n p , v p , B z , and p e for the geomagnetic storm during 14 December 2006 well, and only a small number of adjustments is required. A stable forecast, that is able to account potential variations, is realized on this basis.

dTEC Archive
In the archive, n p , v p , B z , p d , and p e , as well as dTEC maps and TEC maps are stored in a 24 h window for each reference event. The data have a shared sampling rate (for this study 1 h) for direct use in the forecast process. The solar wind data are stored for the calculation of the fitting event to the current situation and the TEC maps are stored for the reconstruction of the event progression. The dTEC maps at each time step t (sampling rate of 1 h) are calculated according to The centered moving average (CMA) is calculated here for the same time of day over a time window of 27 d. The starting time t 0 is the onset time of the reference event. Additionally, a 2-dimensional Gaussian filter can be applied to ensure better processing and interpretability for the forecast. The filtered dTEC is calculated for each coordinate with latitude φ and longitude λ according to the Gaussian kernel G as The standard deviation σ controls the smoothing factor of the Gaussian filter. In Figure 8, the calculated dTEC and filtered dTEC during the St. Patrick's Day geomagnetic storm at 17 March 2015 14:00 UTC are shown. The filter removes certain components of the spatial variations and reduces the range of values. This is desirable only up to a certain degree for an accurate forecast. On the other hand, the filtered map can be better superimposed on the quiet condition forecast to retrieve the storm condition forecast. This eliminates strong gradients and artifacts which otherwise negatively influence the forecast quality. Both TEC and dTEC maps are stored in the archive, so that it is also possible to provide a forecast based on relative values ∆TEC according to For this alternative no quiet condition forecast is required and ∆TEC would be delivered at each time step as the corresponding prediction. This approach was already implemented in the framework of the EU FP7 AFFECTS project [12].  Both calculated trends reflect the results of various studies investigating the ionospheric response to this event [25,27,28] showing the impact of a positive ionospheric storm followed by a negative ionospheric storm during the recovery period. Figure 9 shows also that the impact of the Gaussian filter has no significant impact on the time series. The defining variations of the event are unchanged despite the elimination of spatial artifacts (see Figure 8). The root mean square error (RMSE) between unfiltered and filtered dTEC is 2.88 TECU (2.21%) for all grid points during the 24 h period (see Figure 9). Nevertheless, the applied filter has to be optimized in the future. In addition, each event may obtain a specific filter configuration, if that approach proves to be beneficial.

Quiet Condition Forecast
The forecast for climatological changes in the ionosphere is calculated using the same time window of 27 d as dTEC (covering a single solar rotation). A time window of 81 d as used in various solar flux proxy models would also be an option. However, this is rather uncommon for processing TEC. The quiet condition forecast (qTEC) at each time step t (sampling rate of 1 h) is calculated using a SMA according to Thus, dTEC (reference events) and qTEC (observed event) refer to different time frames with respect to the onset time (see Equation (16)). The use of CMA would be the optimal solution in both cases (accounting for the enveloping ionospheric trend), but this is obviously not possible for a real-time qTEC calculation. Figure 10 shows CMA and SMA during the St. Patrick's Day geomagnetic storm at 17 March 2015. The difference between the two maps is insignificant (e.g., the mean value for CMA and SMA is approximately 26.76 and 26.31 TECU). There are also no apparent differences in the spatial distribution. Therefore, the results are good considering that qTEC is only the background model.

Storm Condition Forecast
The storm condition forecast (fTEC) is provided either directly through dTEC maps (see Equation (16)), ∆TEC maps (see Equation (18)) or through combination of these maps with the background model according to Equations (20) and (21).
The selected reference event E F (see Equation (13)) can change over time resulting in continuous updates of the forecast (see Figure 7).
The forecast according to NNM t n and NNM t 0 ···t n for the geomagnetic storm during 14 December 2006 favors the reference event during 15 September 2005 (assigned to 67% and 17% of the time steps). The forecast according to this reference event and Equation (20) is shown in Figure 11. Both observed and predicted dTEC (see Figure 11a,b), show an TEC enhancement in the Northern and Southern hemisphere between approximately 180 • W to 90 • W. The spatial distribution is different in shape and the observed dTEC in the Southern hemisphere is stronger. Nevertheless, the major features are retrieved. The derived ∆TEC maps (see Figure 11b,c) show the increase in both hemisphere as well, but other features are also present. The observed ∆TEC shows strong enhancements (up to 150%) in the Northern hemisphere at approximately 60 • N between 90 • W to 90 • E. These features (on Earth's night side) is less pronounced (up to 60%) for the predicted ∆TEC. Since TEC tends to take on rather small values at these latitudes and during the night, these changes do not make a significant difference to the absolute values, but nonetheless they indicate areas where ionospheric disturbances are likely to occur. The TEC observations during the event (onset 13:00 UTC, time step 12 h) and final forecast (see Figure 11b,c) have a good agreement regarding absolute values and as expected, a TEC increase in both hemispheres is retrieved. The spatial distribution due to the differences in dTEC is also present in the TEC forecast.
The difference of observed and predicted TEC in Figure 12d shows the discussed, inaccurate features. The TEC enhancement above and below the Equator on Earth's day side, as well as the high latitudes in the Northern hemisphere are underestimated by the forecast. Nevertheless, the forecast is good considering that the RMSE is 5.53 TECU. These features are even more pronounced in following time steps (during maximum impact). However, the difference decreases during the last time step in Figure 12f. The strength of the occurring event was underestimated by the model. This error can be reduced during periods using the same reference event by applying the observed difference during the last available time step to the future time steps as Figure 13 shows the adjusted difference maps (based on results in Figure 12) according to Equation (22) using the difference of 3 h before each time step. The spatial variations are not entirely removed, but they are minimized and the RMSE values are better at every time step (e.g., the underestimated TEC above and below the Equator in Figure 12f is compensated in Figure 13f). This improvement obtains better over time and makes especially the short-term predictions (few hours) more accurate. Nevertheless, the initial forecast and the forecasts covering more than 12 h are still dependent on the reliable reconstruction based on reference events. Figure 14 shows the difference maps according to Equation (22) for the St. Patrick's Day geomagnetic storm at 17 March 2015. These results are calculated based on v p and B z , since n p and p e are not available due to the data gaps of n p in that period (see Figure 2). The critical time period with the strongest variations is again about 15 h after the observed shock (see Figure 14e). The difference is also much stronger than in the first example (see Figure 13e), but still the calculated RMSE values are good (maximum RMSE of 7.49 TECU).  show that good RMSE values can be achieved, but this parameter alone is not enough to describe the quality of the forecast. Especially the spatial distribution of the TEC variations is a crucial criteria to describe the benefit of the approach. These spatial features have to be categorized and analyzed in detail for the realization of the proposed model.

Discussion
The presented ionospheric forecast model is based on the idea to use information from historical solar storm events to provide a detailed spatial and regional forecast of expected ionospheric disturbances. In respect to the first historical event based empirical model, developed within the AFFECTS project [12], several aspects have been added in the presented model in order to significantly enhance such forecast. The area of the predictions was extended from maps of the European region to global maps, which requires a more comprehensive evaluation of the results. Further, the proposed model is based largely on the analysis of solar wind data and is not using geomagnetic indices as the implemented model as part of AFFECTS. However, this may be added again in the future to make the forecast more accurate as solar storms progress. The most crucial difference is in the methods applied in both approaches. The proposed reconstruction from historical reference events (instead of applying a mean trend) in this study allows to analyze the spatial distribution in much more detail and even the consideration of secondary shocks and their respective impacts can be covered. The proposed corrections based on observed TEC during an event (see Figures 12 and 13) may be applied to geomagnetic data as well. Thus, the proposed procedure would match the three data sets (solar wind data, geomagnetic indices, and TEC maps) continuously to reconstruct the best-fitting event overall.
The calculated RMSE in Figures 13 and 14  The operational implementation of the model will update continuously during solar storm forecasts and provide the most recent calculation time (e.g., during the change of the selected reference event). During such an update, the forecast is calculated again for the whole period (covering already passed time steps). Thus, a consistent forecast is provided in real-time. The stability and reliability of these forecasts can be estimated using the quality of the event selection and number of occurred changes (see impact on historical events in Figure 4), but a measure of the overall accuracy of the forecast must also be defined for an operational service. The calculated RMSE (see Figures 12-14) is also considered as a performance parameter for the accuracy of continued forecasts.

Conclusions and Outlook
The TEC forecast model presented in this study does not attempt to describe both quiet and storm conditions, but can be operated on top of a climatological TEC forecast allowing much more detailed predictions of the temporal and spatial storm impact on the ionosphere. The model is activated with solar wind conditions during times of solar activity when ionospheric disturbances are expected. The model selects from a large data base of ionospheric storms at each time step the most appropriate historical event and can significantly limit the predictable variations. The accuracy and precision of the model is, therefore, given by the TEC response to geomagnetic activity driven by past ICME and CIR events.
The detection of such events is implemented with a set of thresholds (applied to solar wind and IMF measurements) and the model forecast is triggered according to the pre-defined threat levels (see Figure 2). In the proposed approach, the selection of fitting reference events is realized using a nearest neighbor approach that is executed in different configurations and continuously updated based on the measurements. The continuous evaluation of the actual event with the pool of available reference events with each other is performed for these configurations (see Figures 4 and 7) and moderate to good as well as weak correlations are retrieved for Kp and GTEC (see Figure 5), respectively. The correlation of Kp mainly confirms the event selection results, whereas GTEC results are more difficult to interpret since these values combine all the spatial variations of TEC (including negative and positive responses).
The presented example forecast of a reference event shows how the spatial distribution of TEC is reconstructed (see Figure 11). Although the general variations are described in both hemispheres, there are significant differences especially in the transition regions between high and low TEC values. This particularly affects the equatorial region, for which additional improvements need to be implemented in the future (e.g., TEC distribution along the geomagnetic Equator). Nevertheless, the difference map in Figures 13 and 14 present very promising results.
The introduced modeling approach aims for long time predictions (more than 12 h) of ionospheric disturbances due to solar storms. This is important, since long-term predic-tions of regional and spatial ionospheric disturbances are crucial for users of satellite based navigation, communication, and remote sensing services. The inclusion of the impact of solar storms on the ionosphere based on real-time data has not yet been satisfactorily completed in existing empirical and physical ionosphere forecast models. The forecast model presented here compares all available historical storm information to predict the most probable impact of future solar storm. In addition, the model is continuously improving by increasing its historical event data base with every upcoming solar storm.
In the future, the solar wind monitoring will be extended by the option of analyzing relative values in order to detect ICME or CIR during periods of strong solar wind activity. Both the current and planned approach will use an optimized setup of thresholds for the last two solar cycles and will be operated in parallel as part of the model. In addition, the filtering in the model (see Equation (17)) will be adjusted, since the existing solution does not reflect the value range of the initial TEC maps. An appropriate scaling or a different filtering method will solve this problem.
The main part of the further analysis will be the comprehensive evaluation of the TEC response during each reference event. The results of this analysis will be used to decide whether the proposed method (nearest neighbor) satisfies the requirements or whether certain selection criteria need to be added (e.g., season). In addition to the spatial distribution, the temporal evolution of the reference events also has a crucial role, which is why the comparison will also be performed covering all time steps (24 h time window). Based on this evaluation, further changes will be applied and sensitivity as well as stability of the model will be set accordingly.
In the future, additions to the model that incorporate further data are of main focus. These additions to the model would progressively improve the forecast with each time step and be more independent of the observed solar wind conditions. Each selection based on one of the measurements (solar wind measurements including IMF data, geomagnetic indices and global TEC maps) could be weighted according to the passed time and attributed importance for the forecast. For this, however, these data must be available with sufficient time in advance.
Longer lead times for the forecast may be achieved by analyzing further in situ measurements, such as considering the velocity and acceleration of CME calculated based on images from Large Angle and Spectrometric Coronagraph on-board the Solar and Heliospheric Observatory [71]. However, this also requires forecasting climatological changes in the ionosphere for longer time periods and thus increases significantly the complexity of the model and its evaluation. This also has to be considered in regard to the defined scope of the model for a real-time service.
In addition, different methods for assigning historical storm events may be analyzed to implement approaches that achieve better accuracy and efficiency than the current implementation. Furthermore, solar storms occurring in the future will be integrated into the data base granting a continuous improvement of the forecast capabilities.  Data Availability Statement: Publicly available datasets were analyzed in this study. ACE MAG and ACE SWEPAM data were obtained from the NASA ASC interface at http://www.srl.caltech. edu/ACE/ASC/level2/index.html (accessed on 16 August 2021). TEC maps were obtained from the NASA CDDIS interface at https://cddis.nasa.gov/archive/gnss/products/ionex/ (accessed on 16 August 2021). Definitive Kp index data were obtained from the GFZ Potsdam interface at https://www.gfz-potsdam.de/kp-index/ (accessed on 16 August 2021). The Dst index data were obtained from the WDC Kyoto interface at http://wdc.kugi.kyoto-u.ac.jp/dstdir/ (accessed on 16 August 2021).