A High Latitude Model for the E Layer Dominated Ionosphere

Under certain conditions, the ionization of the E layer can dominate over that of the F2 layer. This phenomenon is called the E layer dominated ionosphere (ELDI) and occurs mainly in the auroral regions. In the present work, we model the variation of the ELDI for the Northern and Southern Hemispheres. Our proposed Neustrelitz ELDI Event Model (NEEM) is an empirical, climatological model that describes ELDI characteristics by means of four submodels for selected model observables, considering the dependencies on appropriate model drivers. The observables include the occurrence probability of ELDI events and typical E layer parameters that are important to describe the propagation medium for High Frequency (HF) radio waves. The model drivers are the geomagnetic latitude, local time, day of year, solar activity and the convection electric field. During our investigation, we found clear trends for the model observables depending on the drivers, which can be well represented by parametric functions. In this regard, the submodel NEEM-N characterizes the peak electron density NmE of the E layer, while the submodels NEEM-H and NEEM-W describe the corresponding peak height hmE and the vertical width wvE of the E layer electron density profile, respectively. Furthermore, the submodel NEEM-P specifies the ELDI occurrence probability %ELDI. The dataset underlying our studies contains more than two million vertical electron density profiles covering a period of almost 13 years. These profiles were derived from ionospheric GPS radio occultation observations on board the six COSMIC/FORMOSAT-3 satellites (Constellation Observing System for Meteorology, Ionosphere and Climate/Formosa Satellite Mission 3). We divided the dataset into a modeling dataset for determining the model coefficients and a test dataset for subsequent model validation. The normalized root mean square deviation (NRMS) between the original and the predicted model observables yields similar values across both datasets and both hemispheres. For NEEM-N, we obtain an NRMS varying between 36.1% and 47.1% and for NEEM-H, between 6.1% and 6.3%. In the case of NEEM-W, the NRMS varies between 38.5% and 41.1%, while it varies between 56.5% and 60.3% for NEEM-P. In summary, the proposed NEEM utilizes primary relationships with geophysical and solar wind observables, which are useful for describing ELDI occurrences and the associated changes of the E layer properties. In this manner, the NEEM paves the way for future prediction of the ELDI and of its characteristics in technical applications, especially from the fields of telecommunications and navigation.


Introduction
At middle and low latitudes, where photoionization dominates, the electron density of the F2 layer clearly exceeds that of the E layer, which is closely related to the solar zenith angle at daytime [1]. However, at high latitudes, precipitating energetic particles from the magnetosphere can produce strong enhancements in the electron density of the E layer, as have been occasionally observed using different techniques [2]. E layer electron densities clearly exceeding the peak density of the F2 layer were systematically studied for the first time by Mayer and Jakowski [3]. They characterized such events as E layer dominated ionosphere (ELDI), showing the peak electron density at a height range between about 90 and 150 km. ELDI events may occur either due to increased E layer ionization caused by energetic particle precipitation, or due to strong depletion of the F2 layer ionization. Such a depletion may arise, for example, inside the polar hole [4,5] or inside the mid-latitude trough region [6,7]. However, previous studies revealed a strong relationship of ELDI events recorded using ionospheric radio occultation (IRO) techniques with the auroral zone and not with the polar cap or the mid-latitude trough region [3,8,9]. Therefore, in this work, we interpret ELDI events as occurring due to enhanced particle precipitation, which is capable of producing electron densities in the order of more than 10 5 cm −3 . The duration of ELDI events can reach up to 5 hours, as Cai et al. [10] found when analyzing EISCAT observations between 2009 and 2011. The ELDI phenomenon is quite different from sporadic E layer events, which are characterized by very thin layers of extremely enhanced electron density occurring preferably at mid-latitudes in summer [11][12][13].
By analyzing vertical electron density profiles obtained from IRO observations onboard the COSMIC (Constellation Observing System for Meteorology, Ionosphere and Climate/Formosa Satellite Mission 3) satellite mission [14][15][16] and the CHAMP (CHAllenging Minisatellite Payload) mission [17], Mayer and Jakowski [3] found that the location of the ELDI events in the Northern Hemisphere can be described by an ellipse having one focal point at the geomagnetic pole. They observed that the semi-major axis of the ellipse and the spread of ELDI events relative to the ellipse increased during enhanced geomagnetic activity, while the number of ELDI events increased for low solar activity and during local night. By evaluating electron density profiles retrieved from incoherent scatter radar observations carried out during low solar activity conditions for the period from 2009 to 2011 at the European Incoherent Scatter Radar (EISCAT) and at the EISCAT Svalbard Radar (ESR), Cai et al. [10] found that at both radar sites, the number of ELDI events increased during winter and spring. Moreover, during both seasons, the number of ELDI events increased around local night at the EISCAT site. Mannucci et al. [18] analyzed COSMIC IRO electron density profiles at high latitudes during geomagnetic storms that were induced by two high-speed streams in April 2011 and May 2012 and by two coronal mass ejections in July and November 2012. They found that the number of ELDI events increased during the main phases of the storms, whereas their distribution shifted equatorward. Kamal et al. [8] investigated ELDI occurrences for the Northern and Southern Hemispheres using CHAMP IRO data from 2001 to 2008 and COSMIC data from 2006 to 2018. They found that in both hemispheres, the ELDI events were distributed along an ellipse located around the geomagnetic pole. Furthermore, they confirmed that more ELDI events occurred during winter and local night as well as during low solar activity and during geomagnetic storms. As a preliminary step for a possible ELDI model, Kamal et al. [9] described the elliptical ELDI event distribution by three spatial parameters and one temporal parameter. They investigated the variations of these parameters as a function of season, solar activity, geomagnetic activity, interplanetary magnetic field, convection electric field and solar wind energy using COSMIC IRO data covering the period from 2006 to 2018. The authors found clear dependencies between the selected geophysical quantities and the considered ELDI parameters. In addition to the variation in the occurrence rate of ELDI events with maxima in winter and at low solar activity, the authors also observed an increase in their number, as well as a shift in their distribution toward the equator for rising solar wind parameters and related geomagnetic activity indices.
While previous studies on the ELDI have dealt explicitly with its observations, the primary objective of the present work is to develop an empirical model describing main features of ELDI events. In this context, we use a modeling approach similar to other ionospheric parameter models previously developed at the German Aerospace Center (DLR) in Neustrelitz. Space-based single frequency navigation systems require a correction for the signal delay caused by the ionosphere, which depends on the total electron content (TEC) along the ray path. For predicting the vertical TEC, Jakowski et al. [19] developed the global Neustrelitz TEC Model (NTCM) based on TEC data generated by the Center for Orbit Determination in Europe at the Astronomical Institute of the University of Bern. Since the F2 layer exhibits the highest ionization of all the ionospheric layers, it has the largest influence on trans-ionospheric radio wave propagation. Considering this, Hoque and Jakowski [20] developed the Neustrelitz Peak Density Model (NPDM) for describing the peak ionization NmF2 of the F2 layer. The model is based on a large dataset of vertical sounding data from ionosonde stations and radio occultation observations from the CHAMP, GRACE and COSMIC missions. By using the same datasets, Hoque and Jakowski [21] also developed the Neustrelitz Peak Height Model (NPHM) for describing the peak ionization height hmF2 of the F2 layer. Each of the aforementioned models follows a nonlinear multiplicative approach and is implemented by a parametric function. These functions consist of several nonlinear terms, which are linked to each other in a multiplicative manner and depend on one model driver each. Overall, this leads to relatively compact models for the key observables with a small number of coefficients (up to 13 in the case of the aforementioned models).
The proposed Neustrelitz ELDI Event Model (NEEM) is composed of four submodels, whose respective model observables describe one aspect of the ELDI each. In this context, the submodel NEEM-P specifies the ELDI occurrence probability (%ELDI). While this observable indicates a change in the E layer's propagation characteristics, the associated E layer observables indicate how these characteristics change in the case of an ELDI event occurring. In this regard, we represent the observables peak ionization (NmE), peak height (hmE) and vertical profile width (wvE) by the submodels NEEM-N, NEEM-H and NEEM-W, respectively. Based on our previous studies, we select the geomagnetic latitude, local time, day of year, solar activity and the solar wind related convection electric field as user friendly and viable model drivers. Even though we give initial physical explanations for the dependencies found between the model observables and the model drivers, the focus of the present paper is mainly on the modeling of these dependencies. In this work, we use vertical electron density profiles derived from IRO observations onboard the COSMIC satellites. Since the satellites have an orbital inclination of 72 • , these observations cover a large area of the Earth's surface, including the auroral zones. Our database consists of approximately 2.3 million profiles covering a period of more than one solar cycle from April 2006 to December 2018. The presence of an enhanced E layer ionization during ELDI occurrences may affect the propagation of HF radio waves and, therefore, has an effect on telecommunication and navigation applications. The use of an ELDI model in such applications could help predict signal propagation perturbations and their extent.

Data
We used vertical electron density profiles derived from IRO observations onboard the COSMIC satellites. The profile files contained in our dataset cover a period from April 2006 to December 2018 and were obtained from the COSMIC Data Analysis and Archive Center [22]. First, we filtered out all those profile files that could either not be opened or whose contents were corrupted. For each profile, we then performed the computations listed in the following paragraph. The information derived through this process was later used for the computation of the model coefficients and of the model observable trends.
To obtain the ELDI state ELDI st of a profile, which indicates whether the profile represents an ELDI event or not, and to obtain the three E layer observables, we proceeded similarly as described in [8]. In this regard, we first divided the height range of the profile into an E layer segment extending from 90 to 150 km and an F2 layer segment located above 150 km. Thereafter, we determined the maximum electron density NmF2 and the associated height hmF2 within the F2 layer segment. Fitting a Chapman layer function to the F2 layer topside electron density and extrapolating it into the bottomside ionosphere yielded the F2 layer contribution to the profile. Next, we subtracted the F2 layer contribution from the observed profile in order to extract the E layer. Subsequently, we fit a Gaussian function to the E layer. Profiles for which any of these fitting steps failed were ignored from here on. The magnitude, position and full width at half maximum (FWHM) of the Gaussian resulted in the E layer observables NmE, hmE and wvE, respectively. Via a comparison of NmE with NmF2 and additional filter steps outlined in [8], we determined the ELDI state of the profile. The day of year DOY, which represents seasonal variation, and the local time LT were taken from the profile's timestamp information. The observational data for the solar radio flux index F10.7, which we used as a measure for solar activity, were downloaded from the Space Physics Data Facility (SPDF) website of the Goddard Space Flight Center [23]. Following other ionosphere models previously developed in Neustrelitz, as presented in Jakowski et al. [19] and Hoque and Jakowski [20,21], we computed the geomagnetic latitude MLAT of the profile from its geographic latitude and longitude by means of the dipole magnetic field model based on a formula described in Davies [1]. Furthermore, we computed the convection electric field magnitude E Conv as discussed in Kamal et al. [9] according to a formula given in Kelley [24] from observational solar wind speed data (v sw ) and interplanetary magnetic field data (B sw ) downloaded from the SPDF website. Overall, performing the aforementioned processing steps resulted in the following information for each profile: ELDI st , NmE, hmE, wvE, MLAT, LT, DOY, F10.7 and E Conv . The flowchart in Figure 1 summarizes the steps involved in the data pre-processing. After completion of the data pre-processing, our resulting final dataset contains the information about the ELDI state, the E layer observables and the model drivers for about 2.3 million profiles in total. To compute the model coefficients and to validate the model, we randomly split the dataset into a modeling dataset and a test dataset. In order to increase the quality of the model coefficients, the modeling dataset contains 90% of the total data, whereas the test dataset contains 10% of it.

Modeling Approach
As indicated in the introduction, our ELDI model consists of four submodels, each representing a particular model observable. Every submodel is expressed by a separate equation identical for both hemispheres. However, the equation coefficients differ between the Northern and Southern Hemispheres. In the next section, we describe how the equation for a single model observable is put together and how we computed its coefficients.

Model Structure
Since we have had good experience using the multiplicative approach in previous modeling studies with respect to its robustness, small number of coefficients and real-time capability [19][20][21], we follow the same modeling approach in the present work as well. For a model observable F, the associated equation F(x 1 , . . . , x n ) depends on a specific set of model drivers x 1 , . . . , x n . Each dependency of the model observable on a single model driver is given by a separate model term F 1 (x 1 ), . . . , F n (x n ). Since the model equation follows the multiplicative approach, it can be computed as the product of all model terms as shown in Equation (1).
Each model term contains one or more base functions, which were chosen from trend analysis of the model observable with respect to the driver in such a way that they approximate the trend sufficiently well. Furthermore, each term includes internal coefficients located inside of the base functions and external coefficients located outside of them, all of which were determined numerically by means of nonlinear fitting. As an example, in Equation (2), we show a reduced version of the NmE model equation, which is given fully in Section 4.1. The first term models the dependency on the driver MLAT using a Gaussian as a base function, which contains the phase shift MLAT 0 and the standard deviation σ MLAT as internal coefficients. The second term models the dependency on the driver LT using two cosine functions. They represent the diurnal and the semidiurnal variations and contain the phase shifts LT D and LT SD as internal coefficients, respectively. The external coefficients are given by the amplitudes a 1 , a 2 and a 3 , and the bias a 4 .

Model Coefficient Computation
Below, we explain how we computed the internal and external model coefficients. Their computation differs for the E layer observables and the ELDI occurrence probability.

E Layer Observable Model Coefficients
For a considered E layer observable, such as NmE, hmE or wvE, we computed the model coefficients based on all those profiles of the modeling dataset that correspond to an ELDI event. To improve the quality of the computed coefficients, we first reduced the number of outliers of the E layer observable. Such outliers may arise, for example, during the derivation of the observable values from the original profiles. Therefore, as part of a percentile pre-filtering, we set up a histogram of the used observable values and then excluded those profiles whose observable value fell in the lower or upper 10% of the histogram.
To compute the internal coefficients, we considered each model term as independent from all the other terms by analogy with Jakowski et al. [19] and Hoque and Jakowski [20,21]. Regarding this, we set up a cumulative histogram of the used values of the corresponding model driver and divided it into 20 bins in such a way that each bin contained approxi-mately the same number of profiles. Next, we computed the median of the driver values and the median of the observable values for every bin. Subsequently, we fit the considered model term to the resulting 20 data points by means of nonlinear least squares fitting in order to estimate all its coefficients. From this step, we kept only the internal coefficient values as fixed results.
To determine the external coefficients of the full model equation, we directly fit the equation to the model observable-model driver combinations of the pre-filtered ELDI profiles, without performing any additional binning beforehand.

ELDI Occurrence Probability Model Coefficients
As a basis for computing the coefficients of the %ELDI model equation, we used the full modeling dataset of profiles without any form of prior outlier filtering.
To estimate the internal coefficients, we proceeded analogously to their computation for the E layer observables. In this manner, we first set up a cumulative histogram of the used model driver values, subsequently dividing it into 20 bins and then computing the median of the driver values for each bin. Unlike for the E layer observables, for each bin we computed a %ELDI value by relating the number of ELDI profiles to the total number of profiles. Subsequently, we fit the model term to the resulting 20 datapoints.
To compute the external coefficients, we first divided the value range of each model driver into intervals containing approximately the same number of profiles. We refer to a particular combination of these intervals as an interval key. Thereafter, we assigned each profile to such a key, depending on the intervals into which its associated model driver values fell. Below, we give a simple example to demonstrate the assignment of profiles to interval keys. Here, we consider only the three model drivers LT, DOY and F10.7, whose value ranges have been divided into two, three and three intervals, respectively (Table 1). Assuming that, for a given profile, the model drivers have an associated value assignment of LT = 15.3, DOY = 250 and F10.7 = 85.0, the profile will be assigned to the interval key "2-3-1" based on the indices of the matching intervals.
After assigning the profiles to the interval keys, we computed the %ELDI value for each key as the ratio of the ELDI profile number to the total profile number and also the median value for each driver. Subsequently, we fit the entire equation to the %ELDImodel driver combinations of the keys, while keeping the internal coefficients fixed to the previously determined values. This process resulted in all the external coefficients of the submodel.

Modeling Results
In the following sections, for each model observable, we show the associated submodel consisting of the model equation and its coefficients. Next, we show the trends for the original and the predicted observable values depending on the model drivers. Subsequently, we report estimates for the root mean square deviation between observation and prediction. Thereafter, we estimate which model driver used in the submodel has the greatest impact on the variation of the observed model observable values.

NEEM-N: Model for the E Layer Peak Ionization
Equation (3) is the model equation for the peak ionization NmE of the E layer. The model term accounting for the geomagnetic latitude is based on a Gaussian function. The terms representing the local time and day of year dependencies are given by two superposed cosine functions each, while the dependencies on the solar activity and on the convection electric field magnitude are given by linear functions. Table 2 shows the associated internal and external coefficients for both hemispheres computed as described in Section 3.2. The internal coefficients are given by the phase shift MLAT 0 and the standard deviation σ MLAT for the geomagnetic latitude, by the phase shifts LT D and LT SD for the diurnal and semidiurnal variations, and by the phase shifts DOY A and DOY SA for the annual and semiannual variations. The external coefficients of the equation consist of the amplitudes c 1 , ..., c 7 and the bias c 8 . 1 24 ·(LT − LT D ) + c 3 · cos 2π· 2 24 ·(LT − LT SD ) + 1 c 4 · cos 2π· 1 365.25 ·(DOY − DOY A ) + c 5 · cos 2π· 2 365.25 ·(DOY − DOY SA ) + 1 (c 6 ·F10.7 + 1) (c 7 ·E Conv + c 8 ) In Figure 2, we plot two trends for each hemisphere and each dependency of NmE on one of the model drivers. One of these trends results from binning the originally observed NmE values (red), whereas the other is based on binning the NmE values predicted by our model (black). To highlight the symmetry between the observed trends for the Northern and Southern Hemispheres, in the case of MLAT, we generally plot them depending on the absolute value of the geomagnetic latitude. All the trends are supplemented by error bars indicating the standard deviations of the underlying NmE values within each bin. In the last row of the figure, we report the root mean square deviation (RMS) between the observed and the predicted NmE values, averaged for all the dependencies of the respective hemisphere. The normalized root mean square deviation (NRMS) describes the RMS normalized by the mean of the originally observed NmE values. To estimate which model driver causes the largest variation of the observed NmE, we computed the range of each observed NmE trend and then determined the model driver that corresponds to the largest range. In this regard, the range of a considered trend was computed as the absolute value of the difference between its maximum and minimum NmE values. As can be seen in the plots of Figure 2, NmE shows its maximum between 65 • and 70 • MLAT, which is in the main region of the auroral precipitation zone. Furthermore, in local summer and for increased solar activity, the enhanced solar EUV irradiance causes an increased ionization of the entire ionosphere and, thus, also of the E layer. The dependency on LT shows basically a semidiurnal variation. For an enhanced convection electric field, we observe an increase in the peak ionization of the E layer. This observation could be explained by the fact that the convection electric field magnitude is a measure for the degree of magnetosphere-ionosphere coupling. Its increase indicates an enhanced particle precipitation and, therefore, increased NmE. For both hemispheres, the range of the observed NmE is largest depending on DOY (north: 81,000 cm −3 , south: 66,400 cm −3 ).
Since NmE determines the critical frequency of the ionosphere in the case of an ELDI event, this quantity is relevant for terrestrial radio wave propagation at high latitudes. As we see in the results, the modelled values of NmE may reach up to about 2 × 10 5 cm −3 at both hemispheres, which corresponds to a critical frequency of about 4 MHz. Considering the derived RMS values, the critical frequency may reach 6 MHz during an ELDI event within this variability range.

NEEM-H: Model for the E Layer Peak Height
The propagation of HF radio waves also depends on the height of the reflecting layer. In this regard, below, we show the model equation (Equation (4)) and its corresponding coefficients (Table 3) for the peak height hmE of the E layer. Figure 3 shows the related trends and the RMS values for the peak height. Since we found only a negligible variation of hmE depending on the convection electric field magnitude in the initial trend analysis, this quantity is not considered as a driver in the model equation. As expected, the height hmE of the E layer is rather stable, varying between about 105 and 112 km for both hemispheres, with the RMS values being extended by about 7 km. These observations fit well with those of Cai et al. [10]. They found that at the EISCAT site (69.9 • N), which is located inside the auroral oval for most of the time, hmE was about 114 km for the majority of ELDI events. At the ESR site (78 • N), which is mostly located outside the auroral oval, this value was around 108 km. The plots presenting the dependencies of hmE on MLAT show a maximum hmE in the latitude range between 65 • and 70 • , indicating the location of the auroral zone. Regarding the dependency on the local time, a minimum hmE can be found around 8:00 to 12:00 LT. In contrast, the dependency on DOY shows only a minor semiannual variation. Furthermore, we find that hmE increases slightly for increasing F10.7. This effect could be explained by the higher amount of EUV irradiance and enhanced heating for increasing F10.7, which causes an expansion of the upper atmospheric layers in general. For both hemispheres, the range of the observed hmE is largest depending on LT (north: 7.9 km, south: 6.4 km).

NEEM-W: Model for the E Layer Vertical Profile Width
As follows, we introduce the model equation (Equation (5)) and its related coefficients (Table 4) for the vertical profile width wvE of the E layer. The corresponding trends and the RMS are given in Figure 4.  As we can see from the plots, the average width of the E layer generally varies between about 15 and 21 km for both hemispheres. This value range is also in the order of magnitude of the E layer thickness observed by Cai et al. [10], who defined this parameter in a similar way to our E layer profile width. They found that at the ESR site, the E layer thickness was about 17 km for most of the ELDI events, while it was about 23 km at the EISCAT site. These observed magnitude ranges are quite different from the sporadic E layer thickness of up to 2 km at mid-latitudes [11][12][13]. It is interesting to note that the trends show a decrease in wvE for increasing geomagnetic latitudes. Furthermore, we see a slight enhancement of wvE around local noon and with increasing solar activity. The DOY dependency shows a minor semidiurnal variation. For both hemispheres, the range of the observed wvE is largest depending on MLAT (north: 5.4 km, south: 4.6 km).

NEEM-P: Model for the ELDI Occurrence Probability %ELDI
As explained in Section 3.2.2, we divided the value ranges of the model drivers into intervals to compute the external %ELDI model coefficients. On the one hand, the number of intervals of a considered driver and, thus, of the resulting data points must be high enough to adequately represent the curve formed by the base functions included in the associated model term. On the other hand, this number is limited to ensure that each interval key is assigned a sufficiently large amount of profiles to obtain a good %ELDI resolution. Our initial trend analysis showed that for describing the MLAT dependency, a single Gaussian function is suitable. For describing the LT and DOY dependencies, the superposition of two cosine functions is appropriate in each case. All the listed function terms can be sufficiently approximated by six points each. In comparison, the basically linear F10.7 and E Conv dependencies can be described by three points each. This results in the following interval numbers for the model drivers: MLAT: 6, LT: 6, DOY: 6, F10.7: 3, E Conv : 3. For the modeling dataset, we want to ensure a minimum %ELDI resolution of 1% for fitting the model equation to the %ELDI-model driver combinations of the keys. Hence, we set the threshold for the minimum number of assigned profiles required by a valid key to 100. Table 5 shows statistical information about the keys generated for both hemispheres. Thereafter, we give the %ELDI model equation (Equation (6)) and its coefficients ( Table 6).  Figure 5 shows the corresponding trends and RMS values for the %ELDI model. Considering the trends for both hemispheres, we see that the largest %ELDI value is about 12%. Similar to NmE, %ELDI also shows a maximum between 65 • and 70 • MLAT, which fits well with the findings of Kamal et al. [8,9]. In the case of a reduced solar zenith angle, as for example around local noon or in local summer, %ELDI is decreased, which matches the observations of Cai et al. [10] and Kamal et al. [8,9]. A decrease in %ELDI is also visible for increased solar activity, which was likewise found previously by Kamal et al. [8,9]. These dependencies could be explained by the fact that both a small solar zenith angle and a high solar activity are associated with an increased amount of solar irradiance. This causes an enhanced ionization of the upper ionospheric layers, especially of the F2 layer, thereby hiding the ionization of the E layer produced by particle precipitation. In this way, the number of occurring ELDI events decreases. For an increasing convection electric field magnitude, we also see an increase in %ELDI, which is in line with the findings for NmE and was also observed by Kamal et al. [9]. For both hemispheres, the range of the observed %ELDI is largest depending on LT (north: 9.1 %, south: 13.0 %).

Model Validation
In this section, we give the results of validating our models with the test dataset. In this regard, we again show trends for the original model observable values (red) and for those as predicted by our models (black). Furthermore, we show the resulting RMS and NRMS values and compare them with the corresponding values presented in Section 4 for the modeling dataset. Figure 6 shows the results of validating our NmE model NEEM-N with the test dataset. For the Northern Hemisphere, we find an NRMS of 36.1% for the modeling dataset, while this value is slightly increased to 36.3% for the test dataset. For the Southern Hemisphere, the NRMS for the test dataset is 46.9%. This value is slightly lower than the NRMS for the modeling data set, which is 47.1%. Figure 7 shows the results of validating our hmE model NEEM-H with the test dataset. When comparing the Northern Hemisphere NRMS results for the modeling dataset with those for the test dataset, we find that the NRMS is 6.1 and 6.2%, respectively. For the Southern Hemisphere, we have the same NRMS of 6.3% for both datasets. Figure 8 shows the validation results for the wvE model NEEM-W after applying it to the test dataset. By comparing the NRMS results, we see that for the Northern Hemisphere, we have a value of 38.6% for the modeling dataset, while this value is slightly decreased to 38.5% for the test dataset. In the case of the Southern Hemisphere, the NRMS for the two datasets is 41.0 and 41.1%, respectively.
With a share of only 1/10th of the full electron density profile dataset, the test dataset is much smaller than the modeling dataset. Consequently, for the qualitative validation of the %ELDI model, it was necessary to adjust the interval key formation in such a way that a sufficiently large number of valid keys remain. Therefore, we reduced the threshold for the minimum number of assigned profiles required by each valid key to 50, resulting in a %ELDI resolution of 2%. In addition, we reduced the number of intervals to four for the model drivers MLAT, LT and DOY. Table 7 shows the corresponding statistical information about the interval keys generated for both hemispheres. In Figure 9, we show the results of validating our %ELDI model NEEM-P with the test dataset. Regarding the NRMS for the Northern Hemisphere, we see a minor increase from 56.5% for the modeling dataset to 57.4% for the test dataset. For the Southern Hemisphere, we observe an increase from 58.7 to 60.3%.
It is interesting to note that for both the modeling dataset (Section 4) and the test dataset (Section 5), the Southern Hemisphere always shows a higher RMS deviation between the observation and the prediction than the Northern Hemisphere. Since we have chosen identical model equations for both hemispheres in all the cases, this effect could be due to a hemispherical asymmetry, which includes many aspects [25,26]. Consequently, further investigations are necessary to find out the exact causes of this effect.

Conclusions
In this paper, for the first time, we present the empirical Neustrelitz ELDI Event Model (NEEM), which climatologically describes the variation of the E layer dominated ionosphere (ELDI) for the high latitude regions of both hemispheres. NEEM depends on key model drivers. These are the geomagnetic latitude, day of year, local time, solar activity and convection electric field. The model is based on more than two million vertical electron density profiles covering more than one solar cycle from April 2006 to December 2018. The profiles were obtained from ionospheric GPS radio occultation observations on board the COSMIC/FORMOSAT-3 mission. NEEM consists of four submodels. Each submodel is defined by a parametric function equation, which is multiplicatively composed of several terms, one for each driver. The individual submodels NEEM-N, NEEM-H, NEEM-W and NEEM-P describe the peak density, the corresponding height and the width of the vertical E layer electron density profile, as well as the occurrence probability of the ELDI, respectively. The normalized root mean square deviation (NRMS) between the original and the predicted model observables yields similar values across both datasets and both hemispheres. For NEEM-N, we obtain an NRMS varying between 36.1 and 47.1%, and for NEEM-H between 6.1 and 6.3%. In the case of NEEM-W, the NRMS varies between 38.5 and 41.1%, whereas it varies between 56.5 and 60.3% for NEEM-P. In summary, the proposed NEEM represents a promising initial modeling approach for describing the occurrence probability of ELDI events, as well as the associated impact on the E-layer. As such, it paves the way for future prediction of the ELDI and its properties in applications, especially from the fields of terrestrial and space-based telecommunications and navigation. Nevertheless, several possibilities exist to improve NEEM. For example, increasing the size of the modeling dataset could lead to better statistics of the model observables and, thereby, increase the quality of the estimated coefficients. As with other models previously developed in Neustrelitz, the conversion of geographic into geomagnetic coordinates is based on the dipole magnetic field model. An alternative conversion using the IGRF model instead would cause more precise geomagnetic coordinates and, thus, improve the observed dependencies, especially on the geomagnetic latitude. Furthermore, by increasing the number of base functions contained in the individual model terms, the submodels would have more degrees of freedom available to adequately approximate the observed dependencies. Since NEEM is a climatological model that describes the averaged response of the ELDI to the drivers, the reliability to estimate the occurrence of individual ELDI events, often closely related to perturbation induced particle precipitation, is limited. Consequently, the results could certainly be further improved when considering the time delay between the observed solar wind parameters used as model drivers in this study and the magnetospheric-ionospheric-thermospheric response. This is a challenging task that will be addressed in future work.