3.1. Statistical Characterization of Fog Events Duration at A-8 Motor-Road
The statistical characterization of low visibility events associated with fog at the A-8 motor-road is discussed in this experimental subsection. The KS-distances (
3) between the considered CDFs (see
Table 2) and the ECDF for each year and season have been calculated taking into account the two approaches described in
Section 2.2, i.e., Maximum Likelihood and L-moments methods. A seasonal analysis has been taken into account, in order to discuss whether the statistical characterization of fog events is independent (or not) from the season of the year. First,
Table 4 shows the statistics of the low-visibility events at Mondoñedo in the years of the study (2018 and 2019).
As can be seen, the number of low-visibility events (<2000 m) by season in both years is quite high, over 200 episodes or quite close to 200 in all cases.
In general, the number of fog events does not seem to have a dependence with the season, as in the case of radiative-type fogs, which are associated with cool/cold periods. In this case, since we have here a type of orographic fog, the number of low-visibility events is very similar in all seasons. In fact, as can be seen in
Table 4, the longest average duration of the events is found in summer, with durations about 300 min in both years of study. The shortest durations appear in autumn with 81.21 min of average duration in 2019, or winter with 122.69 min in 2018.
Figure 4 shows the different fog events occurred in years 2018 and 2019 at Mondoñedo and their minimal visibility associated value. Observe that those events of longer duration are associated with the lowest visibility minimum value, i.e., denser fog events take longer to dissipate.
We proceed now with the analysis of the probability distributions for fog events duration, by means of minimizing the KS-distance for the two methods of evaluation considered: Maximum Likelihood and L-moments. We start with the Maximum Likelihood case.
Table 5 shows the obtained numerical KS-distance using the Maximum Likelihood method. Moreover,
Figure 5 illustrates the estimated distributions where the y-axis is plotted logarithmically scaled. In this case, ten distribution functions described in
Table 2 were analyzed. We show which distributions best characterize the fog events duration, based on this statistical and seasonal analysis. Specifically we show the KS-distance, which provides a quantitative reference for the best distribution. According to the results, the extremes are significant; i.e., there are many long-term events, or in other words, extreme events of low visibility, and we can see that best results in terms of KS-distance are obtained with the heavy-tail distributions. There are three distributions which fit better than the rest the duration of fog events at Mondoñedo along the seasons: These are the GEV, GPa and STA distributions. These three distributions have the shortest KS-distance and a stationary behavior through the time period. The GEV has a KS-distance around 0.07 through the seasons in both years of study, around 0.09 for the GPa, and 0.06 for the STA in 2018, but with worse values in 2019. LLG also fits well to the data distribution among all seasons around
in 2018 and
in 2019, even better than STA this last year. This can also be seen more visually in
Figure 5, where heavy-tailed distributions such as GEV (in orange) or STA (in burgundy) best fit the data. The same is not true for light-tailed distributions such as EXP (in black) or NRM (in blue) and also for Logistic (LOG) (the red curve). In fact the EXP and LOG, are straight lines far from the origin in logarithmic representation, so that they do not fit very well, or NRM and EV (in cyan) which have a concave shape in this representation. Especially when we find more extreme events in the data since, as previously mentioned, those distributions are characterized by a rapid decrease in the probability of generating extreme values.
If we look at
Table 5, we see that, in most seasons, in both years of study, the shortest KS-distance on average through seasons is obtained by the GEV distribution. It is known that this distribution is well suited for estimating the maximum of samples of size
n, from sufficiently long sequences of independent and identically distributed random variables [
50]. On the other hand, stable type distributions explain more adequately the extreme or rare phenomena, since they usually explain observations with extreme values and skewness. This denotes the presence of heavy tails [
51]. This justifies the inclusion of this type of distribution in the study. Note that these distributions are a more efficient alternative to analyze high volatility phenomena due to their capacity to generate extreme values [
51]. Finally, the GPa distribution also plays an important role in the EVT, and it is very common in the study of extreme events related to hydrological issues [
52,
53]. Its adjustment in our results shows a quite stationary behavior (in both methods) in spite of not showing the smallest KS-distance. This is not the case with the short tail distributions used in the study, i.e., EXP, LOG, NRM and EV. These distributions adapt worse to the fog events data than heavy-tail distributions, as they are characterized by a rapid decrease in the probability of generating extreme values, with KS-distance values around 0.3, or even 0.4 for the EV.
Table 6 shows the obtained numerical KS-distance using the L-moments method. Furthermore,
Figure 6 illustrates the estimated distributions where the y-axis is plotted logarithmically scaled. A total of nine distributions were taken into account in this case, because the moments of the STA distribution do not converge for certain parameters of the distribution. The results show that the distributions that best fit to the fog events in these two years of study are the LN, GAM and GPa with KS-distances around between 0.1 and 0.2. In addition the Log-Logistic (LLG) presents KS-distance stationary values over time, about 0.1. These four distributions are used for modelling hydrological processes or more generally in natural systems [
53]. Specifically, the GAM distribution applies to a wide range of physical processes and is related to other distributions: EXP, Pascal, Erlang, Poisson, and chisquare. It is commonly used in meteorological processes i.e., to represent pollutant concentrations and precipitation quantities [
54]. Moreover, it is used to measure the time between the occurrence of events when the event process is not completely random [
55]. Similarly, in our case, fog events in northwest of Iberian Peninsula, especially in summer, are impulsed by the displacement of the Azores anticyclone. GAM distribution seems to benefit from the L-moments estimation method obtaining lower KS-distances than in case of being estimating by the maximum likelihood method, see
Table 5 and
Table 6. However, from a qualitatively point of view, GAM probability density function resembles more of a straight line as we move away towards
in
Figure 5 and
Figure 6. This is not the observed behavior of the data distribution. It is expected that, as soon as the number of samples increases, GAM will fit worse to the data distribution. Once again, heavy-tailed distributions are the ones that best fit to these meteorological situations in Mondoñedo, except for GAM which is a light-tailed and also obtains good results. The light-tailed distributions such as EXP, NRM and EV obtain the poorest fitting to the data. See for example Spring 2018 with a KS-distance of 0.517 in EV distribution, and how together with the LOG, EXP and the NRM they do not adjust correctly to the extreme values of two events of more than 70 h located in the tail of the data distribution.
Observing the obtained results in
Table 5, there exist some distributions whose KS-distances hardly vary among seasons, such as, GPa or GEV. This is due to the fact that these distributions explain the low visibility event durations equally well among seasons, even though their durations may change between seasons. On the contrary, those distributions whose KS-distances vary among seasons, such as GAM, cannot adapt to the new conditions by simply changing their parameters.
It is possible to notice some differences between the Maximum Likelihood and L-moments methods in the results previously shown. Note that we obtain slightly better KS-distances in the fittest distributions estimated with the Maximum Likelihood method than with L-moments method. This is the case of GEV, which obtains the best KS results through Maximum Likelihood, below
, see
Table 5. However, LN, which best fits the data distribution, obtains KS-distances around
with a high variance among seasons and years, as can be seen in
Table 6. It may seem in some instances that the fitting of the GAM distribution is marginally better than that of the LN in some cases. However, it should be noted that, true to its light-tailed nature, the GAM distribution crosses over all of the heavy-tailed distributions (and specifically that of the LN) at large values of fog duration. Therefore, even though GAM may seem a good fit, it fails at large values of fog duration. This is proof that the Maximum Likelihood method fits the main body of the distribution (thus, failing at large values of fog duration) while the L-moment method fits the tails of the present data (failing at low values of fog duration).
As a final note on this point, an accurate statistical characterization of fogs events with extreme-valued distributions can be used to simulate their occurrence at Mondoñedo, within traffic simulators. This way the real effects of deep fog events on traffic causing jams and important circulation problems can be studied.
3.1.1. Discussion: Physical Mechanism
The data used in this study show that, although the number of low visibility events is quite similar in all seasons (see
Table 4) they last longer in the warm season. The explanation for this pattern can be found in the high pressure system most influential for the Atlantic and Europe: the Azores Anticyclone. In summer, this pressure system strengthens and reaches its most northerly position [
56], bringing northerly winds to the Iberian Peninsula. As have been discussed by some authors [
57,
58] the main ingredients needed for the formation of the low visibility events that affect the “Alto de O Fiouco” area are northward winds that push warm and humid air masses coming from the sea. Since in this region there is a large mountain barrier of around 600–700 m pretty close to both, the Atlantic Ocean and the Cantabrian Sea, these parcels of air are lifted adiabatically becoming saturated at relatively low levels. In addition, the presence of the typical subsidence inversion caused by the Azores Anticyclone forces the formation of low level layers of clouds (mainly stratus and stratocumulus) which can affect the A-8 motor-road since its elevation at this specific location is quite similar to the level at which these clouds are formed (the so-called lifting condensation level). Furthermore, it should be noted that because this meteorological phenomena is caused by a maritime air mass the number of hygroscopic particles (mainly sea salt) can be considerably higher than normal, which can play a major role in the formation of dense fog events [
58].
3.1.2. Statistical Study with Different Thresholds for Defining Low-Visibility Events
The results on statistical characterization of low-visibility fog episodes shown above consider as low-visibility events those under the limit of the visibilimeter (<2000 m, light low-visibility). However, note that this threshold to consider a fog event as low-visibility can be set by the practitioner at alternative values. For example, we can choose different thresholds related to traffic protocols, such as 600 m (moderate low-visibility), 300 m (severe low-visibility), and 50 m (extremely severe low-visibility), all of them with an important effect in secure driving conditions. In fact, visibility below 50 m (extremely severe) very probably leads to motor-road closure.
Table 7 and
Table 8 show the statistics for the low-visibility events at A8 motor-road in the years of the study (2018 and 2019), considering low-visibility those events under 600 m and 300 m, respectively. As can be seen, the number of low-visibility events with the new thresholds is similar, between them and also to the case of the threshold set at 2000 m (light low-visibility events). The low visibility event durations are also quite similar for these three thresholds and their proportions among the seasons remain constant. The low visibility events in the warmer seasons continue to be the longest lasting, see
Table 4 and
Table 7,
Table 8 and
Table 9. This is due to the fact that fog events are usually very intense at Mondoñedo in this season.
Table 9 shows the case of the threshold at 50 m. In this extreme severe case, the number of events is very reduced with respect to other thresholds. This indicates that extreme severe low-visibility events are less frequent than moderate and severe events, mainly in winter and autumn, but with a significant incidence in spring and summer.
We repeat here the analysis of the probability distributions for fog events duration, considering low-visibility events defined by setting the thresholds to 600, 300 and 50 m. We consider ten distributions including light and heavy tail distributions, using both the Maximum Likelihood and L-moments methods. In the case of Maximum Likelihood estimation,
Table 10,
Table 11 and
Table 12 show the obtained KS-distance for each distribution, divided by seasons and years. We clearly distinguish two different statistic behaviors in the results obtained. Low-visibility events defined by thresholds under 600 and 300 m have a very similar behavior than that defined by a threshold at 2000 m, as can be seen in
Table 10 and
Table 11, respectively.
Figure 7 shows the distributions estimated by the Maximum Likelihood method for all seasons in 2018 and 2019 at the 300 m threshold which helps us along the discussion. GEV, GPa and STA are still the distributions which best fit the data, with KS-distances below
for both the 600 and 300 m thresholds. Their good approximation to the data distribution is clearly presented in
Figure 7 for the 300 m threshold. The non-negligible probability of the extreme events is responsible of the good results reported by these heavy-tail distribution, similar to those obtained with a 2000 m threshold, see
Table 5 and
Figure 5. GEV obtains the best KS-distances in most of seasons of the two years analyzed, closely followed by STA. Both distributions report KS-distances around
in most seasons. Even for autumn 2019 at the 300 m threshold which is a season with no extreme fog events, see
Table 11 and
Figure 7. However, STA fails to fit data distribution in both summers, where most of the extremes take place. LLG and GPa obtain larger KS-distances than the previous one between
and
, but still with good results. The results of the rest of the evaluated distributions are far from these ones discussed previously, especially that provided by the light-tail distributions EXP, LOG, NRM and EV, which are characterized by a quick decrease in probability. In
Figure 7, we see that EXP and LOG distributions are straight lines with different slopes, mainly far from the origin, and EV and NRM have concave shape, which does not fit the data distribution trend. The KS-distances obtained by these distributions are above
for both thresholds. In the case of low-visibility events defined by the threshold of 50 m, the behavior changes slightly with respect to the case of the threshold situated at 2000 m, see
Table 12. Again, the best KS-distances are obtained by the heavy-tail distributions GEV, and STA, but their KS distances are now around
with more variations among the seasons. The light-tail distributions still obtain the worst KS-distances, but they decrease respect to the previous threshold. We cannot estimate distributions of autumn 2019 for a threshold of 50 m since the number of low-visibility events in this season is only 3.
Table 13,
Table 14 and
Table 15 present the KS-distances obtained in cases when the L-moments method is used for estimating the distributions, with thresholds at 600, 300 and 50 m, respectively.
Figure 8 shows the distributions estimated by the L-moments method for all seasons in both 2018 and 2019 at the 300 m threshold. The distributions that best fits to the fog events for the 600 and 300 m thresholds is still the LN, and GPa but with higher values respect to the 2000 m threshold, around
, and
, respectively, see
Table 13 and
Table 14. GAM obtains good KS-distances at the 300 m threshold but not in 600 m and varies along the seasons. Furthermore, GAM struggles to fit data in spring and autumn 2018 due to the used L-moments implementation code. The reason is that the duration of low-visibility events in summer is higher than in other seasons, and GAM did fit such wide range of durations with a good accuracy, since they are straight lines far from the origin in the y-log-scaled
Figure 8. GEV also fits were to the data distribution even better than fixing a 2000 m threshold, around
. The light-tail distributions do not fit the data; although, they obtain better KS-distance than in the case of the 2000 m threshold. EXP, LOG, NRM or EV do not fit the data distribution well as their tail decreases quickly, see
Figure 8. Focusing on the results obtained by fixing a threshold at 50 m,
Table 12 shows similar results to the previous thresholds for the L-moment estimation. Again, the distributions with best KS-distances are GAM, LN and GPA, with values above
, higher than in previous thresholds. However, the KS-distances obtained by EXP, EV, NRM are lower than the obtained by fixing 600 and 300 m thresholds. Note that distribution estimations for autumn 2019 do not appear in
Table 12, since only three extreme low-visibility events occurred, not enough for the parameter estimation.
3.2. Prediction of Fog Events at A-8 with ELMs
The results obtained with an ELM in the short-term prediction of low-visibility events due to fog at the A-8 motor-road are presented in this section. For ensuring the independence of the partition data in training and test sets, as well as the performance of the regressors, a
K-fold cross-validation procedure was carried out [
10,
59]. The folding was set to
, and each set consists of an
to train and
to test. Using the full dataset spanning from 1 January 2018 to 30 November 2019, data are randomly selected, breaking the sequence in the data, in order to bring heterogeneity to the values of the samples.
The ELM model considered in this paper has the following characteristics: neurons in the hidden layer are designed with sigmoid activation function. The optimal number of neurons is chosen from a large pool (50–150, in an increment of 1), which passes through the hidden layer, one by one, during the validation phase. In addition to the atmospheric features considered, we will also use the 4 time instants prior to the target we want to predict (t), as predictors, i.e., we will use the target values at , , and . We should note that in all experiments the input–output data pair, , for both ELMs has a time resolution of half an hour (where n stands for the total number of half-hour intervals in the database); hence, the forecasting time-horizon was set to 30 min ahead estimation (instant t) of the visibility. Finally, the experiments will consist of launching 10 executions of each algorithm for each proposed scenario, and average the results of them.
In order to better analyze the ELM performance, a wrapper feature selection process was carried out. This procedure consists of launching as many ELMs as combinations of characteristics we have in a reduced validation set, to find the set of predictors that provides the least error at the output (best set of features). Note that we have 10 features (only the atmospheric features are considered in this process) for this problem (see
Table 1), which means that we have to launch a total of 1024 (
) ELM models (prediction problems) to obtain the best set of characteristics (inputs). Note that we need an extremely fast-training algorithm such as ELM to carry out this feature selection analysis, since otherwise the computation time required would be extremely high. The results obtained in the feature selection process provided two sets of features as best results: the first one included a total of nine characteristics: Accumulated precipitation, Salinity, Relative humidity, Air temperature, Floor temperature, Dew temperature, Global solar radiation, Wind speed and Atmospheric pressure, with a Root Mean Square error (RMSE) of
m in the validation set. A second best set with a total of three characteristics was also obtained: Accumulated precipitation, Relative humidity and Global solar radiation; with a RMSE of
m in the validation set. The rest of features combination produced worse results, so we have kept these two best sets for carrying out the experiments. In order to compare the results, we used the Persistence Prediction Operator (PPO
), a well known operator described by the following equation:
The generalized Persistence prediction operator (PPO
) uses the
M last time steps to infer the prediction and can be also defined as:
In our experiments, we fix referenced as (PPO).
Prediction Results
Table 16 shows the average and standard deviation results (10 runs of the algorithms) obtained by the ELM, when we use 9 or 3 characteristics as predictors (ELM-9, ELM-3), and the obtained by the PPO, which uses the last (PPO
) and the four last time steps (PPO
) for comparison. These results to evaluate the ELM performance are given in terms of the Pearson’s correlation coefficient,
, the RMSE and the computation times, both training (Train-
t) and test (Test-
t). It can be observed that the ELM approach is able to obtain the best results, with an RMSE of 393.56 m and an 80% of
when 3 features are used as predictors. If we compare these results with the case of ELM-9, we can observe a slight difference in terms of RMSE, with a value of 394.82 m, but a similar value for
. Therefore, the ELM model works slightly better with fewer features, achieving good results using less computation time, in particular taking 16.06 and 0.03 s in Train-
t and Test-
t, respectively in ELM-3, against the 18.99 and 0.05 s for Train-
t and Test-
t in the ELM-9 case. Based on these results we can see that the selection of features has an effect on the prediction process by using the ELM approach. The results obtained by the PPO are considerably worse. If we analyze the PPO for both variants, PPO
and PPO
, we can see the poor performance in terms of RMSE and the worse one in
. In this case, the best results are obtained for PPO
with an RMSE of
m and
in
, below the
of the ELM. The difference with respect to the case of PPO
is larger than in the ELM at least in terms of RMSE which obtains
m. In terms of
, PPO
obtains
, similar to the PPO
. We deduce that, in case of PPO, it is better to use the last time step than four time steps for obtaining the prediction. This is because the visibility time series is quite volatile, and using PPO
(7) strongly smooths the time series. Note that the computation time required to train the ELM is acceptable. PPO reaches real time as the predicted series is simply the mean given by Equation (7).
Figure 9 shows a temporal representation of the predicted visibility variable (in red) versus the measured values of this variable (in blue), by the ELM with nine features and three features. It is possible to see that in both cases the performance of the ELM is excellent in this prediction problem, showing good behaviour even in the deepest fog events. Moreover, these good results are obtained regardless of the set of samples we test. As can be seen, in the prediction graphs with nine and three features as predictors, different test samples are used to quantify the performance of the model, which is very interesting to corroborate the good performance of this type of learning machine.