Tide Prediction in the Venice Lagoon Using Nonlinear Autoregressive Exogenous (NARX) Neural Network

: In the Venice Lagoon some of the highest tides in the Mediterranean occur, which have inﬂuenced the evolution of the city of Venice and the surrounding lagoon for centuries. The forecast of “high waters” in the lagoon has always been a matter of considerable practical interest. In this study, tide prediction models were developed for the entire lagoon based on Nonlinear Autoregressive Exogenous (NARX) neural networks. The NARX-based model development was performed in two different stages. The ﬁrst stage was the training and testing of the NARX network, performed on data collected in a given time interval at the tide gauge of Punta della Salute, at the end of Canal Grande. The second stage consisted of a comprehensive validation of the model in the entire Venice Lagoon, with a detailed analysis of data from three measuring stations located in points of the lagoon with different characteristics. Good predictions were achieved regardless of whether the meteorological parameters were considered among input parameters, even with considerable time advance. Furthermore, the forecasting model based on NARX has proved capable of predicting even exceptional high tides. The proposed model could be a useful support tool for the management of the MOSE system, which will protect Venice from high waters.


Introduction
The city of Venice is a UNESCO world heritage site for the uniqueness of its historical, archaeological, urban, architectural, and artistic legacy. The historic center of Venice is located in the middle of the Venice Lagoon, a closed bay at the northwestern end of the Adriatic Sea. In this area, some of the highest tides in the Mediterranean occur [1]. Over the centuries, high tides have caused significant damage to the city, threatening its cultural heritage [2]. The most relevant high tide events, called "acqua alta" (high waters) in Italian, have significantly influenced the socioeconomic and environmental aspects of the Venice Lagoon throughout history.
High-water events occur in the Lagoon when the effects of the astronomical tide due to the attraction of celestial bodies are enhanced by meteorological disturbances. The most relevant weather factors affecting the tide-level fluctuations are barometric pressure and the wind. In particular, the Scirocco wind, from the southeast, and the Bora wind, from north-northeast, can lead to significant increases compared to the normal astronomical oscillations of the tide level.
One of the first reliable measures of high tide in the Venice Lagoon dates back to 1848 when the water surface reached 140 cm above the mean sea level. This is an exceptional value considering that, currently, the average ground level of Venice is only 80 cm above the mean sea level. Indeed, during the 20th century, Venice lost 25 cm in ground level, approximately 15 cm because of subsidence mainly due to groundwater pumping in the nearby industrial area and about 10 cm due to eustatism [3]. The highest extreme observed event dates back to 4 November 1966, when the water level reached 194 cm [4]. More

Study Area and Dataset
The Venice Lagoon extends for 550 km 2 from the Sile River in the north to the Brenta River in the south, making the Venice Lagoon the largest wetland in the Mediterranean Basin [35,36]. For 80% of its surface, it consists of mudflats, tidal shallows, and salt marshes. About 11% is permanently covered by open water, while only 8% is represented by land, including the city of Venice and many smaller islands. The lagoon is connected to the Adriatic Sea by means of three inlets: Lido, Malamocco, and Chioggia.
The tide-level dataset consisted of measures on a network of 19 tide-gauge stations covering the Venice Lagoon. Wind direction, wind speed, and barometric pressure were taken from a weather station, referred to as Piattaforma CNR and located in the Adriatic Sea 13 km from the Malamocco inlet. For each station, data measured every 30 min were available and were used in the analysis.
Both the time series of the tide-gauge station Punta della Salute and the time series of the weather station were available for the period from January 2009 to December 2014. In addition, the gravitational effects were included in the prediction model by means of the astronomical tide height h astr , which was computed through a harmonic analysis: A n cos(σ n t − k n ) (1) where A 0 is the average sea level, A n is the amplitude, σ n the angular frequency, k n the phase delay of component n, and N is the number of harmonics used to evaluate the astronomical tide height. These values can be found on the Venice Municipality website [37]. Figure 1 shows the location of the tide gauge stations in the Venice Lagoon, and the weather station.
Water 2021, 13, x FOR PEER REVIEW 3 of 18

Study Area and Dataset
The Venice Lagoon extends for 550 km 2 from the Sile River in the north to the Brenta River in the south, making the Venice Lagoon the largest wetland in the Mediterranean Basin [35,36]. For 80% of its surface, it consists of mudflats, tidal shallows, and salt marshes. About 11% is permanently covered by open water, while only 8% is represented by land, including the city of Venice and many smaller islands. The lagoon is connected to the Adriatic Sea by means of three inlets: Lido, Malamocco, and Chioggia.
The tide-level dataset consisted of measures on a network of 19 tide-gauge stations covering the Venice Lagoon. Wind direction, wind speed, and barometric pressure were taken from a weather station, referred to as Piattaforma CNR and located in the Adriatic Sea 13 km from the Malamocco inlet. For each station, data measured every 30 min were available and were used in the analysis.
Both the time series of the tide-gauge station Punta della Salute and the time series of the weather station were available for the period from January 2009 to December 2014. In addition, the gravitational effects were included in the prediction model by means of the astronomical tide height hastr, which was computed through a harmonic analysis: where A0 is the average sea level, An is the amplitude, σn the angular frequency, kn the phase delay of component n, and N is the number of harmonics used to evaluate the astronomical tide height. These values can be found on the Venice Municipality website [37]. Figure 1 shows the location of the tide gauge stations in the Venice Lagoon, and the weather station.

NARX Model Architectures
NARX neural networks are a recurrent dynamic type of ANNs networks, which are composed of interconnected nodes inspired by a simplification of the biological neural system. Therefore, each node represents an artificial neuron that receives one or more inputs and sums them to produce an output. These sums pass through a function, known as an activation function, which, for the NARX network, is nonlinear. Based on the flow and processing information direction, different categories of ANNs can be distinguished. While in the feedforward neural networks (FNNs), the information flows in one direction with nodes arranged in layers; in the recurrent neural networks, such as NARX, information flows both in forward and backward directions, allowing a connection between neurons located in the same or previous layers [38]. The faster convergence in reaching the optimal connection weights between inputs and neurons and the reduced number of the latter to calibrate and make the model effective [39] makes NARX more high-performing compared to other ANNs and better at discovering long-time dependences in comparison with other recurrent neural networks [40]. Moreover, the exogenous inputs of the NARX network allow relating the current value of a time series to both past values of the same series and current and past values of the exogenous series, which represent the external series that affect the time series of interest. The basic equation for the NARX model is: where u(t) and y(t) are the input and output values at time t, n u and n y are the input and output network layers, and f is the nonlinear function, approximated by the FNN. The NARX architectures include 3 different and sequential layers ( Figure 2). The first is the input layer, which consists of the input parameters of the neural network. The second is the hidden layer, which represents the computational step between input and output. The third is the output layer, which leads to the predicted value y(t). Four different NARX-based models were implemented in MATLAB ® 2020a [41] environment. In all four models, both the lagged values of the tide level h tide (t − t a ) and astronomical tide h astr (t − t a ) were considered as input values while the output was represented by the predicted h tide (t), with t a that is the lag time between input and target values. For each model, different combinations of additional input values were considered.
In the first model, indicated as "Model I", the lagged wind speed v wind (t − t a ), the lagged wind direction α wind (t − t a ), and the lagged barometric pressure P atm (t − t a ), were also considered as input values. In the second model, "Model II", v wind (t − t a ) and α wind (t − t a ) were considered as additional input values while for "Model III" only P atm (t − t a ) was included as an additional input value. The fourth model, "Model IV", did not have additional input values, taking into account only the lagged values of the tide level h tide (t − t a ) and astronomical tide h astr (t − t a ). Therefore, input data were ahead of the tide value to be predicted for a time equal to the lag time t a . It should be noted that, despite the fact that meteorological parameters were partly neglected, as for "Model II" and "Model III", or completely neglected, as for "Model IV", their influence is implicitly expressed by including previously observed tidal height values, which in turn also depend on meteorological factors. Moreover, as demonstrated by Di Nunno et al. (2021), considering only the lagged tide level h tide (t − t a ) as an input variable led to relevant underestimation of the high tides. This made the astronomical tide an essential parameter for tide prediction [42].
Different lag time t a values were considered, in order to assess the performance of the models as t a increases. A preliminary analysis was conducted to select the optimal number of hidden nodes. The best performances were achieved for 3 hidden nodes (indicated in Figure 2 as h 1 , h 2 , and h 3 ). For the hidden layer, a sigmoid activation function f 1 was used, while a linear activation function f 2 , with only one neuron n, was used for the output layer. For the output layer (Figure 2), the weight w and bias b of the NARX model were optimized through the training algorithm described below.
The output value, represented by the predicted h tide , was then fed back to the input values as part of the NARX architectures. In particular, the time delay t d and the related feedback delay, which is equal to the number of output values that were fed back as input, were both set to 1. This allowed minimizing the weight of fed-back values in the tide prediction.
The Bayesian Regularization was used as a training algorithm. It consists of a Gauss-Newton approximation to the Hessian matrix J T (w)J(w), where w is the weight vector, J is the Jacobian matrix, and J T the transpose, based on the Bayesian technique [43], and implemented in the Levenberg-Marquardt algorithm, in order to reduce the probability of overfitting and the computational overhead [44]. The Levenberg-Marquardt algorithm approximates the Hessian matrix according to the equation [45]: where I is the identity matrix, e is the error vector and λ is the learning constant, adjusted iteratively to find the minimum error. Despite a slow convergence with respect to the direct application of the Levenberg-Marquardt algorithm, the Bayesian Regularization algorithm usually leads to improved predictions [33]. Furthermore, a normalization of the input values was conducted in order to have a common range between 0 and 1 and improve the modeling performance. The tide level, the astronomical tide, the barometric pressure and the wind speed were normalized with respect to the respectively maximum values along the time series, while the wind direction was divided by 360: where i indicates the temporal step.
Water 2021, 13, x FOR PEER REVIEW 5 of 18 The output value, represented by the predicted htide, was then fed back to the input values as part of the NARX architectures. In particular, the time delay td and the related feedback delay, which is equal to the number of output values that were fed back as input, were both set to 1. This allowed minimizing the weight of fed-back values in the tide prediction.
The Bayesian Regularization was used as a training algorithm. It consists of a Gauss-Newton approximation to the Hessian matrix J T (w)J(w), where w is the weight vector, J is the Jacobian matrix, and J T the transpose, based on the Bayesian technique [43], and implemented in the Levenberg-Marquardt algorithm, in order to reduce the probability of overfitting and the computational overhead [44]. The Levenberg-Marquardt algorithm approximates the Hessian matrix according to the equation [45]: where I is the identity matrix, e is the error vector and λ is the learning constant, adjusted iteratively to find the minimum error. Despite a slow convergence with respect to the direct application of the Levenberg-Marquardt algorithm, the Bayesian Regularization algorithm usually leads to improved predictions [33]. Furthermore, a normalization of the input values was conducted in order to have a common range between 0 and 1 and improve the modeling performance. The tide level, the astronomical tide, the barometric pressure and the wind speed were normalized with respect to the respectively maximum values along the time series, while the wind direction was divided by 360: where i indicates the temporal step.

Evaluation Metrics
The performance of the NARX network was evaluated by means of four evaluation metrics: the coefficient of determination R 2 , which provides a measure of how well experimental data are replicated by the model, the Mean Absolute Error (MAE), which provides the average error magnitude for the predicted values, the Root Mean Squared Error (RMSE), which provides the square root of the average squared errors for the predicted values, and the Relative Absolute Error (RAE), which is the ratio between the absolute

Evaluation Metrics
The performance of the NARX network was evaluated by means of four evaluation metrics: the coefficient of determination R 2 , which provides a measure of how well experimental data are replicated by the model, the Mean Absolute Error (MAE), which provides the average error magnitude for the predicted values, the Root Mean Squared Error (RMSE), which provides the square root of the average squared errors for the predicted values, and the Relative Absolute Error (RAE), which is the ratio between the absolute error and the absolute value of the difference between average and each measured values. These metrics are defined as: where m is the total number of measured data, f i is the predicted value for the i-th data, y i is the experimental value for the i-th data, and y a is the averaged value of the measured data.

Results and Discussion
The NARX modeling was performed in two different stages. The first was the training and testing of the NARX network-based model, performed for the period between January 2009 and December 2011 using the tide level dataset of Punta della Salute and the weather data of Piattaforma CNR. It should be noted that, for an accurate tide-level forecast, a time-series length of at least 12 months is required [42]. The second stage was an analysis extended to the Venice Lagoon, which consisted of validation of the NARX network for the tide prediction in the different tide gauge stations that cover the entire lagoon, including Punta della Salute. Validation was performed for periods not exceeding 1 year and not considered for the previous training and testing step, between January 2012 and December 2014. Table 1 reports, for each tide gauge station, the year considered for the model validation.

Training and Testing
With reference to the training and testing phases conducted on the data collected at the Punta della Salute tide gauge, the NARX-based model shows very good performance for all lag time and models ( Table 2). The best performance was achieved for the lowest t a , equal to 1 h, and Model I (R 2 = 0.9980 − RAE = 0.0426). However, also for Model IV, which did not include the meteorological parameters among the input variables, the prediction was still very accurate, showing negligible differences compared to the other models (R 2 = 0.9980 − RAE = 0.0433). For t a = 2 h (Model I-R 2 = 0.9980 − RAE = 0.0437, Model IV-R 2 = 0.9979 − RAE = 0.0443), performances were slightly lower than those achieved for t a = 1 h. Table 2. Prediction performance in the training and testing of the NARX network for the four models. A more marked reduction in performance (Table 2) is observed passing from a lag time of 3 h (Model I-R 2 = 0.9949 − RAE = 0.0697, Model IV-R 2 = 0.9947 − RAE = 0.0717) to 6 h (Model I-R 2 = 0.9737 − RAE = 0.1639, Model IV-R 2 = 0.9729 − RAE = 0.1662). This result may essentially be attributed to the characteristics of the astronomical tide, which is basically semidiurnal, with two maximum heights and two minimum heights within 24 h. This involves the autocorrelation function [46] of the tide level time series showing positive peaks every 12 h starting from a lag time t a = 12 h and negative peaks every 12 h starting from t a = 6 h ( Figure 3). For lag times corresponding to negative peaks of autocorrelation, the predictive ability of the model tends to decrease. However, the evaluation metrics show that results are still accurate (Model IV-R 2 = 0.9729 − RAE = 0.1662).

OR PEER REVIEW 7 of 18
h. This involves the autocorrelation function [46] of the tide level time series showing positive peaks every 12 h starting from a lag time ta = 12 h and negative peaks every 12 h starting from ta = 6 h ( Figure 3). For lag times corresponding to negative peaks of autocorrelation, the predictive ability of the model tends to decrease. However, the evaluation metrics show that results are still accurate (Model IV-R 2 = 0.9729 − RAE = 0.1662). After ta = 6 h, the predictions returned to slightly improve with the best performances achieved for a lag time corresponding to the positive autocorrelation peaks, showing evaluation metrics that settled on good values even for ta equal to 72 h. Figure 4 shows the tide prediction with Model I and four different lag times for the high-tide event that took place between 23 and 26 December 2021. The three warning levels are also reported, with high tide between 80 cm and 110 cm, very high tide between 110 cm and 140 cm, and exceptional tide above 140 cm. It should be noted that at 110 cm, 12% of the Venice historical center and Giudecca is underwater and at 140 cm 59% is underwater [47]. During the shown period, two exceptional tide events were measured, respectively on 23 and 25 December, interspersed with a very high tide, measured on 24 December. For the first exceptional tide, the NARX modeling led to a slight overestimation of the tide level of 0.6 cm for ta = 1 h ( Figure 4a) and to an underestimation of 3.28 cm, 6.03 cm and 2.40 cm computed respectively for ta equal to 6 h (Figure 4b Overall, regardless of the lag time, the NARX network is able to predict both the positive and negative peaks, including the exceptional tide, with the best performance observed for the lowest ta equal to 1 h. Contrary to what was expected, tide predictions obtained for ta = 72 h outperformed the ones achieved for ta = 24 h, confirming the ability of NARX models in long-term predictions.
In addition, the training and testing stage was not significantly affected by the additional input parameters: the four models showed very similar values of the evaluation metrics. This obviously does not mean that the weather parameters have little influence on high waters: the fundamental influence of the weather parameters is taken into account by means of the lagged values of the tide height. After t a = 6 h, the predictions returned to slightly improve with the best performances achieved for a lag time corresponding to the positive autocorrelation peaks, showing evaluation metrics that settled on good values even for t a equal to 72 h. Figure 4 shows the tide prediction with Model I and four different lag times for the high-tide event that took place between 23 and 26 December 2021. The three warning levels are also reported, with high tide between 80 cm and 110 cm, very high tide between 110 cm and 140 cm, and exceptional tide above 140 cm. It should be noted that at 110 cm, 12% of the Venice historical center and Giudecca is underwater and at 140 cm 59% is underwater [47]. During the shown period, two exceptional tide events were measured, respectively on 23 and 25 December, interspersed with a very high tide, measured on 24 December. For the first exceptional tide, the NARX modeling led to a slight overestimation of the tide level of 0.6 cm for t a = 1 h (Figure 4a) and to an underestimation of 3.28 cm, 6.03 cm and 2.40 cm computed respectively for t a equal to 6 h (Figure 4b

Venice Lagoon Analysis
The model developed using Punta della Salute data was then validated for the entire Venetian Lagoon. Before carrying out the validation, the propagation of the tide in the lagoon was studied using an unconventional approach based on the cross-correlation function XCF between the tide level in Piattaforma CNR and the tide level in the different tide-gauge stations. The calculations were carried out with reference to the year 2009. After calculating the cross-correlation between the tide time series (Figure 5), the tide lag time td,c was evaluated as the time that maximizes the cross-correlation function: Overall, regardless of the lag time, the NARX network is able to predict both the positive and negative peaks, including the exceptional tide, with the best performance observed for the lowest t a equal to 1 h. Contrary to what was expected, tide predictions obtained for t a = 72 h outperformed the ones achieved for t a = 24 h, confirming the ability of NARX models in long-term predictions.
In addition, the training and testing stage was not significantly affected by the additional input parameters: the four models showed very similar values of the evaluation metrics. This obviously does not mean that the weather parameters have little influence on high waters: the fundamental influence of the weather parameters is taken into account by means of the lagged values of the tide height.

Venice Lagoon Analysis
The model developed using Punta della Salute data was then validated for the entire Venetian Lagoon. Before carrying out the validation, the propagation of the tide in the lagoon was studied using an unconventional approach based on the cross-correlation function XCF between the tide level in Piattaforma CNR and the tide level in the different tide-gauge stations. The calculations were carried out with reference to the year 2009. After calculating the cross-correlation between the tide time series (Figure 5), the tide lag time t d,c was evaluated as the time that maximizes the cross-correlation function: where s is the size of the time series and τ is the delay [48]. Lag times are represented in Figure 6 and reported, with the peaks of cross-correlation, in Table 3 where s is the size of the time series and τ is the delay [48]. Lag times are represented in Figure 6 and reported, with the peaks of cross-correlation, in Table 3. Peaks get high values, greater than 0.95 for all tide gauge stations, showing a high similarity between the tide levels inside and outside the lagoon. The lowest values are observed in correspondence with the most distant stations and with shallower bottoms. However, the propagation of the tide occurs gradually, with td,c that passes from values lower than or equal to 0.5 h near the lagoon inlets (values below 0.5 h cannot be detected due to data temporal resolution), e.g., Malamocco Diga Nord (Figure 5a), Lido Diga Nord (Figure 5b), and Lido Diga Sud (Figure 5c), up to values over 3 h in the peripheral areas of the lagoon, e.g., Grassabò (td,c = 3.5 h, Figure 5c). Instead, the central areas of the lagoon are characterized by intermediate lag times. As an example, for Punta della Salute ( Figure 5b) and Treporti (Figure 5c), td,c equal to 1.5 h and 2.0 h was respectively estimated. It should be noted that the tide gauges of Valle Averto (Figure 5a) and Marghera (Figure 5b), despite a relevant distance from the Malamocco and Lido inlets, highlight a td,c = 2 h, equal to those computed in the central area of the Lagoon. This could be explained by the less-sheltered position of these tide gauges in comparison with other peripheral tide gauges, such as Grassabò (Figure 5c) and Cavallino Centro (td,c = 4.0 h). These results are in agreement with Ferla et al. [49].     The R 2 and RAE scattered maps for tide prediction in all the 19 tide gauge stations are shown in Figure 7, for Model I and t a equal to 1 h and for Model IV and t a equal to 72 h. The NARX modeling was accurate for all locations, with R 2 values that never drop below 0.91 and RAE values always lower than 0.3 for t a = 72 h. Furthermore, for t a = 1 h, R 2 was always higher than 0.99 and RAE was lower than 0.085 for all tide-gauge stations.
An overview of the results is given in Table 4, with a statistical analysis that consists of the evaluation of the minimum, maximum, mean, and standard deviation values of the evaluation metrics computed considering all tide-gauge stations and lag times. The best performances were observed for Model II, which exhibit mean R 2 and RAE equal to 0.9298 and 0.2446, respectively. Model I shows slightly lower performances, with mean R 2 = 0.9286 and mean RAE = 0.2472. This demonstrates the low impact of the barometric pressure on the tide-level prediction. Model III is outperformed by Models I and II (mean R 2 = 0.9263 and mean RAE = 0.2494); this further highlights that wind speed and direction have a greater impact on tide prediction in comparison with barometric pressure. However, tide predictions were very accurate regardless of whether weather parameters were considered as input variables, with Model IV that is characterized by the least accurate outcomes, but still close to the results of the other three models (mean R 2 = 0.9257 and mean RAE = 0.2476)s, confirming the great forecasting capability of the NARX network. For all 19 tide-gauge stations, an R 2 greater than 0.7 was calculated, which is usually considered the minimum value for a proper prediction [50].  An overview of the results is given in Table 4, with a statistical analysis that consists of the evaluation of the minimum, maximum, mean, and standard deviation values of the evaluation metrics computed considering all tide-gauge stations and lag times. The best performances were observed for Model II, which exhibit mean R 2 and RAE equal to 0.9298 and 0.2446, respectively. Model I shows slightly lower performances, with mean R 2 = 0.9286 and mean RAE = 0.2472. This demonstrates the low impact of the barometric pressure on the tide-level prediction. Model III is outperformed by Models I and II (mean R 2 = 0.9263 and mean RAE = 0.2494); this further highlights that wind speed and direction have a greater impact on tide prediction in comparison with barometric pressure. However, tide predictions were very accurate regardless of whether weather parameters were considered as input variables, with Model IV that is characterized by the least accurate outcomes, but still close to the results of the other three models (mean R 2 = 0.9257 and mean RAE = 0.2476)s, confirming the great forecasting capability of the NARX network. For all 19 tide-gauge stations, an R 2 greater than 0.7 was calculated, which is usually considered the minimum value for a proper prediction [50].

High Tide Analysis
The most interesting issue in tide modeling is represented by the prediction of high tides. In the training and testing stage, the NARX network proved to be able to predict even an exceptional tide for Punta della Salute. In this section, the results relating to the high-tide predictions for three tide-gauge stations located in different points of the Venice Lagoon are described and discussed. Compared to the training and testing stage, a shorter simulation period, equal to one year, was considered.
The first tide gauge was Chioggia Diga Sud located in proximity of the Chioggia Inlet, in the south of the Venice Lagoon. In the period between 10 and 12 February 2014 in this station, a very high tide of 130 cm was measured, followed by a high tide of 95 cm (Figure 8a). The tide level reached at Chioggia Inlet was, however, higher than those recorded in the same period at the Lido inlet and in the Venice historical center, with peaks of 98 cm and 108 cm measured, respectively, in the stations of Treporti and Punta della Salute. NARX modeling, with Model IV and t a = 72 h, provided accurate forecasting of both high tide measurements for Chioggia Diga Sud, with an underestimation of the very high tide event of only 1.90 cm and to an underestimation of the high tide event of 2.85 cm.
Water 2021, 13, x FOR PEER REVIEW 13 of 18 in predicting negative peaks or tides between 0 and 80 cm (Figure 8b,d,f). The best performance was obtained for Chioggia Diga Sud with R 2 = 0.9764 and RAE = 0.1536. Slightly lower but still very accurate predictions were achieved for Canal Ancora (R 2 = 0.92613 and RAE = 0.2709) and for San Giorgio in Alga (R 2 = 0.9141 and RAE = 0.2906). Further confirmation of the results accuracy obtained through Model IV is shown in Figure 9, which reports a notched box plots representation of the Residuals, expressed as the difference between measured and predicted values, for htide > 80 cm, as the lag time increases. It should be noted that positive Residuals involve an underestimation of positive tide peaks that correspond to high tides, while negative Residuals involve an overestimation of negative tide peaks that correspond to low tides. For the three stations, whiskers ranged between −4 cm and 6 cm and medians between −0.5 cm and 2.5 cm. Exceptions to these results were observed for ta = 6 h where there is a relevant underestimation of high tides with whiskers ranged between 5 cm and 20 cm and the medians between 9 cm and 13 cm. These discrepancies may be explained with the autocorrelation analysis reported in Section 3.2. The best performances were achieved for ta equal to 1 h. However, for ta equal to 72 h, the outliers (red crosses in Figure 9) were a small percentage: 4.62% for Chioggia Diga Sud, 3.04% for San Giorgio in Alga, and 4.82% for Canal Ancora. The second tide gauge was San Giorgio in Alga, located in a homonymous small island in the middle of the lagoon, close to the Giudecca island and historical center. An exceptional tide height of 150 cm was measured at this station on 11 November 2012 and two high tides, with heights between 80 cm and 110 cm, were measured until the following day (Figure 8c). In the nearby station of Punta della Salute, a similar exceptional tide height of 148 cm was measured at the same time. For the exceptional tide measured in San Giorgio in Alga, NARX modeling led to an overestimation of 1.37 cm computed, while for the subsequent two high tides, higher overestimations were observed, respectively equal to 4.75 cm and 1.81 cm.
The third was Canal Ancora, located in the north of the lagoon. For this tide gauge, the most significant peak measured in 2013 was equal to 126 cm, which corresponds to a very high tide (Figure 8e). It should be noted that, in the same day, 11 February 2013, in Punta della Salute an exceptional tide of 142 cm was recorded (Figure 4). The difference of 16 cm between Punta della Salute and Canal Ancora was caused by the propagation of the flood event, which initially affected the central part of the lagoon and then reached the peripheral areas, located at a greater distance from the three inlets, and by and the configuration of the lagoon seabed. NARX modeling provided good predictions also for Canal Ancora, with limited overestimations of the tide level equal to 4.32 cm and 3.17 cm for the very high tide and high tide, respectively.
Overall, the position of the three tide gauges with respect to the weather station, more or less distant from the Lagoon inlets, did not affect the accuracy of the forecast. The tide prediction performed with the simplest model, Model IV, and the highest lag time, equal to 72 h, confirm the ability of the NARX network to predict high tides, even in case of exceptional events. In addition, the forecasts of exceptional tide showed similar accuracy to that observed for very high tides and high tides events, confirming the suitability of the NARX network for the modeling of extreme events. The same accuracy was also achieved in predicting negative peaks or tides between 0 and 80 cm (Figure 8b,d,f). The best performance was obtained for Chioggia Diga Sud with R 2 = 0.9764 and RAE = 0.1536. Slightly lower but still very accurate predictions were achieved for Canal Ancora (R 2 = 0.92613 and RAE = 0.2709) and for San Giorgio in Alga (R 2 = 0.9141 and RAE = 0.2906).
Further confirmation of the results accuracy obtained through Model IV is shown in Figure 9, which reports a notched box plots representation of the Residuals, expressed as the difference between measured and predicted values, for h tide > 80 cm, as the lag time increases. It should be noted that positive Residuals involve an underestimation of positive tide peaks that correspond to high tides, while negative Residuals involve an overestimation of negative tide peaks that correspond to low tides. For the three stations, whiskers ranged between −4 cm and 6 cm and medians between −0.5 cm and 2.5 cm. Exceptions to these results were observed for t a = 6 h where there is a relevant underestimation of high tides with whiskers ranged between 5 cm and 20 cm and the medians between 9 cm and 13 cm. These discrepancies may be explained with the autocorrelation analysis reported in Section 3.2. The best performances were achieved for t a equal to 1 h. However, for t a equal to 72 h, the outliers (red crosses in Figure 9) were a small percentage: 4.62% for Chioggia Diga Sud, 3.04% for San Giorgio in Alga, and 4.82% for Canal Ancora. Good results were also achieved without including the weather data related to the wind and barometric pressure, with Model IV capable of providing accurate predictions. This aspect probably represents one of the most interesting novel results of this study. Moreover, this made it possible to extend the analysis through Model IV to a longer period, from January 2009 to December 2018, allowing the validation of the model on a wider range of events, using the tide-level dataset measured in Punta della Salute, while the trained NARX-based model is the same developed and described in Section 3.1. Figure 10 shows some results of this modeling, obtained for a ta = 72 h, focusing on the highest tide measured in Punta della Salute in the period 2009-2018, equal to 154 cm. As can be seen from the residuals, properly divided in order to take into account the values of high tide, very high tide and exceptional tide, the model provided predictions with small underestimation and overestimations. For the exceptional tide, equal to 154 cm, the level is underestimated at about 6.08 cm. The second exceptional tide, which was observed after few hours, shows an even lower underestimation, equal to 2.95 cm. For htide < 80 cm and 80 cm < htide < 110 cm, some overestimations of the tide level are also observed, with Good results were also achieved without including the weather data related to the wind and barometric pressure, with Model IV capable of providing accurate predictions. This aspect probably represents one of the most interesting novel results of this study. Moreover, this made it possible to extend the analysis through Model IV to a longer period, from January 2009 to December 2018, allowing the validation of the model on a wider range of events, using the tide-level dataset measured in Punta della Salute, while the trained NARX-based model is the same developed and described in Section 3.1. Figure 10 shows some results of this modeling, obtained for a t a = 72 h, focusing on the highest tide measured in Punta della Salute in the period 2009-2018, equal to 154 cm. As can be seen from the residuals, properly divided in order to take into account the values of high tide, very high tide and exceptional tide, the model provided predictions with small underestimation and overestimations. For the exceptional tide, equal to 154 cm, the level is underestimated at about 6.08 cm. The second exceptional tide, which was observed after few hours, shows an even lower underestimation, equal to 2.95 cm. For h tide < 80 cm and 80 cm < h tide < 110 cm, some overestimations of the tide level are also observed, with values lower than 5.4 cm. A maximum underestimation of 4.15 cm was evaluated for a high tide event, on 31 October 2018. In any case, forecast errors can be significantly reduced with the progressive use of NARX-based models characterized by gradually decreasing t a . Moreover, this made it possible to extend the analysis through Model IV to a longer period, from January 2009 to December 2018, allowing the validation of the model on a wider range of events, using the tide-level dataset measured in Punta della Salute, while the trained NARX-based model is the same developed and described in Section 3.1. Figure 10 shows some results of this modeling, obtained for a ta = 72 h, focusing on the highest tide measured in Punta della Salute in the period 2009-2018, equal to 154 cm. As can be seen from the residuals, properly divided in order to take into account the values of high tide, very high tide and exceptional tide, the model provided predictions with small underestimation and overestimations. For the exceptional tide, equal to 154 cm, the level is underestimated at about 6.08 cm. The second exceptional tide, which was observed after few hours, shows an even lower underestimation, equal to 2.95 cm. For htide < 80 cm and 80 cm < htide < 110 cm, some overestimations of the tide level are also observed, with values lower than 5.4 cm. A maximum underestimation of 4.15 cm was evaluated for a high tide event, on 31 October 2018. In any case, forecast errors can be significantly reduced with the progressive use of NARX-based models characterized by gradually decreasing ta. The results obtained also lend themselves to an interesting interpretation from the physical point of view. The possibility of predicting high waters with excellent accuracy regardless of weather-forcing data, even a few days in advance, shows that the complex system represented by the Venice lagoon is characterized by considerable "inertia". In particular, when exceptional high waters occur, meteorological factors begin to influence the phenomenon a few days earlier. The effects persist for some days, during which there are some significant peaks of high tide. Furthermore, the study of the cross-correlation function between the oscillations in the different points of the lagoon allow obtaining relevant information on the propagation of the tide without having to conduct a hydrodynamic study and without having to know in detail the configuration of the lagoon itself. Ultimately, the proposed models make it possible to obtain accurate forecasts for a large area on the basis of limited information relating to two or even a single measurement station: this aspect also distinguishes them from commonly used forecasting models, which instead require measurements of the input in a large area.

Conclusions
This study assessed the ability of the Nonlinear Autoregressive Exogenous neural network to develop forecasting models of the tide level in the Venice Lagoon. Four models were built, including both the lagged values of the tide level and astronomical tide as input values. In each model, different combinations of additional input values were considered in order to take into account the meteorological parameters related to wind features and barometric pressure.
Very accurate tide predictions were achieved regardless of whether the meteorological parameters were considered as input parameters. The ability to forecast even exceptionally high tides makes the NARX neural networks a reliable tool for predicting tide-level fluctuation. In addition, the modeling was not particularly affected by lag-time increase, with a satisfactory performance achieved also for a t a equal to 72 h: this demonstrates the great capability of NARX networks to predict high waters with a forecasting horizon of several days. Therefore, NARX modeling could represent a reliable approach for the management of the MOSE, ensuring an adequate time interval for the activation of the barriers. This would allow a reduction of the disaster risk for the city of Venice (Goal 11 of the 2030 Agenda for Sustainable Development), contrasting the effects of climate change that will lead to a rise in the average sea level (Goal 13) and, at the same time, protecting the biodiversity and ecosystem of the Venice Lagoon (Goal 15).
However, even though meteorological parameters may not be included in the model input data, their significant influence is implicitly expressed by including previously observed tidal height values. In addition, in order to provide accurate forecasting of the tide level, it is recommended to consider the astronomical tide as exogenous input. Furthermore, a time-series length of at least 12 months, with different extreme events during the time interval, is necessary for a proper training of the NARX network.
The good results obtained for the Venice Lagoon recommend the use of the NARX network for tide prediction in coastal areas affected by problems related to high tides. A proper evaluation of the tide level is a key factor for the safeguard of the economic, social, and cultural heritage of these places. In addition, the proposed model could be a support tool for the decisions regarding the timely activation of the MOSE in the Venice Lagoon, and of similar systems in other places around the world.