The GEOframe-NewAge Modelling System Applied in a Data Scarce Environment

: In this work, the semi-distributed hydrological modeling system GEOframe-NewAge was integrated with a web-based decision support system implemented for the Civil Protection Agency of the Basilicata region, Italy. The aim of this research was to forecast in near real-time the most important hydrological variables at 160 control points distributed over the entire region. The major challenge was to make the system operational in a data-scarce region characterized by a high hydraulic complexity, with several dams and infrastructures. In fact, only six streamﬂow gauges were available for the calibration of the model parameters. Reliable parameter sets were obtained by simulating the hydrological budget and then calibrating the rainfall-runoff parameters. After the extraction of the ﬂow-rating curves, six sets of parameters were obtained considering the different streamﬂow components (i.e., the baseﬂow and surface runoff) and using a multi-site calibration approach. The results show a good agreement between the measured and modeled discharges, with a better agreement in the sections located upstream of the dams. Moreover, the results were validated using the inﬂows measured at the most important dams (Pertusillo, San Giuliano and Monte Cotugno). For rivers without monitoring points, parameters were assigned using a principle of hydrological similarity in terms of their geology, lithology, and climate.


Introduction
Improvements in information technology in the last couple of decades have permitted the early warning of disasters and flood control at different spatial scales [1,2]. Integrated real-time hydraulic and hydrological modeling can mitigate the consequences of flooding and landslides by facilitating the rapid diffusion of information throughout threatened areas [3]. Furthermore, the adoption of Decision Support Systems (DSSs) helps decision-makers, such as civil protection agencies, to choose and prioritize actions by exploring alternatives. Thus, any decision should be made on the basis of accurate and reliable predictions. In this respect, much can be still done to provide better models, more customizable informatics, and more tools to fill gaps in data.
Notwithstanding their differences, predictive hydrological tools, such as empirical models, distributed models, and lumped models, are essentially data-driven, depending on atmospheric forcings and their parameters are estimated from hydrometric data at basin scales [4,5]. Therefore, considerable difficulties and uncertainties arise when models need to be implemented where there are no gauge stations to calibrate them and in presence of ungauged dams.
River basin discharges are poorly gauged in many parts of the world: while rain gauges are relatively dense, few monitoring points are available for discharge, and they are often declining. However, scarce efforts are made to extract data that are required for robust predictions. Therefore, under these conditions, estimates of extreme values of discharge in ungauged or poorly gauged basins are highly uncertain [4].
Several authors have proposed different strategies to overcome uncertainties due to data scarcity. For example, in Blöschl et al. [6] and Blöschl [7], the best practice recommendations for predicting runoff in ungauged and poorly gauged basins were summarized in (1) collecting all possible data during field surveys of a catchment, (2) deeply analysing the hydrograph to obtain the maximum information, and (3) using regionalization for parameters from similar catchments. Others have suggested (4) the integration of additional data, such as Remote Sensing (RS) measurements, to obtain a more consistent description of hydrological processes [8][9][10][11]. For example, as reported in [12], using RS to construct and verify a groundwater model can reduce the uncertainty of calibrations and increase the predictive value of a model. Predictions can be further improved by (5) performing multiple calibration strategies, such as multi-site calibrations, as proposed in [13]. Multiple parameter sets are obtained from the decomposition of time series to reduce parameter identificability in a semi-distributed model, which could also be helpful in ungauged locations.
One critical aspect of these poorly gauged situation is that it is difficult even to prioritize policy actions that progressively enhance data availability. The implementation of traditional models can be discouraging because they have to be completely rebuilt any time new data come in. Therefore, a new type of modelling infrastructure that can accept changes, without requiring a complete refactoring when new information is acquired, is deemed necessary.
The Basilicata region can be identified as poorly monitored. A limited number of water level gauges (about 10) are available over the region and discharge measurements are not carried out systematically in each cross section. Only few sporadic measurements are available that do not allow the derivation of reliable flow rating curves. Therefore, the modelling application becomes a challenging exercise that imposes to use all the available information in order to overcome such limitations.
Considering all the regional complexities, such as dams and derivations, new strategies needed to be implemented in the DSS, operative in near-real time for the functional center of the Civil Protection of the region, for (i) getting reasonable discharges estimation with the available measurement networks; (ii) implementing a modelling infrastructure easy to be modified when new measurements become available.
To this aim, the hydrological modeling system GEOframe-NewAge [14,15] was chosen and, in the present work, was made operational at a regional scale for the forecasts over 160 monitoring points.
We present (a) the new features developed and integrated in GEOframe-NewAge to enhance the hydrological predictions in the Basilicata region and (b) the various strategies implemented to overcome the uncertainty of calibrations due to data-scarcity. Besides, all the previous aspects are treated to make it possible the replicability of the adopted modelling solutions to future similar systems.
The present work is organized as follows: Section 2 describes the case study of the Basilicata region; Section 3 presents GEOframe-NewAge, its components, and the NET3 graph (Section 3.1), the calibration strategies in a data-scarce environment (Section 3.2) and the model setup (Section 3.3). Section 4 discusses the results of the application and finally, the conclusions of the study are given in Section 5.

Case Study
To Basilicata region, in Southern Italy, is one of the most vulnerable in the national territory, with nearly 50% of the towns classified as high risk for landslides or floods [16][17][18], and a dense drainage network that supplies around 90% of the water resources for public use to the Puglia region [19]. The Basilicata region covers an area of 9992 km 2 in Southern Italy. It is mostly mountainous, with up to 47% of the territory located in areas between 700 and 2250 m a.s.l. (Figure 1a).
The climate is typically Mediterranean and characterized by hot, dry summers and mild, wet winters. The yearly average rainfall ranges between 950 and 1050 mm, while the average temperature is around 14 • C, with an average daily maximum of around 21.5 • C during summer and an average daily minimum of around 4.5 • C during winter.
The vegetation is strongly influenced by climatic conditions. The Western part, along the Apennine, is characterized by highly vegetated soils, with many woods and fields under cultivation of vines and olives. The Eastern part of the region, in contrast, is characterized by bare and arid soils, with poor vegetation and few irrigated zones, [16,20,21].
As shown in the left part of Figure 1b, the region is characterized by a complex drainage network, in which it is possible to recognize nine main rivers: Agri (1702 km 2 ), Basento (1526 km 2 ), Bradano (2776 km 2 ), Cavone (632 km 2 ), Lao (421 km 2 ), Noce (356 km 2 ), Ofanto (2659 km 2 ), Sele (2369 km 2 ), and Sinni (1294 km 2 ). Agri, Basento, Bradano, Cavone, and Sinni Rivers flow into the Ionian Sea, while the Ofanto river flows into the Adriatic Sea, and the Sele, Noce, and Lao Rivers flow into the Tyrrhenian Sea. The region is characterized by high water resource availability-around 1 billion m 3 per year-which is mostly used for the water supply of the Puglia Region, the Calabria Region, and the Basilicata region itself. A complex scheme of hydraulic infrastructures, such as dams, reservoirs, and canals, are realized on the Agri, Bradano, Basento, Ofanto, and Sinni Rivers. In particular, the most important reservoirs are  Each of the nine major rivers is highlighted with a different color. Figure 1a shows the network of measuring stations: the red dots represent the meteorological stations, while the blue stars are the hydrometers. The network is composed of 54 meteorological stations, with a density of 0.5/100 km 2 , which provide hourly temperature and precipitation, and 10 hydrometers, only 6 of which are active. The dataset of meteorological inputs, i.e., temperature and precipitation, and of stage measurements covers two years (from 15 December 2013 to 15 December 2015) at an hourly timescale. The region can be described as poorly gauged with respect to the hydrometric stage measurements since there are only six active hydrometers over a 9992 km 2 territory. Figure 1b shows the regional discretization in 160 Hydrological Response Units (HRUs), whose area varies from around 10 to around 150 km 2 , according to the geomorphology and monitoring points. The DEM used for the derivation of the HRUs has a resolution of 240 × 240 m.

Methodology
In this section we describe the hydrological modelling system GEOframe-NewAge and its enhancements introduced in this work by the integration of a new NET3 version. Then, the calibration strategies proposed to overcome the problems related to the data-scarcity in the Basilicata region, i.e., the extraction of the flow-rating curves, the multi-component and multi-site calibrations, are reported. Eventually, the model setups both at HRU scale and at catchment scale are described.
More than 50 components are available, and they can be grouped into nine categories, as follows: The basin can be discretized into Hydrological Response Units (HRUs), i.e., hydrologically similar parts, such as a catchment or a hillslope or one of its parts, using the components for geomorphic and DEM analyses introduced in [32] and reviewed in [33]. The meteorological forcing data in the centroid of each HRU can be spatially interpolated using a geostatistical approach, such as the Kriging technique [26,34]. The radiation budget includes both shortwave and longwave radiation components [25,29]. Evapotranspiration (ET) can be estimated using three different formulations: the FAO Evapotranspiration model [35], the Priestley-Taylor model [35], and the GEOframe ET model based on [36]. Snow melting and the snow water equivalent are treated using a component that includes three models, as described in [28]. Runoff production is performed using the Embedded Reservoir Model (ERM) [15], which schematizes each HRU as a group of storages (reservoirs) and solves the water budget for each one. Travel time analysis can be performed using the approach proposed in [37,38]. Two calibration algorithms are presented: Let Us CAlibrate (LUCA) [39] and Particle Swarm Optimization (PSO) [40].
The discharge generated at each hillslope is finally routed to the outlet using the Muskingum-Cunge method [41,42].
Any of these components can be used or removed at run time without disrupting the system, but obviously, requiring a re-calibration of the appropriate parameters. A part for this flexibility, the core modelling part are non linear reservoir models, which can be connected in multiple ways at run time.
A graph-based structure called NET3 [43,44] is employed for the management of process simulations. NET3 is designed using a river network/graph structure analogy, where each HRU is a node of the graph, and the channel links are the connections between the nodes. In any NET3 node, a different modeling solution can be implemented and nodes (HRUs or channels) can be connected or disconnected at run time through scripting. Independent nodes are run in parallel, making the simulations faster and facilitating the implementation of additional features. NET3 also enables a further layer of modeling adaptability that allows for seamless insertion into natural river network hydraulic infrastructures [44].
GEOframe-NewAge is open source and helps the reproducibility and replicability of research [45] by providing an ordered modality to run simulations and preserving their parameters. The model source code and projects are managed in the Git [46][47][48] repository Github, which developers use to keep track of code and project evolution. A continuous integration service is also provided by the repository, i.e., Travis CI [49], which ensures the building and testing of the source code at each commit on Git. Dependencies from external classes and/or libraries within components are automatically solved using the Gradle building system [50]. Finally, developers and users collaborate, share documentation, and archive examples and data using the Open Science Framework project [51], which was specifically created for the GEOframe community [52].
The enhancement of GEOframe-NewAge, with the integration of NET3 version done in the present work, makes the model particularly suitable in the case of poorly gauged basins, since it is possible to build the infrastructure and expand it, as soon as new measurements (or components) are available. For example, in the case a new gauge station is operative, it is not necessary to re-build the whole system but it is possible to calibrate only the model parameters specific to the HRUs pertaining the new insertion (and eventually those downstream). This is particularly convenient in operational cases, not only to avoid waste of time in system refactoring, but also to help the decision-makers to define prioritizing actions to improve the measurement network. Thus, in a poorly gauged region, such as the Basilicata, the system allows to easily define the operational investments to improve the hydrological monitoring with the minimum effort.

Simplified Embedded Reservoir Model
In this work, a simplified form of the Embedded Reservoir Model (ERM) presented in [15] was used, albeit the number and scheme of reservoir connections were the same as those in the original model. Therefore, hereinafter, the ERM model version used in this work is referred to as the simplified Embedded Reservoir Model (sERM).
Simplification was actually driven by the idea that in data scarce environment, using less complicate models is the first reccomendation to follow.
In the almost-infinite panorama of hydrological models, the sERM doesn't aim to represent the perfect model but rather a modular system. Flexible and extensible, the model allows to take into account a broad range of modeling strategies. Other worth mentioning models that allow such flexibility are FLEX model, [53,54], and SUMMA model [55,56]. These frameworks are based on a general set of conservation equations for mass and energy, with the capability to incorporate multiple choices for spatial discretization and flux parameterizations [55]. The rationale behind the sERM, FLEX and SUMMA modeling systems is the same: • the possibility to consider several representations of spatial variability and hydrologic connectivity; • the possibility to simulate a broad range of hydrologic processes, with multiple options for individual processes.
However, GEOframe-NewAge system has the further ambition to offer a system to facilitate the production of tools and models to obtain the goal just presented.
The sERM is represented by extended Petri nets (EPNs) [57] in Figure 2. The model is composed of four integrated reservoirs: canopy (green reservoir), root zone (orange reservoir), runoff (purple reservoir), and groundwater (yellow reservoir). The symbols in the figure are reported in the list of sERM definitions in Table 1.  Table 1. List of symbols, names, and units used in the sERM representation. P indicates a calibrated parameter, P* indicates a parameter to be set from the literature, P • indicates a measured parameter, SV indicates a state variable, and F indicates flux. The definitions also include the symbols defined in the expression table below.

Symbol
Name Type Unit a coefficient of the RZ non-linear reservoir model partitioning coefficient between root zone and surface runoff SV From Figure 2, the ruling equations can be easily established: for the canopy storage, for the root zone storage, for the runoff storage, and for the groundwater storage. Table 2, the so-called expressions table, provides mathematical completeness to the fluxes.

Symbol Name Expression
ET rz (t) evapotranspiration from the root zone min 1, 4 Following Figure 2, after the detection of the snowmelt and/or the rainfall, the throughfall and the evaporation from the wet canopy are computed. The wet canopy is modeled through a slightly modified version of the Rutter model [58] (last line of Table 2): the original drainage function was omitted to eliminate some calibration parameters. Throughfall is partitioned into the infiltration in the root zone and the direct surface flow according to the saturation conditions of the root zone. The canopy maximum retention storage, S c max , is modeled as a function of the time-varying Leaf Area Index (LAI) [m 2 /m 2 ], as in [59]. If there is no canopy, the melting/rain is partitioned between the root zone and the runoff according to a partition coefficient, and a variable in time, α(t), is modeled according to [60].
Evapotranspiration is modeled using the formulation proposed by [61], as shown in the third line of the expression table (Table 2). The root zone storage accounts for the evaporation from the base soils, the transpiration of the plants, and the recharge term of the groundwater.
Precipitation that exceeds the root zone capacity is sent directly to the volume available for surface runoff. The surface runoff Q R (t) is modeled with a linear reservoir, in which the coefficient, kA β , i.e., the mean residence time of the basin, is computed as a power law of the area. This approach was originally proposed in [62], with β obtained after fitting the discharges of the Basilicata rivers. Values reported in that study were 0.458 for the Agri River and 0.5 for all the others. In the present study, the β value was chosen as 0.5 for all the investigated rivers, while the k parameter was calibrated. This was made to add a further simplification to the model, diminishing of one order the degrees of freedom of the calibrations. The baseflow from the groundwater is modeled using a non-linear reservoir. Finally, the total runoff is the sum of the direct runoff and the baseflow.

Calibration Strategies in a Data-Scarce Environment
Besides the sERM simplifications introduced in Section 3.3.1, several calibration strategies to address data scarcity were adopted.
First, the Flow-Rating Curves (FRCs) were extracted, exploiting at the best the available stage measurements, using two different approaches. In the first approach, which follows the methodology proposed in [63], the curves were obtained directly as the product of the relations of (i) the mean flow velocity (V) with the river stage (H) and (ii) the wetted area (A) with the river stage. These relations were obtained using available observations on flow velocities and cross-section surveys. In the second approach, the FRCs were extracted using observations of the river stage and discharge (Q). Then, for each monitoring point, the best flow-rating curve was chosen by comparing the mean annual discharge volumes calculated with the first approach (VA), the standard approach (QH), and a simple water balance model, i.e., the Budyko model [64]. The equations of the Budyko model are reported in Appendix B.
Then, different calibrations were carried out to obtain reliable model parameters estimates against the extracted FRCs.
Thanks to the GEOframe-NewAge component-based infrastructure, it was possible to detect each input and output of a single component and calibrate multiple sets of parameters, with each set associated with different hydrological processes, e.g., surface runoff and baseflow, as shown in Figure 3. The calibration versus the different components of the discharge allows to drive the final parameters estimates toward more reliable values. As proposed in [10], the following calibration procedure was performed: 1. from the hourly total discharge, the baseflow was extracted using a mathematical filter, which connected the local minima (Figure 3a, red line); 2. the runoff was extracted by subtracting the baseflow from the hourly total discharge (Figure 3a, blue line); 3. the parameters of the root zone and runoff reservoirs were calibrated against the extracted runoff ( Figure 3b); 4. with the root zone and runoff reservoir maintaining fixed calibrated parameters, the parameters of the groundwater reservoir were calibrated against the extracted baseflow ( Figure 3c); 5. finally, the calibrated parameters previously obtained were further optimized against the hourly total discharge (Figure 3d).
When different discharge measuring points were available for the same river, the NET3 graph structure was used to perform multi-site calibrations. The first calibration followed the procedure described above and was performed for the uppermost point. Then, moving toward the outlet, the best uphill parameters remained fixed, and the downhill calibrations were performed. In this way, it was possible to consider different parameter sets according to local features and climatic forcings. Flowchart of the calibration strategy: first, the discharge components, baseflow (blue line) and runoff (red line), were extracted from the total discharge (a); then, the parameters were calibrated against the runoff series (b) and against the baseflow series (c); finally, all the parameters were recalibrated against the total discharge to further improve the results (d). If multiple monitoring points were available (red stars), then the procedure was repeated for each one.
Finally, for rivers that lacked measures, the model parameters were set equal to those of hydrologically similar calibrated rivers, in terms of climatic conditions (i.e., precipitation and evapotranspiration), lithology, geology, and soil use.

Model Setup
Customized modeling solutions, i.e., schemes of connections of the components to perform specific modeling tasks, were created to solve the hydrological budget at the HRU scale, to model the dams and then connect all the nodes of the network to the outlet. In the following sections, the model setups at the HRU scale and basin scale are described.  . The modeling solution adopted in this work allows for the simulation of the entire hydrological budget: from the spatialization of the temperature data (SIK-K component [26]) to the runoff production (Embedded reservoirs component).

HRU Scale
The meteorological input forcings are spatially interpolated using Kriging techniques (the SIK-K component). Its parameters were obtained following the procedure reported in [26]. First, the semivariance was analyzed, and experimental semivariograms were fitted using the best theoretical models of the 10 available ones. The semivariogram parameters-sill, nugget, and range-were optimized using the particle swarm calibrator. Then, the best model was used for the interpolation of the temperature and precipitation using the ordinary Kriging. Finally, Kriging performances were assessed using leave-one-out cross-validation.
After the interpolation, the radiation budget, both shortwave and longwave radiations (SWRB and LWRB components, respectively), are computed. For this project, the SWRB and LWRB component parameters were set to the default literature values, as in [25,29,30,61,65]. Then, the potential evapotranspiration is simulated according to the Priestley-Taylor model. The total interpolated precipitation is separated into rainfall and snowfall, which is an input of the snow component. The outputs are the snow water equivalent and the melting discharge (if there is snow) or rainfall (if there is no snow). Potential evapotranspiration and melting discharge/rainfall feed into the rainfall-runoff model sERM.
All the parameters, which were fixed for all nine rivers, are reported in Table 3.

Catchment Scale
The modeling solution presented in Figure 4 was solved for each of the 160 HRUs into which the region was discretized. Then, as shown in the EPN of Figure 5, the discharge generated at the HRU scale (represented in gray) is routed along the river network. The dam is simulated by a simple equation: while the routing is done using the Muskingum-Cunge (MC) method [41]: Table 4 defines the symbols, names, and units used in the EPN representation, and Table 5 lists related expressions. Table 4. List of symbols, names, and units used in the EPN representation in Figure 5.  Table 5. Expression table associated with the EPN representation in Figure 5.

Symbol Name Expression
A(H, t) surface area of the dam a sur f ace H(t) − b sur f ace 3 2 In Figure 5, the sites S MC 1 , S MC 3 , and S MC 4 (in gray) represent the routing reservoirs. The orange site, on the other hand, is a dam located between the streams C 1 and C 3 . The input flux is the discharge routed from uphill from HRU 1, while the outflow is, in turn, the input of the downhill HRU 3. The NET3 graph machinery manages the connection of each HRU to the next one in a cascade to the outlet, i.e., HRU1 → Dam → HRU3 and HRU4 → HRU3. The topology of the graph, i.e., the schema of the connection of the EPN sites (reservoirs), is specified in a customized input file. NET3 launches the simulations according to this topology and identifies independent processes that can be run in parallel. For example, the simulations for HRU 1 and HRU 4 in Figure 5 can be run in parallel. It is also possible to define different model parameter sets for a single HRU or a group of HRUs to enable multi-site calibrations and optimize the management of the nodes of interest, such as the dam, as explained in Section 3.2.
Three dams out of five were modeled in this study: Pertusillo on the Agri River, Monte Cotugno on the Sinni River, and San Giuliano on the Bradano River, which are the most important and largest ones in terms of volume. For the three dams, the available data at a daily timescale were the following: A simple water balance (calculation of the differences between the recorded daily volumes) was performed to reconstruct the inflows to each dam when inflow data were missing, and they were used to validate the calibration results.

Results and Discussion
As described in Section 3.2, the Budyko model was used to validate the flow-rating curves that were extracted for the six monitoring points. Tables 6 and 7 report the results of comparing the mean annual volumes obtained with the VA approach, the QH approach, and the Budyko model for the two considered years (2012-2013) in terms of Mean Absolute Percentage Error (MAPE). Table 6. Comparison of the mean annual volume of discharges obtained using the extracted flow-rating curves with the two proposed approaches and the expected volumes obtained using the Budyko model for 2012.

River Budyko Q(V,A) Q(H) MAPE (VA) MAPE (Q(H))
Agri  Table 7. Comparison of the mean annual volume of discharges obtained using the extracted flow-rating curves with the two proposed approaches and the expected volumes obtained using the Budyko model for 2013.

River Budyko Q(V,A) Q(H) MAPE (VA) MAPE (Q(H))
Agri Ponte La Marmora  750  510  508  32  32  Agri SS 106  841  332  310  61  63  Basento SS 106  348  157  250  55  28  Bradano SS 106  90  64  55  29  39  Cavone SS 106  436  520  459  19  5  Sinni Episcopia  1053  516  472  51  55 As is clear from the results, because of the data scarcity, which affects the extraction of the curves, both approaches overestimated or underestimated the mean annual volumes. This also affected the results of the calibration procedure, as discussed later. In general, the VA FRCs perform better than the QH FRCs, except for the Basento SS106 and Cavone SS106 monitoring points, for which the QH FRCs yield better estimates. Therefore, according to the previous results, the QH FRCs were chosen for Basento SS106 and Cavone SS106, while the VA FRCs were used for the remaining points. The FRCs used for each of the six monitoring points are reported in Appendix C.
The final parameter sets of the sERM were obtained using the LUCA calibration algorithm to optimize the Nash-Sutcliffe Efficiency (NSE), whose equation is reported in Appendix D.
For the Agri river, it was possible to calibrate using two measuring stations: (i) upstream of the Pertusillo Dam (Agri (Up)) at the station Grumento-Ponte La Marmora and (ii) downstream (Agri (Down)) at State Road 106 (SS106).
For the Bradano River, the parameters were calibrated only considering the HRUs downstream of the dam because of its strong influence on the measured discharge at the State Road 106 monitoring point. In particular, the upstream HRUs were disconnected from the topography of the river, and the downstream restitution, which was extracted from the data from the San Giuliano dam management office, was considered to be the only upstream contributor to the total discharge at the closing section on State Road 106.
For the Sinni River, on the contrary, the parameters were calibrated only for the HRUs upstream of the dam since there were no active monitoring points on State Road 106.
For the Ofanto, Lao, Noce, and Sele rivers, the parameterizations used were the same as those used for the Basento, Sinni, Sinni, and Agri up-dam, respectively. Figure 6 shows the Flow Duration Curves (FDCs) obtained for each of the six stations. The FDCs were used to verify the goodness of the calibration results since they enable the interpretation of the discharge components (e.g., the surface, subsurface, and groundwater), which are better described by the sERM model, wherein they are distinguished in terms of the frequency of discharge. It is clear that the model is able to well represent both high-and low-frequency discharges for the Cavone, Agri, and Sinni Rivers. On the contrary, Basento and Bradano show problems related to the subsurface contribution (6 and 10 m 3 /s, respectively) because of the lack of a subsurface component of the runoff, resulting in all the baseflow being attributed to the groundwater component. The Bradano river, in particular, shows a very constant measured baseflow (1 m 3 /s): it is almost zero in summer because of a wide floodplain and because of a subsurface component, and it is difficult to simulate by the model.  Figure 7 shows the differences between the measured and the simulated discharges for the six measuring stations. There is clearly a general underestimation of peak flow for the flood event in January 2014, with a maximum difference of around 600 m 3 /s for the Agri SS106 closure section. This is mainly because of the initial conditions for the maximum storages in the root zone and groundwater, which were still affecting the simulations, given an insufficient warm-up period of the model before the event (less than a month). Basento SS106 and Agri SS106 also show large differences in January-February 2015, for which the model mostly overestimates by around 50 m 3 /s. In these cases, the problem lies in the hydraulic complexities, i.e., dams and derivation channels, which are not fully captured by the model. The best fit is obtained for the sections located upstream of the dams, i.e., Agri Ponte La Marmora and Sinni Episcopia; this result is confirmed by the results obtained from a Goodness-of-Fit (GOF) analysis.  Tables 8 and 9 show the values of the sERM calibrated parameters and the GOF indices, respectively. GOF formulations are reported in Appendix D.
Analyzing the values of the calibrated parameters for the root zone reservoir, we can see that the Sinni River has the greatest value of maximum storage of 233.05 mm, which is expected since it is characterized by a higher amount of mean annual precipitation-around 1500 mm/year-relative to the other rivers, whose mean annual precipitation is around 1000 mm/year. Some of the model parameters, such as the B parameter of the formulation from [60] and the runoff coefficient, have small variations in the different sets. This means that the model is less sensible to those parameters compared with the others. However, these values are in line with previous studies, such as [62].
It is also interesting that the optimized values of the Basento and Cavone Rivers are similar, which is also expected since they are both semi-arid basins, with around 700 mm/year of precipitation and around 600 mm/year of potential evapotranspiration. The coefficient of the non-linear reservoir for the groundwater is high for both cases: 25.89 h −1 for Basento and 29.88 h −1 for Cavone.
Much of the variability in the groundwater values is due to the presence of hydraulic infrastructures.
The overall model performances in reproducing the discharge are good, with the best fit obtained for the Sinni River, with a Kling-Gupta Efficiency (KGE) of 0.82, NSE of 0.76, Root-Mean-Square Error (RMSE) of 5.42 m 3 /s, and Percent Bias (PBIAS) of −0.1%. In general, better GOFs were obtained for the calibrated sections upstream of the dams (i.e., Agri (Up) and Sinni) since, downstream, the natural hydrology of the region is strongly affected by the complex hydraulic scheme.  Table 9. Indices of goodness of fit obtained for the six investigated points.

River KGE NSE RMSE PBIAS [-] [-] [m 3 /s] [%]
Agri ( The goodness of the calibration results was validated by comparing the values of the discharge inflows at the three modeled dams (Pertusillo on the Agri River, Monte Cotugno on the Sinni River, and San Giuliano on the Bradano River) with the recorded daily discharges, as shown in Figure 8. This validation is the most reliable since the recorded values are not affected by the uncertainties of the FRC extractions. The values were compared for 2014 since we have data only for that year that cover our dataset. Simulated discharges were coarse-grained from the hourly timescale to the daily timescale and then compared with the recorded daily values. It is clear from Table 10 that there is a really good agreement between the measured and modeled discharges, with correlation coefficients of 0.91, 0.86, and 0.74 for Pertusillo, Monte Cotugno, and San Giuliano, respectively. The San Giuliano data are highly discontinuous, underlying the importance of using the hydrological model to simulate discharges with a short timescale, i.e., one hour, to support the planning and management of a dam.

Conclusions
The aim of the present work is to present the integration of the hydrological model GEOframe-NewAge to simulate the most relevant hydrological and hydraulic variables involved in flood production and landslide induction at a regional scale in a data-scarce environment.
Every hour, the entire hydrological budget is simulated. After the spatialization of the input variables using the Kriging algorithm, the radiation budget is computed, then the evapotranspiration and the snow processes are simulated, and finally, the discharges are produced for each of the 160 monitoring points. Moreover, a simplified model of the dam enables the simulation of the stage in the three major dams-Pertusillo, Monte Cotugno, and San Giuliano-that were considered in this study.
Various methodology to overcome the data-scarcity are proposed: • the extraction of the flow-rating curves using a novel approach based on the velocity and wetted area measurements; • the multi-calibrations versus the different components of the discharge, i.e., the runoff and the groundwater; • the multi-site calibration in different closure sections, when available.
The proposed methodologies proved to be robust, since the results show a good agreement between the measured and modeled discharges for all six monitoring points. The approach for the flow-rating curves extraction was validated using a simple annual water balance model and it shown good performances for 4 over 6 gauge stations. The multi-calibrations, as well, gave consistent results, with better indices of goodness for the section located upstream the dams. They were also validated by comparing the modeled and measured inflows for each of the three dams.
Finally, for Lao, Noce, Ofanto, Sele, were water level measurements were not available, the model parameters were set to equal those of hydrologically similar calibrated rivers, considering the climatic conditions (i.e., precipitation and evapotranspiration), lithology, geology, and soil use.
We claim that the procedures we presented, both for the calibrations and for the model setup, can be easily replicated in any other poorly gauged basin, with great advantages for the early-warning of flood and landslide events.
Moreover, the presented infrastructure allows to integrate new measurements and/or new components as soon as they are available (i.e., new gauge stations), without the necessity of re-implementing and calibrating the whole system. This is particularly important in poorly-gauged locations, such as the Basilicata region, where a future expansion of the measuring network is foreseen. This is useful not only from the modelling point of view but also from the operative point of view, since it also enhances the immediate definition of the prioritizing actions to take to improve the early-warning system.
Expansion of the infrastructure, such as for the near real-time calibration of the model parameters or the data assimilation of snow or soil moisture satellite products, is foreseen and could be easily implemented, thanks to the potentialities and great flexibility of GEOframe-NewAge. Funding: This work was carried out under a scientific agreement between the Civil Protection Department of Basilicata, the Interuniversity Consortium for Hydrology (CINID), and the University of Basilicata to start up the Basilicata Hydrologic Risk Center.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. The Early Warning System of the Basilicata Region
The Italian national system for hydrological and hydraulic risk monitoring is regulated by the Directive of 27 February 2004, which introduced regional functional centers (CFDs) and defined their roles. In particular, on the basis of meteorological forecast bulletins, synoptic maps, and the information on antecedent precipitations, a CFD evaluates and assigns critical levels to different alert zones, into which the territory has been divided, to produce the corresponding Criticality Bulletin.
A DSS was implemented in support of the Basilicata CFD. The region was divided into seven alert zones according to their geomorphological and hydrological characteristics. Three critical levels-normal, moderate, and high-are defined daily for each zone on the basis of pluviometric thresholds. The latter are identified for characteristic durations of foreseen rainfall events (3,6,12,18,24,48,72,96, and 120 h) and for return times of 2, 5, and 20 years.
The integration of the DSS with a Web-GIS allows for the acquisition of a complete risk scenario using all available information in near real-time, i.e., from the current hour to the next 36 h. Web-GISs have user-friendly and lightweight interfaces, and users can easily access the geographical data and service using a browser [67]. The objective is the near real-time monitoring of the spatial and temporal evolution of the hydrology, which is then compared with thematic maps to determine the potential vulnerability and exposed elements in the territory.
Both dynamic and static layers are integrated with the Web-GIS: dynamic layers are time-variant and consist of the critical maps of hourly precipitation; static layers are thematic maps, such as landslide susceptibility, geology and land cover (maps), population, buildings, and infrastructures. Hourly dynamic maps are obtained from the precipitation data recorded by 54 stations and from forecasts at the synoptic scale, which the CFD produces daily as GRidded Information in Binary (GRIB) files (Cosmo LAMI products), with a resolution of around 5 km.
Every hour, the DSS operates as follows: 1. The cumulative precipitation of the last 120 h (as recorded by the meteorological network) is spatialized using the inverse distance weighting method [68]; 2. The GRIB files with the forecasts are downloaded and pre-processed; 3. For each duration (3,6,12,18,24,48,72,96, 120 h), from the current time to the next 36 h, the DSS checks whether the threshold has been exceeded. Recorded precipitation and forecasts are accumulated for a time span of 3 h and compared with the critical values. If a threshold has been exceeded, then the area receives a critical alert level, and the related map is produced; 4. A map with the highest expected criticality is produced for the following 12 and 36 h; 5. The temperature data are downloaded from the sensor network and the LAI map is obtained from the MODIS satellite [69]; 6. GEOframe-NewAge is run to obtain the 36-h discharge and stage forecasts for each node of the network, considering the running conditions; 7. A saturation degree map is produced for the current time; 8. The expected stages are compared with thresholds with an assigned return time to verify final hydrological criticalities; 9. A historical dataset of discharges and stages is updated, which is required for the definition of the running conditions for the simulation in the next hour.

Appendix B. The Budyko Model
The Budyko model [64] was used to compute a simple water balance to choose the best FRC for each monitoring point. The following equation was used: ET a P a = φtanh 1 where ET a is the annual evapotranspiration [L/T], P a is the annual precipitation [L/T], and φ is the aridity index, which is defined as the potential evapotranspiration divided by annual precipitation [-]. Then, the annual discharge is simply Q a = P a − ET a (A2) Table A1 reports the method and the expression of the Flow-Rating Curves (FRCs) used in the study for each of the six monitoring points. When the method is "VA", the FRCs are obtained as a product of the relations between the velocity (V) and the stage (H) and between the area (A) and H. When the method is "QH", the discharge (Q) is a function of H. H 0 is the hydrometric zero.

Appendix D. Error Metrics Adopted
• Kling-Gupta efficiency The Kling-Gupta Efficiency (KGE) incorporates three different statistical measures (the correlation coefficient, r; the variability error, a = σ S /σ m ; and the bias error, b = µ S /µ M ) of the relation between measured and simulated data into one objective function. µ S and µ M are the mean values of measured and simulated data, while σ S and σ M are the standard deviations. KGE = 1 − (r − 1) 2 + (a − 1) 2 + (b − 1) 2 (A3) KGE = 1 indicates the maximum agreement between predicted and observed values. • Nash-Sutcliffe efficiency The Nash-Sutcliffe Efficiency (NSE) is a normalized model efficiency coefficient. It determines the relative magnitude of the residual variance compared with the measured data variance.
where S i and M i are the predicted and observed values at a given time step. The NSE varies from −∞ to 1, where 1 corresponds to the maximum agreement between predicted and observed values. • Root-Mean-Square Error The Root-Mean-Square Error (RMSE) is given by where M and S represent the measured and simulated time-series, respectively, and N is the number of components in the series.

• Percent Bias
The Percent Bias (PBIAS) measures the average tendency of the simulated values to overestimate (positive values) or underestimate (negative values) the observed values. PBIAS is given by where M and S represent the measured and simulated time-series, respectively, and N is the number of components in the series.