Downscaling Regional Hydrological Forecast for Operational Use in Local Early Warning: HYPE Models in the Sirba River

: In the last decades since the dramatic increase in ﬂood frequency and magnitude, ﬂoods have become a crucial problem in West Africa. National and international authorities concentrate e ﬀ orts on developing early warning systems (EWS) to deliver ﬂood alerts and prevent loss of lives and damages. Usually, regional EWS are based on hydrological modeling, while local EWS adopt ﬁeld observations. This study aims to integrate outputs from two regional hydrological models—Niger HYPE (NH) and World-Wide HYPE (WWH)—in a local EWS developed for the Sirba River. Sirba is the major tributary of Middle Niger River Basin and is supported by a local EWS since June 2019. Model evaluation indices were computed with 5-day forecasts demonstrating a better performance of NH (Nash–Sutcli ﬀ e e ﬃ ciency NSE = 0.58) than WWH (NSE = 0.10) and the need of output optimization. The optimization conducted with a linear regression post-processing technique improves performance signiﬁcantly to “very good” for NH (Heidke skill score HSS = 0.53) and “good” for WWH (HSS = 0.28). HYPE outputs allow to extend local EWS warning lead-time up to 10 days. Since the transfer informatic environment is not yet a mature operational system 10–20% of forecasts were unfortunately not produced in 2019, impacting operational availability.


Introduction
Recent changes in hydrological processes resulted in a dramatic increase of flood events in West Africa [1,2]. Several authors documented such changes and particularly the Sahelian paradox observed in South-West Niger [3,4]. Hydrological studies over the last 20 years in the region show a runoff decrease in the Sudano-Guinean catchments and an increase in the Sahelian ones [5][6][7]. The causes of hydrological changes in the Sahelian area are highly debated among scientists and probably rather than from a single factor they can derive from the interaction of several drivers [8]. Among them, the most acknowledged are the recent recovery of rainfall in the Sahel, even if below the pre-1970 levels, the increased occurrence of extreme rainfall events, and the strong soil and vegetation degradation [9][10][11][12]. The reduced water-holding capacity of the soil leads to greater and faster runoff increasing both erosion and river flows [3].
In the last decade, the increase, both in terms of frequency and intensity [13], of extreme flood events together with the demographic growth have produced significant and unprecedented damage to populations [14]. These catastrophic events prompted governments to seek support from the international community to develop early warning systems (EWS) aiming to increase preparedness of local communities.
Some experiences of EWSs developed with a community-based approach have been documented in developing countries, where communities are active participants in the design, monitoring and management of the EWS [15] [16][17][18][19]. EWSs, as defined by the United Nations International Strategy for Disasters Reduction, are based on four pillars: (1) risk knowledge, (2) risk monitoring and warning, (3) risk information dissemination and communication, and (4) response capacity. Among these pillars, risk monitoring and warning has been addressed in recent years at global [20][21][22] continental [23,24] or river basin [25] levels developing hydrological models that allow lead times much longer than those obtainable with local observations [15]. Despite these improvements, in situ measured hydrological data are often considered more reliable then hydrological model outputs for operational applications, particularly if coupled with hydraulic propagation models [26]. Nevertheless, EWSs based only on in-situ measures are limited in forecasting time (propagation time limited to a few days), timeliness (lack of availability of real time observations) and coverage (low density of gauging stations) being able to save lives but not assets or livelihoods [27].
The integration of hydrological modeling with in-situ measures and hydraulic modeling has recently been successfully tested in Niger on the Sirba River [28]. In such a context of poorly gauged basins, where hydrological models cannot be adequately calibrated, hydrological forecasts are nevertheless useful to timely activate the pre-warning tasks while operational warnings were delivered in the case of threshold overpassing in the upstream field observations linked to flood scenarios developed with ad-hoc hydraulic models [29]. Nevertheless, the longer lead times ensured by the hydrological forecasts application underline the importance of downscaling and optimizing models' outputs to improve the performances and ensure a higher reliability of the local early warning systems [30].
The objective of this study is to downscale, optimize and evaluate hydrological forecasts from the FANFAR system [31], specifically provided by the models Niger HYPE 2.23 (NH) and World-Wide HYPE 1.3.6 (WWH), for the application in local EWS of the Sirba River.
This work is structured as follows: Section 2 focuses on the study area, hydrology, EWS and hydrological model features. Section 3 defines the methodology adopted. Sections 4 and 5 describe the results and discuss the importance of the research. Section 6 contains the conclusions and introduces some possible future perspectives of this study.

Study Area
Sirba River is the main tributary of the Middle Niger River Basin covering an area of 39,000 km 2 shared between Burkina Faso and Niger. The morphology of the basin is characterized by very slight height variation (from 444 m a.s.l. to 181 m a.s.l.) laying on a granitic substrate. Climate is Sahelian with a unimodal distribution of rainfall from June to September averaging from 400 mm/year in the northern part to 700 mm/year in the southern one [32]. Rainfall distribution along the rainy season is inhomogeneous with persistent dry spells and extreme rainfall events [33] strongly affecting the hydrology of the Sirba River [13].
Within the Sirba basin, the study area covers the Nigerien reach of 108 km from the Burkina-Niger border to the confluence with the Niger River ( Figure 1). and private buildings (mainly clay houses).
The Sirba River is gauged with some hydrometers in Burkina Faso including three major stations (Bassieri, Liptougou and Sebba) on the three tributaries (Koulouko, Faga and Yali rivers) and two stations in Niger in the main watercourse (Garbey Kourou and Bossey Bangou). Garbey Kourou (GK) is located 8 km upstream the confluence Sirba-Niger and was installed in 1956 for the Sirba flow characterization while Bossey Bangou (BB) was installed in 2018 few kilometers downstream Niger-Burkina border by ANADIA2.0 Project to ensure EWS tasks. Both Nigerien stations are managed by General Directorate for Water Resources and are equipped with staff gauges and pressure devices (SEBA PS Ligth-2 ® ) sending data with hourly time step with GSM paradigms. Figure 1. Geographical framework of the study area: the Sirba River Basin with the hydrometers, the main hydrographic network, the administrative boundaries, and the sub-basins of NH (Niger HYPE) and WWH (World-Wide HYPE) models. Bottom right: West Africa overview with NH and WWH covered surface.

Sirba Hydrology and Observed Flow Series
Sirba is a classical Sahelian river, dry for 7-8 months and rapidly activated by the first rains in June. Flood peaks generated by rainfall events occur generally between August and September.
The hydrological behavior of the Sirba River has been recently analyzed using the discharge time series of the Garbey Kourou gauging station from the installation in 1956 until 2019. The analysis of the annual maximum discharge results in a clear non-stationary distribution with clear positive trend [13,25]. According to the changes in rainfall time series [33], two changepoints have been detected on annual maximum trend: 1968 (the end of the wet period) and 1989 (the end of the dry period. A third breakpoint was detected in 2008, starting a new period of extreme floods [13]. This change is in line with the increase of extreme daily rainfall occurrence [3,9]. In the last 20 years, the riverine communities (97 settlements with 61,703 people, belonging to 7732 households) have been repeatedly affected by Sirba floods [26,34,35]. The flood events caused tremendous damage to livelihoods (mainly related to family subsistence agriculture), infrastructures and private buildings (mainly clay houses).
The Sirba River is gauged with some hydrometers in Burkina Faso including three major stations (Bassieri, Liptougou and Sebba) on the three tributaries (Koulouko, Faga and Yali rivers) and two stations in Niger in the main watercourse (Garbey Kourou and Bossey Bangou). Garbey Kourou (GK) is located 8 km upstream the confluence Sirba-Niger and was installed in 1956 for the Sirba flow characterization while Bossey Bangou (BB) was installed in 2018 few kilometers downstream Niger-Burkina border by ANADIA2.0 Project to ensure EWS tasks. Both Nigerien stations are managed by General Directorate for Water Resources and are equipped with staff gauges and pressure devices (SEBA PS Ligth-2 ® ) sending data with hourly time step with GSM paradigms.

Sirba Hydrology and Observed Flow Series
Sirba is a classical Sahelian river, dry for 7-8 months and rapidly activated by the first rains in June. Flood peaks generated by rainfall events occur generally between August and September.
The hydrological behavior of the Sirba River has been recently analyzed using the discharge time series of the Garbey Kourou gauging station from the installation in 1956 until 2019. The analysis of the annual maximum discharge results in a clear non-stationary distribution with clear positive trend [13,25]. According to the changes in rainfall time series [33], two changepoints have been detected on annual maximum trend: 1968 (the end of the wet period) and 1989 (the end of the dry period. A third breakpoint was detected in 2008, starting a new period of extreme floods [13]. This change is in line with the increase of extreme daily rainfall occurrence [3,9]. The mean hydrographs ( Figure 2) clearly show the deep changes of the Sirba hydrology in the four hydrologic periods. The first period (1956)(1957)(1958)(1959)(1960)(1961)(1962)(1963)(1964)(1965)(1966)(1967)(1968) usually considered the reference period shows a flow season with maximum flow in the month of September of about 180 m 3 /s and a mean flow of 26 m 3 /s. The second period  clearly shows the effects of minor rainfalls in a lowering in the mean annual flow (23 m 3 /s) and in the earliest flow season that reach the flow peaks (130 m 3 /s) in the month of August. The third period (1990-2008) reveals the effects of "Sahelian Paradox" [4] in the Sirba Basin with peaks (220 m 3 /s) and mean flow (47 m 3 /s) higher than the first period. The fourth period (2009-2019) highlights a completely different hydrologic behavior of the Sirba River with a mean flow that doubled (100 m 3 /s) and an unprecedent annual maximum flow (550 m 3 /s).
Water 2020, 12, x FOR PEER REVIEW 4 of 21 The mean hydrographs ( Figure 2) clearly show the deep changes of the Sirba hydrology in the four hydrologic periods. The first period (1956)(1957)(1958)(1959)(1960)(1961)(1962)(1963)(1964)(1965)(1966)(1967)(1968) usually considered the reference period shows a flow season with maximum flow in the month of September of about 180 m 3 /s and a mean flow of 26 m 3 /s. The second period  clearly shows the effects of minor rainfalls in a lowering in the mean annual flow (23 m 3 /s) and in the earliest flow season that reach the flow peaks (130 m 3 /s) in the month of August. The third period (1990-2008) reveals the effects of "Sahelian Paradox" [4] in the Sirba Basin with peaks (220 m 3 /s) and mean flow (47 m 3 /s) higher than the first period. The fourth period (2009-2019) highlights a completely different hydrologic behavior of the Sirba River with a mean flow that doubled (100 m 3 /s) and an unprecedent annual maximum flow (550 m 3 /s).

SLAPIS: the Sirba EWS
SLAPIS is an integrated flood EWS aiming to promote decision-making and behavioral shift from reactive to proactive action at several levels, from the community to the administration. It addresses all four EWS components [36], while also being community and impact-based [28]. Risk knowledge is addressed through participatory risk assessment and identification of flood hazard scenarios and potential impacts [29] connecting the available technical capabilities with the local level through a participatory approach. The monitoring and warning component integrates existing hydrological models, real time measurements and qualitative observations into hydrological warnings based on four color-coded classes [26] bridging the gap between top-down and bottom-up approaches. Moreover, the integration of hydrological forecasts and observations with the community monitoring and preparedness system provides a lead time suitable for operational decision-making at national and local levels. The dissemination component was accomplished with the national alert code, involving stakeholders from national to local level and building on multiple communication channels [28]. Response capability builds on the existing local volunteer system, local participatory adaptation, and contingency planning [29]. This allows the beneficiaries to define the rules of the whole system, strengthening their ability to understand the information and react.
Particularly concerning the monitoring and warning components, two hydrological models have been considered for the EWS development: GloFAS and HYPE. GloFAS is a probabilistic global hydrological model developed by the European Commission Joint Research Center. GloFAS version 2.1 has been optimized on the Sirba River applying correction factors to model outputs because of the poor reliability of the original forecasts on local application. The optimization ensured a substantial improvement in forecast accuracy (particularly the shape of the forecasted hydrograph

SLAPIS: the Sirba EWS
SLAPIS is an integrated flood EWS aiming to promote decision-making and behavioral shift from reactive to proactive action at several levels, from the community to the administration. It addresses all four EWS components [36], while also being community and impact-based [28]. Risk knowledge is addressed through participatory risk assessment and identification of flood hazard scenarios and potential impacts [29] connecting the available technical capabilities with the local level through a participatory approach. The monitoring and warning component integrates existing hydrological models, real time measurements and qualitative observations into hydrological warnings based on four color-coded classes [26] bridging the gap between top-down and bottom-up approaches. Moreover, the integration of hydrological forecasts and observations with the community monitoring and preparedness system provides a lead time suitable for operational decision-making at national and local levels. The dissemination component was accomplished with the national alert code, involving stakeholders from national to local level and building on multiple communication channels [28]. Response capability builds on the existing local volunteer system, local participatory adaptation, and contingency planning [29]. This allows the beneficiaries to define the rules of the whole system, strengthening their ability to understand the information and react.
Particularly concerning the monitoring and warning components, two hydrological models have been considered for the EWS development: GloFAS and HYPE. GloFAS is a probabilistic global hydrological model developed by the European Commission Joint Research Center. GloFAS version 2.1 has been optimized on the Sirba River applying correction factors to model outputs because of the poor reliability of the original forecasts on local application. The optimization ensured a substantial improvement in forecast accuracy (particularly the shape of the forecasted hydrograph rather than its Water 2020, 12, 3504 5 of 20 intensity) [30] allowing optimized GloFAS 2.1 to be used in the EWS platform for the Sirba River as pre-warning, while warnings will only be sent using the in-situ measurements.
SLAPIS is empowered by an information system that has been designed to collect observed data as well as forecast data, supplying optimization models and disseminating results to be used in local EWS. SLAPIS is operational from June 2019 at www.slapis-niger.org [37].

The FANFAR System, HYPE Models and Forecasted Streamflow
FANFAR is a regional flood forecasting and alert system at West African scale [31]. It is motivated by the increase in flood challenges not only in the Sirba River, but throughout West Africa in recent years. FANFAR stems from ten years of cooperation between West African and European scientists and practitioners. The system is built and continuously refined through a participatory co-design, co-development, and co-operation process involving more than 30 national and regional organizations from 17 countries (Benin, Burkina Faso, Cape Verde, Chad, Gambia, Ghana, Guinea, Guinea Bissau, Ivory Coast, Liberia, Mali, Mauritania, Niger, Nigeria, Senegal, Sierra Leone, and Togo) [38].
FANFAR is a pilot operational system providing openly accessible hydrological forecasts with a 10-day outlook on catchment resolution at https://fanfar.eu/ [39]. The system employs a daily forecasting chain centered on the two hydrological models Niger HYPE (NH) [40] and World-Wide HYPE (WWH) [22], as well as tailored versions of these models [41]. Both models are based on the HYPE open-source code (https://hypeweb.smhi.se/model-water/) but differ substantially in their setup and characteristics. The NH model was set up and calibrated to simulate daily hydrological dynamics of the Niger River basin and incorporates tailored components for specific hydrological features important in the region (the Inner Niger Delta and soil flow paths adapted to represent Sahelian conditions). The WWH, on the other hand, was developed aiming to represent global water dynamics on monthly resolution, without specific tailoring to any region. The NH and WWH models are driven by the HydroGFD meteorological data set, which is a combination of global operational forecast models, reanalysis models and gridded observations [42,43]. The simulated streamflow representing Garbey Kourou and Bossey Bangou in each model have been used. Bossey Bangou was however not available in NH, and therefore its streamflow was approximated with the inflows of upstream sub-basins. More details on the models and data used are provided in Table 1.

Hazard Thresholds
The Sirba River was characterized by hazard thresholds from the SLAPIS EWS and the two hydrological models (NH and WWH) acting on the basin. Thresholds have in common the three scale Water 2020, 12, 3504 6 of 20 (frequent, severe, and catastrophic floods) and the color of the scenarios (yellow, orange, and red) but differs in return period and flow rate.
For the hydrological models (Niger HYPE and World-Wide HYPE) the flood hazard and severity levels are determined through a threshold exceedance approach. The thresholds are currently based on exceeding the 2, 5, and 30 years return period magnitudes determined individually for each sub-basin using extreme value analysis on a long historical simulation with each model version [25]. The SLAPIS hazard thresholds were defined according to both the observed flows and field effects on the main riverine communities [26]. From the hydrologic point of view these thresholds were related to: (1) the mean flow hydrology (flow duration curves), (2) the return period analysis based on annual maximum and (3) the non-stationary return period analysis according the hydrologic changes [44]. The specific thresholds used in SLAPIS and FANFAR at Garbey Kourou are presented in Table 2.

Methods
The methodology applied in this study consists of three steps: (1) an evaluation of the original forecast performances, (2) an optimization of model outputs to improve forecast reliability and (3) an evaluation of optimized forecasts to quantify the improvements reached in the post processing procedures [30].
Other methodologies adopted in this research are related to hydraulics and informatic domain. Hydraulic modeling has been fundamental for the translation of flood wave computation to define the forecast in Bossey Bangou from the upstream sub-basins according to the results of Massazza et al. 2019 [26]. Since NH does not have a sub-basin in BB this forecast was built with the sum of the outputs in the three upstream tributaries that covers 97% of the upstream basin. According to the channel length and the flood wave velocity of 0.93 m/s [26] forecasts were shifted by one day (flood wave translation time in the three tributaries equal to 19.4, 23.6 and 23.8 h). Informatic procedures have been used to daily download the HYPE forecasts and the visualization in the SLAPIS platform after post-processing according to the optimization results. SLAPIS architecture has been developed based on several open-source technologies and software components that have been implemented following service oriented architecture paradigms [45] and international standards for geographical information management.

Model Evaluation
The model evaluation is based on continuous and categorical indices and operational availability of the forecast.
Continuous indices are used to evaluate the fitting between observed and forecasted flows ( Table 3). The discharge was analyzed like a continuous variable with three different indices: (1) the root mean square error (RMSE) to identify the mean gap, in absolute value, between forecasts and observations, (2) the RMSE observations standard deviation ratio (RSR) consist in the normalization of RMSE with the standard deviation of observations and 3) the Nash-Sutcliffe efficiency (NSE) index is a normalized statistic that identifies the ratio between the residual variance and the measured data variance [46][47][48]. Table 3. Formula, best and worst values and unit of measure (UM) for continuous indices. Where T is the number of observed/forecasted days, F is forecasted flow, O is observed flow and N the total number of the data.

Index
Formula Best Value Worst Value UM Categorical indices are used to evaluate the reliability of the forecast in terms of predicting flood events. The discharge was compared to a specific threshold and became a dichotomous variable of type "yes" or "no" referring to the specific "threshold exceeding" event. Contingency table of Figure 3 identifies the number of "a = hits", "b = false alarms", "c = misses", and "d = correct negatives" [20,46,49].
Categorical indices are used to evaluate the reliability of the forecast in terms of predicting flood events. The discharge was compared to a specific threshold and became a dichotomous variable of type "yes" or "no" referring to the specific "threshold exceeding" event. Contingency table of Figure  3 identifies the number of "a = hits", "b = false alarms", "c = misses", and "d = correct negatives" [20,46,49]. Usually, the thresholds are defined according to the hazard thresholds used for EWS (Table 2). However, since the yellow threshold for SLAPIS (600 m 3 /s) was never exceeded in the 2019 hydrological year, the decision to consider the flow of 450 m 3 /s for categorical indices computation was taken in order to analyze a sufficient number of events. This discharge has been overpassed 18 days at Bossey Bangou and 16 days at Garbey Kourou hydrometers in the hydrological year 2019 compared with the 15 days defined for the yellow threshold in the mean hydrologic year [26].
Operational availability is an index used to quantify how many forecasts were produced and made available until 23:59 UTC for each issue date. Since the SLAPIS EWS is required to publish the new forecast every day and the hydrological forecasts come from another system, it is very important that forecasts are constantly delivered in the same format and timing to be correctly used. This index is simply a measure of the forecast integrity that could be limited by problems at the production level, informatic supply chain or user level [46]. The score is computed with the ratio of available forecasts on the whole number of forecasts expressed as a percentage.  Usually, the thresholds are defined according to the hazard thresholds used for EWS (Table 2). However, since the yellow threshold for SLAPIS (600 m 3 /s) was never exceeded in the 2019 hydrological year, the decision to consider the flow of 450 m 3 /s for categorical indices computation was taken in order to analyze a sufficient number of events. This discharge has been overpassed 18 days at Bossey Bangou and 16 days at Garbey Kourou hydrometers in the hydrological year 2019 compared with the 15 days defined for the yellow threshold in the mean hydrologic year [26].
Operational availability is an index used to quantify how many forecasts were produced and made available until 23:59 UTC for each issue date. Since the SLAPIS EWS is required to publish the new forecast every day and the hydrological forecasts come from another system, it is very important that forecasts are constantly delivered in the same format and timing to be correctly used. This index is simply a measure of the forecast integrity that could be limited by problems at the production level, informatic supply chain or user level [46]. The score is computed with the ratio of available forecasts on the whole number of forecasts expressed as a percentage.

Model Optimization
Optimization procedures reveal a fundamental importance to adapt the results at the boundary conditions in many different domains [50,51]. In the hydrologic field usually, an optimization was conducted varying the sensitive model parameters in feasible domain to reduce the gap between forecasted and observed flows [52][53][54]. In this case of study, because of the need for local optimization of hydrological outputs, authors chose to adapt the outputs as a post-processing procedure in order to improve outputs at local level without affecting the forecast reliability outside the study area. Therefore, the optimization was developed from the user and not from the developer point of view. Moreover, the optimization process was hydrologically oriented on homogeneous periods of the wet season that were separately analyzed [28].
The hydrological year was divided in twelve periods according to river hydrology [13]. The dry season (November to May) was considered into a unique interval. The low-flow months (June and October) were considered with a monthly time frame, whereas the months with medium-high flows (July, August, and September) were optimized with a ten days' time frame. The probability occurrence evaluation was conducted on the last hydrological period (2009-2019). The flood days were identified in the days when the mean discharges exceeded the yellow threshold (600 m 3 /s). This analysis shows that all the flood events occurred between 21 July and 30 September and the 92.4% of threshold exceeding are between 1 August and 20 September (Table 5). The optimization process was conducted as a volumetric scaling with a linear regression technique based on six different equations (1st, 2nd, 3rd, 4th, 5th polynomial functions and logarithmic). The coefficients of the linear regression models were estimated on each period with the ordinary least squares (OLS) method based on 5-day forecasts [55]. The objective function is the maximization of R 2 between the observed flow and the forecast optimized with the linear regression model [28]. The coefficients respect the statistical tests of Newey-West heteroskedasticity and autocorrelation and are significantly different from zero at the 0.05 level [48,56]. R 2 reach values between 0.29 and 0.84 (mean = 0.48) for WWH and between 0.26 and 0.68 (mean = 0.37) for NH model. The R 2 values are affected by the scattered forecasted values originated from the regional models and the rejection of the statistical tests.
The optimization was conducted on the reference 5-day forecast. This forecast horizon was chosen because it is the middle of the forecast time and was a time sufficiently longer to activate the pre-waring tasks on the field [29]. In the optimization, the dataset was divided in two parts, the first part for the training and the second part for the validation. To increase the training period, all years except the last have been used for training (2016-2018 for NH and 2017-2018 for WWH) and the forecast validation, made with the same indices of Section 3.1, was conducted on the hydrological year 2019.
In the following section the acronyms NH, WWH, NH OTT and WWH OTT will be used to indicate the original and optimized versions of the forecast.

Forecast Performances
The results consider the observed flow and the four forecasted series (NH, WWH, NH OTT and WWH OTT ) at a 5-day forecast horizon.
The results of Table 6 and Figures 4-7 show the performance for the GK station. The results for BB are not shown since: (1) the short flow time series of BB does not allow a sufficient background and (2) these results are quite closer for BB and GK. The results of Table 7 instead show the mean performances in the two gauging stations to evaluate the overall performances of hydrological forecasts on the Sirba EWS that present some differences in the two stations. Tables and figures with the overall results are presented for completeness in Appendix A.  The hydrographs with the NH forecast demonstrate that the magnitude of forecasted flow is quite close to the observed ones ( Figure 5). The forecasts do not present strong outliers and well identify the flow peaks in the hydrologic year. Generally, forecasts demonstrate a good skill before optimization and improved skill after optimization. In NH, the influence of calibration at the Garbey Kourou station [25] is very clear. The forecast performance at Bossey Bangou has lower quality (Appendix Figure A2).  The hydrographs with the NH forecast demonstrate that the magnitude of forecasted flow is quite close to the observed ones ( Figure 5). The forecasts do not present strong outliers and well identify the flow peaks in the hydrologic year. Generally, forecasts demonstrate a good skill before optimization and improved skill after optimization. In NH, the influence of calibration at the Garbey Kourou station [25] is very clear. The forecast performance at Bossey Bangou has lower quality (Appendix Figure A2). The hydrographs with WWH forecasts clearly show that original forecasts are able to represent the annual flow cycle, especially for the dry season, but not the flow magnitude ( Figure 6). The WWHOTT demonstrates a strongly improved correlation with the observed hydrograph and both timing and magnitude of flood peaks are correctly identified. The gap between WWH and observed  The hydrograph of the 2019 validation wet season well highlights strengths and weaknesses of the four compared forecasts (Figure 7): (1) WWH forecasts for July and the first part of August are mainly zeros and the optimized ones are quite overestimated, (2) NH wet season sufficiently well reproduce the GK flows but is shifted about two weeks forward, (3) the optimized forecasted flows correctly identify the annual maximums between the middle of August and the middle of September even if the WWHOTT in the last period of August is quite underestimated and, (4) even if the NHOTT performs well, it's affected by high levels of missing data (i.e., low operational capacity). This is a major limitation of the present FANFAR system, a result of the fact that it is still a pilot system and not yet a production-grade system. flows is related to the setup and goal of WWH (calibrated with a global focus and a monthly resolution for water balance analysis), without the more tailored calibration applied in NH. The hydrograph of the 2019 validation wet season well highlights strengths and weaknesses of the four compared forecasts (Figure 7): (1) WWH forecasts for July and the first part of August are mainly zeros and the optimized ones are quite overestimated, (2) NH wet season sufficiently well reproduce the GK flows but is shifted about two weeks forward, (3) the optimized forecasted flows correctly identify the annual maximums between the middle of August and the middle of September even if the WWHOTT in the last period of August is quite underestimated and, (4) even if the NHOTT performs well, it's affected by high levels of missing data (i.e., low operational capacity). This is a major limitation of the present FANFAR system, a result of the fact that it is still a pilot system and not yet a production-grade system.   The preliminary analysis based on the basic statistical parameters demonstrates that while NH correctly identifies the magnitude of minimum, mean and maximum flows WWH values are quite underestimated (Table 6). After the optimization process the indices demonstrate that both NH OTT and WWH OTT are overestimated. The behavior is quite different for the two models, while NH OTT overestimates maximum more than the mean, WWH OTT does the contrary.
Flow duration curves (Figure 4) confirms the preliminary results emphasizing some interesting points: (1) all the forecasts simulate quite well the dry period of the river even if WWH zeros start from day 100 and NH is not able to produce zero values, (2) WWH forecasts do not demonstrate a clear relation with the observed river flow, (3) NH, even if has the same mean than the observed flow, overestimates in the very high (Q 1 -Q 10 ) and low flows (Q 80 -Q 150 ) and underestimates in the medium flows (Q 10 -Q 80 ), (4) NH OTT is quite closer to the observed flow until day 75 and overestimates the low flows and (5) WWH OTT strongly overestimates all the observed flows.
The hydrographs with the NH forecast demonstrate that the magnitude of forecasted flow is quite close to the observed ones ( Figure 5). The forecasts do not present strong outliers and well identify the flow peaks in the hydrologic year. Generally, forecasts demonstrate a good skill before optimization and improved skill after optimization. In NH, the influence of calibration at the Garbey Kourou station [25] is very clear. The forecast performance at Bossey Bangou has lower quality (Appendix A Figure A2).
The hydrographs with WWH forecasts clearly show that original forecasts are able to represent the annual flow cycle, especially for the dry season, but not the flow magnitude ( Figure 6). The WWH OTT demonstrates a strongly improved correlation with the observed hydrograph and both timing and magnitude of flood peaks are correctly identified. The gap between WWH and observed flows is related to the setup and goal of WWH (calibrated with a global focus and a monthly resolution for water balance analysis), without the more tailored calibration applied in NH.
The hydrograph of the 2019 validation wet season well highlights strengths and weaknesses of the four compared forecasts (Figure 7): (1) WWH forecasts for July and the first part of August are mainly zeros and the optimized ones are quite overestimated, (2) NH wet season sufficiently well reproduce the GK flows but is shifted about two weeks forward, (3) the optimized forecasted flows correctly identify the annual maximums between the middle of August and the middle of September even if the WWH OTT in the last period of August is quite underestimated and, (4) even if the NH OTT performs well, it's affected by high levels of missing data (i.e., low operational capacity). This is a major limitation of the present FANFAR system, a result of the fact that it is still a pilot system and not yet a production-grade system.
Continuous indices highlight the hydrological model capability to correctly forecast the flow in the whole hydrologic year. RMSE identifies the mean gap (in absolute value) between observed and forecasted flow that demonstrates the better performance of NH and the importance of optimization process (Table 7). RSR and NSE are normalized indices that, starting from RMSE, are useful to evaluate the performances in comparison with reference values. According to Moriasi et al. 2007 [49] classification that identifies five levels (bad, unsatisfactory, satisfactory, good and very good): (1) NH performance is satisfactory before and very good after the optimization and (2) WWH performance is bad (RSR) or unsatisfactory (NSE) for the original version and good after the optimization process.
Categorical indices are fundamental to highlight the ability to correctly identify the flow values above the threshold used to distinguish flood to normal values. BIAS show that WWH is not able to forecast flood values, NH forecasts underestimate (BIAS < 1) and optimized forecasts overestimate (BIAS > 1) streamflow. POD and FAR are the fundamental values to identify the forecasting capability. HYPE forecasts demonstrate a quite good POD but a too high FAR resulting in forecasts that are good but not completely reliable. Optimization raises POD (similar performance of NH OTT and WWH OTT ) and reduces FAR for both models. For this point it is important to highlight that the values in Table 7 derive from the mean between Garbey Kourou and Bossey Bangou performances (complete values are reported in Appendix A). This difference is clearly noticeable for NH OTT POD and FAR that derive from excellent values in GK (POD = 0.85 and FAR = 0.35) and bad values in BB (POD = 0.4 and FAR = 0.65). PC values are quite constant for NH before and after the optimization and reduces in WWH since the high number of over forecasting. TS and HSS jointly consider the POD and FAR emphasizing the best performances of NH compared to WWH. In wider terms, from both continuous and categorical points of view, optimization process allows to significantly reduce the gap between forecasts and observations confirming previous results reached in the literature [30,57].
Finally, operational availability quantifies the number of missing forecasts (i.e., not produced within the forecast issue date). The results show that NH has significantly worse availability than WWH for the considered time frame (Table 7), and that none of them ensures full time availability. This problem was typically caused by various information communications technology (ICT) production failures on the Hydrology-TEP cloud platform (https://hydrology-tep.eu) on which the FANFAR pilot system is currently deployed to run every day. Most problems were related to necessary files being inaccessible or incorrectly stored on the Hydrology-TEP Catalogue and Store, while a minority were also related to the forecasting production service. To be robust, local EWS requires new forecasts every day without delay [28], therefore there is a need to upgrade the FANFAR pilot system to a fully supported operational production-grade system.

SLAPIS Operational Application
According to the evaluation of the HYPE models' reliability conducted with continuous, categorical, and operational availability indices on NH and WWH, the latter has been integrated in SLAPIS to be used operationally during the last hydrological season by Niger Hydrological Directorate with restriction to registered users of the SLAPIS platform. The reasons for the choice of WWH are three: (1) the direct output in the BB sub-basin; (2) the more similar results in the two stations and the better forecast performance in BB more useful for the EWS and; (3) the considerably higher value for the operational availability in comparison to NH.
Thanks to the adoption of service oriented architecture paradigms, it was possible to easily integrate WWH into SLAPIS using open-source technologies and developing software components. The data model of SLAPIS has been enriched with new tables to store WWH outcomes for seven specific sub-basins, two of them connected with the gauging stations of Bossey Bangou and Garbey Kourou. Following the WWH outcome specifications, an automatic procedure has been developed using J2EE technology and integrating OpenSearch engine to download and store WWH forecasts into the Geo Data Base every day (Figure 8). The WWH forecast availability in the SLAPIS database starts from June 2017 and is quite constant for each sub-basin.
The WWH optimization procedure has been developed inside the SLAPIS database, based on PostgreSQL and PostGIS engines, using PL/pgSQL language. This procedure is exposed through a REST web service developed with JAX-RS and J2EE technologies to supply calibrated WWH outcomes. The REST web service is used by SLAPIS web application (www.slapis-niger.org) in order to show and plot WWH optimized forecasts through the graphical user interface (Figure 9). specific sub-basins, two of them connected with the gauging stations of Bossey Bangou and Garbey Kourou. Following the WWH outcome specifications, an automatic procedure has been developed using J2EE technology and integrating OpenSearch engine to download and store WWH forecasts into the Geo Data Base every day (Figure 8).The WWH forecast availability in the SLAPIS database starts from June 2017 and is quite constant for each sub-basin. The WWH optimization procedure has been developed inside the SLAPIS database, based on PostgreSQL and PostGIS engines, using PL/pgSQL language. This procedure is exposed through a REST web service developed with JAX-RS and J2EE technologies to supply calibrated WWH outcomes. The REST web service is used by SLAPIS web application (www.slapis-niger.org) in order to show and plot WWH optimized forecasts through the graphical user interface (Figure 9). integrate WWH into SLAPIS using open-source technologies and developing software components. The data model of SLAPIS has been enriched with new tables to store WWH outcomes for seven specific sub-basins, two of them connected with the gauging stations of Bossey Bangou and Garbey Kourou. Following the WWH outcome specifications, an automatic procedure has been developed using J2EE technology and integrating OpenSearch engine to download and store WWH forecasts into the Geo Data Base every day (Figure 8).The WWH forecast availability in the SLAPIS database starts from June 2017 and is quite constant for each sub-basin. The WWH optimization procedure has been developed inside the SLAPIS database, based on PostgreSQL and PostGIS engines, using PL/pgSQL language. This procedure is exposed through a REST web service developed with JAX-RS and J2EE technologies to supply calibrated WWH outcomes. The REST web service is used by SLAPIS web application (www.slapis-niger.org) in order to show and plot WWH optimized forecasts through the graphical user interface (Figure 9). WWH forecasts are available only for profiled users (e.g., operational hydrologists, ANADIA2 partners or scientists). All data are also accessible by Web Catalogue Service of the SLAPIS web platform or using application programing interfaces (APIs) for users with more advanced informatic skills.

Discussion
The results show a general overestimation in the optimized forecast compared to the 2019 observed flow used for model verification. This behavior is motivated by the fact that 2019 hydrological year was characterized by a low flow compared to the previous years: mean flow in 2019 was 72.8 m 3 /s while in the previous year it was higher (2018 = 99.2, 2017 = 106.6 and 2016 = 104.7). Therefore, the flows used for optimization force a higher discharge than the observed in 2019 [13,30].
With a longer training period, a larger number of previous forecasts could be used to force new data in the optimized version, likely resulting in a higher reliability. Of course, the short data sets used for the training period (3 years for NH and 2 years for WWH) are a limit of this study, and with longer reforecasting sequence performances could increase [30].
The choice of hazard thresholds significantly affects categorical performance. Of course, the use of hydrologic and damage-based thresholds is the best choice to evaluate the EWS capability against a flow that can produce field damages. Unfortunately, the low flows registered in 2019 forced the authors to introduce a threshold lower than the one used operationally in the field.
The results also hint at the most appropriate users of hydrological forecasts, namely operational hydrologists, and emergency management experts rather than local authorities or the regular users. The POD and FAR results show that only 60% of observed streamflow above the analysis threshold were correctly forecasted, and that there were a high number of false alarms (50% for NH and 75% for WWH). This accuracy is too poor to allow the use of hydrological forecasts for activating the warning protocols on the field since it could compromise the user's confidence in SLAPIS and therefore the overall reliability of the system [28]. Large-scale hydrological information may potentially be misleading when interpreted at the local scale and can change the key message of the local impacts [58]. So, the outputs of hydrological models need context at decision scale to be useful to users. For these reasons hydrological forecasts will be used to activate the pre-warning protocols for the hydrological experts and not the field warning. A similar conclusion about appropriate users and stakeholders for FANFAR forecasts have been obtained on West Africa scale [38].
Continuous and categorical performances of the models demonstrate the importance of hydrological model calibration with flow data at daily timestep. Strong difference was detected in NH between GK (calibrated) and BB (not calibrated) behavior and also between NH and WWH that were calibrated respectively at a local scale and a daily resolution and a global focus and a monthly resolution. Another critical factor for accurate hydrological forecasts is the availability of meteorological observations and forecasts with low latency and high accuracy. The long latency of such data have been demonstrated to cause bias and drifts in other areas [42]-which may explain the two-week forecast delay observed at GK-and efforts to improve both accuracy and latency of meteorological input data are being investigated [41]. Broadly speaking, the FANFAR system is designed to incorporate a range of approaches aimed at improving accuracy. For example the system enables multiple forecast chain configurations (e.g., multiple hydrological models), integration of various observations (e.g., water levels from local gauges and satellite altimetry), and a range of methods to assess flood hazard (e.g., with locally defined flood hazard thresholds). The system is continuously being developed, with new features added regularly. Most recently (since August 2020), the forecasted streamflow is adjusted with recent in-situ gauge observations (currently from up to 61 hydrometric gauges including Sirba stations) using a simple auto-regressive approach [41].
The operational availability of forecasts from the FANFAR pilot system is currently a major limitation for application of HYPE forecasts in local EW. Since FANFAR is a pilot system, the main emphasis is currently put on system development rather than operational availability. Some operational procedures are in place (e.g., production monitoring), but not yet to the extent normally deployed in mature production-grade systems. In mature systems, there is typically a response process deployed to solve production failures through a cascade of actions including e.g., (1) automatic repetition of failed processing steps, (2) semi-automated intervention by a dedicated response team, and (3) manual intervention by the system developers; where each level is activated only if the previous failed. The speed by which problems are solved is typically regulated through service level agreements (SLAs), that define, e.g., maximum response times and if the response team shall be active 24 h/7 d or only during normal working hours. Production-grade systems typically also have a more thorough procedure for quality control of new code, testing periods before operational launch, as well as heartbeat functionalities through which users are informed of any planned or unplanned production disruptions. Therefore, the operational availability of HYPE forecasts is fully obtainable if deployed in a production-grade ICT environment (e.g., a similar HYPE model deployed on SMHI's production environment had 100% operational availability over a full year of production). If FANFAR is to move from pilot to production-grade system, attention to the ICT environment and the associated procedures and SLAs for minimizing or eliminating production failures is required. Critical foundations for this are also that the ICT environment is accessible, that mandated staff have the necessary capacity to operate the system, and that the system is sufficiently financed [31].
Even though this uncertain situation of operational availability of HYPE forecasts is seen to limit their effectiveness over time, the conceptual approach embraced in SLAPIS EWS is to foster the re-use of hydrological forecasts produced at large scale to be tailored for local users' needs. The critical issue for an operational local EWS, built on distributed data resources and hydrological services providers, is to face continuous technical evolutions of e-infrastructures and incoming forecasted products. System key characteristics for adapting efficiently to such changes are flexibility, transparency, timely documentation, and effective communication with hydrological information producers. The latter is particularly important and should engage in a two-way producers and users both having a common understanding of data and metadata model to effectively exchange them across systems [59]. Indeed, data and metadata modeling add transparency in data exchange and facilitates the development of APIs to interact with distributed data providers. This approach results in a mutual benefit in the perspective of distributed hydrological climate services adding value to data through customized services, maps, and visualizations. That recalls also for system architectures adopting the paradigm of open data and standard web services to increase the interoperability of distributed hydrometeorological services. Interoperability preserves key elements of diversity while ensuring that systems work together in ways that matter most [60]. Following these principles, SLAPIS adopts interoperable services compliant with the most international standards.

Conclusions
The flood local early warning system on the Sirba River (SLAPIS) was developed in the last years as a pilot system to build a community response to the dramatic flood increase in the area. The system was founded both on a top-down hydraulic forecast and on bottom-up field observations. These two sources have a forecast horizon of maximum 48 h.
The downscaling of hydrological forecasts from the HYPE models has been planned to extend the warning time of SLAPIS to 10 days. The application was made through an optimization on the 5-day model outputs with a linear regression technique.
The forecast verification shows that NH forecasts are better than WWH ones both in original and optimized versions, for continuous and categorical indices. However, NH forecasts demonstrate a considerably low operational availability. The WWH model was chosen for the operational integration in the user-interface of the SLAPIS EWS as it resulted in the best operational availability quality performances, a high uniformity, and a direct sub-basin on the upstream gauging station.
However, due to the low performance, and to not impact upon the system reliability, WWH forecasts are used as a pre-warning for hydrologists and not for operational warning for the involved villages.
Methodologically, this study stresses the role of interoperability in distributed hydrometeorological services incrementing collaborative action to facilitate further data and webservice usage and making progress toward the sustainable development goals.
Future perspectives of this work will evaluate the optimization process on a higher number of training and validation years, and to use operational hazard thresholds. Moreover, the reinforcement of the ICT environment and procedures for HYPE models will allow a higher operational availability.
Funding: This research was co-funded by the Italian Agency for Development Cooperation, by the Institute of Bioeconomy (IBE)-National Research Council of Italy (CNR), by the National Directorate for Meteorology of Niger (DMN), and by DIST Politecnico and the University of Turin within the project ANADIA2.0 (Aid10848). In addition, the research is based on the FANFAR project, funded by the European Union's Horizon 2020 research and innovation program under grant agreement # 780118.

Acknowledgments:
We would like to thank (1) Gaptia Lawan Katiellou (Director of DMN) for continuously supporting the project, (2) Bruno Guerzoni for SLAPIS's patient IT development work, (3) Giulio Passerotti for previous steps of the MatLAB coding development, (4) the FANFAR team and the developers of the NH and WWH models, in particular Emmanuel Mathot and Tobias Lagander for their dedicated work to produce HYPE forecasts on Hydrology-TEP.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Appendix A
Results in Bossey Bangou hydrometer: basic statistical parameters (Table A1), Flow duration curve ( Figure A1) and hydrograph ( Figure A2). Table A1. Basic statistical parameters (minimum, mean, and maximum) at Bossey Bangou in the validation year 2019. Observed flow, NH and WWH (5 day) forecasts before and after the optimization.