Validation of the AROME, ALADIN and WRF Meteorological Models for Flood Forecasting in Morocco

: Flash ﬂoods are common in small Mediterranean watersheds and the alerts provided by real-time monitoring systems provide too short anticipation times to warn the population. In this context, there is a strong need to develop ﬂood forecasting systems in particular for developing countries such as Morocco where ﬂoods have severe socio-economic impacts. In this study, the AROME (Application of Research to Operations at Mesoscale), ALADIN (Aire Limited Dynamic Adaptation International Development) and WRF (Weather Research and Forecasting) meteorological models are evaluated to forecast ﬂood events in the Rheraya and Ourika basin located in the High-Atlas Mountains of Morocco. The model evaluation is performed by comparing for a set of ﬂood events the observed and simulated probabilities of exceedances for di ﬀ erent precipitation thresholds. In addition, two di ﬀ erent ﬂood forecasting approaches are compared: the ﬁrst one relies on the coupling of meteorological forecasts with a hydrological model and the second one is a based on a linear relationship between event rainfall, antecedent soil moisture and runo ﬀ . Three di ﬀ erent soil moisture products (in-situ measurements, European Space Agency’s Climate Change Initiative ESA-CCI remote sensing data and ERA5 reanalysis) are compared to estimate the initial soil moisture conditions before ﬂood events for both methods. Results showed that the WRF and AROME models better simulate precipitation amounts compared to ALADIN, indicating the added value of convection-permitting models. The regression-based ﬂood forecasting method outperforms the hydrological model-based approach, and the maximum discharge is better reproduced when using the WRF forecasts in combination with ERA5. These results provide insights to implement robust ﬂood forecasting approaches in the context of data scarcity that could be valuable for developing countries such as Morocco and other North African countries. This study provides the ﬁrst evaluation of ﬂash-ﬂood forecasting methodologies in Morocco. Results indicated that a relatively simple forecasting model based on linear regression model could cient in the case of a relatively small database of events. this approach does not take the spatial variability of precipitation there is not a physical deﬁnition of the model parameters. In the the hydrological modeling approach is more ﬃ cult to implement the context of data scarcity, spatial and temporal hydro-meteorological data model rare. ﬀ di ﬀ erent remote sensing soil moisture suitable the state of initial soil saturation di ﬀ erent basins the density of hydro-meteorological networks is low and usually no measured soil moisture is in to the regression approach considered herein, other machine-learning as logistic regression valuable tools to estimate the probability of a ﬂood event over pre-determined thresholds in combination with a quantitative precipitation forecast.


Introduction
Flash floods mainly affect small watersheds where the response time to a rainfall event is very short, from a few minutes to a few hours [1][2][3]. Flash floods are common in the Mediterranean region, mainly caused by convective rainfall with an intensity that exceeds the infiltration capacity of the basin, which may potentially generate devastating floods [4,5]. The concentration time of is concentrated only in the highest elevation areas. The geology of the basins consists of impervious formations in the upper part of the basins mainly by igneous rocks such as granite, dolerite and andesite, while in lower part of the basins; the dominant formations are clays and massive sandstones [31]. The slopes of the basins are devoid of vegetation because of the steep slopes that favor thin and rocky soils and erosion; these conditions also favor the genesis of high magnitude floods. Vegetation cover exists only along the riverbed. Both basins have a flood warning system based on pre-defined threshold exceedance of observed water levels and rainfall intensities. The rainfall data recorded by all the warning system stations are used in this study. Raw rainfall is recorded at 10 min time step which is then converted to 3 h time step accumulations in order to be comparable with meteorological models' outputs ( Figure 1). In addition, the Rheraya watershed is also monitored by the rainfall network deployed by the Joint International Laboratory Télédétection et Ressources en Eau en Méditerranée semi-Aride (LMI TREMA, [42,43]). This rainfall network covered the 2003-2016 periods with recording period of 30 min. 30-min accumulations have also been converted to 3-h accumulations for the LMI TREMA network. In total, 9 rainfall stations are available in the Rheraya basin and 6 in the Ourika basin ( Figure 1). As most of Moroccan basins are only monitored by daily rain-gauges, a 3 h rainfall accumulation has also been aggregated to daily accumulations, in order to mimic the actual case when implementing the regression method. Rainfall spatial distributions have been obtained after applying the Inverse Distance Weighting (IDW) interpolation method. The discharge data is provided by two ultrasonic water level gauges covering the period from 2013 to 2016 ( Figure 1). To the knowledge of the authors, this is the first time that this modeling approach is tested in a North African country. The main goal is to provide local authorities with recommendations about the best strategy to implement a reliable forecasting system in a context of data scarcity in semi-arid mountainous basins.

Study Area and Hydrometeorological Data
The Rheraya and Ourika small basins (225 km 2 and 503 km 2 , respectively) are located in the High Atlas of Marrakech, where the slopes are steep and altitudes range from 1000 to 4167 m ( Figure 1). These basins are characterized by a semi-arid climate with mean annual precipitations of 732 mm and 541 mm in the Rheraya and Ourika catchments, respectively. Precipitation is characterized by a strong variability in time and space. Snow is only present above 2000 m during winter months [40]. Previous studies have shown that snowmelt has little influence on flood volumes [41], since snow cover is concentrated only in the highest elevation areas. The geology of the basins consists of impervious formations in the upper part of the basins mainly by igneous rocks such as granite, dolerite and andesite, while in lower part of the basins; the dominant formations are clays and massive sandstones [31]. The slopes of the basins are devoid of vegetation because of the steep slopes that favor thin and rocky soils and erosion; these conditions also favor the genesis of high magnitude floods. Vegetation cover exists only along the riverbed.
Both basins have a flood warning system based on pre-defined threshold exceedance of observed water levels and rainfall intensities. The rainfall data recorded by all the warning system stations are used in this study. Raw rainfall is recorded at 10 min time step which is then converted to 3 h time step accumulations in order to be comparable with meteorological models' outputs ( Figure 1). In addition, the Rheraya watershed is also monitored by the rainfall network deployed by the Joint International Laboratory Télédétection et Ressources en Eau en Méditerranée semi-Aride (LMI TREMA, [42,43]). This rainfall network covered the 2003-2016 periods with recording period of 30 min. 30-min accumulations have also been converted to 3-h accumulations for the LMI TREMA network. In total, 9 rainfall stations are available in the Rheraya basin and 6 in the Ourika basin ( Figure 1). As most of Moroccan basins are only monitored by daily rain-gauges, a 3 h rainfall accumulation has also been aggregated to daily accumulations, in order to mimic the actual case when implementing the regression method. Rainfall spatial distributions have been obtained after applying the Inverse Distance Weighting (IDW) interpolation method. The discharge data is provided by two ultrasonic water level gauges covering the period from 2013 to 2016 (Figure 1).
There are 13 flood events available for the Rheraya and 7 for Ourika (Table 1). Those events show two main types of weather that are responsible of floods; (i) a stormy weather predominant in summer linked to thermal convection between the warm air of the plain of Marrakech and the cooler air of the Atlas mountains, the case of 21/07/2016 event in the Ourika basin; (ii) the oceanic rain regime where the cold season disturbances that come from the Atlantic ocean when the atmospheric depressions are centered off Morocco, in this case the rainfall events last longer as the case of the events of November 2014. These floods events have been used to compare the hydrological forecasts rendered by the QPFs and by using the regression model method.

Soil Moisture Datasets
The European Space Agency's Climate Change Initiative (ESA-CCI; http://www.esa-soilmoisturecci.org/) produced the satellite moisture product ESA-CCI Sm V03.2 in order to obtain long time series of soil moisture data [44][45][46]. The procedure is based on the fusion between active and passive microwave sensors from 1978 and 2016 with a temporal resolution of 1 day and a spatial resolution of 25 km. Within, the Copernicus Climate Change service (https://climate.copernivus.eu/) satellite soil moisture products will be release with a short latency, between a few hours to a day. The product has been validated over the world by Dorigo et al. [47]. ESA-CCI has been used to estimate the antecedent wetness conditions prior to flood events in Europe [46][47][48] and in Morocco [41]. In this study, two grids points of ESA-CCI are used that cover the study area.
The ERA5 [49] reanalysis is the latest generation of reanalysis products by the European Centre for Medium-Range Weather Forecasts (ECMWF) and The Copernicus Climate Change Service (C3S). The ERA5 is the improved version of ERA-Interim at different scales by using the latest parameterizations of Earth processes at enhanced spatial and temporal resolutions (i.e., hourly time step and 31 km horizontal resolution), the latency time of the product is five days. The first soil layer of the volumetric soil layer has been selected and then converted to daily soil moisture content.
In addition, in the Rheraya basin, 30 min soil moisture measurements are available from 2013 to 2016 at the SMPR7 station ( Figure 1) with three Thetaprobes at different soil depths: 0.05 m and 0.3 m (Figure 1). Soil moisture data are converted from 30-min to daily time step in order to derive the initial soil moisture conditions of the basin before the flood events. This station is located at an altitude of 2030 m with a slope of 30%. In this study we used the Thetaprobes measurements with 0.05 m depth.

Meteorological Models
Weather forecasting is carried out by the Directorate of National Meteorology (DMN) in Morocco. The DMN uses two NWP models at different spatial resolution:

•
The AROME model is based on the ALADIN model cy36t1 with an hourly time step, 36-h term and a spatial resolution of 2.5 km. The physical parameterizations are from the Méso-NH research model [50,51]. The Rapid Radiative Transfert Model (RRTM) longwave equation is used [52] and the shortwave radiation is represented by six spectral bands [53]. The externalized SURFEX module is used to represent the surface exchanges [54] with a parameterization of the natural land surfaces by ISBA scheme [55]. The lateral boundary conditions are from hourly ALADIN forecasts with 7.5 km horizontal resolution. No deep convection parameterization is needed due to the high resolution and a bulk microphysiscs scheme [56] that regroups the six equations of water variables (water vapor, cloud water, rain water, primary ice, graupel and snow).

•
The ALADIN (ALADIN-France, Aladin International Team 1997) with a three-hour time step, 72 h term and a spatial resolution of 10 km. The model's runs are with a two-time-level semi-Lagrangian advection scheme are used with a complete package of physical parameterizations. The physics are the same as in ARPÈGE model [57]. The operational version at Morocco is ALADIN-Morocco [58].
The full set-up of the two models over Morocco are described in Hdidou et al. [59]. Since forecast data are not stored by the DMN, AROME and ALADIN models have been run to simulate the flood events considered in the present work (Table 1). These runs are fully comparable to real-time forecasts, since no data assimilation is performed during the events. The ALADIN and AROME models have been developed by Météo France and are used both for precipitation [60,61] and flash flood forecasts [62].
In addition, the WRF model [34] has also been implemented in an operational mode, mimicking the operational configuration routinely used by the Meteorology Group at the University of the Balearic Islands [63,64]; http://meteo.uib.es/wrf). That is, a single computational domain of 650 × 550 grid points centered in Morocco and spanning the whole country, the Atlantic Ocean and the southern part of the Mediterranean Sea has been selected. A horizontal resolution of 2.5 km, 50 vertical levels and an integration time step of 12 s is used for all the WRF model simulations, which allow for deep moist convective systems with a relevant entity to be explicitly resolved [65,66]. Physical parameterizations are the single-moment 6-class microphysics (WSM6) scheme, including Graupel [67]; the 1.5-order Mello-Yamada-Janjić (MYJ) boundary layer scheme [68], the Dudhia shortwave scheme [69]; the RRTM longwave scheme [52]; the unified Noah land surface model [70]; and the Eta similarity surface-layer model [68]. Initial and lateral boundary conditions have been provided by the operational deterministic forecasts by the European Centre for Medium-Range Weather Forecasts (ECMWF). Lateral boundary conditions are updated every 3 h. QPF fields are rendered at hourly frequency and forecasts span a 48 h period.
Note that it has only been possible to select flood events from 2014 to 2016 as before 2014, the meteorological models of the DMN were running on a much coarser spatial resolution and this version of the models are no longer operational.

Evaluation of the Quantitative Meteorological Forecasts
The first evaluation of the meteorological runs is carried out by comparing the QPFs for each event against the observed precipitations in order to examine the temporal evolution of the 3 h precipitation amounts. The second verification is based on dichotomous skill scores which are commonly used for evaluation and validation purposes, when assessing the performance of meteorological model outputs [71,72]. These statistical scores are based on a contingency table [73] and allow estimating the probability of exceeding or not predefined rainfall thresholds ( Table 2). The selected rainfall thresholds are 10, 20 and 30 mm, which correspond to the precipitation return periods of 2, 5 and 10 years, respectively. Furthermore, these thresholds are currently used in the operational flood alert system. The following skill scores are selected: Table 2. The contingency table used to build the dichotomous skill scores. Note that a corresponds to a forecasted event that occurred; b to a forecasted event that did not occur; c to a non-forecasted event but that it occurred and; d to a non-forecasted event that did not occur.
• Probability of False Detection (POFD) is the fraction of predicted events that have not been observed relative to the total number of unobserved events: • False Alarm Ratio (FAR) is the ratio of the predicted events that were not observed: • Bias (BIAS) is the ratio between the number of predicted and observed events: • Accuracy or anomaly correlation coefficient (ACC) is the fraction of the correct forecasts relative to all forecasts: where, a, b, c and N represents the number of rain events that fulfilled the conditions in Table 2.

Rainfall-Runoff Model
The HEC-HMS rainfall-runoff model has been selected in this study (USACE 2015). The Soil Conservation Service-Curve Number method (SCS-CN; [74]) is used to calculate runoff from rainfall. The SCS-CN has been widely used in Mediterranean basins [18,41,64,75,76]. The choice of the SCS-CN model is based on its simplicity, as it has only one parameter to estimate. The Clark Unit Hydrograph (CUH) transfer model has been used to simulate the conversion of the rainfall excess to runoff, owing to the complex topography of the study area. Tramblay et al. [76] and El Khalki et al. [41] have shown that it is suitable for this type of basins, CUH is based on two distinct processes: (i) the Time of Concentration parameter (Tc) which is based on a synthetic time histogram; and (ii) the Storage Coefficient parameter (Sc) that represents the impact of basin storage. The Base flow is simulated by using the exponential recession model [77] with the Recession Constant (Rc) and Ratio (R) parameters set constant for all the events. The limited contribution of long-term storage makes this approach more suitable for this type of basins [41].
The calibration of the HEC-HMS model is carried out using 13 events for the Rheraya and 7 events for the Ourika. The model inputs are rainfall measurements interpolated by the IDW method. El Khalki et al. [41] carried out these calibration tasks by adjusting CN, Tc, Sc, Rc and R parameters The calibrated parameters were able to reproduce well the observed discharge for all the selected events [41]. This allows us to consider the calibration results as a benchmark model. Afterwards, the HEC-HMS model has been forced by using the QPFs coming from the AROME, ALADIN and WRF models so as to evaluate the capability of the driven runoff simulations to reproduce the flood events.

Regression Model
In addition to the hydrological model, a simple statistical model based on a multiple regression adjustment is also tested. This is due to the fact that the lack of long time series of complete rainfall and runoff data can make difficult to develop a forecasting system based on a hydrological model. Note that over most basins of Morocco, and in other developing countries, the only hydro-meteorological data available are maximum discharge and rainfall at daily time step. Therefore, it also becomes an important objective to develop a flood forecasting system compatible with the existing limited observed databases. According to Penna et al. [30], maximum discharge and soil moisture are well correlated for alpine, impervious and semi-arid basins. Therefore, a multiple regression model is fitted individually for the Rheraya and Ourika basins to the end of correlating maximum precipitation, peak discharge and the different soil moisture products available for each flash-flood event. The parameters of the multiple regression models are estimated by the Generalized Least Square (GLS, [78]) method, so as to avoid these issues related to potential collinearity in the variables. The validation of the regression models is performed with a resampling procedure due to the limited sample size. The jack-knife (or leave one out) method is implemented for all the events where each event is successively removed and the regression between rainfall, maximum discharge and the three different soil moisture products is re-calculated using the remaining events (events-1). After the validation of the regression model, the maximum discharge forecast by this method is performed by replacing the observed rainfall with the different QPFs in the three resulting equations where each of them represents a soil moisture product. This gives three maximum discharges for each QPF.

Metrics
The efficiency criterions considered in this study for evaluating the performance of HEC-HMS simulations are: Nash-Sutcliffe [79] Equation (6) and BIAS on maximum discharge and volume Equation (8). For the regressions models; the root-mean-square error (RMSE) Equation (7) and BIAS on peak discharge are used: Water 2020, 12, 437 To evaluate the benefit of the regression model against the HEC-HMS model, the efficiency skill score (EFF) has also been employed: where RMSE in [m 3 /s], BIAS in [%], Q sim denotes the simulated peak discharge, Q obs stands for the maximum observed discharge, Q hydro is the peak discharge predicted by the hydro-meteorological forecasting chain using the hydrological model and Q reg is the maximum discharge predicted by the regression model. EFF ≥ 0 denotes an improvement of the peak discharge estimation by using the regression model when compared with the hydrological model, while EFF < 0 means no improvement of the maximum discharge when compared with the hydrological model.

Validation of Forecasted Rainfall Events
The QPFs by AROME, WRF and ALADIN are evaluated by the comparison with the interpolated precipitation to all the rain-gauges for each event and for both basins. QPFs successfully reproduce the timing of heavy precipitation for the majority of the events (Figure 2). The ALADIN model underestimates the amount of precipitation over the Rheraya ( Figure 2) and Ourika (Figure 3) with an average of −23% and −46%, respectively. The WRF and AROME models overestimate the cumulative precipitation over the Rheraya with an average of +113% and +62.5% respectively. In the Ourika, the WRF model shows an underestimation of −2.6% and an overestimation using the AROME model of +24%. It seems clear that the use of NWP models with convective-permitting horizontal scales (~2.5 km) is paramount in order to simulate realistically the high rainfall amounts from convective origin as well as its timing.

Evaluation of Meteorological Models
The application of the contingency table shows different results for each basin. For the Rheraya, the WRF model gives accepted results in different thresholds followed by AROME model. As the threshold values increase, the percentage of false alarm events increases for WRF model where the rejected events increase for AROME and ALADIN models (Figure 4). This is due to the fact that the cumulative precipitation given by the two NWP models exceeds the observed cumulative precipitation. Conversely, ALADIN exhibits an increase in the percentage of missed events for the higher thresholds. No false alarms are detected for the first two thresholds over the Ourika basin by using WRF and AROME models, because the forecasted and observed cumulative rainfalls exceed them. This exceedance is identified in the percentage of detected events, which is very important for the first two thresholds considered. For the ALADIN model, when the highest threshold exceeds the observed and forecasted cumulative rainfall, the percentage of rejected events increases.
average of −23% and −46%, respectively. The WRF and AROME models overestimate the cumulative precipitation over the Rheraya with an average of +113% and +62.5% respectively. In the Ourika, the WRF model shows an underestimation of −2.6% and an overestimation using the AROME model of +24%. It seems clear that the use of NWP models with convective-permitting horizontal scales (~2.5 km) is paramount in order to simulate realistically the high rainfall amounts from convective origin as well as its timing.

Evaluation of Meteorological Models
The application of the contingency table shows different results for each basin. For the Rheraya, the WRF model gives accepted results in different thresholds followed by AROME model. As the threshold ALADIN exhibits an increase in the percentage of missed events for the higher thresholds. No false alarms are detected for the first two thresholds over the Ourika basin by using WRF and AROME models, because the forecasted and observed cumulative rainfalls exceed them. This exceedance is identified in the percentage of detected events, which is very important for the first two thresholds considered. For the ALADIN model, when the highest threshold exceeds the observed and forecasted cumulative rainfall, the percentage of rejected events increases.  The contingency table allowed us to calculate the different skill scores ( Figure 5). WRF and AROME exhibit BIAS > 1 which is explained by the fact that the two models overestimate the amount of the observed precipitations. Conversely, the ALADIN model features a general under-estimation of the total rainfall amounts over the two basins. The ACC indicates that forecasts based on smaller threshold are generally more in agreement with the observed amounts of rainfall. The contingency table allowed us to calculate the different skill scores ( Figure 5). WRF and AROME exhibit BIAS > 1 which is explained by the fact that the two models overestimate the amount of the observed precipitations. Conversely, the ALADIN model features a general under-estimation of the total rainfall amounts over the two basins. The ACC indicates that forecasts based on smaller threshold are generally more in agreement with the observed amounts of rainfall.

Hydro-Meteorological Approach
The calibration of the HEC-HMS model gave good results with an average Nash of 0.79 and 0.66 for the Rheraya and Ourika catchments, respectively. Mean bias on the maximum discharge were of −2.47% and −3.15%, respectively (Table 3). After calibration, the HEC-HMS model shows a good reproduction of the maximum discharge for each event that allows forcing the model with the QPFs by considering the hydrological as a benchmark.

Hydro-Meteorological Approach
The calibration of the HEC-HMS model gave good results with an average Nash of 0.79 and 0.66 for the Rheraya and Ourika catchments, respectively. Mean bias on the maximum discharge were of −2.47% and −3.15%, respectively (Table 3). After calibration, the HEC-HMS model shows a good reproduction of the maximum discharge for each event that allows forcing the model with the QPFs by considering the hydrological as a benchmark. As expected, the result of the hydro-meteorological forecasting approach shows that the ALADIN model underestimates the maximum discharges with average biases of −30% and −37% for the Rheraya and Ourika basins, respectively (Table 4). On the opposite, WRF and AROME driven runoff simulations overestimate the maximum discharges with +178% and +145% for the Rheraya basin, and +35% and +163% for the Ourika catchment. The WRF-HEC-HMS system satisfactorily forecasts the floods of greater magnitude, which would be recommendable for a flood forecasting system because the risk of impacts is larger for this kind of floods. On the other hand, low magnitude floods are better predicted by the multi-model mean: the rainfall underestimation by ALADIN and the overestimation by AROME and WRF provide a bias compensation in rainfall accumulation. Table 4. Statistical indices of the QDFs for the Rheraya and Ourika basins. Note that Q obs stands for the observed maximum discharge, Q WRF denotes the simulated maximum discharge using WRF model, Q ALADIN is the simulated maximum discharge using ALADIN model, Q AROME is the simulated maximum discharge using AROME model and Q Mean is the simulated maximum discharge by using the multi-model mean.

Regression Approach
As aforementioned, the regression approach is composed by three equations per basin, each equation representing a different soil moisture dataset (i.e., observed, ESA-CCI and ERA5) used to perform the regression. The Equations (10)- (12) correspond to Thetaprobes, ESA-CCI and ERA5 soil moisture product respectively for the Rheraya basin, and the equation 13, 14 and 15 for the Ourika basin using the same three soil moisture data. Q obs−Rh = 0.67 × P + 6.31 × SM + 24.24 (10) Q obs−Ou = 5.1 × P − 3015 × SM + 257.6 (13) With Q obs-Rh the predicted maximum discharge using in-situ measurements of soil moisture for Rheraya basin, Q obs-Ou predicted maximum discharge using in-situ measurements of soil moisture for Ourika basin, Q ESA-CCI-Rh predicted maximum discharge using ESA-CCI for Rheraya basin, Q ESA-CCI-Ou predicted maximum discharge using ESA-CCI product for Ourika basin, Q ERA5-Rh Predicted maximum discharge using ERA5 product for Rheraya basin, Q ERA5-Ou predicted maximum discharge using ERA5 for Ourika basin, P: Precipitation in 24 h of the meteorological models and SM: Soil Moisture of the used product before the flood event.
Those equations illustrate the good relationship between precipitation, event maximum discharge and soil moisture. The squared-correlation coefficient is larger than 0.78, which allow validating the equations by the leave-one-out method. Validation shows a good performance in the Rheraya basin with a RMSE of 16 m3/s provided by ERA5 dataset. On the other hand, the multiple regression models based on the ESA-CCI and Thetaprobe datasets exhibit weaker results (26 m 3 /s and 31 m 3 /s, respectively). It is worth noting that the small number of events considered strongly reduces the robustness of the regression model for the Ourika basin. That is, the three multiple regression equations provide an overestimation of the maximum discharge, being the lowest RMSE of 216 m 3 /s for the ERA5 product. Overall, the ERA5 soil moisture dataset gives the best performance s when compared against the observed soil moisture or ESA-CCI database (Table 5).
On the other hand, the best performance of mixing soil moisture products and NWP models comes from the combination of the WRF and the ERA-5 database. This combination presents the lowest RMSE and bias. In the Ourika basin, the rainfall amount of some events is strongly underestimated by ALADIN resulting in maximum discharges close to zero. To alleviate this problem, the multi-model mean of the three NWP model simulations has been used. This approach reduces errors when using all the soil moisture products and renders an efficient result over the Ourika basin, with bias reductions up to 26%, 3.4% and 20.2% for the ESA-CCI, ERA5 and observed soil moistures, respectively.
Comparing the results between the hydrological and multiple regression approaches when forced by the QPFs, most of the cases results in EFF greater than 0 (Table 5). These results suggest that a better performance is obtained when using the regression method instead of the hydrological model. However, these results also depend on the basin and the model and dataset used. For instance, the ALADIN model in combination with the soil moisture obtained for the ESA-CCI and Thetaprobe products yields an EFF smaller than 0 over the Ourika basin. In this case, the hydrological model approach performs better than the regression method. Conversely, the EFF coefficients are positive when using the ERA5 dataset. Table 5. Statistical indices after the application of the linear regression equations approach for the Rheraya and Ourika basins. Note that Q WRF is the simulated maximum discharge using the WRF model, Q ALADIN is the simulated maximum discharge using the ALADIN model, Q AROME is the simulated maximum discharge using the AROME model and Q Mean is the simulated maximum discharge using the average of the three meteorological models. The best results for each meteorological model are represented in bold.

Conclusions
This study provides a first evaluation of two distinct approaches for a flood forecasting chain that could be implemented operationally in Morocco to reduce the vulnerability to flood risk; the first approach rely on the HEC-HMS hydrological model driven with NWP model outputs, the second approach is using a regression model between observed rainfall, soil moisture and observed maximum discharge in combination with NWP forecasts. The study area is focusing on two basins located south of Morocco, which are recurrently hit by severe flood events and highly representative of the basins impacted by floods in Morocco. The AROME, ALADIN and WRF models have been evaluated for a set of flood events over the two basins, with 13 events over the Rheraya basin and 7 events over the Ourika watershed. The AROME model tends to overestimate the observed cumulative rainfalls for all the flood events, while the WRF model overestimates the observed cumulative rainfalls just over the Rheraya basin. Furthermore, both models clearly outperform the ALADIN model which strongly underestimated rainfall amounts. The use of convective-permitting horizontal scales in NWP models appear to be paramount to satisfactory reproduce the high rainfall rates and its timing before flash-flood. That is, the WRF and AROME models have been found more effective when forecasting heavy precipitation events with less false alarms and missing events than ALADIN. The skill scores indicate that the accuracy of the meteorological models tend to decrease for increasing rainfall thresholds.
Next, two methods have been compared in order to produce reliable hydrological forecasts based on the outputs of these NWP models. The first method relies on a standard hydrological model based on the SCS-CN infiltration method. The second method is based on a linear regression approach linking event rainfall, antecedent soil moisture and event peak discharge. The comparison of these two distinct flood forecasting approaches has shown that the regression method outperforms the hydrological model to simulate maximum discharge with a 24 h lead time. The best results have been obtained by combining the WRF model with the ERA5 dataset as the estimate of the initial soil moisture conditions. This improvement is more marked in the case of the Rheraya basin, although the same conclusion can also be drawn for the Ourika catchment. However, the Ourika basin has a smaller number of events, thus caution is required since with small sample size the results cannot be tested with a great robustness.
This study provides the first evaluation of flash-flood forecasting methodologies in Morocco. Results indicated that a relatively simple forecasting model based on linear regression model could be efficient in the case of a relatively small database of events. Yet, this approach does not take into account the spatial variability of precipitation and there is not a physical definition of the model parameters. In the other hand, the hydrological modeling approach is more difficult to implement in the context of data scarcity, where the high-resolution spatial and temporal hydro-meteorological data required to setup such model are rare. However, prior to the operational implementation of such forecasting systems, there is a strong need to test and validate different methods on other basins with different sizes, physiographic characteristics and climate conditions. In particular, this methodology could be useful for larger Moroccan basins upstream of dams where the impact of floods is critical for dam safety. In addition, the consideration of a larger sample of flood events could allow to better analyze the large-scale synoptic conditions associated with these events, but also the variability of soil moisture for different seasons and its influence on the flood generation mechanisms [80]. This would require a national inventory for available hydrometerological data and also to implement a monitoring strategy to increase the spatial coverage of gauged basins. Similarly, the evaluation of different remote sensing soil moisture products is necessary in order to identify the most suitable products to properly estimate the state of initial soil saturation over different basins in Morocco, where the density of hydro-meteorological networks is low and where usually no measured soil moisture is available. Finally, in complement to the regression approach considered herein, other machine-learning methods such as logistic regression [80] could be valuable tools to estimate the probability of a flood event over pre-determined thresholds in combination with a quantitative precipitation forecast.