Improving the Accuracy of Hydrodynamic Simulations in Data Scarce Environments Using Bayesian Model Averaging: A Case Study of the Inner Niger Delta, Mali, West Africa

: In this paper, the study area was the Inner Niger Delta (IND) in Mali, West Africa. The IND is threatened by climate change, increasing irrigation, and dam operations. 2D hydrodynamic modelling was used to simulate water levels, discharge, and inundation extent in the IND. Three di ﬀ erent digital elevation models (DEM) (SRTM, MERIT, and a DEM derived from satellite images were used as a source of elevation data. Six di ﬀ erent models were created, with di ﬀ erent sources of elevation data and di ﬀ erent downstream boundary conditions. Given that the performance of the models varies according to the location in the IND, the variable under consideration and the performance criteria, Bayesian Model Averaging (BMA) was used to assess the relative performance of each of the six models. The BMA weights, along with deterministic performance measures, such as the Nash Sutcli ﬀ e coe ﬃ cient (NS) and the Pearson’s correlation coe ﬃ cient (r), provide quantitative evidence as to which model is the best when simulating a particular hydraulic variable at a particular location. After the models were combined with BMA, both discharge and water levels could be simulated with reasonable precision (NS > 0.8). The results of this work can contribute to the more e ﬃ cient management of water resources in the IND.


Introduction
The Inner Niger Delta (IND) is the second-largest wetland in Africa, and its maximum inundated area can exceed 30,000 km 2 during wet years [1]. Its main water sources are the Niger and Bani Rivers, which enter the study area at Ké-Macina and Sofara, respectively, flowing through the Delta over several hundred kilometers, before both exit at Diré (Figure 1). The maximum inundation area shrinks to around 10,000 km 2 in dry years [2] under the combined impacts of climate variability, irrigation withdrawals, and dam operation. The IND faces many environmental challenges due to upstream water interventions on the river system, as well as high precipitation variability. The IND is considered as a hub of human activities, including agriculture, fishing, transport, and tourism. The Delta plays an important role in promoting sustainable development for food security, water management, and the environment. Some upstream dams and irrigation structures are already in place, while more are at the planning stage. The infrastructures aim to increase economic development and food security. However, damming a river often implies a transfer of benefits from the downstream to the upstream regions. Thus, damming the main course or branches of the Niger River can impact on food security, ecology, and the environment as a whole by altering flow and inundation patterns. The operation of existing dams, such as Sélingué, Markala, and Talo has already modified flow patterns in the Delta, and planned dams, such as Fomi and Djenné, may exacerbate the changes. Figure 2 shows the location of existing and planned dams/reservoirs upstream of IND. Zwarts and Frerotte [1] have shown that on average, 9% of the total flood extent area has been lost due to the effects of the Selingue and Markala dams. Liersch et al. [3] anticipated an average reduction of 24% flood extent compared to the natural condition due to the combined effect of existing and planned water infrastructures in the watershed. The situation may get worse due to increased evaporation as a result of global warming [4]. However, changes in precipitation are uncertain [5] where some models project a wetter, and others a drier future in the Upper Niger basin [6]. A lower flow rate, in combination with a smaller flood plain, will most probably affect crop yields, fisheries, livestock, biodiversity, and environmental flow. To understand the possible impacts of these future changes on ecosystems and livelihoods, a realistic simulation of water levels, flow velocity, and inundations are required. Concerns about the hydrologic regime alterations in the watershed have triggered interest in ecosystem services and environmental health, especially in the IND area. The evaluation of ecosystem health requires the knowledge of relevant hydraulics parameters under altered flow scenarios. Such knowledge can only be achieved using hydrodynamic modelling. Unfortunately, previous attempts to develop a hydrodynamic model of the IND were only partially successful. For instance, Neal et al. [7] applied LISFLOOD-FP [8], a raster-based sub-grid model, but model results were affected by the low-resolution (about 1 km) structured grid and simplified assumptions about the solution of the St. Venant equation. Dadson et al. [9] applied a one-dimensional model with 25 km grids, and presented the inundation as a fraction of the cells which did not show the variability of inundation. Haag [10] applied the D-Flow FM [11,12], a shallow-water solver based on the finite-volume method on an unstructured grid to study the flood in the delta, and ended up having an imperfect simulation of water level, where inundation resulted from poor calibration efforts.
This study is an attempt to develop a working hydrodynamic model for the IND in support of ecosystems studies in the area. The modelling was done using TELEMAC 2D [13], a finite-element computer program suite that solves shallow-water equations using the finite element method. Given the lack of accurate digital elevation models of the study area, two global digital elevation models were considered: the Shuttle Radar Topography Mission (SRTM) [14], and the Multi-Error-Removed Improved-Terrain (MERIT DEM) [15]. However, the elevation accuracy of the DEMs makes their use in hydrodynamic modelling challenging. Rodriguez et al. [16] reported an elevation error (standard deviation) of about 4.68 m of SRTM DEM in Africa. Yamazaki et al. [15] reported a relative vertical height error of about 2 m of MERIT DEM for 58% of the land mapped by the DEM. A third DEM was therefore derived using the waterline method [17][18][19].
Downstream boundary conditions were imposed by prescribing rating curves either in the form of discharge as a function of water level or water level as a function of discharge. Inflows in the domain were a combination of measurements and simulations using a Soil Water Assessment Tools (SWAT) hydrological model [20] of the Upper Niger Basin. The combination of all available DEMs and downstream conditions led to a total of six models with different performances in simulating discharges, water levels, and inundation extents. Bayesian Model Averaging (BMA) [21] was used to combine the outputs of these six models to get improved results and identify the best individual models that could be used to simulate water levels and discharges with relatively high performance.
The paper is organized as follows: Section 1 contains the background and objective of the study; Section 2 presents the materials and methods, which includes available data, topographic data derivation, river network, bathymetric data, floodplain friction, adjustment of the elevation of SRTM and MERIT DEM, model setup, model calibration, and the theory of BMA; Section 3 deals with calibration results, BMA prediction, validation results, inundation extent, and performance of global DEMs; and finally, some concluding remarks complete the study.

Discharge and Water Level
Water levels and discharge at Ké-Macina, Sofara, Mopti, Akka, and Diré, as well as a rating curve at Diré were obtained from the Malian hydrometric services (Direction Nationale de l'Hydraulique, DNH). Water level data have been referenced to Institut national de l'information's (IGN) datum. Water entered into the study area at KeMacina and Sofara through the Niger and Bani rivers, respectively. Mopti is located downstream of the confluence of the Bani and Niger rivers. Akka is situated almost at the middle of the study area, along the Niger River. Water drains the system at Dire station.

Topographic Data Sources
The SRTM DEM has been available at a resolution of one arc-second (about 30 m) since 2014. It is a widely used DEM: Biancamaria et al., Schumann et al.,and Neal et al. [7,22,23] used SRTM gridded data for hydraulic modelling of the Ob in western Siberia, the Scioto River in the eastern U.S., and the Niger River (Inner Niger Delta) in Mali, respectively. The SRTM elevation data used in this study were downloaded from https://earthexplorer.usgs.gov/.
The MERIT DEM is available at three arc-sec resolutions (about 90 m), and was derived from SRTM and AW3D-30 m [24] by removing multiple error components, such as absolute bias, stripe noise, speckle noise, and tree height bias [15]. The MERIT DEM was used by Hawker et al. and Archer et al. [25,26] for the flood modelling study of the Ba catchment, Fiji. The MERIT DEM was downloaded from http://hydro.iis.u-tokyo.ac.jp/~{}yamadai/MERIT_DEM/.

DEM Derivation Using the Waterline Method
An additional DEM was developed using the waterline method from the combined uses of the water extent map derived from Landsat satellite images by Zwart et al. [2] and water level data collected on particular dates. The waterline method involves determining the horizontal position of the land-water boundary from a time series of remotely sensed images taken during different periods, and then superimposing on this boundary the heights relative to a datum.
The procedure is as follows: For each of the seven flood inundation maps, water levels at Ké-Macina, Mopti, Akka, and Diré were used to calculate the slope of the water surface along principal flow paths between Ké-Macina and Akka, between Mopti and Akka, and between Akka and Diré. In the absence of additional information, it was further assumed that the water level variation between any two stations is linear.

3.
A series of orthogonal lines were drawn at 30 m intervals along principal flow paths.

4.
It was assumed that water levels along these lines are constant. Therefore, an elevation value could be set at the intersection points between water extent polygon and the orthogonal lines.

5.
At the end of the process, an altitude was estimated for each point within the flood extent polygons. 6.
Areas outside the larger polygon were populated using SRTM elevation data. 7.
GIS was used to interpolate the elevation data inside the study area.

Satellite Imagery
This study makes use of the water extent maps derived from Landsat images by Zwarts et al. [2]. Zwarts et al. [2] processed Landsat 5 TM images captured on 24 different dates spanning from 1984 to 2003, and derived 24 water extent maps for the study area. The images provide evidence of the water extents of inundation areas, along with known water levels at Mopti, Akka, and Diré. The inundation extent obtained from the images on 28 July 2001 and 16 October 2001 are shown in Figure 4. To validate the simulated inundation, a series of Moderate Resolution Imaging Spectroradiometer (MODIS) images were processed to derive the inundation extent. A summary of the satellite data usage is given in Table 1. We used two spectral indices to separate water pixels from non-water pixels. The indices are the Modified Normalized Difference Water Index (MNDWI) [27] and the Normalized Difference Moisture Index (NDMI) [28].
MNDWI and NDMI indexes are expressed as follows: For MODIS images, Green, NIR, and MIR represent reflectance from Band 4 (545 nm-565 nm), Band 2 (841 nm to 876 nm), and Band 6 (1628 nm to 1652 nm), respectively. The study makes use of the finding by Ogilvie et al. [29] on the threshold value of NDMI and MNDWI. They analyzed MODIS images to study the spatial and temporal dynamics of floods across the Niger Inner Delta over the period of 2000-2011. Water was classified when the pixel had a threshold value of 0.15 for NDMI, and for the MNDWI, classifying pixels above the threshold of −0.34 was considered a flooded area.

River Network
The river network was derived from the inundation extent map derived from a Landsat image on 8 July 1985 by Zwarts et al. [2]. Only major rivers and channels were extracted. Small channels, especially at the northern part of the delta, could not be taken into consideration due to their haphazard geometry and complex meandering. Figure 5 shows the delineated river network used in the model.

Channel Geometry
The bed elevation of river sections at Ké-Macina, Mopti, Akka, and Diré was approximated using hydraulic geometry equations proposed by Leopold and Maddock [30]. Hydraulic geometry relationships are a series of power laws that relate river width (w), depth (d), and velocity (v) to discharge. These functions are: Here, a, c, and k are coefficients, and b, f, and m are exponents with b + f + m = 1, and Empirical values of the hydraulic geometry coefficients and exponents derived by Hey and Thorne [31] for Gravel-Bed Rivers in Britain were used in this study. Hey and Thorne [31] found that a = 3.67, b = 0.45, c = 0.33, and f = 0.35 were a fair representation of hydraulic geometry. The same coefficients were used by Neal et al. [7] in the IND.
Leopold [32] suggested taking the bankfull discharge for the analysis, as it is related to channel morphology. The bankfull discharge is often assumed to be the discharge, with a return period of about 1.5 years. Table 2 shows the river bed elevation at KeMacina, Mopti, Akka, and Dire. Elevations of other points in between the sections were calculated assuming a linear variation along the longitudinal direction.

Estimation of Inflow at Ungauged Inlets
Observed streamflow time series were available only at two inlets of the hydrodynamic model domain, namely Ke-Macina and Sofara. Streamflow at the seven other inlets was generated by local precipitation that generate ephemeral streams which ultimately end up in the IND. These ungauged inflows were estimated using a SWAT model of the upper Niger and Bani Rivers basins developed in a previous project [33]. SWAT is a process-based and spatially semi-distributed hydrological model. In the SWAT model, the whole watershed is divided into multiple sub-basins, and each sub-basin is divided into several Hydrological Responsible Units (HRUs) that consist of unique homogeneous combinations of soil, land use, and topographic features in each sub-basin [34]. The hydrological cycle simulated by the SWAT model is based on the water balance equation [35]: where SW t , SW 0 are the final and initial soil water content (mm/d), respectively; t is the time (day); R day is the precipitation(mm/d); Q sur f is the runoff (mm/d); E a is the evapotranspiration (mm/d); W perc is the percolation (mm/d); and Q gw is the return flow (mm/d). The SWAT model implements multiple options for the description of hydrological processes, such as the Soil Conservation Service (SCS) curve number method [36] for runoff calculation and the Penman-Monteith [37] method for the estimation of potential evapotranspiration. The details on SWAT modelling tools can be found from Arnold et al., [20], Arnold et al. [33], and Neitsch et al. [35]. The SWAT model of the UNB contains 32 sub-watersheds ( Figure 6a). The model was calibrated and validated using streamflow from 14 hydrometric stations listed in Table 3, along with the calibration and validation performance. The performance criteria used in the calibration and validation process is NS. Sensitivity analysis was performed to select the parameters to use for calibration: CN2 (SCS runoff curve number for moisture condition II), RCHRG_DP (Deep aquifer percolation fraction), and RES_K (Hydraulic conductivity of the reservoir bottom, mm/hr) which were tuned during calibration. The Sequential Uncertainty Fitting program, SUFI-2 [38] was used for calibration purposes. A detailed description of the model used in this study can be found in Seidou [33] and Maiga [39]. The calibrated SWAT model was forced with observed climate data, and the discharge from surrounding watersheds was assumed to enter the area where the main channel crosses the boundary of the study area (red dots in Figure 6b). Yellow dots in Figure 6b indicates observed discharge stations. The formulas used to calculate inflows into the hydrodynamic model domain from the SWAT model outputs are provided in Table 4.

Floodplain Friction
The Manning roughness coefficient was derived from the USGS Global Land Cover Characterization [40] database following Asante et al. [41]. Figure 7 shows the distributed Manning's roughness coefficient at the IND.

Hydrodynamic Model Setup
The hydrodynamic modelling was performed using TELEMAC 2D, a hydrodynamic model used worldwide. TELEMAC-2D solves the depth-integrated Shallow Water Equations (SWEs) using the finite element method on an irregular mesh.
In TELEMAC 2D, the SWE's are solved using a fractional step method [42], whose main principle is that the hyperbolic and parabolic parts of the SWE's should be treated separately. The solution comprises two steps: a solution of advection terms, and the solution of the propagation, diffusion, and source terms. Several schemes are available for the advection step, where the Method of Characteristics has been applied to solve the advection of u and v. The Streamline Upwind Petrov-Galerkin (SPUG) method [43] has been applied to solve the advection of h in the continuity equation. Variational formulations and space discretization transform the continuous equations into a linear triangular element, where the dependent variables u, v, and h are expanded using equal order interpolation functions. In the second step, the propagation, diffusion, and source terms are coupled through an implicit time scheme, and the resulting linear system is solved with a variant of the conjugate gradient method [44].

Mesh Generation
The entire domain was discretized using a system of irregular triangular elements. In the channel mesher, the rivers and channels were discretized separately, and in the combined mesher, the river and flood plain were joined together to form a combined mesh. The combined mesh has 220,289 nodes and 438,716 elements. The smallest length of the elements is about 50 m, and maximum length is about 844 m.

Model Boundary Conditions
The model has ten liquid boundaries. Nine of the liquid boundaries are inlets, while the last one is an outlet boundary. The location of liquid boundaries is shown in Figure 6b. Time-varying discharges were used to prescribe flows at the inlet boundaries. Among the nine inlets, two are fed with the measurement, and seven comes from the output of a SWAT model. The downstream boundary is prescribed with two types of rating (stage-discharge) curves, where the Type (1) rating curve is in the form of water level as a function of discharge, and the Type (2) rating curve is in the form of discharge as a function of water level. We used a relaxation parameter to the Type 1 rating curve to possibly obtain a stable simulation of water level and discharge free from oscillation, as suggested by Hervouet [13]. Relaxation to the rating curve gives rise to the possibility of a smoothly varying boundary condition, which may otherwise trigger instabilities [13].
Rainfall was not included in the hydrodynamic model, because local rainfall has some effect on IND flood magnitude. The IND is in a semi-arid area, and most of the water flowing in it comes from a very humid region in Guinea, 600-900 km upstream. The headwaters' annual rainfall can go up to 1750 mm. However, in IND, average annual rainfall varies between 200 mm/yr at the downstream and 500 mm/yr at the upstream [2]. We did account for precipitation and evaporation on the watershed through the SWAT model, but chose not to include precipitation directly on the water body. Zwarts [45] found that local rainfall was too limited to influence the flood height in IND.
The authors acknowledge that constant evaporation has a lot of limitations, but the estimation of evaporation in the IND is an ongoing debate, with multiple papers published on the topic, and rather diverging estimates. Evaporation varies between 160 and 240 mm per month, with an average of 200 mm per month [2]. Ogilvie et al. [29] found evaporation rates were between 4 mm/day and 7 mm/day over the years 2008-2009.
Given the huge uncertainty in the estimated evaporation and the fact that our main purpose was flood wave propagation, we believe that the choice of constant evaporation had little impact on the results. The authors are considering the inclusion of dynamic precipitation and evaporation in the model that involves modifications of the TELEMAC 2D routines.

Initial Conditions
An introductory simulation was performed with prescribed constant discharges at each inlet, as well as a constant prescribed elevation at the outlet boundary to obtain the initial condition for subsequent runs. The simulation time step was set as 60 sec. It was selected in such a way so that it would not produce any instability on the simulation. It is to be noted that the courant number gives the impression of the stability of the model to perform the calculation. The courant number of the simulation was kept below 4. Manning's friction coefficient, as discussed in the previous section, was assigned to the nodes. The Manning coefficient at the floodplain was constant during simulation. However, the Manning's coefficient at the river channel was considered variable, and tuned during calibration.

Model Configurations
Given that three elevation data sets were available and that there were two possibilities for downstream boundary conditions to be set (by prescribing a rating curve in one of the following forms: discharge as a function of water level-type 1 or water level as a function of discharge-type 2), we obtained total of six different hydrodynamic models (Table 5). Prescribing Type 1 boundary conditions often leads to resonances or uncontrollable oscillations because of the feedback of the level on the local discharge. To address the issue, the developers of TELEMAC 2D estimated water levels at time t as a weighted average of the water level, corresponding to the discharge prescribed at time t and the water level corresponding to the discharge prescribed at time t − 1. The weight of the water levels corresponding to the prescribed discharge at time t is called the relaxation coefficient (R). The developers of TELEMAC 2D recommend a value of R between 0.01 and 1.0 to smoothly prescribe discharges. In this paper, the value of R has been set to its default value of 0.02. Table 5. Configuration of the six models. Ogilvie et al. [29] found that the years 2001-2002 and 2008-2009 experienced relatively few clouds. Our study used these two periods as the calibration and validation period, respectively. MODIS and Landsat images during the months of September-February were considered for extracting water extent, as these months showed fewer clouds than the other months [29]. The periods between 1 July 2001 to 28 February 2002, and from 1 July 2008 to 28 February 2009 were thus chosen for calibration and validation, respectively.

Calibration
In flood modelling, the roughness coefficient is typically used as a calibration parameter [46]. It is commonly recognized that friction varies in space, and that the precision of simulation can be improved through calibration and the use of effective friction coefficients [47,48]. The model was calibrated by changing the Manning coefficients in the reaches of the Niger river: KeMacina to Mopti (slope = 0.052 m/km), Mopti to Akka (slope = 0.029 m/km), and Akka to Dire (slope = 0.007 m/km). The Manning coefficient in the other reaches was assumed to be constant (0.02). As discussed before, the Manning coefficient of the floodplain was derived from GLCC. The calibration period of the model was chosen as July 2001-February 2002. The simulation results were evaluated by Pearson's correlation coefficient (r) and Nash-Sutcliffe coefficients (NS) at Akka and Mopti. Pearson's correlation coefficient (r) and Nash-Sutcliffe coefficients (NS) are described in the following equations: Here, x represents the observed, and y the simulated value of a variable. x and y are the averages of the observed and simulated values, respectively. Models 1, 3, and 5 (Table 5) were calibrated for discharge and water level, and the calibration parameters were used for Models 2, 4, and 6, respectively.
After having a good calibration of the models against discharge and water level, we moved to further adjust the waterline DEM with respect to the inundated area. Adjustment was made on the elevation data to match the observed inundation area. Inundation extent maps by Zwarts et al. [2] on 28 July 2001, 25 October 1984, and 16 October 2001, which represent small, medium, and high flooded areas corresponding to 166 cm, 331 cm, and 432 cm water heights at Akka, respectively, were chosen for this purpose. The model was run for 1984-1985 with same parameter and Manning's coefficient. Underprediction, agreement, and overprediction areas were calculated by comparing simulated and observed inundation. Over-and under-predicted areas were those areas that were not supposed to submerge or dry in real cases. Simulated water depth at the overpredicted and underpredicted areas were extracted from the simulation result. The topography of the areas where inundation was overpredicted was adjusted by adding the water depth with the elevation at the grid points, and vice versa. This process continued until a good match was obtained between observed and simulated inundation.

Bayesian Model Averaging
Bayesian model averaging (BMA) is a statistical procedure used to infer a consensus prediction by weighing individual predictions based on their probabilistic likelihood measures. A BMA which assesses the performance of a model using its likelihood of predicting the observation is called the model's marginal likelihood, or Bayesian Model Evidence (BME) [49]. The BME is often normalized and transformed into model weights. The better the model, the higher its evidence, hence its weight. A BMA's outcomes have more skill and reliability than the original ensemble members generated by competing models [50,51]. Several authors [51,52] applied BMA to combine hydrological simulations. For instance, Zhu et al. [53] applied BMA for flood prediction and uncertainty estimation of flood event predictions. Beckers et al. [54] applied BMA to a set of water-level forecast models used in the Flood Forecasting and Warning System for Rhine and Meuse rivers.
In this study, we applied BMA to the predicted water level and discharge by the six models developed in this paper, with the expectation that these models' outputs will be combined into a robust prediction. BMA was applied at each location (Akka and Mopti) and for each variable. Therefore, the combined model would be different according to the location and variable of interest. A brief description of the methodology is given below, as per Hoeting et al. [21] If ∆ is the quantity of interest, then its posterior distribution, given data D, is This is an average of the posterior distributions under each of the models considered, weighted by their posterior model probability. In Equation (9) where pr(D|M k ) = pr((D|θ k , M k )Pr((θ k |M k )dθ k (11) pr(D|M k ) is the integrated likelihood of model M k , θ k is the vector of parameters of model M k (e.g., for regression θ = β, σ 2 ), Pr(θ k |M k ) is the prior density of θ k under model M k , pr(D|θ k , M k ) is the likelihood, and pr(M k ) is the prior probability that M k is the true model. pr(M k |D) is also known as the BMA weight w k , and the sum of w k is equal to 1, that is, K k=1 w k = 1. The Bayesian Model Evidence (BME) is defined as: The posterior mean and variance of ∆ are calculated as follows: Mean: Variance: where [55,56] The successful implementation of the BMA requires estimates of the weights, w k and standard deviationsof the conditional probability density function (pdfs) of the ensemble members. Their values can be estimated through the maximum likelihood method [51]. Let The logarithmic likelihood function can be approximated as follows: Equation (17) cannot be solved analytically for θ. The BMA weights can be calculated by two approaches, such as the Markov Chain Monte Carlo (MCMC) [57] algorithm and Expectation-Maximization (EM) [58] algorithm. We applied the EM algorithm, as recommended by Raftery et al. [51], to get the optimum value of θ.

Calibration Results
The calibration performance of all models is shown in Table 6. The results in Table 6 show that the downstream boundary condition affects model performance. Although the NS of discharge and water level at Mopti has no variation, the model performance varies in Akka in the simulating discharge and water level. Models 1, 3, and 5 generally performed better than Models 2, 4, and 6 in terms of NS. Results suggest that prescribing the downstream boundary condition, 'water levels as a function of discharge (Type 1)', leads to better simulations than prescribing the boundary, 'discharge as a function of water level (Type 2)'. The authors speculate that providing a Type 1 boundary condition leads to better results. The reason seems to the use of the relaxation coefficient (R), as described in Section 2.6.4, which smooths out the solution, but introduces some errors in the algorithm.
Models have been ranked in order of decreasing performance when simulating discharge (resp. water levels) in Table 7 (resp. Table 8). While models built on MERIT DEM generally performed well, the order of the models can change according to the boundary condition, the location (Akka or Mopti), the performance measure (r or NS), and finally, the variable under investigation. The results are in line with the findings of Pappenberger et al. [59] that the boundary condition is a source of uncertainty.

BMA Weights
The BMA weights calculated in the calibration period (2001)(2002) for each of the six models are given in Table 9. The fact that the best-performing model depends on several factors was an incentive to use Bayesian Model Averaging (BMA) at each location for each variable to get a location-specific combination of models exploiting the relative strength of the available models. For instance, the combined BMA simulation of discharge at Akka relies on Model 1 (weight = 0.1177) and Model 5 (0.8823), and almost completely ignores Models 2, 3, 4, and 6 (wight = 0.0000 each); the combined BMA simulation of water level at the same location relies more on Model 1 (weight = 0.9602) and Model 4 (0.0398) than on Models 2, 3, 5, and 6 (wight = 0.0000 each). BMA allows the end-user to have a robust estimate of key parameters by varying the weight according to location and variable of interest. If one location and one variable are more important for the end-user and resources are not available to run six models in parallel, the BMA weights will help them identify which model to maintain.
One of the assumptions in the BMA approach is that BMA weights should reflect relative model skill. BMA weights were highly correlated with model performance statistics [52]. Model weights and BME are probabilistic model performance measures, and both are typically associated in the context of BMA [60], as shown in Equation (12). On the other hand, NS, "r", daily root mean square error (DRMS), and daily absolute error (DABS) are used for deterministic model performance measurements. Previous authors [52,53] used performance statistics such as Nash-Sutcliffe (NS), DRMS, or DABS to examine the consistency of the BMA predictions in terms of their weight. In this exercise, we compared model performance in term of NS and "r" with their BMA weight.
If we compare the highest NS values in Table 5 with the highest BMA weight in Table 6, we can see that the highest NS value corresponds to the highest BMA weight, though the "r" values give a mixed picture. The high correlation between NS statistics and BMA weight confirms one of the central assumptions of the BMA scheme, that better-performing models receive higher weights [52]. BMA weights provide an additional means to assess the relative credibility of the six models. If we take NS statistics and BMA weight together, Model 1 was found to perform well in simulating water levels at Mopti and Akka, and Model 2 is better in simulating discharge at Mopti. Finally, Model 5 outperformed the others in simulating discharge at Akka. Thus, we considered Models 1, 2, and 5 to be the best three among the six models.

BMA Estimates of Water Levels and Discharge in the Calibration Period
The models were combined based on their respective weight values, as shown in Table 6. The maximum weights of the value of the individual model was found to contribute more to the resultant forecast. For instance, the BMA forecast of discharge at Mopti mainly comes from the combination of Model 2 and Model 3 with 79.31% and 14.17%, respectively, from their simulated discharge. A comparison of the discharge and water level from the best individual models, as well as the BMA prediction at the stations, are given in Figure 8. The BMA estimate of discharge and water level at the monitoring stations is shown in Figure 8, where there seems to be a good match between observed values (red) and BMA predictions (black). The highest NS values ( Figure 9) confirms that BMA outperforms each best individual model.

Hydrodynamic Models Validation
The model results were validated using the 2008-2009 discharge, water levels, and water extent. Each of the models was run with inflows of 2008-2009 at the model inlets keeping the same parameters as obtained at the calibration. Table 10 summarizes the NS and "r" statistics of the model simulations.

BMA Validation
The weights obtained from the calibration periods were used to compute BMA predictions for the validation period. The performance statistics, including NS and "r", were again employed to examine the consistency of the BMA predictions. A comparison of NS statistics between BMA and each of the models 1, 2, and 5 are shown in Figure 10. Discharge and water levels simulated by Models 1, 2, and 5 have been compared to BMA estimates in Figure 11. The figure shows a good match between observed values (red) and BMA predictions (black).
It can be seen that the BMA predictions perform better than the best individual in most of the cases. Minor degradation of NS statistics also took place; for instance, BMA performance in the validation is slightly worse than that of Models 1 and 2 in predicting discharge at Mopti and the water level at Akka, respectively. Although there is some minor degradation in the performance when the BMA result is compared with the best individual models, BMA is generally superior to that of the individual models, which is consistent with previous studies [52,61]. The improvement due to BMA is relatively low when NS is close to 1, as there is then little room for improvement. When the NS is low, like in the simulation of discharge at Akka (Figures 9 and 10), the improved accuracy is much more obvious. The use of BMA for estimating both variables is, therefore, preferable to the use of individual model simulations. This feature of BMA is particularly interesting in data-scarce environments, such as Mali, where the only available information often comes from global sources with rather coarse resolutions.

Simulated Inundation Extent from Best Individual Models
This section provides a brief comparison between simulated and satellite-derived inundation extents for 2008-2009. Only the best individual models (Models 1, 2, and 5), as obtained from Section 3.1.1, are considered in the comparison. The reference inundation extents were derived from MODIS imagery for three days, representing the rising, peak, and recession limb of the 2008-2009 hydrograph. These images were taken on 28 August 2008, 7 October 2008, and 17 January 2009. The inundation extent simulated with Models 1, 2, and 5 along with the satellite-derived inundation extent is shown in Figure 12. The simulated inundation in the central part of the study area, containing Lakes Walado, Debo, and Korientze, shows better agreement with the observed inundation for Model 5 for almost all the three days. The flood extent between KeMacina and Akka on 28 August was better simulated by Model 5 than Models 1 and 2. The inundation pattern simulated by Model 5 also closely matched with the observed inundation on 7 October.
It was also found that, while Model 5 performed well in reproducing the inundation extent of 28 August 2008 and 7 October 2008, it performed poorly in reproducing the inundation extent of 17 January 2009 (receding flood), especially in the area between Ke-Macina and Akka. The simulations show water in interconnecting channels that seem to be dry in the satellite images. This, however, has to be interpreted with care, as the coarse resolution of the MODIS images (500 m) does not allow the detection of narrow channels.
Based on the above facts, we can consider Model 5 to be the best in terms of simulating inundation. It is worth mentioning here that among the individual models, Model 5 was calibrated against inundation obtained from Landsat images captured in the 2001-2002 season. This probably makes this particular model the best among the participating models in simulating inundation extent.

Potential Usages of the Developed Model
The IND is a complex floodplain that is used by 1.5 million people, where there is little information about floodplain dynamics. Simple empirical equations are used to link water levels at key stations to inundation extents, and flood forecasting is done using linear regressions [62]. A hydrodynamic model, even if not perfect, provides more insight into the flood dynamics, and will be more useful for decision making. The model developed here will be later used to replace or improve the empirical relationships currently used by local authorities, and to develop a process-based flood forecasting model.

Conclusions
In this study, six different hydrodynamic models of the Inner Niger Delta (IND) in Mali were developed using three different DEMs and two different downstream boundary conditions. The elevation data came from two global DEMS (SRTM and MERIT), as well as a DEM derived from satellite imagery and water level observations using waterline methods. Given the data scarcity in the area, a considerable amount of input data, such as bathymetry, the river network, friction coefficient, and inundation extent were derived from secondary sources, leading to a high level of uncertainty in the simulations. The six models which were developed showed unequal performances in simulating water levels, discharges, and floodplain extent, with none of them outperforming the others in all fronts. It was found that specifying the boundary condition (water levels as a function of discharge) led to better results. The reason for this seems to the use of the relaxation coefficient in the algorithm of the stage-discharge curve. Overall, the MERIT DEM was the best DEM among the ones considered in this paper, as it is a processed DEM by removing errors from SRTM. Bayesian Model Averaging (BMA) was used to produce a more robust estimate of water levels and discharges. BMA also helped to identify which individual models had the highest value for an end-user who is more interested in a particular location in the IND and a particular hydraulic variable.