Integrating Water Observation from Space Product and Time-Series Flow Data for Modeling Spatio-Temporal Flood Inundation Dynamics

Periodic inundation of floodplains and wetlands is critical for the well being of ecosystems. This study proposes a simple but efficient model that integrates time series daily flow data and the Landsat-derived Water Observation from Space (WOfS) product to model the spatio-temporal flood inundation dynamics of the Murray-Darling Basin. A zone-gauge framework is adopted in order to reduce the hydrologic complexity of the large river basin. Under this framework, flood frequency analysis was conducted at each gauge station to identify historical peak flows and their annual exceedance probabilities. The results were then linked with the WOfS dataset through date to model the inundation probability in each zone. Inundation frequency was derived by simply overlaying the yearly inundation extent from 1988 to 2015 and counting the inundation times. Both the resultant inundation frequency map and inundation probability map are of ecological significance for the survival and prosperity of riparian ecosystems. The assumptions of the model were validated carefully to enhance its theoretical basis. The WOfS dataset was also compared with another independent water observation dataset to cross-validate its reliability. It is hoped that with the development of more and more global high-resolution surface water datasets, this study could inspire more studies that integrate surface water datasets with hydrological observations for flood inundation modeling.


Introduction
Although covering only about 4% to 6% of the Earth's ice-free land surface [1,2], wetlands and floodplains play a key role in ecological and hydrological cycles. Their inundation usually exhibits obvious seasonal and inter-annual variations [3], which alters their connectivity to the main stem rivers and thus changes nutrient retention and processing [4]. This also affects the aquatic system productivity [5], as well as spawning and nursery habitat [6,7]. The structure and function of riverine ecosystems are also closely linked to the inundation dynamics [8]. Therefore, understanding inundation dynamics of wetlands and floodplains is essential for riparian ecological studies. is required to convert the area value into surface water maps, which is still a challenging work by now.
This study, therefore, aims to present a method that links WOfS dataset with time series flow data to model the spatio-temporal inundation dynamics over the Murray-Darling Basin, Australia's largest river basin. They objectives of this study include: 1) to map the flood inundation frequency over this large river basin from 1988 to 2015; 2) to model the flood inundation probability of the basin. Comparing with our previous work [22], longer time series and higher resolution WOfS dataset is employed to replace MODIS, which obviously would improve the model due to higher quality of input data. Moreover, the assumptions of the model are carefully validated to ensure its reliability, which has not been conducted in the previous work.

Study Area
The Murray-Darling Basin (MDB), covering an area of over 1,000,000 km 2 (Figure 1), supports over a third of Australia's agricultural production [32]. It is also home to some 30,000 wetlands, 200 of which are deemed important Australian wetlands, and 16 of which are recognized internationally for their ecological importance (Ramsar sites) [33]. There is an obvious elevation gradient from southern and eastern margins toward the inland areas. Rivers in MDB generally arise from uplands in the south and east, and flow into its low-lying and flat inland areas [34,35]. Rainfall is spatially uneven, mostly occurring in the east margins [36].

Flow Data
There are over 700 gauges distributed through the MDB (Figure 1). Some of them are maintained by the Murray-Darling Basin Authority (MDBA), and others are maintained by government services of each state, such as New South Wales Office of Water. Observed flow data at these gauges are collected from these agencies and recorded as daily discharge in mega liters per day (ML/day).

WOfS Dataset
The WOfS dataset was acquired from Geoscience Australia's web service (http://www.ga.gov.au/ scientific-topics/hazards/flood/wofs). Each grid cell of WOfS maintains the number of clear satellite observations, the number of water occasions, the percentage of clear observations, and the confidence of water observation. The clear observations are irregular from day to day, from place to place, due to the continuation of Landsat missions and unpredictable cloud cover. Therefore, we generate monthly maximum water maps by overlaying all the observations in a month and selecting all the clear observations of water. Yearly maximum water maps are derived similarly by overlaying all monthly maps of a year. Generation of monthly and yearly maximum water datasets achieves spatially continuous observations by sacrificing the temporal resolution. In this study, WOfS data from 1988 to 2015 were downloaded and used.

Constructing a Zone-Gauge Framework
Flood inundation dynamics in the MDB are quite variable across the basin due to the basin's rich river networks and complex topography. In order to reflect the spatial variation in flood pulses and decrease time lags between flow and inundation, a zone-gauge framework was proposed in the Murray-Darling Basin Floodplain Inundation Modeling (MDB-FIM) projects [37,38], which was later modified by Huang et al. [39] who divided the basin into 90 zones based on a continental scale drainage network product, the Australian Hydrological Geospatial Fabric (Geofabric) published by the Bureau of Meteorology of Australia (http://www.bom.gov.au/water/geofabric/). Each zone was assigned a representative downstream gauge that best records the flow through the zone. Some of the zones were assigned a virtual gauge that was a combination of two or three observed gauges to match as closely as possible to reality. The combination was achieved by addition or subtraction in arithmetic, depending on the location of the gauges. Some of the zones have no clear linkage to the major rivers in the basin and no corresponding gauges. These ungauged zones were excluded in this study because they are generally not inundated.
This study required gauges with observed flow series that are longer than 30 years. Therefore, the original 90-zone framework was modified by merging those zones that have shorter series with others, and a framework with 67 pairs of zones and gauges ( Figure 2) was finally adopted in this study. Under this framework, it was assumed that the flood inundation extent in each zone was closely related with the observed flow at its downstream gauge. Higher flow would usually correspond to a larger inundation extent.

Modeling Flood Inundation Dynamics
Spatio-temporal modeling of flood inundation in MDB is based on the zone-gauge framework, which consists of 67 pairs of zones and gauges. Under this framework, inundation extent in each zone was assumed to be closely related with the observed flow at its corresponding gauge. Under this assumption, flood frequency analysis was carried out to the time-series flow data of each gauge based on the annual flood series method, considering that the flow in the study area generally exhibits a pattern of inter-annual variation [22].
An annual flood is defined as the highest peak discharge in a water year. Flood frequency analysis is to estimate the exceedance probability of different flow peaks based on the historic flow peaks. To do that, we should first assume that flow peaks obey some kind of probability distribution. Gumbel distribution is a statistical method often used for predicting extreme events such as earthquake and floods [40]. Therefore, we assume the flow peaks observed at all the gauges fit the Gumbel distribution.
The probability plotting positions are a type of empirical probability distribution and provide appropriate non-exceedance probabilities for the extreme values [41]. While there are many plotting position formulas, such as the California, Hazen, Weibull, and Gringorten methods [42], the Gringorten method [43] was recommended for the Gumbel distribution [42,[44][45][46]. Therefore, we have adopted this method. It calculates the Annual Exceedance Probability (AEP) as Equation (1).

Modeling Flood Inundation Dynamics
Spatio-temporal modeling of flood inundation in MDB is based on the zone-gauge framework, which consists of 67 pairs of zones and gauges. Under this framework, inundation extent in each zone was assumed to be closely related with the observed flow at its corresponding gauge. Under this assumption, flood frequency analysis was carried out to the time-series flow data of each gauge based on the annual flood series method, considering that the flow in the study area generally exhibits a pattern of inter-annual variation [22].
An annual flood is defined as the highest peak discharge in a water year. Flood frequency analysis is to estimate the exceedance probability of different flow peaks based on the historic flow peaks. To do that, we should first assume that flow peaks obey some kind of probability distribution. Gumbel distribution is a statistical method often used for predicting extreme events such as earthquake and floods [40]. Therefore, we assume the flow peaks observed at all the gauges fit the Gumbel distribution.
The probability plotting positions are a type of empirical probability distribution and provide appropriate non-exceedance probabilities for the extreme values [41]. While there are many plotting position formulas, such as the California, Hazen, Weibull, and Gringorten methods [42], the Gringorten method [43] was recommended for the Gumbel distribution [42,[44][45][46]. Therefore, we have adopted this method. It calculates the Annual Exceedance Probability (AEP) as Equation (1). where r is the rank of the annual flow peaks from largest to smallest; N is the number of years for the record length. According to the date of annual flow peaks, maximum water observation in its corresponding month would be selected, and assigned a value equal to the AEP of this peak to make a probability map. All the probability maps corresponding to each flow peak were then overlaid by taking the maximum probability to generate a final flood inundation probability map. On the other hand, a flood inundation frequency map was derived by simply overlaying all the yearly maximum water observations from 1988 to 2015 and calculating the annual inundated times of each pixel. A flowchart of the modeling process is given in Figure 3. where r is the rank of the annual flow peaks from largest to smallest; N is the number of years for the record length. According to the date of annual flow peaks, maximum water observation in its corresponding month would be selected, and assigned a value equal to the AEP of this peak to make a probability map. All the probability maps corresponding to each flow peak were then overlaid by taking the maximum probability to generate a final flood inundation probability map. On the other hand, a flood inundation frequency map was derived by simply overlaying all the yearly maximum water observations from 1988 to 2015 and calculating the annual inundated times of each pixel. A flowchart of the modeling process is given in Figure 3.  Figure 4 shows the inundation frequency in MDB over the period , with each pixel indicating the annual inundated times in these 28 years. Note that it was derived from annually aggregated maximum flood extent, thus intra-annual variability and hydroperiods were not captured. Enlarged areas are four of the Ramsar sites, Narran Lake Nature Reserve, Riverland, Kerang Wetlands, and Currawinya Lakes. This map shows how often an area has been submerged by water. It is clear that frequently inundated areas are generally around river channels. This is because floods in this basin are usually caused by the overbank flow of rivers. Floodplain and wetland ecosystems rely heavily on the overbank flow for water supply. River red gum for example, needs to receive adequate surface flooding periodically to stay healthy [47]. Some aquatic organisms, such as some fish species, also rely on flood pulse to broad their habitats and gain prosperity [48]. Therefore, this resultant map can be a useful indicator for the survival and prosperity of riparian biota communities, which therefore makes it a helpful tool for building knowledge of the relationship between ecosystem conditions and the flooding water.  Figure 4 shows the inundation frequency in MDB over the period , with each pixel indicating the annual inundated times in these 28 years. Note that it was derived from annually aggregated maximum flood extent, thus intra-annual variability and hydroperiods were not captured. Enlarged areas are four of the Ramsar sites, Narran Lake Nature Reserve, Riverland, Kerang Wetlands, and Currawinya Lakes. This map shows how often an area has been submerged by water. It is clear that frequently inundated areas are generally around river channels. This is because floods in this basin are usually caused by the overbank flow of rivers. Floodplain and wetland ecosystems rely heavily on the overbank flow for water supply. River red gum for example, needs to receive adequate surface flooding periodically to stay healthy [47]. Some aquatic organisms, such as some fish species, also rely on flood pulse to broad their habitats and gain prosperity [48]. Therefore, this resultant map can be a useful indicator for the survival and prosperity of riparian biota communities, which therefore makes it a helpful tool for building knowledge of the relationship between ecosystem conditions and the flooding water.  Figure 5 shows the inundation probability that was derived by integrating time series flow data and WOfS product. The probabilities in this map were derived from the AEP of flood frequency analysis, which indicates the probability of a given flow peak being exceeded in any given year. Through combining with corresponding inundation extent that was derived by remote sensing data, it therefore projects future possible inundation based on historic flow and inundation status. The AEP-derived probabilities in this map can be related with another commonly used measurement, the Average Recurrence Interval (ARI). For example, AEP of 0.632 generally equals to a 1-year recurrence interval, and 0.181 equals to a 5-year recurrence interval (Equation (2)). Therefore, the flood inundation probability map can be easily converted into flood inundation maps of different ARIs ( Figure 6). From Figure 6, it is observed that most area of the Narran Lake Nature Reserve was inundated at least once every year or every two years, which suggests that its wetlands receives flooding water regularly and frequently. For the Riverland, its wetlands are generally inundated once every five or ten years. Most of the Kerang Wetlands are permanent water bodies that have water cover every year. Except for that, some areas could be inundated by two-or five-year floods. Currawinya Lakes have a relative diverse inundation situation. Wetland areas on the upstream are inundated once every five years, while the downstream wetlands can generally be inundated every two years. The ARI maps can be helpful for pre-formulating future flow regimes to ensure important  Figure 5 shows the inundation probability that was derived by integrating time series flow data and WOfS product. The probabilities in this map were derived from the AEP of flood frequency analysis, which indicates the probability of a given flow peak being exceeded in any given year. Through combining with corresponding inundation extent that was derived by remote sensing data, it therefore projects future possible inundation based on historic flow and inundation status. The AEP-derived probabilities in this map can be related with another commonly used measurement, the Average Recurrence Interval (ARI). For example, AEP of 0.632 generally equals to a 1-year recurrence interval, and 0.181 equals to a 5-year recurrence interval (Equation (2)). Therefore, the flood inundation probability map can be easily converted into flood inundation maps of different ARIs ( Figure 6). From Figure 6, it is observed that most area of the Narran Lake Nature Reserve was inundated at least once every year or every two years, which suggests that its wetlands receives flooding water regularly and frequently. For the Riverland, its wetlands are generally inundated once every five or ten years. Most of the Kerang Wetlands are permanent water bodies that have water cover every year. Except for that, some areas could be inundated by two-or five-year floods. Currawinya Lakes have a relative diverse inundation situation. Wetland areas on the upstream are inundated once every five years, while the downstream wetlands can generally be inundated every two years. The ARI maps can be helpful for pre-formulating future flow regimes to ensure important riparian ecosystems receiving enough water supply. River red gum forests in MDB for example, require flooding water once every five years or more often [49]. As a result, their distribution should be contained within the five year inundation extent, which can be derived from Figure 5 by taking those pixels with probabilities greater than 0.181. Similarly, this resultant map can be useful for identifying other flora and fauna communities that may be in danger due to possible lack of water.

Flood Inundation Probability
Remote Sens. 2019, 11, x FOR PEER REVIEW 8 of 18 riparian ecosystems receiving enough water supply. River red gum forests in MDB for example, require flooding water once every five years or more often [49]. As a result, their distribution should be contained within the five year inundation extent, which can be derived from Figure 5 by taking those pixels with probabilities greater than 0.181. Similarly, this resultant map can be useful for identifying other flora and fauna communities that may be in danger due to possible lack of water.

Consistency between Flow and Inundation in Each Zone
The core assumption of this study is that in each zone, the inundation extent is closely related with the observed flow at its complementary gauge. In order to validate this assumption, the inundation area for each month was first calculated and plotted together with daily flow. Figure 7 shows the variation of inundation area (blue line) and flow (red line) from 1987 to 2015 for five demonstration zones. It is clear that flow and inundation in all these zones have similar pattern, with large inundation extent usually accompanying high flow.

Consistency between Flow and Inundation in Each Zone
The core assumption of this study is that in each zone, the inundation extent is closely related with the observed flow at its complementary gauge. In order to validate this assumption, the inundation area for each month was first calculated and plotted together with daily flow. Figure 7 shows the variation of inundation area (blue line) and flow (red line) from 1987 to 2015 for five demonstration zones. It is clear that flow and inundation in all these zones have similar pattern, with large inundation extent usually accompanying high flow.
Pearson's correlation between the flow volume and the inundation area was performed for each zone to further validate the assumption for the whole basin. Pearson's r and corresponding p-value were calculated and displayed in Figure 8a,b. It can be observed that most of the zones have high r-values with p-values less than 0.05, demonstrating strong consistence between flow and inundation. Zones in the southeast margin of MDB have relatively smaller r values, and some of the zones have high p-values, suggesting a weaker relationship between the inundation extent and flow. After checking their flow and inundation series, it was found that some of these small zones located completely in a stripe that has too many miss Landsat coverages, which leads to non-significant correlations. The other zones are likely to be affected by the more frequent precipitation there. Water that comes from rainfall weakens the agreement between the observed flow magnitude/volume and remotely sensed inundation. Pearson's correlation between the flow volume and the inundation area was performed for each zone to further validate the assumption for the whole basin. Pearson's r and corresponding p-value were calculated and displayed in Figure 8a and Figure 8b. It can be observed that most of the zones have high r-values with p-values less than 0.05, demonstrating strong consistence between flow and inundation. Zones in the southeast margin of MDB have relatively smaller r values, and some of the zones have high p-values, suggesting a weaker relationship between the inundation extent and flow. After checking their flow and inundation series, it was found that some of these small zones located completely in a stripe that has too many miss Landsat coverages, which leads to non-significant correlations. The other zones are likely to be affected by the more frequent precipitation there. Water that comes from rainfall weakens the agreement between the observed flow magnitude/volume and remotely sensed inundation.   Pearson's correlation between the flow volume and the inundation area was performed for each zone to further validate the assumption for the whole basin. Pearson's r and corresponding p-value were calculated and displayed in Figure 8a and Figure 8b. It can be observed that most of the zones have high r-values with p-values less than 0.05, demonstrating strong consistence between flow and inundation. Zones in the southeast margin of MDB have relatively smaller r values, and some of the zones have high p-values, suggesting a weaker relationship between the inundation extent and flow. After checking their flow and inundation series, it was found that some of these small zones located completely in a stripe that has too many miss Landsat coverages, which leads to non-significant correlations. The other zones are likely to be affected by the more frequent precipitation there. Water that comes from rainfall weakens the agreement between the observed flow magnitude/volume and remotely sensed inundation.

Testifying Gumbel Distribution
Another assumption of this study is that the flow peaks fit the Gumbel distribution. The cumulative distribution function of Gumbel distribution is as shown in Equation (3).
where x stands for the volume of flow peaks. Taking logarithm to both sides of this equation, a new equation (Equation (4)) is derived as below.
Then, y will linearly correlate with x, if flow peaks obey the Gumbel distribution. In Equation (5), independent F(x) values indicate the occurrence probability of a flow that is smaller or equal to x. Therefore, it can be calculated by subtracting AEPs of flow peaks from 1. Linear regression was then conducted to x (flow peaks) and y. Coefficient of determination (R 2 ) was calculated for each zone and displayed on the map (Figure 9). It can be observed that, most of the zones have high R 2 , except for zone 27 and zone 52, whose coefficients of determination are 0.48 and 0.57 respectively. After checking the flow data of both zones, it was found that zone 27 had an incredibly high flow peak in 1990, corresponding to a major flood event in Macquarie Marshes that year, while most of the flow peaks in zone 52 are 0. This explains why both zones have relatively low R 2 . Therefore, for the majority of the zones, the Gumbel distribution represents flow peak patterns properly. It is legitimate to assume that the flow peaks in the MDB obey the Gumbel distribution.
Another assumption of this study is that the flow peaks fit the Gumbel distribution. The cumulative distribution function of Gumbel distribution is as shown in Equation (3).
where x stands for the volume of flow peaks. Taking logarithm to both sides of this equation, a new equation (Equation (4)) is derived as below.
Here we introduce another variable y calculated as in Equation (5).
Then, y will linearly correlate with x, if flow peaks obey the Gumbel distribution. In Equation (5), independent F(x) values indicate the occurrence probability of a flow that is smaller or equal to x. Therefore, it can be calculated by subtracting AEPs of flow peaks from 1. Linear regression was then conducted to x (flow peaks) and y. Coefficient of determination (R 2 ) was calculated for each zone and displayed on the map (Figure 9). It can be observed that, most of the zones have high R 2 , except for zone 27 and zone 52, whose coefficients of determination are 0.48 and 0.57 respectively. After checking the flow data of both zones, it was found that zone 27 had an incredibly high flow peak in 1990, corresponding to a major flood event in Macquarie Marshes that year, while most of the flow peaks in zone 52 are 0. This explains why both zones have relatively low R 2 . Therefore, for the majority of the zones, the Gumbel distribution represents flow peak patterns properly. It is legitimate to assume that the flow peaks in the MDB obey the Gumbel distribution.

Validating Surface Water Observation Dataset
Pekel et al. [30] have mapped global surface water dynamics of the last three decades through the analysis of three million Landsat images using the GEE cloud computing platform, and

Validating Surface Water Observation Dataset
Pekel et al. [30] have mapped global surface water dynamics of the last three decades through the analysis of three million Landsat images using the GEE cloud computing platform, and distributed their results as a Global Surface Water Dataset (GSWD) on the GEE. It was reckoned to have provided the best understanding of the changes in our planet's surface water [50]. WOfS is also a Landsat-derived surface water dataset that was developed specifically for Australia. Here we want to give a general impression regarding to how big the difference is between both datasets, so that even though we may not be able to find proper references to validate both datasets, cross-validating them would also reveal their reliability to some extent.
An inundation frequency map that records the number of times that a pixel was flagged as water in the annually aggregated water map in the 28 years (1988-2015) was also derived from GSWD using GEE. It was then compared with the inundation frequency map derived from WOfS ( Figure 4). Figure 10 shows their difference in the four Ramsar sites. It is clear that both datasets are not exactly the same, with differences existing in all the four sites. However, absolute differences of most pixels are less than 6, suggesting that their differences are in a relatively small range. For the Narran Lake site (Figure 10a), there are some areas that have differences bigger than 7, or even 13. This means that in these areas, WOfS dataset identifies more surface water than GSWD does. Similar situations also exits in the Currawinya Lakes (Figure 10d). Differences in Riverland (Figure 10b) and Kerang Wetlands (Figure 10c) are mostly bigger than −6 and less than 6. Both overestimation and underestimation exit for the WOfS dataset, comparing with the GSWD. It has also to be noted that although both datasets were derived from Landsat imagery, their spatial references are different, which makes their spatial resolutions slightly different. The WOfS has a spatial resolution of approximately 25 m, while the GSWD has a spatial resolution of approximately 30 m. Their difference in spatial resolution would inevitably cause some inconsistences between their inundation frequency maps. This could be one important reason why there are so many differences existing in both frequency maps. Meanwhile, the current comparison was based on annual aggregates. Both datasets could have had entirely different individual surface water maps at any given time step, but these differences could have been masked out entirely in the annual aggregation step. For example, Figure 11 shows a comparison of monthly maximum water observation in WOfS and GWSD under both high flow and low flow scenarios. It is observed that they match well at some sites (e.g., Currawinya Lakes) no matter the flow is high or low. While at some other sites (e.g., Riverland), WOfS dataset tends to miss some tiny in-channel water when the flow is at low level. In this study, since we were only using high flow peaks and their corresponding water observation, this is thus unlikely to affect our modeling accuracy.
a Landsat-derived surface water dataset that was developed specifically for Australia. Here we want to give a general impression regarding to how big the difference is between both datasets, so that even though we may not be able to find proper references to validate both datasets, cross-validating them would also reveal their reliability to some extent.
An inundation frequency map that records the number of times that a pixel was flagged as water in the annually aggregated water map in the 28 years (1988-2015) was also derived from GSWD using GEE. It was then compared with the inundation frequency map derived from WOfS ( Figure 4). Figure  10 shows their difference in the four Ramsar sites. It is clear that both datasets are not exactly the same, with differences existing in all the four sites. However, absolute differences of most pixels are less than 6, suggesting that their differences are in a relatively small range. For the Narran Lake site (Figure 10a), there are some areas that have differences bigger than 7, or even 13. This means that in these areas, WOfS dataset identifies more surface water than GSWD does. Similar situations also exits in the Currawinya Lakes (Figure 10d). Differences in Riverland (Figure 10b) and Kerang Wetlands (Figure 10c) are mostly bigger than -6 and less than 6. Both overestimation and underestimation exit for the WOfS dataset, comparing with the GSWD. It has also to be noted that although both datasets were derived from Landsat imagery, their spatial references are different, which makes their spatial resolutions slightly different. The WOfS has a spatial resolution of approximately 25 m, while the GSWD has a spatial resolution of approximately 30 m. Their difference in spatial resolution would inevitably cause some inconsistences between their inundation frequency maps. This could be one important reason why there are so many differences existing in both frequency maps. Meanwhile, the current comparison was based on annual aggregates. Both datasets could have had entirely different individual surface water maps at any given time step, but these differences could have been masked out entirely in the annual aggregation step. For example, Figure 11 shows a comparison of monthly maximum water observation in WOfS and GWSD under both high flow and low flow scenarios. It is observed that they match well at some sites (e.g., Currawinya Lakes) no matter the flow is high or low. While at some other sites (e.g., Riverland), WOfS dataset tends to miss some tiny in-channel water when the flow is at low level. In this study, since we were only using high flow peaks and their corresponding water observation, this is thus unlikely to affect our modeling accuracy.

Discussion
Like many large river basins, the MDB embraces a broad variety of small river systems with unique climate and hydro-morphological characteristics, flow and flooding regimes as well as highly variable levels of river regulation and water abstraction for irrigated agriculture and other uses [51]. This makes the modeling of their flood inundation much more difficult. The zone-gauge framework proposed in MDB-FIM [37,38] and Huang et al. [39], and later adopted by Huang et al. [22] and Heimhuber et al. [31,52], turns out to be an effective way to deal with this difficulty. It is also applicable to other similar large river basins. However, it has to be noted that in the zone-gauge framework, it is important to have a close correlation between inundation extent in each zone and river flow observed at its corresponding gauge. To ensure this, zone delineation and gauge selection need to be carried out very carefully. Even when the correlations are ensured, some other factors may still introduce uncertainties to the proposed model. One is the time lag between gauge flow observation and inundation observation at different areas in the zone. Inundation extent derived from a single remote sensing image can only reflect an instantaneous inundation condition, which makes no single inundation map can be concisely linked to the observed flow. But it is possible to model the time lag based on the grid handling, as achieved by Heimhuber et al. [31]. Another source of uncertainties is the flood frequency analysis on the time series flow data, which is a statistical modeling process that tries to project future flooding regime from historical data. In this study, to deal with the time lag issue, we used monthly maximum water map, which is derived from a composite of images, instead of a single image, to serve the purpose of better matching with observed flow peak. In addition, the assumptions of flood frequency analysis have been carefully validated, in order to prove that uncertainties from this part are manageable.
Except for the modeling processes, the input data themselves also embrace uncertainties. The observed flow data, for example, were normally considered as the most reliable in situ hydrological observations. But there are many different sources of errors that would affect the accuracy of observed flow [53]. For the inundation observation using remote sensing technology, uncertainties are also inevitable. On the one hand, they come from remote sensing data themselves, on the other hand, they come from the interpretation methods. As was demonstrated in the last section, due to

Discussion
Like many large river basins, the MDB embraces a broad variety of small river systems with unique climate and hydro-morphological characteristics, flow and flooding regimes as well as highly variable levels of river regulation and water abstraction for irrigated agriculture and other uses [51]. This makes the modeling of their flood inundation much more difficult. The zone-gauge framework proposed in MDB-FIM [37,38] and Huang et al. [39], and later adopted by Huang et al. [22] and Heimhuber et al. [31,52], turns out to be an effective way to deal with this difficulty. It is also applicable to other similar large river basins. However, it has to be noted that in the zone-gauge framework, it is important to have a close correlation between inundation extent in each zone and river flow observed at its corresponding gauge. To ensure this, zone delineation and gauge selection need to be carried out very carefully. Even when the correlations are ensured, some other factors may still introduce uncertainties to the proposed model. One is the time lag between gauge flow observation and inundation observation at different areas in the zone. Inundation extent derived from a single remote sensing image can only reflect an instantaneous inundation condition, which makes no single inundation map can be concisely linked to the observed flow. But it is possible to model the time lag based on the grid handling, as achieved by Heimhuber et al. [31]. Another source of uncertainties is the flood frequency analysis on the time series flow data, which is a statistical modeling process that tries to project future flooding regime from historical data. In this study, to deal with the time lag issue, we used monthly maximum water map, which is derived from a composite of images, instead of a single image, to serve the purpose of better matching with observed flow peak. In addition, the assumptions of flood frequency analysis have been carefully validated, in order to prove that uncertainties from this part are manageable.
Except for the modeling processes, the input data themselves also embrace uncertainties. The observed flow data, for example, were normally considered as the most reliable in situ hydrological observations. But there are many different sources of errors that would affect the accuracy of observed flow [53]. For the inundation observation using remote sensing technology, uncertainties are also inevitable. On the one hand, they come from remote sensing data themselves, on the other hand, they come from the interpretation methods. As was demonstrated in the last section, due to different interpretation methods, WOfS and GSWD exhibit differences from place to place, although both were derived from Landsat images.
Unlike the previous study [22] that used MODIS for inundation mapping, this study used a Landsat-derived WOfS dataset, which provides longer time series and higher spatial resolution inundation extents. On the one hand, higher spatial resolution ensures higher accuracy of inundation. Validation results have confirmed the close relation between the inundation and flow. On the other hand, longer time series guarantee more observed inundation samples available, which makes the model more robust. In the meantime, it has also to be noted that the temporal resolution of Landsat may be a limitation for the model. Due to Landsat's 16-day revisit and cloud cover, peak flood extents would probably often be missed. In this study, we used WOfS monthly maximum water observations to be linked with flow peaks. They have even coarser temporal resolution, but through composition, cloud coverage can be minimized and image quality is significantly improved. Therefore, using monthly composited data is a compromise approach that sacrifices temporal resolution for high image quality. Besides, considering the propagation process of flood inundation, it is legitimate to use a monthly maximum composited water observation to be linked with the flow peak.
Traditional approach for simulating the extent of flooding is through hydrodynamic modeling. Increases in computation power and data availability have led to major advances in continental and even global scale flood modeling in recent years [13]. For example, Schumann et al. [54] applied a hydrodynamic model to generate a 40-year floodplain inundation dynamics across Australia and found good agreement with total inundated areas derived from a Landsat time series. Nevertheless, parameterization of these models remains difficult [55] and accurate representation of complex river and floodplain topographies and the corresponding storage effects is still limited by the quality of suitable datasets such as digital elevation models (DEMs) with global coverage [56]. Moreover, these large-scale hydrodynamic models are still limited for accurately quantifying the complex and fine-scaled dynamics of periodically inundated and potentially shifting inundated areas across large regulated river basins and over long periods of time [18]. Remote sensing technology provides an effective way for quantifying and monitoring surface water dynamics over large areas and long periods of time. With the development of big data and cloud computing technologies, more and more high-quality, long time series, and high-resolution surface water observation datasets can be easily generated. These datasets can be used as valuable input to the hydrodynamic models, or as important reference to calibrate or validate the models [57]. This study proposed a simple but effective model that integrates one of this kind of dataset with flood frequency analysis. It is anticipated that this kind of remote sensing derived dataset could also be integrated with hydrodynamic models to achieve more robust and more reliable flood inundation modeling results.

Conclusions
Time series flow observed at gauges provide rich information for studying flood return frequency, but information of spatial extent of flooding is still lacking [58]. Satellite observation is an efficient method for acquiring spatial inundation over large river basins. According to Huang et al. [59], there are three advantages to integrate remotely sensed data with in situ gauge flow data. First, flow data can help select appropriate remote sensing images for flood detection. Second, remote sensing helps flood observation transfer from point-based to region-based. Last, flow data usually have long enough time series for hydrological analysis, which endows remotely sensed inundation hydrological characteristics.
This study proposed a simple but efficient method that integrates time series flow data and a global water observation product to model the flood inundation dynamics. Since a close relationship between flow and inundation is required as the basis of the model, it is important that a large river basin be divided into small hydrological zones first. This is why we adopted the zone-gauge framework for modeling inundation in the MDB. Through validation analysis as described herein, it has been proven that under this framework, the assumptions and basis of the model are tenable for most of MDB.
However, it has to be noted that there are still some other uncertainties that may affect the modeling results. For example, inundation variations caused by other drivers other than river dynamics, such as precipitation, groundwater seepage, irrigation, as well as various land and water management strategies, are usually not related to the gauge flow [8]. This would obviously affect the assumption basis of the model.
This study used a Landsat-derived water observation product that has a much longer time series than MODIS data, which obviously would benefit the proposed modeling method. However, the coarse temporal resolution of Landsat would also probably missed rapid inundation, which affects the modeling results. In the future, we are expecting some water observation data sources that have both high spatial resolution and high temporal resolution. This may be acquired from very advanced satellite missions (e.g., CubeSat), or generated through blending multiple data sources, such as MODIS, Landsat, Sentinel-1/2, etc.
As demonstrated in the MDB case study, the proposed modeling method can be applied to any river basin that has a long time series of observed flow data that can be linked to WOfS or any other remotely sensed inundation extent, such as GSWD. The inundation frequency and inundation probability maps show patterns of historic floodplain inundation and projects future possible inundation status. They are useful for revealing water status, estimating water stress and pre-formulating water regime for riverine ecosystems. This method is relatively easy to implement and would provide environmental researchers and managers with important information to assist them for water resource management and wetland ecosystem studies.