The FLOod Probability Interpolation Tool (FLOPIT): A Simple Tool to Improve Spatial Flood Probability Quantiﬁcation and Communication

: Understanding ﬂood probabilities is essential to making sound decisions about ﬂood-risk management. Many people rely on ﬂood probability maps to inform decisions about purchasing ﬂood insurance, buying or selling real-estate, ﬂood-prooﬁng a house, or managing ﬂoodplain development. Current ﬂood probability maps typically use ﬂood zones (for example the 1 in 100 or 1 in 500-year ﬂood zones) to communicate ﬂooding probabilities. However, this choice of communication format can miss important details and lead to biased risk assessments. Here we develop, test, and demonstrate the FLOod Probability Interpolation Tool (FLOPIT). FLOPIT interpolates ﬂood probabilities between water surface elevation to produce continuous ﬂood-probability maps. FLOPIT uses water surface elevation inundation maps for at least two return periods and creates Annual Exceedance Probability (AEP) as well as inundation maps for new return levels. Potential advantages of FLOPIT include being open-source, relatively easy to implement, capable of creating inundation maps from agencies other than FEMA, and applicable to locations where FEMA published ﬂood inundation maps but not ﬂood probability. Using publicly available data from the Federal Emergency Management Agency (FEMA) ﬂood risk databases as well as state and national datasets, we produce continuous ﬂood-probability maps at three example locations in the United States: Houston (TX), Muncy (PA), and Selinsgrove (PA). We ﬁnd that the discrete ﬂood zones generally communicate substantially lower ﬂood probabilities than the continuous estimates. log-cubic spline has the lowest average bias and is the best interpolation method for this case study. These results suggest that the log-linear method adopted by FEMA can be improved on, but has a relatively low bias.


Introduction
Flooding drives sizable risks around the globe [1,2]. Between 1980-2013, global economic flood losses exceeded $1 trillion and resulted in approximately 220,0000 fatalities [3]. Future flood risks are projected to increase driven by a complex interplay between changes in exposures, vulnerabilities, and hazards [3][4][5][6]. Knowledge of flood risks can be a key driver in community participation in flood risk mitigation planning [7].
How one communicates flood probabilities can impact decision-making [8,9]. Flood probability maps are important sources of information about floods. The information communicated through these maps impacts decisions on where to build and whether to elevate structures to prevent flood damage and purchase flood insurance [9,10]. Flood maps typically consist of flood zones that bin continuous and spatially varying hazards into discrete flood zones [11][12][13][14][15]. The outer edge of a zone is the maximum extent of a flood with a designated probability (i. e. the 1 in 100-year flood), while the inner edge The upper left y-axis represents the flood hazard. This is highest near the river and decreases as the elevation rises. The lower right y-axis represents elevation, in this simple illustration increasing with increasing distance from the river. Due to the way flood zones "bin" flood hazard into discrete zones, the communicated flood hazard is almost always lower than the true flood hazard.
The downward bias in the communicated flood probability and the binning associated with flood zones are well understood and communicated by flood mapping organizations [16]. However, downward biased and binned flood probabilities can present a communication barrier and the qualifiers communicated by flood mapping organizations may be ignored [18]. This barrier can be particularly problematic if people ignore risks when they view the probability as falling below some threshold level of concern, as some research suggests [19,20]. One approach to reducing the downwards bias is to disaggregate flood zones into smaller zones. For example, [4] produces flood maps of the contiguous United States for 10 different probability floods, while FEMA generates maps between three and six different probability floods, but publishes only the 1 in 100 and 1 in 500-year flood maps. Increasing the number of probability zones decreases the downward bias and "in-out" issues associated with "binning" probabilities, but does not solve the underlying problem.
FEMA publishes flood risk information through Flood Risk Database (FRD; [10,16,21]). FRD is available for a limited number of riverine and coastal communities over the U.S. FRD provides water surface elevation, depth, and probabilities for extreme floods. Percent annual chance (or Annual Exceedance Probability; AEP) of flooding included in FRD is a prominent approach to communicate continuous flood risk and to avoid the in-out format of the Flood Insurance Rate Map (FIRM). FEMA estimates AEP using a log-linear interpolation between flood probability and flood surface elevation. While AEP data from FEMA's FRD address the biased communicated risk, it is only available to a limited number of communities in the U.S. and only using a log-linear interpolation method.
Here we introduce, design, implement, and demonstrate the FLOod Probability Interpolation Tool (FLOPIT). FLOPIT is a simple tool to interpolate flood probabilities between flood zone boundaries to create continuous flood probability maps. FLOPIT can be used to estimate AEP maps and return level maps for the user's choice of the return period, for any flood-prone community for which raster information for at least two return periods are available. In addition, FLOPIT provides flexibility with respect to the interpolation method.
We design FLOPIT with the goal of helping to improve (i) the communication of flood probabilities and (ii) research on decision-making under uncertainty. We demonstrate the feasibility and importance of flood probability interpolation by applying FLOPIT in three case study locations in the United States: (i) a neighborhood on the Sims Bayou in Houston, Texas, (ii) the borough of Selinsgrove, Pennsylvania, and (iii) the borough of Muncy, Pennsylvania.

Materials and Methods
FLOPIT generates continuous flood probability maps from flood water surface elevation data associated with at least two return periods plus the digital elevation model (DEM) of the area of interest. Typically, FEMA provides flood surface elevation data for 10, 100, and 500-year flood events in the United States. Additionally, DEMs are widely available in varying accuracies across the United States and the world [14,22].
FLOPIT's main inputs are the DEM raster, at least two rasters indicating the water surface elevation or depth of flooding, return periods associated with the provided flood rasters, and the interpolation method. It then relates flood surface elevations associated with provided return periods for each raster cell to corresponding exceedance probabilities using the user's choice of monotonically increasing cubic spline interpolation [23], logcubic spline, linear, or log-linear interpolation [16]. Lastly, FLOPIT interpolates the flood probability for each cell from ground surface elevations derived from a DEM ( Figure 2). The user can choose to generate flood maps for new return periods not provided in the input. For this task, the return period is required as an input. FLOPIT generates water surface elevation, FIRM-like in-out flood maps, and AEP maps. Additionally, the user can choose to generate plots and more information on a specific location by providing the coordinates of the location as an input. In cases where the interpolated probability of a cell is outside the range of flood probability inputs, FLOPIT coerces a flood probability to the "zone" probability (the lowest possible probability bounded by the input data) if the cell is inside the flood extent. Cells that are outside the spatial extent of the flood maps are beyond the limit of extrapolation and coerced to a "not a number" (NA) value. FLOPIT assigns a flood probability to all grid cells and outputs a raster file containing the flood probability map of the study area. The output flood probability map resolution is limited by the resolutions of flood surface elevation data and the DEM.
We run FLOPIT on three case studies. DEMs are retrieved from the National Elevation Dataset [24] and the Pennsylvania Spatial Data Access [25]. We use flood surface elevations with known probabilities from the FEMA flood Map Service Center databases for each location.
We assess FLOPIT's performance by comparing its outputs with FEMA FRD data. We first compare the FLOPIT-interpolated AEP map of Sims Bayou with FEMA's percent annual chance data. Furthermore, we check water surface elevation maps generated by FLOPIT using cross-validation. In the following, we discuss and explain each assessment.
We use Bias as a performance measure to validate FLOPIT. Bias measures the difference between the predicted and the benchmark variable. For example, Bias in the estimated AEP in cell (i,j) is calculated as the following in Equation (1): To assess the interpolated water surface elevation data from FLOPIT, we use crossvalidation. To this end, we leave one return period out and run FLOPIT for all other return periods available and estimate the left-out return levels. We then compare the estimated return level map with the map provided by FEMA as a benchmark.

Results
We first analyze the AEP maps generated by FLOPIT. The Sims Bayou is one of a few areas where we could find published FEMA flood probability maps. As FEMA uses log-linear interpolation [16], we set the interpolation method to log-linear. Results show that FLOPIT's AEP map well replicates FEMA's percent annual chance map ( Figure 3). We quantify the performance in AEP estimation using Bias. The minimum, average, and maximum Bias in the study area are −0.088, −0.001, and 0.087. Figure 3A,B shows the map of AEPs from FEMA and FLOPIT over the study area. Figure 3C is the map of bias and Figure 3D shows the histogram of bias values across the area. A comparison of FLOPIT's AEP with FEMA's in-out flood zones for Muncy is depicted in Figure 4. The same comparison for the other two study regions is available in Figures S1 and S2. AEP results show that at each zone the probability of flooding is higher than or equal to the probability communicated by FEMA in-out FIRM maps ( Figures S3-S5  We implement cross-validation to Sims Bayou and leave the 100-year flood out. We follow FEMA guidelines and use the log-linear method for interpolation. Figure 5 Figure 4. For the analyses above, we used a log-linear interpolation to replicate FEMA. However, FLOPIT allows other interpolation methods such as linear, cubic spline, and log-cubic spline. The uncertainty in the interpolation method impacts the estimated water surface elevation. We test this on the 50-year flood maps over Sims Bayou. We use cross-validation and leave the 50-year return period out using all available interpolation methods. We then compare these 50-year water surface elevation maps against FEMA's 50-year flood map that is generated based on a hydraulic dynamical model. We measure the performance of each flood map using average bias. Bias results for all interpolation methods are shown in Table 1. Results show that the log-cubic spline has the lowest average bias and is the best interpolation method for this case study. These results suggest that the log-linear method adopted by FEMA can be improved on, but has a relatively low bias.     We also use FLOPIT to interpolate flood probability maps for two more riverine floodprone communities, the boroughs of Muncy and Selinsgrove in Pennsylvania ( Figure 4 and Figure S1). These small communities on the Susquehanna river face recurrent river flooding, but currently do not have FEMA-published flood probability maps, to the best of our knowledge. We compare FEMA's in-out flood maps with FLOPIT's probabilistic flood maps in Figure 4. There is a notable divergence between the information communicated from the two maps. The communicated flood hazard through FEMA maps is based on the zone. For example, if a property is in the 100-year zone, a user may assume that the flood chance is 0.01 or 1%. However, the continuous map generated by FLOPIT shows that the flood chance is higher than 1%. This could impact the cost-benefit analysis needed for any flood risk mitigation measure. For example, if a homeowner needs to estimate the Expected Annual Damages (EAD) from FEMA maps, the adopted EAD may be based on the 1% per year flood probability while FLOPIT may suggest a considerably higher value. This means that using only FEMA flood maps to estimate damages can lead to lower damage estimates and consequently a biased sense of security [9].
FLOPIT can handle flood maps at a wide range of resolutions and spatial extents. It can handle resolutions in the order of meters to kilometers, depending on the user's system computational power. The computation time scales with increasing resolution. We report approximate run times for all case studies in Table 2. The reported run times are based on a virtual Amazon Web Services (AWS) instance with 24 physical cores and 185 GB memory. Performance can vary considerably with the degree of aggregation and the number of cells.

Discussion
We introduce FLOPIT, a flood probability interpolation tool that uses flood surface elevation-probability relationships and a digital elevation model to interpolate flood probabilities and produce flood probability (or annual exceedance probability) maps as well as water surface elevation maps for any new return period between input return periods.
Flood probability interpolation tools, such as FLOPIT, can create spatially resolved flood probability maps to help improve stakeholder communication and decision-making. FLOPIT provides spatially continuous flood hazard maps that compare favorably with flood zone maps while improving flood probability quantification. Continuous flood probability mapping also has the potential to improve flood hazard communication, stakeholder decision-making, and the setting of actuarial flood insurance rates.
FLOPIT's flexible resolution framework enables users to navigate the speed versus resolution trade-off with relative ease. Decreasing the map resolution allows for faster flood probability mapping (see Table 1 for approximate wall times). This reduces the computational demands, at the cost of a degraded spatial resolution. One could conceive scenarios where flood probability data are needed at the home scale for individual flood risk assessments or the block scale for city-wide flood risk assessments. The current version of FLOPIT is slightly parallelized using Python's Dask libraries. Future versions of FLOPIT will incorporate more extensive high-performance computing approaches for better scalability [26,27].
The current implementation of FLOPIT does not consider uncertainties in the outputs. Uncertainties in the output flood maps can stem, for example, from the measurement and resolution uncertainty in the DEM, the input flood maps as a result of nonstationarities and climate change, and the method of interpolation. The number of flood maps used as input can introduce uncertainties in the output AEP or flood maps. As the number of input flood maps increases and provided return periods better represent the overall range, this uncertainty becomes less important. However, when there are a limited number of flood maps available for input, the uncertainty in the interpolation method becomes more important. Expanding the treatment of uncertainties is an important avenue to refine FLOPIT.
Current flood hazard communication approaches typically rely on flood zones, assigning a single probability to an entire zone [16]. NFIP insurance rates are typically applied as a flat rate, such as the National Flood Insurance Program (NFIP) 500-year flood zone, or set by depth below a "base flood elevation" (the 1 in 100-year flood surface elevation). Flood probability interpolation can help to determine the probability of any single flood depth, or multiple flood depths. This allows for improved numerical integration over flood risk [28] for spatially resolved actuarial rate setting.
FEMA performs log-linear interpolation within the agency and publishes the output flood probability maps only for a few flood-prone communities. There are many floodprone communities where water surface elevation or depth data are available for a number of return periods but the percent annual chance data are not available. FLOPIT could be useful for these communities. Additionally, our study shows that in our chosen study area, log-linear was not the best interpolation option given that the log-spline method led to a reduced overall bias. While more refined investigations are needed, this study could be the start of a conversation in FEMA to replace log-linear with the log-spline method or at least make the interpolation method more flexible and generate probability maps using different interpolation methods.
Finally, FLOPIT is not only intended for use by FEMA data. There are agencies such (e.g., European's Joint Research Center) that generate flood inundation maps for a limited number of return periods and miss other return periods probably because of the considerable cost of generating these maps (e.g., computational and/or storage limitations). One could easily use FLOPIT and generate more flood maps using these available flood maps. In ungauged areas where flood inundation data are not available, FLOPIT could be used in combination with state-of-the-art models and tools that generate flood inundation maps [29][30][31].

Conclusions
FLOPIT is a tool that uses the relationship between flood surface elevation-probability and a digital elevation model to interpolate flood probabilities and produce flood inundation maps for new return periods. FLOPIT provides a fast and flexible resource for producing continuous flood probability maps to aid flood hazard communication and quantification. We demonstrate how flood zones can be a poor approximation to flood probability and can be downward biased in reported and communicated flooding probabilities. Flood probability interpolation tools can help address these issues by reducing the bias of spatially resolved flood probability maps. Refined flood probability maps can be useful to improve decisions, for example about where and how to build or whether to elevate a house, whether and how to change local zoning, and how to set fair flood insurance rates [9,[32][33][34][35]. Return periods in the 500-year zone range from 100 (1%) to 500 (0.2%), and the average is roughly 250 (0.4%) years. Whiskers extend to maximum and minimum of data; Figure S4. Box and whisker plot of the interpolated return period versus the FEMA flood zone return period for each Sims Bayou in Houston (TX), map pixel. Flood probabilities in the 1 in 100 (1% annual chance) flood zone range from 1 in 10 (10% annual chance) to 1 in 100 (1% annual chance), with an average flood probability of roughly 1 in 30 (3.3% annual chance). Flood probabilities in the 1 in 500 (0.2% annual chance) zone range from 1 in 100 (1%) to 1 in 500 (0.2%), and the average is roughly 1 in 300 (0.33% annual chance). The solid green line illustrates a hypothetical perfect relationship. Whiskers extend to maximum and minimum of data; Figure S5. Box and whisker plot of the interpolated return period versus the FEMA flood zone return period for each map pixel of the Selinsgrove map. Return periods in the 100 (1%) year flood zone range from 10 (10%) to 100 (1%), and the average return period is roughly 20 years (4%). Return periods in the 500-year zone range from 100 (1%) to 500 (0.2%), and the average is roughly 250 (0.4%) years. Whiskers extend to maximum and minimum of data.
Author Contributions: K.K. contributed to the study's conceptual framework, the interpretation of the results, and writing the manuscript. K.J.R.-E.contributed to the study's conceptual framework, wrote the code in R, performed the analysis, and contributed to writing the manuscript. M.Z. revised the R code, wrote the Python code, and contributed to writing the manuscript. S.S. performed a code-review and edited the paper. All authors have read and agreed to the published version of the manuscript. Data Availability Statement: All results: model codes, analysis codes, data, and model outputs used for analysis are freely available from https://github.com/pches/FLOPIT/tree/revision_2020 and are distributed under the GNU general public license.