Gridded Flash Flood Risk Index Coupling Statistical Approaches and TOPLATS Land Surface Model for Mountainous Areas

: This study presents the development of a statistical ﬂash ﬂood risk index model, which is currently operating in research mode for ﬂash ﬂood risk forecasting in ungauged mountainous areas. The grid-based statistical ﬂash ﬂood risk index, with temporal and spatial resolutions of 1 h and 1 km, respectively, has been developed to simulate the ﬂash ﬂood risk index leading to ﬂash ﬂood casualties using hourly rainfall, surface ﬂow, and soil water content in the previous 6 h. The statistical index model employs factor analysis and multi-linear regression to analyze its gridded hydrological components that are obtained from the TOPMODEL-based Land Atmosphere Transfer Scheme (TOPLATS). The performance of the developed index model has been evaluated in estimating ﬂash ﬂooding in ungauged mountain valleys and small streams. Numerical results show that the approach simulated 38 ﬂash ﬂood catastrophes in the Seoul Capital Region with 71% accuracy; therefore, this approach is potentially adequate for ﬂash ﬂood risk forecasting.


Introduction
Flash floods are common and widespread, are a leading cause of weather-related deaths worldwide, and remain one of the most difficult weather phenomena to forecast and provide warnings because of their complex and multifaceted chfaracteristics. Localized torrential rainfall that occurred on 27 July 2011 caused flash floods and mountain landslides on the southern flank of an ungauged mountain in Seoul, the capital of Korea, leading to 60 fatalities [1]. Most of those catastrophes have occurred in ungauged mountainous areas, but methods for classifying the flash flood risk index (FFRI) in ungauged mountainous areas remain few. There is some previous literature forecasting and providing warnings of flash floods for ungauged mountainous areas based on a distributed hydrological model [2][3][4][5][6][7][8]. For example, Reed et al. [2] studied flash flood forecasting at ungauged locations with a distributed hydrologic model and threshold frequency-based method. Wang et al. [3] established a flash flood warning system based on a distributed hydrological model applied to two ungauged mountainous regions.
Flooding hazards caused by heavy rainfall can be categorized according to the rainfall duration and ground-level damage type. They are divided into floods and flash floods with respect to rainfall characteristics such as intensity and duration, as well as speed of flood onset. Flooding hazards are also divided into urban, mountain, and riverine flooding based on the flood-affected areas, of which the riverine flooding is subdivided into large-, medium-, and small-scale riverine floods depending on river size. Given that flash floods are closely associated with the speed of onset of flooding irrespective of the type of ground-level damage, they can be referred to as urban flash floods or mountain flash

Study Area
Summer monsoonal airflow from the tropics triggers heavy rainfall concentrated over coastal regions of East Asia (see geographic location in Figure 2a). An increasing number of extreme rainfall events have been related to mesoscale convective systems with the synoptic-scale East Asian Summer Monsoon. In most regions of South Korea, such torrential rainfall amounts are more than half the annual precipitation, and have been recently concentrated in urban areas with dense human population and industrial centers. Heavy rainfall results in the isolation of hikers and campers in mountainous areas around Seoul (inside yellow line in Figure 2b), and leads to landslide and flooding that causes human casualties in the region [1].

Study Area
Summer monsoonal airflow from the tropics triggers heavy rainfall concentrated over coastal regions of East Asia (see geographic location in Figure 2a). An increasing number of extreme rainfall events have been related to mesoscale convective systems with the synoptic-scale East Asian Summer Monsoon. In most regions of South Korea, such torrential rainfall amounts are more than half the annual precipitation, and have been recently concentrated in urban areas with dense human population and industrial centers. Heavy rainfall results in the isolation of hikers and campers in mountainous areas around Seoul (inside yellow line in Figure 2b), and leads to landslide and flooding that causes human casualties in the region [1].
The Han River basin consists of about 30% lowlands and 70% uplands and mountains ( Figure 2b). The Seoul Capital Region (outside red line in Figure 2b) is the metropolitan area centered on Seoul located in northwest South Korea (inside yellow line in Figure 2b), and makes up only 11.8% of the entire land surface with a surface area of 11,930.06 km 2 . However, almost half (48%) the population of South Korean lives in this area. The region is covered with forest area (49.1%), paddy fields (21.6%), and development (12.4%), and the average terrain slope is about 15.7%. Heavy rainfall occurring over short periods in the Seoul Capital Region results in inherent dangers in measuring high peak flows in mountain valleys and small streams; therefore, they remain ungauged. However, emergency rescue requests are available and valuable information, including onset of rainfall and recognition of danger, is provided with the caveat that there are uncertainties in flood peaks, flood amounts, and flooding time.
Over four years (2009)(2010)(2011)(2012), there were 45 rescue request cases related to flooding in the Seoul Capital Region; after excluding ambiguous locations and cases involving negligence, 38 cases were used in this study (black triangles in Figure 2b). Of these, 34 cases (89.5%) involved isolation, death, or inundation due to rapids or rising water levels in valleys or streams, and the remaining cases (10.5%) involved landslides. Isolation and fatalities due to heavy rainfall suggest that the time from the onset Water 2019, 11, 504 5 of 23 of rainfall to the recognition of danger was very short and that valleys or small streams are at higher risk of incurring such casualties. See Table 1 for more details of each rescue request, such as reported time and catastrophe type provided by the National Disaster Information Center of the Ministry of Public Safety and Security (www.safekorea.go.kr). The Han River basin consists of about 30% lowlands and 70% uplands and mountains ( Figure  2b). The Seoul Capital Region (outside red line in Figure 2b) is the metropolitan area centered on Seoul located in northwest South Korea (inside yellow line in Figure 2b), and makes up only 11.8% of the entire land surface with a surface area of 11,930.06 km 2 . However, almost half (48%) the population of South Korean lives in this area. The region is covered with forest area (49.1%), paddy fields (21.6%), and development (12.4%), and the average terrain slope is about 15.7%. Heavy rainfall occurring over short periods in the Seoul Capital Region results in inherent dangers in measuring high peak flows in mountain valleys and small streams; therefore, they remain ungauged. However, emergency rescue requests are available and valuable information, including onset of rainfall and recognition of danger, is provided with the caveat that there are uncertainties in flood peaks, flood amounts, and flooding time.
Over four years (2009-2012), there were 45 rescue request cases related to flooding in the Seoul Capital Region; after excluding ambiguous locations and cases involving negligence, 38 cases were used in this study (black triangles in Figure 2b). Of these, 34 cases (89.5%) involved isolation, death, or inundation due to rapids or rising water levels in valleys or streams, and the remaining cases (10.5%) involved landslides. Isolation and fatalities due to heavy rainfall suggest that the time from the onset of rainfall to the recognition of danger was very short and that valleys or small streams are at higher risk of incurring such casualties. See Table 1 for more details of each rescue request, such as reported time and catastrophe type provided by the National Disaster Information Center of the Ministry of Public Safety and Security (www.safekorea.go.kr).

Land Surface Model
The TOPLATS model is a multi-scale model for simulating local-to regional-scale catchment water fluxes, and utilizes water balance and energy balance to simulate gridded actual evapotranspiration, soil water content, water table depth, surface runoff, latent heat, sensible heat, ground heat, and net radiation to characterize the redistribution of the water table depth at the sub-catchment scale. The model combines a soil vegetation atmosphere transfer scheme (SVAT) to represent local scale vertical water fluxes with the catchment scale TOPMODEL approach [47] to laterally redistribute the water within a catchment. See Table 2 for the main processes and equations in the TOPLATS model [48]. Since the TOPLATS is a grid-based and time-continuous model, and the vertical water fluxes of the grid cells are calculated by the local SVATs, catchment-scale vertical water fluxes are obtained by aggregation of local water fluxes. There is no lateral interaction between the local SVATs accounted for by the model. However, based on the soils topographic index of the TOPMODEL approach [54], a lateral redistribution of water is realized by adaptation of the local ground water levels, which are used as lower boundary conditions of the local SVATs. In addition, the based flow is generated from the integration of local saturated subsurface fluxes along the channel network. A routing routine is not integrated in the model. In the vertical direction, the soil is divided into 2 layers (root zone and transmission zone). According to Sivapalan et al. [53], it is assumed that saturated conductivity exponentially decreases with depth. The percolation is calculated using an approximation for gravity driven drainage, and capillary rise is calculated based on the approach of Gardner [51], both approaches using the Brooks and Corey parameterization of soil retention characteristics [52]. Based on soil texture and porosity, soil parameters are derived using the pedotransfer function of Rawls and Brakensiek [55]. Soil water contents within the root zone and transmission zone are calculated using the soil water balance equations [56]. Plant growth is not directly simulated by TOPLATS, but the seasonal development of plant properties is described by monthly updating the plant parameter sets consisting of e.g., leaf area index, plant height and stomatal resistance. The digital elevation model serves as the basic data set for the calculation of the topographic wetness index [54], which is used for calculation of the soil-topographic index additionally accounting for local differences in transmissivity [53]. For further details about the model, the reader is referred to Famiglietti and Wood [56] and Peters-Lidard et al. [41].
The TOPLATS-based simulation in full distribution mode requires gridded meteorological and topographical data of the studied region [26]. Meteorological data includes datasets of ground-level precipitation (mm), temperature ( • C), relative humidity (%), wind speed (m/s), air pressure (mm), incoming shortwave and longwave radiation (W/m 2 ), and net radiation (W/m 2 ). Topographical data includes datasets for catchment, land cover, soil, topographic index, and transmissivity. The TOPLATS model can redistribute ground table depth within a single catchment and should thus be reconstructed by catchment unit (Figure 3), although it is a grid column model. The boundaries of the survey area are set in a square (320 km × 320 km) with grid resolution of 1 km × 1 km, and the TM coordinates on the four corners are lower-left (320,000, 80,000), lower-right (320,000, 40,000), upper-right (640,000, 400,000), and upper-left (640,000, 80,000) in transverse Mercator map projection.
Gridded meteorological data for precipitation, air temperature, relative humidity, wind speed, solar radiation, and air pressure are necessary for simulating catchment water balance with TOPLATS. Given that the purpose of this study is to produce hydrological components per 1 km × 1 km grid, we collected the observations from 600 automated weather systems (AWS) and 72 automated surface observing system (ASOS) stations operated by the Korea Meteorological Administration (KMA). Figure 3 shows the locations of the AWSs and ASOS stations in the Han River basin, which is divided into 48 sub-basins. These data are applied to the model after converting them from point data to grid cells using the inverse distance weighting method.
includes datasets for catchment, land cover, soil, topographic index, and transmissivity. The TOPLATS model can redistribute ground table depth within a single catchment and should thus be reconstructed by catchment unit (Figure 3), although it is a grid column model. The boundaries of the survey area are set in a square (320 km × 320 km) with grid resolution of 1 km × 1 km, and the TM coordinates on the four corners are lower-left (320,000, 80,000), lower-right (320,000, 40,000), upperright (640,000, 400,000), and upper-left (640,000, 80,000) in transverse Mercator map projection. Gridded meteorological data for precipitation, air temperature, relative humidity, wind speed, solar radiation, and air pressure are necessary for simulating catchment water balance with TOPLATS. Given that the purpose of this study is to produce hydrological components per 1 km × 1 km grid, we collected the observations from 600 automated weather systems (AWS) and 72 automated surface observing system (ASOS) stations operated by the Korea Meteorological Administration (KMA). Figure 3 shows the locations of the AWSs and ASOS stations in the Han River basin, which is divided into 48 sub-basins. These data are applied to the model after converting them from point data to grid cells using the inverse distance weighting method.
For a smooth simulation of ground-level hydrologic responses to meteorological conditions, an accurate topographical mapping is of vital importance ( Figure 4). To construct a TOPLATS model gridded at a resolution of 1 km × 1 km, a gridded catchment map with the same resolution, land cover map, topographic index, and transmissivity coefficient data are used. The gridded catchment map is drawn based on a gridded Han River catchment map, dividing the survey area into 48 subcatchments ( Figure 4a). The land cover map, drawn using 30 m × 30 m data provided by the Water Resource Management Information System (http://www.wamis.go.kr) of the Ministry of Land, Infrastructure and Transport, includes eight land cover types: water, urban, barren, wetland, grasslands, forest, paddy, and farmland ( Figure 4b). It is very important to reflect detailed soil properties to successfully simulate spatial water balance using a land surface model. The National Institute of Agricultural Sciences distributes a detailed soil map (1:25,000) classifying the entire South Korean land surface into 1300 soil series in the Soil Information System (http://soil.rda.go.kr/soil) For a smooth simulation of ground-level hydrologic responses to meteorological conditions, an accurate topographical mapping is of vital importance ( Figure 4). To construct a TOPLATS model gridded at a resolution of 1 km × 1 km, a gridded catchment map with the same resolution, land cover map, topographic index, and transmissivity coefficient data are used. The gridded catchment map is drawn based on a gridded Han River catchment map, dividing the survey area into 48 sub-catchments ( Figure 4a). The land cover map, drawn using 30 m × 30 m data provided by the Water Resource Management Information System (http://www.wamis.go.kr) of the Ministry of Land, Infrastructure and Transport, includes eight land cover types: water, urban, barren, wetland, grasslands, forest, paddy, and farmland ( Figure 4b). It is very important to reflect detailed soil properties to successfully simulate spatial water balance using a land surface model. The National Institute of Agricultural Sciences distributes a detailed soil map (1:25,000) classifying the entire South Korean land surface into 1300 soil series in the Soil Information System (http://soil.rda.go.kr/soil) (Figure 4c). We use this soil map to construct the soil property distribution and attribute databases. The study area has 379 detailed soil textures. The topographic index and transmissivity are used for redistributing the water table depth within the catchment, as shown in Figure 4d,e, respectively.
Please note that two different types of water excess lead to overland flow: infiltration-excess and saturation-excess. The infiltration-excess is the maximum rate at which water can enter the soil surface [57]. That is, if the rainfall intensity is heavier than the soil surface can absorb, excess water will pond on the surface. Eventually, water starts flowing downslope as overland flow. This overland flow also develops when rain falls on saturated soil; there is no possibility of infiltration and the infiltration capacity is effectively zero. In general, single extreme values of soil transmissivity do not show a homogenous behavior in different catchments [48]. The behavior tends to depend on local soil hydraulic conditions and soil depth.  Figure 4c). We use this soil map to construct the soil property distribution and attribute databases.
The study area has 379 detailed soil textures. The topographic index and transmissivity are used for redistributing the water table depth within the catchment, as shown in Figure 4d,e, respectively. Please note that two different types of water excess lead to overland flow: infiltration-excess and saturation-excess. The infiltration-excess is the maximum rate at which water can enter the soil surface [57]. That is, if the rainfall intensity is heavier than the soil surface can absorb, excess water will pond on the surface. Eventually, water starts flowing downslope as overland flow. This overland flow also develops when rain falls on saturated soil; there is no possibility of infiltration and the infiltration capacity is effectively zero. In general, single extreme values of soil transmissivity do not show a homogenous behavior in different catchments [48]. The behavior tends to depend on local soil hydraulic conditions and soil depth.

Statistical Model
Flash flood indices such as the Flash Flood Index (FFI), Flash Flood Potential Index (FFPI), and Flash flood severity index (FFSI) have been previously proposed. The FFI is a quantitative index that calculates the differences between the average basin rainfall and predetermined FFG produced by the NWS River Forecast Centers [58]. As a result of the data assimilated into the FFG product, the FFI is limited to areas containing relatively large gauged rivers [59]. The FFPI accounts for watershed physiographic characteristics and combines them with the forecast and observed rainfall to determine the likelihood of flash flood occurrence. The method was shown to have poor skill in forecasting flash flooding when applied operationally for flash flood forecasting in western United States [60]. The FFSI is a damage-based post-event assessment tool with five categories ranging from 1 to 5, from least to most destructive, and still additional refinement is needed [61].

Statistical Model
Flash flood indices such as the Flash Flood Index (FFI), Flash Flood Potential Index (FFPI), and Flash flood severity index (FFSI) have been previously proposed. The FFI is a quantitative index that calculates the differences between the average basin rainfall and predetermined FFG produced by the NWS River Forecast Centers [58]. As a result of the data assimilated into the FFG product, the FFI is limited to areas containing relatively large gauged rivers [59]. The FFPI accounts for watershed physiographic characteristics and combines them with the forecast and observed rainfall to determine the likelihood of flash flood occurrence. The method was shown to have poor skill in forecasting flash flooding when applied operationally for flash flood forecasting in western United States [60]. The FFSI is a damage-based post-event assessment tool with five categories ranging from 1 to 5, from least to most destructive, and still additional refinement is needed [61].
In this study, variables sensitive to the flash flood occurrence are determined from factor analysis, which is a useful tool for investigating variable relationships for complex concepts that are not easily measured directly [62]. The method collapses a large number of variables into a few interpretable underlying factors. Here, factor analysis reveals that flash flood occurrence is sensitive to precipitation, surface runoff, and soil water content, among the hydrological components produced by the land surface model simulation (see Section 3.1). The statistical FFRI model is then expressed as the first-order linear function of the factor analysis results for surface runoff and rainfall during the six hours preceding the target time. That is, the time-series of precipitation, surface runoff, and soil water content 6 h prior to the current time are expressed using a first-order linear function based on the factor analysis and regression analysis results.
The basic formula for the statistical FFRI model is expressed as follows: where the factor scores F PCP , F RUN , and F SWS respectively represent the components of the flash flood index for precipitation (PCP), i.e., the amount of rainfall; surface runoff (RUN), i.e., the rate of rainfall exceeding that infiltration capacity of the soil, or additional rainfall after saturated soil water content; and soil water content (SWS), i.e., quantity of water contained in a given mass of the soil. The coefficients w 1 , w 2 , and w 3 are the weights of the factor scores. The factor scores and their weights are calculated based on all data available over the four years between 2009 and 2012, and the FFRI model is applied for all flooding occurrences over those years. Please note that in Equation (1), there are two different kinds of water excesses that provide overland flow: infiltration-excess and saturation-excess. The infiltration-excess is excess water that cannot infiltrate ponds on the soil surface, and eventually flows downslope across the soil surface as overland flow. Saturation-excess is subsurface flow returning to the surface when the capacity of the soil to transmit flow is exceeded. Therefore, any rain falling on the saturated zone adds to the overland flow [57]. In the TOPLATS model, the transmissivity parameter of the soils controls the infiltration-excess; therefore, excess rainfall directly adds to the overland flow [48], and is represented in the RUN variable in the statistical model in Equation (1). Rainfall directly onto the already saturated soil adds to the return flow, this overland flow is represented in the SWS parameter in Equation (1).

TOPLATS Results
The The parameter estimation procedures in the TOPLATS model and its streamflow simulation values obtained here are similar to those reported in [45]. The accuracy of hydrologic components is evaluated using observed inflow data from two dams: Chungju Dam, the largest multipurpose concrete dam in Korea, and Soyanggang Dam, the world's fourth largest rock-fill dam containing 29 million tons of water (Figure 2b). It is because in the study area there were no available gauges so, for simplification and to facilitate catchment behavior comparison, we compared simulated catchment outflow with the daily amount of dam inflow. Please note that, except for this validation, all other simulation results are performed and compared with hourly model results.
Please note that in TOPLATS model a channel routing module is unavailable, so the stream flow itself cannot be simulated in the model. In general, the stream flow in the model is underestimated in the beginning of the rainy season and overestimated in the high rainy season and in the end of the rainy season. Thus, during the rainfall, the instant peaks of precipitation generally do not correspond to the peaks of dam inflow in time partly because water from remote places to the dam takes times to arrive the dam. However, peak flows in the model are well simulated, except that the highest peak is simulated with some delay, as no routing is integrated in the model. Therefore, the simulated flow is defined as the sum of runoff and base-flow, and inflow from the simulated results for daily water balance are compared with observed dam inflows. Figure 5 shows hydrographs of upstream catchments for Chungju Dam and Soyanggang Dam. During the calibration periods, the parameterization of the TOPLATS model is conducted by deriving or directly using as many parameters as possible from standard databases, such as topographic image data and their attribute database. Calibration can be reduced to an adjustment of plant-specific stomatal resistances using a constant factor to satisfy with the long-term water balance and calibrate the parameters for the base flow recession curve. For the validation periods, the accuracy of the simulation is satisfactory for both Chungju Dam (Figure 5a) and Soyanggang Dam (Figure 5b) in terms of total water balance. The relative volume error for the total water budget in each analysis period is satisfactory, in the range of ±10%. This results in ratios of relative volume error for the total water budget (%) for calibration and validation of −7.81% and −7.55%, respectively, for Chungju Dam, and −4.51% and −4.84%, respectively, for Soyanggang Dam. Although the statistical skill scores for the validation period are only slightly worse than for the calibration periods, the statistical values show that the model replicated the observed outcomes well, i.e., with statistical significance.
During the calibration periods, the parameterization of the TOPLATS model is conducted by deriving or directly using as many parameters as possible from standard databases, such as topographic image data and their attribute database. Calibration can be reduced to an adjustment of plant-specific stomatal resistances using a constant factor to satisfy with the long-term water balance and calibrate the parameters for the base flow recession curve. For the validation periods, the accuracy of the simulation is satisfactory for both Chungju Dam (Figure 5a) and Soyanggang Dam (Figure 5b) in terms of total water balance. The relative volume error for the total water budget in each analysis period is satisfactory, in the range of ±10%. This results in ratios of relative volume error for the total water budget (%) for calibration and validation of −7.81% and −7.55%, respectively, for Chungju Dam, and −4.51% and −4.84%, respectively, for Soyanggang Dam. Although the statistical skill scores for the validation period are only slightly worse than for the calibration periods, the statistical values show that the model replicated the observed outcomes well, i.e., with statistical significance.   Figure 6 shows that in daily discharge, the root mean squared error (RMSE) and coefficient of determination (R 2 ) values for the averaged inflows show moderate quality. The RMSE and R 2 for Chungju Dam are, respectively, 16.35 mm and 0.67 for calibration, and 9.53 mm and 0.60 for validation. The RMSE and R 2 for Soyanggang Dam are, respectively, 15.97 mm and 0.74 for calibration and 7.39 mm and 0.82 for validation. The RMSE values for the calibration period are the lower than that for validation because there was more precipitation during the calibration period than during the validation. That is, the average annual precipitation at Chungju Dam is 1389.7 mm in the calibration period and 1152.4 mm in the validation period, while it is 1480.2 mm in the calibration period and 1242.3 mm in the validation period at Soyanggang Dam. Figure 7 shows hourly spatial distribution maps for precipitation (PCP), surface runoff (RUN), and soil water content (SWS) over the sub-basin No. 31 between 1300 UTC 27 July 2011 and 1600 UTC 27 July 2011. The maximum rainfall is 64.1 mm, and the basin mean rainfall is 19.30 mm at 1300 UTC 27 July 2011. Subsequently, the rainfall amounts gradually decrease. The surface runoff is less than rainfall amount due to the infiltration capacity of the soil, and the spatial distribution of the surface runoff is similar to the precipitation. The soil water content shows that the soil along streams has higher water quantity than other regions, and soils in the upper stream areas also show higher water content than those in downstream regions. Table 3 shows the basin means for rainfall, surface runoff, and soil water content during the flood event.
Water 2019, 11, x FOR PEER REVIEW 12 of 23 Figure 6 shows that in daily discharge, the root mean squared error (RMSE) and coefficient of determination (R 2 ) values for the averaged inflows show moderate quality. The RMSE and R 2 for Chungju Dam are, respectively, 16.35 mm and 0.67 for calibration, and 9.53 mm and 0.60 for validation. The RMSE and R 2 for Soyanggang Dam are, respectively, 15.97 mm and 0.74 for calibration and 7.39 mm and 0.82 for validation. The RMSE values for the calibration period are the lower than that for validation because there was more precipitation during the calibration period than during the validation. That is, the average annual precipitation at Chungju Dam is 1389.7 mm in the calibration period and 1152.4 mm in the validation period, while it is 1480.2 mm in the calibration period and 1242.3 mm in the validation period at Soyanggang Dam.  Subsequently, the rainfall amounts gradually decrease. The surface runoff is less than rainfall amount due to the infiltration capacity of the soil, and the spatial distribution of the surface runoff is similar to the precipitation. The soil water content shows that the soil along streams has higher water quantity than other regions, and soils in the upper stream areas also show higher water content than those in downstream regions. Table 3 shows the basin means for rainfall, surface runoff, and soil water content during the flood event.   Overall, the surface runoff, soil water content, and water table depth generally show satisfactory simulation results with respect to the precipitation time-series in the simulated grids, which suggests that the water balance analysis from the TOPLATS distributed land surface model works with moderate quality at the grid level. Because previous similar studies have shown similar results [43,45] and further improvement of the TOPLATS model seems beyond of the scope of this study, these default simulation parameters in the model are employed to evaluate the performance of the statistical flash flood risk index model.

Statical Flash Flood Risk Index Model
As mentioned before, WMO and US NWS define a flash flood as a rapid and extreme flow of high water into a normally dry area, or a rapid water level rise in a stream or creek above a predetermined flood level beginning within six hours of the causative event, e.g., intense rainfall, dam failure, or ice jam. However, the actual time threshold may vary regionally. Ongoing flooding can intensify flash flooding in cases where intense rainfall results in a rapid surge of rising flood waters (http://w1.weather.gov/glossary/index.php?letter=f). In Korea, the KMA issues a heavy rain special report, flood warning, on the basis of 6 h or 12 h total rainfall. Specifically, a watch level is issued when the predicted total rainfall exceeds 70 mm in 6 h or 110 mm in 12 h, and a warning level is issued when the predicted total rainfall exceeds 100 mm in 6 h or 180 mm in 12 h. In the following, two advisory thresholds for watch and warning are selected as flood risk criteria, and the parameters in the statistical FFRI model are determined on the basis of 6 h total rainfall.

Factor Analysis
Factor analysis is a statistical method used to describe variability between observed, correlated variables based on a potentially lower number of unobserved variables, termed factors [62]. Each factor captures a certain amount of the overall variance in the observed variables, and factors are always listed in order of the amount of variation they explain. The eigenvalue is a measure of the amount of variance in the observed variables a factor explains. Any factor with an eigenvalue ≥ 1 explains more variance than a single observed variable [63]. Table 4 shows the factor analysis results for surface runoff and rainfall data for the 6 h prior to the flash flood. In general, the number of factors is determined at the point where the eigenvalue is ≥1 or the cumulative explanatory power for the total variance is within the range of 60-70% or higher. Rainfall and surface runoff are explained using three factors. Categorizing the factors using the values of the rotated component matrix in Table 4, the three rainfall factors explain 84.3% of the total variance. In more detail, factors 1, 2, and 3 are short-term (1 h, 2 h, 3 h), mid-term (4 h), and long-term (5 h, 6 h) rainfall and have explanatory powers of 35.0%, 23.3%, and 26.0%, respectively. The three factors for surface runoff are short-term (1 h, 2 h), mid-term (3 h), and long-term (4 h, 5 h, 6 h) and have explanatory powers of 31.0%, 17.7%, and 42.3%, respectively. The factor scores of surface runoff and rainfall obtained through factor analysis are expressed in Equations (2)-(7), where the weights are the values of the rotated component matrix (shaded values in Table 2). P x and Q x are the hourly rainfall (mm/h) and surface runoff (mm/h) at the preceding time x. The factor scores F PCP,S , F PCP,M , and F PCP,L indicate short-term, mid-term, and long-term rainfall, respectively, and F RUN,S , F RUN,M , and F RUN,L respectively indicate the factor scores of short-term, mid-term, and long-term surface runoff, such that

Linear Regression Analysis
Because the presence (rescue reports) or absence (no rescue reports) of a flash flood is a binary data point, a logistic regression model is generally accepted to be an appropriate method. However, in the scatter plot for short-term rainfall, which is the factor that exerts the greatest influence on flash flood occurrence (Figure 8), the slope of the linear regression model is less than that of the logistic regression model. A sharper slope of the logistic regression model can better differentiate the presence or absence of flash flood. However, it is inappropriate as a qualitative index for calculating risk levels because of the flash flood potential interval (0%-100%). Therefore, the logistic regression model is inadequate for use in the forecasting and warning system, and the linear regression model is more suitable for smoothly expressing the flash flood index. flood occurrence (Figure 8), the slope of the linear regression model is less than that of the logistic regression model. A sharper slope of the logistic regression model can better differentiate the presence or absence of flash flood. However, it is inappropriate as a qualitative index for calculating risk levels because of the flash flood potential interval (0%-100%). Therefore, the logistic regression model is inadequate for use in the forecasting and warning system, and the linear regression model is more suitable for smoothly expressing the flash flood index.  Table 5 shows the results of the linear regression analysis. The parameters B and SE indicate the regression coefficient and its standard error, which represents the uncertainty of the regression coefficient. Statistical significance of the estimated B in Table 3 is tested using the test statistics with a t value that measures the degree of agreement between a data sample and the null hypothesis of the normal distribution for the ratio of B and SE. Then, F statistics are used in combination with the p-value to determine the significance of the estimated regression analysis. See [64] for more details. The analysis of flash flood occurrence using the linear regression model revealed that surface runoff (F = 18.5, p < 0.001) and rainfall (F = 59.52, p < 0.001) are statistically significant. Factors exerting important influence on flash flood occurrence are short-term surface runoff (t = 5.502, p < 0.001), shortterm rainfall (t = 7.356, p < 0.001), and mid-term rainfall (t = 2.077, p = 0.045). The R 2 values for runoff and rainfall are 61.3% and 83.6%, respectively ( Table 5). The unstandardized coefficients and model explanatory power derived here for each factor are applied to the flash flood index model.  Table 5 shows the results of the linear regression analysis. The parameters B and SE indicate the regression coefficient and its standard error, which represents the uncertainty of the regression coefficient. Statistical significance of the estimated B in Table 3 is tested using the test statistics with a t value that measures the degree of agreement between a data sample and the null hypothesis of the normal distribution for the ratio of B and SE. Then, F statistics are used in combination with the p-value to determine the significance of the estimated regression analysis. See [64] for more details. The analysis of flash flood occurrence using the linear regression model revealed that surface runoff (F = 18.5, p < 0.001) and rainfall (F = 59.52, p < 0.001) are statistically significant. Factors exerting important influence on flash flood occurrence are short-term surface runoff (t = 5.502, p < 0.001), short-term rainfall (t = 7.356, p < 0.001), and mid-term rainfall (t = 2.077, p = 0.045). The R 2 values for runoff and rainfall are 61.3% and 83.6%, respectively ( Table 5). The unstandardized coefficients and model explanatory power derived here for each factor are applied to the flash flood index model.

Flash Flood Risk Index Analysis
The statistical FFRI model is then obtained using the first-order linear function of the results from the factor analysis of surface runoff and rainfall during the 6 h preceding the target time; the state variable soil water content is expressed as in Equations (8)- (10). F PCP , F RUN , and F SWS indicate the factor scores of precipitation, surface runoff, and soil water content, respectively. Then, the weights of three factor scores for each factor are determined using the non-standard coefficient B in Table 3.
Here, SW and SW SAT are soil water content at the target time (%) and saturated soil water content (%), respectively, and the factor score F SWS is expressed as the current soil water content divided by the saturated soil water content (i.e., soil-to-water ratio).
The statistical measure R 2 values in Table 5 are used as the weights for the precipitation and soil runoff in the FFRI model in Equation (1) because the R 2 value indicates that the model explains the percentage of variability in the response data around its mean. Then, the FFRI is obtained from the gridded hydrological components as follows: To obtain the weights in Equation (11), we first set the sum of weights in FFRI to 80. The index at 40 (or 60) corresponding to half (or three-fourths) of the range of the flash flood level is set as the threshold for a warning level to be issued. We optimized the value w 1 that minimizes the flooding advisory of 40 and maximizes the FFRI values subject to the available data such that we obtained w 1 = 61.5 and w 2 = 18.5.
Catastrophes caused by ungauged mountain flash floods mostly occur when rapids that form in streams and valleys isolate hikers or campers. Given the difficulty of accurate parameterization of the land surface hydrological dynamics in remote mountainous areas, predicting the occurrence or non-occurrence of flash floods for each hour preceding the target time is a more plausible and practicable approach than simulating river level or inundation area in detail. We obtained FFRI values for the survey area using the statistical FFRI model with temporal and spatial resolutions of 1 h and 1 km, respectively. Figure 9 presents the results of computing the time-series flash flood index according to various conditions based on cell-based hydrological components corresponding to the rescue request times and places. Figure 9a describes a case in which heavy rainfall exceeding 50 mm/h resulted in surface runoff over 30 mm and soil saturation, causing the FFRI to exceed the warning level (60). Moreover, the rainfall and surface runoff fell below 10 mm at the RRT, but the flash flood index exceeded 60 in the simulation, reflecting the values of the hydrological components during the preceding 6 h. In Figure 9b, surface runoff did not occur due to low soil water content, but the flash flood index exceeded the watch level (40) because of cumulative rainfall. Figure 9c shows a case in which gridded rainfall is low, resulting in little change in soil water content and no surface runoff, and the FFRI remains below 40. This suggests that insufficient rainfall did not trigger any changes in the simulated hydrological components, which may be due to the uncertainty in gridded rainfall. the simulation, reflecting the values of the hydrological components during the preceding 6 h. In Figure 9b, surface runoff did not occur due to low soil water content, but the flash flood index exceeded the watch level (40) because of cumulative rainfall. Figure 9c shows a case in which gridded rainfall is low, resulting in little change in soil water content and no surface runoff, and the FFRI remains below 40. This suggests that insufficient rainfall did not trigger any changes in the simulated hydrological components, which may be due to the uncertainty in gridded rainfall.

Discussion
To investigate the performance of the statistical approach to flash flood forecasting in ungauged mountainous areas, we should quantify and validate its accuracy against the observed data. There are individual differences in flash flood Rescue Request Time (RRT) even for the same flood event, so it is important to run accurate simulations of signs of flash flood in the pre-RRT hours in addition to calculating the FFRI at the rescue request time.
To calculate the accuracy of the cases seen in Table 1 for issued flash flood watch or warning for each of the six pre-RRT hours, i.e., times between RRT-6 and RRT, we first define the accuracy calculation (ACC) for the flash flooding index model as: Here, N is the number of the total flash flood catastrophe. has a value of 1 if the flash flood watch or warning is issued once or more between pre-RRT hour k and RRT, irrespective of the time of the actual flash flood occurrence, and has a value of 0 if not. In Equation (12), the indices i and k

Discussion
To investigate the performance of the statistical approach to flash flood forecasting in ungauged mountainous areas, we should quantify and validate its accuracy against the observed data. There are individual differences in flash flood Rescue Request Time (RRT) even for the same flood event, so it is important to run accurate simulations of signs of flash flood in the pre-RRT hours in addition to calculating the FFRI at the rescue request time.
To calculate the accuracy of the cases seen in Table 1 for issued flash flood watch or warning for each of the six pre-RRT hours, i.e., times between RRT-6 and RRT, we first define the accuracy calculation (ACC) for the flash flooding index model as: Here, N is the number of the total flash flood catastrophe. HC i k has a value of 1 if the flash flood watch or warning is issued once or more between pre-RRT hour k and RRT, irrespective of the time of the actual flash flood occurrence, and has a value of 0 if not. In Equation (12), the indices i and k indicate the case number and preceding time (or forecast lead duration time) shown in the first and third columns in Table 4, respectively, such that i = 1,2, . . . , N, and k = 0,1,2, . . . 6. Table 6 shows that a flash flood warning would have been issued in all documented cases of flash flooding. That is, Table 6 presents the results of calculating the flash flood index for each of the 38 cases of flash flood rescue requests during the six hours preceding the RRT. "Cell No." in the second column refers to the sequential number of the grid cell counted from the upper-left corner of the grid frame for the survey area. The numbers '1' and '2' indicate flash flood watch (with yellow shading) and warning (with red shading), respectively. For example, RRT-k (k = 0,1,2, . . . 6) is the flash flood index calculated based on the gridded rainfall and surface runoff during from pre-RRT hour k + 6 to hour k + 1 and the soil water content at pre-RRT hour k.  9  50725  10  45909  1  2  1  11  59036  1  1  12  51043  13  51043  14  68008  2  2  2  2  2  15  55158  1  2  2  16  65089  1  2  2  2  17  60281  18  55834  1  2  2  19  64440  2  2  2  2  2  1  20  66386  1  1  1  1  21  67991  1  1  1  1  22  67029  1  1  1  23  71227  1  1  1  24  60972  1  1  1  2  2  1  25  68296  2  2  2  2  1  26  51614  2  2  2  27  45911  1  2  2  2  2  2  1  28  55158  1  1  1  29  57081  1  1  30  57081  1  31  60659  32  63798  33  59335  34  42042  35  52301  1  1  36  60306  2  2  2  37  51043  1  2  1  38  60656  1  2  2  1 For the cases in which the model shows a time lapse between rainfall onset and rescue request time or no indication of watch/warning problems, the changes in the hydrological components are not shown, because of the absolute shortage of precipitation data in the context of reference preceding time and rescue request time. The cases marked in white and dark gray are the cases with cumulative rainfall below 5 mm and below 10 mm, respectively, during the 6 h pre-RRT (Table 6), and cannot be easily improved in the water balance process within the land surface model. This is likely due to QPE error based on AWS precipitation observations. Uncertainty is potentially generated in the gridded precipitation data from interpolating the AWS precipitation data using inverse distance weighting and the possibility of incorrect flash flood rescue request time as a result of personal judgment. However, the lack of stream gauges in mountain areas combined with these factors suggests that the rainfall-related uncertainty is the main source of error, but the rescue request time is correctly forecasted.
We also classify the flash flooding cases into four different scenarios to obtain the accuracy of the statistical FFRI model: Scenario 1 includes all 38 flash flood catastrophes; Scenario 2 includes 36 catastrophes, excluding those without rainfall in the corresponding grid cells (Cases 7 and 8); Scenario 3 includes 34 catastrophes, excluding those with cumulative rainfall during the six pre-RRT hours below 5 mm (Cases 7, 8, 9, and 31); Scenario 4 includes 31 catastrophes, excluding those with cumulative rainfall below 10 mm during the 6 hours pre-RRT ( Cases 7,8,9,13,17,31,and 33). Figure 10 shows the results of analyzing the accuracy of the four subsets of catastrophe scenarios. The accuracy is highest when the 6 hours pre-RRT is used, and decreases as the pre-RRT time approaches the RRT; the accuracy from RRT between 6 h and 4 h is 71-87%, which decreases to 42-52% at 0 h. The accuracy of a flash flood watch or warning within 6 h pre-RRT reflects the corrected predicted result irrespective of the actual time of occurrence. Therefore, the accuracy decreases as the preceding time decreases because there are fewer probabilities in the frequency of the rescue request calls.

Conclusions
This study analyzed the gridded flash flood detection ability obtained from land surface and statistical FFRI models. We produced gridded hydrological components using the TOPLATS land surface model to predict flash flooding caused by high water and rapids in mountain valleys and small streams. We developed a statistical FFRI model to derive a grid-based flash flood index and evaluated its applicability. After establishing a TOPLATS land surface model in the Han River basin, we simulated high-resolution hydrological components with temporal and spatial resolutions of 1 h and 1 km, respectively. We analyzed the performance of the gridded hydrological components in simulating the land surface conditions leading to flash flood casualties and the accuracy of the gridbased FFRI.
Analysis of the gridded hydrological components of the TOPLATS land surface model for the 38 flash flood catastrophes confirmed that rainfall, surface runoff, and soil water content were adequately simulated in 27 cases (71%), excluding 11 cases with rainfall below 10 mm in the 6 h pre-RRT. The simulated results of rainfall onsets are also compared with RRTs shown in Table 1. The gridded hydrological components reasonably expressed RRT in cases where (1) surface runoff resulted from precipitation exceeding the absorption capacity of soil in an unsaturated state, and (2) rainfall increased the infiltration rate up to soil saturation and continuing precipitation equaled the surface runoff amount (figure not shown). The model also represents RRT prediction with precipitation alone, even in the absence of surface runoff. In this case, the entirety of rainfall infiltrated the soil at conditions of very low initial water table depth, resulting in an increase in water table Overall, the results indicate that FFRI with temporal and spatial resolutions of 1 h and 1 km, respectively established for the Han River are more than 70% accurate within the 6 h pre-RRT. Assuming that the accuracies of the QPE and QPF data simulated via radar or numerical models coincide with that of the QPE obtained from interpolating the AWS point rainfall data in the Capital Region, it is expected that potential flash flood catastrophe areas can be predicted with relatively high accuracy. Thus, the FFRI, which was obtained from analyzing the correlations between the gridded hydrological components simulated with the TOPLATS land surface model and flash flood catastrophe, can be used for predicting areas that may be affected by flash flooding.
It is worth noting that experimental results might exhibit case-by-case variability. Therefore, further evaluation of the performance of the statistical FFRI model should be tested to understand the coefficient sensitivity for determining different factors. Future work should also analyze the correlations between simulated results and undocumented mountain valley/stream runoff, i.e., flash flood cases with no rescue requests and no catastrophe reports. In addition, there would be value in applying the FFRI in a forecasting mode, e.g., based on QPF, which has a large uncertainty compared with observations; this could potentially provide more scientific and operational values to the flash flood forecasting community. Nonetheless, this study provides an important basis to begin providing accurate forecasts to ungauged mountainous areas, and these improvements will be iterations in future work.

Conclusions
This study analyzed the gridded flash flood detection ability obtained from land surface and statistical FFRI models. We produced gridded hydrological components using the TOPLATS land surface model to predict flash flooding caused by high water and rapids in mountain valleys and small streams. We developed a statistical FFRI model to derive a grid-based flash flood index and evaluated its applicability. After establishing a TOPLATS land surface model in the Han River basin, we simulated high-resolution hydrological components with temporal and spatial resolutions of 1 h and 1 km, respectively. We analyzed the performance of the gridded hydrological components in simulating the land surface conditions leading to flash flood casualties and the accuracy of the grid-based FFRI.
Analysis of the gridded hydrological components of the TOPLATS land surface model for the 38 flash flood catastrophes confirmed that rainfall, surface runoff, and soil water content were adequately simulated in 27 cases (71%), excluding 11 cases with rainfall below 10 mm in the 6 h pre-RRT. The simulated results of rainfall onsets are also compared with RRTs shown in Table 1. The gridded hydrological components reasonably expressed RRT in cases where (1) surface runoff resulted from precipitation exceeding the absorption capacity of soil in an unsaturated state, and (2) rainfall increased the infiltration rate up to soil saturation and continuing precipitation equaled the surface runoff amount (figure not shown). The model also represents RRT prediction with precipitation alone, even in the absence of surface runoff. In this case, the entirety of rainfall infiltrated the soil at conditions of very low initial water table depth, resulting in an increase in water table depth and soil water content but not in surface runoff.
A flash flood watch or a flash flood warning is issued when the flash flood risk index ranges from 40-60 or is greater than 60, respectively. After calculating the flash flood index of each of the 38 cases during the six hours preceding the RRT, it was found that flash flood watch or warning was issued in 27 cases. Accuracy analyses for each pre-RRT hour revealed that the prediction accuracy with respect to the rainfall conditions in the corresponding grid cells was highest between 6 h and 4 h pre-RRT (71-87%) and lowest at 0 h (42-52%). The accuracy decreases as the preceding time decreases, because the error between the RRT and the watch or warning time of the simulated flash flood index results from relatively restricted conditions. A distributed hydrological or land surface model has the advantage of performing spatially refined simulation of hydrological components over a large area. However, this presupposes the availability of accurate input data for soil and land cover, indicating land surface properties and accuracy of gridded rainfall. These problems are expected to be resolved as more accurate GIS data become available and as radar and numerical models improve.