A Spatial Improved-kNN-Based Flood Inundation Risk Framework for Urban Tourism under Two Rainfall Scenarios

: Urban tourism has been suffering socio-economic challenges from ﬂood inundation risk (FIR) triggered by extraordinary rainfall under climate extremes. The evaluation of FIR is essential for mitigating economic losses, and even casualties. This study proposes an innovative spatial framework integrating improved k-nearest neighbor (kNN), remote sensing (RS), and geographic information system (GIS) to analyze FIR for tourism sites. Shanghai, China, was selected as a case study. Tempo-spatial factors, including climate, topography, drainage, vegetation, and soil, were selected to generate several ﬂood-related gridded indicators as inputs into the evaluation framework. A likelihood of FIR was mapped to represent possible inundation for tourist sites under a moderate-heavy rainfall scenario and extreme rainfall scenario. The resultant map was veriﬁed by the maximum inundation extent merged by RS images and water bodies. The evaluation outcomes deliver the baseline and scientiﬁc information for urban planners and policymakers to take cost-effective measures for decreasing and evading the pressure of FIR on the sustainable development of urban tourism. The spatial improved-kNN-based framework provides an innovative, effective, and easy-to-use approach to evaluate the risk for the tourism industry under climate change. illustrates that the risk level rises with an increase in rainfall, which makes a huge contribution to FIR. Also, this study conducted around 16 million experiments to reveal the value range of the four accurate indicators in two rainfall scenarios. The experimental results show that their values maintain similar trends, which proves the improved kNN algorithm is stable in FIR. Besides, this study explored the changes of the four indicators in relation to 𝑘 values from 20 to 300. This demonstrated that the optimal accuracy can be produced when the 𝑘 values equal 32, 59, 114, and 139, which means the optimal accuracy is nothing to do with 𝑘 values in even numbers or odd numbers in this study.


Introduction
Flood inundation has been negatively affecting human society. It makes up a huge proportion of the reported natural disasters worldwide, and this quantity has been increasing over the last 30 years [1]. Especially in developing countries, fatalities and losses from flood inundation are disproportionally high [2]. For example, China recorded more than a thousand larger floods since 206 BC, and its flood damages account for a larger number of global flood losses (e.g., almost 10% in 1990-2017) [3], which is higher than other natural hazards. Besides, the worst flood events were observed globally, such as Australia and Thailand in 2011, central Europe in 2013, and India and Pakistan in 2014, which raises public interest in flood inundation [4,5]. Therefore, sufficient research in the field has been conducted to help people to understand, assess, and predict flood events and their impacts [6]. Looking ahead, continuing urban sprawl will locate on river flood-plains and coastal deltas due to population growth and migration (e.g., urban tourism), which will probably produce a substantial increase in flood inundation risk [7].
Urban tourism is an emerging and pollution-free sector. It brings various socioeconomic benefits by providing employment and cultural exchange. For example, the figures in 2018 show that the number of foreign tourists in Shanghai, China, was about 6,859,000 [8], and brought around CNY 208 billion [9]. However, urban tourism also bears the negative brunt of extreme climates with destructive potentials. For example, tourism in Shanghai has suffered from annual losses of roughly CNY 22 million resulting from flood inundation [10]. Therefore, the evaluation of flood inundation risk (FIR) is vital for urban

Basic Principle of kNN
The principle of kNN refers to the feature of a query object being similar to the categories of the nearest k objects [40]. It corresponds to Tobler's first law of geography, which is the following: "All things are related, but nearby things are more related than distant things" [41]. In a kNN algorithm, two essential parameters determine the category of the query object. One is a k value that determines the number of nearby neighbors. The other is a distance function that measures the similarity between the query object and the nearest k objects. The commonly used function is Euclidean distance, which can be described as: where p(x, y) and q(x, y) are query objects and nearby objects, respectively. x and y are the features of p and q. Based on results calculated from Equation (1), the kNN algorithm will perform a vote, and the category of the nearest neighbor decides the class of the query object. However, the process of kNN inferring holds uncertainty, which can be further illustrated by Figure 1.
Based on results calculated from equation (1), the kNN algorithm will perform a vo and the category of the nearest neighbor decides the class of the query object. Howev the process of kNN inferring holds uncertainty, which can be further illustrated by Figure 1.

Improved Spatial kNN Algorithm
In Figure 1, objects fall into two categories, inundation or none inundation. A que object (in green) belongs to one of them. When equals 2, the query object belongs inundation since the distance value of D1 is smaller than that of D2. However, when equals 13, the query object belongs to none inundation since the distance from the que object to objects in none inundation (in red) is less than the distance to objects in inund tion (in blue). Hence, values determine the category of the query object. Howev choosing an appropriate value is a major challenge in a kNN algorithm. Besides, t nearest object may be an invalid value or noise data, which brings the native impacts the classification accuracy of kNN. Therefore, this study (1) calculates the total distan between the query object and nearby objects in inundation, and none inundation, spectively; and (2) uses probability to express the likelihood of inferred results to impro accuracy and diminish uncertainty in kNN. It can be expressed by the following formu Where equates the total numbers of objects, while the numerator is the total distan between the query object to neighboring objects in inundation, and the denominator the total distance between the query object to near inundated objects, and to near objects in none inundation.

Framework Conceptualization
After a review of similar investigations, this study proposes a novel conceptualiz framework integrating a spatial improved kNN method, RS, and GIS, based on gridd spatio-temporal data. The framework begins with spatial data acquisition, followed model development and evaluation of FIR ( Figure 2).

Improved Spatial kNN Algorithm
In Figure 1, objects fall into two categories, inundation or none inundation. A query object (in green) belongs to one of them. When k equals 2, the query object belongs to inundation since the distance value of D1 is smaller than that of D2. However, when k equals 13, the query object belongs to none inundation since the distance from the query object to objects in none inundation (in red) is less than the distance to objects in inundation (in blue). Hence, k values determine the category of the query object. However, choosing an appropriate k value is a major challenge in a kNN algorithm. Besides, the nearest object may be an invalid value or noise data, which brings the native impacts on the classification accuracy of kNN. Therefore, this study (1) calculates the total distance between the query object and k nearby objects in inundation, and none inundation, respectively; and (2) uses probability to express the likelihood of inferred results to improve accuracy and diminish uncertainty in kNN. It can be expressed by the following formula: where n equates the total numbers of objects, while the numerator is the total distance between the query object to k neighboring objects in inundation, and the denominator is the total distance between the query object to k near inundated objects, and to k nearby objects in none inundation.

Framework Conceptualization
After a review of similar investigations, this study proposes a novel conceptualized framework integrating a spatial improved kNN method, RS, and GIS, based on gridded spatio-temporal data. The framework begins with spatial data acquisition, followed by model development and evaluation of FIR ( Figure 2). 1. Spatial data acquisition is the first module that includes the processing of spatial factors into the spatial improved kNN algorithm. The inundation extent, extracted from Landsat imagery, and observed water bodies are merged as the maximum inundation extent (MIE), which is employed as ground truth to verify spatial deduced results.
2. Model development is the second module that contains the spatial improved kNN algorithm that is employed to infer the likelihood of FIR for query objects. The evaluation results are verified by MIE.
3. Risk mapping and evaluation is the final module to produce and evaluate the spatial likelihood of FIR for tourism sites over a study area.
The model development and implementation is the core of the spatial framework, a detail which will be discussed.

Improved kNN Modeling and Implementation
Scenario-based methods are the commonly used way to explore the potential impact of FIR on populations or assets [42]. In this study, two scenarios, moderate-heavy rainfall scenario (MHRS) and extreme rainfall scenario (ERS), are designed to explore the impacts

1.
Spatial data acquisition is the first module that includes the processing of spatial factors into the spatial improved kNN algorithm. The inundation extent, extracted from Landsat imagery, and observed water bodies are merged as the maximum inundation extent (MIE), which is employed as ground truth to verify spatial deduced results.

2.
Model development is the second module that contains the spatial improved kNN algorithm that is employed to infer the likelihood of FIR for query objects. The evaluation results are verified by MIE.

3.
Risk mapping and evaluation is the final module to produce and evaluate the spatial likelihood of FIR for tourism sites over a study area.
The model development and implementation is the core of the spatial framework, a detail which will be discussed.

Improved kNN Modeling and Implementation
Scenario-based methods are the commonly used way to explore the potential impact of FIR on populations or assets [42]. In this study, two scenarios, moderate-heavy rainfall scenario (MHRS) and extreme rainfall scenario (ERS), are designed to explore the impacts of rainfall on tourism sites. Following the first module, the evaluation indices are standardized from 1 to 4 to represent high to none risk. Based on k values, training datasets are sampled randomly. The testing datasets, locating the corresponding positions of the training ones in MIE, are employed to verify the prediction results, which generates a confusion matrix. Four accurate indicators are generated from the matrix to evaluate the accuracy of the proposed model. They are overall accuracy (OA), kappa coefficient (Kappa), average producer's accuracy (APA), and average user's accuracy (AUA). The study compares the values of OA, Kappa, APA, and AUA with those in the previous iteration. The training datasets with the higher accurate values will be saved or, conversely, be omitted. After iterations, the most accurate and optimal training datasets will be employed to infer the likelihood of FIR cell by cell over the study area. These actions are processed in ArcGIS 10.2. Python language is employed to process the spatial data, and R scripting is used to infer the likelihood of FIR using the proposed model.

Study Region
Shanghai is an internationally celebrated city and a tourism destination with a huge number of well-known attractions (Figure 3a), such as the Bund and Oriental Pearl. It has a population of more than 24 million [43,44]. The population density is about 4000 people per km 2 , which is larger than that of other cities in China. The terrestrial area of it is 6340 km 2 , which is surrounded by water on three sides: the Yangtze River Estuary up to the north; the East China Sea to the east; and Hangzhou Bay down to the south. The Huangpu River and the Suzhou Creek pass through the city. The whole city is situated in a flat and low-lying coastal region where the average elevation is about 4 m. The region is regarded as a climate with highly variable conditions based on historical records. It is covered by a northern subtropical monsoon climate with four characteristic seasons. The mean annual precipitation is about 1200 mm, approximately 70% of which occurs during the wet season (April to September). Besides, the city is almost hit by typhoons with a frequency of 1.5 times annually. Also, the land-use pattern has changed with urban sprawl, which changes rainfall-runoff. These facts present an urgent need for evaluating FIR for tourism sites in modern cities.

Data Collection and Processing
FIR is a complex system, and various natural-social indices make it happen [45]. In this study, rainfall was defined as a driving factor, tourism sites as a vulnerability, and other indicators as disaster-prone environments [15]. Based on domain knowledge, nine indices include rainfall from climate, elevation and slope from topography, proximity and density from drainage, and soil water retention (SWR) from soil interacting with land use and land cover change (LUCC) (Table 1, Figure 3b-i).
Rainfall (short or prolonged periods) has the potential to be the largest and most critical impact on FIR [15,46]. Because of the limitation of datasets, spatial points extracted from APHRODITE were used to interpolate using Kriging in ArcGIS, and generate two indicators, R 20 and R 50 , for a moderate-heavy rainfall scenario (MHRS) and extreme rainfall scenario (ERS), respectively (Table 1) [42,47,48]. APHRODITE has been demonstrated to accurately feature the seasonal migration of rain-belts in China [49]. Rainfall decides the formation of flood inundation, but topography determines the redistribution of FIR, since low-lying areas are more easily inundated where drainage systems are needed to reduce the likelihood of FIR. In this study, three levels of drainage systems were defined based on the speed of drainage under flood annual recurrence intervals (ARIs) ( Table 2).  Land use and land cover mutually determine the filter capacity of soil types, which derives CN values. CN is a comprehensive parameter reflecting the characteristics of the watershed before rainfall, which has to do with antecedent moisture condition (AMC), the soil hydrologic characteristics, and ground cover condition, all of which are referenced from the list published by the Soil Conservation Service (SCS, Table 3) [60]. AMC is a key parameter in calculating the SWR, but it was always omitted in previous studies. In this study, AMC was divided into three categories: drought (AMC I), normal (AMC II), and wet (AMC III). The levels of AMC depend on the distance to water bodies. The closer to water bodies, the more humid the soil is. This study categorized all the soil types into four categories from group A to group D. Group A includes Fe-leachi-Stagnic Anthrosols, Fe-accumuli-Stagnic Anthrosols, and Parasalic Ochri-Aquic Cambosols. Group B embraces Hapli-Stagnic Anthrosols and Ochri-Aquic Cambosols. Group C includes Fe-accumuli-Stagnic Anthrosols, and Group D covers Marinic Aqui-Orthic Halosols and Ochri-Aquic Cambosols. Based on the CN method (Table 1), the potential maximum SWR at cell i (SWR i ) is parameterized as a function of a CN (CN i ) value for each cell.
Remote sensing imagery provides rich spatio-temporal data for FIR, but it is easily obscured by clouds. This study merged the processed water inundation extent extracted from Landsat TM images and water bodies to obtain maximum inundation extent (MIE) because of the poor quality of RS imagery [61]. MIE in the southwestern area matches the pattern of inundation extent from images well, which demonstrates the merged MIE can meet the needs of the study. The merged MIE was selected as a surrogate for ground truth to validate the final FIR map.
Since the extracted indices have various measurement scales, all data were registered, projected, rasterized at 100 m resolution, and clipped to the study area in ArcGIS. It rescales them to make each of the various factors contribute relatively equally to the occurrence of FIR. The normal method of rescaling is min-max normalization that can be described as where X Rescaled is rescaled values of X, and min(X) and max(X) are the minimum values and maximum values of the original dataset X.
The calculated indices were reclassified from high to low likelihood of FIR in this study (Figure 4). pipeline networks (f) and pump stations (g) in the sub-study area; (h) soil type; (i) land use and land cover.
The calculated indices were reclassified from high to low likelihood of FIR in this study (Figure 4).

Results and Discussion
After calculation, FIR maps were derived as outputs from the proposed evaluation framework. The levels of spatial inferred results are divided into 5 levels: very high risk

Results and Discussion
After calculation, FIR maps were derived as outputs from the proposed evaluation framework. The levels of spatial inferred results are divided into 5 levels: very high risk (red), high risk (orange), medium risk (medium sand), low (yellow), and very low (green) (Figure 5a,c) using Natural Breaks (Jenks) in ArcGIS because it picks the class breaks that maximize the differences between classes. Still, the classification method can be modified based on practical demands. (red), high risk (orange), medium risk (medium sand), low (yellow), and very low (green) (Figure 5a,c) using Natural Breaks (Jenks) in ArcGIS because it picks the class breaks that maximize the differences between classes. Still, the classification method can be modified based on practical demands.

Result Validation
To validate the modeled results, the FIR maps in Figure 5a,c were further classified into 2 categories, inundation and none inundation, by using 50% as breakpoints. The breakpoint selected has considerable impacts on the FIR evaluation. Previous research re-

Result Validation
To validate the modeled results, the FIR maps in Figure 5a,c were further classified into 2 categories, inundation and none inundation, by using 50% as breakpoints. The breakpoint selected has considerable impacts on the FIR evaluation. Previous research reclassified their results based on domain knowledge. This study selected 50% as a threshold since we attempted to employ a cost-effective method to obtain relatively sound validation results. The likelihood of inundation is larger than 50%, while none inundation's value is less than 50%. Two accurate comparisons were conducted between the evaluated results and MIE. The overlay calculations in MHRS ( Figure 6a) and ERS (Figure 6b) present 54.19% and 62.87% of MIE, respectively. It should be noted that the inferred extents are smaller than MIE since some isolated cells cannot be identified as water areas or noise data. classified their results based on domain knowledge. This study selected 50% as a threshold since we attempted to employ a cost-effective method to obtain relatively sound validation results. The likelihood of inundation is larger than 50%, while none inundation's value is less than 50%. Two accurate comparisons were conducted between the evaluated results and MIE. The overlay calculations in MHRS ( Figure 6a) and ERS (Figure 6b) present 54.19% and 62.87% of MIE, respectively. It should be noted that the inferred extents are smaller than MIE since some isolated cells cannot be identified as water areas or noise data.

Sensitivity and Uncertainty Analysis
A sensitivity analysis is employed to explore the relationship between model inputs and outputs [62,63]. It is the key to understanding the parameters, uncertainty, and performance of a model. In this study, the pattern of FIR is altered with the change of rainfall in two scenarios, and values affect the accuracy of the model. Therefore, it is necessary to explore the relationship between the input values (rainfall and ) and the resultant evaluation using OA, Kappa, APA, and AUA to qualify the dependency between inputs and outputs ( Figure 7).
In MHRS, the optimal accuracies of OA, Kappa, APA, and AUA equal 0.78, 0.42, 0.7, and 0.71, respectively ( Figure 7a). However, their values in ERS increase to 0.8, 0.44, 0.71, and 0.74 (Figure 7b). This illustrates that the risk level rises with an increase in rainfall, which makes a huge contribution to FIR. Also, this study conducted around 16 million experiments to reveal the value range of the four accurate indicators in two rainfall scenarios. The experimental results show that their values maintain similar trends, which proves the improved kNN algorithm is stable in FIR. Besides, this study explored the changes of the four indicators in relation to values from 20 to 300. This demonstrated that the optimal accuracy can be produced when the values equal 32, 59, 114, and 139, which means the optimal accuracy is nothing to do with values in even numbers or odd numbers in this study.

Sensitivity and Uncertainty Analysis
A sensitivity analysis is employed to explore the relationship between model inputs and outputs [62,63]. It is the key to understanding the parameters, uncertainty, and performance of a model. In this study, the pattern of FIR is altered with the change of rainfall in two scenarios, and k values affect the accuracy of the model. Therefore, it is necessary to explore the relationship between the input values (rainfall and k) and the resultant evaluation using OA, Kappa, APA, and AUA to qualify the dependency between inputs and outputs ( Figure 7).
In MHRS, the optimal accuracies of OA, Kappa, APA, and AUA equal 0.78, 0.42, 0.7, and 0.71, respectively (Figure 7a). However, their values in ERS increase to 0.8, 0.44, 0.71, and 0.74 (Figure 7b). This illustrates that the risk level rises with an increase in rainfall, which makes a huge contribution to FIR. Also, this study conducted around 16 million experiments to reveal the value range of the four accurate indicators in two rainfall scenarios. The experimental results show that their values maintain similar trends, which proves the improved kNN algorithm is stable in FIR. Besides, this study explored the changes of the four indicators in relation to k values from 20 to 300. This demonstrated that the optimal accuracy can be produced when the k values equal 32, 59, 114, and 139, which means the optimal accuracy is nothing to do with k values in even numbers or odd numbers in this study.
Uncertainty broadly exists in flood risk evaluation [64], and it will decrease the accuracy of evaluation [65]. Landsat TM optical images as model inputs could miss flash rainfall events since study sites are frequently shielded by clouds during a flood period. It reduces the effectiveness of RS imagery to catch a fast-moving flood, and escalates uncertainty with the expansion of study areas [66]. Moreover, the uncertainty comes from the method of classification in the assessment. There are various methods in the classification of evaluation, such as natural breaks and standard deviation, each of which will produce dissimilar results. However, the conclusive evaluation is an uncertain result instead of the accumulation of all associated uncertainties in flood risk analysis [67]. For this reason, scenarios may be the "best" ways to identify the "best" solution in FIR prediction or prevention [68].
Uncertainty broadly exists in flood risk evaluation [64], and it will decrease the accuracy of evaluation [65]. Landsat TM optical images as model inputs could miss flash rainfall events since study sites are frequently shielded by clouds during a flood period. It reduces the effectiveness of RS imagery to catch a fast-moving flood, and escalates uncertainty with the expansion of study areas [66]. Moreover, the uncertainty comes from the method of classification in the assessment. There are various methods in the classification of evaluation, such as natural breaks and standard deviation, each of which will produce dissimilar results. However, the conclusive evaluation is an uncertain result instead of the accumulation of all associated uncertainties in flood risk analysis [67]. For this reason, scenarios may be the "best" ways to identify the "best" solution in FIR prediction or prevention [68].

Comparison between Improved kNN and kNN
From the visualization in Figure 5, the distribution of FIR in MHRS (a) and ERS (c) is right for the actual situation, compared with the evaluation results in MHRS (b) and ERS (d) using the original kNN. The main study area (Figure 5a,c) is covered by the low-tomedium level of FIR, since the area has facilities that are well equipped against flood inundation. In particular, the sub-study area is mainly situated in very low flood risk, since it has a high density of pipeline networks ( Figure 3f) and pump stations (Figure 3g). These play an important role in lessening the level of FIR under the two scenarios.
Besides, the color gradients at a different FIR level in Figure 5a,c are smooth and natural, which matches the real spatial features of FIR transformation. Concretely, water bodies were located in the very high areas of FIR in Figure 5aA1, which is sound. Nevertheless, they are covered by a medium level in FIR in Figure 5bA2. Moreover, the evaluation results in B1 (Figure 5a,c) show that the level of FIR increase with the rise of rainfall using the improved kNN method. However, B2 in Figure 5b,d cannot reflect this trend, or even show decreases in FIR. Furthermore, the evaluation results from the original kNN are heavily affected by specific factors. For example, C2 in Figure 5b is mainly influenced by precipitation (Figure 3b), and Chongming Island is mainly affected by SWR (Figure 4i).

Comparison between Improved kNN and kNN
From the visualization in Figure 5, the distribution of FIR in MHRS (a) and ERS (c) is right for the actual situation, compared with the evaluation results in MHRS (b) and ERS (d) using the original kNN. The main study area (Figure 5a,c) is covered by the low-to-medium level of FIR, since the area has facilities that are well equipped against flood inundation. In particular, the sub-study area is mainly situated in very low flood risk, since it has a high density of pipeline networks ( Figure 3f) and pump stations (Figure 3g). These play an important role in lessening the level of FIR under the two scenarios.
Besides, the color gradients at a different FIR level in Figure 5a,c are smooth and natural, which matches the real spatial features of FIR transformation. Concretely, water bodies were located in the very high areas of FIR in Figure 5aA1, which is sound. Nevertheless, they are covered by a medium level in FIR in Figure 5bA2. Moreover, the evaluation results in B1 (Figure 5a,c) show that the level of FIR increase with the rise of rainfall using the improved kNN method. However, B2 in Figure 5b,d cannot reflect this trend, or even show decreases in FIR. Furthermore, the evaluation results from the original kNN are heavily affected by specific factors. For example, C2 in Figure 5b is mainly influenced by precipitation (Figure 3b), and Chongming Island is mainly affected by SWR (Figure 4i).  (Figure 4h). When the volume of runoff water exceeds the conveying capacity of water channels, it easily inundates the areas sur-rounding the water bodies [35,69]. Rainfall (Figure 4a,b) contributes to the north area and the west-south area. Besides, elevation (Figure 4c), slope (Figure 4d), and SWR (Figure 4i) mutually decide the redistribution of FIR in these areas because of poor soil porousness and steepness, which make rainfall stay on the earth's surface.
In the sub-study area, the risk above medium level is mainly located at the tributaries of the Huangpu River and the confluence of other rivers, since these areas are fed by excessive waters from surrounding areas [70,71]. Moreover, the study area is covered by impervious surfaces that are not good for rainfall infiltration naturally [72]. Therefore, well-performing drainage systems play an important role in declining the likelihood of FIR. Figure 5a,c illustrate that the level of FIR in the sub-study area is relatively lower than that in other areas.
The statistics for FIR categories were conducted for the study area and tourism sites under the two scenarios (Figure 8a,b). MHRS (Figures 5a and 9a) shows that about 7.   In ERS (Figures 5c and 9b), the areas of very high, medium, and low risk increase. Areas classified as very high risk represent 8.2% of the total area (about 537.28 km 2 ), high risk is about 7.03% (about 460.37 km 2 ), medium risk is about 39.49% (about 2587.57 km 2 ), and low and very low are about 37.60% (about 2463.97 km 2 ) and 7.69% (about 503.72 km 2 ), respectively. The numbers of tourism sites from very high risk to very low risk are 164, 190, 1188, 2691, and 1676, respectively (Figures 8b and 9d).
To illustrate the hotspots in FIR changes from MHRS to ERS, the calculated MHRS-ERS growth rate was classified into three levels, high, medium, and low, using natural breakpoints of 2.20% and 7.00% (Figure 8c). The percentages of very high, medium, and low risk increase by 0.9% (about 58.78 km 2 ), 1.10% (about 72.23 km 2 ), and 3.87% (about 253.88 km 2 ), respectively (Figure 9a,b). The increasing areas in high risk are mainly located in the west, central, and east-south areas of Chongming Island, because these areas have a high density of water bodies and lower elevation.
With the increase of FIR from MHRS to ERS, 30 tourism sites rise to very high risk, with 12 sites becoming high risk, 296 sites becoming medium risks, and 557 sites becoming low risk (Figures 8d and 9c,d). Among these medium-and-above sites, historic interests occupy the highest percentage, followed by amusement parks, scenic spots, and churches. The vulnerability of tourism sites is projected to increase along river channels with the increase of rainfall due to an increase in the volume of fast-flowing surface water into rivers.

Conclusions
This study proposes a spatial innovative framework to evaluate the likelihood of flood inundation risk (FIR) integrating improved k-nearest neighbors (kNN), RS, and GIS. The improved kNN algorithm was applied to deduce the likelihood of FIR based on multiple sources of spatial flood-related factors. RS was used to extract a maximum inundation extent for data sampling and result validation. GIS was employed to derive input indices via processing a variety of data at multi-temporal, multi-spatial resolutions. The spatial framework illustrates that the methodology can derive sound results. The results show that (1) the improved kNN algorithm is right for FIR; and (2) likelihood can better In ERS (Figures 5c and 9b), the areas of very high, medium, and low risk increase. Areas classified as very high risk represent 8.2% of the total area (about 537.28 km 2 ), high risk is about 7.03% (about 460.37 km 2 ), medium risk is about 39.49% (about 2587.57 km 2 ), and low and very low are about 37.60% (about 2463.97 km 2 ) and 7.69% (about 503.72 km 2 ), respectively. The numbers of tourism sites from very high risk to very low risk are 164, 190, 1188, 2691, and 1676, respectively (Figures 8b and 9d).
To illustrate the hotspots in FIR changes from MHRS to ERS, the calculated MHRS-ERS growth rate was classified into three levels, high, medium, and low, using natural breakpoints of 2.20% and 7.00% (Figure 8c). The percentages of very high, medium, and low risk increase by 0.9% (about 58.78 km 2 ), 1.10% (about 72.23 km 2 ), and 3.87% (about 253.88 km 2 ), respectively (Figure 9a,b). The increasing areas in high risk are mainly located in the west, central, and east-south areas of Chongming Island, because these areas have a high density of water bodies and lower elevation.
With the increase of FIR from MHRS to ERS, 30 tourism sites rise to very high risk, with 12 sites becoming high risk, 296 sites becoming medium risks, and 557 sites becoming low risk (Figures 8d and 9c,d). Among these medium-and-above sites, historic interests occupy the highest percentage, followed by amusement parks, scenic spots, and churches. The vulnerability of tourism sites is projected to increase along river channels with the increase of rainfall due to an increase in the volume of fast-flowing surface water into rivers.

Conclusions
This study proposes a spatial innovative framework to evaluate the likelihood of flood inundation risk (FIR) integrating improved k-nearest neighbors (kNN), RS, and GIS. The improved kNN algorithm was applied to deduce the likelihood of FIR based on multiple sources of spatial flood-related factors. RS was used to extract a maximum inundation extent for data sampling and result validation. GIS was employed to derive input indices via processing a variety of data at multi-temporal, multi-spatial resolutions. The spatial framework illustrates that the methodology can derive sound results. The results show that (1) the improved kNN algorithm is right for FIR; and (2) likelihood can better indicate uncertainty using the kNN algorithm in FIR. The spatial framework was programmed and repeatable. The methodology is not limited by the number of input indices. If the inputs of the framework are changed, the evaluation results will be changed and generated accordingly. Therefore, the approach can be used as a feasible methodology not only in FIR, but also in other likelihood-related hazard investigations. The practice provides the baseline information for urban planners or decision-makers to construct cost-effective measures that lessen and avoid the pressure of FIR on the tourism industry.
However, good-quality RS images are vital for FIR evaluation, but they are hard to obtain during severe rainfall or large flood events, which may result in missing out on flash rainfall events. This study had to employ the processed inundation extent from Landsat TM, but its extent affects the evaluation accuracy of the spatial framework. Besides, the resolution of rainfall is coarse, which also determines the formation and spatial distribution of FIR. Furthermore, this study does not consider hydrological factors such as rainfall duration and speed. As a further step, the study plans to conduct sensitivity analysis to explore the relationship between selected breakpoints in result validation and evaluation accuracy, and we will investigate more parameters for delivering more robust evaluation results with the least uncertainty.