Characterization of the 2014 Indus River Flood Using Hydraulic Simulations and Satellite Images

Rivers play an essential role to humans and ecosystems, but they also burst their banks during floods, often causing extensive damage to crop, property, and loss of lives. This paper characterizes the 2014 flood of the Indus River in Pakistan using the US Army Corps of Engineers Hydrologic Engineering Centre River Analysis System (HEC-RAS) model, integrated into a geographic information system (GIS) and satellite images from Landsat-8. The model is used to estimate the spatial extent of the flood and assess the damage that it caused by examining changes to the different land-use/land-cover (LULC) types of the river basin. Extreme flows for different return periods were estimated using a flood frequency analysis using a log-Pearson III distribution, which the Kolmogorov–Smirnov (KS) test identified as the best distribution to characterize the flow regime of the Indus River at Taunsa Barrage. The output of the flood frequency analysis was then incorporated into the HEC-RAS model to determine the spatial extent of the 2014 flood, with the accuracy of this modelling approach assessed using images from the Moderate Resolution Imaging Spectroradiometer (MODIS). The results show that a supervised classification of the Landsat images was able to identify the LULC types of the study region with a high degree of accuracy, and that the most affected LULC was crop/agricultural land, of which 50% was affected by the 2014 flood. Finally, the hydraulic simulation of extent of the 2014 flood was found to visually compare very well with the MODIS image, and the surface area of floods of different return periods was calculated. This paper provides further evidence of the benefit of using a hydrological model and satellite images for flood mapping and for flood damage assessment to inform the development of risk mitigation strategies.


Introduction
Flooding is one of the most common environmental hazards, affecting many countries worldwide [1]. Pakistan, like many South Asian countries, is often severely impacted by flooding [2]. The Indus River is the longest river [2] and the most crucial water source in Pakistan, providing water to irrigate the land that produces 90% of the agricultural outputs of the country [3]. This river, however, often bursts its banks during the monsoon season, causing extensive damage to property and crops, as well as loss of lives. Droughts are also

Remote Sensing Data for Land-Use/Land-Cover Type Classification
Images from the operational land imager (OLI) and thermal infrared sensor (TIRS) onboard the Landsat-8 satellite were obtained from the United States Geological Survey (USGS: http:/earthexplorer.usgs.gov/) to identify the land-use/land-cover (LULC) types of the river basin. The study area is covered by Path 151 of the Landsat-8 satellite, providing images at a temporal resolution of approximately 16 days. Three satellite images were obtained over the study area: before the flood (July 22), during the flood (August 7), and after it (September 24). Google Earth images were also obtained for comparison with the LULC types derived from the Landsat images.

The HEC-RAS Model and DEM Data
The US Army Corps of Engineers Hydrologic Engineering Centre River Analysis System (HEC-RAS) was applied to the study region to estimate the extent-and damageof the 2014 flood. The n-values of Manning's coefficient required by the model were based on the LULC type, with the LULC classification identified using the Landsat images mentioned above.
The HEC-GeoRAS GIS extension of the model comprises a set of procedures and tools for preparing geometric data for import into the HEC-RAS model to give a 3D view of the river basin, thereby allowing for an estimation of the spatial extent of a flood and its depth [28,29]. This import file requires a DTM represented by triangulated irregular networks (TIN). Each triangle represents a uniform slope steepness and flows direction. The DTM was created using data from the Shuttle Radar Topography Mission (SRTM) at 30 m resolution as distributed by the Consortium for Spatial Information (CSI). The study

The HEC-RAS Model and DEM Data
The US Army Corps of Engineers Hydrologic Engineering Centre River Analysis System (HEC-RAS) was applied to the study region to estimate the extent-and damageof the 2014 flood. The n-values of Manning's coefficient required by the model were based on the LULC type, with the LULC classification identified using the Landsat images mentioned above.
The HEC-GeoRAS GIS extension of the model comprises a set of procedures and tools for preparing geometric data for import into the HEC-RAS model to give a 3D view of the river basin, thereby allowing for an estimation of the spatial extent of a flood and its depth [28,29]. This import file requires a DTM represented by triangulated irregular networks (TIN). Each triangle represents a uniform slope steepness and flows direction. The DTM was created using data from the Shuttle Radar Topography Mission (SRTM) at 30 m resolution as distributed by the Consortium for Spatial Information (CSI). The study area in the dataset fell within one single tile. Several studies have used SRTM data to extract the boundary of a watershed and its drainage network [30,31]. This approach was followed, with the watershed delineated using the following steps in the GIS software ArcMap 10.6: i) importing the SRTM data, ii) creating a depression-less DTM over the study region, with the "Fill" method used to remove all sinks to create an accurate representation of flow direction, iii) determine the flow path for each cell in the DTM using a technique named "Flow Path," and (iv) calculate the flow aggregation value at each cell given the number of upstream cells flowing into it. Depressions, also referred to as sinks, have no drainage due to being surrounded by cells at higher elevations and are often the result of DTM processing. To represent the DTM as a TIN, points were extracted around the edge of the DTM spaces, and then, the surface was filled for each gap, creating a filled surface from the DTM. TINs were calculated from these points and converted to a base surface. The filled base surface from the TINs was subtracted from the filled surface from the DTM, and the result was added to the DTM base surface. The contour intervals were made over the area of interest, and the TIN was generated using these contours to create the geometric data.

Streamflow Data
Pakistan Water and Power Development Authority (WAPDA) provided discharge data for the Indus River at Taunsa Barrage from 1986 to 2014, which were used to identify the maximum daily discharge. The HEC-RAC model, however, requires as input the maximum instantaneous peak flow. As there is a lack of instantaneous peak flow data at the study location, it was estimated using Equation (1), a linear regression model based on 12 pairs of maximum instantaneous and maximum daily discharge data with a Pearson correlation coefficient of 0.94 [32]: where Q p = maximum instantaneous discharge (m 3s −1 ) and Q M = maximum daily discharge (m 3s −1 ). Table 1 depicts the resulting maximum instantaneous discharge at Taunsa Barrage. Table 1. Maximum instantaneous discharge (Q m ) and maximum daily discharge (Q p ) at Taunsa Barrage.

Serial Number
Year 1986  2530  2170  16  2001  3510  2456  2  1987  2850  2230  17  2002  3220  2235  3  1988  2942  2632  18  2003  2930  2156  4  1989  3442  2563  19  2004  2720  2230  5  1990  4530  3325  20  2005  2420  1562  6  1991  4918  2635  21  2006  2520  1456  7  1992  3513  2546  22  2007  2210  1320  8  1993  2967  2364  23  2008  2518  1648  9  1994  3840  3385  24  2009  2695  1241  10  1995  3868  3265  25  2010  4230  2530  11  1996  3510  2568  26  2011  3834  2359  12  1997  3420  2346  27  2012  3540  2640  13  1998  3630  3156  28  2013  3530  2654  14  1999  3340  3562  29  2014  4130  3125  15 2000 Figure 2 illustrates the methodological approach used in this study. The OLI Landsat-8 images were pre-processed using layer stacking and mosaicking. They were subject to atmospheric and radiometric correction with the sensor-specific information in the metadata file analyzed in ENVI v 5.4. In this process, spectral radiance was transformed to a planetary or exoatmospheric reflectance by normalizing solar irradiance to eliminate the influence of different solar zenith angles. The process accounted for different values of exoatmospheric solar irradiance resulting from the difference in a spectral band [33]. Atmospheric correction was also performed to remove the influence of ambient atmospheric conditions, including aerosols, thereby increasing the interpretability of the satellite image. metadata file analyzed in ENVI v 5.4. In this process, spectral radiance was transformed to a planetary or exoatmospheric reflectance by normalizing solar irradiance to eliminate the influence of different solar zenith angles. The process accounted for different values of exoatmospheric solar irradiance resulting from the difference in a spectral band [33]. Atmospheric correction was also performed to remove the influence of ambient atmospheric conditions, including aerosols, thereby increasing the interpretability of the satellite image. Radiometric correction is associated with a sensor that records the electromagnetic radiation intensity as a digital number (DN) for each pixel. These DNs may be translated to more practical real-life units such as radiance, reflectance, or temperature by using the sensor-specific information obtained from the metadata file of the Landsat image. Some software packages for image processing include radiometric calibration tools; for instance, Landsat-8 data can be translated to reflectance in ENVI v 5.4 without measuring the radiance first.
After atmospheric and radiometric corrections, ArcGIS 10.6 was used to perform a supervised classification on the Landsat images obtained before, during, and after the occurrence of the 2014 flood to identify the LULC types during each period ( Figure 3) following the example of [34,35]. A maximum likelihood classification (MLC) [36,37] was also applied on the outcomes of the supervised classification, which resulted in a LULC classification with six categories, i.e., crop/agricultural land, built-up area, barren land, sand, water/wetland, and deposited material. Radiometric correction is associated with a sensor that records the electromagnetic radiation intensity as a digital number (DN) for each pixel. These DNs may be translated to more practical real-life units such as radiance, reflectance, or temperature by using the sensor-specific information obtained from the metadata file of the Landsat image. Some software packages for image processing include radiometric calibration tools; for instance, Landsat-8 data can be translated to reflectance in ENVI v 5.4 without measuring the radiance first.
After atmospheric and radiometric corrections, ArcGIS 10.6 was used to perform a supervised classification on the Landsat images obtained before, during, and after the occurrence of the 2014 flood to identify the LULC types during each period ( Figure 3) following the example of [34,35]. A maximum likelihood classification (MLC) [36,37] was also applied on the outcomes of the supervised classification, which resulted in a LULC classification with six categories, i.e., crop/agricultural land, built-up area, barren land, sand, water/wetland, and deposited material.

Satellite-Based Flood Damage Assessment and Its Validation
Flood damage was assessed using a change detection technique, which consisted of calculating the surface area of the different LULC types of the watershed changing category because of the flood. For instance, an area shifting from "built-up area" or "agricultural/cropland" to "deposited material" in a satellite image obtained after the flood was considered a damaged area.   Google Earth images and a total of 212 samples with a minimum of 30 samples for each LULC type were collected by a team of WAPDA together with their GPS location. These were used to validate the LULC type classification after the flood. The verification was done at random points using the random sample points method in the ArcGIS Spatial Analyst extension. These points were then converted into KML format so that they could be displayed in Google Earth. The accuracy of the LULC type classification after the flood was then measured using 600 ArcGIS random points and the 212 GPS points.
Furthermore, 25 GPS points of the selected groups were used for the post-flood verification process. The accuracy of maps that depicted flooded areas was assessed using 440 ArcGIS random points on the Google Earth images and 160 GPS points, with the uncertainty matrix established using the results obtained from this comparison. The assessment was then quantified using the Kappa coefficient (K).

Mapping the Spatial Extent of Floods of Different Return Periods and Comparing the Extent of the 2014 Flood with MODIS Imagery
The floodplain maps were produced using the following steps: (1) a flood frequency analysis was performed to relate the discharge data at Taunsa Barrage to specific return periods; (2) development of the SRTM-based DTM model; (3) extract the boundary of the watershed and its drainage network; (4) prepare the geometric data using the HEC-GeoRAS extension; (5) run the HEC-RAS model to simulate river discharge; (6) produce floodplain maps for floods of various return periods using the peak discharge identified from the flood frequency analysis.
In the flood frequency analysis, the log-Pearson form III [38], the Gumbel [39,40], and the log-normal [41,42] distributions were fitted to the discharge data depicted in Table 1. The Kolmogorov-Smirnov (KS) test identified the log-Pearson III as the best distribution using the approach described in Khattak et al. [32]. The distribution was then used to estimate the peak discharge value for a flood of a given return period. This peak discharge value was input into the HEC-RAS model to map the extent of the flood.
The watershed characteristics, such as the river centerline, were extracted from the DTM using the Geospatial Hydrologic Modeling Extension of the HEC model (HEC-GeoHMS). Geometric data for the river, such as cross-sections and bank stations and flow direction lines, were prepared using HEC-GeoRAS. In addition to the DTM, geo-referenced natural-color images from Google Earth were obtained to estimate Manning's roughness coefficient, a model parameter, for each of the relevant LULC categories (Table 2). For this, separate polygons were drawn for each LULC category and Manning's roughness coefficient was calculated for each polygon using the method proposed by McCuen [43] and Chow et al. [20], while assigning a value of 0.32 for single-story and double-story houses [44]. After implementing all geometric data specifications and assigning Manning's roughness coefficient values to each LULC category, the GIS file was imported into the HEC-RAS. The next step consisted of providing the steady flow and boundary conditions. For steady and unsteady analyses, standard depth is the most employed boundary state and, accordingly, the HEC-RAS model uses this value to estimate the depth of a flood. The energy slope can be estimated by calculating the slope of the river stretch downstream of the modeled river reach. To run an unsteady model, external boundary conditions are required. These are the boundary conditions that must be applied to the upstream and downstream ends of the river. Internal border requirements are optional and allow the user to identify gate operations and associated modifications to the flow along the length of the river.
The level of the flood along the different stretches of the river was simulated using the peak river flow introduced into the HEC-RAS model. This peak river discharge corresponded to floods with a return period of 5-, 10-, 50-, 100-, and 150-year, along with using the maximum instantaneous discharge value of the 2014 flood. The model produced as outputs the flow depth, the cross-sectional discharge, and other related data. After deleting all errors and after the water surface generation and floodplain delineation, the output file from HEC-RAS was imported into ArcGIS, and a floodplain map was produced. Only the flood depth could be obtained on this map; therefore, a smooth water surface was created.
The UNITAR Operational Satellite Application Network (UNOSAT) analyzed the 2014 Indus River flood using a MODIS image taken on July 31. This image was used to examine the extent of the flood and its damage and to compare it with that simulated by the model. Figure 3 illustrates the LULC types before the flood, during the flood, and after it. 'Deposited material' was not a LULC category for the image taken during the flood but only for the image after it. The change detection technique reveals significant changes for all LULC types. Before the flood, the watershed, covering a surface area of approximately 581.8 km 2 , consisted of 63.8% crop/agricultural land, 11.3% built-up area, 21.1% barren land, 0.9% sand, and 1.1% deposited material (Figure 4). The 'Water/Wetlands' category covered only 1.1% of the watershed area, but, as expected, this increased dramatically to 38% during the flood, therefore affecting other LULC types. Specifically, the 'built-up' LULC type decreased from 11.34 to 9.35%, the 'crop/agricultural land' category, which covers the largest surface area of the catchment, decreased from 63.8 to 55.9%, and the 'barren land' LULC type decreased from 21.1 to 17.2% because of the flood; hence explaining the extensive damage that the 2014 flood caused to crops and the built environment.    (Table 3). Overall, the results reveal a high level of accuracy of the LULC categorization technique. For the pre-flood image, the highest UA, i.e., 98.98%, is attained for the 'built-up area' class, while the highest PA, also at 98.98%, is seen for the 'crop/agricultural land' LULC type. After the flood, the water category returned nearly to its original pre-flood level, decreasing from 38% to 1.87%. However, the percentage of the watershed covered by the 'built-up' LULC category did not return to its original value, but to only 9.35%. This is likely because after the flood water receded, materials were deposited, hence this category showed a significant increase from 1.10% to 15.04% in the post-flood image (Figure 4). An increase from 45.22% to 55.92% was also seen in the 'crop/agricultural land' category, while the 'barren land' LULC type, for its part, remained unchanged. The category defined as 'sand' increased marginally from 0.14 to 0.60%, as seen in Figure 3 (right image).

Damage Assessment
This study shows that the 'crop/agricultural land' was the most damaged LULC class because of the 2014 flood. Approximately 371.2 km 2 of crop/agricultural land was inundated compared to 65.98 km 2 for the built-up area. Furthermore, the results highlight that deposited material damaged about 45.91 km 2 of agricultural/cropland and 11.57 km 2 of built-up area ( Figure 5). gory showed a significant increase from 1.10% to 15.04% in the post-flood image ( Figur  4). An increase from 45.22% to 55.92% was also seen in the 'crop/agricultural land' cate gory, while the 'barren land' LULC type, for its part, remained unchanged. The categor defined as 'sand' increased marginally from 0.14 to 0.60%, as seen in Figure 3 (right image

Damage Assessment
This study shows that the 'crop/agricultural land' was the most damaged LULC clas because of the 2014 flood. Approximately 371.2 km 2 of crop/agricultural land was inun dated compared to 65.98 km 2 for the built-up area. Furthermore, the results highlight tha deposited material damaged about 45.91 km 2 of agricultural/cropland and 11.57 km 2 o built-up area ( Figure 5).

Frequency Analysis
The maximal instantaneous discharges at Taunsa Barrage for floods of different return periods are indicated in Table 4 using the LN, Gumbel, and LP3 distributions, while Figure 6 depicts the discharge for each of those distributions. For the three distributions, the recorded discharge values are higher than those provided by [2]. The expected flood peaks using the LN distribution are larger than those obtained using the LP3 and Gumbel distributions for all return periods greater than 50 years. The K-S test results for Taunsa Barrage are shown in Table 5.  Figure 6. The discharge of the Indus River at Taunsa Barrage using the log-Pearson type III, Gumbel, and log-normal probability distributions.   The LP3 distribution was used to predict flood peaks at Taunsa Barrage based on the value of the K-S statistical test. For the return periods of 5-, 10-, 50-, 100-, and 150-year as well as for the 2014 flood, the instantaneous flows derived from the LP3 distribution were used as steady flow inputs into the HEC-RAS model. The first segment was downstream from lower Taunsa at some distance. The topography was comparatively precise compared to the upstream parts, while the second part was at the intersection of the Indus River. Both floods with 100-and 150-year return periods show higher water levels than the water levels that occurred during the 2014 flood (Figure 7).

Flood Plain Mapping
The flooded area for floods of different return periods was delineated using the model and Figure 8 depicts the areas of the river basin expected to be flooded for a flood with a 50-year return period. Figure 9

Flood Plain Mapping
The flooded area for floods of different return periods was delineated using the model and Figure 8 depicts the areas of the river basin expected to be flooded for a flood with a 50-year return period. Figure 9 illustrates the extent of the flood according to the model in comparison to the extent observed on the MODIS satellite image. Through visual inspection, the hydraulic model simulates very well the spatial extent of the 2014 flood. Although the timing of the satellite image does not exactly correspond with the peak flood timing, field surveys in the study area indicate that the highest flood happened around August 7. However, some positions had minor deviations, the accuracy of the floodplain geometry used in the HEC-RAS model may cause those deviations. There were several islands and reservoirs on the water surface created by ArcGIS, possibly because the DEM elevations had a 1 m interval. The surface of water assumed constant elevation values. Consequently, compared to the lower reach, the flooded area in the upper section of the river and flow depth are comparatively smaller.
Near Taunsa Barrage lies the highest area impacted. It can be seen from Table 6 that for the 100-year return period, more than 147% of the surface area is expected to be flooded.   Overall, this study indicates that the accuracy of the HEC-RAS model to simulate the spatial extent of the 2014 flood of the Indus River to be very good, through visual inspection, but better calibration of the channel roughness could potentially further increase the accuracy of the model.

Discussion
Pakistan is a country highly vulnerable to floods [45,46]. For instance, in the past decade, other disasters were overshadowed by floods due to the heavy death toll and the significant disruption to the economy that they caused [34,47]. In 2014, heavy monsoon rains resulted in a high discharge of the Indus River, which exceeded the channel capacity, causing a major flood in Muzaffargarh District of the Punjab province. Near Taunsa Barrage lies the highest area impacted. It can be seen from Table 6 that for the 100-year return period, more than 147% of the surface area is expected to be flooded. Overall, this study indicates that the accuracy of the HEC-RAS model to simulate the spatial extent of the 2014 flood of the Indus River to be very good, through visual inspection, but better calibration of the channel roughness could potentially further increase the accuracy of the model.

Discussion
Pakistan is a country highly vulnerable to floods [45,46]. For instance, in the past decade, other disasters were overshadowed by floods due to the heavy death toll and the significant disruption to the economy that they caused [34,47]. In 2014, heavy monsoon rains resulted in a high discharge of the Indus River, which exceeded the channel capacity, causing a major flood in Muzaffargarh District of the Punjab province.
This paper showed that a supervised classification of Landsat images was able to identify the LULC types of the study region with a high degree of accuracy. Even though Landsat images are available at a relatively high spatial resolution, they are typically available every 15 days over the study area [34], a time interval that is not frequent enough for flood monitoring and that is also often not suitable for damage assessment, especially given that cloud coverage sometimes restricts the use of optical images [48,49]. This was different for path 151 of the Landsat, which provided cloud-free images over the study area at the time of the 2014 flood, allowing for the current study to be performed.
Previous research has shown the potential of optical satellite imagery to monitor inundated farmland and built-up areas [2]. This study has further expanded this evidence to other LULC types. Moreover, the potential of using Landsat images for damage assessment, even many days following a flood, is demonstrated in this paper, a result in agreement with other studies [26,37]. This paper also shows that sediment build-up can also be detected with good accuracy over crop/agricultural fields and built-up areas [48].
When assessing the reliability of the flood monitoring reports, the precision of the satellite data and the methods used to process and analyze them must also be considered. This analysis performed in this paper has provided an accuracy of more 85% for the LULC categorization. This analysis identified 'sand' with an accurate accuracy, however, wet sand has often been mistaken with water and vegetation/agricultural land in some instances. This is also observed in some situations where blurred pixels, crop/agricultural land, and built-up areas are also misclassified in transition areas.
For flood mapping and damage assessment, high-resolution SAR data are suggested [50]. This is because radar satellite imagery can penetrate clouds, thereby they can be used under any weather conditions [50], and they are considered more accurate to assess disruption to crop/agricultural land and built-up areas [6,48]. However, freely available SAR images could not be obtained during the timing of the 2014 flood. Finally, in built-up areas where floodwaters flow over buildings, roads, and other elements, the modest 30 m spatial resolution of the Landsat images can lead to misclassification.
Another objective of this study was to evaluate the suitability of the HEC-RAS model to simulate the water surface profiles and determine the spatial extent of floods of different return periods. It was found that the HEC-RAS model was able to replicate the magnitude of the 2014 flood. The floodplain map showed that the flood levels were about four times higher for a flood with a 50-year return period than those under typical flow conditions. During the field surveys, it was noticed that the marginal barriers are necessary to restore, because the villagers cut them down in many places and build passages for their tractors and cattle. The deterioration of the foundations of barriers has contributed to the extent of the 2014 flood. The interference by citizens is another related concern with the protection of the floodplain in the basin. In recent years, the population and anthropogenic activities have increased in the floodplains. The floodplain must be reinstated or maintained to its natural undeveloped state to ensure citizens and infrastructure safety. Protection from damage and restoring the floodplain will significantly mitigate the risk of flooding. This paper has shown the potential of applying the HEC-RAS model in the simulation of the 2014 flood, and could be used by relevant authorities to inform risk mitigation strategies. It is cost-effective and would allow government agencies to reduce flood damage. While hydraulic models are known to be challenging to implement, several institutions in Pakistan are now familiar with the HEC-RAS model.

Conclusions
This study proposed a modelling approach using GIS and RS to depict the spatial extent of the 2014 flood of the Indus River in Pakistan, comparing the model output with MODIS satellite imagery, and determining the extent of floods of different return periods for the basin, in addition to assessing the damage caused by the flood. The methodology consisted of using Landsat images to identify the various LULC types over the watershed and the approach was subsequently evaluated using Google Earth images and field data with an overall classification accuracy of 85% obtained. Also, the analysis found that the 'crop/agricultural land' LULC type was the most affected by the flood. This research also evaluated the HEC-RAS model to simulate the 2014 flood and using the model to simulate floods of different return periods. The K-S statistical test identified the LP3 probability distribution to be best at simulating the flow regime of the Indus River at Taunsa Barrage, and this distribution was then used to identify the peak river discharge for floods with a 5-, 10-, 50-, 100-, and 150-year return period, which was then used as input into the HEC-RAC model to estimate the areas of the watershed at risk of flooding for each return period. The spatial extent of the 2014 flood, as simulated by the model, was found to agree very well with the extent of the flood as observed on a MODIS satellite image, showing the potential of using the model and the approach presented in this paper for risk mitigation.

Data Availability Statement:
The data presented in this study are available on request from the first or corresponding authors.