GIS-Based Modeling for Selection of Dam Sites in the Kurdistan Region, Iraq

Iraq, a country in the Middle East, has suffered severe drought events in the past two decades due to a significant decrease in annual precipitation. Water storage by building dams can mitigate drought impacts and assure water supply. This study was designed to identify suitable sites to build new dams within the Al-Khabur River Basin (KhRB). Both the fuzzy analytic hierarchy process (AHP) and the weighted sum method (WSM) were used and compared to select suitable dam sites. A total of 14 layers were used as input dataset (i.e., lithology, tectonic zones, distance to active faults, distance to lineaments, soil type, land cover, hypsometry, slope gradient, average precipitation, stream width, Curve Number Grid, distance to major roads, distance to towns and cities, and distance to villages). Landsat-8/Operational Land Imager (OLI) and QuickBird optical images were used in the study. Three types of accuracies were tested: overall, suitable pixels by number, and suitable pixels by weight. Based on these criteria, we determined that 11 sites are suitable for locating dams for runoff harvesting. Results were compared to the location of 21 preselected dams proposed by the Ministry of Agricultural and Water Resources (MAWR). Three of these dam sites coincide with those proposed by the MAWR. The overall accuracies of the 11 dams ranged between 76.2% and 91.8%. The two most suitable dam sites are located in the center of the study area, with favorable geology, adequate storage capacity, and in close proximity to the population centers. Of the two selection methods, the AHP method performed better as its overall accuracy is greater than that of the WSM. We argue that when stream discharge data are not available, use of high spatial resolution QuickBird imageries to determine stream width for discharge estimation is acceptable and can be used for preliminary dam site selection. The study offers a valuable and relatively inexpensive tool to decision-makers for eliminating sites having severe limitations (less suitable sites) and focusing on those with the least restriction (more suitable sites) for dam construction.

The study area, which includes the sixth-order Al-Khabur River, carries all runoff following precipitation events in the Al-Khabur basin. The area shows significant seasonal variations in precipitation, temperature, and potential evaporation, and is characterized by wet winters and dry summers ( Figure 2). The bulk of the annual precipitation (586 mm) occurs from October to May. For the 2001-2005 period, the highest average monthly precipitation, with an average value of 134.3 mm, occurred in January. July was marked by the highest average monthly evaporation rate, with an average value of 354.7 mm. Monthly mean temperature varied between 8.47 (January) and 33.96 °C (July). The hottest average monthly temperature of 41.31 °C was recorded in July, and the coldest average monthly temperature of 4.17 °C in January. Al-Khabur River is fed by rainfall and snowmelt, resulting in peak discharge in spring and low discharges in summer and early fall. The study area, which includes the sixth-order Al-Khabur River, carries all runoff following precipitation events in the Al-Khabur basin. The area shows significant seasonal variations in precipitation, temperature, and potential evaporation, and is characterized by wet winters and dry summers ( Figure 2). The bulk of the annual precipitation (586 mm) occurs from October to May. For the 2001-2005 period, the highest average monthly precipitation, with an average value of 134.3 mm, occurred in January. July was marked by the highest average monthly evaporation rate, with an average value of 354.7 mm. Monthly mean temperature varied between 8.47 (January) and 33.96 • C (July). The hottest average monthly temperature of 41.31 • C was recorded in July, and the coldest average monthly temperature of 4.17 • C in January. Al-Khabur River is fed by rainfall and snowmelt, resulting in peak discharge in spring and low discharges in summer and early fall. ISPRS Int. J. Geo-Inf. 2020, 9, x FOR PEER REVIEW
The pixel sizes of the thematic maps were resampled to obtain the exact spatial resolution as the pixel size of the digital elevation model (DEM) from the Shuttle Radar Topography Mission (SRTM) (i.e., 30 m spatial resolution). The predictive input factors are either continuous or discrete. Factors such as elevation, distance to road, and slope gradient, are continuous, while soil, land cover, and lithology are discrete (Table 1). We used spatial analyst tools of ArcGIS to convert the continuous to discrete factors. To do that, we classified each continuous input factor into five major classes, which are: most suitable, suitable, moderately suitable, less suitable, and not suitable. The weight of the five main classes are 1, 3, 5, 7, and 9,
The pixel sizes of the thematic maps were resampled to obtain the exact spatial resolution as the pixel size of the digital elevation model (DEM) from the Shuttle Radar Topography Mission (SRTM) (i.e., 30 m spatial resolution). The predictive input factors are either continuous or discrete. Factors such as elevation, distance to road, and slope gradient, are continuous, while soil, land cover, and lithology are discrete (Table 1). We used spatial analyst tools of ArcGIS to convert the continuous to discrete factors. To do that, we classified each continuous input factor into five major classes, which are: most suitable, suitable, moderately suitable, less suitable, and not suitable. The weight of the five main classes are 1, 3, 5, 7, and 9, where the not suitable is 1 and the most suitable is 9. We used the natural breaks method because it allows to reduce the variance within classes and maximizes the variance between classes [32]. The number and boundary of classes can significantly influence the results of the statistical methods [33]. The classes in the discrete input factors were assigned to have the same five classes (i.e., most suitable, suitable, moderately suitable, less suitable, and not suitable).

Suitable Dam Site Selection Model
Although several MCDM methods are available, there is no specific method that could be considered most suitable for all types of decision-making situations [34][35][36]. A big criticism of MCDM is the fact that different approaches can yield different results if applied to the same problem [37]. The determination of a suitable MCDM method is thus not an easy task and the focus should be on careful selection of the method [34]. The literature presents several practical applications of comparative analyses of different MCDM methods [11,12,[38][39][40][41][42][43][44][45][46]. In this study, we used WSM and AHP to determine suitable locations for dams.

Weighted Sum Method (WSM)
WSM does not take into account the significant deficiencies that can occur as input factors [47], given that all factors have equal weight. In the first step, we classified each factor into five classes. These were 1, 3, 5, 7, and 9 for the not suitable, less suitable, moderately suitable, suitable, and most suitable for dam site selection, respectively. The weight of these five classes was determined according to the suitability of each class to locate the dam, as shown in Table A3 (column "Rank") and Table 1 (column "Relation intensity"). We relied mainly on previous literature, such as References [9,14,30], and our own expert opinion to compute the weights for the factors. The next step is summation of all factors following Equation (1), suggested by Fishburn [12].
where n is the number of factors, a ij is the actual value of the i of the j criterion, and w j is the weight of the j criterion.

Analytic Hierarchy Process (AHP)
In 1990, Saaty proposed the AHP method, an easy-to-use method that calculates the index weight by comparing the predictive factors with each other [48]. It is one of the most commonly used methods for dam site selection. The GIS environment was used to determine suitable sites for dam, and ratings of each predictive factor are provided on a five-point continuous scale. The weight of each predictive factor was estimated depending on the relation intensity of the factors that influence dam site selection (Table 1). These weights have been computed depending mainly on the previous literature, such as References [9,14,30], and our own expert opinion. The map of suitable sites for dams is computed by the raster overlay algorithm, using Equation (2) [49]: where x i is the value of predictive factor i (where i = (predictive factors listed in Table 1)), w i is the weight for predictive factor i, and n is the number of predictive factors. We correlated all predictive factors used by normalizing their scales and units, using the following equation (Equation (3)): where Z i is the normalized value of pixel, X i is the value of pixel, X min is the minimum value of pixel, and X max is the maximum value of pixel.
As the dam sites are located within the river courses, we made sets of buffer zones (250, 500, and 1000 m) around the drainage networks. The two maps (i.e., WSM and AHP) were intersected with these three buffer zones. The pixels within these three zones that received average value ≥ moderately suitable, were selected for dam site location.

Accuracy Assessment and Dam Site Selection
We followed Noori [14] by modifying the segmentation accuracy assessment [50] to evaluate the results of AHP and WSM methods. The method used the identified number of segments to calculate the summation of distances from suitable pixels to the reference point. Initially, the 21 large dams (Table A1) proposed by MAWR were used as reference points [16]. Thereafter, the resulting maps of WSM and AHP methods were categorized into five classes: most suitable, suitable, moderately suitable, less suitable, and not suitable for location of dams. We created sets of buffer zones (250, 500, and 1000 m) around the reference points. The total pixels' number, the suitable pixels' numbers, and the distance between the reference and the pixels within the buffer, were calculated. Finally, overall accuracy (OA) of the suitable pixels was calculated using Equations (4), (5), and (6): where A s is the accuracy of the suitable pixels by number, N s is the number of suitable pixels, N is the total pixels, A w is the accuracy of the suitable pixel by weight, W is the summation of weights of the total pixels, and O A is the overall accuracy. In order to refine our approach, we also applied the threshold operation. The selection of the threshold values for the suitable method was determined experimentally. The final thresholded raster of ISPRS Int. J. Geo-Inf. 2020, 9, 244 7 of 34 the suitable method was then used to locate the areas representing potential dam sites. These locations have been determined using shapefile of point feature type.
Two geological maps from published reports of Sissakian [52] and Al-Mousawi [53], at a scale of 1:250,000, were used in this study. These maps were scanned at 300 dots per inch (dpi) and georeferenced to the Universal Transverse Mercator (UTM) coordinate system (zone 38 north). The lithological units, tectonic units, and faults were digitized in a GIS database. The lithology and tectonic zones shapefiles were converted to raster format using a spatial resolution of 30 m.
Lithology of the study area includes 24 rock units ( Figure 3). This raster layer was used as the lithological factor. The Ordovician period includes one unit (i.e., Khabour Formation) consisting of sandstone and shale rocks. Rock units of the Carboniferous-Jurassic periods consist of limestone, shale, marl, and siltstone. Rocks of the Cretaceous period comprise limestone, marl, dolostone, and sandstone. The Tertiary period units consist of clastic rocks such as sandstone, conglomerate, siltstone, claystone, and marl. Quaternary sediments ( Table 2) include residual soil, slope debris, and flood plain deposits [52,54].
According to Foad [53], the study area is located within the Unstable Shelf, which is a part of the Zagros Fold-and-Thrust Belt. This belt is approximately 2000 km long, extending from southeast Turkey through Iraq to southern Iran [55][56][57][58][59][60]. The unstable shelf includes the Imbricated Zone (IZ) and the High Folded Zone (HFZ). The IZ and HFZ cover 24% and 76% of the study area, respectively. The IZ lies to the north of the study area, and the HFZ is located to the south ( Figure 3). ISPRS Int. J. Geo-Inf. 2020, 9, x FOR PEER REVIEW Figure 3. Geological [52,53] and tectonic map [53] of the study area.   Distance to lineaments and active faults have been used as predictive factors because they represent potential weakness zones. Since highly faulted areas are not suitable for dam construction [61], dam sites should be located at least 100 m away from lineaments and active faults [14]. Based on these requirements, we implemented all parameters shown in Table 3 [62]. The result was exported as a shapefile and modified within the ArcGIS [63] environment.
We obtained information on active faults by digitizing the two series of geological maps mentioned above [52,54]. The study area includes 53 fault segments, four of which are normal faults, and 11 are thrust faults, while the rest are unclassified ( Figure 3). The total length of the faults is~297.1 km. Interestingly, 10 of them are >2 km in length. The main direction of the thrust faults is NW-SE, while the main directions of the rest of the faults are NE-SW ( Figure 3). Distance to the faults ranges between 0 and 34.02 km ( Figure 4A). The faults shapefile was used to prepare the raster of faults factor. The Euclidean distance to the closest faults was calculated for each cell.
According to Javhar et al. [62], the best band for lineaments extraction from Landsat-8/Operational Land Imager (OLI) is the spectral band 5 (near infrared: 0.85-0.88 µm). Therefore, we used this spectral band acquired on 24 September 2018 [64] to extract the lineaments. The scene was cropped to cover the study area. The lineaments were mapped using lineament extraction tool available in the PCI Geomatica software [65]. Table 3. Threshold parameters and values of lineament extraction [62].

Parameters
Value RADI-radius of filter in pixels; GTHR-threshold for edge gradien; LTHR-threshold for curve length; FTHR-threshold for fitting line error; ATHR-threshold for angular difference; and DTHR-threshold for linking distance.
The lineaments shapefile was used to prepare the raster of lineament factors. The Euclidean distance to the closest lineament was calculated for each cell ( Figure 4A). The study area includes 2045 lineaments, most of them trending in NE-SW and NW-SE direction. The total length of faults is~2114.9 km. More than 63.2% of the lineaments are <1 km in length. The distance of pixels to the lineaments ranges between 0 and 3.394 km ( Figure 4B). faults, while the rest are unclassified ( Figure 3). The total length of the faults is ~297.1 km. Interestingly, 10 of them are >2 km in length. The main direction of the thrust faults is NW-SE, while the main directions of the rest of the faults are NE-SW ( Figure 3). Distance to the faults ranges between 0 and 34.02 km ( Figure  4A). The faults shapefile was used to prepare the raster of faults factor. The Euclidean distance to the closest faults was calculated for each cell.
According to Javhar et al. [62], the best band for lineaments extraction from Landsat-8/Operational Land Imager (OLI) is the spectral band 5 (near infrared: 0.85-0.88 μm). Therefore, we used this spectral band acquired on 24 September 2018 [64] to extract the lineaments. The scene was cropped to cover the study area. The lineaments were mapped using lineament extraction tool available in the PCI Geomatica software [65]. RADI-radius of filter in pixels; GTHR-threshold for edge gradien; LTHR-threshold for curve length; FTHRthreshold for fitting line error; ATHR-threshold for angular difference; and DTHR-threshold for linking distance.
The lineaments shapefile was used to prepare the raster of lineament factors. The Euclidean distance to the closest lineament was calculated for each cell ( Figure 4A). The study area includes 2045 lineaments, most of them trending in NE-SW and NW-SE direction. The total length of faults is ~2114.9 km. More than 63.2% of the lineaments are <1 km in length. The distance of pixels to the lineaments ranges between 0 and 3.394 km ( Figure 4B).

Environmental Factors
Soil types were obtained from Harmonized World Soil Database (HWSD) [66]. This database consists of a 1 km (or 30 arc-second) raster image. Clay soils were considered more suitable because of their low permeability and greater water-holding capacity [67][68][69]. Four groups of soil are exposed in the study area (black dots; Figure 5), which are classified as leptosols, luvisols, vertisols, and calcisols (Figures 5 and 6A; Table 4). Leptosols soil group includes three sub groups (A, B, and C in Table 4). Figure 5 shows the texture of the soil types in the study area. The most suitable soil type for dam location is vertisols, composed of 43% clay.

Environmental Factors
Soil types were obtained from Harmonized World Soil Database (HWSD) [66]. This database consists of a 1 km (or 30 arc-second) raster image. Clay soils were considered more suitable because of their low permeability and greater water-holding capacity [67][68][69]. Four groups of soil are exposed in the study area (black dots; Figure 5), which are classified as leptosols, luvisols, vertisols, and calcisols (Figures 5 and 6A; Table 4). Leptosols soil group includes three sub groups (A, B, and C in Table 4). Figure 5 shows the texture of the soil types in the study area. The most suitable soil type for dam location is vertisols, composed of 43% clay. Figure 5. Soil textural classification [70]. Small dots indicate the percentage of silt, clay, and sand.   We performed a supervised classification using the Support Vector Machine (SVM) algorithm proposed by Vapnik [71] to discriminate the seven major land cover classes. The selected land cover classes are orchard or tree farm class, mountain brush mixture of oak brush class, cultivated land or bare land class, water body class, road class, built-up land class, and bare land class ( Figure 6B). SVM is a supervised nonparametric method developed from statistical machine learning. It is used to solve complicated class distributions in multi and hyperspectral data [72]. Input data layers contain the seven multispectral reflectance bands of Landsat-8/OLI acquired on 24 September 2018, with a spatial resolution of 30 m. A radial basis function was selected as kernel type, and the penalty parameter was 100. The gamma in kernel function was the inverse of the band numbers used in the data input [72,73]. Such procedures are in agreement with previous research conducted by Othman and Gloaguen [74] and Yang [73]. In addition to fieldwork, we used the QuickBird images to select both training and validating datasets for the seven major land cover classes. A total of 615 pixels was selected randomly for training. Another dataset consisting of 310 pixels was used for validation to calculate the OA and Kappa coefficient (K). The classification accuracy was estimated by defining the OA [75] and the K [76], which is a measure of agreement between the classified map and the true reference data. The realized OA and K are 93.08% and 0.8850, respectively. ISPRS Int. J. Geo-Inf. 2020, 9, x FOR PEER REVIEW distributions in multi and hyperspectral data [72]. Input data layers contain the seven multispectral reflectance bands of Landsat-8/OLI acquired on 24 September 2018, with a spatial resolution of 30 m. A radial basis function was selected as kernel type, and the penalty parameter was 100. The gamma in kernel function was the inverse of the band numbers used in the data input [72,73]. Such procedures are in agreement with previous research conducted by Othman and Gloaguen [74] and Yang [73]. In addition to fieldwork, we used the QuickBird images to select both training and validating datasets for the seven major land cover classes. A total of 615 pixels was selected randomly for training. Another dataset consisting of 310 pixels was used for validation to calculate the OA and Kappa coefficient (K). The classification accuracy was estimated by defining the OA [75] and the K [76], which is a measure of agreement between the classified map and the true reference data. The realized OA and K are 93.08% and 0.8850, respectively.

Topographical Factors
We used two topographic attributes, elevation and slope gradient. Using the DEM/SRTM, we classified moderate elevation lands as most suitable, and the low and high elevation lands less suitable for locating dam sites [14] because low lands are prone to flooding [77,78]. We mosaiced four scenes of SRTM, with a spatial resolution of 1 arc-second. The data was reprojected to UTM Z38N, resulting in a spatial resolution of 30 m. We resampled the DEM using the neighbor resampling method. The range of elevation in the study area was between 335 and 2418 m above sea level (a.s.l.; Figure 7A). The mean elevation was 988 m a.s.l.

Topographical Factors
We used two topographic attributes, elevation and slope gradient. Using the DEM/SRTM, we classified moderate elevation lands as most suitable, and the low and high elevation lands less suitable for locating dam sites [14] because low lands are prone to flooding [77,78]. We mosaiced four scenes of SRTM, with a spatial resolution of 1 arc-second. The data was reprojected to UTM Z38N, resulting in a spatial resolution of 30 m. We resampled the DEM using the neighbor resampling method. The range of elevation in the study area was between 335 and 2418 m above sea level (a.s.l.; Figure 7A). The mean elevation was 988 m a.s.l.
We used two topographic attributes, elevation and slope gradient. Using the DEM/SRTM, we classified moderate elevation lands as most suitable, and the low and high elevation lands less suitable for locating dam sites [14] because low lands are prone to flooding [77,78]. We mosaiced four scenes of SRTM, with a spatial resolution of 1 arc-second. The data was reprojected to UTM Z38N, resulting in a spatial resolution of 30 m. We resampled the DEM using the neighbor resampling method. The range of elevation in the study area was between 335 and 2418 m above sea level (a.s.l.; Figure 7A). The mean elevation was 988 m a.s.l. In addition, DEM was also used to extract slope attribute. The steepness of slopes is a major factor in dam site selection. Lands with gentle slopes are more suitable for dam sites location, and vice versa [79]. In addition, DEM was also used to extract slope attribute. The steepness of slopes is a major factor in dam site selection. Lands with gentle slopes are more suitable for dam sites location, and vice versa [79]. Hamzeh [80] stated that areas with slope > 15 • are unsuitable for dam sites [19]. Slope gradient in the study area ranged between flat and 79.3 • ( Figure 7B), with an average slope gradient of 13.8 • . We determined slope pixels between 0 • and 2 • to be most suitable and slope pixels > 30 • to be not suitable for dam site location.

Hydrological Factors
We also used the TRMM data to determine precipitation in the study area. The TRMM [81] was a joint space mission between NASA and the Japan Aerospace Exploration Agency (JAXA). It was designed to measure the rainfall of tropical and subtropical regions of the world. The type of data used is TRMM (3B43-V7), which combines monthly precipitation with a spatial resolution of 0.25 • × 0.25 • [82]. We calculated the average of annual precipitation using the monthly TRMM (3B43-V7) data acquired from September 2002-August 2017. Thirty pixels were selected to cover the study area and the surrounding regions to create distribution of the precipitation factor. To obtain a continuous coverage, we converted these pixels to points, then, we interpolated point-wise precipitation data using an inverse distance weighting (IDW) method.
Approximately 50% of the runoff in the entire KhRB enters the streams inside the Iraqi portion of the basin [83]. Therefore, besides precipitation, river discharge is a significant factor that controls the amount of water stored in the dam reservoir. Lack of hydrological information is made worse in the Al-Khabur mountainous region due to either a total lack of river gauges or poor quality of the limited in-situ monitoring data. Therefore, we measured stream width of the stream networks as an alternative to in-situ river discharge data.
Suitability of TRMM data in the study area was evaluated by making a comparison with the observed precipitation dataset of the Zakho meteorological station. The data from 62 recorded precipitation measurements, covering the period from September 2002 to December 2007, were used. We found that there is a strong linear relationship between the monthly TRMM dataset and the observed precipitation, where the coefficient of determination (R 2 ) is 0.853 and the p-value is <0.05 ( Figure 8A). The slope and intercept were 0.9124 and 9.5485, respectively. The TRMM 3B43-V7 is a valuable tool for mapping water resources that shows good agreement with the ground stations' data [84]. Precipitation varies from 562.85 mm·yr −1 in the southern part of the study area to 785 mm·yr −1 in the north ( Figure 8B). precipitation dataset of the Zakho meteorological station. The data from 62 recorded precipitation measurements, covering the period from September 2002 to December 2007, were used. We found that there is a strong linear relationship between the monthly TRMM dataset and the observed precipitation, where the coefficient of determination (R 2 ) is 0.853 and the p-value is <0.05 ( Figure 8A). The slope and intercept were 0.9124 and 9.5485, respectively. The TRMM 3B43-V7 is a valuable tool for mapping water resources that shows good agreement with the ground stations' data [84]. Precipitation varies from 562.85 mm•yr −1 in the southern part of the study area to 785 mm•yr −1 in the north ( Figure 8B). Drainage network was extracted using tectonics from digital elevation models (TecDEM) 2.2, a MATLAB-based software, which permits the extraction of geomorphologic indices from DEM [85][86][87]. Ramakrishnan [23], Grum [24], Tiwari [25], and Singhai [30] used stream order to estimate the storage capacity of the sub-basins (Table A2). We selected streams belonging to order 3 or higher for locating dam Drainage network was extracted using tectonics from digital elevation models (TecDEM) 2.2, a MATLAB-based software, which permits the extraction of geomorphologic indices from DEM [85][86][87]. Ramakrishnan [23], Grum [24], Tiwari [25], and Singhai [30] used stream order to estimate the storage capacity of the sub-basins (Table A2). We selected streams belonging to order 3 or higher for locating dam sites. The long streams were split into multi segments to be less than 10 km. The width of each stream was measured using 23 cloud-free QuickBird scenes, acquired between 24 and 28 July 2005. We used these older scenes because no recent QuickBird data was available. We considered areas that have no streams or those where streams width is <60 cm (the spatial resolution of QuickBird imagery)to be unacceptable for dam construction ( Figure 9A). Three buffer maps (250, 500, and 1000 m) were created for locating dam sites in the study area.
DEM was used to calculate the CN grid. CN is an empirical parameter commonly used in hydrology for predicting direct runoff or infiltration from rainfall excess [88][89][90][91]. It is a dimensionless parameter and ranges from 0 to 100.
We estimated the CN per pixel by matching the rainfall, soil group, and land cover maps (Equations (7) and (8); [30]. The Geospatial Hydrologic Modeling Extension, HEC-GeoHMS tool, has been used [92].
where Q is runoff (mm), P is rainfall (mm), I a is the initial abstraction, or the amount of water before runoff, which has generally been assumed that I a = 0.2S, and S is the potential maximum soil moisture retention after runoff begins. The S is calculated using Equation (2) and denoted as per the CN [93,94]. The CN value of 100 (S = 0) represents low runoff potential that suggests an impermeable catchment having the maximum runoff-generation capability. A CN value of 0 represents increasing runoff potential of S (i.e., S = ∞), which suggests an infinitely abstracting catchment having zero runoff-generation capability [19]. As shown in Figure 9B, the CN at the KhRB ranges from 30 to 100. The central area of KhRB represents higher runoff potential, with lower runoff potential in the northern and southern parts of KhRB.
which has generally been assumed that Ia = 0.2S, and S is the potential maximum soil moisture retention after runoff begins. The S is calculated using Equation (2) and denoted as per the CN [93,94].
The CN value of 100 (S = 0) represents low runoff potential that suggests an impermeable catchment having the maximum runoff-generation capability. A CN value of 0 represents increasing runoff potential of S (i.e., S = ∞), which suggests an infinitely abstracting catchment having zero runoff-generation capability [19]. As shown in Figure 9B, the CN at the KhRB ranges from 30 to 100. The central area of KhRB represents higher runoff potential, with lower runoff potential in the northern and southern parts of KhRB.

Socioeconomics Factors
We used three socioeconomics factors: (1) distance to roads (m), (2) distance to towns and cities (m), and (3) distance to villages (m). The informatic layers of roads, villages, towns, and cities were obtained from United Nations Office for the coordination of Humanitarian Affairs-Iraq (UNOCHA-IRAQ) [95]. Although the distance to roads has a low impact on dam site suitability, existence of roads and settlements near proposed dam sites contribute to reducing transportation cost [14]. Buffers surrounding the villages, towns, cities, and roads were used to calculate the distance to villages, towns, cities, and roads ( Figure 10 and Figure 11). The farthest distances between each pixel in the study area and the roads, towns, and villages are about 11.5, 19.3, and 5.4 km, respectively.

Socioeconomics Factors
We used three socioeconomics factors: (1) distance to roads (m), (2) distance to towns and cities (m), and (3) distance to villages (m). The informatic layers of roads, villages, towns, and cities were obtained from United Nations Office for the coordination of Humanitarian Affairs-Iraq (UNOCHA-IRAQ) [95]. Although the distance to roads has a low impact on dam site suitability, existence of roads and settlements near proposed dam sites contribute to reducing transportation cost [14]. Buffers surrounding the villages, towns, cities, and roads were used to calculate the distance to villages, towns, cities, and roads ( Figures 10 and 11). The farthest distances between each pixel in the study area and the roads, towns, and villages are about 11.5, 19.3, and 5.4 km, respectively.

Results
We tested 14 different combinations of predictive factors in order to select the best individual combination for the WSM and AHP models. The relationships between dam sites and dam sites' predictive factors using WSM and AHP models are shown in Table A3.
Each factor has a specific predictive weight that varies between WSM and AHP. The predictive weight in the two models represents the normalized WSM and AHP ranks, respectively. The ranges of the predictive factor weights can be calculated by applying the prediction model Equations (3) and (4).

Results
We tested 14 different combinations of predictive factors in order to select the best individual combination for the WSM and AHP models. The relationships between dam sites and dam sites' predictive factors using WSM and AHP models are shown in Table A3.
Each factor has a specific predictive weight that varies between WSM and AHP. The predictive weight in the two models represents the normalized WSM and AHP ranks, respectively. The ranges of the predictive factor weights can be calculated by applying the prediction model Equations (3) and (4).

Results
We tested 14 different combinations of predictive factors in order to select the best individual combination for the WSM and AHP models. The relationships between dam sites and dam sites' predictive factors using WSM and AHP models are shown in Table A3.
Each factor has a specific predictive weight that varies between WSM and AHP. The predictive weight in the two models represents the normalized WSM and AHP ranks, respectively. The ranges of the predictive factor weights can be calculated by applying the prediction model Equations (3) and (4).

Identification of Suitable Sites for Dams by the WSM and AHP Models
Dam site selection maps have been prepared using two different models ( Figure 12). The predictive factors were evaluated qualitatively by adding and removing the predictive factors (experimental method) to select the significant factors and to enhance the prediction accuracy of the dam site selection maps. In other words, we removed the factors that decrease the accuracy of dam site selection and retained those that increase the accuracy of dam site selection.
Qualitative evaluation shows that, for the AHP method, the most significant subfactors are: Aqra-Bekhme lithology, soil type of Chromic Vertisols, and Calcic Xerosols, land cover, type of water, distance to faults > 1000 m, distance to faults > 3000 m, areas having elevation between 700 and 800 m a.s.l., flat areas, rivers that have width > 10m, areas that have CN > 87, distance to roads < 1000 m, distance to cities and towns > 2500 m, and distance to villages > 1000 m (Table A3).
Statistical models based on the AHP and the WSM method revealed the most suitable areas for dam site location. Suitability of dam sites using the WSM and AHP models are shown in Figure 12. These maps display the suitable and the most suitable areas in magenta and blue color respectively, and are located mainly in the center of the study area. The range of the data distribution of the 1000 m buffer for the WSM and AHP is between 0 and 1. The mean of the AHP suitability map is greater than the mean of the WSM suitability map, and the standard deviation of the AHP model is less than the standard deviation of the WSM model in all selected buffer zones (1000, 500, and 250). The mean of the WSM and AHP data are 0.47 and 0.56 respectively, while the standard deviation is 0.2 and 0.14, respectively. The most suitable area of the WSM and the AHP models are 4,918,500 m 2 and 10,188,900 m 2 , respectively. The difference between the AHP and WSM map shows that the suitability value is higher in almost all pixels that are present in the AHP model, with only a few pixels showing the opposite (Figure 13). ISPRS Int. J. Geo-Inf. 2020, 9, x FOR PEER REVIEW to faults > 1000 m, distance to faults > 3000 m, areas having elevation between 700 and 800 m a.s.l., flat areas, rivers that have width > 10m, areas that have CN > 87, distance to roads < 1000 m, distance to cities and towns > 2500 m, and distance to villages > 1000 m (Table A3).
Statistical models based on the AHP and the WSM method revealed the most suitable areas for dam site location. Suitability of dam sites using the WSM and AHP models are shown in Figure 12. These maps display the suitable and the most suitable areas in magenta and blue color respectively, and are located mainly in the center of the study area. The range of the data distribution of the 1000 m buffer for the WSM and AHP is between 0 and 1. The mean of the AHP suitability map is greater than the mean of the WSM suitability map, and the standard deviation of the AHP model is less than the standard deviation of the WSM model in all selected buffer zones (1000, 500, and 250). The mean of the WSM and AHP data are 0.47 and 0.56 respectively, while the standard deviation is 0.2 and 0.14, respectively. The most suitable area of the WSM and the AHP models are 4,918,500 m 2 and 10,188,900 m 2 , respectively. The difference between the AHP and WSM map shows that the suitability value is higher in almost all pixels that are present in the AHP model, with only a few pixels showing the opposite ( Figure 13).    Figure 14 shows the accuracy of suitable pixels by number, the accuracy of suitable pixels by weight, and the OA for three buffer zones used (i.e., 250, 500, and 1000 m), which are described in Table A4 for the two models. The OA is evaluated using the accuracy of suitable pixels by number, and the accuracy of suitable pixels by weight.

Classification Accuracy of the Two Adopted Models with Previosly Recommended Sites
The average of OA for the AHP model is higher than the OA for WSM, 58.27 and 52.78, respectively. The AHP model shows that the best-planned dams are located at sites labeled number 7, 12, and 21, respectively. However, the WSM model shows that the best-planned dams are at number 21, 16, and 8. Both WSM and AHP models indicate that the planned dams numbered 1, 4, 6, 9, 10, 15, and 19 are either not suitable or less suitable. In addition, the most unsuitable planned dams are at 15, 9, and 19 ( Figure 14, Table A1, and Figure A1).  Figure 14 shows the accuracy of suitable pixels by number, the accuracy of suitable pixels by weight, and the OA for three buffer zones used (i.e., 250, 500, and 1000 m), which are described in Table A4 for the two models. The OA is evaluated using the accuracy of suitable pixels by number, and the accuracy of suitable pixels by weight.

Classification Accuracy of the Two Adopted Models with Previosly Recommended Sites
The average of OA for the AHP model is higher than the OA for WSM, 58.27 and 52.78, respectively. The AHP model shows that the best-planned dams are located at sites labeled number 7, 12, and 21, respectively. However, the WSM model shows that the best-planned dams are at number 21, 16, and 8. Both WSM and AHP models indicate that the planned dams numbered 1, 4, 6, 9, 10, 15, and 19 are either not suitable or less suitable. In addition, the most unsuitable planned dams are at 15, 9, and 19 ( Figure 14, Table A1, and Figure A1).
We applied the threshold operation to the AHP raster, which ranged between 0 and 1. The suitable selected threshold value for the AHP raster was 0.8. The selection of the threshold values for the AHP raster was determined experimentally. The final thresholded AHP raster included 11 groups of pixels, which were used to generate areas representing suitable dam sites. We applied the threshold operation to the AHP raster, which ranged between 0 and 1. The suitable selected threshold value for the AHP raster was 0.8. The selection of the threshold values for the AHP raster was determined experimentally. The final thresholded AHP raster included 11 groups of pixels, which were used to generate areas representing suitable dam sites.

Discussion
Selecting suitable sites using GIS techniques is a complex task due to the influence of many factors that affect the location of dams. Careful consideration of predictive factors is required to adequately assess

Discussion
Selecting suitable sites using GIS techniques is a complex task due to the influence of many factors that affect the location of dams. Careful consideration of predictive factors is required to adequately assess the weightings of these factors according to specific site conditions. Published literature indicates that dam site selection involves consideration of important variables such as geology, hydrology, slope, runoff, drainage order, environmental, and socioeconomics aspects as effective factors [23]. Almost all of these factors were applied in other areas having similar characteristics of climate, environment, morphology, and geology as the Greater Zab River [14] and Duhok governorate [8] in northern Iraq, and Sistan and Baluchestan provinces of Iran [96]. We used another factor, stream width, as an alternative to discharge measurement beside the above-mentioned factors to improve the value of the methodology used. The discharge can be measured by multiplying the velocity by the width and the average depth of the stream [97]. Due to the lack of gauging stations to measure the discharge, stream width has been used as a factor to select the dam site. To the best of our knowledge, this study is the first one to use stream width as a proxy to stream discharge in a GIS-based application for dam site selection. We believe that measuring stream width using high-resolution imaging is a reliable factor to estimate river discharges. In our opinion, it is much better than other suggested factors, such as stream density [31,92] that does not estimate the discharge, and lumps together all drainages in the area regardless of whether they are dry, seasonal, or perennial.
Although the CN grid is not indicative of runoff, several studies on dam site selection have used it as a predictive factor [9,23,[25][26][27][28]30,31]. Al-Ruzouq [31] and Mugo [9] used the drainage density as an expression of runoff and considered the CN grid as a predictive factor. Our study emphasizes combining more than one factor as an expression of runoff.
Noori [14] found an inverse relationship between site suitability and distance to villages, towns, and cities. Sayl [29] determined the distance of selected dam site to villages, towns, and cities to be 250 m. In this study, we used the inverse relation between site suitability and distance to villages, towns, and cities. At the same time, we deemed areas located <250 m from villages, towns, and cities as not suitable for dam sites. The closet village and city to the 11 dam sites, are located 485 and 3600 m away, respectively.
The AHP model is a robust tool for solving decision problems and system analysis as it simplifies complex decisions to a series of pairwise comparisons [98], while the WSM model is a simple multicriteria decision-making approach [99]. The boxplot (Figure 15) of the mean of all buffer zones for the 21 dam sites for each of the WSM and AHP models shows that the AHP method is better than WSM. The suitability of the AHP method is affirmed from the distribution of its weighted factors, which is far above 50% of the OA (Figure 15; right), as compared to that of WSM that falls close to 50% (Figure 15; left). Our study confirms the results of Tscheikner-Gratl et al.
[100] that the AHP better than WSM. However, it disagrees with Mulliner et al. [34], who stated that WSM was nearly similar to AHP, and Adamczak et al. [101], who concluded that WSM has greater efficiency than AHP.
Different lithologic units influence quality of the reservoir water, dam foundation, reservoir characteristics, and stability of the dam in the event of rapid discharge [99]. Accordingly, the geotechnical constraints of the rock units in the Al-Khabur basin make lithology a major predictive factor (Table A3).
Drought events occur regularly in the Al-Khabur area, due to: lack of rain in the summer, high runoff caused by varied topography, and significant evaporation due to high temperatures. These environmental condition calls for careful planning and proper management of the available water resources. Unlike the Mosul Dam where the unfavorable evaporite beds contribute to active karstification threatening the stability of the dam [17], we have examined areas with favorable geology for dam construction. We excluded sites downstream of Mosul Dam because of similar geology (presence of gypsum and anhydrite beds), and focused on areas upstream of the Mosul Dam, where different rock types are exposed.
Based on this study, 11 sites were determined to be suitable for dam location ( Figure A2, Figure 16, and Table 5). Three of these correspond to three of the 21 dams that have been suggested by MAWR. Overall accuracies of the 11 sites range between 76.2% and 91.8%. The most suitable site coincides with dam number 8 (#8), located in the southeastern part of the study area, which has the largest reservoir, covering an area of 14.86 km 2 , and capable of holding 1.182 km 3 of water (Table 6). In addition, mean river depth at dam site #8 is about 80 m, which means lower evaporation, as greater reservoir depth results in slower evaporation rate. No village will be inundated as a result of the construction of this dam. The dam site is located on the Mukdadiyah Formation comprising pebbly sandstone, siltstone, and claystone that generally have favorable engineering properties. The only drawback of this dam site is its length of 1367 m, which would add to construction cost.
Dam site #9 has the highest mean depth of 87 m, but its reservoir is smaller, while dam site #6 has the lowest mean depth (12.5 m) and also the smallest reservoir capacity. Similarly, dam sites #2, #3, and #10 have smaller reservoir capacities (Table 6). Based on these considerations, we have excluded sites #2, #3, #8, #9, and #10.
This study identifies suitable locations for dams that should be selected for detailed site investigations prior to construction. Allocating resources on sites that have been found to be more suitable would entail significant cost savings, as opposed to sites having severe limitations.
These multipurpose dams will provide water for drinking and irrigation, electric power generation, and flood control. Benefits would include economic development of the country, including higher crop yield and increased power generation capacity for Iraq, that currently suffers from a critical shortage of electric power. Additionally, it would prevent flooding events that are common in parts of the study area.
The criteria used for dam site selection by the MAWR were based on superficial field surveys and cursory GIS analysis that lacked scientific information. Critical data, such as stream and river discharge, basin size and storage capacity, geology of dam foundation and reservoir area, and related local and regional geotechnical characteristics, were not taken into account. This study was designed to include all key factors (Table 1) to identify suitable dam and safe sites.
site is located on the Mukdadiyah Formation comprising pebbly sandstone, siltstone, and claystone that generally have favorable engineering properties. The only drawback of this dam site is its length of 1367 m, which would add to construction cost.
Dam site #9 has the highest mean depth of 87 m, but its reservoir is smaller, while dam site #6 has the lowest mean depth (12.5 m) and also the smallest reservoir capacity. Similarly, dam sites #2, #3, and #10 have smaller reservoir capacities (Table 6). Based on these considerations, we have excluded sites #2, #3, #8, #9, and #10.
This study identifies suitable locations for dams that should be selected for detailed site investigations prior to construction. Allocating resources on sites that have been found to be more suitable would entail significant cost savings, as opposed to sites having severe limitations.
These multipurpose dams will provide water for drinking and irrigation, electric power generation, and flood control. Benefits would include economic development of the country, including higher crop yield and increased power generation capacity for Iraq, that currently suffers from a critical shortage of electric power. Additionally, it would prevent flooding events that are common in parts of the study area.
The criteria used for dam site selection by the MAWR were based on superficial field surveys and cursory GIS analysis that lacked scientific information. Critical data, such as stream and river discharge, basin size and storage capacity, geology of dam foundation and reservoir area, and related local and regional geotechnical characteristics, were not taken into account. This study was designed to include all key factors (Table 1) to identify suitable dam and safe sites.      In Table 6, N v is the number of villages that will be inundated.

Conclusions
This study serves as a good example of integration of remote sensing images, GIS, and geotechnics in water resources development. We used both AHP and WSM methods for selecting the most suitable locations for dams. Fourteen major factors in dam site suitability analyses were generated from various remote sensing and ancillary data. For the first time, this study used the stream width, measured from high-resolution images instead of the river discharge measurements, as a predictive factor for dam site selection. Twenty-one dam sites, proposed by MAWR, were used as reference sites. We also evaluated the accuracy of the AHP and WSM techniques. For all 21 dams, the overall accuracy of the AHP method was found to be greater than the WSM. Eleven dam sites were found suitable for potential runoff harvesting. Three of these 11 sites correspond to the dam sites that have been suggested by the MAWR. For both models, the accuracy of the 11 sites ranged between 76.2% and 91.8%. The difference between AHP and WSM map shows that the suitability is higher in almost all the pixels that are present in the AHP model, while a few pixels show the opposite. This study offers a useful and inexpensive tool to decision-makers for preliminary screening of potential dam sites, eliminating sites with severe limitations, and directing geotechnical exploration activities at sites with minimum limitations. This method can be applied for the rest of the hydrological basins in the Kurdistan region.
Based on these analyses, 11 dam sites were determined to be more suitable. However, 10 sites, numbers 1, 4, 5, 6 and 15, 16, 17, 18, and 19, that are located in the Al-Khabur River basin, should be avoided because of heavy solutioning activities and faster reservoir siltation rates. Site #8 appeared most suitable for dam location because of its large reservoir capacity, low evaporation rate, and no villages within its reservoir.
It must be emphasized that this study should not be used for selecting the final site without conducting detailed on-site geotechnical investigations at the suggested locations. Nonetheless, by eliminating sites with serious geological and other constraints, the study has identified sites that appear more suitable where additional exploration should be carried out for design and construction of the dam. Table A1. Proposed sites of dams in the study area [16].

No.
Site     Note: Np is the number of pixels, and the Sp is the WSM suitable pixels.