Improving Accuracy of Impervious Surface Extraction Based on a Threshold Hierarchical Method (THM)

: Impervious surface area (ISA) is an important representation of urban area. It is very popular to extract ISA by using linear spectral mixture analysis (LSMA). However, there are still some defects in this method: underestimated in areas with a large amount of ISA. Hence, we designed a threshold hierarchical method (THM) to test this underestimation and understand which scale is the best to mixture. The capacity of the THM and the optimal threshold in the impervious surface extraction are the focus in this work. In THM model, the medium-resolution image (Landsat 8 OLI) and the high-resolution image (Gaofen-2, GF-2) were used, the LSMA and the object-oriented method (OOM) were applied for the area with a larger amount of impervious surfaces, which was extracted from the Landsat 8 OLI image after ﬁnishing the LSMA procedure by a threshold of the ISA abundance data, the GF-2 image was employed to extract the ISA by OOM. The results show that the THM had the capacity to achieve higher ISA extraction accuracy and ameliorate the ISA underestimate problem.

The vegetation-impervious surface-soil (V-I-S) model, a conceptual model proposed by Ridd [25] for simplifying heterogeneous urban areas to a combination of three basic ground components, provided a theoretical basis for quantitatively understanding urban environments, such as the application of the spectral mixture analysis (SMA). The SMA, especially the liner spectral mixing analysis (LSMA), has been used to identify ISA and has proved to be an effective method to describe urban environment [1,7,12,14]. It has been treated as one of the primary subpixel analysis approaches to estimate the percentage of ISA by analyzing three/four pure endmembers [26,27].
The SMA based methods have a common problem [6,7,27,28]: impervious surface is often overestimated in areas with a small amount of impervious surface (e.g., in the less-developed areas), but is underestimated in areas with a large amount of impervious surface (e.g., in the well-developed areas). Previous studies indicate that bright colored building roofs are confused for dry soil, and dark-colored roads and buildings for water/wetland and shadows [14,29]. Although many methods have been presented to solve the problem [27,30], a reasonable solution is that the fully constrained linear mixture model requires that endmember fractions be positive and sum up to 1 [7].
The object-oriented method (OOM), rather than one run at a sub-pixel level, helps extract impervious surface at a pixel level. OOM is always undertaken considering high-resolution images where the resolution is under 10 m, and it exploits comprehensive information on the texture, context, and object-based features [31,32]. Some studies found that SPOT data are better than that of Landsat TM at the sub-city level [33].
With an improvement in resolution, the proportion of mixed pixels was reduced significantly. For a particular zone, the larger the area of impervious surface in medium-resolution imagery (such as Landsat), the more likely it has purely impervious surface in the high-resolution imagery, which is better than that of Landsat 8 OLI. The OOM could be employed for extracting impervious surface in areas with high amounts of impervious surface. Thus, for a zone with a larger amount of impervious surface, if the OOM is applied with high-resolution, it could help mitigate the underestimation of impervious surfaces in areas with a large number of impervious surfaces brought out by the linear spectral mixture model, helping achieve higher extraction accuracy.
Here, a threshold hierarchical method (THM) was designed to address the underestimation of the number of impervious surfaces in areas with a large amount of such surface brought out by an LSMA. Thresholding is the simplest and most fundamental method for image segmentation. The range of the area with a larger amount of impervious surface is extracted considering a threshold value of the number of impervious surfaces, which is calculated through LSMA. For this area where the abundance of impervious surface is larger than the threshold, high-resolution images are selected instead of medium resolution imagery to extract the impervious surface by using the OOM.
The threshold is the key factor determining the range of the high-resolution imagery and the performance of the THM. In order to understand the optimal capability of this method, the abundance of impervious surface, ranging from 0 to 1, was divided into ten equal parts: ten thresholds ranging from 0.1 to 1 in steps of 0.1. Each threshold was adopted and tested in the THM and the optimal threshold was determined.

Study Area
Guangzhou, the capital of Guangdong province, has a history of long-term urban development and has experienced tremendous growth over the past twenty years. It has been the leading commercial port of China, serving as a major business hub in South China. Guangzhou contains 11 administrative districts, with an area of 7434.4 km 2 . According to the Guangzhou Statistical Yearbook (http://112.94.72.17/portal/queryInfo/statisticsYearbook/index), Guangzhou had a year-end registered resident population of 9.50 million in 2019 and 86.46% of this resident population resided in urban areas. The annual average gross domestic product (GDP) growth is greater than 8% from 2000 to 2019, indicating strong economic growth. The core zone of Guangzhou city was selected as the study area, which includes urban-rural fringe zones, areas with abundant bare soil, central business district (CBD) areas, and riverside ( Figure 1). The selected zone is approximately 145 km 2 in area and has diverse land use types, including man-made materials (e.g., buildings, roads, and parking lots), forest, grassland, and sparse bare soils. The reason for selecting this study area is that the abundance of impervious surface varies from low to high, which allows us to reveal the robustness of the extraction methods and make comparisons. Appl. Sci. 2020, 10, x FOR PEER REVIEW 3 of 13

Study Data
Two images of Guangzhou, a Landsat 8 OLI image and one from the China-made satellite, Gaofen-2 (GF-2), were selected ( Figure 2). Table 1 lists their details and locations.

Study Data
Two images of Guangzhou, a Landsat 8 OLI image and one from the China-made satellite, Gaofen-2 (GF-2), were selected ( Figure 2). Table 1 lists their details and locations.

Study Data
Two images of Guangzhou, a Landsat 8 OLI image and one from the China-made satellite, Gaofen-2 (GF-2), were selected ( Figure 2). Table 1 lists their details and locations.

Test Samples
A total of 209 test polygons (90 m × 90 m), whose edges are consistent with the 3 × 3 Landsat images, were randomly selected from Google Earth ( Figure 3). An IKONOS image acquired 31 October 2014, was used to extract the true objects for comparison with the extracted objects.

Test Samples
A total of 209 test polygons (90 m × 90 m), whose edges are consistent with the 3 × 3 Landsat images, were randomly selected from Google Earth ( Figure 3). An IKONOS image acquired October 31, 2014, was used to extract the true objects for comparison with the extracted objects.

Data Processing
Before the data processing, the remote sensing image received a pretreatment, such as radiometric calibration, and FLAASH atmospheric correction was performed, and the highresolution imageries were geometrically rectified based on the Landsat 8 OLI.

THM
Thresholding is the simplest and fundamental method for image segmentation. A threshold helps divide an image into two parts: an image having a value larger than the threshold, and one with a value smaller than the threshold. A thresholding THM was designed in this study.
After the LSMA procedure was completed considering Landsat 8 OLI, the abundance of the impervious surfaces was determined. A threshold value was chosen and if the abundance was larger than the threshold, a high-resolution image instead of a medium-resolution image was adopted to extract the distribution of impervious surfaces using the OOM.

LSMA
The LSMA method was adopted for unmixing pixels. Suitable results were obtained by Wu and Murray when frictions of impervious surface were estimated by analyzing low and high albedo endmembers [7]. Here, four endmembers, i.e., high-albedo objects, low-albedo objects, vegetation, and soil, were used for unmixing the multispectral images into these fractional images. Following Wu's model, impervious surfaces could be estimated by the addition of low-albedo and high-albedo fractions.
The mathematical model of LSMM could be expressed as following: where i = 1, …, (number of spectral bands); k = 1, …, n (number of endmembers); is the spectral reflectance of band i which contains one or more endmembers; is the proportion of an endmember

Data Processing
Before the data processing, the remote sensing image received a pretreatment, such as radiometric calibration, and FLAASH atmospheric correction was performed, and the high-resolution imageries were geometrically rectified based on the Landsat 8 OLI.

THM
Thresholding is the simplest and fundamental method for image segmentation. A threshold helps divide an image into two parts: an image having a value larger than the threshold, and one with a value smaller than the threshold. A thresholding THM was designed in this study.
After the LSMA procedure was completed considering Landsat 8 OLI, the abundance of the impervious surfaces was determined. A threshold value was chosen and if the abundance was larger than the threshold, a high-resolution image instead of a medium-resolution image was adopted to extract the distribution of impervious surfaces using the OOM.

LSMA
The LSMA method was adopted for unmixing pixels. Suitable results were obtained by Wu and Murray when frictions of impervious surface were estimated by analyzing low and high albedo endmembers [7]. Here, four endmembers, i.e., high-albedo objects, low-albedo objects, vegetation, and soil, were used for unmixing the multispectral images into these fractional images. Following Wu's model, impervious surfaces could be estimated by the addition of low-albedo and high-albedo fractions. The mathematical model of LSMM could be expressed as following: where i = 1, . . . , (number of spectral bands); k = 1, . . . , n (number of endmembers); R i is the spectral reflectance of band i which contains one or more endmembers; f k is the proportion of an endmember k to the pixel; R ik was the known spectral reflectance of endmember k with the pixel on band i; and ε i was the error for band i. A constrained least-squares solution was used here, assuming that the following two conditions were satisfied simultaneously:

OOM
The GF-2 image was processed using the commercial software eCognition Developer. For comparison, the GF-2 image was classified into four parts, the same as those for the Landsat 8 OLI image: impervious surface, vegetation, soil, and water, as the final classification types.
In general, OOM involve two key steps: producing a segmented image and classifying the segments into meaningful categories. Multiresolution segmentation was selected during image segmentation due to its good performance [34]. Thereafter, class rules were assigned using spectral signatures, shapes, and contextual relationships; these rules were used as a basis for the fuzzy classification of the imagery (Figure 4).
Appl. Sci. 2020, 10, x FOR PEER REVIEW 5 of 13 A constrained least-squares solution was used here, assuming that the following two conditions were satisfied simultaneously:

OOM
The GF-2 image was processed using the commercial software eCognition Developer. For comparison, the GF-2 image was classified into four parts, the same as those for the Landsat 8 OLI image: impervious surface, vegetation, soil, and water, as the final classification types.
In general, OOM involve two key steps: producing a segmented image and classifying the segments into meaningful categories. Multiresolution segmentation was selected during image segmentation due to its good performance [34]. Thereafter, class rules were assigned using spectral signatures, shapes, and contextual relationships; these rules were used as a basis for the fuzzy classification of the imagery (Figure 4).
Here, five classification targets-water, vegetation, roads, soil, and other impervious surfacewere chosen, and four related segmentation scales (180, 100, 50, and 20), were selected. Segregation related factors such as color, shape, compactness, and smoothness were considered, as described in Table 2. Information on bands and texture information, and remote sensing indexes such as the red band, green band, blue band, NIR band, NDVI, NDWI, PCA band 1, brightness, and RBI were considered in the classification.

Accuracy Assessment
Accuracy assessment is a critical step during the classification. For each test polygon, the true impervious surface was digitized using ArcGIS based on Google Earth image and an IKONOS image. Here, five classification targets-water, vegetation, roads, soil, and other impervious surface -were chosen, and four related segmentation scales (180, 100, 50, and 20), were selected. Segregation related factors such as color, shape, compactness, and smoothness were considered, as described in Table 2. Information on bands and texture information, and remote sensing indexes such as the red band, green band, blue band, NIR band, NDVI, NDWI, PCA band 1, brightness, and RBI were considered in the classification.

Accuracy Assessment
Accuracy assessment is a critical step during the classification. For each test polygon, the true impervious surface was digitized using ArcGIS based on Google Earth image and an IKONOS image. The real proportion of impervious surface area was computed by dividing the ISA by the test polygon area (810 m 2 ) in each test polygon.
The root-mean-square error (RMSE) was selected as an indicator of accuracy. It is a frequently used measure of the differences between values predicted by a model or an estimator, and the values actually observed. RMSE is calculated as follows: X and X indicate the calculate result and the test sample respectively, and n is number of test samples.

Outcome of the LSMA
The Landsat 8 OLI image with six reflective bands was unmixed into four fractional images: those representing high-albedo, low-albedo, bare soil, and vegetation. Figure 5 shows these fractional images. The abundance of the impervious surface was determined by adding the bands of the high-albedo and low-albedo fractional images ( Figure 6).
Appl. Sci. 2020, 10, x FOR PEER REVIEW 6 of 13 The real proportion of impervious surface area was computed by dividing the ISA by the test polygon area (810 m 2 ) in each test polygon. The root-mean-square error (RMSE) was selected as an indicator of accuracy. It is a frequently used measure of the differences between values predicted by a model or an estimator, and the values actually observed. RMSE is calculated as follows: X and X' indicate the calculate result and the test sample respectively, and n is number of test samples.

Outcome of the LSMA
The Landsat 8 OLI image with six reflective bands was unmixed into four fractional images: those representing high-albedo, low-albedo, bare soil, and vegetation. Figure 5 shows these fractional images. The abundance of the impervious surface was determined by adding the bands of the highalbedo and low-albedo fractional images ( Figure 6).    By the test of 209 sample sites, the RMSE of the abundance of the impervious surface was determined to be 0.2197.

Range of Different Thresholds
The abundance of impervious surface could be determined by the LSMA method; the values ranged from 0 to 1 and were divided into ten parts in the steps of 0.1, considering ten thresholds. For each threshold, areas with an abundance larger than the threshold were selected (Figure 7). Based on the above, the scope of the area with an abundance larger than the threshold was found to gradually increase as the threshold increased.

Range of Different Thresholds
For each threshold and for areas where the abundance was larger than the threshold, considering the scope described in Figure 7, the GF-2 imagery and the OOM were selected and employed; thereafter, various outcomes for extracting impervious surfaces could be determined through the THM by adopting different thresholds.
Considering an entire image, the distribution of ISA could be described in two parts: (1) in areas where the abundance data of ISA was smaller than the threshold, information on ISA determined through the Landsat 8 OLI and data on the abundance of ISA was obtained by adding the high-albedo and low-albedo fractions, which were unmixed by the LSMA; (2) in areas where the abundance of ISAs was larger than the threshold, the GF-2 image was adopted, and the ISA (including roads and buildings) and non-ISA (including water, vegetation, and bare soil) were classified by the OOM. The abundance of the ISAs and non-ISAs was 1 and 0, respectively.
Ten different thresholds, ranging from 0.1 to 1, in steps of 0.1, were employed through the THM. The ISA distribution information at different thresholds could be described as Figure 8.

Range of Different Thresholds
The abundance of impervious surface could be determined by the LSMA method; the values ranged from 0 to 1 and were divided into ten parts in the steps of 0.1, considering ten thresholds. For each threshold, areas with an abundance larger than the threshold were selected (Figure 7). Based on the above, the scope of the area with an abundance larger than the threshold was found to gradually increase as the threshold increased.

Range of Different Thresholds
For each threshold and for areas where the abundance was larger than the threshold, considering the scope described in Figure 7, the GF-2 imagery and the OOM were selected and employed; Several aspects could be concluded based on Figure 9: (1) The RMSE at different thresholds obtained through the THM showed a "U" trend. The RMSE was 0.2019 at a threshold of 0.1, and 0.2203 at a threshold of 1.0; the RMSE was the smallest (0.1871) at a threshold of 0.5. (2) The THM is suitable for improving the accuracy in extracting impervious surfaces. The RMSE value was smaller than that obtained by the LSMA, which was 0.2107 at a threshold of <0.9. (3) The optimal threshold was 0.5. As compared to the LSMA, the precision of the THM, considering the optical threshold, could be increased by 3.83%. and low-albedo fractions, which were unmixed by the LSMA; (2) in areas where the abundance of ISAs was larger than the threshold, the GF-2 image was adopted, and the ISA (including roads and buildings) and non-ISA (including water, vegetation, and bare soil) were classified by the OOM. The abundance of the ISAs and non-ISAs was 1 and 0, respectively. Ten different thresholds, ranging from 0.1 to 1, in steps of 0.1, were employed through the THM. The ISA distribution information at different thresholds could be described as Figure 8.  Several aspects could be concluded based on Figure 9: (1) The RMSE at different thresholds obtained through the THM showed a "U" trend. The RMSE was 0.2019 at a threshold of 0.1, and 0.2203 at a threshold of 1.0; the RMSE was the smallest (0.1871) at a threshold of 0.5. (2) The THM is suitable for improving the accuracy in extracting impervious surfaces. The RMSE value was smaller than that obtained by the LSMA, which was 0.2107 at a threshold of <0.9. (3) The optimal threshold was 0.5. As compared to the LSMA, the precision of the THM, considering the optical threshold, could be increased by 3.83%. Figure 9. Variations in the root-mean-square error (RMSE) at different thresholds obtained through THM.

Discussion
The THM was designed to address the problem of impervious surfaces being underestimated in Figure 9. Variations in the root-mean-square error (RMSE) at different thresholds obtained through THM.

Discussion
The THM was designed to address the problem of impervious surfaces being underestimated in the LSMA for areas with a large number of impervious surfaces. The study focused on understanding the capabilities of the THM for ISA extraction and the optimal threshold. It has been proved that the THM helps achieve a higher ISA extraction accuracy when an entire image is considered; the optimal threshold was 0.5 when the THM was employed for a GF-2 image.
The accuracy of the THM, considering different thresholds, demonstrated a "U" trend, as the THM was employed for two images with different resolutions. The spectral information on the medium-resolution remote sensing data was abundant; however, it was too coarse for extracting impervious surfaces. Moreover, the spatial information on the high-resolution remote sensing image was rich, but its scale was often on a pixel level. Limitations exist for both medium-resolution and high-resolution images in extracting impervious surfaces. The threshold is a key aspect in determining the range of application of these two images. Considering the optimal threshold, an optimal balance was achieved by applying the LSMA and OOM at pixel and subpixel levels, respectively, helping achieve suitable accuracy levels for extracting impervious surfaces.
The THM helped extract impervious surfaces, considering an entire region. Four zones (urban-rural fringe zones, areas with abundant bare soil, central business district (CBD) areas, and riverside) were considered to test the capabilities of the THM ( Figure 10); its accuracy was determined using the optimal threshold and a value of 0.5 was adopted. The comparison of the RMSE of the four zones is shown as Table 3. A few shortcomings were observed in applying the THM to some zones, especially the CBD zone. The RMSE through the THM and LSMA for the CBD zone was found to be 0.322 and 0.2597, respectively. The THM was unsuitable for the CBD zone primarily due to the existence of the shadows of tall buildings, which have similar spectral properties as water bodies, roads, and soil. The size and intensity of building shadows vary with solar illumination angle and miscellaneous building attributes. In CBD zones, the abundance of ISA through the LSMA was usually >0.5 and the THM was employed; however, the shadows in the high-resolution image, such as those from the GF-2, were always mistaken for water, and the accuracy decreased if these shadows were not removed before the OOM.  The comparison of the RMSE of the four zones is shown as Table 3. A few shortcomings were observed in applying the THM to some zones, especially the CBD zone. The RMSE through the THM and LSMA for the CBD zone was found to be 0.322 and 0.2597, respectively. The THM was unsuitable for the CBD zone primarily due to the existence of the shadows of tall buildings, which have similar spectral properties as water bodies, roads, and soil. The size and intensity of building shadows vary with solar illumination angle and miscellaneous building attributes. In CBD zones, the abundance of ISA through the LSMA was usually >0.5 and the THM was employed; however, the shadows in the high-resolution image, such as those from the GF-2, were always mistaken for water, and the accuracy decreased if these shadows were not removed before the OOM. Furthermore, in order to compare the capabilities of the THM for different high-resolution imagery and understand the variations in the optimal threshold, two other high-resolution images, viz., a 16 m resolution GF-1 image and a 1 m resolution panchromatic GF-2 image, were adopted and analyzed through the THM. The RMSE of these high-resolution images at different thresholds are as Table 4. The RMSE at different thresholds demonstrated a "U" trend in both the 1 m resolution panchromatic GF-2 image and the 16 m resolution GF-1 image, same as that for the GF-2 image. The optimal threshold of the two high-resolution images was 0.3 and 0.6, respectively. Considering the three high-resolution images, i.e., panchromatic GF-2, GF-2, and GF-1, it was obvious that the optimal threshold through the THM increased as the resolution improved.
According to Kuang [35], the global ISA of 45.26 × 10 4 km 2 was mainly distributed in central and southern North America, eastern Asia, and Europe, as well as coastal regions around the world. Since the cities in Europe and North America exposed a dramatic mosaic of impervious surface and green space composites in urban construction, the distribution of impervious surfaces in these local districts had its own characteristics, which may differ from that of China. No matter what kind of urban construction, the abundance of impervious surface shows a change from low to high in the gradient from suburb to urban area. During the different abundances of impervious surface, to find a suitable threshold will be a key issue to apply THM. The threshold was also found to be related to the selection of high-resolution, and the threshold increased as the resolution of the imagery improved. Copernicus land services data (https://land.copernicus.eu/), which are collected by different sources with different resolutions, including Earth observation satellites and in situ sensors, can be potentially used as relevant to impervious surface extraction and assessment processes from regional to national scales. More accurate ISA data are valuable for better management of local landscapes and for use as an input in other studies such as socioeconomic and environmental modeling.

Conclusions
This study considered Guangzhou city as an example. A hierarchical thresholding extraction method was designed to address the underestimation of impervious surfaces obtained through the LSMA. It was proved that the THM can achieve a higher ISA extraction accuracy than the LSMA.
The threshold is the key to determining the capacity of the THM. On selecting a GF-2 image, the optical threshold value was 0.5, whereas the highest level of precision through the THM was 0.1871; the accuracy of THM could be increased by 3.83% compared to the LSMA. The threshold was also found to be related to the selection of high-resolution, and the threshold increased as the resolution of the imagery improved.
It was proved that the THM has a suitable accuracy when an entire image was considered; however, the method was found to be unsuitable for areas such as the CBD zone because of the existence of the shadows of tall buildings; moreover, the OOM also has limitations in dealing with this problem. On the one hand, detecting and removing shadows by using object-based detection methods [36,37], building shadow indices [38] or SVM classifiers [39] is an effective step to improve the extraction accuracy of THM. On the other hand, choosing another threshold variable is also a feasible approach to deal with this problem.