Comparing Urban Impervious Surface Identification Using Landsat and High Resolution Aerial Photography

This paper evaluates accuracies of selected image classification strategies, as applied to Landsat imagery to assess urban impervious surfaces by comparing them to reference data manually delineated from high-resolution aerial photos. Our goal is to identify the most effective methods for delineating urban impervious surfaces using Landsat imagery, thereby guiding applications for selecting cost-effective delineation techniques. A high-resolution aerial photo was used to delineate impervious surfaces for selected census tracts for the City of Roanoke, Virginia. National Land Cover Database Impervious Surface data provided an overall accuracy benchmark at the city scale which was used to assess the Landsat classifications. Three different classification methods using three different band combinations provided overall accuracies in excess of 70% for the entire city. However, there were substantial variations in accuracy when the results were subdivided by census tract. No single classification method was found most effective across all census tracts; the best method for a specific tract depended on method, band combination, and physical characteristics of the area. These results highlight impacts of inherent local variability upon attempts to characterize physical structures of urban regions using a single metric, and the value of analysis at finer spatial scales.


Introduction
Urbanization is arguably one of the most significant landuse/landcover change occurring in the world today [1,2].Sizes and distributions of urban areas are dynamic because of rising populations.In 2009, for the first time in history, the percentage of human population that lived in urban areas exceeded 50%-the United Nations estimates this percentage will increase to 69% by 2050 [3].Although urban expansion is often attributed to increasing population, the rate of conversion to urban land uses far exceeds the rate of urban population increase, a phenomenon common across the world [3].
Conversion to urban land-losses of agricultural and forested lands, coupled with increasing impervious surface cover-has direct effects on natural temperature regulation [4,5] and alters the hydrologic cycle [5][6][7].Decreasing vegetative cover and increasing impervious surface cover alters thermal processes in urban regions, thus creating warmer temperature regimes relative to rural areas (urban heat island effect) [5,8,9].Alteration of the hydrologic cycle represents the most significant urban water quality issue present today [5,6,10] because stormwater runoff from impervious surfaces creates water quality problems including higher water temperatures and elevated levels of contaminants in surface waters [5,11].A better understanding of such impacts is required to effectively address these adverse issues [5,10,12].
Mapping impervious surfaces is essential for evaluating these impacts and implementing effective environmental and urban management planning [5,10,[12][13][14][15][16][17].Since the 1970s, remote sensing has been widely used for analyses of urban areas [5,13,18,19].The earliest research specifically devoted to impervious surface identification began in the 1990s, but was limited in scope until high-resolution imagery and increased computing power became available (in the mid-2000s) [13].
High-resolution imagery provides the spatial detail necessary to record the fine-scale heterogeneity present within an urban area [12,13,15,18,[20][21][22]. Researchers readily acknowledge that this fine-scale heterogeneity is a problem for remote sensing, especially with regards to creating mixed pixels (pixels influenced by the variation in spectral values present within its dimensions).Ultimately, the solution to this issue lies in identification of analyses capable of resolving this fine detail.However, acquisition and analysis expenses, limited spatial extent, and high costs of sequential coverage limit routine applications.
Because of this heterogeneity, a wide variation in remote sensing techniques, image sources, and effectiveness exists from study to study [5,13,18], and creates a lack of uniformity that limits our ability to make comparisons between studies [5,18].Furthermore, the success of each of the approaches is often evaluated in different ways, for example-accuracy assessments, root mean square error, absolute error, which presents obstacles for other researchers to effectively interpret the findings and identify the most effective method(s).
Despite the relatively coarse spatial resolution of Landsat imagery in the context of urban analysis (30 m by 30 m pixel size), it offers advantages of affordability, accessibility, multispectral coverage, sequential acquisition, and the spatial scope to represent complete urban systems.Much literature is specifically devoted to applications of Landsat imagery to derive impervious surfaces, however, assessing and comparing even this body of literature is very difficult.A selection of this literature is presented in Table 1 ([10,19,), which does not attempt a comprehensive review, but rather presents an illustration of the variety of techniques applied in many different areas.Only one of these studies evaluates the effectiveness of a multi-scale approach (from a finer scale to increasingly larger scales-see [23]).Very few have compared applications of alternative techniques within the same milieu (see [49][50][51]).Table 1.Selected Studies, using Landsat imagery, that illustrate the variety of classification techniques to identify impervious surfaces.

Study Site and Data
The study site is the City of Roanoke, Virginia, USA, a small city (110 square kilometers) located in a valley in southwest Virginia (Figure 1).Roanoke is the largest city in Virginia, outside of the major metropolitan areas in the eastern part of the Commonwealth.Although small in area, Roanoke has a population density comparable to much larger cities in the USA.Over recent decades, Roanoke has been the focus of substantial urbanization, economic stress, and landuse changes.Roanoke has substantial drainage problems and experiences frequent flooding due to its proximity to the Roanoke River, the river's tributaries, and urban stormwater runoff from impervious surfaces.Many segments of the Roanoke River system within the city are on the Virginia Department of Environmental Quality's impaired waters list, due to contaminants such as Escherichia coli, high water temperatures, and heavy metals [52].In addition, its CO 2 emissions were estimated at 2.3 million tons in 2009 [53], and city officials estimate impervious surface cover at 28% [54].For our analysis, Roanoke offers the advantage of a compact urban region with a range of land uses and urban environments that permit evaluation of alternative analytical strategies.

Delineating Impervious Surfaces from Aerial Photos
GIS shapefiles-roads, buildings, and parcels-for our study site were downloaded from Roanoke's geospatial data gateway and the US Census Tigerline data gateway.We accessed aerial photos through the Virginia GIS server with ESRI's ArcCatalog ® GIS server connection.The Virginia Base Mapping Project 2011 flyover for the city occurred between 4 March 2011 and 10 March 2011 [55], completed with 6 inch resolution [56].
We overlaid the city's boundary, parcels, and building shapefiles on the 2011 aerial photos, and, using them as guides, impervious surfaces were delineated in GIS.Many urban researchers have conducted their analyses using landuse or landcover boundary delineations, (see [9,57,58]; this strategy seems to be fruitful for studies that focus upon analysis of urban heat island, but is less well-matched for studies that must consider other dimensions of the urban landscape.For our analysis, impervious surface delineation was accomplished separately by each census tract because: (a) each census tract is spatially compact and contains different landuse/landcovers with varying percent impervious, (b) subdividing the study area provides the ability to evaluate intra-urban variations at a fine scale, and (c) providing impervious surface data individually by census tracts allows for consideration of impervious surfaces' relationships with demographic data.Census tracts are spatial subdivisions delineated by the United States government for which statistical data is collected during the decennial census [59].After editing roads and buildings shapefiles, we joined these using GIS, creating a single polygon for each census tract, then calculated percent impervious for each.

Landsat Image Processing
A cloud-free Landsat TM 5 scene (path 17, row 34, dated 13 March 2011) was obtained from the US Geological Survey's GloVis server.The US Geological Survey provides the scenes with georeferencing already complete.The image was cloud-free and LEDAPS [58,60] preprocessing removed remaining atmospheric effects during preparation of the surface reflectance image.A histogram adjustment was conducted to ensure no negative reflectance values.
The image was subset to form multiple band combinations-bands 5, 6 and 7; bands 1, 5, and 7; and bands 2 through 6.We chose to include the thermal channel (band 6) because of impervious surfaces contribution to UHI [20][21][22], and bands 1, 5, and 7 as band 1 is useful for distinguishing soil from vegetation, and the spectral properties of impervious surfaces can mimic soil surfaces [61,62].Supervised classification was performed on the entire scene for these three images and an image with all seven bands.
Training Areas of Interest (AOIs) were created separately for each band combination using a binary classification-impervious and other.Locations for the impervious surfaces' training areas were identified using the aerial photos, typically including man-made objects such as the airport runways, parking lots, US Interstate highways, and large buildings-shopping malls/centers.Training areas for the other category included all pervious landcovers-agriculture, forests, water, and included several golf courses.We did not limit our training areas to Roanoke City but used the entire Landsat scene to avoid including mixed pixels, which were often problems near roads and for the city's urban forest.We created AOIs for the other category using forested areas in the Jefferson National Forest (west and southwest of the City).We created AOIs for the impervious surfaces related to roads by using wider segments of the Interstate 81 corridor where the pavement has four or more lanes in each direction with little to no adjoining vegetation.
Using a feature space created separately from each image, we assessed the training data for normality, separability, and partitioning, and combined selected areas of interest as needed.These assessments ensured that brightness values for our different categories were not highly correlated, were mutually independent between bands, and that spatially separate pixels were not spectrally similar to each other.Our final signatures represented multiple different and distinct spectral classes for each of the two landcover categories-impervious and other.Four supervised classifications were performed on each image-parallelepiped, maximum likelihood, minimum distance, and Mahalanobis' distance.

Validation Points and Accuracy Assessment
We created a validation dataset by generating a shapefile of random points for each census tract.The total number of random points for the entire city was 1,877-the numbers of samples positioned within census tracts varied from 50 in the smallest census tract to 115 in the largest.We overlaid these point files individually on the 2011 aerial photos and identified each point as either impervious or other.
The most widely utilized data set in the USA for impervious surfaces was developed by the Multi-Resolution Land Charateristics Consortium (MLRC) as part of the National Land Cover Database (NLCD).This raster dataset was developed to identify "percent developed imperviousness" for the coterminous USA [63].The dataset was derived by means of regression tree software using both leaf-on and leaf-off Landsat images and images from NOAA's Defense Meteorological Satellite Program [63].The National Land Cover Database Impervious Surfaces (NLCD IS) dataset presents a continuous layer with a gradient of imperviousness from 0 to 100 percent for each 30 m by 30 m pixel [64], i.e., the value of each pixel is identified as the percent of impervious surfaces present within that pixel.Accuracy assessments have been performed on the full NLCD with resulting overall accuracies ranging from 59.7% to 80.5% [65], 43%-83% [66], and 78%-85% [67].We were unable to locate a similar accuracy assessment performed specifically on the NLCD IS dataset.
Using the random points, we performed an accuracy assessment of the NLCD IS dataset representing our study area.After downloading the dataset from the MRLC website, we used GIS to generate several binary files with varying impervious thresholds from 10% to 75%, masking the extent to the city's political boundary.We added each NLCD threshold file to the attribute tables of our random points shapefile, and calculated user's accuracy, producer's accuracy and overall accuracy for each threshold.We performed this assessment to establish a base threshold for overall accuracies of our classified Landsat images.Our method for this assessment differs from those employed by other researchers because we are assessing the presence or absence of impervious surfaces, whereas they calculated accuracies for all landcover/landuse categories (see [65][66][67][68]) The datum for Landsat Imagery is WGS 1984 and the datum for the NLCD is NAD 1983 Albers.So, for each classified image, we changed the spatial reference from WGS 1984 to NAD 1983 Albers so our accuracy assessments would be comparable to that of the NLCD IS data.We added the value of each pixel (impervious or other) for each classified image for each point to the random points' attribute table, and then calculated overall accuracy for each image.
Our objective is to determine the most effective method(s) for identifying impervious surfaces from Landsat imagery.As such, as explained below and in table in Section 3.3, we chose the four classified images with the highest overall accuracies and compared those images to the aerial photos.We examined them visually to assess their effectiveness in delineation of specific landmarks, such as the airport, river, railway complex, neighborhood streets, and central business district (Figure 2) by comparing the results to our aerial photos.

Assessment of Accuracy at the Census Tract Scale
After choosing the images with the greatest overall accuracy for the entire study site, we then assessed accuracy variations within the urban system, using census tracts as units.Although census tracts form somewhat arbitrary units for investigating impervious surfaces, their use here supports related analyses in our study area, and provides boundaries with which governments and institutions are increasingly defining their urban space within the USA.We chose eleven contiguous census tracts within our study site to accomplish the assessment.These included the census tract with the smallest area (also the most highly developed as it comprises the central business district-Number 11), the largest in area (6.01), and the census tract with the least amount of development (28).

Aerial Image Delineation of Impervious Surfaces
The percent impervious, as delineated from the aerial photos, for the eleven census tracts used in the finer scale analysis ranged from 13.3% (Tract 28) to 89.5% (Tract 11) (Table 2).The smallest census tract (11), 105.3 ha, representing the central business district, has the largest percent impervious at 89.5%.Census Tract 28, with the smallest percent impervious (13.3%), is the second largest in area (1,077.5 ha).

Accuracy Assessments of the NLCD IS Data Set
For pixel impervious values ≥ 10%, the accuracies of the several binary classifications of the NLCD IS data resulted in overall accuracies ranging from 53.5% (pixel impervious value ≥ 10%) to 72.8% (pixel impervious value ≥ 45%) and kappas from 0.21 to 0.43 (Table 3).Accuracy and kappa increase as the threshold of impervious percent increases until about 40%-50%.Overall accuracies range between threshold values of 35% and 75%, and are consistently close to 70% or above.These results are consistent with the overall accuracy assessments completed on the general NLCD performed by other researchers (see [65][66][67][68][69]), and thus, form our baseline with which to compare our different classifications on the Landsat Images.

Accuracy Assessments of the Supervised Classifications
Overall accuracies for all sixteen supervised images ranged from 50.8% (Mahalanobis' distance: all bands) to 72.1% (parallelepiped: bands 2-6) (Table 4).Our kappa values for these classified images ranged from 0.16 (Mahalanobis' distance: all bands) to 0.40 (minimum distance: bands 1, 5, and 7) (Table 4).All the classification methods using the band combination of channels 2-6 produced results similar to the NLCD IS standard.The results for ten of our sixteen classified images were comparable to results achieved for the NLCD IS (i.e., for pixel impervious values ≥ 10%, NLCD IS overall accuracies remained steady around 70% and the kappas for the NLCD IS around 0.40).
Since our objective was to identify the most effective method(s), we selected parallelepiped-bands 2-6; minimum distance-bands 1, 5, and 7; and minimum distance-all bands to include in our finer scale assessment at the census tract level.These three images had the highest overall accuracies and highest kappas.In addition, we had other several classified images that achieved results similar to the NLCD IS standard, so, we also chose to include the image which attained the highest overall accuracy and highest kappa-Mahalanobis' distance-bands 5, 6, and 7.These selections gave us one classified image from each band combination and the four images for our finer scale assessment.Notes: *classified images in red were chosen for the finer scale analysis.

Comparison of Aerial Photos to Supervised Classifications
In this section, we are only reporting our visual assessment of the four classified images chosen in Section 3.3 above.Each of the images (Figure 3) provided sharp detail for the airport area, the major roads and highways, the railway complex, and a large shopping center near the airport.Minimum distance-bands 1, 5, and 7; minimum distance-all bands; and Mahalanobis' distance-bands 5, 6, and 7 provide additional detail including most of the neighborhood streets.Minimum distance-bands 1, 5, and 7; minimum distance-all bands; and parallelepiped clearly record the high concentration of impervious surfaces within the central business district.Minimum distance-all bands performed the best in separating the river from the surrounding impervious surfaces, an area which can cause difficulty because of the many bridges crossing the river and sediments within the river from stormwater runoff.The neighborhood streets can cause difficulties with different classification algorithms because of shadowing from buildings and some trees, along with variations in width and types of adjoining vegetation.

Supervised Classification Results-Finer Scale
Table 5 encapsulates the results from the manual delineation of the percent impervious for each census tract, and the percent impervious per census tract for each of the four chosen classified images (Figure 3).The divergence of the classified image percent from the manual delineation percent is enumerated in Table 6 (negative values indicate under estimates).Parallelepiped-bands 2-6 consistently under-estimated the percent impervious in all census tracts (a range in differences from −21.5 to −5.2) (Table 6).The other three methods over-estimated percent impervious for all census tracts (a range in differences from 1.0 to 27.5) (Table 6).No single method produced consistent under-or over-estimates in all the census tracts.When examined at finer scale, no one method consistently provided the highest accuracies and kappas in all tracts.Overall accuracies and kappas for each of the four images by census tract were variable (Table 7), consistent with the percent impervious comparisons in Tables 5 and 6.However, in six of the census tracts, all four methods achieved accuracies similar to the NLCD IS standard.Two of our census tracts had accuracies that exceeded those standards-Tracts 11 and 28.Census Tract 11 has the highest percent impervious and both minimum distance images achieved accuracies in excess of 80.0%.Census Tract 28 has the lowest percent impervious; all methods achieved accuracies in excess of 78%.We did find two census tracts (4 and 27) where the overall accuracies were lower than the NLCD IS standard.Table 5. Percent impervious for the manual delineation of aerial photos and percent impervious for each of the four chosen classification methods (Figure 3), and listed by census tract.All classification methods produced overall accuracies equal to or exceeding results of the city-wide NLCD IS in the majority of the census tracts (parallelepiped bands 2-6 and minimum distance all bands in eight of eleven tracts, and the other two methods in seven of eleven tracts).The methods also produced overall accuracies in 18 of 44 assessments that exceed the city-wide NLCD IS standard (≥75%), and at least one kappa higher than the NLCD IS standard (≥0.43).In most cases, the highest overall accuracy and highest kappa for a specific census tract was from the same method, with three exceptions-Census Tracts 6.01, 6.02 and 28.

Census
Table 7. Overall accuracy (OA) and kappa (k) for each of the four chosen classified images, listed by census tract.

Census Tract
No.

Discussion
For our study site, at least four of the supervised classification methods performed as well as the NLCD IS data set with respect to overall accuracy and kappa value at the coarser city scale.Three of our four most effective classifications were based in part upon the TM's thermal channel, an effect that is consistent with reports of other researchers who have noted the contribution of the thermal channel to classification of impervious surfaces to the UHI [5,13,16,70].Because our leaf-off imagery records a time with little to no vegetation growth (winter season), Band 1 was useful for distinguishing soil from vegetation especially in instances where impervious surfaces-roads, highways, parking lots-exhibit properties similar to bare earth (no vegetative cover) [5,61].Visually, these four classifications did vary in effectiveness for identifying distinguishable features (especially neighborhood streets, the river, and the central business district) in various parts of the city.In certain heterogenic urban areas, such as these specific locales, the classification algorithms vary in their latitude to assign mixed pixels to specific spectral categories.
When examined at the finer detail of the census tract, internal heterogeneity within the city greatly impacted the results.Except for a few census tracts, our classifications produced variable results within and across the census tracts.Census tracts are defined by demographic criteria, so are not comparable to each other with respect to composition and variation in landuse/landcover.
With larger average parcel sizes, our accuracies are higher but in areas with wide variability in parcel sizes (thus greater heterogeneity) our accuracies were more variable across the methods.
Consideration of average parcel size within the census tract, spatial patterns of the parcels, percent impervious, and interactions with the classification algorithm, contribute to variability of results (Table 8): Parcels  Parcel sizes, range of parcel sizes within a specific tract, the percent impervious, and the spatial pattern of land uses all affect accuracy;  Units with very high and very low percent impervious show high accuracies for all methods (Census Tracts 11 and 28); and  Units with the highest percentages of impervious surfaces (Census Tract 11), show low kappa values; the more variability in range of parcel sizes within a census tract and the higher percent impervious, the greater variability in accuracy of classifications between the methods.

Classification Strategy
 At least one classification method achieved an accuracy of over 70% in nine of the eleven census tracts;  Parallelepiped classification-bands 2-6 proved the most effective method in tracts with moderate percentages of impervious surfaces and larger average parcel sizes;  In some census tracts, multiple methods achieved accuracies higher results greater than the city-wide NLCD IS standard; and  For all census tracts, the most effective method included either the thermal channel (band 6) or the blue visible channel (band 1).Our analysis is subject to several sources of uncertainties: Although we avoided bias in our random points dataset by retaining an independent party to complete it, human error still could have been a factor and some points misidentified between impervious or other; Shadowing from buildings and coniferous trees could have altered brightnesses of some pixels resulting in misclassification between impervious and other; For our finer scale analysis, we chose the best four methods from the broader scale analysis; so, it is possible that another method, which we did not consider, could have performed equally well or better at the finer scale analysis.Some imprecision can occur when comparing point values to pixel values-our pixels cover a 90 square meter area; in urban settings, this area could encompass several different surface covers whereas a random point has no area and could be located anywhere within a pixel with different surface covers.Additionally, many studies have pointed out the problems encountered with mixed pixels within urban areas and have developed alternative methods in attempts to solve the mixed pixel dilemma [5,13].However, our analysis indicates that more complicated methods in delineating impervious surfaces from Landsat imagery may not be necessary for this study site.Our overall accuracies at the coarser city-wide scale for several of our images were comparable to the overall accuracy of the widely used NLCD IS dataset, and to accuracies achieved by many of the more complex methods referenced in Table 1.At finer census tract scales, and even though no one method was effective across all census tracts, different methods and band combinations produced accuracies in excess of 70% and 80%.

Conclusions and Future Outlook
Based on our analysis, Landsat imagery is effective in delineating urban impervious surfaces using standard classification methods but the best specific method is highly dependent on the internal heterogeneity within a specific site.Most of our classifications at the city scale provided accuracies equivalent to that of the US National Land Cover Database.Our study shows that, for our study site and similar urban areas, a finer scale analysis using spatially compact boundaries can overcome this internal heterogeneity and provide higher accuracies.The best method will depend on physical characteristics of the finer-scale compact area.
The literature on identification of impervious surfaces is abundant, with the majority of the studies analyzing urban areas at coarse scales.To properly manage impacts of urbanization, such as stormwater mitigation and temperature regulation, analyses must account for the heterogeneity within urban regions.We encourage applications of more robust, and more detailed, reference data to permit analysis and validation at finer spatial scales to provide improved assessment, and to permit analysis of the detailed spatial fabric of the urban landscape.
Many developed countries have high-resolution aerial photos available to aid in identification of impervious surfaces.However, these analyses are not uniformly effective and derivation of impervious surfaces from high-resolution aerial photography is time-intensive and costly, a practical concern for developing countries.In contrast, the Landsat image archive is free, readily available for most regions of the world, and covers many decades.And, as our study reveals, standard classification methods can be quite effective especially when completed at finer scales.

Figure 1 .
Figure 1.City of Roanoke, Virginia, USA Reference Map.

Figure 2 .
Figure 2. Specific locations for visual accuracy.

Figure 3 .
Figure 3. Impervious surfaces as classified by four algorithms.

Table 2 .
Percent impervious (IS) delineated from aerial photos and total area of census tract.

Table 3 .
Overall accuracy and kappa of Roanoke's NLCD IS dataset at varying thresholds of percent impervious.

Table 4 .
Overall accuracy (OA) and kappa of all sixteen classified Landsat images *.

Table 6 .
Difference in percent imperious between aerial photo delineation and classified images, listed by census tract *.

Table 8 .
Summary of census tract characteristics and best classification method for that census tract.