Comparison of Remote Sensing Image Processing Techniques to Identify Tornado Damage Areas from Landsat TM Data.

Remote sensing techniques have been shown effective for large-scale damagesurveys after a hazardous event in both near real-time or post-event analyses. The paperaims to compare accuracy of common imaging processing techniques to detect tornadodamage tracks from Landsat TM data. We employed the direct change detection approachusing two sets of images acquired before and after the tornado event to produce a principalcomponent composite images and a set of image difference bands. Techniques in thecomparison include supervised classification, unsupervised classification, and object-oriented classification approach with a nearest neighbor classifier. Accuracy assessment isbased on Kappa coefficient calculated from error matrices which cross tabulate correctlyidentified cells on the TM image and commission and omission errors in the result. Overall,the Object-oriented Approach exhibits the highest degree of accuracy in tornado damagedetection. PCA and Image Differencing methods show comparable outcomes. Whileselected PCs can improve detection accuracy 5 to 10%, the Object-oriented Approachperforms significantly better with 15-20% higher accuracy than the other two techniques.


Introduction
Remote sensing is a cost effective tool for large scale damage surveys after hazardous events. From hurricanes to earthquakes, satellite or airborne imagery provides an immediate overview of the damaged area and facilitate rescue and recovery efforts. Not only these images can provide damage estimates, identified damaged area from the imagery can guide limited emergency or survey crews to needed areas for detailed analysis. Nevertheless, the usefulness of remote sensing to the detection of damaged area depends on the accuracy of the detection techniques. While several studies applied satellite or airborne images to detect tornado damages, there is no systematic accuracy assessment of the accuracy of damage detection among different image processing approaches. A good understanding of how image processing techniques perform enables an intelligent choice of the techniques for damage detection. This paper compares three main approaches in remote sensing image processing and through the comparison to draw insights into the strengths and limitations of each technique in detecting tornado damage tracks. Included in the comparison are Principal Component Analysis (PCA), Image Differencing, and Object-oriented Classification. Fundamentally, damaged area detection is a classification problem. The attempt is to classify all area in the input images to two classes: undamaged area and damaged area. All image processing techniques for damaged area detection assume that damaged area and undamaged area relate to discernible differences in spectral reflectance on images. Therefore, classification of imagery reflectance reveals classes of damaged and undamaged areas.
While PCA and Image Differencing methods are based on multivariate statistics, the two approaches detect changes in distinct ways. PCA classifies spectral reflectance on the before and after images separately and then determine the differences in the two images as damaged area. Imaging Differencing, on the other hand, classifies cells based on spectral differences before and after the events. Furthermore, the Object-oriented Classification considers the characteristics of a conceptual object (such as the elongated nature of a tornado track) on a image and applies the object characteristics in determining whether image cells (or pixels) should be considered outside or inside the object. The conceptual differences in image classification are expected to play a major role in accuracy assessment [1][2].
The Oklahoma City Tornado Outbreak on May 3, 1999 serves as a case study here. Wide spread damage caused by the event across both urban and rural areas in central Oklahoma. Ladsat TM images are used to compare all three imaging processing techniques. Extensive field surveys conducted by the National Weather Service provides detailed ground data to assess accuracy of these techniques. The extended damaged area also provides a wide range of damage intensity that can challenge the detection robustness of image processing techniques.
For all remote sensing analyses, data preparation is critical to ensure data validity of the analysis that follows. The next section will describe the procedures taken to perform atmospheric corrections and geographically match the pre-and post-event Landsat TM images so that changes caused by the tornadic event (i.e., damaged area) can be legitimately resulted by comparing the images. The following section details the conceptual and methodological foundations for each of the image processing techniques and their applications to detect tornado damage area. Both unsupervised and supervised approaches are considered in applying these techniques to damage detection. Classes of undamaged area and damaged area (i.e. tornado damage tracks) then are compared with ground survey results. Error matrices cross-tabulate accurately identified, producer's error, user's error, and Kappa index for the overall error assessment. Discussions follow to compare error matrices among these tested techniques. The final section concludes the findings and suggests directions for future studies.

Data Preparation and Study Area
Landsat Thematic Mapper image data (path 28 and row 35) at 28.5 m spatial resolution with seven channels ranging from blue to thermal infrared portion of the spectrum was used to perform a damage assessment. The thermal channel was excluded in the study due to its coarser resolution. A location map of the study area is provided in Figure 1   We then co-registered both images to minimize locational errors (Myint and Wang, 2006).
We then converted the Landsat ETM+ data to apparent surface reflectance using an atmospheric correction method known as the Cos(t) model (Chavez, 1996). This model incorporates all of the elements of the dark object subtraction model (for haze removal) and a procedure for estimating the effects of absorption by atmospheric gases and Rayleigh scattering. Even though data import, image layer stacking, visual judgment, and image subset were performed in ERDAS Imagine, conversion from DN values to reflectance was performed one band at a time using ATMOSC module in IDRISI software package. The reflectance data were imported back to ERDAS Imagine for layer stacking. The layer stacked image data was multiplied by 10,000 and kept as 16 bit integer data for easy computation and comparison.

Methodology
Digital change detection methods have been broadly divided into either pre-classification spectral change detection or post-classification change detection methods [3][4][5]. Regarding post-classification change detection, two images acquired on different dates are separately classified, and the changes are identified through the direct comparison of the classified information [6]. Since the approach was employed originally in the late 1970s for early satellite images, the method has a long history of applications to change detection analysis and some researchers consider it a standard approach for change detection [7]. The analyst can produce a change map with a matrix of changes by overlaying the classification results for time t 1 (pre-event image) and t 2 (post-event image).
In the case of pre-classification change detection, a new single band or multi-spectral images are generated from the original bands to detect changed areas [8]. This approach generally involves further processing procedures to determine changes overtime. After obtaining a change image, a further analysis is required to identify change and no change pixels and to produce a classified map. Histogram thresholding is a simple approach for identifying the change pixels. Pixels that show no significant change tend to be grouped around the mean while pixels with significant change are found in tails of the histogram distribution [9]. This approach is known as direct multidate classification technique or composite analysis [10]. The goal of the study was to assess image processing techniques that can effectively classify nondamaged and damaged areas from remotely sensed imageries. Three main approaches of change detection methods are compared here according to their ability to extract changed areas and identify damaged areas and non-damaged areas using a commonly used supervised classifier (i.e., maximum likelihood), a widely used unsupervised classifier (i.e., iterative self-organizing data analysis or ISODATA), and an object-oriented approach.

Damage Assessment Using Principal Component Analysis
Spectral change detection or pre-classification techniques rely on the principle that land cover changes result in persistent changes in the spectral signatures of the affected land surfaces. These techniques involve the transformation of two original images to single-band or multi-band images in which the areas of spectral change are highlighted [10]. There have been efforts to increase the accuracy of the changed area identification using a number of direct change detection approaches: principal component comparison [11], change vector analysis [12], regression analysis [8], inner product analysis [13], correlation analysis [14], image ratioing [3], and multitemporal NDVI analysis method [15]. In contrast to the direct change detection approaches, [16][17]  In this study, we employed the direct change detection approach using a principal component analysis (PCA) using two sets of images acquired before and after the tornado event to produce a principal composite image. This can be achieved by superimposing two N-band images to generate a single 2N-band image dataset followed by a principal component analysis to produce 2N principal component bands. Following the same procedure, two Landsat TM images with 6 bands excluding the thermal band each acquired before and after the May 3 1999 tornado outbreak over the study area were layer-stacked first to generate a single 12-band image. Then, a principal component analysis was performed to produce 12 principal component bands.
The PCA results suggest that tornado damage areas can be observed in principal component (PC) bands 2, 3, and 4 and hence, we layer stacked PC bands 2, 3, and 4 as the first set of images for the assessment. Figures 3, 4, 5, 6, 7, and 8 show PC bands 1, 2, 3, 4, 5, and 6 respectively. Even though it is understood that the first principal component contains the largest component of the total scene variance [18], it can be observed from Figure 1 that the first PC band does not show damaged areas of the 3 May 1999 tornado well. PC5 and other principal components at higher orders do not have much information on changes except some noise in the images. A higher number of succeeding component images may contain a decreasing percentage of the total scene variance. Hence, we do not show higher level of PC images after PC-6. A closer inspection revealed that PC bands 3 and 4 showed the strongest response of tornado damage signatures in the images. We anticipated that this could potentially lead to a good result and we layer stacked PCA bands 3 and 4 as an additional sets of PCA images for the analysis. We selected training samples of damaged areas and non-damaged areas to perform a supervised classification approach (i.e., maximum likelihood) in the 2 selected composite of PC image bands (i.e., PC 2, 3, 4 and PC 3, 4). Figures 9 and 10 show the first set of PC composite bands 2, 3, 4, and PC composite bands 3, 4.
We also employed an unsupervised classification algorithm, namely iterative self organizing data analysis (ISODATA), to identify 50 clusters. We determined the clusters that belong to damaged areas as we can visually identify in the images by interactively displaying one cluster at a time on the monitor. The ISODATA utility repeats the clustering of the image until either a maximum number of iterations has been performed, or a maximum percentage of unchanged pixels has been reached between two iterations. This maximum percentage of unchanged pixels is known as convergence threshold. The convergence threshold is the maximum percentage of pixels whose cluster assignments can go unchanged. In this study, we used 20 iterations and 0.97 convergence threshold in the study area. A convergence threshold of 0.97 implies that as soon as 97% or more of the pixels stay in the same cluster between one iteration and the next, the utility stops processing.

Damage Assessment Using Image Differencing Approach
In image differencing, co-registered images of two different dates are subtracted, followed by the application of a threshold value to generate an image that shows changes of land use and land cover. As discussed earlier, threshold values are typically set based on a standard deviation value. Lower standard deviation may lead to greater inclusion of no change. Optimally, selection of the proper threshold should be based on the accuracy of categorizing pixels as change or no change [11]. The threshold values for change/no change can be determined by the mean plus a number of standard deviation or interactively performing with a monitor and operator-controlled image processing software capable of level slicing. Although this is a straight forward procedure it determines only changed areas instead of identifying type of changes from one class to another [9].        We used image differences of Landsat TM reflectance data acquired on June 26, 1998 andMay 12, 2000. We selected all image difference bands as the first set of image difference bands for the identification of damaged areas. Tornado damage areas appear evident in bands of 3, 5, and 7 on the differencing image, we layer stacked the above image differences bands as the second set of images for the assessment. Figures 11,12,13,14,15,and 16 show image difference bands of 1, 2, 3, 4, 5, and 6 respectively. It can be seen from Figure 14 that image difference band 4 does not show much information on damaged areas. By visual judgment of the difference images, we observed that the second least effective image difference band was band 1.
As mentioned earlier, we selected training samples of damaged areas and non-damaged areas to perform a supervised classification approach (i.e., maximum likelihood) in the 2 selected sets of image differences (i.e., bands 1, 2, 3, 4, 5, 7 and bands 3, 5, 7). Figure 17 shows image differences of a composite bands 3, 5, and 7. We also employed an unsupervised approach (Iterative Self-Organizing Data Analysis Technique -ISODATA) using 20 iterations and 0.97 threshold value to determine 50 clusters for the assessment of tornado-damaged areas in the above three sets of Landsat TM data. Following the same procedure, we determined the clusters that belong to damaged areas as we can visually identify in the images by interactively displaying one cluster at a time on the monitor.

Damage Assessment Using Object-oriented Approach
An object is defined as a group of pixels having similar spectral and spatial properties in the object oriented approach in image classification. An object based classification approach generally uses segmented objects in relation to different level of scales as vital units instead of considering per-pixel basis at a single scale for image classification [19][20][21].
Image segmentation is a prime task that splits an image into separated groups of cells or objects depending on parameters specified in the first stage before carrying out a classification. We used eCognition professional 4.0 to perform an object-based classification approach. There are three parameters that need to be identified in the segmentation function in eCognition software [22], namely shape (S sh ), compactness (S cm ), and scale (S sc ) parameters. Users can apply weights ranging from 0 to 1 for the shape and compactness factors to determine objects at different level of scales. These two parameters control the homogeneity of different objects. The shape factor adjusts spectral homogeneity vs. shape of objects whereas the compactness factor, balancing compactness and smoothness, determines the object shape between smooth boundaries and compact edges. The scale parameter is generally considered a key parameter in image segmentation that controls the object size that matches the user's required level of detail. Different level of object sizes can be determined by applying different numbers in the scale function. A higher number of scale (e.g., 200) generates larger homogeneous objects (smaller scale -lower level of detail) whereas the smaller number of scale (e.g., 20) will lead to smaller objects (larger scale). A smaller number used in the scale parameter is considered higher level in the segmentation procedure. The decision on the level of scale depends on the size of object required to achieve the goal. The software also allows users to assign different level of weights to different bands in the selected image during image segmentation.       Image Classification with Object-oriented Approach: Regarding selection of objects to assign classes, we used the nearest neighbor classification procedure. The nearest neighbor option is a non-parametric classifier and is therefore independent of the assumption that data values follow a normal distribution. This technique allows unlimited applicability of the classification system to other areas, requiring only the additional selection or modification of new objects (training samples) until a satisfactory result is obtained [23]. The major advantage of using the nearest neighbor classifier is that it is capable of discriminating classes that are spectrally similar and not well separated using a few features or just one feature [24]. The nearest neighbor approach in eCognition can be applied to any number of classes at levels using any original, composite, transformed, or customized bands. There are two options available with the nearest neighbor function, namely (1) Standard Nearest Neighbor, and (2) Nearest Neighbor. The Standard Nearest Neighbor option automatically select mean values of objects for all the original bands in the selected image whereas the Nearest Neighbor option requires users to identify variables (e.g., shape, texture, hierarchy) under object features, class-related features, or global features.
We employed the nearest neighbor approach after performing an image segmentation procedure at a required level of scale using a composite of PC image bands 3 and 4 to identify damaged areas and non-damaged areas. PC composite image of bands 3 and 4 gave the highest accuracy among all composite bands. A composite image of principal component bands 3 and 4 shows damaged and nondamaged areas better than other composite bands. After testing different scale levels and parameter values, we considered scale level of 100 to be the optimal level of scale for the study. Shape parameter (S sh ) was set to 0.3 to give less weight on shape and give more attention on spectrally more homogeneous pixels for image segmentation. Compactness parameter (S cm ) was set to 0.5 to balance compactness and smoothness of objects equally. Figure 18 shows a segmented image using shape Tornado damage path (S sh ) 0.3, compactness (S cm ) 0.5, and scale (S sc ) 100. We identified 10 to 15 training samples of nondamaged areas and 5 to 10 samples of damaged areas. We identified different samples iteratively and classified damaged areas until we received satisfactory results.

Accuracy Assessment
For the classification accuracy assessment, error matrices were produced and analyzed for each composite band and each method. These error matrices show the contingency of the class to which each pixel truly belongs (columns) on the map unit to which it is allocated by the selected analysis (rows). From the error matrix, over all accuracy, producer's accuracy, user's accuracy, and kappa coefficient were generated. It has been suggested that a minimum of 50 sample points for each landuse land-cover category in the error matrix be collected for the accuracy assessment of any image classification [25]. We used a stratified random sampling approach to select 120 samples points that leads to approximately 60 points per class (damaged and non-damaged areas) for the accuracy assessment. To be consistent and for precise comparison purposes, we used the same sample points for the outputs generated by the objected oriented classifier, supervised approach (i.e., maximum likelihood), and unsupervised classification technique (i.e., ISODATA,). For a better evaluation, we performed the classification accuracy assessment on the original output maps without editing or manually correcting any of the output maps.

Results and Discussion
Overall accuracies produced by a composite image of PC bands 2, 3, and 4 using the unsupervised approach (i.e., ISODATA) and supervised approach (i.e., maximum likelihood) were 84.17% (Table 1) and 79.17% (Table 2) respectively. Damaged areas also appear more visible on the supervised classification output compared the unsupervised method ( Figures 19) and the supervised method  (Figure 20). The unsupervised output contains more misclassified pixels in non-damaged areas whereas the supervised output seems to contain much less damaged areas. Producer's accuracy that measures the error of omission and user's accuracy that describes the error of commission of damaged areas identified by the unsupervised were 70.83% and 87.18% respectively. Even though PC bands 2, 3, 4 with the unsupervised approach gave higher overall accuracy than the supervised approach, user's accuracy of damaged areas for the supervised approach reaches 100%. It implies that a user of this classification would still find that 100 percent of the time, an area visited on the ground that the classification says damaged areas will actually be damaged areas. However, only 47.92% of the areas identified as damaged areas within the classification are truly of that category. This was because 25 points that were supposed to be damaged areas were mistakenly identified as non-damaged areas, and only 23 points were correctly identified as damaged areas. In other words, all points in areas classified as damaged areas were correctly identified whereas many points in areas identified as non-damaged areas were found to be damaged areas on the ground (reference data). Overall Accuracy = 84.17% Overall Kappa = 0.6595 Table 2. Overall accuracy, producer's accuracy, user's accuracy, and Kappa coefficient produced by a composite image of PC bands 2, 3, and 4 with a supervised classifier (i.e., maximumlikelihood). Figure 19. An output map of PC composite bands 2, 3, and 4 using an unsupervised approach (i.e., ISODATA). Note: blue color represents non-damaged areas and red color represents damaged areas. Overall Accuracy = 79.17% Overall Kappa = 0.5247 Figure 20. An output map of PC composite bands 2, 3, and 4 using a supervised approach (i.e., maximum likelihood). Note: blue color represents non-damaged areas and red color represents damaged areas.
Overall accuracies produced by a composite image of PC bands 3 and 4 using the unsupervised approach and supervised approach were 85.00% (Table 3) and 87.50% (Table 4) respectively. Both producer's and user's accuracies for non-damaged areas were higher than damaged areas. This is probably due to the fact that area extent of non-damaged category was a lot larger than damaged areas, and many randomly selected points fall in non-damaged areas. The overall accuracies of 85% and 87.5% produced by a composite image of PC bands 3 and 4 reach the minimum mapping accuracy of 85% required for most resource management applications [26][27]. Figures 21 and 22 suggest that the output generated by the supervised approach shows more damaged areas whereas the unsupervised output contains less damaged areas. It has been found that PC composite of bands 3 and 4 with the use of either an unsupervised or a supervised approach can be considered an effective approach for identifying damaged areas due to a disaster event.
A composite image difference of the original reflectance bands 1, 2, 3, 4, 5, and 7 using the unsupervised classifier and supervised classifier gave overall accuracies 80.33% (Table 5) and 83.33% (Table 6) respectively. We believe that both outputs generated by unsupervised and supervised look similar. Output maps of both techniques (Figures 23 and 24) show less noise in non-damaged areas and contain less damaged areas than they can be observed visually in composite images. This could have been the reason why producer's accuracies for both outputs were very low (51.06% and 61.36%). In general, this approach seems to be a good approach as both outputs produce somewhat high overall accuracies. Table 3. Overall accuracy, producer's accuracy, user's accuracy, and Kappa coefficient produced by a composite image of PC bands 3 and 4 with an unsupervised classifier (i.e., ISODATA). Overall Accuracy = 87.50% Overall Kappa = 0.7238 Figure 21. An output map of PC composite bands 3 and 4 using an unsupervised approach (i.e., ISODATA). Note: blue color represents non-damaged areas and red color represents damaged areas. Figure 22. An output map of PC composite bands 3 and 4 using a supervised approach (i.e., maximum likelihood). Note: blue color represents non-damaged areas and red color represents damaged areas.  Overall Accuracy = 83.33% Overall Kappa = 0.6154 Figure 23. An output map of image difference bands 1, 2, 3, 4, 5, and 7 using an unsupervised approach (i.e., ISODATA). Note: blue color represents non-damaged areas and red color represents damaged areas. Figure 24. An output map of image difference bands 1, 2, 3, 4, 5, and 7 using a supervised approach (i.e., maximum likelihood). Note: blue color represents non-damaged areas and red color represents damaged areas.
It was evident from the visual analysis that image difference bands 3, 5, and 7 showed damaged areas with lower spatial variances than other image difference bands, we anticipated that a composite image difference of the original reflectance bands 3, 5, and 7 would produce a satisfactory outcome. From Tables 7 and 8, overall accuracies produced by the above composite of image difference bands 3, 5, and 7 using the unsupervised classifier and supervised classifier did not produce satisfactory accuracies (i.e., 70.83%, 78.33%) as expected. Both outputs (Figures 25 and 26) were similar to the outputs from the previous image difference bands. They also show a lower noise level in non-damaged areas and contain less damaged areas.
The highest overall accuracy (98.33%) was produced by a composite of PC bands 3 and 4 using the object-oriented approach (Table 9). Producer's accuracy of non-damaged areas and user's accuracy of damaged areas reached 100%. User's accuracy of non-damaged areas and producer's accuracy of damaged areas also exceeds 95%. Errors in user's accuracy of non-damaged areas and producer's accuracy of damaged areas were due to the fact that 2 points fall on areas that were mistakenly identified as non-damaged areas in the output map that were supposed to be damaged areas. It can be observed from Figure 27 that there is no single pixel that was mistakenly classified as damaged areas in non-damaged areas. As mentioned earlier, we did not edit, manually correct, or filter any of the output maps produced in this study (Figures 19 to 27). The output map of the object-oriented approach from a composite of PC bands 3 and 4 is the original output generated by the nearest neighbor classifier.  Table 8. Overall accuracy, producer's accuracy, user's accuracy, and Kappa coefficient produced by a composite image difference of the original reflectance bands 3, 5, and 7 with a supervised classifier (i.e., maximumlikelihood).

Figure 25.
An output map of image difference bands 3, 5, and 7 using an unsupervised approach (i.e., ISODATA). Note: blue color represents non-damaged areas and red color represents damaged areas. Overall Accuracy = 78.33% Overall Kappa = 0.4956 Figure 26. An output map of a image difference bands 3, 5, and 7 using a supervised approach (i.e., maximum likelihood). Note: blue color represents non-damaged areas and red color represents damaged areas. Overall Accuracy = 98.33% Figure 27. An output map of PC composite bands 3 and 4 using an object oriented approach. Note: The output image was not manually edited or filtered.

Conclusion
It was found that a composite image of PC bands 3 and 4 using a supervised approach (i.e., maximum likelihood) gave the highest overall accuracy among all the traditional classifiers with different composite bands. A composite image difference of the original reflectance bands 1, 2, 3, 4, 5, and 7 using the unsupervised classifier and supervised classifier were found to be the second most effective of all pre-classification change detection techniques. It can be observed from Figures 19 through 26 that there is a significant signature confusion between other changed areas between the two time periods and damaged areas due to the 3 May 1999 tornado. A majority of other changed areas other than damaged areas between the two time periods (26 June 1998 and 12 May 1999) were found to be changes from active to non active agricultures and vice versa. To minimize this problem, two images selected before and after a disaster event should be within a short time frame whenever possible. For example, two images acquired within 10 days before and after a natural disaster can be expected to eliminate or at least minimize the signature confusion between damaged areas and other changed areas.
A composite image of PC bands 3 and 4 using the object-oriented approach with a nearest neighbor classifier gave the highest accuracy (98.33%). It can be concluded that the object oriented approach outperforms the supervised and unsupervised approaches. The object-oriented approach allows additional selection or modification of new objects (training samples) each time after performing a nearest neighbor classification quickly until the satisfactory result is obtained. This is probably the key advantage of using the object-oriented approach. There are many possible combinations of different functions, parameters, features, and variables available. However, it should be noted that the exact computation and operation of many of the parameters and functions available with eCognition software are not explicit. The successful use of eCognition largely relies on repeatedly modifying training objects, performing the classification, observing the output, and testing different combinations of functions as a trial-and-error approach. The availability of many different combinations of parameters, functions, features, and variables helped us identify damaged and non-damaged areas effectively. Nonetheless, we conclude that the object oriented approach is effective and reliable in identifying damaged areas due to a severe weather event.