The Improvement of Land Cover Classification by Thermal Remote Sensing

Land cover classification has been widely investigated in remote sensing for agricultural, ecological and hydrological applications. Landsat images with multispectral bands are commonly used to study the numerous classification methods in order to improve the classification accuracy. Thermal remote sensing provides valuable information to investigate the effectiveness of the thermal bands in extracting land cover patterns. k-NN and Random Forest algorithms were applied to both the single Landsat 8 image and the time series Landsat 4/5 images for the Attert catchment in the Grand Duchy of Luxembourg, trained and validated by the ground-truth reference data considering the three level classification scheme from COoRdination of INformation on the Environment (CORINE) using the 10-fold cross validation method. The accuracy assessment showed that compared to the visible and near infrared (VIS/NIR) bands, the time series of thermal images alone can produce comparatively reliable land cover maps with the best overall accuracy of 98.7% to 99.1% for Level 1 classification and 93.9% to 96.3% for the Level 2 classification. In addition, the combination with the thermal band improves the overall accuracy by 5% and 6% for the single Landsat 8 image in Level 2 and Level 3 category and provides the best classified results with all seven bands for the time series of Landsat TM images.


Introduction
We very much appreciate Johnson's comments [1] on our recently published results [2] on the improvement by thermal remote sensing for land cover classification.In our original paper, we made use of single-date Landsat 8 and time series of Landsat 4/5 images to investigate the potential of thermal information for an improved land cover classification in the Attert Catchment in the Grand Duchy of Luxembourg [2].The classification results were assessed by a 10-fold pixel based cross-validation (CV) method where pixel were randomly selected and the overall accuracy (OA) taken as an evaluation measure.We found that the inclusion of thermal bands can improve the accuracy of the land cover classification when added to the multispectral bands.Based on the accuracy assessment, we also concluded that the time series of thermal images alone produced similar classification results when compared to all other VIS/NIR and TIR band combinations.
In his comment, Johnson pointed out that the high accuracy data produced by the thermal images were likely caused by the overestimation of the pixel-based CV method [1].He also recommended performing the CV at the region-of-interest (ROI)/polygon level, thereby avoiding the spatial autocorrelation between training and validation pixels given multi-resolution data.
We acknowledge the various arguments around the accuracy assessment of the multi-scale remotely sensed images [3][4][5] and the numerous data fusion methods [6,7].While our experiments focused on the evaluation of the thermal bands compared to the other multispectral variables, the assessment problems of the random sampling procedure related to the scale issues of the resampled thermal images were neglected.In this response, we have carefully taken Johnson's suggestion into account and evaluated all our original land cover classification with the polygon-based CV method.
In the following sections, we briefly introduce the new assessment procedure and show the classification results for both the single-date and time series of image applications.While we will briefly introduce the data used, a detailed description of the images and the classification algorithm can be found in [2].

Polygon-Based Cross Validation Method
Based on the suggestion of [1], we added a polygon-based CV analysis to our study.Training and validation samples were collected using the area of interest tool of the drawing toolbox in the ERDAS software (ERDAS, Inc., Atlanta, GA, USA); the polygons here are the same as in the original paper [2].However, in order to make sure the pixels from the polygon as pure as possible, here the pixels were kept only when their central points are inside the polygon; the polygons were then refined.This leads to the slightly smaller sample sizes compared to the original paper.As we are not aiming at evaluating the effects of the training size on the classification accuracy, here the comparison between the pixel-based CV method and the polygon-based CV method was conducted using the same size of samples (pixels with central point inside the polygon).
For all land cover categories (except for water bodies) more than 10 polygons of ground truth data exists and there is little difference among the size of each polygon.For the 10-fold polygon-based cross validation, all polygons were split into 10 smaller sets and reorganized into 10 groups of data.The classification model was trained using pixels from nine groups of polygons (training data); the resulting model is then validated on the remaining group (validation data), and this procedure is repeated 10-times so that each group is used for validation once.In this way the correlation of multi-resolutions calibration and validation is avoided.

Results and Discussion
In this section, the accuracy statistics for a land cover classification applying both the original pixel based CV method and the polygon-based CV method are compared.Input data (a single-date Landsat 8 image on 21 July 2013, as well as time series of images from Landsat 4/5) and classification methods (Random Forest and the k-NN algorithms) are kept as in the original paper [2].

Three-Level Classification Based on the Single-Date Landsat 8 Image (S1)
In a first part, three variants of spectral band combinations of the single Landsat 8 image are used as input data and include: (i) Bands4, only considering bands 2 to 5 in the VIS/NIR spectral region, without thermal bands; (ii) Bands6T, as Bands4, but adding the two thermal bands 10 and 11; and (iii) Bands10T, including all bands, except the panchromatic band.
Table 1 summarizes the accuracy evaluation results for both the polygon-based and the pixel-based CV methods.The OA values for the classification of Level 1 land cover category obtained by the polygon-based CV are almost identical to the pixel-based evaluation for the three variants (97% to 98%).The OA values of Level 2 and Level 3 categories decreased about 5% and 9%, respectively, when switching to the polygon-based CV method.However, the increase in performance when adding the thermal bands into the classification is still pronounced and our conclusions drawn in the original paper still hold.

Table 1.
The mean values of overall accuracy (OA) calculated by a polygon-based 10-fold cross validation (CV) method for the three variants from Landsat 8 in 2013 classified by k-NN and Random Forest: Bands4, only considering bands 2 to 5 in the visible and near infrared spectral region without thermal bands; Bands6T, as Bands4, but adding the two thermal bands 10 and 11, and Bands10T, including all bands, except the panchromatic band (see [2] for a detailed channel/band description); k-NN5 and RF represents the nearest neighbor method with k = 5 and Random Forest, respectively.

Two-Level Classification Based on Time Series of Images (TS1 and TS2)
The second part of the analysis focus on the differences between polygon-based and pixel-based CV methods when using time series of Landsat 4/5 images as input in the land cover classification system (Tables 2 and 3).The time series of images consist of two groups: TS1, including 7 images between 1984 and 1990, and TS2, including 6 images from 2006 to 2011.Five variants of the times series were analyzed including the following different band combinations: B3B4, the combinations of band 3 and band 4 of Landsat TM; PC3, the first three principal components of all VIS/NIR bands; 6Bands, all bands except the thermal band; Thermal only the single thermal band; and 7Bands, the combination of all 7 Landsat 4/5 bands.
Table 2. Overall Accuracy of Level 1 classification assessed by the polygon-based CV method using five variants of time series of images: B3B4, the combination of band 3 and band 4; 3PC, the first three principal components of all bands; 6Bands, the combination of six bands; Thermal, the thermal band; and 7Bands, the combination of all seven bands.The best OA value from the pixel-based CV is almost the same as in the original paper, the details can be found in the Supplementary materials (Tables S1 and S2).For simplification, we here only list the OA values of the polygon-based CV for Level 1 and Level 2 land cover categories classification.Table 2 shows OA values of the polygon-based CV for the classification of Level 1 land cover categories, whereby images within each group (TS1, TS2) are added subsequently as input.The classification performance increases with increasing numbers of images for all variants, a behavior that is already shown in our original study using a random pixel-based CV.Also, the 7Bands variant including the thermal band still achieved the best overall performance, especially for smaller number of images included.Using the full set of available images all variants performed almost equally well, with OA values of 96%-98.5%.When only using the thermal band, classification performance is reduced by 4.5% compared to the pixel-based CV in our original paper, nicely demonstrating the overestimation of performance when correlation of multi-resolution calibration and validation data are existent, as commented by Johnson [1].
The differences between the pixel-based based and a polygon-based 10-fold CV method are summarized in Figure 1.Here, OA values of all five variants from TS1 are summarized in a single box-plot for each time-step in the left image of Figure 1.It is clearly seen that on average the polygon-based CV method produced significant lower OA for both time series (TS1, TS2), again supporting the issue raised by Johnson's comment.Two variants of 6Bands and 7Bands were selected to show the detailed variation for the two methods in the right image of Figure 1.Besides the lower OA in comparison to the pixel-based CV method, the polygon-based CV still produced higher average OA for the 7Bands compared to the 6Bands without thermal band (the right image in Figure 1).
Repeating this analysis for the classification of Level 2 land cover categories, the differences in the performance measure (OA) between both CV methods is even more pronounced.Table 3 summarizes the classification results.The average OA values are in general lower, as we analyze more specific land cover categories.The best OA for TS1 and TS2 are 86.6% and 93.3% when including the full set of images.The 7Bands variant including the thermal band still achieved the best OA value of 86.6% for TS1, which is 10.9% lower than 97.5% from the pixel-based CV method.The best OA of TS2 from polygon-based CV method is about 5% lower than the corresponding value for the pixel-based CV method.Again, the Thermal variant including only the single thermal band, showed only a relatively weak performance with OA values of 72% and 74.5% for the both time series, compared to 96.8% 95.8% for the pixel-based CV method.Figure 2, taking TS2 as an example, summarizes the differences between the pixel-based and the polygon-based CV methods in the left image and displays the variation of the selected 6Bands and 7Bands variants, again supporting the issue raised by [1].3) are summarized in a single box-plot; (Right): OA comparison of selected variants of 6Bands without thermal band and 7Bands with thermal band.

Conclusions
In this response to [1], a polygon-based CV method, as suggested by [1], was applied to evaluate a land cover classification for three different levels of land cover categories.The classification was based on (i) a single-date Landsat 8 image; and (ii) time series of Landsat 4/5 images.The performance of classification results using the polygon-based CV were compared to a pixel-based CV method as applied in our original application [2].
For the single-date Landsat 8 image, the polygon-based method achieved almost similar accuracy values when compared to the pixel-based method, for all three-levels of land cover categories and for both classification methods used.When using time series of images, five different variants of band combinations (see Table 2) with and without thermal information have been considered.
The accuracy of the Level 1 classification decreased but to a very acceptable and still useful level when compared to the commonly recommended standard of 85% [8] (the best OA of Thermal is 94.6% and the best OA of 7Bands is 98.4%).The most obvious decline in performance is observed in the classification results for the Level 2 category, of which the best OA among the five variants is 93.3% and only 74.5% when using only the thermal bands.
Consistent with our former findings, the inclusion of the thermal bands still improved the land cover classification in comparison to only using the VIS/NIR bands, also when assessing the classification results with a polygon type of CV approach.This has also been shown by other researcher, Eisavi et al. [9] applied the random forest classifier to the multi-temporal spectral and thermal features in land cover classification and found that the contribution of multi-temporal thermal information led to a considerable increase in the accuracy data.When using time series of thermal images to classify land cover at the Level 2 category, the performance and OA values were significantly lower for the polygon based CV when compared to the pixel-based evaluation for all band combinations considered.Again, the inclusion of thermal information improved the classification results on various levels.
In summary, a clear effect of correlation in the samples for calibration and validation due to multiresolution data could be observed here.Classification accuracy (OA) was highly overestimated when ignoring correlation effects in the selection of calibration and validation data using time series of images as input.We therefore strongly support the comments made by Johnson and can only support his recommendation concerning the appropriate choice of CV methods.However, the choice of the CV method did not change our original conclusions, in that the inclusion of thermal data into the classification process, can significantly improve classification results.

Figure 1 .
Figure 1.The distribution of OA values for the Level 1 land cover category classification using times series TS1, the polygon-based and pixel-based 10-fold cross validation methods and the Random Forest methods.(Left): all variants (Table 2) are summarized in a single box-plot; (Right): OA comparison of selected variants of 6Bands without thermal band and 7Bands with thermal band.

Figure 2 .
Figure 2. The distribution of OA values for the Level 2 land cover category classification using times series TS2, the polygon-based and pixel-based 10-fold cross validation methods and the Random Forest methods.(Left): all variants (Table3) are summarized in a single box-plot; (Right): OA comparison of selected variants of 6Bands without thermal band and 7Bands with thermal band.

Table 3 .
Overall Accuracy of Level 2 classification assessed by the polygon-based CV method using five variants of time series of images: B3B4, the combination of band 3 and band 4; 3PC, the first three principal components of all bands; 6Bands, the combination of six bands; Thermal, the thermal band; and 7Bands, the combination of all seven bands.