Comparison of Supervised Classification Algorithms Using a Hyperspectral Image for Land Use/Land Cover Classification †

: Hyperspectral imaging is becoming popular in land use/land cover classification because of its ability to capture detailed information through higher spatial resolution and contagious spectral bands. Using the hyperspectral image from G-LiHT (Goddard’s LiDAR, Hyperspectral


Introduction
The detailed mapping of land use/land cover change has advanced in recent decades with emerging satellite data based on multispectral or hyperspectral sensors.Since the 1960s, remote sensing data has been used in land cover mapping.The detailed mapping assists in analyzing changes over time in various land use/land cover classes and assessing risk at various scales [1].This information plays a vital role in preserving ecologically sensitive areas and solving environmental issues.With the increasing urbanization and land degradation, the significance of land use classification has increased [2].
Multispectral images are extensively utilized in image classification, particularly for land use and land change classification [3].However, with the rapid advancement in technology, hyperspectral images have also been used recently.The hyperspectral images provide spectral data for each pixel in numerous contiguous spectral bands often covering a wide range of wavelengths [4].Also, these images have high spectral resolution, making them able to capture detailed information about the spectral characteristics of the observed objects or surface.
The development of reliable image classification depends on the performance of classification algorithms.Specifically dealing with hyperspectral images, the high dimensionality and spectral mixing are the major challenges [5], which can significantly impact the accuracy of classification results.Additionally, an inadequate number of ground-truth data, as well as potential redundancy in hyperspectral images, add complexity to the classification process [6].Therefore, this study aims to investigate the robustness of classification algorithms in handling spectral unmixing and limited ground-truth information.It compares various image classification algorithms using a hyperspectral image.
Image classification involves a process where each individual pixel within the image is categorized into discrete land use classes [7].The two most often used methods of classification are supervised classification and unsupervised classification.In supervised classification, the analyst knows about the labels of classes or has the training dataset [8] before applying the classification of the image.
Hence, this study aims to perform and compare supervised classifiers (SVM, SAM, and SID) for land use land cover classification using a hyperspectral image.

Data
A hyperspectral image from G-LiHT (Goddard's LiDAR, Hyperspectral, and Thermal) was used for this research.This data underwent processing to generate standardized data products, including 1 m at-sensor reflectance hyperspectral imagery.The flight was conducted on 7 May 2015, in Stanton of Knoxville, Tennessee.The image was acquired as UTK_7May2015_Stanton and was downloaded from the NASA G-LiHT website accessed on 3 September 2023 (URL: https://glihtdata.gsfc.nasa.gov/)with 119 spectral bands between 418 and 918 nm.The hyperspectral imaging spectrometer model was Hyperspec model 1002A-00451; Headwall Photonics [9].

Data Pre-Processing, Training, and Testing Dataset
We preprocessed the hyperspectral image to reduce redundancy and noise.We used Minimum Noise Fraction (MNF) transformation, a widely used technique that serves to decorrelate spectral bands and reduce noise, effectively isolating the signal from undesirable variations [10].Following this transformation, an eigenvalue analysis was conducted to determine the importance of each MNF component.MNF components with higher eigenvalues were prioritized as they capture more detailed information about the land cover classes being analyzed.We subsetted the original image with 30 bands, only reducing the band with low eigenvalues.
We used the spectral library (Figure 1) created through an in-field survey using a spectrometer as the ground-truth data or training dataset for image classification.We used a random sampling method to choose our samples in the field.A total of 60 samples of each land cover type (vegetation, grassland, built-up, bare soil, water) were recorded using a spectrometer.Then, 40 samples were chosen for the training dataset randomly.The remaining 20 samples were chosen for testing.

Spectral Angle Mapper (SAM)
SAM is particularly valuable for identifying and characterizing materials or objects within a scene based on their spectral signatures.SAM operates on the principle that the similarity between two spectra can be quantified by measuring the angle between them in a high-dimensional space, where each dimension corresponds to a spectral band or wavelength [11].For this classification method, we tried various values of maximum angle radians, and the best result was obtained when a value of 0.3 was used.

Spectral Information Divergence (SID)
SID is commonly used in hyperspectral image processing for tasks like anomaly detection, target detection, and classification.It helps identify areas or objects in an image that deviate significantly from the expected spectral distribution, which can be useful in  SAM is particularly valuable for identifying and characterizing materials or objects within a scene based on their spectral signatures.SAM operates on the principle that the similarity between two spectra can be quantified by measuring the angle between them in a high-dimensional space, where each dimension corresponds to a spectral band or wavelength [11].For this classification method, we tried various values of maximum angle radians, and the best result was obtained when a value of 0.3 was used.

Spectral Information Divergence (SID)
SID is commonly used in hyperspectral image processing for tasks like anomaly detection, target detection, and classification.It helps identify areas or objects in an image that deviate significantly from the expected spectral distribution, which can be useful in image classification [12].For the SID algorithm, we used the maximum divergence threshold value of 0.5 to obtain the best result.

Support Vector Machine (SVM)
SVM is a supervised machine learning algorithm used for classification and is effective in high-dimensional images [13].We used a radial-based kernel function using a gamma value of 0.009 for SVM.

Accuracy Assessment
The accuracy assessment is a crucial step in land use/land cover classification for the validation of the classified image.We used a confusion matrix, which summarizes the class labels against the predicted labels to evaluate the performance of supervised classification algorithms.The total accuracy was calculated as Overall accuracy = (Number of correctly classified pixels ÷ Total number of pixels) x 100

Image Classification
The image classification results are shown in Figure 2. SAM appears to give no data value surrounding the water bodies and built-up areas.SID can remove the no data value from the image.It can be seen that the forest and built-up areas are classified, although there seems to be some noise over the water bodies.Meanwhile, SVM performed well in detecting the land cover types, removing the no data value over the water bodies and surrounding built-up areas.

Image Classification
The image classification results are shown in Figure 2. SAM appears to give no data value surrounding the water bodies and built-up areas.SID can remove the no data value from the image.It can be seen that the forest and built-up areas are classified, although there seems to be some noise over the water bodies.Meanwhile, SVM performed well in detecting the land cover types, removing the no data value over the water bodies and surrounding built-up areas.

Accuracy Assessment
The confusion matrix table for each of the classification algorithms (SVM, SID, and SAM) is represented in Tables 1-3.
SVM achieved an exceptional accuracy of 92.03%, SID had 89.60%, and SAM had 91.23%.The confusion matrices provide further insights into the classification performance, detailing the distribution of true positives (correctly classified pixels), true nega-

Accuracy Assessment
The confusion matrix table for each of the classification algorithms (SVM, SID, and SAM) is represented in Tables 1-3.
SVM achieved an exceptional accuracy of 92.03%, SID had 89.60%, and SAM had 91.23%.The confusion matrices provide further insights into the classification performance, detailing the distribution of true positives (correctly classified pixels), true negatives, false positives, and false negatives for each classifier.The high values in the diagonal of the confusion matrices indicate strong agreement between the predicted and actual class labels.

Discussion
The accuracy assessment results of the land use/land cover classification, employing SVM, SAM, and SID classifiers, reveal promising outcomes for the hyperspectral image.The achieved accuracies for all three classes indicate they performed very well for detailed classification, while SVM stands as a top-performing classifier with the highest accuracy of 92.03% among the three of them.The result was consistent with a comparative study on the effectiveness of image classification algorithms, including SVM, SAM, and SID, conducted by [14,15], which also concluded that SVM performs better than other methods.Despite showing the highest accuracy, SVM is computationally intensive, especially with large datasets.Though SAM had negligible differences with SVM, SAM has proven to be best in capturing spectral similarity based on spectral angles [16].
The notable outcome of this research is the consistency in the distinguishability of forest and water across all employed classification schemes.This implies the spectral signatures of these classes are distinct and easily discernible by the selected classifiers.In contrast, variability is seen in built-up areas.Also, the challenges seen in this research are shadows, particularly tall structures, and trees.In many cases, these shadows create dark pixels within the image and can be incorrectly classified as water bodies.

Conclusions
Following an analysis of different supervised image classifiers, we discovered that SVM outperforms other classifiers in accurately identifying land cover/land use classes and is also effective at handling high-dimensional data.Following SVM, SAM can also serve as a suitable method for detecting land cover/land use classes, as there was negligible difference between SAM and SVM.The detection in built-up areas and water bodies is slightly mislabeled as a shadow by SID, whereas the SVM demonstrated its effectiveness in handling such scenarios.Hence, these three supervised classifiers were demonstrated to be effective in classifying remotely sensed data.

Figure 2 .
Figure 2. Supervised classification results using SAM, SID, and SVM (left to right) showing five (water, built-up, vegetation, grassland, bare soil) classes and unclassified labels.

Figure 2 .
Figure 2. Supervised classification results using SAM, SID, and SVM (left to right) showing five (water, built-up, vegetation, grassland, bare soil) classes and unclassified labels.

Table 1 .
Accuracy assessment of SVM.

Table 2 .
Accuracy assessment of SID.

Table 3 .
Accuracy assessment of SAM.