Novel Multi-Scale Filter Proﬁle-Based Framework for VHR Remote Sensing Image Classiﬁcation

: Filter is a well-known tool for noise reduction of very high spatial resolution (VHR) remote sensing images. However, a single-scale ﬁlter usually demonstrates limitations in covering various targets with di ﬀ erent sizes and shapes in a given image scene. A novel method called multi-scale ﬁlter proﬁle (MFP)-based framework (MFPF) is introduced in this study to improve the classiﬁcation performance of a remote sensing image of VHR and address the aforementioned problem. First, an adaptive ﬁlter is extended with a series of parameters for MFP construction. Then, a layer-stacking technique is used to concatenate the MPFs and all the features into a stacked vector. Afterward, principal component analysis, a classical descending dimension algorithm, is performed on the fused proﬁles to reduce the redundancy of the stacked vector. Finally, the spatial adaptive region of each ﬁlter in the MFPs is used for post-processing of the obtained initial classiﬁcation map through a supervised classiﬁer. This process aims to revise the initial classiﬁcation map and generate a ﬁnal classiﬁcation map. Experimental results performed on the three real VHR remote sensing images demonstrate the e ﬀ ectiveness of the proposed MFPF in comparison with the state-of-the-art methods. Hard-tuning parameters are unnecessary in the application of the proposed approach. Thus, such a method can be conveniently applied in real applications.


Introduction
A remote sensing image with very high resolution (VHR) has an improved visual appearance over imagery of conventional resolution [1]. The remote sensing images with VHR mean that the size of per pixel in the image is smaller than that of the low-median resolution image. Therefore, the remote sensing image with VHR can capture and describe the details of ground targets in terms of size and shape. Such imagery plays an important role in various practical applications, such as land cover classification [2][3][4], target recognition [5][6][7], and change detection [8,9]. However, VHR remote sensing images usually have insufficient spectral information, that is, their numbers of spectral bands are lacking. The spectral bands for most high-resolution remote sensing sensors are less than eight. For example, Quick Bird satellite image contains four spectral bands, and WorldView-3 image only has eight multi-spectral bands [10,11]. This situation is attributed to remote sensors, which have physical limitations between spatial and spectral resolutions. Thus, the high resolution in geometry and low deliverance in spectral reflectance produce the Hughes phenomenon and cause considerable noise in the classification map [12][13][14][15][16].
The remainder of this paper is organized as follows. Section II discusses the details of the proposed approach. Section III provides the experiments and comparisons. Finally, Section IV presents the conclusion drawn from this study.

Proposed Method
This section introduces the details of the proposed MFPF for classification with VHR remote sensing images. Figure 1 illustrates that the proposed MFPF comprises the following parts. First, MFPs are developed, and a layer-stacking technique is used to fuse the multi-scale filtered image profiles. Then, principal component analysis (PCA) is adopted to reduce redundancy. Finally, an initial classification map is acquired through a supervised classifier, and a post-classification method is proposed on the basis of the MFs. Each part is detailed as follows.

Generation of MFPs
In the previous method (MMF) [23], parameter "T1" is the measured spectral difference between the central pixel and its neighbors, and T2 is the size of an adaptive extension and is insensitive to the classification accuracies. By contrast, T1 is sensitive to the classification accuracy, and selecting T1 for a given image is time consuming and depends on the practitioner's experience. The idea of parameter serialization is introduced in this study to handle the limitation of the previous MMF and avoid the hard-tuning of T1 for the classification of different datasets.  In the proposed MFPF, T1 is serialized with five values with a fixed T2, such as = , , , , , where indicates the different values of T1. The adaptive extension algorithm in the previous study [23] indicates that a different T1 with a fixed T2 will create the extended regions with various shapes. This phenomenon is attributed to the different T1 values, which will change the searching direction in the extension process. In addition, different searching directions will affect the shape of the adaptive extension region. Therefore, the multi-scale feature of a target will be described and utilized when T1 is set to different values.

Generation of MFPs
In the previous method (MMF) [23], parameter "T 1 " is the measured spectral difference between the central pixel and its neighbors, and T 2 is the size of an adaptive extension and is insensitive to the classification accuracies. By contrast, T 1 is sensitive to the classification accuracy, and selecting T 1 for a given image is time consuming and depends on the practitioner's experience. The idea of parameter serialization is introduced in this study to handle the limitation of the previous MMF and avoid the hard-tuning of T 1 for the classification of different datasets.
In the proposed MFPF, T 1 is serialized with five values with a fixed T 2 , such as T 1 = {S 1 , S 2 , S 3 , S 4 , S 5 }, where S k indicates the different values of T 1 . The adaptive extension algorithm in the previous study [23] indicates that a different T 1 with a fixed T 2 will create the extended regions with various shapes. This phenomenon is attributed to the different T 1 values, which will change the searching direction in the extension process. In addition, different searching directions will affect the shape of the adaptive extension region. Therefore, the multi-scale feature of a target will be described and utilized when T 1 is set to different values. Herein, "I" is assigned as a VHR remote sensing image with an R-G-B band false color. An adaptive extended region with a pair of T 1 and T 2 is used for smoothing each band of the given image (I) with the mean of the adaptive region to construct MFPs. Thus, the formula of MFPs can be written as In the equation-1, where I S 1 is the filtered image with the pair of T 1 = s 1 , and T 2 is fixed at a pre-provided value. I s 1 = I S 1 r , I where "r, g, and b" denote the red, green, and blue bands, respectively. Thus, "I s 1 r " presents the filtered image profile for the red band. In the MPF construction, MPFs are the filtered profiles, which are developed by smoothing the raw image band-by-band with the adaptive extension algorithm using different values of parameter T 1 and the fixed value of T 2 .

Reduction of Redundancy through PCA
MPFs can capture the multi-scale feature of a ground target but is redundant in the data construction of a target. In this section, a classical method called PCA is proposed to reduce redundancy. Before applying the PCA, a layer-stacking method, which is embedded in the ENVI 5.2 professional software, is adopted to fuse the MPFs into a high-dimensional vector.
The details of PCA are not introduced in this study because it is a classical method that has been successfully applied to many image-processing cases [39][40][41]. In the proposed MFPF, the PCA application tool, which is integrated into a toolbox of the ENVI 5.2 business software, is utilized to reduce the dimension of the high-dimensional vector. An observation is depicted in Figure 2 to demonstrate the capability of the proposed approach further in improving the intra-class homogeneity and preserving the edge of different classes. The observations and comparison results demonstrate that the smallest standard deviation of each band is achieved through the proposed approach. The boundary of the ground targets is preserved, as demonstrated by the yellow arrow in the figures. Herein, "I" is assigned as a VHR remote sensing image with an R-G-B band false color. An adaptive extended region with a pair of T1 and T2 is used for smoothing each band of the given image (I) with the mean of the adaptive region to construct MFPs. Thus, the formula of MFPs can be written as MPFs = I , I , I , I , I , In the equation-1, where I is the filtered image with the pair of = , and T2 is fixed at a preprovided value. I = I , I , I , where "r, g, and b" denote the red, green, and blue bands, respectively. Thus, "I " presents the filtered image profile for the red band. In the MPF construction, MPFs are the filtered profiles, which are developed by smoothing the raw image band-by-band with the adaptive extension algorithm using different values of parameter T1 and the fixed value of T2.

Reduction of Redundancy Through PCA
MPFs can capture the multi-scale feature of a ground target but is redundant in the data construction of a target. In this section, a classical method called PCA is proposed to reduce redundancy. Before applying the PCA, a layer-stacking method, which is embedded in the ENVI 5.2 professional software, is adopted to fuse the MPFs into a high-dimensional vector.
The details of PCA are not introduced in this study because it is a classical method that has been successfully applied to many image-processing cases [39][40][41]. In the proposed MFPF, the PCA application tool, which is integrated into a toolbox of the ENVI 5.2 business software, is utilized to reduce the dimension of the high-dimensional vector. An observation is depicted in Figure 2 to demonstrate the capability of the proposed approach further in improving the intra-class homogeneity and preserving the edge of different classes. The observations and comparison results demonstrate that the smallest standard deviation of each band is achieved through the proposed approach. The boundary of the ground targets is preserved, as demonstrated by the yellow arrow in the figures.  The reduced dimension feature vector is considered the "input feature" of a supervised classifier, such as support vector machine (SVM). The initial classification map is acquired.

Post-Processing Classification with MFs
As previously mentioned, each filter in the applied MFs has a corresponding adaptive spatial region. In this section, the label of each pixel in the initial classification map is revised in accordance with the labels of its neighbors. A post-processing classification approach is proposed on the basis of the MFPs. First, the label of each class around a pixel at position (i,j) is counted. If denotes the total number of the label for the specific class , then is obtained using Equation (2). Second, The reduced dimension feature vector is considered the "input feature" of a supervised classifier, such as support vector machine (SVM). The initial classification map is acquired.

Post-Processing Classification with MFs
As previously mentioned, each filter in the applied MFs has a corresponding adaptive spatial region. In this section, the label of each pixel in the initial classification map is revised in accordance with the labels of its neighbors. A post-processing classification approach is proposed on the basis of the MFPs. First, the label of each class around a pixel at position (i,j) is counted. If N C m ij denotes the total number of the label for the specific class C m , then N C m ij is obtained using Equation (2). Second, the label of the central pixel at position (i,j) is revised using the class that frequently appears in the multi-scale Remote Sens. 2019, 11, 2153 5 of 17 adaptive regions, as expressed in Equation (3). The label (L ij ) of the pixel at (ij) is determined by the label of the frequently appearing class.
where R S 1 c m is the number of pixels for special class C m within the adaptive region at scale s 1 , "k" is the total number of scale parameter that is equal to the number of different values of T 1 , "M" is the total number of classes for a given image scene, and "m" is the number of counted classes.
This post-processing with multi-scale adaptive regions can further smooth intra-class noise and preserve the boundary between different classes. These features are attributed to the pixels, which comprise a target that is generally homogeneous in spectra and continuous in the spatial domain. The shape and size of a target in the remote sensing are unknown before classification. Thus, the use of a multi-scale adaptive shape of the MFs in the post-processing of the initial classification map should be helpful for smoothing noise and improving classification performance as demonstrated by the experiments performed in this study.

Experiments
In this section, three experiments are conducted with three real VHR remote sensing images to verify the effectiveness and superiority of the proposed MFPF. The experimental details are presented as follows.

Datasets
(1) JX01: This dataset was acquired using a Canon EOS 5D Mark II camera banded with an unmanned aerial vehicle (UAV) platform. The flight height is approximately 100 m, and the spatial resolution is 0.1 m/pixel. This image covers a countryside scene in Jiang Xi City, China, and the scene size is 1400 × 1000 pixels. Figure 3 exhibits a typical area in the countryside of China. Seven classes, including roads, grass, buildings, shadow, trees, soil, and water, were identified in this scene.        average accuracy (AA), and the kappa coefficient (Ka), are used to evaluate the performance of the proposed methods. OA is the total number of correctly classified samples divided by the total number of test samples, AA represents the mean value of correctly classified pixels for each class compared with each test sample, and Ka indicates the inter-rater reliability for the classified result [42]. The parameters of each approach are optimized through a trial-and-error method for a specific dataset to guarantee the fairness of comparison.

Experimental Results
In the first experiment based on the JX01 dataset, Table 1 displays that each filter improves the

Experimental Setting
Three experiments were conducted in this section to test the effectiveness of the proposed MFPF for classifying VHR remote sensing images.
(1) In the first experiment, the JX01 dataset is adopted and classified using the SVM classifier, which is an extensively used pixel-wise classifier. The proposed MFPF is compared with the classification of raw and filtered images based on MF, MedF, and MMF [23]. (2) In the second experiment, three classical and extensively used supervised classifiers, namely, SVM, k-nearest neighbor (KNN), and Random Tree (RT), are adopted to classify the Pavia University image scene. This task is initiated to test the robustness of the proposed approach further in terms of the different classifiers. The effectiveness of the proposed MFPF is further verified by comparing the different filters in terms of the three classifiers. In all the experiments, the parameters of the proposed MFPF are fixed at T 1 = {10, 15, 20, 25, 30} and T 2 = 100 without hard-tuning. Three extensively used indexes, namely, overall accuracy (OA), average accuracy (AA), and the kappa coefficient (Ka), are used to evaluate the performance of the proposed methods. OA is the total number of correctly classified samples divided by the total number of test samples, AA represents the mean value of correctly classified pixels for each class compared with each test sample, and Ka indicates the inter-rater reliability for the classified result [42]. The parameters of each approach are optimized through a trial-and-error method for a specific dataset to guarantee the fairness of comparison.

Experimental Results
In the first experiment based on the JX01 dataset, Table 1 displays that each filter improves the classification accuracies in comparison with those of the raw image without any filter processing. For example, the improvements of OA, AA, and Ka are 4.12%, 2.99%, and 0.052, respectively, compared with those of the traditional MF method. The classification performance in different filters shows that the proposed MFPF can achieve the optimal accuracy for most kinds of specific classes. The optimal OA, AA, and Ka are obtained through the proposed MFPF. The visual performance presented in Figure 6 further demonstrates the conclusion of this experiment.  The second experiment is conducted on the Pavia University dataset to demonstrate the effectiveness and adaptability of the proposed MFPF further. The experimental results are presented as follows. (1) The proposed MFPF achieves the optimal accuracies in terms of OA, AA, and Ka in the Pavia University image in comparison with traditional filters, such as the MF, MedF, and previous MMF. Table 2 presents that the OA and AA of the proposed MFPF are 94.18% and 94.14% for the KNN classifier, correspondingly, and those of MMF are 83.92% and 81.38%, respectively. Improvements of 10.26% and 12.76% in terms of OA and AA, respectively, are obtained by the proposed MFPF compared with those of MMF. The comparison details for the KNN are summarized in Table 3 to demonstrate the superiority of the proposed MFPF. (2) The results based on the different classifiers and filters imply that the proposed MFPF is effective and robust for the SVM, KNN, and RT classifiers. The proposed MFPF performs the optimal accuracies for each classifier in comparison with those of the raw image without any processing and the images filtered with MF, MedF, and MMF.
Three demonstrations with the specific KNN, SVM, and RT classifier are respectively illustrated for the corresponding visual performance in Figures 7, 8, and 9. The visual comparison and observation results clearly demonstrate that the classification map obtained through the proposed MFPF performs with minimal noise, and the intra-class homogeneity, such as the meadow highlighted by the rectangle in Figures 7, 8, and 9, is improved. The boundary of ground targets can be preserved well in the classification map obtained through the proposed MFPF. The second experiment is conducted on the Pavia University dataset to demonstrate the effectiveness and adaptability of the proposed MFPF further. The experimental results are presented as follows. (1) The proposed MFPF achieves the optimal accuracies in terms of OA, AA, and Ka in the Pavia University image in comparison with traditional filters, such as the MF, MedF, and previous MMF. Table 2 presents that the OA and AA of the proposed MFPF are 94.18% and 94.14% for the KNN classifier, correspondingly, and those of MMF are 83.92% and 81.38%, respectively. Improvements of 10.26% and 12.76% in terms of OA and AA, respectively, are obtained by the proposed MFPF compared with those of MMF. The comparison details for the KNN are summarized in Table 3 to demonstrate the superiority of the proposed MFPF. (2) The results based on the different classifiers and filters imply that the proposed MFPF is effective and robust for the SVM, KNN, and RT classifiers. The proposed MFPF performs the optimal accuracies for each classifier in comparison with those of the raw image without any processing and the images filtered with MF, MedF, and MMF. Three demonstrations with the specific KNN, SVM, and RT classifier are respectively illustrated for the corresponding visual performance in Figures 7-9. The visual comparison and observation results clearly demonstrate that the classification map obtained through the proposed MFPF performs with minimal noise, and the intra-class homogeneity, such as the meadow highlighted by the rectangle in Figures 7-9, is improved. The boundary of ground targets can be preserved well in the classification map obtained through the proposed MFPF.
In the third experiments on the ZH-6 dataset, the proposed MFPF is compared with the state-of-the-art spatial-spectral feature-based methods. Table 4 summarizes the results of the quantitative comparison. The comparison results show that the spatial-spectral features can improve the classification accuracies in comparison with those of the raw image without any processing. The improvement is approximately 5%. However, the proposed MFPF not only achieves the optimal accuracy for most kinds of specific class but also obtains the highest accuracies in terms of OA, AA, and Ka in comparison with the state-of-the-art spatial-spectral feature-based methods. The visual performance depicted in Figure 10 supports the quantitative conclusion from this experiment.    In the third experiments on the ZH-6 dataset, the proposed MFPF is compared with the state-ofthe-art spatial-spectral feature-based methods. Table 4 summarizes the results of the quantitative comparison. The comparison results show that the spatial-spectral features can improve the classification accuracies in comparison with those of the raw image without any processing. The improvement is approximately 5%. However, the proposed MFPF not only achieves the optimal accuracy for most kinds of specific class but also obtains the highest accuracies in terms of OA, AA, and Ka in comparison with the state-of-the-art spatial-spectral feature-based methods. The visual performance depicted in Figure 10 supports the quantitative conclusion from this experiment.  Note: The bold texts in each row correspond to the optimal accuracies of the comparisons. Figure 10. Classification maps of the ZH-6 dataset obtained through SVM with spatial-spectral feature images and the proposed methods: (a) raw image without filter; (b) EPFs [23]; (c) RFs [24]; (d) M_EMPs [22]; (e) RGF [25]; (f) proposed MFPF.
The experimental results based on the three real VHR remote sensing images show that the following aspects of the proposed MFPF can be achieved. First, in the proposed approach, a novel image filter, is effective in smoothing the noise of VHR remote sensing images. The proposed MFPbased approach called MFPF performs better than the MF, MedF, and the previous MMF in terms of accuracies in land cover classification of VHR remote sensing images. Second, the proposed MFPF is robust and effective for classical classifiers, namely, SVM, KNN, and RT. Finally, the proposed MFPF has a certain superiority to the spatial-spectral feature-based methods in comparison with the stateof-the-art VHR image classification methods; for example, the improvements are 3.06%-9.76% and 1.47%-8.26%, in terms of OA and AA, respectively.

Discussion
The relationship between the accuracies and the number of training samples for the different methods is analyzed in this section. This task is initiated to promote the potential application of the proposed MFPF. As shown in Figures 11, 12 and 13, the accuracies of each approach are improved with an increase in the number of training samples. For example, in Figures 11a-c, the OAs of the Figure 10. Classification maps of the ZH-6 dataset obtained through SVM with spatial-spectral feature images and the proposed methods: (a) raw image without filter; (b) EPFs [23]; (c) RFs [24]; (d) M_EMPs [22]; (e) RGF [25]; (f) proposed MFPF.
The experimental results based on the three real VHR remote sensing images show that the following aspects of the proposed MFPF can be achieved. First, in the proposed approach, a novel image filter, is effective in smoothing the noise of VHR remote sensing images. The proposed MFP-based approach called MFPF performs better than the MF, MedF, and the previous MMF in terms of accuracies in land cover classification of VHR remote sensing images. Second, the proposed MFPF is robust and effective for classical classifiers, namely, SVM, KNN, and RT. Finally, the proposed MFPF has a certain superiority to the spatial-spectral feature-based methods in comparison with the state-of-the-art VHR image classification methods; for example, the improvements are 3.06%-9.76% and 1.47%-8.26%, in terms of OA and AA, respectively.

Discussion
The relationship between the accuracies and the number of training samples for the different methods is analyzed in this section. This task is initiated to promote the potential application of the proposed MFPF. As shown in Figures 11-13, the accuracies of each approach are improved with an increase in the number of training samples. For example, in Figure 11a-c, the OAs of the JX01 dataset and SVM classifier increase from 79.37% to 87.75%, whereas the number of training samples for the proposed MFPF increases from 5 points/class to 10 points/class. The OA of the proposed approach gradually increases to 93.43% when the number of training samples reaches 100 points/class. The fluctuation in the method accuracies may be due to the uncertain spatial distribution of the training samples for each test.  15 Overall, the newly proposed approach is a promising framework for VHR remote sensing image classification. In future works, the robustness and adaptability of the proposed approach will be investigated on additional kinds of remote sensing images, such as the WorldView 4.0 VHR sensing images.

15
Overall, the newly proposed approach is a promising framework for VHR remote sensing image classification. In future works, the robustness and adaptability of the proposed approach will be investigated on additional kinds of remote sensing images, such as the WorldView 4.0 VHR sensing images.

15
Overall, the newly proposed approach is a promising framework for VHR remote sensing image classification. In future works, the robustness and adaptability of the proposed approach will be investigated on additional kinds of remote sensing images, such as the WorldView 4.0 VHR sensing images.    The accuracies on the Pavia University image and ZH-6 dataset with KNN and SVM classifier also swiftly increase when the number of training samples increases from 5 points/class to 10 points/class. However, the sensitivity between the accuracies and the number of training samples stabilizes when the latter is more than 50 points/class. The OA of the proposed MFPF only increases from 88.21% to 94.18%. By contrast, the number of training samples increases from 50 points/class to 500 points/class for the KNN classifier, as illustrated in Figure 12a-c.
The above observation and the comparison results for the three datasets clearly demonstrate that the proposed MFPF achieves enhanced accuracies under an equal level of the training sample for each filter. The accuracies of the proposed MFPF will increase and stabilize. This finding is helpful for the number of training samples selected for the proposed MFPF.

Conclusions
In the present work, a simple but powerful approach called MFPF was proposed for VHR remote sensing image classification. Instead of using the filter with a single processing scale, filter profiles were built in the proposed MFPF approach by filtering the image with a series of different parameters. The proposed MFPF comprises the following three major steps. (1) MFPs were first constructed through a modified mean filter with different scale parameters. (2) The classical PCA was then adopted for reducing the dimension of MPFs, and the top three components were taken as the input feature for a supervised classifier to obtain the initial classification map. (3) Finally, the adaptive spatial region of each filter was adopted for post-processing the initial classification map. The contributions and advantages of the proposed MFPF are summarized as follows.
(1) The proposed MFPF provided competitive accuracies in land cover classification of VHR remote sensing images. As an extension of MMF [20], the proposed MFPF achieved the best accuracy compared with that of the MMF, MF, MedF, and the raw image without any filter processing Furthermore, the results of the second experiment indicated that the proposed MFPF is more robust for the different classifiers compared with that of the MF, MedF, and MMF. In addition, the classification results achieved by the proposed MFPF clearly demonstrated its effectiveness and superiority in terms of visual performance and quantitative accuracies compared with those based on the classical spatial-spectral feature extraction approaches, including EPFs [26], RFs [27], M_EMPs [25], and RGF [28].
(2) To the best of the author's knowledge, this study is the first to promote the idea of multi-scale filter profile construction to improve the classification performance with VHR remote sensing images. Experimental results indicate that the proposed MFPF can smooth the salt-and-pepper noise of classification maps because this approach can cover various targets with different shapes and sizes. Furthermore, the post-processing strategy with the adaptive region of each filter in the proposed MFPF can further smooth the noise in the initial classification map and improve its performance and accuracy.
The experimental results based on three real VHR remote sensing images acquired using different sensors and platforms indicated that the proposed MFPF is effective for smoothing the noise of classification maps and improving classification accuracies. An advantage of the proposed MFP is its ability to improve the intra-class spectral homogeneity while preserving the edge of targets. This improvement helps smooth the noise of the classification map and upgrades the recognition accuracies. Another major advantage of the proposed MFPF is its capability to avoid hard-tuning of parameters for a pre-given VHR image, which will be relatively useful in real applications.
Overall, the newly proposed approach is a promising framework for VHR remote sensing image classification. In future works, the robustness and adaptability of the proposed approach will be investigated on additional kinds of remote sensing images, such as the WorldView 4.0 VHR sensing images.