Mixed Noise Estimation Model for Optimized Kernel Minimum Noise Fraction Transformation in Hyperspectral Image Dimensionality Reduction

: Dimensionality reduction (DR) is of great signiﬁcance for simplifying and optimizing hyperspectral image (HSI) features. As a widely used DR method, kernel minimum noise fraction (KMNF) transformation preserves the high-order structures of the original data perfectly. However, the conventional KMNF noise estimation (KMNF-NE) uses the local regression residual of neighbourhood pixels, which depends heavily on spatial information. Due to the limited spatial resolution, there are many mixed pixels in HSI, making KMNF-NE unreliable for noise estimation and leading to poor performance in KMNF for classiﬁcation on HSIs with low spatial resolution. In order to overcome this problem, a mixed noise estimation model (MNEM) is proposed in this paper for optimized KMNF (OP-KMNF). The MNEM adopts the sequential and linear combination of the Gaussian prior denoising model, median ﬁlter, and Sobel operator to estimate noise. It retains more details and edge features, making it more suitable for noise estimation in KMNF. Experiments using several HSI datasets with different spatial and spectral resolutions are conducted. The results show that, compared with some other DR methods, the improvement of OP-KMNF in average classiﬁcation accuracy is up to 4%. To improve the efﬁciency, the OP-KMNF was implemented on graphics processing units (GPU) and sped up by about 60 × compared to the central processing unit (CPU) implementation. The outcome demonstrates the signiﬁcant performance of OP-KMNF in terms of classiﬁcation ability and execution


Introduction
Hyperspectral remote sensing combines imaging technology and spectroscopy technology to obtain continuous and narrow-band image data with high spectral resolution [1], which improves the ability to monitor the Earth's systems and human activities [2,3]. However, the high-dimensional data obtained by the hyperspectral sensor are challenging to analyse and apply [4,5]. In hyperspectral image (HSI) classification, with limited training samples, the classifier performance first improves as the dimension increases but degrades when the dimension is higher than an optimal value (the Hughes phenomenon) [6,7]. Dimensionality reduction (DR) is a powerful tool to solve this problem and has become a critical step in HSI processing tasks because of its effectiveness at simplifying and optimizing HSI features [8].
For optical images, it is generally assumed that signal and noise are independent of each other; that is, an HSI is composed of signal and noise [37]: where → y (x) is the pixel vector at location x; → y s (x) and → y N (x) are the signal and noise in → y (x). For a hyperspectral image Y containing n pixels and b bands, it can be expressed as a matrix with n rows and b columns. The covariance matrix ∑ Y of image Y can be expressed as the sum of the signal covariance matrix ∑ Y S and the noise covariance matrix ∑ Y N , which can be expressed as in Equation (2): Remote Sens. 2021, 13, 2607 4 of 24 Treating y i as the average value of the ith band, a matrix Y mean with n rows and b columns can be obtained, which can be expressed as follows: . . .
Then, we centralize the matrix Y, with the center matrix Y C given by: The covariance matrix ∑ Y of Y can be expressed as follows: Similarly, ∑ Y S and ∑ Y N can be obtained. NF, which is shown in Equation (6), is defined as the ratio of the noise covariance matrix to the image covariance matrix: where a is the eigenvector matrix of NF. Minimizing NF is equivalent to maximizing 1/NF: The dualistic formulation is got by re-parameterizing and setting a ∝ Y T b: Introducing a non-linear mapping Φ : y → Φ(y) to kernelized 1/NF can be expressed as follows: where Φ(Y) is the matrix after mapping of Y, Φ(Y N ) is the matrix after mapping of Y N , In KMNF, in order to extract the components arranged by image quality after KMNF, we need to maximize 1/K_NF according to Equation (10) [26]. Solving the maximized 1/K_NF maximizes the Rayleigh quotient, which can be solved using the maximized Rayleigh entropy [5,37]. It can be written as follows: where λ is the eigenvalues of K N K T As mentioned above, a ∝ Y T b; after non-linear mapping, Y T b is transformed to Φ(Y) T b. The feature extraction result R can be obtained by: Through the above analysis, it can be seen that the calculation accuracy of 1/K_NF mainly depends on the noise estimation results, so noise estimation is a key step in the KMNF transformation. The KMNF-NE uses the residual of a local regression in a 3 × 3 neighbourhood to apply to a paraboloid or a plane, which can be regarded as a filtering problem [37,44]. However, recent studies show that space-based noise estimation methods like KMNF-NE are data-selective and unstable [46].
Considering the types of noise in HSIs, we propose an MNEM that combines a Gaussian prior denoising model, median filter, and Sobel operator to estimate noise during KMNF.

Mixed Noise Estimation Model
The Gaussian prior denoising model learns the Gaussian model from the image itself, which enables it to better adapt to the data and achieve better denoising results in the images mixed with Gaussian noise [48].
For images mixed with Gaussian noise, the noise model can be expressed as follows: where Z G is an image mixed with Gaussian noise, S G is the clean image, and N G is the noise subject to a Gaussian distribution.
In the prior Gaussian model, it was assumed that the clean image patch satisfies the Gaussian distribution, that is S G ∼ N(µ, ∑); the result of denoising R G with Gaussian parameters can be obtained by maximizing the posterior probability p as in Equation (16) [48]: where S G is the clean image patch, µ is the mean value of the clean image patch, ∑ is the covariance of the clean image, and σ is the standard deviation of Gaussian noise. In Equation (16), the second line is derived from the first line by the Bayes theorem, and the last line is obtained from the logarithmic multiplier of the third line. The result depends on the balance of the first term (fidelity term) and the second term (prior term) in the equation. When ∑ is a positive semidefinite matrix, Equation (16) can be regarded as a convex optimization problem [48]. Thus, setting the derivative to 0, we obtain Equation (17) as follows: Then, the Gaussian noise mixed in the image can be expressed as in Equation (18): Recent works show the validity of this method for filtering out Gaussian noise in HSI [48].
The median filter, a typical non-linear filter for signal restoration based on order statistics, replaces the value of the pixel with the median value of the grayscale in a N × N neighbourhood. It protects details perfectly while filtering out noise (especially impulse noise) but cannot effectively maintain the edge feature of the image.
As a widely used edge detection algorithm, the Sobel operator produces a better edge detection result and has a smooth suppression effect on noise. The Sobel operator is a discrete difference operator. Applying this operator to any pixels in the image produces the corresponding gray vector or its normal vector [49].
The Sobel operator adopts the spatial neighbourhood (3 × 3) feature of an image to detect edge information, as shown in the following equation: where Z S x,y,i is the value of the pixel located at line x, column y, and band i of the hyperspectral image Z S ; N SX x,y,i is the transverse edge detection value; N SY x,y,i is the longitudinal edge detection value; and N S x,y,i is the edge detection value of the Sobel operator. As mentioned above, HSIs often suffer from annoying degradations by a mixture of various types of noise that include thermal noise, quantization noise, and sparse noise [47]. According to the types of mixed noise in HSIs, the MNEM, which combines a Gaussian prior denoising model, median filter, and Sobel operator, is proposed in this paper to estimate noise.
The flowchart of the noise estimation process is shown in Figure 1. It adopts two different ways of using multiple filters. In image denoising, a filter can suppress or filter out different types of noise in the image. A linear combination processing is introduced in this paper to combine multiple filters for noise estimation. The linear combination processing method is the MNEM-Ratio shown in Figure 1. The sequential processing method is the MNEM-Order shown in Figure 1. First, the median filter is used to filter out the impulse noise in each band of HSIs. Second, considering that the median filter cannot retain the edge features of the image well, the Sobel operator is used for edge detection to enhance the edge information. Finally, the prior Gaussian denoising model is utilized to suppress the Gaussian noise, and the residual between the denoised image and the original image is calculated to estimate the noise in HSIs.  For the MNEM-Ratio, mean spectral angle distance (MSAD), which is expressed as in Equation (24), is used as the evaluation criteria to determine , , and . In evaluating the noise estimation performance, the smaller the value of MSAD, the better the denoised results. The procedure is shown in Algorithm 1.  For the MNEM-Ratio, mean spectral angle distance (MSAD), which is expressed as in Equation (24), is used as the evaluation criteria to determine P M , P S , and P G . In evaluating the noise estimation performance, the smaller the value of MSAD, the better the denoised results. The procedure is shown in Algorithm 1. Algorithm 1. The procedure of determining P M , P S and P G in MENM-Ratio Input: hyperspectral image Y.
Step 1: input Y into the median filter to obtain the median filter denoised image Y_Median.
Step 2: Sobel operator is used in Y to get the Sobel denoised image Y_DeNoise_Sobel = Y − Y_Sobel.
Step 3: Y is input into the Gaussian prior denoising model to obtain the denoised image Y_Gaussian.
Step 4: calculate the MSAD values MSAD_Median, MSAD_Sobel and MSAD_Gaussian between the input image and the denoised images Y_Median, Y_DeNoise_Sobel and Y_Gaussian.
Step 5: take the reciprocals, that is MSAD M = 1/MSAD_Median, MSAD S = 1/MSAD_Sobel and MSAD G = 1/MSAD_Gaussian. To evaluate the noise estimation performance of the MNEM, we apply the proposed method to each band in HSI. The following evaluation criteria are used [50,51]: 1. Mean peak signal-to-noise ratio (MPSNR) where PSNR m is the PSNR in band m, and M is the number of HSI bands. 2. Mean structure similarity (MSSIM) where SSI M m is the structure similarity (SSIM) in band m, and M is the number of HSI bands. 3. Mean spectral angle distance (MSAD) MSAD is calculated between the original image pixel z p and the denoised image pixel r p and then averaged for all pixels over the entire spatial domain. The larger the values of MPSNR and MSSIM, the smaller the value of MSAD and the better the denoised results.

Optimized Kernel Minimum Noise Fraction (OP-KMNF) Transformation
Instead of using the residual of a local regression in the nearest neighbourhood pixels, we adopt MNEM-Ratio and MNEM-Order to estimate noise during KMNF for optimized OP-KMNF. The OP-KMNF-Ratio and OP-KMNF-Order transformation procedures are shown in Algorithm 2 and Algorithm 3.

Algorithm 2. The procedure of OP-KMNF-Ratio
Input: hyperspectral image Y Step 1: input Y into the median filter to obtain Y_Median, and then Noise_Median = Y − Y_Median.
Step 2: Sobel operator is used in Y to get Noise_Sobel = Y_Sobel.
Step 3: Y is input into the Gaussian prior denoising model to obtain Y_Gaussian, and then Noise_Gaussian = Y − Y_Gaussian.
Step 6: calculate the eigenvectors of , and obtain the matrix of b.
Step 7: map all pixels onto the transformation matrix using Equation (14). Output: hyperspectral image feature extraction result R. Step 1: input Y into the median filter to obtain Y_Median.
Step 3: Y_Sobel is input into the Gaussian prior denoising model to obtain Y_Gaussian, and then MNEM-Order = Y − Y_Gaussian.
Step 5: calculate the eigenvectors of , and obtain the matrix of b.
Step 6: map all pixels onto the transformation matrix using Equation (14). Output: hyperspectral image feature extraction result R.

Graphics Processing Units (GPU)-Based Parallel Computing
Due to the kernel functions in KMNF, the computational complexity increased and the execution efficiency decreased. Recently, parallel processing has attracted more and more attention because of its powerful data processing capacity. The parallel platforms mainly include the CPU, field programmable gate array (FPGA), and graphics processing units (GPU). Compared to FPGA and CPU, GPU is excellent in floating-point operations and more flexible in programming [52,53].
Basic linear algebra subroutines (BLAS) are a set of kernels that provide a standard builder for executing basic vector and matrix operations [54]. BLAS specifications are widely used to develop high-quality linear algebra software, such as LAPACK and FLAME [55,56]. Based on the GPU parallel platform, NVIDIA computing unified device architecture (CUDA) provides a BLAS library (called CUBLAS) [57]. Recent works show the superior performance of CUBLAS in processing matrix and vector operations [54,57,58]. An analysis of OP-KMNF shows voluminous matrix and vector operations in the program, so this paper realizes the parallel processing of OP-KMNF based on the CUBLAS of GPU.
In this paper, the parallel computing performance of OP-KMNF is evaluated in two different computing systems. Computing system No. 1 consists of an Intel Core i5-4460 CPU at 3.20 GHz with four cores and an NVIDIA GeForce GTX745 GPU card, and computing system No. 2 consists of an Intel Core i7-10750H CPU at 2.60 GHz with six cores and an NVIDIA GeForce RTX2060 GPU card. The hardware specifications of NVIDIA GeForce GTX745 and NVIDIA GeForce RTX2060 are shown in Table 1.

Results
The experiments were conducted on three datasets (the Salinas dataset, Indian Pines dataset, and Xiong'an dataset), introduced in Section 3.1. All DR algorithms are implemented on Visual Studio 2015, and when using median filtering and the Sobel operator, Opencv3.1.0 is used as the dependency library to obtain the results. The first experiment used MPSNR, MSSIM, and MSAD as evaluation criteria to evaluate the denoised results of MNEM adopted in OP-KMNF, and the results are shown in Section 3.2. The second experiment took the average accuracy as the evaluation criterion to assess the performance of OP-KMNF in terms of the support vector machine (SVM) classifier. The SVM classifier with radial basis function (RBF) kernel was applied in this experiment. In all experiments, 25% of the samples are randomly selected for training and the rest are employed for testing, and the ten-fold cross-validation is used to find the best parameters in SVM. The experimental results are given in Section 3.3. In order to verify the adaptability of OP-KMNF to HSIs with different spatial resolutions and spectral resolutions, pixel merging and band merging were performed on the Xiong'an dataset to obtain images with different spatial resolutions and spectral resolutions. The experimental results of HSIs with different spatial resolutions are given in Section 3.4. The experimental results of HSIs with different spectral resolutions are given in Section 3.5. The last experiment was to verify the execution efficiency of OP-KMNF, implemented on GPU by increasing the data volume in processing, and the computational costs of OP-KMNF were analysed in detail with a data volume of 400 × 400 × 250. The results are given in Section 3.6. In order to ensure the reliability of the experimental results, each experiment was conducted five times, and the average values and confidence intervals were taken and are reported in this paper.

Salinas Dataset
The Salinas dataset is an image of the Salinas Valley in California, USA, collected by an airborne visible infrared imaging spectrometer (AVIRIS), which consists of 512 × 217 pixels, and the spatial resolution is 3.7 m. This dataset has 224 spectral bands originally. In experiments, we used 204 bands after discarding the 108th-112th, 154-167th, and 224th bands because of the water absorption phenomenon. The pseudo-colour image and the ground truth classification map of Salinas are shown in Figure 2.

Indian Pines Dataset
Indian Pines is the earliest dataset used for hyperspectral image classification, collected by AVIRIS in Indiana, USA. This dataset consists of 145 × 145 pixels, and the spatial resolution is 20 m. It has 220 spectral bands originally. In the experiments, we used 200 bands after discarding the 104th-180th, 150th-163rd, and 220th bands because of the water absorption phenomenon. The pseudo-colour image and the ground truth classification map of Indian Pines are shown in Figure 3.

Xiong'an Dataset
Xiong'an is collected in New District, Hebei Province, China by an airborne multimodular imaging spectrometer (AMMIS), a next-generation Chinese airborne hyperspectral imager. Funded by the China High-Resolution Earth Observation Project, AMMIS was de-signed and manufactured by the Shanghai Institute of Technical Physics, Chinese Academy of Sciences. This dataset consists of 512 × 512 pixels and 250 spectral bands, the wavelength range is 0.4-1.0 µm, and the spatial resolution is 0.5 m [59][60][61][62][63]. The pseudo-colour image and the ground truth classification map of Xiong'an are shown in Figure 4. experiment was conducted five times, and the average values and confidence intervals were taken and are reported in this paper.

Salinas Dataset
The Salinas dataset is an image of the Salinas Valley in California, USA, collected by an airborne visible infrared imaging spectrometer (AVIRIS), which consists of 512 × 217 pixels, and the spatial resolution is 3.7 m. This dataset has 224 spectral bands originally. In experiments, we used 204 bands after discarding the 108th-112th, 154-167th, and 224th bands because of the water absorption phenomenon. The pseudo-colour image and the ground truth classification map of Salinas are shown in Figure 2.

Indian Pines Dataset
Indian Pines is the earliest dataset used for hyperspectral image classification, collected by AVIRIS in Indiana, USA. This dataset consists of 145 × 145 pixels, and the spatial resolution is 20 m. It has 220 spectral bands originally. In the experiments, we used 200 bands after discarding the 104th-180th, 150th-163rd, and 220th bands because of the water absorption phenomenon. The pseudo-colour image and the ground truth classification map of Indian Pines are shown in Figure 3.

Xiong'an Dataset
Xiong'an is collected in New District, Hebei Province, China by an airborne multi-modular imaging spectrometer (AMMIS), a next-generation Chinese airborne hyperspectral imager. Funded by the China High-Resolution Earth Observation Project, AM-MIS was designed and manufactured by the Shanghai Institute of Technical Physics, Chinese Academy of Sciences. This dataset consists of 512 × 512 pixels and 250 spectral bands, the wavelength range is 0.4-1.0 μm, and the spatial resolution is 0.5 m [59][60][61][62][63]. The pseudo-colour image and the ground truth classification map of Xiong'an are shown in Figure 4. ti-modular imaging spectrometer (AMMIS), a next-generation Chinese airborne hyperspectral imager. Funded by the China High-Resolution Earth Observation Project, AM-MIS was designed and manufactured by the Shanghai Institute of Technical Physics, Chinese Academy of Sciences. This dataset consists of 512 × 512 pixels and 250 spectral bands, the wavelength range is 0.4-1.0 μm, and the spatial resolution is 0.5 m [59][60][61][62][63]. The pseudo-colour image and the ground truth classification map of Xiong'an are shown in

Experiments on Noise Estimation
The denoised results of KMNF-NE, SSDC, and MNEM were compared, and the MPSNR, MSSIM, and MSAD values on the Salinas, Indian Pines, and Xiong'an datasets are shown in Table 2

Experiments on Noise Estimation
The denoised results of KMNF-NE, SSDC, and MNEM were compared, and the MPSNR, MSSIM, and MSAD values on the Salinas, Indian Pines, and Xiong'an datasets are shown in Table 2. To visualize the denoised results, we show the denoise results of band 5 in the Salinas dataset, band 187 in the Indian Pines dataset, and band 15 in the Xiong'an dataset in  In this section, experiments were conducted to assess the noise estimation performance of MENM. As mentioned above, the larger the values of MPSNR and MSSIM and the smaller the value of MSAD, the better the denoised results. Experimental results in this section show that MNEM is more effective, more robust, and retains more details and edge features than KMNF-NE and SSDC. Therefore, we adopt MNEM as the noise estimation method during KMNF for OP-KMNF. Table 2. Mean peak signal-to-noise ratio (MPSNR), mean structure similarity (MSSIM) and mean spectral angle distance (MSAD) of different methods on Salinas, Indian Pines, and Xiong'an datasets.      In this section, experiments were conducted to assess the noise estimation performance of MENM. As mentioned above, the larger the values of MPSNR and MSSIM and the smaller the value of MSAD, the better the denoised results. Experimental results in this section show that MNEM is more effective, more robust, and retains more details and edge features than KMNF-NE and SSDC. Therefore, we adopt MNEM as the noise estimation method during KMNF for OP-KMNF.

Experiments on OP-KMNF
The numbers of samples and training samples of the Salinas, Indian Pines, and Xiong'an datasets are listed in Tables 3 and 4. The average accuracies of different DR methods for SVM classification on the Salinas, Indian Pines, and Xiong'an datasets are shown in Table 5. Taking the feature extracted by each method in the Xiong'an dataset (number of features = 9) as an example, the statistics on the classification accuracies of each class are given in Table 6. In order to visualize the classification results, the SVM

Experiments on OP-KMNF
The numbers of samples and training samples of the Salinas, Indian Pines, and Xiong'an datasets are listed in Tables 3 and 4. The average accuracies of different DR methods for SVM classification on the Salinas, Indian Pines, and Xiong'an datasets are shown in Table 5. Taking the feature extracted by each method in the Xiong'an dataset (number of features = 9) as an example, the statistics on the classification accuracies of each class are given in Table 6. In order to visualize the classification results, the SVM classification results of the Indian Pines, Salinas, and Xiong'an datasets after different DR methods (number of features = 5) are shown in Figures 8-10.       In this section, SVM is used as the classification algorithm, and the average accuracy is used to assess the results. Experimental results, obtained using multiple hyperspectral datasets (including the Indian Pines dataset, Salinas dataset, and Xiong'an dataset), show that OP-KMNF overperforms other traditional DR methods (including LDA, PCA, MNF, OMNF, FA, KPCA, KMNF, OKMNF, and LPP) in terms of the average classification accuracy. The improvements of OP-KMNF in terms of the average accuracy on the Salinas, Indian Pines, and Xiong'an datasets were 2.33%, 1.85%, and 4.54%, respectively, and, compared with KMNF, the improvements were 2.42%, 4.92%, and 7.64%. Taking the feature extracted by each method in the Xiong'an dataset (number of features = 9) as an example, the classification accuracy of each class was assessed. In the classification of In this section, SVM is used as the classification algorithm, and the average accuracy is used to assess the results. Experimental results, obtained using multiple hyperspectral datasets (including the Indian Pines dataset, Salinas dataset, and Xiong'an dataset), show that OP-KMNF overperforms other traditional DR methods (including LDA, PCA, MNF, OMNF, FA, KPCA, KMNF, OKMNF, and LPP) in terms of the average classification accuracy. The improvements of OP-KMNF in terms of the average accuracy on the Salinas, Indian Pines, and Xiong'an datasets were 2.33%, 1.85%, and 4.54%, respectively, and, compared with KMNF, the improvements were 2.42%, 4.92%, and 7.64%. Taking the feature extracted by each method in the Xiong'an dataset (number of features = 9) as an example, the classification accuracy of each class was assessed. In the classification of In this section, SVM is used as the classification algorithm, and the average accuracy is used to assess the results. Experimental results, obtained using multiple hyperspectral datasets (including the Indian Pines dataset, Salinas dataset, and Xiong'an dataset), show that OP-KMNF overperforms other traditional DR methods (including LDA, PCA, MNF, OMNF, FA, KPCA, KMNF, OKMNF, and LPP) in terms of the average classification accuracy. The improvements of OP-KMNF in terms of the average accuracy on the Salinas, Indian Pines, and Xiong'an datasets were 2.33%, 1.85%, and 4.54%, respectively, and, compared with KMNF, the improvements were 2.42%, 4.92%, and 7.64%. Taking the feature extracted by each method in the Xiong'an dataset (number of features = 9) as an example, the classification accuracy of each class was assessed. In the classification of Corn, Robinia, Populus and Sophora japonica, OP-KMNF showed a significant improvement compared with the other methods.

Adaptability of OP-KMNF to Hyperspectral Images with Different Spatial Resolutions
HSIs with different spatial resolutions are obtained after pixel merging on the Xiong'an dataset. We treat the Xiong'an dataset as Spatial_resolution_1, and for each band, four adjacent pixels are averaged to obtain Spatial_resolution_2, and eight adjacent pixels are averaged to obtain Spatial_resolution_3. The spatial resolution of Spatial_resolution_1 is 0.5 m, the spatial resolution of Spatial_resolution_2 is 1 m, and the spatial resolution of Spatial_resolution_3 is 2 m. The pseudo-colour images and the ground truth classification map are shown in Figure 11. Corn, Robinia, Populus and Sophora japonica, OP-KMNF showed a significant improvement compared with the other methods.

Adaptability of OP-KMNF to Hyperspectral Images with Different Spatial Resolutions
HSIs with different spatial resolutions are obtained after pixel merging on the Xiong'an dataset. We treat the Xiong'an dataset as Spatial_resolution_1, and for each band, four adjacent pixels are averaged to obtain Spatial_resolution_2, and eight adjacent pixels are averaged to obtain Spatial_resolution_3. The spatial resolution of Spa-tial_resolution_1 is 0.5 m, the spatial resolution of Spatial_resolution_2 is 1 m, and the spatial resolution of Spatial_resolution_3 is 2 m. The pseudo-colour images and the ground truth classification map are shown in Figure 11. The numbers of samples and training samples used in the Xiong'an dataset with different spatial resolutions are listed in Table 7. The average accuracies of different DR methods for SVM classification are shown in Figure 12. The numbers of samples and training samples used in the Xiong'an dataset with different spatial resolutions are listed in Table 7. The average accuracies of different DR methods for SVM classification are shown in Figure 12.  5182  4925  1231  Soybean  10,562  2641  2474  618  523  130  Pear_trees  1303  326  312  78  71  18  Grassland  27,703  6926  6734  1683  1583  396  Sparsewood  9292  2323  2254  563  534  133  Robinia  25,761  6440  6274  1568  1508  377  Paddy  30,029  7507  7364  1841  1761  440  Populus  5534  1384  1345  336  318  80  Sophora japonica  811  203  182  45  36  9  Peach_trees  1498  375  348  87  73  18 In this section, experiments are conducted to verify the adaptability of OP-KMNF to HSIs with different spatial resolutions. For experiments on Spatial_resolution_1, Spa-tial_resolution_2, and Spatial_resolution_3, the OP-KMNF showed better performance compared with some other DR methods (including LDA, PCA, MNF, OMNF, FA, KPCA, KMNF, OKMNF, and LPP).

Adaptability of OP-KMNF to Hyperspectral Images with Different Spectral Resolutions
HSIs with different spectral resolutions were obtained after band merging on the Xiong'an dataset. We treat the Xiong'an dataset as Spectral_resolution_1, and for each pixel, two adjacent bands are averaged to obtain Spectral_resolution_2, and four adjacent bands are averaged to obtain Spectral_resolution_3. The spectral resolution of Spec-tral_resolution_1 is 2.4 nm, the spectral resolution of Spectral_resolution_2 is 4.8 nm, and the spectral resolution of Spectral_resolution_3 is 9.6 nm. The pseudo-colour images and the ground truth classification map are shown in Figure 13. In this section, experiments are conducted to verify the adaptability of OP-KMNF to HSIs with different spatial resolutions. For experiments on Spatial_resolution_1, Spa-tial_resolution_2, and Spatial_resolution_3, the OP-KMNF showed better performance compared with some other DR methods (including LDA, PCA, MNF, OMNF, FA, KPCA, KMNF, OKMNF, and LPP).

Adaptability of OP-KMNF to Hyperspectral Images with Different Spectral Resolutions
HSIs with different spectral resolutions were obtained after band merging on the Xiong'an dataset. We treat the Xiong'an dataset as Spectral_resolution_1, and for each pixel, two adjacent bands are averaged to obtain Spectral_resolution_2, and four adjacent bands are averaged to obtain Spectral_resolution_3. The spectral resolution of Spec-tral_resolution_1 is 2.4 nm, the spectral resolution of Spectral_resolution_2 is 4.8 nm, and the spectral resolution of Spectral_resolution_3 is 9.6 nm. The pseudo-colour images and the ground truth classification map are shown in Figure 13. The numbers of samples and training samples used in the Xiong'an dataset with different spectral resolutions are listed in Table 8. The average accuracies of different DR methods for SVM classification are shown in Figure 14. The numbers of samples and training samples used in the Xiong'an dataset with different spectral resolutions are listed in Table 8. The average accuracies of different DR methods for SVM classification are shown in Figure 14.
In this section, experiments are conducted to verify the adaptability of OP-KMNF to HSIs with different spectral resolutions. For experiments on Spectral_resolution_1, Spec-tral_resolution_2, and Spectral_resolution_3, the OP-KMNF showed better performance compared with some other DR methods (including LDA, PCA, MNF, OMNF, FA, KPCA, KMNF, OKMNF, and LPP).  3.6. GPU Implementation of OP-KMNF Table 9 shows the runtimes and speedups of OP-KMNF before parallel computing and after on two different computing systems with different data volumes. The CPU runtime was tested on computing system No. 1. Taking the processing with the data volume is 400 × 400 × 250 as an example, we detail the computational costs of OP-KMNF, and the result is shown in Table 10.
In this section, experiments were conducted to compare the time consumption before and after parallel computing in OP-KMNF with identical data volumes on two different computing systems. The experimental results show that OP-KMNF implemented in parallel on GPU leads to a significant improvement in efficiency. For experiments on a data volume of 400 × 400 × 250, the improvements of OP-KMNF-Order and OP-KMNF-Ratio sped up by 63.95× and 64.27×, respectively. For the computational costs of OP-KMNF with a data volume of 400 × 400 × 250, it can be seen that,

Discussion
In this section, we discuss the performance of MNEM, OP-KMNF, and parallel computing on GPU that are proposed in this paper. The results were given in Section 3.
We conducted experiments to assess the noise estimation performance of MENM, and the results demonstrate that MNEM is more effective, more robust, and retains more details and edge features than KMNF-NE. From the experimental results of the Salinas dataset (spatial resolution: 3.7 m), Indian Pines dataset (spatial resolution: 20 m), and Xiong'an dataset (spatial resolution: 0.5 m), the improvements of MENM in MPSNR, MSSIM, and MASD were greatest for the Indian Pines dataset, which shows that MENM had better performance than KMNF-NE and SSDC on a dataset with limited spatial resolution and lots of mixed pixels. As mentioned above, noise estimation is a crucial component in KMNF, so we adopted MNEM for noise estimation during KMNF for OP-KMNF.
We validated the performance of DR methods in terms of SVM, and the average accuracy was used to assess the results. Experimental results obtained using three real hyperspectral datasets (the Indian Pines dataset, Salinas dataset, and Xiong'an dataset) showed that, (1) compared with other DR methods (including LDA, PCA, MNF, OMNF, FA, KPCA, KMNF, OKMNF, and LPP), OP-KMNF led to significant improvements in terms of the average classification accuracy; (2) it is not the case that the more features are extracted, the higher the classification accuracy will be, so the number of features should be selected to achieve the highest classification accuracy; (3) in practice, FA can be used as a dimensionality reduction method, and in many cases, it has better performance than MNF and KMNF; and (4) the experimental results prove the importance of accurate noise estimation for the KMNF algorithm and provide a way to optimize KMNF in the future. Compared to the other nine DR algorithms under the same experimental conditions, OP-KMNF handles the non-linear features well and performs better in subsequent classification processing.
To verify the adaptability of OP-KMNF to HSIs with different spatial resolutions, we obtained HSIs with different spatial resolutions by using pixel merging on the Xiong'an dataset. The spatial resolution of Spatial_resolution_1, Spatial_resolution_2, and Spa-tial_resolution_3 was 0.5 m, 1 m, and 2 m, respectively. Experimental results obtained using these images showed that, (1) compared with other DR methods (including LDA, PCA, MNF, OMNF, FA, KPCA, KMNF, OKMNF, and LPP), OP-KMNF led to significant improvements in average classification accuracy on HSIs with different spatial resolutions; (2) in most cases, the performance of DR methods (including LDA, PCA, MNF, OMNF, FA, KPCA, KMNF, OKMNF, LPP, and OP-KMNF) in classification was improved with the decrease in spatial resolution; and (3) the MNEM adapts HSIs with different spatial resolutions well during KMNF and, the lower the spatial resolution, the more the performance improvement of OP-KMNF over KMNF.
To verify the adaptability of OP-KMNF to HSIs with different spectral resolutions, we obtained HSIs with different spectral resolutions by using band merging on the Xiong'an dataset; the spectral resolution of Spectral_resolution_1, Spectral_resolution_2, and Spec-tral_resolution_3 was 2.4 nm, 4.8 nm, and 9.6 nm, respectively. Experimental results obtained using these images showed that, (1) compared with other DR methods (including LDA, PCA, MNF, OMNF, FA, KPCA, KMNF, OKMNF, and LPP), OP-KMNF led to significant improvements in terms of average classification accuracy on HSIs with different spectral resolutions; (2) the performance of LDA and LPP in classification improved with the decrease in spectral resolution, while the performance of KMNF and OKMNF worsened with the decrease in spectral resolution; and (3) the MNEM adapts the HSIs with different spectral resolutions during KMNF well and the lower the spectral resolution, the greater the performance improvement of OP-KMNF over KMNF.
In practice, we need to consider both the performance and the efficiency. We introduced GPU-based parallel computing to improve the execution efficiency of OP-KMNF and compared the time consumption before and after parallel computing under the same data volumes on two different computing systems. The results showed that the implementation in parallel on GPU led to a significant improvement in efficiency. The analysis of the experimental results in Tables 9 and 10 showed that, (1) the larger the data volume, the more significant the acceleration effect; (2) when the data volume was 400 × 400 × 251, the execution efficiency sped up by about 60× compared to the CPU implementation; and (3) with the development of parallel systems, the execution efficiency of the algorithm is further improved.
From the experimental results, we can see that the MNEM-Order and MNEM-Ratio have different adaptability on different datasets. For the experimental results on the Salinas dataset, the MNEM-Ratio shows better noise estimation performance, while on the Indian Pines and Xiong'an datasets, MNEM-Order shows better performance. In most cases, OP-KMNF-Ratio shows better performance on the Salinas dataset in terms of the average classification accuracy, while OP-KMNF-Order shows better performance when the feature number is 3. Similar experimental results have also appeared in the Indian Pines dataset and Xiong'an dataset. The reason for this phenomenon is still incompletely understood.
To further analyse the causes of this phenomenon, the PSNR and SSIM values of each band on the Salinas, Indian Pines, and Xiong'an datasets are shown in Figures 15 and 16. It can be seen that each noise estimation method has different adaptability to different bands. For KMNF transformation, it is not the case that the more features are extracted, the higher the classification accuracy will be; the noise fraction matrix of the HSI is processed as a whole during the KMNF transformation, and the impact of the noise estimation results of each band on the final results has not been quantitatively assessed. All of these factors have an impact on the final results. In future work, we will perform more in-depth research to explain this phenomenon. extracted, the higher the classification accuracy will be; the noise fraction matrix of the HSI is processed as a whole during the KMNF transformation, and the impact of the noise estimation results of each band on the final results has not been quantitatively assessed. All of these factors have an impact on the final results. In future work, we will perform more in-depth research to explain this phenomenon.

Conclusions
The MNEM proposed in this paper can estimate the noise of HSI more effectively, is more robust, retains more details and edge features than KMNF-NE, and further optimizes the DR performance of KMNF in terms of classification. In consideration of the computational cost, we introduce GPU-based parallel computing to OP-KMNF. Experimental results show that this algorithm handles non-linear features well and perfectly adapts HSIs with different spatial resolutions and spectral resolutions. Additionally, it is superior to other commonly used DR algorithms in terms of classification. The experiments on two different computing systems proved that the execution efficiency is further improved with parallel system development, which clarifies the real-time dimensionality reduction for HSI. In the future, the performance of OP-KMNF in other applications (e.g., anomaly detection), and the implementation of its real-time algorithm will be studied. extracted, the higher the classification accuracy will be; the noise fraction matrix of the HSI is processed as a whole during the KMNF transformation, and the impact of the noise estimation results of each band on the final results has not been quantitatively assessed. All of these factors have an impact on the final results. In future work, we will perform more in-depth research to explain this phenomenon.

Conclusions
The MNEM proposed in this paper can estimate the noise of HSI more effectively, is more robust, retains more details and edge features than KMNF-NE, and further optimizes the DR performance of KMNF in terms of classification. In consideration of the computational cost, we introduce GPU-based parallel computing to OP-KMNF. Experimental results show that this algorithm handles non-linear features well and perfectly adapts HSIs with different spatial resolutions and spectral resolutions. Additionally, it is superior to other commonly used DR algorithms in terms of classification. The experiments on two different computing systems proved that the execution efficiency is further improved with parallel system development, which clarifies the real-time dimensionality reduction for HSI. In the future, the performance of OP-KMNF in other applications (e.g., anomaly detection), and the implementation of its real-time algorithm will be studied.

Conclusions
The MNEM proposed in this paper can estimate the noise of HSI more effectively, is more robust, retains more details and edge features than KMNF-NE, and further optimizes the DR performance of KMNF in terms of classification. In consideration of the computational cost, we introduce GPU-based parallel computing to OP-KMNF. Experimental results show that this algorithm handles non-linear features well and perfectly adapts HSIs with different spatial resolutions and spectral resolutions. Additionally, it is superior to other commonly used DR algorithms in terms of classification. The experiments on two different computing systems proved that the execution efficiency is further improved with parallel system development, which clarifies the real-time dimensionality reduction for HSI. In the future, the performance of OP-KMNF in other applications (e.g., anomaly detection), and the implementation of its real-time algorithm will be studied.