Integrating MNF and HHT Transformations into Artiﬁcial Neural Networks for Hyperspectral Image Classiﬁcation

: The critical issue facing hyperspectral image (HSI) classiﬁcation is the imbalance between dimensionality and the number of available training samples. This study attempted to solve the issue by proposing an integrating method using minimum noise fractions (MNF) and Hilbert–Huang transform (HHT) transformations into artiﬁcial neural networks (ANNs) for HSI classiﬁcation tasks. MNF and HHT function as a feature extractor and image decomposer, respectively, to minimize inﬂuences of noises and dimensionality and to maximize training sample e ﬃ ciency. Experimental results using two benchmark datasets, Indian Pine (IP) and Pavia University (PaviaU) hyperspectral images, are presented. With the intention of optimizing the number of essential neurons and training samples in the ANN, 1 to 1000 neurons and four proportions of training sample were tested, and the associated classiﬁcation accuracies were evaluated. For the IP dataset, the results showed a remarkable classiﬁcation accuracy of 99.81% with a 30% training sample from the MNF1–14 + HHT-transformed image set using 500 neurons. Additionally, a high accuracy of 97.62% using only a 5% training sample was achieved for the MNF1–14 + HHT-transformed images. For the PaviaU dataset, the highest classiﬁcation accuracy was 98.70% with a 30% training sample from the MNF1–14 + HHT-transformed image using 800 neurons. In general, the accuracy increased as the neurons increased, and as the training samples increased. However, the accuracy improvement curve became relatively ﬂat when more than 200 neurons were used, which revealed that using more discriminative information from transformed images can reduce the number of neurons needed to adequately describe the data as well as reducing the complexity of the ANN model. Overall, the proposed method opens new avenues in the use of MNF and HHT transformations for HSI classiﬁcation with outstanding accuracy performance using an ANN.


Introduction
Hyperspectral images (HSIs) are characterized by hundreds of observational bands with rich spectral information at high spectral resolution. Compared to multi-spectral images [1][2][3], the rich spectral information of HSIs provides very high-dimensional data, which are valuable resources for land-cover (SNR). The order of the MNF images also reveals the images' quality. Since image quality significantly affects object detection [47], the first 14 MNF bands with higher image quality are selected to compose two sets of the experimental image, respectively, MNF1-10, and MNF1-14. In the second step, HHT transformation is applied to decompose the 14 selected MNF bands into 14 sets of bidimensional empirical mode components (BEMCs) [45]. Due to the land-use homogeneity characteristics of the Indian Pines dataset and based on an experiment from a previous study [46], the first four two-dimensional intrinsic mode function (BIMFs) were neglected to avoid highfrequency noise part information. Two sets of the experimental image, MNF1-10+HHT and MNF1-14+HHT, were merged for ANN classification. In the ANN classification stage, three categories of images, the original 220 band Indian Pines dataset, MNF-transformed images (two sets), and MNF+HHT-transformed images (two sets) were compared regarding their ANN classification performances using different training sample proportions. In the proposed approach, to further test the impact of training sample proportion and the number of neurons in the ANN on classification accuracy, four proportions of training sample, 5%, 10%, 20%, and 30%, were extracted. One to 1000 ANN neurons were assessed in terms of the associated classification accuracy.

Study Images
Two benchmark datasets, the Indian Pine (IP) and Pavia University (PaviaU) hyperspectral datasets, were employed. The IP dataset was acquired from the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) sensor and has been widely used in image-classification research [28,31,48]. The IP dataset has a unique composition, with one-third forest and two-thirds farmland. Additionally, the IP dataset is composed of 145 × 145 pixel image size, 220 spectral bands, and 16 classes with a 20 m spatial resolution. The IP dataset with ground truth reference is available at https://engineering.purdue.edu/~biehl/MultiSpec/.
The PaviaU dataset was obtained from the Reflective Optics System Imaging Spectrometer (ROSIS) optical sensor, capturing an urban site over the University of Pavia, northern Italy. The size of the PaviaU images is 610 × 610 pixels, 103 spectral bands, and nine classes with a 1.3 m spatial

Study Images
Two benchmark datasets, the Indian Pine (IP) and Pavia University (PaviaU) hyperspectral datasets, were employed. The IP dataset was acquired from the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) sensor and has been widely used in image-classification research [28,31,48]. The IP dataset has a unique composition, with one-third forest and two-thirds farmland. Additionally, the IP dataset is composed of 145 × 145 pixel image size, 220 spectral bands, and 16 classes with a 20 m spatial resolution. The IP dataset with ground truth reference is available at https://engineering.purdue.edu/~{}biehl/MultiSpec/.
Remote Sens. 2020, 12, 2327 4 of 20 The PaviaU dataset was obtained from the Reflective Optics System Imaging Spectrometer (ROSIS) optical sensor, capturing an urban site over the University of Pavia, northern Italy. The size of the PaviaU images is 610 × 610 pixels, 103 spectral bands, and nine classes with a 1.3 m spatial resolution. The PaviaU dataset with ground truth reference is available at http://www.ehu.eus/ccwintco/index.php?title=P%C3% A1gina_principal.

Frequency Transformation-Minimum Noise Fraction (MNF)
MNF was applied for dimensionality reduction in this study. MNF segregates noise from bands through modified principal-component analysis (PCA) by ranking images on the basis of signal-to-noise ratio (SNR) [39,49]. MNF defines the noise of each band as follows: where N i (x) is the noise content of the xth pixel in the ith band, and S i (x) is the signal component of the corresponding pixel [43]. An image has p bands with gray levels S i (x), i = 1, 2 · · · , p, where x is given as the image coordinate. A linear MNF transform is as follows: where Y i (x) is the linear transform of the original pixel; a i is the left-hand eigenfactor of Σ N i Σ −1 , and u i is the corresponding eigenvalue of a i , equal to the noise fraction in Y i (x). u 1 ≤ u 2 ≤ · · · ≤ u describes the ranking of MNFs by the image quality.

Frequency Transformation-Hilbert-Huang Transform (HHT)
FABEMD, a branch of HHT, was implemented to decompose the extracted features. FABEMD offers an efficient mathematical solution by using order-statistics filters to estimate the upper and lower envelopes and setting the screening iteration number for each two-dimensional intrinsic mode function (BIMF) to one. The primary process of FABEMD is described below [43,44,50].
A maximum-value map (LMMAX) and a minimum-value map (LMMIN) are generated by a two-dimensional array of local maxima and minima. Local extreme points are identified by the neighbor-kernel method and points with pixel values strictly above (below) all their neighbors are considered local maxima (minima). A commonly used 3 × 3 kernel was adopted because it produces more favorable local extremum detection results than a large kernel size [44]. When the extreme point is in the border and corner of the image, neighboring points within the kernel are ignored.

a mn
Local Maximum i f a mn > a kl Local Minimum i f a mn < a kl where a mn is an element of the array located at the m th row and n th column, and Equations (4) and (5) represent k and l.
where w ex × w ex is the neighboring kernel size for detecting extremum points. An illustration of the BEMCs with associated BIMFs and residue image of the IP dataset is shown in Figure 2. Remote Sens. 2020, 12, x FOR PEER REVIEW 5 of 19 Figure 2. Illustration of the 14 sets of BEMCs with BIMFs and residue image for the IP dataset.

Frequency Transformation-MNF+HHT transform
Two frequency transformations were performed. First, a MNF transform was performed for noise and dimension reduction of the HSIs. The output images of the MNF transform were ranked by their signal-to-noise ratio (SNR) and image quality. In general, low-ordered MNF images contained higher SNR and image quality; therefore, the first 14 MNF images were extracted to give two image sets, MNF1-10 and MNF1-14, for comparison purposes. The rest of BIMFs and the residue image were composited for later ANN classification. Figure 2 shows the BIMFs and residue image for BEMCs 1-14. All BIMFs and residue images were derived directly from HHT transformation. Based on visual inspection, the image quality decreased with higher-ordered BIMFs as well as with higher-ordered BEMCs.

Machine Learning Classification-Training Sample Proportions
Four training sample proportions, 5%, 10%, 20%, and 30%, were tested with three categories of images (i.e., the original 220 bands of the IP dataset, two sets of MNF-transformed images, and two sets of MNF+HHT-transformed images) separately to investigate how changing training sample proportions would impact on the ANN's classification performance. Two hundred neurons were used in the hidden layer of the ANN as a benchmark to test the classification performance.

Machine Learning Classification-Artificial Neural Networks (ANNs)
ANNs, a subset of machine learning, have already shown great promise in HSI classification. The network training was performed using the open-source software ffnet version 0.8.0 [51] with the standard sigmoid function and the truncated Newton method (TNC) used for gradient optimization in the hidden layer [52,53]. The number of neurons was set as equal to the number of input bands, and the maximum number of iterations was set to 5000. Fifty percent of the pixels from each class of the HSI were randomly selected to form the training dataset for the assessment of classification accuracy. The selection and assessment were implemented 20 times to obtain a relatively reasonable accuracy in the training dataset. Percentages of 10%, 20%, 40%, 60% of the pixels were randomly selected from the training dataset to represent the 5%, 10%, 20%, and 30% training samples.

Frequency Transformation-MNF+HHT Transform
Two frequency transformations were performed. First, a MNF transform was performed for noise and dimension reduction of the HSIs. The output images of the MNF transform were ranked by their signal-to-noise ratio (SNR) and image quality. In general, low-ordered MNF images contained higher SNR and image quality; therefore, the first 14 MNF images were extracted to give two image sets, MNF1-10 and MNF1-14, for comparison purposes. The rest of BIMFs and the residue image were composited for later ANN classification. Figure 2 shows the BIMFs and residue image for BEMCs 1-14. All BIMFs and residue images were derived directly from HHT transformation. Based on visual inspection, the image quality decreased with higher-ordered BIMFs as well as with higher-ordered BEMCs.

Machine Learning Classification-Training Sample Proportions
Four training sample proportions, 5%, 10%, 20%, and 30%, were tested with three categories of images (i.e., the original 220 bands of the IP dataset, two sets of MNF-transformed images, and two sets of MNF+HHT-transformed images) separately to investigate how changing training sample proportions would impact on the ANN's classification performance. Two hundred neurons were used in the hidden layer of the ANN as a benchmark to test the classification performance. Figure 3 shows the ANN classification results for the IP dataset, using 200 neurons of each proportion of training sample in each category of images. In general, the classification accuracy increased with the increase in training sample proportion, indicating the data-eager characteristics of ANNs. Additionally, despite the proportion of training samples, the MNF+HHT-transformed image sets displayed higher accuracy than the MNF-transformed images and the original 220 bands of the IP dataset, indicating that the data-frequency transformations by MNF and HHT significantly improved the classification accuracy. Moreover, it was observed that MNF+HHT transformation remarkably reduced dependence on the amount of training data when using an ANN. For instance, in the situation of using a 5% training samples, the MNF1-10+HHT images and the MNF1-14+HHT images achieve 96.33% and 97.02% accuracies, respectively, which were 4.58% and 5.28% higher than the 91.75% accuracy achieved by the original 220 band Indian Pine images with a 30% training sample. Moreover, it was observed that MNF+HHT transformation remarkably reduced dependence on the amount of training data when using an ANN. For instance, in the situation of using a 5% training samples, the MNF1-10+HHT images and the MNF1-14+HHT images achieve 96.33% and 97.02% accuracies, respectively, which were 4.58% and 5.28% higher than the 91.75% accuracy achieved by the original 220 band Indian Pine images with a 30% training sample. Furthermore, Figure 3 displays the results of 5% to 30% training sample proportions with the original 220 band IP dataset, MNF-transformed image sets, and MNF+HHT-transformed image sets. The pairwise T-test was performed to compare the 220 band set with MNF+HHT-transformed image sets. The statistical results showed that both MNF1-10+HHT and MNF1-14+HHT transformations produced significantly higher accuracy than classification in the original 220 band IP dataset (p-values 0.058 and 0.059, respectively; α = 0.10).
Larger improvements were observed for the MNF+HHT transformation shown in Figure 4. For example, MNF1-14+HHT transformation improved the accuracy from 62.17% to 97.02% for the ANN classification, which showed a 34.85% accuracy improvement, in contrast with the 27.79% improvement from 62.17% to 89.96% for the MNF1-14 image set. To make a more rigorous comparison of the accuracy between MNF-transformed images and MNF+HHT-transformed images ( Figure 4), with the 5% training sample, the accuracy of the MNF1-10+HHT-transformed images reached 96.33%, which was 5.93% higher than the 90.40% accuracy achieved by MNF transformation. With the 10%, 20%, and 30% training samples, the accuracies of the MNF1-10+HHT-transformed images were 6.01%, 5.62%, and 4.22% higher than that of the MNF1-10 images, respectively. Likewise, higher accuracies were found in the MNF+ HHTtransformation sets in the comparison of the MNF1-14+HHT and MNF1-14 images. With the 5% training sample, the accuracy of the MNF1-14+HHT-transformed images reaches 97.02%, which was 7.06% higher than the 89.96% accuracy achieved by MNF transformation. With the 10%, 20%, and 30% training samples, the accuracies of the MNF1-14+HHT-transformed images were 6.77%, 5.70%, 4.19%, and 3.41% higher than those of the MNF1-14 images, respectively. For PaviaU dataset, a similar pattern to the IP dataset was observed, as shown in Figures 5 and 6. As shown in Figure 5, the classification accuracy rose with the increase in training sample proportion. Likewise, the MNF+HHT-transformed image sets showed higher accuracy than the MNF-transformed images and the original 103 bands of the PaviaU dataset with any proportion of training samples, which highlights again that MNF and HHT transformations significantly improved the classification accuracy.
Moreover, it was also observed that MNF and HHT transformations successfully lowered the demand for training samples. When using 5% training samples, the MNF1-10+HHT images and the MNF1-14+HHT images achieved 93.58% and 92.44% accuracies, respectively, which was 1.45% and 0.31% higher than the 92.13% accuracy achieved by the original 103 band PaviaU image with a 30% training sample.
Additionally, Figure 6 displays the accuracy comparison of 5% to 30% training sample proportions with the original 103 bands of the PaviaU dataset, MNF-transformed image sets, and MNF+HHT-transformed image sets. Based on a pairwise T-test, the statistical results showed that MNF1-10, MNF1-10+HHT, and MNF1-14+HHT transformations produced significantly higher accuracies than classification in the original 103 band PaviaU dataset (p-value<0.001).
Compared to the IP dataset, smaller but still positive improvements were observed for the To make a more rigorous comparison of the accuracy between MNF-transformed images and MNF+HHT-transformed images (Figure 4), with the 5% training sample, the accuracy of the MNF1-10+HHT-transformed images reached 96.33%, which was 5.93% higher than the 90.40% accuracy achieved by MNF transformation. With the 10%, 20%, and 30% training samples, the accuracies of the MNF1-10+HHT-transformed images were 6.01%, 5.62%, and 4.22% higher than that of the MNF1-10 images, respectively. Likewise, higher accuracies were found in the MNF+ HHT-transformation sets in the comparison of the MNF1-14+HHT and MNF1-14 images. With the 5% training sample, the accuracy of the MNF1-14+HHT-transformed images reaches 97.02%, which was 7.06% higher than the 89.96% accuracy achieved by MNF transformation. With the 10%, 20%, and 30% training samples, the accuracies of the MNF1-14+HHT-transformed images were 6.77%, 5.70%, 4.19%, and 3.41% higher than those of the MNF1-14 images, respectively. For PaviaU dataset, a similar pattern to the IP dataset was observed, as shown in Figures 5 and 6. As shown in Figure 5, the classification accuracy rose with the increase in training sample proportion. Likewise, the MNF+HHT-transformed image sets showed higher accuracy than the MNF-transformed images and the original 103 bands of the PaviaU dataset with any proportion of training samples, which highlights again that MNF and HHT transformations significantly improved the classification accuracy.

Machine Learning Classification-Neuron Numbers
For the purpose of understanding the influence of the number of neurons in the ANN on classification accuracy, two categories of images, MNF1-10+HHT and MNF1-14+HHT images, were compared in terms of classification performance, using 1 to 1000 neurons in the hidden layer with 5 %, 10%, 20%, and 30% training sample proportions. Tables 1 to 4 show the results of the classification accuracy of MNF1-10+HHT and MNF1-14+HHT image sets for the IP and PaviaU datasets. The highest accuracy value in the corresponding training sample proportion column is shown in bold, and values above 95% are shaded in light gray. In comparison, values above 99% are shaded in dark gray.
For the IP dataset, as shown in Table 1, the MNF1-10+HHT image set with 5% and 10% training sample proportions, the highest accuracies of 96.94% and 98.91% occurred at 800 neurons. With 20% and 30% training sample proportions, the highest accuracies were found when the hidden layer had 600 and 500 neurons, respectively. In Table 2 of the Indian Pines MNF1-14+HHT image set, the highest accuracies in 5 %, 10%, 20%, and 30% training sample proportions appeared when the hidden layer had 600, 1000, 800, and 500 neurons, respectively. Based on the paired T-test, both MNF1-10+HHT and MNF1-14+HHT transformations produced significantly higher accuracies when more training samples were used.
For both the MNF1-10+HHT and MNF1-14+HHT image sets for the IP dataset, significantly higher accuracies were observed when using a 10% training sample than when using a 5% training sample (respectively, p-value= 0.0002, α=0.01 and p-value= 0.0054, α=0.01). Similarly, significantly higher accuracy was achieved when using a 20% training sample than when using a 10% training sample (p-value> 0.0001, α= 0. 01 and p-value= 0.0589, α= 0.10). However, no significant difference was found between 20% and 30% training sample proportions, which demonstrates the limitations in the accuracy improvement that can be achieved by increasing the training sample size. Besides, comparing the values of accuracy, the MNF1-14+HHT image set reached a value above 99% when the hidden layer used 30 neurons at a 20% training sample proportion, whereas the MNF1-10+HHT image set needed 80 neurons, which may support an inference that the MNF1-14+HHT images had more discriminative information in different classes to support better classification.
For the PaviaU MNF1-10+HHT image set, as shown in Table 3, above 95% accuracy was achieved using 5%, 10%, 20%, and 30% samples when the hidden layer had 30, 20, 15, and 10 neurons, respectively. With a 5% training sample proportion, the highest accuracy of 95.09% occurred at 30 neurons. With 10%, 20%, and 30% training sample proportions, the highest accuracies were found when the hidden layer had 800, 600, and 1000 neurons, respectively. As displayed in Table 4, the Moreover, it was also observed that MNF and HHT transformations successfully lowered the demand for training samples. When using 5% training samples, the MNF1-10+HHT images and the MNF1-14+HHT images achieved 93.58% and 92.44% accuracies, respectively, which was 1.45% and 0.31% higher than the 92.13% accuracy achieved by the original 103 band PaviaU image with a 30% training sample.
Additionally, Figure 6 displays the accuracy comparison of 5% to 30% training sample proportions with the original 103 bands of the PaviaU dataset, MNF-transformed image sets, and MNF+HHT-transformed image sets. Based on a pairwise T-test, the statistical results showed that MNF1-10, MNF1-10+HHT, and MNF1-14+HHT transformations produced significantly higher accuracies than classification in the original 103 band PaviaU dataset (p-value < 0.001).
Compared to the IP dataset, smaller but still positive improvements were observed for the MNF+HHT transformation, as shown in Figure 6. For example, MNF1-10+HHT showed improved accuracy from 87.64% to 93.58% for the ANN classification, which was a 5.94% accuracy improvement.

Machine Learning Classification-Neuron Numbers
For the purpose of understanding the influence of the number of neurons in the ANN on classification accuracy, two categories of images, MNF1-10+HHT and MNF1-14+HHT images, were compared in terms of classification performance, using 1 to 1000 neurons in the hidden layer with 5 %, 10%, 20%, and 30% training sample proportions. Tables 1-4 show the results of the classification accuracy of MNF1-10+HHT and MNF1-14+HHT image sets for the IP and PaviaU datasets. The highest accuracy value in the corresponding training sample proportion column is shown in bold, and values above 95% are shaded in light gray. In comparison, values above 99% are shaded in dark gray.
For the IP dataset, as shown in Table 1, the MNF1-10+HHT image set with 5% and 10% training sample proportions, the highest accuracies of 96.94% and 98.91% occurred at 800 neurons. With 20% and 30% training sample proportions, the highest accuracies were found when the hidden layer had 600 and 500 neurons, respectively. In Table 2 of the Indian Pines MNF1-14+HHT image set, the highest accuracies in 5 %, 10%, 20%, and 30% training sample proportions appeared when the hidden layer had 600, 1000, 800, and 500 neurons, respectively. Based on the paired T-test, both MNF1-10+HHT and MNF1-14+HHT transformations produced significantly higher accuracies when more training samples were used.  For both the MNF1-10+HHT and MNF1-14+HHT image sets for the IP dataset, significantly higher accuracies were observed when using a 10% training sample than when using a 5% training sample (respectively, p-value = 0.0002, α = 0.01 and p-value = 0.0054, α = 0.01). Similarly, significantly higher accuracy was achieved when using a 20% training sample than when using a 10% training sample (p-value > 0.0001, α = 0. 01 and p-value = 0.0589, α = 0.10). However, no significant difference was found between 20% and 30% training sample proportions, which demonstrates the limitations in the accuracy improvement that can be achieved by increasing the training sample size. Besides, comparing the values of accuracy, the MNF1-14+HHT image set reached a value above 99% when the hidden layer used 30 neurons at a 20% training sample proportion, whereas the MNF1-10+HHT image set needed 80 neurons, which may support an inference that the MNF1-14+HHT images had more discriminative information in different classes to support better classification.
From a visual aspect, Figures 7-10 represent the classification accuracy results of the IP and PaviaU MNF1-10+HHT and MNF1-14+HHT image sets with 5%, 10%, 20%, and 30% training sample proportions. Based on the structure of an ANN, the number of parameters for each layer was deliberated. First of all, the number of neurons in the hidden layer was set as equal to the number of input bands. Secondly, the number of parameters in the output layer was set to 16 due to the number of classes in the IP dataset. Therefore, the total parameters could be estimated according to the number of input bands. In the present study, 5 to 220 bands derived from the IP dataset were taken as the input layer, whereas the hidden layer had 1 to 1000 neurons. The output layer produced the probability of 16 classes. As shown in Figure 11, with an increasing number of input layers (bands), the associated estimated number of parameters rose exponentially when the neuron number increased. The rising of the estimated parameters was exaggerated if the number of neurons in the hidden layer increased. As the estimated parameters upsurge, the model becomes more complex and tends to over-fit when the number of available training samples is limited.
Remote Sens. 2020, 12, x FOR PEER REVIEW 12 of 19 number of input bands. In the present study, 5 to 220 bands derived from the IP dataset were taken as the input layer, whereas the hidden layer had 1 to 1000 neurons. The output layer produced the probability of 16 classes. As shown in Figure 11, with an increasing number of input layers (bands), the associated estimated number of parameters rose exponentially when the neuron number increased. The rising of the estimated parameters was exaggerated if the number of neurons in the hidden layer increased. As the estimated parameters upsurge, the model becomes more complex and tends to over-fit when the number of available training samples is limited.      Figures 12 and 13 show the maps generated from the best classification results of each training sample proportion. In general, the misclassified pixels can be observed around the boundaries of classification blocks. It is clear to see that the classification accuracy increased as the training sample proportion increases, as well as when the number of neurons increased. The 30% training sample proportion set produced the highest accuracy almost every tims. However, the increasing rate of accuracy was more apparent when the number of neurons was below 200. The accuracy improvement curve became relatively flat when more than 200 neurons were used. This result revealed that using more discriminative information from transformed images can reduce the number of neurons needed to adequately describe the data, as well as reducing the complexity of the ANN model.          Furthermore, several interesting results were observed based on the experiments with these two datasets. Compared to the IP dataset with 220 bands, the PaviaU dataset derived from ROSIS, processing 103 bands with nine classes, needed fewer neurons to achieve a similar classification accuracy. Regarding band selection, the performance of MNFs 1-14 was superior to that of MNFs 1-10 in the IP dataset, reflecting that MNFs 1-10 might have excluded some effective spectral information, whereas MNFs 1-10 showed superior performance to MNFs 1-14 in the PaviaU dataset (using 5% and 10% training samples), reflecting that MNFs 1-14 might have included ineffective spectral information and so decreased the classification accuracy. As shown in Figure 14, for the PaviaU dataset, the order of MNF images represents the spectral information of the scene. The MNF 1 to MNF 10 images illustrated better scene information than MNF 11 to MNF 14 based on a visual evaluation. In short, the PaviaU images needs less MNFs to achieve a similar classification accuracy than the IP image set did, due to its lower-dimensional spectral information. insufficient training. For example, the classes of "Oats", "Hay-windrowed", and "Alfalfa" possessed only 1, 2, and 3 pixels, respectively, in the 5% training data selection, which resulted in lower overall accuracy. However, the proposed method reached a high overall accuracy of 97.62% even with insufficient training data, such as the 5% selection, which proves its usability in situations with limited training data and high-dimensional spectral information.   For the IP dataset, the training data proportions of 5% and 10% resulted in unsatisfactory classification in a 220 band run due to some classes possessing only a few pixels, thus causing insufficient training. For example, the classes of "Oats", "Hay-windrowed", and "Alfalfa" possessed only 1, 2, and 3 pixels, respectively, in the 5% training data selection, which resulted in lower overall accuracy. However, the proposed method reached a high overall accuracy of 97.62% even with insufficient training data, such as the 5% selection, which proves its usability in situations with limited training data and high-dimensional spectral information.

Conclusion
To enhance HSI classification, this study proposes a process integrating MNF and HHT to reduce image dimensions and decompose images. Specifically, MNF and HHT function as feature extractor and image decomposer, respectively, to minimize the influences of noises and dimensionality. This study tested two variables, the number of neurons and training sample proportion, to evaluate the variation of ANN classification accuracy.
For both the IP and PaviaU hyperspectral datasets, the statistically significant classification accuracy improvement indicated that the proposed MNF+HHT process had excellent and stable performance. The major contributions and findings can be summarized as follows. 1.
With the aim of solving two critical issues in HSI classification, the curse of dimensionality and the limited availability of training samples, this study proposes a novel approach by integrating MNF and HHT transformations into ANN classification. MNF was performed to reduce the

Conclusions
To enhance HSI classification, this study proposes a process integrating MNF and HHT to reduce image dimensions and decompose images. Specifically, MNF and HHT function as feature extractor and image decomposer, respectively, to minimize the influences of noises and dimensionality. This study tested two variables, the number of neurons and training sample proportion, to evaluate the variation of ANN classification accuracy.
For both the IP and PaviaU hyperspectral datasets, the statistically significant classification accuracy improvement indicated that the proposed MNF+HHT process had excellent and stable performance. The major contributions and findings can be summarized as follows.

1.
With the aim of solving two critical issues in HSI classification, the curse of dimensionality and the limited availability of training samples, this study proposes a novel approach by integrating MNF and HHT transformations into ANN classification. MNF was performed to reduce the dimensionality of HSI, and the decomposition function of HHT produced more discriminative information from images. After MNF and HHT transformations, training samples were selected for each land cover type with four proportions and tested using 1-1000 neurons in an ANN. For a comparison purpose, three categories of image sets, the original HSI dataset, MNF-transformed images (two sets), and MNF+HHT-transformed images (two sets) were compared regarding their ANN classification performances.

2.
Two HSI datasets, the Indian Pines (IP) and Pavia University (PaviaU) datasets, were tested with the proposed method. The results showed that the IP MNF1-14+HHT-transformed images achieved the highest accuracy of 99.81% with a 30% training sample using 500 neurons, whereas the PaviaU dataset achieved the highest accuracy of 98.70% with a 30% training sample using 800 neurons. The results revealed that the proposed approach of integrating MNF and HHT transformations efficiently and significantly enhanced HSI classification performance by the ANN.

3.
In general, the classification accuracy increased as the training sample proportion increased and as the number of neurons increased, indicating the data-eager characteristics of ANNs. The MNF+HHT transformed image sets also displayed the highest accuracy statistically. A large accuracy improvement, 34.85%, was observed for the IP MNF1-14+HHT image set compared with the original 220 band IP image using 5% training samples. However, no significant difference was found between 20% and 30% training sample proportions, which demonstrates the limitations in the accuracy improvement that can be achieved by increasing the sample size. The accuracy improvement of the PaviaU dataset was smaller but still positive. For the PaviaU dataset, 10 MNFs showed superior performance to 14 MNFs when using 5% and 10% training samples, which reflected that 14 MNFs might include ineffective spectral information and thus decrease the classification accuracy. The PaviaU image set needed fewer MNFs than the IP set did to achieve a similar classification accuracy, due to its lower-dimensional spectral information 4.
Additionally, the accuracy improvement curve became relatively flat when more than 200 neurons were used for both datasets. This observation revealed that using more discriminative information from transformed images can reduce the number of neurons needed to adequately describe the data, as well as reducing the complexity of the ANN model.

5.
The proposed approach suggests new avenues for further research on HSI classification using ANNs. Various DL-based methods such as semantic segmentation [54], manifolding learning, GANs, RNN, SAE, SLFN, ELM, or automatic feature-extraction techniques could be further investigated as future possible research directions.