3.2. Machine Learning Classification—Training Sample Proportions
Four training sample proportions, 5%, 10%, 20%, and 30%, were tested with three categories of images (i.e., the original 220 bands of the IP dataset, two sets of MNF-transformed images, and two sets of MNF+HHT-transformed images) separately to investigate how changing training sample proportions would impact on the ANN’s classification performance. Two hundred neurons were used in the hidden layer of the ANN as a benchmark to test the classification performance.
shows the ANN classification results for the IP dataset, using 200 neurons of each proportion of training sample in each category of images. In general, the classification accuracy increased with the increase in training sample proportion, indicating the data-eager characteristics of ANNs. Additionally, despite the proportion of training samples, the MNF+HHT-transformed image sets displayed higher accuracy than the MNF-transformed images and the original 220 bands of the IP dataset, indicating that the data-frequency transformations by MNF and HHT significantly improved the classification accuracy.
Moreover, it was observed that MNF+HHT transformation remarkably reduced dependence on the amount of training data when using an ANN. For instance, in the situation of using a 5% training samples, the MNF1–10+HHT images and the MNF1–14+HHT images achieve 96.33% and 97.02% accuracies, respectively, which were 4.58% and 5.28% higher than the 91.75% accuracy achieved by the original 220 band Indian Pine images with a 30% training sample.
Furthermore, Figure 3
displays the results of 5% to 30% training sample proportions with the original 220 band IP dataset, MNF-transformed image sets, and MNF+HHT-transformed image sets. The pairwise T-test was performed to compare the 220 band set with MNF+HHT-transformed image sets. The statistical results showed that both MNF1–10+HHT and MNF1–14+HHT transformations produced significantly higher accuracy than classification in the original 220 band IP dataset (p
-values 0.058 and 0.059, respectively; α = 0.10).
Larger improvements were observed for the MNF+HHT transformation shown in Figure 4
. For example, MNF1–14+HHT transformation improved the accuracy from 62.17% to 97.02% for the ANN classification, which showed a 34.85% accuracy improvement, in contrast with the 27.79% improvement from 62.17% to 89.96% for the MNF1–14 image set.
To make a more rigorous comparison of the accuracy between MNF-transformed images and MNF+HHT-transformed images (Figure 4
), with the 5% training sample, the accuracy of the MNF1–10+HHT-transformed images reached 96.33%, which was 5.93% higher than the 90.40% accuracy achieved by MNF transformation. With the 10%, 20%, and 30% training samples, the accuracies of the MNF1–10+HHT-transformed images were 6.01%, 5.62%, and 4.22% higher than that of the MNF1–10 images, respectively. Likewise, higher accuracies were found in the MNF+ HHT-transformation sets in the comparison of the MNF1–14+HHT and MNF1–14 images. With the 5% training sample, the accuracy of the MNF1–14+HHT-transformed images reaches 97.02%, which was 7.06% higher than the 89.96% accuracy achieved by MNF transformation. With the 10%, 20%, and 30% training samples, the accuracies of the MNF1–14+HHT-transformed images were 6.77%, 5.70%, 4.19%, and 3.41% higher than those of the MNF1–14 images, respectively.
For PaviaU dataset, a similar pattern to the IP dataset was observed, as shown in Figure 5
and Figure 6
. As shown in Figure 5
, the classification accuracy rose with the increase in training sample proportion. Likewise, the MNF+HHT-transformed image sets showed higher accuracy than the MNF-transformed images and the original 103 bands of the PaviaU dataset with any proportion of training samples, which highlights again that MNF and HHT transformations significantly improved the classification accuracy.
Moreover, it was also observed that MNF and HHT transformations successfully lowered the demand for training samples. When using 5% training samples, the MNF1–10+HHT images and the MNF1–14+HHT images achieved 93.58% and 92.44% accuracies, respectively, which was 1.45% and 0.31% higher than the 92.13% accuracy achieved by the original 103 band PaviaU image with a 30% training sample.
Additionally, Figure 6
displays the accuracy comparison of 5% to 30% training sample proportions with the original 103 bands of the PaviaU dataset, MNF-transformed image sets, and MNF+HHT-transformed image sets. Based on a pairwise T-test, the statistical results showed that MNF1–10, MNF1–10+HHT, and MNF1–14+HHT transformations produced significantly higher accuracies than classification in the original 103 band PaviaU dataset (p
-value < 0.001).
Compared to the IP dataset, smaller but still positive improvements were observed for the MNF+HHT transformation, as shown in Figure 6
. For example, MNF1–10+HHT showed improved accuracy from 87.64% to 93.58% for the ANN classification, which was a 5.94% accuracy improvement.
3.3. Machine Learning Classification—Neuron Numbers
For the purpose of understanding the influence of the number of neurons in the ANN on classification accuracy, two categories of images, MNF1–10+HHT and MNF1–14+HHT images, were compared in terms of classification performance, using 1 to 1000 neurons in the hidden layer with 5 %, 10%, 20%, and 30% training sample proportions. Table 1
, Table 2
, Table 3
and Table 4
show the results of the classification accuracy of MNF1–10+HHT and MNF1–14+HHT image sets for the IP and PaviaU datasets. The highest accuracy value in the corresponding training sample proportion column is shown in bold, and values above 95% are shaded in light gray. In comparison, values above 99% are shaded in dark gray.
For the IP dataset, as shown in Table 1
, the MNF1–10+HHT image set with 5% and 10% training sample proportions, the highest accuracies of 96.94% and 98.91% occurred at 800 neurons. With 20% and 30% training sample proportions, the highest accuracies were found when the hidden layer had 600 and 500 neurons, respectively. In Table 2
of the Indian Pines MNF1–14+HHT image set, the highest accuracies in 5 %, 10%, 20%, and 30% training sample proportions appeared when the hidden layer had 600, 1000, 800, and 500 neurons, respectively. Based on the paired T-test, both MNF1–10+HHT and MNF1–14+HHT transformations produced significantly higher accuracies when more training samples were used.
For both the MNF1–10+HHT and MNF1–14+HHT image sets for the IP dataset, significantly higher accuracies were observed when using a 10% training sample than when using a 5% training sample (respectively, p-value = 0.0002, α = 0.01 and p-value = 0.0054, α = 0.01). Similarly, significantly higher accuracy was achieved when using a 20% training sample than when using a 10% training sample (p-value > 0.0001, α = 0. 01 and p-value = 0.0589, α = 0.10). However, no significant difference was found between 20% and 30% training sample proportions, which demonstrates the limitations in the accuracy improvement that can be achieved by increasing the training sample size. Besides, comparing the values of accuracy, the MNF1–14+HHT image set reached a value above 99% when the hidden layer used 30 neurons at a 20% training sample proportion, whereas the MNF1–10+HHT image set needed 80 neurons, which may support an inference that the MNF1–14+HHT images had more discriminative information in different classes to support better classification.
For the PaviaU MNF1–10+HHT image set, as shown in Table 3
, above 95% accuracy was achieved using 5%, 10%, 20%, and 30% samples when the hidden layer had 30, 20, 15, and 10 neurons, respectively. With a 5% training sample proportion, the highest accuracy of 95.09% occurred at 30 neurons. With 10%, 20%, and 30% training sample proportions, the highest accuracies were found when the hidden layer had 800, 600, and 1000 neurons, respectively. As displayed in Table 4
, the PaviaU MNF1–14+HHT image set achieved above 95% accuracy with 10 neurons using 10% to 30% training samples. The highest accuracies in the 5 %, 10%, 20%, and 30% training sample proportions appeared when the hidden layer has 600, 30, 600, and 800 neurons, respectively.
From a visual aspect, Figure 7
, Figure 8
, Figure 9
and Figure 10
represent the classification accuracy results of the IP and PaviaU MNF1–10+HHT and MNF1–14+HHT image sets with 5%, 10%, 20%, and 30% training sample proportions. Based on the structure of an ANN, the number of parameters for each layer was deliberated. First of all, the number of neurons in the hidden layer was set as equal to the number of input bands. Secondly, the number of parameters in the output layer was set to 16 due to the number of classes in the IP dataset. Therefore, the total parameters could be estimated according to the number of input bands. In the present study, 5 to 220 bands derived from the IP dataset were taken as the input layer, whereas the hidden layer had 1 to 1000 neurons. The output layer produced the probability of 16 classes. As shown in Figure 11
, with an increasing number of input layers (bands), the associated estimated number of parameters rose exponentially when the neuron number increased. The rising of the estimated parameters was exaggerated if the number of neurons in the hidden layer increased. As the estimated parameters upsurge, the model becomes more complex and tends to over-fit when the number of available training samples is limited.
and Figure 13
show the maps generated from the best classification results of each training sample proportion. In general, the misclassified pixels can be observed around the boundaries of classification blocks. It is clear to see that the classification accuracy increased as the training sample proportion increases, as well as when the number of neurons increased. The 30% training sample proportion set produced the highest accuracy almost every tims. However, the increasing rate of accuracy was more apparent when the number of neurons was below 200. The accuracy improvement curve became relatively flat when more than 200 neurons were used. This result revealed that using more discriminative information from transformed images can reduce the number of neurons needed to adequately describe the data, as well as reducing the complexity of the ANN model.
Furthermore, several interesting results were observed based on the experiments with these two datasets. Compared to the IP dataset with 220 bands, the PaviaU dataset derived from ROSIS, processing 103 bands with nine classes, needed fewer neurons to achieve a similar classification accuracy. Regarding band selection, the performance of MNFs 1–14 was superior to that of MNFs 1–10 in the IP dataset, reflecting that MNFs 1–10 might have excluded some effective spectral information, whereas MNFs 1–10 showed superior performance to MNFs 1–14 in the PaviaU dataset (using 5% and 10% training samples), reflecting that MNFs 1–14 might have included ineffective spectral information and so decreased the classification accuracy. As shown in Figure 14
, for the PaviaU dataset, the order of MNF images represents the spectral information of the scene. The MNF 1 to MNF 10 images illustrated better scene information than MNF 11 to MNF 14 based on a visual evaluation. In short, the PaviaU images needs less MNFs to achieve a similar classification accuracy than the IP image set did, due to its lower-dimensional spectral information.
For the IP dataset, the training data proportions of 5% and 10% resulted in unsatisfactory classification in a 220 band run due to some classes possessing only a few pixels, thus causing insufficient training. For example, the classes of “Oats”, “Hay-windrowed”, and “Alfalfa” possessed only 1, 2, and 3 pixels, respectively, in the 5% training data selection, which resulted in lower overall accuracy. However, the proposed method reached a high overall accuracy of 97.62% even with insufficient training data, such as the 5% selection, which proves its usability in situations with limited training data and high-dimensional spectral information.