Decision Fusion Framework for Hyperspectral Image Classification Based on Markov and Conditional Random Fields

Classification of hyperspectral images is a challenging task owing to the high dimensionality of the data, limited ground truth data, collinearity of the spectra and the presence of mixed pixels. Conventional classification techniques do not cope well with these problems. Thus, in addition to the spectral information, features were developed for a more complete description of the pixels, e.g., containing contextual information at the superpixel level or mixed pixel information at the subpixel level. This has encouraged an evolution of fusion techniques which use these myriad of multiple feature sets and decisions from individual classifiers to be employed in a joint manner. In this work, we present a flexible decision fusion framework addressing these issues. In a first step, we propose to use sparse fractional abundances as decision source, complementary to class probabilities obtained from a supervised classifier. This specific selection of complementary decision sources enables the description of a pixel in a more complete way, and is expected to mitigate the effects of small training samples sizes. Secondly, we propose to apply a fusion scheme, based on the probabilistic graphical Markov Random Field (MRF) and Conditional Random Field (CRF) models, which inherently employ spatial information into the fusion process. To strengthen the decision fusion process, consistency links across the different decision sources are incorporated to encourage agreement between their decisions. The proposed framework offers flexibility such that it can be extended with additional decision sources in a straightforward way. Experimental results conducted on two real hyperspectral images show superiority over several other approaches in terms of classification performance when very limited training data is available.


Introduction
In recent years, hyperspectral image classification has become a very attractive area of research due to the rich spectral information contained in hyperspectral images (HSI). However, in remote sensing, acquiring ground truth information is a difficult and expensive procedure, generally leading to a limited amount of training data. Together with the high number of spectral bands, this results in the Hughes phenomenon [1], which makes HSI classification a challenging task. Moreover, the high spectral similarity between some materials poses additional difficulties, produces ambiguity and further increases the complexity of the classification problem. Moreover, the relatively low spatial resolution of HSI leads to large amounts of mixed pixels, which additionally hinders the classification task.
To tackle these problems, a more complete description of a pixel and its local context has been pursued. Many spatial-spectral methods were developed that include spatial information through contextual features, e.g., by applying morphological and attribute filters, such as extended morphological profiles [2], extended multi-attribute morphological profiles and extended attribute profiles [3][4][5][6][7][8].
In general, spatial-spectral methods employ feature vectors of much higher dimensionality compared to spectral only methods, thereby decreasing the generalization capability of the classifiers for the same amount of training data. To deal with this, feature fusion and decision fusion methods have been developed. In feature fusion, the features are fused directly, for instance in a stacked architecture or using composite or multiple kernels. In [9], a feature fusion method was introduced by using a stacked feature architecture of morphological information and original hyperspectral data. Ref. [10] used different bands and different morphological filters as spatial features to build dedicated kernels and subsequently a composite kernel was built from these individual kernels. Similar composite kernel methods were applied in [11].
Decision fusion methods obtain probability values (decisions) from different individual feature sets by employing probabilistic classifiers and then perform fusion of the decisions. Several papers applied decision fusion rules to combine pixel-based classification results. In [12,13], the majority voting rule was used as a means to fuse several outputs (decisions) produced by basic classifiers. Ref. [14] used consensus theory [15] to generate opinion pool fusion rules to fuse posterior class probabilities obtained from minimum distance classifiers. In [16], probability outputs produced by maximum likelihood classifiers were fused using a weighted linear opinion rule and a weighted majority voting rule. The latter decision fusion rule was also employed for combining the results from supervised SVMs and unsupervised K-means classifiers [17].
Another group of methods applied probabilistic graphical Markov Random Field (MRF) and Conditional Random Field (CRF) models as regularizers after decision fusion. These models perform a maximum a posteriori classification by minimizing an energy function that includes smoothness constraints between neighboring variables. In [18], an MRF regularizer was applied to a linear combination of pixel-based probabilities and superpixel probabilities. In [19], global and local probabilities produced by SVM and subspace multinomial logistic regression classifiers were combined with the linear opinion pool rule and then refined with an MRF regularizer. In a similar manner, in [20], probabilities from probabilistic (one vs. one) SVM and Multinomial Logistic Regression (MLR) classifiers were combined. In [21], rotation forests were used to produce a set of probabilities, which were then fused by averaging over all probability values from the different rotation forests, and regularized by a MRF. In [22], multiple spatial features were used in a fusion framework in which a distinction was made between reliable and unreliable outputs. MRF was then applied to determine the labels of the unreliable pixels. In [23], a method was proposed that linearly combined different decisions, weighted by the accuracies of each of the sources. The obtained single source was then regularized by an MRF.
Apart from being used for spatial regularization, MRFs and CRFs can be used directly as decision fusion methods, by combining multiple sources in their energy function. This strategy has been applied for multisource data fusion in remote sensing [24][25][26][27]. A particularly interesting decision fusion method is proposed in [28] for the fusion of multispectral and Lidar data. They used a CRF model with cross link edges between different feature sources. As far as we know, the strategy of direct decision fusion by using MRF and CRF graphical models has not been applied to the fusion of different decision sources obtained from one hyperspectral image.
In this paper, we propose to perform fusion of different decision sources obtained from one hyperspectral image. The proposed method makes use of MRF and CRF graphical models because of their spatial regularization property and because of their ability to combine multiple decision sources in their energy functions. Since the hyperspectral image is the only image source available, complementary decision sources need to be derived from it. For this, we propose to use fractional abundances, obtained from a sparse unmixing method, SunSAL [29], as decision source. This is expected to provide an improved subpixel description in mixed pixel scenarios and to be well suited in small training size conditions. Fractional abundances have been applied before, as features for a direct hyperspectral image classification [30], or they were first classified with a soft classifier that generates class probabilities to be used in a decision fusion method [23]. On the other hand, sparse representation classification (SRC) methods were employed. These methods describe a spectrum as a sparse linear combination of training data (endmembers) in a dictionary, similarly as in sparse spectral unmixing. They facilitate the description of mixed pixels and were proven to be well suited for classification of high dimensional data with limited training samples, and in particular for hyperspectral image classification [31]. To employ the spatial correlation of HSI, methods forcing structured sparsity were developed as well [32,33]. To our knowledge, fractional abundances have never been applied directly as decision source in a decision fusion framework.
Along with the abundances, class probabilities from a probabilistic classifier (the MLR classifier) are generated. The input of the MLR classifier is initially provided by the reflectance spectra, but, alternatively, contextual features are applied as input as well. Both decision sources (abundances and probabilities) are two complementary views of the hyperspectral image from a different nature and provide a more complete description of each pixel, which is expected to be favorable in the case of small training sizes. To combine both decision sources, we employ a similar decision fusion approach as the one proposed in [28]. For this, we will use MRF or alternatively CRF graphical models that include, apart from spatial consistency constraints, cross links between the two decision sources to enforce consistency across their decisions. Finally, the framework is extended to accomodate three or more decision sources.
In the experimental section, the proposed strategy is demonstrated to improve over the use of each of the decision sources separately, and over the use of several other feature and decision fusion methods from the literature, in small training size scenarios.
The paper is organized as follows: in Section 2, the key elements of the proposed method are presented. In Section 2.2, the decision sources are introduced, while Sections 2.3 and 2.4 describe the proposed decision fusion methods MRFL and CRFL. In Section 2.4, the proposed framework is validated on two real hyperspectral images and compared with several state-of-the-art decision fusion methods. Ultimately, the conclusions are drawn in Section 4.

Preliminaries
In this section, we detail our proposed decision fusion approach to combine complementary decision sources based on MRF and CRF graphical models.

MRF Regularization
In the classical single source MRF approach, a graph is defined over a set of n observed pixels x = {x 1 , . . . , x n } and their corresponding class labels y = {y 1 , . . . , y n }, associated with the nodes in the graph. The graph edges model the spatial neighborhood dependencies between the pixels. While the pixel values are known, the labels are the variables that have to be estimated. In order to accomplish this, the joint probability distribution of the observed data and the labels P(x, y) need to be maximized over y. In terms of energies, the optimal labels are inferred by minimizing the following energy function: where ψ i (y i ) = − ln(p(x i |y i )) are the unary potentials, obtained from the class conditional probabilities p(x i |y i ) [34]. For high dimensional data, one resorts to the more commonly used: ψ i (y i ) = − ln(p(y i |x i )), wherep(y i |x i ) are the estimated posterior probabilities, obtained from a probabilistic classifier [15,35]. The valuesp(y i |x i ) are calculated by using the spectral reflectance values of the HSI pixels as features, but, in general, other (e.g., contextual) features may be applied as the inputs to the probabilistic classifier to obtain the posterior probabilities.
ψ i,j = (1 − δ(y i , y j )) are the pairwise potentials which are only label dependent and impose smoothness, based on the similarity of the labels within the spatial neighborhood N i of pixel i. In the above, δ(y i , y j ) denotes the indicator function (δ(a, b) = 1 for a = b and δ(a, b) = 0, otherwise).

CRF Regularization
One of the drawbacks of the MRF method is that it models the neighborhood relations between the labels without taking the observed data into account. It is a generative model, estimating the joint distribution of the data and the labels. Conditional Random Fields have several desirable properties, making them more flexible and efficient: (1) They are discriminative models, estimating P(y|x) directly, (2) They take into account the observed data in their pairwise potential terms, i.e., they impose smoothness based on the similarity of the observations within the spatial neighborhood of the pixels.

The Decision Sources
Let x = {x 1 , . . . , x n } be a hyperspectral image containing n pixels, with x i ∈ R d , d being the number of spectral bands. D = {(x 1 , y 1 )}, . . . , (x m , y m )} is a training set containing j = 1, . . . , m labeled samples x j and their associated labels y j ∈ {1, . . . , C}, where C is the number of classes. The aim is to assign labels y i to each image pixel x i .
In this work, the MRFs and CRFs are used as decision fusion methods, by combining multiple decision sources in their energy functions. We propose the fusion of two decision sources. The first is the probability output from the Multinomial Logistic Regression classifier (MLR) [36], i.e., a supervised classification of the spectral reflectance values. The second source of information is produced by considering the sparse spectral unmixing method SunSAL proposed in [29].
As a first source of information, the spectral values of the pixels are employed as input to an MLR, to obtain classification probabilities for each pixel x i : where β c ∈ R d , (c = 1, . . . , C) are the regression coefficients, estimated from the training data. A class label can be estimated from the probability vector, e.g., by applying a Maximum a Posteriori (MAP) classifier to it:ŷ p i = arg max c p c (x i ). The second source of information is obtained by computing the fractional abundances of each pixel x i with SunSAL, in which the training data is used as a dictionary of endmembers, E = [x 1 , . . . , x m ] (i.e., the training pixels are assumed to be pure materials): Then, the obtained abundances that correspond to endmembers having class label y j = c are summed up to obtain one fractional abundance α c (x i ) per class c, and the abundance vector: . In a similar way as with the vector of classification probabilities, a class label can be estimated from the abundance vector, e.g., by applying a MAP classifier to it: , similarly as in the sparse representation classifiers. Rather than expressing the statistical probability that a pixel is correctly classified as belonging to class c, the abundances express the fractional presence of class c within the pixel. They are expected to contain complementary information to the classification probabilities, in particular in mixed pixel scenarios. The use of both decision sources allows for a more complete description of the pixels, which is favorable for high-dimensional data and small training size conditions. Once the individual α and p are obtained from the sparse unmixing and the MLR classifier, the decision fusion of these modalities is performed in terms of MRF and CRF graphical models with composite energy functions, including the contributions from both decision sources.

MRF with Cross Links for Fusion (MRFL)
With each decision source, class labels are associated, i.e., y α i for the sparse abundances and y p i for the classification probabilities. To allow both decision sources to be fused, a bipartite graph is considered, containing two types of nodes for each pixel, denoting random variables associated with the labels y α i and y p i , respectively. Now, for each type of nodes, edges are defined that model the spatial neighborhood dependencies between the pixels. Moreover, a cross link is defined, connecting both nodes, i.e., connecting label y α i with the corresponding label y p i [37] (see Figure 1). Adding this cross link encourages the estimatesŷ α i andŷ p i to be the same, i.e., promotes consistency between both decisions. Remark that other cross links are possible (e.g., between neighboring pixels), which were omitted here to avoid possible performance degradation in the case of a denser graph.
 Figure 1. The graph representation of MRFL. Green nodes denote the random variables associated with y α , blue nodes denote the random variables associated with y p . Black lines denote the edges that model the spatial neighborhood dependencies. Red lines denote the cross links between y α and y p , encoding the potential interactions ψ αp i,i (y α i , y p i ). γ is the parameter that controls the influence of these interaction terms.
The goal is now to optimize the joint distribution over the observed data and corresponding labels from both sources: P(α, p, y α , y p ). For this, the following energy function is minimized: The unary potentials are given by: ) impose smoothness based on the similarity of the labels within the spatial neighborhood of pixel i, obtained from the fractional abundances and the classification probabilities, respectively. The last pairwise term ψ αp i,i = (1 − δ(y α i , y p i )) penalizes disagreement between the labels y α i and y p i . Through the binary potentials, the MRFL accounts simultaneously for spatial structuring and consistency between the labelings from the two decision sources.
The minimization of this energy function is an NP-hard combinatorial optimization problem. Nevertheless, there exist methods which can solve this problem efficiently in an approximate way.
We have applied the graph-cut α-expansion algorithm [38][39][40][41]. Since the last term ψ αp i,i encourages cross-source label consistency, for the vast majority of the pixels, one can expect an equivalent estimation of the labelsŷ α i =ŷ p i . For this reason, any of the two may be used as the final labeling result. We refer to [28] for more details on the probabilistic framework for graphical models with such cross-links.

CRF with Cross Links for Fusion (CRFL)
The above method is a generative model and models the joint probability distribution of the labels and the observed data: P(α, p, y α , y p ). Moreover, only relationships between the class labels are taken into account in the pairwise potentials of the MRFL. As an alternative, we employ a discriminative method which is a generalization of the previous MRFL method, directly modeling the posterior distribution P(y α , y p |α, p), by simultaneously taking into account both the relationships between the class labels y α , y p and the observed data: α, p in the pairwise potentials (see Figure 2).   p Figure 2. Graph representation of CRFL. The purple nodes denote random variables associated with the observed data, the green nodes denote random variables associated with the labels y α , blue nodes denote random variables associated with the labels y p . The turqoise lines denote the link of the labels with the observed data. Black lines denote the edges that model the spatial neighborhood dependencies. Red lines denote the cross links between (α, y α ) and (p, y p ) encoding the potential interactions ψ αp i,i (y α i , y p i |α, p). γ is the parameter that controls the influence of these interaction terms.
We refer the reader to [27,28,37]. The energy function is now given by: The unary terms are equivalent to the ones in the MRFL model. For the pairwise potentials, a contrast sensitive Potts model is applied [42]: ).
The first term encourages neighboring pixels with similar abundance vectors to belong to the same class. The second term encourages neighboring pixels with similar class probabilities to belong to the same class. Finally, the third term encourages to assign similar class labels y α i and y p i to pixels for which the abundance vector is similar to the probability vector. The parameters σ are standard deviations that determine the strengths of these enforcements. The optimization of this energy function is again performed with the graph-cut α-expansion algorithm.
Our proposed methods use the graph-cut α-expansion algorithm [38][39][40][41], which has a worst case complexity of O(mn 2 |P|) for a single optimization problem where m denotes the number of edges, n denotes the number of nodes in the graph and |P| denotes the cost of the minimum cut. Thus, the theoretical computational complexity of our proposed method is: O(kCmn 2 |P|), with k the upper bound of the number of iterations and C the number of classes. With a non-cautious addition of edges in the graph, for instance adding a cross link between each node and all other nodes from the second source, there would be a quadratic increase in the computational complexity. On the other hand, the empirical complexity in real scenarios has been shown to be between linear and quadratic w.r.t. the graph size [38].

Hyperspectral Data Sets
We validated our method on two well-known hyperspectral images: the "ROSIS-03 University of Pavia" and the "AVIRIS Indian Pines" images.

University of Pavia
This scene was acquired by the ROSIS-03 sensor over the University of Pavia, Italy. It contains 610 × 340 pixels, with a spatial resolution of 1.3 m per pixel, and 115 bands with a spectral range from 0.43 to 0.86 µm. Twelve noisy bands have been removed, and the remaining 103 spectral channels are used. A false color composite image along with the available ground reference map is shown in Figure 3.

Indian Pines
Indian Pines was acquired by the AVIRIS sensor over an agricultural site in Northwestern Indiana. This scene consists of 145 × 145 pixels with a spatial resolution of 20 m and 220 spectral bands, ranging from 0.2 to 2.3 µm. Prior to using the dataset, the noisy bands and the water absorption bands were manually discarded, leaving us with 164 bands. An RGB image of the scene along with the available ground reference map is shown in Figure 4.

Parameter Settings
In the experiments, we validated the following specific aspects of the proposed methodology: • the performance of the sparse representation obtained by the pixels fractional abundances from SunSAL as decisions, when combined with classification probabilities in a decision fusion scheme; • the comparison of the performances of MRFL and CRFL as decision fusion methods; • the flexibility of the proposed fusion methods, by including additional decision sets; • the performance of the method in the case of small training sample sizes.
The parameters which are part of the proposed methods were set as follows: to generate a balanced small training set, we randomly selected 10 pixels per class from both datasets. This training set was used to estimate the regression coefficients of the MLR classifier and to form the endmember dictionary of the sparse unmixing method.
For both datasets, the regularization parameter from the unmixing method was empirically selected from the range: λ ∈ [10 −5 -0.5] using a grid search method and the abundances were normalized. The inference parameters β, controlling the influence of the spatial neighborhood and γ, controlling the influence of the cross link consistency were set by a grid search in the range: [0. . The parameters σ α , σ p , σ αp from the pairwise potentials of the CRFL method were determined as the mean squared differences between the abundances, between the probabilities and the mean squared differences between the abundances and probabilities, respectively [43]. The obtained optimal values for λ, β and γ are summarized in Table 1.  Remark that β represents the total weight of all neighborhood pairwise interactions for both modalities in Equations (4) and (5). In a 4-connected neighborhood, all pairwise interactions are weighted equally with β 8 . Optimal values of λ are small, an observation reported in other work as well [44,45]. Accuracies remained stable for values of λ in the range λ ∈ [10 −3 -10 −5 ]. We performed a sensitivity analysis of the inference parameters β and γ of our MRFL and CRFL decision fusion methods. Figure 5 shows the evolution of the Overal Accuracy (OA) as a function of the inference parameters. In what follows, we discuss the results from the table and the figure. The following conclusions can be drawn: • The OA initially improves with increasing β and γ, proving the effectiveness of incorporating the spatial neighborhood and the consistency terms in our proposed methods, to correct for the wrongly initially assigned labels from the individual sources.

•
In general, the OA is more sensitive to changes of β, and remains relatively stable for a large range of values of γ. • A significant accuracy drop can be observed for higher values of β and γ in the MRFL method, whereas the CRFL method produces more stable results for different combinations of β and γ. This allows for applying the CRFL method to other images without having to perform extensive and exhaustive parameter grid searches.

•
The optimal values of β and γ are substantially higher for CRFL than for MRFL. This is because the CRFL method inherently uses observed data in the pairwise potentials, and thus heavily penalizes small differences between decisions that correspond to different class labels.

•
For the Indian Pines image, γ is much higher than β in the case of CRFL. This can be attributed to the presence of large homogeneous regions that imply a low influence of the spatial neighborhood compared to the consistency terms. In contrast, the University of Pavia image contains less large homogeneous regions, leading to an increase of the influence of the spatial neighborhood, with larger values of β in the case of CRFL.

University of Pavia
Indian Pines MRFL CRFL

Experiment 1: Complementarity of the Abundances
In this section, we study the potential of the abundances α(x i ), obtained by the SunSAL algorithm as decision sources for classification. As a first step, we investigated the complementarity of these sources when compared to the class probabilities p(x i ), obtained from the MLR classifier. For this, we apply a MAP classifier to both the abundances, obtaining class labelsŷ α i = arg max c α c (x i ), and the MLR class probabilities, obtainingŷ p i = arg max c p c (x i ). From these, a confusion matrix is generated, in which each element (k, l) shows the percentage of the pixels that was classified as class k by the first and as class l by the second classifier. The obtained confusion matrices for the University of Pavia and Indian Pines images are shown in Figure 6. To compare, the confusion matrices between the MLR classifier and a SVM classifier are given as well. One can clearly notice that there is a higher spread in the confusion matrices of SunSAL versus MLR than in the ones of SVM versus MLR. This indicates that SunSAL and MLR disagree more than MLR and SVM do, and that the abundances provide more complementary information to the MLR probabilities than the SVM class probabilities do. This makes the abundances a good candidate decision source in a decision fusion approach.

Experiment 2: Validation of the Decision Fusion Framework
Next, we validate the proposed decision fusion methods MRFL and CRFL by comparing them with several other classification and decision fusion methods. For a fair comparison, all comparing methods are applied on the same two decision sets: the abundances and the class probabilities. Some methods only employ one single source while other methods perform a decision fusion of both sources. Some methods are spectral only, i.e., they do not infer information from neighboring pixels, while other are spatial-spectral methods.
The proposed methods MRFL and CRFL are compared to the following methods: • SunSAL [29]-sparse spectral unmixing is applied to each test pixel, obtaining the abundance vector α(x i ) = (α 1 (x i ), . . . , α C (x i )). From this vector, the pixel is labeled as the class corresponding to the largest abundance value:ŷ α i = arg max c α c (x i ). This is a single source, spectral only method. • MLR-Multinomial Logistic Regression classifier [36] generating the class probabilities p(x i ) = (p 1 (x i ), . . . , p C (x i )). From this vector, the pixel is labeled as the class corresponding to the largest probabilityŷ p i = arg max c p c (x i ). This is also a single source, spectral only method. • LC-linear combination, a simple decision fusion approach, using a linear combination of the obtained abundances and class probabilities by applying the linear opinion pool rule from: [15]. This is a spectral only fusion method. This method was applied in [30] on the same sources as initialisation for a semi-supervised approach.  (1). In [23], three different sources were applied. For a fair comparison, we apply their fusion method with the abundances and class probabilities from our method as decision sources. • MRFG-the same decision fusion method as MRFG_a, but this time, the posterior classification probabilities from the abundances as obtained in [23] are employed. In that work, the abundances were obtained with a matched filtering technique. To produce the posterior classification probabilities, the MLR classifier was used. • MRF_a-this method applies a MRF regularization on the output of SunSAL as a single source. This is a spatial-spectral single source method.

•
MRF_p-a spatial-spectral single source method, applying MRF as a regularizer on the output of the MLR classifier.

•
CRF_a-a spatial-spectral single source method, applying CRF as a regularizer on the output of SunSAL.

•
CRF_p-a spatial-spectral single source method, applying CRF as a regularizer on the output of the MLR classifier.
For the proposed methods, the parameters λ, β and γ are set as in Table 1. The parameter β from the MRFG and MRFG_a methods are set as in [23], i.e., β = 0.5. For the methods where we use MRF and CRF as regularizers, optimal values of the parameters were obtained by a grid search.
All experiments were run on a PC with Intel i7-6700K and 32 GB RAM. The execution time for one run with fixed parameters was in the order of a second for the MRFL and a minute for the CRFL. When performing grid search and averaging over 100 runs, we run the experiments on the UAntwerpen HPC (CalcUA Super-computing facility) having nodes with 128 GB and 256 GB RAM and 2.4 GHz 14-core Broadwell CPUs, on which the different runs were distributed, leading to speedups with a factor of 10-50.
(a) University of Pavia dataset Each of the described methods is applied on the University of Pavia image, with a training set of 10 pixels per class. Experiments are repeated 100 times. In Table 2, all results are summarized. Classification accuracies for each class, overall accuracy (OA), average accuracy (AA), kappa coefficient (κ) and standard deviations are given. The OA for the different methods are plotted in Figure 7. It can be observed that the OA and AA are generally higher for the proposed methods MRFL and CRFL. A pairwise McNemar statistical test verified that the proposed methods achieved significantly better classification results than most of the other methods.  Figure 8 shows the obtained classification maps from the different methods. We can observe that the single source methods based on only spectral information, SunSAL and MLR produce noisy classification maps. The methods in which spatial information is included through MRF or CRF regularization: MRF_a, MRF_p, CRF_a and CRF_p already yield smoother classification maps. Finally, the methods that perform fusion of both modalities: MRFG, MRFG_a, MRFL and CRFL generated the best classification maps. The CRFL obtained the map closest to the ground truth map. From the table and the figure, one can also notice that MRFG_a performs better than MRFG, so we can conclude that the direct use of the abundances is superior to the use of probabilities obtained from the abundances.

(b) Indian Pines dataset
Quantitative results from the Indian Pines image are summarized in Table 3 and Figure 9. Obtained classification maps are shown in Figure 10. From this, similar conclusions to the University of Pavia image can be drawn. Notice that CRF_a performs quite well in this image. A Pairwise McNemar statistical test shows that the proposed methods perform significantly better than the other methods, with the exception of CRF_a.  We have repeated some of the experiments for larger numbers of training samples per class (20,50,100), and noticed that the differences between the methods became smaller. This indicates that the advantages of the proposed methods level out for larger training sizes.

Experiment 3: Comparison of Different Decision Sources
With this experiment, we study the effect of using different decision sources in a pairwise manner in the proposed fusion frameworks. Three types of pairwise sources were applied for the MRFL and four for the CRFL: • Pair 1: probabilities based on the spectra and probabilities based on the fractional abundances.
The first source is the same as in the previous experiments and the fractional abundances were obtained using SunSAL. Subsequently, the abundances were used as input to an MLR classifier, to produce posterior classification probabilities for this set. Ultimately, these two sources of information were fused with the proposed MRFL and CRFL fusion schemes. Therefore, the only difference with the previous experiment is that, instead of the abundances, classification probabilities from the abundances are used. • Pair 2: probabilities based on morphological profiles and probabilities based on the fractional abundances. Initially, (partial) morphological profiles were extracted as in [6] and used as input to an MLR classifier, to produce posterior classification probabilities. These were fused with the probabilities from the abundances using the proposed MRFL and CRFL fusion schemes.
The difference with before is that the morphological profiles contain spatial-spectral information. • Pair 3: probabilities based on morphological profiles and probabilities based on the spectra. • Pair 4: For the CRFL pairwise fusion, we conducted one additional pairwise fusion, between the the pure fractional abundances and the probabilities based on the morphological profiles.
The pairwise fusion results in terms of OA and their standard deviations for the University of Pavia and Indian Pines datasets are displayed in Table 4. We will now discuss the results from the table. The following conclusions can be drawn: • In general, accuracies go down when the abundances are not directly used, but, instead, class probabilities are calculated from them (Pair 1).

•
For the University of Pavia image, accuracies slightly improve when the spectral features are replaced by contextual features, but part of the effect disappears again because of the above-mentioned effect (Pair 2 and Pair 3). The best result is obtained with a direct use of abundances along with contextual features (CRF_Pair4).

•
For the Indian Pines image, no improvement is observed when including contextual features.

Experiment 4: Additional Sources in the Fusion Framework
The proposed fusion framework is flexible in the sense that additional feature sources can be included. This experiment investigates the case where an additional third source/modality, on top of the two existing modalities, preferably including features which contain spatial information, is included in our fusion framework. Along with the two decision sources (i.e., the abundances obtained by SunSAL, and the probabilities derived from the initial spectra by using MLR classification), the probabilities derived from the morphological features are included as an additional source.
As before, each source has its own unary potentials. To not increase the complexity too much, we decided to retain the number of parameters. The three binary potential terms, one for each decision source, connecting neighboring pixels, are all jointly controlled by one parameter β. Now, three cross-link terms are required, connecting all combinations of pairs of decision sources. These are jointly controlled by one parameter γ. Ultimately, labels are produced for each source separately. A majority voting rule is applied in order to produce the resulting labels.
The classification accuracies are shown in Table 5 for both datasets. The results reveal that a straightforward extension of the fusion framework with additional informative decision sources leads to an improvement of the classification accuracies. Table 5. Classification accuracies [%] based on the fusion of three sources (fractional abundances, probabilities based on spectra and probabilities based on morphological profiles) for the University of Pavia and Indian Pines images.

Conclusions
In this paper, we proposed two novel decision fusion methodologies for hyperspectral image classification in remote sensing, addressing the high dimensionality versus the scarcity of ground truth information, the mixture of materials present in pixels and the collinearity of spectra in realistic scenarios. The decision fusion framework is based on probabilistic graphical models, MRFs and CRFs, with a specific selection of complementary decision sources: (1) fractional abundances, obtained by sparse unmixing, facilitating the characterization of the subpixel content in mixed pixels, and (2) probabilistic outputs from a soft classifier, expressing confidence about the spectral content of the pixels. Furthermore, the methods simultaneously take into account two types of relationships between the underlying variables: (a) spatial neighborhood dependencies between the pixels-and (b) consistency between the two decision sources. Experiments on two real hyperspectral datasets with limited training data demonstrated the performance of the framework. The fractional abundances were shown to generate an informative decision source. Both methods MRFL and CRFL outperformed other fusion approaches when applied to the same decision sources. The fusion method CRFL produced high overall accuracies, and was stable over a large range of parameter values. Finally, the addition of a third decision source improved the classification accuracies. In future work, the aim is to further improve the classification accuracies by including additional parameter learning to estimate the model parameters directly from the training data.