## 1. Introduction

Spectral Unmixing (SU) is the process of identifying spectral signatures of materials often referred to as endmembers and also estimates their relative abundance to the measured spectra. Spectral unmixing is used in a wide range of applications including crop/vegetation classification, disaster monitoring, surveillance, planetary exploration, food industry, fire and chemical spread detection and wild animal tracking [

1]. Endmembers play an important role in exploring spectral information of a hyperspectral image [

2,

3] the extraction of endmembers is the first and most crucial step in any image analysis which is the process of obtaining pure signatures of different features present in an image [

1,

4,

5]. SU often requires the definition of the mixing model underlying the observations as presented on the data. A mixing model describes how the endmembers are combined to form the mixed spectrum as measured by the sensor [

6]. Given the mixing model, SU then estimates the inverse of the formation process to infer the quantity of interest, specifically the endmembers, and abundance from the collected spectra [

7,

8]. This could be achieved through a radiative transfer model which accurately describes the light scattering by the materials in the observed scene by a sensor [

6].

The most common approach to spectral unmixing is the linear spectral unmixing [

6,

7], which assumes that each photon reaching the sensor interacts with only one material as measured by the spectrum [

7]. Promising and excellent results have been recorded with linear spectral unmixing methods as proposed by Keshava and Mustard [

1], with some of the commonly used linear mixture models being; Adaptive Spectral Mixture Analysis (ALSMA) [

9], Subspace Matching Pursuit (SMP) [

10], Orthogonal Matching Pursuit (OMP) [

11]. Li et al. [

12] proposed a robust collaborative sparse regression method to spectrally unmix hyperspectral data based on a robust linear mixture model. Thouvenin et al. [

13] proposed a linear mixing model which explicitly accounts for spatial and spectral endmembers variability. Foody and Cox [

14] used a linear mixture model and regression based fuzzy membership function to estimate land cover composition while in [

15] the use of the VCA algorithm is demonstrated to unmix hyperspectral data with relatively lower computational complexity compared to other conventional methods. Non linear mixing models cope with nonlinear interactions capturing effects that are mostly present in an image [

7]. Li et al. [

12] proposed a robust collaborative sparse regression method using a robust linear mixture model which takes into account nonlinearity in the image and treat them as mere outliers. The linear spectral unmixing method generally provides poor accuracy when the light suffers multiple interactions between distinct endmembers or intimate interaction before reaching the sensor [

16,

17]. In this case, the linear mixture model can be advantageously replaced with nonlinear methods [

18,

19] which provides an alternative approach to SU. When interactions occur at a microscopic level, it is said that the materials are intimately mixed. A model proposed by Hapke [

6] describes the interactions suffered by light when it comes into contact with a surface composed of particles; they involve meaningful and interpretable quantities that have physical significance, however, these models require a nonlinear formulation which is complex and complicates the derivation of the unmixing strategies [

7]. These methods account for the intimate mixture of materials, as covered by a scene, in a dataset [

1,

8]. Different nonlinear mixing models exist, some motivated by physical arguments such as bilinear models, while others exploit a more flexible nonlinear mathematical model to improve the performance of the unmixing method [

7]. Nonlinear models can be grouped into several classes such as: intimate mixture models [

1], bilinear models [

20], physics based nonlinear mixing models [

20], polynomial post nonlinear mixing models [

21]. Nascimento and Dias [

22] solve the nonlinear unmixing problem with an intimate mixture model. This method first converts the observed reflectance into albedo using a look-up table, then a linear algorithm estimates the end members albedo and mass fraction for each sample. Chen et al. [

18] formulated a new kernel-based paradigm that relies on the assumption that the mixing mechanism can be described by a linear mixture of end member spectra, with additive nonlinear fluctuations defined in a reproducing Kernel Hilbert Space. Hapke [

6] derive an analytical model used to express the measured reflectance as a function of parameters intrinsic to the mixtures, these include mass fraction, density size and single scattering albedo. The main limitation is that these models depend solely on parameters inherent to the experiment because they require the full information of the geometric position of the sensor with respect to the observed samples therefore making the inversion process more challenging to implement especially when the spectral signatures of the endmembers are unknown [

1].

Another effect that has been considered to great extend is the endmember variability during spectral unmixing due to atmospheric and temporal conditions. Machine learning methods have worked well to account for spectral variability. The combination of spectral information and spatial context may improve the accuracy of the results for hyperspectral unmixing and classification [

23]. Techniques such as morphological filters [

24], Markov Random Fields (MRF) [

23,

25,

26] Zhang [

27] , Support Vector machines (SVM) [

28] and Self Organizing Maps (SOM) [

29] among others have been proposed to impose spatial information. MRF, in particular, is a very powerful tool used to describe neighborhood dependence between image pixels and have proven to provide accurate results for hyperspectral image classification. MRF are effective under the Bayesian inferring framework to incorporate spatial information which proves to provide accurate results in classification and unmixing of hyperspectral data [

23]. Markov Random Fields is a method that integrates spatial correlation information into the posterior probability distribution of the spectral features [

25]. SVM have shown excellent performance with high classification accuracies when applied to datasets with limited number of training samples [

30]. Artificial Neural Networks (ANN) are mathematical models that were initially developed to mimic the complex pattern of neuron interconnections in the human brain [

31,

32]. Presently, a lot of feed-forward neural networks models have been extensively studied in fault detection and diagnosis of mechanical systems. Moreover, ANN have been successfully applied for many years with excellent performance in pattern recognition [

33], and in particular for spectral data [

34,

35]. SOM is one of the most widely used unsupervised neural network algorithms successfully applied for hyperspectral image classification [

29,

36,

37] and data visualization [

38]. Alternative approaches include rule base fuzzy logic [

39,

40,

41] and Markovian jump systems [

42,

43] which could be combined with ANN for switching decision making.

Deep learning involves modeling, which hierarchically learn features of input data using Artificial Neural Networks (ANN) and typically have more than three layers [

44]. Deep learning has been extensively used in the literature for a range of different applications such as vehicle detection [

45,

46], investigated avalanche search and rescue operations with Unmanned Areal Vehicles (UAV), change detection [

47,

48]. In this scheme, high level features are learned from low level ones where the features derived can be formulated for pattern recognition classification [

49]. Neural network pattern recognition is often used to classify input data into a set of target categories by training a network to evaluate its performance using a confusion matrix. The application of neural networks has been demonstrated in the field of remote sensing and hyperspectral unmixing due to their ability to recognize complex patterns in high dimensional images [

50]. Neural network based unmixing of hyperspectral imagery has produced excellent results [

51]. Lyu et al. [

48] have demonstrated neural networks to be a good tool for unmixing using both linear and nonlinear methods simultaneously [

52]. In [

46], the use of artificial neural networks was reported to detect and count cars in Unmanned Areal Vehicle (UAV) images. Wu and Prasad [

53] used neural networks for hyperspectral data classification, where a recurrent neural network was used to model the dependencies between different spectral bands and learn more discriminative features for hyperspectral data classification. Li et al. [

35] reported the use of a 3D convolution neural network to extract spectral - spatial combined features from a hyperspectral image. Kumar et al. [

51] used a linear mixture model to unmix hyperspectral data and then neural networks to predict a fraction of the data that accounts for the nonlinear mixture; they used ground truth data and the abundance estimated by the linear method to train the network for effective validation. Giorgio and Frate [

50] used neural networks to unmix hyperspectral data to estimate endmembers and their abundance. Atkinson and Lewis [

54] applied neural networks to decompose hyperspectral data and compared their results with a linear unmixing model and a fuzzy c-mean classifier; results showed that the neural networks outperformed the conventional linear unmixing method.

Little work in combining the linear and nonlinear approaches has been presented in the literature, and in particular the selection of the most appropriate technique in using the two methods. In this paper, we note that some nonlinear methods are a better method in scenes with multiple interactions and a complex mixture of features commonly composed of multi-layered materials. The linear model is appropriate for images that have a single cover type of material in a pixel. The objective of this paper is to propose a new hybrid methodology for switching between linear and nonlinear spectral unmixing methods using artificial neural networks based on deep learning strategies. The paper is organized as follows.

Section 2 describes our methodology. Experimental results are presented in

Section 3, results were discussed in

Section 4 and Conclusions are drawn in

Section 5.