A Particle Swarm Optimization based Approach to Pre-tune Programmable Hyperspectral Sensors

— Identification of optimal spectral bands often involves the collection of in-field spectral signatures followed by thorough analysis. Such rigorous field sampling exercises are tedious, cumbersome, and often impractical on challenging terrain. Which is a limiting factor for programmable UAV-hyperspectral systems requiring a pre-selection of optimal bands in mapping new environments with unknow target classes. An innovative workflow has been designed and implemented to simplify the process of in-field spectral sampling and realtime analysis for identification of optimal spectral wavelengths. The band selection optimisation workflow used particle swarm optimisation with minimum estimated abundance covariance (PSO-MEAC) for identification of a set of bands most appropriate for UAV-hyperspectral imaging. The criterion function, MEAC greatly simplifies the in-field spectral data acquisition process by requiring a target class signatures and no extensive training samples per class. The metaheuristic method was tested on an experimental site with diversity in vegetation species and communities. The identified optimal set of bands were found to suitably capture the spectral variation between target vegetation species and communities. The approach streamlines the pre-tuning of wavelengths in programmable hyperspectral sensors in mapping applications. This additionally, reduces the total flight time in UAV-hyperspectral imaging as obtaining information for an optimal subset of wavelengths is more efficient, requires less data storage and computation resource in post-processing the captured data.


I. INTRODUCTION
yperspectral technology is a potential tool for remote detection of targets and monitoring. A hyperspectral sensor measures reflected electromagnetic radiation from the target into a large number of spectral narrowbands. The inherent objective in target classification and assessment using hyperspectral data is to utilize its high spectral resolution [1]. However, large dimensionality of hyperspectral data is often This is a preprint version. attributed to the Hughes phenomenon, the curse of dimensionality [2]. The problem is a combined consequence of the high correlation among the adjacent bands and the inability of the algorithm being applied to process the high dimensional data. The problem is paramount in spectrally complex environments such as wetlands and swamps with a large diversity of species to be monitored [1,3,4]. While a common remote sensing data processing solution involves the application of dimensionality reduction techniques or selection of suitable narrowbands in a post-acquisition step; a hardwarebased solution involves the use of programmable hyperspectral sensors as a pre-acquisition step. Programmable hyperspectral sensors typically involve a snapshot based scanning mechanism, unlike general point or line scanning type systems which are non-programable and acquire a continuous spectrum over the operable wavelength region. Several such programmable hyperspectral sensors have been developed in recent times, which are increasingly being used in UAV based remote sensing applications [5][6][7]. A hardware based method such as Fabry-Pérot Interferometer (FPI) technology acquires reflected electromagnetic radiation into pre-selected optimal narrowbands which is programmed by changing the air gap between the tuneable mirrors [8]. This method has the additional benefit of efficient mapping of the environment through the selection of only the spectral features of interest, which is particularly crucial in high-resolution mapping using unmanned aerial vehicles (UAVs) with limited flight duration. The technology is relatively recent compared to pushboom hyperspectral sensors, and existing works involving the FPI have used either -(1) a set of bands for generating vegetation indices (VIs), herein referred to as indices based [7,9,10], or (2) set of bands identified through rigorous experimental testing, herein referred to as knowledge based [11,12] criterion of narrowband selection. Indices based criteria for band selection has the potential to assess the condition and/or estimate the yield of the vegetation [7,9], however, they are not principally suited for multi-target classification since the spectral variation of the target endmembers present within the B. P. Banerjee  H scene is subjective Furthermore, the efficacy of indices based narrowband selection approach for vegetation quality or condition assessment is also subject to the characteristic reflectance of the target and the traditional list of indices does not always ensure the best results for different vegetation communities or species. Knowledge based approach requires a thorough understanding of the spectral variability among the targets present over the area, which is usually attained through intensive in-situ sampling and is not always realizable over difficult terrain or in scenarios requiring urgent mapping. It is therefore important to adopt a data driven methodology for programmable hyperspectral sensors to estimate appropriate narrow bands for scene classification or assessment. Minet et al. [13] proposed an approach to adaptively maximize the contrast between the targets by employing a Genetic Algorithm (GA) based optimisation of positions and linewidths of a limited number of filters in FPI for military applications. However, this method is unsuitable for thematic applications of remote sensing. Different data driven strategies were proposed for the selection of optimal bands for traditional remote sensing applications. A method of sub-optimal search strategy utilizing constrained local extremes in a discrete binary space to select hyper-dimensional features was presented in [14]. Becker et al. [3] used a 2 nd derivative approximation to identify the spectral location of inflection. A band selection method using correlation among bands based on Mutual Information (MI) and deterministic annealing optimization was also employed [15]. Becker et al. [4] proposed a classification based assessment of three optimal spectral band selection techniques (derivative, magnitude, fixed interval and derivative histogram), using spectral angle mapper (SAM) as a classifier. A GA based wrapper method using support vector machine (SVM) was proposed for the classification of hyperspectral images [16]. A double parallel feedforward neural network based on radial basis function was used for dimensionality reduction [17]. A principal component analysis for identifying optimal bands to discriminate wetland plant species was presented [1]. A semi-supervised band clustering approach for dimensionality reduction was developed [18]. A particle swarm optimisation (PSO) based dimensionality reduction approach to improve support vector machine (SVM) based classification was suggested by [19]. Li et al. [20] and Pal et al. [21] presented a hybrid band selection strategy based on GA-SVM wrapper to search optimal bands subsets. A method of band selection based on spectral shape similarity analysis was put forward in [22]. Su et al. [23] implemented 1PSO and 2PSO with minimum estimated abundance covariance (MEAC) [24] among other techniques for evaluation of optimal bands. Ghamisi et al. [25] presented a feature selection approach based on hybridization of a GA and PSO with SVM classifier as a fitness function. However, all these existing optimal band identification studies involving data driven methods were used on traditional hyperspectral datasets after the acquisition, and are yet to be used with a hardware-based solution to pre-tune hyperspectral sensors to acquire the optimal bands.
In this study, for the first time, we have devised an in-field data driven approach to pre-tune a snapshot type UAVhyperspectral sensor for remote sensing applications. The method employed PSO with minimum estimated abundance covariance (MEAC) similar to [23], previously used in a postprocessing stage for waveband selection after hyperspectral dataset acquisition. The major benefits are: (1) it is an efficient approach to identify the optimal bands in-field before the survey operation, (2) it does not require a lot of spectral samples per class, which is particularly an issue over difficult terrain to establish a spectral library, and (3) the system works perfectly when the number of observed samples is less than the total number of potential hyperspectral bands to select from, which is an important issue with other dimensionality reduction methods such as principal component analysis (PCA). The rest of the paper is arranged as follows. Section II describes the experimental framework of the operation in a test environment. Furthermore, we describe the theoretical background of PSO-MEAC approach to the elements of the proposed application. In Section III, the depicted PSO-MEAC method has been used to present the optimal band selection results and discussion on the experimental site. In addition, the performance of the data driven PSO-MEAC approach has been evaluated against the traditional indices based approach for feature selection and mapping. And finally, the concluding remarks are provided in Section IV.

II. MATERIALS AND METHODS
This section details the study area, ground based hyperspectral sensing system, data processing for collected hyperspectral data, workflow for identifying optimal bands in the field, and method for UAV-hyperspectral surveying and assessment.

A. Experiment area
The test site is an upland swamp area within the temperate highland peat swamp on sandstone (THPSS) in Woronora plateau of New South Wales, Australia. The area is located in Wollongong, southwest of the city of Sydney, Australia. The focus was laid over spectrally diverse vegetation communities in critically endangered ecosystems distributed in the Blue Mountains, Lithgow, Southern Highlands and Bombala regions in New South Wales, Australia [26]. The NSW National Parks and Wildlife Service (NPWS) classifies the upland swamps complexes into five major vegetation communities -Banksia Thicket, Cyperoid Heath, Fringing Eucalypt Woodland, Restioid Heath, and Sedgeland [27]. The site has occasional thick vegetation cover and steep gradient which are inaccessible.

B. Hyperspectral set-up for ground based sampling
The spectral measurement of the target classes in the environment was measured with the visible-infrared snapshot hyperspectral (FPI) sensor (Rikola, Senop Optronics, Kangasala, Finland) with separate a data acquisition computer. In this mode of operation, the sensor acquires the maximum number of wavelength bands possible, i.e. 380 bands at 1 nm spectral steps between 500 nm and 880 nm. With a focal length of 9 mm and field-of-view (FOV) of 36.5×36.5 degrees, the sensor acquires 1010×1010 spatial channels in the snapshot imaging mode. In contrast, in the standalone on-board UAVbased data acquisition mode the sensor records a set of 15 programmed wavelength bands in 1010×1010 pixel format, i.e. up to a total of 16 Mpixel storage per hypercube. The sensor also acquires solar irradiance measurement using irradiance sensor for radiometric calibration and positional measurement using global positioning system (GPS) for geometric correction (Fig. 1). All sensors were installed on a handheld mount for hyperspectral imaging. An Android mobile phone was also installed on the sensor mount and paired to the data acquisition computer with a WiFi link to provide a realtime view of the scene, which was useful to bring the target vegetation in focus before the collection of hyperspectral data ( Fig. 1(a)). Additionally, a realtime feed of goniometric measurements (roll and pitch) from the mobile phone's accelerometer was relayed to the screen of the data acquisition computer to monitor the planimetric setting of the captured hypercubes using the FPI sensor ( Fig. 1(b)).
The simplistic design of a handheld hyperspectral imaging system was important to carry around in regions over dense shrub-type vegetation cover (Fig. 1(c)). The hyperspectral data was acquired with a downward nadir orientation over the shrub type swamp vegetation. The data was acquired at a distance of approximately 0.5 m from the top of the canopy (Fig. 1(c)). In this study, the FPI sensor was used as the tool for in-field spectral acquisition to demonstrate an independent form of operation. Nevertheless, the field spectral measurements could also be obtained from other spectroradiometers such as ASD FieldSpec3 (Analytical Spectral Devices, Boulder, Colorado, USA). However, special care should be taken to establish proper radiometric calibration to remove inter-sensor response mismatch, which is addressed by using the same FPI sensor for both in-field spectral data collection for identifying the optimal bands and later UAV-hyperspectral data acquisition.
Hyperspectral measurements were collected for a total of three target vegetation classes, covering eight upland swamp species, including Grass tree (Xanthorrhoea resinosa), Pouched coral fern (Gleichenia dicarpa) and Sedgeland complex (Empodisma minus, Gymnoschoenus sphaerocephalus, Lepidosperma limicola, Lepidosperma neesii, Leptocarpus tenax and Schoenus brevifolius). In addition, spectral measurements were also collected for background vegetation, containing a mixture of other species which were present in small patches, and not selected in this study. Finally, a background bare-earth spectrum was also collected. To obtain a proper un-mixed spectrum for a single species, field sampling was performed over a region of interest with local homogeneity.

C. In-field ground based hyperspectral data processing
Vegetation in an upland swamp environment is highly diverse and species can exist in homogenous and heterogeneous patches. Data collected through the portable handheld FPI system caused minor spectral misalignments due to unavoidable handheld movement of the sensor and due to slight movement of the canopy due to wind. This happens as the data in the FPI sensor is acquired in a snapshot bandwise manner with a small delay and sensor movement [28]. The hyperspectral bands were aligned using a previously developed band alignment workflow described in [28]. The data was first flat-field corrected using dark current removal and white calibration panel, then was converted to the reflectance measurements using previously computed calibration coefficients with integrating sphere [7]. A band averaged hyperspectral signal was calculated from the hypercube and used in the optimal band identification workflow. The spectrum was further treated using Savizky-Golay [29] smoothing filter with a polynomial order of 3 and a frame length of 17 to remove spectral noise. A PSO with MEAC as criterion function was employed to identify the suitable bands in the field, the details of the theory of operation is detailed in section II-D. The entire process of spectral signature retrieval and PSO-MEAC workflow for suitable band identification was implemented as MATLAB routines, and a graphical user interface (GUI) was designed for user friendly and seamless operation in the field.

D. Optimal band identification using PSO-MEAC
Particle swarm optimization (PSO) was originally attributed to simulate the social behaviour (movement and interaction) of the organisms (particles) in a flock of birds flock or pool of fishes [30]. It has however been used as a robust metaheuristic computational method to improve the selection of candidate solution for an optimisation problem. The optimisation operates iteratively over a swarm of candidate solutions with a criterion function as a given measure of quality. In our approach, the selected set of bands are called particles, and a recursive update of the bands is called a velocity. Considering the particle position denotes the selected band subset of size , and velocity the update for the selected band the detailed particle update can be expressed by [30], as in equation (1).
where, is the historically best local solution, is historically the best global solution among all the particles, 1 and 2 control the contributions from local and global solutions respectively, 1 and 2 are independent random variables between 0 and 1, and is the inertia weight to improve the convergence performance.
New velocity and position ( and on the left-hand side of equation (1)) for the particles are updated based on the existing parameters and cost criterion at every iteration (Fig. 2). The iteration process is aimed to minimise the underlined criterion function.
In a traditional supervised classification situation where representative class signatures are known through exhaustive field surveying the band-selection process can be greatly simplified. However, in an aerial survey to determine suitable wavelength bands for programmable UAV-hyperspectral system such an exhaustive exercise is tedious, cumbersome, and not always possible. Therefore, MEAC was used as a criterion function in PSO as it requires only class signatures and no training samples. Efficacy of this technique has been previously evaluated against other existing optimisation methods by Su et al. [23] for feature selection on traditional hyperspectral datasets (airborne and satellite).
Assuming there are p classes present over an area for which the samples were collected, the endmember matrix can be written as = [ 1 , 2 , … , ]. According to Yang et al. [19], with linear mixing of the endmembers, the pixel r can be expressed, as in equation (2): where, ⍺ = ( 1 , 2 , … , ) is the abundance vector and n is the uncorrelated noise with ( ) = 0 and ( ) = 2 (I is an identity matrix).
Usually, the actual number of classes (p) is greater than the known class signatures, i.e. < . Hence, the uncorrelated noise will have ( ) = 2 Σ, where Σ is the noise covariance matrix. Therefore, the abundance vector becomes the weighted least square solution, as in equation (3): with first order moment as (̂) = and second order moment as (̂) = 2 ( Σ −1 ) −1 . The analysis demonstrates that when all the classes are known the remaining noise can be modelled as independent Gaussian noise. For this application when meeting such sampling criteria was difficult and there were unknown classes present, noise whitening should be first applied. Yang et al. [19] and Su et al. [23] performed the optimal band selection on traditional hyperspectral datasets, and used all the pixels for the background noise (Σ) estimation. In this case, the background pixels noise was calculated using background class spectra and bare-earth spectra collected through ground-based sampling.
The background and noise covariance is denoted as Σ + , this estimate was used in this study. The estimate of the unknown class pixels is based on the likelihood of the unknown class (or the class of no interest) being present around the sampled class of interest. In scenes with all endmembers are of know class (or the target class of interest) noise estimation Σ + is not required, which is an unlikely condition in a spectrally complex swamp environment [7].
The identified optimal bands should let minimal deviation of ⍺ from actual [23]. With the partially known classes, the criterion function is equivalent to minimising the trace of the covariance, as in equation (4): where Φ is the selected band subset. The resulting band selection algorithm is referred to as MEAC method [23].
The optimizer returns suitably identified set of wavelength bands with the least cost-criterion (equation (4)), upon successful completion of the PSO-MEAC algorithmic iterations (Fig. 2.).

E. UAV-hyperspectral survey and assessment
Post identification of the set of optimal bands through the data driven PSO-MEAC approach, the FPI hyperspectral sensor was programmed to acquire the suitable narrow wavelength bands. A UAV-hyperspectral mission was made in a pre-planned waypoint acquisition mode with >85% of forwards and >75% lateral overlap from a flying altitude of 50 m. The sensor exposure time was set at 10 ms per band to provide good radiometric image quality for the existing illumination conditions. In addition to the data driven PSO-MEAC tuned mode, another aerial survey was made with an indices based [7] wavelength selection approach, using the same UAV flight characteristic and sensor exposure configuration. A band stabilization workflow was adopted to co-register spatial shifts between bands in hypercubes, from both the aerial acquisition modes [28]. Further, the regular radiometric, mosaicking and geometric correction procedure for hypercubes were carried out [7]. The UAV-hyperspectral orthomosaics achieved a high spatial resolution of 2 cm in ground sampling distance.
A supervised support vector machine (SVM) classifier was used to classify the hyperspectral datasets into constituent classes. The SVM is an efficient kernel based machine learning classifier suitable for high-dimensional feature spaces, which is well used in classifying hyperspectral datasets (ref). The classification was performed as an evaluation step to compare the efficacy of wavelengths identified through data driven PSO-MEAC and indices based approaches. As the fundamental objective in this study was to simply evaluate the two methods, and not to achieve superior accuracies in classification, involving complex classification algorithms were deemed needless. A standard parameter setting using a radial basis function with a kernel gamma function of 0.167, penalty parameter of 100 and pyramid level of 5 was used for the SVM classification. The overall and individual class classification accuracies were computed using the ground truth training samples.
A total of 120 ground truth measurements were collected for shrub-type swamp vegetation through a rigorous field survey and 120 ground truth polygons were identified through visual interpretation of high-resolution hyperspectral data. The sampled ground-based (120) and image-based (120) polygons were randomly divided into 1:1 mutually exclusive set of training and test samples, i.e. 60 ground and 60 image-based polygons for each training and test group. The ground truth training set was used to train the SVM classifier and the test samples were used to compute the overall accuracy (OA), kappa (κ) and confusion matrix to evaluate the classification accuracies. The spectral data from training and test sample polygons was obtained from the UAV-hyperspectral datasets in corresponding data driven PSO-MEAC and indices based modes.

III. RESULTS AND DISCUSSION
This section details the results and discussion of optimal band selection using data driven PSO-MEAC workflow, and its evaluation against the indices based approach.

A. Optimal band identification using PSO-MEAC
The PSO based optimal band identification workflow determines a list of suitable bands according to the MEAC cost criterion. The PSO-MEAC workflow was executed with a population size of 100, the inertial weight of 0.98 and maximum iteration of 500. A total of 15 bands, i.e. = 15, were identified, based on the maximum band capacity of the FPI sensor for on-board UAV data acquisition mode in un-binned setting (1010×1010 pixels).
The selected combination of bands gets re-configured at every iteration to minimise the cost function (Fig. 2). A new combination of bands is designated optimal if the combination achieves best (or minimum) cost. To analyse the performance of the in-field optimal band identification and sensor tuning using the PSO-MEAC approach, a set of internally computed parameters (criterion cost and index of runs) were logged at every iteration (Fig. 3). The PSO-MEAC approach determines the suitable combination of bands (or band-index) using the cost criterion (equation 4). The reduction of the best cost value signifies the learning curve for the optimisation workflow ( Fig. 3(a)). At every iteration, the cost associated with the previous band-index is compared with the new band-index. A record of these parameters reveals the process of convergence to the desired solution by the implemented metaheuristic workflow. A measure of final cost and plot of identified optimal band combination is also produced. It can be seen that using the PSO-MEAC method, better (i.e. smaller) values of cost criterion can be achieved. Each iteration may produce slightly different band combinations according to the cost criterion, as shown by the plot of the index of runs in Fig. 3(b). The final best cost of the PSO-MEAC was -7.7×10 -9 . At this stage the identified band index was 56, 88, 101, 119, 151, 172, 211, 217, 251, 284, 303, 326, 341, 360 and 380 (Fig. 3(c) [23] focused on minimising the number of bands in optimal configurations, which is suitable for dimensionality reduction techniques in traditional airborne or satellite hyperspectral imaging, with a complete set of bands already acquired. In the proposed method, the number of bands to be identified is predefined by the user, which is important to use the FPI sensor to its fullest potential (i.e. hypercube band capacity at desired spectral binning) to acquire the maximum possible information in the optimal configuration. To evaluate the computational complexity, the PSO-MEAC workflow was programmed in MATLAB and implemented as a graphical user interface module to run on a portable field data acquisition computer with 1.5 GHz processor and 512 MB memory. The module took roughly 4 to 5 minutes for every 500 iterations with the selected number of class samples. This demonstrates the operational efficiency of the system despite having a complex search hierarchy and is usable to pre-tune the programmable FPI sensor in a UAVhyperspectral survey for optimised wavelength selection.
Acquisition and identification of optimal bands using characteristic spectral signatures of individual swamp species have been traditionally performed using a measure of separability of the spectrum at respective wavelength bands. In this study, the employed PSO-MEAC based search strategy automatically analyses and identifies wavelength bands based on maximum separability of the reflectance using the MEAC cost criterion function. Collected field spectra for each shrub type vegetation species is shown in Fig. 3(c), and the identified wavelength band positions are shown using a set of superimposed vertical lines. The employed approach has been implemented using a GUI based interface on a portable data acquisition computer, which enables rapid analysis of spectral signatures and identification of suitable wavelength bands. The developed technique and tools were found to be efficient in a field environment during surveying.

B. Classification
The comparative evaluation between the data driven PSO-MEAC and indices based wavelength tuning approaches were performed using an SVM classifier. Two dedicated sets of datasets (data driven PSO-MEAC and indices based) were collected for swamp experiment site. The scene primarily comprised of three shrub-type vegetation classes (i.e. Grass tree, Pouched coral fern, and Sedgeland complex) and two treetype vegetation classes (i.e. Black sheoak and Eucalyptus). A small portion of the acquired scene contained no-vegetation cover and was treated as a separate 'Bare earth' class. Therefore, a total of six classes were used in the classification based comparative evaluation. The optimal bands identified using the data driven PSO-MEAC approach produced a better performance compared to indices based approach, with the SVM classifier. Combining the optimal bands identified using the data driven PSO-MEAC with the SVM classifier produced an overall accuracy of 85.16% and a kappa coefficient of 0.73, whereas the indices based approach produced an overall accuracy of 76.54% and a kappa coefficient of 0.67. The comparative classification maps for both indices based PSO-MEAC and data driven approaches produced using SVM classifier are shown in Fig. 4.
The producer's and user's accuracy for each class with the best classification method, data driven PSO-MEAC is shown in Table 1. With the exception of the Grass tree class, overall the accuracy for each class is satisfactory (>70%), particularly to differentiate between swamp type (Sedgeland complex) and non-swamp type (Eucalyptus) vegetation. The results also indicate the potential of the process to distinguish certain critical non-swamp type terrestrial species (Black sheoak and Bracken fern) within the swamp environment. Increase in the proportion of these terrestrial species in a swamp indicate changes in the swamp hydrology. No change in the proportion  of terrestrial species (or change within equilibrium limits) indicates the stability of hydrology and peat moisture levels. These results, therefore, demonstrate the usefulness of the method to directly map the changes induced in a swamp environment due to the fluctuation of groundwater level.

IV. CONCLUSION
Identification of optimal bands for vegetation monitoring has been an ongoing research problem. The issue is significant in a spectrally complex environment with diversity in vegetation species such as swamps and wetlands. Extensive surveys and post-processing solutions have been recurrently used in different swamp type environment. In this study, an innovative approach was developed for in-field rapid identification of spectrally significant wavelength bands for a given environment to program tuneable hyperspectral sensor acquisition before UAV borne surveys. The method was implemented through a metaheuristic workflow based on particle swarm optimisation (PSO) with minimum estimated abundance co-variance (MEAC) as the cost selection criterion function. A portable in-field hyperspectral signature collection system was devised using the tuneable FPI hyperspectral sensor. The set-up improved the collection of class spectra and background noise spectra, which were then used to identify the optimal band configuration. The method identifies the optimal bands based on representative class spectral signatures, avoiding requirement of extensive in-field sampling. Additionally, the method works perfectly when the number of sample observations is less than the total number of potential hyperspectral bands, which is not possible with other dimensionality reduction methods such as PCA. The method was successfully tested to identify a set of optimal bands for maximising spectral differentiation of swamp type vegetation species and communities.
In future research, the algorithm could be tuned to robustly incorporate vegetation trait retrieval by changing the criterion function. This would be valuable to the agriculture industry for the estimation of chlorophyll content and nitrogen use efficiency.