Wetland Mapping Using HJ-1A/B Hyperspectral Images and an Adaptive Sparse Constrained Least Squares Linear Spectral Mixture Model

In this study, we proposed an adaptive sparse constrained least squares linear spectral mixture model (SCLS-LSMM) to map wetlands in a sophisticated scene. It includes three procedures: (1) estimating the abundance based on sparse constrained least squares method with all endmembers in the spectral library, (2) selecting “active” endmember combinations for each pixel based on the estimated abundances and (3) estimating abundances based on the linear spectral unmixing algorithm only with the adaptively selected endmember combinations. The performances of the proposed SCLS-LSMM on wetland vegetation communities mapping were compared with the traditional full constrained least squares linear spectral mixture model (FCLS-LSMM) using HJ-1A/B hyperspectral images. The accuracy assessment results showed that the proposed SCLS-LSMM obtained a significantly better performance with a systematic error (SE) of –0.014 and a root-meansquare error (RMSE) of 0.087 for Reed marsh, and a SE of 0.004 and a RMSE of 0.059 for Weedy meadow, compared with the traditional FCLS-LSMM. The proposed methods improved the unmixing accuracies of wetlands’ vegetation communities and have the potential to understand the process of wetlands’ degradation under the impacts of climate changes and permafrost degradation.


Introduction
Wetlands in high-latitude areas are well-known as an irreplaceable part of the cold region ecosystems because they regulate climate, replenish ground water, store carbon and maintain seasonal frozen soil mainly due to the heat insulation and water storage characteristics of the peat layer [1]. Besides, wetlands in cold regions provide a unique habitat for many endangered wetland species, such as Grus japonensis and Grus monacha [2]. The Intergovernmental Panel on Climate Change (IPCC) Fifth Assessment Report and Millennium Ecosystem Assessment Report indicate that the wetlands in high-latitude cold regions are fragile and unstable because they are nutrient-limited and sensitive to climate changes and human disturbances [3]. Therefore, identifying the spatial distribution and heterogeneous pattern of wetland vegetation communities in cold regions is very important to evaluate the impacts of climate change and permafrost degradation on the wetland ecosystem health and safety [4].
Satellite remote sensing detecting methods have many advantages in providing multiscale, multi-spectral and multi-temporal imagery for mapping wetland dynamics compared with the traditional field survey [5]. For example, multi-spectral sensors (Landsat multispectral scanner (MSS), Landsat-5 thematic mapper (TM), Landsat-7 enhanced thematic mapper plus (ETM+), Landsat-8 operational land imager (OLI) and SPOT's high resolution visible range instruments (HRV)) have been widely used to discriminate wetlands from the other land cover types at various scales [6][7][8]. However, it has been testified that wetlands are difficult to map only with multi-spectral remote sensing images because of frequent cloud coverage and spectral heterogeneity of the vegetation communities [9]. Researchers have found that combining multi-source remote sensing data (including optical and radar imagery) with geographical ancillary data (including topographical features, soil and landform features) provided an effective method to illustrate the sophisticated scenes, such as wetlands in seasonally flooding plains [10][11][12]. Generally, the classification predictive variables which derive from multi-source sensors and ancillary geographical data are sufficient to discriminate major land cover types (e.g., marsh, open water, grass, forest and imperious area, etc.). However, their limited number of spectral bands and amounts of speckle noise inherited in the optical and radar imagery cannot map wetland vegetation communities in detail [13]. The commonly used hyperspectral images such as Earth-Observing One (EO-1) satellite Hyperion and the airborne visible/infrared imaging spectrometer (AVIRIS) have contiguous narrow wavelength bands which could detect finer spectral characteristics of the ground vegetation than the other multispectral images [14]. It has been testified to be efficient to identify invasive plant or habitat features in natural wetland distribution areas [15,16]. Nevertheless, the application of these hyperspectral images is prevented to some extent by its availability and high cost. Recently, China environment satellite HJ-1A/B and Sentinel-2 A/B provided a new hyperspectral data source for earth observation. These hyperspectral data provide a possible approach for identifying wetland vegetation communities in heterogeneous landscapes.
Much effort has been made on wetland classification algorithms over the past few decades, including maximum likelihood classification (MLC) [17], spectral angular mapper (SAM) [18], partial least squares regression (PLSR) [19], support vector machines (SVMs) [14], random forests (RF) [20], object-based methods [21] and spectral mixture analysis (SMA) [22]. Among these methods, SMA has been widely utilized to predict the abundances of wetlands within each pixel of hyperspectral imagery [23]. One common SMA method, the full constrained least squares linear spectral mixture modelling (FCLS-LSMM), was widely utilized considering its preciseness in abundances prediction and convenience in practical application [24]. With FCLS-LSMM, researchers must construct an endmember spectra dataset to estimate the land cover abundances for the whole image. However, using the same endmember spectra dataset tends to ignore the spatial heterogeneity of the endmembers and therefore, may lead to the overestimation of nonexistent endmembers.
During the recent years, an improved version of the FCLS-LSMM, the multiple endmember spectral mixture analysis (MESMA), has been proposed to solve the problem of the spatial heterogeneity of the endmembers [25,26]. Too many endmembers will make the MESMA model too sensitive to the endmembers selection scheme and generate unstable unmixing results [27]. To solve this problem, land cover maps, height fractions and spectral similarity indicators have been used as prior knowledge of possible existing endmembers in previous studies [23,28,29]. However, the classification errors inherited in the land cover maps, the high cost of height fractions deduction and the subjectivity of the spectral similarity threshold values' selection will bring inevitable uncertainties to these approaches.
An adaptive endmember selection approach based on the sparse constraints was proposed to generate this a priori probability knowledge in the current study. Under the limits of Abundance Nonnegative Constraints (ANC) and Abundance Sum to one Constraint (ASC) conditions, the sparse unmixing method decreased the number of "active" endmembers, and the abundances of unlikely existent endmembers are assigned to zero or approximate zero. In this way, the sparse unmixing method could adaptively select an endmember and its spectra for each pixel in the high-dimension feature space of hyperspectral imagery, and only the selected endmembers are used to estimate the abundances of each pixel with the FCLS-LSMM. Taking the wetlands in high-latitude cold regions of China as a study area, we developed an adaptive endmember selection unmixing model constrained by sparsity to improve the classification accuracy of the wetlands' vegetation communities in a sophisticated scene. Referenced wetland vegetation types generated from in situ field measurements were utilized to demonstrate the unmixing accuracy and the appropriateness of the proposed algorithm. The developed method has the potential to generate accurate and fine-scale wetland mapping data.

Study Site
Zhalong National Nature Reserve (ZNNR) lies in the Northeast of the Songnen Plain, China ( Figure 1). The wetlands in the ZNNR connect with the Wuyuer River and Shuyang River. The Ramsar Convention has nominated ZNNR as a "Wetland of International Importance" since 1992, as it plays important roles in protecting the endangered waterfowls, particularly the Grus japonensis. ZNNR is located at the continental monsoon climate zone, the average annual temperature is 3.9 • C and the annual precipitation is 402.7 mm. The seasonal frozen soil is widely distributed in this area, the ecosystem of ZNNR is quite varying and the vegetation, such as the aquatic vegetation and wet meadow, always exhibit similar spectral characteristics on the multi-spectral satellite imagery. Our previous studies have detected the wetland environment dynamics of the ZNNR, including the changes of vegetation communities and the degradation of habitat quality. However, a finer scale wetland vegetation community's classification map derived from hyperspectral images with fine temporal and spatial resolution is among the first attempts.

Materials
The hyperspectral images which derived from China environment 1A series satellite (HJ-1A) hyperspectral Imaging Radiometer (HSI) sensor and the multispectral images from China environment 1B series satellite (HJ-1B) charge coupled device (CCD) sensor were used to produce the fine-scale wetland maps. The HJ-1A HSI hyperspectral sensor includes 110 bands which range from 0.459 to 0.956 um, while the HJ-1B CCD sensor has four bands ranging from 0.43 to 0.90 um. The HSI hyperspectral sensor has a spatial resolution of 100 m with a 50 km width view, while the CCD multispectral sensor has a 30 m spatial resolution with a 360 km width view. In the current study, one scene of each the HSI image (path/row: 120/27) and CCD image (path/row: 120/27) observed on September 9 2015 in the Zhalong NNR was utilized. Both of the HJ-1A HSI and HJ-1B CCD images could be downloaded from the China Center for Resource Satellite Data and Applications (CRESDA) (http://218.247.138.121/). Seventy ground control points collected from 1:50,000 digital topographic maps were used to calibrate the images. The root-mean-square errors (RMSE) of the calibrated procedure are less than 0.5. The digital number (DN) values of the georeferenced images were atmospheric corrected and converted to the top-of-atmosphere reflectance by the fast line-of-sight atmospheric analysis of spectral hypercubes (FLAASH) toolbox embedded in the environment for visualizing images (ENVI) 5.1 software. The adaptive intensity-huesaturation (AIHS) algorithm was used to blend multispectral CCD and hyperspectral HSI imagery to produce the fusion imagery with both fine spatial and spectral resolution [30]. During the fusion process, the hyperspectral bands were integrated into groups, and the hyperspectral bands of each integrated group were overlapped by the corresponding multispectral CCD bands. Therefore, we utilized a stepwise fusion strategy to assemble one multispectral CCD band with a group of hyperspectral bands.
We constructed a classification system which is suitable for typical freshwater marsh wetlands in northeast China and subdivides the land cover types in this region into 9 classes, including Reed marsh, Carex marsh, Grassy meadow, Weedy meadow, Dry land, paddy field, Saline and alkaline land, impervious and open water. The detailed descriptions of the 9 land cover types are shown in Table 1. The spectral library of the 9 endmembers is built directly from the field survey synchronized with the derivation of the HJ-1 remote sensing images using the field portable spectroradiometers (SVC HR-1024i) ( Figure 2). Spectra measurements were repeatedly conducted nine times for each endmember and the averaged values for each endmember were finally utilized to construct the references spectra library.

Method
As is shown in Figure 3, the sparse constrained least squares linear spectral mixture model (SCLS-LSMM) includes three steps: (1) obtains the HJ-1A/B hyperspectral images and constructs an endmember spectral library based on field gauged spectra for endmembers, (2) adaptively selects "active" endmember combinations for each pixel based on prior knowledge of the abundances using the sparse constrained least squares method and (3) estimates abundances for each pixel based on the linear spectral mixture model algorithm only with the adaptively selected endmember combinations.

Adaptive Endmembers Selection Based on Sparsity Constrained Method
In a complex scene such as the wetland distribution area, the actual number and type of endmembers in each pixel of the image are much smaller than that of endmembers in the whole scene, which was defined as sparsity [31]. Although with a large endmembers' spectral library, the sparse unmixing regulates that only a few of endmembers could be included in the following unmixing procedure. The general unmixing model can be written as follows: where y is a matrix gathering spectral feature values of all the pixels in the hyperspectral image. A signifies a spectral library which contains endmembers' signature with 110 hyperspectral bands, x is the matrix gathering all the abundances coefficients to be predicted for all pixels in the hyperspectral image and n represents noise. Since only a few endmember signatures in the spectral library could be included in the following unmixing procedure, the estimated x would contain many zero values which are unlikely to exist in the specific pixel. In order to predict the abundance x, the objective function and constraint conditions could be demonstrated as follows: where x ≥ 0 means the non-negative constraint of the abundance vector (abundance nonnegative constraints, ANC), while m ∑ i=1 |x i | = 1 indicates the sum of the abundance vector is the constraint of "1" (abundance sum to one constraints, ASC). ||X|| F ≡ trace(XX T ) is the Frobenius norm. The regularization parameter λ > 0 is utilized to coordinate the two terms in the objective function, and ||x || 2 represents the L 2 norm. Although L 1 and L 1/2 have also been used in the sparse unmixing to solve this linear convex problem [32], we conducted the sparse unmixing with L 2 norm considering the L 2 norm could enforce joint sparsity to all pixels [33]. The sparse unmixing algorithm of L 2 norm is solved by integrating the alternative direction method of multipliers (ADMM) algorithm [34]. During the process of sparse unmixing, the constraints of sparsity regulate the abundances of nonexistent endmembers equal to or lower than 0. Through setting a threshold manually to seek the "active" endmembers, all endmembers with less than this value of abundances are discarded. Finally, the FCLS-LSMM method could be used to inverse the abundances according to the adaptively selected endmembers and their spectrum.

Abundance Inversion Based on Linear Spectral Mixture Model
The least square linear spectral mixture model was used to calculate the abundances based on the adaptively selected endmembers for each pixel in the entire image, and the unmixing formula can be expressed as follows: where y i is the spectral signature of a mixed pixel at band i, C i,m is the spectral signature of the mth endmember at the band i, x m indicates the corresponding abundance value of the mth endmember and noise n i represents the error term of band i. Traditionally, the endmember library M should generally include all the collected endmembers in the scene, however, only the adaptive selected endmembers based on the sparsity constrained method were used to inverse the abundance with the FCLS-LSMM method. In other words, various "active" endmembers were utilized to perform the linear spectral unmixing for each pixel, while the nonexistent endmembers' signatures were removed from the dynamic endmember spectra matrix. Therefore, the proposed method could prevent overestimating the abundances of nonexistent endmembers in the complex scene and has the potential to improve the abundances' inversion accuracy.

The Accuracy Evaluations of Two LSMMs
We also performed FCLS-LSMM using all endmembers in the spectral library with the support of ENVI 5.3 software in order to compare the performance of the traditional unmixing model with SCLS-LSMM. Vegetation communities and fractional coverage recoded by field survey were used to assess the unmixing accuracy of the FCLS-LSMM and SCLS-LSMM algorithms. We also randomly sampled 50 homogeneous testing plots along one transect stretched across the Zhalong NNR through field survey at the same time as the hyperspectral images' derivation. The sampling sites were spatially scattered to avoid the auto-correlation issues, and the interval of the field works were above 100 m. The coordinates of each plot were recorded by the handheld Trimble GEO 7 GPS receivers with centimeter level positioning accuracy. The size of the plot was 30 × 30 m 2 , and each plot was made of five quadrants with a size of 1 × 1 m 2 . Five quadrants were distributed systematically throughout each plot. Assign one quadrant at the crossing point of two diagonals in a plot and locate the other four quadrants at equal distance to the crossing point along the diagonals. More than 3 photos were taken of each quadrant by the fisheye camera, and the fractional vegetation covers were extracted from the photos by visual interpretation. The calculated fractional vegetation covers of five quadrants were averaged to represent the value of each plot center. The abundances of different wetland vegetation types extracted by the FCLS-LSMM and SCLS-LSMM were examined respectively, by the abundances calculated from 50 testing plots. The systematic error (SE) and root-mean-square error (RMSE) were utilized to demonstrate the accuracies of abundances' estimation. Specifically, RMSE indicates the relative estimated errors of different land cover abundances, while SE represents the bias, signified by under-estimation or over-estimation.
where X i represents the estimated value of the ith sample, Y i represents the verification value of the ith sample and N is the number of sample plots.

Classification Results Based on SCLS-LSMM
The abundances of various land cover classes were obtained from the HJ-1A/B hyperspectral images based on the proposed SCLS-LSMM and traditional FCLS-LSMM. The classification results of the two algorithms were determined according to the land cover class with the highest abundances (Figure 4a, b). As we can see from the Figure 4b, Reed marsh is mainly located in the core area of the ZNNR, and it accounts for 80% of the total area of the reserve. Carex marsh mainly distributed surrounding the lakes and the river bench, and generally had the maximum water depth beneath the wetland vegetation canopy. Most Grassy meadow concentrated on the northeast of the Wuyuer River, which associated with areas of high relief. The weed meadow, which accounts for a small part of the Zhalong NNR, mainly disperses at the transitional zone between marsh wetlands and uplands. The spatial distributions of four wetland classes extracted by the proposed method are entirely consistent with its ecological niches. In order to evaluate the unmixing accuracy of main vegetation communities in the wetlands, we compare the proposed SCLS-LSMM algorithm with the traditional FCLS-LSMM algorithm in the following paragraph.

Accuracy Assessment and Comparative Analysis
The accuracy assessment comparisons between SCLS-LSMM and traditional FCLS-LSMM algorithms were shown in Table 2. Generally, accuracy assessment results indicated that the performance of the SCLS-LSMM is better than that of the FCLS-LSMM for main vegetation communities in the Zhalong Wetlands. Specifically, the performance of the traditional FCLS-LSMM was worse for Reed marsh, with an RMSE of 0.169 and SE of 0.053, which are significantly higher than that of the proposed SCLS-LSMM. Further analysis indicated that the FCLS-LSMM performed a severe over-estimation for Reed marsh with a significantly higher absolute value of SE (0.053 vs. -0.014). Table 2. Accuracy assessment of vegetation abundances with the proposed SCLS-LSMM and traditional FCLS-LSMM. SE is the systematic error, while RMSE is the root-mean-square error. As we can see from Figure 4a, the Reed marsh in the east of the Zhalong wetlands has been overestimated by the FCLS-LSMM algorithm. When compared to the Weedy meadow, the SCLS-LSMM method also achieved significantly higher accuracies, with a lower SE value (-0.004 vs. -0.043) and RMSE value (0.059 vs. 0.130). For Carex marsh and Grassy marsh, the performance of the proposed SCLS-LSMM and the traditional FCLS-LSMM was similar, and the proposed SCLS-LSMM performed slightly better than the traditional FCLS-LSMM. We also produced a scatter plot based on 50 randomly selected samples to demonstrate the coherence between the modeled and referenced abundances of main vegetation communities ( Figure 5). The scatter plot testified that the proposed SCLS-LSMM performed better than the traditional FCLS-LSMM with higher values of coefficient of determination (R 2 ) and lower values of root-mean-square error (RMSE) for the four main vegetation communities. To summarize, the proposed SCLS-LSMM could discriminate Reed marsh from the other vegetation communities and improved both the unmixing accuracies of the main vegetation communities and the detail level of the wetland classification in ZNNR.

Discussion
The current study provided detailed spatial distribution data for the typical freshwater marsh wetlands within our study area of Zhalong NNR, Northeast of China, with minimal preprocessing procedure and manual manipulation. To the best of our knowledge, this is among the first attempt to map the wetlands' vegetation communities in such a sophisticated scene. Therefore, it brings novel insights into detailed wetland vegetation classification, and provided an efficient tool for monitoring of sub-pixel level wetland dynamics. Unlike the other upland cover types, which have been mapped and monitored in great numbers and frequencies, the spatial data of wetland vegetation communities are scarce for most of the regions due to their inaccessible geographical locations and similar spectral characteristics. This limitation has inhibited exploration of questions related to the degradation and succession process of wetland vegetation under the influences of water shortage pressures which occurred in most of the wetland regions in China at the current time [35]. Our proposed method could be used to identify the wetlands' vegetation communities. When integrated with hydrological situation data, these data offer opportunities to investigate the relationship between hydrological processes and the spatial distribution of the wetland vegetation species [36]. In addition, the detailed data of wetland vegetation communities also facilitate the studies on the suitability and the ecological water requirements of the endangered waterfowls.
Many studies have testified that there is an advantage of hyperspectral images to map land cover compared with the optical multi-spectral images, especially to recognize vegetation communities with the similar spectral signature in a sophisticated scene such as the wetland distributed area [14]. The main wetlands in the Zhalong NNR are Reed marsh, which is a kind of emergent vegetation with permanent or seasonal flooding. Many previous studies integrated multi-spectral optical images with the multi-temporal synthetic aperture radar (SAR) images to distinguish marsh from the other land cover types [37,38]. However, these studies could not distinguish Reed marsh from the other wetland veg-etation communities due to their similar spectral characteristics with the Carex marsh and meadow in the flooding period. In contrast, the image fusion between fine spatial resolution HJ-1B satellite images and hyperspectral resolution HJ-1A satellite images provides feasibility for distinguishing the spatial distributions of different aquatic vegetation communities. Hundreds of narrow, continuous bands of the hyperspectral images could help us to discriminate Reed marsh from the other aquatic vegetation communities. The spectral profiles of the Reed marsh, Carex marsh, Grassy meadow and Weedy meadow showed significant differences from the spectral curves of the hyperspectral data, especially for the spectrum from red to near-infrared bands (Figure 2). The hyperspectral data provide the possibility to discriminate the Reed marsh from the other vegetation communities in the Zhalong wetlands. We noticed that accurate spectral calibrations of hyperspectral imaging data are necessary to be achieved in order to make such quantitative and precise application. The artifact triggered by the shift of center wavelength is important for accurate spectral calibration. Therefore, we would further evaluate the influences of these artifacts on the wetland classification.
Spectral unmixing is one of the most commonly used algorithms for vegetation abundances' estimation based on hyperspectral images. Constructing the endmember library and defining the spectral signatures for each endmember plays an important role during the process of spectral unmixing. The objective of our proposed SCLS-LSMM algorithm is to adaptively select the proper endmembers combination for spectral unmixing based on sparsity constraints for each pixel. Through decreasing the occurrence probability of the nonexistent endmember, the SCLS-LSMM improved the abundance estimation accuracies of different wetland vegetation communities in a sophisticated scene. The comparisons between simulated and verified abundances of different wetland vegetation communities derived from SCLS-LSMM and FCLS-LSMM showed that the proposed SCLS-LSMM model improved the accuracies of the wetland vegetation abundances compared with the traditional FCLS-LSMM (Table 2 and Figure 5). This was due to the different endmembers' selection modes. The traditional FCLS-LSMM algorithm applied fixed endmember combinations to calculate the abundances for each pixel in the current scene. Due to the high spatial heterogeneity of the image pixel in complex scenes, the inclusion of the fixed endmember combination usually leads to the incorrect endmembers. SCLS-LSMM could adaptively select the potential endmembers to calculate the wetlands' abundances for each pixel based on the sparsity constraints. Therefore, the proposed method avoids the impacts of nonexistent endmembers on the spectral unmixing accuracies and improves the wetland classification accuracies in the sophisticated scene.

Conclusions
This study developed an adaptive endmember selection method based on sparsity to resolve the problems of spatial variability of the endmembers' spectra. The performance of the proposed SCLS-LSMM algorithm was compared with the traditional FCLS-LSMM, and the feasibility and limit conditions of the proposed algorithm were quantitatively assessed and discussed in typical marsh wetlands of cold regions. The main conclusions are as follows: (1) The combinations of hyperspectral remote sensing imagery with the SCLS-LSMM algorithm have the potential to distinguish the marsh wetland vegetation communities. The high spectral resolutions of the HJ-1A hyperspectral remote sensing images could discriminate Reed marsh from the other vegetation communities in the wetlands. (2) The proposed SCLS-LSMM algorithm could adaptively select the "active" endmembers and their signatures. Therefore, only these "active" endmembers were utilized to estimate the abundances of the wetlands' vegetation communities, which effectively relieve the problems of spatial heterogeneity of endmembers in a sophisticated scene.
(3) Compared with the traditional FCLS-LSMM algorithm, the SCLS-LSMM algorithm improved the unmixing accuracies of Reed marsh significantly. The lower SE and RMSE for four different wetland vegetation communities signified that the proposed SCLS-LSMM algorithm improved the unmixing accuracies of the typical freshwater marsh wetlands.
Funding: This research was funded by Natural Science Foundation of Heilongjiang Province, China, grant number YQ2020D005.

Data Availability Statement:
The data presented in this study are available on reasonable request from the corresponding author.