1. Introduction
Mangrove forests are communities of diverse salt-tolerant evergreen trees and other plant species in tropical and subtropical intertidal zones, and they provide important ecosystem services such as nutrient cycling, carbon sequestration, and coastal hazard (e.g., shoreline erosion, soil salinization, hurricanes, and tsunamis) mitigation [
1,
2,
3,
4]. Due to climate change, natural disasters, and coastal development, the ecological functions of mangrove forests have been continuously degraded for decades [
5,
6]. The diversity and composition of tree species are key parameters for assessing forest ecosystems [
7] and are also particularly essential for understanding their response to environmental change and observing the integrity of endangered ecosystems such as mangroves [
8]. Therefore, the accurate classification of mangrove species and in-time monitoring of their spatial distribution are critical for conserving and restoring mangrove forests.
Conventionally, obtaining species information on mangrove forests requires costly, labor-intensive, and time-consuming field investigations, and it is often difficult for investigators to access mangrove forests [
9]. Due to rapid, large-scale and cost-effective monitoring capacities, remote sensing techniques have been increasingly adopted to survey and evaluate mangrove resources during the past several decades [
10]. Medium-resolution multispectral imagery, such as that from Landsat [
11,
12], SPOT [
13], and Sentinel-2 [
8] are often used to map the distribution of mangrove forests at regional or even national or global scales. Due to the advantage of their having superb spatial and textural features and high-resolution multispectral images, satellites such as Quickbird [
14], IKONOS [
15], Worldview [
16,
17], and Pléiades-1 [
18] have been widely employed to classify mangrove species at landscape or regional scales.
Compared with multi-spectral satellite images with poor spectral information, hyperspectral remote sensing data contain dozens or hundreds of contiguous wavebands with spectral features related to plant functional traits and are found to be more efficient for tree species classification [
19]. Leaf [
20,
21], canopy [
21,
22,
23], satellite (e.g., Hyperion [
24,
25]), and airborne (e.g., CASI [
26], AVIRIS [
27,
28], and AISA [
29]) hyperspectral data have been widely employed in classifying mangrove forests and other forest types (e.g., temperate, subtropical, tropical rainforest, and urban). With rapid advancements in unmanned aerial vehicles (UAV) and self-driving cars, light detection and ranging (LiDAR) techniques have been increasingly utilized in classifying tree species. However, several studies have pointed out that LiDAR alone has not been able to accurately classify tree species, but could be combined with hyperspectral images to further improve classification accuracy [
30,
31,
32]. Therefore, LiDAR-acquired structural parameters are often used as auxiliary information, and the significance of exploring hyperspectral data for plant species classification is still indispensable.
The high dimensionality and multi-collinearity of hyperspectral data may decrease model accuracy in supervised learning processes because the number of spectral wavebands often exceeds the number of model calibration samples [
33]. Therefore, additional processing methods are needed for hyperspectral data to resolve the problem of redundant predictors and enhance spectral differences. Considering feature extraction, dimensionality reduction (e.g., principal component analysis (PCA)), waveband selection (e.g., successive projections algorithm (SPA)), and vegetation index (VI) extraction are the three commonly-used strategies in relating sensitive spectral features to the information of plant species [
34,
35,
36]. Moreover, different sample subset partition methods (e.g., stratified random sampling (STRAT), Kennard-Stone sampling algorithm (KS), and sample subset partition based on joint X-Y distances (SPXY)) may cause different classification results [
37,
38]. However, very few studies have investigated the combination of feature extraction and sample subset partition in the classification of mangrove species.
Continuous wavelet analysis (
CWA) is an effective noise reduction method and it can also enhance the details of spectral features of hyperspectral data [
39,
40,
41]. Hence,
CWA has been successfully utilized in quantitative remote sensing for retrieving functional traits of plants (e.g., leaf mass per area [
42], canopy water content [
43], leaf dry matter content and specific leaf area [
44]). In contrast, a few studies have applied
CWA in improving species classification accuracy in herbaceous wetlands and tropical dry forests [
45,
46]. However, the advantage of
CWA with regards to hyperspectral data is rarely investigated in mangrove species classification.
With the leaf hyperspectral reflectance spectra of four mangrove species samples collected across six regions, this study aimed to explore the potential of CWA combined with different sample subset partition and feature extraction methods in mangrove species classification. The results at the leaf scale may lay the foundation for further studies with UAV- and satellite-based hyperspectral images.
2. Materials and Methods
2.1. Field Sample
A total of 301 leaf samples of four mangrove species were collected from six sites (
Figure 1) in 2017 and 2018 (
Table 1), comprising 60 of
Avicennia marina, 46 of
Bruguiera gymnorrhiza, 81 of
Kandelia obovate, and 114 of
Aegiceras corniculatum. Each sample was collected from a plot of 10 m × 10 m with a single species, and the center location of each plot was recorded with a global positioning system (GPS) handheld receiver. Moreover, any two plots were at least 30 m apart. For each plot, about 20–30 leaves were picked from the canopy using an extendable trimming pole. To ensure the picked leaves were mature, the leaves between the third and fifth layers from the top were selected [
47]. Finally, each sample was instantly sealed in a fresh-keeping bag, kept in a dark box with ice packs, and transported to a nearby laboratory for spectral measurement and chemical analysis.
2.2. Leaf Reflectance Measurement and Spectra Preprocessing
An ASD FieldSpec 4 portable spectroradiometer (Analytical Spectral Devices, Inc., Boulder, CO, USA) was used to measure the leaf spectra reflectance of four mangrove species, and it possesses 2151 wavebands from 350 to 2500 nm with a sampling interval of 1.4 nm in the range of 350–1000 nm and 2 nm in the range of 1000–2500 nm. For each sample, ten leaves were randomly selected in order to measure their spectra with an ASD leaf clip and plant probe; the spectra of each leaf were recorded with ten successive scans and the spectra of ten leaves were averaged as the final reflectance spectra of the target sample.
Due to the systematic noise at two edges of leaf spectrum (350–399 nm and 2451–2500 nm), the leaf reflectance spectra of 301 samples were first reduced to 400–2450nm (
Figure 2). To minimize the effects of random noise on model calibration, the remaining spectra were then processed by a Savitzky-Golay smoothing filter with a second order polynomial fit and a window size of seven data points [
48]. Finally, the smoothed spectra (hereafter
Smth for short) were subjected to first derivative analysis (hereafter
Der for short) because this can enhance the peaks and valleys of spectral features [
49] and minimize the impact of multiple scatterings of irradiation [
50].
2.3. Continuous Wavelet Analysis of Leaf Reflectance
Wavelet analysis is an effective pattern of decomposing the original signal into multiple amplitudes and scales [
51,
52] and has been widely applied in the field of vegetation remote sensing [
41,
43,
44]. Wavelet analysis is generally implemented in the form of discrete wavelet analysis (
DWA) and continuous wavelet analysis (
CWA). The former often transforms the most informative part of the input data to avoid redundancy but the decomposed components from
DWA are difficult to interpret for waveband-by-waveband analysis [
44]. In contrast,
CWA generates interpretable signals which directly correspond to original leaf spectra and thus the decomposed signals can reflect the information on plant absorption features [
53,
54]. Therefore, we employed
CWA to explore the details of leaf spectra in the classification of mangrove species.
CWA performs the convolution of reflectance spectrum
f (λ) into sets of coefficients with a mother wavelet function at various scales (Equation (1)) [
55]. This may be expressed as
where W
f(
a,b) (
a and
b are positive real numbers) is the vector of wavelet coefficients,
a and
b represent scaling and shifting factors, respectively, indicating the width and position of the wavelet function [
56], and
is the mother wavelet function.
The shape of the leaf reflectance spectrum is similar to a Gaussian or quasi-Gaussian function, or a composition of several Gaussian functions [
57]. Based on the suggestion of Torrence et al. [
58], the second order derivative of Gaussian (namely the ‘Mexican Hat’, “mexh”) was chosen as the mother wavelet function (
Figure 3). The ”mexh” function has symmetry and its mean power is zero [
59]. Moreover, the ”mexh” function has an infinite support width of (–5s, 5s) (
) and its effective basic support range is (–5, 5) [
60].
To decrease intensive computation, according to the suggestion of Cheng et al. [
61],
CWA was only performed at dyadic scales instead of all possible scales. Moreover, the 2051 wavebands (400–2450nm) available in this study made the dyadic scale less than 2
10 = 1024. Based on the suggestion of Cheng et al. [
42] and the preliminary experiments of mangrove species classification, the eight scales (2
0, 2
1,… , 2
7) were chosen for
CWA of
Smth and
Der (or named “wavelet power spectra of
Smth and
Der”), and
CWA was implemented with the wavelet packets of MATLAB R2018a.
2.4. Establishment of Mangrove Species Classification Model
To examine the impact of different sample subset partition and feature extraction methods on the performance of mangrove species classification, three commonly-used subset partition methods (STRAT, KS, and SPXY) and feature selection methods (PCA, SPA, and VI) were tested and compared, respectively. Random forests (RF), a prevalent and successful machine learning method, was chosen as the classification model in this study.
According to the suggestion of Roth et al. [
62], to ensure the modeling and prediction sets contained samples of each species, the original dataset (301 samples) was first divided into two sets with STRAT, using 70% (211) of samples for modeling and 30% (90) of samples for prediction (
Figure 4). The reason for this partition strategy of selecting the prediction sets was that the KS and SPXY algorithms selected sample subsets based on the Euclidean distances of x-space or x and y space which resulted in an unbalanced prediction sample size of four species. Afterwards, the modeling set was divided into two sets, with 70% (148 samples) being used as a calibration set to construct species classification models and the remaining samples (the validation set) being eliminated due to the aforementioned influence of KS and SPXY. Each process of sample subset partitioning was repeated 50 times to ensure the reliability of the classification results [
63].
A total of 270 (18 × 3 × 5) models were constructed, considering 18 types of spectra (Smth, Der, Smth + CWA (eight scales), and Der + CWA (eight scales)) in conjunction with three sample subset partition (STRAT, KS, and SPXY) and five feature extraction methods (PCA, SPA, and three VIs). To simply and clearly present the 270 models, CWA of Smth and Der were expressed in Smth_scale and Der_scale (scale = 1, 2, 4, 8, 16, 32, 64, 128) and the combination of sample subset partition and feature extraction was represented in STRAT_PCA, which indicated the sample subset was partitioned by STRAT and the feature was in the meantime extracted by PCA.
2.4.1. Sample Subset Partition
Compared with simple random sampling (SRS), STRAT can select more representative samples and is especially suitable for remote-based plant species classification [
28]. The KS algorithm calculates the Euclidean distances within different samples along the independent variable (x) space using a stepwise procedure; two samples with the farthest Euclidean distance are first selected and the next sample selected is the farthest one from the first two samples [
64]. SPXY improves on the KS algorithm [
65] by extending the Euclidean distance calculation with both independent and dependent variables. For details of KS and SPXY refer to Galvao et al. [
65]. The three aforementioned sample subset partition methods were implemented with MATLAB R2018a.
2.4.2. Feature Extraction
PCA is a typical dimensionality reduction method which is widely applied in hyperspectral image analysis [
34,
66]. In this study, spectra (
Smth,
Der,
Smth +
CWA (
eight scales), and
Der +
CWA (
eight scales)) were subjected to PCA using a
pca() function in MATLAB R2018a, and the leading several principal components were chosen based on the eigenvalues-greater-than-one rule to calibrate the model [
67]. In addition, we added up the percentages of total variance explained by the selected principal components for each spectrum (
Table 2).
SPA is a forward variable selection algorithm which begins with a waveband, merges a new one during each iteration, and applies projection operators in a vector space until meeting a specified number of wavebands [
68]. The advantage of SPA lies in its deterministic search process with good robustness and reproducible results. The ratio of the number of selected wavebands to the total number of training samples is 0.15–0.2 to avoid over-fitting problems [
69]. Gross, et al. [
45] discovered that model performance with less than 20 features was relatively stable, and extra features evidently increased computation time. Hence, we set the parameter
m_max (maximum number of variables) of the
spa() function as 20. The process was implemented with SPA code (
www.ele.ita.br/~kawakami/spa) in MATLAB R2018a.
VIs are generally derived from two or three wavebands to detect the differences of plant physiology and biochemistry [
69]. There are many forms of VIs based on different mathematical rules, and those commonly-used are normalized difference vegetation index (NDVI = (
)/(
)) [
70], ratio vegetation index (RVI =
/
) [
71] and three bands vegetation index (TBVI =
/(
)) [
72]. The wavebands selected by SPA from spectra (
Smth,
Der,
Smth +
CWA (
eight scales), and
Der +
CWA (
eight scales)) were then used to construct the three forms of VI. The selected wavebands generated
n × (
n − 1) (
n being the number of wavebands selected by SPA) combinations of all possible NDVIs and RVIs, and
n × (
n − 1) × (
n − 2)/2 combinations of TBVIs. The computations were carried out in MATLAB R2018a.
2.4.3. Random Forests Classification
The RF algorithm introduces decision trees, the bagging (bootstrapping aggregation) sampling method and internal cross-validation into
K binary Classification and Regression Trees (CART) trees, and can effectively overcome the over-fitting problem of machine learning [
73,
74,
75]. Hundreds of decision tree models are constructed by RF and the randomized subsets of target data and variables are utilized for building each tree [
76]. These classification trees are then used to determine the correct classification by majority voting [
77]. There are two main tuning parameters (
ntree and
mtry) needed in a RF model, and both of them are kept at the default values because researchers have reported that the default values and the empirical criteria generally produce acceptable results [
74,
78]. The RF was performed with RF code (
https://code.google.com/archive/p/randomforest-matlab/downloads) in MATLAB R2018a.
2.5. Evaluation of Classification Model Performance
Overall accuracy (OA) (Equation (2)), producer’s accuracy (PA) (Equation (3)) and user’s accuracy (UA) (Equation (4)) were employed to evaluate the performance of each classification model. Furthermore, the allocation disagreement (AD) (Equation (5)) and quantity disagreement (QD) (Equation (6)) were adopted rather than Kappa, since Kappa neglects the assessment of off-diagonal elements [
79] which is highly relevant to OA [
63]. For specific descriptions and explanations of AD and QD refer to Jr et al. [
80] and Nurmemet et al. [
81]. The larger the values of OA, PA, or UA, the better the model performance; however, the larger the value of AD or QD, the poorer the model performance. These aforementioned methods may be expressed as
where
is a
error matrix (Equation (7)), and
i is the row/column number of relevant species;
represents the sum of the
ith row of the error matrix, reflecting the total number of the samples in
ith species divided into the
ith species and the other three species;
represents the sum of the
ith column of the error matrix, reflecting the total number of samples divided into
ith species;
, the
ith diagonal element of error matrix, indicates the number of species correctly classified; and
n is the sample size of prediction sets.
5. Conclusions
With leaf hyperspectral data, we have explored the potential of CWA coupled with different sample subset partition and feature extraction methods in mangrove species classification. The following conclusions may be drawn:
- 1)
Regardless of the effect of sample subset partition and feature extraction methods on the performance of mangrove species classification, CWA with suitable scales has great potential to improve the classification accuracy.
- 2)
The STRAT method combined with PCA or SPA methods is recommended to improve classification performance.
- 3)
Compared with the original reflectance spectra, the derivative spectra can significantly improve the classification accuracy.
The leaf-level results can lay the foundation for the next step in-depth study of mangrove species classification with UAV-acquired or satellite hyperspectral data, contributing to understanding large-scale species composition and further effectively protect and manage mangrove forests. Moreover, the encouraging performance of CWA can also be extended to other plant species such as forest and crop. Further, eco-environment factors (e.g., elevation, soil property, leaf biochemical components, and canopy structure) are required to investigate their effects on the performance of mangrove species classification in future studies.