Mapping an Invasive Plant Spartina alterniﬂora by Combining an Ensemble One-Class Classiﬁcation Algorithm with a Phenological NDVI Time-Series Analysis Approach in Middle Coast of Jiangsu, China

: Spartina alterniﬂora ( S. alterniﬂora ) is one of the worst plant invaders in the coastal wetlands of China. Accurate and repeatable mapping of S. alterniﬂora invasion is essential to develop cost-e ﬀ ective management strategies for conserving native biodiversity. Traditional remote-sensing-based mapping methods require a lot of ﬁeldwork for sample collection. Moreover, our ability to detect this invasive species is still limited because of poor spectral separability between S. alterniﬂora and its co-dominant native plants. Therefore, we proposed a novel scheme that uses an ensemble one-class classiﬁer (EOCC) in combination with phenological Normalized Di ﬀ erence Vegetation Index (NDVI) time-series analysis (TSA) to detect S. alterniﬂora. We evaluated the performance of the EOCC algorithm in two scenarios, i.e., single-scene analysis (SSA) and NDVI-TSA in the core zones of Yancheng National Natural Reserve (YNNR). Meanwhile, a fully supervised classiﬁer support vector machine (SVM) was tested in the two scenarios for comparison. With these scenarios, the crucial phenological stages and the advantage of phenological NDVI-TSA in S. alterniﬂora recognition were also investigated. Results indicated the EOCC using only positive training data performed similarly well with the SVM trained on complete training data in the YNNR. Moreover, the EOCC algorithm presented a more robust transferability with notably higher classiﬁcation accuracy than the SVM when being transferred to a second site, without a second training. Furthermore, when combined with the phenological NDVI-TSA, the EOCC algorithm presented more balanced sensitivity–speciﬁcity result, showing slightly better transferability than it performed in the best phenological stage (i.e., senescence stage of November). The achieved results (overall accuracy (OA), Kappa, and true skill statistic (TSS) were 92.92%, 0.843, and 0.834 for the YNNR, and OA, Kappa, and TSS were 90.94%, 0.815, and 0.825 for transferability to the non-training site) suggest that our detection scheme has a high potential for the mapping of S. alterniﬂora across di ﬀ erent areas, and the EOCC algorithm can be a viable alternative to traditional supervised classiﬁcation method for invasive plant detection.

(1) How does the EOCC algorithm perform compare to individual OCC algorithms and a standard supervised classifier SVM in S. alterniflora detection? (2) How much does phenological NDVI-TSA improve S. alterniflora detection? (3) Is the detection scheme transferable and robust when it is applied in different regions?

Study Area
The middle coast of Jiangsu ranges from the Chuandong Estuary in the south to the Xinyang Estuary in the north (Figure 1a). The region belongs to subtropical and warm temperate zones and has a marine monsoon climate with moderately well-defined seasonality [9]. The vegetation mainly comprises S. alterniflora, Phragmites australis (P. australis) and Suaeda salsa (S. salsa), which are predominantly located in two regions, namely the core zones of YNNR ( Figure 1b) and the DMNNR (Figure 1c). Due to the continuous reclamation and agricultural activities, most of the salt marsh plants distributed between the Doulong and Chuandong estuaries have been replaced by aqua-farms and farmlands (Figure 1a) [9]. We, therefore, selected these two national nature reserves as the study site. Specifically, the YNNR was chosen as the study area for evaluating the performance of our proposed Remote Sens. 2020, 12, 4010 4 of 24 method in mapping S. alterniflora, while the DMNNR was used for testing the transferability. The YNNR was established in 1983, aiming at protecting red-crown cranes and their habitat. The DMNNR was established in 1986 for the protection of Elaphurus davidianus and their habitat. Both nature reserves are now under national administration with strict protection implemented [36]. Human activities are strictly forbidden in the core areas of the reserves.

Remote Sensing Data
The Gaofen-1 (GF-1) satellite was launched in April 2013, carrying two panchromatic/multi-spectral and four wide-field-of-view (WFV) cameras. Each WFV camera has a resolution of 16 m and four multispectral bands, including Blue (B1: 0.45-0.52 µm), Green (B2: 0.52-0.59 µm), Red (B3: 0.63-0.69 µm), and NIR (B4: 0.77-0.89 µm). With the combination of four cameras, the GF-1 WFV data have a swath width of 800 km and a high frequency revisit time of four days [37]. A total of 28 high-quality GF-1 WFV images with minimal cloud covers are available for the study area in 2015. Only one image with the highest quality of each month was selected to construct the monthly time-series [11]. However, no cloud-free GF-1 WFV image (<10% cloud cover) was available for September 2015 due to the cloud contamination. The GF-1 WFV image acquired in September 2016 was thus selected in the subsequent time-series analysis ( Table 1). All the GF-1 WFV images were freely downloaded from the China Center for Resources Satellite Data and Application.

Reference Data
Due to the limited infrastructure and the strict protection rule, the central areas of the YNNR and DMNNR are difficult to reach, and therefore, our field surveys were only carried out along the periphery and certain narrow roads in the core areas [38]. Two field trips were conducted to collect ground reference samples in July 2014 and September 2015. With a handheld GPS unit, we recorded the location of S. alterniflora, native species, and other land-cover types. In total, 206 S. alterniflora points, 105 P. australis points, 123 S. salsa points, and 226 other land-cover type points were collected in the YNNR. As the ground reference data were collected in two years, to eliminate the uncertainty caused by the time shift, we removed nine samples that were distributed at the ecotone between S. alterniflora and native plants and three samples from areas frequently flooded by tides. All the ground samples were then inspected on the Google Earth (GE) images (acquired on 13 July 2015 and 14 August 2015), and the points located in a GF-1 WFV pixel (16 m × 16 m) without clear land-cover information were excluded. The detailed spatial information provided by the GE images helped accurately distinguish S. alterniflora from other land-cover types [4]. To ensure a well-stratified reference dataset, we further supplemented the collected in situ data with visual interpretations from the GE images. We first transformed the land-cover maps of the YNNR and DMNNR in 2014 into binary maps containing two classes (S. alterniflora and non-S. alterniflora). The land-cover maps were generated based on a Landsat-8 Operational Land Imager (OLI) image with a 30 m resolution, and both had overall accuracies (OA) exceeding 95% [4]. Then, 2000 and 1000 random points for the YNNR and NMNNR, respectively, were generated with a minimum distance of 100 m constraint from the binary maps to ensure that individual pixel locations were only sampled once. Each random point was then inspected using the GE images, and the points without clear land-cover information were excluded. The sample points were cross-checked by at least one other interpreter. Finally, a total of 1721 samples (1213 samples for YNNR, and 508 samples for DMNNR), including 650 S. alterniflora and 1071 non-S. alterniflora samples were used to formulate our reference dataset.

Remote Sensing Image Preprocessing
The RS images were pre-processed using Environment for Visualizing Images (ENVI), which includes radiometric correction, atmospheric correction, orthorectification, and accurate geographic registration. The atmospheric correction for GF-1 WFV was performed using the Fast Line-of Sight Atmospheric Analysis of Spectral Hypercubes (FLAASH) software package in ENVI 5.3. For the geographic correction for the GF-1 WFV images, we used one 15-m Landsat-8 OLI panchromatic image as a reference. The correction error for each GF-1 WFV image was controlled within 0.5 pixels. Different vegetation indexes (VI), including NDVI, Enhanced Vegetation Index (EVI), Ratio Vegetation Index (RVI), Difference Vegetation Index (DVI), and Soil-adjusted Vegetation Index (SAVI), were generated from each single-date WFV image (Table 2). Additionally, a principal component analysis (PCA) was conducted on the monthly GF-1 WFV data. The first three principal components of the PCA (PC1, PC2, and PC3) were selected as predictive variables for our model. The BSVM algorithm was implemented in the oneClass package in R statistical software [26]. BSVM uses unlabeled samples as negative training samples, a variant of the binary SVM [18]. As the unlabeled class also contains samples from the positive class, two cost terms Cp and Cu are used for differentially penalizing misclassification and margin errors of the positive and unlabeled samples. We chose the radial basis function (RBF) kernel to build the desired models, requiring three parameters to be optimized: the inverse kernel width, γ, and the two penalty parameters, Cp and Cu [18]. In our experiment, we manually set γ, Cp and Cu from the following ranges: γ= 0.1, 1, . . . , 10, Cu = 0.1, 1.1, . . . , 7.1, Cp ={Cu × (1, 5, 10, 20, 100, 200, 500)}. Without negative samples, the performance measure F score cannot be calculated. Alternatively, we chose the commonly used Fpu for model selection criteria. Detailed descriptions of Fpu can be found in Reference [44]. "Optimal" values for tuning parameters were selected by a grid search using five repetitions of ten-fold cross-validation based on the training dataset. We randomly selected 100 positive samples and 5000 unlabeled samples to train the BSVM. Since outputs of BSVM delivers a continuous value for each classified pixel, a threshold is therefore required to convert the output values into binary classes. We used the maximizing the sum of specificity and sensitivity (MaxSSS) as the threshold for the classification of output.
MaxEnt is particularly known for modeling potential species distributions based on environmental parameters [45]. It separates the target species from the background by applying a maximum entropy approach, which compares probability densities [28]. As a general-purpose machine learning method, it has proven to be suitable to solve different one-class classification problems based on RS data in recent years [4,24,25,33]. The MaxEnt algorithm was implemented by using the dismo R package [46]. The same training dataset used for BSVM (i.e., 100 positive samples and 5000 unlabeled samples) was used for MaxEnt. We used default parameters to implement MaxEnt because it has been shown that MaxEnt with the default sets performs similarly to that using "optimal" parameters obtained by the grid search method [4,33]. The MaxSSS was used to transfer the output into binary classes.
The PUL algorithm transforms the traditional binary classifier to one-class classifier based on the Bayes principle [47]. Traditional binary classifiers are inefficient to model p(y = 1|x) due to the absence of reliable negative samples [48]. To address this problem, Elkan and Noto proposed to train a classifier p(s = 1|x) alternatively [47]. The target p(y = 1|x) can be then transformed by the equation: p(y = 1|x) = p(s = 1|x)/c. Here, y = 1 denotes positive pixels, and y = 0 denotes negative pixels, while s = 1 denotes labeled pixels, s = 0 denotes unlabeled pixels, and x denotes the covariates associated with a pixel [48]. If a binary classifier is trained by using labeled and unlabeled data, the model p(s = 1|x) can be presented to estimate the probability of a pixel x being labeled [20]. Therefore, we can obtain p(y = 1|x) if the factor c is successfully estimated. In practice, a reliable way to estimate c is to average the predicted probabilities of multiple positive pixels [29]. where V is a subset of the training (or validation) set that contains only the labeled (i.e., positive) pixels, and n is the cardinality of dataset V. It is worth mentioning that PUL is not a specific classifier, but rather a general framework that can be implemented by any classifier able to correctly predict the conditional probability [20]. More details about the PUL algorithm can be found in Reference [48]. Deep learning, as a new branch of machine learning, has been successfully applied in RS classification in recent years [7]. A growing number of studies have reported achieving competitive classification results at moderate spatial resolution with deep learning [7,49]. Therefore, we used a multi-layered feedforward deep neural network (DNN), trained with stochastic gradient descent, using back-propagation. We named this PUL method a positive and unlabeled deep neural network (PUDNN).
For the PUDNN, we used the rectified linear unit (ReLU) activation function with a softmax output classification function. To improve stability for ReLU, we set the constraint for the squared sum of incoming weights per unit to ten. For the hidden layers in the PUDNN, the values from two to six were tested, and the neurons for each hidden layer were set to 200. We also tested the input layer dropout ratio values of 0, 0.05, 0.1, and 0.2. The relative tolerance for metric-based stopping criterion was set to 0.01 and the maximum duty cycle fraction for scoring was set to 0.025. Other parameters not mentioned here were set for the default. The optimal parameter combination for PUDNN was determined by a grid search with ten-fold cross-validation based on the training dataset. Regarding the training data, we found that using 5000 random unlabeled samples resulted in PUDNN models with very high computational costs. Consequently, we used 300 unlabeled samples randomly generated from the 5000 unlabeled samples, and the same 100 positive samples used for the MaxEnt and BSVM. The constant, c, can be estimated by using the labeled positive data either from the training set or from an independent validation set with both methods producing similar results [29]. To avoid reducing the sample size, the constant c was calculated here by averaging the predicted values of the positive data in the training set. The MaxSSS was then calculated to transform the outputs into binary results. The PUDNN was implemented in the h2o package.

Ensemble Model
The ensemble model was implemented by integrating the three individual OCC algorithms, including MaxEnt, BSVM, and PUDNN, based on a weighted voting scheme. First, the three OCC algorithms were carried out to produce different outputs. The threshold MaxSSS for each algorithm was then calculated to transform the different outputs into binary results. Next, the true skill statistic (TSS) was applied to evaluate the accuracies of the binary results. Here, TSS was calculated from the positive and unlabeled data in the training set according to the following equation: where sensitivity is the true positive rate (also called recall in other fields) and specificity is the true negative rate. The TSS values vary from −1 to 1, where negative and close-to-zero values indicate models are not different from randomly generated models, while values close to 1 indicate good models, and values above 0.5 are assumed as suitable models [50,51]. Thus, it was used as the weight for each classifier. Then, the binary outputs of the individual OCC approaches were combined by using the weighted vote approach: where TSS i and y i refer to the TSS value and predicted class (i.e., one for positive class and zero for negative class) of the model i, respectively, and n refers to the number of classifiers. The outputs of the EOCC approach fell into the range of 0 to 1. The output values above 0.5 were classified as the positive class.

Standard Supervised Classification Method
To systematically investigate the ability of the proposed EOCC algorithm to identify S. alterniflora, we compared its performance with a state-of-the-art supervised classifier SVM, used extensively in RS analysis [20]. To ensure a comparable accuracy assessment, we used the same 100 positive samples from the OCC methods, along with an additional 110 true negative samples to train the SVM. The optimal parameters for the SVM were determined by a grid search, using five repetitions of ten-fold cross-validation. Two parameters were involved in the training of the classifier: the RBF kernel width γ and the penalty parameter C. The parameter grid for the SVM was defined by γ ∈ {2 −11 , 2 −9 , . . . , 2 3 } and C ∈ {2 −3 , 2 1 , . . . , 2 13 }. The SVM was implemented with the caret package [52] in R statistical software.

Feature Selection
The random forest recursive feature elimination (RF-RFE) method was utilized to select the important features for the models. The RF-RFE is a wrapper-based feature-ranking algorithm that searches within the space for an optimal subset by performing optimization algorithms [11]. The algorithm starts with all candidate variables in the model and recursively eliminates one insignificant variable each time until only one remained as input. For each iteration, the feature with the smallest ranking score is removed. The model is then rebuilt the model with the features retained, and the model accuracy is recalculated. The algorithm is formulated to identify the optimal subset of discriminatory features [53]. In this study, a five repetitions of the ten-fold cross-validation method was implemented in the algorithm to secure better evaluations of the model performance during the selection process. We implemented RF-RFE, using the caret package in R statistical software [52].

Monthly SSA for S. alterniflora Detection (Scenario 1)
We evaluated the performance of 12 monthly SSAs in identifying S. alterniflora by using four OCC algorithms (i.e., EOCC, MaxEnt, BSVM, and PUDNN), along with a standard supervised classification method (i.e., SVM) based on 12 single-date GF-1 WFV images. In addition to the original four GF-1 WFV bands, we also used five VIs, and three PCs from the PCA for each SSA, because the VIs and PCs were found to be efficient for improving the model performance in S. alterniflora detection (see Appendix A Figure A1 for details). Variable selection for each SSA was conducted by using the RF-RFE algorithm (see Supplementary Materials Figure S1 for more details). With a complete analysis of 12 months, the discrimination abilities of different algorithms, as well as the suitable phenological windows for S. alterniflora detection were investigated. The optimal SSA was selected to provide a baseline for the comparison with the subsequent NDVI-SSA.

NDVI-TSA for S. alterniflora Detection (Scenario 2)
In practice, a priori information on the crucial detection period for an invasive species is often lacking. To avoid this limitation and broaden the applicability of our proposed scheme, we utilized a complete monthly NDVI time-series dataset representing 12 consecutive months to conduct the NDVI-TSA. Because S. alterniflora grows in intertidal zones and is often impacted by tidal fluctuations, the inclusion of more multi-temporal images could introduce more interference. This would result in a decrease in model performance [9,54]. Hence, we attempted to compress the original NDVI-TSA by using fewer, but more important, NDVI images and analyzed the corresponding classification results. The RF-RFE algorithm was used to select the best feature subset to construct the optimal NDVI-TSA. We named the optimal NDVI-TSA as phenology-based NDVI-TSA (PB-NDVI-TSA). Besides, in the coastal zone where S. alterniflora occupies, frequent cloud coverage often reduces the number of available images. To address this problem, we also evaluated the performance of the second PB-NDVI-TSA that uses the top three features. Consequently, there were three NDVI-TSA, with each NDVI-TSA assessed by using four OCC and SVM algorithms.

Transferability Analysis of the EOCC Algorithm Combined with PB-NDVI-TSA for S. alterniflora Detection
To verify whether the proposed detection scheme is transferable and robust, we directly transferred the "optimal" model to the second region without new training data input. The second PB-NDVI-TSA in Scenario 2, the PB-NDVI-TSA using the top three NDVIs, was selected for the transferability analysis. It should be noted that we only tested the transferability of the second PB-NDVI-TSA because it is often a challenge to obtain enough cloud-free and high-quality images in a year with one satellite sensor for practical applications in areas like Jiangsu coastal wetlands. Clouds are present frequently from June to October over the whole Jiangsu coast, especially during hot and rainy summers. We aimed to develop a detection scheme that could be easily implemented and extended to other regions. To assess the advantage of phenological time-series, the best SSA was also considered for comparison. Moreover, to further demonstrate the applicability of our proposed detection scheme, we compared its performance with the SVM in the transferability analysis.

Accuracy Assessment
To ensure a comparable accuracy assessment, we used an independent validation dataset to evaluate the performance of the OCC and fully supervised classification approaches in Scenarios 1 and 2. A confusion matrix was constructed by using the validation data and the outputs from each classification method. Several accuracy indices, including specificity, sensitivity, OA, Kappa statistic, and TSS, were calculated from the confusion matrix according to the following equations: where TP denotes true positives (i.e., the number of S. alterniflora pixels correctly detected), FP denotes false positives (i.e., the number of S. alterniflora pixels incorrectly detected), TN and FN denote true negatives and false negatives, respectively, and N is the size of the dataset. For the YNNR, 343 positive samples and 660 negative samples were used for the independent validation dataset, while 207 positive samples and 301 negative samples were used to evaluate the classification accuracy of the models for the DMNNR. Table 3 shows the classification accuracies of the presented methods, using 12 variables and the optimal variable subsets selected by the RF-RFE method. Compared with the results of different classification methods using complete variables (i.e., 12 variables), no significant improvements for all classifiers were found after implementing feature selection. Compared with the results from other OCC methods, the PUDNN presented more pronounced accuracy changes after using the optimal variables subsets. As shown in Table 4, for each SSA, the three most important variables consisted of at least one PC or VI, demonstrating the importance of PCs and VIs for S. alterniflora detection. The optimal variable sizes varied from three to twelve among different months. It was important to note that the NDVI was selected as an important variable for ten months, except for July and September, suggesting its high importance and potential for S. alterniflora detection. In the senescence stage during November and December, the VIs showed higher rankings than other variables. Table 3. Classification accuracies of the support vector machine (SVM) and one-class classifier (OCC) methods, using twelve variables and the optimal variables subsets selected by the random forest recursive feature elimination (RF-RFE) algorithm in Scenario 1. Values in parentheses are standard deviations. OA, overall accuracy; EOCC, ensemble OCC; MaxEnt, maximum entropy; PUDNN, positive and unlabeled deep neural network; BSVM, biased SVM; TSS, true skill statistic.

Accuracy
As expected, the performance of the five methods, four OCCs and the SVM, for S. alterniflora detection was strongly dependent upon the phenological window selection of the GF-1 WVF image. The best performance appeared in November, during which the classification accuracy was promising with a high average Kappa of 0.812 and an average OA of 0.918. Meanwhile, August showed the worst result with the average Kappa and OA of 0.377 and 0.714, respectively. The desired classification results were obtained based on the images acquired in November, December, April, and May ( Figure 2). The mean OA and Kappa for these months were greater than the average level of 0.658 and 0.848, respectively. Comparatively, in the dormancy period during January and February, the rapid growth season during July and August, and the early flowering stage during September, the classification performance of the OCC and the SVM algorithms were much poorer than in the senescence and green-up stages. The mean OA and Kappa for these months were less than the average level.
Remote Sens. 2020, 12, x; doi: FOR PEER REVIEW www.mdpi.com/journal/remotesensing and May (Figure 2). The mean OA and Kappa for these months were greater than the average level of 0.658 and 0.848, respectively. Comparatively, in the dormancy period during January and February, the rapid growth season during July and August, and the early flowering stage during September, the classification performance of the OCC and the SVM algorithms were much poorer than in the senescence and green-up stages. The mean OA and Kappa for these months were less than the average level.  Figure 3 shows the changes in Kappa and OA with an increasing number of variables (in decreasing order of importance) using the RF-RFE algorithm on NDVI-TSA (12 NDVIs). When six NDVI indices with less important values were excluded, the highest classification accuracy for identifying S. alterniflora was achieved ( Figure 3). The NDVIs from November, December, May, June, April, and January were selected as the best feature subset. As excepted, all these months showed relatively high performance in S. alterniflora recognition, indicating the NDVI-TSA with RF-RFE algorithm is suitable for selecting the crucial phenological stages for S. alterniflora detection. Among the selected six NDVIs, the NDVIs for November, December, and May were ranked as the top three predictors, further confirming these months as important phenological windows for S. alterniflora recognition. Moreover, it was evident that following the inclusion of the first three variables, the Kappa and OA values showed no significant change. Therefore, we also investigated the performance of the EOCC algorithm by using only these three NDVI indices in the subsequent exploration.   Figure 3 shows the changes in Kappa and OA with an increasing number of variables (in decreasing order of importance) using the RF-RFE algorithm on NDVI-TSA (12 NDVIs). When six NDVI indices with less important values were excluded, the highest classification accuracy for identifying S. alterniflora was achieved (Figure 3). The NDVIs from November, December, May, June, April, and January were selected as the best feature subset. As excepted, all these months showed relatively high performance in S. alterniflora recognition, indicating the NDVI-TSA with RF-RFE algorithm is suitable for selecting the crucial phenological stages for S. alterniflora detection. Among the selected six NDVIs, the NDVIs for November, December, and May were ranked as the top three predictors, further confirming these months as important phenological windows for S. alterniflora recognition. Moreover, it was evident that following the inclusion of the first three variables, the Kappa and OA values showed no significant change. Therefore, we also investigated the performance of the EOCC algorithm by using only these three NDVI indices in the subsequent exploration. . Variability in overall accuracy and Kappa for NDVI time-series analysis (TSA) when using a different number of NDVI variables. Whiskers indicate the standard deviation of five repeated 10-fold cross-validation. The green box indicates the best accuracy of the model, using the selected optimal feature subset. The best accuracies for NDVI-TSA were produced when six variables were considered. The order of these variables is NDVI_Nov, NDVI_Dec, NDVI_May, NDVI_Jun, NDVI_Apr, and NDVI_Jan. Table 5 shows the classification accuracies of OCC classifiers and a fully supervised classifier in Scenarios 1 and 2. All OCC algorithms performed well and similarly for S. alterniflora detection generally. Despite returning the highest mean sensitivity value of all individual OCC methods, MaxEnt had the lowest mean specificity value. BSVM had the highest mean OA, Kappa, TSS, and sensitivity values; however, it also yielded the lowest specificity value of all individual OCC classifiers. Compared to the single OCC algorithms, the EOCC produced a very similar mean sensitivity value, but a slightly higher mean specificity value; meanwhile, it also had a slightly higher mean OA of 86.79%, mean Kappa of 0.698, and mean TSS of 0.676, indicating the EOCC generated slightly better and more balanced results. Moreover, the EOCC obtained the closest classification scores to the fully supervised classification method. Although EOCC's mean sensitivity and TSS values were lower, its mean OA, Kappa, and specificity values were higher than the SVM. The EOCC also had a slightly lower standard deviation of the accuracies than that of the SVM in Scenarios 1 and 2.

Performance of the EOCC Algorithm in NDVI-TSA for S. alterniflora Detection
As the baseline for benchmarking the NDVI-TSA, the EOCC algorithm in the optimal time window (i.e., November) yielded an OA, a Kappa, and a TSS of 91.92%, 0.813, and 0.743 (Table 6). Compared to the best SSA, NDVI-TSA achieved slightly lower OA and Kappa values when it used all NDVI variables (i.e., 12 NDVIs). However, it produced a more balanced sensitivity-specificity result. By implementing the feature selection, the classification accuracies of NDVI-TSA were improved and higher than that of the best SSA. The PB-NDVI-TSA using the selected six variables produced the best results with the highest OA, Kappa, TSS, and sensitivity values. Although the EOCC in the best SSA yielded the highest specificity value, it produced the lowest sensitivity value. The sensitivity was significantly improved in the two phenological NDVI-TSA, which indicated that the PB-NDVI-TSA can greatly reduce the omission rate for S. alterniflora detection. Compared to the optimal PB-NDVI-TSA, the second PB-NDVI-TSA using the top three variables produced slightly lower classification accuracy. However, it still outperformed the NDVI-TSA (i.e., using 12 NDVIs) and the best SSA. These results indicate that phenological NDVI time-series analysis can improve the performance of the EOCC algorithm for S. alterniflora detection. Table 6. Classification accuracies of the two scenarios, using the EOCC algorithm. The best result of each accuracy metric is shown in bold. Based on visual inspection, the classification maps generated using the EOCC algorithm in both the best SSA and the NDVI-TSA were desirable. These maps matched with the land-cover map well. However, in the best SSA map, a few false negative pixels can be easily found inside S. alterniflora patches (Figure 4b). In the ecotone between the invasive plant and native salt marsh plants, where S. alterniflora has relatively low biomass and coverage density, S. alterniflora was underestimated by the best SSA during the senescence stage. The best SSA underestimated more S. alterniflora patches than the NDVI-TSA. The false negative pixels were significantly reduced in the maps from three NDVI-TSA models (Figure 4c-e). Compared to the NDVI-TSA using 12 NDVIs, the prediction of S. alterniflora was improved in the two PB-NDVI-TSA models (Figure 4c,d). Moreover, the false positive pixels located in the culture pond and farmland were also notably reduced in the two PB-NDVI-TSA models.

Transferability Analysis of the EOCC Algorithm in the PB-NDVI-TSA
Since the desired S. alterniflora detection result was obtained by our scheme using the EOCC algorithm in the second PB-NDVI-TSA, we attempted to apply this in a different area without new training data input to verify its applicability and transferability. To demonstrate the usefulness of the phenological NDVI-TSA, we also compared its performance with the optimal SSA in the transferability analysis. As shown in Table 7, by using the EOCC algorithm, the classification accuracies for both SSA and PB-NDVI-TSA were promising with high Kappa (0.789 in the SSA, and 0.815 in the PB-NDVI-TSA), OA (89.57% in the SSA, and 90.94% in the PB-NDVI-TSA), and TSS (0.809 in the SSA, and 0.825 in the PB-NDVI-TSA) values. However, the performances of the SVM in the two schemes (i.e., the best SSA and the PB-NDVI-TSA) were poorer and varied greatly, which indicates that the EOCC has a stronger and more stable performance in the two schemes. Although the EOCC yielded slightly lower sensitivity values in both schemes, it can produce significantly higher specificity values, which demonstrates that the EOCC also has a more balanced performance than the SVM in the transferability analysis. Compared to results in the Best SSA, both SVM and EOCC produced more balanced specificity-sensitivity accuracies in the PB-NDVI-TSA.
Based on visual examination, we found that the SVM in the SSA and PB-NDVI-TSA produced far more false positive pixels than the EOCC did generally ( Figure 5). Overestimation of S. alterniflora was quite a serious problem for the SVM. As expected, the classification results for both SVM and EOCC in the PB-NDVI-TSA were slightly improved compared to the SSA results, with the false positive pixels reduced. The EOCC in PB-NDVI-TSA generated the best classification map, which was well consistent with the manually delineated classification map (Figure 5d). Surprisingly, only a few misclassification errors were found in the best classification map, and most of the tidal channels within S. alterniflora patches were excluded (Figure 5e).

Ensemble Analysis for OCC Methods
Depending on the input training data, OCC methods can be divided as P-classifier (only trained with positive (P) data) and PU-classifier (trained with positive data and additionally unlabeled (U) data). The prominent examples of the first category are OCSVM and SVDD, while MaxEnt, BSVM, and PUL belong to the second group. Benefiting from the additional information from unlabeled data, PU-classifier usually yields higher classification accuracy than P-classifier [18]. We, therefore, only considered the PU-classifiers for the ensemble analysis in our study. Our results imply that all the individual PU-classifiers (i.e., the MaxEnt, BSVM, and PUL) perform well, returning similar OA, Kappa, and TSS values to that from the standard supervised SVM. However, their sensitivity values were lower than that from the SVM. Overall, the SVM produced more balanced and slightly better accuracy than the individual OCC algorithms. For the EOCC method, most accuracy metrics improved slightly compared to the single OCCs, showing a very similar classification result to the SVM. Although maps for the distribution of S. alterniflora created with traditional supervised classification methods are promising, the associated time and costs reduce their viability, especially for application in large areas and across multiple years. Comparatively, the EOCC method applies positive-only data and performs well in recognizing S. alterniflora, highly reducing the requirement for training data collection. In light of its strong performance, low cost, manageable labor intensity, and convenient operation, we offer the EOCC algorithm as a viable alternative to traditional supervised classification methods for invasive plant detection.
To integrate the outputs from different PU-classifiers, we used a weighted vote scheme, which has been shown to outperform simple majority vote and average ways in ensemble OCC applications [20]. Our result confirmed that higher accuracies can be achieved using the weighted vote approach (Appendix A Table A1). Despite the EOCC method not always delivering the best result of all the OCC methods in all scenarios, it did produce slightly more stable and balanced results. For example, overall, the BSVM yielded the best classification accuracy across all individual OCC methods. It also produced the lowest OA, Kappa, and TSS values in December for Scenario 1 and in NDVI12 for Scenario 2 (Appendix A Table A2). However, these discrepancies did not occur in the EOCC. This indicates that our ensemble scheme may feasibly reduce the predictive uncertainties of single OCC methods and offer improvements for classification accuracies. However, compared to the individual OCCs, only a limited improvement for the EOCC was presented in terms of the accuracy metrics. To further improve the detection performance of the ensemble analysis, one feasible way is to use other combination approaches (e.g., Bayesian average and fuzzy integral) that have shown acceptable performance in multi-class classification problems [55]. Besides, compared to the supervised classifier SVM, although returning higher specificity values, all the OCCs showed lower sensitivity values for Scenarios 1 and 2. As we focus on detecting a potentially invasive species to the environment, it may make more sense to minimize the number of false negatives (pixels that are actually S. alterniflora, but are classified as negatives). This would ensure that most individuals of invasive species are detected and potential countermeasures against further spreading can be efficiently implemented [30]. Therefore, one potential improvement for the EOCC might be to adjust the classification thresholds for each individual OCC to minimize the number of false negatives.

Advantages of Phenological NDVI Time-Series for Mapping Invasive Plant S. alterniflora
Numerous studies have demonstrated that vegetation indices, particularly the commonly used NDVI, can help improve invasive species detection accuracy [6,9,11,12,24]. Our results confirmed NDVI is of high importance for separating S. alterniflora from native P. australis and S. salsa. NDVI was deemed as a crucial variable for most SSAs in our study. A recent study found that monthly NDVI time-series can achieve better classification results than the multi-temporal imagery composite and the best SSA for three salt marsh plants (S. alterniflora, P. australis, and Scirpus mariqueter) classification [8]. In our study, compared to the best SSA, the three NDVI-TSAs showed better sensitivity-specificity balanced results, indicating that NDVI-TSAs are more efficient for detecting the invasive plant with a lower omission rate. Due to the phenological lag of S. alterniflora, most native salt marsh plants in the study area are completely withered or dormant by late November, while S. alterniflora can still maintain many green vegetative parts during this senescence stage [4]. This phenological difference makes the S. alterniflora easily distinguishable from the native plants [8]. However, some mixed pixels of S. alterniflora are extremely difficult to distinguish from mudflats [7]. This explains why we obtained very high specificity but relatively low sensitivity values in the best SSA in YNNR. Based on the visual inspection, we found that many pixels of S. alterniflora located in the high tide area closest to the sea were falsely classified as the negative class. In the green period, it becomes easy to distinguish S. alterniflora from the mudflats [56]. Additionally, in late April and May, most of S. alterniflora are just starting to germinate or grow with low biomass, while the co-dominant native marshes (e.g., P. australis) have already been growing for approximately one month [8,9], resulting in the S. alterniflora shows a different spectral characteristic from the native plants. However, in the ecotone of S. alterniflora and S. salsa, we found many false positive pixels (pixels are actually S. salsa, but classified as S. alterniflora) during the visual inspection. Our field surveys showed that the density of S. salsa was much lower than that of S. alterniflora in the ecotone. The scattered distribution makes it difficult for the corresponding pixels (16 m × 16 m) to represent the true spectral response of S. salsa because of the effects of both the canopy and the underlying soil [9]. Although S. salsa has been growing for approximately one month by late May, the mixed-pixel phenomenon leads to its lower NDVI values. This phenomenon also results in S. alterniflora and S. salsa showing a similar spectrum trait in the ecotone. Therefore, it may be inadequate to realize the discriminative potential with the SSA method. NDVI-TSA combines green and senescence periods can make full use of different phenological characteristics, thus improving the spectral separability in S. alterniflora detection.
However, although the NDVI-TSA using 12 NDVIs showed a slightly better specificity-sensitivity balance, it did not show many advantages in terms of Kappa, OA and, TSS values compared to the best SSA. Due to the differences in illumination and atmospheric conditions across different image acquisition dates, unclear observations (e.g., clouds, cloud shadows, and inherent noise) may increase, as more images are included in the NDVI time-series imagery [9]. Even though each image was rigorously filtered (e.g., cloud cover less than 10%) and preprocessed (e.g., atmospheric correction and cloud mask), these unclear observations inevitably exist in the dataset and may influence the classification accuracy. The phenomenon of species succession also hinders the accurate S. alterniflora detection. The boundaries and ecotone between S. alterniflora and native salt marsh plants are areas of high misclassification probability [4,9]. Therefore, it is critically important to pick out a few images that best reflect the differences between S. alterniflora and native plants and thereby reduce the span of the time-series, this helps improve the accuracy in unstable regions [9]. The PB-NDVI-TSA method could reduce the aforementioned uncertainties, as fewer, but more important images are considered. Compared to the complete NDVI-TSA (i.e., models using 12 NDVIs), our results show that the classification accuracies are improved by using the PB-NDVI-TSA methods. More importantly, our second PB-NDVI-TSA model, which includes NDVI images from only three key months (May, November, and December), demonstrates a very little reduction in classification accuracy. However, it still yielded higher and more balanced accuracy than any SSAs and the complete NDVI-TSA.
We concluded that an improved mapping result can be achieved during May, November, and December. Coincidentally, the NDVI of November, December, and May were selected as the top three variables for the NDVI-TSA with an RF-RFE algorithm. This result suggests that the NDVI-TSA can be effective in identifying critical phenological events using the RF-RFE algorithm, regardless of the availability of a priori phenological information on S. alterniflora. Moreover, our results also demonstrate the strong performance of the classification algorithms in NDVI-TSA. Hence, if a priori information on the phenological characteristics of a particular invasive plant is limited, the quantification of intra-annual phenology from NDVI time-series may offer a detailed perspective on seasonal variability of the plant's phenology [13].

Transferability of the EOCC Combined with PB-NDVI-TSA
For operational applications, an approach must deliver comparable results that extend beyond a specific area [57]. However, except for the PUBNN, all the individual OCCs and the SVM performed relatively poorly in the DMNNR (Appendix A Table A3). Although the PUBNN generally performed well in transferability analysis, it also produced the worst results in the YNNR. The EOCC method performed constantly well and showed a very robust transferability and stability in a second test site, without a second training. Compared to the individual OCCs, despite no significant improvement of classification accuracy was shown for the EOCC in the YNNR, a remarkable improvement was realized in the DMNNR. This is in agreement with previous research indicating that ensemble models outperform individual models, providing more robust and stable predictions in transferability analysis [34,57]. Our results also showed the EOCC outperforms the SVM when being transferred to a second area. A recent study demonstrated that an OCC method outperforms a fully supervised method when the training set is incomplete (i.e., one of the non-target land types is not sampled) [29]. OCC methods are less likely to suffer from the problem of incomplete training data because only the target class needs to be labeled [29]. Compared to the SVM only using 110 negative samples, the EOCC using 5000 random unlabeled samples is more likely to sample all land-cover types. Therefore, the SVM might be more susceptible to incomplete datasets, especially in extended areas without new reference data input.
Compared to the best SSA, our results show that improved performances for both SVM and EOCC can be achieved by combining with the PB-TSA method. A large number of studies have indicated that extended observations offered by phenological time-series enable a classifier to show a stronger detection performance than the SSA methods in a certain area [8,9,11,13]. However, there is a lack of studies assessing their performances when being applied in different geographic spaces. Our results somewhat provide evidence that phenological time-series-based methods could exhibit a more stable and strong transferability than the best SSA method across different geographic regions. Since the timing of leaf senescence for the invasive species varies over space and time, the performance of the best SSA method may vary widely among geographic regions. Moreover, it will be difficult to repeatedly determine the optimal phenological stage for detecting S. alterniflora across locations without expert knowledge. Hence, the best SSA method may lack the generalization capability to conduct the long-term regional-scale monitoring of S. alterniflora [11].
The promising results indicate our detection scheme's robustness and potential for larger-scale applications, which is crucially needed for S. alterniflora study and management. However, we only transferred the detection scheme in the DMNNR, which has a similar species composition to the YNNR. Therefore, to further verify the stability and generalizability of the proposed detection scheme, more tests need to be conducted in extended and broader range of regions in future studies. Although our proposed method is designed for S. alterniflora detection, it should be beneficial to other invasive plants. For example, P. australis is deemed as an invasive species, which is widely distributed in New England, North America, and other countries [9]. Its extensive and rapid expansion threatens the native Spartina spp. Thus, it will be interesting to apply our proposed method to identify P. australis if this invasive species displays phenological differences with Spartina spp. in these regions.

Methodological Considerations
A larger number of studies have demonstrated that PCs and VIs can increase the potential to discriminate between vegetation species [4]. Our results confirmed the PCs and VIs are useful for S. alterniflora detection. However, with more input variables, noises in datasets and correlations between variables, which may reduce model stability and interpretability [58]. Therefore, it is critical to find a parsimonious feature subset balancing the prediction accuracy and model interpretability [59]. As one of the most popular variable selection approaches, the RF-RFE method is used to find optimal variable subsets in our study. Although no significant improved accuracies were found after using RF-RFE method, smaller feature subset sizes reduced the computational demand. The best accuracy was achieved in November, which was much higher than other SSAs. Since only three variables were retained for the best SSA after implementing a feature selection using RF-RFE, we did not further evaluate the impact of multicollinearity on model performance. Studies have pointed out that indices developed from remote sensing datasets often show collinearity and may contribute to over prediction [13]. We found different VIs of most SSAs showing high collinearity (r > 0.85), which may affect the model performance. Therefore, addressing collinearity to formulate an optimal feature subset can be critical in future studies.
To ensure a comparable accuracy assessment, we used independent 1511 positive-negative samples to validate the models' results. However, we did not use these samples during the processes of model training and decision-making (determination of parameters and thresholds). This means we can certainly reduce the number of validation samples to minimize the cost. In an operational setting, when validation samples are limited, OCC results with only a few observations of the species of interest might also serve as an initial map product for directing the fieldwork. The results could be used to locate areas with a high occurrence probability of the species and hence increase the efficiency in subsequent field campaigns to collect presence data [30]. Our field vegetation plots were sampled in July 2014 and September 2015. Time mismatch for the two years' samples may affect the discrimination results. To reduce this uncertainty, we used GE imagery to inspect the samples and removed the samples of 2014 that were collected in the ecotone between vegetation types, and high tide zone. Unlike the functional properties of vegetation (e.g., biomass and nitrogen content), the structural composition of vegetation usually evolves more slowly (over ±2years) [60]. Thus, the reference samples are reliable. To ensure a well-stratified reference dataset and reduce the influence of the time mismatch between field sampling and RS data, we supplemented the samples from GE imagery. As a freely available data source, it offers advantages for discerning small, newly-colonizing S. alterniflora clumps [3]. The GE imagery with sufficient spatial information could be an appropriate tool for the collection of training samples [61,62]. However, the classification accuracy of time-series models is highly correlated with the stability of the plants during the study period. Annual variation of the species niche can impede the accurate mapping of salt marshes plants, especially in the ecotones and boundaries of different salt marsh communities, which experience rapid vegetation succession [9]. Therefore, in future studies, it is necessary to collect high-quality reference data that cover several months for these sensitive regions.
The coastal zones of Jiangsu where S. alterniflora has established are inevitably affected by the significant cloud coverage [8], which makes it difficult to acquire sufficient monthly cloud-free images over a year within one satellite sensor. Thence, we attempted to reduce the temporal dimension of the NDVI time-series dataset, using NDVIs from key phenological stages. The results indicated that the PB-NDVI-TSA with only three monthly NDVIs can obtain very high accuracy for S. alterniflora detection. Thus, pinpointing several months that capture the key phenological differences between plants could be a viable way to address the limitations on RS data acquisition [4,8]. However, to construct a well-refined phenological time-series model, the data gap in crucial phenological periods is another huge challenge, especially for long-term mapping tasks. With the availability of more satellite data, the combination of multi-source data, such as the ESA's Sentinel-2A and B constellation, Landsat, and the recently available GaoFen 6-WFV, could be an alternative way to resolve this data gap [4].

Conclusions
In this study, we developed a novel detection scheme that employs an EOCC algorithm incorporated with a phenological NDVI-TSA method to detect the invasive species S. alterniflora based on GF-1 WFV monthly time-series data in the core zones of YNNR and DMNNR. We tested the performance of the EOCC in two scenarios, namely SSA and NDVI-TSA. Within the proposed scenarios, the crucial phenological stages and the advantage of NDVI-TSA in S. alterniflora detection were also investigated. To further demonstrate the identification ability of the EOCC algorithm, we used a standard supervised classifier SVM as a baseline for comparison. Results showed that the EOCC produced slightly more stable accuracies of all OCC classifiers and yielded similar accuracies to the SVM in the YNNR. As for the EOCC in SSA, the best classification result was achieved in November. Compared to the best SSA, PB-NDVI-TSA appeared to produce a slightly higher classification accuracy, in which the Kappa and OA improved from 0.813 and 91.92% to 0.841 and 92.92%, respectively. The satisfactory result could also be transferred to the DMNNR area, where the accuracy of the S. alterniflora detection maintained promise with an OA of 90.94% and a Kappa of 0.815. Moreover, our proposed detection scheme combining the EOCC algorithm with PB-TSA was more robust and applicable than other combinations (i.e., EOCC with the best SSA, SVM with PB-NDVI-TSA, or the best SSA) in the transferability analysis. This highlights that our proposed detection scheme is promising for S. alterniflora mapping over extended areas. We hope this study will provide a potential alternative to traditional methods for invasive-plant detection. Table A1. Mean and standard deviation of overall accuracy, Kappa coefficient, sensitivity, and specificity of the two ensemble methods (i.e., weighted vote and majority vote) in Scenarios 1 and 2. Values in parentheses are standard deviations. The best result of each accuracy metrics is bolded.