Identifying Coastal Wetlands Changes Using a High-Resolution Optical Images Feature Hierarchical Selection Method

: Coastal wetlands are dynamic and fragile ecosystems where complex changes have taken place. As they are affected by environmental changes and human activities, it is of great practical signiﬁcance to monitor coastal wetlands changes regularly. High-resolution optical data can observe changes in coastal wetlands, however, the impact of different optical features on the identiﬁcation of changes in coastal wetlands is not clear. Simultaneously, the combination of many features could cause the “dimension disaster” problem. In addition, only small amounts of training samples are accessible at pre-or post-changed time. In order to solve the above problems, the feature hierarchical selection method is proposed, taking into account the jumping degree of different image features. The inﬂuence of different optical features on wetland classiﬁcation was analyzed. In addition, a training samples transfer learning strategy was designed for wetland classiﬁcation, and the classiﬁcation result at pre-and post-changed times were compared to identify the “from-to” coastal wetlands changes. The southeastern coastal wetlands located in Jiangsu Province were used as a study area, and ZY-3 images in 2013 and 2018 were used to verify the proposed methods. The results show that the feature hierarchical selection method can provide a quantitative reference for optimal subset feature selection. A training samples transfer learning strategy was used to classify post-changed optical data, the overall accuracy of transferred training samples was 91.16%, and it ensures the accuracy requirements for change identiﬁcation. In the study area, the salt marsh increased mainly from the sea area, because salt marshes expand rapidly throughout coastal areas, and aquaculture ponds increased from the sea area and salt marshes, because of the considerable economic beneﬁts of the aquacultural industry.


Introduction
Coastal wetlands, which are located in the interactive zone between terrestrial and aquatic ecosystems, are dynamic and fragile ecosystems [1].They play important roles in water conservation, regional climate regulation, flood control, biodiversity protection, and so on [2,3].Over the past century, coastal wetlands have experienced obvious degradation or even disappearance, as a result of environmental changes and human activities, such as climate change, urban expansion, livestock grazing, and agricultural development [4].Change detection (CD) is the appropriate way to observe wetland changes, therefore accurate and timely CD is fundamental to reliable wetland management and successful wetland protection.
Optical images [5][6][7] and synthetic aperture radar (SAR) images [8-10] have been effectively utilized for dynamic monitoring in wetlands.Spectral-enhanced features [11], textural features [12], spectral-spatial features [13], spectral-textural features [14], and spectral-spatial-textural features [15,16] have been utilized for change detection, however, there is inevitably redundancy among these features.If all of these features are input into the image classifier, on the one hand, it is easy to cause the "dimension disaster" problem, and computer operation becomes slow.On the other hand, the addition of some features is not conducive to the improvement of image classification and change detection.Therefore, it is necessary to select some features to obtain optimal feature subsets.
Feature selection methods select optimal feature subsets from the available image features, which are divided into filter methods and wrapper methods [17].Recently, principal component analysis [18], random forest-based [19], genetic particle swarm optimization [17], sequential forward selection [20], ensemble learning [21], and adaptive feature selection network [22] have been proposed for feature selection.Random forest is one of the wrapper methods and it removes irrelevant features one by one in order to obtain optimal feature subsets.When there are hundreds or thousands of features, it will take more time to remove irrelevant features one by one, therefore, it is necessary to improve the efficiency of feature selection using random forest.
In addition, due to the large-scale and complex geographical conditions of coastal wetlands [23], it is time-consuming and laborious to collect training samples.According to the availability of training samples, CD methods can be divided into two categories: supervised CD and unsupervised CD.In addition, according to the final detailed requirements for the changed regions, CD can be divided into two categories: binary CD and multiple CD.Binary CD only separates changed areas from unchanged areas, while multiple CD can not only extract the changed areas but also obtain the "from-to" change information.A number of binary CD methods have been proposed, such as change vector analysis [24], clustering method [25], threshold method [26], and deep learning [27].The development of binary CD methods is summarized in references [28][29][30][31].
Compared with binary CD, multiple CD is more challenging.Some approaches have been developed, such as post classification comparison [32], CD in change vector analysis polar domain [24], and slow feature analysis [33].However, these methods have been developed based on the assumption that there are either some or no ground reference samples at both pre-and post-changed times.Only small amounts of ground reference data are available at only one of the pre-and post-changed times [34], therefore, how to utilize limited ground reference samples for multiple CD is another important consideration.
The contribution of this article is to propose a random forest-based feature hierarchical selection method in order to obtain optimal feature subsets, and these optimal feature subsets could then be utilized for coastal wetlands classification.In addition, a simple training samples transfer learning strategy was designed for identifying coastal wetland changes, where small amounts of ground training samples are available only at the prechanged time.The feature optimization can improve the accuracy of coastal wetlands classification, and samples transfer learning can be more suitable for the actual monitoring of changes in wetlands.
The rest of this article is organized as follows.The study area and data descriptions are presented in Section 2, the proposed feature hierarchical selection and designed changes identification using sample transfer learning are described in Section 3, our experimental results are reported in Section 4, and conclusions are summarized in Section 5.

Study Area
The study area was a typical coastal wetland and is located in Dongtai city in the southeastern part of the Jiangsu province in China (Figure 1).The wetland types in the region are sea, open water, farmland, aquaculture pond, salt marsh, and building.southeastern part of the Jiangsu province in China (Figure 1).The wetland types in the region are sea, open water, farmland, aquaculture pond, salt marsh, and building.
In the study area, the change patterns consisted of Se-SM (the change from sea to salt marsh marked in rectangle A), Se-AP (the change from sea to aquaculture pond marked in rectangle B), FL-AP (the change from farmland to aquaculture pond marked in rectangle C), SM-FL (the change from salt marsh to farmland marked in rectangle D) and SM-AP (the change from salt marsh to aquaculture pond marked in rectangle E).

Dataset
High spatial-resolution Ziyuan-3 (ZY-3) images in 2013 and 2018 were used as experimental data, the acquisition dates and spatial resolutions are listed in Table 1.Image preprocessing included radiometric correction, image registration, and image cropping using ENVI 5.6 software.In the study area, the change patterns consisted of Se-SM (the change from sea to salt marsh marked in rectangle A), Se-AP (the change from sea to aquaculture pond marked in rectangle B), FL-AP (the change from farmland to aquaculture pond marked in rectangle C), SM-FL (the change from salt marsh to farmland marked in rectangle D) and SM-AP (the change from salt marsh to aquaculture pond marked in rectangle E).

Dataset
High spatial-resolution Ziyuan-3 (ZY-3) images in 2013 and 2018 were used as experimental data, the acquisition dates and spatial resolutions are listed in Table 1.Image preprocessing included radiometric correction, image registration, and image cropping using ENVI 5.6 software.express the morphological characteristics of land cover, so the opening and closing of both MPs and DMPs were used to extract the morphological-based features.Five different scales [1][2][3][4][5] were chosen to describe the morphological characteristics on a fine to coarse scale.Eighty morphological-based features were extracted from four spectral bands.(d) Transform-based features were extracted using non-subsampling shearlet transform (NSST).NSST decomposition was used to obtain high-frequency sub-bands and lowfrequency sub-bands, and NSST reconstruction was used to reconstruct image features.
In order to describe the above features on a coarse to fine scale, NSST reconstruction was used on three different scales, to obtain twelve transform-based features for four spectral bands.(e) Sobel operator was used to obtain four edge-based features for four spectral bands.(f) Ten vegetation indexes (NDVI, NDWI, GR, DVI, RVI, SAVI, OSAVI, MSAVI, PVI, and EVI) were also extracted from four spectral bands, the formula of these vegetation indexes is listed in Table 2.
Together 303 features were extracted from ZY-3 images, as shown in Table 3.

Feature Hierarchical Selection
Variable importance (VI) of image features was calculated by random forest algorithm, as shown in Equation (1): where E OOB1 is the out-of-bag (OOB) error of decision trees, E OOB2 is the OOB error when the image feature F i is replaced, N tree is the number of decision trees.After the image feature F i is replaced, if the OOB error changes obviously, it indicates that the image feature F i is more important.Assuming that the data sequence is {X 1 , X 2 , X 3 , • • • , X n }, the statistic that obeys the overall distribution is F(x, θ), the number is n, the expectation of the data sequence {X 1 , X 2 , X 3 , • • • , X n } is u.u k is the point estimate that only depends on the expectation u, shown as Equation (2), in which x i is the ith statistic value, and x k is the kth statistic value.
In this paper, the jumping degree t k was designed for feature hierarchical selection.Firstly, all the extracted image features were sorted in ascending order according to their VI values.Secondly, the jumping degree of each image feature was calculated, shown as Equations ( 3) and (4).
where t k is the jumping degree at point k, u k is point estimate at point k, u k+1 is point estimate at point (k+1), s i is VI value of image features F i , s k is VI value of image features F k , and n is the number of image features.Finally, if t k (k ≥ 2) is greater than all the jumping degrees of its previous (k − 1) image features, k is regarded as the point that distinguishes different feature groups, and the previous (k − 1) image features belong to the same feature group.The remaining image features after the removal of the previous (k − 1) image features repeat the above rule, until all image features are distinguished hierarchically.

Saliency-Guided Binary Change Detection
Dynamic wetlands monitoring in large-scale areas was an effective method that firstly extracted changed and unchanged areas, and then identified different change categories.Changed and unchanged areas were extracted using binary CD methods, and different change categories were identified using multiple CD methods.Many binary CD methods have been proposed, they can be divided into pixel-based, sub-pixel-based, object-based, and scene-based CD methods, according to different analysis units [35].
Pixel-based and object-based binary CD methods are widely used, but each of them has its advantages and disadvantages.Pixel-based CD methods are sensitive to image registration errors, and their salt-and-pepper phenomena are serious.Object-based CD methods can improve salt and pepper phenomena, and they are less affected by image registration errors, but they are greatly affected by image segmentation parameters.In order to utilize their advantages and solve their disadvantages, we previously proposed a saliency-guided binary CD method combining pixel-based and object-based CD methods, and describe it as follows.Firstly, the different images are obtained and saliency detection is used to generate saliency maps of the different images using a maximum symmetric surround (MSS) algorithm.Then, the combination of fuzzy C-means (FCM) with Markov random field (MRF) is used to extract the initial CD result at a pixel-based level.Secondly, a multi-scale segmentation algorithm is utilized for object-oriented image segmentation, in which the rate of change of local variance (ROC-LV) is used to estimate the optimal segmentation scales.Finally, the uncertainty index of segmentation objects is constructed to adaptively select changed and unchanged samples, and these samples are then used for training random forest classifier, to obtain the final CD results.The reference paper [36] can be consulted for further details.

Change Identification Using Sample Transfer Learning
For large-scale areas with complex land use and land cover classes, training sample collection is time-consuming and labor-intensive, and only small amounts of training samples are available at only one of the pre-and post-changed times.Therefore, training sample transfer learning was designed for change identification in order to obtain the "from-to" changes.Its flowchart is described in Figure 2.

Feature Selection Results
VI values of 303 image features of ZY-3 data in 2013 were calculated using random forest and were sorted in ascending order.According to the feature hierarchical selection method, features with similar VI values were divided into the same feature group.The red dots in Figure 3  , , , , , , , , and 8 [282~303] f = .For images at t 1 time, image features were optimized using the feature hierarchical selection method, and optimal feature subsets were then input into random forest classifier in order to obtain different wetland distributions at t 1 .Six different training samples were collected using unmanned aerial vehicle (UAV) flights.From t 1 to t 2 , some training samples changed from one wetland class to another one.These changed training samples could not be training samples at t 2 time, instead only unchanged training samples were transferred as training samples at t 2 .Transferred training samples were used for random forest classification at t 2 time, and only image features belonging to optimal feature subsets were extracted at t 2 time.Different wetland distributions at t 2 were obtained by random forest classification, in which wetland classes in unchanged areas were the same.Wetland distribution in changed areas at t 1 and t 2 time was compared with each other and the change transfer matrices were obtained.

Feature Selection Results
VI values of 303 image features of ZY-3 data in 2013 were calculated using random forest and were sorted in ascending order.According to the feature hierarchical selection method, features with similar VI values were divided into the same feature group.The red dots in Figure 3 are the points distinguishing different feature groups, corresponding to the 35, 71, 109, 158, 225, 265, and 281st features.In this way, the 303 features were divided into eight feature groups, represented as Ground reference sample data of six different wetland categories were collected using UAV flight, including sea, open water, farmland, aquaculture pond, salt marsh, and building.In this case, small amounts of training samples in 2013 were obtained, and they were used for wetland classification in 2013.In order to verify the influence of different feature combinations on wetland classification, different feature subsets were input into random forest classifiers for wetland classification.Overall accuracy and Kappa coefficient of different feature subsets are listed in Table 4. Generally, the overall accuracy of wetland classification decreased sequentially, along with the feature groups with low VI values being deleted sequentially.The feature groups that are beneficial to wetland identification were deleted, which reduced the distinction of wetland classes.
The overall accuracy of wetland classification increased when the feature groups { }  , , , , ,  F f f f f f f f = was 0.9763, and was also the highest.Ground reference sample data of six different wetland categories were collected using UAV flight, including sea, open water, farmland, aquaculture pond, salt marsh, and building.In this case, small amounts of training samples in 2013 were obtained, and they were used for wetland classification in 2013.In order to verify the influence of different feature combinations on wetland classification, different feature subsets were input into random forest classifiers for wetland classification.Overall accuracy and Kappa coefficient of different feature subsets are listed in Table 4. Generally, the overall accuracy of wetland classification decreased sequentially, along with the feature groups with low VI values being deleted sequentially.The feature groups that are beneficial to wetland identification were deleted, which reduced the distinction of wetland classes.The overall accuracy of wetland classification increased when the feature groups { f 1 } were deleted.The overall accuracy of F 1 = { f 2 , f 3 , f 4 , f 5 , f 6 , f 7 , f 8 } for wetland identification was the highest, at 98.51%.The Kappa coefficient of F 1 = { f 2 , f 3 , f 4 , f 5 , f 6 , f 7 , f 8 } was 0.9763, and was also the highest.

Change Identification Results
Based on ZY-3 images in 2013 and 2018, our previously proposed binary CD method could obtain the unchanged and changed areas in the study area.The overall accuracy of the proposed CD method was 93.51% [36] and it could meet the reliability of subsequent wetland change identification.According to the flowchart of change identification shown in Figure 2, wetland classification results in 2018 could be obtained.The wetland classification results in 2013 and 2018 are shown in Figure 4, the overall accuracy was 98.51% and 91.16%, respectively.This could meet the requirement of actual wetland monitoring.

Change Identification Results
Based on ZY-3 images in 2013 and 2018, our previously proposed binary CD method could obtain the unchanged and changed areas in the study area.The overall accuracy of the proposed CD method was 93.51% [36]   The change transfer matrix between six different wetland categories from 2013 to 2018 is shown in Table 5.The increase of salt marsh was mainly from the sea area (2.27 km 2 ) because salt marsh had strong adaptability and fertility and could expand rapidly throughout coastal areas.The increase of the aquaculture pond was from the sea area (10.58 km 2 ) and salt marsh (5.24 km 2 ) because the aquacultural industry had considerable economic benefits.Some new aquaculture ponds emerged under a reclamation project, and some salt marshes in original natural landscapes were developed as aquaculture ponds.The increase in farmland was from salt marshes because the agricultural industry also developed in coastal areas.The increase of building was from salt marshes and farmland because of population migration to wetland areas which occupied agricultural land and salt marshes to build residential and industrial land.The change transfer matrix between six different wetland categories from 2013 to 2018 is shown in Table 5.The increase of salt marsh was mainly from the sea area (2.27 km 2 ) because salt marsh had strong adaptability and fertility and could expand rapidly throughout coastal areas.The increase of the aquaculture pond was from the sea area (10.58 km 2 ) and salt marsh (5.24 km 2 ) because the aquacultural industry had considerable economic benefits.Some new aquaculture ponds emerged under a reclamation project, and some salt marshes in original natural landscapes were developed as aquaculture ponds.The increase in farmland was from salt marshes because the agricultural industry also developed in coastal areas.The increase of building was from salt marshes and farmland because of population migration to wetland areas which occupied agricultural land and salt marshes to build residential and industrial land.

Conclusions
High-resolution optical data can observe the changes in coastal wetlands in detail.Image features can be extracted from these high-resolution data, but the impact of different features on change identification is not clear, and the combination of a large number of features could easily cause a "dimension disaster" problem.Therefore, the feature hierarchical selection method was proposed taking into account the jumping degree of different image features.The influence of different features on wetland classification was analyzed and those features which had a slight influence on wetland classification were deleted to not only reduce dimensionally but also to obtain optimal features subsets.
In addition, only small amounts of training samples were accessible at pre-or postchanged time.Therefore, the training samples transfer learning strategy was designed for wetland classification, and the classification results at pre-and post-changed times were compared to obtain a change transfer matrix.The main conclusions are as follows: (1) The jumping degree was introduced to design a feature hierarchical strategy in order to obtain optimal feature subsets.The feature selection results showed that the feature hierarchical selection method could provide a quantitative reference for optimal feature subsets selection.because salt marshes expand rapidly throughout coastal areas and aquaculture ponds increased from the sea area (10.58 km 2 ) and salt marshes (5.24 km 2 ) because of the considerable economic benefits of the aquacultural industry.
Different feature types were extracted from ZY-3 images, including spectral-based, texture-based, morphological-based, transform-based, edge-based, and vegetation indexes.

Figure 2 .
Figure 2. Change identification using feature hierarchical selection and training samples transfer learning methods.
are the points distinguishing different feature groups, corresponding to the 35, 71, 109, 158, 225, 265, and 281st features.In this way, the 303 features were divided into eight feature groups, represented as

Figure 2 .
Figure 2. Change identification using feature hierarchical selection and training samples transfer learning methods.

12 Figure 3 .
Figure 3. Hierarchical features in which the red dots are the points distinguishing different feature groups.

Figure 3 .
Figure 3. Hierarchical features in which the red dots are the points distinguishing different feature groups.
and it could meet the reliability of subsequent wetland change identification.According to the flowchart of change identification shown in Figure2, wetland classification results in 2018 could be obtained.The wetland classification results in 2013 and 2018 are shown in Figure4, the overall accuracy was 98.51% and 91.16%, respectively.This could meet the requirement of actual wetland monitoring.

( 2 )
The training samples transfer learning strategy was used to classify post-changed optical data without recollecting training samples.It could obviously save the effort of collecting training samples.The overall accuracy of the transferred training samples was 91.16%, demonstrating that it could ensure the accuracy requirements for change monitoring.(3) The southeastern coastal wetlands located in Jiangsu Province were used as a study area and ZY-3 images in 2013 and 2018 were used to conduct experiments.The results demonstrated that salt marshes increased mainly from the sea area (2.27 km 2 )

Table 2 .
The formula of vegetation indexes.

Table 4 .
The difference of wetland classification using different feature subsets.

Table 5 .
Change transfer matrix between different wetland categories from 2013 to 2018 (units: km 2 ).