Using Multisource High-Resolution Remote Sensing Data (2 m) with a Habitat–Tide–Semantic Segmentation Approach for Mangrove Mapping

: Mangrove wetlands are hotspots of global biodiversity and blue carbon reserves in coastal wetlands, with unique ecological functions and signiﬁcant socioeconomic value. Annual ﬁne-scale monitoring of mangroves is crucial for evaluating national conservation programs and implementing sustainable mangrove management strategies. However, annual ﬁne-scale mapping of mangroves over large areas using remote sensing remains a challenge due to spectral similarities with coastal vegetation, tidal periodic ﬂuctuations, and the need for consistent and dependable samples across different years. In previous research, there has been a lack of strategies that simultaneously consider spatial, temporal, and methodological aspects of mangrove extraction. Therefore, based on an approach that considers mangrove habitat, tides, and a semantic segmentation approach, we propose a method for ﬁne-scale mangrove mapping suitable for long time-series data. This is an optimized hybrid model that integrates spatial, temporal, and methodological considerations. The model uses ﬁve sensors (GF-1, GF-2, GF-6, ZY-301, ZY-302) to combine deep learning U-Net models with mangrove habitat information and algorithms during low-tide periods. This method produces a mangrove map with a spatial resolution of 2 m. We applied this algorithm to three typical mangrove regions in the Beibu Gulf of Guangxi Province. The results showed the following: (1) The model scored above 0.9 in terms of its F1-score in all three study areas at the time of training, with an average accuracy of 92.54% for mangrove extraction. (2) The average overall accuracy (OA) for the extraction of mangrove distribution in three typical areas in the Beibu Gulf was 93.29%. When comparing the validation of different regions and years, the overall OA accuracy exceeded 89.84% and the Kappa coefﬁcient exceeded 0.74. (3) The model results are reliable for extracting sparse and slow-growing young mangroves and narrow mangrove belts along roadsides. In some areas where tidal ﬂooding occurs, the existing dataset underestimates mangrove extraction to a certain extent. The ﬁne-scale mangrove extraction method provides a foundation for the implementation of ﬁne-scale management of mangrove ecosystems, support for species diversity conservation


Introduction
Mangroves are mainly distributed in tropical and subtropical regions between 32 • N and 38 • S around the world, and they play important roles in coastal environmental protection [1,2], water purification [3,4], carbon sequestration [5][6][7], biodiversity conservation [8,9], and other ecological functions [4,10,11].Due to human activities and climate change, the mangrove ecosystem is one of the most fragile ecosystems on Earth and has unique habitat conditions.Over the past 50 years, global mangrove coverage has decreased by approximately 20-35% [12].In China, mangroves have experienced a gradual increase recently after an initial decrease, but the current stock is still only approximately 40% of that in 2000 [13].The severe degradation of mangroves has attracted the attention of the Chinese government.Through the efforts of various sectors in China, mangroves have slowly begun to recover since 2015 [14].Reliable, high-precision, and regular monitoring of mangroves is crucial for formulating and implementing sustainable mangrove management strategies.
Remote sensing plays a unique role in monitoring inaccessible coastal ecosystems [11].With the development of satellite remote sensing and intelligent classification algorithms, significant progress has been made in the extraction and monitoring of mangrove wetland areas [15][16][17][18].In recent years, high-resolution GF and ZY series satellites have been widely used for mangrove monitoring tasks [14,19].With characteristics such as high image resolution and short revisit intervals, these satellites have great potential for detailed and effective mangrove monitoring.
To date, based on existing remote sensing datasets, researchers have conducted a large amount of mangrove monitoring work on local to global scales.Using Landsat datasets with a spatial resolution of 30 m, researchers used machine learning methods such as random forests to create a global mangrove map with a resolution of 30 m from 2000 to 2016 to capture human and natural impacts on mangroves [12].They have also generated longterm  dynamics of Chinese mangroves using image thresholding methods and analyzed the effectiveness of Chinese mangrove protection measures [13].This research strategy takes into account temporal optimization by identifying the low-tide periods for mangroves using corresponding ecological indices.This method is more suitable for global-scale mapping of mangroves at a 30 m resolution.However, such methods have a higher demand for samples and still pose challenges for fine-scale annual monitoring of long-term mangrove changes.Using Sentinel and Landsat datasets with spatial resolutions ranging from 10 m to 30 m, researchers evaluated the applicability of different indicators and selected the interannual changes of EVI and NDVI indices in mangroves time-series analysis [20].And Theil-Sen median trend analysis and the Mann-Kendall test were used to explore the spatiotemporal variation trends in mangroves in the Qinglan Harbor Nature Reserve in Hainan, China from 1987 to 2020 [20].Researchers generated mangrove maps of China using the Otsu method for six time periods from 1973 to 2015, and analyzed national-and local-scale changes in mangroves over a 42-year period [13].Studies based on UAV (unmanned aerial vehicle) data, hyperspectral data, and multispectral data have also used methods such as XGBoost (extreme gradient boosting) for mangrove extraction at a small spatial scale [21,22], SVMs (support vector machines) for mangrove species classification [23][24][25], CNN (convolutional neural networks) to overcome manual leaf sample identification during image recognition process for mangrove extraction and species classification [26].This research strategy considers spatial optimization by combining deep learning with mangrove habitat information to identify potential growth areas of mangroves and then extract mangroves based on this information.This approach helps avoid confusion between mangroves and other terrestrial vegetation due to spectral similarity.However, this research strategy often overlooks the uncertainty of tides, leading to underestimation of extracted mangroves compared with the actual extent of the mangroves.Additionally, these methods require high-quality samples and are still not suitable for long-term mangrove monitoring.These studies indicate that Landsat 30 m spatial resolution and Sentinel 10 m spatial resolution images are mainly used for large-scale mangrove monitoring, but there is a gap in large-scale fine spatial resolution monitoring (2 m) for long time series.Since the policy for protecting mangroves was established, the fluctuation in mangrove areas has increased.In this situation, high spatial resolution and dense time-series mangrove monitoring pose even more rigorous challenges in evaluating national protection plans.
Machine learning algorithms are the most widely used methods for extracting mangroves.For example, machine learning algorithms rely on rich and unbiased training samples from which they can learn to obtain accurate mappings [27].In addition, machine learning algorithms require setting the weights of features to balance prior probabilities and fitting quality [28][29][30].CNNs achieve high scores in automatic classification tasks, but the results of traditional CNNs for mangrove extraction rely on stable sample quality [31].Researchers have utilized attention mechanisms and CNNs to extract mangroves in the Dongzhai Port National Nature Reserve in Hainan, China, further improving mangrove extraction accuracy [32].However, obtaining stable, long-term mangrove samples remains a challenge.Semantic segmentation models like U-Net provide an opportunity for fullsample mapping, requiring only one year of stable samples for training; the model results can be applied to other years [33].The U-Net deep learning semantic segmentation model network can reduce the dependence on sample points and achieve good classification results (OA = 81%) based on Landsat images [34].In addition, some improved U-Net models have demonstrated excellent performance in mangrove extraction [35,36].Semantic segmentation models have shown excellent performance in binary extraction tasks and have been repeatedly proven to be more suitable for fine-scale, long-term mangrove extraction.However, mangrove extraction tasks still present challenges, such as errors caused by spectral similarity confusion leading to an overestimation of mangrove area, and errors resulting from tidal uncertainty leading to an underestimation of mangrove area.Similarly, an independent study in China showed that farmlands, inland shrublands, bushes, and aquatic plants have very similar spectral characteristics to mangroves, and thus misclassifications are common [37].This indicates that obtaining stable samples for mangrove extraction tasks is highly challenging.
Fine-scale annual monitoring of mangroves still presents three major challenges: (1) Spectral similarity: Mangroves share a high spectral similarity with coastal vegetation.Relying solely on spectral data often results in the misclassification of a significant amount of other vegetation as mangroves.(2) Tidal influence: The appearance of mangroves in imagery is greatly influenced by tidal conditions.Existing extraction strategies struggle to obtain accurate mangrove boundaries due to these tidal fluctuations.(3) Sample availability: Current mangrove extraction strategies require a substantial number of reliable samples.Obtaining reliable samples from noncurrent years is challenging, making it difficult to monitor long-term changes in mangroves.
The aim is to solve these challenges.The objective of this research is to develop and evaluate hybrid U-Net deep learning models for mapping mangroves from multisource high-resolution data.This research includes the following aspects: (1) developing a hybrid U-Net model based on a habitat-tide-semantic segmentation approach; (2) extracting the maximum extent of annual mangrove data using multisource high-resolution imagery; (3) qualitatively and quantitatively evaluating the model results through visual interpretation and validation by multiple experts; and (4) comparing the model results with existing mangrove datasets.
The rest of this paper is organized as follows: Section 2 provides detailed information about the study area, data sources, and auxiliary data.Section 3 describes the hybrid U-Net deep learning model for mapping mangroves from multisource high-resolution data.Section 4 provides an evaluation of the performance of the hybrid U-Net model.Section 5 provides a discussion of the advantages and limitations of the hybrid U-Net model and emphasizes the advantages of multisource high score data in mangrove extraction.Finally, the major findings of this research are presented in Section 6.

Study Areas
The study areas are located in three typical regions of Guangxi Province in southern China (Figure 1): (1) Zhenzhu Har bor-Fangcheng Harbor in Guangxi (ZZW); (2) Dafeng River in Guangxi (DFJ); and (3) Tieshan Harbor in Guangxi (TSG).The study areas have two national nature reserves (Guangxi Beilun Estuary National Nature Reserve and Guangxi Shankou Mangrove Ecological and National Nature Reserve) and one autonomous region nature reserve (Guangxi Maowei Sea-Dafeng River Mangrove Forest Autonomous Region Nature Reserve).ZZW belongs to the Guangxi Beilun Estuary National Nature Reserve, which is the largest and most typical bay-type mangrove in China.The mangroves of DFJ have the typical characteristics of estuary-type mangroves.The major species of mangroves include Aegiceras corniculatum, Kandelia obovata, Avicennia Marina, Acanthus ilicifolius L., and Sonneratia apetala [38].The predominant species in the mangrove communities of TSG, which span the low, mid-, and high intertidal zones, include Avicennia Marina, Aegiceras corniculatum, Kandelia obovata, Rhizophora stylosa Griff, and Bruguiera gymnorrhiza (Linn.)Savigny [39].The mangrove data presented in Figure 1 are openly available at [Science Data Bank (ScienceDB)] at [doi:10.11922/sciencedb.00449],reference number [40].

Study Areas
The study areas are located in three typical regions of Guangxi Province in southern China (Figure 1): (1) Zhenzhu Har-bor-Fangcheng Harbor in Guangxi (ZZW); (2) Dafeng River in Guangxi (DFJ); and (3) Tieshan Harbor in Guangxi (TSG).The study areas have two national nature reserves (Guangxi Beilun Estuary National Nature Reserve and Guangxi Shankou Mangrove Ecological and National Nature Reserve) and one autonomous region nature reserve (Guangxi Maowei Sea-Dafeng River Mangrove Forest Autonomous Region Nature Reserve).ZZW belongs to the Guangxi Beilun Estuary National Nature Reserve, which is the largest and most typical bay-type mangrove in China.The mangroves of DFJ have the typical characteristics of estuary-type mangroves.The major species of mangroves include Aegiceras corniculatum, Kandelia obovata, Avicennia Marina Acanthus ilicifolius L., and Sonneratia apetala [38].The predominant species in the mangrove communities of TSG, which span the low, mid-, and high intertidal zones, include Avicennia Marina, Aegiceras corniculatum, Kandelia obovata, Rhizophora stylosa Griff, and Bruguiera gymnorrhiza (Linn.)Savigny [39].The mangrove data presented in Figure 1 are openly available at [Science Data Bank (ScienceDB)] at [doi:10.11922/sciencedb.00449],reference number [40].
For the preprocessing of the images, the preprocessing workflows for GF-1, GF-2, GF-6, ZY3-01, and ZY3-02 were generally the same.The processes included (1) orthorectification of the panchromatic and multispectral images, (2) radiometric calibration, (3) atmospheric correction, (4) image fusion and pansharpening, and (5) data cropping and subset extraction.First, all of the images were orthorectified to the World Geodetic System 1984 (WGS 84) reference system, with a root mean square geometric error of less than one pixel The changes in solar angle and Earth-sun distance was normalized by converting the digital values to top-of-atmosphere reflectance for each image.Then, the radiometric calibration tool of ENVI5.3 and the FLAASH atmospheric correction tool were used to process

Data Preprocessing
This study used five types of multispectral satellites, namely GF-1, GF-2, GF-6, ZY3-01, and ZY3-02.The main parameters are shown in Table 1.For the preprocessing of the images, the preprocessing workflows for GF-1, GF-2, GF-6, ZY3-01, and ZY3-02 were generally the same.The processes included (1) orthorectification of the panchromatic and multispectral images, (2) radiometric calibration, (3) atmospheric correction, (4) image fusion and pansharpening, and (5) data cropping and subset extraction.First, all of the images were orthorectified to the World Geodetic System 1984 (WGS 84) reference system, with a root mean square geometric error of less than one pixel.The changes in solar angle and Earth-sun distance was normalized by converting the digital values to top-of-atmosphere reflectance for each image.Then, the radiometric calibration tool of ENVI5.3 and the FLAASH atmospheric correction tool were used to process the multispectral images.The calibration coefficients and atmospheric correction parameters were obtained from the ENVI5.3lookup table (GMTED2010).Finally, the pansharpening algorithm [41] was used to fuse the orthorectified panchromatic and multispectral images.Pansharpening is a radiation transformation that combines high-resolution panchromatic images (or raster bands) with lower-resolution multiband raster datasets.Combining the high spatial resolution information of panchromatic imagery with the low spatial resolution information of multispectral imagery allows us to obtain fused images with both high spatial and spectral resolution.The data fusion resolution was the same as the panchromatic band spatial resolution.The fused images were resampled to a spatial resolution of 2 m (m for meters).The processed images were then divided into separate image sets by year for subsequent mangrove extraction.In this study, the five types of high-resolution satellite images with a cloud cover of less than 15% in the Beibu Gulf from 2015 to 2021 were selected [42].A total of 226 images were selected for the extraction of the mangrove forests in the Beibu Gulf.

Auxiliary Data 2.3.1. Potential Growth Areas of Mangroves
Using information about the mangrove habitat, mangroves are classified into a specific ecological niche, which eliminates most areas where mangroves are not found [23].In this study, potential mangrove habitats were determined based on common mangrove habitat characteristics.First, a 20 km buffer zone was set along the Beibu Gulf coastline.Second, Guangxi has 12 species of true mangroves, with the tallest being smooth cordgrass (Sonneratia apetala), which can grow up to 20 m high [43].The areas where the DSM (Digital Surface Model) value was less than 20 m were extracted from the buffer zone to obtain the mangrove habitat.The DSM data were downloaded from http://www.eorc.jaxa.jp/ALOS/en/aw3d30/data/index.html, accessed on 21 March 2023.

Reference Data Collection
The sample dataset of this study was mainly used to evaluate the effectiveness of the proposed method for mangrove extraction.The sample discrimination was based on visual interpretation of Google Earth high-resolution imagery and field investigations.The sample dataset consisted of three parts: (1) For the first part, ArcGIS 10.8 was used to generate random points in the mangrove and its adjacent areas.These points were determined to be mangroves by multiple geographers and mangrove experts.Each sampling point's category attribute was manually determined based on high-resolution imagery from Google Earth, and uncertain sampling points were removed.(2) The second part came from the area of change in the mangrove patches, primarily focusing on areas where differences existed between existing datasets for the same year.These areas were initially assessed by an expert system and ultimately visually interpreted using high-resolution imagery from Google Earth.(3) The third part of the sample points comes from the 2015 field investigation results.The number and sources of the sampling points in each region are shown in Table 2.

Existing Data Productions
In this study, we compared our research model's results with five publicly available mangrove datasets.To ensure the reliability of data comparison, we only selected the baseline years of these datasets for comparison.These five datasets consist of one set at 2 m resolution (XCina, 2018) [23], two sets at 10 m resolution (TW, LREIS) [44,45] (2019, 2020), and two sets at 30 m resolution (GMW, CAS) (2015,2018,2019,2020).The 10 m resolution products were mainly extracted based on Sentinel images, and then underwent field investigations and postprocessing, with an accuracy of over 90%.The 30 m resolution products were mainly extracted based on Landsat images, and after postprocessing, the results were published.The most representative products are the GMW data, which have formed annual time-series data since 2015 [15].

Overview of the Mangrove Extraction Process
The overview of the workflow for mangrove extraction and evaluation is outlined in Figure 2 and mainly consists of four parts: (1) image preprocessing, (2) mangrove maximum growth area assessment, (3) U-Net mangrove extraction, and (4) accuracy assessment.Image preprocessing: the satellite images were preprocessed to obtain a high-resolution multisource image set with a spatial resolution of 2 m (m for meters) for 2015-2021.Mangrove maximum growth area assessment: Based on the habitat characteristics and NDVI, we acquired cloud-free images during the low-tide period for mangroves, representing the maximum growth area of mangroves.Extracting the maximum mangrove growth area, which is used for U-Net mangrove extraction, is crucial for the "full mapping" of the semantic segmentation model.U-Net mangrove extraction: Accurate determination of the maximum mangrove growth area contributes to achieving a "full mapping" relationship between stable samples (2021) and the spatial scale of remote sensing images.On this basis, image detection and segmentation were performed on the annual cloud-free mangrove images.Multiple-source image segmentation results were extracted using a blended U-Net model, which was used to derive mangrove maps for multiple years (2015-2020).Accuracy assessment: the accuracy of the mangrove maps was verified based on validation sample sites and field survey results, and quantitative and qualitative analyses were performed via comparing multi-person visual interpretations with existing data.

Principle of the U-Net Model
The U-Net model is the most widely used model in semantic segmentation projects and has also been extensively applied to remote sensing image classification [46].The

Principle of the U-Net Model
The U-Net model is the most widely used model in semantic segmentation projects and has also been extensively applied to remote sensing image classification [46].The input image first passes through the encoder to generate downsampling feature maps.
Then, the decoder part implements upsampling of the feature maps to make them the same size as the original image.U-Net is composed of stackable highway units, allowing deeper networks to be trained using a limited number of samples [34].To address the problem of gradient degradation caused by excessively deep neural networks in deep learning tasks, this study used ResNet-34 as the kernel function to train the deep learning model to enhance the model's feature representation capability [47,48].To optimize the update of network parameters, the Adam optimizer [49] was used.The specific equation, Equation (1), is as follows: In Equation ( 1), t represents the number of training iterations, α is the learning rate, m is the exponential moving average of the gradient, and v is the exponential moving average of the squared gradient.ε is usually a constant value of 1 × 10 −8 .
The model loss was calculated during training using the average cross-entropy loss function, as shown in Equation ( 2): where n represents the batch size and y i and a i are the predicted value and true value of the i-th sample in each batch, respectively.The U-Net network model has better spatial accuracy due to adding skip connections between the encoder and decoder layers.

Hybrid U-Net Based on the Habitat-Tide Semantic Segmentation Approach
Based on mangrove habitat characteristics and the collected low-tide data, a hybrid strategy combining the U-Net model was proposed for mangrove extraction based on the habitat data method.The hybrid U-Net model consists of three steps: (1) mangrove habitat information, (2) low tide level acquisition, and (3) U-Net model mangrove extraction (Figure 3).
The habitat extraction module identifies the potential growth areas of mangroves.The low-tide data collection module extracts the maximum mangrove extent based on the current data.The U-Net module for mangrove extraction constructs the relationship between the maximum mangrove extent and multisource high-resolution remote sensing imagery to build a mangrove extraction model.This model can be used for mangrove extraction in other years within the same area.In the U-Net module for mangrove extraction, the encoder and decoder parts are the same as those of the standard U-Net model.However, the classifier part is enhanced via integrating the habitat information module and the lowtide data collection module, improving the accuracy of the extraction process.
Step 1: Mangrove habitat information According to the content in Section 2.3.1, mangroves grow on intertidal flats and the buffer zone is designed as 20 km from the coastline.Areas with a digital surface model (DSM) value less than 20 m are considered potential growth areas to reduce interference from other coastal vegetation.
This step involves using the unique habitat of mangroves to identify potential mangrove habitat areas and avoid interference from shoreline vegetation in the inversion results.This is an optimization in terms of the model's spatial aspects.The inputs for this step include GF series and ZY series remote sensing images, DSM (digital surface model) data, and coastline data.The output is the extent of the mangrove habitat.
Step 2: Low tide level acquisition The minimum water level in the study area was based on the lowest tidal period of the year [50].For the normalized difference vegetation index (NDVI), the NDVI values of non-water pixels (vegetation and tidal flats) are higher than those of water pixels, as vegetation absorbs more strongly in the green band.Maximum value compositing of the NDVI for the preprocessed 2 m resolution images from 2021 can be used to extract the minimum water surface [51].This composite image not only enhances the difference between vegetation and water but also removes the periodic flooding effect of seawater on the mangrove forest, effectively separating the mangrove forest from the water.In this study, 51 images were ultimately used to track the low-tide period (GF-1: 13; GF-2: 17; GF-6: 4; ZY3-01: 5; ZY3-02: 12).We list the relevant information for the remote sensing data in the Supplementary Materials.
This step involves using NDVIs and multisource high-resolution data to determine the low-tide moments for mangroves, obtaining the maximum mangrove area.Based on this, multilevel interpretation is carried out, and the interpretation results serve as inputs to the U-Net model for training, achieving a "full mapping" with remote sensing imagery.
The training results can then be applied to mangrove extraction for other years.This is an optimization in terms of the model's temporal aspects.The inputs for the second part include NDVIs constructed from GF series and ZY series remote sensing images.The output is the maximum extent of the mangrove areas.This study was compiled using the Python language based on the TensorFlow deep learning framework.The hardware configuration of the operating platform includes a Lenovo Ren-7000 i7-12700 (F)/NVIDIA Quadro RTX3070Ti 16G GPU.The slice size was set to 256 × 256 pixels, the batch size was set to 32, the kernel function was ResNet-34, and the other parameters were set to the default values of ResNet-34.
Remote Sens. 2023, 15, x FOR PEER REVIEW 9 of 23 NDVI for the preprocessed 2 m resolution images from 2021 can be used to extract the minimum water surface [51].This composite image not only enhances the difference between vegetation and water but also removes the periodic flooding effect of seawater on the mangrove forest, effectively separating the mangrove forest from the water.In this study, 51 images were ultimately used to track the low-tide period (GF-1: 13; GF-2: 17; GF-6: 4; ZY3-01: 5; ZY3-02: 12).We list the relevant information for the remote sensing data in the Supplementary Materials.This step involves using NDVIs and multisource high-resolution data to determine the low-tide moments for mangroves, obtaining the maximum mangrove area.Based on this, multilevel interpretation is carried out, and the interpretation results serve as inputs to the U-Net model for training, achieving a "full mapping" with remote sensing imagery.
The training results can then be applied to mangrove extraction for other years.This is an optimization in terms of the model's temporal aspects.The inputs for the second part include NDVIs constructed from GF series and ZY series remote sensing images.The output is the maximum extent of the mangrove areas.This study was compiled using the Python language based on the TensorFlow deep learning framework.The hardware configuration of the operating platform includes a Lenovo Ren-7000 i7-12700 (F)/NVIDIA Quadro RTX3070Ti 16G GPU.The slice size was set to 256 × 256 pixels, the batch size was set to 32, the kernel function was ResNet-34, and the other parameters were set to the default values of ResNet-34.

Accuracy Assessment
We evaluated the classification accuracy of the proposed hybrid U-Net model using three metrics: Kappa coefficient, overall accuracy (OA), and F1-scores [52].The kappa coefficient, OA, and F1-scores are all based on confusion matrix calculations, which provide the number of correctly classified and misclassified samples [53].Specifically, the Kappa coefficient is often used to measure the consistency of test results and the effectiveness of classification, while OA is the ratio of correctly classified categories to the total number of categories [34].The F1-score reflects the overall comprehensive index, independent of the number of categories; it is a means of coordinating precision and recall.Precision is defined as the ratio of true positive (TP) predictions to sample predictions of the same type of land cover.It refers to the proportion of mangrove and non-mangrove types in the model results that is the same as in the validation point results.Recall is defined as the ratio of target type samples to total type samples.It refers to the proportion of model results defined as mangroves and the validation points (wild survey results and visual interpretation results) that are classified as mangroves [50].
In Equations ( 3) and ( 4), a ii represents the number of samples correctly predicted as type i, a ij represents the number of samples of type i predicted as type j, and N is the total number of samples.In Equation (5), FN represents the number of samples for which the classification model predicts mangrove forests but the ground truth shows non-mangrove forests.

Multilevel Interactive Verification
In this study, the multilevel interactive verification was divided into four parts: preliminary interpretation, comparison of interpretation results, issue resolution and correction, and final review and confirmation.Preliminary interpretation: We engaged multiple experts in geographic information or experts in mangrove studies.Based on their respective professional knowledge and experience, they independently interpreted and identified mangroves in the same remote sensing image, generating interpretation logs.These interpretation logs include the basis for decisions, data sources, and key findings.Comparison of interpretation results: Interpretation results from multiple individuals were compared to identify stable mangrove areas and areas with discrepancies.Issue resolution and correction: Referring to the existing dataset, identify areas with differences in mangrove interpretation to determine whether they are mangroves.For areas that still could not be definitively determined, in-depth analysis of the imagery, additional data searches, or reference to ground validation data were conducted, and the final determination was made by local mangrove experts.Final review and confirmation: The stable mangrove areas obtained from the preliminary interpretation were overlaid with areas where there were discrepancies as determined by experts.The final visual interpretation results were ultimately confirmed.

Performance Evaluation of the Hybrid U-Net Model during the Training Period
The training results (2021 samples) based on the hybrid U-Net model in three typical mangrove areas in the Beibu Gulf are shown in Figure 4.During the training period, the output of the hybrid U-Net's divided into two categories: mangrove and non-mangrove.The application effect of the U-Net deep learning model was evaluated using classic evaluation metrics such as accuracy, recall, and F1-score in the three study areas.As seen in Table 3, the F1-scores of the three areas were all above 0.9, indicating that the hybrid U-Net model theoretically performed well in precision and recall.Precision is an indicator from the perspective of model prediction, and there was an average mangrove prediction precision of 92.54% in the three areas.Recall is an indicator from the perspective of real results, and there was an average mangrove prediction recall of 91.51% in the three areas.Overall, the model performs similarly in the three areas, with slightly better results in TSG than in ZZW and DFJ.The lowest F1-score for non-mangrove areas during training in the three regions was 0.97.This demonstrates that very few mangrove patches were misinterpreted as non-mangrove patches.This further indicates that the boundaries of mangrove patches used for model training are quite accurate.
uation metrics such as accuracy, recall, and F1-score in the three study areas.As seen in Table 3, the F1-scores of the three areas were all above 0.9, indicating that the hybrid U-Net model theoretically performed well in precision and recall.Precision is an indicator from the perspective of model prediction, and there was an average mangrove prediction precision of 92.54% in the three areas.Recall is an indicator from the perspective of real results, and there was an average mangrove prediction recall of 91.51% in the three areas.Overall, the model performs similarly in the three areas, with slightly better results in TSG than in ZZW and DFJ.The lowest F1-score for non-mangrove areas during training in the three regions was 0.97.This demonstrates that very few mangrove patches were misinterpreted as non-mangrove patches.This further indicates that the boundaries of mangrove patches used for model training are quite accurate.
The hybrid U-Net model effectively inverted the mangroves in the three areas.The training accuracy-verification curve of the hybrid U-Net is shown in Figure 4.The training accuracy curve was used to measure the performance of the model, and the loss curve was used to further optimize the model.After approximately 7000 iterations in the three typical areas, the model reached a convergence state, at which point the model weights were determined.After model convergence, the loss of three regions is around 0.05.
To avoid overfitting, the early stopping strategy was employed during the model training process [54].This strategy involves monitoring the training progress and automatically stopping the training when the model has reached a point of convergence and the training loss on the training dataset no longer decreases significantly.This allows the model to be saved with the optimal parameter settings without continuing training until after the specified number of epochs.The hybrid U-Net model effectively inverted the mangroves in the three areas.The training accuracy-verification curve of the hybrid U-Net is shown in Figure 4.The training accuracy curve was used to measure the performance of the model, and the loss curve was used to further optimize the model.After approximately 7000 iterations in the three typical areas, the model reached a convergence state, at which point the model weights were determined.After model convergence, the loss of three regions is around 0.05.
To avoid overfitting, the early stopping strategy was employed during the model training process [54].This strategy involves monitoring the training progress and automatically stopping the training when the model has reached a point of convergence and the training loss on the training dataset no longer decreases significantly.This allows the model to be saved with the optimal parameter settings without continuing training until after the specified number of epochs.

Evaluation of the Hybrid U-Net Model's Mangrove Extraction Accuracy
The extraction results of the hybrid U-Net model can be seen in Figure 5 and Table 4 (the extraction results for DFJ and TSG are shown in the Supplementary Materials (Figures S1 and S2)).The PA, UA, OA, and Kappa coefficients were selected as indicators to validate the accuracy of the hybrid U-Net model in extracting the mangroves.The average OA value the mangrove distribution extraction in the typical area of the Beibu Gulf was 93.29%.When comparing the validation results of the different regions and different years (Table 4), the overall OA accuracy exceeded 89.84%, and the Kappa coefficient exceeded 0.74.The lowest OA was 93.31% (kappa = 0.8) in the DFJ area, and the lowest OA was 91.68% (kappa = 0.82) in the TSG area.The best extraction accuracy of the mangroves in the DFJ area was in 2015 (OA = 96.58%,Kappa = 0.87), and the lowest accuracy was in the ZZW area in 2020 (OA = 89.84%,Kappa = 0.77).Compared with the results for DFJ, the extraction accuracy for ZZW and TSG in the four years averaged 94.74%.With the hybrid U-Net model, there were very few errors in mangrove extraction (Figure 5), resulting in high PA values for all the mangrove categories (DFJ reached a maximum value of 96.64% in 2015).The PA and UA values of the mangroves in the three typical areas were both greater than 90.11%.

Multilevel Interactive Verification of Mangrove Extraction
To further validate and confirm the reliability of the mangrove extraction results, this study employed a multilevel interactive interpretation approach on remote sensing imagery to extract mangrove information.The extracted results were then compared with

Multilevel Interactive Verification of Mangrove Extraction
To further validate and confirm the reliability of the mangrove extraction results, this study employed a multilevel interactive interpretation approach on remote sensing imagery to extract mangrove information.The extracted results were then compared with the findings of this study.In the context of multilevel interactive verification of mangrove extraction, experts collaborated to review and validate the extracted mangrove maps (Figure 6).(A comparison of the results of hybrid U-Net extraction of mangroves with multi-person interactive validation for DFJ and TSG is shown in the Supplementary Materials (Figures S3 and S4)).The key steps involved in this interactive verification process included data sharing, a collaboration platform, comparative analysis, discussion and consensus, expert guidance, and iterative refinement.The verification process was carried out independently to ensure unbiased results.

Evaluation of the Hybrid U-Net Model with Existing Data Products
We compared the model results with the baseline years of the existing dataset separately (Figures 7 and 8).Overall, the results of this study were more consistent with the 2/10 m resolution dataset, and in some areas, the results of this study were more reliable compared with the existing dataset, especially in the extraction of sparse and low-growth young forests and narrow mangrove belts along embankments (Figures 7 and 8).(The hybrid U-Net mangrove extraction results compared with existing data products for DFJ and TSG are shown in the Supplementary Materials (Figures S5 and S6)).In our Supplementary Materials, more details on the comparison are provided (Figure S7).The 10/30 m data product did not provide sufficient detailed spatial information.Factors such as the spatial resolution of remote sensing images, tidal inundation, and information extraction strategies can affect the accuracy of mangrove extraction.Previous studies have shown that the area error measured with remote sensing images increases gradually with improvements in spatial resolution [55].Therefore, for narrow and fragmented mangrove areas, the extraction effect of medium-to-low-resolution remote sensing images is poor.

Stability of the Hybrid U-Net Network Performance
This article proposes an integrated spatial, temporal, and methodological (hybrid U-Net model) optimization for mangrove extraction.In this study, the spatial resolution of the mangroves extracted is 2 m, allowing for the capture of finer mangrove patches, especially small and narrow ones.Mangroves are primarily found near estuaries, and small and narrow mangrove patches are quite common.This study enhances the quality of training samples based on mangrove habitat and tidal information, utilizing a semantic segmentation model like U-Net that requires only one year of samples to map mangroves for multiple years.Compared with other machine learning methods that require repeated acquisition of stable samples, this approach enables rapid detection of mangrove changes.
The output of the hybrid U-Net network model has only two categories, mangrove and non-mangrove, making it a binary classification model.Compared with other models, this approach takes into account the potential growth areas of mangroves and the impact of tides on mangrove extraction, and combines the advantages of semantic segmentation models (reducing dependence on samples).This method is more suitable for large-scale mangrove extraction tasks [56].Additionally, the combination of GF and ZY series satellite sensors compensates for the shortcomings of individual satellite image information, ensuring high-resolution mapping at a 2 m spatial resolution.According to the base year sub-data products, the average area values of mangroves in ZZW, DFJ, and TSG were 927.36 ha, 787.55 ha, and 941.58 ha, respectively (Figures 7 and 8 and Table S1).Among them, ZZW and TSG showed a trend of first increasing and then decreasing in mangrove area at the four time points, while the mangrove area in DFJ showed a continuous increasing trend.From 2015 to 2020, the overall mangrove area in ZZW increased by 147.79 ha (18.9%), and that in DFJ increased by 104.66 ha (13.94%), while the mangrove area in TSG remained almost unchanged.
A comparison with 10/30 m data products in this study revealed significant differences in the changes in mangrove area in TSG and ZZW for 30 m data products.The large discrepancy of 693.96 ha in TSG from 2018 (30 m reference) may be because China has many small mangrove patches, as well as narrow mangroves along levees, which were not captured by the 30 m resolution product due to the lack of fine-scale information and different pixel sizes.In contrast, this study's results exhibit better performance in terms of extracting more detailed mangrove patches.In addition, differences in some areas may be due to different methods used for image acquisition and mangrove mapping in different years.From the time series of different mangrove datasets in Table S1, it can be seen that China's mangrove area has remained stable since 2015, with fluctuations in some regions, which is consistent with the results of [13].The results of this study were generated using 2 m high-resolution images, with smooth edges of patches and complete structures of small patches.In contrast, the mangrove patches in the other five datasets showed serious saw-toothed edge effects and insufficient accuracy for small patches.
The results of this study are closer to the 10 m product in terms of mangrove area.Considering the pixel scale effect, the 2 m resolution result should be smaller than the 10 m resolution, which is consistent with the results of this study.For the Pearl River Estuary, where there are mainly large independent mangroves, the differences between the three products were not significant.For TSG and DFJ, there are narrow mangrove belts along the embankments and large independent mangroves.From a spatial perspective, the 10/30 m data have significant omissions for such mangroves.Considering the influence of the scale effect, although the overall changes in the mangrove area for DFJ among the three datasets are not significant, the 2/10 m results are much better than the 30 m result in terms of spatial information.For DFJ, the inversion of the western mangroves using 10/30 m data is disastrous, while the 2 m data can clearly show the spatial distribution of the western mangroves in DFJ.

Stability of the Hybrid U-Net Network Performance
This article proposes an integrated spatial, temporal, and methodological (hybrid U-Net model) optimization for mangrove extraction.In this study, the spatial resolution of the mangroves extracted is 2 m, allowing for the capture of finer mangrove patches, especially small and narrow ones.Mangroves are primarily found near estuaries, and small and narrow mangrove patches are quite common.This study enhances the quality of training samples based on mangrove habitat and tidal information, utilizing a semantic segmentation model like U-Net that requires only one year of samples to map mangroves for multiple years.Compared with other machine learning methods that require repeated acquisition of stable samples, this approach enables rapid detection of mangrove changes.
The output of the hybrid U-Net network model has only two categories, mangrove and non-mangrove, making it a binary classification model.Compared with other models, this approach takes into account the potential growth areas of mangroves and the impact of tides on mangrove extraction, and combines the advantages of semantic segmentation models (reducing dependence on samples).This method is more suitable for large-scale mangrove extraction tasks [56].Additionally, the combination of GF and ZY series satellite sensors compensates for the shortcomings of individual satellite image information, ensuring high-resolution mapping at a 2 m spatial resolution.
The Hybrid U-Net network model has three unique advantages in fine-scale mangrove mapping: (1) It excludes non-vegetation cover before mangrove extraction, avoiding the impact of non-vegetation on sparse mangroves and saving post-processing work; (2) It utilizes vegetation indices to determine the low tide moments, ensuring the accuracy of extracting the maximum extent of mangroves; (3) The Hybrid U-Net model requires only one year of samples for training and the training model can be applied to infer mangroves for other years within the same area.This reduces the traditional deep learning models' reliance on samples.

Advantages of High-Resolution Data Extraction for Mangrove Mapping
Currently, most of the published mangrove distribution datasets use medium-resolution satellite data with a spatial resolution of 30 m or synthetic aperture radar data at the L-band with a resolution of 25 m (Figure 9).In these medium-resolution images, the vegetation texture and spatial details are missing, leading to limitations in accurately identifying mangrove forests.In addition, small and irregularly shaped mangrove areas are difficult to accurately identify and extract.

The Limitations of the Hybrid U-Net Network Model
In general, remote sensing-based surveys of mangrove forests only include areas with a tree crown density of 20% or more [23].Seedlings and mangrove forests with an area of less than 4 m 2 are currently unable to be extracted due to the limitations in sensor accuracy, which may result in an underestimation of model results compared with actual conditions.
There are other types of vegetation growing on the tidal flats where mangrove forests grow along the coast of the Beibu Gulf in Guangxi, China.These vegetation types are easily confused with mangrove forests in image interpretation.The most common types of vegetation are Spartina alterniflora and reeds (Figure 10) [51].Many studies have provided evidence that Spartina alterniflora can entirely supplant mangrove forests within a short span of 1 to 5 years [60].The latest research indicates that the invasion of Spartina alterniflora threatens the survival and reproduction of mangroves, as Spartina alterniflora inhibits the growth and expansion of mangroves [20,61].There is spectral confusion between Spartina alterniflora and mangroves, which may lead to an overestimation of mangrove areas by the hybrid U-Net [62].In future research, it will be possible to improve the accuracy of mangrove mapping further by overlaying thematic maps of misclassified species.This study uses satellite remote sensing images with a resolution of 2 m from GF and ZY series sensors for mangrove mapping.The finer spatial resolution allows for the delineation of small and scattered mangrove patches and narrow strips along the coastline.This is particularly important for mangrove mapping in China.There are numerous small and narrow patches of mangroves in China.Young plants that were newly planted but not forested were not included in existing dataset [23].The use of high-spatial-resolution remote sensing images enables the extraction of more detailed mangrove distributions at a finer spatial scale and allows for rapid detection of subtle changes in the information.A detailed understanding of the distribution of and changes in these mangroves is of great significance for allowing local management authorities to carry out mangrove ecological protection and restoration and to coordinate coastal economic development with ecological protection.

The Limitations of the Hybrid U-Net Network Model
In general, remote sensing-based surveys of mangrove forests only include areas with a tree crown density of 20% or more [23].Seedlings and mangrove forests with an area of less than 4 m 2 are currently unable to be extracted due to the limitations in sensor accuracy, which may result in an underestimation of model results compared with actual conditions.
There are other types of vegetation growing on the tidal flats where mangrove forests grow along the coast of the Beibu Gulf in Guangxi, China.These vegetation types are easily confused with mangrove forests in image interpretation.The most common types of vegetation are Spartina alterniflora and reeds (Figure 10) [51].Many studies have provided evidence that Spartina alterniflora can entirely supplant mangrove forests within a short span of 1 to 5 years [60].The latest research indicates that the invasion of Spartina alterniflora threatens the survival and reproduction of mangroves, as Spartina alterniflora inhibits the growth and expansion of mangroves [20,61].There is spectral confusion between Spartina alterniflora and mangroves, which may lead to an overestimation of mangrove areas by the hybrid U-Net [62].In future research, it will be possible to improve the accuracy of mangrove mapping further by overlaying thematic maps of misclassified species.

Conclusions
This paper presents a spatial-temporal method optimization approach (hybrid U-Net) based on mangrove habitats, low tide, and semantic segmentation models to address challenges in satellite remote sensing of mangrove wetlands, including spectral similarity confusion, tidal uncertainty, and difficulties in obtaining stable samples.The paper demonstrates the advantages of combining tidal water levels and mangrove habitat information in mangrove extraction and mapping.
The hybrid U-Net model overcomes the dependency of traditional machine learning algorithms on large sample datasets.The use of mangrove habitats and low-tide periods helps reduce the interference of surrounding vegetation in the extraction of the mangroves.This method was applied to extract mangroves in three typical areas of the Beibu Gulf in 2015, 2018, 2019, and 2020.This method is less sample-dependent and more accurate and is more suitable for the rapid detection of subtle changes in large-scale mangroves.The results of the hybrid U-Net model were reliable for extracting sparse, slowgrowing young mangroves and narrow mangrove strips along roadsides.
Future research will focus on using one or a set of indicators or rules to approximate classification results, attempting to reduce the impact of spectral uncertainty on model accuracy under different climatic conditions.This high-precision mangrove extraction method will provide an important basis for refined calculations and assessments of mangrove ecosystem changes, mangrove species classification, marine ecosystem management, blue carbon restoration, and biodiversity protection in the future.

Conclusions
This paper presents a spatial-temporal method optimization approach (hybrid U-Net) based on mangrove habitats, low tide, and semantic segmentation models to address challenges in satellite remote sensing of mangrove wetlands, including spectral similarity confusion, tidal uncertainty, and difficulties in obtaining stable samples.The paper demonstrates the advantages of combining tidal water levels and mangrove habitat information in mangrove extraction and mapping.
The hybrid U-Net model overcomes the dependency of traditional machine learning algorithms on large sample datasets.The use of mangrove habitats and low-tide periods helps reduce the interference of surrounding vegetation in the extraction of the mangroves.This method was applied to extract mangroves in three typical areas of the Beibu Gulf in 2015, 2018, 2019, and 2020.This method is less sample-dependent and more accurate and is more suitable for the rapid detection of subtle changes in large-scale mangroves.The results of the hybrid U-Net model were reliable for extracting sparse, slow-growing young mangroves and narrow mangrove strips along roadsides.
Future research will focus on using one or a set of indicators or rules to approximate classification results, attempting to reduce the impact of spectral uncertainty on model accuracy under different climatic conditions.This high-precision mangrove extraction method will provide an important basis for refined calculations and assessments of mangrove ecosystem changes, mangrove species classification, marine ecosystem management, blue carbon restoration, and biodiversity protection in the future.
Remote Sens. 2023, 15, x FOR PEER REVIEW 7 of 23 years (2015-2020).Accuracy assessment: the accuracy of the mangrove maps was verified based on validation sample sites and field survey results, and quantitative and qualitative analyses were performed via comparing multi-person visual interpretations with existing data.

Figure 2 .
Figure 2. Overview of the workflow for the production of 2 m spatial resolution mangrove datasets.

Figure 2 .
Figure 2. Overview of the workflow for the production of 2 m spatial resolution mangrove datasets.

Step 3 :
U-Net model mangrove extraction First, multilevel interactive verification of the maximum range of mangrove areas (2021).Then, each detected mangrove patch was adjusted to obtain the best mangrove patch samples.Finally, the mangrove patch samples were trained to obtain a mangrove extraction model.The cloud-free low-tide images from other years (2015-2020) were used as inputs to the mangrove extraction model to obtain the mangrove map for the corresponding year.

Step 3 :
U-Net model mangrove extraction First, multilevel interactive verification of the maximum range of mangrove areas (2021).Then, each detected mangrove patch was adjusted to obtain the best mangrove patch samples.Finally, the mangrove patch samples were trained to obtain a mangrove extraction model.The cloud-free low-tide images from other years (2015-2020) were used as inputs to the mangrove extraction model to obtain the mangrove map for the corresponding year.

Figure 4 .
Figure 4. U-Net loss curves based on 2021 samples: (a) loss curve at ZZW, (b) loss curve at DFJ, (c) loss curve at TSG.

Figure 5 .
Figure 5.The hybrid U-Net results for ZZW at the Beibu Gulf.(a,b) The satellite image of mangrove area in 2021, (c) mangrove habitat information results, (d) low tide level acquisition results, (e) the hybrid U-Net results for 2020, (f) the hybrid U-Net results for 2019, (g) the hybrid U-Net results for 2018, (h) the hybrid U-Net results for 2015.

Figure 5 .
Figure 5.The hybrid U-Net results for ZZW at Beibu Gulf.(a,b) The satellite image of mangrove area in 2021, (c) mangrove habitat information results, (d) low tide level acquisition results, (e) the hybrid U-Net results for 2020, (f) the hybrid U-Net results for 2019, (g) the hybrid U-Net results for 2018, (h) the hybrid U-Net results for 2015.
Remote Sens. 2023, 15, x FOR PEER REVIEW 14 of 23 the findings of this study.In the context of multilevel interactive verification of mangrove extraction, multiple experts collaborated to review and validate the extracted mangrove maps (Figure 6).(A comparison of the results of hybrid U-Net extraction of mangroves with multi-person interactive validation for DFJ and TSG is shown in the Supplementary Materials (Figures S3 andS4)).The key steps involved in this interactive verification process included data sharing, a collaboration platform, comparative analysis, discussion and consensus, expert guidance, and iterative refinement.The verification process was carried out independently to ensure unbiased results.

Figure 6 .
Figure 6.Comparison of the results of hybrid U-Net extraction of mangroves with multi-person interactive validation for ZZW: (a,b) The satellite image of mangrove area at ZZW in 2021 October.(c,e,g,i) The hybrid U-Net results for 2020, 2019, 2018, 2015.(d,f,h,j) multilevel interactive validation for 2020, 2019, 2018, 2015.

Figure 8 .
Figure 8.The hybrid U-Net results for mangrove area changes in the ZZW, DFJ, and TSG areas for 2015, 2018, 2019, 2020 with the base year sub-data products.

Supplementary Materials:
The following supporting information can be downloaded at www.mdpi.com/xxx/s1.(Figure S1.The Hybrid U-Net result of DFJ at the Beibu Gulf.(a, b) The satellite image of mangrove area in 2021, (c) Mangrove habitat information, (d) Low tide level acquisition, (e) The Hybrid U-Net result in 2020, (f) The Hybrid U-Net result in 2019, (g) The Hybrid U-Net result in 2018, (h) The Hybrid U-Net result in 2015, Figure S2.The Hybrid U-Net result of TSG at the Beibu Gulf.(a, b) The satellite image of mangrove area in 2021, (c) Mangrove habitat information, (d) Low tide level acquisition, (e) The Hybrid U-Net result in 2020, (f) The Hybrid U-Net result in 2019, (g) The Hybrid U-Net result in 2018, (h) The Hybrid U-Net result in 2015.
Figure S3: Comparison of the results of hybrid U-Net extraction of mangroves with multi-person

Supplementary Materials:
The following supporting information can be downloaded at https: //www.mdpi.com/article/10.3390/rs15225271/s1. (Figure S1.The Hybrid U-Net result of DFJ at the Beibu Gulf.(a, b) The satellite image of mangrove area in 2021, (c) Mangrove habitat information, (d) Low tide level acquisition, (e) The Hybrid U-Net result in 2020, (f) The Hybrid U-Net result in 2019, (g) The Hybrid U-Net result in 2018, (h) The Hybrid U-Net result in 2015, Figure S2.The Hybrid U-Net result of TSG at the Beibu Gulf.(a, b) The satellite image of mangrove area in 2021, (c) Mangrove habitat information, (d) Low tide level acquisition, (e) The Hybrid U-Net result in 2020, (f) The Hybrid U-Net result in 2019, (g) The Hybrid U-Net result in 2018, (h) The Hybrid U-Net result in 2015.
Figure S3: Comparison of the results of hybrid U-Net extraction of mangroves with

Table 1 .
Parameters of multispectral data used in this study (m for meters).

Table 2 .
Number of sample points in each area and their sources.(Note: GER stands for Google Earth high-resolution image interpretation of random points; GEC stands for Google Earth high-resolution image interpretation of area change over two consecutive years; and FSs stands for field surveys).

Table 3 .
Accuracy evaluation of hybrid U-Net model for various types (mangrove, non-mangrove).