Swin-UperNet: A Semantic Segmentation Model for Mangroves and Spartina alterniﬂora Loisel Based on UperNet

: As an ecosystem in transition from land to sea, mangroves play a vital role in wind and wave protection and biodiversity maintenance. However, the invasion of Spartina alterniﬂora Loisel seriously damages the mangrove wetland ecosystem. To protect mangroves scientiﬁcally and dynamically, a semantic segmentation model for mangroves and Spartina alterniﬂora Loise was proposed based on UperNet (Swin-UperNet). In the proposed Swin-UperNet model, a data concatenation module was proposed to make full use of the multispectral information of remote sensing images, the backbone network was replaced with a Swin transformer to improve the feature extraction capability, and a boundary optimization module was designed to optimize the rough segmentation results. Additionally, a linear combination of cross-entropy loss and Lovasz-Softmax loss was taken as the loss function of Swin-UperNet, which could address the problem of unbalanced sample distribution. Taking GF-1 and GF-6 images as the experiment data, the performance of the Swin-UperNet model was compared against that of other segmentation models in terms of pixel accuracy (PA), mean intersection over union (mIoU), and frames per second (FPS), including PSPNet, PSANet, DeepLabv3, DANet, FCN, OCRNet, and DeepLabv3+. The results showed that the Swin-UperNet model achieved the best PA of 98.87% and mIoU of 90.0%, and the efﬁciency of the Swin-UperNet model was higher than that of most models. In conclusion, Swin-UperNet is an efﬁcient and accurate model for mangrove and Spartina alterniﬂora Loise segmentation synchronously, which will provide a scientiﬁc basis for Spartina alterniﬂora Loise monitoring and mangrove resource conservation and management.


Introduction
Mangroves and Spartina alterniflora Loise are the primary vegetation communities in coastal wetlands.Mangroves grow at the junction of land and sea and play a vital role in purifying seawater, preventing wind and waves, storing carbon, and maintaining biodiversity [1].The invasion of Spartina alterniflora Loise species has changed the ecological structure of mangrove wetlands and seriously affected the function and stability of the mangrove wetland ecosystem.Therefore, knowledge of the spatial distribution of mangroves and Spartina alterniflora Loise is important for the conservation and restoration of mangrove resources [2,3].
Remote sensing technology has the advantages of image-spectrum merging, wide detection range, less restriction by ground conditions, and fast information acquisition and has been widely used in practical applications, such as urban planning [4,5], traffic monitoring [6,7], land cover classification [8,9], and change detection [10,11].Using remote sensing images, several methods have been proposed to segment the mangroves and the Spartina alterniflora Loise [12][13][14], such as characteristics-based methods and deep learning methods.Characteristics-based methods are designed based on the spectral reflectance or shape features of objects and each pixel is analyzed.For example, Pham et al. [15] modeled, mapped, and analyzed the biomass change between 2000 and 2011 of mangrove forests in the Cangio region in Vietnam with characteristics-based image analysis and machine learning algorithms.Hermon et al. [16] developed a model of mangrove land cover change to analyze the change in mangroves.Pham et al. [17] used a characteristics-based approach for segmentation of the different LANDSAT sensors (TM, ETM+, and OLI) and used a geographic information system (GIS) to study the changes in mangroves during different periods from 1989 to 2013.Characteristics-based methods are a highly accurate but timeconsuming method for segmenting mangroves or Spartina alterniflora Loise.Motivated by the success of deep learning, different deep learning models have been used to segment objects in remote sensing images.
With the development of convolutional neural networks (CNNs) in computer vision, "deep learning" has opened up new research ideas for semantic segmentation [18], and AlexNet [19], VGGNet [20], and GoogLeNet [21] have been proposed for semantic segmentation successfully.For remote sensing images, fully convolutional network (FCN) [22], U-Net [23], SegNet [24], pyramid scene parsing network (PSPNet) [25], DeepLab [26], and unified perceptual parsing network (UperNet) [27] have been proposed for semantic segmentation.Kampffmeyer et al. [28] proposed a deep convolutional neural network (CNN) for land cover mapping in remote sensing images with a focus on urban areas.Hamaguchi et al. [29] introduced a local feature extraction module to a CNN and acquired remarkably good results, especially for small objects.Gao et al. [30] developed a semantic segmentation model for extracting mangroves in remote sensing images by using pixel classification.Several deep learning methods have been proposed for mangrove segmentation.However, in many cases, small areas of mangroves are often missed in the remote sensing images.Spartina alterniflora Loise segmentation is also critical for the analysis of remote sensing data.Currently, the segmentation methods of Spartina alterniflora Loise are rarely reported, especially synchronous segmentation of mangroves and Spartina alterniflora Loise in remote sensing images.
UperNet is a multivision task model; it can perform scene recognition, target detection, and region segmentation simultaneously.Thus, the hierarchical structure of UperNet can contribute to object differentiation with a low computation cost.However, when UperNet is applied to segment objects in remote sensing images, the multiband remote sensing images also present a challenge for feature extraction.On the issue of feature extraction, the application of a transformer in computer vision provides a new research direction for this purpose.Different from a CNN, a transformer with self-attentiveness establishes the connection between image locations at the first layer of information processing.Vision transformer (ViT) [31], transformer in transformer (TNT) [32], pyramid vision transformer (PVT) [33], tokens-to-token ViT (T2T-ViT) [34], and Swin transformer [35] have also gradually been proposed for extracting image features.In addition, the Swin transformer can solve the problems of large variations in scale of visual entities and the high resolution of pixels (Figure 1).The patch merging layer is designed to build a hierarchical structure (Figure 1a).The shifted windowing scheme brings greater efficiency by limiting self-attention computation to non-overlapping local windows while also allowing for crosswindow connection (Figure 1b).M-MSA and SW-MSA attention mechanisms are applied in the shifted windowing scheme to handle two consecutive feature maps (Figure 1c).Mangroves and Spartina alterniflora Loise have similar spectral and textural characteristics to other vegetation; therefore, it is difficult to segment mangroves and Spartina alterniflora Loise from other vegetation.Mangroves and Spartina alterniflora Loise also coexist; it is, therefore, a challenge for the semantic segmentation model to distinguish between the mangroves and the Spartina alterniflora Loise.However, the size and distribution of mangroves and Spartina alterniflora Loise are not uniform.Therefore, it is still worth researching a design for a novel model to improve the accuracy of segmenting mangroves and Spartina alterniflora Loise in remote sensing.To achieve high efficiency and high accuracy of mangrove and Spartina alterniflora Loise segmentation synchronously, a semantic segmentation model based on UperNet was proposed (Swin-UperNet), which was inspired by the hierarchical structure of UperNet and the Swin transformer's method of handling image-encoded data.In the proposed model, a data concatenation module was proposed to make full use of the spectral information of images, which could distinguish between the mangroves and the Spartina alterniflora Loise.The backbone network was replaced with a Swin transformer to improve the feature extraction ability, especially for small areas of mangroves and Spartina alterniflora Loise.A boundary optimization module was designed to optimize the rough segmentation results, which could further improve the accuracy of segmentation of mangroves and Spartina alterniflora Loise.In addition, the loss function was substituted with a linear combination of cross-entropy loss and Lovasz-Softmax loss to solve the unbalanced sample distribution problem.Swin-UperNet can be an efficient semantic segmentation model for mangrove and Spartina alterniflora Loise segmentation synchronously.

Data
The experimental datasets were acquired by GF-1 and GF-6 from 23 August 2016 to 18 December 2021, along the northeastern coast of Beibu Gulf, Guangxi, China, with a cloud coverage of less than 5% and spatial resolution of 8 m. Figure 2 shows the location of the study region.Table 1 shows the information of the studied remote sensing images.Mangroves and Spartina alterniflora Loise have similar spectral and textural characteristics to other vegetation; therefore, it is difficult to segment mangroves and Spartina alterniflora Loise from other vegetation.Mangroves and Spartina alterniflora Loise also coexist; it is, therefore, a challenge for the semantic segmentation model to distinguish between the mangroves and the Spartina alterniflora Loise.However, the size and distribution of mangroves and Spartina alterniflora Loise are not uniform.Therefore, it is still worth researching a design for a novel model to improve the accuracy of segmenting mangroves and Spartina alterniflora Loise in remote sensing.To achieve high efficiency and high accuracy of mangrove and Spartina alterniflora Loise segmentation synchronously, a semantic segmentation model based on UperNet was proposed (Swin-UperNet), which was inspired by the hierarchical structure of UperNet and the Swin transformer's method of handling image-encoded data.In the proposed model, a data concatenation module was proposed to make full use of the spectral information of images, which could distinguish between the mangroves and the Spartina alterniflora Loise.The backbone network was replaced with a Swin transformer to improve the feature extraction ability, especially for small areas of mangroves and Spartina alterniflora Loise.A boundary optimization module was designed to optimize the rough segmentation results, which could further improve the accuracy of segmentation of mangroves and Spartina alterniflora Loise.In addition, the loss function was substituted with a linear combination of cross-entropy loss and Lovasz-Softmax loss to solve the unbalanced sample distribution problem.Swin-UperNet can be an efficient semantic segmentation model for mangrove and Spartina alterniflora Loise segmentation synchronously.The GF-1 satellite carries a 2 m panchromatic camera, an 8 m multispectral camera, and four 16 m wide field view (WFV) cameras and was launched by China on 26 April 2013.To segment mangroves and Spartina alterniflora Loise, GF-1 multispectral images with 8 m resolution were chosen.The multispectral bands of this image consist of blue (0.45~0.52 μm), green (0.52~0.59 μm), red (0.63~0.69 μm), and near infrared (NIR) (0.77~0.89 μm) bands (Table 2).The GF-6 satellite is configured with a 2 m panchromatic/8 m multispectral high-resolution camera and a 16 m multispectral medium-resolution wide-field-view camera.The 2 m panchromatic/8 m multispectral camera has an observation width of 90 km, and the 16 m multispectral camera has an observation width of 800 km.The GF-6 satellite was successfully launched at Jiuquan Satellite Launch Center on 2 June 2018.Similar to the  The GF-1 satellite carries a 2 m panchromatic camera, an 8 m multispectral camera, and four 16 m wide field view (WFV) cameras and was launched by China on 26 April 2013.To segment mangroves and Spartina alterniflora Loise, GF-1 multispectral images with 8 m resolution were chosen.The multispectral bands of this image consist of blue (0.45~0.52 µm), green (0.52~0.59 µm), red (0.63~0.69 µm), and near infrared (NIR) (0.77~0.89 µm) bands (Table 2).The GF-6 satellite is configured with a 2 m panchromatic/8 m multispectral highresolution camera and a 16 m multispectral medium-resolution wide-field-view camera.The 2 m panchromatic/8 m multispectral camera has an observation width of 90 km, and the 16 m multispectral camera has an observation width of 800 km.The GF-6 satellite was successfully launched at Jiuquan Satellite Launch Center on 2 June 2018.Similar to the case for the GF-1 images, GF-6 multispectral images with 8 m resolution were chosen.The multispectral bands of this image consist of blue (0.45~0.52 µm), green (0.52~0.60 µm), red (0.63~0.69 µm), and near infrared (NIR) (0.76~0.90 µm) bands (Table 3).

Data Preprocessing
The size of the GF-1 and GF-6 original images is larger than the area of mangrove or Spartina alterniflora Loise.However, the large size of remote sensing images leads to large amounts of computation.Therefore, smaller images containing the mangrove or Spartina alterniflora Loise were cropped manually.Figure 3 shows a schematic diagram of the original image and the cropped images; 14 smaller images were cropped from the original image.

Data Preprocessing
The size of the GF-1 and GF-6 original images is larger than the area of mangrove or Spartina alterniflora Loise.However, the large size of remote sensing images leads to large amounts of computation.Therefore, smaller images containing the mangrove or Spartina alterniflora Loise were cropped manually.Figure 3 shows a schematic diagram of the original image and the cropped images; 14 smaller images were cropped from the original image.In the cropped images, the area of the mangrove and Spartina alterniflora Loise was still small, which would lead to the unbalanced sample distribution problem.Hence, the mangrove and Spartina alterniflora Loise data were expanded.Figure 4 shows the flow and examples of the expansion.A smaller image of 80 × 80 was randomly selected in the cropped image, and if the percentage of the area of mangrove and Spartina alterniflora In the cropped images, the area of the mangrove and Spartina alterniflora Loise was still small, which would lead to the unbalanced sample distribution problem.Hence, the mangrove and Spartina alterniflora Loise data were expanded.Figure 4 shows the flow and examples of the expansion.A smaller image of 80 × 80 was randomly selected in the cropped image, and if the percentage of the area of mangrove and Spartina alterniflora Loise was greater than 60%, the selected smaller image was saved.Finally, the saved images were randomly embedded into the cropped image, and a new image was generated.
Loise was greater than 60%, the selected smaller image was saved.Finally, the saved images were randomly embedded into the cropped image, and a new image was generated.

Methods
Figure 5 shows the workflow of Swin-UperNet.Taking UperNet as the framework, the backbone network was replaced with a Swin transformer to improve the feature extraction capability (Figure 5B).In the Swin-UperNet model, a data concatenation module was proposed to make full use of the multispectral information of remote sensing images (Figure 5A); a boundary optimization module was designed to refine the rough segmentation results (Figure 5C); and a linear combination of cross-entropy loss and Lovasz-Softmax loss was taken as the loss function to address the problem of unbalanced sample distribution.

Methods
Figure 5 shows the workflow of Swin-UperNet.Taking UperNet as the framework, the backbone network was replaced with a Swin transformer to improve the feature extraction capability (Figure 5B).In the Swin-UperNet model, a data concatenation module was proposed to make full use of the multispectral information of remote sensing images (Figure 5A); a boundary optimization module was designed to refine the rough segmentation results (Figure 5C); and a linear combination of cross-entropy loss and Lovasz-Softmax loss was taken as the loss function to address the problem of unbalanced sample distribution.

Data Concatenation Module
In the data concatenation module, 8 channels were used to enhance the spectral information for mangrove and Spartina alterniflora Loise segmentation, including blue, green, red, and NIR bands, normalized-difference vegetation index (NDVI), forest dis-
The NIR and red bands are useful for extracting information on different vegetation [40].The spectral vegetation and water indexes are spectral measures of canopy greenness [41], which could better reflect the difference between vegetation cover and growth conditions, especially suitable for vegetation monitoring.The normalized-difference vegetation index (NDVI) reflects the growth status and spatial distribution density of vegetation, which is widely used for vegetation assessment.The forest discrimination index (FDI) reflects the level of vegetation density classification in forest monitoring, which is frequently applied in mangrove distribution research.The difference vegetation index (DVI) reflects the change in soil background, which is used for vegetation ecology monitoring.The normalizeddifference water index (NDWI) reflects information on water bodies, which contributes to distinguishing between mangroves and water bodies.Table 4 shows the calculation method of several spectral vegetation or water indexes.

Boundary Optimization Module
The boundary optimization module (BOM) was designed with inspiration from a residual block in ResNet (Figure 5C).The BOM avoided the problem of low-level feature vanishing caused by convolution operations and adjusted the rough segmentation results using a low-level feature map.The BOM added the output of two consecutive conv blocks using a skip connection.The conv block contained conv3×3, batch normalization, and ReLu activation function operations.

Loss Function
Linear combination of cross-entropy loss and Lovasz-Softmax loss was taken as the loss function to address the problem of unbalanced sample distribution [42].The crossentropy loss function is a pixel-wise loss function used in semantic segmentation tasks to measure the variability of pixels between the predicted value and the ground truth value, which is defined as follows: where N is the number of pixels, y i is the ground truth class vector of a pixel i, and ŷi is the output of the model of a pixel i.The cross-entropy loss function considers the probability that the prediction is correctly labeled, but when the number of samples in different categories is unbalanced, it will ignore the learning of the foreground class and affect the efficiency of the algorithm.For example, when the number of samples in the background class is much larger than the number of samples in the foreground class, the background class will be used as the dominant factor to learn.In this study, the number of samples of Spartina alterniflora Loise was smaller compared to other classes, and the unbalanced sample distribution is a serious problem for the segmentation of mangroves and Spartina alterniflora Loise.Hence, Lovasz-Softmax loss was selected to solve the unbalanced sample distribution problem and to optimize the accuracy of segmentation.Lovasz-Softmax loss is proposed by Berman et al. for the Jaccard index (also called the intersection over union).The Jaccard index of class c is defined as where y is a vector of the ground truth labels and y is a vector of the predicted labels.Then, Jaccard index loss can be defined as and we can define the set of mispredicted pixels for class c as Equation ( 4) can be rewritten with M c as However, this loss function is not derivable.In order to optimize the Jaccard index for the training model, the discrete loss was smoothly extended based on a submodular analysis of the set function.The smooth extension is named the Lovasz extension, which is a set function ∆ and is defined as where p is the number of pixels in an image, m is the vector of pixel errors for class c, and the g i (m) is defined as where {π 1 , . . . ,π i } means a permutation ordering the components of m in decreasing order, such as m In a multiclass segmentation task, the pixel errors vector of class c can be defined as where f i (c) ∈ [0, 1] is the predicted class of pixel I for class c.Then, the Lovasz-Softmax loss can be defined as where C is the number of classes.Therefore, to achieve sample distribution balance and excellent segmentation accuracy, the loss function of Swin-UperNet was defined as loss =∝ Loss CE + (1− ∝)Loss LS (10) where ∝ is a weight parameter to balance the cross-entropy loss and Lovasz-Softmax loss functions.

Result and Discussion
To evaluate the segmentation performance of the Swin-UperNet model, two comparison experiments were designed.In the ablation experiment, each improved component was analyzed.In the comparison experiment, the segmentation efficiency and accuracy of the Swin-UperNet model were compared against those of other models, including PSPNet, PSANet [43], DeepLabv3 [44], DANet [45], FCN, OCRNet [46], and DeepLabv3+ [47].

Experimental Data
Based on 27 GF-1/GF-6 remotes sensing images, 200 images with a size of 480 × 480 pixels were cropped.These 200 images were then flipped horizontally, vertically, and diagonally, and 800 remote sensing images were generated.Therefore, the experimental dataset consisted of 800 remote sensing images with a size of 480 × 480 pixels and 8 channels.The 800 remote sensing images were then divided into three sets: 640 images for training, 60 images for testing, and 100 images for validation.To ensure the reliability and validity of the segmentation results, the training and testing data were independent.Before training, the dataset was scaled in range of 0.5-1.5 with random multiplicity.Figure 6 shows the different input channels of the image and the ground truth, wherein the ground truth was labeled by experts.

Experimental Setups
The batch size was set to 2; the optimizer was "AdamW"; the weight decay was 0.01; the initial learning rate was 6 × 10 −5 ; the learning rate strategy was "poly"; and the number of training iterations was 160,000.The segmentation models were implemented using PyTorch 1.7.1+cu101 with the MMSegmentation 0.11.0+framework and executed on the Windows 10 platform with an NVIDIA Quadro RTX 3000 GPU.

Evaluation Metrics
Three evaluation metrics: pixel accuracy (PA), mean intersection over union (mIoU), and frames per second (FPS) [48] were used to evaluate the segmentation performance of the different models.
Pixel accuracy represents the ratio of pixels properly classified, divided by the total number of pixels.For K classes, PA is defined by where K + 1 classes include K foreground classes and 1 background class, and p is the number of class i predicted as class j.
Mean intersection over union represents the average IoU over all classes, and IoU is the area of intersection between the predicted result and the label.mIoU is defined by

Experimental Setups
The batch size was set to 2; the optimizer was "AdamW"; the weight decay was 0.01; the initial learning rate was 6 × 10 −5 ; the learning rate strategy was "poly"; and the number of training iterations was 160,000.The segmentation models were implemented using PyTorch 1.7.1+cu101 with the MMSegmentation 0.11.0+framework and executed on the Windows 10 platform with an NVIDIA Quadro RTX 3000 GPU.

Evaluation Metrics
Three evaluation metrics: pixel accuracy (PA), mean intersection over union (mIoU), and frames per second (FPS) [48] were used to evaluate the segmentation performance of the different models.
Pixel accuracy represents the ratio of pixels properly classified, divided by the total number of pixels.For K classes, PA is defined by where K + 1 classes include K foreground classes and 1 background class, and p ij is the number of class i predicted as class j.
Mean intersection over union represents the average IoU over all classes, and IoU is the area of intersection between the predicted result and the label.mIoU is defined by where A and B denote the label and the predicted results, respectively.FPS represents frames processed per second, which is used to evaluate the computation efficiency of methods.FPS is defined by where t represents the time taken to process an image.

Ablation Experiment
Table 5 shows the comparison of evaluation metrics between models with different settings, including with different loss functions, data processing (DP), data concatenation module (DCM), Swin transformer tiny (ST-Tiny), and boundary optimization module (BOM), respectively.Compared to the models with a different loss function, the Swin-UperNet model achieved the best mIoU of 56.36% and PA of 94.29%, which illustrated that the linear combination of cross-entropy loss and Lovasz-Softmax loss was effective for mangrove and Spartina alterniflora Loise segmentation.Compared to the model without the data processing operation, the mIoU and PA of the Swin-UperNet model increased 29.8% and 2.76%, respectively.Compared to the model with a ResNet backbone network, the mIoU and PA of the Swin-UperNet model increased by 7.04% and 4.52%, respectively, which indicated that the Swin transformer was able to better extract object features for dealing with multichannel data.Compared to the model without a boundary optimization module, the Swin-UperNet model achieved the best mIoU and PA of 90.0% and 98.87%, respectively, which showed that the boundary optimization module could adjust the boundary segmentation and eliminate some misclassifications.These results denote that the Swin-UperNet model could improve the segmentation accuracy of mangroves and Spartina alterniflora Loise.
Table 6 shows the comparison of evaluation metrics between models with different input channels.The Swin-UperNet model achieved the best mIoU and PA.Furthermore, the mIoU and PA for mangrove segmentation increased from 83.0% to 91.03% and 87.37% to 98.35%, respectively.The mIoU and PA for Spartina alterniflora Loise segmentation increased from 63.18% to 79.65% and 69.65% to 89.15%, respectively.These results denote that adding spectral vegetation or water indexes to the data concatenation module of the Swin-UperNet model could improve the accuracy of mangrove and Spartina alterniflora Loise segmentation.

Comparison Experiment
We compared the proposed Swin-UperNet model against other models, including PSPNet, PSANet, DeepLabv3, DANet, FCN, OCRNet, and DeepLabv3+ to evaluate the segmentation performance for mangroves and Spartina alterniflora Loise.The segmentation results for mangroves and Spartina alterniflora Loise are shown in Figure 7, where red area, yellow area, and blue area denote mangrove, Spartina alterniflora Loise, and other, respectively.In the first and second rows, we see that the segmentation results obtained with the Swin-UperNet model were more accurate and the segmentation boundaries were closer to the ground truth.The third row shows that only the segmentation results of the Swin-UperNet model did not misclassify other categories as Spartina alterniflora Loise.From the fourth row, only the segmentation results of the Swin-UperNet model contained the small Spartina alterniflora Loise regions.Figure 7 shows that the segmentation results of the Swin-UperNet model were more consistent with ground truth.From the fourth row, only the segmentation results of the Swin-UperNet model contained the small Spartina alterniflora Loise regions.Figure 7 shows that the segmentation results of the Swin-UperNet model were more consistent with ground truth.

Conclusions
Changes in the growth and distribution of mangroves and Spartina alterniflora Loise affect the security of ecological systems.Due to tides and silt, field observation is difficult and ineffective.Here, we proposed a Swin-UperNet model for highly efficient and accurate segmentation of mangroves and Spartina alterniflora Loise in remote sensing images.
In the Swin-UperNet model, the mangrove and Spartina alterniflora Loise datasets were built, which provided data support for the deep learning models.The data processing method was designed, which increased the diversity of data and the size of Spartina alterniflora Loise samples.The data concatenation module was proposed, which selected some multispectral bands and indexes and was beneficial for segmenting mangroves and Spartina alterniflora Loise.The Swin transformer was chosen as the backbone network, which improved the accuracy of segmentation.The boundary optimization module was proposed, which optimized the rough segmentation result and resolved the misclassification problem.A linear combination of cross-entropy loss and the Lovasz-Softmax loss was chosen as the loss function, which solved the problem of unbalanced sample distribution.
Three metrics were used to evaluate the accuracy and efficiency of the Swin-UperNet model, including pixel accuracy (PA), mean intersection over union (mIoU), and frames per second (FPS), which achieved results of 90.0%, 98.87%, and 10, respectively.The experiment results demonstrated that the proposed Swin-UperNet model could achieve higher efficiency and accuracy of segmentation results for mangroves and Spartina alterniflora Loise synchronously in remote sensing images.Moreover, the combination of remote sensing technology and deep learning could overcome the difficulty in field observation.However, there are still several challenges for the segmentation of mangroves and Spartina alterniflora Loise: How to use multisource remote sensing data to improve the accuracy of Spartina alterniflora Loise segmentation?How to predict the changing trends of distribution of mangroves and Spartina alterniflora Loise in time?

Figure 2 .
Figure 2. Location of the study region.(A) Northeastern coast of Beibu Gulf.(B) Study region.

Figure 2 .
Figure 2. Location of the study region.(A) Northeastern coast of Beibu Gulf.(B) Study region.

Figure 3 .
Figure 3. Schematic diagram of original image and the cropped images.

Figure 3 .
Figure 3. Schematic diagram of original image and the cropped images.

Figure 6 .
Figure 6.Different input channels and the ground truth.

Figure 6 .
Figure 6.Different input channels and the ground truth.

Electronics 2023 ,
12,  x FOR PEER REVIEW 12 of 15 area, yellow area, and blue area denote mangrove, Spartina alterniflora Loise, and other, respectively.In the first and second rows, we see that the segmentation results obtained with the Swin-UperNet model were more accurate and the segmentation boundaries were closer to the ground truth.The third row shows that only the segmentation results of the Swin-UperNet model did not misclassify other categories as Spartina alterniflora Loise.

Figure 7 .
Figure 7. Segmentation results for mangroves and Spartina alterniflora Loise by different models.

Figure 7 .
Figure 7. Segmentation results for mangroves and Spartina alterniflora Loise by different models.

Table 5 .
Evaluation metrics comparison between models with different setting.

Table 6 .
Evaluation metrics comparison between different input channels.

Table 7 .
Performance comparison between different segmentation models.

Table 7 .
Performance comparison between different segmentation models.