Deep Learning with UAV Imagery for Subtropical Sphagnum Peatland Vegetation Mapping

Liu, Zhengshun; Huang, Xianyu

doi:10.3390/rs17172920

Open AccessArticle

Deep Learning with UAV Imagery for Subtropical Sphagnum Peatland Vegetation Mapping

by

Zhengshun Liu

^1,2

and

Xianyu Huang

^1,2,*

¹

Hubei Key Laboratory of Critical Zone Evolution, School of Geography and Information Engineering, China University of Geosciences, Wuhan 430078, China

²

Observation and Research Station of Shenongjia Dajiuhu Wetland Earth Critical Zone, Ministry of Natural Resources, China University of Geosciences, Wuhan 430078, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(17), 2920; https://doi.org/10.3390/rs17172920

Submission received: 13 July 2025 / Revised: 16 August 2025 / Accepted: 17 August 2025 / Published: 22 August 2025

Download

Browse Figures

Versions Notes

Abstract

Peatlands are vital for global carbon cycling, and their ecological functions are influenced by vegetation composition. Accurate vegetation mapping is crucial for peatland management and conservation, but traditional methods face limitations such as low spatial resolution and labor-intensive fieldwork. We used ultra-high-resolution UAV imagery captured across seasonal and topographic gradients and assessed the impact of phenology and topography on classification accuracy. Additionally, this study evaluated the performance of four deep learning models (ResNet, Swin Transformer, ConvNeXt, and EfficientNet) for mapping vegetation in a subtropical Sphagnum peatland. ConvNeXt achieved peak accuracy at 87% during non-growing seasons through its large-kernel feature extraction capability, while ResNet served as the optimal efficient alternative for growing-season applications. Non-growing seasons facilitated superior identification of Sphagnum and monocotyledons, whereas growing seasons enhanced dicotyledon distinction through clearer morphological features. Overall accuracy in low-lying humid areas was 12–15% lower than in elevated terrain due to severe spectral confusion among vegetation. SHapley Additive exPlanations (SHAP) of the ConvNeXt model identified key vegetation indices, the digital surface model, and select textural features as primary performance drivers. This study concludes that the combination of deep learning and UAV imagery presents a powerful tool for peatland vegetation mapping, highlighting the importance of considering phenological and topographical factors.

Keywords:

peatlands; unmanned aerial vehicle; ConvNeXt; ResNet; phenology; SHAP

1. Introduction

Peatlands are crucial wetlands found worldwide [1]. They store about one-third of the world’s soil organic carbon [2] and play a vital role in the global carbon balance [3]. The carbon cycling in peatlands is closely linked to peat-forming plants [4,5,6]. Peatland vegetation also influences regional hydrological processes and water distribution [7]. Sphagnum, a key genus in peatland ecosystems, significantly affects the carbon sequestration capacity of peatlands [8,9]. Moreover, the composition of vegetation is important in determining whether a restored peatland acts as a carbon sink or source [10]. Therefore, mapping vegetation is essential for effectively managing, protecting, and restoring peatlands.

Due to micro-topographical variations, peatland vegetation exhibits substantial spatial heterogeneity in its distribution, posing a great challenge to vegetation mapping using conventional labor-intensive, time-consuming approaches [11,12,13]. Traditional commercial satellite products provide useful data for mapping peatland vegetation. However, their coarse spatial resolution limits the ability to capture fine-scale features (e.g., species differences and small vegetation patterns), which are crucial for accurate ecological assessments [14,15,16,17]. In contrast, unmanned aerial vehicles (UAVs) equipped with ultra-high-resolution sensors offer a solution to this limitation. UAVs are essential for mapping and analyzing complex ecosystems like peatlands, as they can capture detailed, fine-scale data over large areas, which is particularly useful in remote and hard-to-reach environments [13,18]. Furthermore, UAVs allow researchers to track vegetation responses to environmental changes such as hydrological fluctuations, seasonal shifts, and land use practices, while overcoming terrain and weather limitations that hinder traditional methods [19,20].

The need for improved methods to process UAV imagery remains critical for accurately mapping and classifying vegetation in peatlands. Object-based image analysis (OBIA) is particularly well-suited for addressing the classification challenges posed by increased heterogeneity at finer scales [21,22,23]. It effectively mitigates the salt-and-pepper effect caused by pixel-based methods, which can reduce classification accuracy due to high levels of noise in the imagery [24,25]. Furthermore, the integration of machine learning has made processing large-scale UAV imagery efficient and accurate, particularly with the widespread application of the random forest (RF) algorithm in peatland vegetation mapping [13,22,26].

Recently, deep neural networks (DNNs) have emerged as transformative tools. It demonstrated superior accuracy over traditional methods like random forest (RF) and support vector machines (SVMs) through transfer learning and hierarchical feature extraction [27]. Deep learning (DL) has emerged as a transformative approach for remote sensing image interpretation [28,29,30,31,32,33,34]. Convolutional neural networks (CNNs), in particular, autonomously learn hierarchical features from large datasets through multi-layered architectures, enabling robust identification of target categories and spatial distributions—a critical advantage for addressing complex remote sensing challenges.

Recent applications of deep CNNs tackle fundamental limitations in wetland vegetation mapping, specifically mitigating spectral confusion where spectrally similar vegetation exhibits divergent ecological traits while morphologically analogous species display spectral variability [35,36,37,38]. Nevertheless, widespread adoption faces four critical barriers: computational constraints from model complexity and hardware demands; high-dimensional data challenges due to feature redundancy in spectral–spatial datasets; performance degradation under sample-limited conditions; and limited cross-regional generalizability. Furthermore, diverse DL architectures exhibit heterogeneous classification performance across wetland vegetation types, rendering the evaluation of algorithms for peatland mapping an essential research priority.

To systematically address these challenges, we systematically benchmark four state-of-the-art DL models—ResNet, EfficientNet, ConvNeXt, and Swin Transformer—selected for their proven efficacy in handling spectral–spatial complexity [39,40,41,42]. ResNet and EfficientNet represent canonical CNN architectures, while Swin Transformer leverages self-attention mechanisms. ConvNeXt hybridizes both paradigms, enabling comparative analysis of architectural efficacy for peatland vegetation mapping.

In this study, we aim to improve the accuracy of UAV-based vegetation mapping by evaluating the impact of different DL algorithms and images captured in different seasons and topographies on the accuracy of peatland vegetation mapping. The research was carried out at the Dajiuhu peatland, a significant wetland recognized by Ramsar in central China. This peatland is a representative subtropical Sphagnum peatland and exhibits pronounced phenological differences in vegetation communities between the growing and non-growing seasons (Figure 1). The objectives of this study include (1) generating high-precision classification maps of peatland vegetation across different seasons and topographic conditions using comparative DL models; (2) exploring the optimal seasonal timing for mapping accuracy within distinct terrain peatland plots; and (3) interpreting feature importance in DL models using SHAP.

2. Materials and Methods

2.1. Study Area

The Dajiuhu peatland (31°25′–31°32′N, 109°58′–110°08′E) develops in a closed sub-alpine basin, with an average altitude of 1730 m above the sea level. The region is dominated by the East Asian summer monsoon, with an average annual temperature of 7.4 °C and annual precipitation of 1560 mm [43]. The growing season spans from early May to late August, while the non-growing season extends from early September to late April of the following year. In this site, the hydrological conditions and greenhouse gas fluxes have been extensively investigated [43,44].

This study selected peatland batches near Lake No. 3 (LK3) and Yangluchang (YLC) as representative study plots (Figure 1). The sampled areas encompass 0.15 ha (LK3) and 3.8 ha (YLC), both exhibiting high Sphagnum coverage relative to the broader peatland. Field surveys confirmed that despite its limited spatial extent, LK3 contains >35% Sphagnum biomass within its boundaries. The water table level in these areas fluctuates seasonally, with higher levels in the summer and lower levels in the winter [43]. However, LK3 has a higher elevation than YLC, resulting in a higher water table level at YLC relative to LK3. These two sites share similar vegetation characteristics, which are predominantly composed of four main plant forms: (1) dicotyledonous vegetation (dicots), represented by Sanguisorba officinalis, Veratrum nigrum, and Euphorbia esula; (2) monocotyledonous vegetation (monocots), represented by Juncus effuses, Carex argyi, and Calamagrostis epigeios; (3) shrubs, represented by Crataegus wilsonii and Malus hupehensis; and (4) peat moss, predominated by Sphagnum palustre.

2.2. Data Collecting

2.2.1. UAV Imagery Capture and Pre-Processing

UAV images of Dajiuhu were collected on 22 October 2023 as well as 25 April and 27 August 2024, respectively, using a DJI Mavic 3E drone equipped with a red–green–blue (RGB) camera with a resolution of 5280 × 3956 pixels. Automated flight missions maintained 80% forward and sidelap at 60 m above ground level, yielding 468 images at YLC and 193 at LK3. During the data collection process, strict adherence to the DJI RTK mounted on the DJI flight controller was maintained to ensure the accuracy and reliability of the data. All images were acquired between 11:00 a.m. 1:00 p.m. (UTC +8:00) under fully cloudy to mostly cloudy weather conditions. There are two main reasons for choosing cloudy weather for acquiring images. First, the alpine basin’s daytime heating induces persistent cloud formation with limited dispersal due to topographic confinement, causing unstable illumination. Second, the inherently saturated peatland surface generates extensive water bodies that produce specular reflections, distorting spectral signatures critical for vegetation analysis [45].

The UAV images were processed using Pix4dMapper 4.5.6 software. The specific processing procedure was as follows: (1) importing the original image, conducting an image quality check, and eliminating the images with a heading overlap rate and side overlap rate of less than 70%; (2) automatically matching RGB images, performing aerial triangulation, and generating dense point cloud data; (4) building a TIN triangle network; (5) creating digital surface models (DSMs) and digital orthophoto maps (DOMs) of the study area. Finally, to ensure better comparability, ArcGIS Pro was used to standardize the ultra-high resolution of the DSMs and DOMs to 5 cm (Figure 2).

To examine spectral confusion among vegetation types, we extracted RGB reflectance values from ground reference samples to illustrate class-specific reflectance characteristics across different temporal periods (Figure 3).

2.2.2. Ground Reference Data

Peatland vegetation is classified into four ground object types: (1) monocots, (2) dicots, (3) shrubs, and (4) Sphagnum. Water bodies and bare peat, where no peat-forming plant grows, were classified as a separate ‘Other’ type. To ensure that the points were randomly and evenly distributed across the two study batches, we used two types of ways to collect sample data: (1) Ten sample transect images were obtained by the Mavic3E UAV (DJ-Innovations, Shenzhen, China) with a flying height of 20 m, which was visually interpreted to determine land cover types. The location of the vegetation plots in the UAV image was double-checked with visual interpretation to verify that the vegetation description and visual interpretation in the field matched that in the UAV images. (2) GPS was used for field surveys, with synchronized ground investigations conducted during all four UAV sampling periods. Two sets of data were collected at each location, and field survey data from the same sites were merged. Ultimately, a total of 953 sample data points were obtained, with 257 sample points from LK3 and 696 sample points from YLC, which were divided into five ground object types (Table S1, Figure S1).

2.3. Data Processing Approaches

2.3.1. Object-Based Analyzing Image Segmentation

UAV images were segmented using a multi-scale segmentation algorithm implemented in eCognition9.0.2 software. Multi-scale segmentation, one of the most successful algorithms, aggregates pixels into segments of increasing size based on predefined levels through an iterative process [46]. We decided to use multi-scale segmentation in this study because a critical aspect of multiresolution segmentation in OBIA is the selection of segmentation parameters, particularly the scale parameter, which relates to the image’s spatial resolution and governs the size of resulting objects [47,48]. In this study, the color and smoothness were set to 0.8 and 0.5, respectively, aligning with widely accepted standards [26]. The optimal segmentation scale was obtained by the Estimation of Scale Parameter 2 (ESP2) tool using the principle of local variance [49,50].

The optimal scale parameter of 35 was determined using the ESP2 tool [49], which quantifies local variance thresholds to minimize intra-segment heterogeneity and maximize intersegment distinction. In fact, 35 was not the only mutation point of the local variance change rate (Figure S2). Larger segmentation parameters, such as 39, 40, and 46, were also tested but yielded less accurate results, including misclassification of plant formations after segmentation (Figure S3). In contrast, the scale parameter of 35 largely avoided such confusion. With a scale parameter of 35, most object batches range from 0.1 to 0.3 m² and exhibit irregular shapes. Approximately 30,000 objects were produced in LK3 and 80,000 in YLC. Previous studies have used larger segmented batches of 0.96 m² and 0.5625 m² through multiple subjective attempts [13,26]. The fine-scale strategy significantly enhanced boundary delineation accuracy (Figure S3). While finer segmentation increases data volume and computational complexity [22], it is essential for resolving spatially explicit vegetation boundaries in heterogeneous peatlands. This approach provides superior foundational data for DL applications by precisely capturing spatial structures of vegetation communities.

2.3.2. Deep Learning Algorithms

Four popular and high-performing DL algorithms in the field of image classification, ResNet, EfficientNet, Swin Transformer, and ConvNeXt, were employed. Both ResNet and EfficientNet are representative examples of typical CNNs [51].

ResNet introduces residual connections, allowing the network to bypass certain layers and effectively addressing the vanishing gradient problem. It stacks multiple residual blocks, each consisting of identity mappings and convolutional layers [52]. Its architecture consists of 50 layers organized into four sequential stages (conv2_x to conv5_x), each made up of multiple bottleneck residual blocks. Each block follows a 1 × 1–3 × 3–1 × 1 convolutional pattern with batch normalization and ReLU activation and uses identity shortcut connections to bypass nonlinear transformations. This design enables the stable training of very deep networks. For implementation, we adopted the standard ResNet50 configuration with 2560 parameters and initialized it with pretrained weights. For implementation, we adopted the standard ResNet50 configuration with 2560 parameters, initialized it with pretrained weights, and trained it for 30 epochs using Adam optimization (learning rate = 0.0001) with a batch size of 32. The model was optimized using multi-class cross-entropy loss.

EfficientNet employs a compound scaling method, which uniformly scales the width, depth, and resolution of convolutional networks. It incorporates mobile inverted bottleneck blocks and squeeze-and-excitation layers. Its core component is the mobile inverted bottleneck (MBConv) block, which first expands the input channels via 1 × 1 convolution, then applies depthwise 3 × 3 convolution, and finally squeezes the channels using a 1 × 1 convolution. Each MBConv integrates squeeze-and-excitation (SE) layers to adaptively recalibrate channel-wise feature responses. The B0 variant has 5.3 million parameters and uses a baseline architecture with 7 MBConv stages. The channel counts increase progressively while spatial dimensions are reduced through strided convolutions. For training, we used stochastic gradient descent with momentum, configured with an initial learning rate of 0.01 and cosine annealing decay (final learning rate factor = 0.01). The model was trained for 30 epochs using a batch size of 32.

The Swin Transformer is a hierarchical transformer model that utilizes shifted windows for self-attention computations. It divides the input image into non-overlapping patches, which are gradually merged to enhance computational efficiency [53]. It divides input images into non-overlapping 4 × 4 patches (with an embedding dimension of 96), which are processed by transformer blocks organized in four hierarchical stages. Within each stage, local-window multi-head self-attention confines computations to non-overlapping windows (7 × 7 patches), while shifted-window attention enables cross-window connectivity in alternating layers. This approach maintains linear computational complexity relative to image size. Our Swin-T implementation has 28 million parameters and uses a [2, 2, 6, 2] block configuration per stage, along with layer normalization and GELU activations. We trained the model for 30 epochs with a batch size of 16 using AdamW optimization (fixed learning rate = 0.0001, weight decay = 0.05) without explicit learning rate scheduling.

ConvNeXt is a modernized CNN inspired by the design principles of Vision Transformers. It refines standard convolutional layers through techniques such as depthwise convolutions, layer normalization, and inverted residuals [54,55]. Our study used ConvNeXt-Tiny. Key innovations include inverted bottlenecks (expanding channels before depthwise convolution), 7 × 7 kernel sizes to mimic the large receptive fields in Vision Transformers (ViTs), layer normalization instead of batch normalization, and GELU activations. The Tiny variant has 28 million parameters and uses a [3, 3, 9, 3] block structure per stage, with progressive channel scaling (from 96 to 768) and downsampling via strided convolutions between stages. For optimization, we used AdamW with weight decay (5 × 10⁻²) and trained for 30 epochs using a batch size of 32. The learning rate followed a custom warmup schedule starting from an initial value of 5 × 10⁻⁴.

All models were implemented in PyTorch 2.0 and trained on an NVIDIA Tesla K40C GPU with 12 GB VRAM (NVIDIA, Santa Clara, CA, USA). Due to hardware limitations, a uniform batch size of 64 was used. Input imagery underwent downsampling to a standardized resolution of 224 × 224 pixels to ensure dimensional consistency across samples. The data were then divided into 80% training datasets and 20% validation datasets, with stratification by class. The architectural diversity of the selected models enables comprehensive evaluation of complex scene classification capabilities from multiple representational perspectives.

2.3.3. Driving Factor Analysis Using Feature Correlation and SHAP

Studies on vegetation classification have demonstrated that using spectral data in conjunction with topographic and texture data enhances classification accuracy [13,56]. Therefore, the integration of spectral bands, vegetation indices, texture features, and DSMs generated a dataset of 29 feature variables (Table S2). These features are obtained by taking the average of the values in each sample segment patch. To identify key features in the optimal model, we first performed correlation analysis on these 29 variables using the Pearson correlation coefficient (PCC). Features exceeding a correlation threshold of 0.75 were removed, prioritizing the exclusion of the most highly correlated pairs to avoid linear dependence. We then applied the SHAP algorithm for model interpretation.

SHAP was derived from game theory concepts proposed by economist Lloyd Shapley [57]. Lundberg and Lee later adapted this method for machine learning to explain predictions from complex models, enhancing model transparency and user trust [58]. We employed SHAP to resolve critical interpretability challenges in our DL-based peatland vegetation classification. This technique was essential for quantifying how specific features collectively drive model predictions across heterogeneous subtropical peatlands. Unlike simpler attribution methods, SHAP robustly handles complex feature interactions within our multi-source dataset.

To evaluate model robustness and quantify the contribution of vegetation height features, we conducted comparative assessments. These used input configurations based solely on spectral features; combined spectral and texture features; and combined spectral, texture, and height features across different model architectures. This enabled systematic analysis of features impact on classification performance.

2.3.4. Accuracy Assessment

The confusion matrix was constructed from the validation samples and classification results of different DL models, and the classification accuracy of peatland vegetation was evaluated using indicators, including overall accuracy (OA), kappa value, producer accuracy (PA), user accuracy (UA), and macro-average of F1 score (F1-score) [36,59,60]. PA relates the number of pixels correctly classified in a class to the number of pixels available in the validation record for that class. PA reflects the classifier’s ability to identify ground truth data. UA relates the number of correctly classified pixels in a class to the total number of pixels assigned to that class. UA reflects the reliability of classification results. F1-score provides critical insight into spatial distribution recognition capabilities, particularly for minority vegetation classes where spectral confusion is prevalent.

3. Results

3.1. Classification Results and Model Accuracy

Classification results showed significant variations across seasons and topographies, even when the same algorithm was applied (Figure 4 and Figure 5). Among the four DL methods we evaluated, the highest classification accuracy was achieved using the ConvNeXt algorithm during the non-growing season in the high-topography LK3 area, with an overall accuracy of 87% (Figure 6). In contrast, the lowest performance was observed using the Swin Transformer algorithm during the growing season in the lower-topography YLC area, yielding an overall accuracy of only 65%. The YLC site consistently exhibited weaker overall accuracy and Kappa values across both seasons, compared to the LK3 site (Figure 6a).

In this study, Sphagnum, as a key vegetation type within the study area, was evaluated using PA and UA to assess under which conditions different models exhibited superior classification performance (Figures S4 and S5). ConvNeXt attained a maximal Sphagnum precision of 0.86 during the non-growing season in YLC, corresponding to 49.76% areal coverage. It further achieved a dual high UA and PA of 0.91 and 0.82 with 35.41% coverage during the non-growing season in LK3. Conversely, EfficientNet delivered the poorest overall performance in the growing season of YLC, with an OA of 66%.

Phenological shifts induced substantial volatility, as evidenced by 41% UA fluctuations within identical study areas across different seasons (Figure S5b). Dicots exhibited marked seasonal sensitivity with accuracy plunging to 40% during leaf senescence periods where producer accuracy declined to 63% versus the 87% peak in summer. Monocot misclassification patterns emerged distinctly (Figure 7), showing significant confusion with Sphagnum in the growing season of YLC with EfficientNet exhibiting a PA of 0.48, alongside misclassification as dicots in the growing season of LK3. Dicots demonstrated the lowest non-growing-season UA across regions, reaching a critical low of 0.40 under Swin Transformer in YLC. Crucially, Sphagnum classification consistently outperformed growing season metrics during non-growing periods across all observations.

Different deep learning algorithms exhibit varying performance when applied to peatland imagery across seasons and topographies. The training curves show that all models reached a convergence threshold with a train accuracy of >90% and a loss of <0.4 by the 30th epoch (Figures S6 and S7). However, there were significant differences in fine-grained classification performance. Among the four models, ConvNeXt and ResNet demonstrate the most favorable performance, both in terms of DL efficiency and final classification accuracy (Figure 6 and Figure S6). ConvNeXt and ResNet achieved accuracies of 96.6% and 96.2%, respectively, on the test set, significantly outperforming EfficientNet, which achieved accuracies ranging from 84.9% to 90.2% (Figures S6 and S7). Notably, although the Swin Transformer performs well in terms of both accuracy and loss metrics, its overall vegetation classification accuracy is 14% ± 3% lower than that of the optimal model under the same conditions (Figure 6a). When selecting DL models, training accuracy and loss rate may not be the sole indicators of classification quality. Despite competitive test set performance in accuracy and loss rates, certain architectures like Swin Transformer yielded suboptimal overall precision.

Analysis across two distinct peatland study areas revealed clear performance differences. ConvNeXt showed superior performance during non-growing seasons, achieving 87% overall accuracy (OA) in LK3 and 80% in YLC. It also achieved high class-specific F1-scores during the growing season of YLC, with 0.941 for Sphagnum, 0.981 for monocots, and 0.954 for other vegetation (Figure S8). ResNet, on the other hand, performed better during growing seasons, with an OA of 71% and 74%. Notably, it outperformed ConvNeXt in LK3, where it is a high-topography area, achieving an F1-score of 0.902 compared to ConvNeXt’s 0.852. The Swin Transformer displayed more volatility. It performed well during the growing season of YLC with an F1-score of 0.899 but showed a significant decline during non-growing periods, achieving 0.779 in LK3. However, it still outperformed EfficientNet in high-topography growing seasons. EfficientNet consistently underperformed, particularly during growing seasons. In LK3, it achieved an OA of 66% and an F1-score of 0.771 during this period.

3.2. Feature Importance Based on SHAP

This study extracted 29 features to interpret classification outcomes. SHAP-based importance ranking revealed significant regional and seasonal variations. Overall, strong correlations existed among nred, nblue, and ngreen (Figure 8). The final retained variables are shown in Figure 8.

The SHAP summary plot provides a global explanation, indicating the relative importance of each influencing factor in the model calculations (Figure 9). Using the CovNeXt model, which achieved the highest overall accuracy, combined with SHAP interpretability, we assessed feature importance and performed separate SHAP analyses for different regions and seasons. Feature contributions varied significantly across vegetation classes. Vertically, features are ranked by descending importance. Consistent spectral feature stratification was observed, with vegetation indices EXGR, EXR, and EXG, along with the texture feature MEAN and DSM, persistently ranking among the top three important features across all scenarios. Spectral features like GRI, VARI, NDI, and Ngreen played less critical roles, and texture features proved more important than these specific spectral features. The contribution of texture features increased by 22% to 68% during the non-growing season compared to the growing season, while DSM importance increased by an average of 31.5% during the growing season. Horizontally, each feature contributed differently to classifying each vegetation type. Shrub classification relied more on texture and height features at LK3 but depended more on spectral features at YLC. Monocot classification relied more on height features during the non-growing season than the growing season. Peat moss classification depended more on spectral and height features in both seasons but relied more on texture features during the growing season than the non-growing season. Dicot classification depended more on height features during the growing season than the non-growing season.

The results of our analysis on the role of feature inclusion demonstrate that input data composition significantly governs vegetation classification performance across models (Figure 10). When using only spectral features, all models performed below expectations. ConvNeXt achieved an overall accuracy of just 0.70 and an F1-score of 0.79, while Swin Transformer performed worse, with an accuracy of only 0.55. Adding texture features systematically improved all models. ConvNeXt accuracy jumped to 0.81 and its F1 score reached 0.88, representing a 0.11 improvement over using spectral features alone. Using the full feature combination allowed ConvNeXt to achieve its peak performance, with an accuracy of 0.87 and an F1-score of 0.95, a further 0.06 gain over the spectral-plus-texture combination. This improvement was consistent across all models, which showed varying degrees of accuracy increase after incorporating the DSM. The accuracy gains from adding texture features were larger than the subsequent gain from adding height features. This indicates that texture provides stronger complementary information to spectral data than topographic height provides to the combination of spectral and texture data.

4. Discussion

4.1. Model Selection Considerations

ConvNeXt emerged as the top-performing model based on overall accuracy and Kappa value in this study. This architecture effectively combines the advantages of large-kernel convolutional operations and hierarchical window attention mechanisms. The 7 × 7 convolutional kernels in ConvNeXt provide broader receptive fields that capture subtle spatial patterns, which was demonstrated during YLC’s growing season where these kernels identified monocot texture features that smaller 3 × 3 kernels missed, reducing monocot misclassification by 0.22 compared to EfficientNet (Figure S4). In contrast, ResNet, employing smaller 3 × 3 convolutional kernels, demonstrated unique efficacy for dicot identification during the growing season. This advantage likely arises from the kernels’ capacity for precise localized feature extraction, optimally capturing diagnostic high-frequency spatial signatures associated with complex dicot morphology, such as intricate leaf venation patterns, compound leaf arrangements, and irregular canopy structures (Figure S10). Therefore, some studies have effectively improved classification accuracy using complementary features like leaf area index. ResNet achieved dicot classification accuracy comparable to ConvNeXt despite its simpler architectural design. However, ConvNeXt’s distinct superiority during non-growing seasons is primarily attributable to its hierarchical window attention mechanism. This mechanism significantly amplified discriminability among senescing vegetation types by adaptively weighting spectral responses across multi-scale windows. For example, it effectively resolved spectral confusion between withered monocots, contributing to a 0.08 increase in overall accuracy over ResNet within LK3 (Figure S5). Large-kernel convolutions dominate when diagnostic structural features are prevalent in the growing season, while attention mechanisms excel under conditions of increased spectral ambiguity in the non-growing season.

Model performance (F1-score) of Swin Transformer was consistently lower for dicots and other classes than for monocots, Sphagnum moss, and shrubs (Figure S8). This performance volatility originates from its windowed attention mechanism, which demonstrates heightened sensitivity to spectral similarities among vegetation types. In subtropical peatlands, pronounced spectral homogeneity among vegetation significantly increases discrimination difficulty, particularly during growing seasons between monocots and dicots (Figure S10). The local-window attention fails to model global dependencies across these spectrally ambiguous regions, exacerbating misclassification (e.g., between monocots and Sphagnum). This aligns with findings from Jamali et al. (2022) in coastal wetlands, where windowed attention incurred high confusion between spectrally similar marsh types [61].

For comparison, the traditional machine learning algorithm random forest was also applied. While RF achieved its highest accuracy of 73% at LK3 during the non-growing season, this remained lower than DL models under identical conditions. Notably, RF’s performance aligns with established wetland mapping studies, confirming its general utility [56]. However, our results demonstrate that DL architectures better resolve the fine-scale spectral–textural complexities encountered in ultra-high-resolution UAV imagery of subtropical peatlands [52].

Subtropical peatlands diverge significantly from boreal counterparts in vegetation composition, climate regimes, and hydrological conditions. Previous studies attained 92% accuracy or higher in northern peatlands by optimizing spectral indices with high-resolution UAV imagery [4] or designing UAV acquisition methods [13]. It is also noteworthy that our model generated seasonal-specific classification outcomes under different data constraints, producing results that contrast with those of Simpson et al. (2024) [25]. In their study of UK peatlands, higher classification accuracy was achieved during the July–August period and early growing season, which indicates that vegetation classification methods developed for northern peatlands may not be directly applicable to subtropical peatlands.

4.2. Topography and Phenology Comparisons

4.2.1. The Influence of Topography on Classification

Topography influences vegetation status by modulating water table depth, thereby influencing final classification outcomes. LK3 maintained high precision across all phenological stages. Conversely, in the YLC region, characterized by a relatively higher water table depth, Sphagnum moss was more frequently misclassified as monocot vegetation. This misclassification was particularly pronounced during the growing season, when spectral confusion with monocot vegetation intensifies (Figure 6). Varying vegetation classification accuracy among topographies probably results from environmental factors such as moisture availability. Moisture critically affects Sphagnum, altering its color. Higher water tables maintain greener Sphagnum, while lower water tables cause yellowing, as visible in Figure S10b. In the peatland, although depressions may remain relatively moist and support vegetation growth, moss hummocks are heavily affected by water scarcity, often resulting in yellowing Sphagnum. These small-scale variations in moisture and growth can cause Sphagnum classification to exhibit differing spectral characteristics, complicating classification efforts. These results underscore topography-driven moisture gradients as a key modulator of classification performance, a finding critical for scaling models to flood-prone subtropical peatlands reflecting spectral confusion between water-stressed Sphagnum and senescent monocot.

4.2.2. The Influence of Phenologyon Classification

Classification accuracy for dicots and monocots showed similar trends across topographies but varied considerably by phenology. This difficulty arises from spectral confusion with green vegetation from other plant communities in the growing season and with both black and yellow spectral signatures in the non-growing season [25,62]. In the non-growing season, monocots were more easily confused with Sphagnum, while in the growing season, they were more likely to be misclassified as dicot vegetation. This discrepancy can be attributed to the seasonal changes in monocots: as annuals, they are yellow and wilt during the non-growing season, and their tall stature often causes them to collapse and obscure Sphagnum, making separation difficult. Unlike other vegetation types that senesce and renew with the seasons, Sphagnum is perennial and does not lose its leaves (Figure 3 and Figure S10). Additionally, the scattered distribution of dicot vegetation in subtropical peatlands further complicates classification. This discrepancy is closely linked to the phenological height variations of dicot vegetation. In summer, species such as E. esula and S. officinalis grow taller than Sphagnum and monocots (Figure S10). Consequently, the incorporation of DSM data improves their differentiation. Monocots were classified more accurately during the non-growing season than in the growing season. Phenology indirectly affects Sphagnum classification through its modulation of dicot morphology.

Consequently, UAV imaging during non-growing seasons should be prioritized for classifying monocots and Sphagnum, whereas growing season imagery better supports dicot classification. The season-dependent overall accuracy and per-class F1-scores underscore the significant impact of phenology on vegetation classification accuracy (Figure 6 and Figure S8). Moreover, experimental evidence suggests that phenology has a strong impact on vegetation classification [25,63,64]. Phenological characteristics can serve as distinguishing features for classification, which is particularly important for vegetation classification. While our two-season analysis captured key phenological contrasts, future studies could deploy multi-temporal UAV campaigns across transitional periods to quantify optimal acquisition windows and model phenological trajectories for automated vegetation typing.

4.3. Feature Determination

The pivotal role of vegetation indices (EXGR, EXR, EXG) and DSM aligns with their established effectiveness in boreal peatlands [65,66]. This study demonstrates their indispensability within subtropical complex vegetation communities where spectral confusion intensifies under species co-occurrence. Seasonal fluctuations in feature importance reflect phenological impacts on classification models, with enhanced vegetation index dominance during growing seasons likely attributable to chlorophyll-induced spectral responses.

We found that DSM data were invaluable for distinguishing shrubs and dicots, as evidenced by the higher SHAP values associated with DSM for these vegetation types. This enhanced discriminability was particularly pronounced during the growing season, aligning with observed structural characteristics in the field (Figure S10). However, some studies have suggested that DSM data may have limited utility for classifying different species of Sphagnum in peatlands [14,22]. This phenomenon was observed in the YLC region of our study, whereas DSM remained important for Sphagnum classification in the LK3 region. We hypothesize that this discrepancy may relate to variations in Sphagnum species composition and moss accumulation age. Although Sphagnum species composition was consistent across our study area, it may differ in other regions, potentially explaining contrasting findings. Regarding moss accumulation, Sphagnum in YLC exhibited greater accumulation thickness associated with longer growth periods, while LK3 featured younger moss with lower accumulation.

Texture features have also shown to be highly influential, compensating for the lack of spectral information and serving as a supplementary tool in peatland vegetation classification [67]. Adding texture features increased overall accuracy by 0.13 across models (Figure 10), with SHAP analysis ranking texture (MEAN) among the top three influential features (Figure 9). Texture features demonstrated context-dependent utility, most critically resolving classification errors at between monocots and other classes. We have observed that monocot textures are relatively smooth, while Sphagnum and dicot textures are comparatively rough. And areas with high woody shrub coverage exhibit the roughest textures. These textural distinctions contributed significantly to the overall accuracy improvement.

4.4. Implications of Findings

This study demonstrates that DL architectures, particularly ConvNeXt, effectively leverage spectral, textural, and elevation information from UAV-based RGB imagery for high-resolution vegetation mapping in subtropical peatlands. By achieving 87% classification accuracy during non-growing seasons, we validate DL’s capacity to resolve spectral overlap challenges in peatland ecosystems. Previous studies on peatland vegetation classification have reported accuracies ranging between 85% and 92% [16,56]. The superior accuracy in non-growing seasons, with an overall accuracy of 87% compared to 65% in growing seasons, suggests that subtropical peatlands require seasonally adaptive monitoring strategies to optimize resource allocation. This contrasts with boreal frameworks optimized for summer acquisition [25,26,68], highlighting the necessity of region-specific phenological adaptations.

Our static sampling approach inadequately captured hydrological dynamism, exemplified by rapid vegetation shifts triggered by extreme weather events such as droughts and floods in subtropical regions. These disturbances cause transient community reorganization that alters carbon storage dynamics [69]. Therefore, capturing multiple UAV imagery across different seasons and water levels can provide valuable insights into the key mechanisms of vegetation dynamics within the carbon cycle.

5. Conclusions

This study investigates the use of DL models to vegetation mapping in a subtropical Sphagnum peatland using UAV-based RGB imagery captured across different seasons and topographies. By optimizing segmentation parameters, explaining the use of features in DL, and comparing classification accuracies in different scenarios and vegetation types, we assessed the feasibility and uncertainties of utilizing DL in peatland mapping. The key findings are outlined below:

(1): ConvNeXt and ResNet delivered superior vegetation classification performance in subtropical peatlands. ConvNeXt achieved optimal results through its large-kernel architecture and vision-optimized design, while ResNet served as an effective alternative when sufficient computational resources were available during growing seasons. Although EfficientNet and Swin Transformer attained high training accuracy with low loss on validation sets, their low F1-scores and poor overall accuracy indicate limited suitability for subtropical peatland vegetation classification.
(2): Topography and phenology are both important factors that affect classification accuracy. Topography-derived hydrological gradients serve as the core driver. The high depth of the water table in low-lying YLC areas intensifies spectral confusion among vegetation, reducing accuracy by 12–15% compared to LK3. Phenological variations regulate classification outcomes through vegetation growth dynamics: during non-growing seasons, the withering of annual plants enhances inter-category feature distinctness, whereas in growing seasons, the morphological traits of dicot plants become more recognizable with DSM assistance, though spectral overlap between monocots and Sphagnum moss increases misclassification rates. Consequently, Sphagnum mapping should prioritize non-growing season imagery, while dicot vegetation classification requires integrated growing-season data and DSM features.
(3): The SHAP method effectively identifies critical features including key vegetation indices like EXG, EXR, EXGR, DSM, and texture characteristics, clarifying how DL models utilize these inputs. Comparative experiments with different input data configurations demonstrate that DSM contributes substantially to vegetation classification, underscoring the indispensability of topographic elevation data.

Collectively, this study validates the feasibility of four DL models for vegetation mapping in subtropical Sphagnum peatlands and demonstrates their robust performance across distinct topography and phenological conditions.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/rs17172920/s1, Figure S1: sample point distribution; Figure S2: change rate of local variance for different segmentation scales; Figure S3: segmentation results under different segmentation parameter; Figure S4: producer accuracy heatmap for vegetation types utilizing different algorithms; Figure S5: user accuracy heatmap for vegetation types using different algorithms; Figure S6: accuracy and loss value comparison of model training and validation between four different algorithms in LK3; Figure S7: accuracy and loss value comparison of model training and validation between four different algorithms in YLC; Figure S8: summary of per-class F1-scores for all validation exercises; Figure S9: comparison of classification results using Random Forest.; Figure S10: vegetation images from Dajiuhu Peatland; Table S1: the number of samples data of study area; Table S2: list of predictor variables used in vegetation classification computed per segment.

Author Contributions

Conceptualization, Z.L. and X.H.; methodology, software, validation, formal analysis, investigation, data curation and visualization, Z.L.; resources, Z.L. and X.H.; writing—original draft preparation, Z.L.; writing—review and editing, X.H.; supervision, project administration and funding acquisition, X.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (No. 42472368), the Key project from the Hubei Research Center for Basic Disciplines of Earth Sciences (No. HRCES-20242), and State Key Laboratory of Geomicrobiology and Environmental Changes, China University of Geosciences (No. GBL12403).

Data Availability Statement

All raw data can be provided by the corresponding authors upon request.

Acknowledgments

Guang Yang, Yuhang Wang, and Jiantao Xue are thanked for their help in the field investigation.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

UAV	unmanned aerial vehicle
OBIA	object-based image analysis
RF	random forest
DL	deep learning
RGB	red–green–blue
DOM	digital orthophoto map
CNN	convolutional neural network
OA	overall accuracy
PA	producer accuracy
UA	user accuracy
EXGR	excessive green index minus red index
EXR	excess red index
EXG	excess green index
DSM	digital surface model
Dicots	dicotyledonous vegetation
Monocots	monocotyledonous vegetation
SHAP	SHapley Additive exPlanations

References

Xu, J.; Morris, P.J.; Liu, J.; Holden, J. PEATMAP: Refining Estimates of Global Peatland Distribution Based on a Meta-Analysis. CATENA 2018, 160, 134–140. [Google Scholar] [CrossRef]
Yu, Z.; Loisel, J.; Brosseau, D.P.; Beilman, D.W.; Hunt, S.J. Global Peatland Dynamics since the Last Glacial Maximum. Geophys. Res. Lett. 2010, 37, 13. [Google Scholar] [CrossRef]
Strack, M.; Davidson, S.J.; Hirano, T.; Dunn, C. The Potential of Peatlands as Nature-Based Climate Solutions. Curr. Clim. Change Rep. 2022, 8, 71–82. [Google Scholar] [CrossRef]
Lehmann, J.R.K.; Münchberger, W.; Knoth, C.; Blodau, C.; Nieberding, F.; Prinz, T.; Pancotto, V.A.; Kleinebecker, T. High-Resolution Classification of South Patagonian Peat Bog Microforms Reveals Potential Gaps in up-Scaled CH₄ Fluxes by Use of Unmanned Aerial System (UAS) and CIR Imagery. Remote Sens. 2016, 8, 173. [Google Scholar] [CrossRef]
Jiang, W.; Zhang, Z.; Ling, Z.; Deng, Y. Experience and Future Research Trends of Wetland Protection and Restoration in China. J. Geogr. Sci. 2024, 34, 229–251. [Google Scholar] [CrossRef]
Ritson, J.P.; Lees, K.J.; Hill, J.; Gallego-Sala, A.; Bebber, D.P. Climate Change Impacts on Blanket Peatland in Great Britain. J. Appl. Ecol. 2025, 62, 701–714. [Google Scholar] [CrossRef]
Karlqvist, S.; Burdun, I.; Salko, S.-S.; Juola, J.; Rautiainen, M. Retrieval of Moisture Content of Common Sphagnum Peat Moss Species from Hyperspectral and Multispectral Data. Remote Sens. Environ. 2024, 315, 114415. [Google Scholar] [CrossRef]
Rochefort, L. Sphagnum: A Keystone Genus in Habitat Restoration. Bryologist 2000, 103, 503–508. [Google Scholar] [CrossRef]
Zhao, Y.; Liu, C.; Li, X.; Ma, L.; Zhai, G.; Feng, X. Sphagnum Increases Soil’s Sequestration Capacity of Mineral-Associated Organic Carbon via Activating Metal Oxides. Nat. Commun. 2023, 14, 5052. [Google Scholar] [CrossRef]
Knoth, C.; Klein, B.; Prinz, T.; Kleinebecker, T. Unmanned Aerial Vehicles as Innovative Remote Sensing Platforms for High-Resolution Infrared Imagery to Support Restoration Monitoring in Cut-over Bogs. Appl. Veg. Sci. 2013, 16, 509–517. [Google Scholar] [CrossRef]
Andersen, R.; Poulin, M.; Borcard, D.; Laiho, R.; Laine, J.; Vasander, H.; Tuittila, E.-T. Environmental Control and Spatial Structures in Peatland Vegetation. J. Veg. Sci. 2011, 22, 878–890. [Google Scholar] [CrossRef]
Anderson, K.; Gaston, K.J. Lightweight Unmanned Aerial Vehicles Will Revolutionize Spatial Ecology. Front. Ecol. Environ. 2013, 11, 138–146. [Google Scholar] [CrossRef]
Steenvoorden, J.; Bartholomeus, H.; Limpens, J. Less Is More: Optimizing Vegetation Mapping in Peatlands Using Unmanned Aerial Vehicles (UAVs). Int. J. Appl. Earth Obs. Geoinf. 2023, 117, 103220. [Google Scholar] [CrossRef]
Middleton, M.; Närhi, P.; Arkimaa, H.; Hyvönen, E.; Kuosmanen, V.; Treitz, P.; Sutinen, R. Ordination and Hyperspectral Remote Sensing Approach to Classify Peatland Biotopes along Soil Moisture and Fertility Gradients. Remote Sens. Environ. 2012, 124, 596–609. [Google Scholar] [CrossRef]
Palace, M.; Herrick, C.; DelGreco, J.; Finnell, D.; Garnello, A.; McCalley, C.; McArthur, K.; Sullivan, F.; Varner, R. Determining Subarctic Peatland Vegetation Using an Unmanned Aerial System (UAS). Remote Sens. 2018, 10, 1498. [Google Scholar] [CrossRef]
Räsänen, A.; Juutinen, S.; Tuittila, E.; Aurela, M.; Virtanen, T. Comparing Ultra-High Spatial Resolution Remote-Sensing Methods in Mapping Peatland Vegetation. J. Veg. Sci. 2019, 30, 1016–1026. [Google Scholar] [CrossRef]
Treat, C.C.; Bloom, A.A.; Marushchak, M.E. Nongrowing Season Methane Emissions—A Significant Component of Annual Emissions across Northern Ecosystems. Glob. Change Biol. 2018, 24, 3331–3343. [Google Scholar] [CrossRef]
Bertacchi, A.; Giannini, V.; Di Franco, C.; Silvestri, N. Using Unmanned Aerial Vehicles for Vegetation Mapping and Identification of Botanical Species in Wetlands. Landsc. Ecol. Eng. 2019, 15, 231–240. [Google Scholar] [CrossRef]
Kameoka, T.; Kozan, O.; Hadi, S.; Asnawi; Hasrullah. Monitoring the Groundwater Level in Tropical Peatland through UAV Mapping of Soil Surface Temperature: A Pilot Study in Tanjung Leban, Indonesia. Remote Sens. Lett. 2021, 12, 542–552. [Google Scholar] [CrossRef]
Kelly, M.; Tuxen, K.A.; Stralberg, D. Mapping Changes to Vegetation Pattern in a Restoring Wetland: Finding Pattern Metrics That Are Consistent across Spatial Scale and Time. Ecol. Indic. 2011, 11, 263–273. [Google Scholar] [CrossRef]
Diaz-Varela, R.A.; Calvo Iglesias, S.; Cillero Castro, C.; Diaz Varela, E.R. Sub-Metric Analisis of Vegetation Structure in Bog-Heathland Mosaics Using Very High Resolution Rpas Imagery. Ecol. Indic. 2018, 89, 861–873. [Google Scholar] [CrossRef]
Räsänen, A.; Virtanen, T. Data and Resolution Requirements in Mapping Vegetation in Spatially Heterogeneous Landscapes. Remote Sens. Environ. 2019, 230, 111207. [Google Scholar] [CrossRef]
Whiteside, T.G.; Boggs, G.S.; Maier, S.W. Comparing Object-Based and Pixel-Based Classifications for Mapping Savannas. Int. J. Appl. Earth Obs. Geoinf. 2011, 13, 884–893. [Google Scholar] [CrossRef]
Kaneko, K.; Yokochi, M.; Inoue, T.; Kato, Y.; Fujita, H. Topographic Conditions as Governing Factors of Mire Vegetation Types Analyzed from Drone-Based Terrain Model. J. Veg. Sci. 2024, 35, e13226. [Google Scholar] [CrossRef]
Simpson, G.; Nichol, C.J.; Wade, T.; Helfter, C.; Hamilton, A.; Gibson-Poole, S. Species-Level Classification of Peatland Vegetation Using Ultra-High-Resolution UAV Imagery. Drones 2024, 8, 97. [Google Scholar] [CrossRef]
Pang, Y.; Räsänen, A.; Wolff, F.; Tahvanainen, T.; Männikkö, M.; Aurela, M.; Korpelainen, P.; Kumpula, T.; Virtanen, T. Comparing Multispectral and Hyperspectral UAV Data for Detecting Peatland Vegetation Patterns. Int. J. Appl. Earth Obs. Geoinf. 2024, 132, 104043. [Google Scholar] [CrossRef]
Ai, J.; Han, X.; Chen, L.; He, H.; Li, X.; Tan, Y.; Xie, T.; Tang, X. Deep Neural Network and Transfer Learning for Annual Wetland Vegetation Mapping Using Sentinel-2 Time-Series Data in the Heterogeneous Lake Floodplain Environment. Int. J. Remote Sens. 2024, 18, 1–24. [Google Scholar] [CrossRef]
Rezaee, M.; Mahdianpari, M.; Zhang, Y.; Salehi, B. Deep Convolutional Neural Network for Complex Wetland Classification Using Optical Remote Sensing Imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 3030–3039. [Google Scholar] [CrossRef]
Tao, R.; Zhao, X.; Li, W.; Li, H.-C.; Du, Q. Hyperspectral Anomaly Detection by Fractional Fourier Entropy. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 12, 4920–4929. [Google Scholar] [CrossRef]
Zhao, G.; Ye, Q.; Sun, L.; Wu, Z.; Pan, C.; Jeon, B. Joint Classification of Hyperspectral and LiDAR Data Using a Hierarchical CNN and Transformer. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–16. [Google Scholar] [CrossRef]
Li, K.; Wan, G.; Cheng, G.; Meng, L.; Han, J. Object Detection in Optical Remote Sensing Images: A Survey and a New Benchmark. ISPRS J. Photogramm. Remote Sens. 2020, 159, 296–307. [Google Scholar] [CrossRef]
Cheng, G.; Xie, X.; Han, J.; Guo, L.; Xia, G.-S. Remote Sensing Image Scene Classification Meets Deep Learning: Challenges, Methods, Benchmarks, and Opportunities. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 3735–3756. [Google Scholar] [CrossRef]
Xu, F.; Hu, C.; Li, J.; Plaza, A.; Datcu, M. Special Focus on Deep Learning in Remote Sensing Image Processing. Sci. China Inf. Sci. 2020, 63, 140300. [Google Scholar] [CrossRef]
Hou, X.; Ao, W.; Song, Q.; Lai, J.; Wang, H.; Xu, F. FUSAR-Ship: Building a High-Resolution SAR-AIS Matchup Dataset of Gaofen-3 for Ship Detection and Recognition. Sci. China Inf. Sci. 2020, 63, 140303. [Google Scholar] [CrossRef]
Hosseiny, B.; Mahdianpari, M.; Brisco, B.; Mohammadimanesh, F.; Salehi, B. WetNet: A Spatial–Temporal Ensemble Deep Learning Model for Wetland Classification Using Sentinel-1 and Sentinel-2. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–14. [Google Scholar] [CrossRef]
Jafarzadeh, H.; Mahdianpari, M.; Gill, E.W. Wet-GC: A Novel Multimodel Graph Convolutional Approach for Wetland Classification Using Sentinel-1 and 2 Imagery with Limited Training Samples. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 5303–5316. [Google Scholar] [CrossRef]
Jamali, A.; Mahdianpari, M.; Brisco, B.; Granger, J.; Mohammadimanesh, F.; Salehi, B. Wetland Mapping Using Multi-Spectral Satellite Imagery and Deep Convolutional Neural Networks: A Case Study in Newfoundland and Labrador, Canada. Can. J. Remote Sens. 2021, 47, 243–260. [Google Scholar] [CrossRef]
Li, Z.; Meng, Q.; Guo, F.; Wang, L.; Huang, W.; Hu, Y.; Liang, J. Feature-Guided Dynamic Graph Convolutional Network for Wetland Hyperspectral Image Classification. Int. J. Appl. Earth Obs. Geoinf. 2023, 123, 103485. [Google Scholar] [CrossRef]
Nikolova, P.D.; Evstatiev, B.I.; Atanasov, A.Z.; Atanasov, A.I. Evaluation of Weed Infestations in Row Crops Using Aerial RGB Imaging and Deep Learning. Agriculture 2025, 15, 418. [Google Scholar] [CrossRef]
Song, H. A More Efficient Approach for Remote Sensing Image Classification. Comput. Mater. Contin. 2022, 74, 5741–5756. [Google Scholar] [CrossRef]
Jamali, A.; Mahdianpari, M. Swin Transformer and Deep Convolutional Neural Networks for Coastal Wetland Classification Using Sentinel-1, Sentinel-2, and LiDAR Data. Remote Sens. 2022, 14, 359. [Google Scholar] [CrossRef]
Huang, Y.; Wen, X.; Gao, Y.; Zhang, Y.; Lin, G. Tree Species Classification in UAV Remote Sensing Images Based on Super-Resolution Reconstruction and Deep Learning. Remote Sens. 2023, 15, 2942. [Google Scholar] [CrossRef]
Yang, G.; Zhang, Y.; Huang, X. Fluctuations of Water Table Level in a Subtropical Peatland, Central China. J. Earth Sci. 2025, 36, 441–449. [Google Scholar] [CrossRef]
Yang, S.; Ge, J.; Xu, X.; Liu, Z.; Wang, J.; Wang, Y. Regulation Mechanisms of CO₂ Fluxes in Subtropical Mountain Peatlands Based on Long-Term In Situ Observations at the Dajiuhu Peatland. J. Geophys. Res. Biogeosci. 2025, 130, e2024JG008328. [Google Scholar] [CrossRef]
Chabot, D.; Dillon, C.; Shemrock, A.; Weissflog, N.; Sager, E.P.S. An Object-Based Image Analysis Workflow for Monitoring Shallow-Water Aquatic Vegetation in Multispectral Drone Imagery. ISPRS Int. J. Geo-Inf. 2018, 7, 294. [Google Scholar] [CrossRef]
Blaschke, T.; Hay, G.J.; Kelly, M.; Lang, S.; Hofmann, P.; Addink, E.; Feitosa, R.Q.; van der Meer, F.; van der Werff, H.; van Coillie, F.; et al. Geographic Object-Based Image Analysis—Towards a New Paradigm. ISPRS J. Photogramm. Remote Sens. 2014, 87, 180–191. [Google Scholar] [CrossRef]
Benz, U.C.; Hofmann, P.; Willhauck, G.; Lingenfelder, I.; Heynen, M. Multi-Resolution, Object-Oriented Fuzzy Analysis of Remote Sensing Data for GIS-Ready Information. ISPRS J. Photogramm. Remote Sens. 2004, 58, 239–258. [Google Scholar] [CrossRef]
Witharana, C.; Civco, D.L. Optimizing Multi-Resolution Segmentation Scale Using Empirical Methods: Exploring the Sensitivity of the Supervised Discrepancy Measure Euclidean Distance 2 (ED2). ISPRS J. Photogramm. Remote Sens. 2014, 87, 108–121. [Google Scholar] [CrossRef]
Dragut, L.; Csillik, O.; Eisank, C.; Tiede, D. Automated Parameterisation for Multi-Scale Image Segmentation on Multiple Layers. ISPRS J. Photogramm. Remote Sens. 2014, 88, 119–127. [Google Scholar] [CrossRef]
Drǎguţ, L.; Tiede, D.; Levick, S.R. ESP: A Tool to Estimate Scale Parameter for Multiresolution Image Segmentation of Remotely Sensed Data. Int. J. Geogr. Inf. Sci. 2010, 24, 859–871. [Google Scholar] [CrossRef]
Alhichri, H.; Alswayed, A.S.; Bazi, Y.; Ammour, N.; Alajlan, N.A. Classification of Remote Sensing Images Using EfficientNet-B3 CNN Model with Attention. IEEE Access 2021, 9, 14078–14094. [Google Scholar] [CrossRef]
Roy, S.K.; Manna, S.; Song, T.; Bruzzone, L. Attention-Based Adaptive Spectral—Spatial Kernel ResNet for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2021, 59, 7831–7843. [Google Scholar] [CrossRef]
He, X.; Zhou, Y.; Zhao, J.; Zhang, D.; Yao, R.; Xue, Y. Swin Transformer Embedding UNet for Remote Sensing Image Semantic Segmentation. IEEE Trans. Geosci. Remote Sens. 2022, 60, 4408715. [Google Scholar] [CrossRef]
Liu, T.; Abd-Elrahman, A.; Morton, J.; Wilhelm, V.L. Comparing Fully Convolutional Networks, Random Forest, Support Vector Machine, and Patch-Based Deep Convolutional Neural Networks for Object-Based Wetland Mapping Using Images from Small Unmanned Aircraft System. GIScience Remote Sens. 2018, 55, 243–264. [Google Scholar] [CrossRef]
Zhang, Y.; Wang, H.; Yang, G.; Zhang, J.; Gong, C.; Wang, Y. CSNet: A ConvNeXt-Based Siamese Network for RGB-D Salient Object Detection. Vis. Comput. 2024, 40, 1805–1823. [Google Scholar] [CrossRef]
Bhatnagar, S.; Gill, L.; Ghosh, B. Drone Image Segmentation Using Machine and Deep Learning for Mapping Raised Bog Vegetation Communities. Remote Sens. 2020, 12, 2602. [Google Scholar] [CrossRef]
Shapley, L.S. A Value for N-Person Games. In Contributions to the Theory of Games II; Kuhn, H., Tucker, A., Eds.; Princeton University Press: Princeton, NJ, USA, 1953; pp. 307–317. [Google Scholar] [CrossRef]
Lundberg, S.M.; Lee, S.I. A Unified Approach to Interpreting Model Predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17), Long Beach, CA, USA, 4–9 December 2017; Volume 30, pp. 4765–4774. [Google Scholar] [CrossRef]
Torabzadeh, H.; Leiterer, R.; Hueni, A.; Schaepman, M.E.; Morsdorf, F. Tree Species Classification in a Temperate Mixed Forest Using a Combination of Imaging Spectroscopy and Airborne Laser Scanning. Agric. For. Meteorol. 2019, 279, 107744. [Google Scholar] [CrossRef]
Zhong, L.; Hu, L.; Zhou, H. Deep Learning Based Multi-Temporal Crop Classification. Remote Sens. Environ. 2019, 221, 430–443. [Google Scholar] [CrossRef]
Jamali, A.; Mahdianpari, M. Swin Transformer for Complex Coastal Wetland Classification Using the Integration of Sentinel-1 and Sentinel-2 Imagery. Water 2022, 14, 178. [Google Scholar] [CrossRef]
Ustin, S.L.; Gamon, J.A. Remote Sensing of Plant Functional Types. New Phytol. 2010, 186, 795–816. [Google Scholar] [CrossRef]
Pang, Y.; Räsänen, A.; Lindholm, V.; Aurela, M.; Virtanen, T. Detecting Peatland Vegetation Patterns with Multi-Temporal Field Spectroscopy. GIScience Remote Sens. 2022, 59, 2111–2126. [Google Scholar] [CrossRef]
Räsänen, A.; Aurela, M.; Juutinen, S.; Kumpula, T.; Lohila, A.; Penttilä, T.; Virtanen, T. Detecting Northern Peatland Vegetation Patterns at Ultra-High Spatial Resolution. Remote Sens. Ecol. Conserv. 2020, 6, 457–471. [Google Scholar] [CrossRef]
Isoaho, A.; Elo, M.; Marttila, H.; Rana, P.; Lensu, A.; Räsänen, A. Monitoring Changes in Boreal Peatland Vegetation after Restoration with Optical Satellite Imagery. Sci. Total Environ. 2024, 957, 177697. [Google Scholar] [CrossRef] [PubMed]
Beyer, F.; Jurasinski, G.; Couwenberg, J.; Grenzdorffer, G. Multisensor Data to Derive Peatland Vegetation Communities Using a Fixed-Wing Unmanned Aerial Vehicle. Int. J. Remote Sens. 2019, 40, 9103–9125. [Google Scholar] [CrossRef]
Lewiński, S.; Aleksandrowicz, S.; Banaszkiewicz, M. Testing Texture of VHR Panchromatic Data as a Feature of Land Cover Classification. Acta Geophys. 2015, 63, 547–567. [Google Scholar] [CrossRef]
Wolff, F.; Kolari, T.; Villoslada, M.; Tahvanainen, T.; Korpelainen, P.; Zamboni, P.; Kumpula, T. RGB vs. Multispectral Imagery: Mapping Aapa Mire Plant Communities with UAVs. Ecol. Indic. 2023, 148, 110140. [Google Scholar] [CrossRef]
Perryman, C.; Mccalley, C.; Malhotra, A.; Fahnestock, M.; Kashi, N.; Bryce, J.; Giesler, R.; Varner, R. Thaw Transitions and Redox Conditions Drive Methane Oxidation in a Permafrost Peatland. J. Geophys. Res. Biogeosci. 2020, 125, e2019JG005526. [Google Scholar] [CrossRef]

Figure 1. Locations of study sites and photos of the landscape. (a) Distribution of peatlands in the Dajiuhu basin; (b,c) images of the YLC peatland batch during the non-growing season and growing season (photographed on 25 April 2024 and 27 August 2024, respectively); (d,e) images of Lake No. 3 peatland batch during the non-growing season and growing season (photographed on 22 October 2023 and 27 August 2024, respectively).

Figure 2. Work flow of this study.

Figure 3. Box plots of spectral reflectance. (a,c) show the reflectance for the non-growing and growing seasons in YLC, respectively; (b,d) show the reflectance for the non-growing and growing seasons in LK3, respectively. The horizontal bar within the box represents the median of the datasets, and the white square symbol represents the mean value of the datasets.

Figure 4. Comparison of classification results in LK3. (a,f) show the true color images for the growing and non-growing seasons in LK3, respectively. (b–e) show the classification results for the growing season produced using ConvNeXt, Swin Transformer, ResNet, and EfficientNet, respectively. Similarly, (g–j) show the classification results for the non-growing season using the same four DL algorithms.

Figure 5. Comparison of classification results in YLC. (a,f) show the true color images for the growing and non-growing seasons in YLC, respectively. (b–e) show the classification results for the growing season produced using ConvNeXt, Swin Transformer, ResNet, and EfficientNet, respectively. Similarly, (g–j) show the classification results for the non-growing season using the same four DL algorithms.

Figure 6. Classification accuracy of plant types based on the four deep learning algorithms. (a) shows the overall accuracy comparison, while (b) presents the Kappa value comparison.

Figure 7. Detailed results for the highest overall classification accuracy. (a,d) show the true color images for the non-growing and growing seasons in YLC, respectively. (b,c) show the classification results produced using ConvNeXt and ResNet, respectively. (e,h) show the true color images for the non-growing and growing seasons in LK3, respectively. (f,g) show the classification results produced using ConvNeXt and ResNet, respectively.

Figure 8. Correlation analysis. Larger and redder circles indicate stronger positive correlations; larger and bluer circles indicate stronger negative correlations. (a,b) represent the growing season and non-growing season at LK3, respectively; (c,d) represent the growing season and non-growing season at YLC, respectively.

Figure 9. Importance analysis of base SHAP value. (a,b) represent the growing season and non-growing season at LK3, respectively; (c,d) represent the growing season and non-growing season at YLC, respectively.

Figure 10. Classification performance of models trained with different input data source and features. S-T stands for Swin Transformer.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, Z.; Huang, X. Deep Learning with UAV Imagery for Subtropical Sphagnum Peatland Vegetation Mapping. Remote Sens. 2025, 17, 2920. https://doi.org/10.3390/rs17172920

AMA Style

Liu Z, Huang X. Deep Learning with UAV Imagery for Subtropical Sphagnum Peatland Vegetation Mapping. Remote Sensing. 2025; 17(17):2920. https://doi.org/10.3390/rs17172920

Chicago/Turabian Style

Liu, Zhengshun, and Xianyu Huang. 2025. "Deep Learning with UAV Imagery for Subtropical Sphagnum Peatland Vegetation Mapping" Remote Sensing 17, no. 17: 2920. https://doi.org/10.3390/rs17172920

APA Style

Liu, Z., & Huang, X. (2025). Deep Learning with UAV Imagery for Subtropical Sphagnum Peatland Vegetation Mapping. Remote Sensing, 17(17), 2920. https://doi.org/10.3390/rs17172920

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Learning with UAV Imagery for Subtropical Sphagnum Peatland Vegetation Mapping

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Data Collecting

2.2.1. UAV Imagery Capture and Pre-Processing

2.2.2. Ground Reference Data

2.3. Data Processing Approaches

2.3.1. Object-Based Analyzing Image Segmentation

2.3.2. Deep Learning Algorithms

2.3.3. Driving Factor Analysis Using Feature Correlation and SHAP

2.3.4. Accuracy Assessment

3. Results

3.1. Classification Results and Model Accuracy

3.2. Feature Importance Based on SHAP

4. Discussion

4.1. Model Selection Considerations

4.2. Topography and Phenology Comparisons

4.2.1. The Influence of Topography on Classification

4.2.2. The Influence of Phenologyon Classification

4.3. Feature Determination

4.4. Implications of Findings

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI