1. Introduction
Coastal wetland vegetation biomass carbon is a significant component of carbon storage in coastal saltmarshes [
1,
2,
3]. Through the process of photosynthesis, wetland vegetation can convert inorganic carbon in wetland ecosystems into organic carbon, which plays a crucial role in the carbon cycle of saltmarsh wetlands in China [
4,
5]. However, there are variations and differences in the biomass carbon of coastal wetland vegetation due to factors such as vegetation type, geographical location, and nutrient conditions [
6,
7,
8]. The utilization of remote sensing images can not only effectively compensate for the limitations of accessibility and difficulties in obtaining wetland vegetation biomass samples but also integrate images for analyzing the spatial and temporal variations in biomass [
9,
10,
11].
Remote sensing can be used to estimate the distribution and growth of wetland vegetation and other resources. A statistical model can be established between the spectral data of remote sensing images and the aboveground biomass (AGB) of vegetation, and the biomass can be quickly estimated over a large area. This method solves the problems of poor accessibility and safety concerns associated with ground surveys and the sampling of wetland vegetation biomass [
12,
13]. Remote sensing is widely used in the fields of vegetation resource investigation, dynamic monitoring, and biomass estimation [
14,
15,
16]. However, unlike the remote sensing inversion of aboveground biomass in forests and farmland, wetland vegetation is affected by complex environmental factors, such as differences in underlying surface conditions [
17,
18]. In particular, in the Chongming Island area, wetland vegetation is significantly affected by the law of natural differentiation [
10]. The east is affected mainly by seawater, whereas the west is affected mainly by the Yangtze River, resulting in high heterogeneity in wetland vegetation growth and complex spectral response processes [
19]. In addition, different plants differ in their spectral, absorption, and reflection characteristics. Distinguishing wetland vegetation types for biomass remote sensing inversion may reduce the impact of spectral response differences between different vegetation types to a certain extent and improve the estimation accuracy of aboveground biomass [
20]. However, it is still unclear how the biomass remote sensing inversion of multiple wetland vegetation types in the same area differs and what specific impacts distinguishing vegetation type modeling and non-distinguishing vegetation type modeling will have on the wetland vegetation biomass inversion process.
In the selection of vegetation biomass inversion models, traditional statistical models such as univariate and multiple linear regressions have been used [
21]. With the development of machine learning, nonparametric machine learning methods such as the neural network (NN) regression model, support vector regression (SVR) model, and random forest (RF) regression model have shown great potential for the inversion of AGB from wetland vegetation in recent years [
22,
23,
24,
25,
26]. Gao et al. [
27] utilized the RF model along with the hyperspectral verification method of plots to increase the accuracy of Moderate Resolution Imaging Spectroradiometer (MODIS) images for estimating the AGB and coverage of alpine grasslands on the Qinghai–Tibet Plateau. Pecina et al. [
28] used a high-resolution dataset acquired by an unmanned aerial vehicle (UAV) platform combined with the RF algorithm to estimate the AGB of coastal meadows in Estonia. Lu et al. [
29] proposed a solution that combined UAV and Sentinel-2 data to successfully estimate and map the AGB of
Phragmites australis in the Nandagang Wetland Reserve in Hebei Province, China, effectively solving the problem of insufficient AGB samples due to the difficulty of accessing wetlands. Prakash et al. [
30] introduced a robust methodological protocol, which employs machine learning models alongside field measurements and multisensor SAR data, to achieve highly accurate and low uncertainty estimates of AGB in mangrove forests (R
2 = 0.93). Statistical methods are the main methods for biomass estimation, but the choice of model depends on factors such as the study area and vegetation type. Previous research has noted variations in biomass among different wetland vegetation types [
31,
32,
33]. However, it is unclear how modeling these types separately versus an overall model impacts biomass estimation. Currently, there is a gap in systematic research on coastal wetland biomass inversion, and the distribution pattern of wetland vegetation biomass on Chongming Island has not been determined.
From the perspective of image data sources, Landsat, Sentinel-2, MODIS, and other data have been widely used when optical satellite images are used to invert vegetation aboveground biomass [
34,
35,
36]. Among them, Landsat images offer higher spatial resolution than MODIS images and are better suited for wetland vegetation communities with limited distribution areas [
37,
38,
39]. While Sentinel-2 images provide even higher spatial resolution, their use in large-scale studies demands a greater volume of image data, involves significant computational effort, and is constrained by the lack of earlier historical records [
40,
41,
42]. In contrast, Landsat data are widely preferred because of their easy accessibility and their broad and historical perspective on vegetation [
43,
44]. Statistical methods are the main methods used for biomass estimation via optical remote sensing images, but the choice of model depends on factors such as the study area and vegetation type.
In addition to optical data, synthetic aperture radar (SAR) and light detection and ranging (Lidar) data can enhance the accuracy of AGB estimation [
45,
46,
47]. Zhu et al. [
48] integrated SAR and UAV data to estimate the AGB of the largest artificially planted mangrove forest in China on Qi’ao Island, which could provide more detailed and accurate information for AGB estimation. Salum et al. [
49] combined Lidar data and improved the inversion accuracy of mangrove biomass data. Lucas et al. [
50] retrieved mangrove distribution, age, structure, and biomass parameters by combining optical and SAR data in a Malaysian reserve. While SAR has potential for mangrove biomass retrieval [
51,
52,
53], coastal herbaceous wetland vegetation, such as
P.
australis and
Spartina alterniflora, can be challenging to retrieve via SAR data because of their low height and weaker radar signal returns. On the other hand, Lidar has high accuracy in retrieving vegetation height but has limited availability [
33,
54,
55]. Furthermore, obtaining ground lidar data in coastal wetlands is difficult, and large-scale monitoring via UAV data is currently challenging [
56,
57]. Considering the cost-effectiveness and workload involved in data processing, Landsat imagery is the superior choice for the inversion of aboveground biomass in coastal wetland vegetation.
In this study, various wetland vegetation AGB inversion models, such as univariate regression models, multiple linear regression models, and machine learning regression models, were compared based on Landsat images to explore the most suitable AGB retrieval method for each wetland vegetation type. Additionally, the impact of classification modeling on the AGB inversion of wetland vegetation was quantitatively analyzed to examine the utility of different models, spectral bands, species classification, and both parametric and nonparametric statistical methods for biomass prediction. By comprehensively comparing various vegetation types and inversion methods, this research quantitatively reveals the influence of different saltmarsh vegetation types on the remote sensing inversion of wetland AGB. Such an approach is essential for assessing wetland AGB over large areas, significantly enhancing our understanding of biomass dynamics in coastal ecosystems and providing more reliable data support for ecological monitoring and resource management.
3. Results
This study employed 13 models, including univariate regression, multiple linear regression, and machine learning regression models, to estimate the aboveground biomass of vegetation through remote sensing. These models were specifically applied to different vegetation types, such as P. australis, S. alterniflora, and Scirpus spp., and the overall saltmarsh vegetation, which represented all of the species combined. The optimal models for each vegetation type were then integrated to estimate the aboveground biomass of the study area. This was compared with the results from the overall modeling approach to determine whether classification-based modeling enhanced the accuracy of remote sensing biomass estimation for coastal wetland vegetation. This study also aimed to quantify the degree of accuracy improvement and its impact across various vegetation types in coastal wetlands.
3.1. Single-Species AGB Retrieval
This study analyzed the relationships between spectral variables and AGB for different types of wetland vegetation. For
P. australis, most spectral variables were significantly correlated with biomass, except for those in the blue band (
Table 3). The red band showed the highest correlation with the wet biomass of
P. australis, while the NDVI and RVI showed positive correlations. The NDSVI, red band, and NDVI were strongly correlated with dry biomass. Based on these findings, 15 noncollinear spectral variables were selected for modeling
P. australis biomass. For
S. alterniflora, the correlations between the spectral variables and biomass were consistent for both dry and wet biomass, except for the SWIR2130 band, which was significantly correlated with only wet biomass. Therefore, 11 variables were chosen for modeling the wet biomass of
S. alterniflora, and the same 10 variables were used for modeling dry biomass, excluding SWIR2130. For
Scirpus spp., all spectral variables, except for the near-infrared band and NDRI, were significantly correlated with both wet and dry biomass. Therefore, 14 spectral variables were chosen for the biomass inversion of
Scirpus spp., excluding the near-infrared band and NDRI. The structural differences in vegetation canopies among species lead to variations in spectral responses between species. Additionally, different species types exhibit varying optical sensitivities across different models.
3.1.1. P. australis AGB Inversion
This study evaluated different regression models for predicting the AGB of
P. australis. When 105 univariate models for wet and dry AGB were compared separately (
Table S1), the highest accuracy was achieved when the RVI was used as a cubic regression model. However, the accuracy of dry biomass prediction was generally lower than that of wet biomass prediction when univariate models were used for
P. australis. For wet biomass estimation of
P. australis, the stepwise multiple linear regression model was consistent with the univariate linear model. In contrast, for dry biomass estimation, the stepwise multiple linear regression model selected the NDSVI, with an R
2 value of only 0.340. The performance of the stepwise multiple linear regression model was affected by the samples, leading to inferior performance compared with that of the optimal univariate linear regression model. Five machine learning models were compared for their ability to predict the AGB of
P. australis. The RF model performed the best, utilizing spectral data effectively for biomass estimation (
Table 4). The GPR model had slightly lower accuracy but was still viable. The DT model worked well for wet biomass but not for dry biomass. The SVR and NN models struggled to utilize spectral data effectively. Overall, the machine learning models were slightly more accurate for wet biomass than for dry biomass, which was consistent with the univariate and multiple linear regression models. The RF regression model outperformed the univariate regression models and stepwise multiple linear regression models in predicting the AGB of
P. australis. However, limited measured data pose challenges for fully harnessing the advantages of some machine learning methods in nonparametric regression.
3.1.2. S. alterniflora AGB Inversion
The RVI outperformed the other spectral variables in the estimation of AGB under
S. alterniflora. Among the seven univariate regression models tested, the RVI was the optimal spectral variable for all of the models except for the S-curve model when wet biomass was estimated and for all the models when dry biomass was estimated (as shown in
Table S2). The comparison revealed that the quadratic model using the RVI achieved the best performance in estimating both the wet and dry biomass of
S. alterniflora, with a slightly greater accuracy in estimating wet biomass. The stepwise multiple linear regression model for estimating the AGB of
S. alterniflora included the RVI and NDVI as selected spectral variables; these two spectral variables had the highest correlation with the wet and dry biomass of
S. alterniflora, which was consistent with the correlation analysis results. The stepwise multiple linear regression model for dry biomass estimates could use information from both the RVI and the NDVI, resulting in greater accuracy than that of univariate linear regression. However, for wet biomass estimation, the accuracy was between the accuracy of the RVI and the accuracy of the NDVI univariate linear regression models. This is due to the susceptibility of both stepwise multiple linear regression and univariate linear regression models to the influence of sample data [
93]. During cross-validation, stepwise multiple linear regression did not always select RVI and NDVI values, resulting in lower overall accuracy than that of the RVI univariate linear regression model.
After the five machine learning regression models were compared, the RF regression model was found to perform well in estimating both the wet and dry biomass of
S. alterniflora, with higher accuracy for wet biomass than dry biomass. The GPR model with a squared exponential kernel function achieved the second highest accuracy. However, the ability of the SVR, NN, and DT models to estimate the biomass of
S. alterniflora was limited (
Table 5). Although the RF regression model had a greater estimation accuracy for AGB than the stepwise regression model, it was lower than that of univariate models such as the RVI quadratic and cubic models. This is primarily because machine learning regression models struggle to learn the relationships among data when the amount of data is limited.
3.1.3. Scirpus spp. AGB Inversion
The analysis revealed that the RVI and NDVI were more effective than the other spectral variables for estimating the wet biomass of
Scirpus spp. Among the 98 univariate regression models, the logarithmic model based on the RVI had the highest accuracy in estimating the wet biomass of
Scirpus spp. (
Table S3). When the dry biomass of
Scirpus spp. was estimated, the cubic regression model based on the NDVI had greater accuracy than the other univariate models did. Additionally, the R
2 of the optimal model for wet biomass estimation was greater than that of the optimal model for dry biomass estimation. According to the stepwise regression model for the inversion of wet and dry biomass of the
Scirpus spp., the RVI was selected from the 14 spectral variables. The inversion of dry biomass via stepwise regression was consistent with the use of the RVI for univariate linear regression. However, when inverting wet biomass, the spectral variable selected by the stepwise multiple linear regression model was the RVI, while the best spectral variable selected by the univariate linear regression model was the NDVI. The inconsistency between the stepwise multiple linear regression model and the univariate linear regression model was due to the different variables produced by tenfold cross-validation for each fold, and the RVI was selected more often than the NDVI in the stepwise multiple linear regression model.
When machine learning regression models are used for the wet and dry biomass inversion of
Scirpus spp., both the RF and GPR methods can effectively invert AGB, and their inversion accuracy is better than that of the SVR, NN, and DT models. Among them, RF performed slightly better than GPR for wet biomass, while GPR performed better for dry biomass estimation (
Table 6). However, the overall estimation accuracy of machine learning regression models is lower than that of univariate regression models, mainly due to the limitation of the sample size of the measured data, which makes it difficult to exploit the advantages of machine learning data processing.
In summary, when distinguishing wetland vegetation types for the AGB inversion of a single species, the inversion accuracy of different wetland vegetation types is ranked as follows: Scirpus spp. > S. alterniflora > P. australis. Additionally, different wetland vegetation types were suitable for different regression models. For example, P. australis RF regression models for the inversion of aboveground wet and dry biomass perform better than other models; Scirpus spp. are suitable for univariate regression models, where the logarithmic model using the RVI is the most accurate for the inversion of wet biomass and the cubic model using the NDVI is the most accurate for the inversion of dry biomass; and S. alterniflora is suitable for an RF regression model for wet biomass and a univariate regression model for dry biomass, where the RVI quadratic model has the smallest inversion error. Thus, a single approach may not be suitable for all wetland vegetation types, and it is necessary to choose an appropriate regression model and spectral index for accurate biomass estimation.
3.2. Remote Sensing Inversion of the Total AGB of Saltmarsh Vegetation
In previous studies, appropriate regression models for AGB estimation were selected separately for each type of wetland vegetation, but machine learning regression models may not perform well when the number of site samples is low. In addition, site sampling in wetlands requires considerable manpower and resources, resulting in limited sample sizes for each type of vegetation. To address these issues, an overall modeling approach was used with all available data on saltmarsh vegetation. This method efficiently utilizes site data, avoids multiple analyses and modeling, and provides a simple way to estimate aboveground wet and dry biomass for wetland vegetation.
The analysis of the correlation between the remote sensing spectral data and the AGB of saltmarsh wetland vegetation on Chongming Island revealed that there was a significant correlation (
p < 0.01) between the wet biomass and all 15 spectral variables, except for the NDRI. Among these variables, the RVI had the highest correlation with wet biomass (
Table 7).The dry biomass was significantly correlated (
p < 0.01) with all 14 spectral variables except for the SWIR1640 band and was significantly correlated (
p < 0.05) with the blue band. The EVI had the highest correlation with dry biomass. Therefore, the 15 spectral variables that were significantly correlated with the aboveground wet and dry biomass of saltmarsh wetland vegetation were selected for analysis.
The RVI was identified as the optimal spectral variable for estimating the aboveground wet biomass in saltmarsh vegetation, and the highest accuracy was achieved with the cubic model (
Table 8). Additionally, the RVI proves to be more effective for wet biomass estimation compared with other spectral variables, while the EVI and NDVI are better suited for estimating aboveground dry biomass. Univariate regression models highlight the cubic model for the EVI as the most accurate for estimating aboveground dry biomass, with wet biomass estimation showing greater accuracy than dry biomass estimation in saltmarsh vegetation.
Compared with univariate models, the stepwise multiple linear regression model, which incorporates the RVI and NDSVI, demonstrated greater accuracy in estimating aboveground wet biomass in saltmarsh vegetation. Additionally, for aboveground dry biomass estimation, the EVI and NDSVI selected by the stepwise multiple linear regression model outperformed the two univariate linear regression models, with dry biomass accuracy surpassing that of wet biomass in saltmarsh vegetation.
The RF regression model was the most effective among the five machine learning regression models for utilizing image spectral data to estimate the aboveground wet and dry biomass of saltmarsh vegetation. This method achieves the highest accuracy and has great potential in estimating aboveground wet biomass. Despite slightly lower accuracy, the GPR model performs well in estimating the AGB of saltmarsh vegetation and outperforms the univariate and stepwise models. The accuracy order for estimating aboveground wet biomass was RF > GPR > SVR > NN > DT, while for aboveground dry biomass, it was RF > GPR > NN > DT > SVR (
Table 9). Furthermore, the overall modeling accuracy of the machine learning regression models for the overall AGB of saltmarsh vegetation was greater than that of the single vegetation type inversion methods. This is mainly due to the larger sample data available for overall modeling, which better leverages the advantages of machine learning nonparametric methods.
3.3. Comparison of AGB Retrieval Results With and Without the Distinction of Wetland Vegetation Types
This study evaluated the performance of various models, including seven univariate models, stepwise multiple linear regression models, and five machine learning regression models, in predicting AGB for specific wetland vegetation types (P. australis, S. alterniflora, and Scirpus spp.) and overall saltmarsh vegetation. Optimal inversion models were obtained for each wetland vegetation type and overall saltmarsh vegetation, but there are advantages and disadvantages when distinguishing wetland vegetation types during modeling. Distinguishing vegetation types in modeling provides consistent impacts on radiation transfer due to the uniform plant structure and morphology of wetland vegetation. However, this approach may be influenced by other factors, such as underlying surfaces and limited sample sizes. Conversely, not distinguishing vegetation types allows for a large sample size, but variations in plant structure and morphology can lead to diverse effects on the radiation transfer process. This study further investigated the influence of differences in wetland vegetation types on AGB inversion by comparing results under conditions of distinguishing and not distinguishing vegetation types.
The RF regression model outperformed the other 12 models when modeling without distinguishing wetland vegetation types in terms of the accuracy of the inversion of AGB from saltmarsh vegetation. Consequently, this model was selected to predict the aboveground wet and dry biomass of saltmarsh vegetation. In contrast, when distinguishing between wetland vegetation types, optimal aboveground wet and dry biomass inversion models were selected for each wetland vegetation type, and their inversion results are presented in
Table 10.
In general, the difference in accuracy between distinguishing and not distinguishing wetland vegetation types was relatively small for the inversion of saltmarsh wetland vegetation wet and dry biomass. However, the accuracy of distinguishing vegetation types was slightly better than that of not distinguishing them. When distinguishing wetland vegetation types, the R2 values of the measured and fitted wet and dry biomasses could reach 0.806 and 0.839, respectively, and the inversion accuracy of type-based individual modeling was better than that of overall modeling for all three types of saltmarsh wetland vegetation.
For wet biomass, the difference in inversion accuracy between distinguishing and not distinguishing wetland vegetation types was relatively small. When modeling the wet biomass of
P. australis separately, the R
2 value was 0.541, which was slightly better than that of the overall model (0.537), and the difference in the RMSE between the individual and overall models was only 3.600 g/m
2. Therefore, the overall modeling method can adequately invert the wet biomass of
P. australis. Similarly, the inversion accuracy of the wet biomass of
S. alterniflora was like that of
P. australis, and the overall modeling method could be used to invert it as well. The differences in R
2 and RMSE between distinguishing and not distinguishing wetland vegetation types were 0.027 and 8.262 g/m
2, respectively. However, when the wet biomass of
Scirpus spp. was inverted, the overall modeling method tends to overestimate low values and underestimate high values, and the difference in the RMSE between the individual and overall modeling methods is 11.412 g/m
2. Therefore, the individual modeling method for estimating the wet biomass of
Scirpus spp. is better than the overall modeling method and provides more accurate inversion results (
Figure 4).
The inversion accuracy between modeling individually while distinguishing wetland vegetation types and overall modeling without distinguishing vegetation types is greater for dry biomass than for wet biomass. Additionally, by distinguishing wetland vegetation types, the inversion accuracy can be improved by reducing the underestimation of high values. For the dry biomass of P. australis, there was little difference between the individual and overall modeling methods, with only small variations in R2 and RMSE of only 0.032 and 9.090 g/m2, respectively. Similarly, for the dry biomass of S. alterniflora, the differences between the individual modeling and overall modeling methods were relatively small, with differences in R2 and RMSE of only 0.014 and 4.487 g/m2, respectively. Although the overall modeling method tends to underestimate high values more than the individual modeling methods do, both methods have comparable inversion effects on the dry biomass of S. alterniflora. However, when the dry biomass of Scirpus spp. is inverted, high values are occasionally overestimated when the overall modeling method is used compared with the individual modeling methods.
In summary, univariate regression models, multiple linear regression models, and machine learning regression models were compared to estimate the aboveground wet and dry biomass of three main saltmarsh vegetation types on Chongming Island. The optimal inversion models were selected for P. australis, S. alterniflora, Scirpus spp., and all saltmarsh vegetation. The results indicated that using RF regression models for overall modeling could better fit the aboveground wet and dry biomass of wetland vegetation, with inversion results for wet biomass being superior to those for dry biomass. The inversion results for wet biomass were consistently greater than those for dry biomass, primarily because wet biomass originates from fresh vegetation with higher water content, which results in more pronounced absorption in the red wavelength band. This enhanced absorption correlates with greater inversion accuracy, with the difference in accuracy between wet and dry biomass ranging from 1.111% to 3.974%. While individual modeling with wetland vegetation types provides a better reflection of the aboveground biomass of each type, it requires multiple models and has high computational complexity. On the other hand, using the RF machine learning regression model for overall modeling without distinguishing wetland vegetation types can quickly reveal the distribution of the AGB of saltmarsh vegetation on Chongming Island with results similar to individual modeling.
3.4. Estimation of the AGB of Wetland Vegetation Under Different Models
To evaluate the impacts of different regression models on wetland vegetation AGB estimation, various models, including univariate regression, multiple linear regression, machine learning regression, and optimal model combinations for each vegetation type, were compared. The results generally revealed small differences in the inversion performances of several regression models for wet biomass. Among these models, the RF regression model estimates were similar for the combinations of various vegetation types, while the univariate regression model underestimated the
S. alterniflora AGB in Beiliuyao and the stepwise regression model underestimated the
Scirpus spp. AGB in Dongtan compared with the other models. A comparison of the biomass inversion of saltmarsh wetland vegetation by multiple models revealed that the RF regression model outperformed the univariate and multiple linear regression models in fitting saltmarsh vegetation AGB and reducing underestimation in some regions. Furthermore, the RF regression model, similar to the model distinguishing vegetation types, could better fit the saltmarsh wetland vegetation AGB without distinguishing vegetation types. In combination with the classification of saltmarsh wetland vegetation, species differences in AGB per unit area were also observed, with
S. alterniflora having the highest biomass, followed by
P. australis and
Scirpus spp. The distribution of AGB per unit area of saltmarsh vegetation was consistent with the spatial classifications (as shown in
Figure 5).
Compared with the other methods, the dry biomass inversion results revealed that the univariate regression model underestimated the
S. alterniflora AGB in Beiliuyao compared to the other methods, whereas the multiple linear regression model overestimated the
Scirpus spp. AGB in Dongtan compared with the other regression models (
Figure 6). This is attributed to the fact that the univariate regression model has a limited ability to fit high biomass values, resulting in underestimation, while the multiple linear regression model tends to overestimate low biomass values, leading to lower fitting accuracy for low biomass values [
20]. In contrast, the RF regression model effectively reduced both the overestimation of low biomass values and the underestimation of high biomass values, providing a better fit for the AGB of saltmarsh wetland vegetation. When the inversions of the RF regression model and the combined inversions for distinguishing vegetation types were compared, both methods demonstrated improved fitting of the saltmarsh vegetation AGB. However, when distinguishing saltmarsh vegetation types, the low values of
S. alterniflora and
P. australis and the high value of
Scirpus spp. exceeded the range of sample training data and were affected by classification accuracy. Therefore, overall modeling via the RF regression model can better estimate the aboveground dry biomass of saltmarsh vegetation.
3.5. Feature Importance and Stability Analysis of the RF Regression Model
The RF regression model is effective at estimating the AGB of wetland vegetation using spectral data and can determine the relative importance of each spectral variable (
Figure 7). The most crucial variable for wet biomass estimation was the RVI, followed by the NDWI2130, NDVI, and NDWI1640. NDWI2130 and NDWI1640 are ranked highly in importance because they reflect variations in vegetation water content, which is correlated with the AGB [
72]. Additionally, the RVI, NDVI, and SAVI also rank high in importance as they are sensitive to differences in vegetation greenness [
68,
94]. Spectral indices such as the NDVI are particularly sensitive to chlorophyll content, which serves as an indicator of biomass [
95,
96,
97]. The variables strongly correlated with wet biomass, such as the RVI, NDWI1640, and NDWI2130, are given high importance in the RF regression model. The RF regression model automatically selected suitable variables based on the data features for wetland vegetation biomass estimation.
For dry biomass estimation, the NDSVI emerged as the most important spectral variable, followed by the RVI, EVI, PVI, SAVI, NDVI, and DVI in descending order of importance. This ranking aligns with their correlation with dry biomass, with the EVI, SAVI, PVI, DVI, NDSVI, RVI, and NDVI being the top seven variables. The NDSVI is sensitive to vegetation water loss, the EVI, RVI, and SAVI reflect differences in vegetation growth, and the PVI indicates vegetation structure; all of these factors are strongly correlated with dry biomass. The RF regression model automatically selected the appropriate variables based on their correlation with biomass, ensuring reliable estimation of wetland vegetation.
The tenfold cross-validation errors between the model fittings and the measured AGB at the sample sites are shown in
Figure 8. The RF regression model used in this study is stable and reliable for estimating the AGB of saltmarsh vegetation. The small fluctuation in overall sample errors indicates the model’s consistency and robustness, making it suitable for accurate AGB estimation, which is supported by the similarity in the spectral range between the sample and fitted data.
5. Conclusions
In this study, the AGB of three wetland vegetation types (P. australis, S. alterniflora, and Scirpus spp.), as well as the overall saltmarsh vegetation on Chongming Island, were estimated using various regression models (univariate, multiple linear, and machine learning). And the effects of modeling with and without the distinction of vegetation types were compared. The study revealed that: (1) The impact of vegetation type on model performance varies. Scirpus spp. was more precise for biomass retrieval than S. alterniflora or P. australis when distinguishing vegetation types. Different vegetation types require specific regression models. For example, P. australis performs well with RF, Scirpus spp. with univariate regression models, and S. alterniflora with a mix of RF and univariate models; (2) Overall modeling without distinguishing vegetation types is beneficial, addressing challenges in wetland sampling and limited training samples. In terms of model selection, nonparametric regression models such as RF demonstrate superior performance in fitting the biomass of saltmarsh vegetation compared with univariate and multiple linear regression models; and (3) Although distinguishing between vegetation types and modeling them separately slightly improves the inversion results, with an increase in R2 by 0.003 and 0.01 and a reduction in RMSE by 6.398 and 6.451, it requires considerable computational effort. In contrast, overall modeling without distinguishing vegetation types can still accurately estimate wetland vegetation biomass, especially when nonparametric methods such as random forests are applied, offering strong practical potential. The assessment in this study of the impact of coastal wetland saltmarsh vegetation types on aboveground biomass inversion not only contributes to a better understanding of the biomass dynamics of different vegetation types but also provides valuable scientific support for the protection and restoration of wetland ecosystems.