Hyperspectral Inversion of Phragmites Communis Carbon, Nitrogen, and Phosphorus Stoichiometry Using Three Models

Studying the stoichiometric characteristics of plant C, N, and P is an effective way of understanding plant survival and adaptation strategies. In this study, 60 fixed plots and 120 random plots were set up in a reed-swamp wetland, and the canopy spectral data were collected in order to analyze the stoichiometric characteristics of C, N, and P across all four seasons. Three machine models (random forest, RF; support vector machine, SVM; and back propagation neural network, BPNN) were used to study the stoichiometric characteristics of these elements via hyperspectral inversion. The results showed significant differences in these characteristics across seasons. The RF model had the highest prediction accuracy concerning the stoichiometric properties of C, N, and P. The R2 of the four-season models was greater than 0.88, 0.95, 0.97, and 0.92, respectively. According to the root mean square error (RMSE) results, the model error of total C (TC) inversion is the smallest, and that of C/N inversion is the largest. The SVM yielded poor predictive results for the stoichiometric properties of C, N, and P. The R2 of the four-season models was greater than 0.82, 0.81, 0.81, and 0.70, respectively. According to RMSE results, the model error of TC inversion is the smallest, and that of C/P inversion is the largest. The BPNN yielded high stoichiometric prediction accuracy. The R2 of the four-season models was greater than 0.87, 0.96, 0.84, and 0.90, respectively. According to RMSE results, the model error of TC inversion is the smallest, and that of C/P inversion is the largest. The accuracy and stability of the results were verified by comprehensive analysis. The RF model showed the greatest prediction stability, followed by the BPNN and then the SVM models. The results indicate that the accuracy and stability of the RF model were the highest. Hyperspectral data can be used to accurately invert the stoichiometric characteristics of C, N, and P in wetland plants. It provides a scientific basis for the long-term dynamic monitoring of plant stoichiometry through hyperspectral data in the future.


Introduction
All living organisms are composed of multiple chemical elements, with the three main ones being C, N, and P. The balance of elements has an important influence on biological production, and nutrient cycling [1]. Many key characteristics of organisms and ecosystems can be determined by the dynamic ratio of these elements [2]. Such an approach, known as "ecological stoichiometry," studies the elemental components of organisms, particularly C, N, and P [3]. Ecological chemometrics Remote Sens. 2020, 12,1998 3 of 15

Study Area
The study was conducted in Beijing Hanshiqiao Wetland Nature Reserve (40 • 07 N and 116 • 48 E) located in Shunyi District, Beijing ( Figure 1). It is a swamp wetland on the outskirts of Beijing covering a total area of 1900 ha, where management focuses on protecting reed (Phragmites communis) swamp and wild animals. It is part of the Chaobai River system, with the Caijia River as its main water source. The area has a warm temperate semi-humid monsoon climate, with an average annual temperature of 11.9 • C and average annual precipitation of 603 mm [32]. Precipitation mainly occurs from July to August. Reeds in our wetland study area are growing in 5-30 cm of water. During the sampling period, the vegetation coverage is 80-90% in spring and 85-95% in the other three seasons.
Remote Sens. 2020, 12, x FOR PEER REVIEW 3 of 17 basis for estimating and dynamically monitoring reed stoichiometry by hyperspectral data in different seasons without causing damage to the plants.

Study Area
The study was conducted in Beijing Hanshiqiao Wetland Nature Reserve (40°07′N and 116°48′E) located in Shunyi District, Beijing ( Figure 1). It is a swamp wetland on the outskirts of Beijing covering a total area of 1900 ha, where management focuses on protecting reed (Phragmites communis) swamp and wild animals. It is part of the Chaobai River system, with the Caijia River as its main water source. The area has a warm temperate semi-humid monsoon climate, with an average annual temperature of 11.9 °C and average annual precipitation of 603 mm [32]. Precipitation mainly occurs from July to August. Reeds in our wetland study area are growing in 5-30 cm of water. During the sampling period, the vegetation coverage is 80-90% in spring and 85-95% in the other three seasons.

Data Collection
Sampling was conducted in each season in 2018 (in April, July, October, and December), with 90 samples collected each season. In order to ensure the comparability of samples from different seasons, a total of 60 fixed plots were set up in the upstream, middle, and downstream regions of the reserve, and the remaining sampling points were randomly distributed, for a total of 120 random plots over the study period. The size of each quadrat was 1 m × 1 m.
Spectral data of the reed canopy were collected using ASD FieldSpec 4 spectroradiometer (ASD, Analytical Spectral Devices, Inc., Boulder, CO, USA), in the wavelength range of 350-2500 nm. A 5m fiber optic cable was attached, and the height of the support rod was adjusted to ensure that the spectral probe maintained a distance of 1 m from the reed canopy in the quadrat. Data collection was conducted between 10:00 and 14:00 local time (China Standard Time, GMT+8). At this time, the solar elevation angle was greater than 45°, which ensures good lighting conditions. The spectrum was corrected using a reference panel before each sample was collected. Ten spectral datasets were collected and averaged as one spectral reading for each sampling point. In order to ensure accurate inversion of the reed stoichiometry, the spectral bands affected by water-vapor absorption were removed.

Data Collection
Sampling was conducted in each season in 2018 (in April, July, October, and December), with 90 samples collected each season. In order to ensure the comparability of samples from different seasons, a total of 60 fixed plots were set up in the upstream, middle, and downstream regions of the reserve, and the remaining sampling points were randomly distributed, for a total of 120 random plots over the study period. The size of each quadrat was 1 m × 1 m.
Spectral data of the reed canopy were collected using ASD FieldSpec 4 spectroradiometer (ASD, Analytical Spectral Devices, Inc., Boulder, CO, USA), in the wavelength range of 350-2500 nm. A 5-m fiber optic cable was attached, and the height of the support rod was adjusted to ensure that the spectral probe maintained a distance of 1 m from the reed canopy in the quadrat. Data collection was conducted between 10:00 and 14:00 local time (China Standard Time, GMT+8). At this time, the solar elevation angle was greater than 45 • , which ensures good lighting conditions. The spectrum was corrected using a reference panel before each sample was collected. Ten spectral datasets were collected and averaged as one spectral reading for each sampling point. In order to ensure accurate inversion of the reed stoichiometry, the spectral bands affected by water-vapor absorption were removed.
Three reeds were harvested from each quadrat, treated in a drying box at 105 • C for 30 min to halt oxidation, and then dried at 80 • C until a constant mass was reached. The dried reeds were crushed and sieved. Plant C and N were measured using an elemental analyzer (vario EL III, CHNOS elemental analyzer, Elementar Analysensysteme GmbH, Germany), and plant P content was measured using an ultraviolet-visible spectrophotometer (UV-2550, UV-Visible spectrophotometer, Shimadzu, Japan).

Methods
Due to the large number of bands, strong correlations, and data redundancies that are characteristic of hyperspectral data, related calculations are complex. Principal component analysis (PCA) was applied to reduce the dimensionality of the hyperspectral data by compressing many interrelated variables into a few uncorrelated new component variables, each containing a large amount of information from the original data [33]. Taking the principal component after PCA as the independent variable, the next step was to build the prediction model of reed C, N, and P stoichiometry. Each season, 90 samples were randomly assigned to two groups: one group was used for model building, and the other was used for model validation. Using Weka 3.8 software, three models (RF, SVM, and BPNN) were selected to invert hyperspectral data in order to analyze the reed C, N, and P stoichiometry. The accuracy and stability of the models were evaluated by the determination coefficient (R 2 ) and root mean square error (RMSE).
The RF is an integrated learning algorithm used for classification and regression. It is constructed by combining the results of various decision trees and bagging the original dataset to select samples [34]. Each decision tree determines the class tag predictor (termed "vote") for the new instance. Then, the votes generated by various decision trees are calculated, and the class with the majority of votes is assumed to "win"; thus, the new instance is predicted [35]. In this study, using Weka software, a random subspace is first constructed and then the operation is implemented based on the J48 (C4.5) algorithm. The parameters for numIterations, seed, confidence factor, and numFolds are 10, 1, 0.25, and 3, respectively.
The SVM is a very popular machine-learning technology. This supervised learning model contains related learning algorithms, which are used to analyze, classify, and conduct regression analysis on the supplied data [36]. The sample is set with input and output variables; an optimal function to minimize the expected loss is determined; and the balance between the looseness of the function and the value is adjusted. After learning the sample set, the regression prediction equation is constructed [37]. In this study, Weka software using the Gaussian kernel function is used as the kernel function in the SVM model with the kernel size set to 17.
A BPNN is a multilayer feed-forward neural network trained according to the error back-propagation algorithm. It is a machine model that takes the square of network error as an objective function and uses the gradient descent method to calculate the minimum value of the objective function [38]. This model uses powerful data recognition and simulation abilities to model nonlinear systems. The BPNN is trained using the input and output data until it can express the unknown black box function, and then the trained network is used to predict the system output. In this study, Weka software was based on the multilayer perceptron algorithm, the sigmoid function is used as the transfer function and the number of neurons in the hidden layer is set to 10.

Stoichiometric Characteristics of Reed
The stoichiometric characteristics of reed C, N, and P were significantly different across seasons, but there were low error bars for all samples of reed collected within the same season ( Figure 2). This shows that the stoichiometric characteristics of reed are unique in different growing seasons. respectively. The values and ranges of specific stoichiometric characteristics for reed C, N, and P in each of the four seasons are summarized in Table 1. 5.25 mg/g, respectively. The average value of total P (TP) is highest in spring and lowest in winter. The values are 3.79 and 0.40 mg/g, respectively. The average value of total C (TC) is both highest and lowest in summer. The values are 458.49 and 392.45 mg/g, respectively. The average value of C/N is highest in winter and lowest in summer; the values are 84.31 and 12.19, respectively. The average value of N/P is highest in summer and lowest in spring; the values are 15.90 and 6.70, respectively. The average value of C/P is highest in winter and lowest in spring. The values are 1088.28 and 111.37, respectively. The values and ranges of specific stoichiometric characteristics for reed C, N, and P in each of the four seasons are summarized in Table 1.

Correlation Analysis
The results show that the correlation between winter spectral reflectance and plant stoichiometry is markedly different from that of the other seasons. In winter, TC, TN, and TP all show a large positive correlation at 650 nm, while C/N, N/P, and C/P all show a large negative correlation at 650 nm ( Figure 3). In other seasons, the TC, TN, TP, and spectral reflectance correlations of vegetation are in the following order: autumn > summer > spring. The optimal correlation band positions and values of TC, TN, TP, and spectral reflectance are similar across all four seasons. The C/P and C/N values are negatively correlated with spectral reflectance for all four seasons, while the N/P value is positively correlated with spectral reflectance in spring and summer, and negatively correlated in autumn and Remote Sens. 2020, 12, 1998 6 of 15 winter. The overall correlation between N/P and spectral reflectance ( Figure S1) is significantly lower than the other five stoichiometric parameters assessed. stoichiometry is markedly different from that of the other seasons. In winter, TC, TN, and TP all show a large positive correlation at 650 nm, while C/N, N/P, and C/P all show a large negative correlation at 650 nm ( Figure 3). In other seasons, the TC, TN, TP, and spectral reflectance correlations of vegetation are in the following order: autumn > summer > spring. The optimal correlation band positions and values of TC, TN, TP, and spectral reflectance are similar across all four seasons. The C/P and C/N values are negatively correlated with spectral reflectance for all four seasons, while the N/P value is positively correlated with spectral reflectance in spring and summer, and negatively correlated in autumn and winter. The overall correlation between N/P and spectral reflectance ( Figure S1) is significantly lower than the other five stoichiometric parameters assessed.

Random Forest Regression Models
The RF model has good prediction accuracy concerning the stoichiometric properties of C, N, and P. Prediction accuracy decreases across seasons in the order of autumn, summer, winter, and spring ( Figure 4). The R 2 for autumn is greater than 0.97, and that for summer is greater than 0.95. The R 2 of C/N is lower in winter, with a value of 0.92. The R 2 values for TN and C/N are lower in spring than for the other seasons (0.91 and 0.88, respectively), and the prediction error is also greater than that for other seasons. The prediction accuracy is highest for C/P, with an R 2 of 0.99 for all four seasons. The R 2 values for TC and TP are also greater than 0.96. The RMSE values of the inversion models of TC and TN in spring and TP in summer are higher than those of other seasons; those of C/N, C/P, and N/P in autumn and winter are higher than those of spring and summer. The prediction accuracy of the random forest model is slightly higher for TN, TP, and TC than for C/N, N/P, and C/P. Remote Sens. 2020, 12, x FOR PEER REVIEW 8 of 17

Support Vector Machine Regression Models
The SVM yielded poor predictive results for the stoichiometric properties of C, N, and P. Prediction accuracy decreases across the seasons in the order of autumn, winter, summer, and spring ( Figure 5). In spring, the R 2 value for TC is 0.98, and that of the other stoichiometric parameters is less than 0.89. According to RMSE results, the model error of TN inversion in spring is the largest. In summer, the R 2 value of N/P is 0.81. According to the results of RMSE, the error of the summer model is the smallest. In autumn, the R 2 value for N/P is only 0.81, while that of the other stoichiometric parameters is greater than 0.89. The winter R 2 values for C/N, C/P, and N/P are 0.70, 0.81, and 0.83, respectively. According to the results of RMSE in autumn and winter, the inversion errors of the model to C/N, C/P, and N/P are high. From the perspective of chemometrics, the prediction accuracy of the SVM model is highest for TC, with an R 2 that is greater than 0.87. The prediction accuracy of the SVM model for TN, TP, and TC is greater than for C/N, N/P, and C/P, as the R 2 values of the former group are mostly larger than 0.94, whereas those of the latter group are mostly less than 0.89.

BP Neural Network Regression Models
The BPNN yielded high stoichiometric prediction accuracy. However, the prediction accuracy of the BPNN differs from that of the other two models when comparing seasonal results. Prediction accuracy decreases across the seasons in the order of summer, winter, spring, and autumn ( Figure 6). The R 2 of each stoichiometric parameter is greater than 0.96 for summer, and that for winter is greater

BP Neural Network Regression Models
The BPNN yielded high stoichiometric prediction accuracy. However, the prediction accuracy of the BPNN differs from that of the other two models when comparing seasonal results. Prediction accuracy decreases across the seasons in the order of summer, winter, spring, and autumn ( Figure 6). The R 2 of each stoichiometric parameter is greater than 0.96 for summer, and that for winter is greater than 0.90. The prediction accuracy for N/P is low for spring and autumn, with R 2 values of 0.87 and 0.84, respectively. The R 2 values of TN, TP, and TC were greater than 0.96 for all four seasons. The RMSE results of stoichiometric inversion models in different seasons were similar to the trends of the other models, with TC, TN and TP errors less than C/N, C/P and N/P. The prediction accuracy of the BPNN is better for TN, TP, and TC than for C/N, N/P, and C/P.

Accuracy of Prediction Models
The remaining one-third of the data was used for validation of the above built models. The accuracy and stability of the results were verified by comprehensive analysis, the RF model showed the greatest prediction stability, followed by the BPNN, and then the SVM model. For the RF model, the inversion stability of C/N is highest, followed by TN and N/P, and the inversion stability of summer TC and winter C/N require improvement.
The spring R 2 values for all stoichiometric characteristics of C, N, and P are greater than 0.86 in all models. The R 2 values produced by the SVM for summer TC and N/P are only 0.83 and 0.64, and that of winter TP is only 0.79. The overall prediction accuracy for N/P is poor for all four seasons under this model. The BPNN showed poor predictive power regarding autumn TC and N/P, with R 2 values of only 0.86 and 0.84, respectively. Additionally, the R 2 for winter TP is only 0.89 (Table 2). Table 2. Accuracy of prediction models for reed C, N, and P stoichiometric characteristics (P < 0.01). In order to attain identical contrast in model errors across different seasons, the RMSE and the sample average values were processed as proportions. The seasons with both the largest and smallest errors are the same for identical stoichiometric characteristics under the different models. The errors of the three models for TC prediction are all less than 0.5%, except for the summer SVM model, which is 1.04%.

Discussion
Plants obtain C, which forms the carbon skeleton that is the basis of organic compounds, from CO 2 in the air and water-soluble organic C in the soil. As C is abundant in reed-swamp wetlands, the TC of the reeds is stable across all four seasons. The demand for P is greater in early plant growth stages than in later ones, leading to a gradual reduction of reed TP with seasonal changes. Studies have shown that N is a limiting factor when the N/P ratio is less than 8 [39]. Our results indicate that vegetative growth may be limited in spring by the total N content. As summer is the rainy season in the study area, the N/P ratio of reeds is higher in this season than in the others. Plant C/N reflects plant nutrient utilization efficiency. Because the reeds stop growing in winter, the C/N is 3-4 times higher at this time than in the other three seasons; conversely, during the summer season of vigorous growth, the C/N is the lowest. When analyzing the correlation between phytochemical characteristics and spectral reflectance, the correlation between winter and the other three seasons is significantly different. This may be due to a change in spectral reflectance caused by the withering of vegetation in winter.
The larger the vegetation coverage, the less the spectral characteristic is affected by the underlying surface [40]. The reason that the precision of the stoichiometry inversion model for spring C, N, and P is lower than those of the summer and autumn models may be the lower vegetation coverage in spring. With increasing leaf age, the mesophyll interspace increases. In addition, when the reeds flower in autumn, the vegetation structure becomes more complex, thus increasing the spectral reflectance and noise of the autumn vegetation canopy [40] and reducing the precision of the autumn C, N, and P stoichiometry inversion models.
The RF model has the best inversion effect on wetland plant stoichiometry, and its autumn model R 2 is the highest. The strongest correlation between stoichiometry and spectral reflectance was obtained for autumn data. This may be because vegetation growth is maximal in autumn, and the greater vegetation coverage reduces the environmental background interference on spectral data. In addition, the range of sample values is relatively large for autumn, which ensures a more comprehensive range of data. The RF model is good at dealing with nonlinear multiple compound regression problems, easy to improve performance, not easy to overfit, efficient and fast when processing large data sets, and interpretable to the results [41], and is suitable for inverting the stoichiometric characteristics of wetland plants.
The SVM model has high generalization ability and has advantages in solving small sample model inversions, but the model is particularly sensitive to the parameters selected [42,43]. In the present study, the inversion accuracy of the SVM model for TC, TN, and TP is slightly lower than that of the other two models in spring and summer. However, the inversion accuracy for C/N, N/P, and C/P is particularly poor. Therefore, when solving the inversion of small sample plant stoichiometric characteristics in the future, we must select the appropriate kernel function according to the data characteristics.
The BPNN model also has problems with its high inversion accuracy of TC, TN, and TP, and reduced accuracy of C/N, N/P, and C/P inversion. However, its inversion accuracy for C/N, N/P, and C/P is much higher than that of the SVM model. The model R 2 for C/N, N/P, and C/P inversion is higher than 0.90, 0.84, and 0.89, respectively. This may be because the BPNN model has overfitting in the process of retrieving the C/N, N/P, and C/P of wetland plants. The R 2 of each stoichiometric parameter is greater than 0.96 for summer and greater than 0.90 for winter. This may be because the model tends toward the local minimum.
The modeling results of the prediction model verification are basically consistent with those of the three models. The seasons with both the largest and smallest errors are the same for identical stoichiometric characteristics under the different models. This is related to the range and distribution of sample values. Therefore, in addition to the differences between models, the growth characteristics of vegetation also have a great impact on the accuracy of the model.

Conclusions
Based on the stoichiometric characteristics of wetland plant C, N, and P in different seasons, we selected three machine models for hyperspectral inversion. The accuracy of the model is verified. The results allow us to draw the following conclusions: Hyperspectral data can be used to accurately invert the stoichiometric characteristics of C, N, and P in wetland plants (reed).
All models tested showed higher prediction accuracy for TC, TN, and TP than for C/N, N/P, and C/P.
Of the three models tested, the accuracy and stability of RF are the highest. Within this model, the prediction accuracy and stability are the highest for summer.
This study is a preliminary exploration of the hyperspectral inversion of wetland stoichiometric characteristics in different seasons. It provides a scientific basis for the long-term dynamic monitoring of plant stoichiometry through hyperspectral data in the future.