Estimation of Paddy Rice Nitrogen Content and Accumulation Both at Leaf and Plant Levels from UAV Hyperspectral Imagery

Remote sensing-based mapping of crop nitrogen (N) status is beneficial for precision N management over large geographic regions. Both leaf/canopy level nitrogen content and accumulation are valuable for crop nutrient diagnosis. However, previous studies mainly focused on leaf nitrogen content (LNC) estimation. The effects of growth stages on the modeling accuracy have not been widely discussed. This study aimed to estimate different paddy rice N traits—LNC, plant nitrogen content (PNC), leaf nitrogen accumulation (LNA) and plant nitrogen accumulation (PNA)—from unmanned aerial vehicle (UAV)-based hyperspectral images. Additionally, the effects of the growth stage were evaluated. Univariate regression models on vegetation indices (VIs), the traditional multivariate calibration method, partial least squares regression (PLSR) and modern machine learning (ML) methods, including artificial neural network (ANN), random forest (RF), and support vector machine (SVM), were evaluated both over the whole growing season and in each single growth stage (including the tillering, jointing, booting and heading growth stages). The results indicate that the correlation between the four nitrogen traits and the other three biochemical traits—leaf chlorophyll content, canopy chlorophyll content and aboveground biomass—are affected by the growth stage. Within a single growth stage, the performance of selected VIs is relatively constant. For the full-growth-stage models, the performance of the VI-based models is more diverse. For the full-growth-stage models, the transformed chlorophyll absorption in the reflectance index/optimized soil-adjusted vegetation index (TCARI/OSAVI) performs best for LNC, PNC and PNA estimation, while the three band vegetation index (TBVITian) performs best for LNA estimation. There are no obvious patterns regarding which method performs the best of the PLSR, ANN, RF and SVM in either the growth-stage-specific or full-growth-stage models. For the growth-stage-specific models, a lower mean relative error (MRE) and higher R2 can be acquired at the tillering and jointing growth stages. The PLSR and ML methods yield obviously better estimation accuracy for the full-growth-stage models than the VI-based models. For the growth-stage-specific models, the performance of VI-based models seems optimal and cannot be obviously surpassed. These results suggest that building linear regression models on VIs for paddy rice nitrogen traits estimation is still a reasonable choice when only a single growth stage is involved. However, when multiple growth stages are involved or missing the phenology information, using PLSR or ML methods is a better option.


Introduction
Nitrogen (N) is one of the most important plant macronutrients and plays a significant role in the photosynthesis process. It is important for crop development, production and quality [1][2][3]. Proper N fertilization can help crops achieve maximum potential yield [4]. However, the uncertainty in the climate of the current year, previous crop management, and soil N supply cause actual N uptake to vary in both time and space. N deficiency can negatively affect photosynthesis and biomass accumulation, as well as both the quality and quantity of the crop yield [5]. Thus, considering the relatively low cost of N fertilization, overapplication of N fertilizer is a common phenomenon in agricultural practice to ensure maximum yield [6]. The overapplication of N fertilizer can result in problems such as plant stress, the overproduction of leaves, and more unproductive tillers. Additionally, it increases the N leaching risk of cropping systems.
For early growth stages, monitoring the crop N content can help in fertilization planning. For maturing growth stages, monitoring crop N content can help indicate yield quality [7]. Thus, the continuous monitoring of crop N content is of great importance. The destructive determination of N content, using chemical methods such as the Kjeldahl technique, is laborious, lengthy and costly [5]. Portable chlorophyll meters, such as SPAD-502, can be used to quickly diagnose crop N status. These tools are still point-sampling based and not suitable for precision N management across large areas. Remote sensing, however, provides a cost-effective and nondestructive avenue for monitoring crop N status over large geographic areas [8]. Using radiometric data to map crop N conditions has been reported for many crops.
Although there are a few works trying to use radiation transformation models, such as N-PROSPECT [9] or PROPSECT-PRO [10], which are modified versions of the widely used PROSPECT model [11], for N content estimation, remote sensing-based crop N content detection is mainly conducted using empirical relationships from vegetation indices (VIs) that are sensitive to chlorophyll content based on the fact that the two variables are moderately to be highly correlated [12]. Tian et al. [13] quantified the relationship between paddy rice leaf nitrogen content (LNC) and red-edge position (REP) with different algorithms. Li et al. [14] found that the canopy chlorophyll content index (CCCI) and N planar domain index (NDPI) are stable predictors for wheat plant nitrogen content (PNC) after the heading growth stage. Patel et al. [15] evaluated the relationships of several popular broadband VIs with the PNC, plant nitrogen accumulation (PNA) and aboveground biomass (ABG) of ryegrass. Li et al. [16] evaluated several hyperspectral VIs for winter wheat PNC estimation across different years and growth stages. Fitzgerald et al. [17] suggested that the CCCI is suitable for predicting rainfall wheat PNC at the stem elongation growth stage There are also studies trying to build VIs that are directly related to N traits. Lee et al. [18] suggested that the first derivative of canopy spectra at 735 nm is a good indicator for the PNC of paddy rice at the panicle formation stage. Stroppiana et al. [19] concluded that the normalized difference vegetation index (NDVI) formulated with narrow band reflectance at 483 nm and 503 nm was optimal for paddy rice PNC estimation in a combined data set from panicle initiation and heading stages. They also pointed out that VIs specially designed for chlorophyll/nitrogen estimation, such as the transformed chlorophyll absorption in reflectance index (TCARI) and modified chlorophyll absorption ratio index (MCARI) did not perform well in their data set. Tian et al. [20] proposed a three band index that performed well at both the ground and space scales for paddy rice LNC estimation.
Considering that VIs typically use two to three bands, univariate regression models built on VIs are criticized because they do not fully take advantage of spectrometric data (in particular, hyperspectral data) and may be too simple to characterize the intrinsic relationships between the target variable and spectrometric data [21][22][23][24]. Partial least squares regression (PLSR) is a powerful alternative method and can perform better for biochemical trait estimation in most cases [25][26][27][28][29][30]. Specifically, Hansen and Schjoerring [24] showed that PLSR could reduce the root mean square error (RMSE) of LNC estimation by 24%, compared to those of narrowband VIs. More recently, state-of-art machine learning (ML) methods, such as artificial neural networks (ANNs), random forests (RFs) and support vector machines (SVMs), were evaluated for N trait estimation and yielded encouraging results [31][32][33][34][35][36][37][38].
Most of the previous studies focused on special narrow growth stages, such as the early season [32], stem elongation stage [17], panicle initiation stage [18,26,39], heading stage [14], and anthesis stage [40]. Additionally, the LNC is usually considered the target variable [1]. However, the N content can be expressed both per dry weight and per unit surface. Additionally, the N content can be measured at the leaf scale and at the wholeplant scale. The growth stage development is accompanied by canopy structure variation, biomass accumulation and N allocation. How the prediction strength of spectrometric data for different N traits varies across growth stages has not been fully explored. Therefore, the objective of this study is to evaluate the performance of VIs, PLSR and ML methods for paddy rice N trait estimation, with a special focus on the effects of growth stages.

Study Area and Experimental Setup
The experiments were conducted at the rice research field of Guangdong Academy of Agricultural Sciences (23°23 38 N, 113°25 37 E). This region is characterized by a subtropical monsoon climate. The average annual precipitation and average temperature are 1700 (mm) and 22.5°C, respectively. The parent material of the soil in the test area is lateritic sandy loam formed by river alluvium. In this region, paddy rice is under a two-season cropping system. Our experiments were conducted during the 2020 late rice growing season. The paddy rice variety Huangruanxiuzhan was used as the test material. The paddy rice was sown in a nursery bed and manually transplanted on 9 August 2020. The transplanting density was 400 plants/plot with one plant per hill.
To acquire the necessary reference observations, a paddy rice field with nitrogen treatment was set up and used to conduct field-data collection campaigns (detailed in Figure 1). Nitrogen from 0 to 180 (kg ha −1 ) with a 30 (kg ha −1 ) increase was applied to plots J0 to J6 7d after transplantation (DAT). Nitrogen from 0 to 90 (kg ha −1 ) with an 15 (kg ha −1 ) increase was applied to plots F0 to F6 at 33 DAT. Additionally, 90 (kg ha −1 ) nitrogen was applied to each plot from F0 to F6 at 7 DAT. The paddy rice was at the tillering and jointing growth stages at 7 and 33 DAT, respectively. Phosphates and potash fertilizers were applied as basal dressings with P 2 O 5 and K 2 O at 45 and 120 (kg ha −1 ), respectively, for each plot. For both J0 to J6 and F0 to F6, each nitrogen level comprised two replicates. Thus, there were 28 plots in total. Each plot was (5 × 4.5 m) in size.

Field Data Collection
Four field-data collection campaigns were carried out at 18, 32, 47 and 62 DAT. The paddy rice was in the tillering, jointing, booting, and heading growth stages on the corresponding dates. The growth stage was determined, according to the rules described by [41].
Each plot was equally split into two sampling areas (demonstrated in Figure 1). For each field-data collection campaign, the grazeable biomass of three to five hill plants of each sampling area was randomly destructively sampled. Then, the sample plants were split into two parts: green leaves and other (senescent leaves, stems and particles). For each sample, 30 random SPAD-502 (Minolta Camera Co., Osaka, Japan) readings were recorded on green leaves. Then, all green leaves were scanned by a portable scanner (PERFECTION V39, Seiko Epson Corp., Nagano, Japan). The soil plant analysis development (SPAD) readings were converted to leaf chlorophyll content (LCC) (mg cm −2 ) with an empirical relationship (Equation (1)) proposed by [42] and averaged as the LCC of the corresponding sampling area. The leaf pixels in scanned images were extracted by a threshold method. The leaf area indices (LAIs) of each sampling area were determined by Equation (2), where ρ green lea f is the green-leaf pixel ratio of each scanned image, area img = 216 × 297 × 10 −4 m is the maximum scan area of the scanner, and density = 400/(5 × 4.5) m −2 is the planting density. The canopy chlorophyll content (CCC) was approximated by multiplying the LAI and LCC (Equation (3)).
In the second step, samples were oven-dried at 105°C for half an hour and then at 75°C until a constant weight was reached. Finally, the oven-dried samples of different organs were weighed and used to determine the leaf biomass, ABG, nitrogen content (Kjeldahl method) and accumulation. Both LNC and PNC were measured. Additionally, LNA and PNA were determined by multiplying LNC and PNC with leaf biomass and ABG, respectively.
Plots J0 to J6 were sampled during data collection campaigns at 18, 32 and 47 DAT, while plots F0 to F6 were sampled during data collection campaigns at 47 and 62 DAT. Thus, 28 samples were acquired for each data collection campaign, except for 47 DAT (56 samples).
Canopy reflectance data were acquired by a Cubert S185 hyperspectral camera (Cubert GmbH, Ulm, Germany) installed on a hexacopter (DJI M600Pro, SZ DJI Technology Co., Shenzhen, Guangzhou, China). The hyperspectral camera has a spectral range of 450-950 nm and a spectral resolution of 4 nm. It was installed through a three-axis gimbal stabilizer (DJI Ronin-MX, SZ DJI Technology Co., Shenzhen, Guangzhou, China). All unmanned aerial vehicle (UAV) flight missions were performed at approximately 11:00 a.m. Considering that the camera viewing angle may affect the estimation of the canopy parameters [43,44], the same view angle was ensured by the gimbal stabilizer to acquire the nadir images. The flight height was 30 m, which resulted in 16 cm spatial resolution for hyperspectral images. The hyperspectral images were mosaicked and orthographically projected for further analysis. The bands beyond 850 nm were dropped because of the low spectral quality [45]. The sample-area-wise mean spectra were calculated using regions of interest generated from sample area boundaries (demonstrated in Figure 1).

Methods
Linear regression models based on VIs, as well as PLSR and three ML models (ANN, RF and SVM) based on raw reflectance spectra, were built. Both full-growth-stage and growth-stage-specific models were built for each nitrogen trait (LNC, PNC, LNA and PNA). The general workflow is demonstrated in Figure 2. All models were built in the R environment [46]. The PLSR, ANN, RF and SVM models were built using the R packages plsmod [47], nnet [48], randomForest [49] and kernlab [50], respectively.
Twenty-three VIs were evaluated in this study (listed in Table 1). These indices are commonly used for biomass, chlorophyll and nitrogen status estimation. Specifically, the CCCI [17] is a two-dimensional index based on the normalized difference red edge (NDRE) and NDVI. By capturing the planar domain idea proposed by Clarke et al. [51], the CCCI attempts to separate the soil background signal from the plant signal. The NDVI is used as a surrogate for ground cover, and the NDRE is used as a measure of nitrogen content. Conceptually, this allows the nitrogen for any ground cover to be measured. Similarly, Li et al. [14] proposed the NDPI by changing the NDRE in CCCI to CI re .  [14] ρ λ stands for the reflectance at band λ nm. If λ does not match the S185 wavelengths, the mean reflectance of two adjacent bands is used. Considering that the λ = 445 nm band does not fall in the S185 band ranges, the closest band, λ = 450 nm, is used.
A detailed description of the PLSR, ANN, RF and SVM methods can be found in our previous work [21]. In this work, a single-hidden-layer neural network structure was used for the ANN; a radial basis kernel function (RBF) was used for the SVM. In these methods, the latent variables (LVs) of the PLSR, the number of hidden units (units) of the ANN, the number of randomly selected variables of each tree (mtry) and the number of trees (ntree) of the RF, and the σ in the RBF and the C-constant of the regularization term in the Lagrange formulation of the SVM are hyperparameters that need to be fine tuned.
The full-growth-stage models were built on the full data set, while the growth-stagespecific models were built on the data of the corresponding growth stages. The corresponding data set was randomly split into calibration and validation data sets at a 3:1 ratio. All models were calibrated and validated on the calibration and validation data sets, respectively. The mean relative error (MRE), RMSE and coefficient of determination (R 2 ) were calculated to evaluate the model performance (Equations (4)-(6)), where O i and P i are the ith observed and predicted data, respectively, and n is the size of the data pairs.
For the PLSR, ANN, RF and SVM, the hyperparameters (LVs for the PLSR, units for the ANN, mtry for the RF and C, σ for the SVM) were determined by a grid search with a repeated (5 times) 10-fold cross-validation procedure on the calibration data set (Algorithm 1). The parameter (or parameter combination) associated with the lowest MRE cv was considered optimal. After the key parameters were determined, the final models were calibrated on the corresponding calibration data set and evaluated on the corresponding standalone validation data set. Although no hyperparameters needed to be determined for the linear models based on VIs, the same cross-validation procedure was used to calculate the crossvalidated calibration MRE cv , RMSE cv , R 2 cv . The same fold split scheme across different models (of the same growth stage) was used to ensure a fair comparison. MRE cv , RMSE cv and R 2 cv were mainly used for model performance comparison. cv with MRE cv = mean(MRE), RMSE cv = mean(RMSE), and R 2 cv = mean(R 2 ) correspondingly; end Determine the optimal parameter set; Fit the final model with full training data, using the optimal parameter set.  Table 2 depicts the variation in the measured LNC, PNC, LNA, PNA, LCC, CCC and ABG. In the full data set (with all growth stages combined), the LNC varies from 1.41% to 5.16% with coefficient of variation (CV) at 38.63%, the PNC varies from 0.83% to 4.15% with CV at 56.29%, the LNA varies from 1.12 g m −2 to 6.41 g m −2 with CV at 33.04%, the PNA varies from 18.21 kg ha −1 to 156.36 kg ha −1 with CV at 39.85%, the LCC varies from 22.38 µg cm −2 to 39.88 µg cm −2 with CV at 12.44%, the CCC ranges from 29.39 µg cm −2 to 293.72 µg cm −2 with CV at 51.98%, and the ABG varies from 0.54 t ha −1 to 12.38 t ha −1 with CV at 54.24%.

Variability in the Measured Nitrogen, Chlorophyll and Aboveground Biomass
The means of LNC and PNC generally show a decreasing pattern from the tilling to heading growth stages, due to dilution effects [4]. Apart from that, the mean LNC at the heading growth stage is slightly higher than the mean at the booting growth stage. This is because our LNC is measured on the green leaves. During the heading growth stage, nitrogen is allocated from senescent leaves to young (green) leaves, which yields a LNC higher than that measured by combining senescent and green leaves together. The means of LCC show a pattern similar to those of LNC. The means of LNA and CCC first increase and then decrease from the tillering to heading growth stages. The means of PNA and ABG show an increasing pattern from the tillering to heading growth stages. The CV of each trait is lower for the single growth stage (data collection campaign) than for the full data set. The maximum values of the ratio between the single growth stage CV and full data set CV are 0.34, 0.20, 0.83, and 0.55 for LNC, PNC, LNA, and PNA, respectively.  Figure 3 shows the pairwise scatter plots and correlation coefficients of all seven traits (LNC, PNC, LNA, PNA, LCC, CCC and ABG). LNC is more correlated with LCC than CCC and ABG regardless of whether for the full dataset or a specific growth stage. PNC is highly correlated with LNC regardless of whether it is for the full data set or a specific growth stage. However, PNC is more correlated with LCC than CCC and ABG on the full data set, similar to LNC. However, at a specific growth stage, PNC is more correlated with CCC. Both LNA and PNA are more correlated with ABG than LCC and CCC for almost all specific growth stages. For the full data set, LNA is more correlated with CCC, and PNA is more correlated with ABG of the LCC, CCC and ABG traits.  Figure 4 depicts the full-growth-stage and growth-stage-specific MRE cv and R 2 cv of selected VIs for LNC, PNC, LNA and PNA prediction. For growth-stage-specific models of each tillering, jointing and heading growth stage, almost all selected VIs show good performance with relatively low MRE cv and high R 2 cv values for all four nitrogen traits, except for TCARI and NDVI 483,503 , whose performance is slightly worse than that of the other VIs. For the booting growth stage, the R 2 cv values of the growth-stage-specific models are lower than those of the other three growth stages for all four nitrogen traits. For the fullgrowth-stage models, the performance of all selected VIs is lower than the corresponding growth-stage-specific models (except for the booting growth stage).

Performance of the VIs
The performance of the selected VIs is more diverse in the full-growth-stage models and booting growth stage growth-stage-specific models than in tillering, jointing and heading growth stage growth-stage-specific models. The MRE cv values of the growthstage-specific models for LNC and PNC are generally lower than those of the corresponding models for LNA and PNA. The MRE cv values of the full-growth-stage models for LNA and PNA are generally lower than those of the corresponding models for LNC and PNC. Table 3 lists the best-performing VI according to the MRE cv values for each full-growthstage and growth-stage-specific model of each nitrogen trait. Figure 5 shows the scatter plot between the observed and predicted nitrogen traits in the corresponding validation data set of these models. For growth-stage-specific models, the MRE and R 2 range from 3.34% to 10.48% and 0.08 to 1.00 for LNC in both model calibration and validation. According to MRE cv , the model performances are ordered as follows: tillering > jointing > heading > booting. The MRE and R 2 range from 3.20% to 6.86% and 0.39 to 1.00 for PNC in both model calibration and validation. The model performance order is similar to that of LNC. The validation MRE and R 2 range from 7.05% to 16.24% and 0.21 to 1.00 for LNA in both model calibration and validation. According to MRE cv , the model performances are ordered as follows: tillering > jointing > booting > heading. The MRE and R 2 range from 6.78% to 13.35% for PNA in both model calibration and validation. According to MRE cv , the model performances are ordered as follows: tillering > booting > heading > jointing. For each nitrogen trait, the growth-stage-specific model always yields a lower MRE in both model calibration and validation than the corresponding full-growth-stage models. Figure 6 shows the growth-stage-specific and full-growth-stage MRE cv and R 2 cv values of the PLSR and three ML methods (the ANN, RF and SVM) for the four nitrogen traits (LNC, PNC, LNA and PNA). There are no obvious patterns showing a certain method clearly outperforming the other methods. However, the PLSR may yield a large MRE cv in the full-growth-stage models. Of the four nitrogen traits, the booting growth stage models yield the lowest R cv and moderate or highest MRE cv of the growth-stage-specific models and full-growth-stage models.

Performance of the PLSR and ML Methods
For LNC and PNC, the full-growth-stage models yield R 2 cv values comparable to those of the tillering, jointing and heading growth-stage-specific models. For LNA and PNA, the full-growth-stage models yield slightly lower R 2 cv values than the tillering, jointing and heading growth-stage-specific models. However, the R 2 cv values are still above 0. 60 Table 3. The gray line is the 1:1 line.   Table 4 lists the best-performing models of the full-growth-stage and each growthstage-specific models for LNC, PNC, LNA and PNA selected by MRE cv . Figure 7 shows the scatter plots between the observed and predicted nitrogen traits in the corresponding validation data sets. In the full-growth-stage models, the ANN method outperforms the other three methods for LNC and PNC, while the SVM outperforms the other three methods for LNA and PNA. For both LNC and PNC, the full-growth-stage models built on the ANN yield MRE below 7.82% and R 2 above 0.95 both in model calibration and validation. For LNA and PNA, the performance of the full-growth-stage models slightly degrades. The MRE ranges between 11.04% and 14.85%, and R 2 ranges between 0.72 and 0.87 in both model calibration and validation. For LNC, only the tillering growth-stage-specific model yields good performance, with MRE below 3.38% and R 2 above 0.86 in both model calibration and validation. Although the growth-stage-specific models of the other three growth stages (jointing, booting and heading) yield lower MRE val , the R 2 val values are lower (ranging between 0.01 and 0.50). For LNA, all tillering, jointing and heading growth-stage-specific models yield good performance, with MRE below 28.26% and R 2 above 0.84 in both model calibration and validation. The booting growth-stage-specific model yields lower R 2 val (R 2 val = 0.22). For both PNC and PNA, the growth-stage-specific models from tillering to heading growth stages all show good performance, with MRE below 10.83% and R 2 above 0.44 in both model calibration and validation.
Comparing Tables 3 and 4, which present the best-performing VI-based and fullspectra-based models, respectively, the latter is clearly superior, regarding the full-growthstage models of each nitrogen trait. The VI-based growth-stage-specific model of the booting growth stage yields lower accuracy for each nitrogen trait. With the full-spectrabased models, the prediction accuracy is improved for PNC and PNA.  Table 4. Figure 8 shows the pixelwise inversion results of the four N traits (LNC, PNC, LNA and PNC), using the corresponding optimal ML full-growth-stage models listed in Table 4 (the ANN for LNC and PNC and the SVM for LNA and PNA). The N treatment level variation is relatively homogenous. The spatial pattern due to different nitrogen treatments is more obvious in the maps of LNA and PNA than those of LNC and PNC. The N trait variations caused by growth stages are in line with the observations (detailed in Table 2).  −1 , D), using the corresponding optimal full-growth-stage ML models (the ANN for LNC and PNC, and the SVM for LNA and PNA, as listed in Table 4).

Discussion
With a focus on fresh leaves or plant canopies instead of dry leaves or dried and ground leaves, the remote sensing-based estimation of nitrogen traits is mainly based on the relationship between chlorophyll and nitrogen by means of the use of visible and near infrared (VNIR) spectra. The pairwise correlation analysis (Figure 3) shows that the relationships among the four nitrogen traits and those between the four nitrogen traits and the three biochemical traits (LCC, CCC and ABG) are obviously affected by the growth stage, except for the relationship between LNC and PNC. During the vegetative growth stage, N is abundant in the vegetative tissues, while in the reproductive growth stage, N reallocates from the leaves to the reproductive organs (ears, fruits or seeds). In the reallocation process, the assembly and disassembly of N are accompanied by canopy structure variation. Additionally, the ABG increases more slowly with growth stage development, and the "dilution effects" end at the heading growth stage when the plant N content dominates the canopy reflectance [64]. The reallocation of N and variation in the biomass increase the rate change of the relationships between nitrogen and chlorophyll or ABG across the growth stages.
The performance of the growth-stage-specific VI-based nitrogen trait estimation models is relatively constant within the same growth stage. A similar pattern was observed in a previous study [15]. Patel et al. [15] evaluated the performance of several VIs for irrigated perennial ryegrass PNC estimation and found that the selected VIs showed a similar correlation with PNC when applied to single growth stage data. When multiple growth stage data were pooled together, the accuracy of VI-based full-growth-stage models degraded obviously. Li et al. [16] and Patel et al. [15] found similar results. This is because, in addition to the nitrogen and chlorophyll content, canopy reflectance is affected by LAI, ABG, nonphotosynthetic components, soil background and other factors. With the growth stage development, the canopy structure and background situation vary. The VIs that use several bands cannot fully capture the intrinsic relationship between the canopy spectrum and the target variable across growth stages.
There is general interest in exploring the red-edge spectrum, which are bands where reflectance sharply increases from the lower reflectance red region to high reflectance near the infrared region, for biochemical parameter estimation [13,25,28,64,65]. The REP, which is the main inflection point of the red-edge spectrum, has been found to be useful for chlorophyll and nitrogen content estimation. In this work, the REPs calculated with different techniques, such as linear four-point interpolation, linear extrapolation and Gaussian fit, were evaluated. Despite taking advantage of the narrower continuing hyperspectral data, the REPs are not obviously superior to other VIs in either growth-stage-specific or full-growth-stage models. However, there are several VIs that use the red-edge band-such as CI re , mND705, NDPI, and D735-that perform better than other VIs. Several works have pointed out the importance of blue bands for dry-mass-based nitrogen content (LNC and PNC) estimation [16,19,24]; however, our sensor does not cover this region.
Compared to the VI-based models, the PLSR and ML models generally yield better performance for the full-growth-stage models. This may be because more bands are involved, and the full potential of the hyperspectral data is explored. For the growthstage-specific models, the PLSR and ML models generally show similar performance to, or even worse performance than, the corresponding VI-based models. This may indicate that for growth-stage-specific models, the VI-based estimation is optimal and cannot be further improved.

Conclusions
This work evaluates the performance of univariate regression model vegetation indices (VIs), traditional multivariate calibration method partial least squares regression (PLSR) and modern machine learning (ML) methods, such as the artificial neural network (ANN), random forest (RF), and support vector machine (SVM), for paddy rice nitrogen trait-leaf nitrogen content (LNC), plant nitrogen content (PNC), leaf nitrogen accumulation (LNA) and plant nitrogen accumulation (PNA)-estimation, with a special focus on the effects of growth stages. All methods were evaluated both throughout the growing season and at each growth stage (tillering, jointing, booting and heading).
The results indicate that the correlations between the four nitrogen traits and the other three biochemical traits-leaf chlorophyll content (LCC), canopy chlorophyll content (CCC) and aboveground biomass (ABG)-are affected by the growth stage. Within a single growth stage, the performance of the selected VIs is relatively constant. For full-growth-stage models, the performance of the VI-based models is more diverse. For full-growth-stage models, the TCARI/OSAVI performed best for LNC, PNC and PNA estimation, while the TBVI Tian performed best for LNA estimation. There are no obvious patterns showing whether the PLSR, ANN, RF or SVM performs best in either the growth-stage-specific or full-growth-stage models.
For the growth-stage-specific models, a lower MRE and higher R 2 can be acquired at the tillering and jointing growth stages. The PLSR and ML methods yield obviously better estimation accuracy for the full-growth-stage models than the VI-based models. For the growth-stage-specific models, the performance of VI-based models seems optimal and cannot be obviously improved. These results suggest that building linear regression models on VIs for paddy rice nitrogen traits estimation is still a reasonable choice when only a single growth stage is involved. However, when multiple growth stages are involved or the phenology information is missing, using PLSR or ML methods is a better option. Nevertheless, it must be noticed that only one paddy rice variety was tested in our study. Whether our conclusions still hold when other varieties/cultivations or multiple varieties/cultivations are involved needs further evaluation.