Estimation of Leaf Nitrogen Content in Wheat Using New Hyperspectral Indices and a Random Forest Regression Algorithm

Novel hyperspectral indices, which are the first derivative normalized difference nitrogen index (FD-NDNI) and the first derivative ratio nitrogen vegetation index (FD-SRNI), were developed to estimate the leaf nitrogen content (LNC) of wheat. The field stress experiments were conducted with different nitrogen and water application rates across the growing season of wheat and 190 measurements were collected on canopy spectra and LNC under various treatments. The inversion models were constructed based on the dataset to evaluate the ability of various spectral indices to estimate LNC. A comparative analysis showed that the model accuracies of FD-NDNI and FD-SRNI were higher than those of other commonly used hyperspectral indices including mNDVI705, mSR, and NDVI705, which was indicated by higher R2 and lower root mean square error (RMSE) values. The least squares support vector regression (LS-SVR) and random forest regression (RFR) algorithms were then used to optimize the models constructed by FD-NDNI and FD-SRNI. The p-R2 values of the FD-NDNI_RFR and FD-SRNI_RFR models reached 0.874 and 0.872, respectively, which were higher than those of the exponential and SVR model and indicated that the RFR model was accurate. Using the RFR inversion model, remote sensing mapping for the Operative Modular Imaging Spectrometer (OMIS) image was accomplished. The remote sensing mapping of the OMIS image yielded an accuracy of R2 = 0.721 and RMSE = 0.540 for FD-NDNI and R2 = 0.720 and RMSE = 0.495 for FD-SRNI, which indicates that the similarity between the inversion value and the measured value was high. The results show that the new hyperspectral indices, i.e., FD-NDNI and FD-SRNI, are the optimal hyperspectral indices for estimating LNC and that the RFR algorithm is the preferred modeling method.


Introduction
Nitrogen content is an essential indicator of the nutritional level and health status of crops.Nitrogen deficiency significantly reduces the photosynthetic yield of crops while excessive application of nitrogen fertilizer can cause stress to crops and environmental pollution [1].Therefore, information on the nitrogen contents of crop leaves must be obtained for scientific and rational decision-making in agronomy [2,3].
The traditional methods for acquiring chemical crop parameters such as Kjeldahl determination have the disadvantages of being time consuming, restricted to point-source information, and difficult to acquire at the macro scale.In contrast, hyperspectral remote sensing technology can be applied to rapidly estimate the spatiotemporal variations in crop nitrogen content on a macro scale at a relatively low cost when compared to the cost of field measurements [4][5][6].Currently, important progress has been made in estimating the nitrogen contents in crops such as wheat using hyperspectral indices and there have been a series of studies that have discussed the properties of different indices [1,2,[4][5][6][7][8][9].
However, most of the indices in current research were directly constructed from spectral reflectance data.Generally, the reflectance spectra collected from a field are a mixture of crop and soil information due to the limited canopy density of crops [2].Therefore, the spectral indices composed of reflectance spectra usually contain soil background information, which limits the accuracy of the estimation model based on these indices.Fortunately, there is evidence that derivative processing could effectively reduce the soil background values and make remote sensing estimation of biophysical and biochemical parameters more reliable.Furthermore, derivative techniques are effective in enhancing weak spectral features and extracting critical wavelengths by reducing the influence of trends or low-frequency noise [2,[10][11][12].
Therefore, if the spectral indices used for the soil nitrogen estimates are constructed from derivative spectra, then the background soil information could be reduced and the feature information of crop nitrogen content be more effectively extracted.Currently, the indices constructed by derivative spectra were successfully applied to monitor photosynthetic organ areas of rape by estimating the leaf area index (LAI) of rice and leaf water content of wheat [12][13][14].In this paper, the derivative spectra will be used for the construction of new spectral indices to provide a more accurate method for the hyperspectral remote sensing estimation of the leaf nitrogen content (LNC) of crops.
Generally, spectral indices are computationally simple and can reduce the impacts of interference factors and refine the target information.Thus, they can be considered an optimal method for estimating the physiological and biochemical parameters of vegetation [15][16][17][18][19][20].Nevertheless, compared to utilizing the full band as the variable in the model construction, using spectral indices to estimate nitrogen content requires a reduction in the number of model input variables and may result in a lower inversion accuracy.Fortunately, certain new algorithms such as random forest regression (RFR) ensure good performance with several or even a single variable if the input variable is highly important and representative [15][16][17][21][22][23].
Wheat is one of the most important food crops in the world and is also a large consumer of nitrogen fertilizer in crops [24,25].In China, the nitrogen use efficiency of wheat production is only about 35%, which is far lower than that of developed countries [25,26].Therefore, it is necessary to design the spectral indices that can sensitively characterize the LNC of wheat to improve nitrogen management and enhance productivity.In this paper, we will construct new hyperspectral indices that are sensitive to nitrogen content and then use the RFR algorithm to optimize the modeling method to achieve an accurate inversion of the nitrogen content of wheat.To achieve the objectives, the following steps were performed: (1) Screening out the appropriate hyperspectral indices for LNC inversion, (2) using the RFR algorithm to optimize the model, and (3) validating the optimal LNC inversion model (i.e., established by appropriate indices and algorithms) by using the data that were collected from an agricultural remote sensing experiment.

Study Area and Stress Experiment
The study site is located at the National Experiment Station for Precision Agriculture (latitude 40 • 10 31"N to 40 • 11 18"N, longitude 116 • 26 10"E to 116 • 27 05"E), which covers an area of 167 ha and has flat terrain (30-100 m above the mean sea level).It is approximately 20 km northeast of Beijing, China.The site is within the warm temperate continental monsoon climate zone and has a mean annual precipitation level of 507 mm, which is a mean annual number of sunshine hours of 2684 hours and a mean annual temperature of 13 • C. The experimental station was implemented and operated with the objective of performing precision agricultural research.The object of this study site is winter wheat (Tritium aestivum L.), which is one of the most essential crops in North China.The initial soil properties of the study site (0-20 cm plow layer) for the cropping seasons are shown in Table 1.To ensure that a sufficient range of nitrogen levels was available for modeling, the stress experiments were conducted in the wheat growing season of 2007.The stress experiments were conducted in 24 subplots (360 m 2 in size) of the experimental field and included six nitrogen application levels (0, 75, 150, 225, 300, and 375 kg/ha) and six water application levels (0, 225, 500, 725, 1000, and 1125 m 3 /ha).Each level consisted of two replicates (Figure 1).

Study Area and Stress Experiment
The study site is located at the National Experiment Station for Precision Agriculture (latitude 40°10′31″N to 40°11′18″N, longitude 116°26′10″E to 116°27′05″E), which covers an area of 167 ha and has flat terrain (30-100 m above the mean sea level).It is approximately 20 km northeast of Beijing, China.The site is within the warm temperate continental monsoon climate zone and has a mean annual precipitation level of 507 mm, which is a mean annual number of sunshine hours of 2684 hours and a mean annual temperature of 13 °C.The experimental station was implemented and operated with the objective of performing precision agricultural research.The object of this study site is winter wheat (Tritium aestivum L.), which is one of the most essential crops in North China.The initial soil properties of the study site (0-20 cm plow layer) for the cropping seasons are shown in Table 1.To ensure that a sufficient range of nitrogen levels was available for modeling, the stress experiments were conducted in the wheat growing season of 2007.The stress experiments were conducted in 24 subplots (360 m 2 in size) of the experimental field and included six nitrogen application levels (0, 75, 150, 225, 300, and 375 kg/ha) and six water application levels (0, 225, 500, 725, 1000, and 1125 m 3 /ha).Each level consisted of two replicates (Figure 1).The background is the true color of the OMIS image (477 nm for blue, 553 nm for green, and 638 nm for red).The six levels of nitrogen stress application were as follows: N1, 0 kg/ha, N2, 75 kg/ha, N3, 150 kg/ ha, N4 225 kg/ha, N5, 300 kg/ha, and N6, 375 kg/ha.The six levels of water stress application were as follows: W1, 0 m 3 /ha, W2, 225 m 3 /ha, W3, 500 m 3 /ha, W4, 725 m 3 /ha, W5, 1000 m 3 /ha, and W6, 1125 m 3 /ha.

Field Sampling Measurement
The canopy reflectance spectra from the jointing to the booting stage of wheat (Between 4 April, and 13 May, 2007) were collected by spectral scanning between 350 and 2500 nm with an ASD Fieldspec Pro FR spectro-radiometer (made by ASD Inc., Boulder, CO, USA) and the corresponding points of wheat leaves were sampled.The sampling interval over the 350-1000 nm range is 1.4 nm with 3 nm resolution.Over the 1000-2500 nm range, the sampling interval is approximately 2 nm and the spectral resolution is 10 nm.The measurements were conducted under clear and cloudless sky conditions between 10:00 and 15:00 to ensure a high solar elevation angle during spectral acquisition.The collectors wear dark clothes and face the sun behind the target area to minimize interference with the spectral acquisition.The sensor fitted with a fiber optic probe with a 25 • field of view was placed 1.3 m above the canopy, which is perpendicular to the crown level.In order to reduce the impact of environmental changes, the data were taken as the average value after each sample is measured 10 times and the reference plate was used to correct the instrument per half hour.A total of 190 samples were collected and they were divided randomly into two groups: one group with 142 samples used as the training set and the other with 48 samples used as the prediction set.All original spectra were denoised by the wavelet threshold denoising method [5].
Immediately after the canopy reflectance spectra measurements were collected, the leaves at the location of the canopy reflectance measurement were sampled with a serial number in 20 minutes.Afterward, the LAI and LNC of the samples were determined in the laboratory.The LAI was determined by the dry weight method and rectified by a laser area meter (Type C1-203).The LNCs were determined by the Kjeldahl method (GB 7173-87) using dry samples.

Airborne Hyperspectral Remote Sensing Image
To analyze the adaptability of the index during the actual remote sensing process, we used Operative Modular Imaging Spectrometer (OMIS) imagery to test the indices.In this study, OMIS imagery was collected by the Y-5 aircraft at an altitude of 1 km with an approximate pixel size of 3 m at 10:30-13:30 on 26 April, 2007.The data contain 128 channels covering the visible/near-infrared portions of the solar spectrum from 0.4 to 12.5 µm with a bandwidth (spectral resolution) of 10 nm.The imagery was radiometrically corrected to "at sensor" radiance and then the quick atmospheric correction (QUAC) and empirical linear correction (ELC) atmospheric correction methods were used to obtain the ground reflectance.Afterward, the OMIS imagery was georeferenced using GPS data collected at the center of the sites with ground white targets.All of the above processes were conducted in ENVI 5.1 (Environmental Systems Research Institute, Inc. Redlands, CA, USA).
While obtaining the remote sensing images, synchronous sampling was carried out on the 36 pre-distributed field uniform sampling points (six fell in the wheat free area and the final effective sampling points were 30).The samples were brought to the laboratory to measure the LNC and analyze the accuracy of the remote sensing mapping.To facilitate a comparison with the remote sensing mapping results, the corresponding geographic coordinates of the sample points were obtained from the dynamic GPS differences at the sub meter level (Figure 1).

Spectral Indices for This Research
Several spectral indices have been reported in the references for LNC estimations in plant leaves.Fourteen spectral indices with clear physical meanings and high degrees of recognition were selected for comparative analysis in this paper.The calculated methods and literature sources of these indices are shown in Table 1.The wavebands used by these indices were in the visible and near-infrared range in accordance with the existing studies (Table 2).

Indices
Formula or Definition References However, although these spectral indices could represent the nitrogen contents of the crops sensitively, they were directly constructed from reflectivity, which makes it difficult to avoid the effects from the background soil value.To reduce the background signals or noise and resolve overlapping spectral features and enhance the relationship between spectral data and the LNC, we applied two spectral index approaches, i.e., the normalized difference spectra and the simple ratio spectra with first derivative spectra.In this study, the first derivative spectra in the range of 350 to 1350 nm were selected for analysis to avoid the relatively high spectral noise at wavelengths greater than 1350 nm.The normalized difference nitrogen index of the first derivative (FD-NDNI) is defined as: where x and y are the first derivative (FD i and FD j ) values at i and j nm over the whole hyperspectral range [45].Similarly, another transformation, the simple ratio nitrogen index of the first derivative (FD-SRNI), is defined by the equation below.

FD-SRNI
Spectral indices using complete combinations of the full spectral bands were calculated using the FD-NDNI and FD-SRNI formulas.The contour map of relevant statistical indicators such as the determination coefficient (R 2 ) between the spectral index and a specific target variable can provide comprehensive information about the predictive power of the combination of two separate hyperspectral wavelengths.Maps of FD-NDNI and FD-SRNI are useful for selecting optimal wavelengths, optimal band combinations, and effective bandwidths [46].In this study, we derived the contour maps of R 2 between the spectral indices and the LNC to screen out the new indices that had a high predictive ability.To screen out the appropriate indices for estimating the LNC, the newly constructed indices were compared with 14 vegetation indices derived from literature.

Modeling and Optimization
Using the spectral index as the independent variable (x) and the LNC as the dependent variable (y), we selected the best method from various curve-fitting algorithms (linear regression, exponential regression, logarithmic regression, etc.) to construct the estimation model.Afterward, we evaluated the model by the R 2 and root mean square error (RMSE) values to identify the optimal spectral index.
However, the function types must be predefined to use these methods.Difficulties will be encountered during this process if the sample scatter diagram does not show a regular pattern of a functional form.Several new machine learning algorithms such as support vector machine regression (SVR) and RFR can cope with the strong nonlinearity of the functional dependence between target parameters and spectral indices.In addition, these algorithms do not require predefined function types and, therefore, may be more suitable candidates than curve-fitting methods.In this study, SVR and RFR were applied to optimize the estimation model.
Least square support vector machine regression (LS-SVR) algorithm.The support vector machine (SVM) algorithm, which is a machine learning algorithm based on the principle of structural risk minimization, can ensure the accuracy of a calibration model while reducing the complexity of machine learning to obtain an effective generalization ability and high prediction accuracy [47].The LS-SVR algorithm has been widely used to estimate target parameters for remote sensing technology in recent years [2,15,48,49].In this study, the radial basis function (RBF) was applied as the kernel function in the LS-SVR model.A penalty coefficient C and the RBF kernel function parameter g, which have significant impacts on the estimation accuracy, were optimized by cross-validation.To reduce the search difficulty and save computation time, a two-step grid search method was used for cross-validation: (1) a large step length over a large value range was used to obtain the general range of parameters and (2) a small step length based on the above result was used to optimize the parameter values [2].The termination criteria for the grid search and final training were set at 0.1 and 0.001, respectively.
Random forest regression (RFR) algorithm.RFR, which was first developed by Breiman [23], is a new ensemble machine learning algorithm based on regression trees.RFR relies on the assumption that different independent predictors predict incorrectly in different areas and, thus, the overall prediction accuracy can be improved by combining the prediction results of the independent predictors.The structures of regression trees in RFR exhibit significant differences when the training data vary slightly.By using this characteristic and combining it with bagging (bootstrap aggregating) and random feature selection, independent predictors can be created to construct a random decision tree.Training data are generated during RFR modeling by sampling and replacing all of the samples for each predictor in the ensemble [21,23].The RFR algorithm needs to set the number of random features, the number of trees, and the stop criteria.In this study, we determined the number of features by the square root of all features and then trained the RFR model with 100 trees.When the minimum sample in a tree was one sample with a minimum impurity of zero, we terminated the training process.

Effects of Nitrogen Stress on LNC and Canopy Spectra of Wheat
It can be seen from Figure 2a that the wheat LNC with different nitrogen application rates decreased with the growth process.In the same growth stage, LNC of wheat increased with the increase of nitrogen application rate, but the increase became smaller when the nitrogen rate reached 150 kg/ha.The above phenomenon may be caused by a decrease in the effect of nitrogen rate on wheat after the excessive nitrogen fertilizer.Figure 2b is the effects of 30-day different level nitrogen stress on canopy spectra of wheat.The spectra beyond 1350 nm has been removed because that is a relatively small valuable in vegetation analysis and the noise is strong in some regions.It can be seen from the figure that, with the increase of nitrogen content in a wheat canopy, the wheat canopy spectral reflectance increased significantly in the near-infrared region and decreased in the visible region.The phenomenon is consistent with our understanding of plant canopy spectra.When fertility is sufficient, crops are usually more dense and photosynthetic, which means they absorb more intensely in the visible region and have higher reflectance in the near-infrared region.
relatively small valuable in vegetation analysis and the noise is strong in some regions.It can be seen from the figure that, with the increase of nitrogen content in a wheat canopy, the wheat canopy spectral reflectance increased significantly in the near-infrared region and decreased in the visible region.The phenomenon is consistent with our understanding of plant canopy spectra.When fertility is sufficient, crops are usually more dense and photosynthetic, which means they absorb more intensely in the visible region and have higher reflectance in the near-infrared region.

Spectral Indices Optimized for LNC Estimation
Figure 3a shows the maps of R 2 between LNC and FD-SRNI(i,j) using the complete combinations of two wavebands at i and j nm in the form of the ratio calculation.From the image, the useful combinations of wavelengths as well as the waveband widths can be inferred for estimating LNC.The waveband area with the most significant relation to LNC was found around FD-NDNI(715,516), which had an R 2 of 0.861 (Figure 3a).Another type of index, FD-SRNI(i,j), uses the complete combination of two wavelengths in the form of a normalized calculation.Among these combinations, FD-SRNI(716, 26) had the most significant correlation with LNC, which had an R 2 of 0.862 (Figure 3b).According to the knowledge of vegetation spectra, we can see that the sensitive indices of nitrogen content are mainly composed of the spectra around the red edge and green peak spectral areas.In addition, as shown in Figure 3, FD-NDNI(715,516) had a broader area of significant points than FD-SRNI(716,526).

Spectral Indices Optimized for LNC Estimation
Figure 3a shows the maps of R 2 between LNC and FD-SRNI (i,j) using the complete combinations of two wavebands at i and j nm in the form of the ratio calculation.From the image, the useful combinations of wavelengths as well as the waveband widths can be inferred for estimating LNC.The waveband area with the most significant relation to LNC was found around FD-NDNI (715,516) , which had an R 2 of 0.861 (Figure 3a).Another type of index, FD-SRNI (i,j) , uses the complete combination of two wavelengths in the form of a normalized calculation.Among these combinations, FD-SRNI (716,26) had the most significant correlation with LNC, which had an R 2 of 0.862 (Figure 3b).According to the knowledge of vegetation spectra, we can see that the sensitive indices of nitrogen content are mainly composed of the spectra around the red edge and green peak spectral areas.In addition, as shown in Figure 3, FD-NDNI (715,516) had a broader area of significant points than FD-SRNI (716,526) .FD-SRNI(i, j) using the thorough combinations of two wavebands at i and j nm.
The results suggest that the combination of the first derivative values of approximately 715 and 520 nm has a significant role in estimating LNC.In both the FD-SRNI and FD-NDNI maps (Figure 3), the peak width at approximately 520 nm was as narrow as 30 nm even though the width at approximately 715 nm was relatively broad.The R 2 value between LNC and FD-SRNI(i, j)/FD-NDNI(i, j) dropped sharply beyond the spectral range.To quantitatively verify the influence of the bandwidth on the estimation accuracy, the R 2 value between LNC and FD-NDNI(715,516)/FD-SRNI(716,526) was calculated when the bandwidth was 5 nm, 10 nm, 20 nm, …, 70 nm (Figure 4).
Figure 4 shows that, when the bandwidth is less than 30 nm, the R 2 value decreases slightly but The results suggest that the combination of the first derivative values of approximately 715 and 520 nm has a significant role in estimating LNC.In both the FD-SRNI and FD-NDNI maps (Figure 3), the peak width at approximately 520 nm was as narrow as 30 nm even though the width at approximately 715 nm was relatively broad.The R 2 value between LNC and FD-SRNI (i, j) /FD-NDNI (i, j) dropped sharply beyond the spectral range.To quantitatively verify the influence of the bandwidth on the estimation accuracy, the R 2 value between LNC and FD-NDNI (715,516) /FD-SRNI (716,526) was calculated when the bandwidth was 5 nm, 10 nm, 20 nm, . . ., 70 nm (Figure 4).
Figure 4 shows that, when the bandwidth is less than 30 nm, the R 2 value decreases slightly but remains above 0.843.When the bandwidth is greater than 30 nm, the R 2 value decreases rapidly.This result indicates that, when the bandwidth is greater than 30 nm, the ability to use spectral indices to estimate the wheat LNC will drop sharply with the increase of bandwidth.Therefore, when constructing an appropriate spectral index to estimate wheat LNC, a relatively high spectral resolution (<30 nm) is required to determine the optimal position.This result facilitates the efficient selection of the optimum bandwidth and wavelength in the design of new spectral indices for LNC estimation and new sensors for agricultural remote sensing.FD-SRNI(i, j) using the thorough combinations of two wavebands at i and j nm.
The results suggest that the combination of the first derivative values of approximately 715 and 520 nm has a significant role in estimating LNC.In both the FD-SRNI and FD-NDNI maps (Figure 3), the peak width at approximately 520 nm was as narrow as 30 nm even though the width at approximately 715 nm was relatively broad.The R 2 value between LNC and FD-SRNI(i, j)/FD-NDNI(i, j) dropped sharply beyond the spectral range.To quantitatively verify the influence of the bandwidth on the estimation accuracy, the R 2 value between LNC and FD-NDNI(715,516)/FD-SRNI(716,526) was calculated when the bandwidth was 5 nm, 10 nm, 20 nm, …, 70 nm (Figure 4).
Figure 4 shows that, when the bandwidth is less than 30 nm, the R 2 value decreases slightly but remains above 0.843.When the bandwidth is greater than 30 nm, the R 2 value decreases rapidly.This result indicates that, when the bandwidth is greater than 30 nm, the ability to use spectral indices to estimate the wheat LNC will drop sharply with the increase of bandwidth.Therefore, when constructing an appropriate spectral index to estimate wheat LNC, a relatively high spectral resolution (<30 nm) is required to determine the optimal position.This result facilitates the efficient selection of the optimum bandwidth and wavelength in the design of new spectral indices for LNC estimation and new sensors for agricultural remote sensing.FD-NDNI takes 715 and 516 nm as the center wavelengths and FD-SRNI takes 715 and 526 nm as the center wavelengths.

Comprehensive Comparison of Various Indices for Estimating LNC
The estimation results of LNC using optimal curve-fitting algorithms and various spectral indices including the results from the literature (Table 2) and this paper are presented.To identify the indices for LNC estimation that performed consistently, the indices were sorted according to their R 2 ranks in descending order (Table 3).Table 3 shows that FD-SRNI and FD-NDNI, which are the spectral indices designed in this paper, had high R 2 (0.861For FD-SRNI and 0.860 for FD-NDNI) and low RMSE (0.332 for FD-SRNI and 0.333 for FD-NDNI) values and appeared at the top of the rankings, which indicates that these two spectral indices had highly relevant relationships with LNC.In addition, the estimation models based on mNDVI 705 , mSR 705 , NDVI 705 , and GREEN-NDVI also had relatively high accuracies with R 2 values in the range of 0.771 to 0.748 and RMSE values in the range of 0.424% to 0.445%, which accounted for the rankings three through six, respectively.
Although these results indicated that FD-NDNI and FD-SRNI were more likely to yield highly accurate LNC estimations, ideal spectral indices should be obtained with good results from different remote sensing platforms.Therefore, the new spectral indices will be further analyzed using the OMIS data in the next section.

LNC Estimation Modeling for OMIS
The wavelength ranges and the spectral resolution of the canopy spectra were collected by ASD Fieldspec Pro FR were different from the OMIS data.To match the ASD Fieldspec Pro FR data with the OMIS remote sensing image data, the field-measured spectra were converted into simulated OMIS spectra, according to the spectral response function of the OMIS bands.Using the OMIS spectra, the spectral indices suited for LNC estimation were quantified by using the following equations where R x is the reflectance, x is the central OMIS wavelength, and FD x is the first derivative value of the OMIS reflectance.(3) FD-SRNI = FD 710.9 /FD 527.7 (6) FD-NDNI = (FD 710.9 − FD 515.3 )/(FD 710.9 + FD 515.3 ) LNC inversion models of various spectral indices were constructed using curve-fitting LS-SVR and RFR algorithms.The results are shown in Table 4.For the curve-fitting models, a comprehensive analysis of Tables 3 and 4 suggests that the accuracies of the spectral indices calculated from the OMIS bands were consistent with those derived from the original spectral bands, which indicates that OMIS bands can be used instead of the original spectral bands to retrieve the wheat LNC for various spectral indices.In addition, compared to the curve-fitting and LS-SVR methods, the accuracies of the prediction results were generally improved by using an RFR algorithm, which is indicated by higher R 2 and lower RMSE values for various spectral indices (Table 4).These results indicated that the RFR algorithm was the optimal modeling method when compared to the curve-fitting method and LS-SVR method.Therefore, the RFR algorithm and new spectral indices including FD-NDNI and FD-SRNI were selected to validate the estimated accuracy of the LNC estimations using the OMIS image.The study area contains a variety of surface features, but only the wheat coverage area is the target.The OMIS image of the study area was pre-treated by geometric correction, atmospheric correction, de-noising, and independent component analysis.Using the top six ICA-wavebands as the classified variable, the study area in the OMIS image was classified into several categories including wheat, bare soil, water body, and impervious surfaces.The wheat coverage area was extracted by masking the non-wheat coverage area (Figure 5).Then the first derivative thematic map of the image was calculated.The study area contains a variety of surface features, but only the wheat coverage area is the target.The OMIS image of the study area was pre-treated by geometric correction, atmospheric correction, de-noising, and independent component analysis.Using the top six ICA-wavebands as the classified variable, the study area in the OMIS image was classified into several categories including wheat, bare soil, water body, and impervious surfaces.The wheat coverage area was extracted by masking the non-wheat coverage area (Figure 5).Then the first derivative thematic map of the image was calculated.

LNC Mapping and Accuracy Test
The above analysis showed that FD-NDNI and FD-SRNI were the optimal indices for nitrogen estimation and RFR was the optimal algorithm for modeling.The remote sensing mapping of the wheat LNC was achieved by inputting the first derivative thematic map of the image to the LNC estimation model established by the RFR algorithm and FD-NDNI/FD-SRNI (Figure 6).The spatial distribution map shows the detailed information of the wheat LNC, which can provide a scientific basis for a quantitative fertilization scheme.

LNC Mapping and Accuracy Test
The above analysis showed that FD-NDNI and FD-SRNI were the optimal indices for nitrogen estimation and RFR was the optimal algorithm for modeling.The remote sensing mapping of the wheat LNC was achieved by inputting the first derivative thematic map of the image to the LNC estimation model established by the RFR algorithm and FD-NDNI/FD-SRNI (Figure 6).The spatial distribution map shows the detailed information of the wheat LNC, which can provide a scientific basis for a quantitative fertilization scheme.The ground-measured LNC values of synchronous sampling at the time of remote sensing image acquisition were used for the accuracy test.The estimated and ground-measured values in the same location were compared by regression fitting (Figure 7).As seen from the figure, FD-NDNI yields an accuracy of R 2 = 0.721 and RMSE = 0.540 for 30 sites with LNC values ranging from 1.85% to 4.71%.For FD-SRNI, the estimation yields an accuracy of R 2 = 0.720 and RMSE = 0.495 with LNC values ranging from 1.87% to 4.74%.Furthermore, the relationships between the estimates and groundbased measurements both have a slope close to 1 (0.782 for FD-NDNI and 0.864 for FD-SRNI) and a small interception (0.512 for FD-NDNI and 0.357 for FD-SRNI), which indicates that the modeling approach produces systematically fair estimations.

Discussion
Nitrogen content is an important indicator of crop growth and physiological status.It is meaningful to construct a spectral index to accurately estimate nitrogen content.Previous researchers conducted a series of studies on different crops and attempted to design a new spectral index that could sensitively reflect the nitrogen contents of crops without disturbance by using information on plant canopy density and other environmental factors.At present, in addition to the traditional vegetation indices such as NDVI705, there are many spectral indices that are specifically used for nitrogen estimation such as NDNI, NDVI(573,440), Viopt, RVI(810,560), RVI(950,660), and RVI(810,660) [41][42][43][44].However, most of the above indices are directly constructed from the reflectance spectra of crops.The ground-measured LNC values of synchronous sampling at the time of remote sensing image acquisition were used for the accuracy test.The estimated and ground-measured values in the same location were compared by regression fitting (Figure 7).As seen from the figure, FD-NDNI yields an accuracy of R 2 = 0.721 and RMSE = 0.540 for 30 sites with LNC values ranging from 1.85% to 4.71%.For FD-SRNI, the estimation yields an accuracy of R 2 = 0.720 and RMSE = 0.495 with LNC values ranging from 1.87% to 4.74%.Furthermore, the relationships between the estimates and ground-based measurements both have a slope close to 1 (0.782 for FD-NDNI and 0.864 for FD-SRNI) and a small interception (0.512 for FD-NDNI and 0.357 for FD-SRNI), which indicates that the modeling approach produces systematically fair estimations.The ground-measured LNC values of synchronous sampling at the time of remote sensing image acquisition were used for the accuracy test.The estimated and ground-measured values in the same location were compared by regression fitting (Figure 7).As seen from the figure, FD-NDNI yields an accuracy of R 2 = 0.721 and RMSE = 0.540 for 30 sites with LNC values ranging from 1.85% to 4.71%.For FD-SRNI, the estimation yields an accuracy of R 2 = 0.720 and RMSE = 0.495 with LNC values ranging from 1.87% to 4.74%.Furthermore, the relationships between the estimates and groundbased measurements both have a slope close to 1 (0.782 for FD-NDNI and 0.864 for FD-SRNI) and a small interception (0.512 for FD-NDNI and 0.357 for FD-SRNI), which indicates that the modeling approach produces systematically fair estimations.

Discussion
Nitrogen content is an important indicator of crop growth and physiological status.It is meaningful to construct a spectral index to accurately estimate nitrogen content.Previous researchers conducted a series of studies on different crops and attempted to design a new spectral index that could sensitively reflect the nitrogen contents of crops without disturbance by using information on plant canopy density and other environmental factors.At present, in addition to the traditional vegetation indices such as NDVI705, there are many spectral indices that are specifically used for nitrogen estimation such as NDNI, NDVI(573,440), Viopt, RVI(810,560), RVI(950,660), and RVI(810,660) [41][42][43][44].However, most of the above indices are directly constructed from the reflectance spectra of crops.

Discussion
Nitrogen content is an important indicator of crop growth and physiological status.It is meaningful to construct a spectral index to accurately estimate nitrogen content.Previous researchers conducted a series of studies on different crops and attempted to design a new spectral index that could sensitively reflect the nitrogen contents of crops without disturbance by using information on plant canopy density and other environmental factors.At present, in addition to the traditional vegetation indices such as NDVI 705 , there are many spectral indices that are specifically used for nitrogen estimation such as NDNI, NDVI (573,440) , Viopt, RVI (810,560) , RVI (950,660) , and RVI (810,660) [41][42][43][44].However, most of the above indices are directly constructed from the reflectance spectra of crops.When the underlying surface of the target contributes to the spectral information, the index is easily influenced by the background information and then the interference factor is introduced into the model [10,12].This is also the reason why these indices fail to achieve good results in this paper (Table 3).
Nevertheless, during the remote sensing acquisition of physical parameters, the existence of soil as background information is unavoidable.Therefore, if the designed spectral index can weaken or even remove the interference from the soil background, the accuracy of the estimations of physical parameters will be improved by using remote sensing technology [2,15,18].The new indices constructed in this paper known as FD-NDNI and FD-SRNI were calculated by the first-order derivative values of the spectrum.The first derivative treatment effectively reduced the influence of the soil background on the spectra and laid the foundation for the construction of indices for accurately estimating LNC [10][11][12].
In this study, the sensitive bands, i.e., approximately 715 nm and 520 nm, were screened out to construct new indices by using the contour maps of R 2 values between the spectral indices and the LNC.The transition point from the blue light absorption band of the vegetation spectra to the green peak occurs at 520 nm such as the blue edge turning point.In addition, the transition point from the red absorption band to the near-infrared band is 715 nm [50].Nitrogen is one of the most important nutrient elements in plants and it has a great influence on the pigment contents and photosynthetic capacities of plant leaves [3][4][5][51][52][53].When the nitrogen content is sufficient and the physiological activity of a plant leaf is vigorous, the capacity of chlorophyll to absorb blue light and red light is strong, which leads to an obvious green peak and a steep blue edge.At the same time, the near-infrared reflectance is further elevated, which leads to a steep red edge.In this way, the slopes of the reflectance spectra near 520 nm and 715 nm are increased and the first-order values of the slopes grow.When the nitrogen content is insufficient, the physiological activity of the leaves is stressed and the ability of chlorophyll to absorb blue light and red light is weakened.Thus, the green peak of the vegetation spectra is flat and the red edge becomes relatively gentle, which leads to decreases in the slope of the reflectance spectra near 520 nm and 715 nm and the first derivative value decreases accordingly [2].Therefore, the above analysis indicates that the bands at 520 and 715 nm can sensitively reflect the LNC of wheat.
However, even though the bands at 520 nm and 715 nm can sensitively reflect the nitrogen contents in crops, the vegetation spectra are influenced by factors such as observation geometry, radiation intensity, and the distribution of leaf inclinations in the field.Therefore, if a single band is used as an index, it will be strongly influenced by these non-object factors and will not be able to accurately reflect the target factor.In contrast, the synchronization effect of environmental factors on different bands can be offset by the ratio or normalization method.Therefore, better results can be obtained if two or more bands are used to construct an index [2,20,54].In this paper, the indices FD-NDNI and FD-SRNI were constructed by normalization and ratio methods and these indices achieve good results for LNC estimation.
In addition, the modeling algorithm has a comparatively larger impact on the inversion results.According to Breiman and Polikar, the RFR algorithm is a simplistic, robust, and optimal regression method [21,23].In this paper, the model established with different indices shows that the RFR algorithm is more accurate than the curve-fitting method and the SVR algorithm, which is shown by higher R 2 values and lower RMSE values (Table 4).The results agree with the studies that indicated the RFR algorithm is the preferred scheme for regression modeling [16,[55][56][57][58].The RFR algorithm can achieve a better result than curve-fitting algorithms and the SVR algorithm is likely to be attributed to its reasonable assumptions, i.e., different independent predictors predict incorrectly in different areas and, thus, the overall prediction accuracy can be improved by combining the prediction results of the independent predictors [21,23].

Conclusions
In this study, we designed two new spectral indices known as FD-NDNI and FD-SRNI, which were constructed with first-order derivative values of reflectance spectra, to estimate wheat LNC.The comparative analysis shows that the accuracies of FD-NDNI and FD-SRNI were better than those of commonly used indices such as mNDVI 705 and NDNI, which indicates that FD-NDNI and FD-SRNI are the optimal indices for wheat LNC estimation.
Choosing the appropriate spectral bands is critical for spectral index construction.The analysis indicates that the spectral indices constructed by using the bands at 715 and 520 nm can accurately estimate wheat LNC even though the effective ranges of these two sensitive bands are relatively narrow.When the bandwidth is greater than 30 nm, the capacity of the indices to estimate LNC decreases rapidly with an increase in bandwidth.Therefore, the location of sensitive wavebands must be determined and the effective spectral bandwidth (<30 nm) must be selected in the design of new sensors for crop LNC estimation in agricultural remote sensing.
Optimizing the modeling algorithm is an important process for improving the accuracy of estimating physical and chemical vegetation parameters.Compared to curve-fitting and SVR, RFR is the preferred algorithm for regression modeling of wheat LNC estimation, which is indicated by the higher R 2 and lower RMSE values in the retrieved results of both FD-NDNI and FD-SRNI indices.

Figure 2 .
Figure 2.The change of wheat LNC with time in different nitrogen application rates (a) and spectral reflectance of wheat canopy with different nitrogen fertilization levels after 30 days of stress experiment (b).The six levels of nitrogen stress application were as follows: N1, 0 kg/ha, N2, 75 kg/ha, N3, 150 kg/ ha, N4 225 kg/ha, N5, 300 kg/ha, and N6, 375 kg/ha.

Figure 2 .
Figure 2.The change of wheat LNC with time in different nitrogen application rates (a) and spectral reflectance of wheat canopy with different nitrogen fertilization levels after 30 days of stress experiment (b).The six levels of nitrogen stress application were as follows: N1, 0 kg/ha, N2, 75 kg/ha, N3, 150 kg/ha, N4 225 kg/ha, N5, 300 kg/ha, and N6, 375 kg/ha.

Figure 3 .
Figure 3. Image map of the coefficient of determination (R 2 ) between LNC and (a) FD-NDNI(i, j) (b)FD-SRNI(i, j) using the thorough combinations of two wavebands at i and j nm.

Figure 3 .
Figure 3. Image map of the coefficient of determination (R 2 ) between LNC and (a) FD-NDNI (i, j) (b)FD-SRNI (i, j) using the thorough combinations of two wavebands at i and j nm.

Figure 3 .
Figure 3. Image map of the coefficient of determination (R 2 ) between LNC and (a) FD-NDNI(i, j) (b)FD-SRNI(i, j) using the thorough combinations of two wavebands at i and j nm.

Figure 4 .Figure 4 .
Figure 4. R 2 value between LNC and FD-NDNI/FD-SRNI constructed by different bandwidths.FD-NDNI takes 715 and 516 nm as the center wavelengths and FD-SRNI takes 715 and 526 nm as the center wavelengths.

Figure 5 .
Figure 5. Hyperspectral image classification result (a) and wheat cover area extraction result (b).

Figure 5 .
Figure 5. Hyperspectral image classification result (a) and wheat cover area extraction result (b).

Figure 6 .
Figure 6.Ground-measured nitrogen content versus the values estimated from the inversion model: (a) estimated by FD-NDNI and (b) estimated by FD-SRNI.

Figure 7 .
Figure 7. Ground-measured nitrogen content versus the estimated values from the inversion model: (a) estimated by FD-NDNI and (b) estimated by FD-SRNI.

Figure 6 .
Figure 6.Ground-measured nitrogen content versus the values estimated from the inversion model: (a) estimated by FD-NDNI and (b) estimated by FD-SRNI.

Figure 6 .
Figure 6.Ground-measured nitrogen content versus the values estimated from the inversion model: (a) estimated by FD-NDNI and (b) estimated by FD-SRNI.

Figure 7 .
Figure 7. Ground-measured nitrogen content versus the estimated values from the inversion model: (a) estimated by FD-NDNI and (b) estimated by FD-SRNI.

Figure 7 .
Figure 7. Ground-measured nitrogen content versus the estimated values from the inversion model: (a) estimated by FD-NDNI and (b) estimated by FD-SRNI.

Table 1 .
Chemical properties of the 0-20 cm top soil in the study site (n = 70).

Table 1 .
Chemical properties of the 0-20 cm top soil in the study site (n = 70).

Table 2 .
Hyperspectral indices for nitrogen content estimation.

Table 3 .
Estimation models and evaluation indicators based on various spectral indices (n = 48).

Table 4 .
LNC estimation results using various algorithms and spectral indices.