An Ensemble Modeling Framework for Distinguishing Nitrogen, Phosphorous and Potassium Deficiencies in Winter Oilseed Rape (Brassica napus L.) Using Hyperspectral Data

Nitrogen (N), phosphorous (P), and potassium (K) are important macronutrients to crops. Deficiencies of these nutrients can change the pigment content in leaves and affect photosynthesis, resulting in the similar spectral characteristics at some wavelengths. Thus, one of the most important challenges in crop nutrient stress assessment through the canopy’s spectral reflectance is the ability to discriminate different nutrient stress conditions. This study proposes a three-layer ensemble-modeling framework to analyze N, P, and K nutrient stresses utilizing canopy hyperspectral data of crops. The framework selects spectral bands that are sensitive to N, P, and K nutrient deficiency levels, using ensembles of random forest classifiers, and then the reflectance of the selected bands is transformed into the more distinguishable probability features to diagnose the N, P, and K nutrient deficiency levels. For this study, this proposed framework was applied to winter oilseed rape (Brassica napus L.) during the overwintering stage, with 915 spectra samples collected from 14 field experiments. The analysis of nutrient deficiency levels resulting from the proposed framework was compared with that of single random forest, support vector machine, and artificial neural network classifiers, using the same reflectance features selected in the first layer of the framework. The overall accuracy of the nutrient deficiency analysis achieved by the proposed framework reached 80.76%, which was 16.55%, 18.43%, and 35.74% higher than the single random forest, support vector machine, and artificial neural network classifiers, respectively. The proposed framework demonstrated competitive advantages in differentiating the medium deficiency of N and K, and the severe deficiency of K from the normal conditions, boosting the accuracy from below 25% to above 50% because the probability features enhanced the differences among nutrient deficiency levels.


Introduction
Optimal management of mineral nutrients is critical for ensuring crop yield and food quality [1] and for minimizing the negative environmental impact of fertilization [2,3]. Traditional approaches to determine fertilization rely on soil composition analysis or the chemical content of leaves. These methods are time consuming, expensive, and destructive, making them difficult to apply on a large scale. more distinguishable based on the hyperspectral reflectance data and achieve an accurate identification of the element and the degree of crop mineral nutrient stress.
Winter oilseed rape (Brassica napus L.), which is one of the most important oilseed crops in the world, was chosen as a model crop species to evaluate the performance of the proposed framework. The Yangtze River Basin in China accounts for one-fifth of the rapeseed production and cultivation area in the world [32]. Winter oilseed rape in this area is usually grown in rotation with rice, cotton, or soybean, and as such, the soil nutrient supply is limited [33]. Among the mineral nutrients, N, P, and K are particularly important for winter oilseed rape in the Yangtze River Basin [34]. In this study, the proposed ensemble framework transforms the original reflectance features of winter oilseed rape under N, P, and K stresses into new probability features that are more distinguishable for determining the nutrient deficiency levels. To evaluate the performance of the proposed framework, we compared the accuracy of the results derived from the proposed framework with those derived from a single classifier using the reflectance features.

Experiment Design
From 2013-2019, 14 nutrient fertilization experiments were conducted at three study sites in the Hubei Province, located in Central China. The study sites have a humid sub-tropical monsoon climate. The annual mean air temperature measured at local weather stations during 2014-2018 was 17.27-17.86 • C and annual precipitation was 1161-1942 mm. Each experiment site was divided into multiple of plots, and randomly subjected to N, P, or K fertilization treatments with 3 to 6 replications for each treatment. Table 1 provides a summary of the location, cultivar, fertilization rates, and the number of samples used in the analysis of each experiment. Fertility status of the top soil (0-20 cm) in three study sites was provided in Table 2. N, P, or K fertilization treatments were applied omitting one nutrient at a time on the assumption that other nutrients were at sufficient and not excessive levels. Please refer to Li et al. [35], Liu et al. [36], and Lu et al. [37] for details of fertilizer applications. Commercial herbicides, insecticides, fungicides, and irrigation were applied in the fields following the local standard practices for winter oilseed rape production. No visual symptoms of diseases and water stress were observed during the growth season.

Data
Canopy reflectance measurements and ground sampling were conducted in mid-January, corresponding to the over-wintering stage of winter oilseed rape. We evaluated the proposed framework using data collected during the over-wintering stage, because topdressing fertilization was recommended at this time by experts [38]. Canopy reflectance data were collected in the 14 nutrient fertilization experiments using a PSR + 3500 field spectrometer (Spectral Evolution, Haverhill, MA, USA) and an Analytical Spectral Devices Field Spec Pro spectrometer (ASD, Boulder, CO, USA), and 3 to 5 canopy spectra were collected in each plot. Canopy radiance was measured by the fiber-optic sensor with a 25 • field of view. The sensor was placed 1 m above the canopy in a nadir position. Radiance of a BaSO 4 reference panel was measured as the standard before the canopy spectra measurements. Canopy reflectance was calculated automatically as a ratio of canopy radiance to standard radiance. In total, 915 spectra samples were used in this study, of which 427, 286, and 202 spectra samples were for the N, P, and K fertilization experiments, respectively.
Raw reflectance spectra were averaged to 10 nm bandwidths, in agreement with the bandwidth of the Airborne Visible/Infrared Imaging Spectrometer instrument [39]. Bands shorter than 400 nm, between 1800-2000 nm in length, and longer than 2300 nm were deleted because of their low signal-to-noise-ratio. Each spectrum was then labeled with the nutrient level (severe, medium, normal, and excessive) according to the nutrient fertilizer application rates (Table 3). This classification of the nutrient deficiency level was determined by the yield response curve ( Figure A1) and the local expert suggestions. Thus, for each nutrient, there were four sub datasets of different nutrient deficiency levels. Within each nutrient deficiency level dataset, the mean and standard deviation of the spectra were computed and then each spectrum was examined to identify anomalies. If there were more than 20 bands with reflectance higher or lower than twice the standard deviation, the spectrum was eliminated from the dataset. Seventy percent of the canopy spectra were randomly selected from each sub dataset as the training dataset, and the remaining samples were used as the independent validation dataset. The sub datasets were grouped with different combinations according to the needs of the band selection and nutrient deficiency discrimination.
In addition to canopy reflectance spectra data, leaf samples were collected and the leaf nitrogen concentration (LNC, %), leaf phosphorous concentration (LPC, %), and leaf potassium concentration (LKC, %) were measured for a few of the experiments ( Table 1). All fully expanded leaves were collected from three individual plants per treatment to determine the LNC, LPC, and LKC in the lab. The detailed methods of measuring LNC, LPC, and LKC are described in  and Lu et al. (2020). Measurements of the three sampled plants were averaged as the value of each treatment. Figure 1 shows the proposed framework workflow, which includes three layers. The first layer aims to reduce the dimension of hyperspectral data by selecting the spectral bands that are sensitive to the different degrees of N, P, and K deficiency. To achieve reliable results for each nutrient, the band selections were based on variable importance (VI) values derived from six models and used different subsets of data. In the second layer, the selected bands were used to train the N, P, and K deficiency models. Each nutrient deficiency model generated the probabilities of the three nutrient levels (severe, medium, and normal). Nine probabilities derived from the N, P, K deficiency models showing the probability of a spectrum being diagnosed as severe deficiency (denoted as N_sev%, P_sev%, and K_sev%), medium deficiency (denoted as N_med%, P_med%, and K_med%), or the normal condition (denoted as N_nor%, P_nor%, and K_nor%) of the N, P, and K nutrients, composed a new feature for the next layer. The third layer classified the spectra as a severe N (Nsev), medium N (Nmed), severe P (Psev), medium P (Pmed), severe K (Ksev), medium K (Kmed) deficiency, or a normal condition (Normal) with the new probability features.

Framework Overview
Remote Sens. 2020, 12, x FOR PEER REVIEW 6 of 20 The random forest (RF) classifier was used in this study for both band selection and nutrient deficiency analysis because RF contains a comparatively small number of model parameters required to be specified by the user, minimizes the risk of overfitting, and automatically computes a VI score that assesses the contribution of individual predictors to the model [40]. The model parameters that need to be specified to fit an RF classifier are the number of variables selected at each split (mtry) and the overall number of trees (ntree) that are grown. To determine the optimal values of these two parameters, ntree value was tested from 100 to 1000 at an interval of 100; mtry was tested from 3 to 10 in the first layer with the original hyperspectral data as inputs, and mtry was tested for 2, 3, and 4 in the second and the third layer. The ntree and mtry values were optimized by minimizing the out- Figure 1. Workflow of the proposed ensemble modeling framework. Due to the limited space, in the first layer, the complete workflow was only presented for the band selection of phosphorous (P) nutrition. For the nitrogen (N) and potassium (K) nutrition, there are the same band selection processes as for the P.
The random forest (RF) classifier was used in this study for both band selection and nutrient deficiency analysis because RF contains a comparatively small number of model parameters required to be specified by the user, minimizes the risk of overfitting, and automatically computes a VI score that assesses the contribution of individual predictors to the model [40]. The model parameters that need to be specified to fit an RF classifier are the number of variables selected at each split (mtry) and the overall number of trees (ntree) that are grown. To determine the optimal values of these two parameters, ntree value was tested from 100 to 1000 at an interval of 100; mtry was tested from 3 to 10 in the first layer with the original hyperspectral data as inputs, and mtry was tested for 2, 3, and 4 in the second and the third layer. The ntree and mtry values were optimized by minimizing the out-of-bag error of the predictions [41]. RF classifiers were built in Python 3.5, using the library of sklearn 0.19.

Selecting Effective Spectral Bands
To improve the reliability and robustness of spectral band selection, the results of six classifiers that were trained with different data sources were fused. The original hyperspectral data were split into six smaller data sources with different combinations of nutrient deficiency levels. Each data source underwent a preliminary classification using a trained RF classifier and within each RF classifier, the VI of the input features was calculated using the mean decrease in Gini. VI was then normalized (VI ) using the mean (µ) and standard deviation (σ) of each classification as follows: The score of each spectral band as the weighted sum of VI derived from six classifiers was then calculated as follows: where w i is the accuracy of each classification. After assessing the accuracy of 5, 10, 15, and 20 spectral bands as the model input, ten spectral bands that had the highest scores were selected for each nutrient (N, P, and K).

Composing New Features and Identifying Nutrient Deficiency
In the second layer of the framework, the selected spectral bands were utilized to train the RF classifier to generate the probabilities of the deficiency level for each nutrient. In total, three models were developed for N, P, and K that generated nine probabilities showing the probability of a spectrum being identified as severe deficiency (denoted as N_sev%, P_sev%, and K_sev%), medium deficiency (denoted as N_med%, P_med%, and K_med%), or normal condition (denoted as N_nor%, P_nor%, and K_nor%). These nine probabilities served as inputs for the secondary model. In the third layer, the secondary RF classifier was developed to identify the nutrient deficiency level (Nsev, Nmed, Psev, Pmed, Ksev, Kmed, or normal) with training samples.

Evaluating the Performance of the Framework
To evaluate the capabilities of the proposed framework in distinguishing the nutrient deficiency levels, the classification results of the single RF classifier, the support vector machine (SVM) classifier, and the artificial neural network (ANN) classifiers were used as the benchmark. The SVM was trained with a Gaussian radial basis function [42]. A feed forward back-propagation multi-layer algorithm was used in this ANN implementation, because it is widely used in remote sensing studies [43][44][45]. The two hidden layers were introduced to increase the network's ability to model complex tasks [46]. The input layer of ANN consisted of 30 neurons corresponding to the reflectance of 30 bands selected by the first layer of the framework. The output layer of ANN consisted of 7 neurons representing seven nutrient deficiency levels. The number of neurons in the hidden layers (m) was calculated by the equation in Shibata and Ikeda [47]: where N i is the number of input neurons and N o is the number of output neurons. Dozens of the ANN architecture of were then tested by adjusting m around the value calculated by the Equation (3). ANN with 4 neurons in the first layer and 7 neurons in the second layer was used in this study because it yielded the highest overall accuracy. For comparison, the training and validation datasets used to generate the benchmark were the same as those used in the proposed framework. The single RF, SVM, and ANN classifiers were trained separately with the reflectance of the spectral bands selected in Section 2.2, to identify the nutrient deficiency level directly. The classification accuracy was then evaluated with the same independent validation samples for both the proposed framework and the benchmark. Three indicators were calculated in a confusion matrix to evaluate the accuracy of the models, including the overall accuracy, producer's accuracy, and user's accuracy. The overall accuracy described the percentage of all the spectral samples that were classified correctly. The producer's accuracy offered the percentage of the spectral samples of a certain category that was classified as such. The user's accuracy offered the percentage of the samples classified as a certain category that was actually correct.

Leaf Nutrient Concentration and Canopy Spectra at Different Nutrient Fertilizer Levels
LNC, LPC, and LKC demonstrated obvious variations under different nutrient deficiency levels ( Figure 2). For treatments omitting N or K fertilization, the nutrition deficiency was associated with the pronounced decrease in LNC or LKC. For treatments omitting P fertilization, the LPC was the lowest at the Psev, but for Pmed, the decrease in LPC was less obvious compared with the LNC at Nmed or the LKC at Kmed. Overall, the significant decrease in the omitted-nutrient content indicated that reflectance data grouped by fertilization levels represented spectral responses to different nutrient deficiency levels.
Remote Sens. 2020, 12, x FOR PEER REVIEW  8 of 20 overall accuracy, producer's accuracy, and user's accuracy. The overall accuracy described the percentage of all the spectral samples that were classified correctly. The producer's accuracy offered the percentage of the spectral samples of a certain category that was classified as such. The user's accuracy offered the percentage of the samples classified as a certain category that was actually correct.

Leaf Nutrient Concentration and Canopy Spectra at Different Nutrient Fertilizer Levels
LNC, LPC, and LKC demonstrated obvious variations under different nutrient deficiency levels ( Figure 2). For treatments omitting N or K fertilization, the nutrition deficiency was associated with the pronounced decrease in LNC or LKC. For treatments omitting P fertilization, the LPC was the lowest at the Psev, but for Pmed, the decrease in LPC was less obvious compared with the LNC at Nmed or the LKC at Kmed. Overall, the significant decrease in the omitted-nutrient content indicated that reflectance data grouped by fertilization levels represented spectral responses to different nutrient deficiency levels.  Figure 3 illustrates the mean canopy spectra of winter oilseed rape in response to different nutrient fertilizer levels. Reflectance differences can be found in the red region (600-700 nm), NIR region (800-1300 nm), and shortwave infrared (SWIR) region (1600-1800 nm). Generally, Kmed, Ksev, and the normal condition were brighter in the NIR and SWIR regions. Nsev and Psev had a lower reflectance in the NIR region and a higher reflectance in the red region.  Figure 3 illustrates the mean canopy spectra of winter oilseed rape in response to different nutrient fertilizer levels. Reflectance differences can be found in the red region (600-700 nm), NIR region (800-1300 nm), and shortwave infrared (SWIR) region (1600-1800 nm). Generally, Kmed, Ksev, and the normal condition were brighter in the NIR and SWIR regions. Nsev and Psev had a lower reflectance in the NIR region and a higher reflectance in the red region.
Remote Sens. 2020, 12, x FOR PEER REVIEW 9 of 20 Figure 3. Canopy spectral reflectance in response to different nutrient deficiency levels. Nsev: severe N deficiency; Nmed: medium N deficiency; Psev: severe P deficiency; Pmed: medium P deficiency; Ksev: severe K deficiency; Kmed: medium K deficiency; Normal: normal condition without any stresses. Figure 4 illustrates the 10 spectral bands that were selected for each nutrient using the proposed framework. For N, the ensemble modeling consistently selected bands within the range of 640-690 nm, with the highest score at 640 nm; however, the selection also included three bands (2000, 2010, and 2070 nm) in the SWIR region. The band selection for P was spread over the entire spectral region and the bands with the highest scores were in the NIR (e.g., 810, 910, and 1,120 nm) and SWIR regions (2000 and 2040 nm). For K, the selected bands with the highest scores were all in the SWIR region, ranging from 2000 to 2300 nm. The selected bands also included two bands (650 and 680 nm) in the red region and one band (530 nm) in the green region. The selected bands were used to train the N, Figure 3. Canopy spectral reflectance in response to different nutrient deficiency levels. Nsev: severe N deficiency; Nmed: medium N deficiency; Psev: severe P deficiency; Pmed: medium P deficiency; Ksev: severe K deficiency; Kmed: medium K deficiency; Normal: normal condition without any stresses. Figure 4 illustrates the 10 spectral bands that were selected for each nutrient using the proposed framework. For N, the ensemble modeling consistently selected bands within the range of 640-690 nm, with the highest score at 640 nm; however, the selection also included three bands (2000, 2010, and 2070 nm) in the SWIR region. The band selection for P was spread over the entire spectral region and the bands with the highest scores were in the NIR (e.g., 810, 910, and 1120 nm) and SWIR regions (2000 and 2040 nm). For K, the selected bands with the highest scores were all in the SWIR region, ranging from 2000 to 2300 nm. The selected bands also included two bands (650 and 680 nm) in the red region and one band (530 nm) in the green region. The selected bands were used to train the N, P, and K deficiency models in the second layer.    Mean canopy spectra at different nutrient deficiency levels are plotted for reference. Nsev: severe N deficiency; Nmed: medium N deficiency; Nexc: excessive N fertilization; Psev: severe P deficiency; Pmed: medium P deficiency; Pexc: excessive P fertilization; Ksev: severe K deficiency; Kmed: medium K deficiency; Kexc: excessive K fertilization; Normal: normal condition without any stresses.

New Probability Features
Each N, P, and K deficiency model generated three probabilities, showing the likelihood of a spectrum being diagnosed as a severely or medium deficient or normal condition. These nine probabilities composed a new feature to identify the nutrient deficiency level, and they presented more obvious diversities than the spectral features ( Figure 5). The mean probabilities shown in Figure 5 illustrate that Nsev, Psev, and Pmed have much higher probabilities at N_sev, P_sev, and P_med, respectively, and the normal condition has almost equally high probabilities in N_nor, P_nor, and K_nor. It is interesting to note that the probabilities were more complicated for K deficiencies. For example, Ksev had high probabilities at both P_nor and K_sev, and Kmed had high probabilities at N_nor, P_nor, and K_med.

New Probability Features
Each N, P, and K deficiency model generated three probabilities, showing the likelihood of a spectrum being diagnosed as a severely or medium deficient or normal condition. These nine probabilities composed a new feature to identify the nutrient deficiency level, and they presented more obvious diversities than the spectral features ( Figure 5). The mean probabilities shown in Figure  5 illustrate that Nsev, Psev, and Pmed have much higher probabilities at N_sev, P_sev, and P_med, respectively, and the normal condition has almost equally high probabilities in N_nor, P_nor, and K_nor. It is interesting to note that the probabilities were more complicated for K deficiencies. For example, Ksev had high probabilities at both P_nor and K_sev, and Kmed had high probabilities at N_nor, P_nor, and K_med. Figure 5. Nine probabilities generated from three nutrient deficiency models. The probabilities show the likelihood of a spectrum being diagnosed with severe deficiency, medium deficiency, or the normal condition for N (N_sev%, N_med%, N_nor%), P (P_sev%, P_med%, P_nor%), or K (K_sev%, K_med%, K_nor%). Nsev: severe N deficiency; Nmed: medium N deficiency; Nexc: excessive N fertilization; Psev: severe P deficiency; Pmed: medium P deficiency; Pexc: excessive P fertilization; Ksev: severe K deficiency; Kmed: medium K deficiency; Kexc: excessive K fertilization; Normal: normal condition without any stresses. Figure 6 compares the accuracy of the nutrient deficiencies diagnosed by the proposed framework and the regular RF, SVM, and ANN classifiers. The overall accuracy of the proposed framework reached 80.76%, which was 16.55% higher than that of the regular RF classifier, 18.43% higher than that of the SVM classifier, and 35.74% higher than that of the ANN classifier (Figure 6a). The comparison of the confusion matrixes derived for both methods demonstrated the competitive advantage of the proposed framework in distinguishing the Nmed, Ksev, and Kmed from the normal condition. The framework boosted the producer's accuracy of Kmed yielded by the RF, SVM, and ANN classifiers from 0.00% to 70.00%, boosted the producer's accuracy of Nmed from 20.73% to 70.07%, and increased the user's accuracy of Kmed from 0.00% to 87.45% and Ksev to 71.40%. In addition, the proposed framework increased the user's accuracy of Psev, Nmed, and Nsev. Figure 5. Nine probabilities generated from three nutrient deficiency models. The probabilities show the likelihood of a spectrum being diagnosed with severe deficiency, medium deficiency, or the normal condition for N (N_sev%, N_med%, N_nor%), P (P_sev%, P_med%, P_nor%), or K (K_sev%, K_med%, K_nor%). Nsev: severe N deficiency; Nmed: medium N deficiency; Nexc: excessive N fertilization; Psev: severe P deficiency; Pmed: medium P deficiency; Pexc: excessive P fertilization; Ksev: severe K deficiency; Kmed: medium K deficiency; Kexc: excessive K fertilization; Normal: normal condition without any stresses. Figure 6 compares the accuracy of the nutrient deficiencies diagnosed by the proposed framework and the regular RF, SVM, and ANN classifiers. The overall accuracy of the proposed framework reached 80.76%, which was 16.55% higher than that of the regular RF classifier, 18.43% higher than that of the SVM classifier, and 35.74% higher than that of the ANN classifier (Figure 6a). The comparison of the confusion matrixes derived for both methods demonstrated the competitive advantage of the proposed framework in distinguishing the Nmed, Ksev, and Kmed from the normal condition. The framework boosted the producer's accuracy of Kmed yielded by the RF, SVM, and ANN classifiers from 0.00% to 70.00%, boosted the producer's accuracy of Nmed from 20.73% to 70.07%, and increased the user's accuracy of Kmed from 0.00% to 87.45% and Ksev to 71.40%. In addition, the proposed framework increased the user's accuracy of Psev, Nmed, and Nsev. Figure 6. (a) Overall accuracy, (b) producer's accuracy, and (c) user's accuracy of the nutrient analysis produced by the proposed framework and the random forest (RF), support vector machine (SVM), and artificial neural network (ANN) classifiers using the same validation dataset. Nsev: severe N deficiency; Nmed: medium N deficiency; Nexc: excessive N fertilization; Psev: severe P deficiency; Pmed: medium P deficiency; Pexc: excessive P fertilization; Ksev: severe K deficiency; Kmed: medium K deficiency; Kexc: excessive K fertilization; Normal: normal condition without any stresses.

Agreement of Band Selection With Known Spectral Features
The objective of band selection was to minimize the information redundancy caused by continuous spectral bands while preserving the significant spectral information of the objects [48]. Spectral bands sensitive to different nutrient deficiency levels were selected with the ensembles of the RF outputs with multiple combinations of sub datasets as inputs. The selected bands were Figure 6. (a) Overall accuracy, (b) producer's accuracy, and (c) user's accuracy of the nutrient analysis produced by the proposed framework and the random forest (RF), support vector machine (SVM), and artificial neural network (ANN) classifiers using the same validation dataset. Nsev: severe N deficiency; Nmed: medium N deficiency; Nexc: excessive N fertilization; Psev: severe P deficiency; Pmed: medium P deficiency; Pexc: excessive P fertilization; Ksev: severe K deficiency; Kmed: medium K deficiency; Kexc: excessive K fertilization; Normal: normal condition without any stresses.

Agreement of Band Selection With Known Spectral Features
The objective of band selection was to minimize the information redundancy caused by continuous spectral bands while preserving the significant spectral information of the objects [48]. Spectral bands sensitive to different nutrient deficiency levels were selected with the ensembles of the RF outputs with multiple combinations of sub datasets as inputs. The selected bands were successfully applied in building the ensemble model to distinguish between different nutrient deficiency levels.

Nitrogen
Several studies have demonstrated a close relationship between N and Chl concentrations in green leaves [49]. In general, N deficiency leads to less Chl in the vegetation tissues, resulting in leaf yellowing [19]. In this study, the selected bands were primarily within the range of 640-690 nm, adjacent to the Chl absorption peak at 675 nm [50], which concurs with previous studies on spectroscopic estimation of plant N content. Mutanga et al. [51] found that increased N fertilization resulted in increased Chl absorption, which deepened and widened the Chl absorption feature until eventually reaching saturation. On the canopy level, nitrate availability influences both leaf Chl concentration and biomass [52]. In addition to the bands in the red region, 2000, 2010, and 2070 nm were also selected as the important bands to differentiate different N stress levels. These bands correspond to leaf dry matter content, which is often correlated with the N concentrations [53].

Phosphorous
P is an essential component of genetic material, and thus cell division and expansion are adversely affected by P deprivation [22]. P-deficient plants typically display weak or stunted growth. Previous studies have demonstrated a close relationship between P and reflectance in the NIR region because P plays a critical role as an energy supplier in energy-consuming processes such as photosynthesis [54]. Photosynthesis is directly related to the leaf area index and involves structural organic compounds, which contribute to the spectral features of green vegetation in the region from 700-1200 nm. Cheema et al. [55] found that the increased application of N and P fertilizers to canola led to an increase in the leaf area index and total dry mass relative to the control, and lower rates of fertilization. Osborne et al. [25] confirmed that the increase in the number of cells per unit of leaf area in P-deficient corn plants was translated into a significant spectral response in the NIR region of the spectrum. Specifically, they found that linear models that included 730 and 930 nm bands were able to predict P concentration during crop development at one specific point of time. In this study, the selection of multiple bands in the NIR region (760, 810, 910, 1090, and 1120 nm), as the sensitive bands to differentiate P deficiency levels corroborated the findings of previous studies.

Potassium
K deficiency symptoms were not evident. As shown in Figure 4c, deficient spectra almost overlap normal spectra, which is in agreement with the findings of Pacumbaba and Beyl's [15] study on lettuce and those of Rustioni et al.'s on grapevine leaves [16]. The similarity in reflectance spectra between the normal and the K-deficient condition made it difficult to diagnose the K deficiency levels using spectral reflectance directly.
K is responsible for the activation and/or stimulation of a number of enzymes, which influence sugar and starch content in plants [56]. In cotton, K fertilization increases leaf protein content and decreases leaf starch [57]. K also has an outstanding role in plant water relations because K regulates stomata and affects osmotic pressure [54]. Fanaei et al. [24] found that the application of K fertilizer could ameliorate the negative effects of water stress on relative water content and stomatal conductance in two cultivars of canola. In this study, 7 out of 10 sensitive bands selected for distinguishing K deficiency levels were in the SWIR region and corresponded to the absorption features of water, starch, and proteins.
Moreover, K deficiency produces an intensification of the reddish color as a consequence of increased anthocyanin (Anth) content in the distal part of the plant [16], which explains the band selection results in the green and red regions. Gitelson et al. [58] found that when Anth was present in leaves in minute amounts, it led to an increased absorption in the green-orange range between 520 and 650 nm. Gamon and Surfus [59] suggested using the red to green reflectance ratio for the estimation of Anth content.
In summary, the selected bands were in concurrence with the known plant responses to different nutrient stresses and the corresponding absorption features in previous studies. Quantitative analysis of the relationship between the selected bands and biochemical and physiological data was not performed as part of this study, owing to the sampling data being limited. However, a comprehensive understanding of the relationship between the crop biochemical and physiological responses under different nutrient stresses and the corresponding spectral features is believed to be critical for remote analysis of crop nutrient deficiencies. Future studies are required to bridge the gap between the sensitive spectral features and the underlying biochemical and physiological processes of crops coping with different nutrient stresses.

Diagnosis of Nutrient Deficiency Using Ensemble Modeling
Diversity of the classifier outputs is a vital requirement for the success of the ensemble, demonstrated by theoretical and empirical studies [28,60]. If the classifiers are similar, it would not improve the accuracy by combining them. The core of the proposed framework was that the probabilities used for the final decision were generated by three distinct models. Each of these three models was trained to identify deficiency levels of single nutrient. Thus, each model had thoroughly 'learned' the spectral responses to deficiency levels of single nutrient, reducing the disturbance by the spectral responses to the other nutrients. Then samples of N, P, and K deficiencies were mixed, and reflectance data of each sample were fed to three distinct models to generate nine probabilities for the final decision. Since these nine probabilities were outputs of models trained for different nutrients, they demonstrated pronounced differences under N, P, and K deficiencies and the normal condition. The comparison between Figures 3 and 5 showed that the correlation among input features was substantially reduced and the diversity was enhanced. Particularly, probabilities demonstrated competitive advantages in distinguishing the Nmed, Ksev, and Kmed from the normal conditions. The results of this study corroborated the statements from previous studies that combining multiple classifiers can boost weak learners into a strong learning algorithm to address difficult conditions [61,62]. Classifier ensemble fuses multiple classifiers with distinct advantages and often yields higher accuracies than single models by enlarging the diversity among classes. In this study, a specific classifier was trained to identify the deficiency levels of a certain nutrient, such that the distinguishing spectral responses to a certain nutrient was emphasized with less interferences from other nutrient. By fusing the outputs of three distinctly trained models, the diversity among deficiencies of three nutrients was effectively enhanced and help generate more accurate classification results. In addition to the RF model, the classifiers within the framework could be replaced with other models (e.g., SVM) based on the needs and dataset.
In this study, spectra were utilized as the input of the proposed framework instead of transformed spectra (e.g., first-or higher-order derivative spectra). Although advanced preprocessing techniques are promising tools for enhancing the sensitivity of spectral signals to the biochemical traits in plants [63][64][65], these techniques require a very high spectral resolution that is limited to only a few hyperspectral data sources. The results from this study indicate the potential of the proposed framework for diagnosing crop nutrient deficiency levels using hyperspectral and multispectral data; however, additional research is needed to test the proposed framework on a large scale using satellite data, such as Sentinel-2.

Conclusions
This study proposed a novel ensemble-modeling framework to transform the crop canopy reflectance data of the selected bands into more distinguishable probability features and identify the nutrient deficiency levels using the probabilities. The framework was applied to distinguishing the N, P, and K deficiency levels in winter oilseed rape based on 915 spectra samples collected in the field. The accuracy of the ensemble-modeling framework was compared with the results of the regular RF, SVM, and ANN. The overall accuracy of the nutrient deficiency analysis of the proposed framework reached 80.76%, which was 16.55%, 18.43, and 35.74% higher than the regular RF, SVM, and ANN classifiers, respectively. In particular, the proposed framework boosted the producer's accuracy of Kmed resulting from the RF, SVM, and ANN classifiers from 0.00% to 70.00%, and boosted the producer's accuracy of Nmed from 20.73% to 70.07%. In addition, the selected bands that were sensitive to different nutrient deficiencies fit the well-known physiological and biochemical roles of each nutrient. The results from this study provide promising evidence for the discernment of different nutrient levels in crops using hyperspectral remote sensing data and multiple model ensembles. The proposed framework had a potential of being applied to improving the practical agricultural management in different regions.   Figure A1. The relationship between the rapeseed yield and the N (a), P (b), and K (c) fertilizer application rates. The relationships were fit with the quadratic regression model. The red dash line marks the fertilizer rates corresponding to the 90% relative yield. Figure A1. The relationship between the rapeseed yield and the N (a), P (b), and K (c) fertilizer application rates. The relationships were fit with the quadratic regression model. The red dash line marks the fertilizer rates corresponding to the 90% relative yield.