A Machine Learning Framework to Predict Nutrient Content in Valencia-Orange Leaf Hyperspectral Measurements

: This paper presents a framework based on machine learning algorithms to predict nutrient content in leaf hyperspectral measurements. This is the ﬁrst approach to evaluate macro- and micronutrient content with both machine learning and reﬂectance / ﬁrst-derivative data. For this, citrus-leaves collected at a Valencia-orange orchard were used. Their spectral data was measured with a Fieldspec ASD FieldSpec ® HandHeld 2 spectroradiometer and the surface reﬂectance and ﬁrst-derivative spectra from the spectral range of 380 to 1020 nm (640 spectral bands) was evaluated. A total of 320 spectral signatures were collected, and the leaf-nutrient content (N, P, K, Mg, S, Cu, Fe, Mn, and Zn) was associated with them. For this, 204,800 (320 × 640) combinations were used. The following machine learning algorithms were used in this framework: k-Nearest Neighbor (kNN), Lasso Regression, Ridge Regression, Support Vector Machine (SVM), Artiﬁcial Neural Network (ANN), Decision Tree (DT), and Random Forest (RF). The training methods were assessed based on Cross-Validation and Leave-One-Out. The Relief-F metric of the algorithms’ prediction was used to determine the most contributive wavelength or spectral region associated with each nutrient. This approach was able to return, with high predictions (R 2 ), nutrients like N (0.912), Mg (0.832), Cu (0.861), Mn (0.898), and Zn (0.855), and, to a lesser extent, P (0.771), K (0.763), and S (0.727). These accuracies were obtained with di ﬀ erent algorithms, but RF was the most suitable to model most of them. The results indicate that, for the Valencia-orange leaves, surface reﬂectance data is more suitable to predict macronutrients, while ﬁrst-derivative spectra is better linked to micronutrients. A ﬁnal contribution of this study is the identiﬁcation of the wavelengths responsible for contributing to these predictions. evaluated leaf samples from multiple orange tress scattered around di ﬀ erent planting ﬁelds in an experimental portion of the orchard. In each ﬁeld, the number of trees was selected according to the size of the planting ﬁeld and trees planted per area. The selection was both to measure the leaf hyperspectral response and to collect them. A total of 320 samples collected in the ﬁeld, with both spectra curves and later-known nutrient content, were gathered during this survey. The sampling method followed standard recommended agronomic procedures, guided by a ﬁeld specialist. To represent the proper conditions of a citrus tree, only leaves at a medium canopy and those healthy with no signs of diseases or damages were evaluated. A lift platform was to elevate the person with the equipment. Since the chemical analysis of the leaf tissue recommends 3rd or 4th leaf of a fruit branch to be sampled, the spectroradiometer equipment was directed as close as possible to the leaves that shared this description. of determination (R 2 mean absolute error (MAE), and root-mean-squared error (RMSE). To ascertain the relationship between the and the predicted data, the overall ﬁnest models were evaluated in a regression plot. systems. Author Contributions: Conceptualization, L.P.O., A.P.M.R., J.M.J., and J.E.C.; methodology, L.P.O., A.P.M.R., É and J.E.C. formal analysis, L.P.O. and M.M.F.P.; resources, J.E.C., N.N.I., F.F.d.A. and J.M.J.; data curation, L.P.O., M.M.F.P., A.P.M.R., É J.E.C., and J.M.J.; writing—original draft preparation, L.P.O.; writing—review and editing, A.P.M.R, É N.N.I., F.F.d.A., N.E., F.I., V.L., L.A.d.C.J., J.L., L.M., W.N.G. and J.M.J.; supervision, A.P.M.R., J.E.C., N.N.I. and J.M.J; project administration, A.P.M.R., J.E.C and J.M.J.; funding acquisition, L.P.O., and J.E.C. All authors have read and agreed to the published version of the manuscript.

Laser-Induced Breakdown Spectroscopy (LIBS) [38]. Similar research, focusing only on near-infrared spectroscopy, also returned high predictions for both classes of nutrients [39].
Another way to infer chemical components from hyperspectral measurements is by applying a derivative analysis. The derivation of the reflectance data allows highlighting absorption features of components that, in a traditional spectral curve (i.e., reflectance curve), may not be measured with the same accuracy or even be detected [40][41][42]. Studies that apply a derivation of the reflectance curves in plants have found good correlations with N [40,41] and cadmium (Cd) concentrations [42]. Since the gains of derivative analysis are known in the literature, there are also methods for data analysis in the remote sensing scenario that may benefit from it. The advantages proportionated by derivatives may assist in the evaluation of leaf nutritional content when combined with more robust techniques.
The aforementioned studies found high relationships with hyperspectral data by employing various statistical methods in their analysis. However, methods like Partial Least-Squares Regression (PLSR), Principal Component Analysis (PCA), Stepwise-Multiple Linear Regression (SMLR), among others, returned different accuracies even for the same cultures [15,24,32,[37][38][39][40]42]. Some of these methods are also reductive, and the prediction may decrease if an increase occurs in the model complexity [32]. Since hyperspectral measurements produce high and complex amounts of data, one type of approach that could ideally deal with this is machine learning.
Machine learning algorithms are a robust and intelligent technique that can model different types of data [43,44]. These algorithms have the advantage of being non-parametric and non-linear while being able to analyze noised and imperfect data [45][46][47]. They are also capable to perform numerous combinations and calculations in a matter of seconds, achieving relative success in remote sensing applications regarding plant analysis [48,49]. Concerning hyperspectral measurements, among the applications evaluated, these algorithms were able to return state-of-the-art performances for many situations [5,16,23,[50][51][52]. Even though, to date, no study evaluated the performance of machine learning algorithms in inferring both plant macro-and micro-nutritional content with only leaf hyperspectral measurements. Since these algorithms have returned good accuracies in different hyperspectral analyses, they could be appropriate to deal with the complexity imposed by this type of dataset in the described situation.
As previously stated, the first-derivative of the reflectance data has already been proved to be effective in associating with different chemical components. From this information, it could be assumed that both the reflectance data and its first-derivative could be of assistance in predicting different nutrients of the leaf tissue. Since the derivation of a reflectance curve can highlight hard-to-detect components at the first level, it is possible that, by integrating these curves with machine learning algorithms, one can create important information regarding proximal sensing and plant nutritional analysis. In this spirit, a framework adopting machine learning algorithms are proposed to predict macro-and micronutrient content in the leaf-tissue directly from its hyperspectral response. This is the first approach to evaluate different nutrient content combining machine learning methods and reflectance/first-derivative data.
In this study, citrus leaves-more specifically, from Valencia-orange trees-were selected to compose the experimental dataset. It is well known that a sufficient supply of both macro-and micronutrients is critical to the management and sustainability of these plants, and the balance of available nutrients is a key component to profitability [53]. Citrus plants are economically important to the agricultural sector of many countries and may benefit significantly from a rapid and indirect nutritional assessment, such as the one proposed here. In this manner, the aims of this work are to a) show a method to indicate the most suitable spectra (reflectance/first-derivative), in order to model the nutrient content according to the algorithms' performance; and b) determine the important wavelengths or spectral regions associated with each nutrient.

Materials and Methods
The framework proposed in this paper was divided into four main phases ( Figure 1). In the first phase, the hyperspectral measurements of the leaf samples in a Valencia-orange orchard were performed. These measurements were conducted with a field spectroradiometer. In the second phase, the spectral measurements were corrected, and the data were pre-processed. These corrections aimed to convert the radiance signal to reflectance, as well as remove the noise and calculate their first-derivative. The third phase involved the data analysis by machine learning algorithms. In this phase, a fine-tuning to determine the most appropriate parameters to model the data was performed. The fourth and final phase consisted of the organization of the prediction values into a hyperspectral map, where it was identified as the most appropriate algorithm and wavelength (i.e., spectral window) to predict each nutrient.
Remote Sens. 2020, 12, x FOR PEER REVIEW 4 of 23 aimed to convert the radiance signal to reflectance, as well as remove the noise and calculate their first-derivative. The third phase involved the data analysis by machine learning algorithms. In this phase, a fine-tuning to determine the most appropriate parameters to model the data was performed. The fourth and final phase consisted of the organization of the prediction values into a hyperspectral map, where it was identified as the most appropriate algorithm and wavelength (i.e., spectral window) to predict each nutrient.

Study Area and Data Acquisition
As a study area, an open field of citrus trees, located on private property in the municipality of Ubirajara, São Paulo state, Brazil, was selected. The species analyzed were all of Valencia-orange (Citrus sinensis "Valencia"), planted on a Citrumelo-Swingle rootstock. During the evaluation, the trees were in their vegetative phase, with an adult size, measuring nearly 3 m in height (groundrelated), with crown areas around 5.5 m 2 . During the survey, the trees were at their maturing stage, which is five years from their initial planting. The area contains 752 trees per hectare, planted at approximately a 7 × 1.9 m spacing. Each plantation field was 250 m × 250 m in size, with some fields compensating for others accordingly to its location. The plantation fields were selected randomly inside this property and configured the different conditions of the treatments. Before the analysis, the soil was fertilized with 250 kg/ha of N in the form of urea, 125 kg/ha of phosphorus excreted, expressed as P2O5, and 167 kg/ha of potassium oxide (K2O). The area is predominantly composed of red-yellow podzolic soil, situated in a Cwa Köppen [54] subtropical climate type unit. This paper evaluated leaf samples from multiple orange tress scattered around different

Study Area and Data Acquisition
As a study area, an open field of citrus trees, located on private property in the municipality of Ubirajara, São Paulo state, Brazil, was selected. The species analyzed were all of Valencia-orange (Citrus sinensis "Valencia"), planted on a Citrumelo-Swingle rootstock. During the evaluation, the trees were in their vegetative phase, with an adult size, measuring nearly 3 m in height (ground-related), with crown areas around 5.5 m 2 . During the survey, the trees were at their maturing stage, which is five years from their initial planting. The area contains 752 trees per hectare, planted at approximately a 7 × 1.9 m spacing. Each plantation field was 250 m × 250 m in size, with some fields compensating for others accordingly to its location. The plantation fields were selected randomly inside this property Remote Sens. 2020, 12, 906 5 of 21 and configured the different conditions of the treatments. Before the analysis, the soil was fertilized with 250 kg/ha of N in the form of urea, 125 kg/ha of phosphorus excreted, expressed as P 2 O 5 , and 167 kg/ha of potassium oxide (K 2 O). The area is predominantly composed of red-yellow podzolic soil, situated in a Cwa Köppen [54] subtropical climate type unit. This paper evaluated leaf samples from multiple orange tress scattered around different planting fields in an experimental portion of the orchard. In each field, the number of trees was selected according to the size of the planting field and trees planted per area. The selection was both to measure the leaf hyperspectral response and to collect them. A total of 320 samples collected in the field, with both spectra curves and later-known nutrient content, were gathered during this survey. The sampling method followed standard recommended agronomic procedures, guided by a field specialist. To represent the proper conditions of a citrus tree, only leaves at a medium canopy height and those visually healthy with no signs of diseases or damages were evaluated. A lift platform was used to elevate the person with the equipment. Since the chemical analysis of the leaf tissue recommends the 3rd or 4th leaf of a fruit branch to be sampled, the spectroradiometer equipment was directed as close as possible to the leaves that shared this description.
After measuring the spectral radiance, the leaves were extracted from their respective branches, separated, and identified them into plastic bags to be submitted to the laboratory. The leaf samples consisted of the same leaves that had their spectral radiance measured. They were conditioned at an appropriate temperature and transported accordingly. In the laboratory, the leaves were washed with a neutral detergent to remove any impurities. Later, they were dried in an oven, for 48 h, at 60-65 • C, and then crushed. From the crushed material, 100 mg was used for the N analysis. For that, the Kjeldahl titration method [55], divided into 3 phases, was followed: (1) digestion; (2) distillation in a nitrogen distiller; and (3) titration with sulfuric acid (H 2 SO 4 ). The remaining material was separated and used for the analysis of the other macronutrient (P, K, Ca, Mg, and S) and the micronutrients (copper (Cu), Fe, Mn, and zinc (Zn)), following standard laboratory procedures of chemical analysis of the leaf-tissue [56].

Hyperspectral Measurement Processing
The spectral radiance of the Valencia-orange leaves was measured with a Fieldspec ASD FieldSpec ® HandHeld 2 spectroradiometer. To record each target, the equipment was positioned at a 45 • angle concerning the tree canopy. For that, a lift platform was used to ensure the correct height. This equipment operates at a spectral range of 325 nm to 1075 nm. In this study, a 10 • aperture lens was adopted, and 10 readings/measurements were conducted in each leaf to produce one mean spectral signature. This procedure was important to reduce noise and variance for the same target. Before each spectral measurement, a Lambertian (Spectralon ® plate) surface plate was registered. This Lambertian plate was used to calibrate the equipment and convert the digital number to a physical signal. As mentioned, the leaf-spectral response in-field was recorded into 320 measurements for this experiment.
The measured spectral curves consist of the radiance value of the target (i.e., leaf samples) spread along the electromagnetic spectrum. To produce the reflectance value (i.e., reflectance factor), the Hemispherical Conical Reflectance Factor (HCRF) was calculated as shown in Equation (1) [57]: where dL is the radiance; ω is the solid angle; θ and Φ are the zenith and azimuth angles, respectively; i is the incident flux; and r is the reflected energy flux. The K value is the calibration coefficient (i.e., correction factor specified for the equipment). The target corresponds to the radiance of the leaf and the reference is the radiance of the Lambertian surface plate. The HCRF represents the spectral signature of the recorded target. After obtaining the reflectance factor of each leaf, a low signal-to-noise removal was performed by excluding wavelengths under 380 nm and above 1020 nm. After this, the first-derivative of all the HCRFs (n = 320) was calculated. The first-derivative calculation is a traditional method for modeling spectral data, and many approaches have discussed this issue. For this study, a linear least mean-squared smoothing filter [58] was firstly performed to reduce the random noise that may vary with the wavelengths and affect the derivative function. In most cases, noise can be assumed to be stationary with constant variance. It then can estimate a noise-free spectrum s(λ) in terms of the current value of the observed data. By knowing the correct signal of the spectrum giving a specific wavelength s(λ), it is possible to perform a final approximation to estimate derivatives by suitable difference schemes according to a finite band resolution: ∆λ. Thus, the first-derivative was calculated according to [58]: where ∆λ is calculated as |λj − λi|, assuming that the interval between the bands is constant. Additional tests involving further derivates, such as the second, third, and fourth, were also made in the experimental phase of this study. However, there were no indications of an improvement over the first-derivate for the used dataset during the machine learning analysis. For this reason, the proposed framework was limited only to the first-derivative, but future research using different leaf data to process additional derivatives is encouraged. From the total leaf measurements (n = 320) used here, 10% (n = 32) was randomly separated and designed to compose the testing dataset ( Figure 2). Wavelengths ranging from 380 to 1020 nm were used in the software as columns, while the leaf measurements (320) were used as rows. The 32 measurements were configured as an independent dataset, which belonged to the Valencia-orange trees located at different plantation fields, never before seen by the algorithms. The other 288 measurements configured the dependent dataset and belonged to trees with conditions or characteristics distinct from one another, observed during the field campaign. To indicate that, a descriptive statistical analysis was conducted with the nutrients' concentration from the chemical analysis of the leaf tissue, and the following parameters were calculated: minimum, maximum, mean, standard deviation, median, and coefficient of variation. They were important to demonstrate the discrepancy of the dependent (calibration/training) data used, and how representative it could be of the nutritional conditions of Valencia-orange leaf tissue in the analyzed period.

Machine Learning Analysis and Hyperspectral Mapping
In a computational environment, the nutrients were individually selected as the target variables. As input parameters, the reflectance and the first-derivative were used, and the performance of the algorithms in predicting these nutrients was evaluated. As stated, the curves were separated into three sets. The training dataset was used to set-up the hyperparametrization of the chosen algorithms. For that, the Random Search approach [59] was used. The same conjunction of training/validation data was adopted for all algorithms. This process was repeated with a fine-tuning until the reduction in the mean absolute error (MAE) did not result in any more practical gains, as the modification in the parameters impacted the processing time. Once the hyperparameters of each algorithm were defined, the testing dataset was used to verify its real performance.
To configure and run the algorithms, the open-source computer program RapidMiner 9.5 was used, which is based on a particular Python Library [60], while still permitting the development and implementation of different codes. The workstation for this task was equipped with an Intel(R) Core (TM) i7-8550U CPU 4.00 GHz, a Nvidia GeForce MX-150 4Gb GDDR5 64-bits 6008 MHz GPU, and 8GB RAM DDR4 2400MHz. The algorithms for the proposed framework were as follows: k-Nearest Neighbor (kNN), Lasso Regression, Ridge Regression, Support Vector Machine (SVM), Artificial Neural Network (ANN), Decision Tree (DT), and Random Forest (RF). The prediction metrics to evaluate these algorithms were the coefficient of determination (R 2 ), mean absolute error (MAE), and root-mean-squared error (RMSE). To ascertain the relationship between the measured data and the predicted data, the overall finest models were evaluated in a regression plot. measurements configured the dependent dataset and belonged to trees with conditions or characteristics distinct from one another, observed during the field campaign. To indicate that, a descriptive statistical analysis was conducted with the nutrients' concentration from the chemical analysis of the leaf tissue, and the following parameters were calculated: minimum, maximum, mean, standard deviation, median, and coefficient of variation. They were important to demonstrate the discrepancy of the dependent (calibration/training) data used, and how representative it could be of the nutritional conditions of Valencia-orange leaf tissue in the analyzed period. Regarding the configuration of each algorithm, the parameters of the used methods were set to the library default values, except those described in Table 1. For both the DT and RF algorithms an Extreme Gradient Boosting (XGBoost) model was used to increase their performances. This model adopts a forward-learning ensemble method [61], which obtains predictive results in gradually improved estimations. To illustrate the machine learning architecture regarding data inputs and outputs in the proposed analysis, a structure was organized in Figure 3. have a characteristic distinct from most data, which is a high number of bands/wavelengths available to model a given problem. The used dataset was composed of 320 leaf measurement (in which 32 were separated as a test) and 640 spectral bands (380-1020 nm). This gives a total of 204,800 combinations to work with, which should be enough to configure a training/testing dataset. Although this high dimensionality could offer potential problems to hyperspectral data processes [65], studies already suggested that maintaining the original data could also outperform feature-selected subsets [66,67].  It is also important to address that, although hyperspectral data is relatively easy to obtain, leaf tissue analysis can be limited. This is mostly because the chemical analysis can be highly cost if considering the amounts of data required to process the machine learning algorithms. Therefore, the appropriate number of samples is something to be observed in each case. In this study, the amount of data used to train and validate/test the used algorithms should be sufficient based on the literature. One study [62] compared different learners, such as RF, SVM, kNN, and others, in diverse settings. Between these settings, they evaluated the number of classes per problem (from 2 to 50) and the number of samples per class (from 5 to 100). This returned a variation of 10 to 5000 samples. Through their study, it was demonstrated that data curation could be modeled by these algorithms from a few to a high number of samples and still achieve appropriate results. In comparison with the proposed approach, other machine learning frameworks also adopted similar sample sizes, like 324 leaf measurements that were used to model the water-stress response from lettuce [23], 189 hyperspectral observations that were used to model grapevine water status [63], and 266 observations that were used as training to predict nitrogen content in rice fields [64].
As a discussion example, a recent paper collected 500 samples per class with 540 spectral bands and adopted a Cross-Validation method with a dataset considering 200 samples for each validation to demonstrate the importance of the feature selection methods [65]. Regardless, hyperspectral data have a characteristic distinct from most data, which is a high number of bands/wavelengths available to model a given problem. The used dataset was composed of 320 leaf measurement (in which 32 were separated as a test) and 640 spectral bands (380-1020 nm). This gives a total of 204,800 combinations to work with, which should be enough to configure a training/testing dataset. Although this high dimensionality could offer potential problems to hyperspectral data processes [65], studies already suggested that maintaining the original data could also outperform feature-selected subsets [66,67].
As mentioned, though the aforementioned studies did use similar sample sizes of data to train, validate, and test their learners, little information related to hyperspectral wavelengths and machine learning method sample size could be encountered in the literature [65]. Since there is no research focused on evaluating the impact of the training set to model spectral data, a previous comparison regarding two well-known sampling methods was performed. The first is the cross-validation method, which is more suitable to deal with the most common tasks in machine learning data curation [43]. The Cross-Validation method was performed with 10k folders. This model separates the data into 10(k) parts while using nine of them to train the algorithm and one to validate. This process is done sequentially, constantly changing the folder used for validation. In this manner, the chosen algorithm is always validated by data not used during the training phase. The second method used was the Leave-One-Out approach. This method is similar to Cross-Validation, but instead it only takes one data instance for validation each time. The method is considered a very time-intensive procedure, and it is only recommended for smaller datasets [43]. After applying the Random Search approach [59] to perform a fine-tuning, both training methods' results were compared ( Table 2).
In the Cross-Validation method, from the 288 samples, 90% was used to train while 10% was used to validate, and was repeated 10 times randomly. In the Leave-One-Out method, 287 samples were used to train, while one sample was used to test it. This was repeated until all instances were used. The low difference between MAE predictions in each nutrient for both methods indicates that, even when adopting a more suitable training approach to model smaller datasets (Leave-One-Out), the training results are similar. Still, while the Leave-One-Out method is approximately unbiased, it could result in a high variance. Normally, the variance in fitting a model tends to be higher in small datasets since it is more sensitive to noise and artifacts in the used training sample. Because of that, a Cross-Validation method would also show signs of high variance, as well as a high bias if given a limited amount of data. This was not the case here, since both methods returned high predictions and similar metrics, thus indicating that, whatever the training method, the amount of data (204,800 combinations) was sufficient to model the given problem. Regardless, the Leave-One-Out method needed a higher computational cost, which is something to be considered when evaluating the amount of processed data. In the workstation, the Leave-One-Out-averaged processing time for all algorithms was 7.5 times slower than the Cross-Validation method. Because of that, the Cross-Validation method was adopted in this study, but future research should considerer both methods according to their respective dataset size and characteristics.
Lastly, the contribution of each spectral wavelength into the performance of the algorithm was computed by displaying their Relief-F value. Relief-F uses a kNN scoring to address noise data while handles incomplete data [68]. It is considered a reliable metric to inform a feature score and then be applied to rank top-scoring features. Here, the Relief-F values were used to map the hyperspectral response of each nutrient regarding the strength of the individual wavelengths to the performance of the evaluated algorithms. Aside from that, to help ascertain the hyperspectral relationship with the evaluated nutrients dataset, an analysis of each nutrient and a Shapiro-Wilk normality test at a 95% confidence interval was performed. As the normality test returned a p-value under 0.05, a non-parametric Spearman's correlation test in a pairwise comparison was executed to verify the association between each nutrient.

Results
The chemical analysis of the leaf tissue returned heterogeneous and non-parametric results for the nutrient content of the analyzed leaves (Table 3). Analysis has shown that the majority of the nutrients presented a high variability and uniform distribution. This behavior was most noticeable in nutrients, such as Ca, Fe, Mn, and Zn. Regardless, this condition is important to demonstrate the applicability of the proposed framework. Since this is a heterogeneous dataset, machine learning algorithms are advantageous for modeling data with such characteristics. The correlation between nutrients (Figure 4) is important information to characterize a dataset. The correlation coefficients indicated that, although significant, most nutrients have a low correlation value among themselves. Still, the pairwise comparison returned an expected behavior. Macronutrients, such as N, P, and K, showed positive correlations with each other while presenting negative correlation coefficients (N and P) with the other nutrients. The correlation coefficients between the nutrients varied around 0.5 or below, with the highest reaching 0.59 and lowest reaching −0.60. The low correlation value is also favorable for the proposed framework, as it helps to isolate the effects of the nutrient on the evaluated wavelengths.
Remote Sens. 2020, 12, x FOR PEER REVIEW 12 of 23 between the nutrients varied around 0.5 or below, with the highest reaching 0.59 and lowest reaching For the machine learning algorithms used, the results were separated between the two datasets: reflectance (Table 4) and first-derivative ( Table 5). The algorithms returned good performances (R 2 > 0.80) for the macronutrients with the spectral reflectance as predictors. When the first-derivative was used, the algorithms performed well on both macro-and micronutrients (some R² > 0.80), but all performances were improved for the micronutrients. This is an important discovery, as it highlights the importance of first-derivative measurements and their relationship with micronutrients in the Valencia-orange leaf tissue. In both datasets, algorithms like RF, ANN, and kNN returned better predictions than most linear ones, such as Lasso and Ridge Regressions and SVM. The MAE predictions returned here are similar to the predictions resulted from the training phase, which indicates how adjusted the sampling method was. For the machine learning algorithms used, the results were separated between the two datasets: reflectance (Table 4) and first-derivative ( Table 5). The algorithms returned good performances (R 2 > 0.80) for the macronutrients with the spectral reflectance as predictors. When the first-derivative was used, the algorithms performed well on both macro-and micronutrients (some R 2 > 0.80), but all performances were improved for the micronutrients. This is an important discovery, as it highlights the importance of first-derivative measurements and their relationship with micronutrients in the Valencia-orange leaf tissue. In both datasets, algorithms like RF, ANN, and kNN returned better predictions than most linear ones, such as Lasso and Ridge Regressions and SVM. The MAE predictions returned here are similar to the predictions resulted from the training phase, which indicates how adjusted the sampling method was. The bolded in the table are the scores representing the overall best performance of each nutrient. The bolded in the table are the scores representing the overall best performance of each nutrient. To ascertain the relationship between each nutrient prediction, their regression values were plotted (Figure 5a,b). A quick analysis of the best-predicted values versus the laboratory-measured values demonstrates how the performance of the algorithms varied with the increase in the nutrient concentrations. Nutrients such as P and Ca did not show a closer resemblance with a 1:1 relationship (dashed-line- Figure 5a,b) as much as the other nutrients' predictions, even lower ones such as Fe. Regardless, most predictions were quite well related to the laboratory data, and on-site measurements of nutrients like N, K, Mg, S, Cu, Mn, and Zn may benefit from the advantage promulgated by the approach presented here. The calculated Relief-F value showed the contribution of each wavelength to the algorithms' performance (Figure 6a,b). This contribution is important to isolate specific spectral regions and wavelengths of the electromagnetic spectrum most closely related to each nutrient. This relationship, however, is limited to the evaluated algorithm and its performance. Still, since most performances were relatively high (Tables 3 and 4) for most nutrients, this metric is an interesting parameter, as it shines some light on the spectral mapping of Valencia-orange leaf nutrients, as not much is known The calculated Relief-F value showed the contribution of each wavelength to the algorithms' performance (Figure 6a,b). This contribution is important to isolate specific spectral regions and wavelengths of the electromagnetic spectrum most closely related to each nutrient. This relationship, however, is limited to the evaluated algorithm and its performance. Still, since most performances were relatively high (Tables 3 and 4) for most nutrients, this metric is an interesting parameter, as it shines some light on the spectral mapping of Valencia-orange leaf nutrients, as not much is known about their spectral behavior.
As mentioned, the Relief-F value calculated for each wavelength indicated important contributions in different ranges for each nutrient (Figure 5a,b). Because of that, certain bands of the electromagnetic spectrum contributed more than others. To summarize the information obtained from the proposed framework, a table (Table 6) indicating the nutrient and its class (macro or micro), the machine learning method most suitable to model it, its coefficient of determination (R 2 ), the spectral data which its prediction was calculated from, and the most contributive wavelengths or spectral regions to model the measured nutrient from the Relief-F value was created. These results demonstrate the potential of applying different machine learning algorithms for this task. So far, this is the first approach of its kind with nutrient content in leaf tissue analysis.

Discussion
In the proposed framework, both reflectance data and their first-derivatives were used to predict macro-and micronutrients. This approach used a robust technique (machine learning) to model the hyperspectral data, which helped to ascertain some discoveries. The results demonstrated compelling performances to predict most of the nutrients (Tables 3-5). Nutrients like N, Mg, Cu, Mn, and Zn were predicted with an R 2 of 0.912, 0.832, 0.861, 0.898, and 0.855, respectively. Other nutrients like P, K, and S presented inferior performances with an R 2 of 0.771, 0.763, and 0.727, respectively. The worst performances were obtained for nutrients like Ca and Fe, with their R 2 's equal to 0.624 and 0.612, respectively. In comparison to the literature, most of these performances, specifically those related to macronutrients, were similar or superior for other types of plants and methods. For N, predictions using visible to infrared data returned accuracies between 0.73 to 0.87 (R 2 ) [2,30,40]. For K, a three-band combination index predicted the nutrient with an R 2 equal to 0.74 [37]. In nutrients like Mg, S, P, Ca, and others, the predictions (R 2 ) variated between lower values of 0.27 up to 0.98, depending on the method applied and the plant evaluated [24,26,30,32,38,39,50].
One important finding from this research is the relationship between nutrients, algorithms, and leaf-spectral curves. For macronutrients, the performance of the algorithms was superior when adopting the surface reflectance data. As for the micronutrients, the first-derivative of the spectral reflectance returned better performances for the algorithms (Figure 5a,b and Table 6). This finding can be related to information reported in the literature [39][40][41]. Since first-derivative spectra allow for highlighting absorption features of the original spectra, it could potentially be linked to different components not so easily observable in spectral reflectance data alone. This approach demonstrated a better relation for all micronutrients when linked to the algorithms with the first-derivative, so this could offer a possible explanation. As previously mentioned in Section 2.2, other derivatives of the dataset were evaluated, but could not find a significant difference over the first derivative. Still, further research should continue to explore the association between first-derivative spectra, secondand third-derivatives, and micronutrients.
Another contribution of the proposed framework is that, although, with a limitation in the accuracy of the algorithms, it is possible to identify the wavelengths and the spectral regions that most contributed to predicting each nutrient (Figure 6a,b and Table 6). While some nutrients show contributions from the same wavelengths, these contributions vary in value (Relief-F). Even so, most of the nutrients showcase particular wavelengths that could potentially be isolated or used in combination with others to ascertain their relationship with the prediction (Table 6). This finding could help to map the Valencia-orange leaf spectral behavior related to both macro-and micronutrients and promote the investigation of simpler mathematical models or spectral indices capable of modeling these nutrients by focusing on these wavelengths.
Machine learning algorithms have the advantage of modeling data in a non-linear and a non-parametric manner. Unlike many traditional statistical methods, these algorithms are built with the advantage of dealing with noisy, complex, and heterogeneous data [16,23,[50][51][52]. These characteristics proved to be an advantage for this study, as the data used had higher variance, was not-normal (Table 3), and, while statistically significant, low-correlated in a pairwise manner (Figure 4). Previous research aimed to model nutrient content with similar characteristics by using multiple mathematical methods in the analysis of plant hyperspectral data, but it did not return the same accuracies [15,24,32,40,42]. Nonetheless, since machine learning methods can deal with most of the data inconsistencies, both in hyperspectral measurements and in nutrient content analysis, the proposed framework should be more appropriate to combine these features not requiring data modification while still returning good performances.
Finally, the different performances returned by the algorithms should be discussed. It is clear that regression models like Lasso, Ridge, and SVM were inferior to others (Tables 3 and 4) in both scenarios (reflectance/first-derivative). Although SVM is known to handle high dimensionality data and do well with a limited training dataset [45], it performed poorly in the used dataset in comparison with the rest. The DT algorithm, though not as inferior in performance as the aforementioned algorithms, achieved middling results in comparison with the remaining methods. For the DT, the XGBoost model was adopted to improve the prediction performance, which was also implemented in the RF base model. During the experimental phase, this boosting model proved to be of assistance in enhancing the performance of both algorithms. Still, DT did not return predictions as accurate as did the RF algorithm.
The highest performances were obtained by the RF, ANN, and kNN algorithms; both for macroand micronutrients. While RF was better in almost all predictions, ANN and kNN performed well in only particular cases, especially for K (reflectance data) and Mn (first-derivative data), respectively. kNN is a simpler method than ANN and RF, being relatively faster. However, throughout the different nutrients, RF and ANN had better consistency. The ANN was constructed in a manner that could predict most of the nutrients, but performance rates were limited. The amount of available data for training the algorithms could also be a potential hindrance for deep learning networks to handle. While the ANN method benefited from a multilayer perceptron, with two hidden layers and a high number of neurons and interactions, the RF algorithm was boosted with the XGBoost model, which returned a continuous performance for the reflectance and first-derivative datasets. Regardless, as no machine learning algorithm is considered universally appropriate to deal with any task, a framework like the one proposed here is recommended since it makes uses of multiple algorithms because different data could potentially impact its performance.

Conclusions
The proposed approach uses leaf spectral data in the visible and near-infrared regions, and switches between reflectance and its first-derivative data to predict the amount of macro-and micronutrients measured in the laboratory. This method was able to return high predictions (R 2 ) for nutrients like N (0.912), Mg (0.832), Cu (0.861), Mn (0.898), and Zn (0.855), and, to a lesser extent, P (0.771), K (0.763), and S (0.727). These accuracies were obtained with the RF, ANN, and kNN algorithms, among which RF performed the best. Another discovery was that reflectance data is more suitable to model macronutrients, while the first-derivative of the reflectance data is better related to micronutrients. Another contribution also made by this study is the identification (by the Relief-F value) of the wavelengths most responsible for the prediction results. Each nutrient was better correlated to one or more spectral wavelengths. Because of it, future research should evaluate simpler models or spectral vegetation indices capable of modeling the nutrient content by focusing on these wavelengths. Although the presented method was used for evaluating the nutritional conditions of Valencia-orange leaves, it can be replicated for different plants and cultivars, with the possibility of even better performances being achievable. Furthermore, as an advantage of this approach, this framework may be implemented in hyperspectral data obtained with sensors embedded in UAV-based systems.