Leaf Nitrogen Concentration and Plant Height Prediction for Maize Using UAV-Based Multispectral Imagery and Machine Learning Techniques

: Under ideal conditions of nitrogen (N), maize ( Zea mays L.) can grow to its full potential, reaching maximum plant height (PH). As a rapid and nondestructive approach, the analysis of unmanned aerial vehicles (UAV)-based imagery may be of assistance to estimate N and height. The main objective of this study is to present an approach to predict leaf nitrogen concentration (LNC, g kg − 1 ) and PH (m) with machine learning techniques and UAV-based multispectral imagery in maize plants. An experiment with 11 maize cultivars under two rates of N fertilization was carried during the 2017 / 2018 and 2018 / 2019 crop seasons. The spectral vegetation indices (VI) normalized difference vegetation index (NDVI), normalized difference red-edge index (NDRE), green normalized difference vegetation (GNDVI), and the soil adjusted vegetation index (SAVI) were extracted from the images and, in a computational system, used alongside the spectral bands as input parameters for different machine learning models. A randomized 10-fold cross-validation strategy, with a total of 100 replicates, was used to evaluate the performance of 9 supervised machine learning (ML) models using the Pearson’s correlation coefficient (r), mean absolute error (MAE), coefficient of regression (R 2 ), and root mean square error (RMSE) metrics. The results indicated that the random forest (RF) algorithm performed better, with r and RMSE, respectively, of 0.91 and 1.9 g.kg − 1 for LNC, and 0.86 and 0.17 m for PH. It was also demonstrated that VIs contributed more to the algorithm’s performances than individual spectral bands. This study concludes that the RF model is appropriate to predict both agronomic variables in maize and may help farmers to monitor their plants based upon their LNC and PH diagnosis and use this knowledge to improve their production rates in the subsequent seasons.


Introduction
Remote sensing techniques aligned with precision agriculture practices are being investigated in researches with different farmlands [1]. In recent years, the increase of market-availability of unmanned aerial vehicles (UAV) encouraged multiple applications in this field. Agriculture remote sensing is a promising field as it supports a multidisciplinary view of different problems related to crop mapping [2] and has been implemented in multiple subjects, such as environment control [3], temporal analysis [4], phenology [5], yield-prediction [6][7][8][9], and nutritional analysis [10][11][12]. These studies revealed the importance of evaluating techniques and sensing data to deal with such tasks.
A relevant topic for farmers and technicians is the correct monitoring of their farmlands, as nutrient absorption rates are connected with plant-growth and yield estimates. An important nutrient related to plant-growth is Nitrogen (N). N benefits leaf development and photosynthetic activity in plants, influencing their productivity [13]. Plants that have nutritional deficiencies related to N show visual symptoms in their leaves, known as chlorosis [14,15]. This nutrient is commonly applied in agricultural areas and it is one of the most contributive nutrients to global production. However, the incorrect diagnosis may be a problem from both economic and environmental point-of-views [16,17].
To circumvent the aforementioned problem, agronomic technicians rely on traditional methods of chemical leaf tissue analysis to determine the amount of N absorbed by the plant [18]. However, this practice is viewed as a destructive, time-consuming, and highly-priced approach. Thus, it is difficult to adopt the traditional analysis as a recurrent procedure to monitor multiple areas and stages [19]. As a rapid, nondestructive, and highly-replicable method, UAV-based image analysis may be of assistance to perform plant nutrient content and growth-status estimate [20][21][22][23][24].
As an alternative method, multispectral data analysis collected with sensor systems represents a promising approach to increase the precision in area monitoring [20]. Predicting nutrient content and plant height with remote systems and automated intelligent methods is gaining attention in agriculture practices. With multispectral sensors, at canopy or leaf levels, different studies predicted leaf nitrogen concentration (LNC) in maize (Zea mays L.) [16], winter-wheat (Triticum aestivum) [21], cotton (Gossypium hirsutum) [22], rice (Oryza sativa) [23], citrus (Citrus sinensis) [18,24], among others. Although hyperspectral sensors stand out in their ability to characterize the spectral response with high accuracies [25,26], multispectral sensors are used more frequently in agriculture remote sensing since they are economically viable and accessible to most of the front-end users.
Predicting agronomic variables with multispectral data is a common practice in remote sensing applications. However, performing this task with machine learning techniques is still a recent and relevant topic in agriculture remote sensing since it provides a robust and direct approach to evaluate different agronomic variables. Machine learning is considered a subgroup inside of the artificial intelligence area in which algorithms can learn from data and then discover patterns in the dataset, deciding on new and similar information. The algorithms have the potential to model several types of datasets using linear and parametric and nonlinear and nonparametric approaches [12,27,28], including multispectral images [29]. Different machine learning algorithms like random forests (RF), decision trees (DT), artificial neural network (ANN), support vector machines (SVM), among many others, have been adopted to attend various applications in agriculture remote sensing [5,[30][31][32].
Machine learning has helped to increase not only the prediction's accuracy of some agronomic variables but also assisted in solving complex problems related to data heterogeneity. A revision study on yield and N content prediction [33] concluded that advances in remote sensing technologies Remote Sens. 2020, 12, 3237 3 of 17 and machine learning techniques will result in more cost-effective and comprehensive solutions for a better crop state assessment. The combination of machine learning techniques and vegetation indices (VI) is also an important subject in agricultural applications and has been adopted in different studies, some of which are related to maize characteristics predictions [34,35].
Under ideal conditions of N, maize plants can grow to their full potential reaching maximum height [36,37]. Considering that, implementing different approaches to estimate height and N with UAV-based remote systems is essential to optimize the monitoring of areas with multiple varieties. Currently, one of the main objectives of maize breeding programs is to identify genotypes with high efficiency in N usage [38,39]. Obtaining rapid predictions with an alternative approach like machine learning and UAV-based image may enable programs, technicians, and farmers to evaluate multiple genotypes each year, allowing them to optimize the selection of the most promising plants concerning N use efficiency. In this matter, the main idea behind this proposal is to present a feasible alternative to monitor N and plant height (PH) with machine learning techniques in UAV based imagery.
By implementing the aforementioned approach, farmers can monitor their LNC in maize plants and select the areas or maize varieties (based upon their location or plots) that are most promising based upon their diagnosis and use this knowledge to improve their production rates in subsequent seasons. As machine learning has been proved [23][24][25][26][27][28][29][30] to be a robust approach to evaluate heterogeneous data, it could return important results when considering different genotypes of maize plants. In this paper the following questions are addressed: (1) which machine learning models are most suitable to predict LNC and PH in maize (Zea mays L.) plants with spectral data from UAV-based image? and (2) amongst all predictor variables (spectral indices, bands, and the combination of both), which one is the most useful for mapping LNC and PH based on the machine learning approach?

Materials and Methods
The proposed method was divided into 4 main phases: (1) the description of in-field experiments and how the experimental design was mounted, (2) the extraction of variables LNC and PH, (3) the image preprocessing and calculation of the VIs investigated, and (4) the experimental protocol implemented. Each main phase is described in detail in the following subsections.

Field Trials
The experiment was carried out in the municipality of Chapadão do Sul, State of Mato Grosso do Sul, Brazil (18 • 46 26 S, 52 • 37 28 W, and an average altitude of 810 m), during the 2017/2018 and 2018/2019 crop seasons. In this experiment, 11 maize cultivars cultivated under two rates of nitrogen fertilization in topdressing, 60 kg ha −1 -considered as low and 180 kg ha −1 -considered as high, were investigated, with four replicates of each plot. The cultivars used in the experiment were: Caimbé; CatiVerde; Gorotuba; AlAvaré; BRS106; BRS4103; BRS4104; Diratininga; SCS154; SCS155; and SCS156. The dimensions of each plot were five rows, spaced at 0.45 m each, with a 5 m length. Because it corresponds to a relatively small experimental area, the soil here presents similar conditions. This area is constantly monitored and soil corrections are conducted whenever necessary.
The corn cultivars and N rates were allocated in the same plots in both seasons. The use of several cultivars and two rates of N aimed to create different situations promoted by farmers in Brazil. Thus, the models tested can estimate the variables for these conditions in both seasons. The integration between multiple varieties also was important to provide enough samples for the machine learning models to learn the necessary features: LNC and PH. It was also necessary to build a dataset heterogeneous enough to demonstrate the feasibility of these techniques. The geographic location of the area, along with the experiment plots, is displayed in Figure 1.

Evaluated Variables
Maize plants were evaluated at the V12 stage. The images were collected at this stage because plants have reached their full potential in terms of growth and nitrogen absorption in this phase. The average third of five leaves of maize plants were collected in each experimental unit. The LNC (g kg -1 ) was obtained by the methodology described in [40]. In this regard, N was evaluated with in-field measurements and following agronomical standard procedures. For this, the Kjeldahl titration technique was applied, which is divided into 1) digestion, 2) distillation in an N distiller, and, 3) titration with sulfuric acid (H2SO4). On this same date, PH (m) was obtained with an average of five plants chosen at random in each plot. For this, a measuring tape was used, positioned from the base of the plant to its apex (i.e., the highest point of the plant; at the top of the canopy). A tracking GNSS with high precision accuracy was used to map the crop plots (yellow-grids in Figure 1), ensuring that the collected data was representative of each plot.
This provided a total of 176 in-field observations of LNC and PH. The measure mean-values of LNC and PH, for both seasons (2017/2018 and 2018/2019), did not result in statistical differences at a p-value under 0.05. For this, a Shapiro-Wilk test followed by a pairwise t student test was used. When

Evaluated Variables
Maize plants were evaluated at the V12 stage. The images were collected at this stage because plants have reached their full potential in terms of growth and nitrogen absorption in this phase. The average third of five leaves of maize plants were collected in each experimental unit. The LNC (g kg −1 ) was obtained by the methodology described in [40]. In this regard, N was evaluated with in-field measurements and following agronomical standard procedures. For this, the Kjeldahl titration technique was applied, which is divided into (1) digestion, (2) distillation in an N distiller, and, (3) titration with sulfuric acid (H 2 SO 4 ). On this same date, PH (m) was obtained with an average of five plants chosen at random in each plot. For this, a measuring tape was used, positioned from the base of the plant to its apex (i.e., the highest point of the plant; at the top of the canopy). A tracking GNSS with high precision accuracy was used to map the crop plots (yellow-grids in Figure 1), ensuring that the collected data was representative of each plot. This

Image Acquisition and Vegetation Indices
The following spectral regions were used for calculating the VIs: green (G), red (R), red-edge (RE), and near-infrared (NIR). The described wavelength (nm) is the bands' center on both sensors. The area was recorded during the first crop season (2017/2018) with a MicaSense Red-Edge multispectral sensor (G: 560 nm, R: 668 nm, RE: 717 nm, and NIR: 842 nm) embedded in a UAV-multirotor X800. For the second crop season (2018/2019), a Sensefly eBee RTK fixed-wing remotely piloted aircraft was used. The eBee was equipped with the Sensefly Parrot Sequoia multispectral sensor (G: 550 nm, R: 660 nm, RE: 735 nm, and NIR: 790 nm).
Both sensors acquired spectral data in the aforementioned wavelengths and used a luminosity sensor allowing the calibration of the acquired values. The two overflights were performed at 100 m altitude, returning a spatial image resolution (ground sample distance-GSD) of 0.10 m, and were conducted at 10:00 h (local time). Figure

Image Acquisition and Vegetation Indices
The following spectral regions were used for calculating the VIs: green (G), red (R), red-edge (RE), and near-infrared (NIR). The described wavelength (nm) is the bands' center on both sensors. Both sensors acquired spectral data in the aforementioned wavelengths and used a luminosity sensor allowing the calibration of the acquired values. The two overflights were performed at 100 m altitude, returning a spatial image resolution (ground sample distance-GSD) of 0.10 m, and were conducted at 10:00 h (local time).   For the image preprocessing, the Pix4DMapper was used, optimizing the interior and exterior parameters of the image. A sparse dense cloud based on the structure-from-motion (SfM) technique and point clouds based on the MVS (multi-view stereo) with multiple control points collected were used. These points were collected with a global navigation satellite system (GNSS), dual-frequency in real-time kinematic (RTK) mode. Images were acquired with 80% longitudinal and 60% lateral overlaps, and the digital number (DN) was converted to surface reflectance using the calibration parameters described in the manual of both sensors. The calibration and luminosity corrections were also necessary to minimize the influence of soil brightness. Because spectral indices were used in this study, this interference was also minimized. The plantation itself also was in a stage fully developed and covered most of the soil in the spatial resolution registered, making its contribution minimal to the spectral behavior of the plants.
During an experimental phase, multiple VIs were calculated with the aforementioned spectral bands. However, most of the indices did not return promising results and also presented redundancy over the tests. Because of that, only four main VIs were implemented in the machine learning models: normalized difference vegetation index (NDVI) [41], normalized difference red-edge index (NDRE) [42], green normalized difference vegetation (GNDVI) [43], and soil adjusted vegetation index (SAVI) [44]. These VIs are among the most commonly used indices to predict plant health and conditions. The equations arranged below demonstrate the spectral data used to obtain these VIs, respectively.

Data Analysis
The pixel values for each plantation plot were extracted from the images. These values were used as input to estimate the measured in-field values of LNC and PH in their corresponding plot. A randomized 10-fold cross-validation sampling strategy, with a total of 10 repetitions, was used to evaluate the performance of 9 supervised machine learning models (Table 1). To evaluate the performance of each model as well as the relationship between the predicted and observed variables, the root mean squared error (RMSE) and mean absolute error (MAE) metrics were used.
The number of samples implemented was similar to others presented in previous research [12,28], which also discussed the required quantities of input data to train these types of algorithms. With the cross-validation approach, 90% of the 176 samples were used to train the models and 10% to test it. Because this process was repeated, 100 randomized test-sets were constructed. In summary, this type of validation is repeated sequentially, constantly changing the folder used for validating the algorithm [12,18,27,45]. In this manner, the algorithm is always validated with data not used at its training phase. In this experiment, the entire procedure was also repeated 100 times, which means that the models were built from scratch in every repetition.
Two decision tree-based machine learning algorithms were used here: the reduced error pruning tree with backfitting, and the random forest method with 100% of the training set as bagging size. A K-nearest neighbor was also used with three different K values: 1, 5, and 10. Support vector machines adopting sequential minimal optimization (SMO) have been tested under 2 different kernels: radial based functions and polynomial. Finally, a linear and a kernel-based regressor were also included for comparison. The linear regression uses a grid-search strategy for model selection based on the Akaike information criterion and the kernel-based regressor is a radial basis function network. The library default values were adopted for the number and depth of trees, nodes, and leaves in the decision tree models, as well as a different number of neighbors (1, 5, and 10) for the KNN algorithm. As stated, two functions (RBF and polynomial) were considered for SVM, the exp(-gamma*|u-v| 2 ) and the (gamma*u'*v + coef0) 2 , respectively. Each value regarding the described variables was set to be calculated automatically considering the overall best predictions with an epsilon loss curve equal to 0.1. Last, a grid search approach was used to fine-tune the linear regression model (RBF Regression), thus performing a hyperparametrization of this particular model.
All the models have been tested using three sets of variables: (#1) a set with spectral-bands only (SB), (#2) a set only with Vis, and (#3) a set including both SB and VIs together. During the experimental phase, different hybrid combinations of the adopted models were evaluated. However, the combinations are not discussed in this manuscript mainly because they did not result in interesting outcomes as well as the separation between SBs and VIs. After determining the best overall algorithm, an inference model was calculated to produce a prediction map over the UAV image. This map was used to ascertain the relationship between the predicted variables and help discuss the implications of the proposal of this study.
Additionally, based on the overall best algorithm, the most contributive input data used by the learner were also identified. For this, a classifier attribute evaluation that estimates the worth of an attribute by using this specified classifier (in our case, the overall best algorithm) was implemented with the rank search as a selection method. The evaluated rank metric was based on a merit score obtained with the ZeroR regressor. This merit corresponds with the relative increase in the performance of the model in relation to the ZeroR classifier. The ZeroR was used since it takes the average value of the target variable and uses this value as a prediction. In this regard, the rank method can return a merit number even greater than 1 (since relative increase may exceed 100%). This procedure was important to determine the significance of each SB or VI to infer LNC and PH in maize crops.

Relationship among the Agronomic Variables
To present the relationship among the evaluated variables, Figure 3 was prepared to display the Pearson's correlations between LNC and PH with the SBs and VIs evaluated in this study. The magnitude of the correlations was different for each N fertilization rate (high and low).

Models' Performances for LNC and PH Prediction
Figures 4 and 5 display the boxplots for the RMSE using 100 runs (10 repetitions of 10-fold cross-validation) of each machine learning algorithm under the 3 data input configurations: SB, VI, and SB+VI. Figure 4 displays the boxplot for LNC, whereas, in Figure 5, PH is displayed. Regarding the LNC estimate, the RMSE indicates a higher averaged performance of the RF model with a smaller interquartile range for the VI and SB+VI configuration. The performance of RF using only the SBs is lower than using the other configuration sets for both LNC and PH. The three KNN models also showed lower values of RMSE for VI alone than SB+VI, with a clear advantage for higher K sets (e.g., 5 and 10).

Models' Performances for LNC and PH prediction
Figures 4 and 5 display the boxplots for the RMSE using 100 runs (10 repetitions of 10-fold crossvalidation) of each machine learning algorithm under the 3 data input configurations: SB, VI, and SB+VI. Figure 4 displays the boxplot for LNC, whereas, in Figure 5, PH is displayed. Regarding the LNC estimate, the RMSE indicates a higher averaged performance of the RF model with a smaller interquartile range for the VI and SB+VI configuration. The performance of RF using only the SBs is lower than using the other configuration sets for both LNC and PH. The three KNN models also showed lower values of RMSE for VI alone than SB+VI, with a clear advantage for higher K sets (e.g., 5 and 10).
In the boxplots for PH, there is a slightly lower averaged RMSE for RF when comparing it against the other models, but the combination of SBs + VIs seemed to lower the performance of the model. Beyond RF, the REPT and KNN models presented good results for the VIs dataset alone. Although some outliers were detected in the estimations, each box-plot was constructed over a 95% confidence interval. The overall performance of the best model (RF) presented an RMSE equal to 1.9 g.kg -1 and 0.17 m, for both LNC and PH, respectively.  In the boxplots for PH, there is a slightly lower averaged RMSE for RF when comparing it against the other models, but the combination of SBs + VIs seemed to lower the performance of the model. Beyond RF, the REPT and KNN models presented good results for the VIs dataset alone. Although some outliers were detected in the estimations, each box-plot was constructed over a 95% confidence interval. The overall performance of the best model (RF) presented an RMSE equal to 1.9 g.kg −1 and 0.17 m, for both LNC and PH, respectively.
To better ascertain the relationship between predictions and measured variables, the regression of the overall three best methods for each variable (PH and LNC) was plotted. It used the configuration set #2 containing only the VIs as input variables ( Figure 6).
The PH scatterplot in Figure 6 demonstrates how consistent the RF model was when predicting this variable. As for the LNC prediction, it is possible to notice that the two topdressing conditions of N fertilization rates are separated by the model. This is an important observation since it demonstrates that the RF approach was able to separate distinctly the low and high rate levels. To better ascertain the relationship between predictions and measured variables, the regression of the overall three best methods for each variable (PH and LNC) was plotted. It used the configuration set #2 containing only the VIs as input variables ( Figure 6). The PH scatterplot in Figure 6 demonstrates how consistent the RF model was when predicting this variable. As for the LNC prediction, it is possible to notice that the two topdressing conditions of N fertilization rates are separated by the model. This is an important observation since it demonstrates that the RF approach was able to separate distinctly the low and high rate levels. To better demonstrate the feasibility of the proposal, a map of the predicting values for both LNC and PH was constructed with the RF model using the VIs as input parameters (Figure 7). This map can provide a qualitative approach for the result. Once trained, a machine learning model can calculate or perform inference over the image data, returning a visual representation of the area.
Remote Sens. 2020, 12, x FOR PEER REVIEW 12 of 18 To better demonstrate the feasibility of the proposal, a map of the predicting values for both LNC and PH was constructed with the RF model using the VIs as input parameters (Figure 7). This map can provide a qualitative approach for the result. Once trained, a machine learning model can calculate or perform inference over the image data, returning a visual representation of the area.  Table 2 ranks the contribution of individual attributes for RF for estimating LNC and PH. This metric was estimated with configuration #3 (VIs + SBs), considering all input data to confront both spectral bands and spectral indices' importance to the model. The merit for each attribute indicated that VIs like NDVI, NDRE, and SAVI were more contributive than SBs. This result supports the observation over previous analysis, that the VIs configuration returned better accuracy than the SBs configuration. The merit score was obtained from the ranking-based approach in the method section.

Discussion
The evaluation of multiple cultivars and different quantities of N fertilizer was implemented to simulate the characteristics encountered in most maize-crops around Brazil. With this experimental design, using the spectral data in three distinct configurations, we investigated the performance of a set of machine learning algorithms, like the REPT, RF, KNN, SVM (with RBF and polynomial kernels), and LR. The leaners returned similar predictions according to the respective configuration  Table 2 ranks the contribution of individual attributes for RF for estimating LNC and PH. This metric was estimated with configuration #3 (VIs + SBs), considering all input data to confront both spectral bands and spectral indices' importance to the model. The merit for each attribute indicated that VIs like NDVI, NDRE, and SAVI were more contributive than SBs. This result supports the observation over previous analysis, that the VIs configuration returned better accuracy than the SBs configuration. The merit score was obtained from the ranking-based approach in the method section.

Discussion
The evaluation of multiple cultivars and different quantities of N fertilizer was implemented to simulate the characteristics encountered in most maize-crops around Brazil. With this experimental design, using the spectral data in three distinct configurations, we investigated the performance of a set of machine learning algorithms, like the REPT, RF, KNN, SVM (with RBF and polynomial kernels), and LR. The leaners returned similar predictions according to the respective configuration set when predicting both LNC and PH. The adopted configurations were indicative of the importance of VIs in the prediction of these agronomic variables. In a direct analysis regarding the relationship between each variable (Figure 3), a negative correlation of LNC with NIR and VIs and a positive correlation with R was found. This observation differentiates from the literature [10][11][12][13][14][15][16][17] since the LNC is closely related to chlorophyll influence over these spectral variables. Still, this could configure a particular case related to field conditions raised in this study. The same observations were also noted in a previous study, conducted in a Citrus orchard, which also observed the same spectral bands from the Parrot Sequoia embedded sensor [11].
The RF algorithm may share a similar trend in the nutrient analysis since different and recent types of research concluded that this learner is obtaining optimal and balanced results with different spectral data. In similar research [52], an experiment conducted with maize and multispectral imagery from the orbital scale demonstrated that VIs showed strong performance. Another research [34], aiming to estimate maize, stated that the RF learner returned the highest accuracies among the evaluated algorithms. For N content, although not conducted in maize crops, multiple types of research [25,33,[53][54][55][56] also concluded that the RF learner, as well as other types of regressors based on decision trees, were appropriate to model LNC. In the presented approach, the errors encountered with this model are relatively lower or similar when in comparison to the aforementioned studies.
RF is one of the most powerful methods in the current literature related to machine learning tasks [57][58][59][60][61][62]. The increase in data dimensionality is often seen as a problem for most traditional methods. In this study, the increase in dimensionality was also necessary to improve the overall accuracy of this model. As for limitations, the major difficulty associated with this method, as well as the other machine learning approaches, is the small amount of data [63]. However, as agronomic variables are onerous to obtain, tests with multiple repetitions and configurations sets were conducted to ensure the accuracy of this proposal. The strategy of adopting different configurations and repetitions should be further explored in future research, where the number of instances is relatively small.
The performance of each algorithm was, as discussed, evaluated with different configurations. This analysis returned interesting outcomes, as the accuracy of the learners were better with the VIs as attributes (Figures 4 and 5). The importance of VIs as estimators of N and PH was evaluated in previous papers [18,52,[64][65][66][67]. This is mainly because the VIs enhance some characteristics related to biological variables, such as chlorophyll content and biomass, which are highly correlated with LNC and PH. Nonetheless, when implemented in machine learning methods, it is difficult to understand the exact function in the model's predictions. However, when considering different scenarios, as well as implementing the rank-based approach presented here, it is possible to shine some light onto this process. The rank demonstrated that most contributions are provided by the VIs, and, to a lesser extent, the SBs with their respective surface reflectance values. This type of evaluation is important since it provides a matter to indicate which input variables are more suitable to model the evaluated problem, which can reduce the amount of data input, resulting in an accurate and more rapid estimative.
As for the image itself, the major limitation of a UAV image data collection is the low capacity to compensate and analyze larger areas. However, this type of aerial remote sensing is important when considering the spatial resolution and highly detailed information obtained on the vegetation cover, permitting an analysis at a plant or crop-plot level [68][69][70][71]. Additionally, by evaluating crop at an aerial view, it is easier to ascertain the relationship between spectral data and biophysical variables, since the end-user can reduce the amount of noise introduced in the system by extracting only pixels corresponding with the canopy itself.
The approach presented here may also be implemented with different datasets over diverse areas, crops, and sensors. The approach of adopting multiple machine learning models and VIs could also be used to predict agronomic variables like other macronutrients and micronutrients. Previous experiments already suggest the possibility of inference other nutrients with spectral data from proximal sensors [12,62,72]. In this regard, additional experiments could consider multispectral data from sensors embedded in UAVs. Here, the particular objective was to investigate the contribution of multispectral data in machine learning methods to nutrient content (N) and height (PH). The advantage of LNC and PH prediction with UAV-based images is that it promotes a rapid and cost-efficient manner to the recurrent monitoring of the agricultural landscapes. However, the traditional agronomic method should not be substituted but assisted by remote sensing technologies and computational techniques such as the ones indicated here.

Conclusions
In this study, a machine learning approach was implemented to estimate LNC (g kg −1 ) and PH (m) for maize plants. It was tested whether the models are impacted by data input regarding different combinations of SBs and VIs. It also demonstrated which one of the implemented learners is more suitable to predict both parameters (LNC and PH). The conducted experiment showed that the RF algorithm performed better, with RMSE equal to 1.9 g.kg −1 and 0.17 m, for LNC and PH, respectively. The VIs contributed more to the algorithm's performances than the SBs. This paper concludes that the proposed approach of machine learning models is appropriate to predict these agronomic variables. This method may be used in research that intends to evaluate different types of crops or applied in precision agriculture practices and assist in decision-making models. Regardless, future experiments should be conducted in more practical conditions.