Quantifying Leaf Chlorophyll Concentration of Sorghum from Hyperspectral Data Using Derivative Calculus and Machine Learning

: Leaf chlorophyll concentration (LCC) is an important indicator of plant health, vigor, physiological status, productivity, and nutrient deficiencies. Hyperspectral spectroscopy at leaf level has been widely used to estimate LCC accurately and non-destructively. This study utilized leaf-level hyperspectral data with derivative calculus and machine learning to estimate LCC of sorghum. We calculated fractional derivative (FD) orders starting from 0.2 to 2.0 with 0.2 order increments. Additionally, 43 common vegetation indices (VIs) were calculated from leaf spectral reflectance factor to make comparisons with reflectance-based data. Within the modeling pipeline, three feature selection methods were assessed: Pearson’s correlation coefficient (PCC), partial least squares based variable importance in the projection (VIP), and random forest-based mean decrease impurity (MDI). Finally, we used partial least squares regression (PLSR), random forest regression (RFR), support vector regression (SVR), and extreme learning regression (ELR) to estimate the LCC of sorghum. Results showed that: (1) increasing derivative order can show improved model performance until certain order for reflectance-based analysis; however, it is inconclusive to state that a particular order is optimal for estimating LCC of sorghum; (2) VI-based modeling outperformed derivative augmented reflectance factor-based modeling; (3) mean decrease impurity was found effective in selecting sensitive features from large feature space (reflectance-based analysis), whereas simple Pearson’s correlation coefficient worked better with smaller feature space (VI-based analysis); and (4) SVR outperformed all other models within reflectance-based analysis; alternatively, ELR with VIs from original reflectance yielded slightly better results compared to all other models. measured LCC values. Results showed that the reflectance-based analysis yielded good performance with increasing derivative order until approximately 1.2 order, whereas the VI-based analysis showed decreasing performance (distribution of predicted LCC values showed outliers and skewness) of models with increasing derivative order.


Introduction
Demand for sustainable and high yield crops is continually increasing due to rapid population surge and climate change [1][2][3]. Cereal crops can play a significant role in meeting such demand [4]. Among many different cereals, sorghum (Sorghum bicolor) is an important crop in semi-arid environments due to its high drought, heat, and water tolerance [5,6]. However, accurate genomic selection is indispensable to increase the yield and stress tolerance [7,8], which heavily relies on different phenotypic traits collected at plant breeding stations [9][10][11]. Leaf chlorophyll concentration our knowledge, we have not found any studies that utilized FD augmented hyperspectral data for estimating either LCC or any other biochemical properties from sorghum.
Machine learning (ML) algorithms play an important role in estimating crop LCC and other phenotyping traits from either hyperspectral spectroscopy or multi-sensor imageries. For example, multiple linear regression (MLR) [50][51][52][53], partial least squares regression (PLSR) [49,[54][55][56][57], random forest regression (RFR) [14,58,59], support vector machine based regression (SVR) [48,54,57,58], and back propagation neural networks (BPNNs) [14,35,50] have shown incredible performance in estimating LCC of different crops. Recently, extreme learning machine based regression (ELR) [60] has been found to be an efficient and rapid learning algorithm for regression, which outperformed some other ML algorithms for many practical applications [61][62][63][64]. In addition to model training, feature selection is a crucial step before starting any ML pipeline. For example, there could be varying results depending on what feature selection method and how the method is implemented with the training data. However, there has not been any comprehensive study that compares the performance of several ML algorithms in terms of derivative-augmented hyperspectral data for phenotypic trait estimation.
The goal of this study is to investigate the influence of derivative calculus on hyperspectral reflectance data for estimating LCC of sorghum. We asked the following research questions to achieve this goal: 1) Can derivative analysis better quantify LCC of sorghum among reflectance-based and vegetation index (VI)-based spectral data? 2) Which combination of feature selection and ML algorithm has better prediction capability? 3) Can common VIs better estimate LCC compared to reflectance spectra? In this study, we analyzed different derivative orders (including both integer and fractional orders), three feature selection methods (i.e., Pearson's correlation coefficient, variable importance in the projection, and mean decrease impurity), and four ML algorithms (i.e., PLSR, RFR, SVR, and ELR) for LCC estimation of sorghum.

Study Site and Plant Material
The study area (Figure 1) is the Transportation Energy Resources from Renewable Agriculture Phenotyping Reference Platform (TERRA-REF) field scanner ( Figure 1b) field site at the Maricopa Agricultural Center, Maricopa, Arizona, United States. The details of this field scanner system can be found in the research of Burnette et al. [65]. The experimental field site (33.070° N, 111.974° W, elevation 360 m) was planted on 3 August 2016 with two replicates of a Sorghum bicolor research population from Texas A&M (W Rooney) comprised of 173 recombinant inbred lines at the F-10 generation plus the parental lines SC56 and Tx7000. The field layout included 32 rows by 54 ranges in total, with the two outer lateral rows and end ranges as border plots to reduce edge effects. Border plots were excluded from any quantitative analysis. Experimental design followed a two-replicate alpha design with row-column constraint. Plots were four-row plots, 3.5 m long and 0.76 m row spacing, such that sorghum lines were evaluated in the two inner row subplots while the two outer rows were plot borders to reduce plot-to-plot edge effects. There were 350 total plots, where each plot had two subplots and was given unique identifiers. The field trial was managed for optimal growth. Initial irrigation was from sprinklers for emergence followed by subsurface drip lines.

Leaf Chlorophyll Concentration Measurements
In-situ ground LCC was collected using Dualex 4 Scientific (Figure 1c in yellow box, Force-A, France) handheld sensor for 394 sample leaves from 349 plots. The Dualex 4 Scientific instrument measures leaf chlorophyll index by using a red-edge band (710 nm) and a NIR band (850 nm), and estimates LCC in µg/cm 2 using a calibration coefficient [66]. Only sunlit representative leaves from each plot were selected for measurements. The LCC measurements were taken at noon on two days, (9 November 2016 and 11 November 2016) while the sorghum plants were at the grain development growth stage.

Hyperspectral Reflectance Measurements
Reflectance measures, or specifically, the hemispheric conical reflectance factor (HCRF, [67]) were collected using a Spectral Evolution portable spectroradiometer PSR-3500 (Figure 1c in blue box; Spectral Revolution, Inc., Lawrence, MA, USA) almost simultaneously with the Dualex measurements from the same sorghum leaves. Measurements were taken under clear-sky conditions near solar noon to minimize the disturbances from changes in sun angle and cloud or canopy shadow. The spectroradiometer has a spectral range of 350-2500 nm with a resolution of 3.5 nm in the 350-1000 nm range, 10 nm in the 1000-1900 nm range, and 7 nm in the 1900-2500 nm range. A reference spectrum taken from a 99% Spectralon calibration panel (Labsphere, Inc., North Sutton, NH, USA) was used to normalize leaf spectral measurements to reflectance factor. Calibration panel readings were repeated for every 15 min to readjust the baseline to account for any changes in illumination condition. A leaf clip with a bifurcated fiber-optic and a 5-watt tungsten halogen lamp light source was used to record leaf reflectance factor with a black background. With pre-configured settings, the PSR-3500 spectroradiometer averaged 40 readings automatically for each sample. The spectral reflectance factor, referred to as the reflectance hereafter, was interpolated to 1 nm, which resulted in 2151 individual spectral bands.

Fractional Derivative Calculation
Fractional-order derivative has been utilized as a tool to extract useful and sensitive information in many fields of signal processing [68][69][70]. Although fractional derivative (FD) refers to derived integer-order derivative into any positive order, the calculation of FD is complex and several algorithms exist to calculate. However, Riemann-Liouville, Grunwald-Letnikov, and Caputo are the three most frequently used classic definitions [71][72][73][74]. We adopted the Grunwald-Letnikov (G-L) definition to calculate FD at different orders in this study due to its specifically simple formula and coefficients [75]. The G-L definition is generally expressed as Equation (1): where is any order, ℎ is the step size, and are the upper and lower limits of the fractional order derivative, respectively. The G-L algorithm uses a Gamma function, which is expressed as Γ( ) = exp(− ) = ( − 1)! . Considering the resampling interval of spectral reflectance as 1 nm and ℎ = 1, the derived difference in the fractional order derivative of single variable function ( ) can be expressed as Equation (2): We considered calculating FD orders from 0.2 to 2.0 with 0.2 order increments. Therefore, 10 different orders were calculated from the spectral data using the G-L algorithm. A Python package named "differint" [76] was used to calculate the FD augmented spectral data.

Calculation of Vegetation Indices
Hyperspectral narrow band vegetation indices (VIs) are commonly used to estimate different crop biophysical and biochemical properties. We selected 43 common VIs (Table 1) based on studies that estimated different plant biochemical traits.

Feature Selection Methods
Feature selection is one of the most important pre-processing steps before performing any ML regression or classification pipeline [104][105][106]. Since hyperspectral data usually contain a large number of features (i.e., wavelengths), it is ideal to reduce the number of features by selecting the most sensitive features. Our spectral data contained reflectance values for wavelengths from 350-2500 nm with 1 nm intervals, which resulted in 2151 features. Therefore, dimensionality reduction by selecting features that were sensitive to LCC was a necessary step. Other than assessing the impact of FD in estimating LCC using different ML algorithms, we also focused on the effect of different feature selection methods and number of features within the pipeline. We used three common feature selection methods: Pearson's correlation coefficient (PCC), partial least square based variable importance in the projection (VIP), and random forest based mean decrease impurity (MDI) to rank the importance of features.

Pearson's Correlation Coefficient (PCC)
Pearson's correlation coefficient (PCC) is a measure of the linear dependence between two random variables, which is formally defined as the covariance of the variables divided by the product of their standard deviations. The calculation of PCC ( ) can be expressed as Equation (3): where ̅ = ∑ and = ∑ denote the mean of and , respectively, with sample size. The coefficient ( ) ranges from -1 to 1 and it is invariant to linear transformations of either variables. The feature importance scores were calculated based on the absolute value of PCC.

Variable Importance in the Projection (VIP)
Partial least squares (PLS) regression is a common regression technique which is based on explanatory variables that have maximal covariance with the target variable. However, a key feature of PLS regression is that the importance of explanatory variables in predicting the target variable can be quantified by a metric called variable importance in the projection (VIP). The VIP score measures the explicative power of explanatory variables with respect to the target variable which is based on the PLS regression. Feature selection using VIP has been utilized in several studies related to remote sensing of vegetation [107][108][109][110]. According to Eriksson and colleagues [111], the VIP score for the th variable for target variable can be computed with Equation (4): where = 1,2 … , , which is the number of PLS components, is the number of columns of (i.e., features or wavelengths), is the loading weight of the th variable in the th component, and , , and are the th column vectors of , , and , respectively. Here, contains theweights defining the common latent variable space relating and , and holds the loading vectors that best represent the space. The variable with a higher VIP score shows the relevancy of using that variable to predict the target variable.

Mean Decrease Impurity (MDI)
Random forest is an ensemble learning technique based on randomized decision trees and impurity measurements [112] that can provide different feature importance measures. One such technique is known as Gini importance or mean decrease impurity (MDI), when the random forest uses Gini index as its impurity measurement. Breiman [112] proposed to evaluate the importance of a variable for predicting (i.e., LCC) by adding up the weighted impurity decreases ( ( )Δ ( , )) for all nodes where is used and averaged over all trees in the forest as in Equation (5): where ( ) is the proportion / of sample reaching , and ( ) is the variable used in split . Few studies have implemented MDI scoring for feature selection in ML pipeline [108,113,114]. In our study, the MDI score of a variable (i.e., wavelength or VI) represents the corresponding importance estimating LCC.

Machine Learning Algorithms
In the plant phenotyping community, several machine learning (ML) algorithms have become popular in terms of both accuracy and computational efficiency [115]. We investigated four commonly used ML regression techniques (i.e., PLSR, RFR, SVR, and ELR) for estimating LCC from reflectance and VI-based spectra with derivative analysis. PLSR is a multivariate calibration technique that uses component projection to reduce the full feature space to a smaller number of noncorrelated features (also known as latent variables) containing the most useful information [116]. Therefore, PLSR was found to be very effective when the feature space is large, and multicollinearity exists within different features [117]. RFR is an ensemble-learning algorithm that accumulates a large set of decision trees, which are a hierarchically organized set of conditions or restrictions [118]. The process starts with fitting decision tree to randomly drawn samples and for each tree node a subset of input features is selected. Due to random selection of features in each tree, RFR is tolerant to outliers and noise [119]. SVR is the regression implementation of support vector machine (SVM). SVM transforms the non-linear regression problem to a linear one by utilizing different kernel functions. These functions then map the original input space into a high-dimensional feature space to find unique global solutions that are not exploited by multiple local minima [120]. ELR is the regression version of extreme learning machine (ELM), which is a feed-forward neural network with one input layer, one hidden layer, and one output layer [60]. ELM can provide high computational efficiency because the hidden node parameters are generated randomly [8].

Modeling Pipeline and Evaluation
An automated modeling pipeline was developd ( Figure 2) to train different ML regression techniques. After creating both reflectance-based and VI-based derivative order datasets, the modeling pipeline started with dividing the dataset into training (n = 244) and validation (n = 105) sets by a 70%/30% split. The validation set was kept completely outside of the feature selection and model training parts, and only utilized during the final model evaluation step. Since different derivative orders had different ranges of reflectance values, the features were scaled from 0 to 1 before any modeling steps. Feature importance scores were calculated using three feature selection methods (i.e., PCC, VIP, and MDI). Since both VIP and MDI were required to train PLSR and RFR models first, the training parameters were selected based on a grid search algorithm and 10-fold cross-validation. Based on different feature importance scores, different groups of features were extracted from different derivative orders. For reflectance-based analysis, 25, 50, 75, 100, 125, 150, 175, and 200 feature groups were created, whereas for VI-based analysis, 5,10,15,20,25,30,35, and 40 feature groups were extracted. Each of these groups from different FD orders were input data for ML algorithms. The feature groups from both reflectance-based and VI-based data were trained with different ML models. Since different models require different training parameters, we carefully selected different ranges of model parameters for PLSR, RFR, SVR, and ELR based on extensive literature survey, and applied a grid search algorithm to select the best combination of model parameters. The grid search was performed with a 10-fold cross-validation and mean squared error (MSE) was selected as the scoring criteria. Therefore, the combination of parameters resulting in the lowest MSE was considered as the optimal parameters for the model. Each feature group from different derivative orders was processed through this technique and the average MSE score for each input set was retained. Finally, the combination of feature selection method and number of features that showed the lowest MSE score within a particular derivative order and ML algorithm was used for model evaluation with the validation set. The modeling pipeline was implemented in Python and the ML algorithms were utilized from the "Scikit-learn" package [121]. The evaluation of model performance was conducted by using the coefficient of determination ( ), root mean squared error ( ), and relative RMSE ( %). The equations are as follows: where = 1,2,3, … … . . , is the validation sample, and represent the predicted and measured LCC values, respectively, and is the average of each measurable variable.

Descriptive Statistics of Collected Samples
The descriptive statistics and distribution of sample LCC are presented in Table 2

Spectral Features After Fractional Derivative Analysis
Fractional derivative-augmented spectra showed varying spectral patterns with increasing fractional orders. Figure 4 shows such pattern from the corresponding reflectance spectra of three LCC samples: the minimum (red line), maximum (green line), and median (blue line) LCC values. In the case of the original spectral reflectance factor (Figure 4a, from here on the reflectance factor will be denoted as original spectra), the maximum LCC spectra showed higher reflectance peaks at the NIR region (around 750-800 nm) compared to the other two spectra. However, the difference between these reflectance peaks at the NIR region started to diminish with fractional derivative analysis, specifically after 0.4 order (Figure 4c). Derivative treatment also led to increase in the reflectance value with increasing derivative orders exponentially, which allowed the derivative spectra to be sensitive to subtle changes in the reflectance factor. Based on Figure 4, the number of reflectance and absorption peaks increased with incremental derivative orders compared to original spectra. For example, a subtle reflectance peak in the original spectra of maximum LCC sample (green line in Figure 4a) at around 800-1000 nm is amplified in the 0.2 derivative order spectra (Figure 4b). The FD treatment to reflectance spectra enabled sensitive features to become more significant by increasing the derivative reflectance value at certain bands (e.g., the derivative reflectance curve showed a sharp change at around 1000 nm in 0.4 order, Figure  4c), whereas the less sensitive bands were found comparatively lower in their derivative reflectance values.

Feature Importance Scores
The relationship between different features and the target variable in this study (i.e., LCC) is a crucial step before performing model training. Figure 5 shows such a relationship between features from different derivative orders and LCC based on Pearson's correlation coefficient (Pearson's R). Pearson's R ranges from -1 to +1, which represent the negative and positive relationships, respectively. All correlation coefficients from different features and LCC values were tested at the 0.01 significance level (99% confidence); they are shown in Figure 5 for each derivative order. Overall, there were negative correlations between original reflectance spectra and LCC at around 750 nm, 1400 nm, and 2000 nm wavelengths, where some features passed the significance test (Figure 5a). Very few features in original spectra showed a positive correlation and not a single feature with a positive correlation showed statistical significance (Figure 5a). However, with increasing derivative orders, both the correlation coefficient and number of features passing the significance test increased (Figure 5b-k). The highest correlation (both positive and negative) was found at around 700-750 nm range from 1.0 and 1.2 order derivatives. After 1.4 order, the overall correlation and number of significant features started dropping and the correlation curve became noisy (Figure 5i-k). Figure 5. The correlation coefficient between LCC and original spectral data (a) and fractional derivative augmented spectra (b-k). The dashed lines in each plot represent the limit of statistical significance at 99% confidence. The data points located beyond these limits are significantly correlated with LCC. With increasing derivative order, several wavelengths showed increased and statistically significant correlation coefficients (c-h). However, from order 1.6 (i), the pattern of correlation becomes noisy and insignificant. Figures 6 and 7 show the feature importance scores from different feature selection methods for reflectance spectra and VIs, respectively. The PCC, VIP, and MDI scores were scaled in the range of 0-1 to make uniform comparison of scores between each derivative orders and feature selection methods. An important consideration for this analysis was that the scores were calculated using only the training samples, whereas the validation set was set aside for further model evaluations. In terms of reflectance-based feature importance scores (Figure 6), the PCC tended to extract sensitive features from around 700 nm, 1400 nm, and 1800-2400 nm range (Figure 6a). With increasing derivative orders, the most important features were concentrated at around 700 nm and 1400 nm, and after 1.6 order, the pattern of important features became noisy. The VIP scores (Figure 6b) showed a similar pattern of feature importance as the PCC, however, the values were slightly different. Alternatively, in terms of MDI (Figure 6c), the feature importance was more discrete than PCC and VIP. The MDI resulted in important features at the 2000 nm region for original spectra (order 0.0), whereas with increasing derivative orders, the important features were found at around 700-800 nm region. Usually this region is considered as the NIR region and the correlation between features at this region and LCC was found significantly improving with increasing derivative orders ( Figure 5). However, the MDI highlighted unique wavelengths as a result of very clear and sharper increase in feature importance. Feature importance scores from the VIs are shown in Figure 7. The scores are shown for all 43 features with different feature selection methods, however, the order of the features in Figure 7 does not represent any logical meaning. Similar to the reflectance-based feature importance scores, the PCC and VIP showed similar patterns of feature importance with different derivative orders. The PCC tended to highlight several important features even with original spectral data (order 0.0), for example, Cart4, Datt1, MTCI, NDVI, REP, RIdb, SR750/710, VOG2, and VOG3 were found as showing higher scores. With increasing derivative orders, the scores for different features became noisier (Figure 7a,b). In terms of MDI (Figure 7c), very few features were highlighted in each derivative order, for example, only Vog2 and Vog3 were found highly important in original spectra, with order 0.2 and order 0.4, respectively. After order 1.4, the number of important features increased abruptly.

Model Results of LCC Estimation
ML models (i.e., PLSR, RFR, SVR, and ELR) were trained with every possible combination of feature selection methods and number of feature groups. Model evaluation metrics (i.e., R 2 , RMSE, and RMSE%) were only calculated for the combination of feature selection method and number of features that yielded the lowest cross validation MSE score from the training set. These metrics were calculated with the validation dataset and all derivative orders of two different datasets: reflectancebased and VI-based spectra. The validation metrics of LCC estimation are demonstrated in Table 3.
In addition, the model R 2 and RMSE are illustrated in Figure 8 with respect to different derivative order.
In terms of reflectance-based analysis (Figure 8a,c), the derivative order of 1.0 showed superior performance with all four models (R 2 ranging from 0.578 to 0.734 and RMSE% ranging from 8.125 to 10.227). The predictive performance of all models showed improvement with increasing derivative order up to a particular point. For instance, PLSR (R 2 of 0.701 and RMSE% of 8.603) showed the highest result at order 0.2, RFR (R 2 of 0.683 and RMSE% of 8.865) and SVR (R 2 of 0.734 and RMSE% of 8.125) yielded peaks at order 1.0, and ELR (R 2 of 0.704 and RMSE% of 8.567) performed the best at order 0.4. After the respective orders, each model started to decline in their performance (Figure 8a,c). Overall, the SVR showed consistently good performance until the derivative order reached 1.8 (R 2 ranging from 0.457 to 0.734 and RMSE% ranging from 8.125 to 11.605). Table 3 also shows the best combination of feature selection method and number of features for each model and derivative order. The best performing model within the reflectance-based analysis (i.e., SVR with order 1.0) used 75 features selected by MDI. Overall, the MDI was found as the optimal feature selection method for most of the well performed models.   ., b, d). The 0.0 order in the x-axis represents the original spectra without any derivative treatment. For reflectance-based modeling, model performance increases; however, the performance starts decreasing after certain derivative order. For VI-based modeling, the model performance was found better with original spectra (order 0.0). With increasing derivative order, model performance starts declining.
Alternatively, when VIs were used as input features instead of reflectance spectra for different derivative orders, the highest performance was observed at original spectra (R 2 ranging from 0.618 to 0.744 and RMSE% ranging from 7.971 to 9.734). The best performing model was found with ELR at original spectra (R 2 of 0.744 and RMSE% of 7.971), which was even higher than the best model found with reflectance-based analysis (i.e., SVR at order 1.0 resulting in R 2 of 0.734 and RMSE% of 8.125). The ELR with original spectra used 15 features as input which were selected by PCC. Overall, most of the well-performing models at lower derivative orders showed PCC as an optimal feature selection method. However, according to Figure 8b, the model performance decreased with increasing derivative orders within the VI-based analysis. Therefore, the LCC estimation worked better with derivative spectra at 1.0 order when direct reflectance from wavelengths was used, whereas the original spectra showed good performance when the model inputs were VIs.
The distributions of predicted LCC values using different models, derivative orders, and feature types (i.e., reflectance-based or VI-based) with validation dataset are illustrated in Figure 9. The boxplots with different models show how different the distribution of predicted LCC values is with measured LCC values. Results showed that the reflectance-based analysis yielded good performance with increasing derivative order until approximately 1.2 order, whereas the VI-based analysis showed decreasing performance (distribution of predicted LCC values showed outliers and skewness) of models with increasing derivative order.

Performance Analysis of Derivative Spectra and VIs in LCC Estimation
The derivative calculus including both integer-orders and fractional-orders, has proven to be an effective tool for analyzing spectra in many fields. Although many studies have utilized first-order and second-order derivatives in estimating vegetation spectra, very few studies have utilized fractional derivative in analyzing hyperspectral reflectance of crop leaves. To our knowledge, only one study from Chen, Li, and Tang [47] found 0.6 order spectra that resulted in superior performance in estimating nitrogen concentration of natural rubber (Hevea brasiliensis). Additionally, Wang, Zhang, Kung, and Johnson [35] reported that 1.2 order fractional derivative of hyperspectral data yielded the best results for estimating soil organic matter content. Fu, Xiong, and Tian [39] conducted a similar investigation and showed that FD analysis can increase the correlation coefficient between FD-augmented spectra and soil organic matter content. However, Fu, Xiong, and Tian [39] did not conclude with any single order that provided the best result in predictive analytics. The results from our study also dictate that when reflectance spectra are used in modeling LCC, derivative calculus can significantly increase the correlation between LCC until a certain order ( Figure 5). However, different models yielded their best performance at different orders. For example, both SVR and RFR had higher model performance at order 1.0, but PLSR showed its best performance at order 0.2. We also found that the best performance was retrieved from 1.0 order (i.e., first order) with SVR model when reflectance spectra were used as model input. However, the second highest model performance from 0.8 order with SVR (R 2 of 0.729 and RMSE% of 8.201) used fewer features (n = 25) compared to the highest performing model that used more features (n = 75), yet the results were only slightly less than the best model. Therefore, we find it inconclusive to state that either fractional derivative or integer-order derivative is better in estimating LCC from sorghum.
Derivative calculus augmented spectra have the capability to extract more useful information from hyperspectral data since the order is extended arbitrarily to non-integers as well as integers [29,30,34]. This process increases the possibility of highlighting more detailed features within the limits of integer derivatives. For example, Figure 10 shows reflectance spectra of a sample leaf (i.e., from the median LCC value of 50.5 µg/cm 2 ) without any derivative analysis (i.e., original spectrum, Figure 10a) and derivative spectrum from order 0.2 to 2.0 with a smaller spectral window (i.e., the NIR region of 700-1000 nm). The selected features with the best models found at each derivative order are also highlighted. Figure 10 is a close-up version of Figure 4 that highlights how the increasing derivative order amplify certain information in the spectral curve and how important features are then selected by different feature selection methods. According to Figure 10a, the original spectra show an increasing slope until around 760 nm and start to flatten out until 1000 nm. With increasing derivative order, the flatten curve starts to show abrupt peaks on it and the important features start to appear in a distributed manner. For example, with order 0.6 (Figure 10d), important features are seen all over the spectrum instead of clustering at the lower end of the spectrum as in the case of original spectra (Figure 10a). This is the reason that the correlation coefficient between LCC and derivative spectra significantly increased with increasing derivative orders ( Figure 5). Alternatively, use of VIs has been considered as a convenient and powerful feature for estimating different plant characteristics from spectral data. Many studies have reported the good predictive capabilities of using VIs in predicting leaf biochemical properties [122][123][124][125][126][127][128]. We have also found superior results with VIs instead of performing any derivative augmentation (i.e., the best performing model was from 15 VIs). Figure 11 shows those VIs that resulted in the best performing model using ELR and original spectra. These VIs were selected by PCC as the feature selection method. One possible reason behind VIs showing the best performance could be that VIs were developed to enhance certain vegetation information. LCC is considered as one of the major leaf pigments that reflects the photosynthetic ability and overall health status of a plant [129]. Although the VIs selected for this study were based on a wide literature survey, most of the VIs were found highly sensitive to LCC. For example, the highest correlation was found for the Red-Edge Position (REP) index ( Figure 11, equation in Table 1) developed by Clevers [92]. This index highlights reflectance from the red-edge position of the spectrum and simplifies the spectral curve to a straight line between 700 nm and 740 nm. The reflectance of the REP was then estimated as being half of the reflectance in the NIR at about 780 nm and the reflectance minimum of the chlorophyll absorption feature at around 670 nm. By highlighting the chlorophyll absorption band, this index provided the best score and can be used as a potential feature. Other VIs were also closely related to LCC and other leaf biochemical parameters which helped different models to estimate LCC using the original spectra instead of using VIs from derivative transformed spectra. An advantage of using VIs instead of reflectance is that they reduce the feature space and increase model computational efficiency. However, the selection of VIs is very important for estimating certain plant characteristics.

Impact of Feature Selection Methods in Modeling Pipeline
Selecting sensitive features for modeling any biochemical properties is crucial, especially when the input feature space is large. Use of PCC (or absolute value of Pearson's correlation coefficient) is very common for sensitive feature selection in the plant science community. However, we also explored the effectiveness of the VIP score from PLSR and MDI score from RFR. Results suggested that in terms of reflectance-based analysis, MDI worked better since the scores were not saturated over the spectrum for different derivative orders. MDI score is calculated based on node impurity, which is a measure of homogeneity of the variable. With increasing order, the difference between each feature range increases a lot with abrupt changes which increases the impurity in the feature space. That is why MDI can unilaterally pick important features from the spectrum at large distances with increasing orders. With PCC, there exists the chance of multicollinearity (correlation among features) which results in similar feature importance score for adjacent bands in our analysis. On the other hand, since MDI is a tree-based scoring measure, the multicollinearity problem was avoided.
On the other hand, in terms of VI-based analysis, PCC tends to pick up sensitive features distributed over all the available VIs. Since some of the individual VIs showed higher correlation compared to individual original spectra or derivative augmented spectra, PCC was able to pick up important features for modeling. An advantage of PCC is that it does not need to train any model, whereas both VIP and MDI scores were calculated after training PLSR and RFR models, respectively.

Performance of Machine Learning Models in LCC Estimation
In plant science and remote sensing communities, the PLSR, RFR, and SVR have proven to be effective machine learning models for estimating biochemical properties. Recently, ELR has been utilized as a potential machine learning method in regression problems due to its enhanced computational efficiency [58,63]. Our study reveals that within reflectance-based analysis, SVR consistently outperformed all other models in every FD order. Many studies have also reported the superior predictive capability of SVR in estimating crop phenotypic traits [130][131][132]. This can be attributed to the high generalization ability of SVR by providing a global minimum solution [131,133]. ELR also performed well at 0.4 order derivatives but started to decrease its performance with increasing derivative order. Although the difference between model evaluation metrics of different orders is small in some cases (e.g., SVR at order 1.0 and SVR at order 0.8), it has to be noted that the model training was performed with a 10-fold cross validation and the evaluation metrics were calculated using a validation dataset which was completely independent of the training dataset and only used for model evaluation. This showed the robustness of the trained models. Arguably, our study concluded that SVR from order 1.0 is better than SVR from order 0.8. However, for future studies with other crops or other study areas, the result may vary, so careful design of the modeling pipeline is required before making such an inference.
In terms of VI-based analysis, most of the model performed well with original spectra. The best performing model within both VI-based and reflectance-based analysis was found with ELR from reflectance. However, the ML models in terms of VI-based analysis did not perform well with increasing derivative orders. The reason behind this is that VIs were developed to amplify certain information from vegetation spectra rather than derivative-augmented spectra. Therefore, the derivative-augmented spectra were already amplified with different orders and when VIs were calculated from these derivative-augmented spectra, more noise was introduced to the feature space. This resulted in continuous poor performance of models over increasing FD orders. Therefore, it is not advisable to calculate VIs from derivative-augmented spectra.

Conclusions
Accurate and non-destructive measures for estimating LCC for sorghum is an important step to support plant breeders and genetic selection studies. This study investigates the effectiveness of derivative calculus and machine learning models in estimating LCC of sorghum from hyperspectral spectroscopy. Major conclusions include: 1. In terms of reflectance-based analysis, increasing derivative order can show improved model performance until a certain order; however, it is inconclusive to state that a particular derivative order is optimal for estimating LCC of sorghum. Further assessment with data from multiple study sites and growth stages is required to make such an inference. 2. VI-based modeling with original spectra outperformed reflectance-based modeling with derivative-augmented spectra. 3. Sensitive feature selection is a crucial step in any machine learning pipeline. MDI score was found effective in selecting sensitive features from a large feature space (reflectance-based analysis), whereas PCC worked better with a smaller feature space (VI-based analysis). 4. When single wavelengths were used in the analysis from different FD orders, SVR outperformed all other models. However, PLSR and ELR required fewer model parameters and computational time, which can be advantageous in model training. Alternatively, ELR with VIs from original spectra yielded slightly better results compared to all other models. Therefore, ELR worked better when hand-crafted features (VIs) were used. The findings from this study will help plant breeders and scientists in estimating LCC for sorghum non-destructively and efficiently. It also demonstrates a potential framework for how to prepare a semi-automated machine learning pipeline that highlights robust data processing, feature selection, model training, and model evaluation techniques, which can be adopted to other plant phenotypic estimation studies as well. Our next steps and future work will include data augmentation and transferring the pipeline to hyperspectral imagery collected from unmanned aerial vehicle platforms to estimate LCC and other biochemical properties of sorghum.