Stress Distribution Analysis on Hyperspectral Corn Leaf Images for Improved Phenotyping Quality

High-throughput imaging technologies have been developing rapidly for agricultural plant phenotyping purposes. With most of the current crop plant image processing algorithms, the plant canopy pixels are segmented from the images, and the averaged spectrum across the whole canopy is calculated in order to predict the plant’s physiological features. However, the nutrients and stress levels vary significantly across the canopy. For example, it is common to have several times of difference among Soil Plant Analysis Development (SPAD) chlorophyll meter readings of chlorophyll content at different positions on the same leaf. The current plant image processing algorithms cannot provide satisfactory plant measurement quality, as the averaged color cannot characterize the different leaf parts. Meanwhile, the nutrients and stress distribution patterns contain unique features which might provide valuable signals for phenotyping. There is great potential to develop a finer level of image processing algorithm which analyzes the nutrients and stress distributions across the leaf for improved quality of phenotyping measurements. In this paper, a new leaf image processing algorithm based on Random Forest and leaf region rescaling was developed in order to analyze the distribution patterns on the corn leaf. The normalized difference vegetation index (NDVI) was used as an example to demonstrate the improvements of the new algorithm in differentiating between different nitrogen stress levels. With the Random Forest method integrated into the algorithm, the distribution patterns along the corn leaf’s mid-rib direction were successfully modeled and utilized for improved phenotyping quality. The algorithm was tested in a field corn plant phenotyping assay with different genotypes and nitrogen treatments. Compared with the traditional image processing algorithms which average the NDVI (for example) throughout the whole leaf, the new algorithm more clearly differentiates the leaves from different nitrogen treatments and genotypes. We expect that, besides NDVI, the new distribution analysis algorithm could improve the quality of other plant feature measurements in similar ways.


Introduction
High-throughput imaging sensor technologies have been well established for measuring crop growth status in fields or indoors. Plant imaging models have been developed to predict plants' physiological features, such as the plant's biomass [1], field yield [2], chlorophyll content [3], nitrogen content [3], relative water content [4,5], and disease symptoms [6,7]. In conventional plant phenotyping analyses, most studies rely on the averaged spectrum across all the segmented plant pixels of the whole plant [2,4,[8][9][10][11][12]. Researchers firstly segment the target plant tissue from the image. Then, the averaged spectrum across all the plant pixels is utilized for prediction purposes. However, this conventional pixel level, with machine learning approaches. We took the normalized difference vegetation index (NDVI), one of the most popular vegetative indices [30,31], as an example: Hyperspectral images of top-collared corn leaves were collected. The images were processed to generate high resolution NDVI heatmaps of each individual leaf. NDVI distribution patterns along the leaf's mid-rib direction were successfully modeled with selected machine learning algorithms. The NDVI distribution pattern was then combined with the spectral information for a more precise assessment of the nitrogen stress level. The result showed that, compared with the traditional algorithms which use an averaged spectrum, the new distribution analysis algorithm was able to improve the phenotyping quality significantly. The plots of the B73xMo17 genotype were equally divided into two groups: a high nitrogen treatment group (H), treated with 250 kg/ha; and a low nitrogen treatment group (L), treated with 0 kg/ha. The P1105AM genotype plants were all high nitrogen treated with 250 kg/ha There were four plots of replicates for each group, and eight plants were evenly selected from each plot as samples. In total, each group had 32 replicates. In this study, the B73xMo17 plants were labeled as the control group used for the model construction. The P1105AM plants had a high nutrient use efficiency property [32], and they were labeled as the validation group for the performance evaluation.

Handheld Hyperspectral Device and Plant Sampling
The leaf-scale hyperspectral images of the corn plants in this experiment were collected with a portable hyperspectral corn leaf imager, LeafSpec, developed by the Purdue Phenotyping Lab group [33]. As shown in Figure 1, the LeafSpec device takes a hyperspectral image of the entire leaf by smoothly sliding the leaf through it. The imaging process takes about 5 s for each specimen [1]. LeafSpec has a spectral range of 450 to 900 nm, and its sampling spectral resolution is 0.74 nm. It is a push broom imager with a spatial resolution of 878 pixels on each line ( Table 1). The device collects both spectra and morphological data for each leaf. The on-board microcontroller processes the images from the camera and sends the resulting data to a smartphone app in real time. The smartphone app then automatically uploads the geo-referenced phenotyping results (GPS location and timestamp information) to Purdue's GeoHub GIS system immediately after each measurement for map-viewing purposes. In this experiment, LeafSpec was used to scan the top matured leaf of each sample plant at the V8 stage, when the plant had eight leaves with visible leaf collars, from the leaf collar to the leaf tip. both spectra and morphological data for each leaf. The on-board microcontroller processes the images from the camera and sends the resulting data to a smartphone app in real time. The smartphone app then automatically uploads the geo-referenced phenotyping results (GPS location and timestamp information) to Purdue's GeoHub GIS system immediately after each measurement for map-viewing purposes. In this experiment, LeafSpec was used to scan the top matured leaf of each sample plant at the V8 stage, when the plant had eight leaves with visible leaf collars, from the leaf collar to the leaf tip.

Image Processing and Segmentation
After data collection, the raw hyperspectral images were first calibrated with a flat strip made with white Teflon material as the reference. The angle of the white reference was perpendicular to the view of the camera and lighting source. The image calibration was performed with the following equation Equation (1): where R cali is the calibrated image, R raw is the raw hyperspectral leaf image, R dark is the dark reference image, and R white is the hyperspectral image of the white reference. Then, the leaf tissue was segmented from the calibrated images using a segmentation procedure with a convolution methodology [12]. A vector of sequential integers from −20 to 20 was multiplied by the reflectance intensity vector from the red-edged region (680-720 nm). By choosing threshold 7 as the boundary between the plant tissue and the background, the plant tissue was successfully segmented, as shown in Figure 2. The spectral features of the leaf tissue (grey region in Figure 2) were used in this study. Due to the high resolution of the leaf images, the detailed spectra distribution features of the leaf were obtained.

Image Processing and Segmentation
After data collection, the raw hyperspectral images were first calibrated with a flat strip made with white Teflon material as the reference. The angle of the white reference was perpendicular to the view of the camera and lighting source. The image calibration was performed with the following equation Equation (1): where is the calibrated image, is the raw hyperspectral leaf image, is the dark reference image, and is the hyperspectral image of the white reference. Then, the leaf tissue was segmented from the calibrated images using a segmentation procedure with a convolution methodology [12]. A vector of sequential integers from −20 to 20 was multiplied by the reflectance intensity vector from the red-edged region (680-720 nm). By choosing threshold 7 as the boundary between the plant tissue and the background, the plant tissue was successfully segmented, as shown in Figure 2. The spectral features of the leaf tissue (grey region in Figure 2) were used in this study. Due to the high resolution of the leaf images, the detailed spectra distribution features of the leaf were obtained.

NDVI Calculation
Once the hyperspectral leaf images were processed and segmented for all the 64 (B73xMo17) and 32 (P1105AM) corn leaf samples, the Normalized Difference Vegetative Index (NDVI) Equation (2) was

NDVI Calculation
Once the hyperspectral leaf images were processed and segmented for all the 64 (B73xMo17) and 32 (P1105AM) corn leaf samples, the Normalized Difference Vegetative Index (NDVI) Equation (2) was calculated for each pixel. This index was reported to be related to the nitrogen status and chlorophyll content of the plant [34,35]. Since NDVI values range from 0 to 1, the jet color bar was set from 0 (blue) to 1 (red) for the NDVI heatmap image.
Sensors 2020, 20, 3659 5 of 13 where R 800nm and R 650nm are the reflectance values of wavelengths 800 nm and 650 nm, respectively [36,37]. One traditional way of evaluating the plant's nitrogen level is through calculating the average NDVI of each corn leaf Equation (3). The average NDVI was compared with the new stress distribution analysis algorithm's result.
where n is the total number of pixels from the segmented leaf image, and NDVI pixel is the pixel-level value.

Preprocessing of NDVI Images
To efficiently utilize the stress distribution information from the obtained NDVI images, we mainly focused on the mid-rib direction of the leaf (the horizontal direction (X) in Figure 3), which had the major variance in the stress distribution [20,38]. After averaging the NDVI values along each scanning line (the vertical direction (Y) in Figure 3), a further rescaling process was applied in order to rescale all the leaf images to the same length. Since the original leaf samples had different lengths, it was important to rescale them before comparing their distribution patterns. After rescaling, we equally divided the leaf image into 50 sections along the mid-rib direction (X) and calculated the average NDVI for each section. The changes in the NDVI across the 50 sections are shown in Figure 4. Figure 4 shows the different distribution patterns from the different stressed groups: the NDVI value gradually increased along the mid-rib direction and suddenly dropped near the leaf tip. Differences in the NDVI's changing ratio at different sections were observed between the different stressed groups.
One traditional way of evaluating the plant's nitrogen level is through calculating the average NDVI of each corn leaf Equation (3). The average NDVI was compared with the new stress distribution analysis algorithm's result.
where n is the total number of pixels from the segmented leaf image, and NDVIpixel is the pixel-level value.

Preprocessing of NDVI Images
To efficiently utilize the stress distribution information from the obtained NDVI images, we mainly focused on the mid-rib direction of the leaf (the horizontal direction (X) in Figure 3), which had the major variance in the stress distribution [20,38]. After averaging the NDVI values along each scanning line (the vertical direction (Y) in Figure 3), a further rescaling process was applied in order to rescale all the leaf images to the same length. Since the original leaf samples had different lengths, it was important to rescale them before comparing their distribution patterns. After rescaling, we equally divided the leaf image into 50 sections along the mid-rib direction (X) and calculated the average NDVI for each section. The changes in the NDVI across the 50 sections are shown in Figure 4. Figure 4 shows the different distribution patterns from the different stressed groups: the NDVI value gradually increased along the mid-rib direction and suddenly dropped near the leaf tip. Differences in the NDVI's changing ratio at different sections were observed between the different stressed groups.

Application of Machine Learning Algorithms
After preprocessing, the NDVI distribution pattern across the X direction was utilized for the improved assessment of the nitrogen stress level by training machine learning algorithms. The overall schematic of the distribution feature analysis procedure is shown in Figure 5.
The NDVI images were rescaled to 1 × 50 vectors containing the distribution information along the leaf's midrib direction. With 64 training samples from genotype B73xMo17, the final input data constituted a 64 × 50 matrix (Figure 5b). The desired output matrix was the nitrogen treatment (stressed vs. no stress). In this study, we had two nitrogen treatments: the high nitrogen treatment group (H), treated with 250 kg/ha, and the low nitrogen treatment group (L), treated with 0 kg/ha. The nitrogen treatments were labeled 1 and 0, representing high and low nitrogen treatment, respectively.

Application of Machine Learning Algorithms
After preprocessing, the NDVI distribution pattern across the X direction was utilized for the improved assessment of the nitrogen stress level by training machine learning algorithms. The overall schematic of the distribution feature analysis procedure is shown in Figure 5.

Application of Machine Learning Algorithms
After preprocessing, the NDVI distribution pattern across the X direction was utilized for the improved assessment of the nitrogen stress level by training machine learning algorithms. The overall schematic of the distribution feature analysis procedure is shown in Figure 5.
The NDVI images were rescaled to 1 × 50 vectors containing the distribution information along the leaf's midrib direction. With 64 training samples from genotype B73xMo17, the final input data constituted a 64 × 50 matrix (Figure 5b). The desired output matrix was the nitrogen treatment (stressed vs. no stress). In this study, we had two nitrogen treatments: the high nitrogen treatment group (H), treated with 250 kg/ha, and the low nitrogen treatment group (L), treated with 0 kg/ha. The nitrogen treatments were labeled 1 and 0, representing high and low nitrogen treatment, respectively.  The NDVI images were rescaled to 1 × 50 vectors containing the distribution information along the leaf's midrib direction. With 64 training samples from genotype B73xMo17, the final input data constituted a 64×50 matrix (Figure 5b). The desired output matrix was the nitrogen treatment (stressed vs. no stress). In this study, we had two nitrogen treatments: the high nitrogen treatment group (H), treated with 250 kg/ha, and the low nitrogen treatment group (L), treated with 0 kg/ha. The nitrogen treatments were labeled 1 and 0, representing high and low nitrogen treatment, respectively.
Finally, we obtained a 64×50 input matrix and a 64×1 output vector (Figure 5c), and we selected the four most commonly used supervised machine learning algorithms from the Scikit-learn library [39] and denoted them as the regression models ( Table 2). The prediction results of the regression models were then used to evaluate the nitrogen stress level. In order to avoid overfitting issues, we conducted a 10-fold cross-validation to evaluate each model's performance. The data were randomly divided into 10 subsets. The nine subsets were used to train the regression models and the left subset was used as Sensors 2020, 20, 3659 7 of 13 the validation set. The whole validation process was repeated 10 times. The final estimation error was averaged over all 10 trails for each model. Table 2. Overview of machine learning algorithms used for this study.

Regression Algorithm Description Reference
AdaBoost Adaptive Boosting (AdaBoost) is a generalized boost method which is an ensemble technique that attempts to create a strong classifier from several weak classifiers. [40] Logistic Regression Logistic Regression is a predictive analysis method usually used when the dependent variable is dichotomous (binary). [41] PLSR Partial Least Squares Regression (PLSR) is a method that performs least squares regression on new components after reducing original predictors to a smaller set of uncorrelated components. [42] Random Forest Random Forest is an ensemble technique performing both regression and classification tasks with the use of multiple decision trees with Bootstrap Aggregation. [43]

Hyperparameter Optimization
Hyperparameter optimization is the process that is used to maximize a supervised ML model's performance without overfitting, and it is vital in order to enhance the model's performance [23]. For optimizing the selected four ML models (AdaBoost, Logistic Regression, Partial Least Squares Regression (PLSR), and Random Forest), the grid search was selected and performed for the best tuning parameter values. Starting from the default setting in the Scikit-learn library, a set of candidates for the tuning parameter values are specified and then evaluated for each of the methods. In consideration of the efficiency and accuracy of the model searching, we only chose one hyperparameter of each ML method for the grid search. Taking Random Forest as an example, the maximum depth was chosen as the tuning parameter from values in a range of 1 to 10. During the whole parameter tuning process, each of the trained models was 10-fold cross-validated to prevent overfitting. Finally, the best hyperparameter for each model was determined based on the lowest cross-validated root mean square error (RMSE). RMSE is interpreted as the absolute measure of fit, which is commonly used in optimizing machine learning models. It is a good measure of how accurately the model predicts the response, and it is the most important criterion for fit if the main purpose of the model is prediction [44].

Metrices for Model Evaluation
To evaluate the algorithms' performance, we conducted the two-sample t-test on the predictions between the low and high nitrogen groups. We hypothesized that by incorporating the NDVI's spatial distribution information, the new models' results could more clearly separate the nitrogen treatments, compared with the traditional averaged NDVI.
In addition to the nitrogen stress assessment, the new model's prediction results were also used to separate the images from two different corn genotypes: (1) B73xMo17 (control); (2) P1105AM (high nutrient use efficiency). P1105AM is a commercial corn product from Pioneer, and it has been proven to have a high nutrient use efficiency in many studies because it can contribute to deep root system development and increased nutrient absorption by the plant [32]. Similarly, the two-samplet-test result was used to compare the new model's predictions with the conventional averaged NDVI value.

Software and Computation
In this study, the supervised machine learning analysis and figure plotting were performed in the Python version 3.7.2 software environment [45]. Machine learning algorithms were carried out with the package from the Scikit-learn machine learning library [39]. The dataset was analyzed and manipulated using Pandas [46] and Numpy [47]. The figures were drawn with Matplotlib [48]. All the computations were run on an HP 17 G3 Mobile Workstation (Hewlett-Packard, Palo Alto, CA, USA) equipped with 64-gigabytes (GB) of random-access memory (RAM) and a 2.70 GHz Intel®Core™ i7-6820HQ processor and a GPU of Nvidia Quadro P3000.

The Averaged NDVI Comparison
The traditional averaged NDVI values from the leaf images were compared between the different nitrogen treatments and genotypes. Based on the two-samplet-test, the high nitrogen group showed a significantly higher NDVI than the low nitrogen group (Table 3). However, the distributions of the averaged NDVIs were largely overlapped. Figure 6a shows the estimated probability density distributions of the averaged NDVI for the two nitrogen treatments by using the kernel density estimate (KDE) [29]. In this study, the supervised machine learning analysis and figure plotting were performed in the Python version 3.7.2 software environment [45]. Machine learning algorithms were carried out with the package from the Scikit-learn machine learning library [39]. The dataset was analyzed and manipulated using Pandas [46] and Numpy [47]. The figures were drawn with Matplotlib [48]. All the computations were run on an HP 17 G3 Mobile Workstation (Hewlett-Packard, Palo Alto, CA, USA) equipped with 64-gigabytes (GB) of random-access memory (RAM) and a 2.70 GHz Intel® Core™ i7-6820HQ processor and a GPU of Nvidia Quadro P3000.

The Averaged NDVI Comparison
The traditional averaged NDVI values from the leaf images were compared between the different nitrogen treatments and genotypes. Based on the two-samplet-test, the high nitrogen group showed a significantly higher NDVI than the low nitrogen group (Table 3). However, the distributions of the averaged NDVIs were largely overlapped. Figure 6a shows the estimated probability density distributions of the averaged NDVI for the two nitrogen treatments by using the kernel density estimate (KDE) [29].   Similarly, the P1105AM genotype showed a significantly higher averaged NDVI than the control genotype (Table 4). However, the distribution of the averaged NDVI values between the two genotypes was severely overlapped too (Figure 7a). This would lead to severe type 1 and type 2 errors in classifying the plants, either for nutrient or breeding studies. Therefore, this confirmed that the conventional averaged NDVI method is not satisfactory in this kind of plant analysis study. A finer level of image processing algorithm for nutrient and stress analysis is preferred for improved quality of measurement. Similarly, the P1105AM genotype showed a significantly higher averaged NDVI than the control genotype (Table 4). However, the distribution of the averaged NDVI values between the two genotypes was severely overlapped too (Figure 7a). This would lead to severe type 1 and type 2 errors in classifying the plants, either for nutrient or breeding studies. Therefore, this confirmed that the conventional averaged NDVI method is not satisfactory in this kind of plant analysis study. A finer level of image processing algorithm for nutrient and stress analysis is preferred for improved quality of measurement.

Results of Machine Learning Algorithms
The results of the four supervised learning models are summarized in Table 3. They showed that AdaBoost, Logistic Regression, PLSR, and Random Forest all had smaller two-sample t-test P-values than the averaged NDVI. Among them, Random Forest had the lowest P-value for classifying corn leaves from different nitrogen treatments. Besides this, the standard deviation of the prediction results was used to check the consistency of the model. Random Forest had the lowest and most consistent standard deviation values for both the high and low nitrogen groups. Taking both findings into consideration, the Random Forest algorithm delivered the best performance with the most significant difference and the greatest consistency.
To illustrate the performance of the predictions from the new Random Forest model, the distributions of the prediction results from the two nitrogen treatments are shown in Figure 6b. Compared with the distributions of the traditional averaged NDVI (Figure 6a), the new model's predictions were able to more clearly separate the two treatments, as shown in Figure 6b. The -log (P-value) of the two-sample t-test for the new model's predictions also increased to 9.519 from 5.845 when the traditional averaged NDVI was used.

Results of Machine Learning Algorithms
The results of the four supervised learning models are summarized in Table 3. They showed that AdaBoost, Logistic Regression, PLSR, and Random Forest all had smaller two-sample t-test P-values than the averaged NDVI. Among them, Random Forest had the lowest P-value for classifying corn leaves from different nitrogen treatments. Besides this, the standard deviation of the prediction results was used to check the consistency of the model. Random Forest had the lowest and most consistent standard deviation values for both the high and low nitrogen groups. Taking both findings into consideration, the Random Forest algorithm delivered the best performance with the most significant difference and the greatest consistency.
To illustrate the performance of the predictions from the new Random Forest model, the distributions of the prediction results from the two nitrogen treatments are shown in Figure 6b. Compared with the distributions of the traditional averaged NDVI (Figure 6a), the new model's predictions were able to more clearly separate the two treatments, as shown in Figure 6b. The -log (P-value) of the two-sample t-test for the new model's predictions also increased to 9.519 from 5.845 when the traditional averaged NDVI was used.

Model Evaluation for Different Corn Genotypes
Similarly, the prediction results of the new Random Forest model were also used to differentiate between the two corn genotypes: (1) B73xMo17 (control); (2) P1105AM (high nutrient use efficiency). The statistical results are summarized in Table 4. The results showed that the genotype difference was more significant with the new model's prediction result (p-value = 0.004), while it was less significant with the conventional averaged NDVI (p = 0.035). Since the genotype difference is usually less significant in plant breeding studies [49], it indicated that the new model was better at detecting weaker signals compared to the traditional averaged NDVI.

Discussion and Conclusions
In most of the existing hyperspectral image process algorithms, the spectra across all the plant pixels are averaged in order to predict the plant's physiological features. Due to the high variance in the nutrient and stress levels between different locations on the same leaf, the averaged spectrum might not provide a satisfactory degree of plant measurement quality.
In this paper, a new hyperspectral image processing algorithm based on Random Forest and leaf-region rescaling was introduced in order to analyze the nutrient and stress distribution patterns along the top-collared corn leaf. NDVI was selected as an example feature, since it is widely used to predict the nutrient status (water, nitrogen content, etc.) of plants, mainly from leaf mesurements [33,34,38]. A handheld hyperspectral scanner called LeafSpec was used to scan and collect the hyperspectral images of the top-collared leaves of 96 corn plants with different nitrogen treatments and genotypes. The NDVI distribution patterns within a single leaf were modeled to predict the plant's nutrient status. Four machine learning regression models were selected and compared for this purpose. All four models were able to more clearly separate the nitrogen treatments than the traditional averaged NDVI across all the leaf pixels. Among the four models, the Random Forest method showed the best performance. Similarly, the new model for NDVI distribution analysis also outperformed the traditional averaged NDVI in differentiating the leaves from different genotypes as well. The results indicate that although NDVI is a good indicator of the nutrient status of plants, the evalution of nutrient status can be further improved with our distribution analysis algorithum.
Compared with the conventional method, which only considers the averaged value, the analysis of distribution features utilized all the variance information in the captured image. Taking both intensity value and distribution information into consideration, the proposed Random Forest algorithm successfully improves phenotyping quality, with a better sensitivity in detecting the plant's nutrient conditions. Given the successful results, the new distribution analysis algorithm could potentially be applied to other plant feature measurements as well.
In the future, we would like to collect more diverse leaf samples in order to validate the current trained model, as well as to make the prospective model more robust. Moreover, an exhaustive grid search will be done with more hyperparameters involved. Thus, the potential of the current Random Forest model and other ML models could be fully explored. Meanwhile, subspace clustering approaches, such as deep clustering [50] and structured auto-encoder methods [51], will also be explored. Finally, the deep learning-based approach will be tested, as it could learn to automatically extract the features from the raw images, which might enable it to fully make use of the data.