Growth Stage Classification and Harvest Scheduling of Snap Bean Using Hyperspectral Sensing: A Greenhouse Study

The agricultural industry suffers from a significant amount of food waste, some of which originates from an inability to apply site-specific management at the farm-level. Snap bean, a broad-acre crop that covers hundreds of thousands of acres across the USA, is not exempt from this need for informed, within-field, and spatially-explicit management approaches. This study aimed to assess the utility of machine learning algorithms for growth stage and pod maturity classification of snap bean (cv. Huntington), as well as detecting and discriminating spectral and biophysical features that lead to accurate classification results. Four major growth stages and six main sieve size pod maturity levels were evaluated for growth stage and pod maturity classification, respectively. A point-based in situ spectroradiometer in the visible-near-infrared and shortwave-infrared domains (VNIR-SWIR; 400–2500 nm) was used and the radiance values were converted to reflectance to normalize for any illumination change between samples. After preprocessing the raw data, we approached pod maturity assessment with multi-class classification and growth stage determination with binary and multi-class classification methods. Results from the growth stage assessment via the binary method exhibited accuracies ranging from 90–98%, with the best mathematical enhancement method being the continuum-removal approach. The growth stage multi-class classification method used raw reflectance data and identified a pair of wavelengths, 493 nm and 640 nm, in two basic transforms (ratio and normalized difference), yielding high accuracies (~79%). Pod maturity assessment detected narrow-band wavelengths in the VIS and SWIR region, separating between not ready-to-harvest and ready-to-harvest scenarios with classification measures at the ~78% level by using continuum-removed spectra. Our work is a best-case scenario, i.e., we consider it a stepping-stone to understanding snap bean harvest maturity assessment via hyperspectral sensing at a scalable level (i.e., airborne systems). Future work involves transferring the concepts to unmanned aerial system (UAS) field experiments and validating whether or not a simple multispectral camera, mounted on a UAS, could incorporate < 10 spectral bands to meet the need of both growth stage and pod maturity classification in snap bean production.


Introduction
Snap bean is one of the largest sources of broad-acre crop income in the United States, and is planted from California to New York [1]. However, more than four million pounds of the harvested snap bean was disposed of unsold in 2017 [1]. One of the contributing factors to overall "food loss" is our lack of understanding, developing, and adopting site-specific crop management, with one example being crop growth stage assessment, e.g., when and where to schedule the harvest, or apply pesticides and fertilizers. Precision agriculture, or site-specific management, has been shown to provide management inputs that could maximize output by optimizing inputs [2]. This approach has been put into practice in recent decades via a variety of remote sensing systems [3]. Remote sensing systems are non-destructive in nature, provide rapid data collection, and offer synoptic coverage, via ground-based (e.g., spectroradiometers), airborne (e.g., aircraft and UAS), or spaceborne (satellite) platforms [4,5]. Precision agriculture and remote sensing thus are closely tied and poised to take advantage of sensors, such as hyperspectral and structural systems, that exceed our visual assessment ability.
The electromagnetic energy incident on crops results in a unique spectral characterization, which is called a spectral signature (or spectral response), associated with either vegetation as general cover type, or the physiological state of a plant [25]. As a plant grows, its biochemical composition is altered and in turn impacts the spectral response [26]. This change in the spectral response of crops typically is more pronounced in the VNIR-SWIR region, in terms of spectral curve shape, amplitude, and absorption features [27]. The change in crop spectral behavior in the VNIR-SWIR region is dominated by pigment levels, scattering due to intercellular leaf structure, water status, etc. [28][29][30]. These variabilities in crop spectral behavior have led scholars to use spectral information to assess plant health, growth, and deviation from an "ideal" seasonal spectral evolution. We will focus on the applications related to growth stage assessment using two different approaches in this context, namely machine learning techniques and spectral indicators.
Machine learning-based crop growth stage assessment, among many other applications [31][32][33][34][35], has gained traction with the advent of artificial intelligence approaches. Doktor et al. (2014) studied the use of random forests (RF) to extract the physiological status of summer barley crops using simulated signatures, as well as hyperspectral laboratory and field measurements in the VNIR-SWIR region [36]. Their results show that random forests can explain much of the variability (R 2 = 0.80-0.89) in leaf area index (LAI) using spectral data. Yuan et al. (2017) also evaluated the use of machine learning techniques on VNIR hyperspectral data for LAI assessment [37]. Their findings show that RF proved the most accurate when compared to other approaches, with a high coefficient of determination (R 2 = 0.74). Senthilnath et al. (2013), on the other hand, investigated crop stage classification using hyperspectral images, using dimensionality reduction via principal component analysis (PCA), followed by an unsupervised clustering method with the objective of distinguishing between three different growth stages of wheat [38]. Findings from this study showed a high efficiency ( Remote Sens. 2020, 12, x is our lack of understanding, developing, and adopt example being crop growth stage assessment, e.g., whe pesticides and fertilizers. Precision agriculture, or s provide management inputs that could maximize outp been put into practice in recent decades via a variety o systems are non-destructive in nature, provide rapid d ground-based (e.g., spectroradiometers), airborne (e.g platforms [4,5]. Precision agriculture and remote sen advantage of sensors, such as hyperspectral and structu ability. Most spectral imaging systems are categorized as (imaging spectroscopy) types, with the former define latter as containing narrow, contiguous spectral cove devices differ between applications, from visible-toassessments of i) soil properties [7][8][9][10][11], ii) crop disease [15,17,18]; to shortwave-(SWIR; 1000-2500 nm), m applications focused on chemical identification in the We will focus our discussion on the VNIR-SWIR, due The electromagnetic energy incident on crops which is called a spectral signature (or spectral respons cover type, or the physiological state of a plant [25]. A altered and in turn impacts the spectral response [26] typically is more pronounced in the VNIR-SWIR regio and absorption features [27]. The change in crop sp dominated by pigment levels, scattering due to interce These variabilities in crop spectral behavior have led plant health, growth, and deviation from an "ideal" se applications related to growth stage assessment usi namely machine learning techniques and spectral indi Machine learning-based crop growth stage assess has gained traction with the advent of artificial intelli the use of random forests (RF) to extract the physio simulated signatures, as well as hyperspectral laborato region [36]. Their results show that random forests can in leaf area index (LAI) using spectral data. Yuan et learning techniques on VNIR hyperspectral data for L RF proved the most accurate when compared to o determination (R 2 = 0.74). Senthilnath et al. (2013), classification using hyperspectral images, using dime analysis (PCA), followed by an unsupervised clusterin between three different growth stages of wheat [38 efficiency (ɳ = 81.5%), as well as an acceptable kappa c Spectral indicators and vegetation indices (i.e., a on two or more bands) have proven advantageous in t an extension to crop growth stage assessment [26,39]. V broad-or narrow-band type [40]. Narrow-band indices more sensitive to subtle changes in the spectrum, w coarser spectral resolution. The peak in the first deri nm, known as red edge peak, is a narrow-band spectr the other hand, broadband indices, such as the ratio vegetation index (NDVI; [44]), highlight relative biom = 81.5%), as well as an acceptable kappa coefficient (κ = 0.72).
Spectral indicators and vegetation indices (i.e., a mathematical transform or formulation based on two or more bands) have proven advantageous in terms of rapid assessment of crop spectra, with an extension to crop growth stage assessment [26,39]. Vegetation indices are classified as either of the broad-or narrow-band type [40]. Narrow-band indices correspond to fine spectral resolution and are more sensitive to subtle changes in the spectrum, whereas broad-band indices take advantage of coarser spectral resolution. The peak in the first derivative of reflectance spectra between 680-750 nm, known as red edge peak, is a narrow-band spectral indicator of plant stress and age [41,42]. On the other hand, broadband indices, such as the ratio vegetation index (RVI; [43]) and normalized vegetation index (NDVI; [44]), highlight relative biomass, as well as plant vigor/health/leaf area [43]. It also is important to briefly differentiate between plant maturity contexts, which focus either on the fruit-level or plant-level.
We hypothesize that growth stage and pod maturity assessment of snap bean for precision agriculture purposes can be addressed via hyperspectral systems, since growth and maturity are functions of spectral and biophysical features. The objectives of this research therefore were to: (i) evaluate the feasibility of snap bean growth stage assessment using a machine learning and spectral indices approach; (ii) assess snap bean pod maturity level via machine learning techniques, and (iii) determine spectral and biophysical attributes that correspond to accurate growth stage and pod maturity classification.

Study Area
This study was carried out in a greenhouse located at Rochester Institute of Technology (RIT), Rochester, NY, during March-May 2019. One hundred forty-two snap bean seeds (cv. Huntington) were sown on March 6, 2019, in a peat moss-based potting mix ("Potting Mix", Miracle-Gro), after which 48 plants were moved to 6-inch diameter pots, when at least 50% of the seeds had germinated (see Figure 1).

Plant Growth Characteristics
Each plant was considered a single sampling unit. All samples were irrigated every two days with 500 mL of water, and artificial light (tungsten-halogen light), as well as natural sunlight, were available to plants. Two different harvests were scheduled for this experiment; plants 1-24 were harvested when 50% of the 24 samples showed pods of industry sieve size 4-6, and plants 25-48 were harvested when 70% of the 24 samples carried pods of sieve size 4-6. The four critical growth periods, as outlined by Fernández et al. (1986) were defined as vegetative growth (i.e., first to second trifoliate; V3-V4), budding (i.e., pre-flowering; R5), flowering (R6), and pod formation (R7-R9); each was determined when at least 50% of plants from each harvest set showed corresponding signs (see Figure 2) [69]. Six different pod stages, marked by industry sieve sizes (sieve size 1, [S1] < 5.8 mm; sieve size 2, 5.8 mm < [S2] < 7.5 mm; sieve size 3, 7.5 mm < [S3] < 8.5 mm; sieve size 4, 8.5 mm < [S4] < 9.7 mm; sieve size 5, 9.7 mm < [S5] < 10.9 mm; sieve size 6, [S6] > 10.9 mm), were used to classify pods non-destructively (i.e., pods remained on plant) during the pod formation stage. It would be worthwhile to mention that these measurements are widths of the pods. At the time of harvest, however, pods from plants were destructively removed, classified into different sieve sizes, and the average pod weight of each sieve size was determined. This average pod weight was then multiplied by the number of pods per sieve size per plant (recorded during pod formation stage), and the category that had the maximum weighted mass was assigned as the plant maturity level.

Data Collection
Hyperspectral and biophysical data were collected every 2-3 days after 50% of plants showed second trifoliate emergence (i.e., large enough canopy cover to encompass as much of the spectroradiometer's field of view ([FOV] as possible). This temporal resolution allowed us to capture data 25 times during the growth cycle of 69 days. A Spectra Vista Corporation spectroradiometer ("HR-1024i" model, Spectra Vista Corporation, Poughkeepsie, NY) was used to collect the hyperspectral data. This device captures hyperspectral data in 978 bands, ranging from VIS to SWIR (350-2500 nm), which are collected via three different detectors; silicon, InGaAs, and extended InGaAs detectors, responsible for the 350-1000 nm, 1000-1890 nm, and 1890-2500 nm ranges, respectively. A 14 • FOV fore-optic lens was utilized for accurate pointing accuracy, with a resultant sampling diameter of 3-4 cm, at 15-20 cm height above the plant canopy. In order to ensure a repeatable data acquisition procedure, a plant's center was directly positioned beneath the center of the FOV (marked by the spectrometer's laser). The spectroradiometer was installed on a height-adjustable platform, covered on both sides with two tungsten-halogen lamps that provided incident electromagnetic energy throughout the VIS-SWIR region (see Figure 3a). Plant biophysical attributes (PHY), namely height, canopy width (along the longest axis), and the number of leaves, were concurrently collected with spectral data (spec.) acquisition to evaluate whether or not structural information may improve the performance of the growth stage classification. This study thus aimed to assess a best-case scenario, i.e., one for which a high signal-to-noise ratio (SNR) and pure, limited-background vegetative spectra were observed. This was enabled via the setup depicted in Figure 3a, where we covered the collection chamber with thick black felt to inhibit any external, stray light from entering the assembly. Moreover, to ensure a pure vegetative spectrum, a collar-shaped black felt was used to cover the base of the plant to effectively exclude the soil signal from each spectral sample (see Figure 3b). Calibration to reflectance was achieved using a Spectralon panel (see Figure 3c), set on a tripod to coincide with plant height, thereby minimizing illumination differences or variability between sample and collection days. There still existed system and/or spectral noise in few regions; we therefore omitted select spectral ranges from the analysis: 335-480 nm, due to detector fall off and associated low SNR, and 850-1000 nm, 1900-2000 nm, 2400-2500 nm regions, due to low SNR caused by detector overlap [70]. This step reduced the original 978 bands (before data cleaning) to 678 bands. The next step involved data dimensionality reduction, a common approach when dealing with oversampled (hyper)spectral data [71,72].

Principal Component Analysis
Principal component analysis (PCA), a proven approach for variability assessment, outlier detection, and dimensionality reduction, was used to evaluate spectral variability in the data set [73][74][75]. PCA is a linear transform of a random variable into a new space, such that the projected features (i.e., principal components) have maximum variance and are orthogonal to one another. The mathematical expression, shown below in (1), contains X is the random vector, e as the jth normalized eigenvectors, sorted in descending order, and P as the corresponding jth principal component: This mathematical transformation will result in principal components, with the first principal component exhibiting the highest variability (X; [76]).

Data Analysis
Several preprocessing steps were applied to the spectral data set. The Small's and χ-squared tests for multivariate normality were used to identify the proper approach, i.e., parametric vs. non-parametric, for classification analysis [77]. Moreover, due to different scales and units between spectral and biophysical features, we scaled all features to a specific range (0-1 in our study), enabling all features to have the same influence on the analysis [77].

Growth Stage Binary Classification
Scholars and crop growers both are keenly interested in evaluating plant maturity stages and transitions, for scientific or practical, management reasons, respectively. We approached such a growth stage classification as either a binary or multi-class classification. The binary classification approach includes a spectral enhancement step, in addition to using raw reflectance data, as an input to the classifier. Mathematical enhancement techniques include the Savitzky-Golay filter for smoothing and first derivative calculation, and the well-known continuum-removal (CR) approach [78,79]. A configuration of 11 bands, with a second-order smoothing polynomial, was utilized in this study for the Savitzky-Golay approach to reduce spectral noise artifacts, without suppressing pertinent absorption features [70]. The continuum-removal approach was implemented to enhance absorption features in the 555-775 nm, 1120-1265 nm, 1275-1675 nm, 1676-1825 nm, and 1900-2397 nm regions, due to their tie to chlorophyll absorption, structural and water absorption, sugar, protein, and nitrogen absorption, and cellulose and lignin absorption, respectively [79]. Table 1 represents the number of samples in each data set that was used in the analysis in a one-vs.-rest fashion. We ensured an unbiased classification procedure by balancing classes via random sampling. Table 1. Growth stage binary classification data used in this study and the corresponding number of samples in a one-vs.-rest fashion: True class samples indicate the number of samples that were labeled as the "true" class (e.g., vegetative growth) or of the "false" class (non-vegetative growth; i.e., rest). Numbers below indicate the number of samples for each class. These mathematically-enhanced spectra were then subjected to a logistic regression routine to calculate the corresponding per-wavelength discriminating power (C-index; [80,81]). Consequently, C-indices of all features were ordered in descending fashion and collinearity was assessed in a pairwise fashion in order to remove features with high collinearity. Features with an absolute correlation coefficient (Pearson's r) of 0.7 and higher were removed, while maintaining the one with the higher C-index [82]. This approach was essential to avoid potential overfitting in the analysis.
We used nine different parametric and non-parametric classifiers, namely logistic regression (LR), a support vector machine (SVM), a linear support vector classifier (LSVM), the K nearest neighbor (KNN) algorithm, a naïve Bayes (NB) approach, perceptron (Perc.) stochastic gradient descent (SGD), decision trees (DT), and a random forest (RF) method to evaluate the performance of our analysis [83][84][85][86][87][88][89][90]. We used four-fold cross-validation for vegetative growth and flowering and a 10-fold approach for budding and pod formation in order to ensure an unbiased judgment of classifier performance, as well as classifier efficiency on new data (i.e., test data) [91]. This difference in the number of folds is due to sample size for each test (see Table 1). Finally, the false positive rate (FPR) and true positive rate (TPR) were recorded, along with overall accuracy, averaged across all folds. The mentioned scores are presented in terms of receiver operating characteristic (ROC) curves and confusion matrices [92]. The accuracy metric used in this study was average accuracy [92].

Growth Index Classification
Over 200,000 combinations, derived from a two-band-index approach, were selected for the 678 spectral bands (i.e., C 678 2 ) and used as input to the SVM to identify the pair that resulted in the highest accuracy, derived from five-fold cross-validation, for the four growth stages. Two different arrangements of wavelengths were examined in this study for multi-class classification: (i) ratio indices (i.e., w1 w2 ) and (ii) normalized difference indices (i.e., ). The ratio index was named the snap-bean growth index (SGI), while the normalized form was called the normalized difference snap-bean growth index (NDSI). The reported score was average accuracy. We ensured that our classes were balanced, where such a balanced data set for the multi-class classification approach contained 48 samples of each growing stage, totaling 192 samples.

Pod Maturity Classification
The multiclass classification procedure, used for multiclass growth stage assessment, was used to evaluate multiclass pod maturity classification with five-fold cross-validation (see Figure 4). Table 2 shows the number of samples (plants) per sieve size. Three different sub data sets were produced using these data: [set 1], combining sieve size 1-4 as class 0 (not ready-to-harvest; immature and mid-mature), and sieve size 5-6 as class 1 (ready-to-harvest; mature and over-mature); [set 2], merging sieve size 1-4 as class 0 (immature and mid-mature), sieve size 5 as class 1 (mature), and sieve size 6 as class 2 (over-mature); [set 3], mixing sieve size 1-2 as class 0 (immature), sieve size 3-4 as class 1 (mid-mature), and sieve size 5-6 as class 2 (ready-to-harvest; mature and over-mature).

Software
All data preprocessing and machine learning tasks were performed in Python 3.5 [93] with the Scikit-learn module [94]. Publicly-available multivariate normality and continuum-removal algorithms were utilized [95,96]. The presented data are publicly available in FigShare [97].

Results
We noticed large variability in plant height (35.71 ± 6.39; Figure 5) over the 25 days of biophysical data collection. This distinct change in height could imply that this biophysical trait may be a critical attribute for classification, and can be readily assessed via structural sensors, such as ranging sensors (e.g., light detection and ranging; LiDAR) on unmanned aerial system (UAS) platforms. The width feature, on the other hand, exhibited a smaller mean and standard deviation (32.01 ± 4.87), when compared to plant height ( Figure 5). As for the number of leaves (20.19 ± 6.72), we observed a steady increase as the season progressed, followed by leaf loss and yellowing closer to harvest. It is worth noting that these data sets are normal, with only a slight indication of non-normality. The reported p-values were 0.49 and 0.52 for the Small and χ-squared methods, respectively. Biophysical variable levels were a function of our greenhouse study environment and may exhibit different values (mean, range, variability) for operational grower settings.  Figure 6 shows the results for the principal component analysis for spectral data, i.e., the results for the first four principal components (PCs) in terms of impact plots [77]. The reported impact plots explain ± 7 standard deviations around the mean (as positive and negative lines). The total explained variability for the first four PCs was over 99%. As can be seen from Figure 6, PC-1 accounted for most of the variability in the dataset (explained variability = 95.56%) in the 700-1800 nm range. PC-2 showed a more significant change in the visible region, and the higher end of the SWIR region, when compared to PC-1. The visible region's amplitude shifts toward higher reflectance values in the green reflective pigment, and lower amplitudes for blue and red reflective pigments. A closer look at Figure 6 reveals that as chlorophyll absorbs more visible energy (negative impact in the visible region, i.e., less reflective), higher reflectance values can be observed at the longer end of the SWIR region (1500-2500 nm). This could be considered a change in the overall "slope" of the spectrum, with its origin at~750 nm (i.e., higher reflectance in the visible region, with lower reflectance in wavelengths > 750 nm). PC-2 is responsible for 2.2% of the total variability in the data set, which demonstrates potential for being impactful through the continuum-removal approach and being an ideal input feature to a classifier. PC-3, in turn, is responsible for 1.2% of the total variability and exhibits high variability in the 1490 nm and 1900 nm absorption peaks, which are also close to water absorption troughs. We can also see zero shift in the~1250 nm region, minimal change in the visible region, and shifts toward higher and lower reflectance values in the 800-1200 nm region. Finally, PC-4 encapsulates mostly noise in the wavelengths shorter than 500 nm, due to the low SNR in that region.

Growth Stage Binary Classification
This section includes findings for the classification of four growth stages (viz., vegetative growth, budding, flowering, and pod formation), presented as four tables and figures for different classification results, i.e., one table-figure pair per growth stage. Table 3 shows that both raw reflectance and smoothed data sets exhibited similar classification performances. Specifically, we observed explanatory spectral features residing in the 610-620 nm (red), 700-770 nm (red edge), 810-830 nm (NIR), and 1130 nm, 1390 nm, 1490 nm, 1560 nm, 2010-2090 nm, 2170 nm, 2380-2390 nm regions in SWIR domain. The last column in Table 3 represents the selected features, following the feature removal process. The raw reflectance and smoothed data sets exhibited a similar number of selected features (only one for spectral-only and four for spec. + PHY). The limited number of selected features for raw and smoothed data sets demonstrates the problem of collinearity. The addition of biophysical attributes significantly improved the classification performance for raw and smoothed data sets (Table 3). For the first derivative model, among biophysical features, the number of leaves was only identified as differentiating, and for the continuum-removed data set, no physical attributes were detected. It seems that the continuum-removal approach results in superior performance that is independent of biophysical data, when compared to other models. Moreover, while plant height and canopy width could be assessed relatively easily via ranging devices (e.g., LiDAR; [98][99][100]), the number of leaves arguably cannot be considered a feature for more practical applications (i.e., UAS-based).  It can be concluded from Table 3 that both continuum-removed data sets show strong classification performance. We chose to only report classification results for [CR-spec], since both continuum-removal tests show superior and similar performance. Figure 7 shows the classification results for [CR-spec] for the top five features, with corresponding perceptron classifiers. Figure 7a,b represent the confusion matrix and ROC curve, respectively, for the top five features for the [CR-spec] data set, and the confusion matrix is highly diagonal, and the ROC curve shows an AUC of 1.00 and accuracy of 0.97, demonstrating robustness, as well as solid classification performance between vegetative spectra and non-vegetative spectra, based on a nonparametric classifier (perceptron).  Table 4 shows that the spectral wavelengths responsible for accurate classification of budding are located in the 560-595 nm (far green), 600-690 nm (red), 700-710 nm (red edge) regions, and regions in the SWIR domain including 1193 nm, 1400-1460 nm, 2030-2090 nm, 2110 nm, 2210-2300 nm, and 2315-2380 nm. Furthermore, all biophysical attributes show enhancement in classification performance, similar to when plants were vegetative only. All biophysical features were identified for raw and smoothed models, and width was detected as differentiating in first derivative and CR models. However, as can be seen, the continuum-removal approach resulted in a superior performance that did not rely on biophysical characteristics. The corresponding classifier for CR was KNN.

Snap Bean Flowering
The third stage of growth, flowering, was observed for this study over a two-day period in the growth cycle. Table 5 presents findings from this growth stage and shows that the differentiating wavelengths are located in~500 nm (green), 690-775 nm (red edge), 830 nm (NIR), and 1396 nm, 1460 nm, 1630 nm, 1750 nm, 1800-1890 nm, 2010-2080 nm, 2220-2230 nm, 2310-2390 nm in the SWIR region. Physical attributes only enhanced performance of raw and smoothed data sets and were not identified by the first derivative and CR models. The best-performing model (continuum-removal) achieved optimal results (accuracy = 0.98), without the need to include biophysical attributes. LSVM and DT proved accurate when using both continuum-removed data sets. Figure 9 depicts the corresponding confusion matrix and ROC curve for the [CR.-spec] model, i.e., the model with superior performance. The observed confusion matrix is diagonal, and the ROC curve presents an AUC = 0.99, which reflects solid classification performance.

Pod Formation
The final growth stage of the snap bean crop is represented by pod formation, which lasted almost a month in our study; pods varied from small pins to over-mature maturity levels (industry sieve size 1-6). There were several wavelength regions that were identified as distinguishing between pod-forming and non-pod-forming plants: 595-610 nm (red), 660-770 (red edge), and wavelengths 1090 nm, 1190 nm, 1230 nm, 1380-1385 nm, 1460-1490 nm, 1820 nm, 2040 nm, 2350-2390 nm in the NIR and SWIR spectral regions. This stage also showed an increase in classification accuracy when biophysical attributes were added. Similar to the vegetative growth stage, the number of leaves and canopy width were selected as outcome predictors. It is likely that both traits show significance due to leaf yellowing (due to natural senescence), which causes foliage loss. However, in the first derivative model, only width was identified as discriminating (similar to results from budding; see Table 4). While the addition of biophysical attributes increased performance, we can observe from Table 6 that both CR tests selected only spectral attributes and still outperformed other models. Both CR models yielded superior performance when RF and KNN classifiers were used. We opted to show results for the [CR-spec] data set, which represents a classification accuracy of 0.91, based on the RF classifier with the top five features. Figure 10 depicts the confusion matrix and ROC curve, corresponding to the pod formation classification. The analysis again exhibits a diagonal confusion matrix, which is indicative of the robustness of the algorithm, while the ROC curve shows an AUC of 0.98.  Figure 11 shows the histogram of the first and second wavelengths (W1 and W2) of the most accurate 1000 spectral pairs; it is evident that the range between 490-500 nm (far-blue) exhibited the highest frequency, followed by the 540-570 nm (green) spectral region. Both of these spectral domains are crucially important in terms of chlorophyll absorption [101]. Figure 11b shows that the second wavelength in the 600-650 nm (red) spectral region is prominent, followed by the 2000-2050 nm (SWIR) wavelength region. Wavelength regions 490-500 nm (far-blue), 600-650 (red), and 2000-2050 nm (SWIR) were highly relevant to all stages of growth classification; the first two spectral regions are associated with chlorophyll absorption, while the latter has been coupled with starch absorption features [101]. The most accurate wavelength pair for this ratio index was W 1 = 493 nm and W 2 = 640 nm, as identified by the SVM classifier. Table 7 represents the confusion matrix, as well as user's and producer's accuracies (commission and omission errors, respectively), and kappa (κ) value for this snap bean growth index (SGI) [102,103]. The introduced index was termed the SGI.   Table 7 lists the confusion matrix for the SGI classification, as well as the producer's accuracy (also referred to as recall), user's accuracy (also called precision), and their corresponding errors, i.e., errors of omission (also known as a type I error), and error of commission (also known as a type II error), and the kappa value.

Snap-Bean Growth Index (SGI) Assessment
From a producer's point of view, we observe that the vegetative growth and pod formation stages are best separated at~93.75% accuracy, indicating a high degree of separability between early growth stage plants and those that are relatively mature. This wavelength combination yielded a 68.5% producer's accuracy for the budding class, and as Table 7 shows, this classification error is primarily due to misclassifying budding stages as vegetative growth. Finally, the flowering stage exhibited the least accurate performance (producer's accuracy = 60.42%), indicating there is some confusion between plants that are in the flowering and pod formation stages. This result again is to be expected, since both of these stages arguably requires plant resources to be reallocated to either flowering or pod generation, resulting in potentially similar physiological outcomes and associated spectral region impacts. The separability between the flowering and budding stages, in contrast, exhibited relatively little misclassification. This outcome could prove useful in terms of management logistics for the industry, since these are critical stages for management interventions [104,105]. The error of omission, which corresponds to the producer's accuracy, shows how the algorithm failed to apply the correct label to the sample (cost to producers). On the other hand, the user's accuracy ranges from 75-82%, which shows less variability than the producer's accuracy. When comparing these results with the ones from the raw reflectance spectra for binary growth stage classification (see Tables 3-6), we observe these two visible domain bands outperformed all binary classification accuracies for raw reflectance data sets (accuracies ranging between 0.51-0.71). This band pair indicates a robust, reliable index with potential for use without preprocessing of spectral data.
The corresponding kappa value represents a measure of how well a classification algorithm performs relative to a chance class assignment [102]. The kappa value of 0.72 for SGI showed a solid level of agreement. Generally speaking, the overall accuracy of 79.17% and high kappa value proved that the SGI can be used for classifying snap bean maturity levels, based on two wavelengths in the visible region. This is a significant outcome, since growers and industry service providers can implement this index in cost-effective (silicon detector spectral range), operational UAS-based sensing solutions. Figure 12 shows the most accurate first and second 1000 wavelengths for the normalized difference index form. Similar to the SGI results, there is a high incidence for W 1 in the 490-500 nm wavelength range, followed by wavelengths in the 550-575 nm range. Both ranges have proven discrimination for vegetative growth and flowering stage (see Tables 3 and 5). This validates our notion that features selected by SVM in multi-class classification are also detected by the binary classification. Figure 12b represents the most accurate second 1000 wavelengths, with two peaks around the 600-650 nm and 2000-2250 nm regions; both regions have proven useful for binary classification (see Table 3, Table 4, and Table 6). The most accurate wavelength pair is W 1 = 493 nm and W 2 = 640 nm, with an accuracy of 78.64%. These wavelengths agree with those from the SGI-based approach. We dubbed this index the normalized difference snap-bean growth index (NDSI). Table 8 contains the results for classification of the NDSI via SVM. Although the overall accuracy is similar to SGI, the producer's accuracies are different (especially for the flowering and pod formation classes). The results from this analysis exhibited a higher producer's accuracy for flowering by 10%, when compared to SGI, but a lower producer's accuracy for pod formation by almost 10% (i.e., 83% vs. 93%). Moreover, when evaluating the findings for the vegetative growth stage, we observed a higher producer's accuracy and a better distinguishability between vegetative growth and budding. The budding class exhibited the same behavior as those from SGI (i.e., a large portion of samples were misclassified as belonging to the vegetative growth class). The results for the flowering class showed a smaller misclassification error when using NDSI, when considering the flowering and pod formation stages.  Table 8 also shows, in terms of user's accuracy, that there is an almost similar performance to SGI, with the difference that there is a slightly lower user's accuracy for flowering, albeit a higher pod formation stage accuracy. This is indicative of the complex nature of this study and the difficulty in differentiating between growth stages. The kappa value also bears out the null hypothesis (the classification outcome is better than chance) and validates our findings presented earlier. The overall accuracy of 78.64% is similar to the SGI, but with the difference being that there is an observed lower variability in producer's accuracy for the NDSI results. Table 9 shows the results for multi-class pod maturity classification on the three introduced sets. As can be seen from the table, [Set 1] (ready-to-harvest vs. not ready-to-harvest) exhibited the highest accuracy when compared to the other two sets (accuracy = 78%). The identified differentiating wavelengths for the best performing model, As we can see, when biophysical data were added, the best performing models (i.e., [CR-spec.]) either improved the performance only slightly, while detecting the width feature as discriminating, or did not detect biophysical features as contributing to pod maturity level classification. The three models that exhibited superior performance for the three sets are summarized in Figure 13. As can be seen from the three subfigures, the low accuracy measures for [Set 2] and [Set 3] was due to the poor distinction between class 1 and other classes. Class 1 in [Set 2], which is sieve size 5, is largely misclassified as sieve size 6. [Set 2] showed similar but weaker performance, as class 1 (sieve size 3-4) is distinctly misclassified as other two classes (class 0 with sieve size 1-2; class 2 with sieve size 5-6). These findings proved that that there was not much spectral difference between: (i) under-mature (sieve size 1-2) and mid-mature (sieve size 3-4) plants, as shown by Figure 13b,c; and (ii) mature (sieve size 5) and over-mature (sieve size 6) plants, as shown by Figure 13b. However, a diagonal confusion matrix with definite distinction between not-ready-to-harvest (sieve size 1-4) and ready-to-harvest (sieve size 5-6) proved a promising classification scheme. mid-mature (class 1) vs. mature + over-mature (class 2). Note that the distinction between mid-mature from immature and ripe is challenging. The same is true for distinction between mature and over-mature.

Discussion
The generated impact plots by PCA can be linked to the plants' biophysical attributes ( Figure 6). The generated impact plot for PC-1 demonstrated changes in the 700-1800 nm range, especially for the 700-1250 nm range. This range is responsible for both water and structural absorption (e.g., cellulose) features [101]; plant water intake, as well as structural characteristics, which change substantially with plant maturity. PC-2, on the other hand, showed changes in the visible and higher end of the SWIR region, denoting overall plant growth; the ratio between green reflectance and the red absorption trough, which increases as the plant matures, is tied to total visible light absorption by chlorophyll. PC-3 is mostly responsible for sugar, starch, and protein (both in the 800-1200 nm and~1900 nm regions), cellulose and sugar absorption in the~1490 nm region, as well as starch, protein, oil, and water absorption in the 800-1200 nm region [101]. However, in an operational airborne data campaign setting, the 1400 nm and 1900 nm atmospheric water absorption features (low SNR) could impact the PC-3 outcome, thus diminishing the influence of these regions on the growth stage classifiers.
Our binary approach identified spectral wavelengths, at each growth stage, which proved to be discriminating. These spectral features, however, need to be scrutinized in terms of their remote sensing implications and extension to operational scenarios and sensors, and ultimately, they need to be linked to plant physiology. For example, the results from the most accurate model [CR-spec.] for binary classification of vegetative growth stage included chlorophyll, lignin, water, nitrogen, and starch-related chemical changes, as evidenced via spectral features: 1.
The 610-620 nm range, identified as discriminating features for both continuum-removal approaches (see Table 3) are strong indicators of chlorophyll absorption in the red spectral region, necessary for photosynthesis [101]. Also, Card et al. (1988) found high correlation coefficients (R 2 = 0.93) between nitrogen and the 620 nm band, and between lignin and the 610 nm band (R 2 = 0.93) [106].

2.
The 700-710 nm wavelength region represents the well-known red edge spectral feature, and was identified by continuum-removal models. The red edge peak reflects plant growth, health/vigor, and maturity, as well as indicating sugar absorption [42]. The red edge peak is an easy-to-detect spectral metric in first derivative spectra, since the slope of the red edge exhibits a common high value in comparison to other spectral metrics. However, it is notable that the continuum-removal approach proved useful in independently identifying the red edge peak. 3.
The single wavelength at~1390 nm in the SWIR region falls in the atmospheric water absorption feature, due to bending O-H bonds, as well as reflecting total nitrogen absorption [101,106].

4.
The single wavelength at 1490 nm was deemed as being explanatory for cellulose and sugar absorption, via stretching of O-H bonds [101].

5.
The 1560-1570 nm region, resides in the vicinity of two regions, 1540 nm and 1580 nm, which have associations with starch and sugar, due to stretching O-H bonds [101]. 6.
The 2000-2020 nm region, which also appeared in the raw and smoothed data sets, reflects starch absorption, causing deformation in O-H and C-O bonds [101]. 7.
Finally, for the 2380-2390 nm region, we could not find any physiological link in the literature. This may require future efforts to concentrate on detecting ties between spectral signatures in the far-SWIR region and the biophysical/chemical changes due to plant growth.
Results from the continuum removal approach (Table 4) show the budding maturity stage classification can be coupled with oil content in the plant, which was not important in differentiating vegetative growth, starch, protein, nitrogen, lignin, and water, with detected wavelengths as follows: 1.
The significant association with~560 nm highlights the predictive power of reflective green pigment in the plant, and this selection was attributed to leaf senescence (yellowing; or mosaic virus; [107]) and abscission associated with plant senescence.

2.
The 690-700 nm spectral domain was a strong predictor both first derivative and continuum-removal features (see Table 4), previously associated with the vegetative stage (see Table 3), and encompasses the red edge peak.

3.
The single wavelength of 1400 nm highlights water absorption via bending of O-H bonds [101]. This wavelength feature was not previously detected at the vegetative stage.

4.
The wavelength at~2300 nm is coupled to protein (not detected in vegetative stage) and nitrogen absorption by stretching of N-H and C=O bonds, as well as bending of C-H bonds [101,108].

5.
A scientific link in the literature was identified between the 2315 nm wavelength and oil absorption, due to the bending of C-H bonds [101]. 6.
The single wavelength of 2327 nm is within that previously described as associated with starch absorption (2320 nm) caused by a stretch in C-H bonds and deforming of CH2 molecules [101]. Moreover, Card et al. (1988) identified a link between lignin absorption and the 2320 nm band [106]. 7.
Finally, the single wavelength at 2365 nm has been associated with nitrogen content [106].

8.
There was no clear association between 1190 and 2030 nm and physiological changes in the plant.
The best performing model for the flowering stage was identified as the continuum-removal method. The ability to capture this phenological change is of pivotal importance for plant maturity assessment, especially since it has bearing on disease risk (e.g., white mold caused by the fungus, Sclerotinia sclerotiorum; [109]). This model identified spectral indicators indicative of oil (detected in budding classification), cellulose (not detected previously), starch, sugar, starch, and lignin, as below: 1.
The 690-705 nm range, previously detected in the vegetative and budding stage, represents the red edge peak spectral metric.

3.
The~2080 nm wavelength in the SWIR region is coupled to sugar and starch absorption, via stretching and deforming of O-H bonds [101]. 4.
The wavelength region at~2310 nm is attributed to oil absorption (previously found in budding stage), via bending in C-H bonds [101].

5.
A single wavelength was found at 2334 nm, in the vicinity of two absorption features due to starch and cellulose (not detected previously), and also linked to stretching of O-H/C-H bonds and being responsible for deformation in O-H/C-H/CH 2 bonds and molecules [101]. 6.
Lastly, no relevant information was found for the wavelength at 1626 nm and the selection of the 2370-2390 nm region, which calls for future work. It is worth noting that the latter region has been given less attention in the literature due to the associated low SNR caused by atmospheric effects in aerial remote sensing systems.
Finally, pod formation, the final growth stage, is mainly dominated by chlorophyll absorption in the red portion of electromagnetic spectrum, coupled to water, starch, sugar, lignin, protein, nitrogen, and abundant cellulose absorption features (as identified by the most accurate model-continuum-removal). The corresponding wavelengths are: 1.
The 610 nm wavelength, as previously detected in the vegetative classification, is linked to chlorophyll and lignin absorption [106].

2.
The 700-710 nm wavelength region, representing the red edge peak, proves to be crucial in all stages of growth (detected in all stages of growth; see Tables 3-5).

3.
The wavelength at~1190 nm, in the proximity of water, cellulose, starch, lignin absorption ranges (~1200 nm), due to bending of O-H bonds.

4.
The single wavelength at 1490 nm corresponds to cellulose and sugar absorption, with stretching in O-H bonds. 5.
The wavelength at~1820 nm (stretching of O-H/C-H bonds) is responsible for cellulose absorption. 6.
Finally, no information was found in the literature for the~1230 nm, 2040 nm, and~1380 nm wavelength regions, thus demanding future work. It also is worth noting that the selected 2390 nm feature may be rendered useless due to its proximity to the longer SWIR region, which suffers from low SNR due to atmospheric considerations.
In short, the multi-class growth classification, using the proposed vegetation indices (ratio and normalized forms), identified two discriminative wavelengths, namely at 493 nm and 640 nm. Both wavelengths varied with chlorophyll absorption throughout the growth stages of the plant. Results from the pod maturity classification, on the other hand, showed that models that performed best (i.e., [CR-spec.] models) used detected wavelengths in the~550 nm region (attributed to reflective green pigment and leaf senescence/yellowing), the 720-740 nm range (encompassing the red edge peak, between 2000-2070 nm (starch absorption, causing O-H and C-O deformation in the 2000 nm area), the 2040 nm spectral region (no relevant scientific link found), and protein and nitrogen absorption stretching and bending of N-H and N=H bands in the 2060 nm spectral region [101].

Conclusions
We studied growth stage and pod maturity classification of snap bean, as a proxy broadacre crop, via an experimental greenhouse setup with a high spectral and temporal resolution. We approached growth stage assessment with binary (one-vs.-rest) and multi-class classification (vegetative index identifier), and pod maturity detection via multi-class classification. The reason we approached this study using both binary and multi-class fashion for growth stage classification, was to identify wavelengths that are not only discriminatory between four different levels of maturity, but to also generate results that could detect spectral wavelengths corresponding to each stage of snap-bean maturity, which is more conclusive and extensible to operational implementation.
Results from the binary growth stage classification showed the continuum-removal technique was adept at detecting growth-related spectral features for all maturity stages, with accuracies ranging from 0.90-0.98. We determined that addition of biophysical features to spectral data for growth stage assessment enhanced the classification accuracy. Despite this improvement, it was found that the continuum-removal approach exhibited similarly promising results, with or without the addition of biophysical attributes. Findings for the multi-class growth stage classification approach showed that two distinguishing wavelengths at 493 nm and 640 nm yielded the best separation between the four growth classes. Accuracies of 79% for the ratio index (i.e., snap-bean growth index [SGI]) and 78% for the normalized difference form (i.e., normalized difference snap-bean growth index [NDSI]) were observed. The selected binary classification algorithm was effective at determining plant growth stage, relying on both VIS-NIR and SWIR regions for accurate outcomes. However, the multi-class classification and the proposed indices also yielded solid classification results, even if with marginally lower accuracies, with the distinct benefit of only using two wavelengths in the visible region.
Findings from the pod maturity assessment, coupled to harvest scheduling, proved that distinction between not ready-to-harvest and ready-to-harvest sieve sizes (immature + mid-mature vs. mature + over-mature; i.e., industry sieve sizes 1-4 vs. 5-6) is feasible with solid accuracy measures (~78% with only two selected narrow-band wavelengths), while the algorithm failed to differentiate immature vs. mid-mature and mature vs. over-mature due to the similarity of spectral data between these pod maturity levels.
Specific limitations of this study include the relatively small sample size, our focus on only one snap bean cultivar (i.e., cv. Huntington), and the absence of mixed-pixel effects and atmospheric noise by collecting data under controlled laboratory conditions. All of these factors can be evaluated in future studies, e.g., an extension to unmanned aerial systems (UAS) and subsequent transitioning to other airborne remote sensing platforms would introduce both mixed-pixel and atmospheric noise and enable sampling of thousands of individual plants. In terms of the advantages of this approach, we noted the robustness of the algorithms to newly introduced data, while the growth stage classification results exhibited high classification accuracy via the SWIR region, and solid performance using only two bands in the visible spectral domain. We are especially intrigued by this last finding, i.e., the potential of using only two wavelengths in the "cheap" silicon detector range to develop cost-effective, operational platforms to accurately assess snap bean growth stages. An acceptable classification performance for pod maturity to differentiate between not ready-to-harvest and ready-to-harvest, with identified wavelengths residing in the silicon and SWIR spectral regions, may prove to be advantageous to operational implementation, as long as a practical and affordable solution is available to merge the two detectors into one multispectral sensor suite.