Plant Disease Diagnosis Based on Hyperspectral Sensing: Comparative Analysis of Parametric Spectral Vegetation Indices and Nonparametric Gaussian Process Classification Approaches

: Early and accurate disease diagnosis is pivotal for effective phytosanitary management strategies in agriculture. Hyperspectral sensing has emerged as a promising tool for early disease detection, yet challenges remain in effectively harnessing its potential. This study compares parametric spectral Vegetation Indices (VIs) and a nonparametric Gaussian Process Classification based on an Automated Spectral Band Analysis Tool (GPC-BAT) for diagnosing plant bacterial diseases using hyperspectral data. The study conducted experiments on tomato plants in controlled conditions and kiwi plants in field settings to assess the performance of VIs and GPC-BAT. In the tomato experiment, the modeling processes were applied to classify the spectral data measured on the healthy class of plants (sprayed with water only) and discriminate them from the data captured on plants inoculated with the two bacterial suspensions (10 8 CFU mL − 1 ). In the kiwi experiment, the standard modeling results of the spectral data collected on nonsymptomatic plants were compared to the ones obtained using symptomatic plants’ spectral data. VIs, known for their simplicity in extracting biophysical information, successfully distinguished healthy and diseased tissues in both plant species. The overall accuracy achieved was 63% and 71% for tomato and kiwi, respectively. Limitations were observed, particularly in differentiating specific disease infections accurately. On the other hand, GPC-BAT, after feature reduction, showcased enhanced accuracy in identifying healthy and diseased tissues. The overall accuracy ranged from 70% to 75% in the tomato and kiwi case studies. Despite its effectiveness, the model faced challenges in accurately predicting certain disease infections, especially in the early stages. Comparative analysis revealed commonalities and differences in the spectral bands identified by both approaches, with overlaps in critical regions across plant species. Notably, these spectral regions corresponded to the absorption regions of various photosynthetic pigments and structural components affected by bacterial infections in plant


Introduction
Plant diseases are a major threat to worldwide agriculture, causing substantial yield losses and impacting food security and quality [1].Timely and accurate disease diagnosis is crucial for implementing effective management strategies in sustainable agriculture.These practices aim to contribute to more effective and precise plant protection measures due to more customized phytosanitary treatments regarding time, location, product used, and dose.However, traditional diagnostic methods often fail to detect diseases before visible symptoms emerge, limiting their effectiveness in proactive disease management [2,3].Innovative plant disease monitoring and diagnosis methods involving different state-of-the-art sensing approaches have recently been explored for precise and in vivo and in situ disease assessment.Recent strides in innovative sensing techniques, particularly hyperspectral spectroscopy (HS), offer promising avenues for precise disease diagnosis [2,4].
Changes in the host plant's physiological, biochemical, and metabolic properties caused by pathogens result in altered optical and metabolic features.Proximal optical sensors, including HS devices, can detect these changes, along with the monitorization of the spatiotemporal pattern of disease development [5], which allows the development of several methods of diagnosis.
Plant pigments are one of the first host compounds to be affected and degraded by pathogens, resulting in changes in plant's optical behavior.Chlorophylls (Chl) a and b are the major pigments of plants (accounting for almost 65% of the total pigment content), and their spectral absorption range is mostly concentrated in the 410-430 and (Chl a), 450-470 nm (Chl b), and 600-690 nm (Chl a) bands, located in the blue and red regions, respectively.Green radiation, on the other hand, is less strongly absorbed.In healthy plants, chlorophyll concentration is approximately ten times higher than that of other pigments (e.g., carotenoids and flavonoids, among others), thus masking out the specific absorption features of these compounds [6].
With the disease development and onset, other photosynthetic pigment levels are increasingly more affected, namely carotenoids and polyphenols.The first type of pigment absorbs most effectively between 440 and 480 nm and extends its absorption action into the blue-green region.They include compounds such as yellow lutein pigments, β-carotenes, and xanthophylls (e.g., violaxanthin and zeaxanthin).In turn, polyphenols (e.g., brown pigments) start to appear only when the plant tissues begin to necrose [7][8][9][10].They include compounds such as flavonoids and anthocyanins, which absorb radiation from blue to red spectral ranges with higher intensity in the shorter wavelengths [7][8][9][10].
Moreover, the optical spectral properties of host plants are also affected in the Near-Infrared (NIR) region (700-1300 nm) and short-wave infrared (SWIR, 1000-2500 nm) when plant leaves structure (e.g., cell layers, cell size, structural components-lignin, and proteins, among others), air spaces, and water content is affected [9,11].
In this regard, it is possible to see those changes in plant leaves' biochemistry and cellular composition result in changes in a plant's spectral characteristics.Nevertheless, it is important to mention that a leaf's spectral properties are not a static phenomenon over time.Indeed, they continuously change during growth, maturity, senescence, decay, or stress (e.g., plant disease development).
Hence, despite the potential of HS in plant disease diagnosis, challenges persist in harnessing its full potential due to the complexity of hyperspectral data and the need for efficient processing methodologies to extract relevant information [12,13].Addressing these challenges is crucial to unlocking the full potential of HS in improved disease diagnosis and management strategies.
HS is known for acquiring data in narrow wavebands (<10 nm), with high precision and resolution, and being able to capture detailed information from the electromagnetic spectrum [12].Nevertheless, despite this evident benefit, the measurement of this large number of variables (i.e., features, wavelengths) results in the data's high dimensionality, which increases the complexity of its processing to produce relevant information.Furthermore, the spectral data assessed in near-contiguous variables likely present similar or overlapping information.This potential data redundancy also increases the complexity of its analysis interpretation and the chance of overfitting occurrence [14].Dimensionality reduction methods were developed to mitigate the effects of high dimensionality and collinearity, mostly based on identifying and extracting the most relevant and distinctive spectral features (without losing relevant information) [13].
The computation of spectral Vegetation Indices (VIs) is one of the most widespread Feature Selection (FS) approaches for retrieving crop biophysical information, especially due to their intrinsic simplicity.It consists of a user-defined mathematical combination of two or more wavelengths that improves crop biophysical information extraction from data, i.e., identifying spectral relationships that unravel specific plant properties.Hence, VIs are considered as parametric, physiological-driven methods.Nonetheless, it is important to note that when narrowband hyperspectral data are used, Vis can be a restrictive formulation, since they only use some of the available wavelengths, failing to leverage the complete wealth of information in the continuous spectral data [15].Besides that, some of the VIs that have already been developed were designed to estimate specific vegetation traits (e.g., plant biomass and photosynthetic pigments research), which might not entirely suit the assessment of plant disease.The ones developed for studying specific plant-pathogen interactions (e.g., [16]) are usually only applicable in analyzing that specific pathosystem (usually in similar environmental conditions), mostly in symptomatic conditions, and are unsuitable for generalized disease assessment.Disease studies are usually modeled as a classification approach, which adds difficulty to the application of the index.
Another emerging strategy, recently employed for exploring hyperspectral data, is applying different advanced techniques (e.g., machine learning algorithms) that search for relationships between spectral data and biophysical variables (also known as nonparametric, data-driven methods).They mostly consider all the spectral features measured by the hyperspectral sensors, which constitutes an important benefit compared to the VIs [17].These methods can be based on linear or nonlinear predictive methods.
Furthermore, automated band analysis tools have been developed in the domain of machine learning classification algorithms (MLCAs).Following a band selection method earlier introduced in regression [18], this paper introduces an automated spectral band analysis tool (BAT) based on Gaussian process classification (GPC) for the spectral analysis of bacterial plant diseases.Briefly, starting from using all bands, GPC-BAT sequentially removes the least informative band in GPC until one band is left.By tracking the accuracy of statistics, GPC-BAT allows (1) to identify the most informative bands relating spectral data to a classification problem, and (2) to find the least number of bands that preserve optimized accurate classification tasks.
Hence, despite the development and availability of diverse methods for extracting meaningful spectral information in the context of plant bacterial disease diagnosis, it is necessary to address their suitability and performance when leading with different pathosystems.Therefore, the main objective of the present work aimed to explore, test, and validate the application of proximal optical sensed data for the early assessment and diagnosis of bacterial plant diseases.In this regard, the specific goals of this study were: (i) explore the suitability of different VIs for extracting relevant spectral features for performing plant bacterial disease diagnosis, using both reflectance and transmittance hyperspectral data (physiological driven approach); (ii) investigate the potential of a GPC-BAT for performing plant bacterial disease diagnosis using reflectance and transmission hyperspectral data (data-driven approach); (iii) compare and contrast the performance of VIs and GPC-BAT in discerning spectral features crucial for differentiating between healthy and diseased plant tissues; and (iv) uncover the biological significance of the identified spectral features concerning specific plant-pathogen interactions and their implications for early disease diagnosis.To achieve this, two case studies were conducted on tomato (controlled conditions) and kiwi (field conditions) plants, aiming to explore the capabilities of the developed approaches for performing bacterial disease diagnosis in distinct species in different environmental conditions.

Materials and Methods
The present analysis focuses on two case studies: one in controlled environmental conditions using hyperspectral transmittance sensing data and the second in field conditions using hyperspectral reflectance sensing data.The first case consisted of collecting spectral data in healthy tomato leaflets' tissues, along with measurements in inoculated (diseased) tissues with Pseudomonas syringae pv.tomato (responsible for the bacterial speck disease of tomato), and tissues inoculated with Xanthomonas euvesicatoria (responsible for the bacterial spot disease of tomato).The second case assessed spectral data in nonsymptomatic and symptomatic kiwi leaf tissues affected by the bacterial canker of kiwi caused by Pseudomonas syringae pv.actinidiae.In both case studies, multiple spectral samples were gathered within an experimental setup at various time intervals, encompassing all the plants involved in the study.
The hyperspectral data were then used in two modeling approaches involving a physiologically driven parametric approach based on VIs and a nonparametric approach based on a Gaussian Process Classification Banda Analysis Tool.

Tomato Bacterial Diseases-Indoor Assay
An indoor assay was performed in a walk-in growth chamber (temperature of 25 to 27 • C, humidity of 60% approximately, photoperiod of 12/12 h, and light intensity of 30 W) with nine tomato (Solanum lycopersicum L.) plants of the variety Cherry in 200 mL pots with a commercial potting substrate.Groups of three plants were formed and physically separated from each other to avoid cross-contamination; one group was sprayed with distilled water (Control, healthy class), the second group with a bacterial suspension (1 × 10 8 cells/mL) of Pseudomonas syringae pv.tomato DC 3000 (Pst), and the last group with a suspension (1 × 10 8 cells/mL) of Xanthomonas euvesicatoria (Xeu), following a previously developed protocol [19] (Figure 1).Plant phenotypical observations were performed daily to assess symptom development for 10 days (Table 1).Experimental setup of the bacterial inoculation assay performed on tomato leaves (walk-in experimental conditions), and in kiwi leaves (field experimental conditions).In both cases, spectral measurements were performed on the adaxial side of the leaves.In the tomato case study, they were, furthermore, performed on the 4th, 5th, and 6th leaves of the assay plants, in a dark room, using a spectrometer (measuring transmittance data) combined with an optical fiber bundle with a reflection probe.A white LED was placed beneath each leaflet.In kiwi, spectra collection was performed in random points of leaves casually selected, using a passive spectroradiometer (measuring reflectance data).
The success of artificial bacterial inoculation was assessed by the performance of a viability assay and through a colony polymerase chain reaction (PCR), as stated in [19].PCR analyses were performed at 48, 72, and 96 h after bacterial inoculation, as well as in the last assay date.The growth of Pst and Xeu in their appropriate selective (KB and YDC, respectively) media demonstrated that bacteria were viable at the moment of inoculation.PCR results proved the infection success, where the formation of bacteria-specific bands for each pathogen species, namely a 200-base pair (bp) fragment for Pst and a 713 bp fragment for Xeu, were observed.No PCR amplification was observed from samples collected from Control samples, assuring they were healthy until after the last spectral measurement.The first macroscopic lesions were detected in Pst-inoculated samples 3 days after inoculation (DAI) and in Xeu-inoculated samples at 8 DAI.
Hyperspectral transmittance point-of-measurement (POM) data were captured inside a walk-in chamber using a setup composed of a mini spectrometer (Hamamatsu Photonics K.K. TM Series C11697MB) with a wavelength range of 200-1100 nm, and a spectral resolution of 0.6 nm.This setup includes a transmission optical fiber bundle (FCR-7UVIR200-2-45-BX, Avantes, Eerbeek, The Netherlands), a laptop for data storage and processing, and a white LED spanning from 390 to 800 nm.A specialized evaluation software (SpecEvaluationUSB2.exe, Hamamatsu Photonics K.K., Hamamatsu, Japan) was used for data acquisition.Further details about the setup can be found in previous work [19] (Figure 1).Subsequently, a resampling technique of approximately 10 nm was employed to minimize data redundancy.A dataset comprising 2346 samples (spectral observations) encompassing 51 wavelength features (spectral variables) was selected for subsequent analysis.The spectral measurements were later classified according to the leaflets' plant treatment group, including the classes: (i) Control (healthy); (ii) inoculated with Pst; and (iii) inoculated with Xeu (Table 1).This dataset can be found in [20].

Kiwi Bacterial Diseased-Field Assay
An assay was performed in field conditions in commercial orchards of kiwi (Actinidia deliciosa) of the cultivar 'Bo.Erika', located in Guimarães, Portugal.Macroscopic signs (i.e., symptoms visual to the human eye) of bacterial canker caused by Pseudomonas syringae pv.actinidiae (Psa) were assessed in feminine plants.Plant visual phenotyping was performed, classifying leaves into nonsymptomatic (NS, when no macroscopic visual symptoms were present) or symptomatic (S, when macroscopic symptoms were visible), as described in [21] (Table 1).The detection of Psa in Kiwi samples was carried out by dedicated PCR following the EPPO guideline [22], as detailed by Mariz-Ponte et al. [23].These samples were collected before the spectral assay and proved the presence of the bacteria in study in the orchards under study.
Hyperspectral reflectance measurements were collected in situ, in vivo leaves, using a portable spectroradiometer (ASD FieldSpec ® HandHeld 2, ASD Instruments, Boulder, CO, USA) with a wavelength range of 325 to 1075 nm, spectral resolution of 1 nm, and field-ofview conical angle of 25 • (Figure 1).The detailed procedures followed during the spectral acquisition assay can be found in previous research [21].In brief, three leaves were chosen per plant, and their spectral signatures were collected at different time points, resulting in 504 samples (spectral observations) and 751 spectral features (spectral variables).Binary classification of leaves' spectra was made according to the phenotype of the leaves resulting in the binary classes NS and S (Table 1).This dataset can be found in [24].

Modeling Approaches
Parametric Approach-Vegetation Indices (VIs) Hyperspectral data, including both transmittance and reflectance spectra, usually have an overlapping nature and multi-scale interference [25].To address this issue, a selection of 33 spectral VIs, encompassing 42 distinct wavelength combinations, was computed to identify the most relevant wavelengths or bands for discriminating healthy and diseased biological tissues (Table 2) (Figure 2).This selection process aimed to integrate VI formulations commonly used to assess different crop traits as well as crops' physiological conditions.The variables used in each formula corresponded to default values explicitly mentioned in the formula (Table 2) or values chosen by the authors, namely: 450 nm (representing the blue region of the electromagnetic spectrum), 550 nm (green), 680 nm (red), 700 nm (red edge), and 800 nm (NIR).The feature representing the blue region was elected due to being related to pigment absorption features (~450 nm, e.g., chlorophylls and carotenoids) [26,27] and a blue fluorescence maximum [28].The 550 nm wavelength was selected because reflectance data correspond to the green peak (or green edge), where reflectance values can be more than twice the surrounding wavelengths [29,30].This value is also sensitive to chlorophyll content and has been explored to detect plant stress-induced changes and pigment content variations [31,32].Instead, 680 nm was chosen because it corresponds to the reflectance minimum in the Red region [29,30].The red-edge value (700 nm) was used because it is highly sensitive to changes in chlorophyll-a absorption and is used to detect subtle changes related to plant physiological status and growth stage transitions [31,33].The 800 nm spectral feature was chosen because it is related to the influence of changes in leaf structure and density, but it is not sensitive to pigment level changes [34].Furthermore, all these wavelengths have been extensively used in formulating multiple VIs, as seen in Index Data Base (IDB), a database for remote sensing indices [35,36].
A Flexible Discriminant Analysis (FDA), which applied predictive modeling with a built-in Feature Selection (FS), was then performed to evaluate the most significant VIs used to discriminate between spectral data measured in (i) healthy tomato tissues (Control, Con), diseased tomato tissues inoculated with Pst, and diseased tomato tissues infected with Xeu; and (ii) nonsymptomatic (i.e., without macroscopic lesions, NS) and symptomatic (S) kiwi tissues (Figure 2).The datasets, encompassing both tomato and kiwi cases, were split according to the holdout method [37], which involved partitioning into a training set comprising 70% of the data and a testing set with the remaining 30% of the observations [38].
Model evaluation was employed through a resampling strategy involving repeated 10-fold cross-validation to estimate accuracy.A more detailed explanation can be found in [37][38][39].Model performance was then evaluated by assessing different classification model metrics, including the confusion Matrix (CM), accuracy score, kappa coefficient, and F1-score [19].The CM is a 2D-matrix representation of the actual classes of the collected spectral samples in one dimension and the predicted class values in the other.When the predicted class values are equal to the actual value, they are considered correct classifications and localized in the CM's diagonal.The remaining matrix cells correspond to incorrect classification predictions, where the predicted value is not coincident with the actual value.The class of interest is considered positive, while the other(s) are considered negative.When the predicted class is correctly classified as the class of interest, it is considered a true positive (TP) case.When the predicted class is accurately classified as not belonging to the class of interest, it is called a true negative (TN).When the predicted class is wrongly classified as the class of interest, it is called a false positive (FP), and when incorrectly classified as not fitting the class of interest, it is classified as a false negative (FN).
The accuracy score (also known as Success Rate) corresponds to the number of rightfully classified prediction cases divided by the total number of predictions (Equation ( 1)): The kappa coefficient (also called Cohen's kappa) amends the accuracy score by considering the probability of an accurate prediction occurring by chance alone [37] (Equation ( 2)).Its value can range from zero, indicating an imperfect agreement, to one, the perfect agreement between models' predictions and true values.Kappa values (in percentage) can be interpreted as follows: when less than 20%, it is considered a poor agreement; 20% to 40%, a fair agreement; 40% to 60%, a moderate agreement; 60% to 80%, a good agreement; and 80% to 100%, a very good agreement [37].The kappa coefficient can be estimated through the following formula, where Pr(a) represents the proportion of actual agreement and Pr(e) refers to the expected agreement between the classifier and the true values, under the hypothesis that they were chosen randomly (Equation ( 2)): The F1-score (also called F-measured) combines the proportion of positive cases that are truly positive (Precision) with the number of TP over the total number of positives (Recall, which measures how complete the results are) into a single number using the harmonic mean (Equation ( 3)): Sensitivity was evaluated, indicating the models' ability to predict the TP of each available class (Equation ( 4)): The Specificity metric was also calculated, since it indicates the models' suitability for predicting TN of each available class (Equation ( 5)): All these computation analyses were made in the software R [40] (v.4.2.1) using the packages 'caret' [41] and 'earth' [42].The previous formulae can also be found in [41].
Table 2. List of the Spectral Vegetation Indices (VIs) computed in this work, mentioning their formula and reference (when available).

Vegetation Indices
Formula Ref.

ARTMO Software
The Automated Radiative Transfer Models Operator (ARTMO) is a modular MATLAB GUI toolbox initially developed for automating the simulation of radiative transfer models (RTMs) [79].This comprehensive toolbox integrates various leaf and canopy RTMs alongside essential tools for semi-automated retrieval of biophysical and biochemical variables.ARTMO (v.3.31) is connected to a relational SQL database management system (MySQL, version 5.5 or 5.6; local installation required) for storing all generated data (i.e., simulations and statistical results) and trained models along with metadata, enabling the re-execution of earlier models or simulations.An initial version of the machine learning classification algorithm (MLCA) toolbox was introduced in version 3.29, and this functionality has been expanded in subsequent releases.
The current official version (v.1.02) of the MLCA toolbox incorporates 20 supervised MLCAs belonging to the principal families of supervised classifiers, predominantly affiliated with machine learning methodologies.Note that this initial version is limited to pixel-based classifiers, implying that object-based subpixel-based or scene-based deep learning classifiers have not been incorporated.Nevertheless, pixel-based classifiers enable the learning and characterization of intricate spectra.
Supervised classifiers are traditionally classified into parametric and nonparametric methods.Parametric methods are grounded in probabilistic theories, modeling the decision boundaries between classes from a fixed number of parameters, independent of the number of samples, employing global criteria for classification [80].By contrast, nonparametric methods guide the class grouping based on the digital number (single band/image) or spectral data (multi-and hyperspectral reflectance or transmittance).The spectral value distribution is independent and focused on the local data structure, requiring a substantial set of samples for the classification process [81].
Arguably, one of the most promising nonparametric classifiers is the Gaussian process (GP) classification.GPs are stochastic processes where each random variable follows a multivariate normal distribution [82].The goal of GP classification is to learn a mapping from the input data (e.g., spectral reflectance or transmittance values) to their corresponding classification label (e.g., plant health group type), which can then be used on new, unseen spectral measurements (Figure 2).When the GP is developed with kernel methods [83], it allows mapping the original data into a possibly infinite-dimensional space [84].In this space, the input-output relationship can be better estimated as the GP can consider more complex and flexible functions than the linear models.This enables the GP to capture intricate relationships between the spectral data and the health crop phenotype, leading to more accurate classification results.Due to its probabilistic framework, the GP provides uncertainty estimation per sample.This means that for each spectral measurement, the GP can provide a measure of how confident it is in its classification prediction.This uncertainty information can inform decision making, allowing users to be more or less confident with the inferred classification label (e.g., see [15]).
Machine Learning Approaches-Gaussian Process Classification (GPC-BAT) The GP has another advantage of being capable of using more sophisticated kernel functions than the standard linear kernel or the radial basis function (RBF) kernel equation (Equation ( 6)), which can be optimally tuned through likelihood maximization: where x i and x j represent two spectra, σ is the variance, and x i − x j is the Euclidean distance between the two spectra x i and x j [18,82].
In the classification case, the output values of k RBF are discrete (±1); this causes the likelihood function to be non-Gaussian, and then some approximations should be performed [85].We chose the Laplace approximation which performs well and is robust.One notable kernel function is the automatic relevance determination (ARD) kernel equation (Equation ( 7)): where Σ is a diagonal matrix, whose diagonal tries are constituted by σ 2 1 , . . ., σ 2 d parameters to weight each input dimension [18,82].This kernel covariance function requires one parameter per input feature; it can be optimized under that framework, and it allows providing a band ranking based on their optimal values.
Following the rationale as presented in Verrelst et al. (2016) [18] for GP regression, a GPC-based band ranking feature has been implemented into a so-called band analysis tool (i.e., GPC-BAT).In short, we employ a simplified and general iterative backward greedy algorithm to identify the most informative bands.This algorithm assesses the impact of each band on the prediction error in the context of the remaining bands.At each iteration, the algorithm removes the band with the highest uncertainty σb, thereby re-training the GPC model with the remaining bands.This is referred to as sequential backward band removal (SBBR).The SBBR algorithm is analogous to recursive feature elimination (RFE), a technique earlier presented with support vector machines or random forests.In RFE, the feature with the lowest ranking score is eliminated, iteratively removing insignificant features until only the most relevant ones remain (e.g., [86][87][88][89]).This SBBR approach allows us to pinpoint the bands that most strongly influence the prediction of our target classes.These bands provide valuable insights into the spectral characteristics that best capture the sensitivity of the classes of interest, e.g., healthy and diseased groups.
A principal application of GPC-BAT is that the algorithm identifies how many bands are minimally needed in order to retain robust results and informs us about the most sensitive wavelengths.Accordingly, the output GUI delivers the following band analysis outputs: (1) overall accuracy (OA) statistics as a function of #bands plotted over the sequentially removed bands until only two bands are left and (2) associated wavelengths selected by the tool (Figure 3).

Spectral Vegetation Indices-Parametric Approach
This section presents the predictive classification results of the approach combining the calculation of different VIs (Table 3) followed by the computation of the FDA model, which allowed (i) the classification of tomato leaflet spectral samples collected on healthy (Control, Con) and both inoculated diseased tissues with Pst and Xeu bacteria; and (ii) the classification of kiwi leaf spectral samples measured on nonsymptomatic (NS) and symptomatic (S) diseased tissues.

Tomato Disease in Walk-In Chamber
Table 2 presents the results of an overall accuracy of 63.30% (proportion of correctly classified instances) and a kappa coefficient of 44.70 (which indicates agreement between the predicted and actual classes beyond random occurrence) for the validation dataset (Table 3, detailed information about the training results is present in Appendix A Table A1).The model metric analysis per class revealed that samples inoculated with Pst bacteria presented good precision (75.66%), sensitivity (65.60%), specificity (90.52%), and F1-score (70.27%) (Table 3).These metrics indicate the model's suitability for accurately performing correct predictions for both classes (healthy vs. inoculated), correctly identifying instances of these classes, distinguishing samples that do not belong to these classes, and a good balance in capturing true positives and avoiding false positives (Table 3).The control class presented a higher sensitivity, highlighting the model's efficiency in correctly identifying healthy samples.In contrast, the Xeu class had a reduced precision (52.44%), sensitivity (48.56%), and F1-score (50.43%), which translates into a higher likelihood of false positives in Xeu prediction, a more probable miss of a considerable proportion of positive instances of this class, and an imbalance in assessing true positives and avoiding false positives (Table 3).
Therefore, although the FDA model exhibits good capabilities in differentiating samples measured in healthy (Control) and Pst-diseased tissues, there is potential for improving the identification of spectra captured on Xeu-inoculated tissues (Table 3).
A fraction of these false positives may be accounted for by the sensitivity gap between digital and visual phenotyping methods.These results are based on visual phenotyping, which only allows samples with visible symptoms to be identified.However, Xeu-inoculated plants may present early changes in their optical properties long before the appearance of symptoms, which could justify some of the false positive cases recorded.
CM results indicate that predictions of samples collected on tissues inoculated with Xeu were more challenging to the model when compared to the healthy and inoculated with Pst ones, presenting a higher number of wrong classifications than in the remaining classes studied (Table 4, for both training and validation sets).In fact, the model only correctly identified 49.63% of the total Xeu samples.The majority of the remaining Xeu samples were wrongly inputted to the Control class.This can be related to the fact that macroscopic symptoms only appeared 8 DAI, resulting in a high number of nonsymptomatic samples (presenting a phenotype similar to the healthy ones), whose spectral signature is more similar to healthy samples than Pst-diseased ones.In contrast, approximately 74.17% of the total healthy (Control) samples were accurately classified and 67.13% of the total number of samples inoculated with Pst (Table 4).

Table 4.
Confusion Matrix results of the Flexible Discriminant Analysis (FDA) using the Vegetation Indices (VIs) computed in the hyperspectral data collected on tomato leaflet tissues (transmittance) and kiwi leaf tissues (reflectance).The predicted samples of each class (column) that were correctly classified for each true class (row) for the spectral data collected on tomato leaflets tissues (left) and kiwi leaf tissues (right) are shown.The classes used in the tomato case study were control samples (healthy, Con), samples inoculated with Pseudomonas syringae pv.tomato (Pst), and samples inoculated with Xanthomonas euvesicatoria (Xeu).In turn, the binary classes nonsymptomatic (NS) and symptomatic (S) were applied to the kiwi case study.

Predicted Class-Tomato
Predicted

Kiwi Bacterial Canker Disease in the Field
In turn, when the IVs-based modeling approach was applied in the binary classification of nonsymptomatic (NS in Table 3) and symptomatic (S in Table 3) samples of the validation set taken on kiwi leaves in field conditions, the overall accuracy achieved was 71.33%.The kappa value of 41.73% demonstrates the model's effectiveness in classifying the two classes (results for the training set can be seen in Supplementary Materials Table S1).The model metrics for the spectra collected in nonsymptomatic and symptomatic tissue's spectra revealed that the model acceptably identifies a significant proportion of true positive samples of these classes, classifies samples belonging to both classes, and has a good ability to correctly identify instances that do not belong to the class in analysis, along with a good balance between finding positive cases and avoiding false positives (Table 3).
The CM values show that the model has less difficulty predicting nonsymptomatic samples correctly classifying 78.28% of the total samples compared to the symptomatic samples (67.71%) (Table 4).

GPC-BAT Performance with Original Training Data and Further Validation
This section presents the predictive classification results of the approach using the ARTMO GPC-BAT tools, which also allowed (i) the classification of tomato leaflet spectral samples collected on healthy (Control, Con), Pst-inoculated, and Xeu-inoculated tissues and (ii) the classification of kiwi leaf spectral samples measured on nonsymptomatic (NS) and symptomatic (S) tissues.

Tomato Diseases in Walk-In Chamber
For the selected 23 wavelengths (from the 51 available) for the tomato case study and 577 wavelengths (from the 611) for the kiwi case study, the models presented the best classification metrics (Table 5).The prediction for discriminating the different classes defined for the tomato dataset achieved a maximum overall accuracy of 70.46% and kappa of 55.60%.Furthermore, metric evaluation per class provides insights into the model's performance in distinguishing between healthy (Con in Table 5) and diseased instances (Pst and Xeu in Table 5).In terms of Precision (80.72 vs. 75.24%),Specificity (88.75 vs. 89.14%),and F1-values (73.87 vs. 77.31%), the Control and Pst-inoculated classes presented good metric levels.These results highlight the model's accuracy in identifying positive predictions for both classes, distinguishing samples that do not belong to these respective classes and achieving a good balance in capturing true positives and avoiding false positives (Table 5).The Xeu-inoculated class, compared with Pst, showed lower values of Precision (55.93 vs. 75.24%)and F1-score (60.04 vs. 71.31%),indicating a higher likelihood of false positives in their prediction and an imbalance in assessing true positives and avoiding false positives.This class had a good Specificity value, similar to the other two, suggesting the model's performance in distinguishing spectral measurements that do not belong to it (Table 5).Regarding Sensitivity, the Pst class presented the higher level, indicating the model's effectiveness in correctly identifying instances of this class.However, the values for the two remaining classes were lower, implying that the model may miss a notable proportion of positive instances in the Control and Xeu classes (Table 5).The CM also demonstrates that 81% of the healthy and 75% of Pst-inoculated samples were correctly classified.Similar to the previous approach (based on VIs), the GPC-BAT also faced more difficulty in accurately classifying the inoculated Xeu samples, with only predicting 56% of the cases correctly (Table 6).Hence, while the model demonstrates strong capabilities in distinguishing samples collected in Control (healthy) tissues and those measured on Pst-diseased tissues, it needs to be enhanced to accurately identify spectra collected on Xeu-diseased tissues (Table 5).

Kiwi Bacterial Canker Disease in the Field
The binary classification performed for the kiwi leaves hyperspectral reflectance measurements achieved a maximum overall accuracy of 75.40% and a kappa of 49.95%.The model proved effective in both class predictions, allowing the distinction between nonsymptomatic (NS in Table 5) and symptomatic (S in Table 5) assessments collected in kiwi leaves in field conditions.The model metrics for the spectra collected in nonsymptomatic and symptomatic tissue's spectra revealed that the model effectively identifies a significant proportion of true positive samples of these classes, a good ability to classify instances that do not belong to them correctly, and a well-balanced between identifying positive instances and avoiding false positives (Table 5).
The CM demonstrates that 79% of the nonsymptomatic samples were accurately classified, along with 70% of the symptomatic samples (Table 6).
The GPC model did not perform best for both case studies when all the spectral bands were applied.In the tomato case study, the overall accuracy and kappa values when all spectral features were used were lower, reaching 69.06% and 53.48%, respectively.Likewise, the kiwi case study's values were 61.90% and 22.50% for accuracy and kappa, respectively.Furthermore, for the kiwi case study, the outcomes were more unstable when compared to the model developed with the wavelengths chosen by BAT, presenting a higher standard deviation (SD) and processing time (Table 7).In the tomato case study, the SD value was lower when all the features were used, but the processing time was almost 40% superior (Table 7).In terms of selected sensitive wavelengths, when the spectral data collected on kiwi were used, GPC profusely selected wavelengths greater than 800 nm (26 of 34 wavelengths), these wavelengths (>800 nm) did not prove to be important in the construction of the selected VIs.Only PVIhyp (800, 1000 nm) selected wavelengths at 800 nm, but this VI has a very modest representation (24.46%) in the distinction between the nonsymptomatic and symptomatic classes.
For the tomato case study, the FDA selected five VIs whose formula integrated three wavelengths: PSRI, CCCI, EVI, SIPI, and GARI (Table 8).In contrast, in the kiwi case study, all the chosen VIs presented only two wavelengths (Table 8).
In GPC-BAT, the predictive power of each wavelength for the target variable is evaluated by the index sigma (σ).Accordingly, the lower the sigma value, the more important the feature is.Thus, the contribution of each spectral feature can be ranked through the quantification of this property.In the tomato case study, for the selected spectral features, it is possible to observe in Figure 4 that for the identification of the Control (Con) class, 440.21, 450.00, 460.31, 490.04, 510.41, 520.00, 640.35, 680.02, and 750.17 nm were the more relevant wavelengths, since they presented a lower sigma value (i.e., more weight in model), leading to a higher distance from the plots' center (blue dots).The most significant features for predicting samples made on tomato leaflet tissues inoculated with Pst were 440.21 and 450.04 nm (red dots).In classifying samples collected on leaflet tissues inoculated with Xeu, the 440.21, 450.04, 660.15, and 680.02 nm wavelengths were the most significant for prediction (orange dots).In turn, in the kiwi dataset, from the selected wavelengths, the identification of samples belonging to both classes was more influenced by the 597, 771, 791, 835, 869, 883, 902, and 999 nm features (Figure 5).The wavelengths selected as relevant for the classification of nonsymptomatic samples are the same as those chosen to predict the opposing symptomatic class (Figure 5).This tendency was expected, since this is a binary classification task, where the differences between the two categories in the study are expected to occur in coincident spectral features.

Biological Interpretation of Sensitive Wavelengths
These wavelengths chosen in both case studies present an interesting biological significance, since they coincide with the spectral absorption regions of several photosynthetic pigments, namely: (i) chlorophylls, in the blue region around 430 to 480 nm, and the red region, from 640 to 680 nm; (ii) carotenoids, including xanthophylls, in the blue-green region nearby 400 to 550 nm; (iii) flavonoids, in the UV-blue wavelengths ranging from 315 to 500 nm, including anthocyanins whose absorption band is from 500 to 550 nm, (iv) and pheophytins, whose absorption action is located in the blue (430 to 480 nm) and red (640 to 680 nm).Furthermore, the selected wavelength features also overlap the NIR spectral range associated with interactions between light and leaf water content and between light and leaf structural components (such as cellulose and lignin).Thus, it is possible to determinate that all these pigments and structural components are affected by the action of Psa, Pst, and Xeu bacteria in kiwi and tomato leaves, respectively, due to the relevant spectral variance found in the wavelengths corresponding to their spectral absorption regions among the different classes in study.

Discussion
This work analyzed two methodologies for performing bacterial disease classification in tomato (assay performed in controlled environmental conditions, using a transmittancebased sensor) and kiwi (assay made in the field, using a reflectance-based sensor) plants.One approach combines the calculation of different VIs described in the literature and a machine learning algorithm with a built-in FS method (FDA).Another approach uses the two distinct hyperspectral datasets combined through the ARTMO GPC-BAT.In both approaches, the most relevant spectral wavelengths for class detection were identified and linked to their biological significance.The first approach uses VIs developed according to the physiological information of plants.In contrast, the second approach constitutes a datadriven approach.The tomato experiment (three classes: Control vs. Pst and Xeu), compared to the kiwi experiment (binary), represents a more complex classification model.The sensor used in the tomato experiment allows obtaining spectral information from the visible spectrum up to 800 nm, with wavelength resampling to 10 nm intervals (approximately), whereas the sensor used in the field experiment with kiwi allows data from the visible spectrum up to 1000 nm, with a more precise step of 1 nm wavelength.
It is possible to observe that both approaches allowed the identification of the different classes in the study, using the tomato and kiwi datasets.Nevertheless, the strategy involving the application of VIs as an FS technique showed lower classification metrics than the methodology that used the GPC-BAT, which may not be high enough to justify their future application.This may be related to the fact that the VIs used, despite being well established in the literature, were developed for specific plant traits and situations differing from the bacterial plant disease diagnosing problem in the study.Furthermore, since they are calculated using only available spectral features, they may not use all the information in spectral narrowband, high-dimensional hyperspectral data [5].Hence, this modeling approach must be enhanced aiming at a more effective class discrimination.Exploring distinct wavelength combinations in the VIs, different FS models (e.g., Random Forest), and performing hyperparameter tunning are some of the strategies that may be performance.
In contrast, GPC-BAT considered all the available spectral features and performed a selection according to their relevance for identifying the class in the study.
The GPC-BAT, when applied to the analysis encompassing all wavelengths captured by the hyperspectral sensors, exhibited lower classification metrics than the VIs approach in both the tomato and kiwi case studies.For instance, in the kiwi study, where the hyperspectral data indicated a narrowband field of 1 nm, the model's performance using all available features was as poor as when only three wavelengths were utilized (data not displayed).This may be related to hyperspectral data being super-imposed in the recorded spectra at different interference scales [25], (i.e., the data collected corresponds to several structural and metabolic plant compounds present in the area measured) and to the significant amount of redundant information embedded in contiguous wavelengths.As a result, only a few specific spectral variables are relevant to identify diseased plant tissues [90,91].
In this regard, Feature Selection or spectral reduction techniques are, thus, recommended to overcome this hurdle.In this work, two approaches were analyzed namely an FDA algorithm and the BAT of ARMO.Given the selected wavelengths, both studied strategies (VI-based vs. GPC-BAT) presented comparable results for both case studies, notably when dealing with the more complex tomato dataset especially with the tomato dataset.Equivalent wavelengths were found in the blue (450 nm), green (550 nm), and red-edge regions (680, 690, 700, and 750 nm).VIs further highlighted the 800 nm wavelength in the NIR region.In the kiwi case study, the features selected by the two algorithms were similar but not entirely coincident, namely in the green region (where VIs selected the 530, 553, and 554 nm, and the GPC-BAT chosen the 544, 597 nm), red-edge region (Vis identified the 670, 677, 700, 705, 730, and 750 nm as relevant, and GPC-BAT only considered the 754 nm), and NIR region (Vis picked a lower amount of wavelengths, namely 780, 800, 994, and 1000 nm, when compared to the GPC-BAT which took into account the following wavelengths 771, 790, 791, 795, 825, 835, 839, 845, 850, 851, 860, 864, 866, 869, 881, 883, 888, 893, 902, 905, 906, 928, 932, 939, 945, 947, 973, 980, 993, 999, and 1006 nm).Only the Vis presented features in the blue region 400 and 450 nm.Thus, it is important to address that despite the two modeling strategies applied to work with different types of spectra (reflectance vs. transmittance), having different spectral resolution (~10-10 nm vs. 1-1 nm) and presenting different pre-processing methods, they selected similar wavelengths.
These findings present biological significance, since the relationship between the plant host and the pathogen causes changes in photosynthetic pigment content, water levels, and structural composition (e.g., cellulose and lignin levels) [92].This ultimately leads to modifications in the tissues' spectral behavior.In particular, the variance in spectral characteristics among diseased leaves infected by distinct bacteria could be linked to the generation of unique molecules by each pathogen, which may influence the spectral signature of the host.For instance, Pst bacteria produce a specific phytotoxin named coronatine that induces changes in chlorophyll fluorescence (by altering photosystem II-PSII), impacting tomato plant tissues' absorption and scattering of light [93].Moreover, the host tomato plant can activate diverse defense responses upon encountering a pathogen, initiating a cascade of biochemical and molecular reactions that further contribute to spectral modifications in the visible wavelength ranges.Phytoalexins (e.g., flavonoids) serve as an example, with their production hypothesized to be linked to an increase in the spectral reflectance of plants in the VIS range [94].
Previous studies performed by our team also reported similar outcomes.In particular, that study developed a methodology for early diagnosing two bacterial diseases of tomato, caused by Pst and Xeu bacteria, using hyperspectral transmittance data and an applied predictive modeling approach [19].A total of 3478 spectral measurements were normalized and subjected to a Linear Discriminant Analysis (LDA) aiming to reduce data dimensionality.This algorithm highlighted similar relevant wavelengths in the blue, green, and red spectral regions.Furthermore, a modeling approach using a Support Vector Machine was applied for spectral classification.It achieved an accuracy of 100% for samples measured on tissues inoculated with Pst and 74% for tissues inoculated with Xeu when samples collected before symptom appearance were used [19].Likewise, another study performed on a kiwi orchard allowed the identification of hyperspectral reflectance samples collected on nonsymptomatic and symptomatic Psa disease leaf tissues.Several methodologies involving different Feature Selection techniques combined with different Machine Learning algorithms were explored, and the one combining a stepwise forward various selection (SFVS) approach followed by the computation of an SVM algorithm was selected, achieving an overall accuracy of 85%.Similar to the other strategies explored, the SFVS elected the blue region, green region, and NIR region as the most relevant for sample classification.
Furthermore, other researchers reported similar classification findings to the ones found in the present work, namely when studying different tomato and kiwi diseases based on modeling hyperspectral spectroscopy data.The suitability of a portable hyperspectral spectrometer combined with various algorithms for FS and data modeling for early nondestructive diagnosis of tomato bacterial wilt disease (Erwinia tracheiphila) in leaves was explored [95].The model presenting higher evaluation metrics (overall accuracy of 90.70%) applied Genetic Algorithms for FS and SVM to predict classification.The Simple Ratio Pigment Index (SRPI) was the VI and was found to have a higher contribution in the developed model.It considers 430 and 680 nm wavelengths and is sensitive to leaf nitrogen content and photosynthetic efficiency (and is similar to our findings) [95].
Another study using tomato plants explored the usage of a portable high-resolution spectroradiometer combined with VIs, Principal Component Analysis (PCA), and a classification model K-nearest neighbor (KNN) for the diagnosis of late blight (Phytophthora infestans), target (Corynespora cassiicola), and bacterial spot (Xanthomonas euvesicatoria) [96].They successfully identified the spectral samples collected on detached tomato leaflets with an accuracy reaching the 100% level even in nonsymptomatic stages (Error Rate of 9.50%), when the 15 VIs selected by PCA in the first principal component (PC) were considered.Interestingly, it is possible to observe that when 30 VIs selected by PCA and belonging to the first PC were used, the model showed a lower accuracy value (65.20%) and a higher error rate (28.6).In terms of VIs, the ones selected presented similar features to the ones found in our study (such as the 680 and 800 nm used in the Normalized difference index and the Simple Ratio, and structure-intensive pigment index, among others) [96].
Hyperspectral VIS-NIR spectroscopy was, moreover, used for the nondestructive early diagnosis of tomato chlorosis virus (ToCV) [97].They used a Neighborhood component analysis (NCA) for performing FS and for selecting the most relevant VIs in the study, along with two ML models for data modeling (XY-fusion network-XY-F-and Multilayer Perceptron with Automated Relevance Determination-MLP-ARD).The best overall accuracy (92.1% before outlier removal and 100% after outlier removal) was obtained using MLP-ARD.In terms of relevant VIs, is possible to observe that wavelengths such as 550, 670, 700, 720, 740, and 800 nm, among others, were present in the most notable VIs formulae (such as Anthocyanin Reflectance Index-ARI, Pigment Specific Simple Ratio-PSSR, Red Edge Inflection Point-REIP, Simple Ratio-SR, and Vogelmann Index-VOG).In turn, from the 15 wavelengths selected by the NCA, these were mostly located in the blue (402.20 to 449.20 nm), green (556.40 to 566.40 nm), red-edge (676.40 to 726.30 nm), and NIR regions (862.10 nm).These outcomes coincided with our observations [97].
The feasibility of multispectral data for predicting kiwifruit decline (probably caused by Phytophtora spp.and Phytopythium spp.) in diseased orchards was also tested [98].Multispectral data included the 550, 660, and 790 nm spectral features, and when combined with K-means clustering allowed the determination of kiwi plants' vigor affected or not by the disease with 73% (or more) Accuracy and 82% Precision.These results are, thus, in line with ours also identifying the green, red, and NIR regions as relevant for estimating plant biophysical traits [98].
The present outcomes demonstrate that hyperspectral transmittance and reflectance spectroscopy can identify healthy and diseased tissues, such as tomato (herbaceous) and kiwi (woody) crops, in laboratory or field conditions.Further research is advised to explore if specific host-pathogen interactions require customized modeling approaches to be predicted or if it is possible to elaborate a unified strategy that allows bacterial disease assessment.Nevertheless, it should be taken into consideration that model comparison may be challenging due to several factors: pathogen species in the study; the occurrence of specific host-pathogen interactions; the number of spectral points measured; the environmental conditions where the data are collected; and the stage of the disease cycle where the spectral assessments are made, among others.Furthermore, hyperspectral spectroscopy sensors present a relatively low Technology Readiness Level (TRL), indicating that these sensors have a large margin to be improved.In this regard, developing and enhancing effective FS strategies or Dimensionality Reduction approaches may be conducted to identify specific spectral regions valuable for performing plant disease diagnosis, which may be incorporated in multispectral sensors involving lower production and data processing costs.Moreover, the application of hyperspectral imaging sensors should also be explored in future studies, since it would offer a comprehensive spectral overview across spatial dimensions, allowing the capture of spectral signatures from several pixels in the image.This spatial information enables the identification of spatial variations in spectral features within the plant tissue sample, which can provide valuable insights into the spatial distribution of disease symptoms or other physiological changes.However, hyperspectral imaging typically requires more complex data processing and may involve higher equipment costs compared to point-measurement techniques, which must be taken into consideration in the technique development.

Conclusions
This study aimed to explore and compare two distinct modeling approaches, namely the parametric Spectral Vegetation Indices (VIs) and the Gaussian Process Classification based on an Automated Spectral Band Analysis Tool (GPC-BAT), for diagnosing bacterial diseases in plants using hyperspectral sensing.Our analysis covered two experimental conditions, namely controlled conditions where tomato plants were used and field conditions where kiwi plants were analyzed, revealing insights into the performances of each approach.
Both the VI-and GPC-BAT-based approaches demonstrated potential for class discrimination in both experiments.Nevertheless, the modeling strategy applying VIs showed moderate success in distinguishing healthy and Pst-inoculated tomato tissues, with accuracy values ranging from 63% to 71%.In turn, the GPC-BAT strategy exhibited enhanced accuracy in classifying healthy and diseased tissues, achieving overall accuracy values extending from 70% to 75% in the tomato and kiwi case studies.This approach also revealed greater potential for discriminating nonsymptomatic from symptomatic tissues in kiwi plants in field conditions.
The identified spectral bands, particularly in the blue, green, red-edge, and NIR regions, align with the absorption regions of several photosynthetic pigments and plant structural components.These spectral modifications correlate with bacterial infections, affecting chlorophylls, carotenoids, flavonoids, pheophytins, and water content in plant leaves.
Hence, while VIs offer a simplistic yet moderately effective means of disease diagnosis, GPC-BAT, with feature reduction, shows promise in improving accuracy.However, further refinements are necessary, especially for the early-stage diagnosis of specific bacterial infections.
The identified wavelengths hold biological significance, suggesting a correlation between bacterial infections and alterations in photosynthetic pigments and leaf structural components.Future research could focus on refining and integrating these approaches to develop more robust and accurate diagnostic tools for various plant-pathogen interactions, thereby aiding in early disease detection and management in diverse agricultural settings.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/agronomy14030493/s1,Table S1: Cross-validation statistics of the Flexible Discriminant Analysis (FDA) for the training hyperspectral dataset collected on tomato leaflet tissues (transmittance) and kiwi leaf tissues (reflectance).

Figure 1 .
Figure 1.Experimental setup of the bacterial inoculation assay performed on tomato leaves (walk-in experimental conditions), and in kiwi leaves (field experimental conditions).In both cases, spectral measurements were performed on the adaxial side of the leaves.In the tomato case study, they were, furthermore, performed on the 4th, 5th, and 6th leaves of the assay plants, in a dark room, using a spectrometer (measuring transmittance data) combined with an optical fiber bundle with a reflection probe.A white LED was placed beneath each leaflet.In kiwi, spectra collection was performed in random points of leaves casually selected, using a passive spectroradiometer (measuring reflectance data).

Figure 2 .
Figure 2. Conceptual framework for the predictive modeling approaches of bacterial diseases of tomato and kiwi.

Figure 4 .
Figure 4. Gaussian Process Classification sigma bands polar plot, representing the most significant wavelengths for each class in the study: Control samples (healthy, Con), samples inoculated with Pseudomonas syringae pv.tomato (Pst), and samples inoculated with Xanthomonas euvesicatoria (Xeu).The lower the sigma value, the greater the importance of the wavelength.

Figure 5 .
Figure 5. Gaussian Process Classification sigma bands polar plot, representing the most significant wavelengths for the binary class in the study: nonsymptomatic (NS) and symptomatic (S).The lower the sigma value, the greater the importance of the wavelength.

Table 1 .
Spectral data characterization of the measurements randomly performed on tomato leaflets (walk-in chamber-controlled conditions, transmittance) and kiwi leaves (in-field conditions, reflectance), showing the number of assessment dates, plants, and observations (classified according to visual phenotyping observations).

Table 3 .
Cross-validation statistics of the Flexible Discriminant Analysis (FDA) using the validation set of the Vegetation Indices (VIs) computed in the hyperspectral data collected on tomato leaflet tissues (transmittance) and kiwi leaf tissues (reflectance).Model metrics by class are also provided.

Table 5 .
Cross-validation statistics of the Gaussian Process Classification (GPC) models developed using hyperspectral data collected on tomato leaflet tissues (transmittance) and kiwi leaf tissues (reflectance).

Table 6 .
Confusion Matrix of the GPC model results show the predicted samples of each class (column) correctly classified for each true class (row) for the spectral data collected on tomato leaflets' tissues and kiwi leaf tissues.The classes used in the tomato case study were Control samples (healthy, Con), samples inoculated with Pseudomonas syringae pv.tomato (Pst), and samples inoculated with Xanthomonas euvesicatoria (Xeu).The binary class nonsymptomatic (NS) and symptomatic (S) were applied to the kiwi case study.

Table 7 .
Cross-validation statistics of the GPC models developed using hyperspectral data collected on tomato leaflet tissues (transmittance) and kiwi leaf tissues (reflectance) were used.

Table 8 .
Vegetation Index (VI) importance for classification according to Flexible Discriminant Analysis (FDA).The importance value corresponds to the t-statistic value scaled to the maximum.