Modeling Hyperspectral Response of Water-Stress Induced Lettuce Plants Using Artiﬁcial Neural Networks

: Modeling the hyperspectral response of vegetables is important for estimating water stress through a noninvasive approach. This article evaluates the hyperspectral response of water-stress induced lettuce ( Lactuca sativa L.) using artiﬁcial neural networks (ANN). We evenly split 36 lettuce pots into three groups: control, stress, and bacteria. Hyperspectral response was measured four times, during 14 days of stress induction, with an ASD Fieldspec HandHeld spectroradiometer (325–1075 nm). Both reﬂectance and absorbance measurements were calculated. Di ﬀ erent biophysical parameters were also evaluated. The performance of the ANN approach was compared against other machine learning algorithms. Our results show that the ANN approach could separate the water-stressed lettuce from the non-stressed group with up to 80% accuracy at the beginning of the experiment. Additionally, this accuracy improved at the end of the experiment, reaching an accuracy of up to 93%. Absorbance data o ﬀ ered better accuracy than reﬂectance data to model it. This study demonstrated that it is possible to detect early stages of water stress in lettuce plants with high accuracy based on an ANN approach applied to hyperspectral data. The methodology has the potential to be applied to other species and cultivars in agricultural ﬁelds.


Introduction
Remote sensing is an important tool for the analysis of vegetation in agricultural fields because it allows farmers to obtain data in a faster manner than most traditional methods [1,2]. Changes in the spectral response of plants can be observed with equipment that records wavelength values, such as a spectroradiometer [3]. The analysis of the spectral signatures enables the identification of vegetation characteristics that would not be visually perceived in asymptomatic plants [4]. Because of this, many studies focusing on phytosanitary problems are based on spectroscopy. These problems include nutritional deficiencies [5][6][7]; diseases [8], biomass [9], and, as in the present study, water stress [10].
The spectral response of a plant differs according to the species, which motivates the creation of different approaches to model it [11]. One of the problems commonly faced by farmers is related to water stress, which limits growth and compromises its production [12]. Water stress is responsible for chlorophyll variation and impairing other biological components, such as leaf area and root size [13]. The alteration of these components results in the appearance of visual symptoms, but they are difficult to identify due to their similarity to other problems, such as diseases, malnutrition, and cold damage [14]. An alternative that can identify changes caused by water stress alone is hyperspectral analysis [10,11].
The amount of leaf water is best estimated in the near-infrared and medium-infrared spectral regions [15]. In the near-infrared region, the spectral response is associated with the structural organization of intracellular molecules located in the mesophyll, which is affected as a consequence of stress [14]. Stress may also unbalance other physiological conditions and cause changes in visible and red-edge regions [15]. Changes in these spectral regions are associated with foliar pigmentation. Studies have sought to evaluate spectral behavior in these and other regions of the spectrum [16,17]. In addition, the absorbance curve has shown a better relationship with leaf pigmentation [18], which encourages the use of absorbance data to evaluate the negative effects caused by stress in plants.
Different approaches have been adopted to model the hyperspectral response when detecting water stress in cultures. In rice, multivariate analysis models were applied to determine the spectral response of the plant under different stress levels [10]. In tomatoes, classification trees were used to separate the spectral indices that best corresponded to the induced water stress [12]. In winter wheat crops, through continuous analysis of hyperspectral data over time, it was possible to quantify water stress in relation to other variables, such as disease and nitrogen accumulation [19]. Other studies have evaluated the implications of water stress through hyperspectral data in different plants, such as vineyards [11] and citrus fruits [20].
Recently, machine learning approaches have been used in modeling the hyperspectral response of different conditions associated with vegetation [21]. The popular techniques used for analyzing data include regression analysis, vegetation indices, linear polarizations, wavelet-based filtering, and, currently, machine learning algorithms like random forest, decision tree, support vector machine (SVM), k-nearest neighbor (kNN), artificial neural networks (ANN), naïve Bayes (NB), and others [22][23][24][25]. To evaluate the hyperspectral response of plants, machine learning has already been implemented in different scenarios. A radial basis function and the kNN were used to detect citrus canker in several disease development stages [26]. ANN, NB, and kNN were also used to model pepper fusarium disease in a climate room [27]. A combination of different machine learning algorithms like SVM, ANN, and others were also evaluated to model photosynthetic variables [28].
In lettuce, water stress poses a major threat. To deal with this, commercially available seeds are being inoculated with rhizobacteria, because it mitigates the effects of the stress [29]. These effects; however, may not be visually perceptible, which makes detection by ordinary approaches difficult. Hyperspectral data have already demonstrated high potential in assessing water stress in plants in different spectral regions (350-2500 nm) [11,23,30]. However, to date, no model has evaluated the spectral response of lettuce submitted to water stress. Here we evaluate the hyperspectral response of water-stress induced lettuce with a machine learning method through ANN. The contribution of this study is twofold. Firstly, we identified the effects of water stress in lettuce and its association with their spectral response. Secondly, we evaluated the performance of the ANN algorithm to model its effects. The rest of this article is organized as follows. Section 2 presents the materials and methods adopted in this study. Sections 3 and 4 present and discuss the results obtained in the experimental analysis. Finally, Section 5 concludes the article.

Materials and Methods
The experiment was conducted in a growth chamber under controlled conditions in a phytotron. Twenty-eight-day-old lettuce (Lactuca sativa L.) plants were transplanted to pots with 0.5 kg of agricultural soil (pH (CaCl 2 0.01 mol L −1 ) 5.9; 43.9 mg dm-3 of P (Mehlich-1), 2.7 mmol dm-3 of K, 25.3 mmol dm-3 of Ca, 5.3 mmol dm-3 of Mg, 14.3 mmol dm-3 of H + Al). The experimental design was randomized with three treatments (n = 36). One treatment was carried out with B. subtilis inoculation, strain AP-3 [31]. The inoculation was performed with bacteria obtained by scraping cells multiplied in a solid medium, diluted in sterile water to a concentration of 1.0.109 cels. mL −1 , and 0.1 mL per inoculum. The plants were cultivated for 14 days under the same irrigation conditions, maintaining the soil at field capacity. From the 14th day on, water restriction was applied.
The treatments were conducted as follows: (i) control group, with maintenance of the field capacity and without inoculation of the plants; (ii) stress group, with maintenance of 50% of the field capacity and without inoculation of the plants, and; (iii) bacteria group, with maintenance of 50% of field capacity and inoculation of the plants. The water replacement for field capacity maintenance of only 50% was conducted by the gravimetric method. The phytotron chamber maintained the same temperature (25 • C) and lighting conditions during the experiment. Water-deficit treatment was performed for 15 days and was completed on the 30th day after the transplantation of the plants to vessels. The experimental design and analysis are summarized below ( Figure 1).

Materials and Methods
The experiment was conducted in a growth chamber under controlled conditions in a phytotron. Twenty-eight-day-old lettuce (Lactuca sativa L.) plants were transplanted to pots with 0.5 kg of agricultural soil (pH (CaCl2 0.01 mol L −1 ) 5.9; 43.9 mg dm-3 of P (Mehlich-1), 2.7 mmol dm-3 of K, 25.3 mmol dm-3 of Ca, 5.3 mmol dm-3 of Mg, 14.3 mmol dm-3 of H + Al). The experimental design was randomized with three treatments (n = 36). One treatment was carried out with B. subtilis inoculation, strain AP-3 [31]. The inoculation was performed with bacteria obtained by scraping cells multiplied in a solid medium, diluted in sterile water to a concentration of 1.0.109 cels. mL −1 , and 0.1 mL per inoculum. The plants were cultivated for 14 days under the same irrigation conditions, maintaining the soil at field capacity. From the 14th day on, water restriction was applied.
The treatments were conducted as follows: (i) control group, with maintenance of the field capacity and without inoculation of the plants; (ii) stress group, with maintenance of 50% of the field capacity and without inoculation of the plants, and; (iii) bacteria group, with maintenance of 50% of field capacity and inoculation of the plants. The water replacement for field capacity maintenance of only 50% was conducted by the gravimetric method. The phytotron chamber maintained the same temperature (25 °C) and lighting conditions during the experiment. Water-deficit treatment was performed for 15 days and was completed on the 30th day after the transplantation of the plants to vessels. The experimental design and analysis are summarized below (Figure 1). .

Spectral Data Measurements
To record the spectral response of each plant, considering the different treatments, a darkroom was prepared to avoid light interference from other materials. The spectral response of the lettuce was measured using a Fieldspec HandHeld ASD spectroradiometer, operating at a spectral range of

Spectral Data Measurements
To record the spectral response of each plant, considering the different treatments, a darkroom was prepared to avoid light interference from other materials. The spectral response of the lettuce was measured using a Fieldspec HandHeld ASD spectroradiometer, operating at a spectral range of 325-1075 nm, in 512 channels with a spectral resolution of 1.6 nm and a 1 • field of view. The equipment was carefully placed close to the leaves, at a 45 • inclination, in relation to the height of the plant so that its field of view (FOV) did not exceed the area of the plant and register the spectral response of the substrate. A halogenic lamp was also placed at 45 • on the other side. Before each measurement, the equipment was calibrated with a Lambertian (Spectralon ® plate) surface plate.
The spectroradiometer registered 10 spectral curves during the same measurement for different leaves. This resulted in 360 spectral signatures for each measured day. These signatures represent the radiance from the leaves along the electromagnetic wavelength. Because the spectroradiometer records the radiance that reaches the equipment, we needed to transform it into the reflectance factor. For that, the leaf radiance was divided by a reference radiance, which corresponds with the Lambertian plate measured previously. The spectroradiometer also has a known calibration factor (K), which must be multiplied by the values of the described operation. This factor, together with the radiance values of the reference plate of the Lambertian surface, was used to estimate the bidirectional reflectance factor (BRF), as shown in Equation (1) [32].
where dL is the spectral radiance, ω is the solid angle, θ and Φ are in order, the zenith and azimuth angles, respectively; i is the incident flux, and r is the reflected energy flux. As mentioned, the K value is the correction factor from the equipment manufacturer. The BRF represents the spectral signature of the recorded radiometric target, also called the spectral response of the selected target.
To remove regions with a low signal-to-noise ratio, the spectral range from 380 to 1020 nm was selected to compose the spectral data, removing everything outside this range. The spectral curves were evaluated in terms of reflectance and absorbance values. Following Beer-Lambert's law, which shows that a concentration of an absorbent is proportional to the absorbance, the spectral reflectance values were converted using Equation (2).
where A corresponds to absorbance and R corresponds to the spectral reflectance obtained with the Fieldspec HandHeld ASD spectroradiometer.

Biophysical Data Measurements
The leaf chlorophyll content (α + β) was recorded using a portable chlorophyllometer (Clorofilog Falker). The measurements were taken in the leaves of the apical part, median part, and basal part of each lettuce plant. This device operates in three spectral regions, in which the first two are in the red and red-edge regions and the third one in the near-infrared region [33]. The diameter of the leaves of each plant was also measured using a millimetric tape. The plants were then detached and weighed using a digital balance. At this stage, the aerial part (leaf and stem) was removed from the root and weighed separately, obtaining the fresh mass (g). The material was then left to dry in the open air for 48 h and weighed again to find its dry mass (g).

Statistical Data Analysis
The Shapiro-Wilk test was used to verify the normality of the data related to the biophysical parameters. The ANOVA method was applied to determine the difference among the three treatments (control, stress, and bacteria groups), while the mean difference was verified using the Tukey test. A 95% confidence interval was then adopted for all statistical analyses. For spectral response curves, the correlation of single wavelengths with the selected biophysical parameters was calculated for both reflectance and absorbance values. Contour maps resulting from traditional 2D correlation analysis were also applied between the pairs of the treatments to determine the spectral intervals that presented similarity. Correlation graphs between the biophysical parameters and all the spectral wavelengths of the three groups were then plotted. The results were compared using their means, standard deviations, correlation coefficient, coefficient of regression (R 2 ), mean discrepancy (by bootstrapping), and root mean squared error (RMSE). The metrics evaluated here were obtained with the open-source software PAST v. 3.2 and the statistical program R v. 3.6.

ANN and Machine Learning Analysis
In a computational environment, we randomly separated 80% of the data to train the ANN algorithm and 20% to test it. The number of spectral wavelengths (n = 360) was the same for each day. The spectral wavelengths were added as input for the ANN, and one hidden layer with n neurons was considered. A linear activation function was applied in the output layer. We adopted the Adam Optimizer with regularization of a = 0.0001. We used an open-source version of the RapidMiner v. 9.4 software.
To define the best hyperparameters, we performed a cross-validation method by separating our dataset into 10 folds. This separation was stratified and we used only the training dataset (80%). In this approach, one-fold is used to validate the algorithm performance while the remaining folds are used to train the model. The test is repeated until all 10 folds are used individually as validation data. An example of the training curve being adjusted to the 1st measurement day absorbance data is plotted below (Figure 2).

ANN and Machine Learning Analysis
In a computational environment, we randomly separated 80% of the data to train the ANN algorithm and 20% to test it. The number of spectral wavelengths (n = 360) was the same for each day. The spectral wavelengths were added as input for the ANN, and one hidden layer with n neurons was considered. A linear activation function was applied in the output layer. We adopted the Adam Optimizer with regularization of a = 0.0001. We used an open-source version of the RapidMiner v. 9.4 software.
To define the best hyperparameters, we performed a cross-validation method by separating our dataset into 10 folds. This separation was stratified and we used only the training dataset (80%). In this approach, one-fold is used to validate the algorithm performance while the remaining folds are used to train the model. The test is repeated until all 10 folds are used individually as validation data. An example of the training curve being adjusted to the 1st measurement day absorbance data is plotted below (Figure 2). We applied a hyperparametrization evaluation and detected that 100 neurons in hidden layers and a maximum number of interactions of 200 presented the ideal configuration without overfitting our model for most of the tests. Finally, we plotted an ROC (receiver operating characteristic) curve to evaluate the comparison between each classification and a confusion matrix of the ANN results. We evaluated the gain ratio and the F-score for each individual wavelength.
To test the robustness of the ANN, we compared it with other traditional machine learning algorithms, such as decision-tree; support vector machine (SVM); random forest (RF); naïve Bayes; and logistic regression. The number of training and testing remained the same. We also performed a hyperparametrization with these algorithms. The criteria for stopping was defined once it did not return in any practical gains for the classification accuracy (%). For this, we considered the individual characteristic of each classifier, like the number of trees, nodes and leaves, number of interactions, function degree, and others. We applied a hyperparametrization evaluation and detected that 100 neurons in hidden layers and a maximum number of interactions of 200 presented the ideal configuration without overfitting our model for most of the tests. Finally, we plotted an ROC (receiver operating characteristic) curve to evaluate the comparison between each classification and a confusion matrix of the ANN results. We evaluated the gain ratio and the F-score for each individual wavelength.
To test the robustness of the ANN, we compared it with other traditional machine learning algorithms, such as decision-tree; support vector machine (SVM); random forest (RF); naïve Bayes; and logistic regression. The number of training and testing remained the same. We also performed a hyperparametrization with these algorithms. The criteria for stopping was defined once it did not return in any practical gains for the classification accuracy (%). For this, we considered the individual characteristic of each classifier, like the number of trees, nodes and leaves, number of interactions, function degree, and others.
The decision tree and random forest models provide classification trees that rely on the idea of an overall accuracy improvement by adding the predictions of combined independent predictors [34]. SVM uses a regression approach to find separation lines and can be applied in many cases where there is a distinct margin of separation. Naïve Bayes is a probabilistic classifier that applies the Bayes' theorem with independence assumptions. Lastly, logistic regression is also a regression approach that bases itself in a sigmoid function to model the predicted classes [35].
The metrics used for evaluating the performance of each algorithm was the AUC (area under the curve), overall accuracy, F1-score, precision, and recall. We compared each one of them during the four stages of the spectral response measurement: 14th, 19th, 24th, and 29th days. Both the reflectance and the absorbance values were used separately as input features. The results are presented in the following section.

Hyperspectral and Biophysical Parameters Comparison
The water stress caused a reduction in practically all biophysical parameters. The exception was for the root development node treatment inoculated with the rhizobacteria. This indicates that the influence of the bacteria was more evident in the root system. The bacteria group had mean values similar or even higher to those found for the control group, both in the fresh mass and in the dry root (Table 1). Chlorophyll content also expressed high differences between the control group and the others. At the end of the experiment, the lowest values of reflectance and the highest values of absorbance were observed in the control group (Figure 3). The spectral behavior of each treatment increased in difference as the stress progress continued. Stress and bacteria groups were both submitted to water stress, and their spectral response curves were distanced from the control curve. This condition can be explained by the reduction in fresh leaf mass and leaf diameter, which caused an increase in chlorophyll concentration in both groups (stress and bacteria). Furthermore, a continuous analysis over time showed that reflectance and absorbance values both increased and decreased, respectively. The stress group; however, was the one that presented a higher discrepancy over time. To evaluate the correlation between each spectral behavior in the final day of the experiment, a matrix ( Figure 4) was organized with the correlation values between all groups. In general, the control group presented better correlations with the bacteria group than the stress group. The To evaluate the correlation between each spectral behavior in the final day of the experiment, a matrix (Figure 4) was organized with the correlation values between all groups. In general, the control group presented better correlations with the bacteria group than the stress group. The correlation was higher in the near-infrared region for all curves. The green region (from 520 to 580 nm) was more evident on the reflectance curve in the bacteria group and was better isolated when evaluated on the absorbance curve ( Figure 4C,D). This could prove difficult in differentiating both groups' spectral behavior. Still, both groups returned different amplitudes in the averaged values (Figure 3), which indicates a feasible separation between them.
The relationship between the biophysical parameters and all the spectral wavelengths of the three groups is shown in Figure 5. The parameters that presented best correlations with the spectral curves were chlorophyll, fresh masses and aerial dry matter, and dry weight of roots. With the exception of the dry root weight, the aforementioned parameters presented higher negative and positive correlations in the visible region ( Figure 5A,C,D), specifically in the blue region (from 380 to 460 nm) and in a smaller range in the red region (between 640 and 680 nm), which coincides with the absorption regions of chlorophyll. The weight of the root dry mass was the parameter that presented the highest correlation (r = 0.80) with the near-infrared region (700 nm onwards).
The mean values of the correlation of each parameter with the reflectance and absorbance curves were also compared ( Table 2): the chlorophyll, diameter, fresh aerial weight, and dry root weight had better correlations (r = −0.803; 0.540; 0.822; and 0.709, respectively) with absorbance than with reflectance. The difference between averages ranged from 0.797 (fresh root) to 1.602 (fresh aerial), which demonstrates how much the reflectance values differ from those of absorbance. This behavior is better observed in the absorbance wavelengths of chlorophyll, fresh weight of the aerial part and in the dry-root weight, which presented values higher than 0.7, and the lowest values in root mean squared error (RMSE).
Remote Sens. 2019, 11, x FOR PEER REVIEW 8 of 15 correlation was higher in the near-infrared region for all curves. The green region (from 520 to 580 nm) was more evident on the reflectance curve in the bacteria group and was better isolated when evaluated on the absorbance curve ( Figure 4C,D). This could prove difficult in differentiating both groups' spectral behavior. Still, both groups returned different amplitudes in the averaged values (Figure 3), which indicates a feasible separation between them. The relationship between the biophysical parameters and all the spectral wavelengths of the three groups is shown in Figure 5. The parameters that presented best correlations with the spectral curves were chlorophyll, fresh masses and aerial dry matter, and dry weight of roots. With the exception of the dry root weight, the aforementioned parameters presented higher negative and positive correlations in the visible region ( Figure 5A,C,D), specifically in the blue region (from 380 to 460 nm) and in a smaller range in the red region (between 640 and 680 nm), which coincides with the absorption regions of chlorophyll. The weight of the root dry mass was the parameter that presented the highest correlation (r = 0.80) with the near-infrared region (700 nm onwards).
The mean values of the correlation of each parameter with the reflectance and absorbance curves were also compared ( Table 2): the chlorophyll, diameter, fresh aerial weight, and dry root weight had better correlations (r = −0.803; 0.540; 0.822; and 0.709, respectively) with absorbance than with reflectance. The difference between averages ranged from 0.797 (fresh root) to 1.602 (fresh aerial), which demonstrates how much the reflectance values differ from those of absorbance. This behavior is better observed in the absorbance wavelengths of chlorophyll, fresh weight of the aerial part and in the dry-root weight, which presented values higher than 0.7, and the lowest values in root mean squared error (RMSE).

Modeling the Hyperspectral Wavelengths Through Artificial Neural Network
The wavelengths were modeled by different machine learning algorithms from the 14th day of the experiment. The ANN model presented here was able to classify better than any of the remaining algorithms since the first day of measurement (Table 3). In general, the absorbance values offered better accuracy than the reflectance ones. The accuracy and other metrics were improved with each measurement, indicating an increased difference over time.

Modeling the Hyperspectral Wavelengths Through Artificial Neural Network
The wavelengths were modeled by different machine learning algorithms from the 14th day of the experiment. The ANN model presented here was able to classify better than any of the remaining algorithms since the first day of measurement (Table 3). In general, the absorbance values offered better accuracy than the reflectance ones. The accuracy and other metrics were improved with each measurement, indicating an increased difference over time. The other machine learning algorithms were also able to return similar classification accuracies. The logistic regression method presented high accuracy in the first three measurements. However, it declined over the final day. This behavior was noted for the other algorithms as well. ANN was not only able to maintain consistency over time but also presented its highest accuracy on the final day. Another observation is that, to all machine learning methods applied here, the absorbance values were more efficient in discriminating the plant groups in most of the classifications.
To visualize the differences between each group, an ROC curve of the last day of measurement was used ( Figure 6). The ROCs suggest that the ANN was better to differentiate individually the three groups, while other algorithms performed worse at specific conditions. The ANN also returned a less false-positive rate than all of the other machine learning algorithms. The confusion matrix of the final measurement day also shows how the ANN had more problems in predicting the control group (89.6%) than the other groups (94.4% and 94.1%, bacteria and stress, respectively).

Discussion
This study evaluated the spectral response of lettuce submitted to water stress while modeling its effects with an ANN and other machine learning algorithms. For that, we separated our data into three groups: control, stress, and bacteria. The reason to include the rhizobacteria in this situation is to induce a similarity with what transpires in greenhouses or horticulture models, as this bacterium is commonly present in soil and commercial seeds [29]. The addition of the bacteria group is also important to reinforce our test as it can act as a middle-ground between the stress group and the control group. We firstly evaluated the biological and physical response of the induced stress, and later compared it with the hyperspectral measurement. Lastly, we used ANN and other machine learning algorithms to classify both groups solely by their spectral response. Based on this classification, the gain ratio and the relief-F were used to evaluate the contribution of individual wavelengths to the ANN model ( Figure 7). These metrics suggest that the stress group presented higher differences with the control group than the bacteria group, easily distinguishable by the algorithm. However, there appears to be a higher discrepancy between the bacteria group and the control group at the blue region (380 to 440 nm). Nonetheless, the near-infrared region and the 660 to 730 nm region appears to be contributing more to the stress group response.

Discussion
This study evaluated the spectral response of lettuce submitted to water stress while modeling its effects with an ANN and other machine learning algorithms. For that, we separated our data into

Discussion
This study evaluated the spectral response of lettuce submitted to water stress while modeling its effects with an ANN and other machine learning algorithms. For that, we separated our data into three groups: control, stress, and bacteria. The reason to include the rhizobacteria in this situation is to induce a similarity with what transpires in greenhouses or horticulture models, as this bacterium is commonly present in soil and commercial seeds [29]. The addition of the bacteria group is also important to reinforce our test as it can act as a middle-ground between the stress group and the control group. We firstly evaluated the biological and physical response of the induced stress, and later compared it with the hyperspectral measurement. Lastly, we used ANN and other machine learning algorithms to classify both groups solely by their spectral response.
Our results indicate that the physiological response to water stress in early-stage lettuce is the reduction in leaf size and an increase in chlorophyll concentration. This behavior was evident both in the stress group and in the bacteria group, although with lower intensity in the latter. Changes in leaf pigmentation are noticeable in cases of plant stress [15]. However, an interesting observation is that the stress did not affect root weight in the bacteria group. This indicates that there is an effect in mitigating this stress, although this was not indicated by leaf analysis. By examining the mean spectral curves of each treatment (Figure 2) there is a small amplitude between the red-edge region wavelengths. The red-edge region is commonly known to indicate stress presence [23,34], and this may explain why the bacteria group did not differentiate much from the control group here.
Regarding the correlation between biophysical parameters and the wavelengths, it is initially perceived that absorbance wavelengths are more correlated with most biophysical parameters. An important observation to be made is the strongest correlation between chlorophyll and leaf fresh-weight with these wavelengths (Figure 4 and Table 2). This relationship between absorbance levels and the biophysical parameter was continuous throughout the experiment, particularly in regard to modeling each group's response with machine learning algorithms (Table 3). Still, the correlation was more pronounced in the green and near-infrared regions (Figure 3). This situation is also evident when observing the mean curves of each treatment (Figure 2), where the amplitude between the curves is smaller in the blue, red, and red-edge remaining regions.
The classification performed by the ANN algorithm in this study showed interesting results since the first day of measurement when lettuces had just been stressed one day before the actual measurement. This condition is important to mention as it indicates how powerful hyperspectral analysis in conjunction with machine learning algorithms can be. From the evaluation metrics used in this study, it is evident how ANN was better in distinguishing the three plant groups. By observing the phenomenon temporally, one can see how the performance of the algorithms increased (Table 3). This can be explained by the increased distinction between the wavelengths of each group. As the stress occurred, the spectral behavior of these experiments became distinct from each other. Because this study is unique in this regard, there is a lack of literature to compare with. Still, the accuracy found here is similar to or even higher than those obtained by modeling different stresses effects in plants [21,[24][25][26][27].
Another contribution of this study is the evaluation of the performance of different algorithms for both reflectance and absorbance wavelengths. Absorbance curves were directly related to changes in biophysical parameters for all treatments (Figure 4 and Table 2). This persisted in the machine learning analysis, where the performance of the algorithms was superior in differentiating the three groups by using their absorbance values. Thus, it is recommended that the modeling of these effects in lettuce is preferably performed from the conversion of reflectance to absorbance data. Another observation is that, by evaluating the performance of each algorithm over time, the ANN accuracy reached its peak at the last measurement day (with 92.7%), while the other algorithms decreased in performance (from the third to the fourth day of evaluation). This indicates how feasible the ANN algorithm was in modeling the water-stress effects in comparison to the others.
Lastly, the ANN algorithm has shown high precision and recall values ( Table 2) when classifying each group, as shown in Figure 5. The confusion matrix demonstrated a small decrease in performance (89.6%) for differentiating the control group from the others. Regardless, the gain ratio and relief-F metrics ( Figure 6) show how each individual wavelength contributed to the ANN model. In the gain ratio analysis, there is a predominance of wavelengths in the region of blue (380 to 440 nm), red (660 to 730 nm), and near-infrared (790 nm onwards). Despite the similarity between the curves of both groups, there was a smaller amplitude difference for the blue region. This may indicate how much the blue region contributed to differentiate the effects of the stress on the bacteria group. This was similar in the assessment of relief-F curves, in which this same region presented an even higher value than the group under stress. The blue region is responsible for the absorption of chlorophyll and may be an indication of how important the stress effect was in this spectral range. Nevertheless, the model also indicated greater contributions in the red region, red-edge, and near-infrared, which corroborates with the observations made during previous results (Figure 3). Apart from other species and cultivars, future research could be conducted exploring additional spectral regions such as the shortwave infrared (SWIR) region that is unfortunately not considered in the Fieldspec HandHeld ASD spectroradiometer device.

Conclusions
In this study, we have applied an artificial neural network algorithm to model the hyperspectral response of water-induced stress in lettuce. The ANN algorithm detected differences since the first day of the induced stress, with an 80% classification accuracy. The algorithm continued to present an increasing performance along with time-series analysis, resulting in a final 93% accuracy. The spectral wavelengths that contributed the most for its prediction were located around 380 to 440 nm, 660 to 730 nm, and, on a lower level, 790 nm onwards. We also detected that absorbance values are more suitable to deal with this issue than reflectance. Although the rhizobacteria did mitigate the water-stress effect at some point, a spectral behavior difference was noticed by the ANN algorithm, proving its robustness. The proposed approach indicated how feasible water stress in lettuce at early stages is measurable with machine learning algorithms such as ANN in hyperspectral data. While the small number of instances (four measurement days) evaluated could provide problems for the experiment, all machine learning algorithms tested here were able to classify it appropriately. For future works, we recommend similar studies with other species and cultivars. Additionally, the method demonstrated here could be scaled up to remote sensing platforms like unmanned aerial vehicles (UAV), as currently hyperspectral sensors can be embedded in it.