Estimation of Water Stress in Potato Plants Using Hyperspectral Imagery and Machine Learning Algorithms

: This work presents quantitative detection of water stress and estimation of the water stress level: none, light, moderate, and severe on potato crops. We use hyperspectral imagery and state of the art machine learning algorithms: random decision forest, multilayer perceptron, convolutional neural networks, support vector machines, extreme gradient boost, and AdaBoost. The detection and estimation of water stress in potato crops is carried out on two different phenological stages of the plants: tubers differentiation and maximum tuberization. The machine learning algorithms are trained with a small subset of each hyperspectral image corresponding to the plant canopy. The results are improved using majority voting to classify all the canopy pixels in the hyperspectral images. The results indicate that both detection of water stress and estimation of the level of water stress can be obtained with good accuracy, improved further by majority voting. The importance of each band of the hyperspectral images in the classiﬁcation of the images is assessed by random forest and extreme gradient boost, which are the machine learning algorithms that perform best overall on both phenological stages and detection and estimation of water stress in potato crops. E.A.S.-A., G.A.G.-V., L.M.T.-D., O.D.O.-P. and A.M.C.-M. L.M.T.-D.; Supervision, A.M.C.-M.; Project administration, A.M.C.-M.; Funding A.M.C.-M.; Resources, G.A.G.-V. and A.M.C.-M. All have and to the published version of the manuscript.


Introduction
Potato (Solanum tuberosum L.) is the third most important food crop in the world [1]. The potato provides an economic and rich source of carbohydrates and it is included in the diet of both developed and undeveloped countries. Water deficit is the most important abiotic stress affecting the development, productivity, and quality of potato cultivars [2]. Hence, it is important to detect, as early as possible, signs of water stress in potato plants avoiding production and quality losses. Due to climate change, crops worldwide are suffering from unexpected and longer severe weather changes such as droughts, which are becoming increasingly more intense [3]. Specifically in Colombia, a good portion of areas suitable for potato production are vulnerable to increased aridity, soil erosion, desertification, and variations in the hydrological system as a consequence of climate change [4]. Therefore, there is a need to map water stress in potato crops using nondestructive technologies such as remote sensing.
Recently, a spectroradiometer (350-2500 nm) was used to explore the effect of water stress on the spectral reflectance of bermudagrass and five vegetation indexes were studied [5]. In the case of potato crops, 12 vegetation indexes including four Normalized Water Indexes (NWIs), have been studied to detect water stress in potato leaves under different watering conditions using also a spectroradiometer (350-2500 nm) [4]. The results indicate clear differences in the spectrum of water-stressed leaves in the 700-1300 nm range [4].
Remote sensing technologies using unmanned aerial vehicles (UAVs) acquiring visible and thermal images were used to map water stress in barley crops [6]. The detection of water stress in plants using aerial imagery has focused on thermal imagery to estimate plant temperature relative to the air temperature computing NWIs. Since stomata close under water stress, the temperature of the leaves relative to the air increases [6][7][8]. More recently, remote sensing imaging technologies using visible, near-infrared (NIR), short wave infrared (SWIR), and thermography have been proposed to detect water stress in potato crops [9]. Rather than using broadband multispectral images, hyperspectral imagery and machine learning algorithms have been proposed to determine the quality of food products [10]. Hyperspectral imagery (400-1000 nm) has also been proposed to detect water stress in potato crops using spectral indexes [11]. Hyperspectral imagery (400-2500 nm) was used in combination with partial least squares-discriminant analysis (PLS-DA) and partial least squares-support vector machine (PLS-SVM) classification to detect abiotic and biotic drought stress in tomato canopies [12]. Hyperspectral imagery (450-1000 nm) in combination with machine learning algorithms (random forest and extreme gradient boost) has been also used to detect water stress in vine canopies [13]. Another possibility for detecting water stress in plants is to use radar remote sensing technologies [14,15] with the advantage of penetrating the clouds, a limitation of visible and thermal imagery. Finally, ultrasound wave spectroscopy has also been used to estimate the water content of plant leaves using convolutional neural networks and random forest algorithms [16].
As previously indicated, work on detecting water stress in potato cultivars has been based on vegetation indexes (NDVI, the Simple Ratio, the Photochemical Reflectance Index, the pigment-specific simple ratio of Chlorophyll-a, the reflectance water index, the Normalized Water Indexes and the dry Zea N index). Here we use a hyperspectral camera (400-1000 nm) and several well-known machine learning algorithms to detect water stress in potato hyperspectral images and to estimate the degree of water stress: none, light, moderate and severe, using all images bands. The use of machine learning algorithms allows us to determine which regions in the spectral signature of the leaves are more influential to better estimate water stress from remote sensing using images in the visible (400-700 nm) and near-infrared (NIR) (700-1000 nm) bands.

Plant Material and Experimental Design
The experiment was developed in greenhouse number 17 of AGROSAVIA (Corporación Colombiana de Investigación Agropecuaria), Tibaitatá research center, Colombia (4 • 41 25.7064 N, 74 • 12 08.23 W) at 2543 m above the sea level. Certified seeds of Solanum tuberosum L., variety Diacol Capiro were planted in the greenhouse. The experiment consisted of a randomized complete blocks design in a factorial 2 × 4 arrangement. The first factor considered was the level of plant development (phenological stage), this was fixed according to [17]: tubers differentiation (TD) and maximum tuberization (MT) (Appendix A). The second factor was the level of water stress severity, determined by the hydric potential of the leaves, measured using a Scholander pressure chamber in Mega Pascals (Mpa). Control plants have a hydric potential in the 0-−0.49 Mpa range, light (L) water stress has a hydric potential in the −0.5-−0.59 Mpa range, moderate (M) water stress has a hydric potential in the −0.6-−0.89 Mpa, and severe (S) water stress has a hydric potential equal to or lower than −0.9 Mpa. These hydric potential ranges were selected based on [18,19], and previous research experience of AGROSAVIA in greenhouses containing potato crops.
Potato plants were sown in a greenhouse in a loamy soil that was kept at field capacity (soil water potential did not decline below −0.033 MPa) by drip irrigation from sowing until the 9th and 13th week after sowing, when each stage of development was reached (TD and TF, respectively). At that time, the water supply was suspended, and the water potential in the leaf was measured daily until reaching each level of stress (L, M, S). Control plants had a water supply throughout the experiment.

Hyperspectral Imagery
The hyperspectral images were acquired using a 710-VP Surface Optics Corporation camera with 520 × 696 pixels and 128 spectral bands in the 400-1000 nm range, using the Environment for Visualizing Images (ENVI) format. The images were taken at 3 m above the plant's canopy level and the camera looking downwards. The image acquisition campaigns were done at around the same hour of the day. Figure 1 shows a false-color image of the canopy of a plant loaded and visualized with MultiSpec [20]. As can be seen from this image a Spectralon reflectance white panel is also used on each image to convert the hyperspectral intensity images to reflectance. It is easy to segment the white Spectralon panel from the hyperspectral image by computing the average of the red, green, blue, and NIR bands and dividing that image by the maximum intensity. Figure 2 shows this normalized average, where the Spectralon reflectance panel can be segmented from the image using a threshold above 0.5.
Horticulturae 2021, 7, x FOR PEER REVIEW 3 of 24 (TD and TF, respectively). At that time, the water supply was suspended, and the water potential in the leaf was measured daily until reaching each level of stress (L, M, S). Control plants had a water supply throughout the experiment.

Hyperspectral Imagery
The hyperspectral images were acquired using a 710-VP Surface Optics Corporation camera with 520 × 696 pixels and 128 spectral bands in the 400-1000 nm range, using the Environment for Visualizing Images (ENVI) format. The images were taken at 3 m above the plant's canopy level and the camera looking downwards. The image acquisition campaigns were done at around the same hour of the day. Figure 1 shows a false-color image of the canopy of a plant loaded and visualized with MultiSpec [20]. As can be seen from this image a Spectralon reflectance white panel is also used on each image to convert the hyperspectral intensity images to reflectance. It is easy to segment the white Spectralon panel from the hyperspectral image by computing the average of the red, green, blue, and NIR bands and dividing that image by the maximum intensity. Figure 2 shows this normalized average, where the Spectralon reflectance panel can be segmented from the image using a threshold above 0.5. The reflectance of each hyperspectral image can be computed using: where , , is the reflectance image at pixel coordinates , and waveband , , , is the raw intensity image at pixel coordinates , , and waveband , the known reflectance of the Spectralon panel at wavelength (0.99 at visible and NIR ranges) and the mean intensity of the Spectralon panel at waveband . Once the hyperspectral images are converted to reflectance, it is necessary to segment the canopy from its background. The Normalized Difference Vegetation Index (NDVI) has widely been used to detect vegetation canopy [21]: The reflectance of each hyperspectral image can be computed using: where ρ(x, y, λ) is the reflectance image at pixel coordinates x, y and waveband λ, I(x, y, λ) is the raw intensity image at pixel coordinates x, y, and waveband λ, ρ S (λ) the known reflectance of the Spectralon panel at λ wavelength (0.99 at visible and NIR ranges) and Is(λ) the mean intensity of the Spectralon panel at waveband λ. Once the hyperspectral images are converted to reflectance, it is necessary to segment the canopy from its background. The Normalized Difference Vegetation Index (NDVI) has widely been used to detect vegetation canopy [21]: where ρ N IR , ρ red are the reflectances at the NIR and red wavelengths, respectively. However, the NDVI is affected by several factors including shadows [21] that could lead to 0/0 undefined values. To avoid this, we used the Soil-Adjusted Vegetation Index (SAVI) that overcomes the issues of the NDVI [21] and selected those values where SAVI > 0.3 ( Figure 3): where , are the reflectances at the NIR and red wavelengths, respectively. However, the NDVI is affected by several factors including shadows [21] that could lead to 0/0 undefined values. To avoid this, we used the Soil-Adjusted Vegetation Index (SAVI) that overcomes the issues of the NDVI [21] and selected those values where SAVI > 0.3 ( Figure  3): (3) Figure 2. Normalized sum of red, green, blue, NIR bands.  From the image campaign at the tubers differentiation phenological stage, 64 images were acquired to be used for the machine learning algorithms (stressed and control plants) with water stresses that range from 3 to 20 days. From the image campaign at the maximum tuberization phenological stage, 52 images were acquired to be used for the machine learning algorithms (stressed and control plants) with water stresses that range from zero to nine days. The reading and preprocessing of the hyperspectral images were done using Python 3.8.5 that comes with Anaconda [22]. The Python spectral library [23] was used to read the hyperspectral images.
There are control plants that provide images for the control class and there are several images for each stress condition, taken at different days after the application of each stress level.

Machine Learning Algorithms
Two supervised classification tasks for the two phenological stages of the potato crops were carried out: detection of water stress i.e., the plant is water-stressed or not (two classes) and the estimation of the water level of stress i.e., the plant is not water-stressed, is lightly water-stressed, is moderately stressed or severely stressed (four classes). To perform these classification tasks six well-known machine learning algorithms were used:

•
Random decision forest (RF) [24] using 100 trees, with a balanced class weight. RF are an ensemble of decision trees, the class predicted corresponds to the class most voted for the decision trees.

•
Multi-layer perceptron (MLP) [25] with an input layer having equal nodes as the number of bands (128) and an output layer having equal nodes as the number of classes (2 or 4). Each layer is followed by a batch normalization layer [26], a dropout layer [27] with a probability of 0.2, a rectified linear activation function (RELU, a function that will output the same input if it is positive, zero otherwise) [28] on the input layer, and a Softmax activation function [28] on the output layer for the case of four classes or a Sigmoid activation function [28] for the case of two classes classification (see Figure 5). An MLP neural network consists of layers of nodes: an input layer, hidden layers and an output layer. Except for the input nodes, each node is a neuron that uses a nonlinear activation function. Each node on a layer connects with each node of the following layer by a weight function. The neural network learns the weights from the training data.

•
Convolutional neural networks (CNN) [29] with two convolutional layers using a kernel size of 3 and 20 filters each one. The two convolutional layers are followed by a batch normalization layer, a dropout (0.2) layer, and a RELU layer. After the two convolutional layers, a flatten layer follows to flatten out the last convolutional layer into MLP nodes. After the flatten layer, an input MLP layer of size equal to the half of nodes of the flatten layer follows, then a middle MLP layer with half the nodes of the previous layer and an output layer with equal nodes as the number of classes. Each MLP layer is followed by a batch normalization layer, a dropout (0.2) layer, and a RELU layer for the case of the first MLP layer and the second MLP layer. The last MLP layer is followed by a Softmax activation function in the case of four classes or a Sigmoid activation function, in the case of two classes ( Figure 4). Convolutional neural networks are a kind deep learning neural network specialized on images, with convolutional layers applying different kinds of filters on patches of the images and then on previous convolutional layers, to capture variabilities at higher scales. • Support vector machine (SVM) [30] using linear SVM with default parameters. SVM maps training examples to points in space so as to maximize the width of the gap between the classes. • Extreme gradient boost (XGBoost) [31] using tree classifiers (gbtree) as weak learners and 100 estimators. Gradient boosting produces an ensemble of weak predictions (usually trees) models and generalizes them by the optimization of a differentiable loss function. XGBoost in an implementation of gradient boosting that uses a more regularized model formalization to control overfitting. • AdaBoost (AB) [32] with 100 estimators. An AdaBoost classifier works by fitting a classifier that first fits the dataset and then fits additional copies of the classifier, but giving more weight to the incorrectly classified instances, so subsequent classifiers focus on harder cases.   (usually trees) models and generalizes them by the optimization of a differentiable loss function. XGBoost in an implementation of gradient boosting that uses a more regularized model formalization to control overfitting. • AdaBoost (AB) [32] with 100 estimators. An AdaBoost classifier works by fitting a classifier that first fits the dataset and then fits additional copies of the classifier, but giving more weight to the incorrectly classified instances, so subsequent classifiers focus on harder cases.
The RF, SVM, AB classifiers were implemented in Python 3.8 using the sklearn library. The MLP and CNN were implemented in Python 3.5 using the keras library with tensorflow under the hood in the High Performance Computing servers of Agrosavia, given the memory required by CNN. The XGBoost classifier was implemented using xgboost python library in Python 3.8.  The RF, SVM, AB classifiers were implemented in Python 3.8 using the sklearn library. The MLP and CNN were implemented in Python 3.5 using the keras library with tensorflow under the hood in the High Performance Computing servers of Agrosavia, given the memory required by CNN. The XGBoost classifier was implemented using xgboost python library in Python 3.8.
Given the size of the images (520 × 696 × 128) and equipment memory constraints and processing times, only 10000 pixels were selected at random from the canopy (identified using SAVI > 0.3) on each image to train the classifiers forming a training dataset. In the case of CNN, a window of size 5 × 5 × 128 was selected centered on each one of the 10,000 pixels selected at random in the canopy to form the CNN dataset. To evaluate the classifiers, fivefold cross-validation was employed to measure the probability of classfication overfitting, due to the tendency of classifiers to overfit the training dataset. Here, 80% of the dataset is used for training and 20% for testing the classifiers on each one of the five-fold crossvalidation runs. In the case of MLP and CNN, 20% of the 80% available data for training is used for validation in such a way that the MLP or CNN models are saved only if the computed loss improves for the validation data, as an extra measure to avoid overfitting the dataset. Furthermore, the classifiers were trained with the full training dataset and then used to classify the whole canopy on each image (containing many more pixels unseen by the classifiers) using majority voting, i.e., selecting the class most pixels are classified with. Figure 6 shows the classification performance using two classes (water stress or control) for the phenological stage tubers differentiation using overall accuracy, sensitivity, and specificity (see confusion matrices in the Appendix B), where the standard deviation of the mean is indicated for accuracy, sensitivity, and specificity, as error bars. As can be seen from these results RF and XGBoost achieve the best classification performance, being XGBoost the best. data for training is used for validation in such a way that the MLP or CNN models are saved only if the computed loss improves for the validation data, as an extra measure to avoid overfitting the dataset. Furthermore, the classifiers were trained with the full training dataset and then used to classify the whole canopy on each image (containing many more pixels unseen by the classifiers) using majority voting, i.e., selecting the class most pixels are classified with. Figure 6 shows the classification performance using two classes (water stress or control) for the phenological stage tubers differentiation using overall accuracy, sensitivity, and specificity (see confusion matrices in the Appendix B), where the standard deviation of the mean is indicated for accuracy, sensitivity, and specificity, as error bars. As can be seen from these results RF and XGBoost achieve the best classification performance, being XGBoost the best.  Table 1 compares the classification performance using the best three classifiers found: RF, XGBoost, and CNN alone and using Majority Voting (MV). This table shows that both RF and XGBoost correctly classify all the images using majority voting, followed by CNN. Figure 6. Classification performance, tubers differentiation phenological stage using two classes. Table 1 compares the classification performance using the best three classifiers found: RF, XGBoost, and CNN alone and using Majority Voting (MV). This table shows that both RF and XGBoost correctly classify all the images using majority voting, followed by CNN.  Figure 7 shows the classification performance for the tubers differentiation phenological stage and four classes: control and three levels of water stress: light, moderate, and severe (see confusion matrices in the Appendix B), where the standard deviation of the mean is indicated for accuracy, sensitivity, and specificity, as error bars. In this case, XGBoost performs best, followed by RF and MLP. Table 2 compares the classification performance of the three best classifiers: RF, XGBoost, and CNN alone and using MV. In this case, XGBoost performs best, followed by RF and CNN. severe (see confusion matrices in the Appendix B), where the standard deviation of the mean is indicated for accuracy, sensitivity, and specificity, as error bars. In this case, XGBoost performs best, followed by RF and MLP. Table 2 compares the classification performance of the three best classifiers: RF, XGBoost, and CNN alone and using MV. In this case, XGBoost performs best, followed by RF and CNN.   Figure 8 shows the classification performance at the maximum tuberization phenological stage using two classes: control and water stress (see confusion matrices in the Appendix B), where the standard deviation of the mean is indicated for accuracy, sensitivity, and specificity, as error bars. The best classifiers are XGBoost followed by RF and CNN. Table 3 compares the classification performance of RF, XGBoost, and CNN alone and using MV over all the images. This table shows RF and XGBoost both achieve perfect classification using MV of all the images taken at this phenological stage.  Figure 8 shows the classification performance at the maximum tuberization phenological stage using two classes: control and water stress (see confusion matrices in the Appendix B), where the standard deviation of the mean is indicated for accuracy, sensitivity, and specificity, as error bars. The best classifiers are XGBoost followed by RF and CNN. Table 3 compares the classification performance of RF, XGBoost, and CNN alone and using MV over all the images. This table shows RF and XGBoost both achieve perfect classification using MV of all the images taken at this phenological stage.    Figure 9 shows the classification performance at the maximum rate of tubers phenological stage using four classes: control, light, moderate, and severe water stress (see confusion matrices in the Appendix B), where the standard deviation of the mean is indicated for accuracy, sensitivity, and specificity, as error bars. Here, XGBoost obtains the best performance, followed by RF and CNN. As in the case of the two classes, the classification accuracies are good and allow estimation of the water stress from the first day. Table 4 Figure 9 shows the classification performance at the maximum rate of tubers phenological stage using four classes: control, light, moderate, and severe water stress (see confusion matrices in the Appendix B), where the standard deviation of the mean is indicated for accuracy, sensitivity, and specificity, as error bars. Here, XGBoost obtains the best performance, followed by RF and CNN. As in the case of the two classes, the classification accuracies are good and allow estimation of the water stress from the first day. Table 4 compares the classification performance of RF, XGBoost, and CNN alone and using MV, where it can be noticed that XGBoost in combination with MV achieves perfect classification, followed by RF and CNN.   Figure 10 shows XGBoost classification results on some images of the tubers differentiation phenological stage using four classes. The color code here is green for no water stress, blue for light stress, yellow for moderate stress, and red for severe stress. Figure  10a shows the classification for a control plant (no water stress). Figure 10b shows a plant that suffered light stress. Figure 10c shows a plant that suffered moderate stress. Figure  10d shows a plant that suffered severe stress.   Figure 10 shows XGBoost classification results on some images of the tubers differentiation phenological stage using four classes. The color code here is green for no water stress, blue for light stress, yellow for moderate stress, and red for severe stress. Figure 10a shows the classification for a control plant (no water stress). Figure 10b shows a plant that suffered light stress. Figure 10c shows a plant that suffered moderate stress. Figure 10d shows a plant that suffered severe stress. Figure 11 shows some XGBoost classification results for the maximum tuberization phenological stage using the same color code as in Figure 10. Figure 12 shows the band importance for RF classification in the detection (two classes) and estimation (four classes) of water stress at the phenological stage of tubers differentiation. Figure 13 shows the same band importance for RF classification of two and four classes at the phenological stage of the maximum tuberization. As indicated in Figures 12 and 13 the most important bands for classification in RF are the violet, the red edge, and a few wavelengths in the NIR.   Figure 11 shows some XGBoost classification results for the maximum tuberization phenological stage using the same color code as in Figure 10.  Figure 12 shows the band importance for RF classification in the detection (two classes) and estimation (four classes) of water stress at the phenological stage of tubers differentiation. Figure 13 shows the same band importance for RF classification of two and four classes at the phenological stage of the maximum tuberization. As indicated in Figures 12  and 13 the most important bands for classification in RF are the violet, the red edge, and a few wavelengths in the NIR.   Figure 14 shows the band importance for XGBoost classification in the detection (two classes) and estimation (four classes) of water stress at the phenological stage of tubers differentiation. Figure 15 shows the same band importance for XGBoost classification of two and four classes at the phenological stage of the maximum tuberization. From these figures, XGBoost considers important more bands than RF, i.e., it exploits better the spectral signature of the hyperspectral images. Band importance could help us identify which bands are better suited to detect water stress from multispectral imagery or to define water stress indices specially designed for potato crops.  Figure 14 shows the band importance for XGBoost classification in the detection (two classes) and estimation (four classes) of water stress at the phenological stage of tubers differentiation. Figure 15 shows the same band importance for XGBoost classification of two and four classes at the phenological stage of the maximum tuberization. From these figures, XGBoost considers important more bands than RF, i.e., it exploits better the spectral signature of the hyperspectral images. Band importance could help us identify which bands are better suited to detect water stress from multispectral imagery or to define water stress indices specially designed for potato crops.

Discussion
The results indicate that even using a small subset of pixels, taken at random from the hyperspectral images, it is possible to obtain good classification accuracies for detecting and estimating water stress in potato crops. The results also indicate that as early as one day after the onset of the stress in the tubers differentiation phenological stage and on the same day of the onset of the stress in the maximum tuberization water stress can be detected and measured. Other researchers like [33] also found that hyperspectral imaging could be useful to detect water supply conditions of leafy vegetables growing under greenhouse, using modified partial least square regression algorithm, trained to classify different levels of leaf water potential, obtaining a correlation coefficient of 0.826. In this sense, hyperspectral imaging could become a useful tool for the design of precision irrigation systems that allow optimizing the use of water in crops such as potatoes, although it is necessary to develop more studies in real conditions of commercial cultivation.
It was evident that over all classification tasks and phenological stages XGBoost provides excellent classification accuracies alone or in combination with majority voting, followed closely by random forest. Random forest and XGBoost also provide a direct measure of band importance to detect and estimate water stress. In this case, XGBoost seems to better use the whole spectral signature of the canopy, while RF uses a reduced subset of bands. Although the SVM algorithm did not show the best results in this study, the authors of [34] reported promising results when using this algorithm (R = 0.7684) in combination with the Kullback-Leibler divergence (KLD) dimensionality reduction method to select the most relevant bands of hyperspectral images, in the detection of moisture content in maize leaves at the seedling stage. For future experiments, it may be useful to evaluate some combinations of algorithms that have proven to be efficient in the detection of relative water content in leaves, from remote hyperspectral sensing, as reported by [35] who used artificial neural networks (ANN) after selecting the most important bands

Discussion
The results indicate that even using a small subset of pixels, taken at random from the hyperspectral images, it is possible to obtain good classification accuracies for detecting and estimating water stress in potato crops. The results also indicate that as early as one day after the onset of the stress in the tubers differentiation phenological stage and on the same day of the onset of the stress in the maximum tuberization water stress can be detected and measured. Other researchers like [33] also found that hyperspectral imaging could be useful to detect water supply conditions of leafy vegetables growing under greenhouse, using modified partial least square regression algorithm, trained to classify different levels of leaf water potential, obtaining a correlation coefficient of 0.826. In this sense, hyperspectral imaging could become a useful tool for the design of precision irrigation systems that allow optimizing the use of water in crops such as potatoes, although it is necessary to develop more studies in real conditions of commercial cultivation.
It was evident that over all classification tasks and phenological stages XGBoost provides excellent classification accuracies alone or in combination with majority voting, followed closely by random forest. Random forest and XGBoost also provide a direct measure of band importance to detect and estimate water stress. In this case, XGBoost seems to better use the whole spectral signature of the canopy, while RF uses a reduced subset of bands. Although the SVM algorithm did not show the best results in this study, the authors of [34] reported promising results when using this algorithm (R = 0.7684) in combination with the Kullback-Leibler divergence (KLD) dimensionality reduction method to select the most relevant bands of hyperspectral images, in the detection of moisture content in maize leaves at the seedling stage. For future experiments, it may be useful to evaluate some combinations of algorithms that have proven to be efficient in the detection of relative water content in leaves, from remote hyperspectral sensing, as reported by [35] who used artificial neural networks (ANN) after selecting the most important bands through partial least squares regression (PLSR), improving the performance of ANN alone.
CNN is a deep learning neural network algorithm that extracts features from images. However, despite being the deep learning neural network most used to analyze images [10], its classification performance was lower than RF and XGBoost, and only by using majority voting, it was possible to improve its performance to classify all image pixels. This is probably because CNN exploits the spatial structure of the images (such as edges) and not the spectral signature of the images. In this case, the canopy consists of mostly leaves with no spatial clues related to water stress.
Our results indicate that using machine learning and spectral images constitute a phenotyping tool useful to detect and estimate water stress in potato plants, which can also be used in processes of genetic improvement, by choosing those phenotypes that better resist water stress. The reflectance images obtained may be sensitive to the physiological and biochemical changes of the substances and pigments that are degraded and mobilized due to water stress.

Conclusions
This work shows that detection of water stress, as well as estimation of the water stress level, is possible with good accuracy incremented on the whole canopy, using majority voting at the tubers differentiation and maximum rate of tuberization phenological stages. In particular, the classification results are more accurate and available from the first day of stress for both the tubers differentiation and maximum rate of tuberization phenological stages. Extreme gradient boost performed best overall phenological stages and classification tasks, followed by random decision forests. XGBoost and RF also provide a measure of the importance of each band to detect or estimate water stress in potato crops. In the case of RF, these bands are the violet, red edge, and some specific NIR bands, while in the case of XGBoost it includes some additional bands in the visible (green, yellow, red) and NIR, exploiting better the spectral signature.
These results could lead to the use of more specific normalized water indexes for water stress detection and estimation in potato crops using these machine learning algorithms. However, they are not intended to be used by producers, since this research work was conducted under greenhouse conditions. In this sense, these results are an important basis for further research considering actual potato crop field conditions and cultural practices. It will allow to design advanced tools for early detection of water stress, increasing the efficiency in the application of irrigation.   Appendix B