A Decision Support System for Irrigation Management: Analysis and Implementation of Di ﬀ erent Learning Techniques

: Automatic irrigation scheduling systems are highly demanded in the agricultural sector due to their ability to both save water and manage deﬁcit irrigation strategies. Elaborating a functional and e ﬃ cient automatic irrigation system is a very complex task due to the high number of factors that the technician considers when managing irrigation in an optimal way. Automatic learning systems propose an alternative to traditional irrigation management by means of the automatic elaboration of predictions based on the learning of an agronomist (DSS). The aim of this paper is the study of several learning techniques in order to determine the goodness and error relative to expert decision. Nine orchards were tested during 2018 using linear regression (LR), random forest regression (RFR), and support vector regression (SVR) methods as engines of the irrigation decision support system (IDSS) proposed. The results obtained by the learning methods in three of these orchards have been compared with the decisions made by the agronomist over an entire year. The prediction model errors determined the best ﬁtting regression model. The results obtained lead to the conclusion that these methods are valid engines to develop automatic irrigation scheduling systems. validation, H.N.-H, R.D.-M. and A.G.-F.; writing—original draft preparation, R.T.-S., R.S.-S., R.D.-M.; writing—review and editing, R.T.-S., H.N.-H., R.S.-S., R.D.-M. and A.G.-F.; visualization, A.G.-F, H.N-H. and R.T.-S.; project administration, R.T.-S. and A.G.-F.; funding


Introduction
Water is a limiting factor in agricultural production. This fact is intensified in regions where water is scarce. In these regions, the importance of properly managing irrigation is a fundamental factor for sustainable production. There are agricultural techniques that have made it possible to optimize irrigation management, from the use of drip irrigation systems to regulated deficit irrigation strategies able to maintain yields with lower irrigation volumes [1,2].
Information and communication technologies (ICT) have contributed to the sustainable management of water in agriculture. The deployment of wireless sensor networks in crops using Internet of Things (IoT) technologies and the remote management of data with cloud computing have allowed massive monitoring of agricultural variables, which generate a large amount of information [3,4]. This information helps the agronomist to determine the water status of the soil-plant-atmosphere continuum and to make decisions about irrigation, and whether different deficit irrigation strategies adapted to phenology and physiology of the crop should be implemented [5]. In addition, the democratization of IoT technologies for monitoring soil and weather variables is allowing a wide diffusion of water saving tools in home contexts [6,7].
However, the continuous modernization of irrigation systems needs to implement equipment that allows an automated scheduling of irrigation. It must include sensors to provide different parameters [8,9]. Traditionally, these parameters are related to environmental conditions and provide information about the full crop water requirements using weather stations [6] as well as the soil's water status or volumetric content, which indicate the water availability for the plant. The most commonly used soil parameter sensors are those that use dielectric properties, since they are cheap and flexible [10,11], although its correct operation requires complex calibration, taking into account factors such as soil texture and structure, temperature, and water salinity [12][13][14] besides the spatial variability of the soil conditions [15]. Other sensors such as thermal and multispectral cameras, satellites, or infrared radiometers (IR) are used to estimate water crop needs [16][17][18][19].
Regarding irrigation automation, soil sensors have started to be used for this purpose, considering water matric potential [20] and volumetric water content [21], with thermal sensors [22] and, recently, their combination with wireless technologies for flexible implementation [9]. These systems use fixed or dynamic thresholds of the soil, and measure atmospheric or plant parameters for irrigation actuation [23,24].
However, irrigation management can take into account more variables, including the hydro-physical properties of certain soils, the parameters related to other crops, their stages of development, water quality [25,26], and factors related to productivity, fruit quality, and the implementation of deficit irrigation strategies [27], which prevent the proper functioning of irrigation thresholds.
Systems based on machine learning (ML) techniques use the previous irrigation management experience of a human expert to train a system to reproduce that expert behavior. Fuzzy logic, artificial neural networks (ANNs), or regression procedures have been used recently for automatic irrigation management [28][29][30].
Decision support systems (DSSs) are ML applications that use the knowledge of an agronomist (human expert) to learn irrigation scheduling patterns and emulate human activities in decision-making. Additionally, DSSs permit a continuous learning process (while being used) and adapt their performance to context changes or different objectives [31]. Therefore, DSSs in agriculture are useful tools for optimal irrigation management and have demonstrated good behavior [32]. These systems have been defined and implemented over several years in the agricultural sector for a large range of applications, not only for developing irrigation management [33] but also crop growth models [34], and financial and agricultural management models [35]. In some applications for irrigation management, knowledge-based learning models have been developed using climate data provided by weather station networks [36].
Different algorithms are used for water needs estimations in automatic irrigation systems based on ML. ANN [37,38] and support vector regression (SVR) [33] algorithms are widely used for DSS development. In [39], k-nearest neighbor (kNN) and adaptive boosting (AdaBoost) algorithms were compared with an ANN for the estimation of potato water needs. The authors in [40] compared SVR with multivariate adaptive regression splines (MARS) and M5 model tree (M5Tree) in modeling reference evapotranspiration (ETo). Genetic algorithms (GAs) [41] and random forest regression (RFR) [42] have also been used for water needs estimation. This paper presents the development of an irrigation decision support system (IDSS) for irrigation management optimization in citrus trees. The system uses the following information obtained automatically by in-the-field sensors: (1) weather data, (2) the amount of applied water data from the previous week, and (3) soil water status and water quality data. We also considered aspects such as meteorological predictions, the type of crop, and the phenological stage on which the model performs the irrigation calculation. The system was trained using reports made by agronomists for determining the irrigation frequency and doses on a weekly basis. Three regression learning algorithms were compared for performance purposes: linear regression (LR), SVR, and RFR. The models were evaluated using leave-one-out cross-validation (LOOCV) and random 90-10 shuffle cross-validation techniques.
The irrigation amount estimated for an entire year by the IDSS was compared with the irrigation water applied by the agronomists, allowing us to compare the performance of different learning algorithms.
Section 2 describes the material and methods used for the development of the IDSS. Section 3 summarizes the results obtained and a discussion. The conclusions are described in Section 4. Figure 1 shows the process used by an agronomist (expert) to determine the amount of irrigation.  (3) soil water status and water quality data. We also considered aspects such as meteorological predictions, the type of crop, and the phenological stage on which the model performs the irrigation calculation. The system was trained using reports made by agronomists for determining the irrigation frequency and doses on a weekly basis. Three regression learning algorithms were compared for performance purposes: linear regression (LR), SVR, and RFR. The models were evaluated using leave-one-out cross-validation (LOOCV) and random 90-10 shuffle cross-validation techniques. The irrigation amount estimated for an entire year by the IDSS was compared with the irrigation water applied by the agronomists, allowing us to compare the performance of different learning algorithms.

Materials and Methods
Section 2 describes the material and methods used for the development of the IDSS. Section 3 summarizes the results obtained and a discussion. The conclusions are described in Section 4.  ETo (reference evapotranspiration) is obtained from meteorological variables and represents the effect of the weather on net crop water requirements, while Kc (crop coefficient) indicates the specific characteristics of the crop and its effect on water needs (the type of crop, the development and phenological stage, etc.). In a third stage, the quality of the irrigation water, the uniformity coefficient of the irrigation system, the field size, etc., allow for determining the real volumes of irrigation. This value gives an approximate idea of the amount of water needed to satisfy full crop water requirements. This value is modulated with the water status of the soil to obtain the volume of irrigation to contribute to the crop.

Materials and Methods
The main goal of the IDSS is to calculate the irrigation doses that have to be applied to the crop. This decision is taken automatically based on the information provided by the sensors and the prediction of a machine learning system. The aim of this component, therefore, is to mimic a human expert (agronomist) in the decision-making process.
The IDSS described in this paper was trained using different varieties of citrus trees (orange, mandarin, and lemon trees) and cultivated in different plots. For each one of them, weekly irrigation reports (carried out by the agronomist) were used to train the system.
In the following paragraphs, the main parts used to develop the IDSS are described: (i) the integrated information platform that provides the soil and irrigation data, (ii) the crops and plots where the data and irrigation reports were obtained and where the IDSS was tested, and (iii) the architecture of the different machine learning algorithms implemented in the IDSS. ETo (reference evapotranspiration) is obtained from meteorological variables and represents the effect of the weather on net crop water requirements, while Kc (crop coefficient) indicates the specific characteristics of the crop and its effect on water needs (the type of crop, the development and phenological stage, etc.). In a third stage, the quality of the irrigation water, the uniformity coefficient of the irrigation system, the field size, etc., allow for determining the real volumes of irrigation. This value gives an approximate idea of the amount of water needed to satisfy full crop water requirements. This value is modulated with the water status of the soil to obtain the volume of irrigation to contribute to the crop.
The main goal of the IDSS is to calculate the irrigation doses that have to be applied to the crop. This decision is taken automatically based on the information provided by the sensors and the prediction of a machine learning system. The aim of this component, therefore, is to mimic a human expert (agronomist) in the decision-making process.
The IDSS described in this paper was trained using different varieties of citrus trees (orange, mandarin, and lemon trees) and cultivated in different plots. For each one of them, weekly irrigation reports (carried out by the agronomist) were used to train the system.
In the following paragraphs, the main parts used to develop the IDSS are described: (i) the integrated information platform that provides the soil and irrigation data, (ii) the crops and plots where the data and irrigation reports were obtained and where the IDSS was tested, and (iii) the architecture of the different machine learning algorithms implemented in the IDSS.

Data Collection Platform
The information about the soil and weather conditions is provided by wireless devices called nodes, developed by Widhoc Smart Solutions (CEDIT, Fuente Álamo 30320, Spain). The wireless nodes collect and send data using Wi-Fi or GPRS links. A cloud server stores and indexes the data for further processing purposes. The nodes are powered by solar panels and rechargeable batteries.
The main variables collected by the nodes are as follows: (1) soil matric potential measured by MPS-6 sensors (Decagon devices, Inc., Pullman, WA 99163, USA) (

Data Collection Platform
The information about the soil and weather conditions is provided by wireless devices called nodes, developed by Widhoc Smart Solutions (CEDIT, Fuente Álamo 30320, Spain). The wireless nodes collect and send data using Wi-Fi or GPRS links. A cloud server stores and indexes the data for further processing purposes. The nodes are powered by solar panels and rechargeable batteries.
The main variables collected by the nodes are as follows: Each node samples the variables every 15 min and sends the information to a data server. This server establishes communication with the nodes and the integration of the databases (the information from customers and equipment is stored securely). The processed information is shown on a website-customizable front-end layer.
Information about weather conditions is provided by the IMIDA (Instituto Murciano de Investigación y Desarrollo Agrario y Alimentario, 30150 Murcia, Spain). This public research institution has deployed a network with 49 climatic stations that cover the region of Murcia (SIAM)  Each node samples the variables every 15 min and sends the information to a data server. This server establishes communication with the nodes and the integration of the databases (the information from customers and equipment is stored securely). The processed information is shown on a website-customizable front-end layer.
Information about weather conditions is provided by the IMIDA (Instituto Murciano de Investigación y Desarrollo Agrario y Alimentario, 30150 Murcia, Spain). This public research institution has deployed a network with 49 climatic stations that cover the region of Murcia (SIAM) [43] ( Figure 3). The network is widely used to determine the ETo, in addition to other variables such as temperature (T), relative humidity (RH), global radiation (GR), wind speed (WS), rainfall (RF), dew point (DP), and vapor pressure deficit (VPD).
The weather information, belonging to the SIAM network, is automatically integrated into the cloud server to complement the soil and water data from the nodes.

Plot and Report Description
The data were collected from nine commercial orchards of citrus trees located in the Southeast Spain, specifically in the region of Murcia. This is a semiarid zone where the water is very scarce and drip irrigation is commonly used. The irrigation criteria followed was to maximize the yield per unit area. Table 1 shows the main characteristics of the orchards and the numbers of reports used in the training process.  The network is widely used to determine the ETo, in addition to other variables such as temperature (T), relative humidity (RH), global radiation (GR), wind speed (WS), rainfall (RF), dew point (DP), and vapor pressure deficit (VPD).
The weather information, belonging to the SIAM network, is automatically integrated into the cloud server to complement the soil and water data from the nodes.

Plot and Report Description
The data were collected from nine commercial orchards of citrus trees located in the Southeast Spain, specifically in the region of Murcia. This is a semiarid zone where the water is very scarce and drip irrigation is commonly used. The irrigation criteria followed was to maximize the yield per unit area. Table 1 shows the main characteristics of the orchards and the numbers of reports used in the training process. The weekly reports to train the system were generated by two different technicians. They used the information from the automatic weather stations closest to the orchard placement, including crop parameters and other data such as water quality and soil sensor information. They also needed the amount of water applied to the orchard during the previous week. The generated reports, as a final result, suggest the total amount of water for the following week. The total number of reports available is 484, for nine different citrus crops located in Southeast Spain.

Irrigation Decision Support System (IDSS)
The proposed IDSS is a trained system that automatically predicts the amount of irrigation water needed for the orchard. A supervised training process was used. Irrigation reports performed by the agronomist in the orchard were used as the ground-truth.
In order to cover the best options of automatic learning systems, regressive techniques were implemented. The regression methods that were used to predict the agricultural technician criteria are LR, RFR, and SVR. A detailed description of each method and their corresponding parameters selection are presented below.
Two validation techniques (LOOCV and 90-10 shuffle cross-validation) were used to test the goodness of every model, as suggested in [44].
The root-mean-square error (RMSE), given in Equation (1), was used to obtain the accuracy of the forecasting models: where n is the number of data, y i is the actual output of the instance i, andŷ i is the corresponding estimated output. It is a global and very standard error measure where lower values mean higher accuracy in the predictions. Note that the RMSE is measured on the same scale as the output variable, so a simple comparison among the RMSE of the forecasting methods is enough to evaluate their performance when the output variable is the same for all forecasting methods.

Description of the Output and Input Variables
In this research, the same output and input variables were used for the three forecasting methods (LR, RFR, and SVR). The output (response variable) in all cases was the total amount of irrigation water for the next week suggested by the agronomists (given in their reports).

•
Regarding the set of possible input variables, the selection was made according to the main information used by the agronomists when developing the irrigation reports, such as the total water needs (TWN), the soil water status, the amount of water applied previously, and the critical period of the crop: in the case of citrus trees, the main critical periods are flowering and fruit setting (Stage I), and a second period when fruit is growing fast (Stage II).

•
However, there are more factors that might affect irrigation prediction (such as the weather prediction, the possible irrigation cutoff in the area, etc.), and those factors are not taken into account by the agronomist. This could be a limitation of this IDSS.

•
The selection of a suitable set of features is crucial for good performances in the prediction models.
In this sense, a selection procedure similar to that depicted in [36] was developed. The inputs that perform best in this new context were the following: • daily average of the matric potential of the last 5 d (five inputs); • the TWN (one input); • the water applied (sensor measured) during the previous week (one input); • a binary value indicating whether or not the crop is in a period where the fruit is gaining weight (one input).

•
The daily average of the soil matric potential gives representative information about the conditions of the soil in the previous week. The TWN provides the theoretical irrigation volume for the crops in a specific area with specific weather conditions. The quantity of water for the last week is a hint of what the water requirement for the next week should be, as the water requirements of one week and the next are highly correlated. Finally, the period of the crop helps to finer tune the irrigation quantity: periods without fruit are less critical than those in which the fruit is present on the crop [27,45].
Due to the nature of the output (the amount of water for the next week), we selected the regression methods described below to obtain the predictions.

Linear Regression
LR is a classical statistical method that explains a target variable Y (called a response variable) as a linear function of a set of features X j controlled by the researcher (called regressors or predictors).
In general, the multiple LR model can be expressed as follows: where n denotes the sample size. The β j parameters of the model are estimated using the least squares criteria. In general, some of the proposed predictors by the researcher might be not significant (that is, irrelevant when the rest of the predictors are considered in the model), so it is important to provide simpler models when it is possible. There are different methods to achieve a simplified model, such as stepwise, forward, and backward selection methods. The results of the model selection depend on the data being analyzed. In the present research, the three selection methods were applied to the dataset before estimating the multiple LR model, and the same results were obtained. The main advantage of the LR method against other ML methods is the fast computation time in which the parameters of the model are estimated. Under a suitable theoretical framework, it allows inferences on the regression parameters and predictions. Although the LR method has shown good behavior in many contexts and fields, its efficiency is limited to linear relationships between the response variable and the predictors. However, real problems might present nonlinear and complex relationships between them.

Regression Trees: Bagging Regression and Random Forest Regression
In the case of nonlinear and complex relationships between the features and the response, regression trees have shown better performance than classical approaches. In a regression tree, the feature space is divided into J non-overlapping "boxes," and the prediction for a new observation is given by the mean of the response values of the training data belonging to the same "box" as the new observation.
Let (x 1 , y 1 ), (x 2 , y 2 ), . . . , (x n , y n ) be the training dataset, where each y i denotes the i-th output (response variable) and x i = (x i,1 , x i,2 , . . . , x i,k ) the corresponding input of the s predictors (features) in the study. The objective in a regression tree is to find boxes B 1 , B 2 , . . . , B j that minimize the RSS, given by Equation (3): whereŷ B j is the mean response for the training observations within the jth box.
A desirable criterion could be to find the partition of the feature space that minimizes the residual sum of squares (RSS) for the training dataset, but this approach is usually computationally infeasible. Therefore, the way to obtain the partition of the feature space is by means of binary splitting: the algorithm chooses, at each step, the predictor (X j ) and cut-point (s) that minimize the RSS for the resulting tree. In general, the above optimization problem is not very computationally demanding, except when the number of features is too large. The process is repeated until a stopping criterion is reached, for instance, until all regions contain less than a determined number of observations.
Regression trees have many advantages: they are easy to explain, computationally fast to obtain, and can handle missing data, outliers, and irrelevant features. However, they are very sensitive to the data and can overfit the training data (there are lots of small branches). A way to keep from overfitting is to prune the least important leaves of the tree. Therefore, using a good strategy for pruning the tree, a single regression tree can be used as a prediction method, but it is more efficient to use them as base learners in complex solutions.
In random forest and bagging, the original training dataset is used to build N new subsets that only perform random sampling with replacement. For each new training dataset, the corresponding regression tree is developed. Given a new observation, the prediction of each single tree is computed, and the final prediction is obtained as the mean of the single predictions. The final prediction reduces the variance and improves accuracy [46].
The difference between bagging (bootstrap aggregating) and random forest is the number of predictors (features) considered at each split of the tree. In bagging, all the features are used, whereas in random forest only a random sample of mtry (predictors considered at each split of the regression tree) can be chosen each time. This last approach allows one to reduce the variance more efficiently.
The main parameters to be tuned in random forest are N and mtry, whereas for bagging only N should be tuned because mtry = k (the total number of features) is determined. It is important to highlight that, for these methods, the fitting goodness increases (or at least does not decrease) with the number of trees, so we do not have to worry about choosing values for N that are greater than necessary.

Support Vector Regression
Support vector machines (SVMs) are very popular in the field of classification problems. The adaptation of the SVM approach to regression problems has led to an effective tool in function estimation: SVR. In this section, the basis of the SVR technique is depicted together with suggestions for the parameter selection stage of the method. Firstly, the linear case is presented for simplicity, and the nonlinear case is introduced secondly.
The objective of the linear support vector regression (LSVR) is to find a linear function of the following form that can fit the actual output vector y (response variable) while balancing model complexity and prediction error: In Equation (4), x denotes the vector of input features, k the number of features, w the vector of parameters to be estimated, and b the position parameter to be estimated. The vector of parameters w represents the flatness or simplicity of the function. SVM generalization to SVR is accomplished by introducing an ε-insensitive region around the function, called the ε-tube, in such a way that observations y i , which lie inside the tube, lead to null errors.
For the development of the method, it is necessary to set a tolerance margin ε (penalizing only the points placed outside the ε-tube [47]) and to select a loss function (that describes the way to measure the estimation errors). There are different types of loss functions (linear, quadratic, Huber, etc.), and we can distinguish between symmetrical and asymmetrical ones. In this paper, we will focus on the Vapnik's ε-insensitive linear loss function given by Note that the loss function is only affected by the training samples that lie outside the ε-tube, which are called support vectors. Though many phenomena can be modeled by linear functions, there are many situations where the relationship between the output (response variable) and the input variables (features) is not linear.
For nonlinear functions, the data can be mapped into a higher dimensional space by means of a nonlinear function φ(x) : R k → R M , M > k. Now the aim is to find a function of the following form that can fit the output vector y while balancing model complexity and prediction error: Next, the optimization problem can be written for the nonlinear case: as where ξ i , ξ * i represent the upper and lower training errors (see Figure 4), and C is a regularization parameter that controls the model error and model simplicity trade-off. For example, large values of C give more weight to minimizing the model error.
where ,  *  represent the upper and lower training errors (see Figure 4), and C is a regularization parameter that controls the model error and model simplicity trade-off. For example, large values of C give more weight to minimizing the model error. Figure 4 shows the behavior of a nonlinear SVM [23]. The selection of an appropriate transformation is not an easy task. However, one advantage of SVR is that, in practice, the nonlinear function does not need to be used. If we re-write the optimization problem of Equation (7) in its dual form, it can be seen that only the inner products 〈 , 〉 are needed. In this context, Vapnik [47] proposes the use of internal products through a "kernel trick": where K(X,Y) is a function that verifies Mercer's theorem [47]. Therefore, the SVR technique requires the selection of ε (margin of the tube), C (regularization parameter), and the kernel function. Some of the most used kernel functions in the context of SVR are the linear, polynomial, and sigmoid kernels.
In this paper, the radial basis function (RBF) kernel, given by Equation (9), was used, because of its good results in nonlinear relations [48].
, (9) Recall that higher values of C provide more complex models and can produce overfitting of the training data. Smaller values of C result in a simpler model but low accuracy. For this paper, the value of this parameter was selected following the indications of Mattera and Haykin [49], who propose that C should be equal to the range of the output.
The parameter ε also affects the smoothness or complexity of the model. In addition, the value of ε determines the number of support vectors. Smaller values of ε lead to higher numbers of support vectors and, therefore, a more complex learning machine. However, higher values of ε lead to a lower  Figure 4 shows the behavior of a nonlinear SVM [23]. The selection of an appropriate transformation φ(x) is not an easy task. However, one advantage of SVR is that, in practice, the nonlinear function φ(x) does not need to be used.
If we re-write the optimization problem of Equation (7) in its dual form, it can be seen that only the inner products φ(x i ), φ x j are needed. In this context, Vapnik [47] proposes the use of internal products through a "kernel trick": where K(X,Y) is a function that verifies Mercer's theorem [47]. Therefore, the SVR technique requires the selection of ε (margin of the tube), C (regularization parameter), and the kernel function. Some of the most used kernel functions in the context of SVR are the linear, polynomial, and sigmoid kernels.
In this paper, the radial basis function (RBF) kernel, given by Equation (9), was used, because of its good results in nonlinear relations [48].
Recall that higher values of C provide more complex models and can produce overfitting of the training data. Smaller values of C result in a simpler model but low accuracy. For this paper, the value of this parameter was selected following the indications of Mattera and Haykin [49], who propose that C should be equal to the range of the output.
The parameter ε also affects the smoothness or complexity of the model. In addition, the value of ε determines the number of support vectors. Smaller values of ε lead to higher numbers of support vectors and, therefore, a more complex learning machine. However, higher values of ε lead to a lower number of support vectors, so important information may be lost. In this work, the suggestions of Cherkassky and Ma [50] and Mattera and Haykin [49] were chosen. They propose a value of ε such that the percentage of support vectors in the regression model is around 50% of the number of samples.

Results and Discussion
In this section, we show the results obtained after the training and testing stages of each regression method, and we provide different measures to compare their performances.
Regarding the LR method, the final estimated model was the same using the three selection methods (stepwise, forward, and backward).
As for the RFR method, a value of N = 500 (the total number of trees considered in the training stage) was selected, whereas mtry = 8 provided the best goodness-of-fit in the test dataset.
Finally, the SVR model that performs best uses an RBF kernel, a penalty factor C of 100, and an epsilon of 8 (obtaining 265 support vectors, which is slightly higher than 50% of the sample size).
In order to analyze the performance of the different models, LOOCV and random 90-10 shuffle cross-validation techniques were used to assure better evaluation. In both cases, the RMSE was selected as the error measure to evaluate the accuracy in the training and test datasets. Table 2 shows the results of the 90-10 shuffle cross-validation. The three regression models were compared with a dummy model that irrigates the crops based only on the TWN. This dummy model takes advantage of the fact that the maximum amount of water calculated by the agronomist is typically less than the TWN. Using the training data, the optimal percentage is calculated based on the phenological period of the crop. A value of 95% TWN is fixed in the more critical period (when the fruit is present), and 75% TWN when the fruit is not present. These values make sense according to the irrigation tips made by the agronomist. Table 2. Root-mean-square error (RMSE) of 90-10 shuffle cross-validation for the models (m 3 ha −1 week −1 ). LR: linear regression; RFR: random forest regression; SVR: support vector regression. In the case of the shuffle cross-validation, the three regression models perform much better than this dummy method. Both SVR and RFR perform similarly with RMSEs of 17.13 and 16.83 m 3 ha −1 in testing, respectively. LR performs worse than the other two models. It seems normal according to its linear nature that is not able to adapt well enough to more complex situations. Table 3 shows the results of the LOOCV for the four prediction models and each orchard separately. The first column represents the orchard number. In this case, for each selected orchard, we do not use the data of the orchard for training-only for testing. In other words, this approach proves the generalization capabilities of the model. Each row represents the training and test errors of the models when the selected orchard is left out.

SVR LR RFR Dummy
In this case, RFR is still the model that performs best, with LR in second place and performing slightly better than SVR, with average RMSE values in testing of 18.01, 18.35, and 19.99 m 3 ha −1 per week, respectively. All models performed better than the dummy model.  Figure 5 and Table 4 analyze the water requirement per month for the RFR model, the dummy model, the TWN, and the GT (ground-truth) during the year 2018 and for three of the orchards. Although all nine plots were used for the model training and error comparison phases, a comparative representation of the goodness of the models with respect to GT (agronomist reports) was made for those plots for which full years of data were available, Orchards 6, 7, and 9. In this case, the relative error (RE), given in Equation (10), was used to evaluate the monthly accuracy for each prediction method: where y j is the actual output for month j, andŷ j is the estimated output for month j. Note that this error measure is a dimensionless quantity (see Figure 6), and it does not make sense when the output variable can take null values. The mean relative error (MRE), computed as the mean of the monthly relative errors, provides a dimensionless global error measure (see Table 4): where m denotes the number of months considered; in our case, m = 12 because the methods were evaluated just for the year 2018.  Figure 5 and Table 4 analyze the water requirement per month for the RFR model, the dummy model, the TWN, and the GT (ground-truth) during the year 2018 and for three of the orchards. Although all nine plots were used for the model training and error comparison phases, a comparative representation of the goodness of the models with respect to GT (agronomist reports) was made for those plots for which full years of data were available, Orchards 6, 7, and 9. In this case, the relative error (RE), given in Equation (10), was used to evaluate the monthly accuracy for each prediction method: (10) where is the actual output for month j, and is the estimated output for month j. Note that this error measure is a dimensionless quantity (see Figure 6), and it does not make sense when the output variable can take null values. The mean relative error (MRE), computed as the mean of the monthly relative errors, provides a dimensionless global error measure (see Table 4): where m denotes the number of months considered; in our case, m = 12 because the methods were evaluated just for the year 2018.   Figure 6 shows the distribution of the relative errors associated with each prediction model compared to the GT. We can notice that the RFR model has much less dispersion and fewer relative errors compared to the TWN and dummy methods.   Figure 6 shows the distribution of the relative errors associated with each prediction model compared to the GT. We can notice that the RFR model has much less dispersion and fewer relative errors compared to the TWN and dummy methods.  Irrespective of the magnitude of the residuals, it is important to verify if the prediction errors behave properly (ensuring no bias, low dispersion, and symmetry). In this sense, the quality of the errors was checked, analyzing, for each prediction method, the distribution of the net residuals by means of the box plots given in Figure 7. It can be seen that the TWN and dummy methods provide higher bias, dispersion, and asymmetry than the RFR method. In Figure 8, we can see that the monthly and weekly RFR prediction follow quite precisely the tendency of the GT for all months and weeks in each one of the three orchards tested. Irrespective of the magnitude of the residuals, it is important to verify if the prediction errors behave properly (ensuring no bias, low dispersion, and symmetry). In this sense, the quality of the errors was checked, analyzing, for each prediction method, the distribution of the net residuals by means of the box plots given in Figure 7. It can be seen that the TWN and dummy methods provide higher bias, dispersion, and asymmetry than the RFR method.  Irrespective of the magnitude of the residuals, it is important to verify if the prediction errors behave properly (ensuring no bias, low dispersion, and symmetry). In this sense, the quality of the errors was checked, analyzing, for each prediction method, the distribution of the net residuals by means of the box plots given in Figure 7. It can be seen that the TWN and dummy methods provide higher bias, dispersion, and asymmetry than the RFR method. In Figure 8, we can see that the monthly and weekly RFR prediction follow quite precisely the tendency of the GT for all months and weeks in each one of the three orchards tested. In Figure 8, we can see that the monthly and weekly RFR prediction follow quite precisely the tendency of the GT for all months and weeks in each one of the three orchards tested.

Conclusions
This paper describes the design and development of an automatic decision support system to manage irrigation in agriculture.
The system was trained with climatic and soil data from nine different citric crops located in different zones of Southeast Spain.
The aim of the IDSS is to mimic the irrigation recommendations of an agronomist, with the idea of creating a robust model (with good generalization capabilities) able to precisely predict the weekly water requirement of the crops with no previous information of the specific field.

Conclusions
This paper describes the design and development of an automatic decision support system to manage irrigation in agriculture.
The system was trained with climatic and soil data from nine different citric crops located in different zones of Southeast Spain.
The aim of the IDSS is to mimic the irrigation recommendations of an agronomist, with the idea of creating a robust model (with good generalization capabilities) able to precisely predict the weekly water requirement of the crops with no previous information of the specific field.
Three regression methods were tested to determine the one that best fits the agronomist criteria. RFR was the method that best emulated the agronomist.
Despite the results obtained among the regression models tested, the number of reports is a critical factor that directly affects the performance of the methods.
Regarding the water applied, we can conclude that considering only the TWN for irrigation wastes water (an increment of 282, 722, and 1049 m 3 ha −1 per year for Orchards 6, 7, and 9, respectively) compared to the agronomist. In contrast, the dummy estimator tends to heavily underestimate the water requirements (with underestimates of 766, 404, and 393 m 3 ha −1 for Orchards 6, 7, and 9, respectively). However, the RFR model performed much better than the others (408, 248, and 7 m 3 ha −1 ) with errors below 12% for all the orchards and weeks.
In terms of performance, considering the water predicted by the IDSS versus the agronomist, the system has a weekly average error below 9% for the most critical periods (the ones when the fruit is growing), with a 10% error being considered acceptable in agriculture. It can be concluded that the IDSS is a viable predictor [51].
For future research, we aim to extend the dataset with more citrus plantations in order to analyze the performance in different regions and weather conditions. In addition, exporting this model to other plantations different from citrus and adding data only from VWC sensors would be a good way of evaluating the robustness of the model and decreasing the cost of the whole system.
Another future improvement would be to migrate the system to daily instead of a weekly prediction. This change would vastly increase the potential of the model, as it would be able to adapt more quickly to changes in weather conditions and decrease the reaction time, which will result in water savings.