The Integration of Multivariate Statistical Approaches, Hyperspectral Reﬂectance, and Data-Driven Modeling for Assessing the Quality and Suitability of Groundwater for Irrigation

: Sustainable agriculture in arid regions necessitates that the quality of groundwater be care-fully monitored; otherwise, low-quality irrigation water may cause soil degradation and negatively impact crop productivity. This study aimed to evaluate the quality of groundwater samples collected from the wells in the quaternary aquifer, which are located in the Western Desert (WD) and the Central Nile Delta (CND), by integrating a multivariate analysis, proximal remote sensing data, and data-driven modeling (adaptive neuro-fuzzy inference system (ANFIS) and support vector machine regression (SVMR)). Data on the physiochemical parameters were subjected to multivariate analysis to ease the interpretation of groundwater quality. Then, six irrigation water quality indices (IWQIs) were calculated, and the original spectral reﬂectance (OSR) of groundwater samples


Introduction
Groundwater is regarded as one of the most essential irrigation resources, especially in countries that face extreme water scarcity. However, various agricultural and other human activities in recent decades have resulted in the severe pollution of groundwater, making it unsuitable for irrigation purposes under ordinary conditions. Therefore, water quality, which is fundamental to groundwater management and irrigation water, should be regularly evaluated and monitored.
The viability of any water resource for irrigation purposes is evaluated through several irrigation water quality indices (IWQIs) [1]. These indices, which are based on several physiochemical variables of water, include a water quality index (WQI), residual sodium bicarbonate (RSBC), total hardness (TH), potential salinity (PS), residual sodium carbonate (RSC), magnesium hazard (MH), and others [2][3][4][5][6]. The main objective of each index is to determine groundwater's suitability for either drinking or irrigation purposes. For example, the primary objective of the WQI is to evaluate the overall water quality levels by transforming a wide range of selected physiochemical water variables into a single numerical value using a simple mathematical function [4,7,8]. Again, this index can be considered a simplified model of the complex reality of water quality. The main objective of MH is to represent the excess of Mg 2+ in irrigation water. A high MH value in irrigation water may lead to soil alkalinity, as well as reduce the infiltration capacity of soil if a large amount of water is adsorbed between magnesium and clay particles [5,9]. The primary goal of RSC is to measure the relationship between the sum of CO 3 2− and HCO 3 − and that of Ca 2+ and Mg 2+ [10]. A high-RSC value in irrigation water makes the soil infertile because of the deposition of sodium carbonate [10][11][12]. PS is a characterization of the dominant Cl − and SO 4 2− in irrigation water [13]. Since salts with low solubility precipitate in the soil and accumulate with each successive round of irrigation, this leads to issues of soil salinity. Thus, a high PS value would foster soil permeability problems and thus inhibit the water absorption by crops [4]. Therefore, various important dissolved ions in groundwater such as Na + , K + , Ca 2+ , Mg 2+ , Cl − , SO 4 2− , CO 3 2− , and HCO 3 − represent a common factor in determining various IWQIs. This is because the productivity of field crops as well as the different physiochemical properties of soil are mainly dependent on the concentrations of these ions in irrigation water. If the concentration of some particular ions exceeds a specific threshold, the water cannot be suitable for irrigation purposes. For example, excessive concentrations of Na + in irrigation water reduce soil permeability as well as hinder the uptake of essential ions (Ca 2+ , Mg 2+ , and K + ). Excessive concentrations of Mg 2+ can lead to soil alkalinity. Excessive concentrations of HCO 3 − negatively affect the plant uptake and metabolism of nutrients. Finally, many of these negative impacts of excessive particular ions limit the growth and productivity of most field crops [4,5,9] The evaluation of groundwater quality through the analysis of a large number of physiochemical parameters produces complex data matrices, which make the understanding and interpretation of the intricacies of water quality difficult. Recently, multivariate statistical methods such as factor analysis (FA) and principal component analysis (PCA) have been used to aid in the improved interpretation of large and complex data matrices relating to water quality without losing necessary information collected from actual observations [14][15][16][17]. Both types of analysis, incorporating several physiochemical parameters, are applied for extracting new important components or factors that account for most of the variations in groundwater quality [16,[18][19][20]. Using both analytical methods, each new factor produces its own effect on groundwater quality within areas under investigation. In summary, FA can determine the key parameters that explain the original data to the greatest extent, whereas PCA can provide a comprehensive picture of the interrelationships between all parameters and confirm their relative degrees of influence. Thus, in the present study, FA was used together with PCA to examine the influence of different of physicochemical parameters on groundwater quality.
Although all of the IWQIs are simple indicators that reflect overall water quality levels, the establishment of these indices using conventional procedures, such as collecting samples at fixed points in the field and then analyzing several physiochemical parameters in the laboratory with intensive statistical analysis, makes the compilation of these indices a time consuming, laborious, and ineffective process for effectively monitoring and managing groundwater quality. Importantly, it is not easy to determine the temporal and spatial variations in water quality demonstrated by these indices, which play an essential role in the evaluation and management of groundwater quality for irrigation purposes [21].
To overcome the limitations of IWQIs, field-measured spectroscopy has been shown to be a practical tool for the integrative evaluation and monitoring of the temporal and spatial variations of several water quality variables on a large scale in a quick and costeffective manner. The advantages of this tool include a large number of spectral reflectance bands and information that is significantly influenced by the changes that occur in the concentrations and types of physiochemical parameters, such as the content levels of chlorophyll, total suspended solids (TSSs), TH, total alkalinity (TA), total dissolved substances (TDS), salinity, the content of nitrogen and phosphorus, dissolved organic carbon, chemical oxygen demand (COD), and biological oxygen demand (BOD) [22][23][24][25][26][27][28][29][30]. For instance, a close relationship has been found between the total phosphorus in water and spectral reflectance wavebands in blue (450-510 nm) and green (500-600 nm) regions [31]. Gitelson et al. [32] reported that spectral reflectance from water surfaces in the range of 700-900 nm wavebands is sensitive to changes in the TSS concentration. Liu et al. [33] reported that spectral reflectance in the range of 560 to 780 nm was adequate for detecting suspended particulate matter in water. Xing et al. [28] also noted that the curve patterns of the relationship between spectral reflectance wavebands and water quality variables exhibited the maximum spectral response at intervals of 990-999 nm for NH 3 -N, 760-770 nm for BOD, and 460-475 nm for COD, TH, TA, and TDS. However, because the spectral reflectance measurements generate a large volume of data, the analysis of spectral reflectance data using an appropriate statistical model remains a critical step toward the discovery of the optimal relationship between spectral data and IWQIs.
Recently, new modeling frameworks, such as data-driven modeling (DDM) or data mining, have seen significantly increased use in the creation and application of computational intelligence and computer resources for water-related issues [34,35]. These methodologies constitute an approach for estimating water quality parameters obtained from field datasets and mapping the relationship between water quality variables according to spatial and temporal variability [36]. DDM is focused on the analysis of the data that characterizes the system being studied, particularly a model that can be described based on identifying links between the system's parameters (input, internal, and output variables) without specific knowledge of the physical behavior [37]. The DDM encompasses different categories commonly categorized into statistical and artificial intelligent models that include neural networks such as fuzzy systems and evolutionary computing, as well as other artificial intelligence and machine learning fields such as support vector machine regression (SVMR) [38]. SVMR uses a kernel function to map variable inputs into a high-dimensional feature space and can therefore handle high-dimensional input vectors [39]. Thus, it can provide a more rational solution than linear methodologies [40]. The development and current progress in the integration of various artificial intelligence techniques (knowledge-based systems, genetic algorithms, artificial neural networks, and fuzzy inference systems (FISs)) in water quality modeling, sediment transportation, and dissolved oxygen concentration have been studied by several researchers [38,[41][42][43].
Thus far, the prediction of IWQIs using adaptive neuro-fuzzy inference system (ANFIS) and SVMR models based on spectral reflectance bands has not been addressed. Only one Water 2021, 13, 35 4 of 24 study that we are aware of has focused on predicting the drinking WQI of surface water using the SVMR based on spectral reflectance indices (SRIs) [44]. Additionally, Xing et al. [28] used a partial least-squares regression (PLSR) and extreme learning machine to predict the TH in sewage water using the entire range of spectrum wavebands (350-2500 nm).
The primary objectives of this study are to (i) incorporate the different physiochemical parameters into a multivariate analysis to better understand and interpret the major contributing parameters that influence the groundwater quality of two distinct regions; (ii) evaluate the suitability of the groundwater in both regions for irrigation purposes based on various IWQIs; (iii) determine the optimal spectral wavelengths interval corresponding to different IWQIs; and (vi) evaluate the ANFIS and SVMR models based on the spectral reflectance bands to predict and simulate the IWQIs of groundwater. We not only evaluate the suitability of groundwater for irrigation purposes using multivariate analysis and a wide variety of IWQIs but also develop a new algorithm that can simulate and predict the quality of water resources by means of field-measured spectroscopy.

Study Area
The quaternary aquifer represents the main water-bearing formation for the two regions that make up the study area. The first of these (29 • 0 -29 • 30 N and 30 • 20 -31 • 10 E) is located in the El Fayoum depression in the Western Desert (WD) of Egypt and has an area of approximately 1200 km 2 . The second region (30 • 30 -30 • 55 N and 30 • 40 -31 • 00 E) is located in the Central Nile Delta (CND) and has an area of approximately half that of the first region.
The Quaternary deposits in the WD region are composed of varying grain sizes ranged from sandy clay and sometimes gravelly at the west to silt clay sediments at the east. The Quaternary aquifer in the CND region is generally composed of sand with a rich content of clay. The depth of the groundwater surface varies from few centimeters to 9.1 m and from 25 to 30 m in the WD and CND regions, respectively [29].

Sample Collection, Ionic Analysis, and Irrigation Groundwater Quality Evaluation
A total of 40 groundwater samples, 15 of which were from the first region and 25 were from the second, were collected during the summer of 2019 ( Figure 1). The samples were kept at 4 • C and delivered to the laboratory on the day they were collected for physiochemical analysis. The temperature, electrical conductivity, hydrogen ion activity (pH), and total dissolved solids (TDS) were measured in situ for each sample using a portable calibrated conductivity multi-parameter instrument (Hanna HI 9033). The different ionic concentrations, which are the basis for calculating the different variables related to irrigation groundwater quality, were analyzed according to the Standard Methods for the Examination of Water and Wastewater [45]. The total concentrations of Mg 2+ , Ca 2+ , and Cl − were estimated via the titration method using the ethylenediamine tetra-acetic acid for the first two ions and silver nitrate for the last ion. The total concentrations of Na + and K + were estimated using a flame photometer (ELEX 6361, Eppendorf AG, Hamburg, Germany). The concentrations of HCO 3 − and CO 3 2− were determined using the titrimetric method, whereas the content of SO 4 2− was determined with an ultraviolet visible spectrophotometer (DR/2040, Loveland, CO, USA). Before the ionic analysis, all of the samples were filtered through 0.45 µm of porosity to collect the solutions containing the dissolved elements only. All of the ionic concentrations are expressed in milligrams per liter. water-plant relationships, were calculated for the groundwater samples of the two regions. Table 1 lists the formula and references of these variables.  The WQI was first proposed by Horton [47]. This index has the capacity to transform large quantities of water parameter data into a single value, which helps us intensively interpret the impacts of several water parameters on overall water quality levels [4,7]. In this capacity, the index is a significant marker for the evaluation and management of groundwater for irrigation purposes. In this study, the WQI was calculated using the following equation [48]: (1)

Evaluation of the Quality of Groundwater for Irrigation Purposes
To ascertain the suitability of groundwater for irrigation purposes, quality variables such as the WQI, RSBC, RSC, TH, PS, and MH, which cause various problems for soilwater-plant relationships, were calculated for the groundwater samples of the two regions. Table 1 lists the formula and references of these variables. The WQI was first proposed by Horton [47]. This index has the capacity to transform large quantities of water parameter data into a single value, which helps us intensively interpret the impacts of several water parameters on overall water quality levels [4,7]. In this capacity, the index is a significant marker for the evaluation and management of groundwater for irrigation purposes. In this study, the WQI was calculated using the following equation [48]: The WQI was calculated in two steps. In the first step, the dominant physiochemical parameters that play a vital role in the estimation of water quality for irrigation purposes were identified, which in this study are SAR, EC, Na + , Cl − , and HCO 3 − . In the second step, the aggregation weights (W i ) and water quality measurement parameter value (Q i ) were determined based on each physiochemical parameter value, according the criteria proposed by Ayers and Westcot [48].
The values of Q i were calculated using the following equation: where Q max is the maximum value of Q i for each class; X ij is the observed value of each physiochemical parameter; X inf is the lower limit value of the class to which the parameter belongs; Q imap and X amp respectively refer the class amplitude and class amplitude to which the parameter belongs. Finally, the values of W i were normalized, and their final sum equals one according to the following equation: where W i and F are respectively corresponding to relative weight of the physiochemical parameters for WQI and a constant value of component 1; Aij is the parameter i that can be explained by factor j; i is the number of physiochemical parameters choose in WQI and varying from 1 to n; and j is the number of factors selected in WQI and ranging from 1 to k. According to Equation (3), the normalized weights of the SAR, EC, Na + , Cl − , and HCO 3 − are 0.189, 0.211, 0.204, 0.194, and 0.202, respectively. The MH plays a pivotal role in determining the suitability of water for irrigation purposes, where the high concentrations of Mg 2+ , as well as Ca 2+ in water, can increase the soil pH, decrease the availability of phosphorous and the soil quality, and finally, cause a significant decline in crop production [11,49]. An HM value of <50 is recommended for irrigation purposes, whereas one >50 is considered unsuitable for irrigation.
Generally, irrigation water with high salinity causes changes in soil structure and permeability, toxicity for plants, and reduced water availability. Thus, the PS is another key variable for irrigation water quality guidelines. It is defined as the Cl − concentrations plus half of the SO 4 2− concentrations [13]. Usually, irrigation water with PS values lower than three is considered fit for irrigation purpose.
The TH of water is the result of the presence of Ca 2+ and Mg 2+ in it. Although both ions are essential for plant growth and certain concentrations of them in irrigation water are beneficial, very high TH values in irrigation water can reduce irrigation efficiency because of the precipitation of both ions throughout the irrigation system. Additionally, very low values of TH can induce corrosion in the system. Hence, the TH variable is also considered an indicator of irrigation water quality. Groundwater with a TH <75 mg L −1 is usually classified as soft, whereas >300 mg L −1 is considered very hard [2].
For irrigation purposes, RSC (NaCO 3 ) and RSBC (NaHCO 3 ) are usually used to predict the relationship of CO 3 and HCO 3 concentrations and those of Ca 2+ and Mg 2+ . If the CO 3 and HCO 3 concentrations in the irrigation water are less than the Ca 2+ and Mg 2+ levels, the values of the RSC and RSBC become negative. This indicates little risk of Na + accumulation in the soil due to offsetting the levels of Ca 2+ and Mg 2+ ; the opposite holds true for positive values.

Measurements of Spectral Reflectance
For this study, the groundwater samples were placed in black cylindrical cups with a diameter of 25 cm and a depth of 10 cm. The spectral reflectance data were collected from the water surface of each sample using the passive reflectance sensor (tec5 AG, Oberursel, Germany) with a spectral range of 302-1148 nm, a sampling interval of 2 nm, and a probe field angle of 12 • . The spectral readings were collected under sunny and windless conditions, and the probe was held vertically at ≈25 cm above the water surface, in the nadir orientation, to cover a sufficient area for the sensing of the water surface. A polytetrafluoroethylene white Spectralon reflectance panel, which provides near 100% reflectance, was used to calibrate the spectral reflectance data before the measurements were taken and when needed during the measurement process. The spectral reflectance of each groundwater sample was measured twice, with each representing an average of 10 scans. Finally, the average of two measurements was recorded as the measured spectrum for a groundwater sample.

Multivariate Statistical Analysis
Multivariate statistical analyses, such as FA and PCA, were used to evaluate the quality of groundwater for irrigation purposes based on 11 physiochemical parameters. Both multivariate statistical analysis methods (FA and PCA) were applied to reduce data without the loss of important information contained in the original data [50]. Although both methods relate to each other, they are not identical. The main objective of FA is to reduce the complexity of a large set of related physicochemical parameters into small cluster numbers called factors to enable a better interpretation of the original data of these. Each of the factors has a strong correlation to the original physiochemical parameters within them [15,51,52]. The contribution of each factor is significant when its eigenvalue is >1, and this explains the higher percentage of the total variability of the original data of the parameters. The correlation coefficient (r) between the physiochemical parameters and selected factors is used to determine the factor loading. The factor loadings are considered weak, moderate, and strong when their values have ranges of 0.30-0.50, 0.50-0.75, and 0.75-1.00, respectively [53].
To provide a comprehensive picture of the interrelationships between all physiochemical parameters, a PCA was applied to more easily visualize this interrelationship. The PCA allowed the original data of the physiochemical parameters to be combined within two-dimensional principal components (PCs) and thus provide a straightforward visual representation of what the original data of physiochemical parameters would resemble, rather than appearing as a large mass of numbers [50]. The first two PCs should explain the highest percentage of the total variability of the original data of the parameters (at least 70% of the total variability) [54].
Before running the FA and PCA, the Kaiser-Meyer-Olkin (KMO) test and Bartlett's spherical test were conducted to independently verify the data and follow it through a Gaussian distribution. Then, the KMO test was used to compare the simple and partial r between the parameters. Bartlett's spherical test was used to verify whether the parameters were independent through a test of the correlation between them within the matrix [55]. Finally, if the value of the KMO test was >0.5, as well as both tests confirming the data following the Gaussian distribution, then the FA and PCA analysis were deemed applicable [56]. The FA and PCA were conducted using the XLSTAT statistical package (version 2019.4, Addinsoft, Boston, MA, USA). Figure 2 displays a schematic diagram of the methodology used in this study for estimating the groundwater quality variables. Both ANFIS and SVMR models were integrated with remote sensing data across the entire range of the spectrum (302-1148 nm) to predict the different IWQIs (WQI, RSBC, TH, PS, RSC, and MR). First, the analysis examined the correlation between the spectral reflectance bands and desired output to select the best spectral reflectance bands that make a significant correlation to the output groundwater quality variables. Furthermore, the ANFIS model was developed along with the SVMR model. Finally, the performance of the two models was evaluated using several evaluation criteria, namely, the coefficient of determination (R 2 ), the Nash-Sutcliffe coefficient (E), the root-mean-square error (RMSE), and the mean absolute deviations (MADs).

ANFIS and SVMR Models
Water 2021, 13, x FOR PEER REVIEW 8 of 25 groundwater quality variables. Furthermore, the ANFIS model was developed along with the SVMR model. Finally, the performance of the two models was evaluated using several evaluation criteria, namely, the coefficient of determination (R 2 ), the Nash-Sutcliffe coefficient (E), the root-mean-square error (RMSE), and the mean absolute deviations (MADs).

Figure 2.
Schematic diagram of the methodology of support vector machine regression (SVMR) and adaptive neuro-fuzzy inference system (ANFIS) used in this study for estimating the groundwater quality variables. WQI, RSBC, TH, PS, RSC, and MR indicate water quality index, residual sodium bicarbonate, total hardness, potential salinity, residual sodium carbonate, and magnesium hazard, respectively.

ANFIS Model
The ANFIS model is an FIS that is formulated as a feed-forward neural network. Thus, the advantages of a fuzzy system can be combined with learning algorithms [57]. Another benefit is that it can be amalgamated with learning algorithms [58,59]. By utilizing different output or input data, the ANFIS hypothesizes an FIS and typically modifies its membership function constraints on the basis of a backpropagation algorithm [60], and consequently, FIS may perhaps extract knowledge from the data used in training. Figure  3 represents the distinctive structural design of the ANFIS, which has a multilayered feedforward setup and is connected to an incoherent network for the x as well as the y inputs. It is important to note that FIS is primarily made up of five functioning blocks.

Figure 2.
Schematic diagram of the methodology of support vector machine regression (SVMR) and adaptive neuro-fuzzy inference system (ANFIS) used in this study for estimating the groundwater quality variables. WQI, RSBC, TH, PS, RSC, and MR indicate water quality index, residual sodium bicarbonate, total hardness, potential salinity, residual sodium carbonate, and magnesium hazard, respectively.

ANFIS Model
The ANFIS model is an FIS that is formulated as a feed-forward neural network. Thus, the advantages of a fuzzy system can be combined with learning algorithms [57]. Another benefit is that it can be amalgamated with learning algorithms [58,59]. By utilizing different output or input data, the ANFIS hypothesizes an FIS and typically modifies its membership function constraints on the basis of a backpropagation algorithm [60], and consequently, FIS may perhaps extract knowledge from the data used in training. Figure 3 represents the distinctive structural design of the ANFIS, which has a multilayered feed-forward setup and is connected to an incoherent network for the x as well as the y inputs. It is important to note that FIS is primarily made up of five functioning blocks. Water 2021, 13, x FOR PEER REVIEW 9 of 25 The Sugeno model has a rule base with the following form: By letting a = A and b = B , we have f = p  a q  b r By letting a = A and b = B , we have f = p  a q  b r .
In this particular case, fi is the output inside the incoherent zone and is denoted by the fuzzy principle; Ai and Bi are the membership values; a and b are indirect identifying functions; and pi, qi, and ri are the consequential restrictions, which are brought up to date within the forward pass inside the learning algorithm.
If the membership functions of the fuzzy sets Ai and Bj are correspondingly μ and μ , the five layers that incorporate ANFIS will be formulated as discussed below.
If the throughput of the layer one i th node is represented by O1, i, then for Layer 1, each node i is considered adaptive for the node function.
with a and b being the inputs to the i node, and A or (B − r is considered a function determined by past occurrences. Layer 2 has nodes that are labeled and multiply the signals that enter the system. Each of the node's throughputs denotes a rule's firing strength: O2,i = wi = μ (a) μ (b) for i = 1 and 2.
Layer 3: Nodes that are labeled N in this layer scale the firing strengths to ensure the provision of the firing strengths is normalized: Layer 4: Its output consists of a linear amalgamation of the inputs proliferated by the firing power, which are normalized. The layers of these nodes are adaptive and have specific roles: where wi is the layer three output and a parameter set (pi, qi, ri). The parameters of Layer 4 are termed the resultant parameters. Layer 5: It has a node and calculates the absolute output by considering the total of every inward bound signal: The Sugeno model has a rule base with the following form: By letting a = A 1 and b = B 1 , we have f 1 = p 1 a + q 1 b + r 1 (4) By letting a = A 2 and b = B 2 , we have f 2 = p 2 a + q 2 b + r 2 .
In this particular case, f i is the output inside the incoherent zone and is denoted by the fuzzy principle; A i and B i are the membership values; a and b are indirect identifying functions; and p i , q i , and r i are the consequential restrictions, which are brought up to date within the forward pass inside the learning algorithm.
If the membership functions of the fuzzy sets A i and B j are correspondingly µ A i and µ B i , the five layers that incorporate ANFIS will be formulated as discussed below.
If the throughput of the layer one i th node is represented by O 1,i , then for Layer 1, each node i is considered adaptive for the node function.
with a and b being the inputs to the i th node, and A i or (B i − r is considered a function determined by past occurrences. Layer 2 has nodes that are labeled and multiply the signals that enter the system. Each of the node's throughputs denotes a rule's firing strength: Layer 3: Nodes that are labeled N in this layer scale the firing strengths to ensure the provision of the firing strengths is normalized: Layer 4: Its output consists of a linear amalgamation of the inputs proliferated by the firing power, which are normalized. The layers of these nodes are adaptive and have specific roles: where w i is the layer three output and a parameter set (p i , q i , r i ). The parameters of Layer 4 are termed the resultant parameters. Layer 5: It has a node and calculates the absolute output by considering the total of every inward bound signal: Each square-illustrated layer is termed adaptive, and its values are typically changed while system training is conducted. Conversely, the layer epitomized by the circle remained invariable before, during, and after the training is complete. Figure 3 represents the network used in this study and comprises select bands as inputs, as well as one output membership function for the desired index (different irrigation water quality variables).
In this study, ANFIS was used to excerpt the relationship of spectral reflectance data and groundwater quality variables and characterize them as fuzzy "if-then" relationships. The postulate segment of fuzzy if-then rules includes spectral reflectance data. The consequential segment comprises the WQI, RSBC, TH, PS, RSC, and MR. The framework of the developed ANFIS archetype comprised a fuzzy system of the Sugeno type with a linear output membership function and a typical bell input membership function. The ANFIS simulation model contains two phases. The first corresponds to when operation rules are created using a fuzzy approach, and then, the already created FIS is applied as an input into the ANFIS system ( Figure 4). For the second phase, the irrigation water quality variables were simulated using the final FIS, which was created using ANFIS.
Water 2021, 13, x FOR PEER REVIEW 10 of Each square-illustrated layer is termed adaptive, and its values are typically change while system training is conducted. Conversely, the layer epitomized by the circle r mained invariable before, during, and after the training is complete. Figure 3 represen the network used in this study and comprises select bands as inputs, as well as one outp membership function for the desired index (different irrigation water quality variables) In this study, ANFIS was used to excerpt the relationship of spectral reflectance da and groundwater quality variables and characterize them as fuzzy "if-then" relatio ships. The postulate segment of fuzzy if-then rules includes spectral reflectance data. Th consequential segment comprises the WQI, RSBC, TH, PS, RSC, and MR. The framewo of the developed ANFIS archetype comprised a fuzzy system of the Sugeno type with linear output membership function and a typical bell input membership function. Th ANFIS simulation model contains two phases. The first corresponds to when operatio rules are created using a fuzzy approach, and then, the already created FIS is applied an input into the ANFIS system ( Figure 4). For the second phase, the irrigation water qua ity variables were simulated using the final FIS, which was created using ANFIS.

SVMR Model
The SVMR algorithm is a universal theory of machine learning for pattern classific tion and recognition. The SVMR model can solve either regression or classification pro lems or can map both low-dimensional nonlinear inputs and high-dimensional linear ou puts with excellent results. The SVMR model used spectra reflectance bands as input da in the spectrum and different groundwater quality variables as output data to establi calibration equations and used cross-validation to minimize over-fitting.

Performance Evaluation of the Developed Models
Different goodness-of-fit measures were used to evaluate the prediction of the ANF and SVMR models' operation to predict the groundwater quality variables. Some of th

SVMR Model
The SVMR algorithm is a universal theory of machine learning for pattern classification and recognition. The SVMR model can solve either regression or classification problems or can map both low-dimensional nonlinear inputs and high-dimensional linear outputs with excellent results. The SVMR model used spectra reflectance bands as input data in the spectrum and different groundwater quality variables as output data to establish calibration equations and used cross-validation to minimize over-fitting.

Performance Evaluation of the Developed Models
Different goodness-of-fit measures were used to evaluate the prediction of the ANFIS and SVMR models' operation to predict the groundwater quality variables. Some of the measures that were utilized comprised the coefficient of determination (R 2 ), Nash-Sutcliffe coefficient (E), RMSE, and MAD. A t-test (two-sample) of the averages was conducted to determine if there was a substantial difference between the simulated data and mean from the experiment.
The proposed Nash and Sutcliffe [61] efficiency was computed, as is shown below: where WQI o is the average of the considered parameter. The MAD evaluates the mean magnitude of the errors across a set of simulations, with no consideration of direction. It also gauges the precision of constant variables and is computed as is shown below: where WQI o is the observed value and n is the number of data points. Meanwhile, the WQI f is the predicted value. The absolute variance fraction, R 2 , is calculated as follows: The variables must already be defined. The RMSE demonstrates the fit (absolute) of the model to the data points (how the experimental data are closer to the model's projected values). The RMSE measures the best of the absolute value, with the smaller RMSE values indicating the best fit and R 2 being a relative measure of the fit. The RMSE is determined with the following formula: The correlation coefficient is a statistical hypothesis that can be ascribed to the evaluation of how trends in the simulated values correspond to the inclinations in the experimental values (actual values). The formula for calculating the correlation coefficient is as follows:

Interpretation of Groundwater Quality through Physiochemical Parameters Using a Multivariate Analysis
The integration of multivariate statistical analysis approaches such as FA and PCA and a large set of physiochemical parameters has been widely applied as an effective strategy for providing meaningful information regarding water quality and sustainably managing water resources [17,19,20,[62][63][64][65]. For instance, FA is a powerful tool that is used to reduce the complexity of several physiochemical parameters without losing a great deal of information from the original measurements. This appreciable reduction in physiochemical parameters is highly necessary for the superior interpretation of the original measurements as well as for reduced costs of laboratory measurements of these [66]. Generally, this tool is used in conjunction with physiochemical parameters to create new factor loadings. These factor loadings always have a strong relationship with the original physiochemical parameters. The factor loadings are classified as strong, moderate, and weak if their absolute values for physiochemical parameters are greater than 0.75, 0.75-0.50, and 0.50-0.30, respectively [67,68]. In this study, the FA reduced the 11 physiochemical parameters into two main factors (eigen value > 1) in both regions. These two factors explained, respectively, 73.6% and 57.7% of the total variability of the physiochemical parameters in the WD and CND regions ( Table 2). The first factor accounted for 47.9% of the total variability and had strong loadings for TDS, K + , Na + , Cl − , and SO 4 2− and moderate loadings for Mg 2+ and Ca 2+ in the WD region. By contrast, in the CND region it accounted for 40.8% of the total variability and had strong loadings for TDS, Ca 2+ , SO 4 2− , and HCO 3 − , moderate loadings for Mg 2+ , and weak loadings for the temperature, K + , and Na + ( Table 2). The second factor accounted for 25.7% and 16.9% of the total variability in WD and CND regions, respectively, and had strong loadings for NO 3 − and weak ones for the pH in both regions. This factor also had strong loadings for HCO 3 − and Cl − in the WD and CND regions, respectively ( Table 2). Additionally, the PCA was applied for the 11 physiochemical parameters to confirm the results of the FA. The primary objective of the PCA is to classify the different physiochemical parameters into major components such that the parameters that are their vectors, adjacent or parallel to each other and with a small angle between them, signify the strength of their reciprocal association. By contrast, the divergence of the vectors of the parameters expresses weak relationships between them. Based on the angles between the vectors of the parameters in the biplot of the PCA, the 11 physiochemical parameters were divided into three and four groups in the WD and CND regions, respectively ( Figure 5). The first of these included K + , Ca 2+ , Mg 2+ , Na + , Cl − , SO 4 2− , and TDS from the WD region, whereas that for the CND region included Ca 2+ , Mg 2+ , SO 4 2− , HCO 3 − , and TDS. The second group, which only appeared in the CND region, comprised K + , Na + , and Cl − . The NO 3 − with HCO 3 − in the WD region and with a pH in the CND region was located in the third group. The fourth group contained pH and temperature in the WD region but only temperature in the CND region ( Figure 5). Water 2021, 13, x FOR PEER REVIEW 14 of 25

Validity of Groundwater for Irrigation Purposes
The evaluation of groundwater quality for irrigation purposes plays a fundamental role in the sustainability of soil health and crop productivity, especially in arid and semiarid conditions. Several IWQIs pertain to the suitability of any water source for irrigation purposes. Based on these indices, the quality of irrigation was classified into different classes to consider the risk of salinity problems, ion toxicity to plants, and soil permeability and degradations [9,12,17].
The WQI has been widely used for many decades to evaluate the suitability of groundwater for different usage purposes. The primary idea of this index is to integrate several physiochemical parameters into a single value to quickly and easily interpret the groundwater quality [4,8,74,75]. Importantly, this index can help with evaluating the key indicators relevant to groundwater quality in a comprehensive manner by highlighting the physiochemical parameters, which are highly important for irrigation water quality, and omitting the less important parameters. According to Meireles et al. [4], groundwater with a WQI value ranging from 85 to 100 can be safely used to irrigate the majority of soils with low probability of causing sodicity and salinity problems, and thus no toxicity risk for most field crops. The groundwater with WQI ranging from 70 to 85 is recommended for use to irrigate soils with high texture or moderate permeability with avoiding use for the irrigation of salt-sensitive crops. When the value of WQI ranged from 55 to 70, the groundwater can be used to irrigate the soils with moderate to high permeability values and irrigate the moderately salt-tolerant crops. However, if the value of WQI for groundwater ranged from 0 to 40, the water should be avoided to use for irrigation under normal conditions, and in special cases, it may be used occasionally. In this study, the groundwater samples of the CND exhibited higher values for WQI than those in the WD (Table 3). Based on the final results of the FA and PCA, the following conclusions can be reached: (1) the TDS, which is an indicator of the salinity level and reflects the total concentrations of the dissolved cations and anions in water, was closely related to salt ions (principally, Na + , Cl − , and SO 4 2− ), as well as K + , Ca 2+ , and Mg 2+ in the WD region, whereas it appeared to be associated with Ca 2+ , Mg 2+ , SO 4 2− , and HCO 3 − in the CND region ( Figure 5), wherein the salt ions exhibited stronger loadings for the first factor in the WD region than in the CND one; the opposite was true for Ca 2+ and Mg 2+ ( Table 2); (2) the two cations (K + and Na + ), in which the ratio between both ions plays a distinctive role in the groundwater quality, particularly pertaining to their use as an irrigation water source, had strong and weak loadings for the first factor in the WD and CND regions, respectively, with the angle between the vectors of the two parameters being very close in both (Table 2 and Figure 5); (3) the NO 3 − had strong loadings (>0.77) for the second factor in both regions ( Table 2) and was also plotted separately in individual groups with HCO 3 − in the WD region and pH in the CND one ( Figure 5), which indicated that both point sources such as agricultural fertilizers, domestic waste, and anthropogenic activities, as well as nonpoint sources, such as agricultural runoff, soil erosion, and atmospheric deposition due to natural weathering, could be the source of NO 3 − in both regions; and (4) the pH in both regions and temperature in the CND region were less important in accounting for groundwater quality, as they showed a weak loading for either the first or second factor ( Table 2) and also provided a large angle between their vectors and those of the other parameters ( Figure 5). All of these findings indicate that the PCA mostly confirmed the FA results and, additionally, the combination of both methods was found to be advantageous to examining and interpreting the behavior of groundwater quality in both regions, as well as to predicting the variables that may impact groundwater quality by illuminating the relationship between physiochemical parameters and the factors or components of both analyses. The final results of both methods indicate that the groundwater of WD is less suitable for irrigation, whereas the first factor of FA exhibits strong loadings for TDS and salt ions (principally, Na + , Cl − , and SO 4 2− ), and moderate loadings for Ca 2+ and Mg 2+ (Table 2), as well as the salt ions, which make a major contribution to the TDS of the water, showed a small angle between their vectors and those of the TDS ( Figure 5). The strong loadings of the salt ions in the groundwater of the WD may be attributed to the recharge from the underlying fractured limestone of the Eocene aquifer through hydraulic connection, as well as agricultural runoff into the quaternary aquifer system, which is considered as the most important aquifer in the WD region [69,70]. The moderate loadings of Ca 2+ and Mg 2+ , compared with Na + and Cl − , are also the result of the high mineralization processes through rock-water interaction such as reverse ion exchange, leaching, dissolution and precipitation processes [66,71]. Additionally, the excessive use of artificial chemical fertilizers and pesticides and the over-pumping of groundwater for irrigation, especially during summer seasons, may also lead to an increase in salinity hazards in the groundwater of the WD [72]. Therefore, this source of water can be used for irrigation purposes but with proper water management, such as the provision of a good drainage system and calculation of the water leaching requirements to control salt accumulation in the soil. The low or moderate loadings of salt ions in the groundwater of the CND region may be attributed to the Nile river's water as the main source of water recharge in the area via seepage from the river's two branches [17,73].

Validity of Groundwater for Irrigation Purposes
The evaluation of groundwater quality for irrigation purposes plays a fundamental role in the sustainability of soil health and crop productivity, especially in arid and semiarid conditions. Several IWQIs pertain to the suitability of any water source for irrigation purposes. Based on these indices, the quality of irrigation was classified into different classes to consider the risk of salinity problems, ion toxicity to plants, and soil permeability and degradations [9,12,17].
The WQI has been widely used for many decades to evaluate the suitability of groundwater for different usage purposes. The primary idea of this index is to integrate several physiochemical parameters into a single value to quickly and easily interpret the groundwater quality [4,8,74,75]. Importantly, this index can help with evaluating the key indicators relevant to groundwater quality in a comprehensive manner by highlighting the physiochemical parameters, which are highly important for irrigation water quality, and omitting the less important parameters. According to Meireles et al. [4], groundwater with a WQI value ranging from 85 to 100 can be safely used to irrigate the majority of soils with low probability of causing sodicity and salinity problems, and thus no toxicity risk for most field crops. The groundwater with WQI ranging from 70 to 85 is recommended for use to irrigate soils with high texture or moderate permeability with avoiding use for the irrigation of salt-sensitive crops. When the value of WQI ranged from 55 to 70, the groundwater can be used to irrigate the soils with moderate to high permeability values and irrigate the moderately salt-tolerant crops. However, if the value of WQI for groundwater ranged from 0 to 40, the water should be avoided to use for irrigation under normal conditions, and in special cases, it may be used occasionally. In this study, the groundwater samples of the CND exhibited higher values for WQI than those in the WD (Table 3). Furthermore, most of the groundwater samples from the CND region (72%) were categorized as having no restriction, and the remaining samples (28%) were categorized as having a low restriction. By contrast, 47% and 33% of the groundwater samples from the WD region fell in the high restriction and severe restriction categories, respectively (Table 4). These results indicate that the groundwater of the CND region was fresher than that of the WD region and therefore more suitable for irrigation purposes and human consumption. By contrast, the groundwater of the WD region can be safely used to irrigate soils that have coarse and medium textures, moderate the permeability status, and, with a good drainage system, calculate the water leaching requirements. When soil is irrigated with high-RSC water, which measures the relationship in the sum concentration of CO 3 2− and HCO 3 − to the sum of Ca 2+ and Mg 2+ , soil degradation can result and crop productivity be negatively impacted due to the deposition of Ca 2+ and Mg 2+ , as well as the sodium hazard in the soil potentially increasing [9][10][11]. Groundwater with an RSC value greater than 2.5 is harmful and unsuitable for irrigation purposes, whereas that with an RSC value of <1.25 is recommended as being safe for irrigation purposes. In this study, the RSC values of groundwater for the WD and CND regions ranging from −35.54 to 0.74 meq L −1 and from −6.34 to 0.29 meq L −1 , respectively (Table 3). Furthermore, all groundwater samples from both regions were classified as good water and suitable for irrigation without limitations due to the RSC (Table 4). Additionally, all groundwater samples from both regions, except for one sample from each, featured negative RSC values that revealed excessive concentrations of Ca 2+ and Mg 2+ in the groundwater in both regions, without the potential build-up of Na + .
An excess of HCO 3 − concentrations in irrigation water negatively affects soil permeability and drainage, as well as the plant uptake of nutrients. Therefore, RSBC is an important IWQI for evaluating the suitability of groundwater for irrigation purposes based on HCO 3 − concentrations [5]. The RSBC values of groundwater for the WD and CND regions ranged from −17.42 to 3.76 meq L −1 and −3.7 to 1.28 meq L −1 , respectively (Table 3). Furthermore, 100% of the groundwater samples from both regions met the requirement of water quality for irrigation purposes, all of their RSBC values being <5 meq L −1 and thus fell within the satisfactory category (Table 4).
A high level of Mg 2+ concentrations in irrigation water leads to soil alkalinity, causes soil infiltration issues, decreases the availability of phosphorous, and ultimately diminishes crop productivity [49,63,76]. Therefore, MH was proposed for the evaluation of groundwater for the purpose of irrigation. The groundwater contains MH < 50 and was considered safe and suitable for irrigation purposes, with the opposite holding true for MH > 50 [77].
The MH values of the groundwater samples from the WD and CND ranged from 28.47% to 61.36%, and 12.38% to 44.18%, with 60% and 100% of the samples being safe for irrigation, respectively (Tables 3 and 4).
PS is another IWQI for the classification of groundwater sources for irrigation. Generally, groundwater is considered safe for irrigation if its PS value is lower than 3 meq L −1 [13]. Only 13% of groundwater samples from WD were classified as being excellent to good, whereas most of the samples (80%) were classified as injurious to unsatisfactory. By contrast, 84% of groundwater samples from the CND were classified as excellent to good, with 4% classified as injurious to unsatisfactory (Table 4). This indicates that the groundwater of the CND has high suitability for irrigation purposes, whereas that of the WD can be used for irrigation, but with a proper water management system (i.e., providing a good drainage system and calculating water leaching requirements) to control salt accumulation in the soil.
The suitability of groundwater for irrigation purposes can also be evaluated by TH, which is mostly caused by the presence of Ca 2+ and Mg 2+ [78,79]. Usually, Ca 2+ and Mg 2+ are found in groundwater in the form of CaCO 3 and CaMg(CO 3 ) 2 , respectively. Therefore, groundwater with a TH level of more than 300 mg L −1 equivalent to CaCO 3 will be problematic for the plumbing of irrigation systems [80]. The TH values of the groundwater from the WD and CND regions ranged, respectively, from 299.6 to 2213.8 mg L −1 and 299.6 to 2213.8 mg L −1 (Table 3). Furthermore, most groundwater samples from the WD region fell under the very hard category, whereas 52% of groundwater samples from the CND region fell under the same category and 36% fell under the hard one. None of the groundwater samples from the WD region fell under the moderately hard category, with 12% of the samples from the CND region falling under this category (Table 4). Figure 6 displays the behavior of the spectral reflectance signature of the different groundwater samples collected from the two regions of between 302 and 1148 nm. The figure shows that there are obvious differences in the shape of the spectral reflectance curves between the samples. However, the spectral reflectance of the groundwater samples from the CND region was higher than that in the samples obtained from the WD one.

Hyperspectral Characteristics of the Groundwater
Additionally, the figure also shows that the spectral reflectance curves of the two regions reveal three reflectance deep troughs around 320-340, 740-780, and 950-1000 nm wavelength regions and three reflectance peaks around 400-700, 780-820, and 1100-1148 nm wavelength regions ( Figure 6). Furthermore, clear differences in the spectral reflectance between the groundwater samples in both regions were considerably large in the 302-318, 350-700, 750-900, and 1074-1148 nm wavelength regions. These results indicate that the obvious differences in physiochemical characteristics between the groundwater samples influenced the variations in the curves of the hyperspectral reflectance at different wavelength regions. Therefore, these results further suggest that it is possible to evaluate the parameters related to IWQIs in terms of the spectral characteristics of the water surface in these wavelength regions. groundwater samples collected from the two regions of between 302 and 1148 nm. The figure shows that there are obvious differences in the shape of the spectral reflectance curves between the samples. However, the spectral reflectance of the groundwater samples from the CND region was higher than that in the samples obtained from the WD one.
Additionally, the figure also shows that the spectral reflectance curves of the two regions reveal three reflectance deep troughs around 320-340, 740-780, and 950-1000 nm wavelength regions and three reflectance peaks around 400-700, 780-820, and 1100-1148 nm wavelength regions ( Figure 6). Furthermore, clear differences in the spectral reflectance between the groundwater samples in both regions were considerably large in the 302-318, 350-700, 750-900, and 1074-1148 nm wavelength regions. These results indicate that the obvious differences in physiochemical characteristics between the groundwater samples influenced the variations in the curves of the hyperspectral reflectance at different wavelength regions. Therefore, these results further suggest that it is possible to evaluate the parameters related to IWQIs in terms of the spectral characteristics of the water surface in these wavelength regions. The results displayed in Figure 7 further confirm this and test the strength of the relationship between the original spectral reflectance data throughout the entire wavelengths (302 to 1148 nm) and the six parameters related to the IWQIs using the pooled data of 40 groundwater samples. Generally, the WQI, RSC, and RSBC were positively correlated with the original spectral reflectance throughout the entire wavelengths, whereas the PS, TH, and MR parameters displayed the opposite relationships (Figure 7). Furthermore, the original spectral reflectance exhibited the highest correlation coefficients (r), The results displayed in Figure 7 further confirm this and test the strength of the relationship between the original spectral reflectance data throughout the entire wavelengths (302 to 1148 nm) and the six parameters related to the IWQIs using the pooled data of 40 groundwater samples. Generally, the WQI, RSC, and RSBC were positively correlated with the original spectral reflectance throughout the entire wavelengths, whereas the PS, TH, and MR parameters displayed the opposite relationships ( Figure 7). Furthermore, the original spectral reflectance exhibited the highest correlation coefficients (r), with WQI (r = 0.24-0.84), followed, in sequence, by TH (r = 0.16-0.63), MR (r = 0.30-0.62), RSC (r = 0.15-0.60), and PS (r = 0.17-0.57). The RSBC was the least correlated with the original spectral reflectance (r = 0.09-0.43). Additionally, the highest correlation between the six IWQI parameters and the original spectral reflectance was located at wavelength intervals of 302-318, 358-900, and 1074-1148 nm, where the correlation coefficients between these wavelength regions and the WQI, RSC, RSBC, PS, TH, and MR were higher than 0.70, 0.50, 0.35, 0.49, 0.52, and 0.57, respectively (Figure 7). These findings confirm that the wavelengths with high correlation coefficient values can be useful for the spectroscopic analysis and quantification of water quality parameters. The results from the studied articles fully confirm this statement and have indicated a higher and consistent association between the spectral reflectance of the surface water at various parts of the spectrum and several water quality indicators. For instance, Ma and Dai [81] found that the spectral reflectance in the range between 760 and 1100 nm was the optimal wavelength interval for estimating the TSS concentrations. Wu et al. [82] demonstrated that the spectrum in the wavelengths ranging from 750 to 900 nm showed a strong correlation with the total TSS and turbidity values. The spectrum in the wavelengths ranging from 400 to 900 nm was found to be effective for estimating the concentrations of turbidity, phosphorus, BOD, and COD [22,30,44,83]. Xing et al. [28] also reported that the best wavelengths for estimating different water quality parameters included COD, BOD, TH, TA, and TDS were located in the spectrum ranging from 400 to 450 nm and from 900 to 1300 nm. the six IWQI parameters and the original spectral reflectance was located at wavelength intervals of 302-318, 358-900, and 1074-1148 nm, where the correlation coefficients between these wavelength regions and the WQI, RSC, RSBC, PS, TH, and MR were higher than 0.70, 0.50, 0.35, 0.49, 0.52, and 0.57, respectively (Figure 7). These findings confirm that the wavelengths with high correlation coefficient values can be useful for the spectroscopic analysis and quantification of water quality parameters. The results from the studied articles fully confirm this statement and have indicated a higher and consistent association between the spectral reflectance of the surface water at various parts of the spectrum and several water quality indicators. For instance, Ma and Dai [81] found that the spectral reflectance in the range between 760 and 1100 nm was the optimal wavelength interval for estimating the TSS concentrations. Wu et al. [82] demonstrated that the spectrum in the wavelengths ranging from 750 to 900 nm showed a strong correlation with the total TSS and turbidity values. The spectrum in the wavelengths ranging from 400 to 900 nm was found to be effective for estimating the concentrations of turbidity, phosphorus, BOD, and COD [22,30,44,83]. Xing et al. [28] also reported that the best wavelengths for estimating different water quality parameters included COD, BOD, TH, TA, and TDS were located in the spectrum ranging from 400 to 450 nm and from 900 to 1300 nm.

Performance of the Developed Models for Predicting the IWQIs
Since the relationships between the full range of the spectrum and IWQIs generated a wide range of effective wavelengths and the highest and most significant r values for these wavelengths were relatively flat, as is shown in Figure 7, it was difficult to develop effective models for estimating the water quality parameters based on the r using the full spectrum wavelengths. Moreover, although the simple spectral reflectance indices (SRIs, Figure 7. Correlation coefficient between the original canopy spectral reflectance of the full wavelengths range (302-1148 nm) and different irrigation water quality indices (IWQIs) including water quality index (WQI), residual sodium bicarbonate (RSBC), total hardness (TH), potential salinity (PS), residual sodium carbonate (RSC), and magnesium hazard (MH).

Performance of the Developed Models for Predicting the IWQIs
Since the relationships between the full range of the spectrum and IWQIs generated a wide range of effective wavelengths and the highest and most significant r values for these wavelengths were relatively flat, as is shown in Figure 7, it was difficult to develop effective models for estimating the water quality parameters based on the r using the full spectrum wavelengths. Moreover, although the simple spectral reflectance indices (SRIs, which focus on only two to three specific wavelengths), are widely used to estimate several IWQIs [23,27,29], this approach ignores a large number of wavelengths within the full spectrum, thereby causing small variations in the results or highly concentrated results. Furthermore, due to the large variation in environmental conditions and concentrations and types of water components reported in other studies, several published SRIs should be validated or new ones need to be derived, which would impose a heavy computational burden. As a result of the above-mentioned limitations of the methods of the Pearson correlation coefficient and SRIs for the accurate estimation of different IWQIs, therefore, several hyperspectral statistical models have recently been developed and gradually applied to improving the accuracy of IWQIs estimation. This study considered different ANFIS and SVMR models that are based on several spectral wavelengths in different spectrum regions to improve the estimation of the different IWQIs.
In this study, the two models of ANFIS and SVMR were utilized to develop the models for the simulation of the IWQIs. To construct both models, the IWQI datasets were randomly split into two sets of progressions, such as training and testing, by considering 70% and 30%, respectively. For the ANFIS models, subtractive clustering was used to create an FIS of a Sugeno-type. Furthermore, the separate input and output sets of the data-generated input arguments were then applied to the IWQIs. The primary purpose of implementing this step was to pinpoint the most appropriate antecedent membership functions, as well as the number of rules. The subsequent step encompassed the utilization of the linear least-squares estimation to determine the subsequent estimation of every rule's equation to cover the feature space. Once the model of the best performance had been chosen via the ANFIS training, the predicted values of the selected output would be calculated, and a comparison made between these values and the actual values measured to estimate output (Supplementary Figures S1-S3 and Table 5). For instance, Supplementary Figure S1 displays the WQI values that were predicted by the ANFIS against the resultant experimental values for the training dataset as well as for testing. It is apparent in Supplementary Figure S1 that the two curves of estimated as well as experimental data nearly overlap one another, and this trend between the simulated and experimental data is similar apart from the few records that deviate by more than the actual experimental values. Table 5 summarizes the training, analysis, and corroboration of the output for the IWQIs prediction models. Clearly, the ANFIS carries outfitting, and the general projection upshots are objectively virtuous from the standpoints of R 2 , RMSE, MAD, and E. The large value of R 2 is an indication of a perfect rapport between the projected and observed values of the IWQIs. Moreover, the RMSE is equal or closer to the MAD, which denotes that every error is of the same magnitude. A significant positive relationship was also obtained between the two measured and simulated WQIs in case of training and testing values. According to Equation (11), the range of E lies between 1.0 (perfect fit) and −∞; the high value of E is indicative of a more efficient data-driven model. Values of E in the range (−∞, 0) occur when the mean observed value is a superior estimation to the model's prediction or a simulated value, which indicates inadequate performance. The values of E presented in Table 5 are above 0.85, which shows that the developed model is an ideal fit for WQI in terms of the testing and training datasets. Table 5 shows the R 2 , RMSE, MAD, and E of the training and testing datasets of the ANFIS and SVMR models based on all of the spectrum wavelengths for each IWQI. Generally, the ANFIS models offer a more accurate estimation of the different IWQIs in both the training and testing datasets (R 2 was 1.00 in the training datasets and from 0.74 to 0.98 in the testing ones) than those from the SVMR models (R 2 ranged from 0.29 to 0.88 in the training datasets and from 0.01 to 0.70 in the testing ones). The ANFIS models presented to lower the values for the RMSE and MAD and increase the values for E of the IWQIs over those of the SVMR model. The ANFIS models indicate a robust estimation for the WQI, RSC, RSBC, TH, and PS and a moderate one for the MH in testing datasets. However, the SVMR model showed a strong estimation for WQI and RSC, a moderate estimation for the PS and TH, and a weak estimation for the RSBC and MR in the testing datasets ( Supplementary Figures S4-S6 and Table 5).
These results indicate that the ANFIS or SVMR models based on different spectral bands from the full spectrum wavelengths can provide additional improvements to assess IWQIs and can be considered a unified method for remotely estimating constituent concentrations in water quality evaluations. This is due to the various models of ANFIS or SVMR, which contain numerous sensitive and effective wavelengths that cover all of the main variations in the water components and are directly linked to the key changes in the undesired water quality parameters. As in this study, Wang et al. [64] developed SVMR models on the basis of SRIs to predict the drinking water quality index (DWQI). The models were based on selected SRIs of derivative order (1.6) and found to better estimate the DWQI than alternatives, with an R 2 of 0.92, an RMSE of 58.4, and a slope of the equation of 0.97. Xing et al. [28] reported that including the most sensitive wavelengths in the PLSR model increased the estimation efficiency of several wastewater quality parameters, such as the TH. Additionally, for estimates of single water quality parameters, Wang et al. [64] reported that the PLSR models, based on a spectral reflectance range from 400 to 900 nm, indicated the best predictive through cross-validation for both the TSS (R 2 = 0.97, RMSECV = 1.91, and RPD = 6.64) and chlorophyll-a (R 2 = 0.98, RMSECV = 6.15, and RPD = 7.44). Table 5. Comparison of statistical parameters between adaptive neuro-fuzzy inference system (ANFIS) and support vector machine regression (SVMR) models of measured and simulated values of different irrigation water quality indices (IWQIs) including water quality index (WQI), residual sodium bicarbonate (RSBC), total hardness (TH), potential salinity (PS), residual sodium carbonate (RSC), and magnesium hazard (MH).

Conclusions
In this study, the different physiochemical parameters were incorporated into a multivariate analysis (FA and PCA) to better understand and interpret the major contributing parameters that influence the groundwater quality of two distinct regions (WD and CND). The suitability of the groundwater in both regions for irrigation purposes was also evaluated based on various IWQIs. The ANFIS and SVMR models based on the optimal spectral reflectance bands were also applied to develop a new algorithm that can predict and simulate the IWQIs of groundwater. The results reported herein indicate that the combination of FA and PCA analysis was found to be advantageous to examining and interpreting the behavior of groundwater quality in both of the regions investigated, as well as for predicting the variables that may impact groundwater quality through an understanding of the relationship between the physiochemical parameters and the factors or components of both analytical approaches. The final results of both methods indicate that the groundwater of WD is less suitable for irrigation, whereas the first factor of FA exhibits strong loadings for TDS and salt ions (principally, Na + , Cl − , and SO 4 2− ), and moderate loadings for Ca 2+ and Mg 2+ , as well as the salt ions, which make a major contribution to the TDS of the water, showed a small angle between their vectors and those of the TDS. Therefore, the groundwater of WD region can be safely used to irrigate soils that have coarse and medium textures, moderate permeability, and good drainage systems, with the water leaching requirements also calculated. According to the IWQ, RSBC, PS, RSC, and MH, approximately 7.0%, 100.0%, 13.0%, 100.0%, and 60.0% of groundwater samples in the WD are suitable for irrigation purposes and classified as low restriction, satisfactory, excellent to good, good, and suitable, respectively. However, almost all (72-100%) groundwater samples of the CND are very suitable for irrigation. The highest correlation between the six IWQI parameters and the original spectral reflectance was located at wavelength intervals of 302-318, 358-900, and 1074-1148 nm, where the correlation coefficients between these wavelength regions and the WQI, RSC, RSBC, PS, TH, and MR were higher than 0.70, 0.50, 0.35, 0.49, 0.52, and 0.57, respectively. These findings confirm that the wavelengths with high correlation coefficient values can be useful for the spectroscopic analysis and the quantification of water quality parameters. The applicability and capacity of the ANFIS and SVMR models were investigated based on a dataset collected from groundwater wells in the quaternary aquifer located in the CND and WD. The ANFIS models offer a more accurate estimation of the different IWQIs in both the training and testing datasets (R 2 was 1.00 in the training datasets and from 0.74 to 0.98 in the testing ones) than those from the SVMR models (R 2 ranged from 0.29 to 0.88 in the training datasets and from 0.01 to 0.70 in the testing ones). The ANFIS models indicate a robust estimation for the WQI, RSC, RSBC, TH, and PS and a moderate one for the MH in testing datasets. However, the SVMR model showed a strong estimation for WQI and RSC, a moderate estimation for the PS and TH, and a weak estimation for the RSBC and MR in the testing datasets. Finally, the main conclusion of this study is that the integration of both ANFIS and SVMR with proximal remote sensing could provide a highly useful tool for the prediction of IWQIs. On the basis of our findings, we propose the developed models as simple tools for predicting IWQIs and onsite evaluations.
Supplementary Materials: The following are available online at https://www.mdpi.com/2073-444 1/13/1/35/s1, Figure S1. Comparison between training series (a and c) and testing series (b and d) for water quality index (WQI) and residual sodium bicarbonate (RSBC) using the developed ANFIS model. Figure S2. Comparison between training series (a and c) and testing series (b and d) for total hardness (TH) and potential salinity (PS) using the developed ANFIS model. Figure S3. Comparison between training series (a and c) and testing series (b and d) for residual sodium carbonate (RSC) and magnesium hazard (MH) using the developed ANFIS model. Figure S4. Comparison between training series (a and c) and testing series (b and d) for water quality index (WQI) and residual sodium bicarbonate (RSBC) using the developed SVMR model. Figure S5. Comparison between training series (a and c) and testing series (b and d) for total hardness (TH) and potential salinity (PS) using the developed SVMR model. Figure S6. Comparison between training series (a and c) and testing series (b and d) for residual sodium carbonate (RSC) and magnesium hazard (MH) using the developed SVMR model.