Integration of Water Quality Indices and Multivariate Modeling for Assessing Surface Water Quality in Qaroun Lake, Egypt

: Water quality has deteriorated in recent years as a result of rising population and unplanned development, impacting ecosystem health. The water quality parameters of Qaroun Lake


Introduction
The natural environment has been severely distorted by industrialization and uncontrolled urbanization. Lakes are the world's most productive, varied, and interacting ecosystems. The aquatic ecosystem is made up of the biological community, physiochemical elements, and their interactions. A complex interplay of physical and biological processes exists within the aquatic environment, and changes do not occur in isolation. On the other hand, an ecosystem has often evolved over time, with species becoming adapted to their surroundings [1,2].
Water quality indicators have received a lot of attention in recent years in water environment research because of the potential for toxic effects, persistence, and bioaccumulation issues that can harm aquatic ecosystems [3,4]. Agricultural activities, several industrial, and urbanization processes can pollute the environment and lead to water ecosystem contamination, endangering aquatic biota and humans [5,6]. Water quality is a crucial component of surface water management, thus evaluating surface water quality for aquatic environments in developing nations is a critical issue in recent times. One of Egypt's most important inland-aquatic habitats is Qaroun Lake, which is a closed basin that serves as a primary reservoir for agricultural drainage water in Fayoum Province [6]. During the autumn and winter seasons, the lake is an important location for fishing, salt manufacture, tourism, and migrating birds [7]. Because of the greatest richness in biological life, archeological monuments, and geologic formations [8], both natural processes (rain, abrasion, soil erosion, etc.) and human inputs (urban, agricultural, and industrial activities) impose pressure on surface water quality in the lake [9,10]. Along the lake's southern edge, there are several pollution sources, including agricultural and urban wastewater discharged by Fayoum Province, as well as fisheries [11,12].
Fayoum Province discharges 450 million m 3 of untreated effluent into the lake each year [13]. El-Bats and El-Wadi are the two primary drains that receive massive volumes of household, industrial, and agricultural wastewater, which putting a lot of strain on aquatic life in the lake. The quantity and quality of water supply from various sources has a significant impact on water quality, because lakes are still waters that cannot clean themselves, they are more vulnerable to contamination than other water bodies [14]. Because of the growing human population and the associated increase in pollution dangers, lake monitoring and evaluation has become an important issue of lake management. As a result, lake water quality management is required to analyze these effects and provide a path to the long-term socioeconomic and environmental sustainability of this essential resource [13].
The physiochemical parameters such as temperature, pH, TDS, Al, Ba, Cd, Cr, Cu, Fe, Pb, Mn, Ni and Zn are regarded as key indicators and essential markers of water quality and a crucial characteristic in determining water suitability for aquatic life. Since the increasing of trace elements above the limit of quantification can affect water quality and damage the environment and anthropogenic activities [15,16]. The heavy metal; as Zn is poisonous in excessive amounts, despite the fact that it is "vital" components for life organisms [17].
Water quality has deteriorated in recent years because of rising population and unplanned development, impacting ecosystem health. In order to understand the impacts on water quality and living creatures, it is necessary to investigate water quality parameters in aquatic environments. Natural and human processes, as well as the transfer of nutrients and trace elements to surface waters have an impact on water quality in any region [9,10,18,19]. Water quality indices (WQIs) are crucial in this process, which are considered a communication tool for transferring water quality data and should be cal-culated to monitor water quality [20][21][22]. Therefore, some documented water quality indices, such as WAWQI, HPI, MI, C d and PI were utilized in this research to determine the current state of surface water hydrochemistry and the appropriateness of water for aquatic ecosystems. PIs are helpful techniques for assessing surface water quality which reflect the cumulative impacts of trace elements to indicate overall water quality and contamination degree [23][24][25]. Water pollution indices are considered an efficient method of ensuring safety by developing a control plan for monitoring the development, expansion, urban production, and direction of human activities in order to prevent negative effects on water quality resources [26][27][28].
The WAWQI is an arithmetic weighted technique for classifying water quality based on purity levels [29]. The HPI is a useful tool for assessing the impact of specific trace elements on overall water quality and perceptions of surface water suitable for human consumption [30]. Furthermore, the MI takes into account the cumulative effects of trace elements, allowing for a quick evaluation of overall water quality [31]. The C d evaluates the degree of pollution impacts on water quality in terms of specific trace elements. In addition, PI evaluates the relative toxicity of particular metals separately, which represents the combined impact of all metals on water quality and contamination level [32]. Therefore, the degree of pollution by trace element is measured as a combination of the individual contamination parameters via means of cumulative effects of trace elements that are regarded harmful to the aquatic environment.
Multivariate analysis of environmental data is widely used to identify potential pollution sources that affect water systems, and it is a significant approach for dependable water resource management as well as quick pollution issue solutions [33][34][35]. In the evaluation and monitoring of trace element contamination of water, cluster analysis (CA) and principal component analysis (PCA) are often used [36,37]. The CA and PCA were used to classify metals or investigated parameters into distinct factors/groups based on the predicted source of contribution and also, can assist in the organization and simplification of huge data sets in order to give useful insight [38]. Furthermore, water quality may be evaluated utilizing a geographic information system (GIS) as well as multivariate statistical modeling. Through interconnected layers of component geographical information, GIS can reflect the real environment [39,40]. GIS makes it simpler to analyze landscape features by providing spatial data that are not readily available through field research [41,42]. The geoprocessing models are crucial because they automate and record various phases of geospatial processing, as well as the complete geospatial data management process [43]. Pollution indices assist in identifying and mapping pollution levels, as well as determining present and prospective negative impacts on the aquatic system.
Combination of the WAWQI and PIs is a useful and practical method for detecting surface water quality using machine learning models such as support vector machine regression (SVMR), which are necessary for policymakers to understand the current state of surface water quality and its control mechanisms. In addition, this is useful in determining the best treatment techniques to address specific problems [28,[44][45][46][47]. To calculate these indices by tradition equations methods require several steps, accuracy in the calculation, time and high effort to convert a large number of water characterization data into a single value (WAWQI or PIs) to describe the level of water quality [29,48]. To overcome this problem, the SVMR could be used since it is a common method for specifying non-linear between a set of independent variables and response variables [45,49,50]. The SVMR use a several data of water characterization as into a single index to improve water parameter estimation. As a result, the water indices such as WAWQI and PIs can be analyzed simultaneously using this approach throughout a wide range of water characterization data. The SVM can translate data into a new high-dimensional space using a kernel function. Then, using a subset of sample cases known as support vectors, a predictive model is formed [49][50][51]. To the best of our knowledge, little research has compared the performance of SVMR in predicting WQIs using water characterization data. Several distinct water quality indexing methods are used in this study to offer a comparison outcome of their results. Therefore, the objectives of this work were to (i) assess the appropriateness of surface water for aquatic environments using the WAWQI; (ii) assess the contamination risk of surface water using PIs; (iii) classify physiochemical parameters into distinct groups/factors using CA and PCA; and (iv) evaluate the efficiency of SVMR models based on physical parameters and trace elements to predict the WAWQI as well as based on trace elements to predict PIs.

Study Area
Qaroun Lake is part of the Fayoum Depression, which was produced by natural circumstances in the northeastern section of Egypt's Western Desert. It is considered a closed shallow semi-saline lake lying between longitudes 30 • 24 and 30 • 50 E and latitudes 29 • 24 and 29 • 33 N ( Figure 1), with an area of about 200 km 2 and forming the deepest part in the Fayoum Depression with no outflow except evaporation [52]. The research area is rectangular and elongated in shape, with average measurements of 45 km in length, 5.7 km in width, and 4.2 m in depth [53]. The urban and agricultural regions border the lake on the south and east, while the uninhabited desert lands border it on the north and west.
of SVMR models based on physical parameters and trace elements to predict the WAWQI as well as based on trace elements to predict PIs.

Study Area
Qaroun Lake is part of the Fayoum Depression, which was produced by natural circumstances in the northeastern section of Egypt's Western Desert. It is considered a closed shallow semi-saline lake lying between longitudes 30°24′ and 30°50′ E and latitudes 29°24′ and 29°33′ N ( Figure 1), with an area of about 200 km 2 and forming the deepest part in the Fayoum Depression with no outflow except evaporation [52]. The research area is rectangular and elongated in shape, with average measurements of 45 km in length, 5.7 km in width, and 4.2 m in depth [53]. The urban and agricultural regions border the lake on the south and east, while the uninhabited desert lands border it on the north and west.
Qaroun Lake serves as a large natural reservoir for various effluents (agricultural, household, sewage, and industrial wastes) that flow through the eastern and southern drains from a great portion of Fayoum Province [7]. The drainage system has two major drains (El-Bats and El-Wadi) as well as several subsidiary drains (Sheikh Allam and Bahr Qaroun) that go to the lake ( Figure 1). The investigated catchment is located in Egypt's desert region, where the temperature is typically warm and dry, with a hot, long dry summer and a moderate, short winter [53]. Low seasonal rainfall (10 mm/y) and a high evaporation rate (7.3 mm/day) are further characteristics of the study area [54,55].

Sampling and Analyses
Water samples were obtained from 16 points across Qaroun Lake in July (dry season) over two years 2018 and 2019 ( Figure 1). The location of the collected samples was determined by UTM coordinates using handheld MAGELLAN GPS 315. Physical properties of the water samples such as T °C, pH, and TDS were measured in situ using a calibrated YSI Professional Plus handheld multi-parameter instrument (Hanna HI 9811- Qaroun Lake serves as a large natural reservoir for various effluents (agricultural, household, sewage, and industrial wastes) that flow through the eastern and southern drains from a great portion of Fayoum Province [7]. The drainage system has two major drains (El-Bats and El-Wadi) as well as several subsidiary drains (Sheikh Allam and Bahr Qaroun) that go to the lake ( Figure 1). The investigated catchment is located in Egypt's desert region, where the temperature is typically warm and dry, with a hot, long dry summer and a moderate, short winter [53]. Low seasonal rainfall (10 mm/y) and a high evaporation rate (7.3 mm/day) are further characteristics of the study area [54,55].

Sampling and Analyses
Water samples were obtained from 16 points across Qaroun Lake in July (dry season) over two years 2018 and 2019 ( Figure 1). The location of the collected samples was determined by UTM coordinates using handheld MAGELLAN GPS 315. Physical properties of the water samples such as T • C, pH, and TDS were measured in situ using a calibrated YSI Professional Plus handheld multi-parameter instrument (Hanna HI 9811-5). Some 500 mL polyethylene bottles with pre-marked labels and acidified with nitric acid to a pH less than 2 were used to collect surface water samples. The bottles were immediately closed and stored in a 4 • C refrigerator until further examination. Standard analytical procedures [56] were used to analyze trace elements such as Al, Ba, Cd, Cr, Cu, Fe, Pb, Mn, Ni and Zn using inductively coupled plasma mass spectrometer (ICAP TQ ICP-MS Thermo Fisher Scientific Inc., Waltham, MA, USA) at Environmental and Food Lab, University of Sadat City, which accredited according to ISO/IEC 17025/2017. The findings are shown in Table 1. Duplicates were performed during the analysis for quality assurance and quality control (QA/QC) of the surface water samples to provide better data confidence from the analytical procedure. Also, the precision of the method was certain by testing certified reference materials (ERM-CA713). The WAWQI assess water quality based on the degree of purity using the most routinely measured water quality criteria. The WAWQI is the most appropriate index for determining the overall quality of surface water for aquatic utilization, and it is defined by mathematical approaches using the equation published by Rown et al. [29]. The weighted arithmetic approach is used to compute the WAWQI according to Equation (1): Each variable's sub-quality index is called Q i , W i the weight unit of the specified variable is W i , and there were 13 physicochemical characteristics (n = 13) that were expressed in mg/L. According to the Canadian Council of Ministers of the Environment, the calculated value of Q i is based on the surface water concentration (C i ) and the standard (S i ) for each surface water parameter's aquatic life value [57], as shown in Equation (2): The recommended standards are used to calculate w i for each parameter [57] by Equation (4): The proportionality constant is K. To calculate the WAWQI, a weight must be assigned to each surface water parameter (w i ), and the relative weight (W i ) and quality rating range (Q i ) must be calculated. Therefore, W i values were assigned for selected physicochemical (Table 1), while w i was computed using Equation (4). The arithmetic weight approach was used to assign weighted values. The weights (w i ) and arithmetic weights (W i ) for the water parameters are presented in Table 2. Table 2. Arithmetic rating method for computation of HPI, MI, C d and PI.

Pollution Indices (PIs)
HPI, proposed by Prasad and Bose [58], MI, proposed by Tamasi and Cini [48], C d , established by Backman et al. [59], and PI, proposed by Caerio et al. [32], are the four techniques utilized in this work. The pollution indices including the HPI, MI, C d and PI were assessed for the concentrations of selected ten trace elements in Table 1 according to the following equations: Heavy Element Pollution Index (HPI) Each chosen parameter was given a rating or weight (W i ) to create the HPI index [60]. A toxicity index (HPI) based on mathematical weights of trace elements were used to reflect overall water quality with respect to the recommended standard guidelines (S i ) for each metal for aquatic environment [57]. The concentration limits, i.e., the standard permitted value (S i ) and maximum desired value (I i ) for each parameter, were obtained from the [57] standards (Table 2) for computing the HPI for the current water quality data. Therefore, the HPI values were estimated according to Equation (5): where W i and Q i indicate the unit weights and the sub-indices for selected trace elements in Table 1 and the number of trace elements being tracked is n = 10. The sub-index (W i ) and (Q i ) are calculated by Equations (6) and (7): where K is the proportionality constant and S i is the ith parameter's standard allowable value.
The monitored value of heavy metal, ideal, and standard values of I parameter, respectively, are M, I and S. The symbol (−) denotes the numerical difference between the two numbers, but the algebraic sign is ignored. Low trace element pollution (HPI < 100), trace element pollution with threshold risk (HPI = 100), and excessive heavy metal pollution (HPI > 100) were the three categories for HPI values [58,61,62].

Metal Index (MI)
The Metal Index (MI) is a technique for determining the overall quality of water in terms of metals. It is based on a complete trend evaluation of the current state [61]. Therefore, the MI according to Equation (8) represents water quality conditions under metal stress.
where H c is the concentration of trace elements, H max is the maximum permitted concentration for each metal, and i is the ith sample [48].
The degree of contamination (C d ) was calculated and measured based on the contamination factors of specific trace elements that exceeded acceptable limits [32,61], according to Equations (9) and (10): where C fi is the contamination factor for each trace element, the analytical value for each metal is C Ai , C Ni is the acceptable concentration for each metal, and C Ni is referred to as MAC (Table 2).

Pollution Index (PI)
For trace elements, pollution impact on surface water was assessed using PI values based on individual metal computations and classified into five groups (Table 3), which reflect the individual contamination effect of each trace element on surface water quality according to Equation (11): where C i is the metal concentration and S i is the metal level in relation to the metal concentration in water [32,63].

Data Analysis
The physicochemical parameters and WQIs were statistically analyzed using to compute statistical variables (e.g., minimum, maximum, mean, and standard deviation). The Pearson correlation coefficient was utilized to establish the relationships between WQIs, physical and chemical characteristics of water samples, as well as the significance thresholds at 0.05 and 0.001. For water quality evaluations, the CA and PCA are applied to enhance the identification of effective contaminant components in surface water based on transforming data from chemical analyses into recognizable patterns [64][65][66][67]. The CA and PCA were utilized to recognize the sources or factors that were responsible for changes in water quality by converting the original variables into a new set. PAST software (version 3.25) was used to process above statistically analyzed of the physicochemical parameters and WQIs, Pearson correlation coefficient and the analytical chemical findings of the physicochemical concentrations for CA and PCA. The maps are created using GIS methodology version 10, which is based on inverse distance weighted interpolation (IDW), which is one of the most basic and widely used interpolation methods for mapping various characteristics [6,68,69]. Using ArcGIS's IDW tool, the statistical relationships between the known locations were identified, and the concentrations of trace elements in the research area were calculated.

Support Vector Machine Regression
The SVMR algorithm is a machine learning theory that can be used to classify and recognize patterns. Version 10.2 of the unscramble X program (CAMO Software AS, Oslo, Norway) was used to construct the SVMR models. The SVMR model was used to establish calibration (Cal.) and validation (Val.) models of the WAWQI based on three physical parameters and ten trace elements as input data and for PIs with respect to ten trace elements (Table 1). For example, the SVMR of calibration model for testing a single dependent variable (e.g., contamination index (C d ) used several independent variables (e.g., Al, Ba, Cd, Cr, Cu, Fe, Pb, Mn, Ni and Zn). The measured datasets were randomly divided into two sets of progressions, 67 percent training and 33 percent testing datasets, to construct the models. The performance of SVMR for (Cal.) and (Val.) models was evaluated to predict the WQIs based on four criteria (determination coefficient (R 2 ), root mean square error (RMSE), mean absolute deviation (MAD), and accuracy (Acc). The optimal model was selected based on the lowest RMSE and MAD, as well as the highest R 2 and Acc. R 2 , is computed according to Equations (12)-(15) as the following: The RMSE is calculated with the following equation: The MAD determines the precision of constant variables, as seen below: The Acc is calculated with the following equation: WQIo i represents the observed value, and n represents the number of data points. WQI fi , on the other hand, is the predicted value.

Physicochemical Data
Physiochemical parameters play an important role in water quality evaluations and are a valuable source for learning about water chemistry and quality. Table 4 shows statistical descriptions of physicochemical characteristics regarding trace elements in surface water samples taken from Qaroun Lake over two years. Temperature is a key element in the aquatic environment and one of the variables that determines water quality, which controls biological, physical, and chemical activities in water. The water temperature varied between a minimum of 28.8 • C to maximum of 34.2 • C; with an annual average of 31.5 • C during summer across two years. Although, water in Qaroun Lake lies in the optimal range for most of the aquatic organisms, the steep temperature gradients, can have direct harmful effects on fish according to CCME [57] for aquatic life. In addition, the surface water pH values varied from 7.8 to 8.4, with a mean of 8.2, which fell in the range of acceptable water for the aquatic environment system according to the guidelines of the CCME [57]. The pH values of the surface water samples indicated a slightly acidic to alkaline water as well as an increase in planktonic algae photosynthetic activity [70]. The TDS values for the collected samples ranged between 27,652.27 mg/L and 39,056.09 mg/L, with a mean value of 35,679.37 mg/L. Because of the effect of evaporation associated with very high solute dissolution and continuous recharging from agricultural, domestic, sewage, and industrial wastes in the closed lake, the TDS values in the obtained samples revealed that the surface water at Qaroun Lake was semi-saline type (e.g., 10,000-100,000 mg/L). All water quality parameters are expressed in mg/L except temperature (T • C) and pH, SD: standard deviation.
On the other hand, the trace element concentrations of Al, Ba, Cd, Cr, Cu, Fe, Pb, Mn, Ni and Zn showed mean values of 0.29, 0.053, 0.21, 0.016, 0.012, 0.10, 0.0068, 0.005, 0.004, and 0.003 mg/L, respectively as the following trend: Al > Ba > Fe > Ni > Cu > Zn > Pb > Mn > Cr > Cd. To the best of our knowledge, trace elements in water come from two sources: natural (rock weathering and soil leaching) and anthropogenic (urban residential and industrial waste and chemical fertilizer usage). The trace elements concentrations in the collected water samples differed significantly between samples, indicating that the surface water was contaminated by Al, Ba, Cd, Cu, Mn, and Zn, at levels that were higher than the proposed permissible limits for the protection of aquatic life according to the CCME [57]. The obtained physicochemical results for the studied area were agreement with the results reported by many studies in this region [26,71]. For example, the result averages of the researched parameters were compared with the variables studied of Wadi El-Rayan Lakes in Fayoum Province utilizing eight heavy metals (Cd, Cr, Cu, Fe, Mn, Ni, Pb, and Zn) to assess metal pollution in the Lakes' water [26]. According to the findings, Pb and Cd concentrations in the upper lake exhibited a temporal significant difference (p < 0.05), but Fe, Ni, Zn, and Cu values showed a highly spatial significant difference (p < 0.01). These findings revealed that the discharge of untreated effluents, sewage, and agricultural chemicals into the lakes via the El-Wadi drain and the increasing rate of water evaporation result in increased metal levels, potentially reversing the dramatic transformation story of Qaroun Lake and the deterioration of the aquatic environment.
Furthermore, a long-term change of water quality characteristics and metal pollution load of Fe, Mn, Zn, Cu, and Cd of heavily polluted Mediterranean Lakes in Egypt, were investigated [72]. The comparison of the five lakes revealed an increase in most metal values at Qaroun Lake, Mariut Lake, Manzala Lake, and Burullus Lake, except Mn, which had higher levels than Manzala Lake. Burullus Lake was rated third, followed by Idku Lake. The results (Table 5) revealed that the values of all examined metals in several northern Egyptian lakes exceed the CCME permitted levels [57].  Table 6 presented statistical descriptions of water quality indices such as the WAWQI, HPI, MI, and C d over two years. The WAWQI values ranged from 154.591 to 358.788, with a mean value of 252.461, and the findings obtained revealed that 100% of water samples were unsuitable water categories and not recommended for the aquatic environment ( Table 7). The spatial distribution map of WAWQI values of the surface water in the study area increasing from northwest to southeast direction indicates that most of the surface water quality degradation was observed near the downstream of drainage network in front El-Bats and El-Wadi drains at the end of drain discharging in the lake (Figures 2a and 3a). This may be attributed to runoff untreated agricultural and municipal wastewater into the lake.

Water Quality Indices
The HPI values ranged from 154.5875 to 358.8039, with a mean value of 252.4668, which presented that 100% of samples were above the critical HPI value, representing water highly polluted by trace elements (Figures 2b and 3b). The MI values of the surface water samples ranged from 6.343048to 22.7259, and according to the MI findings, trace elements had a significant impact on all surface water samples (Table 7). Based on the spatial variation map of MI findings, in the northeastern and southwestern regions of the lake, surface water samples were more influenced by trace elements (Figures 2c and 3c). The PIs including the HPI, and MI showed that surface water of Qaroun Lake was highly polluted and seriously affected by heavy metals for the aquatic ecosystem. The heavy metal pollution increased gradually from the southeast to northwest direction (Figures 2b,c and 3b,c).
The computed values for C d of water samples presented that the C d values ranged from −3.65695 to 12.7259. The C d calculated found that 50% of surface water samples had positive values (C d > 1), indicating highly contaminated surface, and about 16% of samples indicating medium contaminated water (Figures 2d and 3d), while the remaining samples about 34% had negative values (C d < 1), indicating better water quality for aquatic environment with respect to trace elements (Figures 2d and 3d). The C d values revealed the degree of contamination by metals across two years resulting from continuous rapped discharging of untreated wastewater from the drains, especially in the front of the lake (Figures 2d and 3d).
A comparison of the spatial distribution maps of the WAWQI and PIs findings (Figures 2 and 3) indicated a decrease in surface water quality for aquatic utilization. There are no noticeable changes in the spatial distribution of WQIs for the lake between two years because of slight increase in physicochemical characteristics across two years.
The water quality degradation in Qaroun Lake showed that according to HPI, surface water was severely polluted, and heavy metals had a significant impact, according to MI. While significant levels of water contamination for Al, Cd, Cu, and Zn revealed differences in the evaluation schemes for metal concentrations [57]. The study area's surface water quality was deteriorating due to rising amounts of swept-out effluents from various drains into the lake.    A comparison of the spatial distribution maps of the WAWQI and PIs findings (Figures 2 and 3) indicated a decrease in surface water quality for aquatic utilization. There are no noticeable changes in the spatial distribution of WQIs for the lake between two years because of slight increase in physicochemical characteristics across two years.
The water quality degradation in Qaroun Lake showed that according to HPI, surface water was severely polluted, and heavy metals had a significant impact, according to MI.  A comparison of the spatial distribution maps of the WAWQI and PIs findings (Figures 2 and 3) indicated a decrease in surface water quality for aquatic utilization. There are no noticeable changes in the spatial distribution of WQIs for the lake between two years because of slight increase in physicochemical characteristics across two years.
The water quality degradation in Qaroun Lake showed that according to HPI, surface water was severely polluted, and heavy metals had a significant impact, according to MI. Based on the classification of PI levels, the PI data revealed two groups of trace element effects ( Table 8). The PI values obtained demonstrated that Al had a severe impact on the surface water samples (Figure 4). (PI = 3.51), moderately affected by Cd (PI = 2.43) and Cu (PI = 2.76), and slightly affected by Zn (PI = 1.69), while there were no effects exerted by Ba, Cr, Fe, Pb, Mn, and Ni (PI < 1.0) as shown in Figure 4. The PI results revealed that the surface water points were strongly affected by Al and moderately affected by Cd and Cu, while slightly affected by Zn (Table 2). According to the obtained PI results, the high loadings of Al and Cu may be attributed to industrial activities, while the high loading of Cd and Zn revealed anthropogenic activities and poor sanitation infrastructure. For example, Goher et al. [26] applied PIs for assessing the water quality status in Wadi El-Rayan Lakes. According to metal index values, all selected surface water samples from the Lakes are seriously threatened with metal pollution, and the PI values showed that surface water of Wadi El-Rayan Lakes were slightly affected by Cr, and Pb and moderately affected by Cd and Cu, while no pollution effect by Fe, Mn, Zn, and Ni for aquatic utilization. While significant levels of water contamination for Al, Cd, Cu, and Zn revealed differences in the evaluation schemes for metal concentrations [57]. The study area's surface water quality was deteriorating due to rising amounts of swept-out effluents from various drains into the lake. Based on the classification of PI levels, the PI data revealed two groups of trace element effects ( Table 8). The PI values obtained demonstrated that Al had a severe impact on the surface water samples (Figure 4). (PI = 3.51), moderately affected by Cd (PI = 2.43) and Cu (PI = 2.76), and slightly affected by Zn (PI = 1.69), while there were no effects exerted by Ba, Cr, Fe, Pb, Mn, and Ni (PI < 1.0) as shown in Figure 4. The PI results revealed that the surface water points were strongly affected by Al and moderately affected by Cd and Cu, while slightly affected by Zn (Table 2). According to the obtained PI results, the high loadings of Al and Cu may be attributed to industrial activities, while the high loading of Cd and Zn revealed anthropogenic activities and poor sanitation infrastructure. For example, Goher et al. [26] applied PIs for assessing the water quality status in Wadi El-Rayan Lakes. According to metal index values, all selected surface water samples from the Lakes are seriously threatened with metal pollution, and the PI values showed that surface water of Wadi El-Rayan Lakes were slightly affected by Cr, and Pb and moderately affected by Cd and Cu, while no pollution effect by Fe, Mn, Zn, and Ni for aquatic utilization. According to foregoing findings, the PIs in Qaroun Lake have tended to rise, because of uncontrolled releases of domestic and industrial wastewater. Therefore, combining the WAWQI and PIs is a useful and practical method for assessing surface water quality in aquatic ecosystems using physicochemical characteristics in relation to trace elements.  According to foregoing findings, the PIs in Qaroun Lake have tended to rise, because of uncontrolled releases of domestic and industrial wastewater. Therefore, combining the WAWQI and PIs is a useful and practical method for assessing surface water quality in aquatic ecosystems using physicochemical characteristics in relation to trace elements.

Correlation Matrix between WQIs and Physicochemical Parameters
The correlation between physiochemical parameters, WAWQI and three PIs were computed via simple regressions as presented in Figure 5. The WAWQI vs. HPI, MI and C d showed a high positive and significant correlation, with r = 0.91 for MI and C d and with r = 0.91 for HPI. The significant correlation coefficients for the matrix of physiochemical parameters, WAWQI and three PIs varied from 0.51 to 1.00. The correlations among four water quality indicators and physiochemical parameters indicated that TDS, temperature, Ba, and Ni showed non-significant correlation with the four water quality indicators. On other hand, there were positive and strong correlation between four water quality indicators with Al, Cd, Cr, Cu, Fe, Mn and Zn and r varied from 0.64 to 0.95. Moderate correlation between WAWQI and HPI was found, with r = 0.59. Al showed the highest correlation coefficient with the MI and C d with r = 0.95, C d showed the highest correlation coefficient with the WAWQI and HPI, with r = 0.94 and Zn showed the highest correlation coefficient with the MI and C d , with r = 0.85. Ba, and Ni showed non-significant correlation with the four water quality indicators. On other hand, there were positive and strong correlation between four water quality indicators with Al, Cd, Cr, Cu, Fe, Mn and Zn and r varied from 0.64 to 0.95. Moderate correlation between WAWQI and HPI was found, with r = 0.59. Al showed the highest correlation coefficient with the MI and Cd with r = 0.95, Cd showed the highest correlation coefficient with the WAWQI and HPI, with r = 0.94 and Zn showed the highest correlation coefficient with the MI and Cd, with r = 0.85. Figure 5. The relationships between physicochemical parameters and WQIs across two years.

Cluster Analysis
The CA was used to identify water quality changes and to classify various physicochemical characteristics by transforming the initial variables into a new set of variables associated with water. Three forms of clustering were discovered in the CA findings for trace elements, including Al and Zn (Cluster I). Another cluster includes Ba (Cluster II), which was further split into two sub-clusters, one representing Fe, Cu, and Ni and the other representing Cd, Cr, and Mn (Figure 6a). High contributions of Al, Zn, and

Cluster Analysis
The CA was used to identify water quality changes and to classify various physicochemical characteristics by transforming the initial variables into a new set of variables associated with water. Three forms of clustering were discovered in the CA findings for trace elements, including Al and Zn (Cluster I). Another cluster includes Ba (Cluster II), which was further split into two sub-clusters, one representing Fe, Cu, and Ni and the other representing Cd, Cr, and Mn (Figure 6a). High contributions of Al, Zn, and Ba may be attributable to industrial and agricultural wastewater arising from human actions, according to the CA of the physicochemical (Figure 6a). Trace elements such as Fe, Cu, Ni, Cd, Cr, Pb and Mn are included in a distinct cluster, revealing anthropogenic activities and quickly increasing industry sectors in Fayoum Province. Therefore, the main causes of trace element pollution in Qaroun Lake revealed industrial leaching and precipitation with increasing human activities [76,77].
Ba may be attributable to industrial and agricultural wastewater arising from human actions, according to the CA of the physicochemical (Figure 6a). Trace elements such as Fe, Cu, Ni, Cd, Cr, Pb and Mn are included in a distinct cluster, revealing anthropogenic activities and quickly increasing industry sectors in Fayoum Province. Therefore, the main causes of trace element pollution in Qaroun Lake revealed industrial leaching and precipitation with increasing human activities [76,77].  Figure 6b shows the results of a PCA for physicochemical characteristics regarding trace elements for surface water stations over two years. PC1 explained 50.193% of total variance was prevailed by large positive loading of T °C, Zn, Cd, Al, Mn, Cr, Cu, Ni, Fe, Pb, and Ba especially from sample 1 to 9. PC2, (pH and TDS) 14.431% of the total variations were highly associated at samples 12 and 15, (Figure 6b). All trace components were clustered together in positive loading combinations, which revealed a strong relationship between the variables. The existence of ten essential main principal components showed the influence of trace elements on surface water quality in the research region, according to the PCA results. Therefore, PC1 showed maximum loading of T °C, Al, Cd, Mn, Fe, Pb and Zn in samples No. 2, 3, 4, 5, and 8, while PC2 showed maximum loading of pH and TDS in samples No. 1, 6, and 9 ( Figure 6b). The PCA of surface water samples for the physicochemical parameters revealed the loadings of Al, Cd, Ba, Ni, Cu, Cr, Mn, Fe, Pb and Zn on PC1 and the loadings of pH and TDS on PC2 (Figure 6b). These findings could be attributable to industrial and anthropogenic operations in the research region [78][79][80], that lead to contamination of Qaroun Lake by individual metals, especially the high loadings of Ba, Cr, Cd, and Zn. Furthermore, phosphorus fertilizers are a source of several harmful trace metals such as Cd, Cr, and Zn, which are mostly anthropogenic origin [16,81,82].

Principal Component Analysis
Most surface water locations in the research area had high contaminated water, as shown by a strong agreement between PCA and Cd. Agriculture runoff, discharge of industrial wastewater, and urban sewage through the estuary have all developed near the  Figure 6b shows the results of a PCA for physicochemical characteristics regarding trace elements for surface water stations over two years. PC1 explained 50.193% of total variance was prevailed by large positive loading of T • C, Zn, Cd, Al, Mn, Cr, Cu, Ni, Fe, Pb, and Ba especially from sample 1 to 9. PC2, (pH and TDS) 14.431% of the total variations were highly associated at samples 12 and 15, (Figure 6b). All trace components were clustered together in positive loading combinations, which revealed a strong relationship between the variables. The existence of ten essential main principal components showed the influence of trace elements on surface water quality in the research region, according to the PCA results. Therefore, PC1 showed maximum loading of T • C, Al, Cd, Mn, Fe, Pb and Zn in samples No. 2, 3, 4, 5, and 8, while PC2 showed maximum loading of pH and TDS in samples No. 1, 6, and 9 ( Figure 6b). The PCA of surface water samples for the physicochemical parameters revealed the loadings of Al, Cd, Ba, Ni, Cu, Cr, Mn, Fe, Pb and Zn on PC1 and the loadings of pH and TDS on PC2 (Figure 6b). These findings could be attributable to industrial and anthropogenic operations in the research region [78][79][80], that lead to contamination of Qaroun Lake by individual metals, especially the high loadings of Ba, Cr, Cd, and Zn. Furthermore, phosphorus fertilizers are a source of several harmful trace metals such as Cd, Cr, and Zn, which are mostly anthropogenic origin [16,81,82].

Principal Component Analysis
Most surface water locations in the research area had high contaminated water, as shown by a strong agreement between PCA and Cd. Agriculture runoff, discharge of industrial wastewater, and urban sewage through the estuary have all developed near the research area in recent years, as evidenced by the integration of trace element contributions in PCA and PIs. Therefore, combining PCA and PI for surface water quality assessment regarding trace elements is a beneficial and adaptable approach that holds exceptional potential and provides unique insights.

The Support Vector Machine Regression Models to Predict Water Quality Indices
The mathematical methods can be used to calculate accurate estimation of the WAWQI, HPI, MI and C d of surface water based on physiochemical parameters. Although, these methods are accurately but require more time, efforts, and several steps to converts several input data to obtain on single value as output data. The SVMR is easy method to predict single model including multiple response variables as input data [46,50,83]. The SVMR model can solve both regression and classification problems, as well as mapping lowdimensional nonlinear input to high-dimensional output [49,51]. Table 9 shows the R 2 , RMSE, MAD, and Acc of the calibrating and validating datasets of the SVMR models based on all three physical parameters and ten trace elements to predict WAWQI and based on ten trace elements to predict HPI, MI and C d of surface water quality. Generally, the SVMR models provide a more accurate estimation of the different WQIs in both calibrating and validating datasets. For example, R 2 was 0.99 in the calibrating datasets and from 0.97 to 0.99 in the validating ones.  Figure 7; the plots for MI and C d are in Figure 8. From the calibrating to validating testing periods, the SVMR model demonstrated a very minimal decline in performance quality (R 2 , RMSE, MAD, and Acc) for all four WQIs (Table 9 and Figures 7 and 8). Figures 7 and 8 show matching and 1:1 scatter plot and of the measured and predicted values of calibration and validation models of the WAWQI and PIs of SVMR for water samples analysis. A very small discriminating insight emerges in Figures 7 and 8, which displays the small difference between predicted and measured of four indices values for the calibrating and validating phases. The equation slope of the Cal. models varied from 0.957 to 1.045 and the equation slope of the Val. models varied from 0.851 to 1.045 (Figures 7 and 8).
To the best of our knowledge, the topic of applying machine learning (SVMR) to predict the WAWQI, HPI, MI, and C d using physiochemical factors has not been addressed. Recently, multivariate regression models based on partial least squares regression (PLSR), and stepwise multiple linear regressions (SMLR) can accurately predict water quality indices [28]. For both the Cal. and Val. Models, Gad et al. [28] reported that PLSR based on data for many trace elements was accurate to predict PLs and drinking water quality index (DWQI) and R 2 ranged from 0.98 to 1.00 for the Cal. models and 0.88 to 0.99 for the Val. models. As well as PIs and the DWQI of the SMLR models, which included major ions and heavy metals as input data, provided the best prediction for both indices with R 2 = 1. Principal component regression (PCR) and SVMR were found to be robust models for predicting six irrigation water quality indices in the Cal. and Val. models by Elsayed et al. [46], with R 2 ranging from 0.48 to 0.99. In addition, multiple linear regression (MLR) including physicochemical parameters as input data were found by Chen and Liu [84] to be useful in estimating water quality variables such as chlorophyll disk depth, total phosphorus and, dissolved oxygen, with R 2 values of 0.55, 0.31, and 0.64, respectively. Finally, the findings of this research show that SVMR has the ability to predict WAWQI, HPI, MI, and C d in surface water.   Table 9. Water 2021, 13, x 18 of 22 Figure 8. Comparison between calibrating series (a,c) and validating series (b,d) for MI and Cd using the SVMR model. Statistical analysis results were added in Table 9.
To the best of our knowledge, the topic of applying machine learning (SVMR) to predict the WAWQI, HPI, MI, and Cd using physiochemical factors has not been addressed. Recently, multivariate regression models based on partial least squares  Table 9.

Conclusions
In this study, water quality indices (IWQs), multivariate statistical techniques such as CA and PCA and machine learning as SVMR based on physicochemical were tested to characterize the suitability of surface water quality for aquatic utilization in Qaroun Lake, Egypt. According to the acquired analytical data, the surface water in the analyzed area of Qaroun Lake was semi-saline water type, and the trace element contents as the following trend of Al > Ba > Fe > Ni > Cu > Zn > Pb > Mn > Cr > Cd. The surface water was heavily influenced by Al, moderately influenced by Cd and Cu, while slightly influenced by Zn. Surface water quality of Qaroun Lake has deteriorated due to widespread use of agricultural fertilizer and pesticides, industrial activity, and insufficient drainage networks. As well as the WQIs, which are confirmed by multivariate statistical analysis, indicate that industrial effluents and landfill leachates/municipal sewage were considering the primary sources of trace element contamination in Qaroun Lake. So that, the use of effective wastewater treatment procedures prior to disposal into the lake will contribute to greater remediation of surface water quality deterioration in the investigated region. In calibration and validation datasets, the SVMR models performed well in estimating the four WQIs of surface water quality in Qaroun Lake, with the best R 2 values, lowest RMSE and MAD values, and maximum slope values. From the calibrating to validating testing periods, the SVMR model demonstrated a very minimal decline in performance quality (R 2 , RMSE, MAD, and Acc) for all four WQIs. Therefore, utilization of physicochemical parameters and water quality indices supported by GIS techniques, multivariate modelling and machine learning is a useful and practical method for determining the quality of surface water and its progression.