Applying Factor Analysis and the CCME Water Quality Index for Assessing Groundwater Quality of an Aegean Island (Rhodes, Greece)

: Globally, water quality indices (WQIs) are beneﬁcial for evaluating groundwater and surface water quality. The Canadian Council of Ministers of Environment Water Quality Index (CCME WQI) was combined with the parametric values given by Directive 98/83/EC to investigate the possible suitability of groundwater resources for human consumption on Rhodes Island. Chloride (Cl − ), pH, calcium (Ca 2+ ), electrical conductivity (CND), carbonate (CO 32 − ), bicarbonate (HCO 3 − ), potassium (K + ), magnesium (Mg 2+ ), sulfate (SO 42 − ), sodium (Na + ),nitrate (NO 3 − ), nitrite (NO 2 − ), ammonium (NH 4+ ), and phosphate (PO 43 − ) were included in the dataset applied in this study. Statistical analysis, GIS database, and WQI estimation were successfully used to evaluate the groundwater resources of the study area. All studied groundwater parameters have mean and median values lower than the corresponding parametric values established by Directive 98/83/EC. The high CND values (up to 2730 µ s cm − 1 ) in groundwater collected from Rhodes’ coastal aquifers indicate a direct relationship with seawater intrusion. CCME-WQI classiﬁes the groundwater samples for most monitoring stations on Rhodes aquifers as “excellent”, Class 5, for 2019 and 2020. The ﬁndings of this study may be helpful for scientists and stakeholders monitoring the study area and applying measures to protect the groundwater resources.


Introduction
Evaluating water quality and analyzing potential linkages between element concentration in water and human health are essential components of the laws and guidelines issued by many researchers and authorities [1][2][3][4][5][6][7][8]. Although numerous scientists and engineers have conducted meticulous research studies on a wide range of water quality issues, it is evident that there is a shortage of mature and scientifically sound processes in several areas [9,10]. One of these issues is the documentation of a universally acknowledged single value or score for expressing the water quality of an aquifer in terms of a WQI (water quality index). WQIs are intended to be valuable and persuasive tools for preventing deterioration of water quality and the sustainable development of water resources. WQIs are able to allow a better understanding of the water quality status by calculating a single value that is not based on any particular dimension. This makes it feasible to evaluate, express, and communicate even to non-experts the overall quality of any water source [11].
The topic of water quality is broad and comprehensive [1,[12][13][14][15]. Gradual deterioration of water quality is one of the primary impacts of agricultural, industrial, and household expansion and other land use changes [2,4,13,16]. Well-structured planning of water resources is essential for protecting water quality and assessing and managing groundwater and surface water contamination.
In the present survey, a research study based on multivariate statistics and a WQI is performed to assess groundwater quality of Rhodes Island, Greece.
(a) Alexakis [1] proposed a new meta-evaluation approach of two widely used WQIs for application in groundwater quality assessment; (b) Panagopoulos et al. [17] implemented the CCME-WQI (Canadian Council of Ministers of Environment Water Quality Index) for the evaluation of the physicochemical quality of Greek rivers; (c) Alqarawy et al. [18] combined, among others, physicochemical parameters and WQIs to delineate water quality and controlling factors using multivariate modeling techniques; (d) Haider et al. [19] discussed the spatiotemporal water quality variations in smaller water supply systems by applying modified CCME-WQI from groundwater resources to distribution networks; (e) Molekoa et al. [20] employed hydrogeochemical analysis of groundwater samples to calculate WQI and evaluated factors governing water quality evolution in the Mokopane area (South Africa); (f) Shafique et al. [21] applied multivariate and geospatial monitoring of water and soil quality to investigate the impact on the planted mangroves growth pattern at the Indus delta; (g) Aldrees et al. [22] presented the development of a multi-expression programming based predictive model for water quality parameters and WQI.
Most of the WQIs in the literature include steps for sub-indexing and weighting [23]. However, the CCME-WQI skipped these steps and performed the final aggregation function by using the parameter measurements directly within fixed mathematical functions [24]. The CCME-WQI has become the most popular index, which has been used for all kinds of water bodies. Compared to other water quality indices, CCME-WQI has many benefits, such as the fact that it can be used with anywhere from only four to a vast number of water parameters, it is flexible when it comes to select water quality standards, and it can still be used if some data is missing [24]. Additionally, the index is practically independent of a specific set of quality parameters, so it can be used with almost any combination of parameters [24].
The main objectives of the present study are to apply the CCME-WQI to a dataset available for all the aquifers of an Aegean Island for the determination of their groundwater quality over a 2-year period and to evaluate water suitability for drinking purposes. To the writer's knowledge, this is a unique effort so far to provide a representative application of the CCME-WQI on an Aegean island scale and investigate possible variations in its performance during two sampling campaigns (2019-2020).
The novelty of the present study lies mainly in the implementation of a well-known index for the first time on groundwater resources of an Aegean island with a high variation in topographical, climatic, geological, and hydrological settings and anthropogenic activities. It has to be highlighted that the application of the CCME-WQI on a Aegean island may attract many EU researchers, even policymakers, to the results of a globally used WQI. The results of this study could prove helpful for those who would like to identify how the examined index behaves. Moreover, the findings of this study may initiate the discussion that it is important to install and setup a monitoring network to evaluate groundwater deterioration of the aquifers on Rhodes Island by applying WQI because it turns raw data of water quality into information that is coherent and convenient for policy makers, stakeholders, and inhabitants. Furthermore, since the climate crisis is most likely to accelerate the mixing of seawater in coastal aquifers, the monitoring network in the area studied can help to the sustainable management of groundwater resources of Rhodes Island.

Study Area
Rhodes Island is situated in the Aegean Sea, Greece ( Figure 1). Water resource management is a significant concern on Rhodes Island (Figure 1), particularly during the summer when water demands increase due to the Island's high visitor population. It is worth noting that the total population on Rhodes Island is 115,490 [25], and the total number of visiting people during the summer exceeds 3,300,000 [26].

Study Area
Rhodes Island is situated in the Aegean Sea, Greece ( Figure 1). Water resource management is a significant concern on Rhodes Island (Figure 1), particularly during the summer when water demands increase due to the Island's high visitor population. It is worth noting that the total population on Rhodes Island is 115,490 [25], and the total number of visiting people during the summer exceeds 3,300,000 [26].
The CORINE land cover classes were acquired from the CORINE/land cover Copernicus application [31] and then added to the GIS (geographic information system) database as a separate polygon layer. The land on Rhodes Island is characterized by agricultural activities (Figure 2). A significant part of the land on Rhodes Island is covered by forest and semi-natural areas ( Figure 2).  Table 1.
The CORINE land cover classes were acquired from the CORINE/land cover Copernicus application [31] and then added to the GIS (geographic information system) database as a separate polygon layer. The land on Rhodes Island is characterized by agricultural activities (Figure 2). A significant part of the land on Rhodes Island is covered by forest and semi-natural areas ( Figure 2).

Primary Data and Data Treatment
A database was developed using all available information for 2019 and 2020 by collecting data from the Greek Ministry of Environment and Energy database [28]. It is important to note that data only for the wet period of each year is available. According to the Mediterranean climate type, the hydrological year is comprised of a wet and a dry period, with the wet period lasting from October to March and the dry period from April to September. The locations of monitoring stations are presented in Figure 1. The monitoring stations, aquifer codes, and water uses on Rhodes Island are tabulated in Table  1. The primary data applied in this study was assembled by Institute of Geology and Mineral Exploration. The dataset used in this study included the following parameters: pH, chloride (Cl − ), calcium (Ca 2+ ), electrical conductivity (CND), carbonate (CO3 2− ), bicarbonate (HCO3 − ), potassium (K + ), magnesium (Mg 2+ ), sulfate (SO4 2− ), sodium (Na + ), nitrate (NO3 − ), nitrite (NO2 − ), ammonium (NH4 + ), and phosphate (PO4 3− ). The Greek Government Gazette II 1635 of 9 June 2016 contains more technical information about field methods and water chemistry studies [32]. Microsoft ® Excel (Redmond, Washington, DC, USA) and IBM ® SPSS v.28 (International Business Machines Corporation; Statistical Product and Service Solutions; Armonk, NY, USA) for Windows were utilized for data processing. Simplified digital maps were developed using the ArcView 10.4 GIS software (ESRI ® ) (Environmental Systems Research Institute; Redlands, CA, USA).

Primary Data and Data Treatment
A database was developed using all available information for 2019 and 2020 by collecting data from the Greek Ministry of Environment and Energy database [28]. It is important to note that data only for the wet period of each year is available. According to the Mediterranean climate type, the hydrological year is comprised of a wet and a dry period, with the wet period lasting from October to March and the dry period from April to September. The locations of monitoring stations are presented in Figure 1. The monitoring stations, aquifer codes, and water uses on Rhodes Island are tabulated in Table 1. The primary data applied in this study was assembled by Institute of Geology and Mineral Exploration. The dataset used in this study included the following parameters: pH, chloride (Cl − ), calcium (Ca 2+ ), electrical conductivity (CND), carbonate (CO 3 2− ), bicarbonate (HCO 3 − ), potassium (K + ), magnesium (Mg 2+ ), sulfate (SO 4 2− ), sodium (Na + ), nitrate (NO 3 − ), nitrite (NO 2 − ), ammonium (NH 4 + ), and phosphate (PO 4 3− ). The Greek Government Gazette II 1635 of 9 June 2016 contains more technical information about field methods and water chemistry studies [32]. Microsoft ® Excel (Redmond, Washington, DC, USA) and IBM ® SPSS v.28 (International Business Machines Corporation; Statistical Product and Service Solutions; Armonk, NY, USA) for Windows were utilized for data processing. Simplified digital maps were developed using the ArcView 10.4 GIS software (ESRI ® ) (Environmental Systems Research Institute; Redlands, CA, USA). The Rhodes Island dataset of groundwater quality parameters was processed by the statistical techniques of R-mode factor analysis, applying the Varimax-raw rotational approach with Kaiser normalization [33] to determine the common origin of the parameters and inter-parameter relationships. Kaiser [33] suggested that the variance of the squared loadings across a factor be maximized rather than the variance of the squared loadings for the variables. According to Kaiser [33] (Equation (1)), the resulting criterion to be maximized is: where v is the number of variables, the p v f 's are the loadings and V f is the variance of the squared loadings for factor f. Since factor analysis classifies or combines all the highly correlated variables into the same factor, the WQI is included in factor analysis to identify which parameter is "more important" than the other water quality parameters. The multivariate statistics analysis was applied to 9 parameters: pH, CND, Cl − , NO 3 − , NO 2 − , NH 4 + , SO 4 2− , Na + , and CCME-WQI.
The KMO (Kaiser-Meyer-Olkin) test evaluates data for factor analysis [33]. The KMO test measures model as well as variable sampling adequacy. The statistic test measures common variance among variables (Equation (2)). Higher proportions and KMO-value make data better suited for factor analysis. where p jk is the partial correlation, and r jk is the correlation between the variable in question and another variable.
A varimax rotation simplifies a sub-expression spaces into a few major items each [33,34]. The actual coordinate system is unchanged, and it is the orthogonal basis that is being rotated to align with those coordinates. Principal component analysis or factor analysis produce a dense subspace with multiple non-zero weights, making it difficult to interpret. Orthogonality requires a subspace-invariant rotation. Varimax maximizes squared loading variances (squared correlations between variables and factors). Intuitively, this is achieved if [33,34]: (a) a variable has a high loading on a single factor but near-zero loadings on the remaining factors and (b) a factor is composed of only a few variables with very high loadings on this factor while the remaining variables have near-zero loadings on this factor.
Varimax rotation brings the loading matrix closer to a simple structure if these conditions hold (as much as the data allow). Varimax seeks a basis that most economically represents each individual, meaning each can be described by a linear combination of a few basic functions [33,34].
According to Davis [34], the application of R-mode factor analysis in geochemical data was performed in the following steps: (a) Computation of variances/covariances. Variances/covariances were calculated using the formula (Equation (4)): (c) The "correct" number of factors was determined by applying a combination of Davis's [34] proposed standard criteria. To accomplish a "simple structure," the rotation of the factor axis was calculated. The correlation coefficients between variables and factors (loadings in factor analysis) are close to 0 or +1. The higher the factor loadings, the better the factors characterize the variables.

Application of WQI
The WQI developed by the CCME [24] was utilized to analyze Rhodes Island's groundwater. CCME-WQI employs a goal value (objective or guideline) for each parameter that should not be exceeded and three important aspects (factors) to calculate a single unitless number that represents overall water quality.
The CCME modified the BC (British Columbia) WQI (water quality index) to create a CCME-WQI that water agencies in many countries could use. The CCME-WQI did not use sub-indices, weights, or classical index aggregation [24]. CCME-WQI applies a target value (guideline or objective) for each water quality parameter that should not be exceeded and three essential factors to estimate a single unitless number that represents overall water quality. The three factors are: (a) scope, or the number of variables in a dataset not meeting water quality criteria; (b) frequency, or the number of times the objectives are not met; and (c) amplitude, or the amount by which the objectives are not fulfilled. A score of 0 indicates the worst water quality, and 100 is the greatest. Acceptable water quality criteria are combined into a single number with a range from 0 to 100, with 100 signifying "excellent" quality [24]. The WQI classes, ratings, and boundaries which were applied in this study are tabulated in Table 2. Table 2. CCME-WQI classes, boundaries, and description of water quality [24].

Class
Rating Boundaries Description of Water Quality According to CCME [24], the equations of CCME-WQI are the following (5)-(11): F 1 (Scope) represents the percentage of failed parameters relative to the total number of measured parameters. The term "guidelines" means "objectives" or "target values."  An excursion is the number of times a sample's concentration is above (or below, if the guideline is a minimum) the guideline.
When i th test value cannot exceed j th parameter's objective [24]: When the test value cannot fall below the objective [24]: Summing individual tests' deviations from guidelines and dividing by the total number of tests yields the total amount of non-compliance (both those meeting guidelines and those not meeting guidelines). This parameter, called nse, is estimated as [24]: An asymptotic function scales the normalized sum of deviations from guidelines (nse) to calculate F 3 .
After obtaining the factors, the CCME-WQI is estimated by adding the F 1 , F 2 , and F 3 as follows [24]: The divisor 1.732 normalizes the outcome from 0 to 100, where 0 is the 'worst' and 100 is the 'best' water quality [24].
In this study, the CCME-WQI score is derived from a software tool developed by CCME and is freely available to the public [24]. Directive 98/83/EC of the Council of Europe [35] is the source for all the objective values adopted in this study (Table 3). Table 3. Parameters and objective values used to classify groundwater quality on Rhodes Island.

Results and Discussion
The variation of water quality parameters in the aquifers in the area studied is demonstrated in terms of three factors, accounting for 83.2% and 64.2% of the total variance of the dataset for the wet periods of 2019 and 2020, respectively. The varimax rotated factor loadings and the proportion of variance explained are tabulated in Tables 4 and 5   The analysis of the 2019 dataset revealed the following findings (Table 4): (a) Factor 1, explaining 48.6% of the total variability, is a dipolar factor with high positive loadings (greater than +0.875) for CND, Cl − , SO 4 2− , and Na + and high negative loading for CCME-WQI (−0.913); (b) Factor 2 accounts for 17.8% of the total variance and is a factor with high positive loadings for pH and NO 2 − (greater than +0.706) and moderate negative loading for NH 4 + (−0.680); and (c) Factor 3, explaining 16.8% of the total variance and showing high positive loading for NO 3 − (+0.945). The treatment of the 2020 dataset produced the following results (Table 5): (a) Factor 1, explaining 32.1% of the total variance, is a dipolar factor with high positive loadings (greater than +0.777) for CND, Cl − , and SO 4 2− , moderate positive loading for Na + (+0.530), and high negative loading for CCME-WQI (−0.711); (b) Factor 2 accounts for 17.3% of the total variance and is a factor with high positive loadings for NO 3 − (+0.950); and (c) Factor 3, explaining 14.7% of the total variance, presents high positive loading for NH 4 + (+0.718). Factor 1 is the most significant factor because it can illustrate the largest proportion of the total variance in both datasets (Tables 4 and 5). The variability of CND, Cl − , SO 4 2− , and Na + in Factor 1 can be attributed to the process of mixing seawater with freshwater. The close relationships between Cl − , SO 4 2− , and Na + can be attributed to the presence of these ions in seawater. Groundwater salinization in the agricultural areas of Rhodes Island be away from the coast can be attributed to "irrigation return-flow" (Figures 2 and 3). In other words, dissolved salts in irrigation water are concentrated by evapotranspiration process and finally infiltrate from soil to aquifer of Rhodes Island. Similar processes of groundwater salinization which are directly related to agricultural areas are also reported by Suarez et al. [36] in the Upper Rio Grande (USA) and Foster et al. [37] in Almeria (Spain) and Punjab (Pakistan). Moreover, the CND value increases with the content of dissolved salts. Factor 1 expresses the total salt content of the groundwater samples determined by these parameters. Thus, Factor 1 may be referred to as "salinity factor" for both examined datasets. Similar results derived from factor analysis were also reported by many researchers who studied seawater intrusion into coastal aquifers [38,39]. Moreover, Vandarakis et al. [40], who evaluated the coastal vulnerability to the ongoing sea level rise for Rhodes Island, indicated high to very high vulnerability for 40% of the entire coastline length. Factor 1 identified the antipathetic relation between CND-Cl − -SO 4 2− -Na + , and CCME-WQI in both sampling campaigns indicating that these water quality parameters significantly influence the CCME-WQI (Tables 4 and 5).
All the mean and median values of the examined parameters in groundwater of the area studied did not exceed the corresponding parametric values (PVs) given by Dir.98/83/EC [35] (Table 6), denoting that most water samples are suitable for human consumption.
The content of Cl − in groundwater samples collected from three (W4, W5, and W7) and two (W4 and W5) monitoring stations during the 2019 and 2020 sampling period, respectively, exceed the PV given by the EC [35]. The Na + concentration in groundwater collected from monitoring stations W5, W8, and W22 during 2019 is higher than the PV given by the EC [35]. The Na + content in groundwater gathered from W5 and W8 during 2020 exceeds the corresponding PV established by the EC [35]. Only one monitoring station (W7) in 2019 and one monitoring station (W4) during 2020 presented NO 3 − contents in groundwater higher than PV suggested by the EC [35] (Table 6; Figure 4). The groundwater gathered from monitoring station W5 in both sampling periods presented SO 4 2− concentration higher than PV proposed by the EC [35]. Only one groundwater sample collected from W5 in 2019 exceeded the PV for CND established by the EC [35] (Table 6; Figure 3). The pH value in groundwater collected from monitoring station W6 during 2020 is higher than the corresponding PV established by the EC [35].
datasets. Similar results derived from factor analysis were also reported by m researchers who studied seawater intrusion into coastal aquifers [38,39]. Moreo Vandarakis et al. [40], who evaluated the coastal vulnerability to the ongoing sea leve for Rhodes Island, indicated high to very high vulnerability for 40% of the entire coas length. Factor 1 identified the antipathetic relation between CND-Cl − -SO4 2− -Na + , CCME-WQI in both sampling campaigns indicating that these water quality param significantly influence the CCME-WQI (Tables 4 and 5).     The spatial distribution of NO 3 − content in aquifers on Rhodes Island is depicted in Figure 4. The elevated NO 3 − concentration in groundwater of the study area is directly related to agricultural land use and especially the application of nitrogen fertilizers in cultivated areas (Figures 2 and 4). Many researchers have identified high NO 3 − concentration in groundwater due to the leaching of fertilizers in many regions worldwide [3,[41][42][43][44]. Moreover, NO 3 − is a common groundwater pollutant that is symptomatic of anthropogenic sources, such as agricultural and domestic sewages [42,44], and elevated concentration of NO 3 − is typical in aquifers [1,41,45]. station (W7) in 2019 and one monitoring station (W4) during 2020 presented NO3 contents in groundwater higher than PV suggested by the EC [35] (Table 6; Figure 4). The groundwater gathered from monitoring station W5 in both sampling periods presented SO4 2− concentration higher than PV proposed by the EC [35]. Only one groundwater sample collected from W5 in 2019 exceeded the PV for CND established by the EC [35] ( Table 6; Figure 3). The pH value in groundwater collected from monitoring station W6 during 2020 is higher than the corresponding PV established by the EC [35].  The spatial variation of CND value in groundwater monitoring stations of the area studied is presented in Figure 3. The high CND values are recorded mainly in the coastal aquifer systems of Rhodes Island, suggesting a direct link with seawater intrusion in these systems ( Figure 3). Furthermore, Chandrajith et al. [46] reported that the mean value of CND in groundwater samples collected in a coastal region of Sri Lanka was 1260 µS cm −1 and served as the first indication of seawater intrusion. According to scientific reports, the Mediterranean basin will be significantly impacted by anticipated climatic crises such as increased air temperatures, precipitation decrease, and rising sea levels [47][48][49]. The surface temperature is expected to rise by 2050 by 2.5 • C, and precipitation will fall by 10.5% [47,48]. Furthermore, existing freshwater in the Mediterranean region has been depleted over the last few decades [47]. Elevated CND values in irrigation water are among the primary threats to the agricultural sector [50,51]. According to the IPCC [52], global warming will cause an increase in sea level, which will, in turn, affect seawater intrusion into coastal aquifers [53,54].
The spatial variation of the quality class in the study area derived by the application of CCME-WQI is illustrated in Figure 5. It is observed that CCME-WQI classifies the majority of monitoring stations, the Rhodes aquifers, into the highest class (Class 5) for both sampling campaigns in 2019 and 2020 ( Figure 5). Only three monitoring stations (W4, W5, and W6) show differences in the classification derived by CCME-WQI ( Figure 5). CCME-WQI has also been applied in many regions to record the quality status of surface water and groundwater [1,3,[55][56][57]. Moreover, Chandrajith et al. [46] also applied WQI as a vulnerability indicator to highlight seawater intrusion in sedimentary aquifers of Sri Lanka.

Conclusions
All of the examined parameters in the groundwater of the study area presented mean and median values that are lower than the corresponding parametric values established by Directive 98/83/EC. The high CND values are primarily observed in the coastal aquifer systems on the island of Rhodes, indicating a direct relationship between seawater intrusion and these aquifer systems. CCME-WQI classifies the majority of Rhodes aquifer monitoring stations as belonging to the highest category (Class 5) for both sampling campaigns in 2019 and 2020. The most critical factor which accounts for the greatest proportion of total variance in both water quality datasets includes CND, Cl − , SO4 2− , and Na + . The contents of CND, Cl − , SO4 2− , and Na + in the groundwater of Rhodes Island are mainly attributed to the process of seawater intrusion. The statistical analysis results of this study revealed that the CCME-WQI is primarily controlled by CND, Cl − , SO4 2− , and Na + . The linking of factor analysis with CCME-WQI is a helpful tool for monitoring the groundwater quality on Rhodes Island. It is important to install and setup a monitoring network to evaluate groundwater deterioration of the aquifers on Rhodes Island applying WQI because it turns the raw data of water quality into information that is coherent and convenient to policy makers, stakeholders, and inhabitants. Since climate crisis is most likely to accelerate the mixing of seawater in coastal aquifers, the monitoring network in the area studied can help to the sustainable management of groundwater resources of Rhodes Island.

Conclusions
All of the examined parameters in the groundwater of the study area presented mean and median values that are lower than the corresponding parametric values established by Directive 98/83/EC. The high CND values are primarily observed in the coastal aquifer systems on the island of Rhodes, indicating a direct relationship between seawater intrusion and these aquifer systems. CCME-WQI classifies the majority of Rhodes aquifer monitoring stations as belonging to the highest category (Class 5) for both sampling campaigns in 2019 and 2020. The most critical factor which accounts for the greatest proportion of total variance in both water quality datasets includes CND, Cl − , SO 4 2− , and Na + . The contents of CND, Cl − , SO 4 2− , and Na + in the groundwater of Rhodes Island are mainly attributed to the process of seawater intrusion. The statistical analysis results of this study revealed that the CCME-WQI is primarily controlled by CND, Cl − , SO 4 2− , and Na + . The linking of factor analysis with CCME-WQI is a helpful tool for monitoring the groundwater quality on Rhodes Island. It is important to install and setup a monitoring network to evaluate groundwater deterioration of the aquifers on Rhodes Island applying WQI because it turns the raw data of water quality into information that is coherent and convenient to policy makers, stakeholders, and inhabitants. Since climate crisis is most likely to accelerate the mixing of seawater in coastal aquifers, the monitoring network in the area studied can help to the sustainable management of groundwater resources of Rhodes Island.