Statistical Analysis for Water Quality Assessment: A Case Study of Al Wasit Nature Reserve

: This study presents a comprehensive data analysis using univariate and multivariate statistical techniques as a tool to establish a baseline for the assessment of water quality parameters in environmental compartments. The Al Wasit Nature Reserve is a hypersaline wetland in the UAE with a spatial ﬂuctuation in water parameters as water ﬂows above ground as well as ponds forming in deeper areas and over the year due to the arid climate and seasonality. Water samples were collected at ﬁfteen sites along the hypersaline wetland over three periods during the months of February to March 2021 as temperatures started to rise with the oncoming summer. Water quality parameters, including the temperature, pH, turbidity, dissolved oxygen (DO), oxidation-reduction potential (ORP), electrical conductivity (EC), chemical oxygen demand (COD), chloride, ammonia, and nitrates, were measured. The results of the data analysis were used to group the sites, which were divided into three groups with similar water quality characteristics. Correlation assessments between all studied parameters revealed signiﬁcant differences in the values of eight of the evaluated parameters between the three identiﬁed clusters, with only the nitrate concentrations and dissolved oxygen parameters not being signiﬁcant. It was found that one of the three clusters (cluster 1) performed better than the other two for most of the studied parameters. The results of this study demonstrate the applicability and the potential time and cost savings of the usage of data analysis tools for long-term data monitoring in the wetland and other environmental systems worldwide.


Introduction
Wetlands are dynamic ecosystems covering approximately seven percent of the earth's total surface [1]. A wetland is an area of land typically saturated with water standing above a soil surface [2]. Regional and local variations in topography, soil, climate, hydrology, chemistry, vegetation, and human activity influence wetlands [3]. Hence, they are classified into different categories depending on their geomorphic settings, water sources, and hydrodynamics. Wetlands can be natural, artificial, stagnant, flowing, brackish, and salty [1]. In a wetland, the water level can vary significantly between permanently flooded and seasonally flooded areas with saturated soils [4]. Wetland habitats are important for birds because they meet their needs for feeding, nesting, and roosting [5]. In general, wetlands have a diverse range of life and provide shelter for a variety of animals and plants.
Healthy, natural wetlands are crucial to human life. Global wetlands cover over 12.1 billion hectares, where 93% are inlands systems and 7% are marine and coastal. The most significant areas of wetlands are concentrated in Asia, with approximately 32% of the techniques, such as principal component analysis (PCA), have previously been used in the interpretation of the results and to reduce the dimensionality of a multivariate dataset while maintaining its original structure [17]. In addition, the application of various statistical techniques such as cluster analysis, discriminant analysis, and ANOVA analysis helps in the interpretation of complex datasets to better understand the water quality of a system, enabling the identification of possible factors affecting the water system and providing a valuable tool for water resource management as well as a quick solution to pollution problems [18,19] The use of some nonparametric multivariate statistical approaches was also beneficial in assessing the different levels of some pollutants, such as arsenic, in the aquifers of the Calabria region [20]. A geochemical modeling approach was also used to study arsenic pollution through investigating water-rock interactions and applying reaction path modeling [21]. In this study, various univariate and multivariate data analysis techniques were performed on the Al Wasit Nature Reserve area in the United Arab Emirates to investigate how water quality parameters change at different sampling sites and to determine which variables are most responsible for the variations in water quality. The findings of this study will greatly benefit scientists by facilitating future sampling collections and reducing the time and effort required for these activities.

Literature Review
The concepts that are covered in the literature are extensive and span different areas of research and practice. This section reviews the literature from several sources by focusing the review of the literature on water quality assessment methods such as statistical data analysis in the Al Wasit Nature reserve in the UAE in particular. This section identifies the gaps in the body of research knowledge and the motivations for this research.

Importance of Wetlands
Wetlands can effectively minimize the number of pollutants and effluents before discharge to rivers and other resources. Some wetlands are natural, while others are constructed (artificial). The performance of wetlands in improving water quality and reducing runoff pollution has been described in several studies [22,23]. In addition, wetlands provide diverse watershed functions, including important wildlife habitats, floodwater storage, groundwater recharge, and water filtration. The global decline in wetlands has been related to amphibian reduction, the loss of reptile and invertebrate habitat, and changes in hydrologic states [24][25][26]. For these reasons, wetlands have been considered in global policies and targets. For example, they are considered an important element in the 2030 Agenda for Sustainable Development because they are necessary for achieving many of the 17 United Nations Sustainable Development Goals, namely (i) SDG 6, which focuses on water sanitation; (ii) SD 14, which motivates the protection of coastal and marine areas; and (iii) SDG 15, which promotes the sustainable use and protection of inland freshwater ecosystems and their services [27]. In addition, wetlands are of great relevance to the "Aichi Biodiversity Targets", which are part of the Strategic Plan for Biodiversity (2011-2020) [28]. Many of these targets focused on stopping ecosystem loss, including Target 5, which aimed to wipe out the loss of natural habitat by 2020. Other targets aimed to preserve inland, coastal, and marine water and promote the sustainable use of aquatic species and the management of aquaculture [29]. Moreover, the UN Convention to Combat Desertification set a target to halt land degradation, which has a direct impact on wetlands such as peatlands and rivers [30]. Wetlands are also part of several international agreements, such as the 2015 Paris Agreement, which called on countries to include wetland protection and management in their nationally determined contributions to addressing climate change with nature-based solutions [31]. Accordingly, it is important to understand the risks and challenges associated with wetlands in order to effectively set regulations to mitigate risks and achieve the aforementioned targets.
Wetland assessment and monitoring are essential to the understanding of how wetlands are evolving over the long term, how species richness and water quality are changing over time, and how they are deteriorating. Natural wetlands are in long-term decline worldwide. Both inland and marine/coastal wetlands have declined by about 35% from 1970 to 2015. In contrast, man-made wetlands have doubled during this period and account for 12% of today's wetlands. Nevertheless, this increase in man-made wetlands has not compensated for the loss of natural wetlands [6]. Due to this fact, expectations of a decline in wetland-dependent species have been raised, with concerns of extinction. In this context, the International Union for Conservation of Nature Red List assesses the level of threat of extinction of animal and plant species, showing that 25% of species that depend on inland wetlands are threatened with extinction. It has also shown that inland species dependent on rivers and streams are more threatened globally than species in swamps and lakes. In addition, the risk of extinction is higher for species in inland wetlands than for their terrestrial counterparts, and wetland-dependent species are most at risk in tropical areas [32]. Using the Wetland Living Planet Index, the percentage of wetland-dependent species declined by 39% between 1970 and 2011 [24]. Ultimately, water quality is a crucial concern for human well-being [33], yet the signs indicate that it is negatively impacted over time. Despite the fact that a decline in water quality leads to wetland degradation, wetlands have a significant role in enhancing water quality through ecosystem-regulating services [34]. The preceding discussion suggests that the main challenges facing wetlands are wetland degradation, declines in wetland-dependent species, and water quality degradation. However, since this study aims to assess and analyze the water quality parameters in wetlands, specifically in the Al Wasit Wetland, the next section will focus more on the problem of water quality in wetlands and the main trends related to it.

Water Quality of Wetlands and Statistical Analysis Techniques
Many factors play important roles in affecting the water quality in wetlands, including the degree of wastewater treatment, the erosion of topsoil, nutrient loading and eutrophication, pathogen pollution, salinity, and thermal stress. Untreated wastewater is a major threat to water quality. Therefore, proper treatment should be carried out before the water enters wetlands. It was shown by [35] that the treatment of industrial and municipal wastewater in countries has a positive correlation with the country's income. On average, 70% of wastewater is treated in high-income countries, followed by 38% in upper-middle-income countries, while lower-middle-income countries treat about 28% of their wastewater, and only 8% of wastewater is treated in low-income countries. In addition, according to the United Nations World Water Development report [36], over 80% of the world's wastewater is discharged into wetlands without adequate treatment. This creates serious problems that negatively impact wetland water quality. The next challenge affecting water quality is topsoil erosion, which leads to nutrient loading and the eutrophication of wetlands, considered the greatest challenge to water quality. In 2017, the United Nations reported that by 2050, about one third of the population will be exposed to water with excessive nitrogen and phosphorus [37]. In addition, increased sedimentation may harm aquatic biodiversity, e.g., [38,39], while on the other hand, the retention of sediment behind dams would minimize sediment loading in coastal and deltaic areas, leading to the subsidence and loss of wetlands. Moreover, the preliminary results from the Global Water Quality Monitoring Program show that one third of all river stretches in Latin America, Africa, and Asia are already affected by severe pathogen pollution [40], which may lead to serious health issues and diseases such as cholera and giardiasis [27]. Another important factor in water quality is salinity, which can increase due to vegetation clearing and the irrigation of saline soils as a result of the infiltration of water through the soil profile [41]. The excessive extraction of groundwater and the rise in sea level promote saltwater intrusion [42]. Once they occurs, soil salinization and groundwater salinity are considered permanent and irremediable problems [43]. Salinity plays an important role in this study, as the Al Wasit Wetland is considered to be a hypersaline wetland. A hypersaline lake is a body of water with a concentration of salt exceeding 35 g/L [44]. It has been reported that all wetlands are initially freshwater water bodies, and they eventually move to more saline states due to natural and anthropogenic factors [45]. Additionally, according to Delaney et al. (2017), saline wetlands report conductivities ranging between 8.8 and 50 mS/cm, and those higher than 50 mS/cm are considered hypersaline. The eutrophication of wetlands is another important problem caused by nitrogen oxides, ammonia, and other compounds released from many sources, such as fertilizers and animal wastes in agriculture and landscaping and the release of treated sewage waters into the environments [46]. Finally, the thermal pollution of wetlands, associated with power plants and industry, reduces oxygen levels, alters the food chain, reduces biodiversity, and promotes the invasion of thermophilic species [47].
Since water is an invaluable and radical element for life in wetlands, providing a range of services to ecosystems and serving as a central habitat for various species, the water quality of surface and groundwater systems has become an important field of research in the last century and is still one of the most important trends in recent decades. Statistical data analysis is an essential research tool for studying relationships, patterns, and trends among variables and has applications in various research fields, including the environmental field, where statistical analysis has played an important role in assessing the water quality in different regions. For example, a case study on the Fuji River Basin in Japan [17] illustrated the usefulness of multivariate statistical analysis techniques for the analysis and interpretation of complex datasets and for water quality assessment. The author performed a discriminant analysis to reduce the dimensionality of the large dataset and to identify a few indicator parameters responsible for large variations in water quality. In addition, factor, principal component, and cluster analyses were applied to surface water quality data in the Tahtali Basin in Turkey [18]. It was concluded that agricultural use and domestic discharges controlled the surface water quality. In addition, a cluster analysis was performed to group the sites based on variable levels. Agricultural discharges were found to strongly influence the north and northeast areas of the region. These techniques were useful in helping managers understand the complex nature of water quality problems and set priorities for improving water quality. Moreover, Apollaro et al. [20] adopted a multivariate nonparametric technique to investigate the different levels of concentration of some chemical elements in ground water, such as arsenic, which is considered to be one of the most monitored pollutant elements worldwide due to its harmful effects on human health. In addition, a geochemical modeling approach was also used by [21] to address arsenic pollution while studying water-rock interactions and applying reaction path modeling to investigate the rock-to-water release of arsenic and the fate of this pollutant in crystalline aquifers. Another nonmetric multidimensional-scaling analysis was used by [19] to investigate the variation associations of abiotic and biotic parameters among and within three constructed wetlands in metropolitan Taipei. Further investigation was conducted in [19] to test the pollutant-removal performance of the wetland systems.
According to Gazzaz [48], a latent structure of the Kinta River (Malaysia) water quality dataset was identified using three different multivariate statistical techniques (factor analysis, cluster analysis, and discriminant analysis). The factor analysis identified the parameters responsible for the variations in water quality of the Kinta River. The cluster analysis grouped the monitoring sites into two clusters, one for sites with low water pollution and the other with relatively high river pollution. The discriminant analysis confirmed these clusters and created a discriminant function to predict the membership of new or unknown samples in a cluster. These techniques showed the potential to appropriately reduce the number of water quality parameters and the number of monitoring stations for long-term monitoring. Furthermore, the important rule of wetlands in decreasing pollutants and wastewater before discharging into rivers and other water sources encouraged the researchers in [49] to invest time to perform statistical techniques and a pattern analysis, which resulted in time and cost efficiency for data monitoring purposes in a free constructed wetland. The novelty of this study is that it serves as a baseline to demonstrate the use of statistical analysis techniques to evaluate wetland water quality data. Furthermore, it will play a role in determining which sites are representative and which require continuous monitoring.

Study Area
The Al Wasit Nature Reserve (Al Wasit), shown in Figure 1, is located in the northern Sharjah suburb in the United Arab Emirates (UAE). The reserve is considered one of the most ecologically diverse conservation sites, and it is a protected area for both captive and wild birds [50]. The National Reserve was previously known as Al Ramtha Lagoon, and it is categorized as a saltmarsh (sabkha), giving it a hypersaline nature [51]. The wetland is currently conserved by the Environment and Protected Areas Authority of the Sharjah Government. For the purpose of this study, we have characterized the different areas into ponds, including the big pond (shoreline and middle) occupying the biggest area near the shore and the middle deeper areas; followed by the middle pond, connecting the pond with above-and below-ground flows between the big and upper ponds, and the upper pond (the final water destination). membership of new or unknown samples in a cluster. These techniques showed the potential to appropriately reduce the number of water quality parameters and the number of monitoring stations for long-term monitoring. Furthermore, the important rule of wetlands in decreasing pollutants and wastewater before discharging into rivers and other water sources encouraged the researchers in [49] to invest time to perform statistical techniques and a pattern analysis, which resulted in time and cost efficiency for data monitoring purposes in a free constructed wetland.
The novelty of this study is that it serves as a baseline to demonstrate the use of statistical analysis techniques to evaluate wetland water quality data. Furthermore, it will play a role in determining which sites are representative and which require continuous monitoring.

Study Area
The Al Wasit Nature Reserve (Al Wasit), shown in Figure 1, is located in the northern Sharjah suburb in the United Arab Emirates (UAE). The reserve is considered one of the most ecologically diverse conservation sites, and it is a protected area for both captive and wild birds [50]. The National Reserve was previously known as Al Ramtha Lagoon, and it is categorized as a saltmarsh (sabkha), giving it a hypersaline nature [51]. The wetland is currently conserved by the Environment and Protected Areas Authority of the Sharjah Government. For the purpose of this study, we have characterized the different areas into ponds, including the big pond (shoreline and middle) occupying the biggest area near the shore and the middle deeper areas; followed by the middle pond, connecting the pond with above-and below-ground flows between the big and upper ponds, and the upper pond (the final water destination).

Sampling and Water Quality Assessments
Samples were collected from fifteen locations along the Al Wasit Wetland biweekly during the months of February to March 2021. The locations were selected based on the available surface water and were representative of the whole reserve. At least three observations were recorded per location. The studied parameters are shown in table 1. An on-site water quality analysis was conducted using an HI 9829 multiparameter (Hanna Instruments, Singapore) to measure the temperature, pH, turbidity, dissolved oxygen (DO), oxidation-reduction potential (ORP), and electrical conductivity (EC) at various

Sampling and Water Quality Assessments
Samples were collected from fifteen locations along the Al Wasit Wetland biweekly during the months of February to March 2021. The locations were selected based on the available surface water and were representative of the whole reserve. At least three observations were recorded per location. The studied parameters are shown in Table 1. An on-site water quality analysis was conducted using an HI 9829 multiparameter (Hanna Instruments, Singapore) to measure the temperature, pH, turbidity, dissolved oxygen (DO), oxidation-reduction potential (ORP), and electrical conductivity (EC) at various locations on the wetland. Additionally, surface water samples were collected simultaneously in Nalgene or polypropylene bottles and transported to the laboratory for an analysis of chemical oxygen demand (COD), chloride, ammonia, and nitrates. Ion-selective electrodes (ISEs) for ammonia, nitrates, and chloride were used to measure samples in the laboratory (HI 4101, HI 4113, and HI 4107 from Hanna Instruments, Singapore). Chloride samples were diluted due to their high salt content. The chemical oxygen demand (COD) was tested using HI 93754C-25 HR and an HI 83399 multiparameter photometer (Hanna Instruments, Rhode Island, United States). Guiding standards for hypersaline wetlands are not available. Hence, the marine water quality guiding standards published by the Ministry of Climate Change and Environment (MOCCAE) and typical concentrations found in similar hypersaline environments were used for comparison purposes and are shown in the fourth column of Table 1 (MOCCAE, 2020) [52].

Statistical Analysis
Several statistical techniques were utilized to assess the water quality. In particular, a Pearson correlation analysis was performed to examine the associations among all pairs of the evaluated parameters. A hierarchical cluster analysis was used to group the 15 sites into three clusters based on the water quality parameters. A discriminant analysis was also used to verify the quality of the cluster analysis results. Finally, to investigate the differences between the three clusters, a nonparametric ANOVA test was used, followed by multiple comparison procedures, to detect the source of the differences among the three clusters for each parameter.

Results
The selected water quality parameters studied per site were recorded during each sample trip. On average, the temperature ranged from 25.33 to 30.52 • C, the pH ranged from 8.12 to 8.50, the electrical conductivity ranged from 27,541 to 116,547 µS/cm, and the DO ranged from 6.81 to 10.88 mg/L, among others. To understand the differences between the sampling locations and parameters, all measurements were further analyzed using a Pearson's correlation analysis, multivariate statistical techniques, a cluster analysis (CA), and a linear discriminant analysis.

Multivariate Statistical Analysis
To further understand the differences among the sampling locations, two multivariate statistical techniques, cluster analysis (CA) and linear discriminant analysis (LDA), were used. CA, which is an unsupervised multivariate technique, was used to cluster cases or objects based on their characteristics, resulting in clusters of objects with high internal homogeneity (within-cluster homogeneity) and low homogeneity between clusters [55]. LDA is a multivariate technique that can either predict or describe group differences [56]. This technique is frequently used to develop classification rules in order to assess the relative importance of variables in differentiating between groups [57].
In this study, complete linkage hierarchical clustering was performed on the dataset using the proc varclus procedure in the SAS software in order to cluster all 15 sampling sites based on the water properties, which were determined based on the 10 measured parameters. By using the proc varclus procedure, the parameters were first normalized, and the Euclidean distance was used to find the distance and measure the similarity and dissimilarity between each pair of individual observations. Then, a complete linkage was used to measure the distance between each pair of clusters. The CA results are depicted in Figure 2. From the dendrogram in Figure 2, one can observe three clusters: sites 1, 2, and 3 are grouped into one cluster (Cluster 1); sites 4, 5, and 6 are grouped into one cluster (Cluster 2); and the remaining sites can be grouped together (Cluster 3). Furthermore, the three clusters can be classified based on their locations on the map as follows: Cluster 1 can be classified as the upper pond, Cluster 2 can be classified as the middle pond, and Cluster 3 can be classified as the big pond (see Figure 3). In the big pond, some of the sites are located in the middle of the large pond, including sites 9, 8, 10, and 15 on the map (Figure 3), while the remaining sites, 11-14 and 7, were located on the shore of the big pond on the map (Figure 3). For more clarity, the locations of the three clusters in the wetland are shown in Figure 3. The cluster analysis grouped the 15 sites into three groups according to the water quality characteristics of the tested samples.
In order to verify the CA results, LDA was performed using the proc discrim procedure in the SAS Software as a supervised technique to classify the three clusters based on the 10 water quality variables. Furthermore, a canonical analysis was used as a dimension reduction technique to visualize the three clusters in two dimensional graphs. Table 3 shows the results of the LDA analysis. The classification tables show 100% training accuracy and 95.83% cross-validation accuracy. Due to the small sample, cross-validation was based on the leave-one-out method, where in each iteration, one observation was used as a testing point and the remaining observations were used as training data. For further verifications of the quality of the CA results, a canonical plot based on the first two canonical variables is shown in Figure 4. It can be clearly seen that the figure supports the results from the CA. wetland are shown in Figure 3. The cluster analysis grouped the 15 sites into three groups according to the water quality characteristics of the tested samples.     In order to verify the CA results, LDA was performed using the proc discrim procedure in the SAS Software as a supervised technique to classify the three clusters based on the 10 water quality variables. Furthermore, a canonical analysis was used as a dimension reduction technique to visualize the three clusters in two dimensional graphs. Table 3 shows the results of the LDA analysis. The classification tables show 100% training accuracy and 95.83% cross-validation accuracy. Due to the small sample, cross-validation was based on the leave-one-out method, where in each iteration, one observation was used as a testing point and the remaining observations were used as training data. For further verifications of the quality of the CA results, a canonical plot based on the first two canonical variables is shown in Figure 4. It can be clearly seen that the figure supports the results from the CA.

Cluster Segmentation
A cluster analysis provides an initial exploratory assessment of the spatial difference characteristics of the systems under study. In cluster segmentation, we are mainly interested in investigating the differences between the three clusters. For this purpose, the mean and standard deviation of each cluster are reported in Table 4, in addition to the overall minimums, maximums, and means of all ponds. As mentioned previously, guiding standards for hypersaline wetlands are not available. In order to have an understanding of the expected regional water characteristics, the marine water quality guiding standards published by the Ministry of Climate Change and Environment (MOCCAE) and typical concentrations found in similar hypersaline environments were used for comparison purposes [49]. The results of this comparison show that the temperature, DO, pH, and turbidity are within the acceptable levels. The average values for the majority of the parameters in the three clusters show lower and sometimes higher levels of concentrations than the standard levels shown in Table 1. The average values for ORP in this study ranged between 81 and 41 mv, yet no specific standard values are available for comparison purposes. The average values for EC levels in the three clusters were greater than 70,000 µS/cm, agreeing with the standard level of higher than 50,000 µS/cm. Moreover, the COD average levels appeared to be higher than the standard levels of less than 40 mg/L in all clusters. However, the upper pond showed much lower COD levels than the samples from the other two clusters. Chloride is another parameter that is interesting for this type of hypersaline ecosystem. The results in Table 4 show that the average chloride concentration was between 37,000 mg/L and 12,000 mg/L in the three clusters, which is consistent with the previously reported values of typical hypersaline environments of approximately 26,000 mg/L. The average levels of ammonia were between 1.7 mg/L and 0.1 mg/L, which were higher than the standard level (>0.06 mg/L). Finally, the average levels of nitrates were between 50 mg/L and 58 mg/L and were slightly higher than the standard level (<50 mg/L).
In general, the upper pond appeared to have lower temperature, ORP, EC, turbidity, nitrate, chloride, and COD values when compared with the middle and big ponds. The big pond had higher ORP, EC, turbidity, nitrate, chloride, and COD values. Despite having higher pH, DO, and ammonia values than the other two ponds, the upper pond appeared to perform better, with major reductions in EC, turbidity, chloride, and COD. Table 5 depicts a heat map of the differences between the three ponds for clarity. The statistical significance of the differences for each variable is investigated further in the following section by running multiple nonparametric ANOVA tests and pairwise comparisons among the three levels in the SAS software using the proc glm procedure.

Univariate Statistical Tests
For each of the ten water quality parameters, a one-way ANOVA test was used to investigate the differences between the three ponds (upper, middle, and big). This method is useful for determining whether each of the ten parameters varies between the three ponds. The parametric ANOVA test assumes normality and variance homogeneity. These assumptions were tested, and Levene's test showed that the homogeneity assumption between the three groups was violated. As a result, the Kruskal-Wallis test, a nonparametric analogue to the parametric one-way ANOVA, was used to test the following hypotheses: H 0 . There are no differences among the three ponds.
H a . There is at least one difference between the three ponds.
Since the Kruskal-Wallis test was used for each of the water quality variables, a Bonferroni p-value adjustment was used in order to control type I error. According to Table 6, the adjusted p-values were significant for all parameters except for nitrate and dissolved oxygen. Note: "Green": high; "Yellow": medium; "Red": low. The source of the difference between the three ponds was also investigated. As a result, pairwise comparisons were performed using the nonparametric Wilcoxon test, and the results are shown in Table 7. For each combination, Table 7 displays the mean difference, the Bonferroni adjusted p-value, and the simultaneous 95% confidence interval. The table shows that the temperature and ORP levels in the big and middle ponds differed. The temperature and ORP levels appear to be lower in the big pond. The pH, ORP, EC, turbidity, ammonia, chloride, and COD levels differed significantly between the big and upper ponds. The big pond appeared to have higher ORP, EC, turbidity, chloride, and COD levels and lower pH and ammonia levels. Finally, the temperature, pH, EC, turbidity, ammonia, chloride, and COD levels differed significantly between the middle and upper ponds. The temperature, EC, turbidity, chloride, and COD levels were higher in the middle pond, while the pH and ammonia levels were lower.

Discussion
In this study, ten parameters were investigated using 44 samples collected from 15 sites in the Al Wasit Nature Reserve. The cluster and discriminant analysis results indicate that water samples can be collected on a regular basis from one station in each cluster for a rapid assessment of the Al Wasit Nature Reserve because each site in the cluster represents the entire cluster. As a result, taking water samples from three sampling sites rather than fifteen may accurately reflect the spatial dimension of the water quality throughout the wetland and reduce the monitoring time, effort, and cost without significant loss. This finding is consistent with the findings of a previous study conducted by Mohammadpour et al. [49], who used a spatial pattern analysis to assess the water quality in free surface lakes.
Previous studies have suggested that, due to the dynamic nature of wetland ecosystems, efforts to monitor and manage the water quality can be severely challenged [58]. For this reason, studies that assess the physical and chemical characterization of a wetland provide valuable inputs that contribute to the development of management programs [59].
The results of this study reflect on the importance of monitoring temperature as one of the most important parameters for assessing water quality and ecosystems because it influences many chemical and biological processes in marine ecosystems [55]. The water temperature ranged from 24.61 • C to 30.81 • C, with a mean value of 27.76 • C, which fell within the Ministry of Climate Change and Environment's (MOCCAE) standard values for marine water properties, suggesting that there is no thermal pollution in the wetland; however, the temperature in the middle pond was significantly higher than in the upper and big ponds. In addition, measurements of pH are a routine parameter, and in this study the pH ranged from 8.00 to 8.60, with no significant difference between the middle and big ponds (p-value > 0.05). The upper pond, however, had a significantly different pH than the middle and big ponds, with the highest average pH value of 8.45. Overall, the average pH values in the Al Wasit Wetland were within the standard ranges given in Table 1, i.e., between 6.0 and 9.0, indicating no serious threats from wastewater, minerals, or acid rain affecting the pH [56].
The understanding of the significance of these results helps guide potential future projects aiming at monitoring parameters over a long period of time to understand possible changes or degradation, which were previously used in other studies to associate water quality degradation to the trophic status of the ecosystem (Dar et al., 2021). In addition, parameters such as surface water EC values vary depending on geological structure and precipitation and can be used to categorize or distinguish wetlands [57]. When EC values exceed 3000 S/cm, the sample is classified as saline [58]. The Al Wasit Wetland's EC values ranged from 26,960 S/cm to 134,700 S/cm, with an average value of 72,172 S/cm, which falls within the hypersaline range. Furthermore, multiple comparisons revealed that there was no significant difference in the EC values between the big and middle ponds, while the upper pond EC values were significantly lower than the other two ponds. The Pearson correlation analysis revealed strong positive correlations between EC and both Cl and COD (r = 0.860 and 0.805, respectively, p-value < 0.001).
Furthermore, parameters such as the ORP showed clear differences between the sites in the study. There was a significant difference between the big pond and the other two ponds, as shown in Table 6, with the big pond having a higher ORP value. This could be because the ORP value decreases in the sediment. Because the big pond is deeper than the other two ponds, the samples from the upper and middle ponds are closer to the bottom sediment, whereas the samples from the big pond are further away. The ORP and pH values were found to have a moderate negative correlation. The latter result is consistent with the findings of Ustaolu et al. [54], who assessed the water quality of the Pazarsuyu Stream in Turkey.
The concentration of dissolved oxygen indicates how healthy the aquatic ecosystem is. Fish and other aquatic animals may perish if dissolved oxygen levels are too low. DO levels in the Al Wasit Wetland ranged from 5.90 mg/L to 11.59 mg/L, with a mean value of 7.98 mg/L. These results are within the norm, and the mean values for all three ponds were greater than 5 mg/L, indicating a good level of DO, with no significant differences between the three ponds. A moderate negative correlation between Cl and DO was found (r = −0.529, p-value < 0.05). Overall, the DO values in the Al Wasit Wetland were not lower than the reported standard values.
Turbidity is a measure of water clarity. It describes the amount of light scattered or blocked by water particles. These particles give the water a cloudy or turbid appearance. The measured turbidity values ranged from 1.00 FNU to 19.83 FNU, with an average value of 12.97 FNU, which is within the standard turbidity range (75 FNU). The Kruskal-Wallis test and pairwise comparisons showed a significant difference in the turbidity values between the upper pond and the other two ponds, with the upper ponds performing better, with lower turbidity values of 4.04 FNU. This is an indication of a healthy aquatic ecosystem in the Al Wasit Wetland, as high turbidity can harm fish and other aquatic life. Algae and other aquatic plants require light to grow, and high turbidity reduces the amount of light available underwater.
Ammonia concentration management can improve plant growth and system function [59]. Ammonia toxicity varies with pH and water temperature. The average NH 3 concentration in the Al Wasit Wetland was 0.55, which is higher than the standard value of 0.06 mg/L. The nonparametric ANOVA test revealed significant differences between the three ponds, with the upper pond exhibiting the highest NH 3 levels. The correlation analysis revealed moderate negative relationship between the NH 3 levels and the turb, Cl, EC, and COD levels. On the other hand, a moderate positive correlations were found between NH 3 and both DO and pH.
Nitrogen is an essential nutrient for plant development. High concentrations of certain forms of nitrogen, such as nitrate, can be toxic to aquatic organisms. The NO 3 levels in the Al Wasit Wetland ranged from 30.60 mg/L to 106.0 mg/L, with a mean value of 55.49 mg/L. This means that the NO 3 levels were significantly higher than the standard (50 mg/L). An ANOVA test revealed no significant differences in the NO 3 levels among the three ponds. A Pearson correlation analysis revealed moderate positive correlations between the NO 3 levels and pH as well as the EC, Cl, and COD levels.
Previous research, such as [53], showed that chlorides can accumulate in wetlands from year to year and that seasonal increases in chloride concentrations in wetlands can affect the reproduction of some species. Hypersaline wetlands, on the other hand, are known for their high salinity levels, so chloride levels are expected to be high. In this study, the Cl values ranged from 8320 mg/L to 55,200 mg/L, with an average of 30,846 mg/L. The measured values were consistent with what has previously been reported in similar environments in the UAE, which is an average of 26,000 mg/L. An ANOVA test and multiple comparisons revealed that the Cl level in the upper pond differed significantly from the other two ponds, where lower Cl values performed better. According to the correlation analysis, Cl levels had moderate positive correlations with temperature, ORP, turb, and NO 3 . However, moderate negative correlations were found between the Cl content and pH, DO, and NH 3 .
The COD levels in the Al Wasit Wetland ranged from 88 mg/L to 2924 mg/L, with an average of 1195 mg/L. The COD values appeared to be much higher than the standard values (40 mg/L). An ANOVA test and pairwise comparisons revealed that the upper pond had significantly higher COD levels than the middle and big ponds, which had much lower average values. The Pearson correlation coefficient revealed that the COD level had a strong positive correlation with the EC and moderate positive correlations with ORP, turb, NO 3 , and Cl. Finally, moderate negative correlations were discovered between COD, pH, and NH 3 .
In general, the results of this study clearly imply that samples taken from sites in close locations may have similar water characteristics. In addition, according to the tested samples, all the sites in the big pond, including sites that are located on the shore or in the middle of the pond, appear to have similar water quality characteristics. Some of the limitations encountered in this study included the hypersaline nature of the wetland and the lack of previous studies within the region comparing similar environments. As an initial study, this work provides a baseline on the possibilities of measuring water quality parameters, which could be used for site selection and comparisons. In the future, studies should be conducted to include a larger variety of chemical and physical parameters, such as heavy metals, bacteria, and others, in addition to the collection of sediments and other biological matrices to complement the study. In terms of data collection, access to the site and the number of measurements collected were also limiting factors, as the measurements were taken on site and taken to the laboratory for further analysis. Furthermore, if a larger dataset was available, the data could be used to predict spatial distribution patterns and pollution quality indexes, as suggested by Bera et al., 2022 [60]. This study, also contributes to knowledge that is scarce in this region and highlights the importance of introducing tools such as remote sensing for regional water quality monitoring and assessment [61,62]. In addition, understanding the water quality parameters of incoming effluents will facilitate regulating and reducing the negative impacts on natural waters and their organisms. Including additional factors not reported in this study, such as inorganic suspended solids (ISS), total suspended solids (TSS), organic matter (OM), biochemical oxygen demand (BOD), and total phosphorus (TP), can increase the understanding of a wetland's status and suggest treatment methods for the wetland [58,[60][61][62][63].

Conclusions
Several statistical techniques were used to analyze water quality data from the Al Wasit Nature Reserve in the United Arab Emirates in this study. Using a cluster analysis, fifteen water quality monitoring sites were divided into three groups (the upper, middle, and big ponds) with similar water quality characteristics. To validate the cluster analysis results, a discriminant analysis was used. To investigate the differences between the three ponds, an ANOVA was used. The three ponds were found to differ significantly in eight of the evaluated parameters, with only the nitrate and dissolved oxygen parameters being insignificant. Following that, pairwise comparisons revealed the sources of the differences, with the upper pond (Cluster 1) outperforming the middle pond (Cluster 2) and the big pond (Cluster 3) for the majority of the parameters.
Based on the findings above, the statistical analysis approaches presented here formulate a comprehensive framework that is useful for professionals involved in the design and implementation of water quality monitoring networks and the interpretation of large water quality datasets. This approach provides a multivariate statistical basis for classifying sites by similar water quality characteristics. It also builds classification models that could be used in the future with large datasets to predict the locations of new unknown samples. In addition, larger datasets could be collected, and discriminant analyses could be further used to identify the most responsible water quality parameters for spatial variation in water quality and to identify opportunities to reduce the number of water quality parameters and sampling sites to those that are most representative for long-term monitoring. This will assist water quality monitoring agencies in refining current monitoring programs by reducing the number of water quality variables monitored to those that are most influential and limiting the number of sampling sites to those that are most representative of spatial or temporal patterns, which will significantly reduce the time, effort, and cost of assessing water quality.
This study establishes guidelines for sampling in the Al Wasit Wetland, such as sampling only from ponds with significant differences in parameter values. The study found no statistical differences between the middle pond, which is mostly located near the shore, and the big pond, which includes deeper sites that require the use of specialized tools, equipment, and boats for sampling. As a result, samples from the middle pond are also considered representative of the big pond for future work, potentially saving costs. It is also suggested that more data should be collected in the future in order to create a high-accuracy predictive model. Such predictive models are useful for predicting how parameter values will change as a result of changes in spatial or temporal patterns. The same framework could be used to assess water quality in other areas.