Applying Factor Analysis Combined with Kriging and Information Entropy Theory for Mapping and Evaluating the Stability of Groundwater Quality Variation in Taiwan

In Taiwan many factors, whether geological parent materials, human activities, and climate change, can affect the groundwater quality and its stability. This work combines factor analysis and kriging with information entropy theory to interpret the stability of groundwater quality variation in Taiwan between 2005 and 2007. Groundwater quality demonstrated apparent differences between the northern and southern areas of Taiwan when divided by the Wu River. Approximately 52% of the monitoring wells in southern Taiwan suffered from progressing seawater intrusion, causing unstable groundwater quality. Industrial and livestock wastewaters also polluted 59.6% of the monitoring wells, resulting in elevated EC and TOC concentrations in the groundwater. In northern Taiwan, domestic wastewaters polluted city groundwater, resulting in higher NH3-N concentration and groundwater quality instability was apparent among 10.3% of the monitoring wells. The method proposed in this study for analyzing groundwater quality inspects common stability factors, identifies potential areas influenced by common factors, and assists in elevating and reinforcing information in support of an overall groundwater management strategy.


Introduction
Several water quality items define water quality characteristics. Several researchers have undertaken multivariate analyses to understand hydro-geological characteristics and contamination of regional groundwater [1][2][3][4][5][6][7]. Factor analysis extracts the multivariate influence to understand the cause(s) of major factors affecting water quality, and to acquire information on the strength of the influence. Researchers have recently adopted spatial technology as an important analytical tool to describe and map the spatial variability of hydro-chemical parameters [4,[8][9][10][11]. In collecting data, each spatial sampling position S(x, y) may be used to measure the multivariate factors (v 1 , v 2 , v 3 , …, v n ), and each variable may contain multiple sampling records on different frequencies (v 11 , v 12 , v 13 , …, v nt ), where t is frequency. The analysis unit for considering the characteristics of water quality and spatial correlations should be based on one measurement such as data from a single survey, the average, or the median value during a long-term investigation. Many studies have emphasized spatial analyses of water quality data, but ignored temporal information [8,9,[12][13][14]. Therefore, this study includes information stability for each monitoring well will to understand the spatial variation of well stability.
The characteristics of groundwater quality closely relate to environmental variability. Cruz and Silva analyzed the groundwater data set for Pico Island (Portugal) inferring that silicate mineral dissolution and water salinization were mainly responsible for observed changes in groundwater composition [15]. Aiuppa et al. analyzed groundwater from Mr. Etna, Italy and revealed three major sources of groundwater contaminants: leachate from the host basalt, saline brines from the sedimentary basement below Mt. Etna, and agricultural and municipal wastewaters [12]. Kim et al. used the modified piper diagram to investigate salinization of shallow groundwater in the coastal reclaimed regions of Korea and reported that residual salts from seawater intrusion, and organic matter in the filling materials accelerated the groundwater salinization process [16].
Studies using factor analysis to assess groundwater quality have shown that extracted factors are often related to the saline parameters of groundwater quality. Adams et al. used factor analysis to assess groundwater in the Western Karoo (South Africa) and its interaction with the environment, and reported that the salinization process, mineral precipitation and dissolution, cation exchange, and human activity were the main processes influencing groundwater quality [17]. Kim et al. divided shallow groundwater in the coastal area at Kimje City (Korea) into four groups and revealed that seawater intrusion, chemical fertilizers, and the reduction process affected physicochemical compositions of groundwater [18]. Liu et al. investigated groundwater quality in the coastal Blackfoot Disease (BFD) area in Yun-Lin County (Taiwan) and discovered that groundwater quality was mainly controlled by seawater intrusion and arsenic pollution [9]. Additionally, the areas of high salinization and arsenic pollution were consistent with the area of groundwater over-pumping. Liu et al. demonstrated that brine groundwater was primarily composed of highly evaporated seawater [8]. However, the salinization factor did not determine the analysis results of groundwater samples in the contaminated sites. Subbarao et al. analyzed the effluent contamination of groundwater around a zinc (Zn) smelter plant and a polymer plant at Visakhapatnam (India), showing that the groundwater at the Zn smelter plant was contaminated by magnesium (Mg) and sulfate (SO 4 ), whereas sodium (Na), chloride (Cl), and carbonates (CO 3 ) were the major elements transported into groundwater at the polymer plant [14]. Love et al. applied factor analysis to prove that groundwater quality around an iron (Fe) mine and municipal sewage disposal plant in Southern Africa was related to agricultural activities, mining activities, and chemical usage [13].
Studies often combine factor analysis with cluster analysis for conducting spatial variance analysis. Cluster analysis is used to split water samples into a number of groups according to similar hydro-geochemical composition [2,[18][19][20]. A large-area research typically integrates cluster analysis with geographical information system (GIS) technology to investigate whether the cluster phenomenon exists spatially. If so, factor analysis is then used to discuss the factor influence in each cluster to solve the less prominent factor influence of spatial variances in small areas [4,8,18,[21][22][23].
The reported concentration values used for carrying out the multivariate statistical approach based on a single sampling or the statistics at each monitoring well do not include information on raw data uncertainty. The general results only display the groundwater quality characteristics for the specific time of the survey. Because groundwater quality may vary over time, the above results thus do not represent a realistic groundwater quality state. The ignored information may not change the factor component composition, but it will provide other useful information [8,9,13]. Therefore, this study includes factor influence. Shannon proposed entropy as a measure of uncertainly, a theory recently applied in various fields [24]. Although research has quantified the entropy theory to evaluate uncertainty for hydrological variables and parameters in models of water resources systems [25], studies have not fully explored its application for describing and evaluating large-scale characteristics of groundwater quality. Entropy theory establishes and quantifies uncertainty information to solve water resource and environmental management problems [26].
The three objectives of this study included: (1) applying cluster analysis and GIS technology to evaluate whether a spatial cluster phenomenon exists; (2) using factor analysis combined with kriging to interpret and map major factors that affect groundwater quality in Taiwan, and (3) investigating the stability and spatial variation of influential factors.

Study Area
Taiwan is approximately 36,000 square kilometers is extension, with 32% of the whole island having mean sea level elevations higher than 1,000 meters. The average annual precipitation is about 2,150 mm; the majority of this precipitation occurs from typhoons during the wet season from May to October. However, the spatial and temporal distributions of precipitation are extremely uneven, leading to a great difference in river flow during the dry and wet season. In Taiwan, the groundwater aquifer belongs mostly to quaternary sediments. Geologically, the groundwater aquifer of Taiwan is made up of coastal terrace, river terrace, and alluvial plain; the plain areas are mostly alluvial fans with abundant groundwater. The distribution of groundwater resources in Taiwan are divided into nine groundwater areas. The water resources come from surface water (68%) and groundwater (32%) to supply 70% of agricultural, 21% of domestic, and 9% of industrial water demands [27]. The stable quantity of groundwater deems it an important water resource.

Hydro-Geochemical Dataset
The regional and site-specific groundwater monitoring wells established by the Taiwan Environmental Protection Administration (EPA) and the monitoring wells established by the Water Resources Agency (WRA) constitute the main groundwater monitoring networks in Taiwan. Since 1999, the Taiwan EPA has continually established regional monitoring wells (depth less than 20 meters) with a density of 0.4 wells/100 hectares for seasonal sampling. These wells are mainly used as a pre-warning system for groundwater contamination. The procedure for this study purged groundwater with three times the volume of wells to remove suspended solids before collecting groundwater samples. Samples were collected with bailers, stored in polyethylene bottles, and preserved according to standard analytical methods (EPA). The sample bottles were placed in 4 °C containers and transported to the laboratory for analysis. Some relatively unstable hydrochemical parameters such as temperature (Temp.), pH and electrical conductivity (EC) were measured in the field. The general hydrochemical parameters included: total hardness (TH), total dissolved solid (TDS), chloride (Cl − ), ammonia (NH 3 -N), nitrate (NO 3 -N), sulfate (SO 4 2− ) and total organic carbon (TOC). The metal parameters included arsenic (As), cadmium (Cd), chromium (Cr), copper, (Cu), lead (Pb), zinc (Zn), iron (Fe), and manganese (Mn). Since 2005, the supplemental five parameters have included calcium (Ca 2+ ), magnesium (Mg 2+ ), sodium (Na + ), potassium (K + ), and alkalinity (Alk.), for a total of 23 analytical parameters.
This study analyzed groundwater samples collected seasonally from 2005-2007, derived from 414 regional monitoring wells shown in Figure 1. In general, each monitoring well was sampled and analyzed quarterly. Sometimes samples could not be collected due to natural or human interruption causes. As a result, this study collected eight or more samples from about 90% of the monitoring wells (373) to provide stable and effective hydrochemical data.

Statistical and Geostatistical Analysis
Cluster analysis selected two southern and northern areas for studying spatial evaluations. Factor analysis was applied to discover the factors influencing groundwater quality in these two areas. Then, entropy theory was used to quantitatively evaluate the stability of groundwater quality. Cluster analysis is an unsupervised pattern detection method that classifies all cases into smaller groups or clusters based on similarities within a group and dissimilarities among different groups. Therefore, the magnitude of association is strong (homogeneity) between cases in the same group and weak (heterogeneity) among different groups. The similarity between two cases is typically quantified through Euclidean distance measurements. Hierarchical agglomerative clustering, which is the most common method, has the advantage of not making any prior assumptions about the data. The visual compendium of the clustering processes is typically displayed as a dendrogram (tree diagram). However, the fact that the user must decide the number of groups causes subjective judgments in the cluster analysis. This study employed hierarchical agglomerative cluster analysis on standardized data using Ward's method with squared Euclidean distance. Ward's method attempts to minimize the sum of squared distances of centroids from any two hypothetical groups formed at each step. The linkage distance is expressed as (D link /D max ) × 100, which represents the standardized quotient between the linkage distances for a particular case divided by maximal linkage distance [11,28,29].

Ordinary Kriging (OK)
Kriging is a group of geostatistical techniques used to interpolate the value of a random field at an unobserved location from observations of its value at nearby locations. The main tool of most geostatistical analyses is the variogram. The variogram can be defined as half the expected squared difference between paired random functions separated by the distance and direction vector. The important characteristics of the variogram are range, sill, and nugget effect. The variogram function can be expressed as follows: (1) where is number of pairs observations, represents the regionalized variable at position . For the traditional variogram, which is a function of one variable , the model for the variogram can be obtained by the use of mathematical models for instance exponential, spherical, Gaussian, and linear variogram. These models may be fitted to the variogram and the coefficients of the model may be used to assign optimal weights for interpolation using kriging. In this study used form of kriging is ordinary kriging. Ordinary kriging assumes an unknown constant trend: μ(x) = μ. The estimate method is linear weighted moving averages of the n available observations. Then the interpolation by ordinary kriging is given by: where is estimated the value at . And the kriging weights of ordinary kriging fulfill the unbiasedness condition. The weighting factors can be determined by solving a non-linear optimization problem involving the minimization of the estimated error to the constraint by using the Lagrange multiplier. The variance of estimation error is defined by equation (5): More detailed discussions and mathematical inference are provided by Journel and Huijbregts and Isaaks and Srivastava [30,31].

Factor Analysis (FA)
Factor analysis is an extensively used multivariate statistical method to rearrange original variables into fewer underlying factors (also called common factors) to retain as much information contained in the original variables as possible. Unlike original variables factors are completely uncorrelated with each other. Hence, substituting these factors for the original variables can effectively reduce the overall complexity of large data. The eigenvalue quantifies the contribution of a factor to total variance. Factors are produced according to an eigenvalue analysis of the correlation matrix, and factor loadings and factor scores are the main measurements of factor analysis. The first step of factor analysis is to standardize the raw data and compute a correlation matrix of the variables from the standardized variables. The second step is to estimate the factor loadings that express the degree of closeness between the factor and variables. Factor loadings range from −1 to +1, with a larger absolute value indicating a stronger relationship between the respective factor and variable. Furthermore, Liu et al. proposed that classifying the factor loadings as strong, moderate, and weak corresponds to absolute loading values in the range of >0.75, 0.75-0.50 and 0.50-0.30 [9]. The last step linearly transforms factors associated with the initial set of loadings by factor rotation to maximize variable variances and to obtain a better interpretable loading pattern. Factor scores are computed for each individual case to represent the contribution of each factor in each case.
This study performed factor analysis to determine the factors controlling regional groundwater composition of the two areas and the resulting factors in the main groundwater types. Factor extraction was carried out by principal components, where only eigenvalues greater than one were retained [32]. The factor loading matrix was rotated to obtain uncorrelated factors by varimax rotation.
This study considered factor scores of common factors in each groundwater monitoring well as variables and applied them to kriging methods to create various surfaces to display the range and degree of groundwater quality influenced by common factors.

Information Entropy Theory
Shannon introduced the entropy concept into information theory by suggesting entropy as a measure of information or uncertainty. Shannon entropy expresses the degree of uncertainty implicated in predicting the output of a probabilistic event. Mathematically, an inverse relationship exists between the amount of information and the probability of occurrence. If the occurrence of an event can be precisely predicted the probability value will be great, and inversely, the Shannon entropy will be small. Hence, information and uncertainty as dual terms that reveal the information gained is indirectly measured as the amount of reduced uncertainty. Various fields of ecology, hydrology, and water quality have recently applied entropy theory [33][34][35].
The Shannon entropy can be explained as follows: let a set of n possible outcomes be , and the probabilities of the outcomes as ) The basic assumption of entropy is the amount of information, H(X), being a real non-negative measure, additive, and a continuous function of probability p. Therefore, entropy H(X) is defined as: where p i is the probability of the outcome x i . In this study, the base of the logarithm was 2. The unit of entropy measurement is called a -bit.‖ Table 1 shows the summary of statistics for the hydrochemical data in Taiwan. The water temperature ranged from 18.2 to 32.4 °C , with a maximum difference of 4 °C between the northern and the southern areas. The pH value of groundwater ranges from 3.6 to 10.5-the southern area groundwater is on the alkaline side of neutral, whereas the northern area groundwater is weakly acidic. The mean and standard deviation of other hydrochemical parameters are EC 2,366.4 ± 7,884.2 µS cm −1 , TH 477.7 ± 1,017.6 mg L −1 , and TDS 1,659.2 ± 5,832.9 mg L −1 .    Table 1 shows the concentrations of Cd, Cr, Cu, and Pb in more than 80% of the samples, and Zn in 22% of samples were below the detection limits. More than 50% of shallow groundwater aquifers, equivalent to 60,000 hectares in the sampling area, contain As, indicating that groundwater in Taiwan generally contains trace amounts of As. Researchers have reported groundwater with high-arsenic concentrations in the southwestern coast of Taiwan, with the arsenic content of well water ranging from 0.01 to 1.82 mgL −1 High-arsenic concentrations are also in Blackfoot Disease hyperendemic areas [36][37][38][39]. The results in this study showed that the groundwater As concentrations ranged from N.D. to 0.191 mgL −1 , where the highest concentration of As also occurred in Southwestern Taiwan. Table 2 gives the correlation coefficient matrix for the hydrochemical parameters. If the correlation coefficient (r) is greater than 0.7, two parameters are considered to be strongly correlated, whereas if the r value is between 0.5 and 0.7, it indicates a moderate correlation at a significance level p < 0.05 [17]. Parameters having high degrees of correlations are EC and TDS (r = 0.998) because all of the dissolved components cause increased ionic concentration, as well as increased EC concentration. EC is highly related to TH (r = 0.970), Cl − (r = 0.998), SO 4 2− (r = 0.964), Na + (r = 0.997), K + (r = 0.996), and Mg 2+ (r = 0.965) but moderately related to Ca 2+ (r = 0.670).

Descriptive Statistics
The results indicated that these ions involve various physical and chemical reactions: e.g., oxidation/reduction reactions, and ion exchange in groundwater aquifers, which suggest that the same factor strongly affect them [40]. Ca 2+ and SO 4 2− have a relatively high correlation (r = 0.714), revealing that the calcium ion in groundwater comes mainly from gypsum. Na + and Cl − also have high correlation (r = 0.996). However, heavy metals having no significant correlations with other parameters are As, Cd, Cr, Cu, Pb, and Zn; therefore, As, Cd, Cr, Cu, Pb, and Zn are not included in the subsequent multivariate analyses. Only 17 water quality parameters (Temp., pH, EC, TH, TDS, Cl − , NH3-N, SO 4 2− , TOC, Fe, Mn, Na, K, Ca, Mg, and Alk) for samples collected from the 414 monitoring wells are used. Values for constituents lower than the methods detection limit (MDL) were replaced with half of the method detection limit (MDL/2) prior to statistical analysis to make up all data [41]. This study used the SPSS 13.0 software for multivariate statistical analysis and the results were plotted by ArcGIS 9.3.

Cluster Analysis of Spatial Correlation
The geological factor is the main factor affecting the groundwater by many researches [2,[4][5][6]8,9,11,13,14]. In Taiwan, groundwater monitoring wells are distributed over nine groundwater aquifers with the farthest distance of 354 kilometers between two wells. This highlights the regional pattern of groundwater factors, which must reduce the global influence of common factors. However, common factors in the small area are appropriate to represent the real situation. Thus, cluster analysis and GIS technology were combined to determine if a spatial cluster exists. Based on the raw data, cluster analysis was performed to split the groundwater monitoring wells into two groups, three groups, and four groups up to multiple groups, and then plotted to determine the spatial distribution, shown in Figure 2.
When the monitoring wells were divided into two groups, the percentages of monitoring wells in the groups were 96% and 4%. Monitoring wells of the smaller group located in Southwestern Taiwan had the spatial aggregation as shown in Figure 2(a). When divided into three groups represents the quotient between the linkage distances for a particular case divided by the maximal linkage distance], the percentages of monitoring wells in each group were 56%, 40% and 4%. Figure 2(b) shows the first group aggregated south of the Choshui River Alluvial Fan, the second group in the Southwestern Taiwan, and the third group in the northern Taichung groundwater area and the eastern groundwater area. The three groups demonstrate distinct distributions in space. When the monitoring wells were divided into four and five groups, the spatial patterns were similar to those for the three previous groups, and only a few monitoring wells were divided into other groups [ Figure 2(c) and 2(d)]. However, regardless of how the group increased the number of groups, as shown in Figure 2(e) and Figure 2(f), the division of groundwater characteristics into north and south groups is obvious. Therefore, in order to focus, choose to use the most appropriate area to be discussed.
Wells of the first group, identified as the northern area, were located in the Taipei Basin, the Taoyuan-Chungli Terrace, the Hsinchu-Miaoli Coastal Area, the Taichung Area, the Lanyang Plain, and the Hualien-Taitung Valley. Wells of the second group, the southern area, were located in the Choshui River Alluvial Fan, the Chianan Plain, and the Pingtung Plain. The northern area has 200 monitoring wells whereas the southern area has 214 monitoring wells. Because of the extreme value of the variation in groundwater quality, we compare the relative variation of both the northern and southern areas using a box-and whisker plot. Figure 3 shows the box-and-whisker plot of hydrochemical parameters for the northern and southern areas. Based on the median value, the northern area had a greater concentration of TOC and NO 3 -N than the southern area, whereas the southern area had higher concentration of EC, TH, TDS, Cl − , SO 4 2− , Na + , K + , Ca 2+ , Mg 2+ , Temp., pH, Alk, Mn 2+ , NH 3 -N, and Fe 2+ than the northern area. This demonstrated that factors affecting groundwater quality are different for the northern and the southern areas.

Factor Analysis Combined with Kriging
Factor analysis was performed on the normalized data sets (23 variables) separately for the northern and southern areas of Taiwan. This study retained only factors with eigenvalues that exceeded 1.0, and based on the absolute factor loadings, were greater than 0.625 to determine predominant parameters of the common factors. Table 3 presents the rotated common factors for the percentage of variance and the total cumulative percentage of variance.  , Na + , K + , and Mg 2+ . For the southern area, Ca 2+ is added to consist of nine parameters. Factor 1 explained 42.28% of the total variance for the southern area and 50.34% for the northern area. These parameters are the major ions in aqueous solution. Since EC can reflect the degree of groundwater pollution by seawater intrusion, we can regard it as a water salinization index [9]. The southern area suffered from discharge of agricultural and industrial wastewaters to groundwater and had higher groundwater EC value than the northern area, where groundwater in the southern area was already polluted.

Factor 2
Factor 2 accounts for 15.13% of total variance including the parameters pH, Ca 2+ , and Alk in the northern area. In the southern area, the association of TOC and Alk characterized factor 2 accounting for 12.19% of the total variance. Since the geology of the northern area is primarily limestone, the Ca 2+ ions release into the groundwater, which changes the pH. The groundwater in the southern area has a wide distribution of TOC, mainly from livestock and industrial wastewater discharge. TOC is also a pre-warning of groundwater pollution. The source of alkalinity is different for the northern and the southern areas. In the northern area, Ca 2+ ions are the main source of alkalinity. However, TOC degradation into inorganic carbon mainly causes alkalinity in the southern area.

Factor 3
Factor 3 for the northern area includes the parameters Fe 2+ and Mn 2+ explains 12.21% of the total variance. Factor 3 for the southern area includes only one parameter, Fe 2+ , which explains 8.44% of the total variance. The soil and rock in the groundwater are composed of Fe 2+ and Mn 2+ . The iron dissolves in water to form divalent and trivalent iron cations and in the absence of other ions, the neutral and oxidizing water can form a ferric hydroxide deposit with the iron. Since the groundwater contains low amounts of dissolved oxygen, the anaerobic condition results in reduced trivalent iron and increased divalent iron in water. Hence, the groundwater contains higher concentrations of iron ions than surface water. Similarly, divalent manganese [Mn(II)] is the main type existing in groundwater.

Factor 4
Factor 4 in the northern area accounts for 10.80% of total variance, containing two parameters of NH 3 -N and TOC. The total organic carbon is a composite index that responds to the total mass of organic matter existing in water. In the southern area, Factor 4 only contains one parameter, temperature, which explains 7.76% of total variance. The groundwater contaminated by ammonia nitrogen results from organic matter contained in discharges of industrial, agricultural and domestic wastewaters decomposing into ammonia nitrogen by microbial reactions. The high concentration of ammonia nitrogen or organic nitrogen in water indicates water pollution. Inorganic ammonia is the main parameter of groundwater monitoring work and the existence of nitrogen compounds closely relate to organic matter. Ammonia nitrogen converts into nitrogen gas for release to the atmosphere through nitrification and denitrification, where nitrification is the key to the nitrogen cycle. Groundwater in the anaerobic condition precedes the nitrification process, leading to accumulated ammonia nitrogen in groundwater. Carbon and nitrogen originate from the same source, causing a high concentration of total organic carbon in groundwater.
The factor scores were evaluated using kriging method. Table 4 lists the variography results for common factor in northern area and southern area. A best-fit models with the lowest reduced sum of squares (RSS) and the highest R 2 values were generated using GS+ software in order to fit variograms based on use of a least squares model. The varograms of F1, F2 in the northern area and F2, F4 in the southern area with high nugget effect ratios (>38.0%) represent high levels of small-scale variations. These variations may be due to the extreme observations in the study area. The spatial structures of northern area and southern area in common factors are not all similar.

Shannon Entropy Calculations
The current study analyzed the stability of each parameter using the information entropy theory to extend information contained in the long-term monitoring data. Some missing data for a few groundwater monitoring wells raised the data reliability by selecting monitoring wells with a sum greater than eight for data. The northern area has 175 groundwater monitoring wells, and the southern area has 198; a total of 373 groundwater monitoring wells occupied 90% of the whole data. This study calculated the information entropy for each selected groundwater monitoring well and ranked the groundwater monitoring wells according to their calculated information entropy values. Then the rankings of groundwater monitoring wells were summed up for each parameter classified by each common factor. Caused by the characteristics of different groundwater quality parameters, it is to rank to be added to replace the sum of entropy value. Table 5 shows the contained parameters. Finally, the magnitude of the sum of ranks was used to determine the stability of groundwater quality. The smaller value indicates a more unstable groundwater quality. Water quality variation in many groundwater-monitoring wells is not obvious, so these wells have the same rank. The groundwater parameters for these wells are relatively stable, and are not shown in the plot. Factor 2 for the northern area is taken as an example.

Evaluating Stability of Groundwater Quality Variation
Factor analysis of the scores can use the kriging method to draw contour maps. Factor scores can quantify common factors' influence. A high score means the common factor has high-impact. With a well knowing the factor score and overlaying entropy, we can visualize the factors of influence and stability of the relationship. Figure 4(a) shows the overlaying map of the distribution of Factor 1 scores and the ranks of information entropy values for the northern and the southern areas. For the northern area, the ranks of information entropy values for only five of the monitoring wells vary noticeably. Unstable monitoring wells occupy 2.9% (5/175) of the total number of monitoring wells, located in Taipei City, Miaoli County, and Yilan County. The regions with high scores of Factor 1 conformed to the locations of these five unstable monitoring wells. In the southern area, the ranks of information entropy values showed that 103 monitoring wells had noticeable variation with 52.0% (103/198) of the monitoring wells being unstable. The overlay map reveals that the regions with high Factor 1 scores corresponded with monitoring well locations having the upper rankings of information entropy values. A higher concentration of the groundwater quality parameter contained in Factor 1 less stability. This is common to both the southern and the northern areas, but the problem is more serious in the southwestern coast of Taiwan because Factor 1 represents the extent of groundwater salinization. Thus, the information is useful to obtain the spatial distribution of groundwater salinization. Figure 4(a) reveals that the monitoring wells that have unstable groundwater quality located in potential areas of groundwater salinization are polluted. Figure 4(b) shows the overlay map of the distributions of Factor 4 scores for the northern area and Factor 2 scores for the southern area, and the ranks of information entropy values. The ranks of information entropy values for the northern area exhibit that 18 monitoring wells have obvious variations with 10.3% (18/175) unstable monitoring wells aggregated in Taipei County and Ilan County. The regions with high Factor 4 scores conformed to monitoring well locations having upper rankings of information entropy values. Organic matter in these dense population areas pollutes the groundwater to interfere with groundwater quality stability. In the southern area, the ranks of information entropy values indicate that 118 monitoring wells have obvious variations with 59.6% (118/198) unstable monitoring wells spread extensively. The regions with high Factor 2 scores did not correspond with monitoring well locations having upper rankings of information entropy values, indicating that concentrations of TOC and Alk in the groundwater did not positively relate to groundwater quality stability. This is caused by the pollution source originating from animal husbandry, characterized by a wide range of a long-term, slow pollution.

Serious Salinization Groundwater in Southwestern Taiwan
To understand the extent of Factor 1 impact on groundwater quality in the southern area, the study area concentrated on monitoring wells located only in the coast area of Chianan Plain. Using the information entropy method to evaluate the stability of groundwater parameters in Factor 1 emphasized the correlation between factor scores and information entropy values as shown in Figures 5(a)-(c). Results of overlaying the contours of Factor 1 and the information entropy values of Factor 1 showed that all of the parameters except Ca 2+ indicated that the monitoring wells with high information entropy values of parameters were located closely in the regions having high Factor 1 scores. This indicated that monitoring wells with high Factor 1 scores had relative poor groundwater quality stability. These monitoring wells aggregated in the coastal areas, so salinization obviously affected groundwater quality. Factor loading for Ca 2+ was 0.729, lower than other parameters of Factor 1. However, Ca 2+ fitted the criterion for selecting groundwater parameters, showing that Ca 2+ is not a main control parameter in Factor 1. This finding could prove that additional analyses of groundwater quality uncertainty in this study assisted in understanding various groundwater parameters, and how the same pollution source affected those parameters. Figure 5(d) reveals that the monitoring wells within the top 20 ranks of the information entropy values are mostly located in coastal areas. The Factor 1 scores for monitoring wells also gradually decreased from the coast to inland (the further the monitoring wells are from the coast, the smaller the Factor 1 scores). In contrast, the ranks of information entropy values gradually increased (the further the monitoring wells are from the coast, the lower the ranks of information entropy values). From left to right, the study area can be broadly divided into three zones based on Factor 1 scores. Table 6 shows the statistical ranks of Factor 1 information entropy values. The nine monitoring wells closest to the sea in zone I have high Factor 1 scores (Factor 1 scores 1-8) and high information entropy values. According to the ranks of information entropy values, five of the monitoring wells (56%) are within the top ten ranks and the other four wells are (44%) 11 to 20. In zone II, further away from the coastal area, 12 monitoring wells (58%) have smaller Factor 1 scores than zone I (Factor 1 scores 0-1) with six monitoring wells (50%) ranking 31-103; Only two wells (17%) rank within the top ten. In zone III, furthest away from the coastal area, the monitoring wells have the smallest Factor 1 scores (scores <0).   Based on Figure 5(d).
The basis of the ranks of these information entropy values are that 94% of the total 30 monitoring wells in zone III rank beyond 31, only 6 % rank between 21 to 30. This result indicates that the information entropy values for the monitoring wells in zone II are apparently smaller than in zone I. These observations infer that salinization seriously pollutes groundwater in Southwestern Taiwan and groundwater quality is very unstable, proving ongoing salinization. Monitoring wells located in zone II are also close to zone I, implying a continuous salinization factor seriously affecting groundwater quality, which is a potential area of groundwater quality deterioration. Therefore, these monitoring wells must monitor hydrochemical parameters of salinity regularly to prevent further groundwater quality deterioration. The unplanned development of aquaculture ponds and groundwater over-pumping in Taiwan coastal areas has caused an unbalanced supply/demand of fresh water in groundwater aquifers that has lead to inland seawater intrusion and an adverse influence of groundwater resources in the coastal area. The Taiwan Water Resources Agency (WRA) reported support the points as same as our study, the high water demand for aquaculture feeding consumed about 29% of all groundwater resources in Southwestern Taiwan. Serious over-pumping of groundwater has caused land subsidence and seawater intrusion to result in groundwater salinization. Figure 6 shows the relationship between factor scores and calculated information entropy values of the various hydrochemical parameters for each monitoring well in the northern area and the southern area. In the southern area, the relationship between Factor 1 scores and information entropy values are similar for Factor 1-contained hydrochemical parameters, except for Ca 2+ . Only the results for EC and Ca 2+ are displayed in Figure 6(a) and Figure 6(b). The comparison of EC and Ca 2+ reveals a specific relation between EC factor scores and information entropy values. In terms of Factor 1, the relations are similar for the northern and southern areas. The only difference is that in the southern area, there is no similar relation for the parameter Ca 2+ . Therefore, including Ca 2+ in Factor 1 is not appropriate for the southern area. Figure 6(c) and Figure 6(d) show the results of TOC and Alk, which are parameters for Factor 2 in the southern area. The information entropy values of parameters TOC and Alk have no relationship with Factor 2 scores, indicating extensive groundwater pollution by organic matter. The regions where groundwater quality is unstable assist in finding the location of potential pollution areas. In the northern region, the Alk parameter in Factor 2 is similar to that in the southern area, and the Factor 2 score does not relate to the information entropy value shown in Figure 6(e). Both parameters Ca 2+ and pH are also in Factor 2 in the northern area due to different sources. Hence, natural effects caused the instability of Alk in groundwater of the northern area.

Relations between the Factor Score and Information Entropy Value
In the southern area, Figure 6(f) and Figure 6(g) display the relationship between Factor 3 scores and information entropy values of parameter Fe 2+ , and Factor 4 scores and information entropy values of parameter temperature. Natural effects strongly influence these two factors, so that their factor scores and information entropy values do not correlate. Both parameters, Fe 2+ and Mn 2+ in the northern area have a similar relationship between factor scores and information entropy value, so that only the result for Fe 2+ is shown in Figure 6(h). These findings demonstrate that the northern area is different from the southern area because Factor 3 scores highly correlate with information entropy values for parameters Fe 2+ and Mn 2+ in the northern area. Since Fe 2+ and Mn 2+ are natural elements, the groundwater that contains high concentrations of these elements has an unstable groundwater quality. This represents continuous natural variations, and observations show whether there are any other continuously changing natural phenomena in these regions. Figure 6(i) and Figure 6(j) show the results of NH 3 -N and TOC, which are parameters for Factor 4 in the northern area. Human activities highly influence Factor 4, as demonstrated by the weak relationship between Factor 4 scores and information entropy values of parameters NH 3 -N and TOC. Views of these regions revealed that cities with a high population density discharging improperly treated domestic wastewaters caused groundwater quality deterioration.

Conclusions
Overall, groundwater pollution in Taiwan is not serious. One hundred eight monitoring wells, accounting for 29.0% of total monitoring wells, possess the potential for groundwater salinization or have already been intruded by seawater, causing unstable groundwater quality. Groundwater over-pumping for aquaculture feeding in the southwestern coast of Taiwan causes ongoing seawater intrusion. Cluster analysis let us know that using the Wu River as a boundary, Taiwan can be divided into northern and southern areas with drastic differences in groundwater quality between the two areas. The southern area suffers from discharged agricultural and industrial wastewaters into the groundwater, causing groundwater pollution and higher EC concentration than the northern area. In the northern area, Taipei City, Taipei County, and Ilan County are densely populated areas. Discharges of domestic wastewater have already polluted the groundwater and interfered with groundwater quality stability.
Combining the geostatistical method and the information entropy theory to evaluate groundwater quality variations provides an effective tool for analyzing common factor uncertainty, and for conducting complete analyses of long-term monitoring data collected from a large-scale region. The results will facilitate subsequent researches on environmental remediation, pollution prevention, and investigating natural variations and implementing water management projects.