Identification of Key Factors Affecting the Trophic State of Four Tropical Small Water Bodies

Due to their dimensions, small and shallow water bodies are more sensitive to changes in nutrient load, water flow, and human management. The four water bodies studied are small (area <0.01 km2), constantly supplied by a non-anthropogenic source of nutrients, and these water bodies present different trophic states: mesotrophic, eutrophic, and hyper-eutrophic. The objective of this study was to identify the key environmental factors that created differences in the trophic state of these adjacent shallow urban lakes by modeling chlorophyll-a (Chl a) through the application of the Partial Least Squares Regression (PLSR). The models (n = 36) explain 45.8–60.6% (R2), and predicts 39–52.9% (Q2) of the variance. Environmental variables were identified in the water bodies as critical factors of trophic state determination, water residence time (WRT), ions (e.g., Ca2+), and minerals as hydroxyapatite (HAP). These variables were related to processes that could improve trophic conditions, such as flushing and phosphorous precipitation. Conversely, N-NH3 concentration was associated with nutrient recycling, and found to be able to promote eutrophication.


Introduction
Recent studies propose that the number and surface of small and shallow water bodies can play an essential role within the global carbon cycle because dissolved organic carbon concentrations show a negative correlation with lake size, which means they can play a significant part in carbon sequestration [1]. The small water bodies are also essential for the maintenance of regional biodiversity in aquatic invertebrates and vertebrates [1,2]. From an anthropogenic point of view, they are vital for human development [3].
A specific type of small water body is the urban shallow lake. These water bodies can have high social value and provide various landscape, recreational, or cultural services [4]. The small magnitudes of depth and surface of some water bodies make them critically vulnerable to nutrient enrichment or eutrophication processes, which in turn entails a series of undesirable changes that alter the quality of their waters [5]. Today, eutrophication is the greatest threat to water bodies on the planet, and this situation will continue for at least the next decade [6]. Moreover, eutrophication is costly in economic terms [7]. the planet, and this situation will continue for at least the next decade [6]. Moreover, eutrophication is costly in economic terms [7].
The size and shallowness of these water bodies make them susceptible to external processes, such as cloudless days, intense solar radiation, high water temperatures, strong winds, and fluctuations in the water supply, which can modify the ecological dynamics of the lake, even in the littoral zone. Some of these processes also influence the trophic state [8].
Unlike early studies on the critical factors of eutrophication, which were focused mainly on the role of nutrients [9], current research has a much broader, often multifactorial approach, centered on predicting Chl a concentrations and identifying key factors that support their production, such as temperature, water retention time, water level, photoperiod, presence of macrophytes, and zooplankton herbivory [10].
The lakes studied in the present research are adjacent to each other and share the same water source. Therefore, their environmental conditions and trophic status would be expected to be similar. However, as this did not occur, the present work sought to know which were the main variables at the root of this difference, and how they interacted.
In identifying the key factors, the aim of this study was to identify the variables by PLSR, besides N and P concentrations, that are important in the differential expression of eutrophication in small and shallow lakes, and that could participate in processes that promote (e.g., nutrient recycling) or diminish (e.g., precipitation of phosphates, co-precipitation with divalent cation minerals, or flushing) eutrophication.

Study Site
The lakes are located within the Eastern Quarry (EQ) (Cantera Oriente, in Spanish; 19° 19′ 15" -19° 18′ 47"N and 99° 10′ 22" W) within the Ecological Reserve of Pedregal de San Angel (REPSA), situated northeast of the Xitle volcano lava spill, whose extension is 80 km 2 (Figure 1a,b) [11]. The lakes are located within the EQ, which is part of the Ecological Reserve of Pedregal de San Angel (REPSA). This ecological urban reserve protects the last existing portions of vegetation that were very extensive in the past at Mexico's basin, like the scrub of "Palo loco" (Pittocaulon praecox) [12].   The water that filled the EQ lakes came from the springs of a shallow aquifer flowing through the rocks (fractured basalt) along with the lava spill, as the main recharge sites of the shallow aquifer are the terrains near to the Xitle volcano [13], whose summit is 3100 m a.s.l. and is located 9.5 km southwest of the EQ [11] (Figure 1a).
The EQ has a total area of 206,000 m 2 divided into two areas. The northern part corresponds to the ecological reserve. From the D-D' topographic profile to the southern end of the EQ (Figure 1b), the area is a training ground for a professional football team [12].
The EQ is a depression created due to the extraction of volcanic rock (basalt) during the period 1970-1994. The depression has an altitude range between 2254 and 2292 m a.s.l. [12]. Then, the lakes are classified as high-altitude water bodies, but are also tropical, considering the latitude [14]. The lakes originated when the excavation reached the aquifer and several springs appeared. EQ lakes approximately have the following surfaces, excluding flood zones: NL(5450 m 2 ) < RL(6400 m 2 ) < CL(8300 m 2 ) < SL(9500 m 2 ) [15].
The water flows through a series of channels to the different lakes. The water flow is regulated by gates (4), two of which are located near the mouths of the SL and the RL, with one in the channel, and another that divides NL and CL. The water leaves the shallow lake system area through two holes in the eastern wall-one located north of NL, and the other, most important one in RL. There is no sanitary network that discharges into the lakes; the wastewater is collected in septic tanks and treated later.
The climate in the EQ is temperate subhumid, with rains in the summer and a dry winter season; the rainy season lasted from May to October, and the dry season from November to April. The average annual rainfall is 833 mm, the average annual temperature is 15.6 • C, and predominant winds come from the north and NNE [12]. Over the studied period, September was the month with the highest monthly mean precipitation (127.4 mm), and February was the driest month (1.7 mm). In the period 2013-2016, the average rainfall in the dry season was 123.9 mm, while that in the rainy season was 490.9 mm [16].
The EQ water bodies fulfill the function of maintaining regional biodiversity. The presence of macrophyte vegetation is usually limited to the channels and the littoral zone of the water bodies, even in those water bodies that are more transparent, because it is continuously removed or trimmed. The main species of macrophytes living in the water bodies is Rush (Typha latifolia), and in the channels, the predominant species are the submerged plant, Sago Pondweed (Stuckenia pectinata) and the floating Swollen Duckweed (Lemna gibba) [17]. The species richness of phytoplankton in the water bodies of the EQ is 137, where chlorophytes (Scenedesmus, Desmodesmus, Actinastrum, Chlamydomonas) and bacillariophytes (Cyclotella, Gomphonema, Cymbella) present a higher number of species [18]. The lakes, in addition to the Common Carp (Cyprinus carpio), are inhabited by endemic species of the Valley of Mexico, such as the fish Mexclapique of Zempoala (Girardinichthys multiradiatus) [19] and Moctezuma's Green Frog (Rana montezumae Baird) [20]. These lakes have also been used as a temporary shelter for the Mexican Ajolote (Ambystoma mexicanum) [21], and refugee area for several species of aquatic birds [22].

Sampling and Analysis of Water Parameters
Samples for the determination of the physical, chemical, and nutrient parameters of the lakes and the puddle were collected monthly from June 2013 to May 2016 (n = 36). The samples were integrated from the column every 0.2 m to the bottom.
All the water analyses were performed based on the protocols of the Standard Methods for the Examination of Water and Wastewater [23]. Water temperature (T), specific conductivity (K 25 ), and dissolved oxygen concentration (DO) were measured with a multiparameter YSI-85, the pH with a Conductronic PH-10 potentiometer, and the Secchi Disc depth (zSD) with a 20 cm diameter disc. Samples for determinations of Total Phosphorus (TP), P-PO 4 3− , P-Org, Total Nitrogen (TN), Standardization procedures and quality control gave recovery percentages of 95-99%, and coefficients of variation (CV) between 0.9% and 2.0% when replicates were obtained. Within the sampling period 2013-2016, Chl a was also determined with the acetone (90%) extraction method and read using a 10-AU Turner Designs fluorometer [24], and as quality control, tripled samples were analyzed in each batch, obtaining CV < 5%.

Trophic State Index (TSI)
The trophic state index of Carlson to estimate and classify the degree of eutrophication [25] uses three limnological parameters, such as Chl a (µg/L), zSD (m), and TP (µg/L) to calculate the Integral TSI (TSI-Int) according to the following equations: TSI Chl a = 9.81 ln(Chl a) + 30. 6 (1) The Integral TSI (TSI-Int) was calculated with a modified formula that weighs TSI Chlor-a while reducing the weighting of TSI TP, because the availability of nutrients may not necessarily be the most important factor in promoting eutrophication (Equation (4)), and also to avoid overestimation of the values for classifying the trophic state, taking into account the relatively high TP values of the waters that feed the EQ [26].

Water Residence Time (WRT)
Theoretical or nominal WRT is defined by Equation (5), where V is the volume of the water body and Q is the daily flow in or out [27].
The WRT was estimated for each month and lake in the sampling period, except for WS 4 because it belonged to the channel network, but for statistical analysis and data management, at this site it was considered that WRT tended toward zero days. The average monthly volumes of lakes (V) were estimated from their surfaces and the monthly depths measured during the data collection period (2013-2016). To estimate Q, in each supply or output channel the surface velocity of the flow was measured using the float method, considering that the surface velocity is higher than the average flow rate. The use of a correction factor improves the estimation by this method. Finally, the velocity was multiplied by the area of the cross-section of the wetted perimeter of the different channels, then the Q estimated values were obtained [28].

Hydrogeochemical Modelling of Divalent Cations Minerals
Ca 2+ and Mg 2+ under certain conditions can form different minerals that remove PO 4 3− from the water column and attenuate eutrophication by precipitation or co-precipitation [29,30]. The saturation index (SI), defined as Equation (6), gives the potential formation of any solid in an aqueous medium: where IAP is the ionic activity product for the solid and Kps is the constant of the solubility product of the modeled solid. If SI < 0, the solid is dissolved; when SI = 0, the solid is in chemical equilibrium and can precipitate; and if SI > 0, the system is oversaturated and the solid can precipitate as well [31]. For the development of the hydrogeochemical models, the Aqion program was used, and the basis of this program was the method of calculating the thermodynamic balance that bases its numerical calculation on USGS Phreeqc [32]. The Aqion program offers the results of both SI and concentration of the solid that can potentially form (µM).
Hydrogeochemical models for the formation of phosphates with divalent cations were made for each month in the four lakes, and in WS 4 with the available variables of the 2013-2016 sampling period [Ca 2+ , Mg 2+ , CO 3 2− , HCO 3 − , pH, temperature (T), N-NO 3 − and P-PO 4 3− ], but due to not having the concentrations of all the major elements, the errors in the ion balances were completed with little or no reactive ions, such as Na + or Cl − , according to the case. For additional data handling of this parameter, as statistical analyses, the unit µM was used instead of SI values, since the SI may incorporate negative and zero values, making it difficult to find just a valid transformation factor for all other variables. In the case of the hydrogeochemical data that were not analyzed statistically, the SI values were used.

Organic Carbon in Sediments (OC)
The samples were taken from different sites at the bottom of the lakes in October 2017. The determinations were made by triplicate, through the oxidation method of the organic matter with potassium dichromate (K 2 Cr 2 O 7 ) and sulphuric acid (H 2 SO 4 ) [33]. The method was standardized (n = 7) with anhydrous dextrose, obtaining a 99% recovery and 1.6% of CV.

Spatial and Temporal Differences in Water Parameters
The Discriminant Analysis (DA) was used to evaluate the spatial differences for the analyzed variables between the different water bodies, also allowing to define the contribution of each variable to the grouping of observations [34]. The data was transformed to n + 1, and then the Log 10 transformation was applied to minimize the biases of the data distribution, bringing them closer to a normal distribution [35]. The DA was performed with the XLStat 2014 program [36].
To figure out whether there were significant differences between the climatic seasons, a PERMANOVA analysis using PAST ver. 3.11 was performed [37]. Environmental data (Chl a and nutrients) for the rainy and dry seasons were previously transformed using n+1 and Log 10 .

Partial Least Squares Regression (PLSR)
The PLSR is useful for solving multicollinearity problems in variables [38]. This study applies PLSR to identify the most influential variables on the trophic status for the studied lakes, considering Chl a as the modeled variable. PLSR models were implemented with the XLStat 2014 program [36], following the procedure proposed by Tenenhaus based on the retention of significant variables for the model, according to the β coefficients, confidence intervals, and with their contribution to explaining the variable response [39]. To improve model performance metrics, outlier data were removed, based on the Dixon test [40]. In the NL and CL models, no data were removed, while in the SL model, the three highest observations (outliers) were removed, and in the RL model, one outlier was removed. The value of R 2 is interpreted as the proportion of the variance of the dependent variable explained by the model [38].
The Q 2 is a jack-knife (cross-validation) version of R 2 , which predicts the variation of the dependent variable [38]. The Q 2 index shows how the data omitted from cross-validation can be reconstructed with the help of the model and with the optimal number of PLS components, which include the Water 2020, 12, 1454 6 of 19 response variable for its building [41]. If Q 2 > 0, the model has predictive relevance, while Q 2 < 0 suggests that the model lacks predictive relevance [42].

Root of the Mean Square Error of Prediction (RMSEP)
The mean square error of the prediction (MSEP) or its square root (RMSEP) is often used to evaluate regression performance; the minimum RMSEP value suggests the best model, so this parameter gives the criteria to choose the optimal number of components in PLSR [43], with the condition that the number of PLS components is less than the number of predictors [44]. If the choice of the optimal number of components is not adequate, the model can over-adjust R 2 in Q 2 demerit [39].

Significance and Importance of Variables in PLSR Models
According to the Tenenhaus procedure, the selection of significant variables by the confidence interval of the standardized coefficients, also called the β coefficients, is an iterative process in which from saturated PLSR models, they go away, discarding those variables whose confidence interval of the β coefficient includes zero on its range of values. This procedure is repeated as many times as necessary with the retained variables until no variable includes zero in the range of values of the confidence interval of the respective β coefficient [45]. Saturated models included all variables in Table 1, except TSI-Int, mainly because it is an index derived from Chl a (Equation (4)).
The variable importance in projection (VIP) values derived from loads of the β coefficients summarizes the relative importance of the predictors in the PLSR model, and provide a useful measure in the selection of variables that contribute most to the explanation of the variance of the dependent variable [46]. The VIP values for considering a moderately influenced variable are 0.7-0.8 < VIP < 1, and to classify the influence of a variable as "high", the VIP must be higher than 1 [47]. Table 1 shows a summary of the results of the variables analyzed during 2013-2016. The average depth of the lakes varies between 94 and 121 cm. WS 4 is a part of a channel that has an average depth of 65 cm, with extreme values of 20-100 cm in the sampling site; however, the five sites show a relatively low standard deviation (SD) between 11 and 17 cm, probably due to the dampening of flood zones. Among the sites studied, it is remarkable that the flow of water supply of RL was higher than that of the other lakes, so the WRT for RL was lower (RL < NL < CL < SL). Differences in WRT may be due to the operation of the hydraulic lake infrastructure that varies depending on the season of the year (rainy or dry seasons), and the requirements of maintenance or sports activities that take place in the EQ.

Environmental Conditions in the Lakes (2013-2016)
The maximum and minimum WRT values are probably associated with the rainy and dry seasons, because the water of the springs comes from a shallow aquifer composed of high hydraulic conductivity materials, as fractured basalt. It was observed that over a few days (approximately 7-15), the flows of the springs reflected notorious changes according to the season. The flow value in one of the springs was almost 80 m 3 /day in the dry season, and almost 10,200 m 3 /day in the rainy season [13].
The operation of the gates is decisive in the distribution of water and on the WRT values. During the rainy season, as far as possible, the gates work to avoid flooding of the lowest areas in EQ, and in the dry season, to avoid an excessive drop in lake level. Therefore, seasonal effects can be partially masked by gate management.
The average temperature in the water column exhibits the following order: NL < WS 4 < RL < CL < SL. NL, CL, and mainly SL have remarkable dispersion values (>1.0 • C), while RL and WS 4 have SD values less than 1 • C, which may be explained by the difference in the exposure of lakes to sunlight due to the adjacent rock wall that is 30-38 m high ( Figure 1).
T ( • C) 16   The K 25 shows the average maximum values in WS 4 and NL; both sites also exhibit the highest dispersion in the data along with the CL, while the RL and SL show the lowest SD values. The lowest average values of K 25 in CL, SL, and RL may be associated with solid precipitation.
The Ca 2+ reveals very similar average values in all water bodies (23-26 mg/L) and SD around 3 mg/L, as does Mg 2+ , but the latter exhibits average values in the range of 12.6-13.6 mg/L. These elements probably come from the weathering of non-calcareous minerals [48].
Concerning alkalinity, CO 3 2− had changing values in the different shallow lakes ranging from 0.0 to >100 mg/L, except WS 4, which presented with 0.0 throughout the sampling series. HCO 3 − is the predominant ion in alkalinity and exhibited significant variations in results at each of the sites studied, including WS 4 (20-256 mg/L). The carbonated ions in waters that cross basaltic substrate (low alkaline reserve) probably come from the soil area where the water infiltrated; decomposition of organic matter, the action of the roots, and the microbial respiration generate a CO 2 concentration of one or two orders of magnitude higher than the concentration of CO 2 in the air, and based on pH, it transformed into HCO 3 − or CO 3 2− [49].
As for nutrients, N-NO 3 − is the most abundant analyzed nitrogen ion in the system (SL < CL < TN:TP ratios were calculated by mass, showing a range between 10 and 199 (CL < NL < RL < WS 4), and averaging (57 ± 23), this value suggests that P is the limiting nutrient in accordance with the Redfield ratio (7.2 by mass) for phytoplankton. However, the effect of the TN:TP ratio depends on the concentration of P and N [51]. High TN:TP values and high concentrations of TP can generally relieve the P deficiency and allow for significant phytoplankton growth, as evident in the lakes of the EQ [52].
According to hydrogeochemical models, the only phosphate compound that had SI > 0 was HAP [Ca 5 (PO 4 ) 3 (OH)]. The P-fraction was calculated in the HAP (P-HAP). This mineral is perhaps the most crucial calcium phosphate compound for all uses and applications [53]; it has low solubility and may form under certain conditions that are present in EQ lakes (enough concentration of Ca 2+ and PO 4 3− , adequate temperature values and increases in pH values, due to primary production). These conditions favor the formation of the mineral, which would remove PO 4 3− from the water column and thereby reduce eutrophication [29]. P-HAP results show that NL, RL, and WS 4 did not offer conditions for HAP formation and, therefore, for the presence of P-HAP. In CL and SL, the conditions for the potential formation of HAP were presented because they are warmer and have higher pH values caused by eutrophication, which promotes the formation of this mineral in less time. However, the HAP concentration in CL is about 5.5 times higher than in SL, as the latter lake has the lowest concentration of P-PO 4 3− due to the high consumption by primary producers. Variables influenced by eutrophication, such as DO concentration, deploys the series: WS 4 < RL < NL < NL < CL < SL, while zSD reversely exhibits the same series as DO. Note that the depth and zSD in both RL and WS 4 are similar, as in most measurements, the depth was equal to zSD. Concerning pH, the order of values remains WS 4 < NL < VL < RL < CL < SL.
Chl a has the same order of values as DO, which also has consistency with the TSI-Int results, where NL is referred to as "medium eutrophic", just like RL, while CL and SL are hyper-eutrophic and WS 4 is light eutrophic. In some cases, when zSD = depth, the TSI zSD contribution overestimated the TSI-Int values mainly with regard to WS 4 and RL, where the zSD = depth condition is displayed in 36 and 30 cases, respectively.
The various trophic state conditions of these shallow lakes are probably associated with the differences that most variables present in the sites studied, reflecting different interactions between the environmental variables. As can be seen in RL and WS 4, where the predominant trophic condition is light eutrophic, they are possibly associated with the short WRT of WS 4.
Calcite is a solid that can be formed according to pH [54]. Table 2 shows calcite saturation rates (SI Calcite ) in the lakes and that are based on pH. Conditions for calcite formation in NL, RL, and WS 4 are not favorable most of the time, and can only be seen as favorable when pH values are high; CL and SL show adequate conditions for the potential calcite formation in the respective range. The limited calcite formation can be explained by the reduced formation of carbonic acid, since primary producers take advantage of dissolved CO 2 for photosynthesis, and this raises the pH, promoting the transformation of HCO 3 − into CO 3 2− [55]. The organic carbon concentrations (OC) of sediments in the shallow lakes (Table 3) differ (SL < RL < CL < NL); however, the concentration of OC in the sediments in the NL is significantly higher than at other sites. Primary producers can generate the OC, and in the case of NL, the highest value could relate to the inputs from the adjacent hillside with abundant plant cover (Figure 1).

Spatial and Temporal Differences in Water Parameters
In order to establish the differences between the water bodies studied, a DA was performed. Table 4 presents the variables with the highest correlation with the first two discriminating functions; these correlations allow for the influence of the different variables in the classification to be measured.
The F1 discriminating function explains 88.81% of the variance, and F2 explains 7.01%; both functions are significant, as shown by Wilk's Lambda and the p-value. F1 concentrates the values of the highest coefficients of K 25 , DO, pH, CO 3 2− , HCO 3 − , nitrogen, and phosphate ions, WRT and Chl a, while F2 correlates with zSD, Ca 2+ , and Mg 2+ . The variable with the most influence on group discrimination is WRT, which presents a high correlation with the explanatory function of variance (F1).  Figure 2 illustrates the results of the DA; WS 4 and RL were different from the other lakes, and did not share characteristics among them. NL and RL showed a greater proximity to WS 4, while the SL and CL point clouds were present. NL and RL showed greater proximity to WS 4, while the SL and CL point clouds presented with an incomplete overlap, as both lakes have been classified in the same category in the TSI-Int (hyper-eutrophic), although they may have some differences.  Figure 2 illustrates the results of the DA; WS 4 and RL were different from the other lakes, and did not share characteristics among them. NL and RL showed a greater proximity to WS 4, while the SL and CL point clouds were present. NL and RL showed greater proximity to WS 4, while the SL and CL point clouds presented with an incomplete overlap, as both lakes have been classified in the same category in the TSI-Int (hyper-eutrophic), although they may have some differences. The PERMANOVA analysis was applied using the Chl a and nutrients and did not show differences (p > 0.05) between the dry and rainy seasons. The PERMANOVA analysis was applied using the Chl a and nutrients and did not show differences (p > 0.05) between the dry and rainy seasons. Figure 3a presents the scatter plot and performance metrics of the model, while R 2 explains 60.5%, and Q 2 predicts 52.9% of the Chl a variation. The model with one PLS component had the lowest RMSEP value.  Figure 3a presents the scatter plot and performance metrics of the model, while R 2 explains 60.5%, and Q 2 predicts 52.9% of the Chl a variation. The model with one PLS component had the lowest RMSEP value.

North Lake (NL)
On the Y-axis (Figure 3b), the positive values of the β coefficients for DO and pH reflect direct relationships of the predictor variables modified by eutrophication with Chl a; the negative values of the β coefficients of the rest of the variables indicate inverse relationships. The VIP value (Figure 3b) indicates that the variables modified by eutrophication (pH, zSD, and DO) are the ones that bring the most weight to the model, as well as N-NO3 -. The WRT and P-PO4 3-, have moderate influence, while Ca 2+ has a minor influence on the model, but is significant, based on the β coefficient confidence interval. Variables such as pH, zSD, and DO have a strong influence on Chl a modeling due to the variables that are directly affected by eutrophication.
Nutrients are recognized factors of eutrophication, so the VIP is moderate to high. The inverse relationship of the nutrients (N-NO3 -and P-PO4 3-) with Chl a may be associated with consumption by primary producers, which contributes to the reduction in concentrations of dissolved nutrients in the water column.
After nutrients, WRT in NL is the most influential variable in the trophic state. The WRT in the EQ has a close relationship with the dry and rainy seasons: in the NL, the condition Up Chl a/Down WRT may occur, associated with the connection between NL and CL during the rainy season when the NL receives CL water with a higher Chl a load. The other condition is Down Chl a/Up WRT, and may be related to the non-connection of shallow lakes during the dry season, in addition to the reduction of temperature, photoperiod, and irradiance during the winter, which could contribute to a decrease in the activity of primary producers.
The inverse relationship of Up Chl a/Down Ca 2+ is probably associated with the NL, sometimes having positive SICalcite values ( Table 2). The Down Chl a/Up Ca 2+ relationship is feasibly associated with the water renewal (flushing) that replenishes the precipitated ions and dilutes the Chl a. Figure 4a presents the scatter plot and performance metrics of the model, where R 2 explains 45.8% of the variance, and Q 2 predicts 39% of the Chl a variation. This model was built with one PLS component, and resulted in the lowest RMSEP value. Figure 4b shows the β coefficients of the retained variables and the confidence intervals of the β coefficients to check the significance of the have moderate influence, while Ca 2+ has a minor influence on the model, but is significant, based on the β coefficient confidence interval. Variables such as pH, zSD, and DO have a strong influence on Chl a modeling due to the variables that are directly affected by eutrophication. Nutrients are recognized factors of eutrophication, so the VIP is moderate to high. The inverse relationship of the nutrients (N-NO 3 − and P-PO 4 3− ) with Chl a may be associated with consumption by primary producers, which contributes to the reduction in concentrations of dissolved nutrients in the water column. After nutrients, WRT in NL is the most influential variable in the trophic state. The WRT in the EQ has a close relationship with the dry and rainy seasons: in the NL, the condition Up Chl a/Down WRT may occur, associated with the connection between NL and CL during the rainy season when the NL receives CL water with a higher Chl a load. The other condition is Down Chl a/Up WRT, and may be related to the non-connection of shallow lakes during the dry season, in addition to the reduction of temperature, photoperiod, and irradiance during the winter, which could contribute to a decrease in the activity of primary producers.

Central Lake (CL)
The inverse relationship of Up Chl a/Down Ca 2+ is probably associated with the NL, sometimes having positive SI Calcite values ( Table 2). The Down Chl a/Up Ca 2+ relationship is feasibly associated with the water renewal (flushing) that replenishes the precipitated ions and dilutes the Chl a. Figure 4a presents the scatter plot and performance metrics of the model, where R 2 explains 45.8% of the variance, and Q 2 predicts 39% of the Chl a variation. This model was built with one PLS component, and resulted in the lowest RMSEP value. Figure 4b shows the β coefficients of the retained variables and the confidence intervals of the β coefficients to check the significance of the variables. In this case, the signs of the β coefficients exhibit direct relationships with pH, DO, CO 3 2− , and P-Org and inverse relationships with zSD, HCO 3 − , P-PO 4 3− , and P-HAP.

Central Lake (CL)
Water 2020, 12, x FOR PEER REVIEW 12 of 19 variables. In this case, the signs of the β coefficients exhibit direct relationships with pH, DO, CO3 2-, and P-Org and inverse relationships with zSD, HCO3 -, P-PO4 3-, and P-HAP. The variables that are modified by eutrophication (pH, DO, and zSD) and P-PO4 3-have VIP values that classify them as variables of considerable influence, except for P-PO4 3-, which is low (VIP = 0.674), but the roles of these variables have already been mentioned and are not different from one lake to another.
HCO3 -and CO3 2-have a significant influence on the model (VIP > 1). Its high VIP values are associated with the modifications to pH generated by primary producers; however, in the case of CO3 2-, it is not discarded as a possible eutrophication attenuation factor because it could co-precipitate with P-PO4 3- [30], due to CL presenting with SICalcite > 0 ( Table 3).
The P-Org presented a direct relationship with Chl a of moderate influence in CL (VIP = 0.850), probably due to living and detrital organic matter. The CL presents the highest concentration of P-Org even without being the most productive site, which can be associated with the water movement that favors the accumulation of materials because it presents an orientation in which the north and NNE winds exert a movement of surface water in the opposite direction to the hydraulic gradient. In CL, the winds generate a slowdown, sufficient to probably cause the retention of P-Org in the water body.
The potential HAP formation can be a eutrophication attenuation factor, although the influence of this variable on the model is low (VIP = 0.598). The potential formation of HAP is considered as a possible attenuator because the average P-HAP concentration (0.033 mg/L) approximately corresponds to 25% of the mean P-PO4 3-concentration (0.13 mg/L) in the waters supplying lakes from WS 4 ( Table 1).

South Lake (SL)
In the SL model, R 2 explains 53.2% of the variance, and Q 2 predicts 44.9% of the Chl a variation (Figure 5a). The minimum RMSEP value was obtained with one PLS component.
The pH, and particularly zSD, are variables that strengthen the model performance metrics due to their VIP values (Figure 5b). The β coefficient of N-NH3 has a significant direct influence on the model (VIP = 0.935), and this parameter may reflect a nutrient recycling process because the direct relationship N-NH3/Chl a suggests that organic matter is decomposing, since ammonium is the dominant nitrogen ion at the beginning of the mineralization of organic matter [56]. Nutrient recycling or mineralization may be taking place in the water column, which is consistent with the lowest OC value in SL sediments (1.96%) (Table 3).
Additionally, SL being the warmest water body between EQ lakes can promote mineralization because it increases the activity of decomposers and related microorganisms [54], and the DO content in the SL water column is much greater than 2 mg/L, which is optimal for the removal of The variables that are modified by eutrophication (pH, DO, and zSD) and P-PO 4 3− have VIP values that classify them as variables of considerable influence, except for P-PO 4 3− , which is low (VIP = 0.674), but the roles of these variables have already been mentioned and are not different from one lake to another. HCO 3 − and CO 3 2− have a significant influence on the model (VIP > 1). Its high VIP values are associated with the modifications to pH generated by primary producers; however, in the case of CO 3 2− , it is not discarded as a possible eutrophication attenuation factor because it could co-precipitate with P-PO 4 3− [30], due to CL presenting with SI Calcite > 0 ( Table 3).
The P-Org presented a direct relationship with Chl a of moderate influence in CL (VIP = 0.850), probably due to living and detrital organic matter. The CL presents the highest concentration of P-Org even without being the most productive site, which can be associated with the water movement that favors the accumulation of materials because it presents an orientation in which the north and NNE winds exert a movement of surface water in the opposite direction to the hydraulic gradient. In CL, the winds generate a slowdown, sufficient to probably cause the retention of P-Org in the water body.
The potential HAP formation can be a eutrophication attenuation factor, although the influence of this variable on the model is low (VIP = 0.598). The potential formation of HAP is considered as a possible attenuator because the average P-HAP concentration (0.033 mg/L) approximately corresponds to 25% of the mean P-PO 4 3− concentration (0.13 mg/L) in the waters supplying lakes from WS 4 ( Table 1).

South Lake (SL)
In the SL model, R 2 explains 53.2% of the variance, and Q 2 predicts 44.9% of the Chl a variation (Figure 5a). The minimum RMSEP value was obtained with one PLS component.
The pH, and particularly zSD, are variables that strengthen the model performance metrics due to their VIP values (Figure 5b). The β coefficient of N-NH 3 has a significant direct influence on the model (VIP = 0.935), and this parameter may reflect a nutrient recycling process because the direct relationship N-NH 3 /Chl a suggests that organic matter is decomposing, since ammonium is the dominant nitrogen ion at the beginning of the mineralization of organic matter [56]. Nutrient recycling or mineralization may be taking place in the water column, which is consistent with the lowest OC value in SL sediments (1.96%) (Table 3).
Additionally, SL being the warmest water body between EQ lakes can promote mineralization because it increases the activity of decomposers and related microorganisms [54], and the DO content in the SL water column is much greater than 2 mg/L, which is optimal for the removal of organic materials by biodegradation [57]. The direct relationship between N-NH 3 /Chl a can also be explained by the ammonium that may be being used as a nitrogen source by primary producers and generate more chlorophyll [58].
Water 2020, 12, x FOR PEER REVIEW 13 of 19 organic materials by biodegradation [57]. The direct relationship between N-NH3/Chl a can also be explained by the ammonium that may be being used as a nitrogen source by primary producers and generate more chlorophyll [58].

Regulation Lake (RL)
In the model, R 2 explains 51.5% of the variance, and Q 2 predicts 43.4% of the Chl a variation (Figure 6a). The minimum RMSEP value was obtained with one PLS component. The influence of pH and zSD, as on the other models, is evident (VIP > 1), followed by Ca 2+ and P-PO4 3-with low influences (VIP > 0.7), but is significant based on the confidence intervals of their respective β coefficients (Figure 6b). When the inverse relationships of Ca 2+ and P-PO4 3-with the Chl a have the condition Down Chl a/Up Ca 2+ and P-PO4 3-, it is probably associated with flushing derived from WRT because RL has the lowest WRT among all the water bodies (3 ± 1 day) ( Table 1), but the WRT is not among the significant variables, probably due to the delay in the response of the lake to the replacement. In the condition Up Chl a/Down Ca 2+ and P-PO4 3-, the reduction of Ca 2+ can be associated with SICalcite > 0 ( Table 2), and the inverse relationship Chl a/P-PO4 3-may be associated with consumption of primary producers since HAP (Table 1) is not formed in this lake.

Discussion
The variances explained from Chl a in this study with PLSR (R 2 = 45.8-60.6%) are consistent with the results found in another tropical system, the lakes of the Paraná River basin in Brazil (R 2 =

Regulation Lake (RL)
In the model, R 2 explains 51.5% of the variance, and Q 2 predicts 43.4% of the Chl a variation (Figure 6a). The minimum RMSEP value was obtained with one PLS component. The influence of pH and zSD, as on the other models, is evident (VIP > 1), followed by Ca 2+ and P-PO 4 3− with low influences (VIP > 0.7), but is significant based on the confidence intervals of their respective β coefficients (Figure 6b). When the inverse relationships of Ca 2+ and P-PO 4 3− with the Chl a have the condition Down Chl a/Up Ca 2+ and P-PO 4 3− , it is probably associated with flushing derived from WRT because RL has the lowest WRT among all the water bodies (3 ± 1 day) ( Table 1), but the WRT is not among the significant variables, probably due to the delay in the response of the lake to the replacement.
Water 2020, 12, x FOR PEER REVIEW 13 of 19 organic materials by biodegradation [57]. The direct relationship between N-NH3/Chl a can also be explained by the ammonium that may be being used as a nitrogen source by primary producers and generate more chlorophyll [58].

Regulation Lake (RL)
In the model, R 2 explains 51.5% of the variance, and Q 2 predicts 43.4% of the Chl a variation (Figure 6a). The minimum RMSEP value was obtained with one PLS component. The influence of pH and zSD, as on the other models, is evident (VIP > 1), followed by Ca 2+ and P-PO4 3-with low influences (VIP > 0.7), but is significant based on the confidence intervals of their respective β coefficients (Figure 6b). When the inverse relationships of Ca 2+ and P-PO4 3-with the Chl a have the condition Down Chl a/Up Ca 2+ and P-PO4 3-, it is probably associated with flushing derived from WRT because RL has the lowest WRT among all the water bodies (3 ± 1 day) ( Table 1), but the WRT is not among the significant variables, probably due to the delay in the response of the lake to the replacement. In the condition Up Chl a/Down Ca 2+ and P-PO4 3-, the reduction of Ca 2+ can be associated with SICalcite > 0 ( Table 2), and the inverse relationship Chl a/P-PO4 3-may be associated with consumption of primary producers since HAP (Table 1) is not formed in this lake.

Discussion
The variances explained from Chl a in this study with PLSR (R 2 = 45.8-60.6%) are consistent In the condition Up Chl a/Down Ca 2+ and P-PO 4 3− , the reduction of Ca 2+ can be associated with SI Calcite > 0 (Table 2), and the inverse relationship Chl a/P-PO 4 3− may be associated with consumption of primary producers since HAP (Table 1) is not formed in this lake.

Discussion
The variances explained from Chl a in this study with PLSR (R 2 = 45.8-60.6%) are consistent with the results found in another tropical system, the lakes of the Paraná River basin in Brazil (R 2 = 52%) [59], strengthening the suggestion that linear regression models are useful tools for understanding the complex dynamics of the factors that compose the aquatic systems [59][60][61].
The presence of N-NO 3 − in EQ spring water may be due to edaphologic processes that start at the point where water infiltrates into the aquifer. N-NO 3 − concentrations in EQ springs, in general, do not exceed the limit for drinking water in Mexico [62]. Another previous study focused on analyzing wells within the borders of the lava spill, and its surroundings did not find concentrations higher than 10 mg/L of N-NO 3 − [63], suggesting that the water sprouts from the springs are not polluted with sewage. The origin of P-PO 4 3− in the spring water can be lithogenic because the variation in WS 4 is low (±0.03 mg/l), which points to a stable and constant contribution over the sampling period, and because the phosphorus content in the basalts of Xitle volcano lava spill, such as P 2 O 5 , is 0.25-0.76% (1090-3320 mg/kg as P) [64]. These values are above the upper Earth crust value (757 mg/kg as P) [65]. The spill rock of the Xitle volcano, also rich in iron, aluminum, titanium, and manganese [64], is a basalt mostly composed of olivine and pyroxene [66], the olivine being one of the most reactive minerals in the presence of water [67], and the weathering of these minerals probably result in the release of PO 4 3− contained in the rock to the water that supplies the EQ lakes. The nitrogen and phosphorus intake, primary production, and chemical processes of EQ are likely to be the origins of the high TN:TP ratios at EQ lakes (57 ± 23 by mass), which may be determining the phytoplankton community composition due to the TN:TP ratio. This ratio has a strong influence on biological structure and function, and is therefore widely used to handle and infer the phytoplankton community composition [68]. At low TN:TP ratios, cyanobacteria become dominant [68], and around 7.2 by mass at high N and P concentrations are optimal for Microcystis aeruginosa [51]. The average values of TN:TP ratios in EQ lakes are much greater than 7.2, and thus are able to explain why the dominant group of phytoplankton, in general, is not cyanobacteria, but chlorophytes and bacillariophytes, as mentioned above [18]. These are less harmful for the environmental condition of EQ water bodies.
In NL, nutrient consumption has a significant influence over the trophic state, so the identification of other variables that also participate in the expression of a particular trophic state is important. In NL, we found the WRT had an inverse relationship with Chl a. The WRT has been reported as a variable that relates directly to the increase in primary producers [69], which makes it relevant to focus on the type and response time of small and shallow water bodies.
The extraction of PO 4 3− from the water column as calcium compounds has been studied as a way to mitigate eutrophication [70], although the potential formation of HAP in CL is questionable for at least two reasons-firstly, because it may be affected by the relatively slow formation of the mineral. The formation kinetics may determine the concentration of PO 4 3− available for primary producers, without neglecting how the presence of humic substances inhibits HAP precipitation [71]; and secondly, because the CL model has the lowest PLSR (R 2 and Q 2 ) performance metrics among the four models, and P-HAP also has a minor influence (VIP = 0.598). The pH of the spring water generally does not favor HAP precipitation, and it is only when the water resides in CL and the primary producers modify the pH when the conditions for the presence of HAP improve. However, due to the slow kinetics of HAP precipitation, primary producers are more likely to take advantage of PO 4 3− before HAP formation. In our study, calcite was used to explain the removal of Ca 2+ from the water column, and it was also able to participate in the co-precipitation of CaCO 3 /PO 4 3− , a process that has been reported as a mitigation factor for eutrophication in many lakes of different temperate areas of the world [30,72,73]. Co-precipitation has not been studied in depth under the conditions offered by CL (shallow tropical lake at high altitude), where there could be a minor attenuation factor. Co-precipitation depends on the initial concentration of PO 4 3− -at low or high concentrations, the co-precipitation may below, or even inhibited. The high average concentration of P-PO 4 3− in CL (0.08 mg/L) would be an unfavorable condition for adequate co-precipitation rates [36]. Organic matter mineralization maintains its primary production in SL. A similar process occurred in four shallow eutrophic lakes with a limited external P load, where the organic matter decomposition was stimulated by rises in temperature, derived from a eutrophic state, but was also helped by the contributions of the phosphorus released from the aerobic surface of the sediment [74]. The ortho-phosphate ion, negatively charged, is electrostatically repelled by minerals in the sediment as oxides and hydroxides of iron, aluminum, titanium, or manganese when the water pH value in EQ lakes (6.6-10) could be higher than the pH values of the Points of Zero Charge of these minerals [75], which are present in the basaltic rock of the Xitle spill that presumably gave rise to the sediments of the EQ water bodies [64]. Additionally, in the sediments of eutrophic water bodies, lower concentrations of P-Org were found relative to others with a lower degree of eutrophication, suggesting that the decrease in P-Org concentrations is a consequence of the mineralization [76], a situation that was observed in SL but with the OC.
Often, the shallow lake sediments present a significant interaction with most of the water column, which can lead to mineralization if the water contains a sufficient amount of DO to satisfy the Sediment Oxygen Demand (SOD) [77,78], >2 mg/L, as mentioned above. Eutrophic lakes have a high amount of bacteria and other microorganisms as flagellates and ciliates, related to the abundance of organic matter, and all these organisms play a fundamental role in organic compound degradation [54], where the cilliates showed the highest abundance in SL [79]. In this lake (SL), due to the lower water supply (highest WRT), mineralization seems to be the process that supplies nutrients for eutrophication.
The rate of water renewal or flushing has been widely studied as a strategy for the remediation of all types of epicontinental water bodies, such as estuaries, lakes, rivers, reservoirs, and shallow lakes [80][81][82][83]. In EQ, flushing is applied due to the regulating function of the RL that dampens the flow rates coming from the springs to prevent flooding, dislodging the water as quickly as possible.
With the application of PLSR, not only were the most influential variables in the trophic state of each water body identified, but they also began with the knowledge of the processes present, in which the variables participated. The WRT is a variable that modifies chemical, physical, and biological aspects [10]. In NL, WRT had an apparent direct influence on the Chl a PLSR model; however, in the rest of the lakes, the influence of WRT could be considered indirect, due to the variables identified as influential on the processes that may be favored by the WRT, such as the possible attenuation of eutrophication by the potential formation of HAP in CL, mineralization in SL, and flushing in RL.
The WRT is not among the retained variables in the PLSR models of CL, SL, and RL, probably due to the delay of the response of organisms, reactions, and physical processes, such as water movements, gate operation, changes in temperature, and precipitation of materials. The importance of WRT is embodied as being the crucial variable within F1 in DA that explains 88.81% of the variation, clarifying its importance as a discriminatory variable of lakes, and as a facilitating regulatory factor of processes that can attenuate or lead to eutrophication.
By identifying the variables that are probably the most influential in the trophic state of these small shallow lakes, future research would lead to deepening knowledge of the processes that may be present in the different lakes studied, while maintaining the focus on being able to apply the information to other similar sites.
From our perspective, it would firstly be necessary to study the provenance of the sediments, as well as the role that they play. This is especially so in NL because in this lake, the sediments present with the highest OC concentration, which are not the most eutrophicated; it would also be necessary to identify the origin of P-PO 4 3− . In CL, the study of HAP and co-precipitation of CaCO 3 − /PO 4 3− would be further explored as possible eutrophication mitigation factors. Also in SL, to delve into the study of environmental factors that promote nutrient recycling, it would complement the identification of the groups of species and their biomass to know their role in this process in the environmental conditions offered by small and shallow water bodies.