1. Introduction
Understanding the equilibrium state of aquatic systems is essential for informed decision-making and the preservation of the health of dependent organisms [
1,
2]. This is particularly relevant in tropical and subtropical regions, which are among the most sensitive to climate change [
3]. In this context, we examine Lake Chapala, a shallow water body located in the subtropical region of western Mexico. It is the largest lake in Mexico, with a maximum depth of 10.5 m [
4]. The lake’s depth and volume were affected by reduced flow in the late 1970s, attributed to dam construction on the Lerma River [
5,
6]. Since that time, average lake levels have fluctuated around 60% of their actual maximum capacity (7897 Mm
3), which is significantly below the historical highs recorded in 1926 (9663 Mm
3) (
https://www.ceajalisco.gob.mx/contenido/chapala/, accessed on 4 May 2025).
The current conditions expose the lake to atypical climatic phenomena, such as droughts. Over the past two decades, a semi-permanent drought classified as anomalous dry (D0) has been observed in western Mexico, characterized by periods lasting two to five years, interrupted by brief intervals (
https://droughtmonitor.unl.edu/NADM/Statistics.aspx/, accessed on 4 May 2025). The severity of these droughts has progressively intensified, reaching an exceptional classification (D4) in extensive areas of western Mexico, including the Lake Chapala region [
7]. This climatic condition induces significant fluctuations in the water volume of rivers and lakes and adversely affects water quality, leading to substantial alterations in natural processes. Consequently, addressing water quality in response to these phenomena necessitates an immediate response, involving real-time monitoring of key indicators such as turbidity.
Management strategies for large aquatic systems face significant operational challenges in monitoring water quality, necessitating considerable investments in financial and human resources. This traditional approach inherently offers limited spatial coverage and yields spatially discrete and temporally asynchronous measurements. Environmental remote sensing for water quality assessment offers an alternative method to address the limitations of conventional monitoring. This innovative approach utilizes alternative methods and technological tools that complement direct water quality assessment. The spatial coverage provided by satellite imagery offers a comprehensive and instantaneous view of the optical properties of water, which can assist in characterizing the physical, chemical, or biological phenomena occurring within a water body, thereby supporting information from discrete records. Furthermore, the periodic satellite pass facilitates the assessment of any phenomenon in near real time and provides a historical dataset for analyzing trends, patterns, or tracking changes over time, enabling a rapid response to pollution events.
With 52 years of data collection, the continuous research and rigorous quality protocols implemented by the Landsat program have established it as one of the most robust platforms, offering a diverse array of products. This has resulted in several product enhancements, including (1) improved image quality, (2) a standardized revisit period over the same site every 16 days, (3) satellite synchronization (e.g., Landsat 8 and 9) to double the data acquisition, thereby increasing the temporal resolution from 16 to 8 days, (4) standardized scene size, and (5) consistent spatial coverage between scenes. Despite their spatial resolution of 30 m per pixel, the attributes of Landsat products render them an optimal choice for research on the optical properties of water bodies and their spatiotemporal variations. Notably, the adjustment of the detector range (from 760–900 nm in Landsat 5 to 850–880 nm in Landsat 8 and 9) to measure the near-infrared spectral response is crucial for conducting research on turbidity linked to inorganic matter in inland water bodies.
The estimation of water quality parameters through optical properties and remote sensing is a well-established practice [
8]. Three methodological approaches are employed to model water properties in various aquatic environments: empirical, analytical, and machine learning [
9,
10]. Herein, we succinctly outline the advantages and limitations of these models as discussed by these authors. Empirical models are extensively utilized due to their simplicity and minimal computational demands. Nonetheless, their predictive capability is constrained by the range of values used for their parameterization and the dynamics of the water body, limiting their applicability across different water bodies. Analytical models are based on the inherent optical properties of water (IOPs) and the atmosphere, which are independent of the ambient light field. These models are infrequently applied in complex water bodies due to the challenges associated with modeling interactions among various water components, and their application as a singular model to optically heterogeneous water bodies requires substantial in situ validation data. Conversely, machine learning approaches, while inherently empirical, are distinguished by their capacity to function in a multidimensional space. These algorithms can produce generalizable models that capture the intricate nonlinear relationships between remotely sensed reflectance and water quality parameters. However, they are heavily reliant on the range and context of data used for model training, the parameterization, and the volume of training data available, which may result in overfitting, particularly in models with numerous input variables subject to collinearity.
The current literature on water quality assessment through remote sensing in Lake Chapala and its surrounding water bodies is relatively limited, and existing studies present certain limitations that this research seeks to address. Several studies have estimated water quality indices or parameters using spot images [
11] or previous Landsat products, such as ETM7+ and Standard Level 1 products [
12]. These analyses, however, rely on Top of Atmosphere (TOA) reflectance rather than surface reflectance, which introduces inherent uncertainties primarily due to atmospheric noise. In contrast, some studies have utilized machine or deep learning models with higher-quality products, such as Landsat 8 or Sentinel 2 and 3, to estimate the spatial distribution of pollutants and their temporal dynamics in water, correlating these with historical data from official sources [
13,
14]. While machine-learning methods effectively capture the natural, often nonlinear, behavior of water quality parameters, their results are heavily contingent on the range of values used for training or validation, which may not include extreme values associated with extraordinary events.
Numerous studies fail to account for two critical physical phenomena in image correction: the sun-glint effect, which is linked to specular refraction due to the low energy levels of water bodies, and the whitecap effect, characteristic of aquatic environments with waves [
15]. Reflectance in inland water bodies is typically very low (often less than 10%); thus, correcting for these effects is essential to mitigate the uncertainty caused by radiometric oversaturation noise resulting from these natural light phenomena. Additionally, the majority of these studies rely on estimations derived from asynchronous measurements between satellite data and in situ water quality parameters. In the context of a highly dynamic water body such as Lake Chapala, the temporal discrepancy between field measurement and satellite overpasses introduces significant uncertainty in satellite estimates, particularly when the water body is subject to temperature fluctuations or significant winds (even with diurnal variations). This discrepancy hinders the establishment of meaningful correlations between both data sources. Furthermore, the validation of satellite reflectance through field measurements is another critical aspect that is seldom incorporated into water quality analysis in existing research.
No research to date has introduced an analytical or semi-analytical model that delineates the inherent optical properties of Lake Chapala. However, we have previously advocated for the use of empirical models to estimate parameters such as turbidity across different seasons in Lake Chapala [
16]. These models are derived from quasi-synchronous measurements of turbidity and satellite passage, with temporal discrepancies ranging from hours to days, conducted on two separate dates. This methodology has produced representative models for two distinct seasons. Additionally, we have compared these models with historical records to evaluate their predictive capacity for typical turbidity conditions over different timescales [
17]. As our aim is to develop a standardized model for Lake Chapala, the empirical model approach proves beneficial in achieving this objective.
The circulation of breezes associated with regional atmospheric patterns exerts a significant influence on the mixing mechanisms within the water column, leading to variations in the lake’s thermal regime. This suggests that the lake exhibits considerable dynamism, particularly on its eastern side [
18,
19]. Historical records from public entities provide substantial data; however, this information is seldom synchronized with satellite data. Given the lake’s dynamic nature, the water circulation in Lake Chapala complicates the assessment of water quality using remote sensing and machine learning based on historical records. An empirical model approach with quasi-synchronous measurements is deemed most appropriate for characterizing turbidity. Therefore, this study aims to establish a standardized turbidity model tailored to Lake Chapala. We employed satellite surface reflectance, in situ surface reflectance, and turbidity measurements to evaluate data quality before modeling, develop standard empirical models, and validate their accuracy. The latter two measurements were conducted simultaneously and nearly synchronously with satellite data acquisition. Furthermore, turbidity data from official sources were utilized to contextualize the model’s response within the framework of historical turbidity patterns.
2. Data Sources and Methodology
2.1. Study Area
Lake Chapala, the largest lake in Mexico, spans a surface area of approximately 1100 km
2, with a length of 75 km and an average width of 5.5 km (
Figure 1). It is recognized as one of the largest and most shallow tropical climates [
18,
20]. The primary sources of water inflow are the Lerma River and La Pasión Creek, which also contribute significantly to the influx of inorganic sediments [
4]. Consequently, since the late 1970s, the eastern side of the lake has exhibited the highest concentration of solids [
21]. The turbidity of Lake Chapala has notably increased following a reduction in storage volume at the end of the 1970s [
5]. Turbidity is predominantly influenced by suspended matter, primarily composed of clay-rich sediments, which undergo constant recirculation within the water column [
4,
22]. The eastern sector of the lake is particularly noteworthy due to its elevated turbidity levels. Previous research has demonstrated that this shallow water area provides favorable conditions, such as water column mixing, photic depth, and nitrogen availability as a limiting nutrient for phytoplankton photosynthesis, thereby promoting eventual growth of algae [
20,
23]. Additionally, significant seasonal variations in turbidity have been reported in this area, with higher turbidity during the dry season and lower turbidity during the rainy season, associated with the depth of the water column [
4]. In light of these considerations and through the analysis of historical Landsat images, eight sampling sites were identified to conduct five campaigns for measuring turbidity and surface reflectance, in proximity to certain sites of the National Water Commission (NWC) in Lake Chapala (
Figure 1).
The Lerma River constitutes one of the principal hydrographic basin systems in Mexico, encompassing a surface area of 47,116 km
2 and exhibiting an average annual surface runoff of 4742 hm
3 [
24]. The river is responsible for transporting significant quantities of solids, a result of erosion processes mainly due to inadequate soil management practices [
25,
26]. Furthermore, the Lerma River carries pollutants from upstream areas to the lake, originating from agricultural, livestock, and industrial activities. Some pollutants remain untreated, threatening the lake’s ecosystem [
27,
28].
2.2. Historical Turbidity Measurements
In Mexico, the NWC is responsible for overseeing the country’s water quality monitoring system. They conducted measurements at 34 sampling sites in Lake Chapala. We utilized data sources from 2000 to 2018 to calculate turbidity statistics, as well as to ascertain trends, typical behavior, and intra-annual variations for reference purposes. This dataset comprised 904 turbidity records for the specified period. From 2005 to 2018, an average of 62 measurements were recorded annually, corresponding to two samplings per site per year.
2.3. Landsat Image Processing
A collection of 41 Landsat images, encompassing six spectral bands within the visible-shortwave infrared region (VIS-SWIR2), was assembled from 22 June 2023 to 23 November 2024. This dataset was obtained from the Landsat 8 and 9 satellites (Collection 2, Level 2), which are products derived from the Land Surface Reflectance Code (LaSRC). These products incorporate implicit radiometric enhancements, atmospheric and topographic corrections, and mitigation of bidirectional effects associated with the geometric relationship between the sun and sensor angles [
29,
30], thereby ensuring spatial, temporal, and radiometric conformity. Such procedures provide precise measurements of surface reflectance above the Earth’s surface, thereby enhancing the consistency and comparability of images captured at different times [
31].
We considered additional data concerning the pixel quality assessment (PQA) band of Landsat 8 and 9 to effectively filter artifacts from the images and isolate the affected surface. The PQA band is a derivative sub-product of the LaSRC, comprising pixel values encoded within a radiometric depth range of to 2
^16, specifically digital numbers (DN) from 1 to 57,240 [
32]. These values are indicative of the likelihood of pixels corresponding to artifacts, with spectral response related to snow, ice cover, clouds, cloud shadows, or water. The arrangement of these values is such that a higher digital number indicates a higher probability. We implemented a geoprocessing procedure to filter the images, thereby retaining pixels that are unaffected by cloudiness and shadows.
We utilized a conditional algorithm to classify the PQA-band from the DN values [
32] to Boolean values. Specifically, we assigned a value of 1 to pixels exhibiting a low probability of cloudiness and its shadow, defined as DN ≤ 22,270 and not equal to DN = 1. Values not meeting this criterion were nullified using the following expression:
where input1 corresponds to the PQA-band. The second step performs the reflectance calculation for the VIS-SWIR2 bands, based on the surface enabled by the previous PQA-band output that uses the following expression:
Having PQA-band as input1 and any VIS-SWIR2 band as input2. The parameters M and A represent the multiplicative scale factor (2.75 × 10
−5) and additive scale factor (−0.2), respectively, for calibrating the DN to reflectance values. Furthermore, a ratio of the SWIR1 band over the BLUE band was employed to extract pixels corresponding to the water surface of Lake Chapala as follows:
where the SWIR1 and BLUE bands correspond to input1 and input2, respectively, while input3 represents any band of the VIS-SWIR2 spectrum. This study leverages the observation that the disparity in reflectance values between these bands is usually most noticeable when examining the spectral response of water. As a result, ratios less than 1 suggest pixels associated with inland water bodies. Furthermore, the images underwent correction to mitigate the sun-glint effect, a common occurrence on water surfaces. This effect is caused by specular reflection at the air–water interface, just above the water column’s surface, and is directed towards the satellite [
15]. A statistical approach (Equation (1)) was used, which originally relies on the variance and covariance (Equation (2)) of the reflectance between the VIS bands related to the near-infrared (NIR) [
33].
where:
| Reflectance corrected by the sun-glint effect. |
, | Reflectance in any band, reflectance of the NIR band. |
| Coefficient of covariance. |
In this study, we applied the method with a slight modification by using the SWIR2 band as a reference to correct the sun-glint effect in the VIS-NIR bands, as shown in previous studies [
34,
35]. The use of the SWIR2 band helps in reducing the scattering light effect that might be observed in the NIR region due to the optical response of suspended matter.
2.4. Field Measurements of Reflectance and Turbidity
We conducted five campaigns to measure surface reflectance in situ at eight sampling sites on the eastern side of Lake Chapala between 2023 and 2025 (
Table 1), an area characterized by the highest turbidity contrast in the lake [
16]. The number of sampling campaigns and sites was determined based on available resources, the size of the lake, and the time lag between in situ measurements and the satellite pass.
The number of samples was limited; however, we conducted a comprehensive analysis of Landsat images from the last 10 years to meticulously identify optimal locations for the sampling sites. We selected sites in areas where sediment plumes are typically observed, as well as sites where such plume movement is atypical. Furthermore, historical turbidity data from monitoring sites in proximity to our sampling locations enabled us to validate these selections. Notably, our field measurements encompass 80% of the turbidity range historically recorded by the NWC.
These conditions, coupled with the interest in comparing water surface reflectance from two different information sources, necessitated the scheduling of seasonal sampling. Historical turbidity data facilitated the determination of seasonal frequency, as it more accurately reflects intra-annual turbidity variation compared to monthly frequency analysis. A spectrometer (StellarNet, Black Comet SX-200, Tampa, FL, USA) with a spectral resolution of 0.5 nm was employed to capture the spectral signature of the water surface in the VIS-NIR region (450–900 nm). At each site, three measurements were conducted, with each measurement representing the average of five spectra (900 reflectance records per spectrum), resulting in 15 spectra per site and 120 spectra per campaign. We calculated the average reflectance of the VIS-NIR region, constrained to the bandwidth defined by the Landsat 8 and 9 platforms (
https://www.usgs.gov/landsat-missions/landsat-9, accessed on 4 May 2025) for subsequent analysis.
The in situ surface reflectance measurements were utilized to validate the satellite observations. However, the presence of cloudy sky conditions partially compromised the quality of satellite imagery across all campaigns. Notably, during two of the five data collection campaigns (29 November 2023 and 7 May 2024), the overcast sky resulted in a complete absence of satellite reflectance data, thereby affecting the comparative analysis between the two data sources. In the remaining three campaigns, conducted on 12 March 2024, 7 November 2024, and 7 March 2025, we calculated the average in situ surface reflectance at eight sites within a ±3-h window surrounding the Landsat satellite overpass, which is around 11:18 local time.
These samples are considered representative of the spring and fall seasons, respectively, to evaluate the model’s response within these temporal contexts. Turbidity was concurrently measured in the field at the same eight sampling sites. Turbidity was optically quantified in nephelometric turbidity units (NTU) using a Hanna portable turbidimeter HI-93703 scanner (Nusfalau, Romania), which detects light dispersion at 860 ± 10 nm, in accordance with ISO 7027-1:2016 standards. This spectral range closely aligns with the Landsat NIR spectral range (850–880 nm), providing certainty for comparative analysis between data sources, while also ensuring independence from measurements in other spectral regions and the multispectral analysis itself. Both in situ turbidity and surface reflectance were measured quasi-synchronously with the satellite overpass.
2.5. Time Series Construction and Statistical Analysis
All images were integrated into a datacube to construct a time series and analyze the statistical properties of reflectance in Lake Chapala. The datacube consists of 246 bands (41 images, each consisting of six bands), partially covering the surface of Lake Chapala. The time series represents the mean reflectance value and complementary statistical parameters for each band in the datacube. This time series facilitates the identification of intra-annual variations in water reflectance between June 2023 and November 2024. The images were affected by cloud cover and their shadows, resulting in surfaces with null pixels. A sufficiency criterion of 80 percent or greater was employed to recover pixels with valid information in each band of the datacube. This criterion ensures the retention of all pixels with reflectance values in at least 196 of the 246 bands, equivalent to 33 of the 41 images. The calculation of descriptive statistics at the pixel level enabled the characterization of water reflectance behavior over time. The integration of statistical parameters (average, minimum, maximum, standard deviation, range, and count) per pixel and band facilitated the identification of spatial contrasts in the typical spectral response of water in Lake Chapala, particularly concerning the NIR surface reflectance.
2.6. Spatial Prediction Model
The model’s construction was undertaken in three distinct phases. Initially, satellite surface reflectance was validated through field measurements. This comparative analysis between data sources enabled the evaluation of the reliability of satellite data via linear correlations for the dates of 12 March 2024, 7 November 2024, and 7 March 2025. In the subsequent phase, nonlinear correlations between in situ turbidity and satellite NIR surface reflectance measurements facilitated the development of empirical turbidity models to estimate the magnitude and spatial distribution of turbidity in Lake Chapala. Quasi-synchronous data from the three campaigns were utilized to construct spatial prediction models. Two second-degree polynomial functions were formulated to estimate turbidity based on satellite surface reflectance at the eight sampling sites for each campaign. These models were applied to the corresponding Landsat imagery to estimate turbidity in Lake Chapala. The reliability of each turbidity model was assessed using linear correlations with field observations at the eight sampling sites. Furthermore, the three campaigns were integrated into a single series to establish a standardized annual model (2024–2025). In the final phase, the response of the turbidity prediction models was evaluated for spring (12 March 2024 and 7 March 2025), fall (7 November 2024), and the annual cycle. The statistical validation was conducted by applying the models to a specific case evaluated in 2016 [
16]. The response of each model was assessed through linear correlations between the turbidity estimated by the models and the turbidity measured in that study at the 15 sampling sites. Validation was also performed using the historical WNC data measured at 34 sampling sites. The four models (Equations (3)–(6)) were applied to the NIR band for all 41 Landsat images. A turbidity time series derived from these images was integrated for each model to calculate the pixel-by-pixel descriptive statistics and map the estimated turbidity across the entire lake, as detailed in
Section 2.5.
5. Conclusions
This study demonstrates that satellite-derived surface reflectance, when accurately corrected and validated with in situ spectral measurements, can effectively predict turbidity in shallow tropical lakes. The model was constructed using data from eight sampling sites; thus, its results could be subject to scrutiny due to the limited data volume. Nevertheless, the model predictions, based on data from a few rigorously selected sites in the eastern sector of the lake, possess the capacity to estimate turbidity across the entire lake. The predictions derived from these sites exhibit an acceptable degree of certainty. The empirical models developed, particularly the annual model, exhibited acceptable statistical performance, that is, R2 of 0.65 on average by season or annual cycle (0.55 to 0.81) for the whole lake; and R2 of 0.71 on average (0.53 to 0.92) for the eastern side of the lake. Thereby establishing their suitability as tools for spatial and temporal turbidity assessment. The surface reflectance data from Landsat 8 and 9 were optimized for water quality analysis through the integration of radiometric corrections, water masking, and sun-glint correction. The strong correlation between field-measured and satellite-derived reflectance underscores the reliability of these products, particularly in the NIR band used for turbidity estimation. The statistical validation with historical turbidity records and prior in situ measurements has demonstrated that the proposed models effectively capture the spatial gradients and seasonal dynamics of turbidity across Lake Chapala. This methodology is particularly advantageous in environments with limited ground monitoring infrastructure, where satellite imagery can address data gaps in both temporal and spatial dimensions. This study highlights the potential of empirical, single-band remote sensing models, particularly when integrated with rigorous field validation, for scalable water quality monitoring in dynamic freshwater systems. This methodology was applied to the specific case of Lake Chapala, Mexico; however, it could be applicable to other shallow, sediment-laden tropical lakes facing similar monitoring and management challenges.