Next Article in Journal
Advanced Semi-Supervised Learning for Remote Sensing-Based Land Cover Classification in the Mekong River Delta, Vietnam
Previous Article in Journal
A Generative Augmentation and Physics-Informed Network for Interpretable Prediction of Mining-Induced Deformation from InSAR Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Climate and Vegetation Dominate Lake Eutrophication in the Inner Mongolia–Xinjiang Plateau (2000–2024)

1
College of Grassland Science and Technology, China Agricultural University, Beijing 100193, China
2
College of Land Science and Technology, China Agricultural University, Beijing 100193, China
3
Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China
4
School of Energy and Environmental Engineering, University of Science and Technology Beijing, Beijing 100083, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Remote Sens. 2026, 18(7), 988; https://doi.org/10.3390/rs18070988
Submission received: 2 February 2026 / Revised: 4 March 2026 / Accepted: 9 March 2026 / Published: 25 March 2026

Highlights

What are the main findings?
  • The plateau-wide annual mean TLI stayed around 48–49 (mesotrophic to lightly eutrophic) but showed a statistically significant but modest upward trend during 2000–2024 (Sen’s slope = 0.0158 TLI yr−1; p = 0.006).
  • Spatial heterogeneity intensified: 54% of lakes showed increasing TLI (15.6% significantly), with clusters in northeastern basins (Hulun–Erguna–Hailar) and several endorheic/desert–oasis basins, while many high-altitude cold-region basins remained comparatively stable.
What are the implications of the main findings?
  • Vegetation greenness and air temperature explain most interannual TLI variability overall (~34% and ~22%), but population/land-use pressure and grazing dominate in specific basins—supporting basin-specific eutrophication management.
  • The long-term, lake-resolved TLI baseline enables rapid screening and monitoring prioritization in data-sparse dryland basins, with extra attention to hotspot basins.

Abstract

Lakes on the Inner Mongolia–Xinjiang Plateau (IMXP) are increasingly vulnerable to eutrophication under climate change and human pressure, yet long-term monitoring remains limited by sparse field sampling. Here, we reconstruct multi-decadal trophic dynamics across the IMXP using Landsat time series and temporally transferable machine-learning models and further quantify the underlying natural and anthropogenic drivers. We compiled monthly in situ water-quality observations (chlorophyll-a, Chl-a; total phosphorus, TP; total nitrogen, TN; Secchi depth, SD; and permanganate index, CODMn;) and calculated the trophic level index (TLI). After rigorous quality control and monthly aggregation, we compiled a dataset of 1345 matched lake–month samples spanning 2000–2024, and divided it into a training set (n = 1076; ≤2019) and an independent test set (n = 269; 2020–2024) to evaluate temporal transferability. We utilized Google Earth Engine to generate monthly surface reflectance composites from Landsat 7 ETM+, Landsat 8 OLI, and Landsat 9 OLI-2. Four supervised regression algorithms—ridge regression (RR), support vector regression (SVR), random forest (RF), and eXtreme Gradient Boosting (XGBoost)—were trained to estimate TLI. On the independent test period, XGBoost performed best (R2 = 0.780, RMSE = 3.290, MAE = 1.779), followed by RF (R2 = 0.770, RMSE = 3.364), SVR (R2 = 0.700, RMSE = 3.842), and RR (R2 = 0.630, RMSE = 4.267); we then used XGBoost to reconstruct monthly and yearly TLI for 610 perennial grassland lakes from 2000 to 2024. From 2000 to 2024, the annual mean TLI (48–49) across the IMXP exhibited a statistically significant upward trend (slope = 0.0158 TLI yr−1; 95% confidence interval (CI) = 0.0050–0.0267; p = 0.006). Meanwhile, spatial heterogeneity was distinct (TLI: 41.51–59.70). High values concentrated in endorheic and desert–oasis basins (e.g., Eastern Inner Mongolia Plateau, >51), whereas lower values characterized high-altitude regions (e.g., Yarkant River, <45). Overall, trends ranged from −0.49 to 0.51 yr−1, increasing in 54% of lakes (15.6% significantly) and decreasing in 46% (15.4% significantly). Attribution analyses identified NDVI (33.92%) and temperature (21.67%) as dominant drivers (55.59% combined), followed by precipitation (13.99%) and human proxies (30.42% combined: population 10.66%, grazing 10.31%, built-up 9.45%). Across 53 sub-basins, NDVI was the primary driver in 28, followed by temperature (11), population (7), precipitation (3), grazing (3), and built-up land (1); notably, the top two drivers explained 56.6–87.1% of variations. TWFE estimates revealed bidirectional NDVI effects (significant in 31/53): positive associations in 22 basins were linked to nutrient retention, contrasting with negative effects in nine basins associated with agricultural return flows. Temperature effects were significant in 15 basins and predominantly negative (14/15), except for the Qiangtang Plateau. Overall, eutrophication risk across the IMXP lake region reflects the combined influences of climatic conditions, vegetation conditions, and human activities, with their relative contributions varying among basins.

1. Introduction

Plateau lakes deliver critical ecosystem services, including storing and regulating freshwater through buffering runoff peaks, sustaining base flows during dry seasons, and recharging groundwater in water-scarce regions [1]. They also sustain aquatic and riparian biodiversity and mediate regional energy–water exchanges [1,2]. Beyond these hydrological functions, lakes actively participate in regional carbon cycling by emitting CO2 and CH4 to the atmosphere while simultaneously burying organic carbon in their sediments [3,4]. Recent evidence indicates that lakes in arid Inner Mongolia are significant greenhouse gas sources, with emission patterns governed by nutrients, salinity, and catchment productivity [3], while, nationally, Chinese lakes have shifted from a net CO2 source to a net sink over the past two decades [4]—highlighting that changes in lake trophic conditions can feed back to regional climate. Despite their importance, plateau lakes are intrinsically vulnerable: long water residence times and limited flushing predispose them to nutrient accumulation under hydroclimatic variability [2,5], and human-accelerated nutrient loading—cultural eutrophication—has further exacerbated this fragility by promoting algal dominance and undermining downstream water security [5,6]. It is important to note, however, that external nutrient loading (i.e., the mass flux of nitrogen and phosphorus entering a lake) is not equivalent to a lake’s trophic state; the latter is a composite response integrating nutrient concentrations, phytoplankton biomass, water transparency, and organic matter levels, as captured by indices such as TLI [7,8].
IMXP exemplifies these challenges. Situated in the mid-latitude interior drylands, IMXP harbors over 600 perennial lakes embedded in a mosaic of grassland, desert, and irrigated oasis landscapes [9,10,11]. Several lake-specific studies have revealed alarming trophic trajectories within this region: in Hulun Lake, organic carbon burial rates increased 3.75-fold over the past century in concert with rising temperatures [12], while overgrazing has been identified as a primary driver of elevated organic matter concentrations [13]; a recent review further confirmed that the dominant control on Hulun Lake’s water quality shifted from natural climate variability to anthropogenic land-use change around 1998 [14]. For Daihai Lake, temperature-driven evaporative losses have caused a 64% area reduction since the 1980s, concentrating nutrients and accelerating water quality deterioration [15]. In Wuliangsuhai Lake, over 90% of surrounding farmland drainage enters the lake, making agricultural return water the primary nutrient source [16,17]. A regional study of seven Inner Mongolian lakes linked spatial water quality differences to temperature, precipitation, and land-use patterns [18], while a three-decade Mongolian Plateau dataset showed that 71.83% of lakes were eutrophic by 2020 [19]. Meanwhile, a remote sensing analysis of water clarity across IMXP (2000–2019) indicated substantial spatial heterogeneity and climate-sensitivity of lake optical properties [20], and rapid lake losses on the broader Mongolian Plateau have been attributed to drying trends [21]. However, these studies—valuable as they are—have been limited to individual lakes or small lake groups, span relatively short time periods, and lack the systematic cross-basin comparisons needed to disentangle the relative roles of climate, vegetation, and human activities across the full extent of IMXP.
To help address these limitations, satellite remote sensing provides a scalable option; however, reconstructing long-term trophic dynamics still requires multi-decadal observations with consistent calibration and sufficient spatial resolution. Landsat provides the longest continuous record of calibrated, medium-resolution (30 m) multispectral imagery, which is particularly valuable for monitoring the numerous small lakes (1–10 km2) that dominate IMXP but are not well resolved by coarser-resolution sensors such as MODIS (250–500 m) or Sentinel-3 OLCI (300 m) [22]. Cloud-computing platforms such as Google Earth Engine further enable efficient processing of decades of Landsat imagery into standardized time series [23]. Regarding trophic state assessment, TLI—as specified in the Chinese national standard HJ 91.2–2022—integrates five water quality parameters (Chl-a, TP, TN, SD, and CODMn) into a single composite index, differing from Carlson’s trophic state index (TSI) which relies on three parameters (Chl-a, TP, and SD) [7]; we adopt TLI here to ensure direct compatibility with China’s monitoring framework [8]. For retrieval algorithms, empirical and semi-analytical models offer strong interpretability but typically require lake-specific parameterization [24,25,26,27]; in contrast, machine learning methods can learn the nonlinear mapping between multi-source spectral features and TLI, making them more suitable than empirical and semi-analytical approaches for cross-basin, long-term retrieval [24,28]. Nevertheless, because the relationship between lake trophic state and remote sensing signals is not temporally stable under shifting climate and land-use conditions, the temporal transferability of such models must be rigorously evaluated using independent time periods rather than random splits that allow period-specific signals to leak across training and testing sets [24,29]. Accordingly, three key challenges can be identified: (i) IMXP lacks a long-term (>20-year), lake-resolved TLI baseline covering its full lake inventory; (ii) existing machine learning retrieval models for TLI have not been tested for temporal transferability using strictly independent time periods; and (iii) the relative contributions of climate, vegetation, and human activities to trophic variability have not been systematically compared across IMXP’s diverse sub-basins. Here, we reconstruct monthly and yearly TLI for 610 perennial lakes across IMXP from 2000 to 2024 using Landsat time series and in situ observations. Specifically, we (i) compile and quality-control monthly in situ water-quality data, calculate TLI, and derive lake-scale predictors from monthly Landsat composites; (ii) evaluate four supervised regression models (RR, SVR, RF, and XGBoost) using a time-based split to assess temporal transferability; (iii) characterize multi-decadal spatiotemporal patterns of TLI; and (iv) quantify basin-specific relationships between TLI variability and climate, vegetation, grazing, and land-use indicators to inform basin-scale management.

2. Materials and Methods

2.1. Study Area

Located in the mid-latitude interior drylands (≈60–130°E, 30–50°N), IMXP is characterized by a cold arid to semi-arid climate with strong continentality [9,10,11]. Precipitation is limited and seasonal (concentrated in the warm season), while winters are long and dry. High evaporation intensifies regional water deficits, making surface water highly sensitive to hydroclimatic variability. The region features pronounced relief—ranging from high mountains to low-lying basins—which creates strong runoff gradients and diverse hydrological sub-units (Figure 1).
This study focuses on 610 perennial lakes (annual surface area ≥ 1 km2 during 2000–2024) compiled from HydroLAKES [30]. The ≥1 km2 criterion is frequently adopted in Landsat-based trophic-state assessments because it balances spatial coverage with spectral robustness: at 30 m resolution, lakes above this size typically retain sufficient interior water pixels after excluding nearshore mixed pixels to yield stable lake-mean reflectance [31,32]. Spatially, these lakes show evident clustering in the western and northeastern sectors. According to standard classification [33], the inventory is dominated by small lakes: 496 lakes (81.3%) fall within the 1–10 km2 class, and 99 (16.2%) are 10–100 km2. Large lakes are scarce, with only 9 (1.5%) in the 100–500 km2 range and 6 (1.0%) ≥500 km2.
Watershed boundaries were derived from HydroBASINS [34]. While the broader region contains over 70 sub-basins, analyses were restricted to the 53 sub-basins that contain at least one target lake (B01–B53; Figure A1; Table A1). These served as consistent spatial units for trend detection and driver attribution.

2.2. Data and Processing

2.2.1. Water Quality Data and Processing

TLI is calculated using five key parameters: Chl-a, TN, TP, CODMn, and SD. The data for Chl-a, TN, TP, and CODMn were primarily sourced from the National Surface Water Quality Automatic Monitoring System [35], with supplementary data for certain lakes obtained from the National Tibetan Plateau Data Center [36] and the Lake-Watershed Science Data Center [37]. The SD data were calculated from synchronous turbidity (Tbdy) values obtained from these same data sources based on the standard conversion formula [38,39]:
S D   =   3.390   ×   T b d y 0.637
To ensure comparability across sources and to reduce the influence of irregular sampling, raw observations were standardized to monthly time series through three sequential steps: (i) compute daily means by averaging same-day measurements; (ii) remove outliers using the IQR rule, flagging values outside [Q1 − 3IQR, Q3 + 3IQR]; and (iii) compute monthly means by averaging daily means within each calendar month. Finally, the monthly composite TLI was calculated from Chl-a, TN, TP, CODMn, and SD following the standard HJ 91.2–2022 [8].
j = 1 5 T L I j = j = 5 5 W j × T L I j
T L I j = 10 × ( a j + b j   l n   x j )
W j = r j 2 j = 1 5 r j 2
where xj is the measured value of the j-th water-quality parameter, aj and bj are the coefficients listed in Table 1, TLIj is the sub-index of the j-th indicator, Wj is its correlation-based weight, and rj is the Pearson correlation coefficient between Chl-a and the j-th parameter. The r j and W j values are provided in Table A2 [8].

2.2.2. Satellite Data and Processing

This study uses the monthly TLI as the response variable, with spectral band ratios derived from Landsat surface reflectance as the primary predictors, supplemented by spatiotemporal auxiliary variables (Section 2.2.2 and Section 2.3.1).
Satellite data processing was implemented in Google Earth Engine (GEE; Google LLC, Mountain View, CA, USA) using Landsat Collection 2 Level-2 surface reflectance (L7/L8/L9) for 2000–2024 [23,40]. Within each lake polygon, only pixels identified as permanent water in the JRC Global Surface Water dataset were retained for reflectance extraction, excluding intermittently exposed shoreline and lakebed areas [30]. We utilized VNIR–SWIR bands, retaining scenes with <10% cloud cover. To ensure physical validity, we used the QA band to mask pixels affected by clouds, shadows, and radiometric saturation (where signal intensity exceeds the sensor’s dynamic range) [41,42]. ETM+ data were then harmonized to OLI-equivalent reflectance via band-wise OLS transformations (Equation (5), Table A3) [43]; L5 TM was not included because no comparably validated cross-calibration pathway to OLI exists [43]. Valid water-pixel observations were spatially averaged within each lake polygon and composited into monthly means; this spatiotemporal averaging effectively fills gaps caused by the L7 SLC-off error [42].
ρ O L I = a + b ρ E T M +
The ETM+ reflectance harmonized to OLI/OLI-2 standards exhibited low errors (MAE = 5.42 × 10−4–1.00 × 10−3; RMSE = 6.78 × 10−4–1.26 × 10−3) and strong agreement (R2 = 0.72–0.89), with the lowest consistency in SWIR2 (R2 = 0.72) (Figure A2).
Candidate predictors were assembled by pooling all band ratios and normalized spectral indices employed in previous studies [44,45,46,47,48,49], yielding an initial set of 17 candidate features. To reduce multicollinearity, variance inflation factors (VIF) were computed and predictors with VIF > 5 were iteratively removed until all remaining predictors satisfied VIF < 5 (Table 2) [50]. B3/B4 exploits the Chl-a reflectance peak versus its red absorption [44,45]; B2/B4 and B2/B5 capture scattering-to-absorption balance related to suspended solids and clarity [48]; B5/B3 responds to elevated NIR reflectance under turbid or bloom conditions [46]; B5/B6 separates particulate scattering from water absorption [49]; and (B5 − B4)/(B5 + B4) tracks chlorophyll-driven reflectance changes in the red–NIR transition [45,47].

2.2.3. Driving Factor Data and Processing

Lake trophic state in continental drylands is governed by hydroclimatic forcing, vegetation dynamics, and anthropogenic nutrient loading [2,5,18,20]. We compiled nine candidate variables spanning these three processes (Table 3) [51,52,53,54,55,56,57]. Among them, air temperature and precipitation, as major meteorological factors, are consistently recognized as key drivers of TLI in IMXP [18,20,21]. In contrast, radiation, wind speed, PET, and drought indices are mostly expressed through energy/evapotranspiration processes, and are often derived from or highly dependent on air temperature and precipitation [18,20]. At the annual scale, they provide limited incremental information and tend to introduce redundancy and instability, thus were not included. NDVI, FVC, and NPP characterize vegetation status from three complementary perspectives: vegetation greenness, coverage, and productivity, respectively, while LHGI is used to quantify grazing pressure [13,16]. Population density, proportion of construction land, and proportion of cultivated land are adopted to represent urbanization and agricultural intensity, both of which are directly linked to nutrient export [5,16]. All raster layers were projected to a common coordinate system and summarized for each sub-basin and year. Spatial preprocessing and extraction were conducted in ArcGIS 10.8 (Esri Inc., Redlands, CA, USA), and the subsequent regression and contribution analyses were performed in Python 3.7 (Python Software Foundation, Beaverton, OR, USA).
To avoid model instability caused by multicollinearity, we screened variables using VIF [50]. In the initial full model, severe collinearity was concentrated among vegetation metrics (NDVI = 54.95, FVC = 34.10, NPP = 12.16) and land-use metrics (farm_ratio = 9.22, built_ratio = 6.49), all exceeding the VIF < 5 threshold. NDVI was retained as the representative vegetation proxy due to its robustness, longest available record, and widespread use in dryland vegetation monitoring [52]; built_ratio was prioritized over farm_ratio as it more directly characterizes high-intensity anthropogenic disturbance. After iterative removal, all six retained variables satisfied VIF < 5 (Table 4).

2.3. Research Methods

2.3.1. TLI Model Construction

Because in situ sampling across IMXP is spatially sparse and temporally discontinuous, and TN, TP, and CODMn lack strong optical signatures in Landsat bands, we trained models to retrieve the composite TLI directly from satellite reflectance rather than inverting sub-parameters individually.
Retrieving TLI from satellite imagery across IMXP is challenging because the spectral–TLI relationship ranges from approximately linear in clear, oligotrophic lakes to strongly nonlinear in turbid, eutrophic ones, while the wide diversity in lake optical properties demands broadly generalizable algorithms [17,20,24,28]. We therefore evaluated four models spanning a complexity gradient—RR [58], SVR [59], RF [60], and XGBoost [61]—from regularized linear regression through kernel-based and ensemble methods, following the recommended practice of comparing across model families to balance interpretability and predictive power in remote sensing water quality retrieval [28,38,47]. To address regional heterogeneity, we incorporated auxiliary spatiotemporal variables (coordinates, season, lake size) [17,20,62]. The dataset was split chronologically (training: ≤2019, ~80%; testing: 2020–2024, ~20%) to evaluate temporal transferability (a prerequisite for operational long-term monitoring [29]); this 80/20 partition is widely adopted in machine learning, balancing sufficient training data with a representative validation sample [29,63].

2.3.2. TLI Model Evaluation

The root-mean-square error (RMSE), mean absolute error (MAE), and Pearson correlation coefficient squared (R2) were used for statistical evaluation [64,65,66]. In this study, RMSE reflects the dispersion between predicted and measured TLI, MAE quantifies the average magnitude of the absolute differences between predicted and measured TLI, and R2 indicates how well the predicted TLI agrees with the measured TLI. Smaller RMSE and MAE values and a higher R2 value indicate better model performance and more stable predictions.

2.3.3. Sub-Basin Analysis of TLI Driving Factors

To identify interannual TLI drivers across sub-basins, we constructed a panel of annual mean TLI and six candidate driving variables for each sub-basin. Pooled OLS cannot control for unobserved basin-specific characteristics (e.g., morphometry and geology) [67]; SEM requires a priori path specification unsuited to exploratory screening [67]; machine-learning models sacrifice coefficient interpretability [68,69]. We therefore adopted a two-way fixed-effects (TWFE) model on first-differenced variables (Equation (6)) [70,71,72], where basin fixed effects (μb) remove time-invariant basin heterogeneity, year fixed effects (τb) absorb common annual shocks, and first-differencing addresses residual non-stationarity [71].
T L I b , t = β 0 + k = 1 6 β k X k , b , t + μ b + τ b + ε b , t
where b and t index sub-basins and years, T L I b , t and X k , b , t are the first differences in the TLI and the k-th driving variable, βₖ is the average marginal association, and ε b , t the error term. Because the pooled βₖ may mask spatial heterogeneity, we further estimated basin-specific regressions with year fixed effects (Equation (7)) [70]:
T L I ( b ) = α ( b ) + k = 1 6 β k ( b ) X k ( b ) + ε ( b )
where superscript (b) denotes basin-specific estimates, and α(b) the basin intercept.
Relative driver importance within each basin was quantified via Shapley-value R2 decomposition [73,74], which—unlike standardized coefficients or partial R2—yields an order-independent, additive allocation of total R2 among correlated predictors. Normalized percentage contributions C k ( b ) were calculated following Lipovetsky and Conklin [73].

3. Results

3.1. TLI Model Performance

Figure 2 compares measured and retrieved TLI for the training (n = 1076) and independent test (n = 269) datasets across four algorithms. XGBoost (Figure 2a) and random forest (RF; Figure 2b) show point clouds that are more tightly clustered around the 1:1 line, with fitted regression lines closer to the 1:1 line, indicating better agreement and generalization. In contrast, support vector regression (SVR; Figure 2c) and ridge regression (RR; Figure 2d) exhibit greater dispersion and more apparent departures from the 1:1 line, suggesting larger prediction errors. The performance statistics reported within each panel further confirm this ranking on the test set: XGBoost (R2 = 0.780, RMSE = 3.290, MAE = 1.779) outperformed RF (R2 = 0.770, RMSE = 3.364, MAE = 2.168), SVR (R2 = 0.700, RMSE = 3.842, MAE = 2.231), and RR (R2 = 0.630, RMSE = 4.267, MAE = 2.635). Therefore, XGBoost was selected for subsequent TLI retrieval and mapping.

3.2. Spatiotemporal Dynamics in the TLI of Lakes on IMXP

Plateau-wide, the annual mean TLI remained within a narrow band (~48–49) throughout 2000–2024 (Figure 3a). Despite this apparent stability, Sen’s trend test detected a statistically significant upward drift (slope = 0.0158 TLI yr−1; 95% CI = 0.0050–0.0267; p = 0.006), indicating a slow but persistent increase in trophic pressure over the 25-year record. In other words, the plateau as a whole is not experiencing rapid eutrophication, but neither is it stable—the trajectory is one of gradual, broadly distributed trophic enrichment.
This modest overall trend masks pronounced spatial heterogeneity: multi-year mean TLI ranged from 41.51 to 59.70 across the 610 lakes, forming distinct basin-scale clusters (Figure 3b). The highest values concentrate in two geographic groups. The first is the northeastern lake basins, notably the Hulun Lake basin (B34) and the adjacent Erguna–Hailar river system (B26, B29), where mean TLI consistently exceeded 51. The second group comprises several endorheic and desert–oasis transition basins in Xinjiang (e.g., Ebinur Lake basin, B10; Turpan Basin, B31; Endorheic region, B49). In contrast, high-altitude inland basins—exemplified by the Qiangtang Plateau inland region (B14)—exhibit comparatively low mean TLI (<45), likely reflecting lower nutrient inputs and shorter growing seasons. Additional basin-level values are summarized in Table A1 and mapped in Figure A1.
TLI trends are directionally mixed—54% of lakes show increasing TLI and 46% show decreasing TLI (Sen’s slopes spanning −0.49 to 0.51 TLI yr−1)—but only ~31% reach statistical significance (Mann–Kendall p < 0.05; Figure 3c,d). Significantly increasing lakes (15.6%) cluster primarily in the northeastern basins (Hulun–Erguna–Hailar) and in several endorheic/oasis basins, mirroring the spatial hotspots identified above. Significantly decreasing lakes (15.4%) are more geographically scattered, occurring in parts of the Irtysh River (B06), Ulungur River (B25), and Western Inner Mongolian Plateau (B16) basins. The remaining ~69% of lakes exhibit non-significant trends, underscoring that most individual lakes have not yet departed clearly from their baseline trophic state.
These spatial patterns are corroborated at multiple temporal scales. Annual TLI maps spanning the full study period (Figure A3) confirm that the basin-scale hotspots identified above persist across years. Seasonal composites (Figure A4, Figure A5, Figure A6, Figure A7 and Figure A8) further demonstrate that these spatial contrasts are reproduced at monthly scale: the same basin-scale clustering is evident in every season, with the northeastern and endorheic/oasis basins consistently exhibiting higher TLI.
Given the strong basin-scale clustering of trophic conditions, the mixed but spatially structured TLI trends, we next quantify which climatic, vegetation, and anthropogenic factors explain interannual TLI variability at the sub-basin scale.

3.3. Driving Factors Associated with Annual TLI Variability on the Sub-Basin

At the plateau scale, Shapley-value R2 decomposition identifies vegetation conditions (NDVI and LHGI combined: 44.23%) and climatic conditions (temperature and precipitation combined: 35.66%) as the dominant driver categories, together accounting for ~80% of the explained TLI variance, while human-activity proxies (population density and built-up ratio) contribute the remaining ~20% (Figure 4a). The pooled TWFE regression yields significant coefficients for temperature (β = −0.094, p < 0.001) and precipitation (β = +0.0003, p = 0.003), with built-up ratio marginally significant (β = −38.74, p = 0.049); NDVI, population density, and LHGI are not significant in the pooled model. The non-significance of NDVI—despite its largest individual Shapley share (33.92%)—arises because Shapley values partition total explained variance, whereas the pooled TWFE coefficient estimates a single average marginal effect that cancels out when a driver exerts opposing effects across basins.
At the sub-basin level, the dominant driver varies markedly (Figure 4b; Figure A9). NDVI ranks first in 28 of 53 basins, followed by temperature (11), population density (7), precipitation (3), LHGI (3), and built-up ratio (1). The top two drivers together explain 56.6–87.1% of each basin’s TLI variance, indicating a concentrated attribution structure. Geographically, NDVI-dominant basins span most inland and endorheic regions; temperature-dominant basins cluster in the Hexi Corridor and the Qiangtang Plateau; population-density-dominant basins align with oasis–urban corridors in northern Xinjiang (e.g., B10, B31).
Basin-specific regressions reveal the direction and significance of each driver (Figure 5). NDVI is significant in 31 basins with a pronounced sign reversal: positive in 22 basins (β: +13.05 to +28.59) but negative in nine Tarim River system basins (β: −79.73 to −55.32). Temperature is significant in 15 basins and predominantly negative (14/15; β: −0.33 to −0.30), except in the Qiangtang Plateau (B14; β = +0.47). Population density is significant in 8 basins, positive in northern Xinjiang oasis–urban basins (6/8) and negative in the Ili and Emin basins (2/8). Precipitation, LHGI, and built-up ratio show sparse significance (3, 2, and 1 basins, respectively). These basin-level contrasts—especially the bidirectional NDVI response and the predominantly negative temperature association—suggest distinct underlying processes that warrant further investigation.

4. Discussion

4.1. Model Accuracy and Applicability

By benchmarking against existing remote sensing-based eutrophication studies (Table 5), our XGBoost model demonstrated robust performance on the independent temporal test set (2020–2024), achieving an R2 of 0.780, an RMSE of 3.290, and an MAE of 1.779. These metrics not only attest to the model’s effectiveness across the optically complex lakes of IMXP but also highlight its specific advantages for large-scale, long-term monitoring.
First, the proposed model exhibits competitive retrieval accuracy compared to other regional or multi-lake models. Previous studies targeting lake groups, such as urban lakes in Wuhan or the Poyang Lake basin, typically reported RMSE values ranging from 4.4 to 6.1 or R2 values between 0.54 and 0.72 [45,46,48,75]. In contrast, our model maintained an RMSE below 3.3 and an MAE of 1.78 while covering a heterogeneous inventory of 610 lakes. This suggests that XGBoost can effectively model the complex non-linear relationship between spectral reflectance and the comprehensive TLI across diverse optical water types without requiring the “one-lake, one-parameter” calibration often needed by empirical or semi-analytical methods, when fed with physics-informed multi-band ratio features.
Second, the results reflect a necessary balance between local precision and broad-scale applicability. We acknowledge that specific single-lake studies, such as the neural network model for Chaohu Lake (R2 = 0.8937) [44], achieved higher fitting metrics. However, such high precision often relies on the relatively consistent optical properties of a single water body and abundant local training data. Our study, spanning 53 sub-basins over 25 years, inevitably faces greater environmental heterogeneity. Although our R2 is slightly lower than that of highly specialized single-lake models, our approach eliminates the need for individual lake tuning, making it a far more cost-effective and scalable solution for regional screening in data-sparse dryland environments.
Third, the choice of Landsat ensures the feasibility of long-term historical reconstruction, a capability currently unmatched by other sensors in this context. While Sentinel-2/3 offer superior temporal and spectral resolution for recent monitoring [45,46,49], their archives (post-2015) are too short to evaluate multi-decadal responses to climate change. Conversely, while MODIS covers the required time span, its coarse resolution is unsuitable for the small lakes (1–10 km2) that dominate our study region. By utilizing the 30 m resolution Landsat series (L7/L8/L9), our model successfully reconciles the need for a multi-decadal archive (2000–2024) with the spatial granularity required to resolve smaller water bodies.
In summary, although the absolute goodness-of-fit may be slightly lower than that of refined models targeting single water bodies, the proposed XGBoost model maintains low error rates (RMSE/MAE) and effectively meets the challenges of cross-basin generalization and historical reconstruction. Consequently, this approach serves as a reliable and practical method for this study, particularly suitable for reconstructing long-term trophic-state baselines and monitoring dynamics in data-sparse dryland basins.

4.2. Analysis of the Spatiotemporal Variations in Lake TLI Within Sub-Basins

Climate-vegetation variables collectively explain ~80% of interannual TLI variance at the plateau scale (NDVI 33.92%, temperature 21.67%, precipitation 13.99%, LHGI 10.31%), with anthropogenic proxies accounting for ~20% (Figure 4a). This hierarchy is consistent with Ma et al. [18], who attributed TLI variation (49.14–71.77) across seven Inner Mongolian lakes primarily to temperature, precipitation, and land use, and with Wang et al. [19], who reported 71.83% of Mongolian Plateau lakes as eutrophic (Forel–Ule Index; FUI ≥ 10) by 2020. However, our basin-resolved Shapley decomposition reveals a dimension these studies may have overlooked: the top two drivers explain 56.6–87.1% of each basin’s variance, yet which two dominate shifts markedly across basins (Figure 4b), suggesting that plateau-scale averages mask distinct eutrophication regimes in adjacent watersheds. Nationally, Hu et al. [31] documented a longitudinal gradient-40-year mean TSI of 62.26 in eastern plains versus 23.72 on the Tibetan Plateau—with anthropogenic factors explaining 88% of eastern lake variability. Our IMXP mean TLI (48–49) falls between these extremes, consistent with its transitional position, but climate-vegetation dominance (~80%) rather than human forcing (~20%) clearly distinguishes IMXP from the anthropogenically driven eastern lake region.
The most notable pattern is the bidirectional NDVI-TLI relationship: significantly positive in 22 basins but negative in 9 (Figure 5f), reflecting that NDVI captures vegetation vigor without distinguishing nutrient pathways. In the nine Tarim River system basins (β: −79.73 to −55.32), vegetation is irrigation-dependent; rising NDVI signals intensified cropping and amplified nutrient export via return flows—a mechanism documented in Wuliangsuhai Lake, where over 90% of water input derives from farmland drainage delivering 2037 t TN and 55.8 t TP annually [16], and in Lake Taihu, where diffuse agricultural sources supply ~90% of dissolved nitrogen [76]. Conversely, the 22 basins with positive coefficients (β: +13.05 to +28.59) are predominantly pastoral, where intact vegetation intercepts nutrients via root uptake, microbial immobilization, and sediment filtration—removing 60–99% of N and P in riparian buffers [77,78]. Hulun Lake confirms this inversely: grassland destruction by overgrazing allowed terrigenous organic matter (~90% of the lake’s total) to enter via wind-blown debris [13]. Neither Ma et al. [18] nor Zhang et al. [20] reported this sign reversal, likely because their designs pooled heterogeneous basins or lacked basin-specific regressions. These results suggest NDVI cannot serve as a universal water quality proxy in drylands; its role is contingent on land use context.
Temperature is significant in 15 basins and predominantly negative (14/15; β: −0.33 to −0.30), contrasting with the documented pattern in temperate lakes where warming intensifies eutrophication via internal phosphorus release, cyanobacterial dominance, and metabolic acceleration [79,80]. Mi et al. [80] showed that in a restored temperate lake, 68% of re-eutrophication stemmed from internal P release and 32% from warming (18% metabolic + 14% synergistic). In water-limited IMXP catchments, warming instead amplifies evapotranspiration, reducing runoff and curtailing allochthonous nutrient delivery—the principal external loading pathway [2,5]. A parallel pattern occurs in semi-arid Brazilian reservoirs, where 91% of 65 systems showed TSI increases during drought due to volume loss rather than temperature-mediated internal cycling [81]. The sole exception—Qiangtang Plateau (B14; β = +0.47)—reinforces this interpretation: at extreme altitude with minimal evaporative demand, warming extends the ice-free season without the runoff suppression governing lower basins. As Adrian et al. [82] noted, climate impacts on lakes are mediated by baseline hydrological regime. Our results extend this across 53 sub-basins, providing multi-basin evidence that warming can suppress rather than promote trophic variability in dryland systems.
Anthropogenic drivers (population density, built-up ratio, grazing intensity) contribute ~20% collectively but reach significance only in a few basins: population density in Xinjiang oasis-urban corridors (B10, B31), built-up ratio solely in the Fen River Basin (B39), and grazing intensity in specific grassland basins (B07). This spatial confinement indicates that direct human pressures act as localized rather than pervasive drivers at the interannual scale on IMXP.
These findings converge on one overarching insight: eutrophication variability in IMXP lakes reflects basin-specific coupling among land use, hydrological regime, and climate sensitivity rather than a fixed driver hierarchy. Vegetation greening signals nutrient export in irrigated basins but nutrient retention in pastoral ones; warming suppresses trophic pressure through runoff reduction in most basins yet enhances it at high altitude; human pressures are locally decisive but spatially confined. This basin-dependent structure implies differentiated management: controlling return flows in irrigated basins, maintaining vegetation cover in pastoral ones, incorporating climate-adaptive planning where temperature-runoff coupling is strong, and targeting point-source controls in anthropogenically dominated basins.

4.3. Uncertainties, Limitations, and Implications for Regional Lake Monitoring

This study has several important limitations that should be considered when interpreting the reconstructed TLI patterns and the basin-scale attribution results. First, uncertainty is introduced by the reference dataset and by the construction of TLI itself. Sampling dates do not always coincide with Landsat overpasses. In addition, SD was not consistently available and was estimated from turbidity to complete the TLI indicator set. To assess the associated uncertainty, we performed a sensitivity analysis by scaling the estimated SD with a multiplier k (Figure 6). The distribution of k values for IMXP SD was concentrated between 0.67 and 1.41 (P5–P95). Within this range, ΔTLI was within ±2.5, which is small relative to the TLI range. ΔRMSE did not exceed 0.4, compared to baseline RMSE values of 3.290 and 2.896. While these errors are relatively small, they demonstrate that turbidity-derived SD does introduce measurable uncertainty into both the TLI and model performance, even though SD has a low weight in the composite TLI. These factors can introduce noise into the learned reflectance–TLI relationship and may limit the attainable accuracy, particularly for short-lived events or under rapidly changing conditions [24,25].
Second, satellite observation constraints affect the completeness of the monthly time series and the representation of extremes. Cloud/ice contamination and the 16-day revisit cycle led to uneven sampling in some months, and monthly compositing can smooth episodic bloom peaks. Spatially, 30 m Landsat pixels can be affected by mixed land–water signals for small lakes and near-shore zones, which is especially relevant given that most lakes in the inventory are 1–10 km2. These effects can bias lake-mean reflectance and thus the derived TLI, particularly in optically complex and shallow waters [24,25].
Despite these limitations, the results have clear implications for regional lake monitoring. The Landsat-derived TLI time series can complement sparse field sampling by providing spatially explicit, basin-to-lake screening of trophic conditions over long periods, which is difficult to achieve through conventional monitoring alone. The identified basin-scale hotspots and trend clusters (e.g., northeastern basins and several endorheic/oasis basins) can be used to prioritize where additional field sampling is most needed and to design sampling schedules aligned with satellite overpasses. Finally, the basin-specific driver patterns highlight which drivers (vegetation, climate, and human-pressure variables) should be tracked routinely to support targeted diagnosis and management in different sub-basin contexts.

5. Conclusions

This study provides a 25-year, lake-resolved trophic baseline for 610 lakes across IMXP using Landsat-driven XGBoost retrieval with temporally independent validation, offering a scalable framework for long-term monitoring in data-sparse dryland regions.
Three findings emerge with broader relevance. First, climate-vegetation variables account for ~80% of interannual TLI variability, while anthropogenic proxies remain spatially confined—suggesting that dryland lake eutrophication operates under a different driver structure than that reported for nutrient-saturated lowland systems. Second, NDVI signals nutrient export in irrigated basins but nutrient retention in pastoral ones, indicating that vegetation greening alone is an ambiguous indicator of water quality trajectories unless interpreted alongside land use context. Third, warming is associated with reduced rather than increased trophic levels in most IMXP basins, likely because enhanced evapotranspiration suppresses runoff and external nutrient delivery—a pattern consistent with observations in other semi-arid systems but differing from the positive warming–eutrophication relationship commonly observed in temperate lakes.
These results collectively indicate that eutrophication drivers in dryland lake regions vary by basin rather than following a uniform hierarchy, and that effective management requires strategies tailored to each basin’s dominant land-use–climate–hydrology coupling.

Author Contributions

Conceptualization, L.L. and S.L.; methodology, Y.Z.; software, Y.Z. and W.S.; validation, Y.Z., Y.Y. and Z.Z.; formal analysis, Y.Z.; investigation, Y.Z., Y.Y. and Z.Z.; resources, J.W., Y.R., L.L. and S.L.; data curation, Y.Z., W.S., Y.Y. and Z.Z.; writing—original draft preparation, Y.Z.; writing—review and editing, F.C., Y.R., L.L. and S.L.; visualization, L.W.; supervision, Y.R., L.L., J.W. and S.L.; project administration, Y.R., L.L. and S.L.; funding acquisition, Y.R., L.L. and S.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research has been supported by the National Natural Science Foundation of China (Grant No. 42101081), the National Key Research and Development Program of China (Grant No. 2021YFD1300503).

Data Availability Statement

The findings of this study are based on publicly available third-party datasets. Landsat Collection 2 Level-2 surface reflectance data are available from the U.S. Geological Survey (https://earthengine.google.com/). Lake and watershed boundaries were obtained from the HydroSHEDS database (https://www.hydrosheds.org/, accessed on 15 January 2025) containing HydroLAKES and HydroBASINS. In situ water quality observations used for model training were compiled from the Ministry of Ecology and Environment of the P.R.C. (https://www.mee.gov.cn/, accessed on 15 January 2025), the China National Environmental Monitoring Centre (https://www.cnemc.cn/, accessed on 15 January 2025), the National Tiβn Plateau Data Center (https://www.tpdc.ac.cn/, accessed on 15 January 2025), and the NIGLAS Data Center (https (GSOD) are available from NOAA NCEI (https://www.ncei.noaa.gov/, accessed on 15 January 2025); vegetation and grazing datasets can be accessed via the Resource and Environment Science and Data Center (https://www.resdc.cn/, accessed on 15 January 2025), the National Tiβn Plateau Data Center, NASA LP DAAC (https://doi.org/10.5067/MODIS/MOD17A3HGF.061), and Figshare (https://doi.org/10.6084/m9.figshare.26195684.v3); and human activity data are available from Oak Ridge National Laboratory (https://landscan.ornl.gov/, accessed on 15 January 2025) and Zenodo (https://doi.org/10.5281/zenodo.15853565). The processed data products derived from these sources are not shared.

Acknowledgments

We acknowledge the data support from the Ministry of Ecology and Environment of the People’s Republic of China and the China National Environmental Monitoring Centre for water-quality monitoring records, the National Tiβn Plateau Data Center and the Nanjing Institute of Geography and Limnology (Chinese Academy of Sciences) Data Center for related data services, and the U.S. Geological Survey for providing Landsat Collection 2 surface reflectance products. We also thank the data providers of the auxiliary driving factor datasets (vegetation, land cover, population, climate, and grazing intensity) as cited in Section 2.2, and the Google Earth Engine platform for enabling large-scale Landsat time-series processing.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
Chl-aChlorophyll-a
CIConfidence interval
CODMnPermanganate index (chemical oxygen demand, CODMn)
ETM+Enhanced Thematic Mapper Plus
EVIEnhanced Vegetation Index
FUIForel–Ule Index
FVCFractional vegetation cover
GEEGoogle Earth Engine
IMXPthe Inner Mongolia–Xinjiang Plateau
IQRInterquartile range
JRCJoint Research Centre
LHGILivestock husbandry grazing intensity
MAEMean absolute error
MODISModerate Resolution Imaging Spectroradiometer
MSIMultiSpectral Instrument
NDVINormalized Difference Vegetation Index
NIRNear-infrared
NPPNet primary productivity
OLCIOcean and Land Color Instrument
OLIOperational Land Imager
OLS-2Operational Land Imager 2
OLSOrdinary least squares
OWTOptical water type
PETPotential evapotranspiration
QAQuality assessment
R2Coefficient of determination
RFRandom forest
RMSERoot mean square error
RRRidge regression
SBSub-basin
SDSecchi depth
SEMStructural equation modeling
SLCScan line corrector
SRSurface reflectance
SVRSupport vector regression
SWIRShortwave infrared
SWIR1Shortwave infrared 1
SWIR2Shortwave infrared 2
TbdyTurbidity
TLITrophic Level Index
TMThematic Mapper
TNTotal nitrogen
TPTotal phosphorus
TSITrophic State Index
TWFETwo-way fixed effects
VIFVariance inflation factor
VNIRVisible and near-infrared
XGBoostExtreme Gradient Boosting

Appendix A

Table A1. B code-watershed comparison table.
Table A1. B code-watershed comparison table.
SB_CodeBasin_Name
OverallOverall
B01Nierji to Jiangqiao
B02Lower West Liao River reach (below Sujiapu)
B03Hekouzhen to Longmen (left bank)
B04Qindan River
B05Zhang–Wei River mountainous area
B06Irtysh River
B07Eastern Inner Mongolian Plateau
B08Shizuishan to Hekouzhen (south bank)
B09Gurbantunggut Desert region
B10Ebinur Lake basin
B11Longmen to Sanmenxia mainstem reach
B12Ba–Ili Basin
B13Aksu River
B14Qiangtang Plateau inland region
B15Wei River (Baojixia to Xianyang)
B16Western Inner Mongolian Plateau
B17Cherchen River basin
B18Xiaheyan to Shizuishan
B19Wei River (upstream of Baojixia)
B20Ili River
B21Kashgar River basin
B22Heihe River
B23Qingshui River and Kushui River
B24Wulijimuren River
B25Ulungur River
B26Erguna River mainstem
B27Shule River
B28Below Jiangqiao
B29Hailar River
B30Luan River mountainous area
B31Turpan Basin
B32Rivers of the middle northern Tianshan foothills
B33Rivers of the eastern northern Tianshan foothills
B34Hulun Lake basin
B35Yongding River (upstream of Cetian Reservoir)
B36Hotan River
B37Upstream of Nierji
B38Xilamulun River and Laoha River
B39Fen River
B40Shizuishan to Hekouzhen (north bank)
B41Jeminay small rivers
B42Daxia River and Tao River
B43Keriya River small tributaries
B44Yongding River (Cetian Reservoir to Sanjiadian reach)
B45Right bank upstream of Wubu
B46Emin River
B47Yarkand River
B48Kaidu–Kongque River basin
B49Endorheic region
B50Right bank downstream of Wubu
B51Weigan River
B52Tarim River mainstem
B53Hexi Desert region
Note: Sub-basin codes are used consistently throughout the manuscript and figures to simplify labeling; full basin names are provided Table A2 for reference.
Table A2. Pearson correlation coefficients (rj) between each sub-index and Chl-a, and the corresponding normalized weights (Wj) used in the composite TLI calculation.
Table A2. Pearson correlation coefficients (rj) between each sub-index and Chl-a, and the corresponding normalized weights (Wj) used in the composite TLI calculation.
IndexChl-aTPTNSDCODMn
j12345
r j 10.840.82−0.830.83
r j 2 10.70560.67240.68890.6889
W j 0.26630.18790.1790.18340.1834
Table A3. Band-wise OLS coefficients for harmonizing Landsat ETM+ surface reflectance to OLI/OLI-2 equivalence.
Table A3. Band-wise OLS coefficients for harmonizing Landsat ETM+ surface reflectance to OLI/OLI-2 equivalence.
Band NameETM+ BandOLI/OLI-2 Banda (Intercept)b (Slope)
Blue120.00030.8474
Green230.00880.8483
Red340.00610.9047
NIR450.04120.8462
SWIR1560.02540.8937
SWIR2770.01720.9071

Appendix B

Figure A1. B code-watershed comparison figure.
Figure A1. B code-watershed comparison figure.
Remotesensing 18 00988 g0a1
Figure A2. Annual average change in TLI from 2000 to 2004 cross-sensor reflectance harmonization between Landsat 7 ETM+ and Landsat 8 OLI Scatterplots show band-to-band relationships and the fitted linear regressions used to convert ETM+ reflectance to OLI-like reflectance. (a) Blue band (L7 Band 1/L8 Band 2): solid line denotes the 1:1 reference line, dashed line denotes the fitted regression line; (b) Green band (L7 Band 2/L8 Band 3); (c) Red band (L7 Band 3/L8 Band 4); (d) Near-infrared (NIR) band (L7 Band 4/L8 Band 5); (e) Shortwave infrared 1 (SWIR 1) band (L7 Band 5/L8 Band 6); (f) Shortwave infrared 2 (SWIR 2) band (L7 Band 7/L8 Band 7).
Figure A2. Annual average change in TLI from 2000 to 2004 cross-sensor reflectance harmonization between Landsat 7 ETM+ and Landsat 8 OLI Scatterplots show band-to-band relationships and the fitted linear regressions used to convert ETM+ reflectance to OLI-like reflectance. (a) Blue band (L7 Band 1/L8 Band 2): solid line denotes the 1:1 reference line, dashed line denotes the fitted regression line; (b) Green band (L7 Band 2/L8 Band 3); (c) Red band (L7 Band 3/L8 Band 4); (d) Near-infrared (NIR) band (L7 Band 4/L8 Band 5); (e) Shortwave infrared 1 (SWIR 1) band (L7 Band 5/L8 Band 6); (f) Shortwave infrared 2 (SWIR 2) band (L7 Band 7/L8 Band 7).
Remotesensing 18 00988 g0a2
Figure A3. Spatial distribution of annual mean TLI for 610 lakes across IMXP from 2000 to 2024. Each panel displays the lake-level annual mean TLI for one year (2000–2024, 25 panels in total). Symbol color indicates TLI magnitude according to the legend (20–70). The background boundary delineates the IMXP study area.
Figure A3. Spatial distribution of annual mean TLI for 610 lakes across IMXP from 2000 to 2024. Each panel displays the lake-level annual mean TLI for one year (2000–2024, 25 panels in total). Symbol color indicates TLI magnitude according to the legend (20–70). The background boundary delineates the IMXP study area.
Remotesensing 18 00988 g0a3
Figure A4. Spatial distribution of seasonal mean TLI (spring, summer, autumn, and winter) for 610 lakes across IMXP during 2000–2004. Each row represents one year and each column one season. Symbol color indicates TLI magnitude (20–70).
Figure A4. Spatial distribution of seasonal mean TLI (spring, summer, autumn, and winter) for 610 lakes across IMXP during 2000–2004. Each row represents one year and each column one season. Symbol color indicates TLI magnitude (20–70).
Remotesensing 18 00988 g0a4
Figure A5. Same as Figure A4, but for the period 2005–2009.
Figure A5. Same as Figure A4, but for the period 2005–2009.
Remotesensing 18 00988 g0a5
Figure A6. Same as Figure A4, but for the period 2010–2014.
Figure A6. Same as Figure A4, but for the period 2010–2014.
Remotesensing 18 00988 g0a6
Figure A7. Same as Figure A4, but for the period 2015–2019.
Figure A7. Same as Figure A4, but for the period 2015–2019.
Remotesensing 18 00988 g0a7
Figure A8. Same as Figure A4, but for the period 2020–2024.
Figure A8. Same as Figure A4, but for the period 2020–2024.
Remotesensing 18 00988 g0a8
Figure A9. Spatial distribution of driver contributions to interannual TLI variability across the 53 sub-basins. Pie charts represent the relative contributions of six drivers (build ratio, NDVI, population density, temperature, precipitation, and LHGI) derived from Shapley values.
Figure A9. Spatial distribution of driver contributions to interannual TLI variability across the 53 sub-basins. Pie charts represent the relative contributions of six drivers (build ratio, NDVI, population density, temperature, precipitation, and LHGI) derived from Shapley values.
Remotesensing 18 00988 g0a9

References

  1. Millennium Ecosystem Assessment (MEA). Ecosystems and Human Well-Being: Wetlands and Water—Synthesis; World Resources Institute: Washington, DC, USA, 2005. [Google Scholar]
  2. Smith, V.H.; Schindler, D.W. Eutrophication science: Where do we go from here? Trends Ecol. Evol. 2009, 24, 201–207. [Google Scholar] [CrossRef] [PubMed]
  3. Sun, H.; Yu, R.; Liu, X.; Zhang, Z.; Ren, X.; Li, X.; Qi, Z.; Wang, J.; Guo, Z.; Zhu, P.; et al. Large greenhouse gases emissions from lakes in Inner Mongolia, China. J. Hydrol. 2024, 637, 131432. [Google Scholar] [CrossRef]
  4. Xiao, Q.; Xu, X.; Qi, T.; Luo, J.; Lee, X.; Duan, H. Lakes shifted from a carbon dioxide source to a sink over past two decades in China. Sci. Bull. 2024, 69, 1857–1861. [Google Scholar] [CrossRef]
  5. Carpenter, S.R.; Caraco, N.F.; Correll, D.L.; Howarth, R.W.; Sharpley, A.N.; Smith, V.H. Nonpoint pollution of surface waters with phosphorus and nitrogen. Ecol. Appl. 1998, 8, 559–568. [Google Scholar] [CrossRef]
  6. Paerl, H.W.; Otten, T.G. Harmful cyanobacterial blooms: Causes, consequences, and controls. Microb. Ecol. 2013, 65, 995–1010. [Google Scholar] [CrossRef] [PubMed]
  7. Carlson, R.E. A trophic state index for lakes. Limnol. Oceanogr. 1977, 22, 361–369. [Google Scholar] [CrossRef]
  8. Ministry of Ecology and Environment of the People’s Republic of China. Technical Specifications for Surface Water Environmental Quality Monitoring (HJ 91.2-2022); China Environmental Science Press: Beijing, China, 2022.
  9. Beck, H.E.; Zimmermann, N.E.; McVicar, T.R.; Vergopolan, N.; Berg, A.; Wood, E.F. Present and future Köppen–Geiger climate classification maps at 1-km resolution. Sci. Data 2018, 5, 180214. [Google Scholar] [CrossRef] [PubMed]
  10. Fang, J.; Bai, Y.; Wu, J. Towards a better understanding of landscape patterns and ecosystem processes of the Mongolian Plateau. Landsc. Ecol. 2015, 30, 1573–1578. [Google Scholar] [CrossRef]
  11. Hu, Q.; Zhao, Y.; Huang, A.; Ma, P.; Ming, J. Moisture transport and sources of the extreme precipitation over northern and southern Xinjiang in the summer half-year during 1979–2018. Front. Earth Sci. 2021, 9, 770877. [Google Scholar] [CrossRef]
  12. Zhang, F.; Xue, B.; Yao, S.; Gui, Z. Organic carbon burial from multi-core records in Hulun Lake, the largest lake in northern China. Quat. Int. 2018, 475, 80–90. [Google Scholar] [CrossRef]
  13. Chen, X.; Chuai, X.; Yang, L.; Zhao, H. Climatic warming and overgrazing induced the high concentration of organic matter in Lake Hulun, a large shallow eutrophic steppe lake in northern China. Sci. Total Environ. 2012, 431, 332–338. [Google Scholar] [CrossRef]
  14. Yu, H.; Shi, X.; Wang, S.; Zhao, S.; Sun, B.; Liu, Y.; Yang, Z. A review of the characteristics and mechanisms of water environment evolution in Hulun Lake under the dual drivers of climate warming-drying and human activities. Sustainability 2025, 17, 10395. [Google Scholar] [CrossRef]
  15. Du, Y.; Wan, W.; Li, Q.; Zhang, H.; Qian, H.; Cai, J.; Wang, J.; Zheng, X. Impacts of climate and human activities on Daihai Lake in a typical semi-arid watershed, Northern China. PLoS ONE 2022, 17, e0266049. [Google Scholar] [CrossRef] [PubMed]
  16. Liu, X.; Liu, H.; Jing, J.; Liu, Y.; Xu, Z.; Cao, X.; Ma, L.; Zhuo, Y.; Wen, L.; Wang, L. How the land use/cover changes and environmental factors at different scales affect lake water quality in arid and semi-arid regions. Front. Ecol. Evol. 2023, 11, 1188927. [Google Scholar] [CrossRef]
  17. Yu, H.; Shi, X.; Wang, S.; Zhao, S.; Sun, B.; Liu, Y.; Yang, Z. Trophic status of a shallow lake in Inner Mongolia: Long-term, seasonal, and spatial variation. Ecol. Indic. 2023, 156, 111167. [Google Scholar] [CrossRef]
  18. Ren, X.; Yu, R.; Liu, X.; Sun, H.; Geng, Y.; Qi, Z.; Zhang, Z.; Li, X.; Wang, J.; Zhu, P.; et al. Spatial changes and driving factors of lake water quality in Inner Mongolia, China. J. Arid Land 2023, 15, 164–179. [Google Scholar] [CrossRef]
  19. Guo, J.; Liu, K.; Na, J.; Liu, G.; Cao, Z.; Fan, C.; Xue, B.; Huang, J.; Song, C. A three-decade lake dataset on the Mongolian Plateau tracking water area and quality dynamics (1990–2020). Sci. Data 2025, 12, 1788. [Google Scholar] [CrossRef] [PubMed]
  20. Zhang, Y.; Shi, K.; Zhang, Y.; Moreno-Madriñán, M.J.; Xu, X.; Zhou, Y.; Qin, B.; Zhu, G.; Jeppesen, E. Water clarity response to climate warming and wetting of the Inner Mongolia–Xinjiang Plateau: A remote sensing approach. Sci. Total Environ. 2021, 796, 148916. [Google Scholar] [CrossRef]
  21. Tao, S.; Fang, J.; Zhao, X.; Zhao, S.; Shen, H.; Hu, H.; Tang, Z.; Wang, Z.; Guo, Q. Rapid loss of lakes on the Mongolian Plateau. Proc. Natl. Acad. Sci. USA 2015, 112, 2281–2286. [Google Scholar] [CrossRef] [PubMed]
  22. Wulder, M.A.; Loveland, T.R.; Roy, D.P.; Crawford, C.J.; Masek, J.G.; Woodcock, C.E.; Allen, R.G.; Anderson, M.C.; Belward, A.S.; Cohen, W.B.; et al. Current status of Landsat program, science, and applications. Remote Sens. Environ. 2019, 225, 127–147. [Google Scholar] [CrossRef]
  23. Gorelick, N.; Hancher, M.; Dixon, M.; Ilyushchenko, S.; Thau, D.; Moore, R. Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote Sens. Environ. 2017, 202, 18–27. [Google Scholar] [CrossRef]
  24. Yang, H.; Kong, J.; Hu, H.; Du, Y.; Gao, M.; Chen, F. A review of remote sensing for water quality retrieval: Progress and challenges. Remote Sens. 2022, 14, 1770. [Google Scholar] [CrossRef]
  25. Chen, L.; Liu, L.; Liu, S.; Shi, Z.; Shi, C. The application of remote sensing technology in inland water quality monitoring and water environment science: Recent progress and perspectives. Remote Sens. 2025, 17, 667. [Google Scholar] [CrossRef]
  26. Lee, Z.-P.; Carder, K.L.; Arnone, R. Deriving inherent optical properties from water color: A multi-band quasi-analytical algorithm for optically deep waters. Appl. Opt. 2002, 41, 5755–5772. [Google Scholar] [CrossRef]
  27. Werdell, P.J.; Franz, B.A.; Bailey, S.W.; Feldman, G.C.; Boss, E.; Brando, V.E.; Dowell, M.; Hirata, T.; Lavender, S.J.; Lee, Z.; et al. Generalized ocean color inversion model for retrieving marine inherent optical properties. Appl. Opt. 2013, 52, 2019–2033. [Google Scholar] [CrossRef] [PubMed]
  28. Shi, X.; Gu, L.; Jiang, T.; Zheng, X.; Dong, W.; Tao, Z. Retrieval of chlorophyll-a concentrations using Sentinel-2 MSI imagery in Lake Chagan based on assessments with machine learning models. Remote Sens. 2022, 14, 4924. [Google Scholar] [CrossRef]
  29. Filippelli, S.K.; Schleeweis, K.; Nelson, M.D.; Fekety, P.A.; Vogeler, J.C. Testing temporal transferability of remote sensing models for large area monitoring. Sci. Remote Sens. 2024, 9, 100119. [Google Scholar] [CrossRef]
  30. Pekel, J.-F.; Cottam, A.; Gorelick, N.; Belward, A.S. High-resolution mapping of global surface water and its long-term changes. Nature 2016, 540, 418–422. [Google Scholar] [CrossRef] [PubMed]
  31. Hu, M.; Ma, R.; Xue, K.; Cao, Z.; Chen, X.; Xiong, J.; Xu, J.; Huang, Z.; Yu, Z. A dataset of trophic state index for nation-scale lakes in China from 40-year Landsat observations. Sci. Data 2024, 11, 659. [Google Scholar] [CrossRef]
  32. Bonnema, M.; Oaida, C.; David, C.H.; Frasson, R.P.d.M.; Yun, S.-H. The Global Surface Area Variations of Lakes and Reservoirs as Seen from Satellite Remote Sensing. Geophys. Res. Lett. 2022, 49, e2022GL098987. [Google Scholar] [CrossRef]
  33. Wang, S.; Dou, H. (Eds.) Lakes in China; Science Press: Beijing, China, 1998. [Google Scholar]
  34. Lehner, B.; Grill, G. Global river hydrography and network routing: Baseline data and new approaches to study the world’s large river systems. Hydrol. Process. 2013, 27, 2171–2186. [Google Scholar] [CrossRef]
  35. China National Environmental Monitoring Centre. National Surface Water Quality Automatic Monitoring Real-Time Data Publishing System. Available online: https://szzdjc.cnemc.cn:8070/GJZ/Business/Publish/Main.html (accessed on 15 January 2025).
  36. National Tibetan Plateau Data Center. Data Center Resources and Repository. Available online: https://www.tpdc.ac.cn/ (accessed on 15 January 2025).
  37. National Earth System Science Data Center; Lake-Watershed Science Data Center. Lake-Watershed Science Data Resources. Available online: https://lake.geodata.cn/ (accessed on 15 January 2025).
  38. Li, P.; Hao, F.; Wu, H.; Nie, H. Spatiotemporal dynamic analysis of eutrophication status based on machine learning-based retrieval algorithm: Case study in Liangzi Lake, Hubei, China. Remote Sens. 2024, 16, 4192. [Google Scholar] [CrossRef]
  39. U.S. Geological Survey. Estimation of Secchi Depth from Turbidity Data in the Willamette River at Portland, OR (14211720). Available online: https://or.water.usgs.gov/will_morrison/secchi_depth_model.html (accessed on 15 January 2025).
  40. U.S. Geological Survey. Landsat Collection 2 Surface Reflectance. USGS Landsat Missions. Available online: https://www.usgs.gov/landsat-missions/landsat-collection-2-surface-reflectance (accessed on 15 January 2025).
  41. U.S. Geological Survey. Landsat Collection 2 Quality Assessment Bands. USGS Landsat Missions. Available online: https://www.usgs.gov/landsat-missions/landsat-collection-2-quality-assessment-bands (accessed on 15 January 2025).
  42. Storey, J.C.; Scaramuzza, P.; Schmidt, G.L.; Barsi, J. Landsat 7 scan line corrector-off gap-filled product development. In Proceedings of the Pecora 16—Global Priorities in Land Remote Sensing, Sioux Falls, SD, USA, 23–27 October 2005. [Google Scholar]
  43. Roy, D.P.; Kovalskyy, V.; Zhang, H.K.; Vermote, E.F.; Yan, L.; Kumar, S.S.; Egorov, A.V. Characterization of Landsat-7 to Landsat-8 reflective wavelength and normalized difference vegetation index continuity. Remote Sens. Environ. 2016, 185, 57–70. [Google Scholar] [CrossRef]
  44. Xiang, B.; Song, J.-W.; Wang, X.-Y.; Zhen, J. Improving the accuracy of estimation of eutrophication state index using a remote sensing data-driven method: A case study of Chaohu Lake, China. Water SA 2015, 41, 753–761. [Google Scholar] [CrossRef]
  45. Liu, H.; He, B.; Zhou, Y.; Yang, X.; Zhang, X.; Xiao, F.; Feng, Q.; Liang, S.; Zhou, X.; Fu, C. Eutrophication monitoring of lakes in Wuhan based on Sentinel-2 data. GIScience Remote Sens. 2021, 58, 776–798. [Google Scholar] [CrossRef]
  46. Liu, H.; He, B.; Zhou, Y.; Kutser, T.; Toming, K.; Feng, Q.; Yang, X.; Fu, C.; Yang, F.; Li, W.; et al. Trophic state assessment of optically diverse lakes using Sentinel-3-derived trophic level index. Int. J. Appl. Earth Obs. Geoinf. 2022, 114, 103026. [Google Scholar] [CrossRef]
  47. Yang, F.; He, B.; Zhou, Y.; Li, W.; Zhang, X.; Feng, Q. Trophic status observations for Honghu Lake in China from 2000 to 2021 using Landsat satellites. Ecol. Indic. 2023, 146, 109898. [Google Scholar] [CrossRef]
  48. Li, J.; Zheng, Z.; Liu, G.; Chen, N.; Lei, S.; Du, C.; Xu, J.; Li, Y.; Zhang, R.; Huang, C. Estimating effects of natural and anthropogenic activities on trophic level of inland water: Analysis of Poyang Lake Basin, China, with Landsat-8 observations. Remote Sens. 2023, 15, 1618. [Google Scholar] [CrossRef]
  49. Zhou, Y.; He, B.; Fu, C.; Giardino, C.; Bresciani, M.; Liu, H.; Feng, Q.; Xiao, F.; Zhou, X.; Liang, S. Assessments of trophic state in lakes and reservoirs of Wuhan using Sentinel-2 satellite data. Eur. J. Remote Sens. 2021, 54, 461–475. [Google Scholar] [CrossRef]
  50. O’Brien, R.M. A caution regarding rules of thumb for variance inflation factors. Qual. Quant. 2007, 41, 673–690. [Google Scholar] [CrossRef]
  51. NOAA National Centers for Environmental Information (NCEI). Global Surface Summary of the Day (GSOD). 1929–Present (Daily Station Summaries; Used Here for Annual Mean Temperature and Annual Total Precipitation). Available online: https://www.ncei.noaa.gov/access/metadata/landing-page/bin/iso?id=gov.noaa.ncdc:C00516&view=iso (accessed on 15 January 2025).
  52. Xu, X. China Annual NDVI and EVI Dataset (1 km; 2000–2024). Data Center for Resources and Environmental Sciences; Chinese Academy of Sciences (RESDC). Available online: https://www.resdc.cn/DOI/doi.aspx?DOIid=49 (accessed on 15 January 2025).
  53. Gao, J.; Shi, Y.; Zhang, H.; Chen, X.; Zhang, W.; Shen, W.; Xiao, T.; Zhang, Y. China Regional 250 m Fractional Vegetation Cover (FVC) Dataset (2000–2024). National Tibetan Plateau/Third Pole Environment Data Center (TPDC). Available online: https://data.tpdc.ac.cn/en/data/f3bae344-9d4b-4df6-82a0-81499c0f90f7 (accessed on 15 January 2025).
  54. NASA LP DAAC. MOD17A3HGF.061: MODIS/Terra Net Primary Production Gap-Filled Yearly L4 Global 500 m SIN Grid (Version 6.1). Available online: https://www.earthdata.nasa.gov/data/catalog/lpcloud-mod17a3hgf-061 (accessed on 15 January 2025).
  55. Wang, D.; Peng, Q.; Li, X.; Zhang, W.; Xia, X.; Qin, Z.; Ren, P.; Liang, S.; Yuan, W. A Long-Term High-Resolution Dataset of Grasslands Grazing Intensity in China. Figshare, 2025, Version 3. 2024. Available online: https://figshare.com/articles/dataset/A_long-term_high-resolution_dataset_of_grasslands_grazing_intensity_in_China/26195684/3 (accessed on 5 July 2025).
  56. Lebakula, V.; Gonzales, J.; Stipek, C.; Tsybina, E.; Zimmer, A.; Nukavarapu, N.; Jeong, B.; Reynolds, B.; Kaufman, J.; Fan, J.; et al. LandScan Global 2024. Available online: https://landscan.ornl.gov/2025; (accessed on 15 October 2025).
  57. Yang, J.; Huang, X. The 30 m Annual Land Cover Dataset and Its Dynamics in China from 1990 to 2019. Earth Syst. Sci. Data 2021, 13, 3907–3925. [Google Scholar] [CrossRef]
  58. Hoerl, A.E.; Kennard, R.W. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 1970, 12, 55–67. [Google Scholar] [CrossRef]
  59. Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
  60. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  61. Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2016), San Francisco, CA, USA, 13–17 August 2016; ACM: New York, NY, USA, 2016; pp. 785–794. [Google Scholar] [CrossRef]
  62. Håkanson, L. The importance of lake morphometry for the structure and function of lakes. Int. Rev. Hydrobiol. 2005, 90, 433–461. [Google Scholar] [CrossRef]
  63. Roberts, D.R.; Bahn, V.; Ciuti, S.; Boyce, M.S.; Elith, J.; Guillera-Arroita, G.; Hauenstein, S.; Lahoz-Monfort, J.J.; Schröder, B.; Thuiller, W.; et al. Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography 2017, 40, 913–929. [Google Scholar] [CrossRef]
  64. Chai, T.; Draxler, R.R. Root mean square error (RMSE) or mean absolute error (MAE)?—Arguments against avoiding RMSE in the literature. Geosci. Model Dev. 2014, 7, 1247–1250. [Google Scholar] [CrossRef]
  65. Kvalseth, T.O. Cautionary note about R2. Am. Stat. 1985, 39, 279–285. [Google Scholar] [CrossRef]
  66. Willmott, C.J.; Matsuura, K. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim. Res. 2005, 30, 79–82. [Google Scholar] [CrossRef]
  67. Grace, J.B. Structural Equation Modeling and Natural Systems; Cambridge University Press: Cambridge, UK, 2006; ISBN 9780521837422. [Google Scholar]
  68. Fotheringham, A.S.; Brunsdon, C.; Charlton, M. Geographically Weighted Regression: The Analysis of Spatially Varying Relationships; John Wiley & Sons: Chichester, UK, 2002; ISBN 9780471496168. [Google Scholar]
  69. Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. arXiv 2017, arXiv:1705.07874. [Google Scholar] [CrossRef]
  70. Wooldridge, J.M. Econometric Analysis of Cross Section and Panel Data, 2nd ed.; The MIT Press: Cambridge, MA, USA, 2010; ISBN 9780262232586. [Google Scholar]
  71. Xu, H.; Tan, X.; Liang, J.; Cui, Y.; Gao, Q. Impact of agricultural non-point source pollution on river water quality: Evidence from China. Front. Ecol. Evol. 2022, 10, 858822. [Google Scholar] [CrossRef]
  72. Shen, W.; Hu, Q.; Yu, X.; Imwa, B.T. Does coastal local government competition increase coastal water pollution? Evidence from China. Int. J. Environ. Res. Public Health 2020, 17, 6862. [Google Scholar] [CrossRef]
  73. Lipovetsky, S.; Conklin, M. Analysis of regression in game theory approach. Appl. Stoch. Models Bus. Ind. 2001, 17, 319–330. [Google Scholar] [CrossRef]
  74. Song, E.; Nelson, B.L.; Staum, J. Shapley effects for global sensitivity analysis: Theory and computation. SIAM/ASA J. Uncertain. Quantif. 2016, 4, 1060–1083. [Google Scholar] [CrossRef]
  75. Zhou, Y.; He, B.; Xiao, F.; Feng, Q.; Kou, J.; Liu, H. Retrieving the lake trophic level index with Landsat-8 image by atmospheric parameter and RBF: A case study of lakes in Wuhan, China. Remote Sens. 2019, 11, 457. [Google Scholar] [CrossRef]
  76. Wang, M.; Strokal, M.; Burek, P.; Kroeze, C.; Ma, L.; Janssen, A.B.G. Excess nutrient loads to Lake Taihu: Opportunities for nutrient reduction. Sci. Total Environ. 2019, 664, 865–873. [Google Scholar] [CrossRef]
  77. Mayer, P.M.; Reynolds, S.K.; McCutchen, M.D.; Canfield, T.J. Meta-analysis of nitrogen removal in riparian buffers. J. Environ. Qual. 2007, 36, 1172–1180. [Google Scholar] [CrossRef]
  78. Kumwimba, M.N.; Zhu, B.; Stefanakis, A.I.; Asamoah, G.A.; Muyembe, D.K. Nutrient and sediment retention by riparian vegetated buffer strips: Impacts of buffer length, vegetation type, and season. Agric. Ecosyst. Environ. 2024, 369, 109050. [Google Scholar] [CrossRef]
  79. Woolway, R.I.; Sharma, S.; Smol, J.P. Lakes in hot water: The impacts of a changing climate on aquatic ecosystems. BioScience 2022, 72, 1050–1061. [Google Scholar] [CrossRef] [PubMed]
  80. Mi, C.; Shatwell, T.; Ma, J.; Wentzky, V.C.; Rinke, K. Synergistic effects of warming and internal nutrient loading interfere with the long-term stability of lake restoration and induce sudden re-eutrophication. Environ. Sci. Technol. 2023, 57, 4003–4013. [Google Scholar] [CrossRef] [PubMed]
  81. Wiegand, M.C.; Nascimento, A.T.P.; Costa, A.C.; Lima Neto, I.E. Trophic state changes of semi-arid reservoirs as a function of the hydro-climatic variability. J. Arid Environ. 2021, 184, 104321. [Google Scholar] [CrossRef]
  82. Adrian, R.; O’Reilly, C.M.; Zagarese, H.; Baines, S.B.; Hessen, D.O.; Keller, W.; Livingstone, D.M.; Sommaruga, R.; Straile, D.; Van Donk, E.; et al. Lakes as sentinels of climate change. Limnol. Oceanogr. 2009, 54, 2283–2297. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Location of IMXP lake region and the distribution of the analyzed lakes. The background shows elevation. The thick black outline indicates the study-area boundary, and dashed lines denote sub-basin boundaries. Blue circles represent the 610 perennial lakes whose surface area remained ≥1 km2 during 2000–2024; symbol size indicates lake area class (1–10, 10–100, 100–500, and ≥500 km2). Red flags mark in situ lake monitoring stations used for 2000–2024 observations. The inset map shows the study region within China.
Figure 1. Location of IMXP lake region and the distribution of the analyzed lakes. The background shows elevation. The thick black outline indicates the study-area boundary, and dashed lines denote sub-basin boundaries. Blue circles represent the 610 perennial lakes whose surface area remained ≥1 km2 during 2000–2024; symbol size indicates lake area class (1–10, 10–100, 100–500, and ≥500 km2). Red flags mark in situ lake monitoring stations used for 2000–2024 observations. The inset map shows the study region within China.
Remotesensing 18 00988 g001
Figure 2. Agreement between measured and retrieved TLI for the training set (n = 1076) and independent test set (n = 269): (a) XGBoost, (b) random forest (RF), (c) support vector regression (SVR), and (d) ridge regression (RR). Blue points denote training samples and red points denote test samples. The dashed line indicates the 1:1 line, and the solid line indicates the fitted regression; R2, RMSE, and MAE are reported in each panel.
Figure 2. Agreement between measured and retrieved TLI for the training set (n = 1076) and independent test set (n = 269): (a) XGBoost, (b) random forest (RF), (c) support vector regression (SVR), and (d) ridge regression (RR). Blue points denote training samples and red points denote test samples. The dashed line indicates the 1:1 line, and the solid line indicates the fitted regression; R2, RMSE, and MAE are reported in each panel.
Remotesensing 18 00988 g002
Figure 3. Spatiotemporal dynamics of lake trophic level index (TLI) across IMXP during 2000–2024. (a) Plateau-wide annual mean TLI derived from the Landsat time series; the orange dashed line shows the Sen’s slope estimate, with the 95% CI and p-value provided. (b) Spatial distribution of the multi-year mean TLI (2000–2024) for 610 lakes. (c) Lake-level Sen’s slope of annual TLI (TLI yr−1) with the proportions of increasing and decreasing trends. (d) Statistical significance of lake-level trends based on the Mann–Kendall test (p < 0.05) with the proportions of significant and non-significant trends.
Figure 3. Spatiotemporal dynamics of lake trophic level index (TLI) across IMXP during 2000–2024. (a) Plateau-wide annual mean TLI derived from the Landsat time series; the orange dashed line shows the Sen’s slope estimate, with the 95% CI and p-value provided. (b) Spatial distribution of the multi-year mean TLI (2000–2024) for 610 lakes. (c) Lake-level Sen’s slope of annual TLI (TLI yr−1) with the proportions of increasing and decreasing trends. (d) Statistical significance of lake-level trends based on the Mann–Kendall test (p < 0.05) with the proportions of significant and non-significant trends.
Remotesensing 18 00988 g003
Figure 4. Relative contributions of six drivers to interannual TLI variability at overall and basin scales (2000–2024). (a) Overall Shapley-based contribution (%) of each driver to the explained variance of annual TLI variability. (b) Basin-wise contributions for B01–B53 (Table A1); stacked bars sum to 100% within each basin. Asterisks mark statistically significant coefficients in the basin-specific regressions, and “+”/”−” indicate positive/negative associations.
Figure 4. Relative contributions of six drivers to interannual TLI variability at overall and basin scales (2000–2024). (a) Overall Shapley-based contribution (%) of each driver to the explained variance of annual TLI variability. (b) Basin-wise contributions for B01–B53 (Table A1); stacked bars sum to 100% within each basin. Asterisks mark statistically significant coefficients in the basin-specific regressions, and “+”/”−” indicate positive/negative associations.
Remotesensing 18 00988 g004
Figure 5. Spatial distribution of statistically significant driver–TLI associations across sub-basins. Each panel maps the basins in which a given driver has a statistically significant coefficient (p < 0.05) in basin-specific regressions: (a) LHGI, (b) population density, (c) precipitation, (d) built-up ratio, (e) temperature, and (f) NDVI. Red and blue polygons denote significantly positive and negative associations, respectively; gray indicates non-significance. Inset statistics in each panel report the number of significant basins and the range of regression coefficients (β) for positive (+) and negative (−) associations; a dash (-) indicates no basin in that category.
Figure 5. Spatial distribution of statistically significant driver–TLI associations across sub-basins. Each panel maps the basins in which a given driver has a statistically significant coefficient (p < 0.05) in basin-specific regressions: (a) LHGI, (b) population density, (c) precipitation, (d) built-up ratio, (e) temperature, and (f) NDVI. Red and blue polygons denote significantly positive and negative associations, respectively; gray indicates non-significance. Inset statistics in each panel report the number of significant basins and the range of regression coefficients (β) for positive (+) and negative (−) associations; a dash (-) indicates no basin in that category.
Remotesensing 18 00988 g005
Figure 6. Sensitivity analysis of the SD prediction-to-observation ratio k on TLI and model performance in IMXP. (a) Changes in total TLI (ΔTLI_total, black line, left y-axis, unitless) and additional RMSE (ΔRMSE, red line, right y-axis, unitless) as a function of the SD multiplier k (x-axis). Here, k is defined as the ratio of predicted SD to observed SD, used to scale the estimated SD values. (b) Frequency distribution of the SD multiplier k (x-axis) in IMXP (y-axis: count); vertical dashed lines denote the 5th (P5, k = 0.67) and 95th (P95, k = 1.41) percentiles of the k distribution.
Figure 6. Sensitivity analysis of the SD prediction-to-observation ratio k on TLI and model performance in IMXP. (a) Changes in total TLI (ΔTLI_total, black line, left y-axis, unitless) and additional RMSE (ΔRMSE, red line, right y-axis, unitless) as a function of the SD multiplier k (x-axis). Here, k is defined as the ratio of predicted SD to observed SD, used to scale the estimated SD values. (b) Frequency distribution of the SD multiplier k (x-axis) in IMXP (y-axis: count); vertical dashed lines denote the 5th (P5, k = 0.67) and 95th (P95, k = 1.41) percentiles of the k distribution.
Remotesensing 18 00988 g006
Table 1. Coefficients for the TLI sub-index equations (Equation (3)).
Table 1. Coefficients for the TLI sub-index equations (Equation (3)).
Parameter jxⱼUnitaⱼbⱼ
Chl-aρChl-amg/m32.5001.086
TPρTPmg/L9.4361.624
TNρTNmg/L5.4531.694
SDdSDm5.118−1.940
CODMn ρ C O D M n mg/L0.1092.661
Table 2. Final predictors after VIF screening and corresponding VIF values.
Table 2. Final predictors after VIF screening and corresponding VIF values.
FeatureBand Formula (OLI Notation)VIF
Green–Red ratioB3/B41.375
Blue–Red ratioB2/B41.295
Blue–NIR ratioB2/B51.341
NIR–Green ratioB5/B31.648
NIR–SWIR1 ratioB5/B61.069
NIR–Red index(B5 − B4)/(B5 + B4)1.297
Table 3. Set of driving factors for TLI.
Table 3. Set of driving factors for TLI.
CategoryVariable (Unit)Temporal ScaleSpatial Scale and Format
Climatic conditionstemp (°C)Annual1 km, Raster
precip (mm)Annual10 km (2000); 1 km (2001–2024), Raster/Tabular
Vegetation conditionsNDVI (-)Annual1 km, Raster
FVC (%)Annual250 m, Raster
NPP (kg C·m−2·yr−1)Annual500 m, Raster
LHGI (-)Annual10 km (2000);
250 m (2001–2024), Raster
Human activitiespop_dens (persons·km−2) Annual1 km, Raster
build_ratio (%)Annual30 m, Raster
farm_ratio (%)Annual30 m, Raster
Table 4. VIF Values of Different Driving Factors.
Table 4. VIF Values of Different Driving Factors.
VariableNDVIFVCNPPFarm_RatioBuilt_RatioPop_DensTempLHGIPrecip
original54.9534.1012.169.226.493.163.052.741.61
final2.283.872.141.661.131.60
Table 5. Comparison of remote-sensing-based TLI estimation models in previous studies and this study.
Table 5. Comparison of remote-sensing-based TLI estimation models in previous studies and this study.
Study AreaRS Data and FeaturesBest ModelAccuracy
(Test/Valitation)
Chaohu LakeMODIS MOD09 SR; B1–B5ANN (BP-LM)R2 = 0.8937; MSE = 5.3452 [44]
Wuhan urban lakesLandsat-8 OLI (+AWV); radiometric + AWV featuresRBFNNR2 = 0.641;
RMSE = 5.104 [75]
Wuhan lakesSentinel-2 MSI; MCI, B5/B4, B3/B4, parameter kRBFNNR2 = 0.64; MAE = 4.67;
MRE = 8.47%; RMSE = 6.15 [45]
Wuhan lakesSentinel-3 OLCI; OWT framework (OWT-specific inputs)OWT + LMBR-BPNNMAE = 4.56; MAPE = 8.33%; RMSE = 5.98 [46]
Honghu LakeLandsat series; Landsat predictors (+optional air temperature and water level)Semi-empirical RBFNNR2 = 0.723;
RMSE = 4.97 [47]
Liangzi LakeLandsat-8 OLI; 19 spectral featuresPCA–LASSO–RFR2 = 0.54; RMSE = 4.7; MAE = 3.7 [38]
Poyang Lake BasinLandsat-8 OLI; Chl-a based band combinationsSemi-analytical
TLI model
MAD = 3.58; RMSD = 4.43;
MAPD = 8.88% [48]
Wuhan lakes and reservoirsSentinel-2 MSI; FUI/hue angleGPRRMSE = 5.8; MAPE = 9% [49]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, Y.; Cao, F.; Rong, Y.; Wen, L.; Su, W.; Wu, J.; Yin, Y.; Zi, Z.; Liu, S.; Liu, L. Climate and Vegetation Dominate Lake Eutrophication in the Inner Mongolia–Xinjiang Plateau (2000–2024). Remote Sens. 2026, 18, 988. https://doi.org/10.3390/rs18070988

AMA Style

Zhang Y, Cao F, Rong Y, Wen L, Su W, Wu J, Yin Y, Zi Z, Liu S, Liu L. Climate and Vegetation Dominate Lake Eutrophication in the Inner Mongolia–Xinjiang Plateau (2000–2024). Remote Sensing. 2026; 18(7):988. https://doi.org/10.3390/rs18070988

Chicago/Turabian Style

Zhang, Yuzheng, Feifei Cao, Yuping Rong, Linglong Wen, Wei Su, Jianjun Wu, Yaling Yin, Zhilin Zi, Shasha Liu, and Leizhen Liu. 2026. "Climate and Vegetation Dominate Lake Eutrophication in the Inner Mongolia–Xinjiang Plateau (2000–2024)" Remote Sensing 18, no. 7: 988. https://doi.org/10.3390/rs18070988

APA Style

Zhang, Y., Cao, F., Rong, Y., Wen, L., Su, W., Wu, J., Yin, Y., Zi, Z., Liu, S., & Liu, L. (2026). Climate and Vegetation Dominate Lake Eutrophication in the Inner Mongolia–Xinjiang Plateau (2000–2024). Remote Sensing, 18(7), 988. https://doi.org/10.3390/rs18070988

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop