Multi-Source Data Integration and Model Coupling for Watershed Eco-Assessment Systems: Progress, Challenges, and Prospects

Li Ma; Zihe Xu; Lina Fan; Hongxia Jia; Hao Hu; Lixin Li

doi:10.3390/pr13092998

,

and

¹

Information Center of Ministry of Ecology and Environment, Beijing 100029, China

²

School of Environment and Chemical Engineering, Heilongjiang University of Science and Technology, Harbin 150022, China

^*

Authors to whom correspondence should be addressed.

Processes2025, 13(9), 2998;https://doi.org/10.3390/pr13092998

This article belongs to the Special Issue Artificial Intelligence-Based Analytics for Data-Driven Decision-Making in Industrial Process Engineering

Version Notes

Order Reprints

Abstract

The integrated assessment of watershed ecosystems is increasingly critical for sustainable water resource management amid global environmental change. Multi-source data integration—encompassing in situ monitoring, remote sensing, and model-based observations—has significantly expanded the spatial and temporal scales at which watershed processes can be analyzed. Concurrently, advances in model coupling strategies, ranging from loose to embedded architectures, have enabled more dynamic and holistic representations of interactions among hydrology, water quality, and ecological systems. However, a unifying operational framework that links multi-source data, cross-scale coupling, and rigorous uncertainty propagation to actionable, real-time decision support is still missing, largely due to gaps in interoperability and stakeholder engagement. Addressing these limitations demands the development of intelligent, adaptive modeling frameworks that leverage hybrid physics-informed machine learning, cross-scale process integration, and continuous real-time data assimilation. Open science practices and transparent model governance are essential for ensuring reproducibility, stakeholder trust, and policy relevance. The recent literature indicates that loose coupling predominates, physics-informed ML tends to generalize better in data-sparse settings, and uncertainty communication remains uneven. Building on these insights, this review synthesizes methods for data harmonization and cross-scale integration, compares coupling architectures and data assimilation schemes, evaluates uncertainty and interoperability practices, and introduces the Smart Integrated Watershed Eco-Assessment Framework (SIWEAF) to support adaptive, real-time, stakeholder-centered decision-making.

Keywords:

watershed eco-assessment; multi-source data fusion; model coupling strategies; digital twin systems; open science and interoperability

1. Introduction

Watersheds are fundamental landscape units where hydrological, ecological, and biogeochemical processes converge. Their ecological integrity supports freshwater availability, biodiversity conservation, agricultural productivity, and resilience to environmental change [1,2]. However, escalating anthropogenic pressures—urbanization, agricultural intensification, pollution, and climate variability—have severely compromised watershed functions worldwide [3,4,5]. This degradation has prompted urgent calls for more integrated, predictive, and adaptive watershed assessment frameworks [6,7,8].

Traditional eco-assessment methods, largely reliant on sparse in situ observations or single-process models, struggle to capture the inherent complexity, feedbacks, and nonlinearity of watershed systems [9,10]. Limitations in spatial and temporal resolution, inability to assimilate heterogeneous data streams, and failure to account for cross-domain interactions (e.g., hydrology-ecology feedbacks) constrain their utility in dynamic environmental contexts [11]. Against this backdrop, the purpose of this review is to consolidate advances in multi-source data integration and model coupling for watershed eco-assessment, and to clarify what is ready for practice versus what remains at the research stage.

In response, there is a paradigm shift toward multi-source data integration and model coupling as core strategies for next-generation watershed eco-assessment. Multi-source integration leverages the complementary strengths of diverse datasets—including ground-based sensor networks, satellite remote sensing products (e.g., MODIS, Sentinel-2), biological monitoring indices (e.g., macroinvertebrate IBI scores), and model-generated outputs (e.g., ERA5 reanalysis)—to build a richer, more continuous, and spatially comprehensive understanding of watershed processes [12,13]. Advanced techniques such as spatiotemporal harmonization, machine learning-driven data fusion, and geospatial knowledge graphs are being increasingly applied to manage and exploit this data heterogeneity [14,15].

Model coupling further enhances eco-assessment by linking different process models across hydrology, water quality, and ecology domains [16,17,18,19]. Coupled models allow the dynamic feedbacks between components—such as the effects of hydrologic alterations on aquatic habitat quality or nutrient cycling—to be explicitly simulated [20,21]. Coupling strategies range from loose file-based exchanges to tight and embedded integrations, facilitated by standards like OpenMI and component frameworks like CSDMS [22,23]. Case studies in the Chesapeake Bay watershed and Great Barrier Reef catchments demonstrate the tangible management benefits of integrated, coupled modeling systems, such as improved pollutant source identification and adaptive management interventions [24,25].

Despite these advances, critical challenges persist. Data standardization and interoperability remain significant bottlenecks, as datasets originate from disparate agencies, platforms, and conventions [26]. Scale mismatches between datasets (e.g., 10 m-resolution remote sensing vs. lumped watershed models) introduce aggregation errors that propagate through coupled models [11]. Furthermore, uncertainty quantification across complex model chains is underdeveloped, risking overconfidence in model-based decision support tools [12].

New technologies introduce additional dimensions of opportunity and complexity. Artificial intelligence (AI) and physics-informed machine learning (PIML) offer powerful tools for data integration, error correction, and surrogate modeling, but integrating AI outputs with traditional process-based models raises concerns about interpretability, generalizability, and ethical use [15,20,21]. Real-time digital twin systems, leveraging IoT sensor networks, remote platforms, and dynamic models, promise continuous monitoring and prediction for adaptive watershed management. However, scalability, cyberinfrastructure, and governance issues pose formidable barriers to operational deployment [22].

This review focuses on eco-assessment systems from basin to reach scales eco-assessment systems that unite hydrological, water quality, and ecological components under realistic data and operational constraints. We prioritize studies that (i) combine at least two independent observation streams, (ii) implement explicit coupling among process models or embed AI components within mechanistic frameworks, and (iii) report evaluation against independent observations or management outcomes [27]. We exclude purely conceptual coupling proposals that lack demonstration at catchment scale and single-variable skill studies that do not address cross-domain responses. Where possible, we distinguish research prototypes from operational deployments and summarize computational, data, and governance requirements to support transferability across regions and institutions. This scope clarifies how evidence is weighed and which conclusions can be generalized beyond individual case studies.

Emerging data sources—including UAV-acquired imagery, crowdsourced environmental observations, and edge-computing-enabled smart sensors—further enrich the data landscape but demand novel integration strategies to handle issues of data quality, privacy, and bias [23,25]. Simultaneously, the governance dimension cannot be overlooked: successful implementation of integrated eco-assessment systems requires not only technical innovation but also institutional collaboration, data sharing agreements, capacity building, and stakeholder engagement [13,24].

To avoid ambiguity, we define ‘multi-source data integration’ as the harmonization of in situ, remote-sensing, biological, and reanalysis data into consistent spatiotemporal representations for inference or prediction. A digital twin is defined as a continuously synchronized virtual representation of a watershed that ingests live data, performs forecasting via process and/or AI surrogates, and supports human-in-the-loop control. We assess approaches using multiple lenses: predictive skill (e.g., NSE, KGE, RMSE) and spatiotemporal correspondence, ecological response indicators (e.g., IBI changes or habitat suitability), computational efficiency and scalability, interoperability and maintainability (e.g., component standards), reproducibility and transparency of data, and the clarity of uncertainty quantification and communication across model chains. These criteria underpin our synthesis of strengths, limitations, and suitability for deployment.

Given these complexities, the future of watershed eco-assessment will likely involve hybrid systems that blend multi-source observational data, coupled process models, AI-augmented components, and real-time decision-support interfaces. It will also necessitate rigorous attention to transparency, reproducibility, and uncertainty communication to maintain stakeholder trust and scientific credibility [26,28].

Our cross-case synthesis yields several transferable lessons. Interoperable architectures and shared vocabularies lower transaction costs for coupling and maintenance [29], while modular data assimilation improves robustness under sensor outages and regime shifts [30]. Success hinges on aligning spatial and temporal supports among observations and models, establishing governance for data access and stewardship [31], and budgeting for cyberinfrastructure alongside science tasks. Common pitfalls include over-reliance on black-box surrogates without stress-testing generalization [32], neglect of scale mismatch and aggregation bias, and under-reported end-to-end uncertainty that erodes stakeholder trust. We therefore emphasize transparent workflows, calibrated handoffs between AI and process components, and audit trails from raw data to decisions.

However, prior surveys are fragmented—separating data fusion from process modeling, treating cross-domain feedbacks and chain-wide uncertainty only qualitatively, and overlooking governance and cyberinfrastructure needed for real-time digital twins. In this review, we critically synthesize recent advances in multi-source data integration and model coupling for watershed eco-assessment. Our aims are to (i) delineate data types and integration techniques and specify when each is appropriate; (ii) compare coupling architectures and workflows across hydrology–water quality–ecology, including enabling standards; (iii) assess data assimilation practices and end-to-end uncertainty quantification across model chains; (iv) appraise AI/PIML-enabled hybrid modeling and emerging real-time digital twin implementations; and (v) distill design principles into a practical roadmap for robust, adaptive, and transparent eco-assessment. By distilling progress and delineating persistent challenges, we aim to chart a roadmap for developing robust, adaptive, and actionable eco-assessment frameworks that can support sustainable watershed management under accelerating global change [33,34,35].

2. Data Sources for Watershed Eco-Assessment

As a roadmap for this review, Figure 1 situates multi-source data, coupling architectures, and digital twin frontiers across Section 2, Section 3, Section 4, Section 5 and Section 6.

Figure 1. Integrated framework for watershed eco-assessment—review roadmap (Section 2, Section 3, Section 4, Section 5 and Section 6).

Effective watershed eco-assessment relies on integrating multiple, complementary data streams to comprehensively characterize hydrological, chemical, and biological processes across diverse spatial and temporal scales. Four major categories of data sources—in situ ground measurements, remote sensing products, biological monitoring data, and model-generated outputs—form the foundation of integrated assessment systems. A synergistic combination of these data types is essential to overcome the limitations inherent in any single source and to enable robust, scalable eco-assessment frameworks.

2.1. Ground Monitoring Data

In situ measurements remain the cornerstone of watershed studies, providing high-accuracy “ground truth” observations that anchor model calibrations and validate remote sensing outputs. Traditional datasets include streamflow records from gauging stations, precipitation measurements from meteorological networks, groundwater level observations from monitoring wells, and water quality sampling stations that quantify parameters such as nutrients, dissolved oxygen, pH, turbidity, and temperature [36]. These datasets typically offer fine temporal resolution (e.g., hourly, sub-daily) and long historical records, making them invaluable for trend analysis, model validation, and detection of regime shifts under climatic and anthropogenic forcing [37,38,39]. As shown in Figure 2, the model was calibrated with RS-based evapotranspiration (ET) products and validated using lake water storage change (LWSC) from multi-mission satellite data and basin water storage change from GRACE. The approach achieved NSE values ≥ 0.7 in most sub-basins and strong agreement with RS products (R² = 0.8), proving the feasibility of RS-driven hydrological modeling in data-scarce regions.

Figure 2. Water balance model of Hala Lake basin and its application process based on remote sensing data [39]. Copyright, 2025 Elsevier.

However, spatial coverage of ground monitoring networks often remains sparse, particularly in topographically complex, remote, or politically unstable regions. Data gaps due to equipment failure, maintenance interruptions, and site decommissioning pose additional challenges to continuous monitoring efforts [40]. Furthermore, heterogeneities in measurement protocols, instrumentation sensitivity, and data management standards across agencies complicate multi-source data harmonization.

Efforts such as the USGS National Water Information System in the United States, the European Environment Agency’s water base program, and emerging citizen science initiatives (e.g., CrowdWater, FreshWater Watch) are progressively expanding the spatial and thematic coverage of ground-based observations [40]. Nonetheless, ensuring long-term data consistency, minimizing human-induced biases, and standardizing quality assurance and control protocols remain critical needs for maximizing the utility of in situ monitoring in integrated watershed assessment.

2.2. Remote Sensing Data

Remote sensing technologies offer synoptic, repeatable, and multi-scalar observations critical for watershed eco-assessment, particularly in regions where ground monitoring is limited or absent. Multispectral and hyperspectral satellite missions (e.g., Landsat series, Sentinel-2, MODIS, PRISMA) enable monitoring of land cover changes, vegetation dynamics, surface water extents, wetland hydrology, and water quality indicators such as chlorophyll-a concentration, turbidity, and suspended solids [41]. Furthermore, unmanned aerial vehicles (UAVs) equipped with miniaturized sensors are increasingly used to provide ultra-high-resolution data for localized assessments, bridging the spatial-temporal gap between satellite and field observations [42]. Advances in radar (e.g., Sentinel-1 SAR) and passive microwave sensors (e.g., SMAP) provide data on soil moisture, flood inundation extents, snowpack dynamics, and ice cover, independent of cloud and illumination conditions.

High-resolution topographic datasets derived from airborne or terrestrial LiDAR surveys are increasingly crucial for watershed delineation, hydraulic modeling, and floodplain mapping. The integration of these detailed topographic data with process-based ecological models allows for more accurate simulation of hydrological pathways and their influence on habitat suitability [43]. Recent innovations, such as PlanetScope’s daily high-resolution imagery and ICESat-2’s laser altimetry for inland water bodies, are further pushing the frontiers of remote watershed observation [44,45].

Despite their advantages, remote sensing datasets are often subject to limitations including cloud contamination (for optical sensors), coarse revisit times, spectral resolution trade-offs, and indirectness of measurements (e.g., inferring water quality proxies rather than direct chemical concentrations). Calibration and validation using in situ observations are necessary to ensure remote sensing-derived products’ accuracy and transferability across diverse watershed settings [45]. In this context, community-based monitoring and citizen science initiatives are emerging as valuable sources of ground-truth data for validating satellite-derived water quality products [46]. Emerging techniques such as data fusion, machine learning-based downscaling, and physics-informed retrieval algorithms are enhancing the reliability and applicability of remote sensing for eco-assessment purposes.

2.3. Biological Monitoring Data

Aquatic biological communities act as sensitive integrators of watershed health, responding cumulatively to physical, chemical, and biological stressors across multiple temporal scales. Macroinvertebrate assemblages, fish community structure, periphyton biomass, and microbial community diversity are among the most frequently monitored biological indicators. Field-based surveys typically yield indices such as the Biological Monitoring Working Party (BMWP) score, the Index of Biotic Integrity (IBI), and multimeric indices tailored to specific regional contexts [47]. Cross-regional calibrations and trait-based extensions of BMWP/IBI have improved transferability and diagnostic power, particularly along land-use and hydromorphological gradients [48].

Biological data provide critical complementary information to physicochemical monitoring by capturing integrated ecosystem responses that are otherwise difficult to infer from discrete environmental variables alone. However, collecting biological data poses significant logistical challenges: labor-intensive fieldwork, seasonal and annual variability, sampling bias, and the need for specialized taxonomic expertise all hinder large-scale, frequent biological monitoring campaigns [49]. Meta-analyses of bioassessment datasets quantify seasonal and observer variance and recommend sampling designs and power analyses to ensure trend detectability [50].

The advent of molecular and sensor-based technologies is beginning to revolutionize biological monitoring. Environmental DNA (eDNA) metabarcoding enables detection of a wide array of aquatic taxa from water samples, vastly improving the spatial resolution, taxonomic coverage, and cost-effectiveness of biological assessments [51]. Comparative studies show that eDNA metabarcoding often recovers greater taxonomic richness and detects rare or early-invading taxa versus kick-net surveys, albeit with marker and reference-library biases to manage [52]. Similarly, in situ sensors for measuring chlorophyll fluorescence and algal pigments allow near-real-time tracking of algal blooms and primary productivity, enhancing the temporal responsiveness of eco-assessments. Integrating biological monitoring data with physicochemical and remote sensing datasets is increasingly recognized as a best practice for holistic watershed health evaluation [53,54].

2.4. Model Outputs and Reanalysis Products

Model-generated outputs are critical for augmenting observational datasets, especially in data-scarce or inaccessible regions, and for providing spatially and temporally continuous estimates necessary for eco-assessment and scenario planning. Hydrological models such as Soil and Water Assessment Tool (SWAT+) and Variable Infiltration Capacity(VIC) model produce continuous runoff, soil moisture, evapotranspiration, and groundwater recharge estimates across catchments [12]. Water quality models simulate nutrient dynamics, sediment transport, pathogen dispersion, and chemical pollutant fate within watershed networks [55].

Meteorological reanalysis products, including ERA5 (produced by ECMWF) and the North American Land Data Assimilation System (NLDAS), assimilate multi-source observations into physically consistent gridded datasets of key atmospheric and land surface variables such as precipitation, temperature, radiation, and humidity [56]. These products enable retrospective analysis of watershed conditions, facilitate forcing hydrological and ecological models, and support climate change impact assessments.

Despite their utility, model-generated datasets inherently carry uncertainties originating from model structural assumptions, parameter choices, numerical approximations, and errors in input forcings. Calibration against ground observations, sensitivity analysis, and robust uncertainty quantification frameworks are essential to ensure the credibility and applicability of model outputs in decision-support contexts [57]. Moreover, hybrid approaches that blend physical models with data-driven techniques (e.g., machine learning emulators) are emerging as powerful tools to enhance model performance and adaptability under changing environmental conditions.

In summary, multi-source data streams provide a rich yet heterogeneous foundation for watershed eco-assessment. Table 1 summarizes Section 2.1, Section 2.2, Section 2.3 and Section 2.4 with a side-by-side comparison of variables, spatiotemporal support, strengths/limitations, and representative products. Effectively integrating these complementary datasets is critical for capturing complex system dynamics and supporting robust evaluations. The following section explores methods designed to synthesize these diverse information sources into coherent analytical frameworks.

Table 1. Overview of eco-assessment data sources (2.1–2.4)—variables, spatiotemporal support, strengths/limitations, representative products; plus a column for model outputs and reanalysis (2.4).

3. Methods of Multi-Source Data Integration

The integration of heterogeneous data streams is fundamental to building coherent and robust watershed eco-assessment systems. Effective multi-source integration demands addressing discrepancies in data formats, spatial-temporal scales, semantic meanings, and inherent uncertainties [59]. A comprehensive integration pipeline typically involves four core stages: data cleaning and standardization, spatial-temporal harmonization, feature extraction and transformation, and data fusion (Figure 2) [60]. Each step requires rigorous methodological choices to minimize information loss and uncertainty amplification.

3.1. Data Cleaning and Standardization

Environmental datasets frequently suffer from incomplete records, outliers, sensor drift, and inconsistencies arising from diverse instrumentation, protocols, or human error. Data cleaning encompasses procedures such as outlier detection (e.g., Z-score or Mahalanobis distance methods), missing data imputation (e.g., multiple imputation by chained equations), and temporal smoothing to remove noise while preserving signal integrity [61].

Advanced approaches increasingly leverage machine learning for anomaly detection. For instance, unsupervised algorithms like isolation forests or autoencoders are used to detect sensor faults or implausible values in high-frequency hydrological data [62]. Integrating domain knowledge into cleaning algorithms—such as enforcing conservation of mass in water balance datasets—further improves reliability.

Standardization involves reconciling units, spatial projections, variable definitions, and metadata conventions [63]. Adoption of open standards such as WaterML for time series hydrology, Climate and Forecast (CF) conventions for gridded datasets, and SensorML for metadata improves machine-readability and cross-platform interoperability [64].

Without rigorous standardization, downstream integration and modeling efforts become error-prone, opaque, and irreproducible. Thus, metadata completeness and adherence to community ontologies are increasingly regarded as critical best practices in watershed informatics.

3.2. Spatial-Temporal Harmonization

The multi-source nature of watershed data necessitates aligning datasets that differ in spatial resolution, coverage, and temporal frequency. Spatial harmonization strategies depend on dataset types and assessment objectives.

For instance, satellite remote sensing imagery (e.g., 30 m Landsat) may be aggregated to the catchment or sub-catchment scale using GIS zonal statistics [65], while gridded climate data (e.g., 0.25° ERA5) may be downscaled using statistical or dynamical methods to match finer hydrological model requirements [66]. Machine learning techniques such as random forests and deep neural networks have been employed for spatial downscaling of precipitation and soil moisture [67].

Temporal harmonization addresses discrepancies between different sampling intervals. Aggregation (e.g., converting hourly precipitation to daily totals) or disaggregation (e.g., temporal disaggregation of monthly runoff using temperature patterns) is applied based on the target modeling timestep [68]. Advanced methods such as temporal Gaussian process interpolation or LSTM-based time series modeling can reconstruct missing or irregular observations with higher accuracy [69].

Critically, harmonization must balance computational practicality against the preservation of key process signals (e.g., extreme events), and propagate uncertainties appropriately into the integrated datasets [70].

3.3. Feature Extraction and Transformation

Beyond raw harmonization, feature extraction transforms primary variables into process-relevant indicators. Derived features can reveal latent hydrological, ecological, or biogeochemical processes otherwise obscured in raw datasets.

Hydrologic indices such as baseflow index, runoff coefficients, and flow duration curves are derived from streamflow time series to characterize watershed response [71]. Vegetation indices such as NDVI, EVI, and land surface temperature metrics from optical and thermal imagery capture vegetation health and evapotranspiration dynamics [72].

Feature engineering techniques, including principal component analysis (PCA), independent component analysis (ICA), and t-SNE, help reduce high-dimensional remote sensing datasets into compact, informative feature spaces. More recently, convolutional neural networks (CNNs) trained on raw satellite imagery have been used to automatically extract spatial patterns relevant to watershed status [73].

Normalization, scaling, and transformation (e.g., Box-Cox transformations) prepare features for statistical or machine learning models, ensuring comparability across datasets with different distributions. Emerging workflows increasingly adopt theory-guided feature extraction, integrating physical constraints (e.g., water and energy balance laws) into automated feature engineering pipelines [74,75].

3.4. Data Fusion Techniques

Data fusion is the synthesis of multiple datasets into a coherent, unified information structure for analysis, modeling, or decision support [76,77]. Fusion approaches span a continuum from simple rule-based methods to sophisticated machine learning and hybrid physics-informed models [78,79].

(1) Rule-based fusion methods are deterministic and transparent, such as prioritizing ground measurements over satellite-derived estimates when available or combining multiple rainfall products via weighted averages based on known accuracies. Despite simplicity, they may not optimally handle biases or dynamic uncertainty.

(2) Statistical fusion methods, such as Bayesian model averaging, Kalman filtering, and ensemble approaches, explicitly model uncertainties and probabilistically blend observations and predictions [80,81]. Data assimilation frameworks (e.g., Ensemble Kalman Filter) are widely used in hydrology for real-time updating of model states with observations, improving forecast accuracy.

(3) The optimization of multi-parameter systems, such as foamed ceramics synthesis, demonstrates the value of integrating diverse data sources [82], which underpins the application of machine learning fusion methods in complex environmental modeling. In this context, supervised learning algorithms (e.g., random forests [82,83] gradient boosting machines, deep neural networks) are trained to learn nonlinear mappings between heterogeneous input datasets and target watershed variables (e.g., predicting nitrate loads from satellite land cover, precipitation, and soil attributes) [84,85].

(4) Hybrid fusion strategies integrate process-based models with data-driven correctors or surrogates—pioneering the field of physics-informed machine learning (PIML). Examples include augmenting hydrological process models with deep learning error correctors or constraining neural networks with mass balance equations to ensure physical realism [49,74].

Fusion strategy selection depends on the specific application domain, available data quality and quantity, interpretability requirements, and computational constraints. In watershed eco-assessment, ensemble and hybrid approaches are gaining prominence for their ability to balance physical credibility with data-driven predictive power.

Looking forward, scalable and explainable fusion frameworks that integrate diverse data modalities (e.g., optical imagery, hydrometeorological time series, ecological surveys) will be essential for operationalizing real-time, adaptive watershed management systems.

Together, these integration methods offer powerful tools for reconciling heterogeneous datasets and extracting meaningful eco-hydrological insights. However, data fusion alone is insufficient; coupling with dynamic simulation models is essential to predict watershed responses under changing conditions. The next section discusses strategies for multi-model coupling to achieve comprehensive eco-assessment.

4. Models for Watershed Eco-Assessment

Process-based models are indispensable tools for watershed eco-assessment, enabling the simulation of hydrological, water quality, and ecological dynamics under natural and anthropogenic influences. These models serve to extrapolate observations, assess system behavior under alternative scenarios, and support decision-making for sustainable watershed management [86]. Based on modeling objectives and system complexity, watershed models can be broadly classified into three categories: hydrological models, water quality models, and integrated ecohydrological models.

4.1. Hydrological Models

Hydrological models simulate the movement, distribution, and storage of water in watershed systems to support understanding of streamflow generation, groundwater recharge, and flood risk.

Physically based distributed models—such as SWAT+ [87] and VIC model—solve process-level water/energy balance over explicit spatial units (grids/HRUs). Parameters are linked to measurable properties via governing equations, distinguishing physically based schemes from purely empirically calibrated approaches [84]. They represent precipitation partitioning, infiltration, evapotranspiration, and runoff generation using mapped inputs (soils, land cover, topography, climate), enabling scenario analyses of land-use change and climate variability.

Conceptual lumped models (e.g., HBV, GR4J) aggregate the catchment into a few interconnected “stores” (e.g., snow, soil, groundwater, channel) linked by empirical transfer functions. Such models emphasize parsimony and equifinality-aware calibration rather than strict parameter measurability [88]. Parameters are calibrated from hydrographs, delivering computational efficiency for large domains or data-scarce regions [89].

Recent advances combine hydrological models with remote-sensing data assimilation (e.g., soil moisture, snow, ET) and machine-learning post-processors to reduce bias and improve predictions in ungauged or poorly monitored basins, while retaining the model’s physical structure [90].

4.2. Water Quality Models

Water quality models simulate the transport, transformation, and fate of pollutants within aquatic systems, capturing processes such as nutrient cycling, sediment transport, and contaminant dispersion [91,92].

Models like QUAL2K [93] and WASP (Water Quality Analysis Simulation Program) are widely used for riverine water quality simulations, modeling constituents such as dissolved oxygen, nitrogen, phosphorus, and toxicants along river reaches. In agricultural landscapes, integrated watershed models like SWAT incorporate sediment and nutrient loading dynamics from uplands to stream networks, linking land management practices to downstream water quality outcomes [87].

Recent developments emphasize coupled hydrology-water quality frameworks that dynamically link flow generation, sediment detachment, and solute transport processes, improving predictive capabilities under variable climate and land-use conditions [94]. Meanwhile, data-driven surrogate models, particularly machine learning-based emulators, are increasingly used to approximate computationally intensive water quality simulations, facilitating rapid scenario analysis [95].

4.3. Integrated Ecohydrological Models

Integrated ecohydrological models explicitly simulate the feedbacks between hydrological processes and ecological responses, providing a holistic framework for watershed eco-assessment [94,95,96].

Models such as RHESSys (Regional Hydro-Ecologic Simulation System) [97] and MIKE SHE–ECO Lab combine physically based hydrology with ecosystem components, enabling assessments of how hydrological alterations affect vegetation dynamics, habitat availability, and biogeochemical cycling. Such frameworks are critical for evaluating the ecological impacts of flow regulation, land conversion, and climate change [97,98,99].

Emerging integrated platforms leverage modular architectures to flexibly couple hydrology, water quality, and ecological sub-models via interfaces such as OpenMI and CSDMS [21,22]. For example, modular coupling of SWAT with aquatic habitat models (e.g., CASiMiR) enables evaluation of habitat suitability under different flow regimes [85]. In parallel, hybrid models that couple a process-based core with AI components used for sub-grid parameterization, bias correction, and surrogate emulation of computationally expensive processes have emerged to improve computational scalability and to enable probabilistic uncertainty characterization in complex watershed assessments [49,100].

These integrated models attempt to simulate feedbacks between water, climate, and biota, providing a more holistic representation of watershed function. For example, a coupled surface water-groundwater-ecology model can assess how flow regime alterations impact aquatic habitat and species dynamics. The complexity of integrated models often demands significant computational power and detailed data, but they are increasingly important for scenario analysis in environmental flow management and climate adaptation planning.

4.4. Model Selection and Application Considerations

Model selection must be guided by assessment objectives, spatial and temporal scales, data availability, system complexity, and computational resources. Physically based distributed models offer detailed process representations but require intensive calibration and data inputs, while lumped or semi-distributed models trade process realism for operational feasibility [101].

Increasingly, multi-model ensembles are adopted to address structural uncertainties and improve robustness in eco-assessment predictions [102]. Moreover, transparent reporting of model assumptions, calibration procedures, and validation outcomes is critical to ensure credibility and reproducibility in watershed modeling applications [103].

Ultimately, the integration of models with multi-source observational data and real-time monitoring systems will be essential for advancing from retrospective assessment to proactive, adaptive watershed management.

The suite of modeling approaches reviewed here enables detailed representation of hydrological, water quality, and ecological processes at varying resolutions and complexities. Nevertheless, meaningful eco-assessment increasingly depends on the ability to dynamically link these models, allowing for cross-domain feedbacks. The subsequent section examines coupling strategies that integrate diverse models into unified simulation systems.

5. Coupling Strategies for Integrated Eco-Assessment

The complexity of watershed eco-assessment necessitates the integration of multiple models representing hydrological, water quality, and ecological processes. Model coupling—the systematic linking of individual models into a unified computational framework—enables the simulation of cross-domain feedbacks, supports scenario analysis under changing environmental drivers, and enhances predictive capability for management interventions [104]. As eco-assessment goals evolve from descriptive assessments toward proactive, real-time decision support, coupling strategies must balance realism, computational efficiency, transparency, and scalability to meet operational demands [103].

5.1. Loose Coupling

Loose coupling represents the simplest integration strategy, where individual models operate independently and exchange information through external files, standardized formats, or customized scripts without synchronized runtime communication [105]. Typically, outputs from an upstream model are processed and formatted to serve as inputs for a downstream model, facilitating a modular workflow. This architecture offers notable advantages, including ease of implementation, flexibility in model replacement or upgrading, and compatibility with legacy systems [106]. A common example is the linkage of a hydrological model (e.g., SWAT) with an aquatic habitat model (e.g., PHABSIM) to evaluate habitat suitability under different flow regimes. However, loose coupling presents important limitations. Temporal misalignments between models can produce inconsistent or inaccurate system predictions, particularly when modeling rapidly evolving phenomena such as flash floods or algal blooms [107]. Additionally, errors and uncertainties tend to accumulate across model interfaces without feedback correction mechanisms, while manual reconciliation of differing spatial resolutions, temporal steps, and unit conventions can significantly increase data processing overhead [108]. The sequential execution structure further complicates uncertainty propagation tracking, undermining the robustness of sensitivity analyses. Despite these limitations, loose coupling remains valuable for exploratory scenario analysis and applications where feedback loops between subsystems are weak or unidirectional.

5.2. Tight Coupling

Tight coupling introduces a higher degree of integration, enabling models to dynamically exchange information during runtime through shared memory access, application programming interfaces (APIs), or standardized protocols such as OpenMI [109]. In this approach, synchronization between models is maintained at each computational cycle, allowing for iterative feedbacks that improve the realism of process interactions. Tight coupling brings substantial advantages: dynamic cross-domain feedbacks can be explicitly represented, enabling more accurate simulation of complex phenomena such as nutrient-driven aquatic vegetation changes influencing river hydraulics [110]. Moreover, computational efficiency is enhanced compared to sequential loose coupling, particularly when iterative optimization or feedback-controlled simulations are required [101]. Automated data exchange reduces manual intervention, improving reproducibility and strengthening workflow robustness. Notable examples include the integration of MIKE SHE (hydrology) with MIKE 11 (hydrodynamics and water quality) into a seamless simulation platform, or the modular architecture of SWAT+ which enables consistent interaction among land surface, routing, and water quality components [100].

However, implementing tight coupling can be technically challenging, and the resulting integrated model may be computationally intensive. Maintaining numerical stability and convergence when two models run concurrently is another concern. Despite these challenges, tightly coupled models have shown clear benefits in certain applications—for instance, improving predictions of lake–watershed interactions, as demonstrated by a coupled model of Qinghai Lake that successfully reproduced multi-decadal lake level changes and differentiated key water balance components [111].

5.3. Embedded (Seamless) Coupling

Embedded or seamless coupling represents the deepest level of model integration, wherein multiple process modules—such as hydrology, sediment transport, biogeochemistry, and ecosystem dynamics—are developed within a single software environment, sharing solvers, grids, and control structures [112]. In embedded systems, common computational kernels enforce mass, momentum, and energy conservation across domains, ensuring rigorous consistency in coupled simulations [113]. Examples include ecohydrological models such as RHESSys v7.x, which simultaneously simulates watershed hydrology, carbon cycling, and vegetation dynamics [97], and DHSVM v3.1.x, which integrates snow hydrology, soil moisture, and vegetation processes at high spatial resolution [97]. Embedded coupling offers unmatched computational efficiency, fine-grained process fidelity, and the ability to simulate fully integrated system responses across broad temporal and spatial scales. However, embedded models are often less modular, as updating or replacing individual components—such as a new evapotranspiration scheme—may require significant recoding and system revalidation [114]. Furthermore, the complexity of large, monolithic codebases can obscure internal workings, limiting accessibility, transparency, and adaptability for diverse user communities. High development and maintenance costs also constrain the widespread deployment of embedded frameworks outside specialized research groups. Consequently, embedded coupling is generally reserved for high-fidelity research-grade simulations where maximizing physical realism justifies the sacrifices in modularity and flexibility.

5.4. Emerging Flexible Coupling Frameworks

Recent advances recognize that neither fully loose nor fully embedded coupling offers a universal solution. As a result, flexible, component-based coupling frameworks have emerged to provide modular integration of models. These frameworks (often supported by interoperability standards) allow users to plug in different model components (for hydrology, climate, water quality, etc.) and manage the data exchange between them in a semi-automated way [21]. Notable examples include the Open Modelling Interface (OpenMI) standard, the EPA FRAMES 2.x family platform, and the CSDMS (Community Surface Dynamics Modeling System) which provides a Basic Model Interface for component models [102]. Such frameworks act like “middleware,” handling unit conversions, time-step alignment, and data transfer so that scientists can focus on model dynamics rather than coupling logistics.

A key development is the concept of digital twin basins, which takes flexible coupling to the next level. In a digital twin framework, real-time data streams are continuously assimilated into a suite of coupled models representing the basin [103]. The digital twin serves as a living virtual replica of the watershed, blending observations with multi-physics models (hydrological, hydraulic, ecological, etc.) within a high-performance computing environment [115]. This paradigm enables dynamic feedback control (e.g., models adjusting as conditions evolve) and scenario testing in a virtual space. A recent Nature perspective outlined a vision for widely applicable digital twin basins but noted that most existing water models are not yet effectively integrated across all essential processes—highlighting data and modeling gaps that need to be overcome [115]. To address this, physics-informed data-driven approaches are being explored as a glue for multi-model systems [116]. For example, machine learning components can estimate processes that are missing or poorly captured in traditional models, thereby linking model domains more seamlessly.

One illustrative emerging framework is the blueprint for a river basin digital twin proposed by [115], which outlines how real-time monitoring, historical data, analytics, and multi-domain models can be seamlessly integrated within an interoperable platform [115]. This blueprint emphasizes segmenting the basin system into modular components and establishing continuous two-way feedback loops between the physical watershed and its cyber counterpart. The goal is to create adaptive coupling architectures that can evolve with the system—for instance, where AI-driven controllers (or orchestration services) dynamically reconfigure model linkages based on evolving conditions or objectives. Together, these developments point toward “smart coupling” architectures that are adaptive, transparent, and capable of supporting real-time, cross-domain watershed management. A platform-level diagram can visualize how diverse models and data streams interconnect.

5.5. Coupling Strategy Selection and Best Practices

Selecting an appropriate coupling strategy for integrated watershed eco-assessment requires balancing multiple criteria, including scientific fidelity, computational cost, data availability, and end-user needs. In practice, a phased or hybrid approach is often useful: one might begin with loose coupling for ease, then progressively move to tighter coupling as understanding and data improve. It is also common to use different strategies for different subsystems (e.g., tight coupling hydrology–hydraulics, loose coupling to a separate ecological model for post-processing).

The selection of an appropriate coupling strategy for integrated watershed eco-assessment requires careful trade-offs among multiple criteria, including scientific realism, computational efficiency, modularity, and system transparency. Loose coupling approaches are well suited for exploratory studies or applications requiring flexible integration of legacy models, whereas tight coupling is preferred when dynamic feedback representation and predictive performance are critical. Embedded coupling, while offering the highest fidelity, demands significant development effort and is best reserved for high-resolution, process-intensive research applications. Community frameworks such as OpenMI and CSDMS operationalize modular interfaces and component coupling for both loose and tight integrations [15,21]. Benchmark architectures such as SUMMA v3.x family formalize structural choices and promote transparent, modular coupling across process representations [43].

Key best practices include: (a) ensuring temporal/spatial consistency between coupled components (e.g., matching time-steps, aligning grids or watershed units), (b) performing sensitivity analysis to identify which feedbacks are essential to include (not all processes need full coupling), and (c) rigorous calibration and validation of the coupled system as a whole, in addition to validating individual sub-models. Model integration can introduce new uncertainties; thus, uncertainty propagation and analysis are critical [117]. Stakeholder involvement in model design is also beneficial—an integrated model must be understandable and usable by decision-makers to truly add value. Participatory modeling practices enhance transparency and decision relevance [118].

Encouragingly, studies have shown that well-chosen coupling strategies can significantly enhance predictive performance and insight. For example, coupling surface and groundwater models has resolved long-standing water budget discrepancies in large basins [119], and linking hydrology with water quality models has improved the attribution of pollution sources in multi-stressor environments [120]. As a rule of thumb, one should adopt the simplest coupling that adequately captures the interactions of interest, to avoid unnecessary complexity. By following best practices and leveraging emerging tools, practitioners can build integrated modeling systems that are both robust and maintainable. A comparative overview of the three main coupling strategies is summarized in Table 2, illustrating key characteristics and application contexts.

Table 2. Comparative evaluation of coupling strategies (loose, tight, embedded, flexible) for watershed eco-assessment, with criteria such as implementation complexity, computation demand, feedback capability, example applications, etc.

Best practices for coupling implementation emphasize the importance of clearly articulating coupling objectives and the expected feedback pathways between subsystems. Consistency in spatial and temporal discretization across models is critical to avoid numerical artifacts and ensure smooth information transfer. Interface uncertainties should be systematically managed through ensemble simulations or probabilistic coupling approaches that explicitly quantify and propagate uncertainties across model boundaries [117,121]. Transparent documentation of all data transformations, interface assumptions, and calibration strategies is essential to support reproducibility and facilitate model evaluation [103,122]. Moreover, early engagement of stakeholders in the design and configuration of the coupling framework helps align technical capabilities with policy and management needs, ensuring that eco-assessment outputs are relevant, actionable, and trusted [118].

As environmental pressures intensify and decision timelines shrink, advancing toward integrated, adaptive, and intelligent model coupling architectures will be pivotal in delivering robust, policy-relevant watershed assessments capable of operating under conditions of deep uncertainty and rapid change.

Advances in coupling strategies have greatly enhanced the ability to simulate interconnected watershed processes under realistic scenarios. Despite these improvements, substantial technical, computational, and institutional challenges remain. The final section identifies these persistent bottlenecks and outlines future research directions necessary to build more adaptive, intelligent, and operationally relevant eco-assessment systems.

6. Challenges and Future Directions

Integrated watershed eco-assessment is evolving rapidly, driven by advances in data science, computational technology, and a growing recognition of the need for holistic water management. In this section, we highlight several key emerging trends and future directions that promise to shape next-generation eco-assessment systems.

6.1. Challenges

6.1.1. Data Heterogeneity and Quality

The proliferation of observational platforms—including ground-based monitoring networks, Earth-observing satellites, and citizen science initiatives—has dramatically expanded the availability of watershed-relevant data. However, it has simultaneously intensified data heterogeneity across spatial resolutions, temporal frequencies, formats, and quality standards [123]. Critical data gaps, inconsistencies, and systematic biases remain pervasive, particularly in low-resource regions with limited infrastructure. Moreover, preprocessing steps such as resampling, gap-filling, and bias correction, though necessary, often introduce additional uncertainties that are rarely quantified rigorously, undermining downstream model reliability [124]. Harmonizing multi-source datasets without eroding essential process signals remains an open technical frontier, demanding the development of novel data fusion frameworks that preserve both statistical consistency and dynamical realism.

6.1.2. Scale Mismatch and Model Integration

Watershed processes inherently span a wide range of spatial and temporal scales, from sub-hourly rainfall-runoff responses to multi-decadal vegetation and land-use transitions. Integrating models that operate on incompatible scales or process resolutions often introduces aggregation artifacts, scale mismatch errors, and emergent biases that degrade system realism [125]. Downscaling coarse-grid inputs (e.g., global reanalysis products) or upscaling fine-grained measurements (e.g., soil texture data) remain nontrivial, especially when nonlinear process feedbacks are involved. Developing scale-aware coupling frameworks that explicitly account for cross-scale interactions, hierarchical model structures, and dynamically adaptive resolutions is critical to advancing integrated watershed assessments [126].

6.1.3. Computational Complexity

The integration of high-resolution hydrological, biogeochemical, and ecological models, particularly under frameworks demanding uncertainty quantification, ensemble prediction, or real-time updating, imposes substantial computational burdens. Model calibration, sensitivity analysis, and probabilistic scenario evaluation across complex coupled systems require vast computational resources, often limiting the practical application of sophisticated eco-assessment models to academic case studies rather than operational watershed management. While high-performance computing (HPC) and cloud-based platforms offer unprecedented capabilities, bottlenecks in algorithmic scalability, parallelization efficiency, and data throughput persist. There is an urgent need for computational innovations that leverage hybrid architectures, intelligent load balancing, and reduced-order modeling techniques to enable real-time, resource-efficient eco-assessment.

6.1.4. Uncertainty Propagation and Communication

Uncertainty permeates every stage of integrated eco-assessment, cascading from observational measurement errors, through parameter estimation and structural model biases, into ensemble forecast outputs [127]. Yet, despite its ubiquity, the rigorous quantification, propagation, and transparent communication of uncertainty across multi-model workflows remain limited and inconsistently standardized [128]. Particularly under conditions of deep uncertainty, such as climate extremes or land-use transitions, communicating complex uncertainty information to stakeholders and decision-makers is a sociotechnical challenge of growing urgency. Failure to appropriately characterize or communicate uncertainty risks undermining stakeholder trust and misinforming policy interventions. Development of uncertainty-aware visualization frameworks, probabilistic scenario narratives, and risk-based decision tools is essential to bridge the gap between scientific complexity and actionable management.

6.1.5. Governance, Interoperability, and Stakeholder Engagement

Beyond technical challenges, institutional and governance barriers pose significant obstacles to integrated watershed eco-assessment. Fragmentation across data repositories, modeling platforms, and governance agencies leads to duplication of effort, gaps in interoperability, and inefficiencies in knowledge generation [129]. Proprietary software ecosystems, restrictive data licensing practices, and entrenched organizational silos impede the development of open, collaborative modeling environments necessary for integrated, cross-sectoral management [130]. Moreover, the integration of stakeholder perspectives—local knowledge, value systems, and societal priorities—into sophisticated modeling frameworks remains limited, risking a disconnect between technical model outputs and management needs [131]. Advancing inclusive, stakeholder-centered modeling processes is therefore critical for achieving both scientific robustness and social legitimacy in watershed eco-assessment.

6.2. Future Directions

6.2.1. Toward Smart, Adaptive Watershed Systems

The future of watershed eco-assessment lies in the transition from static, periodically updated models to dynamic, smart systems capable of continuous learning and adaptation. Digital twin frameworks—virtual representations of watersheds that are continuously updated with real-time observational data—offer a transformative pathway toward responsive, anticipatory management under conditions of environmental change and uncertainty. By integrating continuous monitoring, intelligent data assimilation, predictive simulation, and automated feedback control, digital twin-based eco-assessment platforms could enable early warning systems, adaptive intervention strategies, and resilient, evidence-based watershed governance. Recent advances advocate the use of Earth system digital twins that continuously ingest real-time observations and predictive models to inform adaptive management strategies

6.2.2. AI-Augmented Data Fusion and Modeling

Artificial intelligence (AI), particularly machine learning (ML) and physics-informed machine learning (PIML), holds significant promise for enhancing data fusion, feature extraction, surrogate modeling, and predictive analytics in watershed eco-assessment. Hybrid approaches that integrate domain-based physical process understanding with data-driven learning architectures can improve predictive accuracy, reduce computational burdens, and maintain scientific interpretability [85]. However, caution is warranted to avoid overreliance on “black box” models: embedding physical constraints, uncertainty quantification, and explainability mechanisms into AI architectures is essential for ensuring trustworthy, policy-relevant eco-assessment outputs [13]. Physics-informed machine learning (PIML) frameworks, which integrate physical laws into learning algorithms, are emerging as powerful tools to enhance model generalizability and interpretability in environmental systems.

6.2.3. Cross-Scale, Multi-Resolution Coupling

Developing cross-scale coupling architectures that flexibly integrate models operating at different spatial and temporal resolutions is a frontier challenge for watershed assessment. Innovations in multi-scale modeling platforms, adaptive mesh refinement (AMR) techniques, and multi-resolution data assimilation strategies are critical to bridging the divide between fine-scale process representation and basin-wide system management [132]. Such architectures must not only preserve critical small-scale dynamics but also allow information aggregation and feedback propagation across scales to inform actionable decisions [133,134].

6.2.4. Emphasizing Uncertainty-Aware Decision Support

Future eco-assessment systems must embed uncertainty quantification and risk-based decision analysis as central, not ancillary, components of their design. Probabilistic forecasting methods, ensemble-based scenario exploration, and robust optimization under uncertainty are crucial tools for informing adaptive, resilient watershed management strategies [135]. Many uncertainty sources are structural and path-dependent; for example, excavation-induced unloading can trigger creep, damage evolution, and shifts in failure modes in deep rock masses, altering risk profiles over time [124]. Furthermore, developing intuitive uncertainty visualization techniques—such as scenario storylines, ensemble fan plots, and decision-tree-based risk maps—will be essential for transparent, stakeholder-centric communication that supports informed decision-making under uncertainty.

6.2.5. Promoting Open Science and Interoperability

Adopting FAIR (Findable, Accessible, Interoperable, Reusable) data principles [136] and promoting open modeling standards are foundational for building collaborative, transparent, and reproducible eco-assessment systems. Open-source, modular modeling platforms coupled with interoperable data hubs will enable rapid innovation, model co-development, and broader stakeholder engagement. Moreover, participatory co-design approaches—where stakeholders are actively involved throughout model conceptualization, development, validation, and application—are critical to ensuring that eco-assessment systems are not only scientifically rigorous but also socially relevant and trusted [118]. Embedding openness, inclusivity, and transparency as core design principles will be central to building watershed management systems fit for the Anthropocene. The FAIR data principles, initially proposed to improve data stewardship, are now being expanded to address the interoperability needs of AI-driven modeling systems [137].

In light of the synthesized challenges and emerging opportunities, we propose a conceptual framework for the next generation of eco-assessment systems: the Smart Integrated Watershed Eco-Assessment Framework (SIWEAF). SIWEAF envisions an adaptive, modular, and stakeholder-driven system architecture that continuously ingests multi-source observations, dynamically couples multi-resolution models, and integrates uncertainty-aware, AI-augmented decision support tools. Central to SIWEAF is the embedding of open science principles, participatory co-design processes, and digital twin technologies to enable near-real-time monitoring, forecasting, and intervention planning under deep uncertainty. By combining technological innovation with governance transformation, SIWEAF aims to bridge the gap between scientific complexity and actionable watershed management, offering a scalable blueprint for resilient, data-driven eco-assessment in the Anthropocene. To synthesize these future directions, we propose the Smart Integrated Watershed Eco-Assessment Framework (SIWEAF), which integrates continuous multi-source data assimilation, dynamic model coupling, AI-augmented modeling, and stakeholder-centered decision support. A conceptual representation of SIWEAF is shown in Figure 3.

Figure 3. AI-augmented modeling (machine learning, physics-informed).

Figure 3 Conceptual framework of the Smart Integrated Watershed Eco-Assessment Framework (SIWEAF). Inputs from multi-source observations and stakeholder priorities are integrated through core modules including data fusion, multi-model coupling, AI-augmented modeling, uncertainty quantification, and real-time assimilation. Outputs support near-real-time eco-assessment, early warning, and adaptive management, with continuous feedback loops for updating and stakeholder engagement.

Addressing these interlinked challenges will require concerted advances in technology, methodology, and governance [138,139,140]. Future watershed eco-assessment systems must embrace openness, adaptability, and stakeholder-centric design to support sustainable and resilient water resource management in an increasingly uncertain world.

7. Conclusions

Integrated eco-assessment of watersheds has advanced through multi-source data fusion and model coupling, yet heterogeneity, scale mismatch, computation, uncertainty propagation, and institutional fragmentation persist. This review aimed to (i) synthesize methods for data harmonization and cross-scale integration; (ii) compare coupling architectures and data assimilation schemes; (iii) assess uncertainty and interoperability practices; and (iv) introduce the Smart Integrated Watershed Eco-Assessment Framework (SIWEAF) for adaptive, real-time, stakeholder-centered decisions.

Our synthesis indicates that loose coupling predominates, physics-informed ML generalizes better in data-sparse basins, and uncertainty communication remains uneven. To bridge research and practice, priorities include modular digital twin implementations within FAIR-aligned ecosystems, probabilistic/ensemble forecasting, and co-design with stakeholders to ensure legitimacy and uptake.

SIWEAF operationalizes these elements by integrating real-time observations, physics-informed ML, and interoperable components to support anticipatory management under deep uncertainty. Advancing integrated, intelligent, and inclusive eco-assessment is both urgent and achievable for sustainable water resource management—with explicit protocols for reproducibility, transparent uncertainty communication, equity-aware metrics, and capacity-building for data-poor regions and agencies across policy contexts.

Funding

This research was supported by the National Key Research and Development Program (2022YFC3202005).

Conflicts of Interest

The authors declare no conflict of interest.

References

Carpenter, S.R.; Stanley, E.H.; Vander Zanden, M.J. State of the world’s freshwater ecosystems: Physical, chemical, and biological changes. Annu. Rev. Environ. Resour. 2011, 36, 75–99. [Google Scholar] [CrossRef]
He, C.; Wang, Y.; Li, Q.-X.; Yan, Z.; Zhang, K.; Ni, S.-F.; Duan, X.-H.; Liu, L. Alkylarylation of alkenes with arylsulfonylacetate as bifunctional reagent via photoredox radical addition/Smiles rearrangement cascade. Chin. Chem. Lett. 2025, 36, 110253. [Google Scholar] [CrossRef]
Allan, J.D. Landscapes and Riverscapes: The Influence of Land Use on Stream Ecosystems. Annu. Rev. Ecol. Evol. Syst. 2004, 35, 257–284. [Google Scholar] [CrossRef]
Bai, Y.; Li, S.; Ho, S.-H. How do nanomaterials influence the spread of antibiotic resistance genes in aquatic environments? Chin. Chem. Lett. 2025, 111183. [Google Scholar] [CrossRef]
Dudgeon, D.; Arthington, A.H.; Gessner, M.O.; Kawabata, Z.-I.; Knowler, D.J.; Lévêque, C.; Naiman, R.J.; Prieur-Richard, A.-H.; Soto, D.; Stiassny, M.L.J.; et al. Freshwater biodiversity: Importance, threats, status and conservation challenges. Biol. Rev. 2006, 81, 163–182. [Google Scholar] [CrossRef]
Aredo, M.R.; Lohani, T.K.; Mohammed, A.K. Revisiting the global weights of the integrated watershed health assessment framework and Weyib watershed health analysis: Ethiopia’s policy prospects. World Water Policy 2024, 10, 940–970. [Google Scholar] [CrossRef]
Sivapalan, M. Pattern, Process and Function: Elements of a Unified Theory of Hydrology at the Catchment Scale. In Encyclopedia of Hydrological Sciences; John Wiley & Sons: Hoboken, NJ, USA, 2005. [Google Scholar]
Xu, Z.; Yang, P.; Yin, X.; Cai, X. Editorial: Watershed environmental changes and adaptive management for sustainability. Front. Environ. Sci. 2024, 12, 1455906. [Google Scholar] [CrossRef]
Kirchner, J.W. Getting the right answers for the right reasons: Linking measurements, analyses, and models to advance the science of hydrology. Water Resour. Res. 2006, 42, W03S04. [Google Scholar] [CrossRef]
Tian, Y.; Wen, Z.; Zhang, X.; Cheng, M.; Xu, M. Exploring a multisource-data framework for assessing ecological environment conditions in the Yellow River Basin, China. Sci. Total Environ. 2022, 848, 157730. [Google Scholar] [CrossRef]
Jaywant, S.A.; Arif, K.M. Remote Sensing Techniques for Water Quality Monitoring: A Review. Sensors 2024, 24, 8041. [Google Scholar] [CrossRef]
Hersbach, H.; Bell, B.; Berrisford, P.; Hirahara, S.; Horányi, A.; Muñoz-Sabater, J.; Nicolas, J.; Peubey, C.; Radu, R.; Schepers, D. The ERA5 global reanalysis. Q. J. R. Meteorol. Soc. 2020, 146, 1999–2049. [Google Scholar] [CrossRef]
Reichstein, M.; Camps-Valls, G.; Stevens, B.; Jung, M.; Denzler, J.; Carvalhais, N.; Prabhat, F. Deep learning and process understanding for data-driven Earth system science. Nature 2019, 566, 195–204. [Google Scholar] [CrossRef]
Fauvel, M.; Chanussot, J.; Benediktsson, J.A. A spatial–spectral kernel-based approach for the classification of remote-sensing images. Pattern Recognit. 2012, 45, 381–392. [Google Scholar] [CrossRef]
Gregersen, J.; Gijsbers, P.; Westen, S. OpenMI: Open modelling interface. J. Hydroinform. 2007, 9, 175–191. [Google Scholar] [CrossRef]
Fenocchi, A.; Pella, N.; Copetti, D.; Buzzi, F.; Magni, D.; Salmaso, N.; Dresti, C. Use of process-based coupled ecological-hydrodynamic models to support lake water ecosystem service protection planning at the regional scale. J. Contam. Hydrol. 2025, 268, 104469. [Google Scholar] [CrossRef] [PubMed]
Gan, T.; Tucker, G.E.; Hutton, E.W.H.; Piper, M.D.; Overeem, I.; Kettner, A.J.; Campforts, B.; Moriarty, J.M.; Undzis, B.; Pierce, E.; et al. CSDMS Data Components: Data–model integration tools for Earth surface processes modeling. Geosci. Model Dev. 2024, 17, 2165–2185. [Google Scholar] [CrossRef]
Hutton, E.W.H.; Piper, M.D.; Tucker, G.E. The Babelizer: Language interoperability for model coupling in the geosciences. J. Open Source Softw. 2022, 7, 3344. [Google Scholar] [CrossRef]
Strasser, U.; Warscher, M.; Rottler, E.; Hanzer, F. openAMUNDSEN v 0.8.3: An open source snow-hydrological model for mountain regions. EGUsphere 2024, 2024, 1–25. [Google Scholar]
Harpham, Q.K.; Hughes, A.; Moore, R. Introductory overview: The OpenMI 2.0 standard for integrating numerical models. Environ. Model. Softw. 2019, 122, 104549. [Google Scholar] [CrossRef]
Peckham, S.D.; Hutton, E.W.H.; Norris, B. A component-based approach to integrated modeling in the geosciences: The design of CSDMS. Comput. Geosci. 2013, 53, 3–12. [Google Scholar] [CrossRef]
Álvarez-Romero, J.G.; Wilkinson, S.N.; Pressey, R.L.; Ban, N.C.; Kool, J.; Brodie, J. Modeling catchment nutrients and sediment loads to inform regional management of water quality in coastal-marine ecosystems: A comparison of two approaches. J. Environ. Manag. 2014, 146, 164–178. [Google Scholar] [CrossRef] [PubMed]
Ward, S.; Scott Borden, D.; Kabo-Bah, A.; Fatawu, A.N.; Mwinkom, X.F. Water resources data, models and decisions: International expert opinion on knowledge management for an uncertain but resilient future. J. Hydroinform. 2019, 21, 32–44. [Google Scholar] [CrossRef]
Liu, Y.; Theller, L.O.; Pijanowski, B.C.; Engel, B.A. Optimal selection and placement of green infrastructure to reduce impacts of land use change and climate change on hydrology and water quality: An application to the Trail Creek Watershed, Indiana. Sci. Total Environ. 2016, 553, 149–163. [Google Scholar] [CrossRef]
Shen, C. A transdisciplinary review of deep learning research and its relevance for water resources scientists. Water Resour. Res. 2018, 54, 8558–8593. [Google Scholar] [CrossRef]
Fu, B.; Merritt, W.S.; Croke, B.F.; Weber, T.R.; Jakeman, A.J. A review of catchment-scale water quality and erosion models and a synthesis of future prospects. Environ. Model. Softw. 2019, 114, 75–97. [Google Scholar] [CrossRef]
Tricco, A.C.; Lillie, E.; Zarin, W.; O’Brien, K.K.; Colquhoun, H.; Levac, D.; Moher, D.; Peters, M.D.J.; Horsley, T.; Weeks, L.; et al. PRISMA Extension for Scoping Reviews (PRISMA-ScR): Checklist and Explanation. Ann. Intern. Med. 2018, 169, 467–473. [Google Scholar] [CrossRef]
Blöschl, G. Runoff Prediction in Ungauged Basins: Synthesis Across Processes, Places and Scales; Cambridge University Press: Cambridge, UK, 2013. [Google Scholar]
Harpham, Q.; Lhomme, J.; Parodi, A.; Fiori, E.; Jagers, B.; Galizia, A. Using OpenMI and a Model MAP to Integrate WaterML2 and NetCDF Data Sources into Flood Modeling of Genoa, Italy. JAWRA J. Am. Water Resour. Assoc. 2016, 52, 933–949. [Google Scholar] [CrossRef]
Zhang, Y.; Bocquet, M.; Mallet, V.; Seigneur, C.; Baklanov, A. Real-time air quality forecasting, part II: State of the science, current research needs, and future prospects. Atmos. Environ. 2012, 60, 656–676. [Google Scholar] [CrossRef]
Sargiotis, D. Key Principles of Data Governance: Building a Strong Foundation. In Data Governance: A Guide; Springer Nature: Cham, Switzerland, 2024; pp. 137–163. [Google Scholar]
Ma, Z.; Huang, Z.; Chen, J.; Cao, Z.; Gong, Y.-J. Surrogate Learning in Meta-Black-Box Optimization: A Preliminary Study. In Proceedings of the Genetic and Evolutionary Computation Conference, Málaga, Spain, 14–18 July 2025; Association for Computing Machinery: New York, NY, USA, 2025; pp. 1137–1145. [Google Scholar]
Gan, M.; Lu, T.; Yu, W.; Feng, J.; Chong, X. Capturing and visualizing the phase transition mediated thermal stress of thermal barrier coating materials via a cross-scale integrated computational approach. J. Adv. Ceram. 2024, 13, 413–428. [Google Scholar] [CrossRef]
Lu, Y.; Chen, J.; Gao, G.; An, A.; Feng, J.; Liu, Z. Estimation of carbon stock in the reed wetland of Weishan county in China based on Sentinel satellite series. Carbon Res. 2025, 4, 29. [Google Scholar] [CrossRef]
Zhao, J.; Zhao, J.; Hu, Y.; Huang, T.; Zhao, X.; Liu, X. Non-planar fracture propagation model for fluid-driven fracturing based on fluid-solid coupling. Eng. Fract. Mech. 2020, 235, 107159. [Google Scholar] [CrossRef]
Mishra, A.K.; Coulibaly, P. Developments in hydrometric network design: A review. Rev. Geophys. 2009, 47, RG2001. [Google Scholar] [CrossRef]
Pan, F.; Wu, X.; Zeng, Q.; Tang, R.; Wang, J.; Lin, X.; You, D.; Wen, J.; Xiao, Q. A coarse pixel-scale ground “truth” dataset based on global in situ site measurements to support validation and bias correction of satellite surface albedo products. Earth Syst. Sci. Data 2024, 16, 161–176. [Google Scholar] [CrossRef]
Vörösmarty, C.J.; Fekete, B.M.; Meybeck, M.; Lammers, R.B. Global system of rivers: Its role in organizing continental land mass and defining land-to-ocean linkages. Glob. Biogeochem. Cycles 2000, 14, 599–621. [Google Scholar] [CrossRef]
Zheng, D.; Zhu, W.; Han, Y.; Lv, A. A novel remote sensing-based calibration and validation method for distributed hydrological modelling in ungauged basins. J. Hydrol. 2025, 658, 133119. [Google Scholar] [CrossRef]
Buytaert, W.; Zulkafli, Z.; Grainger, S.; Acosta, L.; Alemie, T.C.; Bastiaensen, J.; De Bièvre, B.; Bhusal, J.; Clark, J.; Dewulf, A. Citizen science in hydrology and water resources: Opportunities for knowledge generation, ecosystem service management, and sustainable development. Front. Earth Sci. 2014, 2, 26. [Google Scholar] [CrossRef]
Kerr, Y.H.; Waldteufel, P.; Wigneron, J.-P.; Delwart, S.; Cabot, F.; Boutin, J.; Escorihuela, M.-J.; Font, J.; Reul, N.; Gruhier, C. The SMOS mission: New tool for monitoring key elements ofthe global water cycle. Proc. IEEE 2010, 98, 666–687. [Google Scholar] [CrossRef]
Aasen, H.; Honkavaara, E.; Lucieer, A.; Zarco-Tejada, P.J. Quantitative Remote Sensing at Ultra-High Resolution with UAV Spectroscopy: A Review of Sensor Technology, Measurement Procedures, and Data Correction Workflows. Remote Sens. 2018, 10, 1091. [Google Scholar] [CrossRef]
Clark, M.P.; Nijssen, B.; Lundquist, J.D.; Kavetski, D.; Rupp, D.E.; Woods, R.A.; Freer, J.E.; Gutmann, E.D.; Wood, A.W.; Brekke, L.D.; et al. A unified approach for process-based hydrologic modeling: 1. Modeling concept. Water Resour. Res. 2015, 51, 2498–2514. [Google Scholar] [CrossRef]
Alawathugoda, C.; Hinge, G.; Elkollaly, M.; Hamouda, M.A. Impact of Utilizing High-Resolution PlanetScope Imagery on the Accuracy of LULC Mapping and Hydrological Modeling in an Arid Region. Water 2024, 16, 2356. [Google Scholar] [CrossRef]
Karr, J.R. Assessment of biotic integrity using fish communities. Fisheries 1981, 6, 21–27. [Google Scholar] [CrossRef]
Johnson, N.; Druckenmiller, M.L.; Danielsen, F.; Pulsifer, P.L. The Use of Digital Platforms for Community-Based Monitoring. BioScience 2021, 71, 452–466. [Google Scholar] [CrossRef] [PubMed]
Deiner, K.; Bik, H.M.; Mächler, E.; Seymour, M.; Lacoursière-Roussel, A.; Altermatt, F.; Creer, S.; Bista, I.; Lodge, D.M.; De Vere, N. Environmental DNA metabarcoding: Transforming how we survey animal and plant communities. Mol. Ecol. 2017, 26, 5872–5895. [Google Scholar] [CrossRef] [PubMed]
Sedeño-Díaz, J.E.; Ruiz-Picos, R.A.; López-López, E. Calibrating and Validating the Biomonitoring Working Party (BMWP) Index for the Bioassessment of Water Quality in Neotropical Streams. In Water Quality; Tutu, H., Ed.; IntechOpen: London, UK, 2017. [Google Scholar]
Arnold, J.G.; Srinivasan, R.; Muttiah, R.S.; Williams, J.R. Large area hydrologic modeling and assessment part I: Model development 1. JAWRA J. Am. Water Resour. Assoc. 1998, 34, 73–89. [Google Scholar] [CrossRef]
Gómez-Tolosa, M.; Rivera-Velázquez, G.; Rioja-Paradela, T.M.; Mendoza-Cuenca, L.F.; Tejeda-Cruz, C.; López, S. The use of Odonata species for environmental assessment: A meta-analysis for the Neotropical region. Environ. Sci. Pollut. Res. 2021, 28, 1381–1396. [Google Scholar]
Shrestha, S.; Kazama, F. Assessment of surface water quality using multivariate statistical techniques: A case study of the Fuji river basin, Japan. Environ. Model. Softw. 2007, 22, 464–475. [Google Scholar] [CrossRef]
Keck, F.; Hürlemann, S.; Locher, N.; Stamm, C.; Deiner, K.; Altermatt, F. A triad of kicknet sampling, eDNA metabarcoding, and predictive modeling to assess aquatic macroinvertebrate biodiversity. bioRxiv 2022, 474789. [Google Scholar] [CrossRef]
Li, L.; Xu, H.; Zhang, Q.; Zhan, Z.; Liang, X.; Xing, J. Estimation methods of wetland carbon sink and factors influencing wetland carbon cycle: A review. Carbon Res. 2024, 3, 50. [Google Scholar] [CrossRef]
Zhang, C.; Zhou, M.; Zhang, Y.; Li, D.; Hou, N.; Zhao, X. Novel strategies for enhancing energy metabolism and wastewater treatment in algae-bacteria symbiotic system through carbon dots-induced photogenerated electrons: The definitive role of accelerated electron transport. Chem. Eng. J. 2024, 500, 157016. [Google Scholar] [CrossRef]
Beven, K. A manifesto for the equifinality thesis. J. Hydrol. 2006, 320, 18–36. [Google Scholar] [CrossRef]
Little, R.J.; Rubin, D.B. Statistical Analysis with Missing Data; John Wiley & Sons: Hoboken, NJ, USA, 2019. [Google Scholar]
Schafer, J.L. Analysis of Incomplete Multivariate Data; CRC Press: Boca Raton, FL, USA, 1997. [Google Scholar]
Wang, L.; Qu, J.J. Satellite remote sensing applications for surface soil moisture monitoring: A review. Front. Earth Sci. China 2009, 3, 237–247. [Google Scholar] [CrossRef]
Rapacciuolo, G.; Blois, J.L. Understanding ecological change across large spatial, temporal and taxonomic scales: Integrating data and methods in light of theory. Ecography 2019, 42, 1247–1266. [Google Scholar] [CrossRef]
Castanedo, F. A Review of Data Fusion Techniques. Sci. World J. 2013, 2013, 704504. [Google Scholar] [CrossRef]
Sharifnia, A.M.; Kpormegbey, D.E.; Thapa, D.K.; Cleary, M. A Primer of Data Cleaning in Quantitative Research: Handling Missing Values and Outliers. J. Adv. Nurs. Res. 2025, 1–6. [Google Scholar] [CrossRef]
Pimentel, M.A.F.; Clifton, D.A.; Clifton, L.; Tarassenko, L. A review of novelty detection. Signal Process. 2014, 99, 215–249. [Google Scholar] [CrossRef]
Gujar, P. Data Standardization and Interoperability. In Data Usability in the Enterprise: How Usability Leads to Optimal Digital Experiences; Apress: Berkeley, CA, USA, 2025; pp. 89–110. [Google Scholar]
Kush, R.D.; Warzel, D.; Kush, M.A.; Sherman, A.; Navarro, E.A.; Fitzmartin, R.; Pétavy, F.; Galvez, J.; Becnel, L.B.; Zhou, F.L.; et al. FAIR data sharing: The roles of common data elements and harmonization. J. Biomed. Inform. 2020, 107, 103421. [Google Scholar] [CrossRef] [PubMed]
Ahmadi, H.; Das, A.; Pourtaheri, M.; Komaki, C.B.; Khairy, H. Redefining the watershed line and stream networks via digital resources and topographic map using GIS and remote sensing (case study: The Neka River’s watershed). Nat. Hazards 2014, 72, 711–722. [Google Scholar] [CrossRef]
Pan, M.; Sahoo, A.K.; Troy, T.J.; Vinukollu, R.K.; Sheffield, J.; Wood, E.F. Multisource estimation of long-term terrestrial water budget for major global river basins. J. Clim. 2012, 25, 3191–3206. [Google Scholar] [CrossRef]
Smakhtin, V.U. Low flow hydrology: A review. J. Hydrol. 2001, 240, 147–186. [Google Scholar] [CrossRef]
Tucker, C.J. Red and photographic infrared linear combinations for monitoring vegetation. Remote Sens. Environ. 1979, 8, 127–150. [Google Scholar] [CrossRef]
Huete, A.; Didan, K.; Miura, T.; Rodriguez, E.P.; Gao, X.; Ferreira, L.G. Overview of the radiometric and biophysical performance of the MODIS vegetation indices. Remote Sens. Environ. 2002, 83, 195–213. [Google Scholar] [CrossRef]
Liu, Y.; Gupta, H.V. Uncertainty in hydrologic modeling: Toward an integrated data assimilation framework. Water Resour. Res. 2007, 43, W07401. [Google Scholar] [CrossRef]
Pelletier, C.; Webb, G.I.; Petitjean, F. Temporal convolutional neural network for the classification of satellite image time series. Remote Sens. 2019, 11, 523. [Google Scholar] [CrossRef]
Daw, A.; Karpatne, A.; Watkins, W.D.; Read, J.S.; Kumar, V. Physics-guided neural networks (pgnn): An application in lake temperature modeling. In Knowledge Guided Machine Learning; Chapman and Hall/CRC: Boca Raton, FL, USA, 2022; pp. 353–372. [Google Scholar]
Ajami, N.K.; Gupta, H.; Wagener, T.; Sorooshian, S. Calibration of a semi-distributed hydrologic model for streamflow estimation along a river system. J. Hydrol. 2004, 298, 112–135. [Google Scholar] [CrossRef]
Botts, M.; Percivall, G.; Reed, C.; Davidson, J. OGC^® sensor web enablement: Overview and high level architecture. In Proceedings of the International Conference on GeoSensor Networks, Boston, MA, USA, 1–3 October 2006; pp. 175–190. [Google Scholar]
Evensen, G. The ensemble Kalman filter: Theoretical formulation and practical implementation. Ocean Dyn. 2003, 53, 343–367. [Google Scholar] [CrossRef]
Wang, G.; Wang, L.; Ma, F.; Yang, D.; You, Y. Earthworm and arbuscular mycorrhiza interactions: Strategies to motivate antioxidant responses and improve soil functionality. Environ. Pollut. 2021, 272, 115980. [Google Scholar] [CrossRef] [PubMed]
Zhao, X.; Meng, X.; Li, Q.; Ho, S.-H. Nitrogen metabolic responses of non-rhizosphere and rhizosphere microbial communities in constructed wetlands under nanoplastics disturbance. J. Hazard. Mater. 2025, 484, 136777. [Google Scholar] [CrossRef] [PubMed]
Wang, L.; Cao, Y.; Wei, J.; Bai, S. Structure-activity relationship of self-immobilized mycelial pellets and their functions in wastewater treatment. Bioresour. Technol. 2025, 430, 132558. [Google Scholar] [CrossRef]
Yang, D.; Wang, L.; Bai, S. Enhancement of alfalfa growth resistance by arbuscular mycorrhiza and earthworm in molybdenum-contaminated soils: From the perspective of soil nutrient turnover. Environ. Res. 2025, 267, 120714. [Google Scholar] [CrossRef]
Fang, K.; Shen, C.; Kifer, D.; Yang, X. Prolongation of SMAP to spatiotemporally seamless coverage of continental US using a deep learning neural network. Geophys. Res. Lett. 2017, 44, 11030–11039. [Google Scholar] [CrossRef]
Karpatne, A.; Atluri, G.; Faghmous, J.H.; Steinbach, M.; Banerjee, A.; Ganguly, A.; Shekhar, S.; Samatova, N.; Kumar, V. Theory-guided data science: A new paradigm for scientific discovery from data. IEEE Trans. Knowl. Data Eng. 2017, 29, 2318–2331. [Google Scholar] [CrossRef]
Li, L.; Chai, W.; Kang, J.; Liu, J.; Xing, J.; Li, G.; Zhan, Z. Utilization of graphite tailings and coal gangue in the preparation of foamed ceramics. Int. J. Appl. Ceram. Technol. 2025, 22, e15012. [Google Scholar] [CrossRef]
Zhang, C.; Zhou, M.; Du, H.; Li, D.; Lv, D.; Hou, N. Influence of microbial agents-loaded biochar on bacterial community assembly and heavy metals morphology in sewage sludge compost: Insights from community stability and complexity. Bioresour. Technol. 2025, 419, 132070. [Google Scholar] [CrossRef]
Beven, K.J. Rainfall-Runoff Modelling: The Primer; Wiley: Hoboken, NJ, USA, 2012. [Google Scholar]
Willard, J.; Jia, X.; Xu, S.; Steinbach, M.; Kumar, V. Integrating physics-based modeling with machine learning: A survey. arXiv 2020, arXiv:2003.04919. [Google Scholar]
Liang, X.; Lettenmaier, D.P.; Wood, E.F.; Burges, S.J. A simple hydrologically based model of land surface water and energy fluxes for general circulation models. J. Geophys. Res. Atmos. 1994, 99, 14415–14428. [Google Scholar] [CrossRef]
Bergström, S. The HBV model–Its Structure and Applications; SMHI: Norrköping, Sweden, 1992.
Hrachowitz, M.; Savenije, H.H.G.; Blöschl, G.; McDonnell, J.J.; Sivapalan, M.; Pomeroy, J.W.; Arheimer, B.; Blume, T.; Clark, M.P.; Ehret, U.; et al. A decade of Predictions in Ungauged Basins (PUB)—A review. Hydrol. Sci. J. 2013, 58, 1198–1255. [Google Scholar] [CrossRef]
Kratzert, F.; Klotz, D.; Shalev, G.; Klambauer, G.; Hochreiter, S.; Nearing, G. Towards learning universal, regional, and local hydrological behaviors via machine learning applied to large-sample datasets. Hydrol. Earth Syst. Sci. 2019, 23, 5089–5110. [Google Scholar] [CrossRef]
Wool, T.A.; Ambrose, R.B.; Martin, J.L.; Comer, E.A.; Tech, T. Water Quality Analysis Simulation Program (WASP), Version 6.0; User’s Manual; CHI: Guelph, ON, Canada, 2006. [Google Scholar]
Liu, Z.; Gan, Y.; Luo, J.; Luo, X.; Ding, C.; Cui, Y. Current Status of Emerging Contaminant Models and Their Applications Concerning the Aquatic Environment: A Review. Water 2025, 17, 85. [Google Scholar] [CrossRef]
Song, J.; Xu, R.; Li, D.; Jiang, S.; Cai, M.; Xiong, J. Source apportionment and ecological risk assessment of antibiotics in Dafeng River Basin using PMF and Monte-Carlo simulation. Environ. Geochem. Health 2024, 46, 479. [Google Scholar] [CrossRef] [PubMed]
Moriasi, D.N.; Arnold, J.G.; Van Liew, M.W.; Bingner, R.L.; Harmel, R.D.; Veith, T.L. Model evaluation guidelines for systematic quantification of accuracy in watershed simulations. Trans. ASABE 2007, 50, 885–900. [Google Scholar] [CrossRef]
Tague, C.L.; Band, L.E. RHESSys: Regional Hydro-Ecologic Simulation System—An Object-Oriented Approach to Spatially Distributed Modeling of Carbon, Water, and Nutrient Cycling. Earth Interact. 2004, 8, 1–42. [Google Scholar] [CrossRef]
Graham, D.N.; Butts, M.B. Flexible, integrated watershed modelling with MIKE SHE. Watershed Models 2005, 849336090, 245–272. [Google Scholar]
You, Y.; Ju, C.; Wang, L.; Wang, X.; Ma, F.; Wang, G.; Wang, Y. The mechanism of arbuscular mycorrhizal enhancing cadmium uptake in Phragmites australis depends on the phosphorus concentration. J. Hazard. Mater. 2022, 440, 129800. [Google Scholar] [CrossRef]
Wigmosta, M.S.; Vail, L.W.; Lettenmaier, D.P. A distributed hydrology-vegetation model for complex terrain. Water Resour. Res. 1994, 30, 1665–1679. [Google Scholar] [CrossRef]
Wang, Z.; Chen, S.; Yang, L.; Wang, Q.; Hou, N.; Zhang, J.; Tong, Y.; Li, X. Remediation strategies of biochar and microbial inoculum for PAHs-contaminated soil: Quorum sensing-mediated PAHs degradation and element cycling. J. Hazard. Mater. 2025, 490, 137854. [Google Scholar] [CrossRef] [PubMed]
Yuan, Z.; Wang, Y.; Zhu, L.; Zhang, C.; Sun, Y. Machine-learning-aided biochar production from aquatic biomass. Carbon Res. 2024, 3, 77. [Google Scholar] [CrossRef]
Bieger, K.; Arnold, J.G.; Rathjens, H.; White, M.J.; Bosch, D.D.; Allen, P.M.; Volk, M.; Srinivasan, R. Introduction to SWAT+, a completely restructured version of the soil and water assessment tool. JAWRA J. Am. Water Resour. Assoc. 2017, 53, 115–130. [Google Scholar] [CrossRef]
Gijsbers, P.; Gregersen, J. OpenMI: A glue for model integration. In Proceedings of the MODSIM 2005 International Congress on Modelling and Simulation, Melbourne, Australia, 12–15 December 2005; pp. 648–654. [Google Scholar]
David, O.; Ascough II, J.C.; Lloyd, W.; Green, T.R.; Rojas, K.; Leavesley, G.H.; Ahuja, L.R. A software engineering perspective on environmental modeling framework design: The Object Modeling System. Environ. Model. Softw. 2013, 39, 201–213. [Google Scholar] [CrossRef]
Tarboton, D.G.; Idaszak, R.; Horsburgh, J.S.; Heard, J.; Ames, D.; Goodall, J.L.; Band, L.; Merwade, V.; Couch, A.; Arrigo, J.; et al. HydroShare: Advancing collaboration through hydrologic data and model sharing. In Proceedings of the International Congress on Environmental Modelling and Software, San Diego, CA, USA, 15–19 June 2014. [Google Scholar]
Voinov, A.; Shugart, H.H. ‘Integronsters’, integral and integrated modeling. Environ. Model. Softw. 2013, 39, 149–158. [Google Scholar] [CrossRef]
Argent, R.M. An overview of model integration for environmental applications—Components, frameworks and semantics. Environ. Model. Softw. 2004, 19, 219–234. [Google Scholar] [CrossRef]
Gober, P.; Wheater, H.S. Debates—Perspectives on socio-hydrology: Modeling flood risk as a public policy problem. Water Resour. Res. 2015, 51, 4782–4788. [Google Scholar] [CrossRef]
Belete, G.F.; Voinov, A.; Laniak, G.F. An overview of the model integration process: From pre-integration assessment to testing. Environ. Model. Softw. 2017, 87, 49–63. [Google Scholar] [CrossRef]
Hill, M.C.; Tiedeman, C.R. Effective Groundwater Model Calibration: With Analysis of Data, Sensitivities, Predictions, and Uncertainty; John Wiley & Sons: Hoboken, NJ, USA, 2007. [Google Scholar]
Moore, R.V.; Tindall, C.I. An overview of the open modelling interface and environment (the OpenMI). Environ. Sci. Policy 2005, 8, 279–286. [Google Scholar] [CrossRef]
Gassman, P.W.; Reyes, M.R.; Green, C.H.; Arnold, J.G. The soil and water assessment tool: Historical development, applications, and future research directions. Trans. ASABE 2007, 50, 1211–1250. [Google Scholar] [CrossRef]
Politano, M.; Antonio, A.; and Weber, L. A process-based hydrological model for continuous multi-year simulations of large-scale watersheds. Int. J. River Basin Manag. 2025, 23, 15–28. [Google Scholar] [CrossRef]
Voinov, A.; Cerco, C. Model integration and the role of data. Environ. Model. Softw. 2010, 25, 965–969. [Google Scholar] [CrossRef]
Singh, V.P. Hydrologic modeling: Progress and future directions. Geosci. Lett. 2018, 5, 15. [Google Scholar] [CrossRef]
Ascough Ii, J.; Maier, H.; Ravalico, J.; Strudley, M. Future research challenges for incorporation of uncertainty in environmental and ecological decision-making. Ecol. Model. 2008, 219, 383–399. [Google Scholar] [CrossRef]
Pal, D.; Marttila, H.; Ala-Aho, P.; Lotsari, E.; Ronkanen, A.-K.; Gonzales-Inca, C.; Croghan, D.; Korppoo, M.; Kämäri, M.; Rooijen, E.v.; et al. Blueprint conceptualization for a river basin’s digital twin. Hydrol. Res. 2025, 56, 197–212. [Google Scholar] [CrossRef]
Yang, Y.; Xie, C.; Fan, Z.; Xu, Z.; Melville, B.W.; Liu, G.; Hong, L. Digital twinning of river basins towards full-scale, sustainable and equitable water management and disaster mitigation. npj Nat. Hazards 2024, 1, 43–117. [Google Scholar] [CrossRef]
Refsgaard, J.C.; van der Sluijs, J.P.; Højberg, A.L.; Vanrolleghem, P.A. Uncertainty in the environmental modelling process—A framework and guidance. Environ. Model. Softw. 2007, 22, 1543–1556. [Google Scholar] [CrossRef]
Voinov, A.; Bousquet, F. Modelling with stakeholders. Environ. Model. Softw. 2010, 25, 1268–1281. [Google Scholar] [CrossRef]
Shu, L.; Li, X.; Chang, Y.; Meng, X.; Chen, H.; Qi, Y.; Wang, H.; Li, Z.; Lyu, S. Advancing understanding of lake–watershed hydrology: A fully coupled numerical model illustrated by Qinghai Lake. Hydrol. Earth Syst. Sci. 2024, 28, 1477–1491. [Google Scholar] [CrossRef]
Moriasi, D.N.; Gitau, M.W.; Pai, N.; Daggupati, P. Hydrologic and Water Quality Models: Performance Measures and Evaluation Criteria. Trans. ASABE 2015, 58, 1763–1785. [Google Scholar] [CrossRef]
Vrugt, J.A.; ter Braak, C.J.; Diks, C.G.; Robinson, B.A.; Hyman, J.M.; Higdon, D. Accelerating Markov chain Monte Carlo simulation by differential evolution with self-adaptive randomized subspace sampling. Int. J. Nonlinear Sci. Numer. Simul. 2009, 10, 273–290. [Google Scholar] [CrossRef]
Wilkinson, S.R.; Aloqalaa, M.; Belhajjame, K.; Crusoe, M.R.; de Paula Kinoshita, B.; Gadelha, L.; Garijo, D.; Gustafsson, O.J.R.; Juty, N.; Kanwal, S.; et al. Applying the FAIR Principles to computational workflows. Sci. Data 2025, 12, 328. [Google Scholar] [CrossRef]
Haddeland, I.; Clark, D.B.; Franssen, W.; Ludwig, F.; Voß, F.; Arnell, N.W.; Bertrand, N.; Best, M.; Folwell, S.; Gerten, D. Multimodel estimate of the global terrestrial water balance: Setup and first results. J. Hydrometeorol. 2011, 12, 869–884. [Google Scholar] [CrossRef]
Gupta, H.V.; Clark, M.P.; Vrugt, J.A.; Abramowitz, G.; Ye, M. Towards a comprehensive assessment of model structural adequacy. Water Resour. Res. 2012, 48, W08301. [Google Scholar] [CrossRef]
Blöschl, G.; Bierkens, M.F.; Chambel, A.; Cudennec, C.; Destouni, G.; Fiori, A.; Kirchner, J.W.; McDonnell, J.J.; Savenije, H.H.; Sivapalan, M. Twenty-three unsolved problems in hydrology (UPH)–a community perspective. Hydrol. Sci. J. 2019, 64, 1141–1158. [Google Scholar] [CrossRef]
Fatichi, S.; Pappas, C.; Ivanov, V.Y. Modeling plant–water interactions: An ecohydrological overview from the cell to the global scale. Wiley Interdiscip. Rev. Water 2016, 3, 327–368. [Google Scholar] [CrossRef]
Clark, M.P.; Bierkens, M.F.; Samaniego, L.; Woods, R.A.; Uijlenhoet, R.; Bennett, K.E.; Pauwels, V.; Cai, X.; Wood, A.W.; Peters-Lidard, C.D. The evolution of process-based hydrologic models: Historical challenges and the collective quest for physical realism. Hydrol. Earth Syst. Sci. 2017, 21, 3427–3440. [Google Scholar] [CrossRef]
Beven, K. Facets of uncertainty: Epistemic uncertainty, non-stationarity, likelihood, hypothesis testing, and communication. Hydrol. Sci. J. 2016, 61, 1652–1665. [Google Scholar] [CrossRef]
Voinov, A.; Seppelt, R.; Reis, S.; Nabel, J.E.; Shokravi, S. Values in socio-environmental modelling: Persuasion for action or excuse for inaction. Environ. Model. Softw. 2014, 53, 207–212. [Google Scholar] [CrossRef]
Brunet, G.; Shapiro, M.; Hoskins, B.; Moncrieff, M.; Dole, R.; Kiladis, G.N.; Kirtman, B.; Lorenc, A.; Mills, B.; Morss, R. Collaboration of the weather and climate communities to advance subseasonal-to-seasonal prediction. Bull. Am. Meteorol. Soc. 2010, 91, 1397–1406. [Google Scholar] [CrossRef]
Laniak, G.F.; Olchin, G.; Goodall, J.; Voinov, A.; Hill, M.; Glynn, P.; Whelan, G.; Geller, G.; Quinn, N.; Blind, M. Integrated environmental modeling: A vision and roadmap for the future. Environ. Model. Softw. 2013, 39, 3–23. [Google Scholar] [CrossRef]
Bierkens, M.F.P.; Bell, V.A.; Burek, P.; Chaney, N.; Condon, L.E.; David, C.H.; de Roo, A.; Döll, P.; Drost, N.; Famiglietti, J.S.; et al. Hyper-resolution global hydrological modelling: What is next? Hydrol. Process. 2015, 29, 310–320. [Google Scholar] [CrossRef]
Yang, J.; Zhang, T.; Ma, S.; Shang, J.; Li, L.; Ning, Y.; Zhao, X. Enhancing microplastic removal and nitrogen mitigation in constructed wetlands: An earthworm-centric perspective. J. Hazard. Mater. 2025, 489, 137540. [Google Scholar] [CrossRef]
Zeng, Q.; Gao, Y.; Guan, K.; Liu, J.; Feng, Z. Machine learning and a computational fluid dynamic approach to estimate phase composition of chemical vapor deposition boron carbide. J. Adv. Ceram. 2021, 10, 537–550. [Google Scholar] [CrossRef]
Thyer, M.; Renard, B.; Kavetski, D.; Kuczera, G.; Franks, S.W.; Srikanthan, S. Critical evaluation of parameter consistency and predictive uncertainty in hydrological modeling: A case study using Bayesian total error analysis. Water Resour. Res. 2009, 45, W00B14. [Google Scholar] [CrossRef]
Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J.; Appleton, G.; Axton, M.; Baak, A.; Blomberg, N.; Boiten, J.-W.; da Silva Santos, L.B.; Bourne, P.E. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 2016, 3, 160018. [Google Scholar] [CrossRef] [PubMed]
Stockhause, M.; Huard, D.; Al Khourdajie, A.; Gutiérrez, J.M.; Kawamiya, M.; Klutse, N.A.B.; Krey, V.; Milward, D.; Okem, A.E.; Pirani, A.; et al. Implementing FAIR data principles in the IPCC seventh assessment cycle: Lessons learned and future prospects. PLoS Clim. 2024, 3, e0000533. [Google Scholar] [CrossRef]
Deng, Y.; Wang, D.; Cheng, J.; Zhao, Y.; Li, Z.; Zang, C.; Li, J.; Jia, L. A new popular transition metal-based catalyst: SmMn₂O₅ mullite-type oxide. Chin. Chem. Lett. 2024, 35, 109141. [Google Scholar] [CrossRef]
Lu, X.; Shen, L.; Chen, C.; Yu, W.; Wang, B.; Kong, N.; Zeng, Q.; Chen, S.; Huang, X.; Wang, Y.; et al. Advance of self-cleaning separation membranes for oil-containing wastewater treatment. Environ. Funct. Mater. 2024, 3, 72–93. [Google Scholar]
Shao, L.; Zhang, Y.; Geng, S.; Yang, S.; Wang, T.; Song, X.; Liu, J.; Zheng, J. Physicochemical characteristics and environmental impact of coal mine solid waste. J. Min. Sci. Technol. 2024, 9, 653–667. [Google Scholar]

Figure 1. Integrated framework for watershed eco-assessment—review roadmap (Section 2, Section 3, Section 4, Section 5 and Section 6).

Figure 2. Water balance model of Hala Lake basin and its application process based on remote sensing data [39]. Copyright, 2025 Elsevier.

Figure 3. AI-augmented modeling (machine learning, physics-informed).

Table 1. Overview of eco-assessment data sources (2.1–2.4)—variables, spatiotemporal support, strengths/limitations, representative products; plus a column for model outputs and reanalysis (2.4).

Data Source	Main Variables Measured	Spatial Coverage	Temporal Resolution	Strengths	Limitations
Ground Monitoring	Streamflow, Precipitation, Groundwater Level, Water Quality (Nutrients, DO, Temperature, Turbidity)	Local to regional	Hourly to daily	High measurement accuracy; long historical records; essential for trend analysis	Sparse spatial distribution; Data gaps due to maintenance failures; Variable data quality [37,38,40,58]
Remote Sensing	Land Cover, Vegetation Indices (NDVI), Surface Water Extent, Soil Moisture, Snowpack, Water Quality Proxies (Chlorophyll-a, Turbidity)	Regional to global	Weekly to monthly	Wide spatial coverage; regular repeat cycles; ability to monitor inaccessible areas	Cloud contamination (optical sensors); Need for calibration/validation with ground data [41,45]
Biological Monitoring	Macroinvertebrate Assemblages, Fish Communities, Algal Biomass, Biofilm Diversity	Site-specific	Seasonal to annual	Sensitive indicators of cumulative watershed health; capture biological responses	Labor-intensive; Sampling biases; Requires taxonomic expertise; Limited temporal coverage [47,49,51]
Model Outputs and Reanalysis	Runoff, Soil Moisture, Groundwater Recharge, Nutrient Transport, Meteorological Forcings (Precipitation, Temperature)	Gridded (regional/global)	Hourly to monthly	Provide spatial and temporal continuity; Enable scenario analysis and future projections	Model structural uncertainties; Calibration and validation dependency; Potential error propagation [12,55,56,57]

Table 2. Comparative evaluation of coupling strategies (loose, tight, embedded, flexible) for watershed eco-assessment, with criteria such as implementation complexity, computation demand, feedback capability, example applications, etc.

Criteria	Loose Coupling	Tight Coupling	Embedded Coupling
Implementation Complexity	Low	Medium	High
Flexibility and Modularity	High	Moderate	Low
Computational Efficiency	Low to Moderate	High	Very High
Feedback Representation	Weak/Sequential	Strong/Dynamic	Strong/Fully Integrated
Transparency and Debuggability	High	Moderate	Low
Application Context	Exploratory studies, Legacy model linkage	Predictive management, Real-time forecasting	High-fidelity research, Integrated process studies

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Multi-Source Data Integration and Model Coupling for Watershed Eco-Assessment Systems: Progress, Challenges, and Prospects

Abstract

1. Introduction

2. Data Sources for Watershed Eco-Assessment

2.1. Ground Monitoring Data

2.2. Remote Sensing Data

2.3. Biological Monitoring Data

2.4. Model Outputs and Reanalysis Products

3. Methods of Multi-Source Data Integration

3.1. Data Cleaning and Standardization

3.2. Spatial-Temporal Harmonization

3.3. Feature Extraction and Transformation

3.4. Data Fusion Techniques

4. Models for Watershed Eco-Assessment

4.1. Hydrological Models

4.2. Water Quality Models

4.3. Integrated Ecohydrological Models

4.4. Model Selection and Application Considerations

5. Coupling Strategies for Integrated Eco-Assessment

5.1. Loose Coupling

5.2. Tight Coupling

5.3. Embedded (Seamless) Coupling

5.4. Emerging Flexible Coupling Frameworks

5.5. Coupling Strategy Selection and Best Practices

6. Challenges and Future Directions

6.1. Challenges

6.1.1. Data Heterogeneity and Quality

6.1.2. Scale Mismatch and Model Integration

6.1.3. Computational Complexity

6.1.4. Uncertainty Propagation and Communication

6.1.5. Governance, Interoperability, and Stakeholder Engagement

6.2. Future Directions

6.2.1. Toward Smart, Adaptive Watershed Systems

6.2.2. AI-Augmented Data Fusion and Modeling

6.2.3. Cross-Scale, Multi-Resolution Coupling

6.2.4. Emphasizing Uncertainty-Aware Decision Support

6.2.5. Promoting Open Science and Interoperability

7. Conclusions

Funding

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics