Satellite remote sensing has become essential for water quality assessment across inland and coastal environments, with rapid improvements in recent years. Significant advances have been made in detecting optically active parameters (such as chlorophyll-a, suspended matter, and turbidity), showing consistently strong performance across
[...] Read more.
Satellite remote sensing has become essential for water quality assessment across inland and coastal environments, with rapid improvements in recent years. Significant advances have been made in detecting optically active parameters (such as chlorophyll-a, suspended matter, and turbidity), showing consistently strong performance across multiple studies. Specifically, the median validation performance (R
2) derived from the quantitative synthesis indicates R
2 = 0.82 for chlorophyll-a (interquartile range—IQR: 0.75–0.90), R
2 = 0.80 for total suspended matter (IQR: 0.78–0.85), and R
2 = 0.88 for turbidity (IQR: 0.85–0.90). Conversely, the retrieval of optically inactive parameters (such as nutrients like total phosphorus and total nitrogen) remains more context dependent. It exhibits moderate, more variable results, with median R
2 = 0.68 (IQR: 0.64–0.74) for total phosphorus and R
2 = 0.75 (IQR: 0.70–0.80) for total nitrogen. These findings clearly illustrate the varying success of retrievals of optically active and inactive parameters and underscore the inherent difficulties of indirect estimation methods. However, high reported accuracy has yet to translate into transferable, uncertainty-informed, and operational monitoring systems. This gap stems from structural issues in validation design, physics integration, uncertainty management, and multi-sensor compatibility rather than data limitations alone. We present a PRISMA-guided, distribution-aware quantitative synthesis of 152 peer-reviewed studies (1980–2025), based on a systematic search protocol, to evaluate satellite-based retrievals of both optically active and inactive parameters. Instead of simply averaging performance, we analyse the empirical distributions of validation metrics, considering the validation protocol, sensor type, parameter category, degree of physics integration, and uncertainty quantification. The synthesis demonstrates that validation strategy often influences reported results more than the algorithm class itself, with accuracy inflated under non-independent cross-validation methods and notable variability between studies concealed by mean-based reports. Across four decades, four persistent structural challenges remain: limited transferability across sites and sensors beyond calibration areas; weak or implicit physical integration in many data-driven models; lack of or inconsistency in uncertainty quantification; and fragmented multi-sensor harmonisation that restricts operational scalability. To address these issues, we introduce two evidence-based coding frameworks: a physics-integration taxonomy (P0–P4) and an uncertainty-quantification hierarchy (U0–U4). Applying these frameworks shows that most studies remain focused on low-to-moderate levels of physics integration and primarily consider uncertainty at the prediction stage, with limited attention to upstream sources throughout the observation and inference process. Building on this structured synthesis, we propose a transferable, physics-informed, and uncertainty-aware conceptual framework that links model architecture, validation robustness, and probabilistic uncertainty to well-founded design principles. By shifting satellite water quality modelling from isolated algorithm demonstrations towards integrated, evidence-based system design, this study promotes scalable, decision-grade environmental monitoring amid the accelerating impacts of climate change.
Full article