A Review of Water Quality Forecasting Models for Freshwater Lentic Ecosystems

García-Guerrero, Jovheiry Christopher; Álvarez-Alvarado, José M.; Carrillo-Serrano, Roberto Valentín; Palos-Barba, Viviana; Rodríguez-Reséndiz, Juvenal

doi:10.3390/w17152312

Open AccessReview

A Review of Water Quality Forecasting Models for Freshwater Lentic Ecosystems

by

Jovheiry Christopher García-Guerrero

^*

,

José M. Álvarez-Alvarado

,

Roberto Valentín Carrillo-Serrano

,

Viviana Palos-Barba

and

Juvenal Rodríguez-Reséndiz

^*

Facultad de Ingeniería, Universidad Autónoma de Querétaro, Querétaro 76010, Mexico

^*

Authors to whom correspondence should be addressed.

Water 2025, 17(15), 2312; https://doi.org/10.3390/w17152312

Submission received: 19 June 2025 / Revised: 24 July 2025 / Accepted: 1 August 2025 / Published: 4 August 2025

(This article belongs to the Special Issue Intelligent Water Management: Machine Learning, Remote Sensing, Data Analytics, Predictive Modeling, and the Path to Sustainability)

Download

Browse Figures

Versions Notes

Abstract

Water quality (WQ) monitoring is critical for Mexico and the world due to water pollution and scarcity problems in recent years. In this article, a systematic review was conducted considering only forecasting models focused on lentic freshwater bodies (to specialize the analysis of variables, problems, considerations, etc.) from 2019 to 2025 (to ensure the inclusion of the most relevant and new studies). This review analyzes 52 articles focused on the monitoring place, predictors, forecasted variables, configuration of each forecasting model, results with or without multiple forecast horizons, monitoring conditions, forecasting horizon, data availability, and model replicability. Our review shows that the main models documented used to predict WQ are based on machine learning (where RFs are the most used), AI (where ANNs are the most used and LSTM-based architectures are the most implemented), and statistical methods (where MLR is the most used). The principal forecasted WQ variables are Chl-α, DO, and TP. In comparison, the most used predictors are TP, temperature, and Chl-α. Furthermore, only 10 articles have made their databases available, and nine articles share the configuration of their models. Future research should investigate the real impact of data (quantity and inputs) variation in forecasting values for multiple forecast horizons.

Keywords:

water quality; forecasting model; prediction model; artificial intelligence; freshwater body

1. Introduction

Nowadays, water quality (WQ) is one of the most essential topics. Many actual problems related to water include water scarcity and contamination caused by human activity. One of the most well-known problems in the 20th century happened on the Thames River (London). Industrial waste and household sewage were poured into the Thames River, causing pollution and depriving it of dissolved oxygen (and other consequences) [1]. Around the world, recent concerns about water pollution and scarcity have increased due to human activities and global climate change.

In the case of Mexico, the water sustainability policy is divided into three stages [2]:

Beginning of the 20th century: the water supply was increased. Many storage dams, irrigation districts, aqueducts, and water supply systems were constructed.
From 1980 to 1990: CONAGUA (Comisión Nacional del Agua, translated from Spanish) was established as an institution responsible for managing national waters.
Dawn of the 21st century: The principal emphasis was focused on water sustainability, wastewater treatment, water reuse, and the management of the water of the nation.

The first stage focused on increasing the water supply, but it was not until the third stage that water protection was considered. As mentioned previously, human activities have affected freshwater resources by polluting them. Similar cases have happened for some freshwater resources in Mexico. In recent years, some dams and lakes have experienced poor water quality, resulting in the death of local fauna. These situations have led to undesired consequences, like economic losses. To prevent similar situations, various programs were developed as part of the third stage of water sustainability policy. Some examples of programs implemented to address actual WQ issues are the PRONACES (National Strategic Programs, by its Spanish meaning) and the 2030 Agenda. The PRONACES program includes “water” as one of its ten social themes, and the 2030 Agenda comprises 17 Sustainable Development Goals (SDGs). The sixth SDG is related to clean water and sanitation (a report of the progress of indicators for Sustainable Development Goal 6 is found in [2]). Both programs require monitoring solutions to quantify WQ.

The Water Quality Monitoring System (WQMS) enables continuous monitoring of various variables. Only some WQ variables are needed for monitoring, depending on the activity that requires water. According to CONAGUA, the main activities that use water are agriculture, aquaculture, livestock, thermoelectric plants, commercial, domestic, industrial, agro-industrial, service, multiple-use, and others.

The data obtained can be used to develop a forecasting model, which can prevent harmful events and their consequences. To generate quality data, some questions to consider are

At what time of day should the variable value be measured?
How often should monitoring be performed?
What amount of data is required to train the forecasting model?

Implementing a WQMS should generate a time-series database. According to [3,4], a time series is a set of observations

x_{t}

, with measurements obtained at a specific time t (monitoring frequency must remain the same). The best predictions achieved today implement computer-trained models that use time-series data. The most popular models in recent years have utilized Artificial Neural Networks (ANNs).

Currently, many prediction models have been developed for various water quality variables. However, each model used to be different from the others in terms of

The model/method used to create the prediction model;
The configuration of the model parameters;
The number of auxiliary variables for predicting the variable of interest;
The characteristics of the database used;
The amount of data (used to train and validate the model).

Furthermore, it is essential to consider that not all surface freshwater bodies share the same characteristics. Certain physical aspects that change are:

The surface area;
The depth (average, maximum, Secchi);
The amount of incoming and outgoing water;
The stratification;
Others.

There are already published articles that review forecasting models, but these are focused on other topics: Ref. [5] considers only forecasting models using evolving fuzzy systems (not for WQ), whereas Ref. [6] analyzes models specifically for industrial manufacturing systems. Ref. [7] focused only on short-term forecasting models for solar PV power generation, Ref. [8] considers techniques and technologies for energy forecasting, Ref. [9] analyzes hybrid forecasting methods, and Ref. [10] talks about short-term load forecasting models. Some reviews that focus on WQ forecasting typically aim for particular types of models: Refs. [11,12,13] concentrate on machine learning methods, while Refs. [14,15] analyze only ANN (Artificial Neural Network) models.

Other examples are [11] for WQI (Water Quality Index), [12] for Chl-α, salinity, dissolved oxygen, WQI, and multiple variables forecasted at the same time. On the other hand, Ref. [14] did not focus on specific WQ variables or water types.

Additionally, some reviews focused on specific types of water, such as rivers [11,16] and water resources (including rivers, lakes, dams, catchments, and basins) [13].

In the case of the present review, the primary focus is only on WQ forecasting/prediction for lentic freshwater bodies (lakes, dams, and similar). It allows for analyzing which WQ variables are most commonly used for forecasting and which as predictors, as well as common problems encountered during monitoring or forecasting, and considerations taken into account for monitoring or forecasting, all in the context of this specific case (freshwater lentic bodies). No forecasting model or method is excluded. Some critical aspects discussed to improve the forecast process are

Input (predictors) and output (forecasted variables) for each model;
Model results for no reported multiple forecast horizons or many horizons;
Comparison or combining with other models, methods, or techniques;
The monitoring conditions used;
Which models gave information to replicate their models (including the availability of their datasets).

This article is organized as follows: Section 2 presents some essential topics related to the research, Section 3 introduces the methodological framework used for conducting the systematic literature review, Section 4 presents all analyses made with the review, Section 5 discusses the findings in this article, and Section 6 presents the findings and suggestions for future research.

2. Theoretical Background

In this section, the topics required to understand the review are introduced.

2.1. Artificial Intelligence

According to [17], AI (Artificial Intelligence) is a computer science domain that aims to develop computer systems considered intelligent (i.e., systems that perceive, analyze, and respond to inputs). AI has been one of the most popular topics due to the many applications and advances developed in recent years. It is usually divided into three classifications:

Artificial Intelligence (AI): Ability to perform tasks that require human intelligence, like visual or voice recognition, NLP (Natural Language Processing), etc.
Machine learning (ML): AI that can learn from experience without human intervention. ML is divided into supervised learning, unsupervised learning, and reinforcement learning
Deep Learning (DL): AI using algorithms inspired by the human brain (neuronal networks)

2.1.1. Machine Learning

ML is a subdomain of AI that enables the computer to learn from past data or experience. Many algorithms have been developed to provide results based on the received data. These algorithms are classified into three domains:

Supervised learning: In this classification, a human provides labeled data to the algorithm, which learns to predict the output from a given input.
Unsupervised learning: With this method, the algorithm does not require labeled data and tries to search for patterns.
Reinforcement learning: The methods in this classification use rewards when the algorithm gives the correct output given a specific input, but if it fails, the algorithm receives penalties. These self-training methods repeat this process many times to achieve the desired performance.

2.1.2. Deep Learning

DL represents a subdomain of ML because its algorithms learn from past experiences. These algorithms mimic the function of brain neurons and are called Artificial Neural Networks (ANNs). Many ANNs have been developed using many types of architecture. These ANNs are usually classified into three categories:

Feed-forward neural networks: In these ANNs, the data move from the input layer, through the hidden layers, towards the output layer.
Recurrent Neural Networks (RNN): RNNs are ANNs in which the data from the output layer flows to the input layer.
Convolutional Neural Networks (CNN): These ANNs have convolutional and pooling layers, which help in feature extraction and pattern recognition. CNNs are specialized in images.

2.2. ARMA Models

These models combine AR (Autoregressive) models with MA (Moving Average) models and are specialized in using time-series data (basically data that is obtained with the same frequency). Similar to the case of ANN, ARMA (Autoregressive Moving Average) models are derived from more complex methods.

AR: These models just incorporate an autoregressive part and are represented as $A R (d)$ , where d is an integer and represents the order of the model.
MA: These models have only the moving average part and are represented as $M A (q)$ , where q is an integer and represents the order of the model.
ARMA: These models result from combining AR and MA models. It is represented as $A R M A (p, q)$ .
ARIMA (Autoregressive Integrated Moving Average): These models integrate the “I” (Integrated) part and are represented as $A R I M A (p, d, q)$ , where d is an integer and represents the order of the integrated part.
SARIMA (Seasonal Autoregressive Integrated Moving Average): It adds an “S” (seasonal) part and is represented as $S A R I M A (p, d, q) (P, D, Q, S)$ , where P, D, and Q are integers that represent the order of the seasonal part and S is an integer that represents the seasonal period.
SARIMAX (Seasonal Autoregressive Integrated Moving Average with Exogenous Regressors): It is a SARIMA model with exogenous regressors, which includes external variables that might influence the time series.

2.3. Water Quality (WQ)

WQ is typically known as a representation of water health and utilizes various variables to assess it. According to the activity, these parameters vary.

Physical parameters: turbidity, TDSs (total dissolved solids), etc.
Chemical parameters: pH, DO (dissolved oxygen), Salinity, Heavy Metals, etc.
Microbiologic parameters: E. coli, natural parasites, pathogenic bacteria, etc.

3. Materials and Methods

This section outlines the methodological framework used to conduct the systematic review. The databases used in the systematic review are Scopus and Web Of Science (WoS). Both are among the most well-known databases, featuring a vast collection of indexed journals, and they offer excellent tools for implementing research. Figure 1 shows an analysis of the number of documents published annually in Scopus.

This six-year timeframe was chosen to ensure the inclusion of the most relevant and new studies in forecasting models. This graph illustrates the increasing interest in developing more accurate water quality forecasting models.

The systematic review employs the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) model. The review followed the methodological framework established by [18]. The following inclusion criteria were applied:

It has some of the following keywords: freshwater, forecasting or prediction, and water quality.
The title is related to the “Forecasting or Prediction” of any WQ variable.
The description mentions developing a forecasting model for any WQ variable of interest.

This strategic approach ensured the analysis included relevant and high-quality studies.

Research Queries

The search queries used in Scopus are defined as follows:

TITLE-ABS-KEY (water AND quality AND forecasting AND freshwater) AND (LIMIT-TO (EXACTKEYWORD, “water quality”) OR LIMIT-TO (EXACTKEYWORD, “forecasting”)).
TITLE-ABS-KEY (water AND quality AND prediction AND freshwater) AND (LIMIT-TO (EXACTKEYWORD, “water quality”) OR LIMIT-TO (EXACTKEYWORD, “prediction”) ).

The search queries for WoS are defined as follows:

water quality forecasting (All Fields) and freshwater (All Fields).
water quality prediction (All Fields) and freshwater (All Fields).

Figure 2 provides visual support to understand how many articles may have been obtained by researching some specific keywords in Scopus, and Figure 3 provides the same for WoS.

Additionally, exclusion criteria were established to further refine the scope of the review. These criteria included the following:

The freshwater body of the study was not lentic (i.e., lakes, ponds, dams, reservoirs, etc.).

Figure 4 introduces the flowchart of the prism systematic review presented previously.

This systematic review yielded 52 relevant publications from indexed journals obtained with Scopus and WoS, carefully selected to address specific aspects related to the WQ forecasting models.

4. Results

To research the relationship between keywords related to “Water Quality”, a bibliometric network was established. The bibliometric data were obtained using Scopus, and the data were processed with the software VOSviewer (version 1.6.20). The analysis uses

The co-occurrence of author keywords.
A full counting method.
Six as the minimum number of occurrences.
The research query used was TITLE-ABS-KEY (water AND quality AND prediction AND freshwater) AND (LIMIT-TO (EXACTKEYWORD, “Water Quality”) OR LIMIT-TO (EXACTKEYWORD, “Forecasting”) OR LIMIT-TO (EXACTKEYWORD, “Prediction”) ).

Some keywords were removed due to their similarity, and only those with minor occurrences were removed. The keywords removed were

“Lake” was removed, and “Lakes” was kept.
“Water Quality Prediction” was removed, and “Prediction” was kept.
“Modelling” was removed, and “Modeling” was kept.

Figure 5 introduces the bibliometric network obtained.

By analyzing Figure 5 the following statements can be realized realized:

The principal types of models are machine learning, deep learning, and Artificial Intelligence.
The most used models are LSTM, Artificial Neural Network, multiple linear regression, and biotic ligand model.
Most related variables are dissolved oxygen, chlorophyll alpha (Chl-α), phosphorus, nitrogen, cyanobacteria, and turbidity.
The types of water analyzed are freshwater and groundwater.
The principal types of water bodies are lakes, streams, and rivers.

4.1. Which Water Bodies and Variables Are Used in WQ Forecasting?

Table 1 shows which water bodies (and the dates) were used to obtain the data for each forecasting model and all the variables used (input variables → output variables). All abbreviations used in the following Tables are introduced in Abbreviations.

By analyzing Table 1, the following statements can be made:

In 34 articles, just one freshwater body was used to obtain the data. A few other articles utilized large datasets with data from numerous water bodies: three with more than 100 water bodies, three with more than 1000 water bodies, and two with more than 10,000 water bodies.
There is a wide period from which datasets were obtained:
-
Months: Generally, to obtain data for a specific year period.
-
Seasons: Usually to obtain the data characteristics in each season or to avoid some seasons (like winter, when some freshwater bodies freeze).
-
A complete year: Habitually to consider a complete cycle.
-
More than one year: Usually, articles with datasets with more than one year of data are because the dataset was obtained from a specific organization.
The principal countries from which WQ variables are forecast are the USA and China, with 12 articles each.
The water quality variables of greatest interest to forecast are Chl-α (21), DO (8), and TP (5).
The predictor variables most used are TP and temperature (17 each), Chl-α (16), AT and pH (15), WT and TN (13), and EC and precipitation (11).

Figure 6 presents the times that a variable was used as a predictor and predicted.

4.2. Which Forecasting Methods and Evaluation Metrics Are Used?

For a more detailed analysis of each article, Table 2 contains a summary for each article without multiple forecast horizons. The summary contains

The variables of interest;
The models implemented;
The evaluation metrics used.

To present the most relevant information of each forecasting model, the following considerations were considered:

The forecasting model presented is the model with the best evaluation metrics, according to the authors of each article.
Each variable of interest is introduced with its respective evaluation.
Table 3 only considers the nearest and furthest forecast horizon evaluated (with the evaluation metrics for each horizon).

In some cases, the selected forecasting model is not the principal of interest but is the best and combines other models/methods instead (like in [25]).

The presentation format for each article in Table 2 is

With only one forecasted variable: variable of interest—forecasting model—evaluation metrics.
With two or more forecasted variables: forecasting model—variable of interest (evaluation metrics).

Figure 7 summarizes the evaluation metrics used in the review. RMSE and

R^{2}

are the most commonly used evaluation metrics, with the same frequency (28 times). Only 53.8% of articles used RMSE and

R^{2}

. Not all articles using RMSE used

R^{2}

, and vice versa.

And Figure 8 presents how many times a type of model was used to develop forecasting models.

By analyzing Table 2, the following statement can be made:

Forty articles did not evaluate their models with multiple forecast horizons. It does not allow for comparing each performance with models that implement numerous forecast horizons.

The authors in [19] present the RMSE values for all the predictions for each lake in the area in just one figure as a heatmap. Other articles, such as [30], introduce the evaluation of their forecasting models (10 developed MLR) in their support material but not in the article. On the other hand, in [42], nine machine learning models for predicting some variables are compared, but only the best model results for each variable are presented. Neither an exact comparison between models is given, nor are the inputs for each best forecasting model introduced (the authors merely mention 50 of the 132 predictor variables for the WQ variable response).

The results for many forecasting models are shown in the case of [33]. Each model has different input variables, and all models are retested using lagged comprehensive variables. The evaluation metrics for each forecasting model in [36] are also provided, along with a ranking based on their results (although numeric MAE values are not presented). Additionally, the article presents the exact equations for each predictor variable and the respective equations used.

There are cases where values for each evaluation metric are presented in graphics. In [45], all metrics are given in each plot, corresponding to each model, including results for training and testing datasets. Additionally, in [66], the evaluation metrics are also presented in graphical form (one for each surface and bottom DO, training and testing datasets, each lake, and each of the 30 model runs).

On the other hand, there are cases where the evaluation of each metric is introduced with numerical values. In [63], all evaluation metrics for each forecasting model developed are presented. In a similar case, the job introduced in [58] presents the values for accuracy, precision, and recall for each Alga (Microcystis and Dolichospermum) for each dam reservoir, and each dataset (test and training). Only Dolichospermum in the Agigawa dam has results for accuracy with both datasets. Each model was run 1000 times with different training and testing datasets.

Cases like [48] conducted numerous experiments with all four forecasting models, each with a different configuration (12 for RF, 10 for SVR, 14 for LSTM, and 17 for GRU). Each model presented its configuration and exact values for the two evaluation metrics used. Some articles did not vary either the configuration of the models or the variables, like in [47], where 22 forecasting models are evaluated with the same model but varying the dataset (each dataset contains data of only one sampling date and includes data from CONAGUA dataset and/or from satellite images). A similar case is presented in [49], where the authors consider three cases to test each forecasting model: Case A (measurement data), Case B (measurement and meteorological data), and Case C (measurement, meteorological, and satellite data). All three cases show all four evaluation metrics for each station and each case (Case A has the best results, and Case C has better results than Case B). In [61] (the same first author), consider the same three cases. In [49], 18 different data quantities are used in training and testing datasets (eight for case A, five for case B, and five for case C), but in [61], only six different data quantities are used (three for each station). Due to both articles using the same evaluation metrics and forecasting the same variable, when comparing the two, the best model in [61] yields better results than the best model in [49]. Both articles have different sites of study and input variables (some vary). Both models were implemented using just two models (LSTM and SARIMAX).

In most articles reviewed, each forecasting model evaluation uses the same metrics, but this is not the case for [52]. Despite this article mentioning other models for some variables (the SMAP model for discharge, the Distributed Load model for TN and TP concentrations, and GLM-AED2 for water quality features and WL changes), many of these models did not use the same evaluation metrics (MAE, r, RMSE).

On the other hand, Table 3 presents articles that specify and evaluate different forecast horizons with their forecasting models. The format used to present the results in Table 3 is

With only one forecasted variable: variable of interest—(evaluation metrics)—forecasting model—nearest forecast horizon: (values for each evaluation metric); farthest forecast horizon: evaluation.
With two or more forecasted variables: forecasting model—(evaluation metrics)—nearest forecast horizon: variable of interest (values for each evaluation metric); farthest forecast horizon: variable of interest (values for each evaluation metric).

Table 3 only shows the nearest and farthest forecast horizon. Still, there are cases, such as [23], where experiments to predict Chl-α are implemented using datasets with different frequency times (daily from 0 to 60 days and monthly from 1 to 3 months).

As mentioned previously, in some cases, numerical values for each evaluation metric are not always given. Despite [22] analyzing the evolution of RMSE, MAE, and

R^{2}

for each lead time and for each WQ variable, no exact numerical values are used for each case (only the skill scores of each case are plotted in a heatmap). This situation is also observed in [67], where all forecasting models are evaluated, accompanied by numerous plots for each evaluation metric.

In the work of [31], the evaluation metrics are presented just for 1-day, 7-day, and 16-day forecasts. Also, the metrics are calculated for all depths (averaged together from 1 to 8 m).

Some articles introduce all evaluation metrics for each case implemented. In [53], the authors introduce results for training and validation sets for each forecasting model and for each station (Xintang and Changji Bridge), including results for the nine periods proposed (only for the CBiLSTM-AT model), and use graphics to show the evolution of accuracy through the periods used (only for the CBiLSTM-AT model). In the case of [46], all evaluation metrics for each forecasting model, for all three lead times (10, 20, and 30 days), and with the two cases (with or without) related to multiple lag predictor variables. Similarly, [16] presents all evaluation metrics for each model, dataset (training and test), and lead time (1, 3, 6, and 12 h).

The work developed in [64] presents the values of each evaluation metric used (

R^{2}

, RMSE, and KLD) for each forecasting model at each time step (1 to 7 days, with a time step of one day), except for the main forecasting model (BMA). Most of their results are introduced using a plot, and no exact numerical data is provided.

4.3. Which Models or Methods Are Usually Compared or Combined?

It is essential to highlight that Table 2 and Table 3 only show the results for the best forecasting models, according to the authors. There are articles where other models are evaluated to compare their performance, or multiple methods/models are implemented to track the performance evolution for evaluation metrics. Table 4 introduces which other models are implemented and which case corresponds (combining or comparing).

All models evaluated in [35] use all four evaluation metrics for each beach (Woodlawn, Hamburg, and Bennett). In the case of [50], all four evaluation metrics are used for each forecasting model in both cases of input variables (Landsat Bands and Lansat Bands + Environmental Variables). A similar case is presented in [39], where results for 1-day and 7-day forecasting are provided for each model. In some model combinations, BLSTM was substituted with SVR, ELM, or LSTM.

On the other hand, all results used in [67] are found in plots. RMSE and MAPE are evaluated for each forecasting model and different datasets by increasing the years of data (1 to 20 years). Additionally, Ref. [67] evaluated each forecasting model using different numbers of features (1 to 4, 5 to 8, and 9 to 13) and two forecast horizons (14 days and 28 days).

4.4. Which Monitoring Conditions Are Usually Implemented?

Table 5 introduces the articles that detail the monitoring conditions used and the time-step forecast their models provide.

The authors in [31] specify that at 00:00 h, their model generates a 16-day forecast.

When analyzing Table 5, it is observed that

No article mentions all three aspects considered.
Exactly 25 articles mention the monitoring frequency for all or some of the variables used.
Only three articles indicate the moment when variables are obtained.
Just seven articles detail the frequency of their models for forecasting.

In [58], some WQ variables are calculated as an average or total for the past 7-day data (i.e., average volume in the past 7 days, total daylight hours in the past 7 days, etc.). However, this does not mean that each variable was measured every 7 days.

In most cases, the idea of “the same monitoring frequency, the same forecasting frequency” may be considered. However, this is not the case for many articles. A total of 45 articles did not mention the frequency of forecasting. In a few instances where the monitoring frequency is lower than the forecasting frequency, some techniques must be addressed to match both frequencies. On the other hand, if the monitoring frequency is higher than the forecasting frequency, down-sampling needs to be considered. This case is demonstrated in [67], where the authors implement down-sampling to generate data at regular fortnightly intervals with no missing values. Also, the monitoring moment is not mentioned in most articles. In some cases where the frequency is high, the monitoring moment is less relevant than in cases where the frequency is low. Some WQ variables change during the day. With a high monitoring frequency (such as 1 h), the behavior is observed throughout the day. As a result, the data accurately reflects the behavior of the variable. On the other hand, a low monitoring frequency (such as 1 month) cannot accurately reflect the complete behavior of the data over a full day, but only at the moment of monitoring. The monitoring moment must be detailed when a low frequency is considered. Because some WQ variables change during the day, the variable can reach its minimum or maximum value at a moment that is not considered. Considering this situation is essential because some aquatic species have specific limits for WQ variables. Suppose forecasted values reach the minimum value at a moment different from the lower limit. In that case, it should be expected that the lower limit will be crossed (causing substantial damage to some aquatic species and other consequences). However, the quantity of data is not the only important aspect of datasets. Another feature related to data, as important as the quantity of data, is the percentage of data used for training, testing, or validation of each forecasting model.

4.5. How Much Data Is Required to Train, Test, or Validate Forecasting Models?

In [67], a variation in their results is mentioned depending on the quantity of data used. Also, the results are expected to improve (or worsen) if more (or less) data is obtained. In ANN models, the data is usually divided into sub-datasets (training, testing, or validation). Some articles that report the percentage of data used for training, testing, or validating their models are presented as follows:

80% (training) and 20% (testing/validation): [65,69].
75% (training) and 25% (testing/validation): [27].
75% (training), 10% (validation), and 15% (evaluation): [68].
70% (training) and 30% (testing/validation): [49,50,59,60].
60% (training), 20% (validation), and 20% (testing): [51].
No percentage data: [19] 2/3 (training) and 1/3 (testing); [52] 1000-day data (training), 700-day data (validation); [54] data from 2017 to 2021 (training), data from 2022 (testing); [39] 27-month data (training), 3-month data (validation), and 3-month data (testing); [45] 57 sampling points (training) and 21 sampling points (validation); [46] 107 samples (training) and 70 samples (testing).

4.6. Which Articles Have Made Their Dataset Available?

Sometimes, a researcher must compare their forecasting model with another presented in an existing article. In most cases, there are some differences between both forecasting models:

The water body of interest is not the same (it is in a different place, has other meteorological conditions, does not have the same physics characteristics, etc.).
The characteristics of each dataset are different (have different monitoring frequencies, do not monitor the same variables, do not have the same quantity of data, etc.).

Research can only focus on developing a novel model to forecast a WQ variable. If the dataset of another study is available, then some experiments can be implemented (including with their datasets). This situation allows for the observation of the performance of the model in different cases. However, not all articles shared their datasets. At the time of this review, only 10 articles have made their datasets available for access [19,38,40,44,50,55,60,62,63,66]. In the case of other articles,

The link to the repository or web page is down;
A user and password must be accessed (just one case);
The data will only be available on request;
Governments and private institutions granted the data.

4.7. Which Articles Give Information About the Configuration Used in Their Forecasting Models?

Having the datasets is not the only way to compare a forecasting model. If an article presents the configuration of their forecasting models, then a researcher can replicate those models with their datasets to compare both performances. Table 6 introduces some articles that share some aspects related to the configuration of their forecasting models.

Articles like [42] mention which hyperparameters were tuned for each forecasting model, but no details are given. The same occurs in [48], where hyperparameters for each model are present, but no exact values are given.

5. Discussion

In this section, the discussion is introduced.

5.1. About Water Quality Variables

By analyzing all previous tables, the following statements are made:

The water quality variables of greatest interest to forecast are Chl-α (21), DO (8), and TP (5).
The predictor variables most used are: TP and temperature (17 each), Chl-α (16), AT and pH (15), WT and TN (13), EC and precipitation (11).
Just 24 articles used the forecasted variable (at least one) as a predictor variable.
Only three articles used one variable as a predictor and as a forecasted variable.

The results for articles using only one variable (as a predictor and predicted) are

R^{2} = 0.52

for Chl-α,

R^{2} = 0.96

and RMSE = 0.31, and (RMSE, MAE, MAPE (%)) for COD at 1 day (0.07, 0.05, 2.21) and at 7 days, (0.3, 0.19, 8.18). All variables and methods are different. The maximum number of predictors used in the literature review was 52 variables to predict TP (RMSE = 0.766, std = 0.021), TN (RMSE = 0.567, std = 0.032), Chl-α (RMSE = 0.819, std = 0.014), and Secchi depth (RMSE = 0.559, std = 0.002). In terms of variables (predictors and predicted), no articles share all the same variables with the others. In most cases, the predictors are used to share only a few variables, while other variables are used as predictors. Also, in most cases, all variables are used at all times, and no extra analysis is made by varying the quantity of predictors. In a few other cases, a correlation analysis is performed, and only the most relevant variables are chosen as predictors.

5.2. About the Forecasting Models

The articles show that the main documented models used to predict water quality are based on

Machine learning (39), including RF (15), SVR (4), GBM (3), XGBoost (3), etc.
Deep learning (34), including LSTM-based models (10), other NN-based models (10), RNN (3), GTU (3), etc. LSTM and GTU are RNNs, and RNN is a type of NN, respectively. RNN has a total of 16 models, and NN has a total of 26 models.
Statistical (19): MLR (4), SARIMAX (2), ARIMA, SARIMA, etc.
Regression models (4): SELM, PLS, SPLS, ELM.
Metaheuristic (1): WOA.
Others (28): These includes various models, methods, frameworks, and more.

Despite some articles sharing the same models, it is challenging to make a fair comparison because the variables (predictors and predicted), datasets, forecast evaluation metrics, and additional methods used differ. Also, not all models have the same forecast lead time. The best comparison possible between forecasting models is the one introduced in each article that develops and compares other forecasting models. A few articles evaluated each of the forecasting models obtained. In most cases, the articles did not give details about training time or implementation cost for each forecasting model developed. The principal focus is usually to present their forecasting models using novel methods or for a specific water body. It is a comprehensive case, as the principal focus of all articles is to forecast WQ, rather than optimizing time or reducing implementation time.

5.3. About Common Problems

Some problems found are introduced next:

Of all 52 articles, just 10 articles have made their datasets available.
Only one article did not introduce the variables to be predicted and/or the auxiliary variables to predict the variable of interest.
Forty-two articles did not show the configuration of the prediction model.
Just three articles detailed the exact monitoring conditions. The monitoring moment is usually missing.
Forty-six articles did not present the monitoring and forecast frequency.
Only 15 articles gave the percentage or equivalent of data used for training, testing, or evaluating their models.

Not all articles provide sufficient data to replicate their forecasting models exactly. A common problem is the lack of data used to train, test, or validate the forecasting models developed. In some cases, the datasets were provided by governmental or private organizations, and the authors did not have permission to share the data. Some others had shared the data, but the repository link is broken or down. If the data is not shared, then no further investigation, analysis, or experimentation could be made by other researchers. Some experiments involving the data include varying the quantity of data used to train the models or varying the frequency of data. The previous problem can be addressed using our data; however, the results presented in each article could not be replicated. Using one’s own data is a good solution to the previous problem. The results obtained from testing the forecasting models will enable researchers to validate the performance. Since a different dataset is used, the results are expected to vary. If the new dataset has the same monitoring conditions (frequency and moment), then the performance obtained should be more closely related to the model than to the data itself. If the dataset has different monitoring conditions, then the new dataset should have an impact on the performance. More experiments should be conducted with the data (by varying monitoring conditions and obtaining sub-datasets) to quantify the actual impact of the new dataset used.

5.4. How Should the Forecasting Models Be Shared?

Forecasting models had to share all details about their configuration (architecture, equations, hyperparameters, etc.) and the data used (monitoring time, frequency, quantity, etc.) to replicate the results and make new experiments. To avoid some common errors found during this review, a methodology is proposed based on the following points:

Variables: As seen in previous sections, some information about predictor or forecasted variables are missing, which prevents obtaining data with similar characteristics. To avoid this situation, each variable needs to present the following information:
-
The monitoring moment;
-
The monitoring frequency;
-
The frequency for inputs of the model;
-
The output frequency for forecasted variables.
Dataset: In a similar case, some key information about the dataset used is missing, and the dataset may not be available for downloading. To prevent this situation, a resume is required to give the following details:
-
The total data quantity;
-
The quantity of missing and replaced data;
-
The percentage (or exact quantity of data) used for each train, test, or validation dataset.
-
Summary statistics (mean, maximum, minimum values, std, etc.).
Model: In most cases, key features are missing when searching for information related to the forecasting models, causing a bad or impossible comparison between the models. To avoid this case, the following considerations are required for each forecasting model:
-
The exact equations and coefficients obtained (in the corresponding cases),
-
The input and output variables;
-
The configuration parameters (like hyperparameters for ANNs).
Evaluation: As seen in Table 2 and Table 3, evaluation metrics have many options. Some metrics are used to obtain specific information. The next evaluation metrics are proposed to ensure a good comparison between different models:
-
Error metrics (MAE, MSE, RMSE),
-
Correlation coefficient (r),
-
Determination coefficient ( $R^{2}$ ).
Also, the next evaluation metrics are recommended for classification models:
-
Precision,
-
Accuracy.

In some cases, a confusion matrix or correlation matrix can be considered.

5.5. Future Work

To ensure a good analysis for future forecasting models, the next points represent possible research topics:

Variation of the dataset size to train: These experiments should answer the question “How many days, months, or years of data is required to have better predictions?” Only [67] cibdycted analyses by varying the size of datasets to train the forecasting models developed.
Analysing the impact of data frequency: These experiments should answer the question “Which data frequency is required to have a good precision?” It is different to have a dataset with 1000 data points obtained every 12 h (500 continuous days of data) compared to having a dataset with 1000 data points obtained every 7 days (approximately 19 years of data). Additionally, in the specific case of ANN, the input data can be arrays. How much data is required for input arrays to improve WQ forecasting? An array with 10 data points from a database, with a monitoring frequency of 12 h, represents 5 days of data. This case means that data from the previous 5 days supports each data forecast. Increasing or decreasing the data frequency should have an impact on the forecasting values.

6. Conclusions

There is a wide range of forecasting models, each with its configuration parameters (including hyperparameters and their optimization), architecture, equations, methods, variables, and so on. Despite the main purpose of most articles being to evaluate or compare some specific models, introduce a novel method, or forecast the WQ for a specific freshwater body, if vital information is missing, then the forecasting model cannot be replicated or compared properly. The most common problem is the availability of the database used in each article. If the database is unavailable, researchers cannot conduct additional analyses. Experimenting with the original data and models enables researchers to assess how well (or poorly) the developed forecasting model performs and the impact of data variation. If all configurations and information about the forecasting model are provided, new forecasting models can be developed using their data. This situation will allow researchers to test how the model performance is affected by new data (with the same or distinct characteristics). On the other hand, if databases are not shared and no information about this is provided, then some important characteristics (such as the monitoring moment and frequency, the quantity of data, and the percentages of data in datasets for training, testing, or validating the forecasting model) will be overlooked. It will not allow for obtaining one’s own data with the same characteristics and comparing the models with different data. The ideal case of having a good comparison between models is to have both original and one’s own data. Finally, the developed models must utilize essential and all possible evaluation metrics to compare their results with those of other research.

The principal problem to solve with WQ forecasting appears to be the algae bloom, as the most forecasted WQ variable is Chl-α. DO and TP are the second and third most forecasted variables. If a freshwater body has problems with an algal bloom, then the level of DO is expected to decrease. Then, most articles that forecast Chl-α used DO as a predictor. But algae require snutrients to bloom. TP is the principal nutrient used as a predictor for Chl-α. TP is used more than DO. TP was used 13 times and DO only 7 times as predictors for forecasting Chl-α.

The most implemented forecasting models are machine learning and deep learning models. RF is the most frequently used (15 times), and the LSTM model is the second most frequently used (10 times). RF models are used more often as main forecasting models than LSTM (RF with 7 times and LSTM with 6), but LSTM models are used more often as comparison models than RF (LSTM with 8 times and RF with 7). Between 2019 and 2025, machine learning and deep learning models accounted for 58.4% of the total number of forecasting models. This percentage is expected to increase next year due to the boom of AI.

Finally, WQ forecasting models for lentic freshwater bodies lack a standard to present the model and its results. A standard is required to present each forecasting model properly and facilitate a correct comparison.

Author Contributions

Conceptualization, J.C.G.-G., J.M.Á.-A., V.P.-B. and J.R.-R.; methodology, J.C.G.-G.; software, J.R.-R.; validation, J.C.G.-G., R.V.C.-S. and V.P.-B.; formal analysis, J.M.Á.-A., R.V.C.-S., V.P.-B., and J.R.-R.; investigation, J.C.G.-G.; resources, J.R.-R.; data curation, R.V.C.-S.; writing—original draft preparation, J.C.G.-G.; writing—review and editing, J.C.G.-G., J.M.Á.-A., R.V.C.-S., V.P.-B., and J.R.-R.; visualization, R.V.C.-S., V.P.-B., and J.R.-R.; supervision, J.R.-R.; project administration, J.C.G.-G. and J.R.-R.; funding acquisition, J.C.G.-G., J.M.Á.-A., R.V.C.-S., V.P.-B., and J.R.-R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

Models
AC-BiLSTM	Attention Convolutional BiLSTM
AM	Attention Mechanism
BiLSTM	Bidirectional LSTM
BMA	Bayesian Model Averaging
BN	Bayesian Network
BPNN	Backpropagation Neuronal Network
CART	Classification and Regression Trees
CBiLSTM	CNN-BiLSTM
CBiLSTM-AT	CNN-BiLSTM with AM
CBOPMM	Cyanobacteria Bloom Occurence Probability Prediction Model
CEEMDAN	Complete Ensemble Empirical Mode Decomposition With Adaptive Noise
CNN	Convolutional Neuronal Network
CT	Classification Tree
CVMD	CEEMDAN-VMD
DGLM	Dynamic Generalized Linear Model
DNN	Deep Neuronal Network
FLARE	Forecasting Lake And Reservoir Ecosystems
GA	Genetic Algorithm
GAM	Generalized Additive Model
GANBM	Generalized Additive Negative Binomial Model
GAPM	Generalized Additive Poisson Model
Gaussian BM	Gaussian Bayesian Network
GBM	Gradient Boosting Machine
GBR	Gradient Boost Regressor
GNN	Graph Neural Networks
GNU	Gneural Network
GPR	Gaussian Process Regression
INLA	Integrated Nested Laplace Approximation
IVB	Improved Complete Ensemble Empirical Mode Decomposition With Adaptive Noise-Variational Mode Decomposition-Bidirectional Long Short-term Memory
KNN	K-Nearest Neighbors
LASSO	Least Absolute Shrinkage and Selection Operator
MBPNN	Multilayer Backpropagation Neuronal Network
MLP	Multilayer Perceptron
MLR	Multiple Linear Regression
NBRM	Negative Binomial Regression Model Neuronal Network Bidirectional Long Short-Term Memory
NLR	Non Linear Regression
NM	Naïve Method
PLS	Partial Least Square
PLSR	Partial Least Square Regression
PRM	Poisson Regression Model
PSO	Particle Swarm Optimization
QRF	Quantile Regression Forest
RF	Random Forest
RFR	Random Forest Regressor
SELM	Spatial Error Linear Model
SPLS	Sparse PLS
SVC	Support Vector Classifier
SVM	Support Vector Machine
SVR	Support Vector Regressor
TPS	Two-Phase System
VMD	Variation mode decomposition
WA	Weighted Average
WOA	Whale Optimization Algorithm
XGB	Extreme Gradient Boosting
XGBRF	Extreme Gradient Boosting with Random Forest
ZINBM	Zero-Inflated Negative Binomial Model
ZIPM	Zero-Inflated Poisson Model
Water Quality Variables
AN	Ammoniacal Nitrogen (mg/L)
BOD	Biological Oxygen Demand (mg/L)
Chlorophyll alpha	Chl-α (mg/L)
CA	Carbon Acumulation
COD	Chemical Oxygen Demand (mg/L)
CI	Cyanobacterial Index
DO	Dissolved Oxygen (mg/L)
DOC	Dissolved Organic Carbon (mg/L)
DOY	Day of Year
EC	Electrical Conductivity (μS/cm)
HABs	Harmaful Algal Blooms (ug/L)
LAI	Leaf Area Index
ORP	Oxidative Reductive Potential (mV)
SC	Specific Conductance (μS/cm)
SRP	Soluble Reactive Phosphate (mg/L)
TOC	Total Organic Carbon (mg/L)
TSS	Total Suspended Solids (mg/L)
Turbidity	Turbidity (NTU)
WH	Wave Height (m)
WL	Water Levels (m)
Evaluation Metrics
AIC	Aikake Information Criteria
BIC	Bayesian Information Criteria
DBSCAN	Density-Based Spatial with Clustering of Applications Noise
LOF	Local Outliner Factor
MAE	Mean Absolute Error
MAPE	Mean Absolute Percentage Error
MHOE	Mean Higher Order Error
MSRE	Mean Squared Relative Error
OOB	Out-Of-Bag
RMSLE	Root-Mean-Square Log Error
RSD	Relative Standard Deviation
RSR	Root Mean Squared Error to Standard Deviation

References

Krenkel, P.A.; Novotny, V. Water Quality Management; Academic Press Inc.: New York, NY, USA, 1980. [Google Scholar]
CONAGUA. Estadísticas del Agua en México 2023, March 2024. Available online: https://agua.org.mx/biblioteca/estadisticas-del-agua-en-mexico-2023-conagua/ (accessed on 9 November 2024).
Brockwell., P.J.; Davis, R.A. Introduction to Time Series and Forecasting; Springer: Cham, Switzerland, 2016. [Google Scholar]
Odry, A.; Kecskes, I.; Pesti, R.; Csik, D.; Stefanoni, M.; Sarosi, J.; Sarcevic, P. NN-augmented EKF for Robust Orientation Estimation Based on MARG Sensors. Int. J. Control. Autom. Syst. 2025, 23, 920–934. [Google Scholar] [CrossRef]
Vanegas-Ayala, S.-C.; Barón-Velandia, J.; Romero-Riaño, E. Systematic Review of Forecasting Models Using Evolving Fuzzy Systems. Computation 2024, 12, 159. [Google Scholar] [CrossRef]
Fatima, S.S.W.; Rahimi, A.A. Review of Time-Series Forecasting Algorithms for Industrial Manufacturing Systems. Machines 2024, 12, 380. [Google Scholar] [CrossRef]
Tsai, W.-C.; Tu, C.-S.; Hong, C.-M.; Lin, W.-M. A Review of State-of-the-Art and Short-Term Forecasting Models for Solar PV Power Generation. Energies 2023, 16, 5436. [Google Scholar] [CrossRef]
Mystakidis, A.; Koukaras, P.; Tsalikidis, N.; Ioannidis, D.; Tjortjis, C. Energy Forecasting: A Comprehensive Review of Techniques and Technologies. Energies 2024, 17, 1662. [Google Scholar] [CrossRef]
Sina, L.B.; Secco, C.A.; Blazevic, M.; Nazemi, K. Hybrid Forecasting Methods—A Systematic Review. Electronics 2023, 12, 2019. [Google Scholar] [CrossRef]
Akhtar, S.; Shahzad, S.; Zaheer, A.; Ullah, H.S.; Kilic, H.; Gono, R.; Jasiński, M.; Leonowicz, Z. Short-Term Load Forecasting Models: A Review of Challenges, Progress, and the Road Ahead. Energies 2023, 16, 4060. [Google Scholar] [CrossRef]
Shaheed, H.; Zawawi, M.H.; Hayder, G. The Development of a River Quality Prediction Model That Is Based on the Water Quality Index via Machine Learning: A Review. Processes 2025, 13, 810. [Google Scholar] [CrossRef]
Yan, X.; Zhang, T.; Du, W.; Meng, Q.; Xu, X.; Zhao, X. A Comprehensive Review of Machine Learning for Water Quality Prediction over the Past Five Years. J. Mar. Sci. Eng. 2024, 12, 159. [Google Scholar] [CrossRef]
Willard, J.D.; Varadharajan, C.; Jia, X.; Kumar, V. Time series predictions in unmonitored sites: A survey of machine learning techniques in water resources. Environ. Data Sci. 2025, 4, e7. [Google Scholar] [CrossRef]
Chen, Y.; Song, L.; Liu, Y.; Yang, L.; Li, D. A Review of the Artificial Neural Network Models for Water Quality Prediction. Appl. Sci. 2020, 10, 5776. [Google Scholar] [CrossRef]
Pan, D.; Deng, Y.; Yang, S.X.; Gharabaghi, B. Recent Advances in Remote Sensing and Artificial Intelligence for River Water Quality Forecasting: A Review. Environments 2025, 12, 158. [Google Scholar] [CrossRef]
Pan, D.; Zhang, Y.; Deng, Y.; Thé, J.V.G.; Yang, S.X.; Gharabaghi, B. Dissolved Oxygen Forecasting for Lake Erie’s Central Basin Using Hybrid Long Short-Term Memory and Gated Recurrent Unit Networks. Water 2024, 16, 707. [Google Scholar] [CrossRef]
Ghosh, M.; Thirugnanam, A. Introduction to Artificial Intelligence. In Artificial Intelligence for Information Management: A Healthcare Perspective; Springer: Singapore, 2021; pp. 23–44. [Google Scholar] [CrossRef]
Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. Int. J. Surg. 2021, 88, 105906. [Google Scholar] [CrossRef]
Collins, S.M.; Yuan, S.; Tan, P.N.; Oliver, S.K.; Lapierre, J.F.; Cheruvelil, K.S.; Fergus, C.E.; Skaff, N.K.; Stachelek, J.; Wagner, T.; et al. Winter Precipitation and Summer Temperature Predict Lake Water Quality at Macroscales. Water Resour. Res. 2019, 55, 2708–2721. [Google Scholar] [CrossRef]
Gao, G.; Xiao, K.; Chen, M. An intelligent IoT-based control and traceability system to forecast and maintain water quality in freshwater fish farms. Comput. Electron. Agric. 2019, 166, 105013. [Google Scholar] [CrossRef]
Zhao, C.S.; Shao, N.F.; Yang, S.T.; Ren, H.; Ge, Y.R.; Feng, P.; Dong, B.E.; Zhao, Y. Predicting cyanobacteria bloom occurrence in lakes and reservoirs before blooms occur. Sci. Total Environ. 2019, 670, 837–848. [Google Scholar] [CrossRef] [PubMed]
Peng, Z.; Hu, W.; Liu, G.; Zhang, H.; Gao, R.; Wei, W. Development and evaluation of a real-time forecasting framework for daily water quality forecasts for Lake Chaohu to Lead time of six days. Sci. Total Environ. 2019, 687, 218–231. [Google Scholar] [CrossRef]
Liu, X.; Feng, J.; Wang, Y. Chlorophyll a predictability and relative importance of factors governing lake phytoplankton at different timescales. Sci. Total Environ. 2019, 648, 472–480. [Google Scholar] [CrossRef]
Li, Y.; Khan, M.Y.A.; Jiang, Y.; Tian, F.; Liao, W.; Fu, S.; He, C. CART and PSO plus KNN algorithms to estimate the impact of water level change on water quality in Poyang Lake, China. Arab. J. Geosci. 2019, 12, 287. [Google Scholar] [CrossRef]
Yan, J.; Xu, Z.; Yu, Y.; Xu, H.; Gao, K. Application of a Hybrid Optimized BP Network Model to Estimate Water Quality Parameters of Beihai Lake in Beijing. Appl. Sci. 2019, 9, 1863. [Google Scholar] [CrossRef]
Derot, J.; Yajima, H.; Jacquet, S. Advances in forecasting harmful algal blooms using machine learning models: A case study with Planktothrix rubescens in Lake Geneva. Harmful Algae 2020, 99, 101906. [Google Scholar] [CrossRef] [PubMed]
Banerjee, A.; Chakrabarty, M.; Bandyopadhyay, G.; Roy, P.K.; Ray, S. Forecasting environmental factors and zooplankton of Bakreswar reservoir in India using time series model. Ecol. Informatics 2020, 60, 101157. [Google Scholar] [CrossRef]
Myer, M.H.; Urquhart, E.; Schaeffer, B.A.; Johnston, J.M. Spatio-Temporal Modeling for Forecasting High-Risk Freshwater Cyanobacterial Harmful Algal Blooms in Florida. Front. Environ. Sci. 2020, 8, 581091. [Google Scholar] [CrossRef]
Zhang, X.; Li, B.; Deng, J.; Qin, B.; Wells, M.; Tefsen, B. Advances in freshwater risk assessment: Improved accuracy of dissolved organic matter-metal speciation prediction and rapid biological validation. Ecotoxicol. Environ. Saf. 2020, 202, 110848. [Google Scholar] [CrossRef]
Madani, M.; Seth, R. Evaluating multiple predictive models for beach management at a freshwater beach in the Great Lakes region. J. Environ. Qual. 2020, 49, 896–908. [Google Scholar] [CrossRef]
Thomas, R.Q.; Figueiredo, R.J.; Daneshmand, V.; Bookout, B.J.; Puckett, L.K.; Carey, C.C. A Near-Term Iterative Forecasting System Successfully Predicts Reservoir Hydrodynamics and Partitions Uncertainty in Real Time. Water Resour. Res. 2020, 56, e2019WR026138. [Google Scholar] [CrossRef]
Dugan, H.A.; Skaff, N.K.; Doubek, J.P.; Bartlett, S.L.; Burke, S.M.; Krivak-Tetley, F.E.; Summers, J.C.; Hanson, P.C.; Weathers, K.C. Lakes at Risk of Chloride Contamination. Environ. Sci. Technol. 2020, 54, 6639–6650. [Google Scholar] [CrossRef]
Francy, D.S.; Brady, A.M.G.; Stelzer, E.A.; Cicale, J.R.; Hackney, C.; Dalby, H.D.; Struffolino, P.; Dwyer, D.F. Predicting microcystin concentration action-level exceedances resulting from cyanobacterial blooms in selected lake sites in Ohio. Environ. Monit. Assess. 2020, 192, 513. [Google Scholar] [CrossRef]
Liu, K.; Kong, L.; Wang, J.; Cui, H.; Fu, H.; Qu, X. Two-Phase System Model to Assess Hydrophobic Organic Compound Sorption to Dissolved Organic Matter. Environ. Sci. Technol. 2020, 54, 12173–12180. [Google Scholar] [CrossRef]
Li, L.; Qiao, J.; Yu, G.; Wang, L.; Li, H.Y.; Liao, C.; Zhu, Z. Interpretable tree-based ensemble model for predicting beach water quality. Water Res. 2022, 211. [Google Scholar] [CrossRef]
Navarro, M.B.; Schenone, L.; Martyniuk, N.; Vega, E.; Modenutti, B.; Balseiro, E. Predicting Dissolved Organic Matter Lability and Carbon Accumulation in Temperate Freshwater Ecosystems. Ecosystems 2022, 25, 795–811. [Google Scholar] [CrossRef]
Hadid, N.B.; Goyet, C.; Maiz, N.B.; Shili, A. Long-term forecasting in a coastal ecosystem: Case study of a Southern restored Mediterranean lagoon: The North Lagoon of Tunis. J. Coast. Conserv. 2022, 26, 10. [Google Scholar] [CrossRef]
Jackson-Blake, L.A.; Clayer, F.; Haande, S.; Sample, J.E.; Moe, S.J. Seasonal forecasting of lake water quality and algal bloom risk using a continuous Gaussian Bayesian network. Hydrol. Earth Syst. Sci. 2022, 26, 3103–3124. [Google Scholar] [CrossRef]
Chen, W.; Kim, J.; Yu, J.; Wang, X.; Peng, S.; Zhu, Z.; Wei, Y. COD forecasting of Poyang lake using a novel hybrid model based on two-layer data decomposition. Nongye Gongcheng Xuebao/Trans. Chin. Soc. Agric. Eng. 2022, 38, 296–302. [Google Scholar] [CrossRef]
Carey, C.C.; Woelmer, W.M.; Lofton, M.E.; Figueiredo, R.J.; Bookout, B.J.; Corrigan, R.S.; Daneshm, V.; Hounshell, A.G.; Howard, D.W.; Lewis, A.S.; et al. Advancing lake and reservoir water quality management with near-term, iterative ecological forecasting. Inland Waters 2022, 12, 107–120. [Google Scholar] [CrossRef]
You, L.; Tong, X.; Te, S.H.; Tran, N.H.; Sukarji, N.H.B.; He, Y.; Gin, K.Y.H. Multi-class secondary metabolites in cyanobacterial blooms from a tropical water body: Distribution patterns and real-time prediction. Water Res. 2022, 212, 118129. [Google Scholar] [CrossRef]
Martinsen, K.T.; Sand-Jensen, K. Predicting water quality from geospatial lake, catchment, and buffer zone characteristics in temperate lowland lakes. Sci. Total Environ. 2022, 851, 158090. [Google Scholar] [CrossRef]
Qian, Z.; Cao, Y.; Wang, L.; Wang, Q. Developing cyanobacterial bloom predictive models using influential factor discrimination approach for eutrophic shallow lakes. Ecol. Indic. 2022, 144, 109458. [Google Scholar] [CrossRef]
Crapart, C.; Finstad, A.G.; Hessen, D.O.; Vogt, R.D.; Andersen, T. Spatial predictors and temporal forecast of total organic carbon levels in boreal lakes. Sci. Total Environ. 2023, 870, 161676. [Google Scholar] [CrossRef]
Ding, L.; Qi, C.; Li, G.; Zhang, W. TP Concentration Inversion and Pollution Sources in Nanyi Lake Based on Landsat 8 Data and InVEST Model. Sustainability 2023, 15, 9678. [Google Scholar] [CrossRef]
Gupta, A.; Hantush, M.M.; Govindaraju, R.S. Sub-monthly time scale forecasting of harmful algal blooms intensity in Lake Erie using remote sensing and machine learning. Sci. Total Environ. 2023, 900, 165781. [Google Scholar] [CrossRef] [PubMed]
Torres-Vera, M.A. Mapping of total suspended solids using Landsat imagery and machine learning. Int. J. Environ. Sci. Technol. 2023, 20, 11877–11890. [Google Scholar] [CrossRef]
Amieva, J.F.; Oxoli, D.; Brovelli, M.A. Machine and Deep Learning Regression of Chlorophyll-a Concentrations in Lakes Using PRISMA Satellite Hyperspectral Imagery. Remote Sens. 2023, 15, 5385. [Google Scholar] [CrossRef]
Rodríguez-López, L.; Usta, D.B.; Duran-Llacer, I.; Alvarez, L.B.; Yépez, S.; Bourrel, L.; Frappart, F.; Urrutia, R. Estimation of Water Quality Parameters through a Combination of Deep Learning and Remote Sensing Techniques in a Lake in Southern Chile. Remote Sens. 2023, 15, 4157. [Google Scholar] [CrossRef]
Harkort, L.; Duan, Z. Estimation of dissolved organic carbon from inland waters at a large scale using satellite data and machine learning methods. Water Res. 2023, 229, 119478. [Google Scholar] [CrossRef]
Ozdemir, S.; Yildirim, S.O. Prediction of Water Level in Lakes by RNN-Based Deep Learning Algorithms to Preserve Sustainability in Changing Climate and Relationship to Microcystin. Sustainability 2023, 15, 16008. [Google Scholar] [CrossRef]
Barbosa, C.C.; do Carmo Calijuri, M.; da Silva Anjinho, P.; dos Santos, A.C.A. An integrated modeling approach to predict trophic state changes in a large Brazilian reservoir. Ecol. Model. 2023, 476, 110227. [Google Scholar] [CrossRef]
Tan, R.; Wang, Z.; Wu, T.; Wu, J. A data-driven model for water quality prediction in Tai Lake, China, using secondary modal decomposition with multidimensional external features. J. Hydrol.-Reg. Stud. 2023, 47, 101435. [Google Scholar] [CrossRef]
Hwang, S.Y.; Choi, B.W.; Park, J.H.; Shin, D.S.; Lee, W.S.; Chung, H.S.; Son, M.S.; Ha, D.W.; Lee, K.L.; Jung, K.Y. Evaluation of algal species distributions and prediction of cyanophyte cell counts using statistical techniques. Environ. Sci. Pollut. Res. 2023, 30, 117143–117164. [Google Scholar] [CrossRef]
Villanueva, P.; Yang, J.; Radmer, L.; Liang, X.; Leung, T.; Ikuma, K.; Swanner, E.D.; Howe, A.; Lee, J. One-Week-Ahead Prediction of Cyanobacterial Harmful Algal Blooms in Iowa Lakes. Environ. Sci. Technol. 2023, 57, 20636–20646. [Google Scholar] [CrossRef]
Nkwalale, L.; Schwefel, R.; Yaghouti, M.; Rinke, K. A simple model for predicting oxygen depletion in lakes under climate change. Inland Waters 2023, 13, 576–595. [Google Scholar] [CrossRef]
Rirongarti, R.; Sylvestre, F.; Chalie, F.; Pailles, C.; Mazur, J.C.; Nour, A.M.; Barthelemy, W.; Mariot, H.; der Meeren, T.; Poulin, C.; et al. A diatom-based predictive model for inferring past conductivity in Chadian Sahara lakes. J. Paleolimnol. 2023, 69, 231–248. [Google Scholar] [CrossRef]
Miura, Y.; Imamoto, H.; Asada, Y.; Sagehashi, M.; Akiba, M.; Nishimura, O.; Sano, D. Prediction of algal bloom using a combination of sparse modeling and a machine learning algorithm: Automatic relevance determination and support vector machine. Ecol. Informatics 2023, 78, 102337. [Google Scholar] [CrossRef]
Talukdar, S.; Shahfahad; Bera, S.; Naikoo, M.W.; Ramana, G.V.; Mallik, S.; Kumar, P.A.; Rahman, A. Optimisation and interpretation of machine and deep learning models for improved water quality management in Lake Loktak. J. Environ. Manag. 2024, 351, 119866. [Google Scholar] [CrossRef] [PubMed]
Schaeffer, B.A.; Reynolds, N.; Ferriby, H.; Salls, W.; Smith, D.; Johnston, J.M.; Myer, M. Forecasting freshwater cyanobacterial harmful algal blooms for Sentinel-3 satellite resolved U.S. lakes and reservoirs. J. Environ. Manag. 2024, 349, 119518. [Google Scholar] [CrossRef]
Rodríguez-López, L.; Alvarez, D.; Usta, D.B.; Duran-Llacer, I.; Alvarez, L.B.; Fagel, N.; Bourrel, L.; Frappart, F.; Urrutia, R. Chlorophyll-a Detection Algorithms at Different Depths Using In Situ, Meteorological, and Remote Sensing Data in a Chilean Lake. Remote Sens. 2024, 16, 647. [Google Scholar] [CrossRef]
Woelmer, W.M.; Thomas, R.Q.; Olsson, F.; Steele, B.G.; Weathers, K.C.; Carey, C.C. Process-based forecasts of lake water temperature and dissolved oxygen outperform null models, with variability over time and depth. Ecol. Inform. 2024, 83, 102825. [Google Scholar] [CrossRef]
Nanjappachetty, A.; Sundar, S.; Vankadari, N.; Bapu, T.B.B.R.; Shanmugam, P. An efficient water quality index forecasting and categorization using optimized Deep Capsule Crystal Edge Graph neural network. Water Environ. Res. 2024, 96, e11138. [Google Scholar] [CrossRef]
Chen, C.; Chen, Q.; Yao, S.; He, M.; Zhang, J.; Li, G.; Lin, Y. Combining physical-based model and machine learning to forecast chlorophyll-a concentration in freshwater lakes. Sci. Total Environ. 2024, 907, 168097. [Google Scholar] [CrossRef]
Wang, X.; Tang, X.; Zhu, M.; Liu, Z.; Wang, G. Predicting abrupt depletion of dissolved oxygen in Chaohu lake using CNN-BiLSTM with improved attention mechanism. Water Res. 2024, 261, 122027. [Google Scholar] [CrossRef]
Lin, S.; Pierson, D.C.; Ladwig, R.; Kraemer, B.M.; Hu, F.R.S. Multi-Model Machine Learning Approach Accurately Predicts Lake Dissolved Oxygen With Multiple Environmental Inputs. Earth Space Sci. 2024, 11, e2023EA003473. [Google Scholar] [CrossRef]
Beckmann, D.A.; Werther, M.; Mackay, E.B.; Spyrakos, E.; Hunter, P.; Jones, I.D. Are more data always better?—Machine learning forecasting of algae based on long-term observations. J. Environ. Manag. 2025, 373, 123478. [Google Scholar] [CrossRef]
Sušanj Čule, I.; Ožanić, N.; Volf, G.; Karleuša, B. Artificial Neural Network (ANN) Water-Level Prediction Model as a Tool for the Sustainable Management of the Vrana Lake (Croatia) Water Supply System. Sustainability 2025, 17, 722. [Google Scholar] [CrossRef]
Liu, Y.; Yang, B.; Xie, K.; Sun, J.; Zhu, S. Dongting Lake algal bloom forecasting: Robustness and accuracy analysis of deep learning models. J. Hazard. Mater. 2025, 485, 136804. [Google Scholar] [CrossRef]

Figure 1. Number of articles published: (a) Scopus. (b) Web of Science.

Figure 2. Number of publications in Scopus by researching specific keywords.

Figure 3. Number of publications in WoS by researching specific keywords.

Figure 4. Flowchart of the prism systematic review.

Figure 5. Bibliometric network based on the research in Scopus.

Figure 6. Predictors and predicted variables for water quality forecasting.

Figure 7. Evaluation metrics used.

Figure 8. Types of models used.

Table 1. Monitoring place and monitoring variables.

Ref.	Monitoring Water Bodies	Input and Output Variables
[19]	11,882 lakes (from LAGOS database, 2015)	48 climate metrics, Chl-α, water clarity, TP, TN →lake nutrients, Chl-α, water clarity
[20]	Asian carp and rainbow trout fish farm in the city of Baoding (China), 4 Monitoring Points (MP), from 09/06/2018 to 10/15/2018	pH, temperature, EC, turbidity, and DO
[21]	13 reservoirs and two lakes in Jinan (China), from spring, summer, and autumn of 2014 and 2015	WT, pH, TP, NH₄N, COD, DO → cyanobacteria bloom
[22]	Chaohu Lake (China), 12 MP, 2016 to 2017	Nutrient concentrations from lake and rivers, WL, rainfall, AT, wind field, SR, inflows, outflows, cloud cover → DO, NH, TN, TP
[23]	Yuqiao Reservoir (China)	TP, TN, NH₄, WS, WT → Chl-α (as a proxy of phytoplankton)
[24]	Poyang Lake (China), 5 MP, from 2002 to 2008 for training data, 2009 for verifying data, and 2010 to 2012 for WQ evaluation	COD, DO, NH₄-N, TP, WL, lake area, and lake volume → COD, DO, NH₄-N, TP
[25]	Beihai Lake (Beijing, China), 120 h of data from August 2013	pH, Chl-α, NH₄H, BOD, and EC → DO
[26]	Geneva Lake (France and Switzerland), 1 MP, from 1984 to 2018	Planktothrix rubescens (Chl-α), sum of cyanobacteria taxa → Planktothrix rubescens (Chl-α)
[27]	Bakreswar reservoir (India), 3 MP, from March 2012 to February 2014	AT, WT, humidity, EC, Salinity, hardness, TDS, and Zooplankton
[28]	103 lakes in Florida (USA), from May 2016 to June 2019	Cyanobacteria abundance, WT, ambient temperature, precipitation, and lake geomorphology → Cyanobacteria abundance
[29]	Lake Taihu (China), 32 MP, August 2017	Metal-DOM, Pb-DOM → Pb-DOM
[30]	Lake St. Clair (Canada), 5 MP, summer months from 2014 to 2018	WT, AT, daily rainfall, WS, WD, WH, turbidity, NBirds, DOY, → E. coli concentration
[31]	Falling Creek Reservoir (Virginia, USA), from 07/11/2018 to 08/27/2018	Meteorological data on downwelling shortwave radiation, downwelling longwave radiation, AT, WS, relative humidity, precipitation, water inflow and outflow, inflow temperature → WT
[32]	49,432 lakes from USA	Month, the lake area, watershed area, land use, ice and snow, open development, low development, medium development, high development, barren island, deciduous forest, evergreen forest, mixed forest, shrub, grassland, pasture/day, crops, woody wetlands, emergent wetlands, road density in the watershed, index of winter severity, distance to the nearest interstate, distance to the nearest road → Chloride Contamination
[33]	Western Lake Erie Basin (8 MP) and inland lakes in Northeast of Ohio (2 MP) (USA), twice a month to twice a week from May to November, in 2016 to 2017	Phycocyanin, Chl-α, pH, SC, WT, WH (ft), temperature, turbidity, ORP, DO, SR, Dew Point, Gage Height (f), Streamflow (cfs), rainfall (inches), Lake Level change (ft), WS (mph), Nutrients, Algal Pigment Fluorescence → Microcystin concentration action-level exceedances
[34]	Taihu Lake and Xuanwi Lake (China), March 2019 (Taihu Lake) and July 2020 (Xuanwu Lake)	Temperature, EC, pH, DOC, DOM → K_OC
[35]	3 Beaches from Lake Eris (USA)	Lake-related, weather-related, stream-related, and others → E. coli concentrations
[36]	59 lentic ecosystems in NorthAndean Patagonian glacial lake district, from 2016 to 2017	TDN, TDP, C_prot → DOM
[37]	North Lagoon of Tunis (Southern Mediterranean Sea to the east of the city of Tunis), monthly from 01/1989 to 04/2018	Chl-α
[38]	Vansjo Lake (Norway), from 1992 to 2012	TP → TP; Chl-α, TP, WS → Chl-α; Chl-α, Colour → Cyanobacteria; Colour, Rain sum → Colour;
[39]	Poyang Lake (China), from August 1st 2017 to April 30 2020	COD
[40]	Falling Creek Reservoir (Virginia, USA)	AT, wind speed, relative humidity, shortwave and longwave radiation, precipitation, inflow discharge, and WT → DO
[41]	Freshwater lake in Singapore, 4 MP, 07/2019 to 11/2019	Chl-α, TC, rainfall, NH₄, light, TN, DN, DO, TOC, TP, Turb., TIC, NO₂, NO₃, SO₄, CI, PO₄, pH, Temp, EC, TDS, salinity → Cyanobacterial MCs and Cyn Chl-α, TC, rainfall, NH₄ → MCs; CI, TC, rainfall, NH₄ → CYN;
[42]	924 to 1054 lakes from Denmark, from 2000 to 2019	50 predictor variables → Alkalinity, pH, TP, TN, Chl-α, Secchi depth, color, and pCO₂
[43]	Dianshan Lake, 12 MP, from December 2012 to December 2019, and August and September 2021	AT, DO, TDS, EC, COD, pH, TP, TN, and N:P → Cyanobacterial Bloom
[44]	4735 boreal lakes in 1995 and 1001 boreal lakes in 2019 (Finland, Sweden, and Norway)	NDVI, log R unoff, Bog, Arable, TNdep → TOC
[45]	Nanyi Lake (Southern Anhui), from 01/2015 to 12/2021	TP, CODMn, NH3-N, DO → TP
[46]	Lake Erie (USA), monthly	CI, WS, AT, TP, TKN, NO₂ + NO₃, SRP, TSS, Avg. Streamflow, Ration of TKN to TP, Ration of TKN to (NO₂ + NO₃), Avg. SR, Avg. WL, Secchi depth, Observation time step of the year → CI
[47]	Chapala Lake (Mexico), from May 2005 to November 2016	TSS concentration
[48]	Como Lake, Maggiore Lake and Lugano Lake (between Italy and Switzerland), from 01/15/2019 to 11/05/2022	Water Surface temperature, TSM, Chl-α → Chl-α
[49]	Llanquihue Lake (Chile)	Secchi Disk Depth, Chl-α, temperature, TN, TP → Chl-α
[50]	USA Lakes in AquaSat database and ERA5-Land, from 1984 to 2019 and since 1981 to 2019 respectively	Average monthly AT, WS, LAI, average LAI, evaporation over inland waters, surface net SR, monthly precipitation → DOC
[51]	Lake Sapanca (Turkey), daily from 10/11/2012 to 08/04/2023	Maximum temperature, Minimum temperature, Average temperature, precipitation, Withdrawal → WL
[52]	Itupararanga Lake (Brazil), from 07/2011 to 04/2012 and from 12/2013 to 12/2017	AT, reservoir inflow, WT → WT
[53]	Tai Lake (China), 6 MP, every 4 h from 11/01/2020 to 02/28/2023	WT, Sea Level Pressure, Ground Pressure, AT, Surface temperature, Dew Point temperature, DO content of the adjacent station → DO
[54]	Juam Lake and Tamkin Lake (South Corea), 4 MP, from 01/2017 to 12/2022	BOD, COD, TN, TP, TOC, SS, EC, pH, DO, temperature, turbidity, Transparency, Chl-α, Low Water Level, Inflow, Discharge, Reservoir → Cyanophites Cells Counts
[55]	38 Lowa Lakes, 38 MP, from 2018 to 2021	mcyA_M, TKN, % hay/pasture, pH, mcyA_M: 16S, % developed, DOC, dewpoint temperature, and ortho-P → Cyanobacterial Harmful Algal Blooms
[56]	13 lakes from Germany (two experiments)	Trophic state, stratification duration, and hypolimnion temperature → DO depletion
[57]	Ounianga Kebir and Ounianga Serir (groups of lakes in Sahara), from 2015 to 2016	Water temperature, pH, and EC → EC
[58]	Hitokura, Terauchi, Murou, and Agigawa dam reservoirs (Japan), from 2001 to 2017	Algae (Mycrosystis and Dolichospermum), TN, TP, Inflow, Discharge, AT, Daylight Hours, Wind Strength, Amount of Rain → Algal Bloom (Mycrosystis and Dolichospermum)
[59]	Loktak Lake (India), 60 MP	EC, pH, turbidity, temperature, TDS, BOD, COD, DO, Nitrate → WQI
[60]	2192 lakes in the USA, from 2017 to 2021	Surface WT, precipitation, lake surface area, depth mean → Cyanobacteria index
[16]	Eries Lake (USA), 21 MP, from 06/19/2020 to 10/11/2020 and from 06/19/2021 to 10/11/2021	Date-time stamps, temperature, DO → DO
[61]	Maihue Lake (Chile), summer and spring from 2001 to 2020	Chl-α, Secchi Disk, temperature, TN, TP, turbidity, precipitation, AT, Relative humidity, WS, spectral bands, vegetation spectral indices → Chl-α
[62]	Sunapee Lake (New Hampshire, USA), from 2021 to 2022	AT, shortwave and longwave radiation, windspeed, relative humidity, precipitation, and Hypsography → WT and DO
[63]	3276 water bodies from India, from 2003 to 2014	hardness, pH, Potability, turbidity, TDS, Chloramines, Sulfate, EC, Organic Carbon, Trihalomethanes → WQI
[64]	Taihu Lake (China), 14 MP in rivers and 8 MP in sub lakes, daily from 2014 to 2016 (for meteorological data) and once a month from 2014 to 2016 (for hydrological data)	Precipitation, AT, atmospheric pressure, SR, humidity, evaporation, cloud coverage coefficient, maximum WS, average WS, daily discharge data of 8 inflow rivers and daily level of water data of 6 outflow rivers, maximum wind direction, WT, DO, nitrates, nitrites, ammonium nitrogen, TN, TP, phosphate, diatom biomass, green algae biomass, cyanobacteria biomass, and Chl-α → Chl-α
[65]	Chaohu Lake (China), 3 MP, from 12/17/2020 to 03/31/2023	WT, pH, EC, NTU, COD, NH₃, TP, TN, precipitation, WL → DO
[66]	Erken Lake (Sweden), Müggelse Lake (Germany) Furesø Lake (Denmark), Mendota Lake (USA) and Ekoln Lake (Sweden), from 2004 to 2020 (Erken Lake and Müggelse Lake), 1990 to 2017 (Furesø Lake), 1999 to 2015 (Mendota Lake), and 1987 to 2019 (Ekoln Lake)	River discharge (Metrics), AT, Air Pressure, precipitation, WS, humidity, Shortwave radiation, Cloud Cover, delT, Accumulated bottom water temperature over 10 days, Ice duration, Days from ice-off date, MLD, Wn, thermD, Water treatment, Invasive species (Binary), Daphnia, Accumulated phosphate from river loading, Dissolved organic nutrients from river loading, Inflow temperature → DO
[67]	Blelham Tarn Lake (English Lake District, UK), from 1987 to 2017 (except 2001)	Ammonium, nitrate, surface oxygen saturation, SRP, dissolved reactive silica, surface WT, phytoplankton Chl-α, TP, mean daily WS (Knots), relative humidity, cloud amount (Oktas), AT, and rainfall → Chl-α
[68]	Vrana Lake (Cres Island, Croatia), from 1954 to 2022	Tainfall, water supply pumping, WL, evaporation, losses → WL
[69]	Dongting Lake (China), from May 2020 to October 2023	Temperature, pH, DO, permanganate index (CODMn), AN, TP, TN, EC, turbidity, Chl-α, Cyanobacterial cell density → HABs

Table 2. Forecasting model, evaluation metrics, and WQ variables for models without multiple forecast horizon evaluation.

Ref.	Forecasting to the Nearest Forecast Horizon	Ref.	Forecasting to the Nearest Forecast Horizon
[19]	Multi-task learning approach model—(RMSE, std)—TP (0.766, 0.021), Chl-α (0.819, 0.014), TN (0.567, 0.032), Secchi depth (0.559, 0.002)	[20]	M5 model tree—(R, MAE)—DO ( $0.929$ , 1.376), pH ( $0.718$ , $0.480$ ), EC ( $0.684$ , $23.946$ ), Turbidity ( $0.784$ , $85.312$ )
[21]	Cyanobacteria Bloom—CBOPMM— $R^{2}$ = 0.9237	[24]	PSO+KNN—Accuracy (%)—COD (100), DO (95.2), NH₃-N (93.63), TP (92.06);
[25]	DO—PSO-GA-BPNN— $R^{2} = 0.9276$ , APE (%) = 16.2661, MAPE (%) = 6.7219, RMSE = 0.3596	[26]	Plankthothrix rubescens (Chl-α)—RF— $R^{2} = 0.9$ , $R_{ajusted}^{2} = 0.9$ , τ (Kendall) = 0.87, ρ (Spearman) = 0.97, Pseudo- $R^{2}$ = 0.93
[27]	ARIMA-ANN—(ME, RMSE, MAE, MAPE)—AT (−0.003, 1.509, 1.17, 4.657), WT (−0.0001, 0.911, 0.724, 2.886), Humidity (0.008, 6.021, 4.612, 6.739), EC (0.002, 8.141, 6.094, 3.293), Salinity (−0.008, 4.54, 3.238, 3.341), Hardness (−0.001, 4.369, 3.189, 5.395), TDS (−0.008, 8.107, 6.083, 4.539), Zooplankton (−0.012, 7.006, 5.421, 15.421)	[28]	Cyanobacterial Harmful Algal Blooms—Hierarchical Bayesian spatiotemporal modeling approach—AUC = 0.89, Sensitivity = 0.82, Specificity = 0.82, Accuracy = 0.82
[29]	Pb-DOM—Chemical Speciation Model—Δ_centroid (%) = −4, RPD_avg (%) = 5.5, RPD_geom (%) = 2.8, $R_{{RPD}_{a v g} / {RPD}_{g e o m}} = 1.9$ , SSR = 1.4 × 10⁻¹⁴	[30]	E. coli—MLR—RMSE (from 0.29 to 0.44)
[32]	Chloride concentration—QRF— $R^{2} = 0.94$ , RMSLE = 0.41	[33]	Microcystin concentration action-level exceedances—MLR— $R^{2}$ = 0.88
[34]	K_OC—TPS—RMSE = 0.19	[35]	E. coli concentration—Ligh GBM—Precision = 0.89, Recall = 0.66, Accuracy = 0.88, F1-score = 0.76
[36]	DOC—First-order exponential decay based on the classical multi-G model—LooIC = 3.57	[37]	Chl-α—SARIMA— $R^{2} = 0.52$ , AIC = 628.91, BIC = 666.78
[38]	Gaussian BN—( $R^{2}$ , RMSE, Classification error (%))—TP ( $0.32$ , 3.98, 33), Chl-α ( $0.30$ , 4.76, 34), Colour ( $0.72$ , 8.75, 23), Cyano ( $0.14$ , 1.92, 31)	[40]	FLARE framework—DO (16-day horizon)—No evaluation metrics are used
[41]	RF—( $R^{2}$ , RMSE)—MCs (0.83, 0.68), CYN (0.89, 0.45)	[42]	RF—( $R^{2}$ , RMSE, MAE)—(0.6, 0.142, 0.104), Chl-α (0.28, 0.396, 0.319), Color (0.55, 0.285, 0.204), pCO₂ (0.36, 0.288, 0.209), pH (0.5, 0.024, 0.036), Secchi Depth (0.38, 0.227, 0.167), TN (0.33, 0.207, 0.16), TP (0.38, 0.431, 0.329)
[43]	Cyanobacterial blooms—Multivariate Logistic Regression— $R^{2}$ = 0.911	[44]	TOC—SELM— $r =$ 0.84
[45]	TP—XGBoost— $R^{2}$ = 0.82, RMSE = 0.0072, Bias (from 0.00021 to 0.82)	[47]	TSS—MRL— $R^{2}$ = 0.96, RMSE = 3.01
[48]	Chl-α—RF—MAE = 0.986, RMSE = 1.181	[49]	Chl-α—LSTM— $R^{2}$ = 0.936, MAE = 0.247, MSE = 0.098, RMSE = 0.314, MaxError = 0.552
[50]	DOC—GPR—RMSE = $4.08$ , MAE = 1.90, Bias = −0.69, $R^{2} = 0.61$	[52]	WT—Air2stream— $r = 0.88$ , MAE = −0.039, RMSE = 1.11)
[54]	cyanophyte cell counts—XGBRF—RMSE = 82.2896, RMSLE = 1.6291, MSE = 6771.578, MAE = 33.6925	[55]	Cyanobacterial Harmful Algal Blooms—NN—AUC = 0.940, accuracy = 0.861, sensitivity = 0.857, specificity = 0.857, LR+ = 5.993, and 1/LR– = 5.993
[56]	O2—Model of Hypolimneti O2 depletion— $R^{2} = 0.6$ , RMSE = 4.50	[57]	EC—WA method— $R^{2} = 0.89$ , Jacknife = 0.78
[58]	Algal Microcystis and Dolichospermum—SVM—Mean Accuracy (%) = 83.3, Mean Precision (%) = 82.6, Mean Recall (%) = 58.9	[59]	WQI—RF— $R^{2}$ = 0.97, RMSE = 3.4, MAE = 1.8
[60]	Chl-α—INLA—Accuracy = 0.90, Sensitivity = 0.92, Specificity = 0.90, Precision = 0.43	[61]	Chl-α—DGLM— $R^{2}$ = 0.98, MAE = 0.13, Max Error = 0.43, RMSE = 0.16, MSE = 0.03
[63]	WQI—Cutting-edge optimization techniques with DCGNNs—RMSE = 2.3, MAE = 1, MSE = 6.7, $R^{2} = 0.3$ , Acuracy = 99%, Precision = 98%, Recall = 97%, F1-Score = 94%	[65]	DO- AC-BiLSTM— $R^{2} = 0.877$ , MSE = 0.191, MAE = 0.315, RMSE = 0.476
[66]	2-step mixed ML model workflow (GBR with LSTM)—( $R^{2}$ , NMAE)—Surface DO ( $R^{2} > 0.6$ , NMAE<0.1), Bottom DO ( $R^{2} > 0.8$ , NMAE < $0.1$ %)	[69]	CyanoHAB blooms—iTransformer model—MSE = 0.139, MAE = 0.238

Table 3. Forecasting model, evaluation metrics, and WQ variables for models with multiple forecast horizon evaluations.

Ref.	Forecasting to the Nearest and Farthest Forecast Horizon
[22]	DO, NH, TP, TN— 3D hydrodynamic EcoLake Model—1 to 6 days: for all four parameters, the RMSE range is from 0.06 to 1.28, and the MAE range is from 0.04 to 0.98
[23]	Chl-α— $R^{2}$ - GAM—1 month: (≈82); 3 months: $\approx <$ 75; 1 days: $\approx <$ 80, 60 days: ≈60
[31]	WT—(RMSE, CRPS_forecast, CRPS_null, CRPS_{forecast skill}, RMSE, CI reliability)—FLARE—1 day: (0.52, $0.29$ , $0.34$ , $0.16$ , 0.07, 90); 16 days: (1.62, $0.92$ , $1.93$ , $0.53$ , −0.03, 90)
[39]	COD (RMSE, MAE, MAPE (%))—IVB—1 day: (0.07, 0.05, 2.21); 7 day: (0.3, 0.19, 8.18)
[46]	CyanoHAB—( $R^{2}$ , SD, shape parameter, skew parameter)—RF—10 days: (0.62, 0.40, 0.19, 1.03); 30 days: (0.57, 0.45, 0.25, 1.03)
[51]	LWL—(RMSE, MAPE (%))—ANN—1 day: (0.0131, 0.09); 120 days: (0.481, 2.19)
[53]	DO—( $R^{2}$ , RMSE, MAPE, MAE)—CBiLSTM-AT—1/6 day: ( $0.9923$ , 0.2219, 0.0201, 0.1543); 28 days: DO (0.6901, 1.3972, 0.1404, 1.1471)
[16]	DO—( $R^{2}$ , MSE, MAE)—ConvLSTM—1 h: (0.9762, 0.2119, 0.3634); 12 h: (0.9685, 0.2771, 0.3837)
[62]	DO and Temperature—(CRPS)—FLARE—1 day: OD (<0.6) and T (<0.27); 35 days: OD (<0.67) and T (1.08)
[64]	Chl-α—( $R^{2}$ , RMSE, KLD)—LSTM—1 day: (≈0.95, $\approx <$ 1, ≈0.005); 7 days: ( $0.832$ , $2.673$ , $0.028$ )
[67]	Chl-α—(RMSE, MAPE (%))—RF—14 days (with 5 years data): ( $\approx <$ 8.5, ≈<44); 28 days (with 5 years data): ( $\approx <$ 11, $\approx <$ 59)
[68]	Chl-α—(MSE, RMSE, MSRE, Bias (%), RSR, $R^{2}$ )—MLP—1 month: (0.021, 0.145, 0.021, −0.245, 0.187, $0.984$ ); 6 months: (0.793, 0.890, 0.753, 0.152, 1.164, $0.105$ )

Table 4. Articles comparing their models with others or adding other methods.

Ref.	Main/Best Model	Type	Additional Models
[20]	M5 model tree algorithm	Comparing	Cubist, RF, and GBM algorithm
[23]	GAM	Comparing	RF
[24]	PSO+KNN	Comparing	CART
[25]	BP NN	Combining	GA, PSO
[27]	ARIMA	Combining	ANN
[29]	Chemical Speciation Model	Comparing with own and others models	Own models: Six approaches to model Pb-DOM binding (SHM, Lit K-f $(C_{C o m p - A D O C})$ , Exp K-f $(C_{C o m p - A D O C})$ , Lit K-f $(C_{C o m p - H A})$ , Exp K-f $(C_{C o m p - H A})$ , K-C_ADOC); Other models: NICA-Donnan and WinHumicV
[35]	Light GBN	Comparing	XGBoost, CatBoost, RF, CT, MLR, PLS, SPLS, BN, ensemble stacking model
[36]	First-order exponential decay based on the classical multi-G model	Comparing	same model but with different predictor variable units.
[38]	GBN	Comparing	BN, seasonal Naïve forecast
[39]	BLSTM	Comparing and combining	Combining: CEEMDAN, VMD, Improved CEEMDAN; Comparing: SVR, ELM, LSTM
[44]	SELM	Comparing	MLR
[45]	XGBoost	Comparing	SVR, BP, RF and Empirical Method
[46]	RF	Comparing	EA, LASSO, ANN
[48]	RFR	Comparing	SVR, LSTM, GRU
[49]	LSTM	Comparing	SARIMAX, RNN
[50]	GPR	Comparing	SVR, RFR, MBPNN
[51]	ANN	Comparing	RNN (LSTM, GRU, Bidirectional LSTM, Stacked LSTM)
[53]	CBiLSTM-AT	Comparing and Combining	Comparing: BP, LSTM, BiLSTM, BiLSTM-AT; Combining: VMD, F, CVMD, WOA
[54]	XGBRF	Comparing	PRM, NBRM, ZIPM, ZINBM, GAPM, GANBM
[55]	NN	Comparing	XGBoost, Logistic Regression
[59]	RF	Comparing	DNN, GBM
[60]	INLA	Comparing	SVC, RF, DNN, LSMT, RNN, GNU
[16]	LSTM	Comparing and Combining	Combining: CNN; Comparing: ConvLSTM, CNN-GRU
[61]	DGLM	Comparing	LSTM, SARIMAX
[63]	DCGNN	Comparing	Linear Regression, MLP regressor, SVM, RF
[64]	BMA	Comparing	LSTM, RF, SVM
[65]	BiLSTM	Adding	CNN, AM
[66]	LSTM	Combining and Comparing	GBR
[67]	RF	Comparing	SVM, MLP, GRU, Ridge Regression

Table 5. Monitoring conditions and output time-step.

Ref.	Monitoring Frequency	Monitoring Moment	Forecasting Frequency
[19]	Annually (6 parameters), Seasonly (3 parameters), and monthly (2 parameters)	-	-
[20]	Daily	6:00, 9:00, 16:00 and 22:00	-
[22]	Hourly (meteorological and hydrological data), Monthly (before 2016 and after November 2017 for WQ parameters), once a week (from April to November for WQ parameters)	-	-
[23]	Monthly (2003 to 2017), 4 h (2017)	-	-
[26]	Bi-monthly, except for winter period (monthly)	-	-
[27]	Fortnightly	-	-
[28]	Weekly	15:20–16:00	-
[30]	Hourly (for weather data), once a week, and five days per week	-	-
[31]	Hourly, daily (for meteorological data at 00:00 hours)	-	daily step
[33]	Twice a month to twice a week (May to November 2016 and 2017), more frequent during July to September	-	-
[36]	Once in summer (January or February 2016–2017), every 6 h (for DO just for two days), every 2–4 days (for DOC just for 18 days)	-	-
[37]	Monthly	-	-
[38]	5 to 10 times per year (until 2004) and increasing to around 25 (until 2013 for TP, Chl-α, and color), Hourly (just for Hobøl River discharge), 6 to 8 times a year (for Cyanobacterial before 2004), weekly (for other variables between 2005 and 2014), fortnightly (thereafter)	-	-
[41]	Weekly or Bi-weekly	-	-
[43]	Monthly	-	-
[46]	-	-	10 days
[51]	Daily	-	-
[53]	4 h	-	-
[54]	7 days	-	-
[58]	Once a month (for TN and TP), once an hour (for inflow and discharge of water), every 10 min (meteorological data)	-	-
[60]	Daily (AT), Weekly (WT from surface, precipitation, snow/ice mask shapefiles)	-	Weekly
[64]	Daily and Monthly (for WQ data)	-	4 h
[65]	4 h and hourly (just for precipitation, WL, and flow velocity data)	-	-
[66]	hourly (for surface DO) and daily (for the other parameters)	-	7 days
[67]	Daily (for meteorological data)	-	Fortnightly
[68]	Monthly	-	Monthly (1, 2, 4, and 6 months)
[69]	4 h	08:00	-

Table 6. Articles presenting a configuration of their forecasting models.

Ref.	Observations
[25]	Details the BPNN structure (input, hidden, and output layer), the maximum number of iterations, the threshold error precision, and the learning rate. The PSO and GA algorithms detail the population size, the number of generations, and the learning factors.
[27]	Shows the configuration ( $p, d, q, P, D, Q$ ) for each ARIMA model and (input–hidden–output) for each ANN.
[35]	Presents all the hyperparameters tuned for each forecasting model.
[43]	Gives the equations for the predictive model of cyanobacteria cells and each environmental factor.
[48]	Provides the settings (with numerical values in the cases that correspond) for each forecasting model.
[50]	Shares all the hyperparameters tuned for each forecasting model.
[51]	Introduces the optimized hyperparameter values (neuron number, epoch, batch size, number of layers, and prediction period) for the ANN and the four RNNs implemented.
[55]	Shows the tuned parameters for each final model are also provided by the authors, but the authors also introduce the final hyperparameters for each model.
[65]	Presents the model parameters.
[67]	Gives the values for each hyperparameter obtained for each of the five forecasting models.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

García-Guerrero, J.C.; Álvarez-Alvarado, J.M.; Carrillo-Serrano, R.V.; Palos-Barba, V.; Rodríguez-Reséndiz, J. A Review of Water Quality Forecasting Models for Freshwater Lentic Ecosystems. Water 2025, 17, 2312. https://doi.org/10.3390/w17152312

AMA Style

García-Guerrero JC, Álvarez-Alvarado JM, Carrillo-Serrano RV, Palos-Barba V, Rodríguez-Reséndiz J. A Review of Water Quality Forecasting Models for Freshwater Lentic Ecosystems. Water. 2025; 17(15):2312. https://doi.org/10.3390/w17152312

Chicago/Turabian Style

García-Guerrero, Jovheiry Christopher, José M. Álvarez-Alvarado, Roberto Valentín Carrillo-Serrano, Viviana Palos-Barba, and Juvenal Rodríguez-Reséndiz. 2025. "A Review of Water Quality Forecasting Models for Freshwater Lentic Ecosystems" Water 17, no. 15: 2312. https://doi.org/10.3390/w17152312

APA Style

García-Guerrero, J. C., Álvarez-Alvarado, J. M., Carrillo-Serrano, R. V., Palos-Barba, V., & Rodríguez-Reséndiz, J. (2025). A Review of Water Quality Forecasting Models for Freshwater Lentic Ecosystems. Water, 17(15), 2312. https://doi.org/10.3390/w17152312

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Review of Water Quality Forecasting Models for Freshwater Lentic Ecosystems

Abstract

1. Introduction

2. Theoretical Background

2.1. Artificial Intelligence

2.1.1. Machine Learning

2.1.2. Deep Learning

2.2. ARMA Models

2.3. Water Quality (WQ)

3. Materials and Methods

Research Queries

4. Results

4.1. Which Water Bodies and Variables Are Used in WQ Forecasting?

4.2. Which Forecasting Methods and Evaluation Metrics Are Used?

4.3. Which Models or Methods Are Usually Compared or Combined?

4.4. Which Monitoring Conditions Are Usually Implemented?

4.5. How Much Data Is Required to Train, Test, or Validate Forecasting Models?

4.6. Which Articles Have Made Their Dataset Available?

4.7. Which Articles Give Information About the Configuration Used in Their Forecasting Models?

5. Discussion

5.1. About Water Quality Variables

5.2. About the Forecasting Models

5.3. About Common Problems

5.4. How Should the Forecasting Models Be Shared?

5.5. Future Work

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI