A Review of Hybrid Soft Computing and Data Pre-Processing Techniques to Forecast Freshwater Quality’s Parameters: Current Trends and Future Directions

: Water quality has a signiﬁcant inﬂuence on human health. As a result, water quality parameter modelling is one of the most challenging problems in the water sector. Therefore, the major factor in choosing an appropriate prediction model is accuracy. This research aims to analyse hybrid techniques and pre-processing data methods in freshwater quality modelling and forecasting. Hybrid approaches have generally been seen as a potential way of improving the accuracy of water quality modelling and forecasting compared with individual models. Consequently, recent studies have focused on using hybrid models to enhance forecasting accuracy. The modelling of dissolved oxygen is receiving more attention. From a review of relevant articles, it is clear that hybrid techniques are viable and precise methods for water quality prediction. Additionally, this paper presents future research directions to help researchers predict freshwater quality variables.


Introduction
The growing scarcity of fresh, clean water is one of the most pressing concerns confronting civilization in the twenty-first century [1].Recent research has proven climate change will have a significant impact on freshwater supplies due to the probable reduction in rainfall [2].In addition to projected droughts in various river basins throughout the world due to climate change, several studies have shown potential water quality (WQ) degradation due to dilution or concentration of soluble chemicals [3].Additionally, multiple studies have indicated that pollution has a negative impact on freshwater resources in general [2].The decline in river WQ has irreversible consequences for the environment and human health as more than one billion people do not have access to clean potable water [4].Hence, it is necessary to estimate and make predictions regarding water quality in an attempt to anticipate how WQ will change over time.Additionally, forecasting future variations in WQ is very important for future aquaculture control intelligence.As a result, WQ forecasting is quite useful for anticipating WQ and estimating future supply.Robust, reliable, and flexible models are critically needed [5].
Conventional approaches for time series analysis, such as auto-regressive integrated moving average (ARIMA, abbreviations are collected in Table S1 in the Supplementary Materials) and multiple linear regression (MLR) models have been shown to be limited in terms of carefully determining WQ due to the intricacy and sophistication of the WQ time series.Machine learning (ML) methods such as artificial neural networks (ANN) [6][7][8], support vector machines (SVM) [9,10], deep neural networks (Deep NN) [11], and knearest neighbours (KNN) [12] have also been applied to simulate WQ [13].Artificial intelligence (AI) techniques are superior to traditional models and achieve better results due to the ability of AI to deal with non-linear and complex properties [14,15].Additionally, several combined techniques have been widely employed for WQ modelling because combined techniques are better than standalone models, and this is improving forecasting accuracy [16].The increasing trend in applying hybrid ML methods can be seen in recent years, as revealed in Figure 1.
Environments 2022, 9, x FOR PEER REVIEW 2 of 24 result, WQ forecasting is quite useful for anticipating WQ and estimating future supply.Robust, reliable, and flexible models are critically needed [5].
Conventional approaches for time series analysis, such as auto-regressive integrated moving average (ARIMA, abbreviations are collected in Table S1 in the Supplementary Materials) and multiple linear regression (MLR) models have been shown to be limited in terms of carefully determining WQ due to the intricacy and sophistication of the WQ time series.Machine learning (ML) methods such as artificial neural networks (ANN) [6][7][8], support vector machines (SVM) [9,10], deep neural networks (Deep NN) [11], and k-nearest neighbours (KNN) [12] have also been applied to simulate WQ [13].Artificial intelligence (AI) techniques are superior to traditional models and achieve better results due to the ability of AI to deal with non-linear and complex properties [14,15].Additionally, several combined techniques have been widely employed for WQ modelling because combined techniques are better than standalone models, and this is improving forecasting accuracy [16].The increasing trend in applying hybrid ML methods can be seen in recent years, as revealed in Figure 1.Additionally, several other review papers have introduced the applications of the soft computer to forecast WQ [5,15,[17][18][19][20][21], whose keywords and crucial aspects are summarised in Table 1.
Table 1.Summaries of related review papers.

Reference
Keywords Summary [19] River water quality, state of the art, literature assessment and evaluation, AI, hybrid model.
A review of ANN techniques for environmental issues prediction [17] AI, ANFIS, ANN, river, water quality.AI for surface water quality monitoring and assessment: a systematic literature analysis [18] Pollutant, sediment load, ML tool, ANN, discharge prediction Applications of IoT and AI in Water Quality Monitoring and Prediction: A Review [5] ANNs, feed-forward, recurrent, hybrid, water quality prediction.
A Review of the ANN Models for Water Quality Prediction [20] Water quality criteria, climate change, Urbanisation, eutrophication, best management practices, critical source areas, water quality index, ML algorithms, remote sensing.
Water quality prospective in Twenty First Century: Status of water quality in major river basins, contemporary strategies and impediments: A review Additionally, several other review papers have introduced the applications of the soft computer to forecast WQ [5,15,[17][18][19][20][21], whose keywords and crucial aspects are summarised in Table 1.The literature on WQ forecasting can be seen from a variety of perspectives.Emphasizing the supply side of the problem, Tiyasha, et al. [19] reviewed papers on AI applications for studying river WQ prediction strategies, including the ANN, kernel-based, fuzzy-based complementary models, and hybrid models.In addition, model architecture, input variability, performance criteria, regional generalisation investigation, and comprehensive evaluations of AI approaches have progressed in river quality research.Han and Wang [15] published a study on how an ANN model can estimate WQ dynamics and compare with other approaches such as radial basis function neural network (RBFNN), long short-term memory (LSTM), and convolutional neural network (CNN) to find precise outcomes and explain their benefits.Additionally, the study focused on how many parameters of prediction and which country used the ANN model.Ighalo et al. [17] reviewed papers on neural networks, WQ parameters, location of study, and model accuracy.Mustafa et al. [18] gave an overview of the internet of things (IoT) in WQ monitoring.Furthermore, their study briefly explained an ANN model with its advantages, limitations, and its recent application.Chen et al. [5] focused on an ANN model and basic model architectures in WQ forecast, such as feed-forward, recurrent, and hybrid structures in addition to data collection, output strategy, input selection, data dividing, and data pre-processing (normalisation, missing data imputation, data correct, data abnormal).Giri [20] presented a holistic assessment of WQ decline in key river basins worldwide as shown in this review.In addition, nine modern methods, including field-scale assessment, optimisation strategies for placement of best management practices, a social component in watershed modelling, ML algorithms to discuss WQ issues in complex natural devices concomitant with spatial heterogeneity, and remote sensing in monitoring WQ were included.The existing constraints on improving WQ are then divided into major and secondary barriers.Rajaee et al. [14] reviewed different kinds of single and combined AI approaches including ANNs, Fuzzy Logic (FL), Genetic Programming (GP), SVM, hybrid ANN-ARIMA, hybrid Genetic Algorithm-Neural Networks (GA-NN), hybrid neuro-fuzzy (NF), and wave-let-based combined techniques such as wavelet-neuro fuzzy (WNF), wavelet-neural networks (WANN), wavelet-support vector regression (WSVR), and wavelet-linear genetic programming (WLGP) models were examined for the prediction of WQ in rivers.
Despite their comprehensive surveys of recent applications of AI methods to the WQ field, few researchers have included studies on hybrid algorithms and how they work stepby-step, and in detail, so we focused on hybrid ML techniques and their classification power, including data pre-processing methods.The reason to study these hybrid models in detail is that they have several advantages, such as (a) enhanced predictive performance due to increased capacity for pattern detection and simulation, (b) reduced risk of employing a sub-optimal technique (if used in isolation), and (c) a simplified procedure for model choice due to the utilisation of various components [21].Hajirahimi and Khashei [22] classified hybrid models into several categories and explained the unique characteristics of the models.Based on this literature review, the goal of the paper is to categorise the hybrid models suggested for WQ modelling and forecasting into four main classes (the components combination-based hybrid models (CBH), parameter optimisation-based hybrid models (OBH), pre-processing-based hybrid models (PBH), and hybridisation of hybrid models).

Water Quality Parameters
The nature and amount of industrial, agricultural, and other anthropogenic activity within a region's catchments considerably influences surface WQ [23].The WQ parameters are categorised into three primary groups: physical, chemical, and biological.Different WQ factors that have been modelled are reported in this paper.Physical WQ parameters such as temperature (T), total dissolved solid (TDS), electrical conductivity (EC), salinity, and hydrogen ion concentration (pH) are often of concern as well.Dissolved oxygen (DO), chemical oxygen demand (COD), and biochemical oxygen demand (BOD) are examples of chemical sensors.Figure 2 shows various WQ factors modelled in the previous studies that used a hybrid model for prediction.It can be seen that most studies have been carried out to simulate DO and EC parameters in water.
salinity, and hydrogen ion concentration (pH) are often of concern as well.Dissolved oxygen (DO), chemical oxygen demand (COD), and biochemical oxygen demand (BOD) are examples of chemical sensors.Figure 2 shows various WQ factors modelled in the previous studies that used a hybrid model for prediction.It can be seen that most studies have been carried out to simulate DO and EC parameters in water.

Machine Learning (ML)
ML has been applied for a long time and has received considerable attention over the last few years.It can handle a huge volume of data and permit non-linear constructions by utilizing complex mathematical calculations [24].Additionally, ML are categorised as unsupervised and supervised learning.Supervised learning is employed to learn the primary relationship between input and output values.Unsupervised learning, in contrast, gives the learning algorithms no labels or known outcomes [25].Several ML approaches have been promoted for modelling WQ parameters.The ML models applied include ANN [10,[26][27][28], adaptive neuro-fuzzy inference system (ANFIS) [7,29,30], (SVR) [31][32][33], random forest (RF) [34,35], k-nearest neighbours (KNN) [36], Naive Bayes [37], decision tree (DT) [38,39], and extreme gradient boosting (XGB) [40].The advantages and disadvantages of the most used ML techniques are summarised in Table 2.When the number of fuzzy rules grows, it might become computationally expensive and may risk overfitting.

Machine Learning (ML)
ML has been applied for a long time and has received considerable attention over the last few years.It can handle a huge volume of data and permit non-linear constructions by utilizing complex mathematical calculations [24].Additionally, ML are categorised as unsupervised and supervised learning.Supervised learning is employed to learn the primary relationship between input and output values.Unsupervised learning, in contrast, gives the learning algorithms no labels or known outcomes [25].Several ML approaches have been promoted for modelling WQ parameters.The ML models applied include ANN [10,[26][27][28], adaptive neuro-fuzzy inference system (ANFIS) [7,29,30], (SVR) [31][32][33], random forest (RF) [34,35], k-nearest neighbours (KNN) [36], Naive Bayes [37], decision tree (DT) [38,39], and extreme gradient boosting (XGB) [40].The advantages and disadvantages of the most used ML techniques are summarised in Table 2.Over parameterisation and overfitting difficulties are common in ANNs, especially when the approaches are based on optimal input selection, and the model is regarded as a black-box model.In addition, because no consistent principles control proper ANN model development and construction, it is not easy to prioritise a suitable model.[18,41,42] ANFIS It can be used when the system input data is confusing and imprecise.It can manage non-linear data series and allow the modelling process to have the least possible uncertainty level.
When the number of fuzzy rules grows, it might become computationally expensive and may risk overfitting.
[ [42][43][44] SVR Its increased generalisation ability, unique and globally optimum structures, and ability to be quickly trained.And SVR's flexibility is one of its strongest features, dependent on several types of kernel functions such as linear, polynomial, and radial basis function (RBF) kernels.
Hyper-parameters like the penalty factor, accuracy, and kernel function variance significantly impact the performance of the SVR model.[45,46] RF It is able to manage large datasets with several features, and the accuracy of modelling improves when the number of trees increases.
The training process is slowed when using the model with a high number of trees.[47,48]

Data Pre-Processing Techniques
Data pre-processing techniques are considered essential to the data mining process [49].Data preparation is vital to ensuring that all predictors receive equal attention during the learning phase and helps speed up the procedure [50,51].These methods play an essential role in models by fostering high accuracy and minimal computational costs at the learning phase, as noisy and unreliable information that could exist in data records will adversely impact the training stage and outcome in a poor model [49].The preprocessing data method consists of three approaches: normalisation, cleaning, and model input determination, as in Zubaidi et al. [52].Previous studies used one or two preprocessing steps (Table S2 in the Supplementary Materials).In this study, only 48% of the researchers employed data normalisation, 53% utilised data cleaning, and 67% used best model selection.

Data Normalisation
The goal of data normalisation is to have the same range of values for each of the ANN model's inputs and to obtain the time series normally or nearly normally distributed, as this will aid in the stable convergence of the weights and biases and limit the impact of noise [2].

Data Cleaning
The cleaning strategy aims to determine and eliminate noise from raw data to reduce the error scale and improve the regression coefficient [2].Data cleaning is required to discover and treat unwanted values, because the noise and outliers negatively impact data analysis and then the suggested model's performance [51,53].

Selecting appropriate descriptors
One of the most critical steps in data pre-processing is selecting the best model input [2].The selection of explanatory factors influencing WQ metrics as model input data is vital in creating any successful model [54].

Hybrid Models
A hybrid model combines two or more methods, one serving as the primary model and the others as pre-or post-processing approaches [2].In recent years, combined models have arisen as a way to construct flexible and efficient models and improve the forecasting accuracy of individual algorithms [5,55].The hybrid models can be classified into four types, namely: the components combination-based hybrid models (CBH), parameter optimisationbased hybrid models (OBH), pre-processing-based hybrid models (PBH), and hybridisation of hybrid models as in Hajirahimi and Khashei [22].There are different studies in the hybrid models shown in Figure 3.

Components Combination Based Hybrid Models (CBH)
In this section, ML models were combined to correct the relative incompetency of the individual models.The CBH models aim to improve prediction performance by enabling the remarkable capacity of individual prediction models regardless of combination structures [22].For example, Lola et al. [56] developed a combined technique to forecast daily WQ data (DO, water T, pH, and salinity) using ARIMA and ANN.When compared to stand-alone ARIMA and ANN, the results of the experiments demonstrate that the suggested model can be a viable and effective strategy to increase prediction precision with high correlation coefficients and decrease the error percentage for all indicators up to the maximum of 87.87% in both mean absolute error (MAE) and root mean square error (RMSE).

Components Combination Based Hybrid Models (CBH)
In this section, ML models were combined to correct the relative incompetency of the individual models.The CBH models aim to improve prediction performance by enabling the remarkable capacity of individual prediction models regardless of combination Barzegar et al. [57] investigated the predictive capability of two single deep learning (DL) models, the LSTM and CNN models, along with their combined CNN_LSTM technique to forecast short-term WQ.Two conventional ML methods, (SVR) and (DT), were also used, and their results were compared with DL models.Various statistical criteria were considered to assess the models.The results show that both DL models have similar performance for predicting Chlorophyll-a (Chl_a), and LSTM is better than CNN for simulating DO.Generally, the combined technique CNN_LSTM was superior to LSTM, CNN, SVR, and DT models, and it was able to simulate the high and low levels of WQ parameters, especially for the DO concentration.Similarly, Baek et al. [58] also suggested a composite model LSTM with the DL model to forecast the water level (WL) and quality parameters (Total phosphorus TP, total nitrogen TN, total organic carbon TOC).The outcomes showed that the hybrid model's performance was more precise according to the Nash-Sutcliffe efficiency (NSE).
Yan et al. [59] suggested using the one-dimensional residual convolutional neural networks (1-DRCNN) and bi-directional gated recurrent units (BiGRU), GRU, LSTM, and combined 1_DRCNN with BiGRU models, to forecast TN, TP, and potassium permanganate index (COD Mn ).The outcomes demonstrate that the combined technique has greater forecasting precision and generalisation to predict WQ than standalone models (LSTM, GRU, and BiGRU) based on statistical metrics, such as MAPE and the determination coefficient (R 2 ).
Hien Than et al. [60] investigated the LSTM-MA model to forecast DO, PH, COD, BOD, TSS, Tur, ammonia nitrogen oxidation-reduction potential (NH3-NL), and Coliform variables and classified WQ.The LSTM-MA combined approach was employed to classify WQ, and this model is dependable and effective.The results revealed that the LSTM-MA was superior to the ARIMA, NAR, NAR-MA, and LSTM models according to the RMSE.According to these reviews, combined approaches can be customised by coupling two ML models together to suit the researchers' needs.

Parameter Optimisation-Based Hybrid Models (OBH)
Metaheuristics are commonly employed in WQ forecasting models to modify the parameters of other approaches, estimate the coefficients of a function, or train an intelligent agent and are a method for finding a good (near-optimal) answer at a reasonable computational cost [61].
Numerous approaches and algorithms have been developed to allow AI modelers to employ the computing system in hydrology, predicting and optimizing storage systems.The tasks are becoming more complex as the management of water resources improves to a broader scope, with the need to deal with the whims of climate change and more.Aside from AI models, other areas of research include optimisation algorithms and socalled evolutionary computing approaches, which can be utilised as a single algorithm for forecasting or combined with traditional methods to create a hybrid model.

Particle Swarm Optimisation (PSO)
This is a tool for computationally iterative search and optimisation [49].It is scientifically inspired by social behaviour in animal societies, such as flocking birds or schools of fish.This technique utilises a swarm of particles, each of which represents a potential solution [47].The PSO is evolved depending on two significant aspects of bird flocks' movement behaviour: their velocity and position [62].It is applied to obtain the best forecast technique coefficients that offer the lowest error between measured and forecasted values.So, it has been effectively used recently in various fields to select the optimal solution, such as in intelligent agriculture [63], WL [64], streamflow [62,65], drought [66], and WQ [67,68].
[67] adopted two AI methods, ANFIS and ANFIS-PSO.The results showed that using two models to forecast inorganic markers of WQ is extremely effective.The flexibility of the PSO-ANFIS approach in modelling, on the other hand, is superior to the standalone ANFIS approach based on performance criteria (i.e., MRE%, MAE, RMSE, R and t statistics).
Azad et al. [68] applied the ANFIS model in conjunction with PSO and ant colony optimisation for continuous domains (ACO R ) in predicting WQ parameters.The ANFIS approach, which uses least squares and gradient descent as training algorithms, was applied and compared with ANFIS_PSO and ANFIS-ACO R .The research revealed that ANFIS-PSO was the best model to forecast EC, TDS, TH, sodium adsorption ratio SAR, and carbonate hardness CH parameters.However, PSO may be a suitable strategy for optimizing and learning the aforementioned technique.
Shah et al. [69] proposed the hybrid feed forward neural network (PSO-FFNN) and combined gene expression programming (PSO-GEP) to forecast DO and TDS levels.The more essential input factors for TDS and DO forecasting were determined using principal component analysis (PCA).The fallouts show that the PSO-GEP model outperforms the PSO-FFNN model in terms of precision with statistical metrics.

Genetic Algorithm (GA)
This is a robust, powerful, optimised method based on natural selection and evolutionary principles [28].GA was inspired by natural processes of biological evolution and has been widely employed to generate high-quality solutions to optimisation issues [70].In the early twentieth century, genetic algorithms found their way into the field of hydrology [47].The GA algorithm is applied in several areas, such as water flow [71,72] and WQ [73,74].
Stajkowski et al. [74] utilised the GA-LSTM technique to forecast the river water temperature (WT), and an RNN model as a benchmark to check the robustness of the suggested technique.The goal of using GA is to improve the ANN design process.The results showed that the GA-LSTM model outperformed the RNN, and the fundamental issue of identifying the ideal time frame and number of memory cell units was overcome.According to the findings, the GA-LSTM can be applied as an advanced DL approach for time series analysis.
Azad et al. [73] implemented GA, ACO R , and differential evolution (DE) to improve the performance of an ANFIS.The most appropriate inputs for each model were first determined utilizing sensitivity analysis, and then all of the quality characteristics were forecasted using the aforementioned models.The most acceptable models for simulating EC and TH were ANFIS-DE, but both the ANFIS-DE and ANFIS-GA techniques showed improved performance compared to ANFIS in forecasting river WQ parameters.
Jin et al. [75] investigated a hybrid approach known as an improved genetic algorithm (IGA) back-propagation neural network (BPNN) to forecast variations in surface WQ for real-time early warning for NH3-N, TURB, and EC parameters.IGA optimises the reasonable initial weight parameters and prevents the evolved method from choosing an optimal local outcome.BPNN is used to adjust suitable connection structures and find the features of WQ variation.The findings revealed that the created AI technique could significantly increase forecasting accuracy and dependability and provide effective realtime early warnings for emergency response.The proposed model outperformed BPNN according to statistical criteria.

Other Optimisation Algorithms
The firefly algorithm (FFA) proposed by Yang [76] in 2010 is a heuristic optimisation algorithm that is biologically inspired, and it depends on a specific behavioural pattern, especially the fireflies' light flashing characteristic [77].
Raheli et al. [78] evaluated the ability of a newly suggested combined prediction technique that depends on the FFA as a heuristic optimiser, coupled with the MLP.The model was applied to forecast monthly WQ (i.e., BOD, DO, COD, K, EC, PH, PO 4 , Cl, Na, and NH 4 N).Considering the performance criteria outcomes, the MLP-FFA technique outperforms the corresponding MLP model.
The cuckoo search (CS) was proposed by Yang and Deb.It is effective in tackling global optimisation issues [79].Chatterje, et al. [80] used CS to increase support in the classification technique to predict WQ.To identify the best weight vector for the ANN model, the suggested approach (NN-CS) gradually diminishes an objective function (RMSE).The suggested technique was compared to other well-established approaches, such as NN-GA and NN-PSO, concerning the precision, Matthews correlation coefficient (MCC), recall, Fowlkes-Mallows index (FM index), and f-measure.The simulation outcomes showed that NN-CS outperformed the other models.
Li et al. [81] applied a combined approach that depends on LSTM and sparse autoencoder (SAE) to enhance the forecasting precision of DO in aquaculture.SAE pre-trained the hidden layer data containing deep latent WQ aspects and then fed it into the LSTM to improve forecast precision.The outcomes showed that SAE-LSTM outperforms LSTM and SAE-BPNN.
The artificial bee colony (ABC) was proposed by Karaboga [82].It has ushered in a new technique of thinking about optimisation algorithms.It was inspired by the study of the life cycle of bees and included two core concepts: self-organisation and division of labour [82].The ABC optimisation approach has not been employed broadly in hydrology issues.However, there have been limited attempts to adopt it in optimizing WQ variables, such as Chen et al. [83], which used an improved artificial bee colony (IABC) algorithm with BPNN to predict DO, BOD, and COD M parameters.The IABC algorithm optimised the connection weight values between network layers and the threshold of each layer using a BP neural network.When compared to the regular BP, ABC-BP, and PSO-BP neural network models, it was revealed that the IABC-BP neural network has better prediction capability and could reach considerably higher accuracy-about 25% higher than the BP neural network.The new technique is beneficial for predicting WQ in a water diversion project and might be quickly used in this area.
Grey Relational Analysis (GRA) is a subdivision of the grey system method that deals with ambiguous or uncertain problems and circumstances involving discrete data and inadequate knowledge [84].Zhou et al. [85] proposed three models (LSTM, BPNN, and ARIMA) to forecast DO concentrations.Additionally, the improved grey relational analysis (IGRA) method was used for the feature selection of WQ information.The result revealed that LSTM outperformed the other models, and the hybrid IGRA-LSTM technique was the best.
Melesse et al. [4] proposed ten approaches: M5 prime M5P, bagging-M5P, AR-RF, random subspace (RS)-M5P, RF, RC-RF, random committee (RC)-M5P, bagging-RF, RS-RF, and additive regression (AR)-M5P to forecast salinity.The results revealed that the AR-M5P exceeded other models according to performance criteria.The combination of ML algorithms enhanced model performance in terms of capturing extreme salinity values, which is critical in managing water resources.Tiyasha et al. [28] suggested four tree-based predictive models: RF, random forest geneRator (Ranger), conditional random forests (cForest), and XGBoost compared with algorithms, XGBoost, multivariate adaptive regression splines (MARS), and Boruta, GA.Additionally, four feature selector techniques (GA, Boruta, XGBoost, and MARS) were used to determine the optimum independent variables employed to forecast DO changes.The outcomes show that the performance of all predictive approaches was good as per the features selected by the algorithms MARS and XGBoost.Additionally, the XGBoost predictive technique recorded the best performance when combined with MARS and XGBoost algorithms in terms of applied various statistical criteria.
Kadkhodazadeh and Farzin [86] explored a novel gradient-based optimiser (GBO) algorithm coupled with a least square support vector machine (LSSVM) technique for the evaluation of WQ parameters.The LSSVM-GBO method's performance is examined using three benchmark datasets to demonstrate its superiority (Housing, LVST, Servo).The novel hybrid algorithm's findings were then compared to ANN, ANFIS, and LSSVM techniques.The modelling results based on evaluation criteria revealed that LSSVM-GBO outperformed all other benchmark datasets and techniques.Then, EC and TDS modelling was done at varying time delays using the best input combination and the best algorithm.The Gotvand station has the highest modelling accuracy for EC and TDS parameters.
Dehghan, et al. [87] used SVR in stand-alone and hybrid versions.SVR was integrated with four metaheuristic algorithms, such as chicken swarm optimisation (CSO), social skidriver (SSD) optimisation, black widow optimisation (BWO), and the algorithm of the innovative gunner (AIG) to predict sufficient monthly DO.All the hybrid models produced good performance based on the different statistical criteria, and SVR-AIG offered better results.Moreover, combined techniques improved the precision of the stand-alone SVR method by 6.52-1.75%.

Preprocessing-Based Hybrid Models (PBH)
In this method, the input data are pre-processed using various methods such as decomposition-based, filter-based, denoising-based, feature selection, and data cleaning approaches.Following this, the appropriate individual model forecasts the screened time series [88].
Solg, et al. [89] investigated two models: SVR and ANFIS.The wavelet transform approach was used to clean raw data from noise and analyse the data set into sub-series.Additionally, principal component analysis (PCA) is applied to determine the best predictors.The outcomes showed that the SVR was better than the ANFIS model, the wavelet transform approach improved data quality, and the hybrid W-PCA-SVR is the best technique.
Zhang et al. [23] designed Kernal PCA (kPCA) with a recurrent neural network (RNN) model to estimate the trend of DO.The kPCA technique is used to reconstruct WQ variables, which tries to minimise the noise in raw sensory data while preserving actionable information.The model can use previous knowledge to forecast future trends because of the RNN's recurrent connections.When compared to present AI techniques such as FFNN, SVR, and the general regression neural network model (GRNN), the kPCA-RNN model attained the predicted accuracy and outperformed the comparative models.
Al-Sulttani et al. [90] proposed five various hybrid ML techniques, including Gradient Boosting Machines (GBM H2O), RF, Quantile regression forest (QRF), radial SVM, and Stochastic Gradient Boosting (GBM).Furthermore, the techniques were integrated by employing two various algorithms for identifying features, e.g., GA and PCA, to predict monthly BOD values.GA was used to select the best-fitting predictions based on their evolutionary potential.The findings show that the combined PCA-QRF approach was the best performing approach to predict WQ compared to the other models.
Bi et al. [91] suggest ANN, SVR, ARIMA, XBoost, and LSTM models to forecast DO and COD mn .The outcomes reveal that the SE-LSTM technique is superior to the other methods based on statistical metrics.Hence, The Savitzky-Golay filter can remove possible noise from the WQ time series, and the LSTM can examine non-linear properties in a complex water environment.
Ahmed et al. [92] created a hybrid model by combining the MARS model with the maximum overlap discrete wavelet transformation (MODWT) (i.e., MODWT-MARS).The suggested model was also compared against various ML techniques (MARS, CEEMDAN-MARS, CEEMDAN-SVR, SVR, KRR, KNN, RF) to estimate daily WQ parameters.The results revelated that the combined algorithm (i.e., MODWT-MARS) was superior to the other methods according to statistical criteria.This hybrid approach could be used to anticipate WQ characteristics using fewer predictor factors in the future.
Ahmadianfa, et al. [93] proposed a novel hybrid model discrete wavelet transform coupled with locally weighted linear regression (LWLR) and employing the mother wavelet Bior 6.8 to analyse data into two levels.The outcomes reveal that the W-LWLR technique outperforms other methods such as LWLR, MLR, SVR, ARIMA, W-MLR, W-ARIMA, and W-SVR.
Eze et al. [94] developed a new combined forecast approach that depends on hybrid empirical mode decomposition (EEMD) and an LSTM neural network.Initially, the integrity of the datasets is improved by using moving average filtering and linear interpolation techniques to pre-treat the WQ indicator datasets in this combined EEMD-DL-LSTM technique.Then, the EEMD technique decomposes the dataset of measured real sensor WQ characteristics.Finally, a multi-feature selection procedure is used to carefully choose a collection of IMFs that are substantially linked with the measured real-world WQ parameter datasets and integrate them as inputs to the DL-LSTM neural network.The innovative hybrid prediction model's performance is validated by comparing the results to real datasets.Various measurement criteria, such as (MAE, MAPE, RMSE, and MSE), were utilised to assess the overall precision of the unique hybrid prediction technique.

Hybridisation of Hybrid Models
The hybridisation of hybrid models is a novel idea proposed to improve forecasting precision over traditional hybrid classes [22].
In 2020, several researchers used a combined hybrid model with a pre-processing algorithm, such as Ya, et al. [95], who suggested a technique for forecasting WQ parameters (TN) that depends on the deep belief network (DBN) method.The deep belief network's network is optimised using the PSO algorithm, which extracts feature vectors from WQ data at several scales.The PSO-DBN WQ prediction model is then integrated with the least squares support vector regression (LSSVR) machine, which is used as the top forecast layer of the approach.When comparing the proposed model (PSO-DBN-LSSVR) to the classic back propagation (BP) neural network, the DBN neural network, LSSVR, and the DBN-LSSVR hybrid technique, the outcomes display that the model can accurately forecast the WQ parameters and has good robustness based on statistical metrics.
Wang et al. [96] established a combined assembly wavlet analysis (WA-PSO-SVR) to simulate three WQ metrics: KMnO4(COD Mn ), (NH 3 -N), and (DO).The results showed that the combined WA-PSOSVR technique outperformed two other methods (PSO-SVR and a single SVR) in predicting non-linear stationary and non-stationary time series, particularly for extreme value prediction.Daily forecasts were more precise than monthly forecasts, indicating that the combined technique was better suited to short-term forecasting in this case.
In 2021, Son, et al. [97] suggested a novel hybrid technique (SWT-ISSALSTM).An improved LSTM model was presented to overcome the gradient disappearance or explosion in standard RNNs, as well as the inability to handle the issue of long-time dependence and enhance the model's performance.Additionally, a hybrid model using synchrosqueezed wavelet transform (SWT) to clean the raw data was used to resolve the non-stationarity, unpredictability, and nonlinearity of the WQ parameters data.The improved sparrow search algorithm (ISSA), a novel heuristic optimisation technique integrating Cauchy mutation and opposition-based learning (OBL), was also used to obtain the optimum hyperparameter values for the LSTM method.The suggested combined system was assessed utilising weekly WQ parameters.The results show that the addressed model, which combines the SWT's strong noise-resistant resilience and the LSTM's non-linear mapping, outperforms the peer models (stand-alone LSTM, BPNN, SVR, SWT-LSTM, and ISSA-LSTM) at two gauging stations.The suggested combined technique (SWT-ISSA-LSTM) can be utilised as a replacement framework for predicting WQ.
Jamei et al. [98] aimed to find two novel wavelet-complementary intelligence methodologies: the wavelet least square support vector machine coupled with improved simulated annealing (W-LSSVM-ISA) and the wavelet extended Kalman filter integrated with an artificial neural network (W-EKF-ANN), to predict monthly Mg and SO 4 metrics.The findings showed that both novel complementary paradigms could provide acceptable accuracy for WQP prediction based on correlation coefficient R and RMSE.
Sha et al. [99] evaluated various DL approaches such as CNN, LSTM, and CNN-LSTM models.Moreover, they employed a complete ensemble empirical mode decomposition algorithm (EEMD) with adaptive noise (CEEMDAN) to decompose and reduce the intricacy of DO and TN concentration.The outcomes reveal that the CNN-LSTM performed better than the stand-alone CNN and LSTM models, the techniques using CEEMDAN-based input data performed significantly better than the techniques using original input data, and the technique precision incrementally reduced with the rise of forecasting stages, while the original input data decayed more rapidly than the CEEMDAN-based input data, indicating that the input data pre-processed by the CEEMDAN method could significantly enhance.
Yan et al. [100] suggested four stand-alone models (GA-BPNN), (PSO-BPNN), (PSO-GA-BPNN), and (BPNN) to forecast DO concentration.The finding indicated that the PSO-GA-BPNN technique had enhanced forecasting precision and robustness compared with other methods.The connection weight and threshold of BPNN were optimised using PSO and GA in this work.This hybrid PSO and GA algorithm are based on the PSO algorithm, with the GA inserted during the PSO method's execution.It combines the benefits of both algorithms, resulting in less processing, faster convergence, and better global convergence performance.
The details of the selected papers, including authors, and the location, time scale, methods, input variables, output prediction, and evaluation criteria, are given in Table 3.
An analysis of several reviewed articles on optimisation algorithms revealed the following:

•
The general optimisation approaches demonstrated their ability to tune all AI models to achieve a far higher score on various evaluations as compared to a single model, which does not use any optimisation technique.In addition, when compared to a trialand-error procedure, the probability of achieving ideal values is substantially higher.

•
The most commonly employed algorithm in the WQ area and paired with AI approaches to forming a combined model is the PSO algorithm.

•
Several studies used pre-processing algorithms to overcome the data's non-stationarity, randomness, and nonlinearity of the WQ indicators.However, all pre-processing data steps were not used in most papers.

•
The trend of using hybrid models has increased in recent years.

Future Research Directions
Azad et al. [68] suggested employing modified algorithms to enhance other types of ML methods that suffer comparable shortcomings and comparing these changed hybrid models to different physical and soft computing models.Shah et al. [69] proposed that other studies should employ extra AI models, such as ensemble forecasting combined with PSO, to further develop their performance with optimum parameters in modelling WQ factors.Li et al. [81] recommended that it is possible to create a deep network through layerwise pre-training to collect deeper latent features to investigate the impact of raising the network layers of SAE (sparse auto-encoder) on predictive performance.Tiyasha et al. [28] mentioned that the MARS algorithm as a feature selector and the XGBoost algorithm as both a feature selector and a predictive method should be investigated to create various types of WQ data.In addition, the Boruta algorithm should be used to create scenarios to determine the best predictors' cutoff value.Furthermore, an examination of uncertainty is required to determine the stochasticity of the data application using the suggested AI techniques (RF, cFores, Ranger) and XGBoost.Song et al. [97] stated that more effective pre-processing procedures for WQ data should be investigated to increase the model's precision.Jamei et al. [98] stated that, in the future, an ensemble multi-wavelet transform (EMWT) paradigm could be employed to utilise the wavelets simultaneously.On the other hand, an ensemble tree-based method could be effective for combining the benefits of each complementary strategy to estimate surface water WQPs.Additionally, combined versions that incorporate more than one training technique for predictability improvement are recommended for such an issue of WQ parameters.
Additionally, all of the studies reviewed here support the suggestions below: • It is recommended that the three data pre-processing steps be applied to avoid outliers and noise and to select the most reliable and precise data to be employed as predictors later.

•
Other techniques for pre-treatment data, such as EEMD and singular spectrum analysis, are proposed.

•
Selection predictors are significant in determining the model's performance and precision.Accordingly, it is advised that more efforts be made to select the optimal predictors' combination; consequently, it is proposed that other techniques be used to choose the predictors, such as feature extraction methods, feature selection, and dimensionality reduction methods.

•
Applying hybrid metaheuristic algorithms and soft computing techniques in WQ parameter prediction has grown considerably in recent years.Nevertheless, there is still room for improvement concerning WQ parameter prediction.

Conclusions
This work attempted to review papers that employed hybrid methodologies to simulate WQ parameters.The selected papers in this review revealed that there has been an increasing tendency toward employing these methods in the area of WQ modelling in recent years.Combining data pre-processing techniques with metaheuristic algorithms and soft computing models has enhanced WQ prediction accuracy among the many modelling approaches.Therefore, hybrid models are the most effective techniques that must be used to enhance the precision of WQ parameter predictions.A comprehensive hybrid model incorporates both pre-processing techniques and metaheuristic algorithms.Accordingly, a key strength of the current study is that it represents a comprehensive examination of all the above factors.
Most of the previous research used the WQ parameters as predictors, and few of them applied other factors such as weather.For this type of data, models that incorporate only factors that have been proven effective are more precise than models that incorporate all factor data without testing variables' efficiency.Additionally, most previous studies used one or two steps of pre-processing, which impacted the accuracy of prediction models.Therefore, in future studies, the efficiency of the factors should be tested (predictors) before applying all of the data as input to the forecast models and using normalisation and cleaning.Furthermore, although significant advances in hybrid model techniques have been made recently, no new techniques have emerged as the best forecasting model.Consequently, WQ parameter forecasting remains a research problem, which leaves room for scholars to improve hybrid techniques for specific applications.

Conflicts of Interest:
The authors declare no conflict of interest.

Figure 1 .
Figure 1.Studies' number of hybrid ML models for WQ parameters prediction over the last four years.

Figure 1 .
Figure 1.Studies' number of hybrid ML models for WQ parameters prediction over the last four years.

Figure 2 .
Figure 2. Number of studies employing each parameter of WQ over the years.

Figure 2 .
Figure 2. Number of studies employing each parameter of WQ over the years.

Figure 3 .
Figure 3. Hierarchy chart to a taxonomy of reviewed hybrid models.

Figure 3 .
Figure 3. Hierarchy chart to a taxonomy of reviewed hybrid models.

Table 1 .
Summaries of related review papers.
AI; hybrid model; Wavelet transform; river water quality; prediction; review.AI -based Single and Hybrid Models for Prediction ofWater Quality in Rivers: A Review

Table 2 .
Advantages and disadvantages of the ANN, ANFIS, and SVR models.

Table 2 .
Advantages and disadvantages of the ANN, ANFIS, and SVR models.

Table 3 .
Summary of application of different type hybrid models in WQ monitoring.