Daily Streamflow Forecasting Based on the Hybrid Particle Swarm Optimization and Long Short-Term Memory Model in the Orontes Basin

Water, a renewable but limited resource, is vital for all living creatures. Increasing demand makes the sustainability of water resources crucial. River flow management, one of the key drivers of sustainability, will be vital to protect communities from the worst impacts on the environment. Modelling and estimating river flow in the hydrological process is crucial in terms of effective planning, management, and sustainable use of water resources. Therefore, in this study, a hybrid approach integrating long short-term memory networks (LSTM) and particle swarm algorithm (PSO) was proposed. For this purpose, three hydrological stations were utilized in the study along the Orontes River basin, Karasu, Demirköprü, and Samandağ, respectively. The timespan of Demirköprü and Karasu stations in the study was between 2010 and 2019. Samandağ station data were from 2009–2018. The datasets consisted of daily flow values. In order to validate the performance of the model, the first 80% of the data were used for training, and the remaining 20% were used for the testing of the three FMSs. Statistical methods such as linear regression and the more classical model autoregressive integrated moving average (ARIMA) were used during the comparison process to assess the proposed method’s performance and demonstrate its superior predictive ability. The estimation results of the models were evaluated with RMSE, MAE, MAPE, SD, and R2 statistical metrics. The comparison of daily streamflow predictions results revealed that the PSO-LSTM model provided promising accuracy results and presented higher performance compared with the benchmark and linear regression models.


Introduction
Water plays a major role in the creation of everything we produce. There are no substitutes, and while it is renewable, there is only a finite amount of it [1]. In spite of the fact that it surrounds three-quarters of the Earth's surface, the amount of freshwater is quite insignificant. Whereas the total water in the world is about 1.4 million km 3 , 97.5% is found as saltwater in the oceans and seas and only 2.5% as fresh water in rivers and lakes. Additionally, some freshwater resources are located at the poles and underground, showing a low amount of usable water. With the increasing population, economic developments and climate change gradually increase the pressure on freshwater resources and competition in accessing water resources. Therefore, this situation is expected to cause a global water crisis in the near future. In order to prevent future disaster scenarios, it is necessary to design an accurate planning and management strategy on water resources [2]. Water, which is constantly in circulation in our ecosystem, is insufficient to meet the needs of the increasing world population due to global warming and subsequent drought. Drought affects water resources in two ways: directly and indirectly. The direct effect of drought on water resources is via high temperature and low relative humidity, and increased evaporation losses, especially in surface water resources; the indirect effect is through the increase in input. However, RNNs may have difficulties retaining information from previous layers. This constraint is called the vanishing gradient problem, and its result is defined as the short-term memory problem in RNNs. Nowadays, LSTM-based methods, which are based on an advanced version of RNNs, are mostly studied. The LSTM unit remembers long or short time periods. The key to this capability is that it uses no activation functions in its recurring components [19,20].
Additionally, LSTM network performance occasionally offers unsatisfied outcomes due to the random selection of initialization parameters. Therefore, hybrid modeling studies are attracting progressively more attention in order to get better performance outcomes [21]. Consequently, in this study, the random selection of initialization parameters that significantly affect the analysis performance in the LSTM model was investigated by creating a PSO-LSTM hybrid model using the particle swarm optimization (PSO) algorithm. Recently, hybrid modeling studies merging ANNs with various optimizations have risen in popularity to enhance performance in data analysis processes in hydrology and other fields. Studies to develop methods for hybridization based on time series predictions have been increasing rapidly in number.
Mohammadi et al. [22] recommended a novel hybrid approach for SSL estimation in which multilayer perceptron (MLP) was hybridized with PSO and then integrated with a differential evolution algorithm (DE); the model was called MLP-PSODE. The developed MLP-PSODE model was found to be a parsimonious model that incorporates a lower number of input parameters in its structure for SSL estimation. Gharabaghi et al. [23] introduced a new hybrid algorithm, known as PSOGA, based on the advantage of two evolutionary algorithms, PSO and genetic algorithms (GA). The results demonstrated that the presented hybrid algorithm in the optimized design of ANFIS (PSOGA) has better accuracy than that of individual algorithms. Meshram et al. [24] generated a hybrid model by combining the feedforward neuron network (FNN) with the PSO model developed with the gravity search algorithm (FNN-PSOGSA). The results showed that the prediction accuracy of the hybrid model developed using rainfall values was successful. Motahari and Mazandaranizadeh [25] utilized a PSO algorithm as a metaheuristic approach to train an artificial neural network (ANN). The results revealed that applying the PSO-ANN model can achieve an acceptable prediction of the runoff up to two days ahead. Zounemat-Kermani et al. [26] developed integrative models, and the well-known particle swarm optimization (PSO) and novel manta ray foraging optimization (MRFO) heuristic algorithms are embedded in the models.
Yan et al. [27] built three new models hybridized with PSO for water quality time series. The hybrid models were compared with the data-based models. It was seen that the prediction accuracy of the hybrid model has an advantage in terms of time consumption. Asadnia et al. [28] developed the hybrid ANN-PSO and compared this model with the LN-MM model integrated into the ANN model. The hybrid model gave better results than those of the comparison model. Dökme [29] used the PSO algorithm to reduce the size of data by making feature selection in order to perform better data analysis from datasets. The PSO-based method performed better than other models did in the study. Feng et al. [30] proposed a novel enhanced LSTM model called LN-LSTM-PSO by integrating layer normalization (LN), LSTM network, and PSO to improve prediction accuracy. LN is able to accelerate the convergence speed of the LSTM network, and PSO substantially increases model performance by automating the hyperparameter selection.
Adnan et al. [31] developed a hybrid model for monthly runoff prediction by integrating particle swarm optimization (PSO) and grey wolf optimization (GWO) with extreme learning machine (ELM). The results revealed that the proposed model can achieve a successful prediction. Kouk et al. [32] developed precipitation modeling with an integrated PSO method. The results showed that the developed hybrid model could be successfully applied to precipitation models. Sihag et al. [33] compared the ant algorithm integrated with ANFIS and a model with integrated PSO. When the performance of the models was examined, it was observed that the hybridized model with PSO had higher accuracy compared with the other model.
As noticed in the literature, many hybrid models can be applied to enhance the prediction performance of the data. In addition, hybrid flow models created by integrating various deep learning methods and machine learning methods through different techniques emphasize enhancing the prediction accuracy. In addition, factors such as prediction accuracy and training time of algorithms to be optimized to deep learning models such as LSTM should be considered. Therefore, it is necessary to determine the optimum parameter for artificial intelligence-based models and choose the appropriate optimization method when determining the hybrid model.
The primary focus of this paper is as follows: (1) three flow measurement stations were determined to validate the predictive capacity of the generated model; (2) the PSO algorithm was integrated into LSTM to optimize the number of hidden layer nodes and the learning rate, to achieve higher prediction accuracy, a shorter time in which to handle complex calculations, and long-term correlation.

Study Region
Despite the fact that water scarcity, which is a physical phenomenon, is only a natural phenomenon, it can cause devastating effects due to the vital dependence of society on water resources. In order to minimize the damages of these destructive outcomes, it is necessary for planning to determine the risky regions by using historical hydrological data on a regional basis. The Orontes Basin, located in the south of Turkey and included in the scope of transboundary waters, is essential in terms of the planning of this region due to its geopolitical location. The total water potential for the Orontes basin is determined as 2.64 billion m 3 /year. Accordingly, 0.27 billion m 3 /year of water potential derives from Lebanon, about 1.09 billion m 3 /year from Syria, 0.18 billion m 3 /year from Afrin, including the waters passing through Syria, and about 1.18 billion m 3 /year originates from Turkey [34]. It is likewise noteworthy to evaluate the direct current estimations of this basin, of which approximately 55% of the total water potential is from outside our country. The Orontes River, showed in Figure 1, called Asi in Arabic, is located east of the Lebanon Mountains. The river was formed by its slope over some time with the help of Rasel-Ayn and Al-Labwah, which form the main sources. Subsequently, the rivers merge in Syrian territory after crossing the Bekaa valley between the Lebanon and Anti-Lebanon Mountains. Near the humus, it flows by heading first to the northeast and then to the north under the impact of basalt currents. Additionally, the river initiates from the Gharb Plain around Karkur and forms the Turkey-Syria border, starting near the Etun (Zambakiye) village. Near Eşrefli village, it ultimately joins Turkish territory. After proceeding 10 km north on the Amik Plain, the river bends to the southwest by drawing an arc and enters the Mediterranean Sea near Samandag [35,36].

Datasets and Pre-Processing
In this study, three flow measurement stations that represent various hydrological conditions of the Orontes River Basin were selected to validate the predictive capacity of the generated model. They were chosen in accordance with the conditions of being on various branches of the Orontes River basin shown in Figure 2. Daily flow measurement stations (FMSs) were used to gather long-term, 10-year streamflow data.

Datasets and Pre-Processing
In this study, three flow measurement stations that represent various hydrological conditions of the Orontes River Basin were selected to validate the predictive capacity of the generated model. They were chosen in accordance with the conditions of being on various branches of the Orontes River basin shown in Figure 2. Daily flow measurement stations (FMSs) were used to gather long-term, 10-year streamflow data. Demirköprü FMS (D19A07) is where the Orontes River joins Turkish territory. Karasu FMS (E19A05) is Karasu River, merging with the Orontes River. The Karasu FMS was chosen considering the fact that it passes through the Amik Plain, where intensive agricultural activities take place. Furthermore, the Karasu River merges with the Small Asi River and empties into the sea from Samandağ. Samandağ FMS (D19A09), the point before the Orontes River, spills into the sea. Samandağ station has been determined since it empties into the sea from Samandağ after passing through both the city centers. In addition to that, after merging with the Karasu River, these regions demonstrate intensive agricultural activity. The locations of the stations on the Orontes River are presented with geographical coordinates in Table 1. As shown in Figure 3, during the observation period, the minimum and maximum rates of flow belonging to the three river stations were 1.78 m 3 /s and 30 m 3 /s, respectively. Demirköprü FMS (D19A07) is where the Orontes River joins Turkish territory. Karasu FMS (E19A05) is Karasu River, merging with the Orontes River. The Karasu FMS was chosen considering the fact that it passes through the Amik Plain, where intensive agricultural activities take place. Furthermore, the Karasu River merges with the Small Asi River and empties into the sea from Samandag. Samandag FMS (D19A09), the point before the Orontes River, spills into the sea. Samandag station has been determined since it empties into the sea from Samandag after passing through both the city centers. In addition to that, after merging with the Karasu River, these regions demonstrate intensive agricultural activity. The locations of the stations on the Orontes River are presented with geographical coordinates in Table 1. As shown in Figure 3, during the observation period, the minimum and maximum rates of flow belonging to the three river stations were 1.78 m 3 /s and 30 m 3 /s, respectively. Taking the streamflow at Demirköprü FMS into account, while the lowest streamflow was 2.09 m 3 /s in 2017, the highest streamflow was 30 m 3 /s in 2018. As for the daily streamflow at the Karasu FMS, the lowest streamflow was 1.20 m 3 /s in 2016, whereas the highest streamflow was 30 m 3 /s in 2010. In addition, at the Samandağ FMS, the lowest streamflow was observed in 2017 at 1.78 m 3 /s, and the highest streamflow was found as 29.77 m 3 /s in 2016. Lastly, the highest streamflow was recorded for three stations in the period of March-May.  In the hybrid model created, Python 3.9, one of the versions of the Python programming language, with new components and optimization, was utilized. In the study, the model benefited from Keras library and Deep library for training processes and prediction processes. In the hybrid model where daily river flow data were analyzed, the LSTM comprised 100 periods for LSTM and eight batch sizes for performance analysis during the training process; while ADAM was the optimizer, MSE was the loss function. The dataset was directly bonded to the flow values for each day, and the flow values were formed by the daily flow, which was taken from EIEI (Electrical Works Survey Administration General Directorate) and DSI (Hydraulic State Works). The original data accumulated from the flow observation stations contained 10 years (3651 days) of operations for each station. Of the total dataset, 80% of the data was obtained as the training set and the remaining 20% as the test set. The data were trained to compare models, and then hybrid model performance was analyzed for test data. In addition, the hybrid model indicated one dense layer and two hidden layers.
In this study, the historical flow data of the stations were analyzed in order to estimate the future river flows and evaluate the proposed models. For this reason, flow data that have not been disrupted in a long time period were included so as to obtain accurate estimation. It is significant that the taken flow data must be recorded completely and not be cut. At this stage, short-term cuts in the flow data are acceptable. However, in many basin-based studies, when meteorological data (precipitation, snow, temperature, evaporation, etc.) and hydrological data (flow observation or flow measurement) are obtained from institutions, the data from past dates might be missing or cut for various reasons, such as climatic difficulties, transportation difficulties, or problems with the measuring device. The formation of gaps in inflow data due to unfavorable climatic conditions or In the hybrid model created, Python 3.9, one of the versions of the Python programming language, with new components and optimization, was utilized. In the study, the model benefited from Keras library and Deep library for training processes and prediction processes. In the hybrid model where daily river flow data were analyzed, the LSTM comprised 100 periods for LSTM and eight batch sizes for performance analysis during the training process; while ADAM was the optimizer, MSE was the loss function. The dataset was directly bonded to the flow values for each day, and the flow values were formed by the daily flow, which was taken from EIEI (Electrical Works Survey Administration General Directorate) and DSI (Hydraulic State Works). The original data accumulated from the flow observation stations contained 10 years (3651 days) of operations for each station. Of the total dataset, 80% of the data was obtained as the training set and the remaining 20% as the test set. The data were trained to compare models, and then hybrid model performance was analyzed for test data. In addition, the hybrid model indicated one dense layer and two hidden layers.
In this study, the historical flow data of the stations were analyzed in order to estimate the future river flows and evaluate the proposed models. For this reason, flow data that have not been disrupted in a long time period were included so as to obtain accurate estimation. It is significant that the taken flow data must be recorded completely and not be cut. At this stage, short-term cuts in the flow data are acceptable. However, in many basinbased studies, when meteorological data (precipitation, snow, temperature, evaporation, etc.) and hydrological data (flow observation or flow measurement) are obtained from institutions, the data from past dates might be missing or cut for various reasons, such as climatic difficulties, transportation difficulties, or problems with the measuring device. The formation of gaps in inflow data due to unfavorable climatic conditions or various reasons forms significant issues in terms of effective planning, design, and operation of water resources. In addition, these conditions should be taken into account in determining the flow values so that the structure and hydrological characteristics of the datasets are not deteriorated.
In addition, as stated above, three hydrological stations, Demirköprü, Karasu, and Samandag, were selected to validate the PSO-LSTM model, which illustrates the various climatic regions and hydrological conditions of the Orontes River. The Orontes River Demirköprü station is located in the Hatay Watershed, one of the basins with a high flood regime. The station is the first measurement station encountered after the Orontes River joins the territory of Turkey. It holds the wide river valley in the upper reaches and is located in the riverbed that extends to the transition zone where the canal turns into a plain. In addition, since the mentioned station is near the borders of Turkey, it is least affected by the interventions to the river waters in Turkey. Karasu Station is located at merging of the Orontes River within the borders of Turkey; it is the last station extension and contains a large part of the catchment area. Samandag Station maintains the flow of the river to the sea. By use of these features, D19A07, E19A05, and D19A09 stations were utilized to assemble the datasets for this study. The time period of Demirköprü and Karasu stations in the study was between 2010 and 2019. Samandag station data were from 2009-2018. The datasets consisted of daily flow values.

Long Short-Term Memory Network
Long short-term memory (LSTM) is an impressive RNN architecture, and the most noteworthy feature of this advanced architecture is its ability to decode the disappearing gradient situation or at least reduce the impact of the disappearing gradient issues on training performance. Similar to RNN, nodes in an LSTM neural network receive the latent states of the previous step. However, the node, which is a common LSTM unit, contains a more advanced structure than it does in RNN, and this is the primary aspect that provides long-term memory by reducing the vanishing gradient outcome [37]. Three major components create the LSTM's internal structure-forget undesirable information in the current cell state through the forget gate, add further data to the current cell state through the input gate, produce an output of the current cell state through the output gate-and these serve specific operations on cell states [38]. These gates determine which data need to be added or cleared. Cell State, C t , can be thought of as the memory of a network. It ensures that previous information is maintained. The gates determine the data to be transported, as shown in Figure 4. In Equation (1), f t , which is the information from the previous cell, h t , and the current information, X t , are inserted into the sigmoid activation function. The forget gate, f t , determines how much memory is preserved from the previous memory state, C t−1 . Information with 0 is forgotten, and information with 1 continues to be carried by Cell State. Another gate is the input gate, i t , in Equation (2), providing the information to write into the current memory state, C t . It updates Cell State, C t , and decides to update the previous and current information according to the result of the sigmoid (σ) operation. LSTM decides which information it will delete with the sigmoid function. Information with 0 is considered trivial, and information with 1 is deemed essential. In addition, the tanh activation function, which compresses the data between −1 and 1, is used to regulate the network. Then, the sigmoid and tanh function outputs are multiplied, and choose which information will be updated. In Equations (3) and (4), The exit gate determines the input of the next cell, h t + 1 . It is also operated for guesswork. Then, the existing information on the Cell State is passed through the tanh function. Finally, it determines what information will be the input for the next cell, h t + 1 , by multiplying the two outcomes. When the gate operations for the current cell are completed, the Cell State that will proceed to the next cell and the Hidden State, h t , information defined as the input information of the cell are decided. In Equations (5) and (6), relying on the current cell state, C t , the output of LSTM h t is determined by the output gate o t [39][40][41].
Water 2022, 14, x FOR PEER REVIEW 9 of 21 information of the cell are decided. In Equations (5) and (6), relying on the current cell state, Ct, the output of LSTM ht is determined by the output gate ot [39][40][41].

Particle Swarm Optimization
Many global optimization techniques based on a nature-inspired analogy have been generated over several decades. These techniques are beyond the intuition of inhabitants and employ tools that can solve many of the limitations of derivative-based approaches. One of these popular techniques, PSO, developed by Kennedy and Ebert, is a sociologically inspired population-based metaheuristic founded on the simulation of common approaches such as evolutionary programming, ant colony, birds flock, and fish flock. These algorithms have revealed their ability to solve challenging and complex optimization situations in various fields. Compared to the genetic algorithm (GA) and other evolutionary algorithms (EAs), PSO was utilized in this study due to its faster convergence rate and easy implementation [42].
The PSO system is configured with random solutions and searches for the best solution by updating each iteration. Each potential solution, called particle, is represented by a point in the multidimensional solution space. As they are scanning for the optimal solution, the particles pass into the solution space at a certain speed. Each particle adjusts its position and velocity according to its own experience and the experience of its neighbors. Correctly, each particle takes the path of the best solution. This solution is called personal best representative, pbest. The system also preserves the globally optimal path of all

Particle Swarm Optimization
Many global optimization techniques based on a nature-inspired analogy have been generated over several decades. These techniques are beyond the intuition of inhabitants and employ tools that can solve many of the limitations of derivative-based approaches. One of these popular techniques, PSO, developed by Kennedy and Ebert, is a sociologically inspired population-based metaheuristic founded on the simulation of common approaches such as evolutionary programming, ant colony, birds flock, and fish flock. These algorithms have revealed their ability to solve challenging and complex optimization situations in various fields. Compared to the genetic algorithm (GA) and other evolutionary algorithms (EAs), PSO was utilized in this study due to its faster convergence rate and easy implementation [42].
The PSO system is configured with random solutions and searches for the best solution by updating each iteration. Each potential solution, called particle, is represented by a point in the multidimensional solution space. As they are scanning for the optimal solution, the particles pass into the solution space at a certain speed. Each particle adjusts its position and velocity according to its own experience and the experience of its neighbors. Correctly, each particle takes the path of the best solution. This solution is called personal best representative, pbest. The system also preserves the globally optimal path of all swarms, called gbest. The basic concept of PSO involves varying the velocity of each swarm towards the pbest and gbest positions at each repetition [43]. The particle swarm continues iterating through the process illustrated below until an optimal solution is uncovered. The flow chart of the PSO algorithm is depicted in Figure 5. The PSO system incorporates a local search approach (during self-experimentation) with global approaches (during the adjacent experience) during balancing investigation and exploitation. The state of particles in the study field is explained by particle position, Xi, and particle velocity, Vi.
The expression Vi = [Vi1, Vi2......Vin] is called the velocity for particle I, which specifies the distance that this particle will travel from its initial position. The expression Xi = [Xi1, Xi2...Xin] specifies the position of particle i. The expression pbest is the previous best position of the thread 'i'. The expression gbest represents the best position among all herds in the population. The r1 expression denotes evenly distributed random variables within [0,1]. Expressions C1 and C2, called acceleration coefficients, are also greater than 0 and they take each particle to the single best state and optimal particle location, respectively.
The first part of Equation (7), the expression Vi[t], refers to the particle's previous velocity, which is a memory of the previous extreme direction. This term can be considered the momentum that prevents the particle from altering its direction drastically and that impacts the current direction.
The second part, the expression C1 × r1 × (Pbest , is called the cognitive part and refers to the particle's individual experience. This cognitive part resembles the individual memory of the better place for the particle. The consequence of this term is that The PSO system incorporates a local search approach (during self-experimentation) with global approaches (during the adjacent experience) during balancing investigation and exploitation. The state of particles in the study field is explained by particle position, X i , and particle velocity, The is called the velocity for particle I, which specifies the distance that this particle will travel from its initial position. The expression X i = [X i1 , X i2 . . . X in ] specifies the position of particle i. The expression pbest is the previous best position of the thread 'i'. The expression gbest represents the best position among all herds in the population. The r 1 expression denotes evenly distributed random variables within [0,1]. Expressions C 1 and C 2 , called acceleration coefficients, are also greater than 0 and they take each particle to the single best state and optimal particle location, respectively.
The first part of Equation (7), the expression Vi[t], refers to the particle's previous velocity, which is a memory of the previous extreme direction. This term can be considered the momentum that prevents the particle from altering its direction drastically and that impacts the current direction.
The second part, the expression , is called the cognitive part and refers to the particle's individual experience. This cognitive part resembles the individual memory of the better place for the particle. The consequence of this term is that herds return to their best places, similar to the tendency for individuals to return to the most satisfying situations or places in the past [44,45].
The last part, the expression C 2 × r 2 × (gbest i[t] − Xi[t]), clarifies the association among particles, and it is called the social component. The term is analogous to a group standard that individuals seek to achieve. The outcome of this term is that each particle is attracted to the best position determined by its neighbor. The numbers named r 1 and r 2 are indiscriminate in the range of [0,1].

Forecasting Based on PSO-LSTM (Proposed) Model
In the LSTM neural network, the initial values of the parameters critically influence the network's performance. In this study, the PSO algorithm was employed to optimize two essential parameters of the LSTM network. These two parameters are the number of hidden layer neurons and the learning rate. While constructing the proposed model, a standard LSTM network prediction model was conducted as a priority. The test outcomes were compared by training with random parameters ten times, and the most promising results were documented as the benchmark model. Right after, several hyperparameters of the LSTM model were optimized with PSO. The optimal outcomes of the PSO algorithm were determined, then added to the LSTM network as a parameter, then the LSTM model was retrained, and the outcomes were compared with the benchmark model. In addition, the linear regression model was run to verify the accuracy of the results. Consequently, the results with both models were compared.
First of all, the data were arranged for the training. Therefore, the dataset was divided into training and test datasets by 80% and 20% for the process. Later, translation and normalization techniques were applied so as to optimize the parameters in both datasets. Then, they were converted into a suitable version for training. Afterwards, the LSTM network was first trained with one dense and one LSTM layer. It proceeded as one dense and two LSTM layers to achieve a more pleasing performance. The network structure was accepted as more suitable, and the three-hidden-layer structure was utilized in the following operations. By altering the number of neurons in the hidden layers, the network was run 10 times, and the most acceptable results of the three-layer network were assumed as references. Many attempts were made to determine the most appropriate bias value for the model. As a result of the experiments, the bias value was determined as 0.5. The mapping between PSO particles and LSTM parameters was then merged into this structure. Thus, weight was 0.5, swarm size was 20 and the maximum number of iterations was 50, C 1 , and C 2 acceleration constants were in the range of (−2, 2), velocity was in the range (−3, 3), and the number of particles was in the range (32,256). For the calculation of pbest and gbest values, the results from the PSO were employed as the learning rate. The number of neurons in the proposed network and the R 2 (coefficient of determination) were used to determine the fitness values. In this paper, r 1 is equal to 0.6, and r 2 is equal to 0.3. The network was utilized with the optimization results corresponding to the gbest, and the results were recorded. After these procedures, the linear regression model was also utilized. The graphs and results of the three models were compared. The flowchart of the hybrid model is shown in Figure 6.

Performance Evaluation of Models
The hybrid, linear regression and ARIMA models are compared with the benchmark model in this section of the study. One of the well-known and classical linear statistical models for the estimation of time series is the ARIMA model. The ARIMA is a time series estimation approach used to predict the future value of a variable using its past values. The linear regression model was employed to examine the correlation among the data. The linear regression method used the linear function to model the association between dependent and independent variables in the dataset range and tested the linear correlation. Since the regression approach models the dependent variable as a linear function of the independent variables, it provides an interpretable explanation of how the input affects the output [46]. The performance results of each flow measurement station are shown in Figure 7. Five assessment indicators, which were among the common measures of statistical distribution, were employed to study and compare the estimation results. These are RMSE, MAE, MAPE, standard deviation (SD), and R 2 , specified in Table 2. Statistical measurement results of the stations are explained in Table 2. The model's performance consisted of 730 test data for all three stations. The performance of the hybrid model against other models applied in the study was observed to be thriving when the measurement criteria presented in Table 2 were examined. Furthermore, statistical measurements supported the performance of the hybrid model.

Performance Evaluation of Models
The hybrid, linear regression and ARIMA models are compared with the benchmark model in this section of the study. One of the well-known and classical linear statistical models for the estimation of time series is the ARIMA model. The ARIMA is a time series estimation approach used to predict the future value of a variable using its past values. The linear regression model was employed to examine the correlation among the data. The linear regression method used the linear function to model the association between dependent and independent variables in the dataset range and tested the linear correlation. Since the regression approach models the dependent variable as a linear function of the independent variables, it provides an interpretable explanation of how the input affects the output [46]. The performance results of each flow measurement station are shown in Figure 7. Five assessment indicators, which were among the common measures of statistical distribution, were employed to study and compare the estimation results. These are RMSE, MAE, MAPE, standard deviation (SD), and R 2 , specified in Table 2. Statistical measurement results of the stations are explained in Table 2. The model's performance consisted of 730 test data for all three stations. The performance of the hybrid model against other models applied in the study was observed to be thriving when the measurement criteria presented in Table 2 were examined. Furthermore, statistical measurements supported the performance of the hybrid model.

Comparative Analysis and Discussion
Plotting graphs were used to compare the proposed new PSO-LSTM model. A regression line was also identified in the plotting graphs. The regression line indicated in the graphs was a standard fit line and significant for demonstrating model performance. In this context, while determining the quality of a model, its size and whether it creates a pattern were analyzed. The results of the test data were studied on these graphs. When

Comparative Analysis and Discussion
Plotting graphs were used to compare the proposed new PSO-LSTM model. A regression line was also identified in the plotting graphs. The regression line indicated in the graphs was a standard fit line and significant for demonstrating model performance. In this context, while determining the quality of a model, its size and whether it creates a pattern were analyzed. The results of the test data were studied on these graphs. When the plotting graphs of Karasu station, as shown in Figure 7a, are examined, PSO-LSTM presented a very satisfactory performance with 0.95262 R 2 value compared to LSTM, with 0.8893, ARIMA, with 0.8798, and linear regression, with 0.8725, models. According to the criteria of the R 2 at Demirköprü station, as shown in Figure 7b, PSO-LSTM outperformed LSTM (0.8740), ARIMA (0.7281), and linear regression (0.8373) models with a value of 0.9270. According to Samandag station plotting graphs, PSO-LSTM was achieved with an R 2 value of 0.9749, compared to LSTM with 0.9202, ARIMA with 0.8890, and linear regression 0.8916. The Samandag station, the last point where the Orontes River empties into the Mediterranean, revealed a strong correlation of the estimated flow data with the daily flow values consistent with its features such as accumulation and acting as a downstream point. Additionally, when the comparison models were examined, it was revealed that the LSTM models were more promising than the linear regression models in all three stations. Analysis of the PSO-LSTM and LSTM methods confirmed the feasibility of the application to flow estimation in the Orontes River basin, with all R 2 coefficients of PSO-LSTM being greater than (0.92) among three typical hydrological stations. From the analysis of the five evaluation indices, the accuracy of the models was in the order of PSO-LSTM > LSTM > linear regression >ARIMA. It showed that no additional data error was added to the hybrid calculation. On the other hand, the proposed PSO-LSTM hybrid model was reliable and exhibited higher accuracy in daily flow prediction. Table 2 shows the values of the statistical measurements of the three hydrological stations. At Karasu station, according to the MAE criterion, the LSTM model presented a value of 0.1530 while the hybrid model had a value of 0.1401. The linear regression model showed a value of 0.1948, and the ARIMA model showed a value of 0.0978. When the RMSE criterion was examined, the hybrid, benchmark, linear regression, and ARIMA results were 0.8276, 1.2363, 1.3308, and 1.2886, respectively. According to the standard deviation criterion, these values were similarly 0.2611, 0.2942, 0.3390, and 0.1742. According to the MAPE criterion, these values were 14.0196, 15.3023, 19.4855, and 9.7838. When the evaluation criteria at Karasu station were examined, it was presented that the hybrid model was successful among all evaluation criteria according to the comparison, linear regression, and ARIMA models.
At the Demirköprü station, according to the MAE criteria, the LSTM model had a value of 0.0714, whereas the hybrid model had a value of 0.0728. On the other hand, the linear regression model had a value of 0.0892, and the ARIMA model had a value of 0.2401. When the RMSE criterion was examined, the hybrid, comparison, linear regression, and ARIMA results were 0.9073, 1.2836, 1.3498, and 1.7860, respectively. According to the standard deviation criterion, these values were 0.1545, 0.1563, 0.2129, and 0.3006, respectively. According to the MAPE criterion, these values were 7.2830, 7.1450, 8.9201, and 24.0195. When the evaluation criteria at Demirköprü station were analyzed, it was detected that the comparison model was successful, despite a slight difference in the MAPE and MAE evaluation criteria. Likewise, it was quite successful compared to the linear regression. In the other three statistical measurements, the hybrid model was successful compared to the comparison, linear regression, and ARIMA models.
At the last station, Samandag, according to the MAE criterion, the LSTM model had a value of 0.1270 while the hybrid model had a value of 0.1025. On the other hand, the linear regression model had a value of 0.0951, and the ARIMA model had a value of 0.1066. When the RMSE criterion was investigated, the results of hybrid, comparison, linear regression, and ARIMA models were 1.2557, 2.3066, 2.6876, and 2.6255, respectively. According to the standard deviation criterion, these values were −0.1541, 0.1865, 0.1902, and 0.1647, respectively. According to the MAPE criterion, these values were 10.2574, 12.7057, 9.5131, and 10.6665. When the evaluation criteria at the Demirköprü station were examined, the hybrid and linear regression models provided similar results according to the MAPE and MAE evaluation criteria. The benchmark model lagged behind the hybrid and linear regression models in these criteria. In other evaluation criteria, the hybrid model was quite successful compared to the benchmark and linear regression models. At the Demirköprü station, the ARIMA model lagged behind other models in all evaluation criteria.
In addition, when all evaluation criteria for the three stations were examined, the hybrid model provided significant improvements in percentage. In the general evaluation, the values with the highest R 2 and the lowest standard deviation were seen at the Samandag station. Demirköprü station came to the fore in MAE and MAPE evaluations, and Karasu station according to RMSE criteria. As mentioned before, the Karasu station is located where the Orontes River merges, the river spills into the sea over the last point, Samandag, and the region where Demirköprü station is located is not exposed to pollutants originating from Turkey when the Orontes River enters the borders of Turkey, and the precipitation area capacities of the three stations reveal differences when compared with each other.
Streamflow Samandagı current measurement values have the widest range. The striking point here is that LSTM and linear regression models tend to cluster in the same value range, the 12-30 m 3 /s value range. Although ARIMA generally moves in the direction of the trend line, the hybrid model gave much more accurate results as the intervals are far from the hybrid model. Figure 8 illustrates the standard deviation (SD) and correlation for benchmark (1), proposed (2), linear regression (3), and ARIMA (4) models in Taylor diagram. The distance from reference to the point (observed) measures the centered RMSE [47]. Thus, the reference point with the correlation coefficient marks a perfect model equal to 1 (existence in full agreement with the observations) and the same amplitude of variation when compared with the observations [48]. At all three stations, the hybrid model results were closer to the observation points compared to the other model results, confirming the better accuracy of the optimized model. In spite the fact that the benchmark model performed more sufficiently at Samandag and Demirköprü stations than ARIMA and linear regression did, the ARIMA model provided a significantly close results to the benchmark model at Karasu station, but lagged behind linear regression. To show that the hybrid model used in this study has high accuracy in forecasting river flows, we compared the estimation results of the literature using hybrid models to predict time series. Jabbari and Bae [49] evaluated the real-time bias correction of precipitation data, and from a hydrometeorological point of view, an assessment of hydrological model improvements in real-time flood forecasting for the Imjin River (South and North Korea) was performed. The performance of the real-time flood forecast improved using the ANN bias correction method. Jiandong et al. [50] developed a hybrid forecasting model. In the study, the long short-term memory neural networks (LSTMs) and deep belief networks based on particle swarm optimization (PSO-DBN) were utilized to construct sub-series prediction models. The results showed that the proposed method in this paper was more effective than the other existing methods were. Wang et al. [51] proposed a hybrid model-based "feature decomposition-component prediction-result reconstruction" named VMD-LSTM-PSO to cope with the nonlinear and nonstationary challenges that conventional runoff forecasting models face and improve daily runoff prediction accuracy. Based on its high predictive accuracy and stability, the novel model promised to be a preferred datadriven tool for hydrological forecasting in practice. Chen et al. [52] utilized the three popular DL models, which were deep neural network (DNN), temporal convolution neural network (TCN), and long short-term memory neural network (LSTM). They were used to estimate daily reference evapotranspiration (ETₒ). The results displayed that all proposed DL and CML models outperformed radiation-based or humidity-based empirical equations beyond the study areas in which they were trained. Di Nunno et al. [53] predicted To show that the hybrid model used in this study has high accuracy in forecasting river flows, we compared the estimation results of the literature using hybrid models to predict time series. Jabbari and Bae [49] evaluated the real-time bias correction of precipitation data, and from a hydrometeorological point of view, an assessment of hydrological model improvements in real-time flood forecasting for the Imjin River (South and North Korea) was performed. The performance of the real-time flood forecast improved using the ANN bias correction method. Jiandong et al. [50] developed a hybrid forecasting model. In the study, the long short-term memory neural networks (LSTMs) and deep belief networks based on particle swarm optimization (PSO-DBN) were utilized to construct sub-series prediction models. The results showed that the proposed method in this paper was more effective than the other existing methods were. Wang et al. [51] proposed a hybrid modelbased "feature decomposition-component prediction-result reconstruction" named VMD-LSTM-PSO to cope with the nonlinear and nonstationary challenges that conventional runoff forecasting models face and improve daily runoff prediction accuracy. Based on its high predictive accuracy and stability, the novel model promised to be a preferred datadriven tool for hydrological forecasting in practice. Chen et al. [52] utilized the three popular DL models, which were deep neural network (DNN), temporal convolution neural network (TCN), and long short-term memory neural network (LSTM). They were used to estimate daily reference evapotranspiration (ET o ). The results displayed that all proposed DL and CML models outperformed radiation-based or humidity-based empirical equations beyond the study areas in which they were trained. Di Nunno et al. [53] predicted spring flows by applying nonlinear autoregressive with exogenous inputs (NARX) neural networks. The good results achieved recommend using the NARX network for spring discharge prediction in other areas characterized by karst aquifers. Granata and Di Nunno [54] built three recurrent neural network-based models to predict short-term actual evapotranspiration. Two variants of each model were developed, changing the employed algorithm, selecting between long short-term memory (LSTM) and nonlinear autoregressive network with exogenous inputs (NARX). The results revealed that deep learning-based models could provide very accurate predictions of actual evapotranspiration; however, the performance of the models can be significantly impacted by local climatic circumstances. As can be seen, the results obtained in many studies show that hybrid models created with PSO outperform the comparative model and provide estimation precision.
Considering all these details, it was noticed that converting the results from the PSO algorithm to LSTM parameters was critical. PSO has several limitations such as stability, patterns of movements, convergence to a local optimum, and expected first hitting [55]. According to known features of PSO, it was thought that it would be correct to integrate it into the model. The PSO algorithm was utilized to optimize the learning rate and the number of hidden neurons, which were two essential parameters of the LSTM network. As a result, it was seen that the improvement rates were quite high. However, the LSTM neural network is complex and has other parameters affecting the prediction performance. The models that will be formed by determining the parameters such as dropout, number of iterations, and batch size other than just the two considered parameters with PSO, or those that integrate the parameters into the proposed model using a different improvement algorithm will guide future studies. In addition, it is thought that the study can be a reference to hybrid methods in the development of methods that are diversifying with each passing day with deep learning techniques and in the search for more suitable parameters in these complex structures.
The hybrid model successfully estimated the daily flow rate in flow measurement station data of three various hydrological conditions. In addition, the study demonstrated the success of the hybrid model in predicting the optimal level of river flows when compared with the benchmark and linear regression models.
To sum up, it can be seen from the results (Table 2 and Figure 6) that of the three overall datasets, the PSO-LSTM achieved the best performance on three of the five evaluation criteria. The estimation results of the PSO-LSTM were superior to those of the LSTM and linear regression models in nearly all cases, except for the last two statistical measurement methods, MAE and MAPE, for river flow datasets. In conclusion, the estimation results indicated that the proposed PSO-LSTM algorithm achieves the best overall results compared with the LSTM model and the regression model for river flow estimation issues.

Conclusions
In this study, a hybrid method in which PSO is integrated into LSTM is proposed to estimate flow data. The performance of the proposed method has been tested on river flow data from three different flow observation stations on the Orontes River. In spite of the fact that it has been revealed in the studies that flow can be predicted successfully with artificial neural network, which provides better results than does regression analysis, the success of the method depends on the availability of healthy, reliable, and sufficient data. The proposed new hybrid model was compared with the benchmark model and the linear regression model. Even though the basic LSTM generally demonstrates a strong learning ability for time series, it can occasionally present poor performance results owing to the random selection of initialization parameters. In these cases, supporting the model with optimization algorithms influences the performance considerably. In this study, one of the reasons for choosing PSO as the optimization algorithm to search for the appropriate values of the LSTM parameters is that, when compared to genetic algorithms, it performs with real numbers and has some benefits such as not needing binary coding to make calculations. Statistical evaluation criteria, which are among the basic statistical evaluation methods, were used to measure the model's performance [56]. The results obtained show that in the proposed PSO-LSTM approach, the estimation errors of the flow data are quite low compared to the other models used in the study. Furthermore, when the R 2 values are considered, it is seen that the estimation accuracy is quite high for the proposed model at the same rate, which shows that the improvement effect is significant. In addition, the parameters of the PSO algorithm used in this study are among some of the factors that need to be developed for future studies. For this reason, in new hybrid models to be made with the PSO algorithm, a new algorithm will be presented by studying the factors that will affect the model, such as particle number, particle size, particle spacing, learning factors, stopping condition, rate of change, particle swarm size, speed, and the maximum number of iterations. Combining the PSO algorithm with different optimization methods and comparing the PSO algorithm with new algorithms will benefit future research. In addition, new hybrid models to be created using metaheuristic techniques will also be beneficial in future studies. It has been seen that the PSO-LSTM model provides promising results in river flow predictions. However, the study has some limitations. In this study, only flow data were operated as input. Flow time series are nonlinear, and many parameters such as humidity, snowmelt, and temperature can form these time series. This study can be reconstructed with different input parameters and prepare the ground for future studies. Since the data are nonlinear, decomposition techniques can be included in the model. The generated hybrid model was evaluated only for daily flow data. It can be evaluated for shorter time intervals (hourly, 30 min, 15 min) in possible future studies. Other hydrological variables can be applied in the field of hydrology to study of the proposed model. The contribution of the PSO algorithm to the model designed when hybridized is promising. However, the comparison model can be hybridized with other recently popular algorithms (e.g., grey wolf algorithm), and the contribution of the two algorithms to the prediction accuracy can be examined.
Funding: This research did not receive any specific grant from funding agencies in the public, commercial or not-for-profit sectors.