A Methodology for Forecasting Dissolved Oxygen in Urban Streams

Real-time monitoring of river water quality is at the forefront of a proactive urban water management strategy to meet the global challenge of vital freshwater resource sustainability. The concentration of dissolved oxygen (DO) is a primary indicator of the health state of the aquatic habitats, and its modeling is crucial for river water quality management. This paper investigates the importance of the choices of different techniques for preprocessing and stochastic modeling for developing a simple and reliable linear stochastic model for forecasting DO in urban rivers. We describe several methods of evaluation, preprocessing, and modeling for the DO parameter time series in the Credit River, Ontario, Canada, to achieve the optimum data preprocessing and input selection techniques and consequently obtain the optimum performance of the stochastic models as an effective river management tool. The Manly normalization and standardization (Std) methods were chosen for preprocessing the time series. Modeling the preprocessed time series using the stochastic autoregressive integrated moving average (ARIMA) model resulted in very accurate forecasts with a negligible difference from sole normalization and spectral analysis (Sf) methods.


Introduction
River water quality monitoring programs have been established around the globe to help watershed managers better protect the quality of water in their watersheds [1]. The dissolved oxygen (DO) concentration plays a critical role in regulating various biogeochemical processes and biological communities in river ecosystems. Maintaining sufficient levels of DO in water is critical for water quality because oxygen is needed for the survival and preservation of various aquatic species, including fish, amphibians, benthos, bacteria, and aquatic plants. Therefore, DO is used as a health index for water bodies [2].
Several studies have described the importance of both the diurnal and seasonal fluctuations of DO as it relates to fish habitats [3][4][5]. The occurrence of low DO concentrations, in a normally well-oxygenated river system, can cause mortality in fish and other aquatic life. When DO concentrations are reduced, aquatic species are forced to lower their activity and alter respiration rates, which will delay their development, and can cause reproductive problems and/or deformities [6].
As observed in both the diurnal and the seasonal fluctuations in the water temperature and the DO concentrations, a strong negative correlation commonly exists between DO concentrations and river water temperatures [7]. Refractory carbonaceous biological oxygen demand (CBOD) and the oxidation of ammonia to form nitrite and nitrate are some of the most significant processes leading 2 of 16 to low dissolved oxygen concentrations in a river [8]. DO concentrations in natural river ecosystems are affected by many complex processes, including, solubility (atmospheric pressure, water temperature, humidity, and cloud cover) and processes that determine river flow turbulence mixing (stream depth, width, velocity, and roughness) to name a few [9,10].
Most recently, many researchers have employed advanced machine learning methods and artificial intelligence (AI) techniques for forecasting streamflow and water quality parameters [11][12][13][14]. Sentas et al. [14] compared the ARIMA model with an artificial neural network (ANN) for modeling short daily DO. They claimed that the ARIMA model produced better results in modeling the DO parameters measured at several depths. Heddam and Kisi [15] performed multi-input AI modeling for DO estimation. They used daily water temperature, PH, specific conductance as least square support vector, multivariate adaptive regression splines, and M5 Model Tree (M5T) model inputs and found that these models produce acceptable results in DO modeling. They also reported that the results vary station by station and no superior model can be chosen. Harvey et al. [16] developed a regression model to forecast the daily, weekly, and monthly water temperature and DO. They also claimed that their regression models produced accurate results linking air temperature and DO. Bertone et al. [17] used a multi-input regression model to forecast a seven-days-ahead DO. They achieved a correlation coefficient higher than 0.8 and low errors. They also utilized preprocessing methods to obtain preferable results. Parmar and Bhardwaj [18], by a limited analysis ARIMA model, modeled monthly water quality parameters (DO, PH, temperature) for the Yamuna river. They obtained promising results from the stochastic modeling of these parameters.
Using multi-input models has the advantage of utilizing more historical data to estimate the studied parameter. However, such an approach is accompanied by accumulated errors from the measurements of each of the inputs. Optimized input selection and the complexity of such models is another problem with this approach. The availability of historical data, gaps in the dataset, and the associated cost of collecting this data is another challenge with applying a multi-input modeling approach. Gaps in the data are remedied by removing the missing time intervals [19] or reconstructed by interpolation [20], which becomes a challenge when the gaps are numerous or the timespans are long. A solution to these problems can be achieved by the adoption of a single time series modeling approach, a proper analysis procedure and preprocessing method, careful selection of inputs, or using simpler models.
A universal pursuit for less complex models for DO forecasting led to the application of stochastic modeling techniques [21,22]. One of the most widely used stochastic models in hydrology is the ARIMA model using nonseasonal parameters. The advantage of the stochastic ARIMA model over deterministic models is that it requires less data, and it is well suited for forecasting [22]. However, the successful application of the stochastic ARIMA model for the forecasting of DO is associated with several challenges, the key issues being proper data preprocessing that is addressed in this study.
Therefore, the main objective of this study is to compare preprocessing methods applied to the DO ARIMA models to develop the most accurate while still simple and practical forecast model, which to our knowledge has not been studied in-depth before. The ultimate goal of the study is to develop a user-friendly model for the practitioners to easily assess the health of urban streams and avoid the challenging problems associated with the application of more complex multiparameter models.

ARIMA Model
One of the most widely used stochastic models in hydrology is the ARIMA model, which is defined by using nonseasonal parameters. These nonseasonal parameters are ϕ and θ (autoregressive and moving average parameter, respectively). The orders of each parameter are represented by p and q, respectively, with d denoting the differencing order. The model's equation is defined as [23]: (1 − B t−1 ) d is the d-ith nonseasonal differencing. B is the backshift operator. The striking feature of the ARIMA model is that it uses the inline differencing method to stationarize the time series. After calculating the model parameters, historical values are used to forecast one-step forward values.
Stationarity is defined as the stability of statistical parameters. Each time series is composed of periodic patterns, steps, gradual upward or downward changes, and a stochastic part. The existence of each one of the first three terms (periodic, steps, gradual trend) in a time series alone or simultaneously causes the time series to be nonstationary. Therefore, the series should be made stationary by one of stationarizing methods. By using the Diff method, each data is subtracted from the previous data, and so trends and seasonal patterns in the series are eliminated [24]: Standardization (S td ) is one of the preprocessing methods used to modify the data distribution scale so that the observational data has a mean of zero and a standard deviation of one. Standardization eliminates the trend and removes the jump in the variance (based on the variance before and after the jump) [25].
Periodic patterns can be presented by sinusoidal functions. Therefore, by transforming the time series to the frequency domain, these patterns can be removed from the DO time series. The Fourier expansion is employed to perform the spectral analysis (S f ) by which the repetitive patterns in the time series can be formulated and subtracted from the DO time series [26]. The remaining parts are closer to being stationary.
DO(t) is the average of DO data points, ε(t), are the residuals, α h , and β h are the Fourier coefficients, f z equals to the zth harmony base frequency, and k is the maximum harmonic equal to 2z and 2z − 1 for even and odd data, respectively. By subtracting the periodic term P(t) from DO series, the S f series is achieved: The normal distribution is the basic assumption of statistical models. Therefore, the distribution of the series should be evaluated prior to modeling. Many visual and numerical tests are available to assess the distribution of the DO time series. The Jarque-Bera (JB) test is one of many numerical tests that evaluates the normality based on Kurtosis and Skewness [27]. In the absence of a normal distribution, normalization transformations are used, with the method developed by Manly [28]. This transformation is the developed form of the Box-Cox (BC) [29] transformation and covers the former method's shortcomings. Unlike the BC transform, it involves the transformation of time series with positive and negative domains and has more potential for normalizing data.
The values of the λ, the transformation parameter, are −5 > λ > 5. Any predictable and repetitive pattern in time series results is evidence of nonstationarity. The stationary condition is the base assumption in stochastic modeling, which can be evaluated by various tests such as the augmented Dickey-Fuller test (ADF) [30] and autocorrelation function graphs. The stationarity evaluation in this test is done by fitting a regression line to the DO time series and determining the roots of the fitted equation. In this test, the stationary assumption declines when the corresponding probability to the test statistic (P ADF ) is higher than the significance level of 5%.
The evaluation of gradual oriented changes in DO time series can be assessed by the Mann-Kendall test [31]. This test can determine the gradual changes in time series that may occur alternately and seasonally, by considering both seasonal and nonseasonal parameters, leading to a seasonal or nonseasonal trend in the time series. The trendless DO time series assumption is rejected when the test's statistic probability is higher than the significance level of 5%.
Steps in DO time series can be created by both human or natural factors and distinguished by sudden ascending or descending patterns in the time series. These sudden changes are easily observed from the raw values of the time series. A numerical method to identify these changes is the Mann-Whitney (MW) test [32]. This test evaluates the steps in the DO time series by ordering the data points and dividing them into subclasses and comparing the groups. A corresponding probability lower than the confidence level α = 1% rejects the step-less DO time series assumption.
One of the easiest and most used tools to check the periodicity in the DO time series is the correlogram (ACF). With the help of the correlogram, this property can visually be observed in the time series in the form of alternating variations in distinct and sinusoidal intervals. Alternatively, the numerical Fisher's test [33] can be utilized to evaluate the periodicity in the DO time series. The test, similar to the spectral analysis, uses the Fourier coefficients to assess the significant periodic patterns. For the significance level of 5%, the degrees of freedom in the denominator of the test equal 3 and the test statistics higher than 3 depict periodicity in DO time series.
To study the techniques of water quality data preprocessing and their modeling methods, based on the flowchart in Figure 1, two scenarios are specified. In the first scenario, the data is only normalized, then, using different tests and differencing, the possibility of modeling with other stochastic models is considere. In the second scenario, the data are stationarized after normalization, and the feasibility of modeling with different linear methods is investigated. The modeling and preprocessing methods are compared based on various indices.
The indices used in evaluating DO time series include the conventional coefficient of determination (R 2 ), variance accounted for (VAF), scatter index (SI), the root mean squared error (RMSE), and mean absolute error (MAE). A corrected Akaike's information criterion (AICc) is used to choose a parsimonious ARIMA model. The Theil's coefficients and Nash-Sutcliffe (E N-S ) model efficiency coefficient [34][35][36] are used to assess the quality of modeling as well. The Theil test performance is based on the accuracy of forecasting (denoted as U I ) and forecasting quality (denoted as U II ). Using the AICc, U I and U II indices, superior models can be obtained in each time series modeling, and the less these indices are, the more parsimonious and more accurate the model is. Conversely, the higher the E N-S is, the better the model is. The indices used in evaluating DO time series include the conventional coefficient of determination (R 2 ), variance accounted for (VAF), scatter index (SI), the root mean squared error (RMSE), and mean absolute error (MAE). A corrected Akaike's information criterion (AICc) is used to choose a parsimonious ARIMA model. The Theil's coefficients and Nash-Sutcliffe (EN-S) model efficiency coefficient [34][35][36] are used to assess the quality of modeling as well. The Theil test performance is based on the accuracy of forecasting (denoted as UI) and forecasting quality (denoted as UII). Using the AICc, UI and UII indices, superior models can be obtained in each time series modeling, and the less these indices are, the more parsimonious and more accurate the model is. Conversely, the higher the EN-S is, the better the model is.
In stochastic modeling, Ljung-Box (lbq) test [37] is one of the popular methods to investigate the model's adequacy by testing models' residuals. The corresponding calculated probability to the lbq test statistic in the χ 2 chai distribution less than the α confidence level (Plbq > 5%), means that the model is not adequate and improvements in the preprocessing and/or modeling can take place, and better results can be obtained.

Study Site
The case study site to evaluate the ARIMA model is located in the Credit River Watershed, which is approximately 1000 km 2 in size and drains to Lake Ontario, Canada. The upper and middle areas of the watersheds are predominantly rural and include larger portions of the Town of Orangeville, Town of Erin, Town of Halton Hills, and Town of Caledon. The lower parts of the watershed, however, are mainly urbanized; and include the City of Brampton, Town of Oakville, and City of Mississauga. The spatial distribution of water quality trends in this basin is driven by land use, with the lower tributaries being impacted by urbanization and the resulting point and nonpoint pollution [38]. The dissolved oxygen (DO) data for this study was obtained from two water quality monitoring In stochastic modeling, Ljung-Box (lbq) test [37] is one of the popular methods to investigate the model's adequacy by testing models' residuals. The corresponding calculated probability to the lbq test statistic in the χ 2 chai distribution less than the α confidence level (P lbq > 5%), means that the model is not adequate and improvements in the preprocessing and/or modeling can take place, and better results can be obtained.

Study Site
The case study site to evaluate the ARIMA model is located in the Credit River Watershed, which is approximately 1000 km 2 in size and drains to Lake Ontario, Canada. The upper and middle areas of the watersheds are predominantly rural and include larger portions of the Town of Orangeville, Town of Erin, Town of Halton Hills, and Town of Caledon. The lower parts of the watershed, however, are mainly urbanized; and include the City of Brampton, Town of Oakville, and City of Mississauga. The spatial distribution of water quality trends in this basin is driven by land use, with the lower tributaries being impacted by urbanization and the resulting point and nonpoint pollution [38]. The dissolved oxygen (DO) data for this study was obtained from two water quality monitoring stations in the Credit River  The first station (DO I) is situated at Old Derry Road at 43°37′19.1′′ N 79°44′00.1′′ W, and the second (DO II) is located 20.2 km downstream at the Mississauga Golf and Country Club (DO II) with the coordinates of 43°33′16.7′′ N 79°37′13.0′′ W. Water quality was monitored using a Hydrolab DS5X multiparameter sonde at each location. DO was measured using a Hatch luminescent dissolved oxygen sensor with an accuracy of ± 0.1 mg/L at <8 mg/L and ± 0.2 mg/L at >8 mg/L with a resolution of 0.01 mg/L. Sensor data were polled every 15 min and transferred via SODA telemetry to a central database. For DO I, the period of record is from 20 February 2010 to 11 December 2016 for 2337 days (221,819 data points, after removing errors and interpolation of missing data). For DO II, the period of record is from 6 October 2011 to 11 December 2016 for 1865 days (177,819 data points). Figure 3 shows the seasonal fluctuations of the DO data during the 7 years of monitoring.  In the preliminary investigation of the data, some missing data were observed. The missing data were reconstructed [40] by using the linear interpolation method. After data reconstruction, the data were averaged daily to obtain the daily average DO index. The data were divided into two parts: one for testing and one for the training of the model, with a ratio of 30% and 70%, respectively. The statistical specification of the data at both stations is presented in Table 1.

Diurnal and Seasonal DO Fluctuations
In the preliminary investigation of the data, some missing data were observed. The missing data were reconstructed [40] by using the linear interpolation method. After data reconstruction, the data were averaged daily to obtain the daily average DO index. The data were divided into two parts: one for testing and one for the training of the model, with a ratio of 30% and 70%, respectively. The statistical specification of the data at both stations is presented in Table 1. Nbr., Number of data; Min. and Max., Minimum and Maximum of data; 1st Q. and 3rd Q., first and third Quarters; σ(n), Standard Deviation; γ 1 , Skewness; γ 2 , Kurtosis.
The time series was normalized using the Manly transformation to meet the modeling requirements. After normalization, data were analyzed by the JB test. It was found that the normalization transformation had little effect on the time series, and the value of the test statistics was not lower than the critical value, with the series close to the normal distribution. Therefore, the preprocessing steps of the time series are continued. In the next step, the time series were stationarized using the methods of standardization and spectral analysis. The results are shown in Table 2. The periodicity reduction in the ACF chart of time series can be observed in Figure 4. The spectral analysis method was able to significantly reduce the frequency of the DO II series, while the results of all these three preprocessing approaches are close for DO I. First-order differencing in the ARIMA model was also used to investigate the stationarizing impact of the model on the time series. The periodicity reduction in the ACF chart of time series can be observed in Figure 4. The spectral analysis method was able to significantly reduce the frequency of the DO II series, while the results of all these three preprocessing approaches are close for DO I. First-order differencing in the ARIMA model was also used to investigate the stationarizing impact of the model on the time series.  Table 3 demonstrates that all preprocessed time series are stationary and the probability of the ADF test statistic for all is below 5%. All deterministic terms in the series have been significantly reduced. The results of the tests applied to the normalized and standardized series are very similar for preprocessing methods at each station.  Table 3 demonstrates that all preprocessed time series are stationary and the probability of the ADF test statistic for all is below 5%. All deterministic terms in the series have been significantly reduced. The results of the tests applied to the normalized and standardized series are very similar for preprocessing methods at each station. The spectral analysis method is slightly different from the other two methods. The complete elimination of the periodicity in the subtracted series is shown in the ACF graphs in Figure 5. In addition to eliminating seasonal fluctuations, nonseasonal correlations have also been eliminated, and preprocessed time series are completely stationary in the same first lag. The differencing resulted in improvement, from the values in Table 3

Modeling Results
After the preprocesses using the correlograms method, the primary lag was completely damped. The maximum required modeling parameters for ARIMA modeling are of the first order. However, methods to investigate the effect of increasing the order of parameters for all 10 parameters were considered. Thus, the parameters p = q = {0, 1, 2, 3, ..., 8, 9, 10} and the differencing order, d = 1 were considered. As a result, for each time series by varying the p and q parameters, 99 models for each case and in general for all six methods 6 × 99 = 594 stochastic models were developed and reviewed.

Modeling Results
After the preprocesses using the correlograms method, the primary lag was completely damped. The maximum required modeling parameters for ARIMA modeling are of the first order.
However, methods to investigate the effect of increasing the order of parameters for all 10 parameters were considered. Thus, the parameters p = q = {0, 1, 2, 3, ..., 8, 9, 10} and the differencing order, d = 1 were considered. As a result, for each time series by varying the p and q parameters, 99 models for each case and in general for all six methods 6 × 99 = 594 stochastic models were developed and reviewed. Series modeling results are presented in Table 4. As the results are very close to each other and the errors are too small, the values of Table 4 are multiplied by 100 for better presentation, except for the AICc index. It is observed that the coefficient of determination for all series is above 94% and very close to each other. Furthermore, error and modeling-power-survey indices are also very close. In the preprocessed series from the first station, the values of the model-surveying indices are almost identical for both N orm and S f methods, and with a slight difference, better than S td . Therefore, considering fewer parameters (AICc = −999.57) of the S td method and consequently, simplicity of the model, S td is chosen as the superior preprocessing method for the DO I station.
In the time series modeling of the DO II station, the N orm and S td methods that obtained close values are slightly better than S f results. Considering the effect of the S f method on reducing the periodicity in the DO II series in the ACF diagram of Figure 4, this method can be selected as a superior method with a slight difference from other methods in preprocessing, but with an insignificant difference, the S td model results were better than this method. The AICc index for S td equals −756.846, which is slightly better than the S f method. Broadly, the S td can be chosen as the superior method of the second station as well. Figures 6-8 examine the existence of a significant periodicity in model residuals and modeling accuracy [41,42]. Figure 6 shows the cumulative periodicity of the residuals of the ARIMA models. The cumulative residuals periodogram (CP) values of the modeled series are within 1% of the Kolmogorov-Smirnov confidence intervals. This means that the periodicity of the model residuals has been eliminated.
Water 2020, 12, x FOR PEER REVIEW 10 of 16 equals −756.846, which is slightly better than the Sf method. Broadly, the Std can be chosen as the superior method of the second station as well. Figures 6-8 examine the existence of a significant periodicity in model residuals and modeling accuracy [41,42]. Figure 6 shows the cumulative periodicity of the residuals of the ARIMA models. The cumulative residuals periodogram (CP) values of the modeled series are within 1% of the Kolmogorov-Smirnov confidence intervals. This means that the periodicity of the model residuals has been eliminated.  Figure 7 graphically shows the results of the Ljung-Box test, which is used to check the independence of the residuals. The values of the test higher than 5% indicate the independence of the sample. Residuals independence is one of the postmodeling assumptions in stochastic modeling. This evaluation is performed to assess the adequacy of the stochastic models. In the case of having significant correlations in the residuals, the modeling process should be reviewed. In this figure, the total independence for the residuals is observed. In the second station, however, some negligible correlation is seen in the first lags for all three methods. This correlation might be due to periodicity in the DO I time series. Consequently, as the lbq values for the correlated lags are close to the significant 5% line, the correlations are removed immediately after few lags and the rest of the residual series are independent, it can be concluded that the models are adequate for DO I time series as well.  The decline in the independency is also seen in the initial lags of the model residuals for the DO II series, yet all the residuals of the models are independent. To study the modeling accuracy and correlation of the modeled values, scatter plots of the data are presented in Figure 8. The accuracy of the ARIMA stochastic models at the DO II station is well illustrated. These values are within the Chisquare 95% confidence ellipsis and close to the regression line. However, at the DO I station, it is difficult to interpret due to outlier data. Therefore, another method is required, which is not affected by outliers. The outlier values can be caused by natural events or human interactions in the watershed, which suddenly change the values of water quality indices. In addition, the existence of errors in the sensor measurements can also affect the creation of outlier data [43,44]. Figures 9 and 10 who the Taylor [45] diagram and forecast plots of the DO time series vs. linear stochastic models, respectively. The Taylor diagram compares the standard deviation centered root mean square difference (RMSD) and correlation coefficient (R 2 ) simultaneously. Figure 9 shows that the model is accurately replicating the statistical characteristics of the original data [44][45][46].   This evaluation is performed to assess the adequacy of the stochastic models. In the case of having significant correlations in the residuals, the modeling process should be reviewed. In this figure, the results of the independence test for both stations' data are presented. Using the daily data time series, the first 365 primary nonseasonal lags or the first primary seasonal lag of the residuals from the ARIMA models were evaluated by lbq test. Since the results of the three methods for each station are very close, the corresponding lbq test results are overlapped. For Station II (DO II) model residuals, total independence for the residuals is observed. In the second station, however, some negligible correlation is seen in the first lags for all three methods. This correlation might be due to periodicity in the DO I time series. Consequently, as the lbq values for the correlated lags are close to the significant 5% line, the correlations are removed immediately after few lags and the rest of the residual series are independent, it can be concluded that the models are adequate for DO I time series as well.
The decline in the independency is also seen in the initial lags of the model residuals for the DO II series, yet all the residuals of the models are independent. To study the modeling accuracy and correlation of the modeled values, scatter plots of the data are presented in Figure 8. The accuracy of the ARIMA stochastic models at the DO II station is well illustrated. These values are within the Chi-square 95% confidence ellipsis and close to the regression line.
However, at the DO I station, it is difficult to interpret due to outlier data. Therefore, another method is required, which is not affected by outliers. The outlier values can be caused by natural events or human interactions in the watershed, which suddenly change the values of water quality indices. In addition, the existence of errors in the sensor measurements can also affect the creation of outlier data [43,44]. Figures 9 and 10 who the Taylor [45] diagram and forecast plots of the DO time series vs. linear stochastic models, respectively. The Taylor diagram compares the standard deviation centered root mean square difference (RMSD) and correlation coefficient (R 2 ) simultaneously. Figure 9 shows that the model is accurately replicating the statistical characteristics of the original data [44][45][46]. However, at the DO I station, it is difficult to interpret due to outlier data. Therefore, another method is required, which is not affected by outliers. The outlier values can be caused by natural events or human interactions in the watershed, which suddenly change the values of water quality indices. In addition, the existence of errors in the sensor measurements can also affect the creation of outlier data [43,44]. Figures 9 and 10 who the Taylor [45] diagram and forecast plots of the DO time series vs. linear stochastic models, respectively. The Taylor diagram compares the standard deviation centered root mean square difference (RMSD) and correlation coefficient (R 2 ) simultaneously. Figure 9 shows that the model is accurately replicating the statistical characteristics of the original data [44][45][46]. In the Taylor diagram of both stations, the proposed linear methodology gave very close results for both stations. Except that in the Station DO II, the Sf method forecasted all DO data with lower standard deviation. Alongside other characteristics of the model, the stochastic model should preserve the statistical attributes of the original DO data [46]. Thus, Table 5 presents the descriptive statistics of the forecasted DO time series vs. the observed ones. The forecasted DO data closely match  . Figure 10. The forecasts plot of the modeled DO time series at both DO Stations I and II for the superior method.
The DO time series follows a predictable diurnal cycle ( Figure 11) which is driven by both water temperature and the mass balance between DO sources (reaeration and production by photosynthesis) with DO consumption (by biochemical oxygen demand (BOD) and plant respiration). This cycle is closely tied to the solar cycle and can be modeled using a half-sinusoid function. [47,48]. The methods that we have applied to the daily average DO may also be used to predict minim and maximum DO. The subhourly time series may then be reconstructed assuming the timing of the minimum and maximum DO values are fixed relative to sunset and sunrise. In the Taylor diagram of both stations, the proposed linear methodology gave very close results for both stations. Except that in the Station DO II, the S f method forecasted all DO data with lower standard deviation. Alongside other characteristics of the model, the stochastic model should preserve the statistical attributes of the original DO data [46]. Thus, Table 5 presents the descriptive statistics of the forecasted DO time series vs. the observed ones. The forecasted DO data closely match the observed ones in both stations. Figure 10 illustrates the superior model vs. the observed series. The capability of the one-step forward stochastic model in forecasting DO time series can be seen in the figure. The method correctly forecasted the fluctuations in the original series. Min. and Max., Minimum and Maximum of data, 1st Q. and 3rd Q., first and third Quarters, σ(n), Standard Deviation, γ 1 , Skewness, γ 2 , Kurtosis.
The DO time series follows a predictable diurnal cycle ( Figure 11) which is driven by both water temperature and the mass balance between DO sources (reaeration and production by photosynthesis) with DO consumption (by biochemical oxygen demand (BOD) and plant respiration). This cycle is closely tied to the solar cycle and can be modeled using a half-sinusoid function. [47,48]. The methods that we have applied to the daily average DO may also be used to predict minim and maximum DO. The subhourly time series may then be reconstructed assuming the timing of the minimum and maximum DO values are fixed relative to sunset and sunrise.  Figure 11, where the estimated daily minimum based on linear regression is shown compared to the original time series and daily average. Comprehensive DO water quality metrics focus on preventing both the acute and chronic effects by considering a lower limit on instantaneous value, as well as limits based on a long-term average [49][50][51][52][53]. The average seven-day minimum is a commonly used metric for chronic exposure of low DO [50,52,53] and can be simply calculated based on the above-mentioned linear relationship between the daily average and daily minimum. Using this method produces results with acceptable error during the summer period between June and September when the lowest DO values occur seasonally (RMSE = 0.41 mg/L, MAE = 0.31 mg/L for DO I, RMSE = 0.37 mg/L, MAE = 0.29 mg/L for DO II).

Conclusions
In this study, the dissolved oxygen (DO) time series of two stations at different geographical points along the Credit River was used to examine the preprocessing and one-step-ahead stochastic modeling techniques. The data were measured in 15 min intervals at two different time periods. Initially, using the linear interpolation method, missing data were reconstructed and then averaged daily to obtain the daily DO values.
The initial premises of stochastic methods are the normal distribution of the time series and their stationarity. Jarque-Bera (JB) test was used to determine the distribution of the series. The stationarity  Figure 11, where the estimated daily minimum based on linear regression is shown compared to the original time series and daily average. Comprehensive DO water quality metrics focus on preventing both the acute and chronic effects by considering a lower limit on instantaneous value, as well as limits based on a long-term average [49][50][51][52][53]. The average seven-day minimum is a commonly used metric for chronic exposure of low DO [50,52,53] and can be simply calculated based on the above-mentioned linear relationship between the daily average and daily minimum. Using this method produces results with acceptable error during the summer period between June and September when the lowest DO values occur seasonally (RMSE = 0.41 mg/L, MAE = 0.31 mg/L for DO I, RMSE = 0.37 mg/L, MAE = 0.29 mg/L for DO II).

Conclusions
In this study, the dissolved oxygen (DO) time series of two stations at different geographical points along the Credit River was used to examine the preprocessing and one-step-ahead stochastic modeling techniques. The data were measured in 15 min intervals at two different time periods. Initially, using the linear interpolation method, missing data were reconstructed and then averaged daily to obtain the daily DO values.
The initial premises of stochastic methods are the normal distribution of the time series and their stationarity. Jarque-Bera (JB) test was used to determine the distribution of the series. The stationarity of both series was studied in two steps. In the first step, the general stationarity was studied using the augmented Dickey-Fuller (ADF) test. In the second step, the Mann-Whitney, Mann-Kendall, and Fisher tests were used to examine the factors causing the nonstationarity of the series. It was observed that both time series are nonstationary and non-normally distributed. The nonstationarity of the time series is due to the existence of a trend, jump, and periodicity. After scrutinizing the test results, the ACF and PACF graphs were considered for modeling. In the first scenario, the data were normalized and then, by using the first-order differencing operator in ARIMA, they were stationarized.
In the second scenario, the series was stationarized after normalization and then modeled with the ARIMA. Initially, daily data were normalized by Manly transforms (N orm ). This transformation was unable to normalize the distribution of time series but brought the distribution closer to the normal distribution in both time series. Then, the values were stationarized by the methods of differencing, standardization (S td ), and spectral analysis (S f ). It was found that the two methods, S td and S f , whic were used to stationarize the series after normalization, are only able to reduce the periodicity in the series and practically cannot stationarize them alone. However, using the differencing operator in the stochastic ARIMA model, the preprocessed series were stationarized. According to the results of numerous tests, values of the stationarity and the elimination of deterministic terms are very close to each other. After modeling the preprocessed time series, using the stochastic ARIMA model, the results showed that the methods performed similarly to each other and were very accurate. The S td method was the superior method of the time series preprocessor for the first and second stations outperformed other methods.
Dissolved oxygen is one of many important water quality parameters that are measured at real-time water quality stations and is a crucial parameter for assessing stream habitat suitability. This preprosessing approach can be used for other water quality parameters such as temperature, which exhibit similar periodicity and trends and applied to other models. For example, a novel genetic algorithm (GA)-optimized long short-term memory (LSTM) water temperature model developed by Stajkowski et al. [4] can be combined with this technique to increase accuracy. The main goal of developing accurate models of key water quality parameters is the forecasting and assessment of the impact of disturbances (anthropogenic, such as pollution, or climatic such as climate change) on aquatic habitat suitability and, therefore, the health of aquatic species. There is a strong linear relationship between daily average DO and daily minimum DO, which allows for water quality metrics such as the average seven-day minimum DO to be predicted from this model.
It can be finally concluded that with the correct selection of preprocessing and stochastic methods, which are reliable, simple, and well understood, it is possible to accurately model water quality indices. This is in contrast to artificial intelligence methods, which are a black box and where the connection between model structure and the time series components (both deterministic and stochastic) are not explicit. Additionally, the preprocessing methods used in this study can be applied in combination with artificial intelligence models, including hybrid models such as ARIMA-ANN, ARIMA-long short-term memory, and ARIMA-ELM to improve forecasting accuracy. Moreover, by utilizing other stationarity methods like the advanced exponential smoothing method, the impacts of these methods on DO time series preprocessing can be investigated.