Improved Deep Learning Predictions for Chlorophyll Fluorescence Based on Decomposition Algorithms: The Importance of Data Preprocessing

: Harmful algal blooms (HABs) have been deteriorating global water bodies, and the accurate prediction of algal dynamics using the modelling method is a challenging research area. High-frequency monitoring and deep learning technology have opened up new horizons for HAB forecasting. However, the non-stationary and stochastic process behind algal dynamics monitoring largely limits the prediction performance and the early warning of algal booms. Through an analysis of the published literature, we found that decomposition methods are widely used in time-series analysis for hydrological processes. Predictions of ecological indicators have received less attention due to their inherent ﬂuctuations. This study explores and demonstrates the predictive enhancement for chlorophyll ﬂuorescence data based on the coupling of three decomposition algorithms with conventional deep learning models: the convolutional neural network (CNN) and long short-term memory (LSTM). We found that the decomposition algorithms can successfully capture the time-series patterns of chlorophyll ﬂuorescence concentrations. The results indicate that decomposition-based models can enhance the accuracy of single models in predicting chlorophyll concentrations in terms of the improvement percentages in RMSE (with increases ranging from 25.7% to 71.3%), MAE (ranging from 28.3% to 75.7%), and R 2 values (increasing ranging from 14.8% to 34.8%). In addition, the comparison experiment for different decomposition methods might suggest the superiority of singular spectral analysis in hourly predictive tasks of chlorophyll ﬂuorescence over the wavelet transform and empirical mode decomposition models. Overall, while decomposition methods come with their respective strengths and weaknesses, they are undeniably efﬁcient in combination with deep learning models in dealing with the high-frequency monitoring of chlorophyll ﬂuorescence data. We also suggest that model developers pay more attention to online data preprocessing and conduct comparative analyses to determine the best model combinations for forecasting algal blooms and water management.


Introduction
Harmful algal blooms (HABs) have become a worldwide severe environmental problem by releasing excess toxins, which can have detrimental effects on aquatic ecosystems and endanger human health [1].The timely and high-accuracy prediction of HAB occurrence and intensity is essential in controlling their detrimental environmental and public health effects [2].Continuous and high-frequency monitoring technologies are widely applied in HAB monitoring.For instance, flow cytometry analysis [3], hyperspectral imagery [4] and unmanned aerial vehicles [5] could monitor the real-time distribution of HABs.In particular, chlorophyll-a (Chla) is a crucial parameter for characterizing phytoplankton communities, making it a commonly employed diagnostic pigment in measurements [2].Chlorophyll fluorescence sensors can provide fast, cost-effective, and highly temporal revolution data to train the advanced models for HAB predictions [6].However, high-resolution time-series data often exhibit stochastic non-stationarity distributions, owing to environmental drivers interactively influencing the formation of HABs [7].Therefore, additional modelling efforts are required to overcome the challenges associated with HAB forecasting.
In recent years, data-driven models have gained wider usage for forecasting algal blooms in inland water [8].Machine learning algorithms, including neural networks [9], evolutionary computation [10], support vector machines (SVMs) [11], random forests [12], and gradient boost machines [13], are well-known to be helpful in predicting HABs.Consequently, the non-linearity and intermittency process behind the algal monitoring data hinders the performance of machine learning in the accurate forecasting of early-stage blooms [14].With evolving artificial intelligence, deep learning (DL) models have received increasing attention in HAB forecasts [15].For instance, the independent recurrent neural network (RNN) [16], long short-term memory (LSTM) [17], and gated recurrent unit (GRU) [18] have been increasingly used to forecast HABs [19,20].Some researchers have also demonstrated the effectiveness of image-based convolutional neural networks (CNNs) in modelling HABs [21,22].Nonetheless, deep learning requires large-scale monitoring to train models.In fact, the online monitoring of HAB sensors often contains abnormal values, peak values, and error components with irregular random movements, causing inherently non-linear, complex, and non-stationary algal sequence time series [23].Therefore, pure data-driven methodologies prove unsatisfactory in addressing the high variations in algal dynamics [24].In this case, choosing an appropriate data preprocessing procedure might be essential to increase the forecasting accuracy of the predictive algal parameters [25,26].
The decomposition-based frameworks have been demonstrated to extract the dynamic features of time-series data and enhance models' predictive performance in an increasing number of studies [27].In contrast to conventional time-series models, the decompositionbased frameworks divide the time series into components of varying frequencies, predicting each component individually with deep learning models, and summing them to obtain the predicted results.From the perspectives of decomposition algorithms, the wavelet transform (WT) and empirical mode decomposition (EMD) methods are two of the most commonly used decomposition methods [28].For example, Liu et al. [26] proposed a hybrid prediction model combining WT and LSTM, which decomposes the original algal parameters series into multiple sub-series using wavelet transform, and then employs LSTM on the sub-series components.Zhu et al. [29] reported the hybrid EMD-LSTM with the attention mechanism model and indicated that empirical mode decomposition (EMD) can effectively enhance the smoothness of the time series and increase predictive accuracy.In addition, Luo et al. [30] develop an improved empirical mode decomposition model (EEMD)-LSTM prediction model to predict water quality.Apaydin et al. [31] investigate the singular spectral analysis (SSA) with LSTM to increase the monthly streamflow prediction accuracy.Following that, integrated SSA and genetic-based models have been developed for river flow forecasting and showed improved accuracy [32].To date, no reported study has compared the difference in employing these above-mentioned decomposition-based methods in dealing with high-frequency water-quality data, especially for chlorophyll fluorescence monitoring.
In this study, we evaluate and compare the performance of different hybrid approaches that couple WT, EEMD, and SSA decomposition for extracting the sub-series component of Chla data, along with the deep learning approaches for predicting HABs.The Chla is obtained from a substantial volume of in situ multi-sensor monitoring data in Lake Dianchi, China.Specifically, the goals of this paper are as follows: (1) to obtain better prediction performance of the Chla fluorescence by combining the WT, EEMD, and SSA decomposition approaches; (2) to develop decomposition-based hybrid models and evaluate the prediction effectiveness of the models; and (3) to further compare the prediction performance of different decomposition approaches for chlorophyll fluorescence forecasting.This study demonstrates that the decomposition-based Chla prediction methods could function as a robust and trustworthy tool for forecasting HABs in water management.

Study Sites and Data
The study area is located in Lake Danchi (24 • 40 -25 • 02 N, 102 • 36 -102 • 47 E), Kunming, Yunnan, in southwest China (Figure 1).Lake Danchi is a famous plateau freshwater lake with a surface area of 330 km 2 , and the average depth measures 4.4 m, with a maximum water depth reaching 6.7 m and a watershed area spanning 2920 km 2 [33].Additionally, an artificial causeway, Haigeng Dam, divides the lake into two parts-Caohai in the north and Waihai in the south-covering 286.78 km 2 [34].Over recent decades, this eutrophic lake has frequently experienced intense HABs and has gradually developed to have some of the heaviest cyanobacterial blooms in Chinese lakes [35].Due to the low water flow and high pollutant concentration, cyanobacterial blooms are a frequent occurrence in Caohai, which is located closer to the urban area.This study collected the average values of the Chla data of the Duanqiao and Caohai center sampling sites through continuous monitoring every four hours from 1 February 2019 to 8 January 2021 (Table 1).The two online monitoring sites in Caohai, Dianchi Lake, belong to the sections managed by Kunming Municipal Environmental Monitoring Center, Yunnan, China.The first 50% of the sequence (February 2019-January 2020) was used to train the hybrid models.The remaining data (January 2020-January 2021) were eventually used to test the model performance.

Decomposition-Based Deep Learning Model Development
In this section, we present the multi-decomposition architecture (Section 2.2.1) and introduce the wavelet transformation analysis (Section 2.2.2), ensemble empirical mode decomposition (Section 2.2.3), and singular spectral analysis (Section 2.2.4).We further  In this section, we present the multi-decomposition architecture (Section 2.2.1) and introduce the wavelet transformation analysis (Section 2.2.2), ensemble empirical mode decomposition (Section 2.2.3), and singular spectral analysis (Section 2.2.4).We further illustrate convolutional neural networks (Section 2.2.5) and LSTM (Section 2.2.6).

The Multi-Decomposition Architecture
This study established and compared the performance of three decomposition-based models in forecasting the Chla in the Caohai center and Duanqiao sites of Lake Dianchi.A technical flowchart of this study is presented in Figure 2. Firstly, the original series data of the Chla were decomposed into sub-series based on the multi-decomposition process.Then, the sub-sequences were input into deep learning models to be trained and validated one by one.Finally, all the individual forecasted sub-series were summed to derive the predicted results of the sequence of Chla.

Wavelet Transformation Analysis
The wavelet transformation analysis method (WT) is a useful mathematical tool of signal analysis theory in physics and engineering [36,37].By decomposing the original signal into several sub-components at different time frequency spaces, the WT can analyze non-stationary data and effectively extract time frequency features of the original time series simultaneously.The sub-sequences are typically derived from a template referred to as the "mother wavelet", and these deconstructed wavelets are scaled and translated based on the mother wavelet.The advantage of wavelet analysis lies in its capacity for the adaptable selection of the mother wavelet to match the specific characteristics of the investigated time series.However, determining scale and translation parameters for every possible position necessitates significant computational effort when utilizing a continuous wavelet transformation (CWT).
In contrast, the discrete wavelet transformation (DWT) substantially alleviates the computational complexities associated with wavelet transformations by adopting dyadic scales and positions, typically based on powers of two [26].The DWT of a time series, f(t), is typically carried out as follows:

Wavelet Transformation Analysis
The wavelet transformation analysis method (WT) is a useful mathematical tool of signal analysis theory in physics and engineering [36,37].By decomposing the original signal into several sub-components at different time frequency spaces, the WT can analyze non-stationary data and effectively extract time frequency features of the original time series simultaneously.The sub-sequences are typically derived from a template referred to as the "mother wavelet", and these deconstructed wavelets are scaled and translated based on the mother wavelet.The advantage of wavelet analysis lies in its capacity for the adaptable selection of the mother wavelet to match the specific characteristics of the investigated time series.However, determining scale and translation parameters for every possible position necessitates significant computational effort when utilizing a continuous wavelet transformation (CWT).
In contrast, the discrete wavelet transformation (DWT) substantially alleviates the computational complexities associated with wavelet transformations by adopting dyadic scales and positions, typically based on powers of two [26].The DWT of a time series, f (t), is typically carried out as follows: where the integers a and b represent the decomposition level and translation factor, respectively; the constant m 0 is decomposition scale factor; the constant n 0 is the position factor of translation; Ψ * a,b (t) is the wavelet function; Ψ(t) is the mother wavelet that can be set as the "Daubechies", "Haar", and "Morlet" wavelets; and W f (a, b) are the DWT coefficients.The discrete wavelet transformation (DWT) employs high-pass and low-pass filters to decompose the original time series, f (t), into different resolution levels, yielding a low-frequency approximation sub-sequence (A n ) and a high-frequency sub-sequence (D 1 , D 2 , . .., D n ), and finally obtaining detailed coefficients and approximation sub-time series.

Ensemble Empirical Mode Decomposition (EEMD)
Empirical mode decomposition (EMD) is a noise-reduction, signal-adaptive decomposition algorithm for non-linear and non-stationary data [38].The original time series can be decomposed into finite modal components and intrinsic mode functions (IMFs) that contain only a single instantaneous frequency, and residual (Res) [39].Nonetheless, the noise of signal may result in mode aliasing within the IMFs, consequently generating inaccurate time frequency distributions and diminishing the interpretability of the IMFs.To mitigate the adverse impact of noise during the decomposition process, Zhaohua and Norden (2009) [40] propose ensemble empirical mode decomposition (EEMD), a data analysis approach that incorporates white noise into the original time series.The detailed process of EEMD is as follows: Given an original series, denoted as f (t), the detailed process of EEMD is described below: (1) Add random white noise to the original time series, n i (t) ∼ 0, σ 2 , where σ is known.
where i denotes the count of white noise additions.(2) The EMD algorithm is employed to decompose the composited sequences, f i (t), with noise into I MFs i j (t), (j = 1, 2, . . . K), and the residual, Res i (t).
where I MFs i j (t) indicates the j-th IMF component of derived from the decomposition of the i-th mixed original series.
(3) Repeat the steps described above N times, each time using different Gaussian white noise, and determine the corresponding IMFs.(4) Repeat the aforementioned procedure N times, introducing different Gaussian white noise in each iteration, and obtain each corresponding IMF.Compute the average of the sum of the corresponding decomposed IMFs over N iterations to mitigate the impact of the introduced white noise on the original signal.The j-th IMF component is as follows: (5) Finally, the original series, f (t), are decomposed by EEMD models, which can be expressed as follows: where i = 1, 2, . .., N.

Singular Spectral Analysis
Singular spectral analysis (SSA) is a non-parametric method for estimating the spectral characteristics of time series data, aimed at discerning distinct patterns of variability [41].The fundamental framework of SSA encompasses data embedding, singular value decomposition (SVD), eigentriple grouping, and diagonal averaging.
Embedding can be viewed as a transformation that converts a one-dimensional time series into a trajectory matrix using the selected window length (L), upon which SVD is performed.The window length is the time step by which the data is further divided to extract components.The final product of singular value decomposition is feature triples, the count of which matches the chosen window length.Consider Y N = (y 1 , y 2 , . . . ,y N ), which is not a series with all zeros; let "X" be the trajectory matrix, which can be expressed as where K = N − L + 1.Note that the resulting trajectory matrix is a Hankel matrix, implying that all elements along the diagonal, where i + j is a constant, are equivalent.Secondly, singular value decomposition is applied to the trajectory matrix, X.Let S = XX T , λ 1 , λ 2 , . .., λ L is the eigenvalue of S sorted in descending order (λ 1 . ..λL 0), and U 1 , • • • U L are the standard orthonormal vectors of the matrix S corresponding to these eigenvalues.Let d = rank(X) = max{i, λ i > 0} (in practical sequences, usually d = L*, L* = min(L, K)).Then, the SVD of the trajectory matrix can be written as X = X 1 + . . .+ X d , where In the grouping step, we can choose to analyze the periodogram, right eigenvector scatter plot, or eigenvalue function plot to distinguish between noise and signal.In the process of reconstructing the signal, there are no specific rules for the way of grouping.The set of subscripts {1, . .., d} can be divided into m disjoint subsets according to the properties of the time series to be reconstructed, i.e., I 1 , I 2 , . .., I m .If I = {i 1 , . .., i p }, then the composite matrix corresponds to X = X I1 + X I2 + . . .+ X I M .The final step of SSA involves transforming each resulting matrix from the grouping into a new sequence of length N. Let T be a L × K matrix with elements t ij , 1 and t ij * = t ji otherwise.Through the process of diagonal averaging, matrix T is transformed into a series t 1 , t 2 , . .., t N using the following formula: A single RCt sub-sequence of length N can be obtained according to the formula.The new X component is the sum of d RCt components and can be expressed as

Convolutional Neural Network
One-dimensional convolution (1D-CNN) can be executed by employing a filter that is specialized for handling sequential data, allowing for the extraction of sequence features as the network slides over the data using convolution kernels.A standard 1D-CNN typically comprises an input layer, multiple interleaved convolutional and pooling layers, a fully connected layer, and an output layer, as illustrated in Figure 3.The convolutional and pooling layers are distinctive components of the convolutional neural network.With a single-layer 1D-CNN, the output vector Y : y 1 , • • • , y j , • • • , y m−n+1 t is obtained as follows: where I:[I 1 , . .., I i , . .., I a ] is an input vector with the size of a, t is the stride, and b stands for the size of the convolution kernel, k.After y 1 is obtained according Equation (3), the calculation window slides back to I t+1 to calculate.This process is repeated until there are no remaining data from the input.

PEER REVIEW 8 of 20
where I:[I1, ..., Ii, ..., Ia] is an input vector with the size of a, t is the stride, and b stands for the size of the convolution kernel, k.After y1 is obtained according Equation (3), the calculation window slides back to It+1 to calculate.This process is repeated until there are no remaining data from the input.

Long Short-Term Memory
The LSTM network is variant of a recurrent neural network (RNN) model and is improved on the basis of recurrent neural networks to make it have a long short-term memory function.The LSTM effectively captures long-range dependencies, addressing the problems of gradients exploding and vanishing during backpropagation that are common in traditional RNNs [17].Each LSTM block consists of a memory cell and three parts: the input gate, forget gate, and output gate (Figure 4).The manner in which information from the previous layer is passed to the current layer is determined by each gate selectively.The memory cell acts as an accumulator of state information, preserving the hidden details of the time series.This allows the LSTM to leverage long-term historical context.The specific content is as follows: In LSTM, the forgetting gate first determines the retention of the state at the previous moment, and the calculation formula is In this formula,  is the activation function Sigmoid,   represents the weights of forgotten gate weights, and   represents the bias of the forget gate.The Sigmoid function maps the input and the state of the previous moment to a value from 0 to 1.The value of   is 1 to indicate full retention and 0 to signify complete discarding.The input gate determines the extent to which the current network input, denoted as   , is incorporated into the cell state   .

Long Short-Term Memory
The LSTM network is variant of a recurrent neural network (RNN) model and is improved on the basis of recurrent neural networks to make it have a long short-term memory function.The LSTM effectively captures long-range dependencies, addressing the problems of gradients exploding and vanishing during backpropagation that are common in traditional RNNs [17].Each LSTM block consists of a memory cell and three parts: the input gate, forget gate, and output gate (Figure 4).The manner in which information from the previous layer is passed to the current layer is determined by each gate selectively.The memory cell acts as an accumulator of state information, preserving the hidden details of the time series.This allows the LSTM to leverage long-term historical context.The specific content is as follows: In LSTM, the forgetting gate first determines the retention of the state at the previous moment, and the calculation formula is ter 2023, 15, x FOR PEER REVIEW 9 of 20 where   and   are the weights and bias of the output gate.The current state,  , is multiplied by the output,   , of the activation function layer after tanh to obtain the output, ℎ  , at the current moment.

Model Implementation
In this study, decomposition-based hybrid models are employed for predicting the concentration of Chla in Dianchi Lake.To appropriately train the deep learning predicting In this formula, σ is the activation function Sigmoid, W f represents the weights of forgotten gate weights, and b f represents the bias of the forget gate.The Sigmoid function maps the input and the state of the previous moment to a value from 0 to 1.The value of f t is 1 to indicate full retention and 0 to signify complete discarding.The input gate determines the extent to which the current network input, denoted as x t , is incorporated into the cell state C t .
W i and b i are the weights and bias of the input gate; W c and b c represent the weight and bias when constructing the candidate vector, determining the proportion of forgetting by the sigmoid function.C t of Equation ( 13) implements the cell state update at moment t.The output gate needs to determine the output value with the following formula: where W o and b o are the weights and bias of the output gate.The current state, C t , is multiplied by the output, o t , of the activation function layer after tanh to obtain the output, h t , at the current moment.

Model Implementation
In this study, decomposition-based hybrid models are employed for predicting the concentration of Chla in Dianchi Lake.To appropriately train the deep learning predicting models, we split the modeling procedure into training and testing.In the training part, the first 50% of the decomposed sequence (February 2019-January 2020) are input to build the network.In the testing part, the remaining data (January 2020-January 2021) are accustomed to estimate the model performance.Based on the training data, to prevent the influence of varying scales on parameter learning, Chla concentrations are normalized to a range from 0 to 1 using min-max normalization.Furthermore, we utilize the mean square error (MSE) as the loss function and implement the adaptive momentum estimation method (Adam) to optimize the weights.We implemented our deep learning network on the Keras development platform.In this study, the Daubechies-4 (db4) mother wavelet was utilized to perform a three-level decomposition of the original time series, a choice popular for its widespread acceptance and efficient performance [26].The ensemble number of the EEMD model was set to 100, and the standard deviation of Gaussian white noise, n i (t), was 0.05 [30].We set the window length of the SSA to 15 according to the empirical evaluation of component contributions in the experiment.The WT, EEMD, and SSA were carried out using the MATLAB R2019b software.For the purpose of conducting an equitable comparative analysis across various decomposition methodologies, the parameters for the CNN and LSTM are presented in Table 2.

Evaluation Metrics
To comprehensively measure the prediction performance of the decomposed-base hybrid deep learning models, three different criteria are used, including the RMSE (root mean square error), mean absolute error (MAE), and the coefficient of determination (R 2 ).The formulas for calculating these indicators are as follows: where n represents the number of observed data in the test data; y i and ŷi denote the observed algal parameter values and predicted algal parameter values, respectively.Also, y displays the mean observed values.Moreover, the improvement percentage is introduced to facilitate quantitative comparisons between the decomposition-base hybrid deep learning model and single models.The improvement percentage for the RMSE, MAE, and R 2 are calculated as follows: where RMSE 1 and MAE 1 denote the errors of the single models, and RMSE 2 and MAE 2 represent the errors of the decomposition-based models.R 2 1 and R 2 2 denote the errors of the decomposition-based and single models, respectively.A substantial positive value indicates superior accuracy of the decomposition-based models compared to the single models.

Results and Discussion
Firstly, we present the decomposition results of the WT, EEMD, and SSA models.Secondly, we present and discuss the Chla concentration forecasting enhancement by the hybrid WT-CNN, EEMD-CNN, SSA-CNN, WT-LSTM, EEMD-LSTM, and SSA-LSTM models compared to the independent CNN and LSTM models.Then, we compare the performance among those hybrid deep learning models.Finally, we also assess the strengths and limitations of these decomposition techniques in algal parameter prediction.

The Process of Different Decomposition-Based Algorithms for Chlorophyll Fluorescence Data
Due to the intricate non-stationary nature of algal parameter series, the appropriate decomposition of the original data plays an important role in improving forecasting accuracy.After data preprocessing, the Chla concentration algal parameters were decomposed using the WT, EEMD, and SSA methods.The decomposition results of the algal parameters at the Duanqiao and Caohai center sites are shown in Figure 5.In detail, the wavelet decomposition of the hourly original chlorophyll series effectively generated an approximation low-frequency coefficient (A3) and three detailed high-frequency coefficients (D1-D3) at the Duanqiao and Caohai center sites, as depicted in Figure 5a,b.Compared to the original time series, the sub-series A3 extracted the hourly algal series trend and major peaks, while sub-series levels D1 to D3 captured more subtle and fluctuation components simultaneously.The EEMD results of the algal parameters are shown in Figure 5c,d.The hourly original chlorophyll series were decomposed into four volatility characteristic IMFs with different frequencies and one residual component.Figure 5c,d show IMFs from high frequency to low frequency, the residue, and the original algal series sequentially from top to bottom.Intuitively, it is clear that the EEMD captures the trend of chlorophyll series and volatility characteristics exactly.Furthermore, the SSA decomposed the algal series The EEMD results of the algal parameters are shown in Figure 5c,d.The hourly original chlorophyll series were decomposed into four volatility characteristic IMFs with different frequencies and one residual component.Figure 5c,d show IMFs from high frequency to low frequency, the residue, and the original algal series sequentially from top to bottom.Intuitively, it is clear that the EEMD captures the trend of chlorophyll series and volatility characteristics exactly.Furthermore, the SSA decomposed the algal series into 15 components (Figure 5e,f) with a window length setting to 15.The RC 1 -RC 6 components, with a contribution rate of 97.50%, were reconstructed into the main trend terms.The RC 7 -RC 13 sub-series was chosen as the fluctuation component, and RC 14 -RC 15 was considered as the noise at the Duanqiao site.In comparison, the top six sub-series (i.e., RC 1 -RC 6 ) with a contribution rate of 97.90% were chosen as the main trend components of algal parameters, and the remaining RC 7 -RC 11 and RC 11 -RC 15 components were considered as the volatility characteristics and noise into the subsequent model at the Caohai center site.The series of algal parameters exhibited more prominent trends after reconstruction, suggesting that SSA can effectively extract the trend, volatility, and noise components, capturing the primary features of the series.

Evaluating the Predictive Performance of Deep Learning Based on Multi-Decomposition Process
In this section, the CNN and LSTM predict each sub-series obtained through the multidecomposition methods.We procure the final algal series' predicted values by summing up the forecasting results of all the sub-components.To adequately assess the effectiveness of the hybrid deep learning models, we employ hybrid CNN and LSTM models based on multi-decomposition, including WT-CNN, EEMD-CNN, SSA-CNN, WT-LSTM, EEMD-LSTM, and SSA-LSTM.The detailed performance curves of the prediction and observed values in the train and test datasets are shown in Figure 6.From Figure 6, the fitting curve obtained by the SSA-based hybrid models (orange lines) closely approximates the observed values (blue lines), especially pronounced at peak locations.Therefore, it implies that the performance of SSA-based hybrid models are better than the performance of WT-based and EEMD-based hybrid models in algal sequence prediction.While individual CNN and LSTM models demonstrate the ability to forecast the Chla trends, significant errors exist between observed and predicted values.These models demonstrate inadequate prediction accuracy in capturing details and sharp peaks.The decomposition-based hybrid models can more accurately predict the detailed components and significantly enhance the model performance.All of this evidence convincingly indicates that the decomposition-based hybrid methods can reliably predict algal dynamics at different sampling sites.
The RC7-RC13 sub-series was chosen as the fluctuation component, and RC14-RC15 was considered as the noise at the Duanqiao site.In comparison, the top six sub-series (i.e., RC1-RC6) with a contribution rate of 97.90% were chosen as the main trend components of algal parameters, and the remaining RC7-RC11 and RC11-RC15 components were considered as the volatility characteristics and noise into the subsequent model at the Caohai center site.The series of algal parameters exhibited more prominent trends after reconstruction, suggesting that SSA can effectively extract the trend, volatility, and noise components, capturing the primary features of the series.

Evaluating the Predictive Performance of Deep Learning Based on Multi-Decomposition Process
In this section, the CNN and LSTM predict each sub-series obtained through the multi-decomposition methods.We procure the final algal series' predicted values by summing up the forecasting results of all the sub-components.To adequately assess the effectiveness of the hybrid deep learning models, we employ hybrid CNN and LSTM models based on multi-decomposition, including WT-CNN, EEMD-CNN, SSA-CNN, WT-LSTM, EEMD-LSTM, and SSA-LSTM.The detailed performance curves of the prediction and observed values in the train and test datasets are shown in Figure 6.From Figure 6, the fitting curve obtained by the SSA-based hybrid models (orange lines) closely approximates the observed values (blue lines), especially pronounced at peak locations.Therefore, it implies that the performance of SSA-based hybrid models are better than the performance of WTbased and EEMD-based hybrid models in algal sequence prediction.While individual CNN and LSTM models demonstrate the ability to forecast the Chla trends, significant errors exist between observed and predicted values.These models demonstrate inadequate prediction accuracy in capturing details and sharp peaks.The decomposition-based hybrid models can more accurately predict the detailed components and significantly enhance the model performance.All of this evidence convincingly indicates that the decomposition-based hybrid methods can reliably predict algal dynamics at different sampling sites.
(a)  The hybrid decomposition-based deep learning framework and commonly employed time series individual forecasting methods (CNN and LSTM) are also cross-compared based on the Caohai center and Duanqiao sites of Lake Dianchi (Figure 7).For the CNN model, the best predictions were achieved by the SSA-based hybrid model, resulting in highly satisfactory R 2 values (R 2 = 0.9652 and 0.9518), which represented a 32.16% increase for the Caohai center site and a 26.67% increase for the Duanqiao site (Figure 7a,b).When compared to the CNN, the prediction accuracy of WT-CNN and EEMD-CNN, as measured based on the evaluation indicators of R 2 , showed improvements of 28.02%, 24.99%, 19.18%, and 14.8% for the two sites, respectively.Furthermore, as for the LSTM approach, which is consistent with the CNN method, SSA-LSTM outperforms the single LSTM in The hybrid decomposition-based deep learning framework and commonly employed time series individual forecasting methods (CNN and LSTM) are also cross-compared based on the Caohai center and Duanqiao sites of Lake Dianchi (Figure 7).For the CNN model, the best predictions were achieved by the SSA-based hybrid model, resulting in highly satisfactory R 2 values (R 2 = 0.9652 and 0.9518), which represented a 32.16% increase for the Caohai center site and a 26.67% increase for the Duanqiao site (Figure 7a,b).When compared to the CNN, the prediction accuracy of WT-CNN and EEMD-CNN, as measured based on the evaluation indicators of R 2 , showed improvements of 28.02%, 24.99%, 19.18%, and 14.8% for the two sites, respectively.Furthermore, as for the LSTM approach, which is consistent with the CNN method, SSA-LSTM outperforms the single LSTM in terms of its R 2 statistic (0. 978 vs. 0.738 in Caohai center, and 0.975 vs. 0. 723 in Duanqiao, respectively).Additionally, although the WT-LSTM models (R 2 = 0.955, 0.944) perform better than EEMD-LSTM (R 2 = 0.857, 0.881) at the two sites, they still exhibit a slightly lower performance compared to the SSA-LSTM approach.As for the CNN and LSTM approaches, all models demonstrated the lowest performance among the four cases examined (Figure 7), showing the limitation of single deep learning models in handling complicated and non-stationary forecasting tasks of algal dynamics.Cross-comparisons between hybrid-based CNN or LSTM and single models show that the most significant improvement in forecasting HAB dynamics is achieved through decomposition-based hybrid approaches.These outcomes show that the hybrid deep learning algorithms based on multi-decomposition can decompose non-stationary time series into sub-sequences and better detect the main trend, volatility, and noise components of algal dynamics.In addition, sub-series are fed to the CNN and LSTM models one by one to improve HAB forecasting further.The hybrid models based on decomposition Cross-comparisons between hybrid-based CNN or LSTM and single models show that the most significant improvement in forecasting HAB dynamics is achieved through decomposition-based hybrid approaches.These outcomes show that the hybrid deep learning algorithms based on multi-decomposition can decompose non-stationary time series into sub-sequences and better detect the main trend, volatility, and noise components of algal dynamics.In addition, sub-series are fed to the CNN and LSTM models one by one to improve HAB forecasting further.The hybrid models based on decomposition exhibit the ability to uncover the peak and extreme values of algal parameters, indicating the robust resilience and effective signal-smoothing capability of the preprocessing approach.This is in line with the previous study by Liu et al. [26], who demonstrated that the decomposition-based hybrid WT-LSTM model could enhance the fitting of abrupt or extreme data points and improve HAB prediction performance.Likewise, Luo et al. [30] developed a hybrid EEMD-LSTM model for predicting water quality, and the results demonstrated that the proposed model outperformed the individual LSTM model in various evaluation indicators.Cui et al. [42] also discovered that integrating SSA with a lightweight gradient-boosting machine in a hybrid model led to high-accuracy, real-time predictions of urban runoff.Interestingly, the advantages of hybrid models integrating LSTM over CNN-based models become apparent, indicating the benefits of automatically capturing long-temporal information through LSTM recurrent chains.

Comparing the Effectiveness of Different Decomposition Approaches in Forecasting HABs
To compare the prediction performance of different decomposition approaches, we combine common decomposition approaches-namely WT, EEMD, and SSA-with the CNN and LSTM forecasting techniques to forecast the algal series at two stations.The results are presented in Tables 3 and 4. Firstly, when combined with the SSA decomposition, the hybrid-based LSTM and CNN prediction methods reach the lowest RMSE (1.518 µg/L, 1.927 µg/L, 1.092 µg/L and 1.513 µg/L) and MAE values (0.855 µg/L, 1.361 µg/L, 0.702 µg/L, 1.025 µg/L) and the biggest R 2 (0.978, 0.965, 0.975, 0.952) at Caohai center and Duanqiao station, respectively.This verifies the superiority of the SSA decomposition method over other WT and EEMD methods.Secondly, the decomposition methods demonstrate enhanced effectiveness when combined with a more precise forecasting technique.In particular, LSTM can achieve significantly improved accuracy percentages through decomposition compared to the CNN.Specifically, WT decomposition increases the LSTM improvement percentages of the RMSE from 5.282 to 2.198 by 58.4%, the MAE from 3.525 to 1.405 by 60.1%, and the R 2 from 0.738 to 0.955 by 29.3%; the CNN improvement percentages of RMSE increase from 5.364 to 2.635 by 50.9%,MAE from 3.538 to 1.742 by 50.8%, and the R 2 from 0.730 to 0.935 by 28%.Similarly, the SSA decomposition method offers a greater performance enhancement for LSTM than the CNN model, while EEMD decomposition yields similar results for both models.Thirdly, employing the same WT and SSA decomposition techniques, LSTM typically outperforms the CNN regarding prediction accuracy, while for EEMD decomposition, LSTM is not always superior to the CNN.Table 5 provides a concise comparison of the WT, EEMD, and SSA decomposition methods employed in this study.It is essential to note that each approach possesses its distinct strengths and limitations.Specifically, WT is commonly employed as a preprocessing method for predicting water levels [43], algal blooms [26], precipitation [44], rainfall runoff [45], and river flow [46].WT is well suited for handling signals with a constant frequency and near periodicity, but requires presetting the basis function and their order, significantly impacting the decomposition results.The noise-assisted ensemble empirical mode decomposition (EEMD) method is utilized to address the issue of mode mixing and is capable of processing complex and non-stationary time series.EEMD has gained extensive adoption in enhancing the forecasting performance of precipitation [47], daily runoff [48], water levels [49], rainfall [50], streamflow [51], river flow [52], and water quality [30].Singular spectral analysis (SSA) has found widespread applications in the preprocessing of hydrological data, including streamflow [31], rainfall [53], runoff [54], and rainfall runoff [42] prediction.Previous research has demonstrated that SSA can significantly enhance the prediction effectiveness of independent deep learning models, highlighting its potential to enhance forecasting accuracy.Diverse decomposition methods exhibit distinct applicability conditions.Consequently, selecting the appropriate decomposition methods based on different applicability ranges can provide a research example for the prediction of algal blooms.Despite the extensive research on decomposition-based

Water 2023 , 20 Figure 1 .
Figure 1.Overview of the study area in Caohai, Lake Dianchi, and the distribution of sampling sites.(a) Location of Lake Dianchi in Yunnan, Southwest China.(b) Distribution of Lake Dianchi and riverway networks in Lake Dianchi Basin.(c) The red pentagon marks the location of the two sampling sites inside Caohai of Lake Dianchi.

Figure 1 .
Figure 1.Overview of the study area in Caohai, Lake Dianchi, and the distribution of sampling sites.(a) Location of Lake Dianchi in Yunnan, Southwest China.(b) Distribution of Lake Dianchi and riverway networks in Lake Dianchi Basin.(c) The red pentagon marks the location of the two sampling sites inside Caohai of Lake Dianchi.
Water 2023, 15, x FOR PEER REVIEW 5 of 20

Figure 2 .
Figure 2. Schematic flow chart of the decomposition-based hybrid deep learning models.

Figure 2 .
Figure 2. Schematic flow chart of the decomposition-based hybrid deep learning models.

Figure 6 .
Figure 6.Predicted and observed time series of Chla concentrations by the decomposition-based hybrid deep learning models and independent CNN and LSTM approaches in the Lake Dianchi at (a,b) Caohai Center and (c,d) Duanqiao.

Figure 6 .
Figure 6.Predicted and observed time series of Chla concentrations by the decomposition-based hybrid deep learning models and independent CNN and LSTM approaches in the Lake Dianchi at (a,b) Caohai Center and (c,d) Duanqiao.

Water 2023 ,Figure 7 .
Figure 7. Scatter diagrams of the Chla with one-step prediction lead times at the Caohai center and Duanqiao stations in Lake Dianchi.Red dots represent the single models, green dots represent the hybrid EEMD-based models, orange dots represent the hybrid WT-based models, and blue dots represent the hybrid SSA-based models.(a) Comparison of single model and hybrid CNN-based models at Caohai center station.(b) Comparison of single model and hybrid CNN-based models at Duanqiao station.(c) Comparison of single model and hybrid LSTM-based models at Caohai center station.(d) Comparison of single model and hybrid CNN-based models at Duanqiao station.

Figure 7 .
Figure 7. Scatter diagrams of the Chla with one-step prediction lead times at the Caohai center and Duanqiao stations in Lake Dianchi.Red dots represent the single models, green dots represent the hybrid EEMD-based models, orange dots represent the hybrid WT-based models, and blue dots represent the hybrid SSA-based models.(a) Comparison of single model and hybrid CNN-based models at Caohai center station.(b) Comparison of single model and hybrid CNN-based models at Duanqiao station.(c) Comparison of single model and hybrid LSTM-based models at Caohai center station.(d) Comparison of single model and hybrid CNN-based models at Duanqiao station.

Table 1 .
Overview of online monitoring datasets for chlorophyll-a at two monitoring sites.

Table 2 .
The hyper-parameters of the CNN and LSTM models.

Table 3 .
The improvement percentages in the accuracy of the decomposition-based CNN and LSTM methods compared with single CNN and LSTM approaches for Chla (chlorophyll-a) prediction in Caohai center station (µg/L).(+∆ represents the improvement percentages in RMSE and MAE for decomposition-based models compared to single models.)

Table 4 .
The improvement percentages in the accuracy of the decomposition-based CNN and LSTM methods compared with single CNN and LSTM approaches for Chla (chlorophyll-a) prediction in Duanqiao station (µg/L).(+∆ represents the improvement percentages in RMSE and MAE for decomposition-based models compared to single models.)