Previous Article in Journal
From Models to Metrics: A Governance Framework for Large Language Models in Enterprise AI and Analytics
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Denoising Stock Price Time Series with Singular Spectrum Analysis for Enhanced Deep Learning Forecasting

Department of Statistics and Data Science, Faculty of Science, National University of Singapore, Singapore 117546, Singapore
*
Author to whom correspondence should be addressed.
Analytics 2026, 5(1), 9; https://doi.org/10.3390/analytics5010009 (registering DOI)
Submission received: 3 December 2025 / Revised: 22 December 2025 / Accepted: 22 January 2026 / Published: 27 January 2026

Abstract

Aim: Stock price prediction remains a highly challenging task due to the complex and nonlinear nature of financial time series data. While deep learning (DL) has shown promise in capturing these nonlinear patterns, its effectiveness is often hindered by the low signal-to-noise ratio inherent in market data. This study aims to enhance the stock predictive performance and trading outcomes by integrating Singular Spectrum Analysis (SSA) with deep learning models for stock price forecasting and strategy development on the Australian Securities Exchange (ASX)50 index. Method: The proposed framework begins by applying SSA to decompose raw stock price time series into interpretable components, effectively isolating meaningful trends and eliminating noise. The denoised sequences are then used to train a suite of deep learning architectures, including Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and hybrid CNN-LSTM models. These models are evaluated based on their forecasting accuracy and the profitability of the trading strategies derived from their predictions. Results: Experimental results demonstrated that the SSA-DL framework significantly improved the prediction accuracy and trading performance compared to baseline DL models trained on raw data. The best-performing model, SSA-CNN-LSTM, achieved a Sharpe Ratio of 1.88 and a return on investment (ROI) of 67%, indicating robust risk-adjusted returns and effective exploitation of the underlying market conditions. Conclusions: The integration of Singular Spectrum Analysis with deep learning offers a powerful approach to stock price prediction in noisy financial environments. By denoising input data prior to model training, the SSA-DL framework enhanced signal clarity, improved forecast reliability, and enabled the construction of profitable trading strategies. These findings suggested a strong potential for SSA-based preprocessing in financial time series modeling.

1. Introduction

This study investigates how integrating Singular Spectrum Analysis (SSA) with deep learning models can enhance stock price forecasting accuracy and trading performance in noisy financial markets, with a specific focus on the Australian Securities Exchange (ASX)50 index. The Australian equity market offers a valuable yet underexplored environment for testing advanced forecasting models. Unlike major global markets such as in the U.S. or China, the Australian market exhibits distinctive structural characteristics, moderate liquidity, high concentration in resource and financial sectors, and pronounced sensitivity to global commodity price movements. These characteristics introduce complex, non-linear dependencies that pose significant challenges for traditional time-series models. By applying the SSA–DL framework to this unique context, the study extends existing research beyond well-studied markets and demonstrates the model’s robustness and adaptability across diverse economic settings. This focus not only enriches the global literature on financial forecasting but also provides insights applicable to other markets with similar structural profiles.
To date, machine learning algorithms have been extensively applied in stock price prediction studies across global financial markets. Various models, such as Artificial Neural Networks (ANN) and Convolutional Neural Networks (CNN) have demonstrated strong effectiveness in improving predictive accuracy [1]. However, stock market forecasting remains inherently challenging due to the noisy, chaotic, non-stationary, and highly volatile nature of financial time series data [1,2]. To mitigate these challenges, several signal processing and modeling techniques, such as WaveNet and Singular Spectrum Analysis (SSA), have been developed to enhance forecasting reliability [3,4].
While deep learning algorithms integrated with Singular Spectrum Analysis (SSA) have been applied to financial time series in various international markets, there remains a notable gap in the application of these advanced techniques to Australian stock market data. This underrepresentation may stem from competing research interests in other regions, the relatively smaller size of Australian market, and broader global research priorities. To address this gap, the present study focuses on stocks listed on the Australian Securities Exchange (ASX), specifically the ASX50 Index, which comprises the 50 largest and most liquid stocks in the market. In this study, SSA is employed as a preprocessing step to decompose and denoise stock price series, thereby enhancing the quality of inputs for the subsequent deep learning models.
This study makes two key contributions to the existing literature. First, it advances the application of cutting-edge analytics by integrating Singular Spectrum Analysis (SSA) with deep learning architectures, specifically, Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and a hybrid CNN-LSTM model for stock price forecasting. To the best of our knowledge, this integrated approach has not been previously explored in the context of the Australian stock market. Second, while many studies on stock analytics focus primarily on predictive performance, this research extends the scope by developing profitable and reliable trading strategies derived from the model outputs, thereby bridging the gap between predictive modeling and practical financial decision-making.
The proposed methodology follows a systematic and structured process. First, the Singular Spectrum Analysis (SSA) algorithm is applied separately to the training and test datasets to generate denoised time series. Second, these denoised training datasets are used to train multiple deep learning models, which are subsequently employed to forecast stock prices during the test period. Third, the predicted stock prices are aggregated and compared with the actual prices to evaluate each model’s forecasting performance. Finally, a trading strategy is developed based on the predicted prices to assess the practical utility and profitability of the proposed SSA-deep learning (SSA-DL) framework.

2. Literature Review

The Efficient Market Hypothesis (EMH), as proposed by Fama, posits that stock prices fully reflect all available market information [5]. According to this theory, when investors attempt to earn excess returns through extensive analysis of historical stock data, the market rapidly incorporates such information, thereby adjusting prices to eliminate any potential profit opportunities. Despite this, many stock investors continue to rely on technical analysis, which seeks to identify empirical patterns and market behaviours based on publicly available price data.
Furthermore, numerous studies have questioned the assumption of market efficiency, presenting evidence that financial markets are not entirely efficient. Recent research, such as that of [2], provides compelling evidence that certain technical trading strategies can still yield significant and consistent returns, particularly in the stock markets of China and South Korea.
Previous research has demonstrated that financial indicators such as the Moving Average (MA) [6], Moving Average Convergence–Divergence (MACD), and the Relative Strength Index (RSI) [7] are statistically significant in predicting stock prices and developing profitable trading strategies. These advantages have been observed across both bull and bear market conditions. Furthermore, various time series models have been employed for stock price forecasting, with notable approaches such as the Autoregressive Moving Average (ARMA) model incorporating past indicators and cyclical factors to enhance predictive accuracy.
Given that many stock price series exhibit non-stationary behavior, often mitigated through differencing, the Autoregressive Integrated Moving Average (ARIMA) model has been widely employed in previous studies. For instance, ref. [8] applied the ARIMA model to forecast stock prices across various sectors of the National Stock Exchange (NSE), reporting high predictive accuracy and robustness as validated through paired t-tests. Another commonly used approach, the Generalized Autoregressive Conditional Heteroskedasticity (GARCH) model, effectively captures stock market volatility dynamics, offering valuable forecasts that serve as key tools for risk management in stock trading strategies [9].
Since the late 20th century, quantitative investing has grown rapidly in popularity, fueled by advances in computing power, analytical methodologies, and the increasing demand from large institutional investors [10]. Today, numerous hedge funds and asset management firms leverage machine learning algorithms for portfolio analysis and management. Given the limitations of traditional time series models in addressing the nonlinear and non-stationary characteristics of financial data, many studies have demonstrated the superior effectiveness of machine learning techniques in predicting stock prices and formulating optimal trading strategies across different markets. Among these methods, supervised learning is the most widely used approach in stock market prediction [11].
Given the nonlinear trends in stock prices and their complex relationships with various influencing factors, several machine learning models employing nonlinear algorithms have been applied to stock price forecasting. These include Support Vector Machines (SVM), Support Vector Regressors (SVR), Random Forests (RF), and Artificial Neural Networks (ANN). For instance, Yu et al. [12] utilized Principal Component Analysis (PCA) to classify stocks based on multiple fundamental indicators and subsequently applied the SVM model for stock selection, achieving superior performance compared to the A-share index on the Shanghai Stock Exchange. Similarly, Kazem et al. [13] proposed an SVR model integrated with a chaotic firefly algorithm and backtested it on three U.S. stock datasets, finding that it outperformed traditional models in terms of Mean Squared Error (MSE) and Mean Absolute Percentage Error (MAPE). Additionally, Polamuri et al. [14] employed RF and Extra Tree Regressor models to forecast stock prices on the S&P 500 index, confirming their superiority over conventional linear models based on Mean Absolute Error (MAE) and MSE metrics.
Vijh et al. [15] constructed technical indicators from stock data and applied Random Forest (RF) and Artificial Neural Network (ANN) models to predict the closing prices of five major U.S. companies. Their results indicated that both RF and ANN achieved strong predictive performance, with ANN generally outperforming RF in terms of Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE), and Mean Bias Error (MBE) metrics [15]. Similarly, Göçken et al. [16] integrated ANN with Genetic Algorithms and Harmony Search to optimize technical indicators and mitigate overfitting and underfitting issues, thereby enhancing stock price prediction accuracy. It is noteworthy that the architecture of the ANN model plays a crucial role in determining predictive performance, as factors such as the number of hidden layers, the number of nodes per layer, the inclusion of dropout layers during training, and other hyperparameter configurations can significantly influence the final outcomes [16].
In recent years, deep learning models with multiple hidden layers, such as Deep Neural Networks (DNN), Convolutional Neural Networks (CNN), and Long Short-Term Memory (LSTM) networks, have proven increasingly effective for stock market forecasting due to their superior ability to extract meaningful features from large datasets [11]. Zhong and Enke [17] employed various DNN models with differing numbers of hidden layers to predict the daily return direction of the SPDR S&P 500 ETF, utilizing Principal Component Analysis (PCA) for feature engineering with 60 financial and economic indicators. While CNNs are well known for their strong performance in image classification task and LSTM networks excel in sequence-to-sequence (Seq2Seq) learning by mitigating the vanishing and exploding gradient problems, both architectures and their hybrid variants have demonstrated superior performance in stock market forecasting due to their ability to learn complex relationships between large sets of input and output variables. Hoseinzade and Haratizadeh [18] further enhanced predictive accuracy by developing fine-tuned 2D-CNNpred and 3D-CNNpred models with kernels designed to mimic image- processing feature extraction, achieving more accurate predictions for six stock index movements compared to a shallow ANN model and a baseline CNN-core model.
Durairaj and Mohan developed two novel chaotic hybrid models, Chaos + CNN and Chaos + CNN + PR, which first reconstructed noisy time series affected by chaotic behaviour and then fit both the original time series and the fitted noise series into CNN models. These hybrid models generally produced more accurate predictions for foreign exchange, commodity, and stock market indices compared to traditional models such as ARIMA, CART, and Random Forest (RF) [19]. However, in certain cases, the hybrid models did not outperform the standalone CNN model, suggesting that the CNN alone could capture intrinsic patterns within the noisy time series. To address the issue of overfitting in stock market forecasting, Beak and Yim proposed a specialized LSTM architecture that combined an overfitting-prevention LSTM module with a prediction LSTM module. This model yielded improved forecasts for the S&P 500 and KOSPI 200 indices [20]. Similarly, Fazeli and Houghten employed an LSTM model enhanced with manually constructed technical indicators to predict the stock trends of major companies such as Apple, Microsoft, Google, and Intel, demonstrating the model’s capability to generate effective buy/sell signals based on historical data [21].
The hybrid CNN-LSTM model has also gained considerable attention in recent research. Livieris et al. utilized a CNN to extract meaningful features and an LSTM to learn the internal representation of time-series data, concluding that the CNN-LSTM model achieved improved predictions for gold market prices [22]. Similarly, Lu et al. incorporated information from the preceding 10 days using a CNN model as input for an LSTM to predict stock prices of the Shanghai Composite Index from 1 July 1991, to 31 August 2020. Their results, compared with models such as MLP, CNN, RNN, and LSTM, demonstrated that the CNN-LSTM combination produced lower RMSE and higher R2 values [23]. Song and Choi implemented both CNN-LSTM and GRU-CNN architectures (featuring different configurations of recurrent and convolutional neural networks) for one-step and multi-step predictions of closing prices for the DAX, DOW, and S&P 500 indices [24]. Likewise, Beak applied a CNN-LSTM model using the most recent 20 days of technical data, combined it with Genetic Algorithms (GA) for hyperparameter optimization, and found that this approach achieved higher prediction accuracy for the KOSPI index compared to standalone CNN, LSTM, and CNN-LSTM models [25].
Australia’s equity market is a vital component of global investment management. Research on the Australian market not only uncovers diverse investment opportunities but also enhances market efficiency and transparency. However, most major quantitative studies have primarily focused on regions such as Asia, Europe, North America, and South America, with relatively little attention given to Australia [11]. This gap underscores the need for further exploration of the Australian stock market.
Kwong (2001) conducted a time-series study of selected Australian stocks using neural networks, to uncover patterns between stock movements and influencing factors [26]. Indika Priyadarshani investigated the asymmetry associated with the volatility effects in the Australian stock market compared to other global markets. By modeling covolatility shocks across markets using multivariate generalized autoregressive conditional heteroskedasticity (MGARCH) approach, Priyadarshani demonstrated that the US stock market exerts a dominant influence on the Australian stock market [27]. Hargreaves and Hao applied various machine learning techniques to develop trading strategies based on fundamental factors and concluded that machine-learning-driven equity research can generate superior returns in the Australian market [28]. Hussain et al. employed adaptive neuro-fuzzy inference systems (ANFIS), which integrate the strengths of artificial neural networks (ANNs) and fuzzy systems (FSs) to forecast the performance of Australian stocks listed on the ASX. Their results showed that ANFIS outperformed traditional models such as LSTM and GRU in terms of RMSE, MAE, and MAPE [29].
Due to the random fluctuations or irregularities inherent in both the market and individual stocks, filtering noise becomes a crucial challenge. Singular Spectrum Analysis (SSA) is a non-parametric method that, without many statistical constraints, decomposes a time series into multiple signals, effectively filtering out noise to reconstruct a cleaner time series. This method has a wide range of applications [30]. Wang and Li developed an SSA-NN model that smoothed commodity price series with a threshold of 0.02%, subsequently inputting the results into multiple artificial feed-forward neural networks for prediction purposes [31]. Xiao et al. used SSA to decompose the Shanghai Composite Index into long-term trends, significant event effects, and short-term noise, then applied Support Vector Machines (SVM) to make more accurate predictions than several baseline models [4]. Syukur and Marjuni improved the performance of SSA for forecasting SMS2.SG stock prices over the next 30 days by applying Hadamard transformation to determine the optimal window length for SSA [32].
Fathi et al. employed SSA to decompose price series into various features, which were then used to train non-linear autoregressive neural networks (NARNN) to forecast the performance of 24 stocks in the Egyptian market [33]. While some studies have combined SSA with deep neural networks for forecasting, there has been limited application of this approach to stock markets. For example, Galajit et al. applied SSA to remove noisy components from skewed electrical load series data, using Long Short-Term Memory (LSTM) networks for more accurate electrical load forecasting [34]. Similarly, Wei and Bai integrated SSA with a Convolutional Neural Network (CNN) and a Bidirectional Gated Recurrent Unit (BiGRU) model to forecast non-linear, non-stationary building energy consumption, achieving precise and robust multi-step predictions compared to the individual models [35].
Previous studies have demonstrated the effectiveness of combining Singular Spectrum Analysis (SSA) with deep learning for load and energy forecasting. However, its application in financial markets remains limited. This study advances existing work by extending the SSA–DL framework to sector-level equity forecasting and comparing CNN, LSTM, and CNN–LSTM architectures, each optimized for sector-specific temporal patterns. This approach provides a novel contribution by evaluating the robustness and adaptability of SSA–DL models in complex financial environments.
Existing studies on stock price forecasting have explored a range of traditional and modern techniques, from statistical models such as ARIMA and GARCH to machine learning and deep learning architectures including Support Vector Machines (SVM), Convolutional Neural Networks (CNN), and Long Short-Term Memory (LSTM) networks. While these methods have achieved notable success in various markets, several limitations persist.
First, most deep learning models struggle with the noisy and non-stationary nature of financial time series, often resulting in overfitting and unstable forecasts. Data decomposition methods, such as Empirical Mode Decomposition (EMD) and Wavelet Transforms, have been introduced to address this issue; however, their parameter sensitivity and mode-mixing problems limit their effectiveness. In contrast, Singular Spectrum Analysis (SSA) has shown superior performance in denoising and extracting meaningful patterns from complex signals, yet its integration with deep learning models for stock market forecasting remains scarce [31,32,33,34,35].
Second, the Australian stock market (ASX), despite being one of the most advanced and resource-rich markets globally, has received relatively little attention in data-driven forecasting research compared to the U.S., European, and Asian markets [11,26,27,28,29]. Consequently, there is a pressing need to develop robust forecasting frameworks tailored to the ASX context that can provide both accurate predictions and actionable trading insights.
Finally, existing research tends to focus predominantly on forecasting accuracy metrics, such as MSE or RMSE, without translating predictive results into real-world trading performance. This disconnects between model accuracy and financial utility limits the practical application of predictive models in investment decision-making.
These gaps highlight the opportunity to design a hybrid forecasting and trading framework that integrates advanced signal decomposition, deep learning, and practical trading evaluation, especially in the underexplored ASX market.

3. Contributions

To address these gaps, this study proposes a hybrid Singular Spectrum Analysis–Deep Learning (SSA–DL) framework that integrates Singular Spectrum Analysis (SSA) with Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and a hybrid SSA–CNN–LSTM model to forecast stock prices of companies listed on the ASX50 index. The performance of these models is further evaluated through a back tested trading strategy, linking predictive modeling with investment applicability.
First, stocks from the ASX50 are grouped into three subgroups based on their industrial sectors, and the closing price series of all stocks are decomposed into multiple signals using SSA, with window lengths and criteria optimized for each subgroup. Second, the filtered and denoised signals are used as inputs for deep neural networks to produce rolling forecasts of each signal. These individual forecasts are aggregated into overall stock price predictions, which are evaluated using standard forecasting metrics such as Mean Squared Error (MSE) and Mean Absolute Error (MAE). Finally, a trading strategy based on sector-level stock price forecasts is designed and back tested to assess real-world profitability and robustness.
The key contributions of this research are as follows:
  • Novel methodological integration of SSA and deep learning for financial forecasting:
This study introduces a hybrid SSA–DL framework that combines SSA’s noise-reduction capabilities with the pattern recognition power of deep learning architectures (CNN, LSTM, and CNN–LSTM). While SSA has been previously applied to non-financial domains such as energy and commodity forecasting [31,32,33,34,35], its integration with deep learning for stock market prediction—particularly within the Australian context—remains largely unexplored. This approach demonstrates enhanced forecasting accuracy compared to traditional deep learning models trained on raw, noisy data.
2.
Empirical advancement in modeling the Australian stock market:
The study provides one of the first comprehensive deep learning–based analyses of the Australian Securities Exchange (ASX), addressing the lack of research attention to this market [11,26,27,28,29]. By focusing on the ASX50 index, this research offers empirical insights into market dynamics and presents a benchmark for future studies investigating Australian equity behavior using advanced data-driven methods.
3.
Bridging predictive modeling and actionable trading strategies:
Unlike most prior research, which primarily emphasizes predictive accuracy, this study evaluates model performance in terms of real-world trading outcomes. A portfolio-level trading strategy is developed based on the model’s stock price forecasts, and its performance is assessed using profitability metrics such as return on investment (ROI) and Sharpe Ratio. This integration of forecasting and trading evaluation enhances the practical relevance of the proposed models for investors and portfolio managers.

4. Research Methodologies

4.1. Singular Spectrum Analysis (SSA)

Singular Spectrum Analysis (SSA) is a non-parametric method used to analyze time series data, allowing the detection of underlying patterns, trends, and noise [4,30]. The SSA process can be divided into two main stages: decomposition and reconstruction.
  • Decomposition: This step involves applying Singular Value Decomposition (SVD) to the trajectory matrix, which helps in breaking down the time series into components that capture its underlying structure.
  • Reconstruction: The second stage involves grouping the obtained components and performing anti-diagonal averaging to reconstruct the time series while filtering out noise [33,36,37,38].
Further details on the methodology, including proofs and additional insights, can be found in [36,39].
Step 1: Decomposition Embedding: The first stage of decomposition is to transform the 1-D price series S = [ s 1 , s 2 , s 3 , , s N 1 , s N ] into a trajectory matrix, which is composed of various lagged price series created based on the predetermined window length L. The trajectory matrix X is listed below:
X = [ s 1 s 2 s 3 s K s 2 s 3 s 4 s K + 1 s 3 s 4 s 5 s K + 2 s L s L + 1 s L + 2 s N ] R L × K
where each of the column vector is the corresponding subseries with length L, and K = N L + 1 .
Singular Value Decomposition, SVD: The next stage is to perform SVD to the trajectory matrix X , and we have
X = U Σ V T
where U R L × L , Σ R L × K , V R K × K .
Further, X can be rewritten into
X = i = 1 r σ i u i v i T = X 1 + X 2 + + X r ,
where r = r a n k ( X ) , and σ 1 ,   σ 2 ,   ,   σ r are the singular values of X X T arranged in descending order.
Step 2: Reconstruction Grouping: The grouping stage is to categorize the obtained r-series of X into m disjoint subsets, and series in each subset are summed up. Let I = { I 1 ,   I 2 ,   , I m } ( m < r ) , which are the indices of the categorized new subsets. Then, X can be rewritten into:
X = X I 1 + X I 2 + + X I m
And the contribution of each term is according to its eigenvalues i I k λ i i = 1 r λ i ,   k ( 0 , m ) , and the eigenvalues have the correlation with the singular values that σ r = λ r . In this stage, according to some prespecified criterion such as whether certain numbers of the r-series of X are included in the newly grouped series, which is named by smoothing threshold [31], or whether the increment of singular entropy E reaches an asymptotic value [40], X can be partitioned into X i n f o r m a t i o n and X n o i s e , then the noise term can be filtered out. The formula of the singular entropy due to eigentriple i is shown below:
E = ( λ i i = 1 r λ i ) l o g ( λ i i = 1 r λ i )
And we will obtain X i n f o r m a t i o n = X I 1 + X I 2 + + X I m , m < m .
Anti-diagonal Averaging: this method is performed to reconstruct the denoised time series based on all the subseries of X i n f o r m a t i o n . The procedure is performed by the following calculation:
X I k ~ = { 1 k m = 1 k X I k m , k m + 1 ,   1 k < L = m i n { L , K } , 1 L m = 1 L X I k m , k m + 1 ,   L k < K = m a x { L , K } , 1 N k + 1 m = k K + 1 N K + 1 X I k m , k m + 1 ,   K k N .
where X I k i , j represents of ( i , j ) element of X I k .
After performing anti-diagonal averaging to all series in { X I 1 , X I 2 , , X I m } , each of the obtained series has length of N = L + K 1 , and the final reconstructed denoised series is:
X ~ = i = k m X I k ~

4.2. Convolutional Neural Network (CNN)

Convolutional Neural Networks (CNNs) are among the most popular deep learning algorithms, particularly renowned for their robustness in image processing tasks. They are capable of automatically extracting important, high-level features from datasets. Recently, CNNs have also found wide applications in stock price prediction tasks.
A typical CNN consists of multiple layers, including Convolutional Layers, Pooling Layers, Dropout Layers, and Fully Connected Layers. Here is how each layer contributes to the model:
  • Time Series Data Transformation: Initially, the time series stock price data is transformed into a 3D tensor with the shape (samples, timesteps, features), making it compatible with CNNs.
  • Convolutional Layer: The Convolutional Layer uses its kernel (filter) to capture patterns in the lagged series data. This process applies filters to the data to detect relevant features, while activation functions introduce non-linearity between the outputs of different neurons.
  • Pooling Layer: The Pooling Layer helps in summarizing the essential features extracted by the Convolutional Layer. It performs down-sampling, reducing the dimensionality of the data while retaining important information.
  • Dropout Layer: Dropout is applied to randomly “turn off” certain neurons during training. This regularization technique introduces random noise into the learning process, helping to mitigate the risk of overfitting and ensuring better generalization.
  • Fully Connected Layer: After flattening the output from the convolutional and pooling layers into a 1-dimensional vector, the Fully Connected Layer connects all input information from previous layers. This layer is responsible for generating the final output, which is used for prediction or evaluation.

4.3. Long Short-Term Memory (LSTM)

In the realm of neural networks, recurrent models distinguish themselves from Convolutional Neural Networks (CNNs) by employing a recursive approach that propagates hidden state information forward through time. This mechanism allows recurrent models to retain and enhance their knowledge, improving predictions as they process more data over time.
A prominent type of recurrent model is the Long Short-Term Memory (LSTM), designed to address key challenges that traditional Recurrent Neural Networks (RNNs) face, such as the vanishing gradient and exploding gradient problems. LSTMs incorporate a unique mechanism to control how information is retained or discarded across time-steps, allowing them to maintain long-range dependencies in sequential data, which is crucial for time-series analysis.
The LSTM’s core advantage lies in its gates: the input gate, output gate, and forget gate, which are implemented using sigmoid activation functions and element-wise multiplication operations. These gates regulate what information should be remembered or forgotten as the model processes the sequence. This selective memory mechanism makes LSTMs particularly well-suited for tasks such as speech recognition, natural language processing, and financial time-series prediction.
Training an LSTM involves Backpropagation Through Time (BPTT) and the use of gradient descent optimization techniques. These methods enable the model to update its internal weights progressively based on the errors in its predictions, ensuring the model improves its performance over time.
Figure 1 illustrates the detailed structure and functioning of an LSTM model.
Inside the LSTM model, X t represents the input information at time t , H t represents the values in the hidden state at time t , and the operators inside the circles are pointwise operators. Initially, the Forget Gate F t is to determine what information to forget by the effect of sigmoid activation function.
F t = σ ( W f [ H t 1 , X t ] + b f )
The next step is to decide what new values to be stored in the Memory Cell, where the Input Gate decides the values to update, and tanh is applied to get the Candidate Memory Cell.
I t = σ ( W i [ H t 1 , X t ] + b i )
C t ~ = t a n h ( W C [ H t 1 , X t ] + b c )
Then the memory cell is updated based on the three prepared functions above.
C t = F t · C t 1 + I t · C t ~
Finally, the Output Gate generates the value from the updated memory cell with the information provided by sigmoid activation function.
O t = σ ( W o [ H t 1 , X t ] + b o )
H t = O t t a n h ( C t )
Inside all the functions above, W and b represents the weight matrix and bias term, respectively, and they are to be optimized during the backpropagation.

4.4. CNN-LSTM

The CNN-LSTM model is a powerful hybrid deep learning framework that integrates Convolutional Neural Networks (CNNs) with Long Short-Term Memory (LSTM) networks. This model is designed to process sequential data effectively by combining the strengths of both architectures.
  • CNNs are used to automatically extract hierarchical and spatial features from the input data. These networks are particularly effective in identifying patterns and key features, such as local dependencies within the data, which are crucial for tasks like image processing and time-series feature extraction.
  • LSTM networks, on the other hand, are designed to capture long-range dependencies and temporal relationships within sequential data. LSTMs can learn the context and sequence over time, making them ideal for handling sequential data where previous states influence future ones, such as in time-series forecasting or natural language processing.
By combining CNNs as feature extractors and LSTMs for sequence modeling, the CNN-LSTM model can efficiently process complex data, retain temporal information, and generate more accurate predictions. This hybrid structure allows the model to leverage the strength of CNNs for feature extraction and the power of LSTMs for capturing sequential patterns, making it highly suitable for tasks involving both spatial and temporal data.

5. Experiment Design

5.1. Dataset Description

The ASX50 stock list was sourced from the Market Index website [41] and downloaded from Yahoo Finance [42]. The dataset includes stock data spanning from 12 April 2018, to 31 March 2023, covering a total of 47 companies with consistent trading records and no significant restructuring or delisting during this period. Using the Company Profile information from Yahoo Finance, these 47 companies are divided into three groups based on their respective industrial sectors. Companies within each sector share the same fine-tuned hyperparameters due to the sectoral similarities.
Each deep learning model (CNN, LSTM, and CNN-LSTM) was implemented and fine-tuned separately for each sector group to capture sector-specific temporal and spatial patterns in the data. The CNN architecture consisted of two convolutional layers (kernel size = 3, stride = 1) followed by a max-pooling layer and two fully connected layers. The LSTM model included two LSTM layers (64 and 32 units) with dropout regularization (rate = 0.2). The hybrid CNN-LSTM combined the convolutional layers for feature extraction with an LSTM layer for temporal sequence learning. Model optimization employed the Adam optimizer with an initial learning rate of 0.001, and training was performed with a batch size of 32 for up to 100 epochs using early stopping based on validation loss. Hyperparameters were fine-tuned separately for each sector group using a grid search approach, selecting the configuration that minimized the mean squared error on the validation set.
The chosen CNN configuration emphasizes local temporal feature extraction with an appropriate receptive field to capture short-term price dynamics without excessive model complexity. The LSTM design focuses on preserving sequential dependencies and long-term temporal memory, which is essential for financial time-series prediction. The hybrid CNN-LSTM architecture leverages both short-term pattern extraction and long-term dependency modeling, a structure widely supported in existing financial forecasting literature.
The number of layers, hidden units, kernel sizes, and optimizer settings were determined through preliminary experiments and guided by commonly accepted best practices in deep learning for financial series. These choices balance predictive performance against computational efficiency and overfitting risk. Optimizers were selected based on empirical stability and convergence performance in our datasets.
For two companies that had only one missing value on 29 March 2023, the missing data points were filled using the previous day’s data. To ensure accurate forecasting and account for factors like dividend payments that can distort true stock values, only the adjusted close price series are used. Other data such as Open, High, and Low prices are not considered in the SSA process. Table 1 below outlines the sector grouping of selected companies:

5.2. Experiment Procedure

The training and prediction procedure for each stock is as follows. Stocks in different groups follow this automated process, with different hyperparameters applied based on their sector:
  • Split the Data: Divide the daily adjusted close prices into a training dataset (80%) and a testing dataset (20%). The dataset was divided into training (80%) and testing (20%) sets using distinct, non-overlapping time periods to minimize temporal data leakage. This approach ensured comparability across all model architectures while maintaining the chronological integrity of the financial time series data.
  • Decompose the Training Series: Use Singular Spectrum Analysis (SSA) to decompose the training data based on the specified window length.
  • Calculate Contributions: Compute the contribution of each decomposed term and sort them in descending order. Accumulate the contributions by their order and stop once the cumulative contribution reaches the set criterion.
  • Reconstruct Time Series: Reconstruct the time series components using the selected terms from the decomposition.
  • Cluster Components: Apply the K-means algorithm to aggregate similar components together.
  • Normalize Components: Standardize each reconstructed component by normalizing it.
  • Reshape the Data: Reshape the normalized series into 3D tensors according to the optimal rolling window size.
  • Build and Train the Model: Construct and train the model using each normalized component.
  • Forecast Future Prices: For each point in the testing dataset, forecast the future price by following these steps until all time points are predicted: a. Decompose the previous price series and generate the 3D tensors as described in steps 2–7. b. Perform a one-step prediction for each component. c. Denormalize the predicted components.
  • Aggregate and Evaluate: Combine all predicted components into a single price series and evaluate its accuracy using appropriate performance metrics.
Figure 2 below illustrates the training and prediction procedure.

5.3. Trading Strategy

We utilized the model’s predictions to devise a trading strategy, which was tested over 252 trading days from 1 April 2022, to 31 March 2023. The steps of the strategy were as follows:
  • Initial Capital and Trade Size:
    The initial capital was set at 5 million.
    2 million (40% of the initial capital) was allocated to each daily trade.
    A maximum cumulative loss of 3 million was set to prevent excessive loss over the trading period.
  • Stock Ranking and Investment:
    Each day, stocks were ranked based on their predicted percentage returns.
    The top three stocks for the day were selected for investment.
    An equal amount of 2/3 million (666,666.67 per stock) was allocated to each of the three selected stocks daily, ensuring no short trades were made.
  • Closing and Rebalancing:
    Positions were closed at the end of each trading day.
    The strategy was rebalanced daily, using fresh predictions and rankings for the next day’s trades.
  • Commission Fees:
    A commission fee of 0.025% was applied for both buying and selling, ensuring the transaction costs were considered in the strategy.
  • Percentage Return Calculation:
    The percentage return for each stock trade was calculated using the formula:
Percentage Return =
p e r c e n t a g e   r e t u r n = p r i c e t ^ p r i c e t 1 ^ p r i c e t 1 ^
This formula calculates the return based on the change in price relative to the opening price.
6.
Performance Evaluation:
The strategy was evaluated based on its cumulative returns, accounting for both transaction costs and predicted stock performance.

6. Results

6.1. Parameters

In the implementation of the Singular Spectrum Analysis method, two key parameters were considered:
  • The window length (L) used during the embedding stage, which determines the size of the trajectory matrix constructed from the time series data.
  • The contribution threshold for determining which components are retained for reconstruction. This threshold guides the grouping of components based on their relative significance.
The selection of the key parameters in the Singular Spectrum Analysis (SSA) was guided by both theoretical and empirical considerations to ensure a balance between signal fidelity and noise suppression. The window length ( L ) was chosen to be sufficiently large to capture the dominant temporal patterns in the data, while remaining below half of the series length, consistent with established SSA practices. The contribution thresholds of 99.95% and 99.97% were applied to retain the principal components that together explained nearly all the signal variance, effectively filtering out residual noise without over-smoothing the reconstructed series. Preliminary sensitivity analyses confirmed that small variations in L and the threshold values did not materially affect the results, indicating that the decomposition and reconstruction were robust to parameter choice.
In our research, we set the contribution criteria to 99.95% and 99.97%, respectively, meaning that only the components contributing to this cumulative percentage of the signal variance were preserved. The retained components were then grouped using the K-means clustering algorithm into two distinct categories: trend and periodicity [31]. The remaining components, accounting for the last 0.05% (or 0.03%) of the contribution, were classified as noise and deemed non-informative for prediction purposes.
The window length is essential to be tuned for the stocks in different groups, and [37] suggested that the window length should be large enough but no larger than N 2 , where N is the total length of the series. This problem is also detected in our research, as very low window length will not capture the periodicity effect, causing the contribution value of trend to be larger than the criteria. According to Hassani et al., L = N 4 is a common practice for SSA method [37]. Hence, in our research, Table 2 below, displays the window Length L will be optimized by testing the possible values below for each group:
For the selected deep learning algorithms, several parameters required careful tuning to optimize model performance:
  • Time Step for 3D Tensor Formation: This defines the number of lag values used from the time series to form each input sample.
  • Neural Network Architecture and Hyperparameters: These include optimizer settings, hidden layer configurations, dropout rates, and activation functions.
In our study, a time step of 5 was chosen to construct each 3D tensor, corresponding to one trading week. For the neural network optimization, we employed the Adam optimizer with a learning rate of 0.001, as implemented in the Keras library. Adam is widely recognized for its adaptive learning capabilities and computational efficiency [43].
Other hyperparameters such as the number of hidden units per layer, dropout rates, and activation functions were fine-tuned separately for each sector group to best fit the characteristics of the grouped stocks.

6.2. Forecasting Results

Similarly, this study initially applies the Singular Spectrum Analysis (SSA) algorithm to decompose each stock’s original price series and reconstruct its Trend and Periodicity components. Based on predefined contribution criteria, SSA identifies and excludes components deemed noise, which are therefore not considered in the K-means clustering stage. The contribution thresholds tested in this study are 99.95% and 99.97%, representing the cumulative variance retained from the original signal.
An illustrative example is provided using the stock ALL.AX, with the reconstructed series summarized in Table 3. From the results, it can be observed that the trend components remain consistent across both criteria. However, the periodicity component reconstructed under the 99.97% threshold is smaller compared to that under the 99.95% threshold. This suggests that the trend captures most of the price series’ informational contribution, and the variation in the periodic component is primarily influenced by the stricter criterion.
Importantly, the sequence reconstructed under the 99.97% threshold more closely tracks the original stock price series, as illustrated in Figure 3, where the line labeled “Reconstructed (99.97%)” aligns more accurately with the actual stock prices. This improved alignment is further supported by superior performance in downstream validation. Therefore, subsequent tuning and modeling in this study are conducted primarily under the 99.97% criterion.
The selection of the Window Length (L) in SSA plays a crucial role in effective feature detection and time series reconstruction. An improperly chosen L may hinder the SSA’s ability to accurately capture signal structures. In this study, a range of window lengths from 63 to 504 is explored, corresponding to meaningful temporal intervals such as quarterly (63 days), yearly (252 days), and biannual (504 days) periods.
For each value of L, the original stock price series is decomposed and reconstructed, followed by the construction of various deep learning (DL) models as described previously. These models are then fine-tuned individually within each group using optimized hyperparameters to perform stock price forecasting. The reconstructed trend and periodicity components are aggregated to produce the final predicted stock price series.
To evaluate the accuracy of the predictions, the Mean Squared Error (MSE) and Mean Absolute Error (MAE) are used as performance metrics, comparing the predicted prices to the actual stock prices. The evaluation results for different window lengths and model configurations are summarized in Table 4, Table 5 and Table 6, respectively.
Observing the prediction results of different models when the window length of the SSA algorithm is chosen to be 63 to reconstruct the sequence, it can be found that the SSA-CNN provides the most accurate prediction away from the real stock price in most cases, and almost all the best MSEs and MAEs are obtained from this model. The ordinary CNN also gives relatively accurate results, and it should be noted that the ordinary CNN directly uses the original stock price for prediction, independent of the output sequence of the SSA algorithm. SSA-LSTM and SSA-CNN-LSTM hardly provide more accurate results compared to SSA-CNN and CNN, except for the WTC.AX stock, for which the best prediction is obtained by SSA-CNN-LSTM. However, it can be noted that the differences in model performance in terms of the assessment of prediction accuracy are all very small, so it is also necessary to check the empirical analysis provided by the trading strategy.
When analysing the prediction results with larger window lengths of 252 and 504, it becomes apparent that the prediction accuracy of the SSA-CNN model declines significantly. In contrast, the accuracy of SSA-LSTM and SSA-CNN-LSTM remains relatively stable, with only slight decreases observed. Notably, for the WTC.AX stock, the predictive performance even improves as the window length increases. This suggests that as the window length grows, the relative contribution of trend components diminishes, while the influence of periodicity and noise components increases, due to the higher number of decomposed components.
Furthermore, it is observed that the best prediction results for stocks in the Industrial and Infrastructure sectors tend to be achieved by the SSA-CNN-LSTM model rather than the others. This can be attributed to the model’s strength in capturing long-term fluctuations within periodic sequences, whereas SSA-CNN is better suited for non-smooth, trend-dominated sequences. Consequently, the performance of SSA-CNN deteriorates significantly in these cases, occasionally performing worse than the other two models.
Interestingly, CNN-based models (including non-SSA CNN) also demonstrate strong performance. This can be explained by the fact that with larger window lengths, if the same fixed contribution criteria are used to discard noise, there is a greater risk of omitting important information from the original stock price series. As a result, the reconstructed sequences used for training may lack critical signals, leading to less effective predictions.
However, it is important to recognize that the original stock price series inherently contains noise, which may not be informative or actionable for real-world trading strategies. Although CNNs trained on raw data might achieve higher accuracy in terms of price prediction, they might also capture spurious patterns, making them less suitable for practical trading applications. Investors are typically more interested in robust predictions based on the underlying structure of stock movements, rather than predictions influenced by noise.
For this reason, the present study emphasizes the empirical value of SSA-based Deep Learning (SSA-DL) models, aiming to construct more reliable and interpretable trading strategies grounded in the intrinsic properties of stock behaviour.
The architecture of the best-performing SSA-CNN model, identified when the window length is set to 63, is illustrated in Figure 4 (left). The model begins with two convolutional layers, each comprising 128 filters with a kernel size of 3 and employing the ReLU activation function. Notably, no pooling layer is applied after the convolutional stages, allowing the model to retain the full granularity of the extracted features. These layers are followed by a Dropout layer with a dropout rate of 0.3 to prevent overfitting. Finally, a fully connected (dense) layer with 64 neurons and ReLU activation is employed to generate the output prediction.
In contrast, the SSA-CNN-LSTM model structure, shown in Figure 4 (right), integrates both convolutional and recurrent layers to leverage spatial and temporal features. The model starts with a convolutional layer consisting of 64 filters and a ReLU activation function, which extracts local spatial patterns from the input sequence. This is followed by a Max Pooling layer to down sample and highlight the most salient features. The resulting feature maps are then passed into an LSTM layer with 64 units and a tanh activation function, designed to capture the temporal dependencies within the sequence. The final prediction is generated through a dense layer with 64 neurons and ReLU activation.

6.3. Trading Strategy Performance

In this study, beyond aiming for more accurate stock price predictions using SSA-DL models, we also emphasize the importance of obtaining noise-filtered forecasts that are practically applicable for real-world trading and profitability. This aligns with the insights from Dessian, who conducted a comprehensive review of over 190 research articles and highlighted that many commonly used evaluation metrics such as MSE and RMSE may be inadequate when the ultimate objective is profit maximization in real financial markets [44].
To assess the practical utility of our models, we constructed daily frequency trading strategies based on the predicted stock prices generated by various models. These strategies were applied to a selection of 47 stocks from the ASX50 index, enabling us to test the extent to which model-driven predictions could inform profitable investment decisions. The performance of each strategy was evaluated using a suite of financial metrics, including Win Rate, Return on Investment (ROI), and the Sharpe Ratio (assuming a risk-free rate of 2%).
The empirical results and comparative analysis across models are presented in the following sections.
As shown in Table 7, the SSA-CNN-LSTM model with a window length of 252 achieves the highest Sharpe Ratio, reaching a value of 1.878. Notably, this model also generates over $3.3 million in cumulative return, based on a daily trading capital of $2 million, significantly outperforming the other models. The SSA-LSTM model (also with a window length of 252) demonstrates similarly strong performance, recording the highest Win Rate and total Dollar Gain among all tested strategies.
Interestingly, while SSA-CNN (window length = 63) and the standard CNN model exhibit relatively higher predictive accuracy in terms of MSE and MAE, their trading strategy performance is notably weaker. In fact, the CNN model results in a substantial financial loss. A closer look at Win Rate and Dollar Loss further underscores that SSA-LSTM and SSA-CNN-LSTM outperform both CNN and SSA-CNN models in risk-adjusted returns. Although SSA-CNN performs comparably to the other two SSA-DL models in terms of total Dollar Gain, its high Dollar Loss reduces its final ROI and Sharpe Ratio.
The fact that the two best-performing trading strategies are based on models with a window length of 252 supports the empirical validity of the guideline L = N/4 for selecting the SSA window length. To illustrate these findings visually, Figure 5 presents the PnL (Profit and Loss) curves for CNN and the best-performing models from each SSA-DL variant, compared with the S&P/ASX50 index [45].
The results reveal that the strategies built using SSA-DL models significantly outperform the market baseline. During the one-year evaluation period, the S&P/ASX50 index declined from 7254 to 7049, representing a −2.83% return. Despite the overall market downturn, the SSA-CNN-LSTM model achieved an ROI of 66.58%, while the SSA-LSTM and SSA-CNN models delivered ROIs of 61.28% and 44.51%, respectively. See Figure 5 below. Yan and Ling also integrated their forecasting results with quantitative investing principles and constructed a new strategy that achieved better returns in twelve selected American financial stocks [46]. This validates the practical value of incorporating SSA-DL models into the design of trading strategies, offering substantial alpha generation even in bearish market conditions.
To further investigate the trading behavior of the proposed models, we randomly selected all trading days from 3 January to 31 January for a detailed performance review. The results, summarized in Table 8, reveal that during this month, the SSA-CNN model with a window length of 504 achieved the highest cumulative Dollar Gain. Despite this, notable differences in stock selection can be observed among the three models. Interestingly, SSA-LSTM (252) and SSA-CNN-LSTM (252) exhibit greater overlap in daily stock picks, suggesting a similarity in their portfolio construction approach, in contrast to SSA-CNN (504), which tends to diverge significantly in its choices.
Additionally, the analysis shows that on several consecutive trading days, the models, particularly SSA-LSTM and SSA-CNN-LSTM generate portfolios with unchanged or minimally rotated stock selections. This consistency implies that frequent rebalancing is not always necessary, and avoiding unnecessary portfolio turnover could reduce transaction costs, thereby enhancing real-world returns beyond what is reflected in the back-testing results.
A particularly noteworthy observation is that the SSA-LSTM model selects PLS.AX (an Industrial sector stock), 90 times over the course of the year, resulting in substantial representation of the Industrial and Infrastructure sectors within its portfolio. This consistent preference may indicate the model’s sensitivity to long-term cyclical patterns or strong predictive signals inherent in the stock’s behavior.
To examine sectoral tendencies, Table 9 reports the annual frequency of stock selections made by each model. Relative to the SSA-DL models, the standard CNN model exhibits a pronounced bias towards the Industrial and Infrastructure sectors, while underrepresenting stocks from the Consumer Services, Financials, Healthcare, Technology, and Utilities sectors. In contrast, the SSA-DL models achieve a more balanced sectoral distribution, thereby providing improved diversification.

7. Limitations of Study

While the back testing results demonstrate the potential effectiveness of the proposed trading strategy, it is important to acknowledge that the evaluation was conducted under idealized conditions. Real-world trading involves additional complexities such as transaction costs, liquidity constraints, slippage, and execution delays, which were not explicitly modelled in this study. As the main objective was to assess the predictive capability of the model rather than its operational implementation, these factors were excluded to maintain analytical focus. Future research could extend this work by incorporating realistic trading frictions to better assess the practical performance and robustness of the strategy in live market environments.
In addition, while the proposed SSA-DL framework demonstrates strong predictive potential, this study did not include direct comparisons with classical time-series such as ARIMA or GARCH, or with simple benchmark strategies like buy-and-hold. The omission was deliberate, as the focus of this work was to explore the methodological integration of SSA with deep learning architectures rather than to establish dominance over all alternatives. Future research could extend this work by systematically comparing the SSA-DL framework against both traditional and baseline strategies to provide a broader perspective on its relative advantages and robustness.
Furthermore, the variation in model performance across different window lengths can be attributed to the distinct learning characteristics of the deep learning architectures employed. Specifically, the SSA-CNN model performs better with shorter window lengths, as it effectively captures localized temporal and structural patterns within the decomposed components. In contrast, the SSA-LSTM model tends to excel with longer window lengths, as its recurrent structure enables it to better model long-term dependencies in the financial time series. These complementary strengths highlight the importance of aligning the choice of model architecture and window length with the underlying temporal dynamics of the data.
Note that although the proposed SSA-DL trading framework demonstrated superior returns compared to the ASX50 benchmark, such results should be interpreted cautiously. The high alpha may partially reflect model sensitivity to specific market dynamics during the test period rather than persistent predictive power. Moreover, despite measures taken to mitigate overfitting, the possibility of model over-optimization cannot be fully excluded.

8. Conclusions

This study proposed a novel framework that combines Singular Spectrum Analysis (SSA) with deep learning algorithms to predict stock prices of companies listed on the ASX50 index. The primary objective was to reduce noise in stock price time series, extract meaningful trend and periodicity components, and enhance the accuracy of stock price forecasts. By improving prediction reliability, the proposed approach offers practical guidance for stock selection and portfolio construction in dynamic financial markets. We propose future work to extend the evaluation of the CNN, LSTM, CNN-LSTM and SSA-CNN-LSTM beyond the current dataset to a wide range of financial markets with diverse structural characteristics, including major equity indices such as the S&P 500 (United States), Nikkei 225 (Japan), and ASX200 (Australia), as well as the foreign exchange market (e.g., EUR/USD, USD/JPY), commodity markets (e.g., gold, crude oil), and cryptocurrency markets (e.g., Bitcoin, Ethereum). These markets differ substantially in volatility dynamics, liquidity levels, trading mechanisms, and information efficiency, thereby will provide a robust platform to validate the generality and adaptability of the proposed approach.
To evaluate the model’s performance, we employed Mean Squared Error (MSE) and Mean Absolute Error (MAE) as forecasting accuracy metrics and further validated the models through back tested trading strategies. Results demonstrated that SSA effectively filtered noise and isolated underlying market patterns, enabling deep learning models, specifically CNN, LSTM, and hybrid CNN–LSTM architectures to generate more stable and accurate predictions. Among these, the SSA–CNN–LSTM model with a window length of 252 achieved the best overall performance, yielding a 66% return on investment (ROI) and a Sharpe Ratio of 1.88.
While low-variance SSA components are treated as noise for the current daily/multi-day forecasting horizon, future studies could investigate their potential predictive value for ultra-short-term or intraday strategies, capturing high-frequency market microstructure effects.
These findings have several practical implications. For investors and portfolio managers, the SSA–DL framework provides a data-driven method to improve timing and selection of trades, particularly by filtering out short-term market noise that can distort traditional technical signals. For quantitative analysts and financial engineers, the integration of SSA with deep learning offers a promising pathway for developing robust forecasting engines capable of adapting to volatile, non-stationary financial environments. For trading strategy developers, the study demonstrates how preprocessing techniques can enhance the performance of deep learning-based algorithmic trading systems, leading to superior risk-adjusted returns compared to benchmark models and market indices.
The research also underscores the importance of model configuration, as key hyperparameters such as SSA window length and contribution criteria significantly influenced predictive performance.

9. Future Research Directions

While the proposed SSA–DL hybrid framework has shown promising results, several potential extensions can be explored:
  • Hyperparameter Optimization: Future work could employ automated optimization methods, such as Bayesian optimization, grid search, or genetic algorithms, to identify optimal SSA parameters and deep learning architectures, further enhancing model robustness.
  • Multimodal Data Integration: Extending the current framework to include fundamental indicators, macroeconomic variables, news sentiment, and social media signals could provide a richer feature set, capturing a more comprehensive view of market dynamics.
  • Cross-Market Generalization: Applying the SSA–DL framework to different stock exchanges (e.g., NYSE, NASDAQ, or FTSE) or to other asset classes such as commodities, cryptocurrencies, or exchange-traded funds (ETFs) could test the model’s generalizability across diverse markets.
  • Real-Time and High-Frequency Forecasting: Future studies could adapt the model for real-time prediction or high-frequency trading, incorporating adaptive learning mechanisms to respond to rapidly changing market conditions.
  • Explainable AI (XAI) and Interpretability: Incorporating interpretability techniques, such as SHAP or LIME, could help explain how the SSA–DL model generates predictions, increasing its transparency and trustworthiness for practitioners and regulators.
  • Seasonality: Results from different seasons and diverse datasets to provide a more comprehensive evaluation.
  • Experiments using Filtering: A comprehensive comparison with other filtering methods that isolate the effect of each block in the pipeline.
  • Extension of framework: Link the framework more explicitly to organizational decision-making processes (such as rebalancing and scaling), using model governance practices, and relevant regulatory considerations
  • Future research could employ statistical tests such as the Diebold-Mariano or paired t-tests to confirm whether the observed improvements are statistically meaningful. This addition would enhance the robustness of the performance comparison and provide stronger empirical support for the model’s effectiveness.
  • While the current study focuses on 63-day, 252-day and 504-day metrics to provide a clear overview of the strategy’s overall returns and risk, we acknowledge that more detailed temporal analysis such as monthly or quarterly returns and could offer additional insights into performance stability. Examining these intra-year dynamics and their potential correlation with specific market regimes, including bear markets or periods of high volatility, represents a valuable direction for future research.
  • While the current study compares the proposed strategy to a passive investment in the ASX50 index, we acknowledge that comparisons with simpler daily selection benchmarks such as a one-day momentum or randomly selected stock portfolio could provide additional insight into the contribution of the predictive model versus the selection mechanism. Exploring such benchmarks represents a valuable direction for future research to further isolate and quantify the “alpha” generated by the model itself.
In summary, this study demonstrates that the SSA–DL hybrid approach not only improves predictive accuracy in noisy financial time series but also translates these improvements into tangible investment benefits, bridging the gap between academic modeling and practical financial decision-making. The future research directions outlined above offer promising avenues for further exploration and refinement of this framework in both academic and professional contexts.

Author Contributions

Conceptualization, Z.F. and C.A.H.; Methodology, Z.F. and C.A.H.; Software, Z.F. and C.A.H.; Validation, Z.F. and C.A.H.; Formal analysis, Z.F. and C.A.H.; Investigation, Z.F. and C.A.H.; Resources, Z.F. and C.A.H.; Data curation, C.A.H.; Writing—original draft, Z.F. and C.A.H.; Writing—review and editing, C.A.H.; Supervision, C.A.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original data presented in the study are openly available in repository: https://zenodo.org/record/8319706.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Rouf, N.; Malik, M.B.; Arif, T.; Sharma, S.; Singh, S.; Aich, S.; Kim, H.-C. Stock Market Prediction Using Machine Learning Techniques: A Decade Survey on Methodologies, Recent Developments, and Future Directions. Electronics 2021, 10, 2717. [Google Scholar] [CrossRef]
  2. Ni, Y.; Day, M.-Y.; Cheng, Y.; Huang, P. Can investors profit by utilizing technical trading strategies? Evidence from the Korean and Chinese stock markets. Financ. Innov. 2022, 8, 54. [Google Scholar] [CrossRef]
  3. Cho, C.-H.; Lee, G.-Y.; Tsai, Y.-L.; Lan, K.-C. Toward Stock Price Prediction using Deep Learning. In Proceedings of the 12th IEEE/ACM International Conference on Utility and Cloud Computing Companion—UCC ’19 Companion, Auckland, New Zealand, 2 December–5 December 2019. [Google Scholar] [CrossRef]
  4. Xiao, J.; Zhu, X.; Huang, C.; Yang, X.; Wen, F.; Zhong, M. A New Approach for Stock Price Analysis and Prediction Based on SSA and SVM. Int. J. Inf. Technol. Decis. Mak. 2019, 18, 287–310. [Google Scholar] [CrossRef]
  5. Fama, E.F. Efficient Capital Markets: A Review of Theory and Empirical Work. J. Financ. 1970, 25, 383–417. [Google Scholar] [CrossRef]
  6. Metghalchi, M.; Marcucci, J.; Chang, Y.-H. Are moving average trading rules profitable? Evidence from the European stock markets. Appl. Econ. 2011, 44, 1539–1559. [Google Scholar] [CrossRef]
  7. Chong, T.T.-L.; Ng, W.-K. Technical analysis and the London stock exchange: Testing the MACD and RSI rules using the FT30. Appl. Econ. Lett. 2008, 15, 1111–1114. [Google Scholar] [CrossRef]
  8. Mondal, P.; Shit, L.; Goswami, S. Study of Effectiveness of Time Series Modeling (Arima) in Forecasting Stock Prices. Int. J. Comput. Sci. Eng. Appl. 2014, 4, 13–29. [Google Scholar] [CrossRef]
  9. Abdalla, S.Z.S.; Winker, P. Modelling Stock Market Volatility Using Univariate GARCH Models: Evidence from Sudan and Egypt. Int. J. Econ. Financ. 2012, 4, 161–178. [Google Scholar] [CrossRef]
  10. Chincarini, L. The Impact of Quantitative Methods on Hedge Fund Performance. Eur. Financ. Manag. 2013, 20, 857–890. [Google Scholar] [CrossRef]
  11. Kumbure, M.M.; Lohrmann, C.; Luukka, P.; Porras, J. Machine learning techniques and data for stock market forecasting: A literature review. Expert Syst. Appl. 2022, 197, 116659. [Google Scholar] [CrossRef]
  12. Yu, H.; Chen, R.; Zhang, G. A SVM Stock Selection Model within PCA. Procedia Comput. Sci. 2014, 31, 406–412. [Google Scholar] [CrossRef]
  13. Kazem, A.; Sharifi, E.; Hussain, F.K.; Saberi, M.; Hussain, O.K. Support vector regression with chaos-based firefly algorithm for stock market price forecasting. Appl. Soft. Comput. 2013, 13, 947–958. [Google Scholar] [CrossRef]
  14. Polamuri, S.R.; Srinivas, K.; Mohan, A.K. Stock Market Prices Prediction using Random Forest and Extra Tree Regression. Int. J. Recent Technol. Eng. 2019, 8, 1224–1228. [Google Scholar] [CrossRef]
  15. Vijh, M.; Chandola, D.; Tikkiwal, V.A.; Kumar, A. Stock Closing Price Prediction using Machine Learning Techniques. Procedia Comput. Sci. 2020, 167, 599–606. [Google Scholar] [CrossRef]
  16. Göçken, M.; Özçalıcı, M.; Boru, A.; Dosdoğru, A.T. Integrating metaheuristics and Artificial Neural Networks for improved stock price prediction. Expert Syst. Appl. 2016, 44, 320–331. [Google Scholar] [CrossRef]
  17. Zhong, X.; Enke, D. Predicting the daily return direction of the stock market using hybrid machine learning algorithms. Financ. Innov. 2019, 5, 4. [Google Scholar] [CrossRef]
  18. Hoseinzade, E.; Haratizadeh, S. CNNpred: CNN-based stock market prediction using a diverse set of variables. Expert Syst. Appl. 2019, 129, 273–285. [Google Scholar] [CrossRef]
  19. Durairaj, M.; Mohan, B.H.K. A convolutional neural network based approach to financial time series prediction. Neural Comput. Appl. 2022, 34, 13319–13337. [Google Scholar] [CrossRef] [PubMed]
  20. Baek, Y.; Kim, H.Y. ModAugNet: A new forecasting framework for stock market index value with an overfitting prevention LSTM module and a prediction LSTM module. Expert Syst. Appl. 2018, 113, 457–480. [Google Scholar] [CrossRef]
  21. Fazeli, A.; Houghten, S. Deep Learning for the Prediction of Stock Market Trends. 2019 IEEE Int. Conf. Big Data 2019, 5513–5521. [Google Scholar] [CrossRef]
  22. Livieris, I.E.; Pintelas, E.; Pintelas, P. A CNN–LSTM model for gold price time-series forecasting. Neural Comput. Appl. 2020, 32, 17351–17360. [Google Scholar] [CrossRef]
  23. Lu, W.; Li, J.; Li, Y.; Sun, A.; Wang, J. A CNN-LSTM-Based Model to Forecast Stock Prices. Complexity 2020. [Google Scholar] [CrossRef]
  24. Song, H.; Choi, H. Forecasting Stock Market Indices Using the Recurrent Neural Network Based Hybrid Models: CNN-LSTM, GRU-CNN, and Ensemble Models. Appl. Sci. 2023, 13, 4644. [Google Scholar] [CrossRef]
  25. Baek, H. A CNN-LSTM Stock Prediction Model Based on Genetic Algorithm Optimization. Asia-Pac Financ. Mark. 2023, 31, 205–220. [Google Scholar] [CrossRef]
  26. Chung, K.K. Financial Forecasting Using Neural Network or Machine Learning Techniques, 2001. Available online: https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=e7ce06dc8611f415dc2a609784d2bc579441b34d (accessed on 3 September 2022).
  27. Priyadarshani, K.A.I. Modelling Australian Stock Market Volatility. Doctoral Dissertation, University of Wollongong, January 2011. [Google Scholar]
  28. Hargreaves, C.; Hao, Y. Prediction of Stock Performance Using Analytical Techniques. J. Emerg. Technol. Web Intell. 2013, 5, 136–142. [Google Scholar] [CrossRef]
  29. Hussain, W.; Merigó, J.M.; Raza, M.R. Predictive intelligence using ANFIS-induced OWAWA for complex stock market prediction. Int. J. Intell. Syst. 2021, 37, 4586–4611. [Google Scholar] [CrossRef]
  30. Arteche, J.; García-Enríquez, J. Singular Spectrum Analysis for signal extraction in Stochastic Volatility models. Econ. Stat. 2017, 1, 85–98. [Google Scholar] [CrossRef]
  31. Wang, J.; Li, X. A combined neural network model for commodity price forecasting with SSA. Soft Comput. 2018, 22, 5323–5333. [Google Scholar] [CrossRef]
  32. Syukur, A.; Marjuni, A. Stock Price Forecasting Using Univariate Singular Spectral Analysis through Hadamard Transform. Int. J. Intell. Eng. Syst. 2020, 13, 96–107. [Google Scholar] [CrossRef]
  33. Fathi, A.Y.; El-Khodary, I.A.; Saafan, M. Integrating singular spectrum analysis and nonlinear autoregressive neural network for stock price forecasting. IAES Int. J. Artif. Intell. 2022, 11, 851. [Google Scholar] [CrossRef]
  34. Neeraj, N.; Mathew, J.; Agarwal, M.; Behera, R.K. Long short-term memory-singular spectrum analysis-based model for electric load forecasting. Electr. Eng. 2020, 103, 1067–1082. [Google Scholar] [CrossRef]
  35. Wei, S.; Bai, X. Multi-Step Short-Term Building Energy Consumption Forecasting Based on Singular Spectrum Analysis and Hybrid Neural Network. Energies 2022, 15, 1743. [Google Scholar] [CrossRef]
  36. Golyandina, N.; Nekrutkin, V.V.; Zhigljavsky, A. Analysis of Time Series Structure: SSA and Related Techniques; Chapman & Hall/CrcBoca: Raton, FL, USA; New York, USA, 2001. [Google Scholar]
  37. Hassani, H.; Mahmoudvand, R.; Zokaei, M. Separability and window length in singular spectrum analysis. Comptes Rendus Math. 2011, 349, 987–990. [Google Scholar] [CrossRef]
  38. Rodrigues, P.M.M.; Mahmoudvand, R. A new approach for the vector forecast algorithm in singular spectrum analysis. Commun. Stat.–Simul. Comput. 2019, 49, 591–605. [Google Scholar] [CrossRef]
  39. Hassani, H. Singular Spectrum Analysis: Methodology and Comparison. J. Data Sci. 2021, 5, 239–257. [Google Scholar] [CrossRef]
  40. Fathi, A.Y.; El-Khodary, I.A.; Saafan, M. A Hybrid Model Integrating Singular Spectrum Analysis and Backpropagation Neural Network for Stock Price Forecasting. Rev. D’intelligence Artif. 2021, 35, 483–488. [Google Scholar] [CrossRef]
  41. Market Index—S&P/ASX50 (LIVE DATA): Share Prices & Charts. Available online: https://www.marketindex.com.au/asx50 (accessed on 8 May 2023).
  42. Yahoo Finance—Business Finance Stock Market News. Available online: https://in.finance.yahoo.com/ (accessed on 8 May 2023).
  43. Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014. [Google Scholar] [CrossRef]
  44. Dessain, J. Machine learning models predicting returns: Why most popular performance metrics are misleading and proposal for an efficient metric. Expert Syst. Appl. 2022, 199, 116970. [Google Scholar] [CrossRef]
  45. S&P/ASX50—S&P Dow Jones Indices. Available online: www.spglobal.com/ (accessed on 14 July 2023).
  46. Yan, K.; Ling, Y. Machine learning-based analysis of volatility quantitative investment strategies for American financial stocks. Quant. Financ. Econ. 2024, 8, 364–386. [Google Scholar] [CrossRef]
Figure 1. LSTM structure.
Figure 1. LSTM structure.
Analytics 05 00009 g001
Figure 2. Experiment procedure flow diagram.
Figure 2. Experiment procedure flow diagram.
Analytics 05 00009 g002
Figure 3. Reconstructed series plot of ALL.AX with window_length = 63.
Figure 3. Reconstructed series plot of ALL.AX with window_length = 63.
Analytics 05 00009 g003
Figure 4. SSA-CNN model structure (left), SSA-CNN-LSTM model structure (right).
Figure 4. SSA-CNN model structure (left), SSA-CNN-LSTM model structure (right).
Analytics 05 00009 g004
Figure 5. PnL Diagram for ASX50 Index and the best performing models in each type.
Figure 5. PnL Diagram for ASX50 Index and the best performing models in each type.
Analytics 05 00009 g005
Table 1. Stock Grouping Detail.
Table 1. Stock Grouping Detail.
GroupSectorCompany NameCompany Code on Yahoo Finance
Industrial and Infrastructure SectorsBasic MaterialsBHP Group LimitedBHP.AX
Basic MaterialsFortescue Metals Group LimitedFMG.AX
Basic MaterialsIGO LimitedIGO.AX
Basic MaterialsJames Hardie Industries plcJHX.AX
Basic MaterialsMineral Resources LimitedMIN.AX
Basic MaterialsNewcrest Mining LimitedNCM.AX
Basic MaterialsNorthern Star Resources LimitedNST.AX
Basic MaterialsPilbara Minerals LimitedPLS.AX
Basic MaterialsRio Tinto GroupRIO.AX
Basic MaterialsSouth32 LimitedS32.AX
EnergyOrigin Energy LimitedORG.AX
EnergyWashington H. Soul Pattinson and Company LimitedSOL.AX
EnergySantos LimitedSTO.AX
EnergyWoodside Energy Group Ltd.WDS.AX
IndustrialsAuckland International Airport LimitedAIA.AX
IndustrialsBrambles LimitedBXB.AX
IndustrialsQantas Airways LimitedQAN.AX
IndustrialsReece LimitedREH.AX
IndustrialsTransurban GroupTCL.AX
Consumer and Service SectorsCommunication ServicesREA Group LimitedREA.AX
Communication ServicesTelstra Group LimitedTLS.AX
Communication ServicesTPG Telecom LimitedTPG.AX
Consumer CyclicalAristocrat Leisure LimitedALL.AX
Consumer CyclicalThe Lottery Corporation LimitedTLC.AX
Consumer CyclicalWesfarmers LimitedWES.AX
Consumer DefensiveColes Group LimitedCOL.AX
Consumer DefensiveEndeavour Group LimitedEDV.AX
Consumer DefensiveWoolworths Group LimitedWOW.AX
Financial, Healthcare, Technology, and Utilities SectorsFinancial ServicesANZ Group Holdings LimitedANZ.AX
Financial ServicesASX LimitedASX.AX
Financial ServicesCommonwealth Bank of AustraliaCBA.AX
Financial ServicesComputershare LimitedCPU.AX
Financial ServicesInsurance Australia Group LimitedIAG.AX
Financial ServicesMacquarie Group LimitedMQG.AX
Financial ServicesNational Australia Bank LimitedNAB.AX
Financial ServicesQBE Insurance Group LimitedQBE.AX
Financial ServicesSuncorp Group LimitedSUN.AX
Financial ServicesWestpac Banking CorporationWBC.AX
HealthcareCochlear LimitedCOH.AX
HealthcareCSL LimitedCSL.AX
HealthcareFisher & Paykel Healthcare Corporation LimitedFPH.AX
HealthcareRamsay Health Care LimitedRHC.AX
HealthcareResMed Inc.RMD.AX
HealthcareSonic Healthcare LimitedSHL.AX
Real EstateGoodman GroupGMG.AX
Real EstateScentre GroupSCG.AX
Real EstateStocklandSGP.AX
TechnologyWiseTech Global LimitedWTC.AX
TechnologyXero LimitedXRO.AX
UtilitiesAPA GroupAPA.AX
Table 2. SSA’s Values to be tuned.
Table 2. SSA’s Values to be tuned.
Contribution Criteria99.95%99.97%
Window Length L63 (Trading days in one quarter)252 (Trading days in one year)504 (Trading days in two years)
Table 3. Reconstructed Series values of ALL.AX with window_length = 63.
Table 3. Reconstructed Series values of ALL.AX with window_length = 63.
99.95%99.97%
DatePeriodicityTrendAggregatePeriodicityTrendAggregateActual
12 April 2018−3.80026.50522.704−4.23226.50522.27221.906
13 April 2018−3.64326.57522.932−4.08826.57522.48722.512
16 April 2018−3.50026.64423.144−3.94126.64422.70322.671
17 April 2018−3.34726.71023.363−3.76026.71022.95022.848
18 April 2018−3.18126.77323.592−3.54326.77323.23123.407
19 April 2018−3.02526.83523.811−3.31326.83523.52223.463
20 April 2018−2.87126.89624.025−3.06626.89623.83023.799
23 April 2018−2.72626.95624.230−2.81526.95624.14123.892
24 April 2018−2.58527.01424.429−2.56427.01424.45024.675
26 April 2018−2.46227.07124.609−2.33827.07124.73324.852
27 April 2018−2.35527.12824.773−2.14327.12824.98425.076
30 April 2018−2.26127.18324.922−1.98427.18325.19925.001
1 May 2018−2.17527.23825.064−1.85927.23825.37925.066
2 May 2018−2.09327.29425.201−1.76827.29425.52625.905
3 May 2018−2.01827.34825.330−1.71827.34825.63025.775
4 May 2018−1.94227.40025.458−1.69727.40025.70325.533
7 May 2018−1.85827.45125.592−1.69227.45125.75925.803
8 May 2018−1.77027.50125.730−1.69827.50125.80325.290
9 May 2018−1.67127.55025.879−1.69727.55025.85325.439
10 May 2018−1.56127.59826.037−1.68127.59825.91826.157
11 May 2018−1.44627.64526.199−1.64927.64525.99526.344
14 May 2018−1.32927.69026.360−1.59827.69026.09226.735
Table 4. Models’ Performance (window_length = 63).
Table 4. Models’ Performance (window_length = 63).
Company GroupStocksCNNSSA-CNNSSA-LSTMSSA-CNN-LSTM
Consumer and Service Sectors MSEMAEMSEMAEMSEMAEMSEMAE
ALL.AX0.6460.6430.5870.5880.9760.7440.9920.773
REA.AX11.0862.50912.2522.66523.0583.70118.9423.346
TLS.AX0.0210.1370.0020.0400.0030.0430.0030.043
TPG.AX0.0440.1780.0110.0740.0180.1010.0170.091
WES.AX2.3731.3641.1830.8101.6040.9471.6520.966
WOW.AX0.3560.4850.3690.4660.4870.5420.4260.503
Financial, Healthcare, Technology, and Utilities SectorsANZ.AX0.2160.3820.3380.4400.4840.5050.4310.477
APA.AX0.1850.3820.0500.1750.0860.2270.1280.285
ASX.AX3.0951.3852.3491.2083.0841.3683.0731.384
CBA.AX36.1985.6725.4011.7156.7451.9375.6541.858
COH.AX20.0343.45115.2122.92824.6163.75318.9913.269
CPU.AX1.9861.3350.4380.4760.5990.5510.4780.501
CSL.AX56.5826.44719.5643.51325.3624.01321.7283.726
FPH.AX0.4210.5230.2920.4420.7110.6600.5580.587
GMG.AX0.7690.7140.3960.4750.5640.5630.5160.543
IAG.AX0.0590.2200.0080.0730.0100.0820.0130.090
MQG.AX97.7748.96022.6783.50930.6874.07022.8333.588
NAB.AX2.7401.5670.4670.5120.5950.5720.5010.529
QBE.AX0.3810.5270.0710.2010.1370.2800.1100.251
RHC.AX4.2021.3835.2151.2127.4811.5545.9701.359
RMD.AX3.5291.7210.4550.5180.9160.7491.0550.831
SCG.AX0.0020.0370.0040.0500.0050.0590.0040.052
SGP.AX0.0040.0500.0040.0490.0060.0590.0050.055
SHL.AX0.9390.7500.8740.7051.0250.7711.0750.788
SUN.AX0.0720.2090.0580.1840.0850.2370.0660.203
WBC.AX0.1520.3260.2310.3470.3290.4100.2670.374
WTC.AX31.2465.0814.1591.5843.8031.5113.6851.462
XRO.AX9.7212.4829.9612.40216.4253.19111.8912.623
Industrial and Infrastructure SectorsAIA.AX0.0190.1070.0140.0910.0200.1100.0170.102
BHP.AX6.3772.1741.1750.8382.1521.1631.6301.047
BXB.AX0.0910.2370.0700.1960.1260.2640.1230.261
FMG.AX4.8412.0510.2670.4000.3640.4580.4270.497
IGO.AX2.6051.5030.2780.4100.3790.4890.2920.431
JHX.AX1.6500.9601.2660.8552.2051.1521.6160.996
MIN.AX115.4879.9716.5062.0087.5892.16615.9843.215
NCM.AX1.5831.0200.2840.4150.4440.5250.4870.542
NST.AX0.2440.4120.0810.2230.1010.2470.0890.242
ORG.AX0.0990.2270.0490.1310.1100.2420.0730.180
PLS.AX0.4460.6030.0420.1570.0430.1530.0370.146
QAN.AX0.0460.1760.0210.1110.0330.1420.0280.131
REH.AX0.1990.3680.2100.3600.4710.5500.2960.428
RIO.AX21.1863.6936.6402.02311.5822.75210.2942.583
S32.AX0.0670.2230.0170.1020.0390.1490.0250.124
SOL.AX0.4230.5470.1930.3350.3410.4440.2880.400
STO.AX0.0700.2250.0260.1210.0420.1570.0330.143
TCL.AX0.0680.2270.0460.1710.0720.2140.0570.193
WDS.AX4.7751.9720.9040.7641.8371.1031.4080.957
Table 5. Models’ Performance (window_length = 252).
Table 5. Models’ Performance (window_length = 252).
Company GroupStocksCNNSSA-CNNSSA-LSTMSSA-CNN-LSTM
Consumer and Service Sectors MSEMAEMSEMAEMSEMAEMSEMAE
ALL.AX0.6460.6430.6360.6270.9400.7481.0160.785
REA.AX11.0862.50915.0262.99427.9624.11321.0903.612
TLS.AX0.0210.1370.0040.0510.0050.0590.0060.063
TPG.AX0.0440.1780.0130.0800.0260.1190.0200.104
WES.AX2.3731.3641.4450.9401.5830.9661.7201.032
WOW.AX0.3560.4850.4800.5000.6910.6180.7530.666
Financial, Healthcare, Technology, and Utilities SectorsANZ.AX0.2160.3820.3410.4650.4480.5190.4710.539
APA.AX0.1850.3820.0520.1800.1020.2580.1650.341
ASX.AX3.0951.3853.4351.4705.2791.8465.0411.796
CBA.AX36.1985.6725.6281.8547.0472.0727.3822.076
COH.AX20.0343.45119.7533.56330.4494.43623.9453.822
CPU.AX1.9861.3350.6540.5991.0120.7410.7130.628
CSL.AX56.5826.44728.3104.20735.1034.63840.7245.028
FPH.AX0.4210.5230.3040.4290.6910.6910.3690.486
GMG.AX0.7690.7140.5340.5470.7160.6350.6100.607
IAG.AX0.0590.2200.0110.0820.0150.0980.0140.096
MQG.AX97.7748.96024.7203.74432.0894.22429.2184.047
NAB.AX2.7401.5670.4760.5710.6000.6250.7130.664
QBE.AX0.3810.5270.0810.2120.1660.3030.1370.275
RHC.AX4.2021.3834.9381.3176.1071.4925.7851.477
RMD.AX3.5291.7210.6200.6120.9710.7810.8220.710
SCG.AX0.0020.0370.0040.0480.0060.0610.0050.058
SGP.AX0.0040.0500.0050.0550.0100.0810.0070.062
SHL.AX0.9390.7501.3740.9421.4970.9791.4200.948
SUN.AX0.0720.2090.0700.2050.0980.2400.1020.249
WBC.AX0.1520.3260.2570.3980.4030.4530.3950.436
WTC.AX31.2465.0813.7651.5093.0051.3232.9781.304
XRO.AX9.7212.48211.7722.67117.5263.28712.4932.816
Industrial and Infrastructure SectorsAIA.AX0.0190.1070.0170.1000.0260.1220.0270.128
BHP.AX6.3772.1741.5490.9972.4541.2322.2071.208
BXB.AX0.0910.2370.0720.2010.1240.2640.1220.256
FMG.AX4.8412.0510.3050.4180.4760.5360.3970.475
IGO.AX2.6051.5030.2530.3960.2970.4300.2870.424
JHX.AX1.6500.9601.4940.9662.5031.2362.1561.113
MIN.AX115.4879.9716.7552.0357.8122.16311.3702.745
NCM.AX1.5831.0200.2630.4110.5310.5950.4410.536
NST.AX0.2440.4120.0690.2080.0640.2020.0790.226
ORG.AX0.0990.2270.0560.1490.0860.1930.0810.190
PLS.AX0.4460.6030.0300.1340.0410.1620.0330.139
QAN.AX0.0460.1760.0240.1180.0410.1570.0340.140
REH.AX0.1990.3680.2740.4060.3520.4780.3160.446
RIO.AX21.1863.6938.2872.21512.5462.75512.1512.765
S32.AX0.0670.2230.0180.1060.0280.1270.0330.144
SOL.AX0.4230.5470.2690.4040.4520.5230.4230.511
STO.AX0.0700.2250.0260.1250.0450.1710.0380.146
TCL.AX0.0680.2270.0400.1630.0670.2070.0590.195
WDS.AX4.7751.9720.9880.7731.3330.9121.5851.008
Table 6. Models’ Performance (window_length = 504).
Table 6. Models’ Performance (window_length = 504).
Company GroupStocksCNNSSA-CNNSSA-LSTMSSA-CNN-LSTM
Consumer and Service Sectors MSEMAEMSEMAEMSEMAEMSEMAE
ALL.AX0.6460.6432.2121.1931.6610.9621.4370.870
REA.AX11.0862.50928.1663.91624.8163.61620.2833.444
TLS.AX0.0210.1370.0420.1860.0100.0820.0100.084
TPG.AX0.0440.1780.0910.2520.0380.1440.0320.133
WES.AX2.3731.3643.5031.5882.9021.3202.2661.172
WOW.AX0.3560.4851.3270.9441.0580.8301.0050.796
Financial, Healthcare, Technology, and Utilities SectorsANZ.AX0.2160.3820.6950.6610.8820.7650.7460.682
APA.AX0.1850.3820.2000.3660.0790.2300.0840.235
ASX.AX3.0951.3857.6472.2136.6892.0555.7871.898
CBA.AX36.1985.67248.5776.3739.9762.4299.2022.347
COH.AX20.0343.45148.9525.43733.4964.53330.1664.314
CPU.AX1.9861.3354.5751.9691.0390.7730.8790.712
CSL.AX56.5826.447219.10312.27466.1066.48761.2426.298
FPH.AX0.4210.5231.1910.8730.8920.7280.6230.629
GMG.AX0.7690.7141.3320.9210.6470.6020.6200.595
IAG.AX0.0590.2200.0780.2490.0270.1180.0260.116
MQG.AX97.7748.960152.10110.99842.1384.56437.3124.540
NAB.AX2.7401.5673.4771.6561.8141.0541.3880.910
QBE.AX0.3810.5270.5390.6220.1680.3280.1580.318
RHC.AX4.2021.38314.4922.5458.1211.6437.6501.548
RMD.AX3.5291.7215.2082.0961.2050.8280.9290.727
SCG.AX0.0020.0370.0100.0760.0100.0750.0100.076
SGP.AX0.0040.0500.0140.0950.0150.0920.0130.088
SHL.AX0.9390.7501.8291.1321.1860.8501.0370.786
SUN.AX0.0720.2090.1980.3710.1600.3050.1420.288
WBC.AX0.1520.3260.6980.6480.9460.6590.8370.618
WTC.AX31.2465.08134.6595.4313.3161.3892.4361.192
XRO.AX9.7212.48215.7683.18916.9683.35610.1832.519
Industrial and Infrastructure SectorsAIA.AX0.0190.1070.0590.2100.0430.1710.0450.173
BHP.AX6.3772.17410.4172.8002.4111.2271.9141.099
BXB.AX0.0910.2370.3360.4700.1900.3570.1810.341
FMG.AX4.8412.0515.9042.3120.5450.5800.4310.496
IGO.AX2.6051.5032.9661.5960.3650.4870.2180.364
JHX.AX1.6500.9603.6901.4593.7401.5542.3571.168
MIN.AX115.4879.971133.29410.7618.4902.1966.7631.959
NCM.AX1.5831.0202.4751.3790.8370.7410.7460.684
NST.AX0.2440.4120.2580.4120.1050.2640.1150.275
ORG.AX0.0990.2270.2580.3710.0970.2020.0870.190
PLS.AX0.4460.6030.4680.6300.0430.1590.0380.152
QAN.AX0.0460.1760.0910.2580.0440.1600.0390.148
REH.AX0.1990.3680.6570.6480.3380.4620.5150.564
RIO.AX21.1863.69331.1994.71815.2273.09011.9242.647
S32.AX0.0670.2230.1130.2990.0370.1510.0320.138
SOL.AX0.4230.5471.6321.1130.5800.6050.5050.548
STO.AX0.0700.2250.1070.2750.0590.1870.0490.169
TCL.AX0.0680.2270.2930.4660.1150.2740.1050.265
WDS.AX4.7751.9724.2981.7092.0141.1441.6901.053
Table 7. Financial metrics for all models.
Table 7. Financial metrics for all models.
ModelsDollar Return (Thousand)Dollar Gain (Thousand)Dollar Loss (Thousand)Win RateLose RateROISharpe Ratio (rf = 0.02)Max DrawDown
CNN$1021$12,070$12,8400.46690.5331−20.43%−0.746−0.410
SSA-CNN (63)$1121$14,439$13,0660.52380.476222.43%0.546−0.438
SSA-CNN (252)$1283$14,381$12,8460.50530.494725.66%0.648−0.445
SSA-CNN (504)$2226$14,733$12,2550.51850.481544.51%1.214−0.473
SSA-LSTM (63)$1153$13,824$12,4190.50530.494723.06%0.611−0.359
SSA-LSTM (252)$3064$15,701$12,3850.54500.455061.28%1.597−0.473
SSA-LSTM (504)$1982$13,701$11,4670.53970.460339.64%1.156−0.390
SSA-CNN-LSTM (63)$1891$13,364$11,2200.53170.468337.82%1.169−0.474
SSA-CNN-LSTM (252)$3328$14,749$11,1690.53040.469666.57%1.878−0.468
SSA-CNN-LSTM (504)$3022$14,328$11,0530.54370.456360.45%1.751−0.458
Table 8. Daily Trading Positions and Dollar Return in January 2023.
Table 8. Daily Trading Positions and Dollar Return in January 2023.
DateSSA-CNN (504)SSA-LSTM (252)SSA-CNN-LSTM (252)
Dollar Return (Thousand)Long PositionDollar Return (Thousand)Long PositionDollar Return (Thousand)Long Position
3 January 2023$88.0XRO.AX, BHP.AX, PLS.AX$94.2BHP.AX, FMG.AX, PLS.AX$80.6ANZ.AX, GMG.AX, AIA.AX
4 January 2023$107.7IAG.AX, NAB.AX, XRO.AX$128.4ANZ.AX, BHP.AX, PLS.AX$116.8ANZ.AX, WBC.AX, SOL.AX
5 January 2023$19.4TLS.AX, XRO.AX, PLS.AX$23.2TPH.AX, PLS.AX, SOL.AX$11.5TPG.AX, ANZ.AX, SOL.AX
6 January 2023$84.6XRO.AX, PLS.AX, REH.AX$191.4TLS.AX, BHP.AX, PLS.AX$72.2ANZ.AX, CSL.AX, IGO.AX
9 January 2023$76.7XRO.AX, IGO.AX, REH.AX$8.1CSL.AX, FPH.AX, XRO.AX$27.7CSL.AX, FPH.AX, RMD.AX
10 January 2023$13.1REA.AX, CPU.AX, IGO.AX$45.1CPU.AX, XRO.AX, PLS.AX$19.7CPU.AX, CSL.AX, RMD.AX
11 January 2023$88.0REA.AX, CPU.AX, REH.AX$112.0CPU.AX, IAG.AX, PLS.AX$70.8CPU.AX, IAG.AX, QBE.AX
12 January 2023$0.5REA.AX, CPU.AX, JHX.AX$141.3CPU.AX, XRO.AX, PLS.AX$54.3CPU.AX, SUN.AX, XRO.AX
13 January 2023$36.8REA.AX, JHX.AX, PLS.AX$54.1REA.AX, CPU.AX, NCM.AX$3.2CPU.AX, SUN.AX, NCM.AX
16 January 2023$183.1REA.AX, WTC.AX, REH.AX$29.5CPU.AX, NCM.AX, PLS.AX$7.8CPU.AX, SUN.AX, NCM.AX
17 January 2023$20.6CPU.AX, WTC.AX, REH.AX$13.8CPU.AX, JHX.AX, MIN.AX$1.8CPU.AX, SUN.AX, IGO.AX
18 January 2023$2.6XRO.AX, PLS.AX, S32.AX$45.0CPU.AX, XRO.AX, JHX.AX$6.4CPU.AX, WTC.AX, NCM.AX
19 January 2023$9.9ALL.AX, FPH.AX, NCM.AX$22.4CPU.AX, JHX.AX, REH.AX$10.0CPU.AX, NCM.AX, ORG.AX
20 January 2023$53.2ALL.AX, FPH.AX, XRO.AX$46.4CPU.AX, XRO.AX, REH.AX$115.0CPU.AX, FPH.AX, ORG.AX
23 January 2023$157.7ALL.AX, XRO.AX, PLS.AX$54.8CPU.AX, XRO.AX, REH.AX$2.7XRO.AX, ORG.AX, REH.AX
24 January 2023$154.1FPH.AX, IGO.AX, PLS.AX$149.5ALL.AX, GMG.AX, REH.AX$142.6ALL.AX, TPG.AX, REH.AX
25 January 2023$18.9FPH.AX, IGO.AX, PLS.AX$10.6COH.AX, CPU.AX, FPH.AX$43.7TPG.AX, COH.AX, FPH.AX
27 January 2023$16.4TPG.AX, WES.AX, IGO.AX$1.7TPG.AX, CPU.AX, IGO.AX$20.0TPG.AX, IGO.AX, SOL.AX
30 January 2023$17.4TPG.AX, NCM.AX, REH.AX$32.2TPG.AX, CPU.AX, IGO.AX$28.1TPG.AX, IGO.AX, NCM.AX
31 January 2023$41.1TPG.AX, WES.AX, REH.AX$14.1TPG.AX, CPU.AX, RMD.AX$3.7TPG.AX, RMD.AX, SHL.AX
Total = $859.0Total = $709.2Total = 444.8
Table 9. Frequency of stock selections by each model.
Table 9. Frequency of stock selections by each model.
Company GroupStocksCNNSSA-CNN (504)SSA-LSTM (252)SSA-CNN-LSTM (252)
Consumer and Service SectorsALL.AX16Total 67 times20Total 104 times6Total 109 times26Total 132 times
REA.AX22254626
TLS.AX0330
TPG.AX19102430
WES.AX0391728
WOW.AX1071322
Financial, Healthcare, Technology, and Utilities SectorsANZ.AX9Total 251 times18Total 353 times10Total 309 times28Total 366 times
APA.AX18810
ASX.AX7132439
CBA.AX01036
COH.AX12365
CPU.AX14265337
CSL.AX4423
FPH.AX18234819
GMG.AX4371533
IAG.AX971324
MQG.AX091119
NAB.AX21246
QBE.AX21234
RHC.AX151049
RMD.AX725621
SCG.AX22615
SGP.AX1525106
SHL.AX5202233
SUN.AX17649
WBC.AX9121012
WTC.AX242626
XRO.AX19515742
Industrial and Infrastructure SectorsAIA.AX12Total 435 times1Total 299 times1Total 338 times3Total 258 times
BHP.AX13162818
BXB.AX16301
FMG.AX1412203
IGO.AX26322246
JHX.AX12215025
MIN.AX4421170
NCM.AX23122613
NST.AX412526
ORG.AX262410
PLS.AX55599020
QAN.AX31146
REH.AX12502443
RIO.AX2113824
S32.AX216154
SOL.AX701018
STO.AX19548
TCL.AX121000
WDS.AX30101310
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hargreaves, C.A.; Fan, Z. Denoising Stock Price Time Series with Singular Spectrum Analysis for Enhanced Deep Learning Forecasting. Analytics 2026, 5, 9. https://doi.org/10.3390/analytics5010009

AMA Style

Hargreaves CA, Fan Z. Denoising Stock Price Time Series with Singular Spectrum Analysis for Enhanced Deep Learning Forecasting. Analytics. 2026; 5(1):9. https://doi.org/10.3390/analytics5010009

Chicago/Turabian Style

Hargreaves, Carol Anne, and Zixian Fan. 2026. "Denoising Stock Price Time Series with Singular Spectrum Analysis for Enhanced Deep Learning Forecasting" Analytics 5, no. 1: 9. https://doi.org/10.3390/analytics5010009

APA Style

Hargreaves, C. A., & Fan, Z. (2026). Denoising Stock Price Time Series with Singular Spectrum Analysis for Enhanced Deep Learning Forecasting. Analytics, 5(1), 9. https://doi.org/10.3390/analytics5010009

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop