LSTM-RF Stock Prediction Algorithm via Short-Term Directional Probability-Based Model Selection

Zhu, Chunman; Dawod, Ahmad Yahya; Yu, Xi; Zhou, Qingwei

doi:10.3390/info17060548

Open AccessArticle

LSTM-RF Stock Prediction Algorithm via Short-Term Directional Probability-Based Model Selection

by

Chunman Zhu

^1,2,

Ahmad Yahya Dawod

^1,*

,

Xi Yu

³ and

Qingwei Zhou

²

¹

International College of Digital Innovation, Chiang Mai University, Chiang Mai 50200, Thailand

²

School of Artificial Intelligence, Sichuan Tourism University, Chengdu 610100, China

³

Stirling College, Chengdu University, Chengdu 610106, China

^*

Author to whom correspondence should be addressed.

Information 2026, 17(6), 548; https://doi.org/10.3390/info17060548

Submission received: 2 May 2026 / Revised: 26 May 2026 / Accepted: 27 May 2026 / Published: 2 June 2026

(This article belongs to the Special Issue AI and Machine Learning in the Big Data Era: Advanced Algorithms and Real-World Applications)

Download

Browse Figures

Versions Notes

Abstract

This paper proposes a hybrid machine learning algorithm that enhances stock price prediction accuracy by selecting the optimal model based on the predicted probabilities of short-term upward and downward trends. First, a long short term memory (LSTM) network and a random forest (RF) model are employed to forecast the next-day closing price. Then, based on each model’s statistical performance in predicting upward (HR+) and downward (HR−) trends over the preceding 60 trading days, the optimal model is selected, and the ultimate forecast is determined accordingly. Experimental results based on nine stocks from the Shanghai and Shenzhen Stock Exchanges, covering the period from January 1, 2018 to December 31, 2023, demonstrate that the proposed method outperforms RF, CNN, LSTM, GRU, CNN-LSTM, LSTM-RNN, LSTM-GRU, and AE + LSTM models. Specifically, it achieves superior performance in direction accuracy metrics (HR, HR+, and HR−), with overall HR improving by approximately 2–5% and MAPE decreasing by about 1–2%. Furthermore, the results indicate that the LSTM model performs better in upward trend prediction, while the RF model is more effective in downward trend prediction. In addition, the tanh activation function is found to outperform ReLU in deep learning models for stock prediction. These findings suggest that the proposed algorithm has practical value for the research on stock investment-related algorithms.

Keywords:

LSTM; random forests; stock price prediction; financial forecasting; directional prediction

1. Introduction

The equity capital market, as a central platform for corporate financing and asset allocation, facilitates the efficient flow of capital through stock issuance and trading mechanisms. Stock market prediction, a frontier in financial engineering, fundamentally involves modeling nonlinear time series. Traditional econometric methods have proven inadequate for capturing the complex interactions among multiple influencing factors, such as policy changes and macroeconomic indicators [1]. In recent years, progress in machine learning has introduced a new paradigm for developing predictive models. Deep learning networks have significantly improved prediction accuracy by identifying latent patterns in high-dimensional data [2]. Accurate stock price prediction is a fundamental component of broader market forecasting, making the enhancement of predictive model performance particularly significant. However, designing such models remains challenging because stock prices exhibit complex nonlinear dynamics and pronounced volatility, which have long posed significant difficulties for predictive modeling.

In the field of stock market prediction, numerous studies have focused on developing predictive models using various machine learning techniques, achieving promising results. Traditional machine learning approaches for stock prediction include support vector machines (SVMs) [3], linear regression [4], random forests (RFs) [5,6], and k-nearest neighbors (KNN) [7,8], among others. For instance, a fine-tuned support vector regression model [9] has been applied to time series data, where grid search is employed to select the optimal kernel function and tune model parameters on the training set. Experimental results demonstrate that this approach improves prediction accuracy while reducing computational time and memory usage. Furthermore, comparative studies of traditional machine learning algorithms [10] have shown that random forests perform particularly well on large datasets; however, their accuracy declines as the number of technical indicators decreases. These findings suggest that, despite their effectiveness, traditional machine learning methods still face challenges in capturing the nonlinear characteristics of stock time series data.

With the rapid advancement of deep learning technologies, extensive research has been conducted to develop effective models for stock price prediction, including artificial neural networks (ANNs) [11], convolutional neural networks (CNNs) [12], long short term memory (LSTM) networks [13,14,15,16,17], and gated recurrent units (GRUs) [18,19,20]. For instance, ref. [21] proposed a CNN-based approach that incorporates eight input features, such as financial technical indicators, gold prices and their volatility indices, and crude oil prices and their volatility indices, to predict future stock market trends, achieving an accuracy of up to 67% in forecasting price movements over the next 10 days. In addition, ref. [22] reported that a single LSTM model achieves approximately 50% accuracy in predicting the direction of stock price changes; to address this limitation, multiple LSTM models were employed to predict closing prices over a four-day horizon, and the rate of trend change was calculated to improve directional prediction accuracy. Furthermore, ref. [23] introduced a method that applies wavelet thresholding to remove high-frequency noise from time series signals, followed by an improved GRU model, resulting in enhanced prediction performance on the S&P 500 and Shanghai Composite 300 indices. Comparative analysis indicates that some studies focus on algorithm construction after the prediction model, while others emphasize data preprocessing and model optimization before prediction. Despite these advancements, improvements in prediction accuracy remain limited. As shown in [24], the accuracy of individual deep learning models typically ranges from 46% to 67%. Moreover, ref. [25] demonstrated that although artificial neural networks can better handle missing data, they are prone to overfitting, which reduces their generalization capability, while support vector machines (SVM) are less susceptible to overfitting and achieve average prediction accuracies between 60% and 70%. Overall, these findings suggest that single-model approaches exhibit limited performance in stock price prediction, primarily due to the nonlinear nature of financial time series and their sensitivity to external factors such as financial news, as well as the impact of high-frequency fluctuations on prediction accuracy.

To improve the accuracy of stock price prediction, researchers have increasingly focused on hybrid models, such as CNN-LSTM [26,27], LSTM-RNN [28], AE + LSTM [29], and LSTM-GRU [30,31]. For example, ref. [32] proposed a CNN-LSTM framework in which CNN is used to extract features from input data, followed by LSTM for predicting stock closing prices, achieving an accuracy of approximately 53%. Similarly, ref. [28] employed the ARIMA model to preprocess stock index data and then utilized an LSTM-RNN model to forecast daily closing prices, yielding an RMSE of 2.51 and an MAPE of 1.84%. In addition, ref. [30] combined GRU and LSTM models, where GRU captures short- and medium-term dependencies and LSTM extracts long-term features, with final predictions generated through convolutional and fully connected layers; however, the resulting mean squared error (MSE) was 3% higher than that of the best existing method. These studies demonstrate that hybrid models leverage the complementary strengths of different deep learning architectures. Nevertheless, the multi-layer feature transformation process may lead to feature distortion, information loss, or the introduction of noise, requiring extensive parameter tuning and resulting in unstable performance improvements. Furthermore, ref. [29] proposed an AE + LSTM model, in which an autoencoder is used for feature extraction and LSTM for return prediction. The model was evaluated using directional accuracy metrics, including HR, HR+, and HR−, with results showing that the average HR ranged from 47% to 50.42%. Overall, these findings indicate that, although hybrid models can enhance prediction performance to some extent, their accuracy remains insufficient for practical applications and requires further improvement.

To incorporate richer stock-related feature information, Transformer-based models have attracted increasing attention in recent years [33]. For example, ref. [34] introduced a framework that employs the Time2Vec encoding technique to represent stock time series data, enabling a Transformer model to perform price prediction and achieve promising results on historical data from eight stocks listed on the Dhaka Stock Exchange. In addition, ref. [35] constructed a graph structure based on relationships such as industry affiliation, price co-movements, and supply chain connections, while leveraging a BERT model to extract sentiment information from social media; the integrated features were then fed into a Transformer model for stock price prediction, achieving a mean absolute percentage error of 0.80% for S&P 500 constituents. Furthermore, ref. [36] utilized generative adversarial networks (GANs) to generate synthetic stock price data incorporating market sentiment and volatility, and applied attention mechanisms to select salient features and patterns for prediction. Although Transformer models are capable of capturing long-range dependencies and support parallel computation, they are relatively insensitive to temporal distance. To address this limitation, existing studies have incorporated additional features, such as market news, sentiment indicators, industry attributes, and financial indicators, to enhance dependency modeling. However, compared with daily trading data, these auxiliary features often exhibit lag effects on stock prices, making them less suitable for short-term prediction tasks. Moreover, the effective quantification of such heterogeneous features remains a significant challenge. Table 1 summarizes the key feature combinations used in existing studies and highlights the distinctions between this work and prior research.

To further enhance the accuracy of stock price prediction, this study focuses on the development of hybrid models. In terms of input features, daily trading data with a direct impact on short-term price movements are selected. For model design, classical deep learning architectures that are well-suited for time series forecasting are adopted to effectively capture temporal dependencies in stock data. Through experimental analysis, the performance of several classical models was systematically evaluated for stock price prediction. The models that demonstrated high performance, particularly RF and LSTM, were integrated to develop a hybrid algorithm (LSTM–RF) that leverages the probabilities of short-term upward and downward trends.

The key contributions of this study are summarized as follows: (1) The impact of activation functions on the effectiveness of deep learning models, including LSTM, GRU, CNN, and AE–LSTM, was investigated. The results confirm that the tanh activation function outperforms ReLU in stock prediction tasks. Furthermore, these models were observed to converge after approximately 240 training iterations, providing practical guidance for model training. (2) The predictive performance of RF, LSTM, GRU, CNN, and AE–LSTM was compared using statistical measures, including overall trend accuracy (HR), downward trend accuracy (HR−), and upward trend accuracy (HR+). The results indicate that RF achieves superior performance in predicting downward trends (HR−), whereas LSTM performs better in predicting upward trends (HR+). This finding provides a basis for designing more effective hybrid prediction algorithms. (3) A hybrid algorithm (LSTM-RF) is proposed to combine the complementary strengths of LSTM and RF. Specifically, LSTM and RF are first used to forecast the closing stock price. Then, based on each model’s statistical performance in predicting upward (HR+) and downward (HR−) trends over the preceding 60 trading days, the optimal model is selected, and the ultimate forecast is determined accordingly. Experimental results based on stocks from the Shanghai Stock Exchange and the Shenzhen Stock Exchange demonstrate that the proposed method performs well across HR, HR+, and HR− metrics, improving the overall HR by approximately 2–5%. LSTM-RF also outperforms other hybrid models in terms of prediction error, measured by MAPE, achieving improvements of approximately 1–2%. These findings suggest that the proposed approach has practical value for the research on stock investment-related algorithms. The proposed LSTM-RF algorithm addresses the gap in existing approaches by incorporating directional probabilities to select the optimal prediction model.

In summary, the proposed LSTM–RF method represents an effective approach for stock time-series prediction and can be extended to other time-series forecasting tasks. It contributes to advancements in hybrid model design, objective function selection, and algorithm optimization, thereby offering a useful basis for subsequent studies and real-world implementation.

The organizational logic of the subsequent sections is as follows: Section 2 outlines the hybrid framework, providing details on stock data normalization, the LSTM and RF models, and the proposed LSTM–RF method. The relevant data and conclusions obtained from the experiment are explained in Section 3. Section 4 examines the results and outlines the limitations of the approach. Finally, Section 5 summarizes the research results and provides future research directions.

2. Materials and Methods

This section presents the proposed hybrid algorithm (LSTM-RF) that leverages the predicted probabilities of short-term upward and downward trends to improve stock price forecasting. The algorithm comprises four main steps: (1) preprocessing daily trading data; (2) using a trained LSTM model to estimate the closing price; (3) applying a trained RF model to generate an alternative next-day closing price prediction; and (4) selecting the most appropriate forecast through the proposed LSTM–RF integration strategy.

2.1. Dataset Preparation

Ten commonly used indicators from daily stock trading data were selected as inputs to the deep learning model: opening price, closing value, minimum price, maximum price, trading volume, trading amount, amplitude, increase/decrease amount, increase/decrease percentage, and turnover rate. The number of days of memory for the LSTM network is denoted as T. This value is usually obtained through grid selection during the model training process. The input feature data can be represented as

\{({I D}_{1,1}, I D_{2,1}, \dots, {I D}_{10,1}), ({I D}_{1,2}, {I D}_{2,2}, \dots, I D_{10,2}), \dots, ({I D}_{1, t}, {I D}_{2, t}, \dots, I D_{10, t})\},

where

t \in [1, T]

. To address inconsistencies in the value ranges of different indicators, the Z-score standardization method is applied to each feature. Specifically, the average value and standard deviation of each feature are first computed, and then the data are normalized using Equation (1). After normalization, each indicator approximately follows a standard normal distribution, which enhances both the model’s training precision and its stability.

{S T v a l u e}_{i, t} = \frac{{I D}_{i, t} - {m n}_{i}}{{s d}_{i}}

(1)

where

{I D}_{i, t}

represents the value of the

i

-th daily trading indicator at time step

t

. The terms

{mn}_{i}

and

{sd}_{i}

denote the average value and standard deviation of the

i

-th indicator, respectively.

2.2. LSTM Network

Long short term memory (LSTM), a widely used extension of recurrent neural networks, improves the modeling of sequential data by refining the internal neuron architecture. Its key feature is the incorporation of three gating units: the forget gate

(f_{t})

, the input gate

(i_{t})

, and the output gate

(o_{t})

. These gates manage the preservation and transmission of information across time steps, thereby alleviating the vanishing gradient issue commonly observed in conventional RNNs when handling long sequences. As a result, LSTM is particularly effective for time-series modeling. Figure 1 depicts the basic structure of the LSTM network.

The formulas for calculating the process values in the LSTM computation are presented in Equations (2)–(7):

f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f})

(2)

Here,

b_{f}

represents the bias term,

f_{t}

represents the weight matrix, and

σ

denotes the sigmoid activation function, which produces outputs within the interval [0, 1].

i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i})

(3)

\tilde{∁_{t}} = t a n h (W_{c} \cdot [h_{t - 1}, x_{t}] + b_{c})

(4)

∁_{t} = f_{t} ⊙ C_{t - 1} + i_{t} ⊙ \tilde{∁_{t}}

(5)

o_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o})

(6)

h_{t} = o_{t} ⊙ \tan h (∁_{t})

(7)

Here,

{\tilde{C}}_{t}

denotes the candidate update to the cell state,

C_{t}

represents the memory state,

h_{t - 1}

refers to the hidden state, and

⊙

indicates element-wise multiplication.

The LSTM network proposed in this paper employs a hierarchical architecture: an input layer consisting of a single LSTM neuron to receive the original time-series features; multiple hidden LSTM layers responsible for encoding time-series information and extracting relevant features; and a fully connected deep neural network (DNN) layer followed by a DNN output layer to generate the final predictions.

Several hyperparameters are tuned, including the number of hidden layers, the learning rate, the number of neurons per layer, the training epochs, the dropout rate, the loss function, and the activation function. Given the properties of time-series data, the input window size is limited to a range of 1–10 days to meet the task requirements. Based on the findings of the experimental results in this paper, the training iteration count is set to 240, and the activation function is set to tanh. The finalized hyperparameter settings are listed in Table 2.

Due to substantial differences in the numerical scale and fluctuation range of daily trading data between individual stocks and industry indices, parameters optimized for one target cannot be directly applied to others. Therefore, an independent training process is required for each target. After training, key configurations—including Z-score standardized parameters, suboptimal LSTM network structure parameters, and the optimal time window length—are stored separately to ensure consistency and accuracy in subsequent model inference.

2.3. Random Forest

Random forest is a classical algorithm in ensemble learning. It significantly improves predictive performance by building numerous independent decision trees in parallel and aggregating their outputs. In the financial domain, this algorithm effectively captures complex nonlinear relationships and can uncover market patterns that traditional linear models often fail to detect. As a result, it is widely used in stock price trend prediction tasks.

The bootstrap aggregation process in random forest operates as follows: Multiple independent training subsets are generated through random sampling with replacement, and separate decision tree models are trained independently. Unlike traditional decision trees, random forests introduce a feature selection mechanism during tree construction. Specifically, when a node is partitioned, a randomly chosen subset of features is selected from the full feature space, and the optimal split is determined from this subset. This strategy mitigates reliance on specific features while increasing diversity and complementarity among the individual trees. In classification problems, the final decision is made through majority voting, whereas in regression tasks, the prediction is computed as the average of all tree outputs. This ensemble mechanism effectively balances bias and variance, thereby improving overall predictive performance. A generalized form of the model is shown in Equation (8).

\overset{\land}{F (x)} = A {f_{b} (X; Θ_{b})}_{b = 1}^{B}

(8)

Here,

\overset{\land}{F (x)}

represents the ultimate forecast output;

f_{b} (X; Θ_{b})

denotes the prediction function of the

b

-th decision tree, whose behavior is governed by the random parameter

Θ_{b}

; and

A

is the integration operator corresponding to the task type.

For random forest training, the same Z-score standardized parameters are employed, and the number of decision trees is set to 1000. Although adding more trees typically enhances model performance, it also increases computational overhead. To guarantee reproducibility, the random seed is set to 42. During both training and prediction, the ten daily trading data indicators for each day are used as input to the random forest model.

2.4. Proposed LSTM-RF Algorithm

As discovered in the experimental results of this paper, LSTM demonstrates superior performance over random forest in predicting upward stock price movements, whereas random forest performs better in forecasting downward movements. Based on this observation, this paper proposes, for the first time, an algorithm called LSTM-RF, which balances the complementary advantages of LSTM and RF in stock price prediction. Figure 2 illustrates the various stages that the proposed method processes.

The algorithm calculates directional accuracy separately for stock price increases and decreases over the past 60 trading days, improving precision in handling both upward and downward trends. A 60-day trading window is adopted to separately estimate the probabilities of price increases and decreases. If the statistical period is too short, the sample size may be insufficient, leading to inflated probability estimates. In these comparisons, an empirical decision threshold, denoted as th1, is introduced to regulate the predictive performance of the LSTM-RF algorithm. When the estimated probability exceeds this threshold, the corresponding model is considered to have achieved satisfactory performance in the recent period and is therefore directly selected for prediction. In this study, th1 is set to 0.6, indicating that a model is deemed reliable when its recent predictive probability exceeds 60%. The algorithm also employs the parameters updown, trend accuracy rate (HR), negative trend accuracy rate (HR−), and positive trend accuracy rate (HR+), which are calculated using Formulas (9)–(12).

u p d o w n = \frac{\hat{{t r}_{k}} - {t r}_{k - 1}}{{t r}_{k - 1}} * 100 %

(9)

H R = \frac{{C o u n t}_{k = 1}^{N} (\hat{{t r}_{k}} * {t r}_{k} > 0)}{{C o u n t}_{k = 1}^{N} (\hat{{t r}_{k}} * {t r}_{k} \neq 0)}

(10)

H R - = \frac{{C o u n t}_{k = 1}^{N} (\hat{{t r}_{k}} < 0 A N D {t r}_{k} < 0)}{{C o u n t}_{k = 1}^{N} ({t r}_{k} < 0)}

(11)

H R + = \frac{{C o u n t}_{k = 1}^{N} (\hat{{t r}_{k}} > 0 A N D {t r}_{k} > 0)}{{C o u n t}_{k = 1}^{N} ({t r}_{k} > 0)}

(12)

where

\hat{{t r}_{k}}

and

{t r}_{k}

denote the predicted and actual values, respectively.

Figure 3 presents a schematic overview of the LSTM–RF algorithm. The steps of the procedure are outlined below.

Step 1: Input the target stock code and the current date. Retrieve the daily trading data used by the LSTM model and standardize it via Z-score normalization. The daily data include ten indicators: opening price, closing value, minimum price, maximum price, trading volume, trading amount, amplitude, increase/decrease amount, increase/decrease percentage, and turnover rate. Then, load the trained LSTM model to forecast the following day’s stock price, denoted as

{predictP}_{1}

, and compute the corresponding predicted percentage change,

{updown}_{1}

.

Step 2: Input the current day’s trading data and load the trained RF model to forecast the following day’s stock price, denoted as

{predictP}_{2}

, and compute the predicted percentage change,

{updown}_{2}

.

Step 3: Use the trained LSTM model to generate next-day stock price predictions for each of the past 60 trading days and calculate the corresponding accuracy metrics

{HR}_{1}

,

{HR}_{1}^{+}

, and

{HR}_{1}^{-}

. Similarly, apply the trained RF model to obtain predictions for the same period and compute

{HR}_{2}

,

{HR}_{2}^{+}

, and

{HR}_{2}^{-}

.

Step 4: Process the four possible cases defined by the combinations of upward and downward movements in

{updown}_{1}

and

{updown}_{2}

.

(a) When both

{updown}_{1}

and

{updown}_{2}

indicates an upward trend, the forecasted value

{predictP}_{1}

is selected. According to the findings reported in Section 4, the LSTM model performs better in predicting upward movements.

(b) When both

{updown}_{1}

and

{updown}_{2}

indicates a downward trend, the predicted price

{predictP}_{2}

from the random forest (RF) model is selected. Experimental results in Section 4 indicate that the RF model performs better in predicting downward movements.

(c) When

{updown}_{1}

indicates an upward trend and

{updown}_{2}

indicates a downward trend, the model selection is based on the directional prediction accuracy over the past 60 trading days. Four cases are considered.

If both the LSTM upward accuracy

{HR}_{1}^{+}

and the RF downward accuracy

{HR}_{2}^{-}

exceed the threshold

{th}_{1}

, then the combined metrics are compared. If

({HR}_{1} + {HR}_{1}^{+}) > ({HR}_{2} + {HR}_{2}^{-})

, the prediction

{predictP}_{1}

is selected; otherwise,

{predictP}_{2}

is selected.

If only

{HR}_{1}^{+} > {th}_{1}

, then

{predictP}_{1}

is selected.

If only

{HR}_{2}^{-} > {th}_{1}

, then

{predictP}_{2}

is selected.

If both

{HR}_{1}^{+}

and

{HR}_{2}^{-}

are below

{th}_{1}

, The combined metrics are again compared. If

({HR}_{1} + {HR}_{1}^{+}) > ({HR}_{2} + {HR}_{2}^{-})

,

{predictP}_{1}

is selected; otherwise,

{predictP}_{2}

is selected.

(d) When

{updown}_{1}

indicates a downward trend and

{updown}_{2}

indicates an upward trend, the procedure in case (c) is applied symmetrically.

3. Results

This section first examines the selection of activation functions and the total number of training epochs in the LSTM network. Subsequently, the forecasting capability of the proposed LSTM-RF method is contrasted with that of conventional models.

3.1. Simulation Environment

In this study, three stocks were randomly drawn from each of the Shanghai Composite, Shenzhen Composite, and ChiNext indices, resulting in a total of nine stocks, whose historical data were used for the experiments. To ensure data adequacy, each selected stock was required to have at least 700 trading days between 2020 and 2023, including a minimum of 200 trading days in 2023. These indices are widely regarded as representative benchmarks of the Chinese stock market and collectively reflect its overall performance. Table 3 presents the stock codes selected for this experiment.

This study focuses on simulating and evaluating trading data from 2023, a period during which China transitioned from strict COVID-19 lockdown measures to a more relaxed policy environment. During this time, stock prices exhibited significant volatility. As market activity increased, the magnitude of price fluctuations also expanded. If the training sample period is too short, it may fail to capture the full range of stock price variations. Therefore, each model was trained using four years of historical data—from 1 January 2018, to 31 December 2022. The data were obtained from the AkShare platform and Eastmoney Securities, including daily trading data for industry indices.

3.2. Selection of LSTM Activation Function and Number of Iterations

In this study, experimental data for the stock with code 000333 were used to analyze the performance of deep learning models, including LSTM, CNN, GRU, and AE–LSTM, in stock price prediction. The choice of activation functions plays a crucial role in model effectiveness, with widely used options including sigmoid, tanh, and ReLU. However, the sigmoid function suffers from issues such as vanishing gradients and non-zero-centered outputs, which can lead to inefficient and unbalanced gradient updates. Therefore, this study focuses on a comparative analysis of tanh and ReLU.

The effectiveness of the model is evaluated based on two criteria: mean absolute percentage error (MAPE) and mean squared error (MSE). MAPE serves as a clear measure of forecasting performance by measuring the proportional difference between the estimated and observed stock prices, with larger values signifying greater errors. Its formulation is presented in Equation (13). In contrast, MSE places greater emphasis on large deviations due to the squaring of errors, making it particularly sensitive to outliers; similarly, higher MSE values indicate poorer predictive performance. Its calculation is provided in Formula (14). In Formulas (13) and (14),

{\hat{t r}}_{k}

denotes the predicted value at time step

k

, and

{t r}_{k}

denotes the true value.

M A P E = \frac{1}{N} \sum_{k = 1}^{N} |\frac{\hat{{t r}_{k}} - {t r}_{k}}{{t r}_{k}}| * 100 %

(13)

M S E = \frac{1}{N} \sum_{k = 1}^{N} ({t r}_{k} - \hat{{t r}_{k}})^{2}

(14)

Figure 4 presents the MAPE trends of the four training models for stock code 000333 during the iterative training process using the ReLU activation function. In Figure 4a, the MAPE convergence process on the training dataset is illustrated, while Figure 4b shows the corresponding results on the validation dataset. As shown in Figure 4, all four models converge after approximately 50 iterations. In the training phase (Figure 4a), the CNN model exhibits a slower convergence rate than the other approaches, whereas the LSTM, GRU, and AE-LSTM models demonstrate similar convergence behavior. In the validation phase (Figure 4b), the CNN model again converges more slowly. In addition, LSTM shows slightly less fluctuation than GRU and AE-LSTM, indicating more stable performance.

Figure 5a displays the MAPE outcomes for the four models on the training set. The MAPE values for LSTM, GRU, AE-LSTM, and CNN are 6.57%, 6.65%, 7.21%, and 7.24%, respectively. Figure 5b reports the corresponding results on the validation dataset, where the MAPE values for GRU, LSTM, AE-LSTM, and CNN are 1.48%, 1.64%, 2.15%, and 2.67%, respectively. On the training dataset, LSTM achieves the lowest MAPE, closely followed by GRU, with a marginal difference of 0.08%. In contrast, on the validation dataset, GRU achieves the lowest MAPE, outperforming LSTM by a small margin of 0.16%. Overall, GRU and LSTM exhibit comparable and superior performance, followed by AE-LSTM and CNN.

Figure 6 presents the MSE curves of the four benchmark models during the iterative training process on both the training and validation sets. As illustrated in the figure, the CNN model exhibits a slower convergence rate, whereas the LSTM, GRU, and AE-LSTM models demonstrate similar convergence behavior.

The combined results in Figure 4, Figure 5 and Figure 6 illustrate that, under the ReLU activation function, LSTM achieves the best overall performance during training, followed by GRU, AE-LSTM, and CNN. Although LSTM and GRU perform similarly on the training dataset, LSTM exhibits the smallest fluctuation during the iterative process, indicating more stable convergence. The AE-LSTM model introduces additional noise during feature extraction through the autoencoder structure, which reduces its predictive accuracy relative to LSTM and GRU. Among the four approaches, the CNN model performs the worst with respect to both predictive accuracy and convergence efficiency.

Figure 7 and Figure 8 present the MAPE results of the four training models for stock code 000333 during the iterative process using the tanh activation function. Figure 7 illustrates the MAPE results for the training set, whereas Figure 8 presents the corresponding values for the validation set. As illustrated in Figure 7 and Figure 8, all four models tend to converge after approximately 240 iterations. Both the training and validation results indicate that CNN converges first, followed sequentially by LSTM, GRU, and AE-LSTM. Regarding prediction accuracy, LSTM achieves the best MAPE, followed by GRU, CNN, and AE-LSTM. These results suggest that the autoencoder may introduce noise during feature extraction, which can reduce prediction accuracy.

Figure 9a displays the MAPE outcomes for the four models on the training set. The MAPE values for LSTM, GRU, AE-LSTM, and CNN are 4.91%, 4.96%, 7.40%, and 7.92%, respectively. Figure 9b reports the corresponding results on the validation dataset, where the MAPE values for LSTM, GRU, CNN, and AE-LSTM are 1.51%, 1.78%, 1.69%, and 3.51%, respectively. On the training dataset, LSTM achieves the lowest MAPE, with a marginal difference of 0.05% compared with GRU. On the validation dataset, LSTM also obtains the lowest MAPE, with similarly small gaps relative to GRU and CNN. Overall, LSTM demonstrates the best performance, followed by GRU, while CNN and AE-LSTM exhibit comparatively weak performance.

Figure 10 and Figure 11 present the evolution of the loss function, measured by MSE, for the four models during the training and validation processes. The figures indicate that all models exhibit similar convergence behavior after approximately 240 iterations. From the data presented in Figure 7, Figure 8, Figure 9, Figure 10 and Figure 11, it can be concluded that, when using the tanh activation function, the LSTM model achieves the best performance during training, followed by GRU, CNN, and AE-LSTM.

Based on the preceding results, the LSTM model converges after approximately 240 iterations when using the tanh activation function, compared to about 50 iterations when using ReLU. A comparison of Figure 5 and Figure 9 indicates that the LSTM model achieves higher prediction accuracy on both the training and validation datasets with tanh than with ReLU. Overall, LSTM is the most suitable model for predicting sequential data, such as stock prices, outperforming GRU, CNN, and AE-LSTM. Under the tanh activation function, the LSTM model not only converges after approximately 240 iterations but also delivers the best overall performance.

3.3. Comparison of Accuracy of Four Basic Prediction Models

To assess the forecasting capability of the proposed LSTM-RF hybrid model, four indicators were selected—MAPE, HR, HR−, and HR+—forming a multi-dimensional evaluation system. MAPE quantifies the relative deviation between estimated and actual stock prices, reflecting the model’s fitting accuracy for price fluctuations. HR, HR−, and HR+ assess the model’s directional prediction accuracy across three sub-dimensions: overall market trends, downward trends, and upward trends. These indicators are widely utilized in financial forecasting tasks and have been adopted in numerous similar studies, ensuring strong comparability across models.

Table 4 presents the results of experimental simulations based on stock trading data for nine Chinese securities market stocks (000333, 002033, 002343, 300059, 300083, 600977, 600837, 601995, and 603881) in 2023. The results indicate that, for the HR+ metric, LSTM achieves the highest average prediction value (50.68%) among the four benchmark models (LSTM, RF, GRU, and CNN). For the HR− metric, RF obtains the highest average prediction value (63.42%). These findings suggest that LSTM performs best in predicting upward price movements, whereas RF performs best in predicting downward movements.

Table 4 further shows that, for the HR+ metric, the LSTM-RF model achieves an average prediction value of 53.74%, exceeding the best-performing benchmark model (LSTM at 50.68%) and substantially outperforming the worst-performing model (GRU at 40.10%). For the HR− metric, LSTM-RF obtains an average value of 55.10%, which is higher than LSTM’s 54.57% but lower than RF’s 63.42%. For the overall HR metric, LSTM-RF achieves an average prediction value of 54.63%, surpassing the highest benchmark result (LSTM at 53.30%) and the lowest result (GRU at 50.25%) by approximately 1% and 4%, respectively.

In terms of prediction error (MAPE), LSTM-RF is 0.05 percentage points better than RF but worse than the other models. Compared with LSTM, LSTM-RF improves prediction accuracy by approximately 1%. Overall, the findings suggest that the LSTM-RF model significantly enhances prediction accuracy relative to the benchmark models, thereby contributing to improved potential returns in stock investment.

3.4. Comparison with Other Hybrid Models

Experimental simulations using 2023 stock trading data for nine Chinese securities—000333, 002033, 002343, 300059, 300083, 600977, 600837, 601995, and 603881—were conducted to further evaluate the proposed model’s performance relative to other hybrid models. The models compared were LSTM-RF, CNN-LSTM [32], LSTM-RNN [28], LSTM-GRU [30], and AE + LSTM [29].

The CNN-LSTM model adopts the experimental settings reported in [32]. The input features include opening price, highest price, lowest price, closing price, trading volume, turnover, price change, and percentage change, with all features normalized using the min–max scaling method. The model parameters are configured in accordance with Table 2 of [32].

The LSTM-RNN model follows the configuration described in [28]. As the input features are not explicitly specified in [28], the same feature set as the LSTM-RF model is adopted. Data preprocessing is performed using an ARIMA (1,1,1) model. The network architecture consists of two layers: an LSTM layer followed by an RNN layer. Model parameters are determined via a network search method, consistent with the settings reported in Table 2 of [28].

The LSTM-GRU model utilizes the experimental setup from [30]. The input consists of daily trading data from the previous 30 days, using the same feature set as the LSTM-RF model, with min–max normalization applied. The architecture comprises four layers: a GRU layer, an LSTM layer, a fully connected layer, and an output layer. Model parameters are optimized using a network search approach, consistent with the settings in Table 2 of [30].

The AE + LSTM model follows the configuration in [29]. The input features consist of logarithmic daily returns over the previous 60 days, calculated according to Equations (17)–(19) in [29]. The training parameters of the autoencoder (AE) and LSTM components are set according to Table 1 and Table 2 of [29], respectively.

As shown in Table 5, with respect to the HR+ indicator, LSTM-RF achieves an average prediction accuracy of 53.74%, slightly below the highest value of 54.40% obtained by AE + LSTM. However, AE + LSTM performs poorly on individual stocks such as 000333 and 601995, with accuracies of only 43.33% and 43.75%, respectively. For the HR− indicator, LSTM-RF obtains an average value of 55.10%, outperforming the other four hybrid models. In terms of the composite indicator HR, LSTM-RF achieves an average value of 54.63%, exceeding the highest value (52.14% for LSTM-GRU) and the lowest value (49.70% for AE + LSTM) among competing models by approximately 2% and 5%, respectively.

Regarding prediction error, measured by MAPE, LSTM-RF also demonstrates superior performance, improving accuracy by approximately 1–2% compared with the other models. As indicated in Table 5, the CNN-LSTM and AE + LSTM models perform significantly worse than the remaining approaches. This may be attributed to their reliance on CNN and AE techniques for feature extraction during data preprocessing, which can introduce noise and negatively affect prediction accuracy.

Overall, these results suggest that the proposed method surpasses existing hybrid models in directional prediction accuracy and offers potential benefits for achieving higher returns in stock investment.

To evaluate the performance differences of the proposed LSTM-RF algorithm, Diebold–Mariano (DM) and McNemar tests were conducted by comparing its predictions with those of eight benchmark models, including LSTM, RF, GRU, CNN, CNN-LSTM, LSTM-RNN, LSTM-GRU, and AE + LSTM. The results are presented in Table 6. The DM test assesses differences in predicted values, and the corresponding p-values (p-value1) in Table 6 are all significantly below 0.05, indicating strong statistical significance. The McNemar test evaluates differences in prediction direction, and the reported p-values (p-value2) are also below 0.05, demonstrating statistically significant differences. Overall, these results confirm that the proposed LSTM-RF algorithm exhibits statistically significant improvements over the compared models in both prediction accuracy and directional performance.

4. Discussion

This paper proposes a hybrid algorithm (LSTM-RF) that utilizes the predicted probabilities of short-term upward and downward movements to improve stock price forecasting. The approach exploits the strength of the LSTM model in predicting price increases and the advantage of the RF model in predicting price decreases, thereby enhancing overall predictive accuracy. The performance of the proposed method has been evaluated against advanced models. The main findings are summarized as follows.

First, experimental results confirm that, among deep learning models, the tanh activation function outperforms ReLU for stock closing price prediction. In addition, these models tend to converge after approximately 240 training iterations. This phenomenon can be explained by the intrinsic properties of activation functions. The tanh activation function is symmetric and bounded, enabling it to effectively capture both upward and downward movements in stock prices. In contrast, the ReLU activation function is asymmetric, with an unbounded positive range, which may lead to the loss of negative information. Second, the results indicate that RF achieves higher accuracy in predicting downward trends (HR−), whereas LSTM performs better in predicting upward trends (HR+). This phenomenon can be interpreted through theories such as asymmetric market behavior, volatility clustering, and investor sentiment. During downward phases, negative information often induces asymmetric volatility responses, leading investors to converge toward similar emotional states and resulting in smoother price declines. Under such conditions, the RF model is more effective at capturing relatively linear downward trends, thereby yielding superior predictive performance. In contrast, upward movements are typically characterized by frequent fluctuations and stronger nonlinear dynamics. Consequently, the LSTM model demonstrates better performance in capturing these complex patterns compared to other models. Third, the proposed method incorporates the statistical probabilities generated by LSTM and RF over the previous 60 trading days to select the optimal prediction result. Empirical results based on stocks from the Shanghai and Shenzhen Stock Exchanges indicate that the proposed algorithm achieves superior performance across HR, HR+, and HR− metrics, improving the overall accuracy (HR) by approximately 2–5%. In terms of prediction error, measured by MAPE, LSTM-RF also outperforms other hybrid models, with improvements of approximately 1–2%.

Forecasting stock closing prices is a critical determinant of capital market returns and has been widely studied. During upward price movements, stock prices often exhibit prolonged fluctuations, making accurate prediction challenging; in this context, LSTM demonstrates strong performance in directional forecasting. Conversely, during downward movements, prices typically decline more rapidly over shorter periods, where RF shows superior predictive accuracy. Simultaneously, in deep learning–based stock price prediction, the tanh activation function demonstrates superior performance compared to ReLU. These findings provide valuable guidance for the design of related predictive models. Building on these observations, the proposed LSTM-RF algorithm leverages short-term directional probabilities to integrate the complementary strengths of LSTM and RF models. This approach enriches existing machine learning–based stock prediction research and achieves improved predictive performance relative to current hybrid methods.

Despite these contributions, several limitations remain. First, the proposed method focuses primarily on choosing between LSTM and RF based on their respective strengths; future research could aim to enhance the performance of individual models. Second, the study considers only nine stocks from a single market, resulting in a relatively small sample size and limited comparative analysis. Future work should extend the evaluation to a broader range of securities markets and include more comprehensive benchmarking methods. Third, the computational cost of the algorithm is not explicitly discussed. Real-time prediction, combined with the calculation of statistical probabilities over the past 60 trading days, can be time-consuming. To address this issue, historical statistical probabilities are precomputed in advance, thereby reducing computational overhead. Third, the computational cost of the algorithm is not explicitly discussed. Real-time prediction, combined with the calculation of statistical probabilities over the past 60 trading days, can be time-consuming. To address this issue, historical statistical probabilities are precomputed in advance, thereby reducing computational overhead. Finally, incorporating additional features, such as news data and investor sentiment, or integrating the approach with Transformer-based models, may further improve prediction accuracy.

5. Conclusions

This paper proposes a hybrid algorithm (LSTM-RF) that leverages the predicted probabilities of short-term upward and downward movements to improve directional accuracy in stock price forecasting. Experiments were performed using stocks from the Shanghai Stock Exchange and the Shenzhen Stock Exchange as research samples, and the results indicate that deep learning models generally converge after approximately 240 training iterations. The findings also confirm that the tanh activation function outperforms ReLU in stock prediction tasks. Furthermore, the results show that LSTM achieves higher accuracy than other deep learning models in predicting upward price movements, while the RF model demonstrates superior performance in predicting downward movements. These observations illustrate that the proposed LSTM-RF algorithm integrates the strengths of both models by utilizing their respective prediction probabilities. The empirical results indicate that the proposed method improves the accuracy of stock price prediction and makes a meaningful contribution to algorithmic research in investment decision-making and stock forecasting. Future research will focus on optimizing the proposed hybrid algorithm for stock price prediction. In particular, the integration of Transformer-based models will be explored to incorporate additional features that may further enhance predictive accuracy. Moreover, the proposed method will be extended to a broader range of investment domains to improve its applicability and robustness. The ultimate objective is to enhance the efficiency and accuracy of machine learning–based prediction systems.

Author Contributions

Methodology, A.Y.D. and X.Y.; writing—original draft, C.Z.; writing—review and editing, C.Z. and Q.Z.; supervision, C.Z., A.Y.D. and Q.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Key Research and Development Project of Sichuan Provincial Department of Science and Technology of China under Grant No. 2024YFFK0439, the Sichuan Province Philosophy and Social Sciences Foundation Project of China under Grant No. 24SDLYAQYB017, and the Sichuan Tourism University Research Project of China under Grant Nos. JG2024014, KJC-HXKYGZ-202503260083 and ZX063220.

Data Availability Statement

The datasets supporting the findings of this study are openly available at https://drive.google.com/drive/folders/14tgLoQvv-8GpeIRNDf_naXK4b7huGFy9?usp=sharing (accessed on 26 April 2026).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Shah, D.; Isah, H.; Zulkernine, F. Stock Market Analysis: A Review and Taxonomy of Prediction Techniques. Int. J. Financ. Stud. 2019, 7, 26. [Google Scholar] [CrossRef]
Soni, P.; Tewari, Y.; Krishnan, D. Machine Learning Approaches in Stock Price Prediction: A Systematic Review. J. Phys. Conf. Ser. 2022, 2161, 012065. [Google Scholar] [CrossRef]
Madhusudan, D.M. Stock Closing Price Prediction Using Machine Learning SVM Model. Int. J. Res. Appl. Sci. Eng. Technol. 2020, 8, 379–383. [Google Scholar] [CrossRef]
Emioma, C.C.; Edeki, S.O. Stock Price Prediction Using Machine Learning on Least-Squares Linear Regression Basis. J. Phys. Conf. Ser. 2021, 1734, 012058. [Google Scholar] [CrossRef]
Sadorsky, P. A Random Forests Approach to Predicting Clean Energy Stock Prices. J. Risk Financ. Manag. 2021, 14, 48. [Google Scholar] [CrossRef]
Omar, A.B.; Huang, S.; Salameh, A.A.; Khurram, H.; Fareed, M. Stock Market Forecasting Using the Random Forest and Deep Neural Network Models Before and During the COVID-19 Period. Front. Environ. Sci. 2022, 10, 917047. [Google Scholar] [CrossRef]
Ghosh, P.; Neufeld, A.; Sahoo, J.K. Forecasting Directional Movements of Stock Prices for Intraday Trading Using LSTM and Random Forests. Financ. Res. Lett. 2022, 46, 102280. [Google Scholar] [CrossRef]
Reddy, M.C.K.; Praneeth, M.; Reddy, K.P.; Reddy, A.S. Stock market trend prediction using K-nearest neighbor (KNN) algorithm. Int. J. Innov. Eng. Manag. Res. 2021, 13, 1–8. [Google Scholar]
Ahmad, B.; Zakria, M. Comparative Study of Box-Jenkins ARIMA and KNN Algorithm for Stock Price Prediction in Pakistan. J. Soc. Sci. Humanit. 2022, 30, 102–122. [Google Scholar]
Dash, R.K.; Nguyen, T.N.; Cengiz, K.; Sharma, A. Fine-Tuned Support Vector Regression Model for Stock Predictions. Neural Comput. Appl. 2023, 35, 23295–23309. [Google Scholar] [CrossRef]
Liu, G.; Ma, W. A Quantum Artificial Neural Network for Stock Closing Price Prediction. Inf. Sci. 2022, 598, 75–85. [Google Scholar] [CrossRef]
Mehtab, S.; Sen, J. Stock Price Prediction Using Convolutional Neural Networks on a Multivariate Time Series. arXiv 2021, arXiv:1912.07700. [Google Scholar]
Ghosh, A.; Bose, S.; Maji, G.; Debnath, N.; Sen, S. Stock Price Prediction Using LSTM on Indian Share Market. In Proceedings of the 32nd International Conference on Computer Applications in Industry and Engineering, San Diego, CA, USA, 30 September–2 October 2019; pp. 101–190. [Google Scholar]
Ding, G.; Qin, L. Study on the Prediction of Stock Price Based on the Associated Network Model of LSTM. Int. J. Mach. Learn. Cybern. 2020, 11, 1307–1317. [Google Scholar] [CrossRef]
Jin, Z.; Yang, Y.; Liu, Y. Stock Closing Price Prediction Based on Sentiment Analysis and LSTM. Neural Comput. Appl. 2020, 32, 9713–9729. [Google Scholar] [CrossRef]
Ge, Q. Enhancing Stock Market Forecasting: A Hybrid Model for Accurate Prediction of S&P 500 and CSI 300 Future Prices. Expert Syst. Appl. 2025, 260, 125380. [Google Scholar] [CrossRef]
Tian, B.; Yan, T.; Yin, H. Forecasting the Volatility of CSI 300 Index with a Hybrid Model of LSTM and Multiple GARCH Models. Comput. Econ. 2025, 66, 1969–1999. [Google Scholar] [CrossRef]
Gupta, U.; Bhattacharjee, V.; Bishnu, P.S. StockNet—GRU Based Stock Index Prediction. Expert Syst. Appl. 2022, 207, 117986. [Google Scholar] [CrossRef]
Fang, W.; Zhang, S.; Xu, C. Improving Prediction Efficiency of Chinese Stock Index Futures Intraday Price by VIX-Lasso-GRU Model. Expert Syst. Appl. 2024, 238, 121968. [Google Scholar] [CrossRef]
Chen, C.; Xue, L.; Xing, W. Research on Improved GRU-Based Stock Price Prediction Method. Appl. Sci. 2023, 13, 8813. [Google Scholar] [CrossRef]
Chen, Y.-C.; Huang, W.-C. Constructing a Stock-Price Forecast CNN Model with Gold and Crude Oil Indicators. Appl. Soft Comput. 2021, 112, 107760. [Google Scholar] [CrossRef]
Zhu, C.; Yahya Dawod, A.; Xi, Y.; Chen, G. A Portfolio Optimization Model for Return Trend Rate and Risk Trend Rate Based on Machine Learning. IAES Int. J. Artif. Intell. (IJ-AI) 2025, 14, 933. [Google Scholar] [CrossRef]
Qi, C.; Ren, J.; Su, J. GRU Neural Network Based on CEEMDAN–Wavelet for Stock Price Prediction. Appl. Sci. 2023, 13, 7104. [Google Scholar] [CrossRef]
Kumar, I.; Dogra, K.; Utreja, C.; Yadav, P. A Comparative Study of Supervised Machine Learning Algorithms for Stock Market Trend Prediction. In Proceedings of the 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT), Coimbatore, India, 20–21 April 2018; IEEE: New York, NY, USA, 2018; pp. 1003–1007. [Google Scholar]
Kurani, A.; Doshi, P.; Vakharia, A.; Shah, M. A Comprehensive Comparative Study of Artificial Neural Network (ANN) and Support Vector Machines (SVM) on Stock Forecasting. Ann. Data. Sci. 2023, 10, 183–208. [Google Scholar] [CrossRef]
Livieris, I.E.; Pintelas, E.; Pintelas, P. A CNN–LSTM Model for Gold Price Time-Series Forecasting. Neural Comput. Appl. 2020, 32, 17351–17360. [Google Scholar] [CrossRef]
Wu, J.M.-T.; Li, Z.; Herencsar, N.; Vo, B.; Lin, J.C.-W. A Graph-Based CNN-LSTM Stock Price Prediction Algorithm with Leading Indicators. Multimed. Syst. 2023, 29, 1751–1770. [Google Scholar] [CrossRef]
Varadharajan, V.; Smith, N.; Kalla, D.; Kumar, G.R.; Samaah, F.; Polimetla, K. Stock Closing Price and Trend Prediction with LSTM-RNN. J. Artif. Intell. Big Data 2024, 4, 1–13. [Google Scholar] [CrossRef]
Ma, Y.; Wang, W.; Ma, Q. A Novel Prediction Based Portfolio Optimization Model Using Deep Learning. Comput. Ind. Eng. 2023, 177, 109023. [Google Scholar] [CrossRef]
Farhadi, A.; Zamanifar, A.; Alipour, A.; Taheri, A.; Asadolahi, M. A Hybrid LSTM-GRU Model for Stock Price Prediction. IEEE Access 2025, 13, 117594–117618. [Google Scholar] [CrossRef]
Tripathy, N.; Parida, S.; Nayak, S.K. Forecasting Stock Market Indices Using Gated Recurrent Unit (GRU) Based Ensemble Models: LSTM-GRU. Int. J. Comput. Commun. Technol. 2023, 9, 85–90. [Google Scholar] [CrossRef]
Lu, W.; Li, J.; Li, Y.; Sun, A.; Wang, J. A CNN-LSTM-Based Model to Forecast Stock Prices. Complexity 2020, 2020, 6622927. [Google Scholar] [CrossRef]
Sahoo, J. Learning to Trade: Deep Neural Networks for Robust Stock Market Price Forecasting. Int. J. Eng. Inf. Manag. 2026, 2, 83–107. [Google Scholar] [CrossRef]
Muhammad, T.; Aftab, A.B.; Ibrahim, M.; Ahsan, M.M.; Muhu, M.M.; Khan, S.I.; Alam, M.S. Transformer-based deep learning model for stock price prediction: A case study on Bangladesh stock market. Int. J. Comput. Intell. Appl. 2023, 22, 2350013. [Google Scholar] [CrossRef]
Ridhawi, M.A.; Ali, M.H.; Al Osman, H. Stock Market Prediction Using Node Transformer Architecture Integrated with BERT Sentiment Analysis. IEEE Access 2026, 14, 72613–72631. [Google Scholar] [CrossRef]
Li, S.; Xu, S. Enhancing Stock Price Prediction Using GANs and Transformer-Based Attention Mechanisms. Empir. Econ. 2025, 68, 373–403. [Google Scholar] [CrossRef]

Figure 1. Schematic representation of LSTM.

Figure 2. An overview diagram of each processing stage of the algorithm.

Figure 3. Flowchart of the LSTM-RF algorithm.

Figure 4. The MAPE curves of the four benchmark models during the iterative process when using the ReLU activation function. (a) The convergence process of MAPE on the training dataset; (b) the convergence process of MAPE on the validation dataset.

Figure 5. The MAPE statistical charts of the four benchmark models when using the ReLU activation function. (a) The statistical graph of MAPE on the training dataset; (b) the statistical graph of MAPE on the validation dataset.

Figure 6. The MSE curves of the four benchmark models during the iterative process when using the ReLU activation function. (a) The convergence process of MSE on the training dataset; (b) the convergence process of MSE on the validation dataset.

Figure 7. Training iterations on the training set (tanh).

Figure 8. Training iterations graph on the validation set (tanh).

Figure 9. MAPE statistical charts for the four benchmark models when using the tanh activation function. (a) MAPE on the training dataset; (b) MAPE on the validation dataset.

Figure 10. MSE convergence process graphs for the four benchmark models on the training dataset when the activation function is tanh.

Figure 11. MSE validation process graphs for the four benchmark models on the validation dataset when the activation function is tanh.

Table 1. A summary of recent studies on stock market prediction.

Study	Dataset/Market	Feature Selection Strategy	Prediction Model	Evaluation Metrics	Main Contribution	Key Limitations
[5]	five US’s clean energy ETFs	Stock daily trading data	RF	MD accuracy, MD Gini	RFs and tree bagging show much better	There is no deep learning involved.
[21]	The S&P 500 (GSPC)	Closing price, SMA, EMA, ROC, MACD, Fast %K, Slow %D, Upper band, Lower band, %B, oil price, oil volatility index, gold price, gold volatility index	CNN	Accuracy, Precision, Recall, F1 score	Conducted a classified discussion experiment on the input indicators	The data is rather limited.
[22]	the Shanghai and Shenzhen Stock Exchange	Stock daily trading data	LSTM	HR, AR, SR, MD	The introduction of the trend rate indicator has improved the prediction accuracy.	The data is rather limited.
[23]	S&P500 and CSI 300 stock indices	opening price, closing price, highest price, and lowest price	GRU	MSE, MAE	the wavelet threshold method is specifically used to denoise high-frequency noise	only have four-dimensional features
[32]	the Shanghai Composite Index	opening price, highest price, lowest price, closing price, volume, turnover, ups and downs, and change	CNN-LSTM	MAE, RMSE, R2	Use CNN to extract the features of the input data	The data is rather limited.
[28]	the Amazon Inc. stock	Stock daily trading data	LSTM-RNN	RMSE, MAE, MAPE	A LSTM-RNN model was constructed.	The data is rather limited.
[30]	Eight stocks of Iranian listed companies	42 indicators	LSTM-GRU	MSE, MAE, MAPE	A LSTM-GRU model was constructed.	The data is rather limited.
[29]	the CSI 100 component stocks	past 60 days’ logarithmic daily returns	AE + LSTM + OMEG	HR, HR+, HR−, MSE, MAE	An AE + LSTM + OMEGA model was constructed.	TAE feature processing is prone to introducing noise.
Proposed Study	9 listed stocks of the Shanghai and Shenzhen Stock Exchange	opening price, closing value, minimum price, maximum price, trading volume, trading amount, amplitude, increase/decrease amount, increase/decrease percentage, and turnover rate	LSTM-RF	MAPE, HR, HR+, HR−, MSE	Improved directional prediction performance	The data is rather limited.

Table 2. LSTM network hyperparameter settings.

Parameters	Values
Hidden layers	1, 2, 3
Number of nodes in each layer	10, 20, 32, 64
Learning rate	0.001, 0.01, 0.1
Number of iterations	240
Dropout rate	0.1, 0.2, …, 0.5
Loss function	MAPE, MSE
Active function	tanh
Number of days of LSTM memory	1, 2, 3, …, 10

Table 3. Stock grouping randomly drawn from each of the Shanghai Composite, Shenzhen Composite, and ChiNext indices.

Corresponding Index	Stock Code	Start Time of Statistics	End Time of Statistics	Total Trading Days	Trading Day of 2023
Shanghai Composite Index	600837	1 January 2018	31 December 2023	1457 days	242 days
	601995	2 November 2020	31 December 2023	771 days	242 days
	603881	1 January 2018	31 December 2023	1457 days	242 days
Shenzhen Composite Index	000333	1 January 2018	31 December 2023	1417 days	242 days
	002033	1 January 2018	31 December 2023	1411 days	242 days
	002343	1 January 2018	31 December 2023	1452 days	242 days
ChiNext Market	300059	1 January 2018	31 December 2023	1457 days	242 days
	300083	1 January 2018	31 December 2023	1437 days	242 days
	600977	1 January 2018	31 December 2023	1457 days	242 days

Table 4. Statistical table of prediction results of 4 benchmark models.

Index	Model	Prediction Statistics (%)									Average (%)
Index	Model	000333	002033	002343	300059	300083	600977	600837	601995	603881	Average (%)
HR	LSTM-RF	53.94	50.21	56.43	56.43	56.02	56.43	52.28	55.19	54.77	54.63
	LSTM	49.79	51.45	49.38	57.26	54.77	52.28	53.94	56.43	54.36	53.30
	RF	56.85	48.96	59.34	52.70	49.38	59.34	47.72	50.21	54.77	53.25
	GRU	51.45	51.04	50.21	48.55	48.13	45.64	50.62	54.77	51.87	50.25
	CNN	57.26	50.62	49.79	52.28	51.87	44.40	49.79	46.06	51.87	50.44
HR−	LSTM-RF	53.33	48.57	57.59	61.83	55.85	56.99	50.00	56.77	54.97	55.10
	LSTM	47.11	20.51	44.19	60.29	77.34	54.35	53.04	68.99	65.32	54.57
	RF	80.17	58.12	82.17	36.76	67.19	58.73	71.30	55.04	61.29	63.42
	GRU	83.47	71.79	48.06	52.21	50.00	51.59	45.22	75.19	61.29	59.87
	CNN	69.85	36.21	50.43	62.02	54.96	80.77	54.20	64.84	62.90	59.58
HR+	LSTM-RF	54.95	51.47	54.22	50.00	56.60	54.55	55.14	52.33	54.44	53.74
	LSTM	52.50	80.65	55.36	53.33	29.20	45.61	54.76	41.96	42.74	50.68
	RF	33.33	40.32	33.04	73.33	29.20	61.54	26.19	44.64	47.86	43.27
	GRU	19.17	31.45	52.68	43.81	46.02	39.13	55.56	31.25	41.88	40.10
	CNN	40.95	64.00	49.19	41.07	48.18	16.79	44.55	24.78	40.17	41.08
MAPE	LSTM-RF	1.44	4.14	3.10	2.08	2.47	2.13	1.38	2.16	2.79	2.41
	LSTM	2.06	7.97	4.61	3.11	3.18	2.83	2.48	3.68	3.82	3.75
	RF	1.36	2.27	2.66	1.99	2.40	1.95	0.94	1.52	2.46	1.95
	GRU	2.32	3.39	4.58	2.36	3.51	3.40	1.78	2.69	3.25	3.03
	CNN	3.22	7.11	5.56	6.38	3.80	5.23	6.22	6.44	5.14	5.46

Table 5. Statistical table of prediction results for hybrid models.

Index	Model	Prediction Statistics (%)									Average (%)
Index	Model	000333	002033	002343	300059	300083	600977	600837	601995	603881	Average (%)
HR	LSTM-RF	53.94	50.21	56.43	56.43	56.02	56.43	52.28	55.19	54.77	54.63
	CNN-LSTM	51.3	35.24	44.46	53.53	55.43	53.35	48.04	54.38	53.86	49.95
	LSTM-RNN	49.52	34.85	45.43	55.58	60.31	52.89	50.43	58.31	56.54	51.54
	LSTM-GRU	55.6	46.42	45.91	52.62	54.76	48.66	48.41	61.54	55.39	52.14
	AE + LSTM	48.13	50.62	44.81	52.70	47.72	43.15	51.04	57.26	51.87	49.70
HR−	LSTM-RF	53.33	48.57	57.59	61.83	55.85	56.99	50.00	56.77	54.97	55.10
	CNN-LSTM	51.13	21.31	41.16	54.1	59	60.61	46.17	59.36	56.76	49.95
	LSTM-RNN	49.23	20.87	43.29	57.06	68.18	57.39	49.57	64.69	61.25	52.39
	LSTM-GRU	62.41	43.27	43.65	54.47	60.19	50.29	46.35	69.41	59.59	54.4
	AE + LSTM	52.89	41.03	40.31	49.26	38.28	32.82	40.87	68.99	42.74	45.24
HR+	LSTM-RF	54.95	51.47	54.22	50.00	56.60	54.55	55.14	52.33	54.44	53.74
	CNN-LSTM	52.87	50.38	48.96	54.12	52.87	47.69	51.11	50.59	52.36	51.22
	LSTM-RNN	51.18	50.43	48.97	55.5	53.63	49.8	52.49	53.32	53.42	52.08
	LSTM-GRU	50.59	51.17	49.77	52.78	51.32	48.83	52.25	55.67	52.99	51.71
	AE + LSTM	43.33	59.68	50.00	57.14	58.41	55.45	60.32	43.75	61.54	54.40
MAPE	LSTM-RF	1.44	4.14	3.10	2.08	2.47	2.13	1.38	2.16	2.79	2.41
	CNN-LSTM	2.41	8.22	4.86	3.43	3.43	3.38	2.73	3.93	4.17	4.06
	LSTM-RNN	1.96	2.79	3.06	2.49	3	2.35	1.34	2.12	2.96	2.45
	LSTM-GRU	2.44	3.49	4.78	2.36	3.51	3.5	1.78	2.69	3.38	3.1
	AE + LSTM	2.42	7.93	5.44	3.67	4.07	4.10	2.90	3.60	4.28	4.27

Table 6. The p-values of different models on the experimental stocks.

Model	p-Values (Compared with the LSTM-RF Model)
LSTM	DM = 3.1670, p-value1 = 0.001540; p-value2 = 0.0455
RF	DM = 3.2130, p-value1 = 0.001314; p-value2 = 0.0462
GRU	DM = 4.1670, p-value1 = 0.000031; p-value2 = 0.0344
CNN	DM = 4.0481, p-value1 = 0.000052; p-value2 = 0.0381
CNN-LSTM	DM = 4.9481, p-value1 = 0.000001; p-value2 = 0.0189
LSTM-RNN	DM = 3.8820, p-value1 = 0.000104; p-value2 = 0.0422
LSTM-GRU	DM = 3.5432, p-value1 = 0.000395; p-value2 = 0.0428
AE + LSTM	DM = 5.0131, p-value1 = 0.000001; p-value2 = 0.0171

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhu, C.; Dawod, A.Y.; Yu, X.; Zhou, Q. LSTM-RF Stock Prediction Algorithm via Short-Term Directional Probability-Based Model Selection. Information 2026, 17, 548. https://doi.org/10.3390/info17060548

AMA Style

Zhu C, Dawod AY, Yu X, Zhou Q. LSTM-RF Stock Prediction Algorithm via Short-Term Directional Probability-Based Model Selection. Information. 2026; 17(6):548. https://doi.org/10.3390/info17060548

Chicago/Turabian Style

Zhu, Chunman, Ahmad Yahya Dawod, Xi Yu, and Qingwei Zhou. 2026. "LSTM-RF Stock Prediction Algorithm via Short-Term Directional Probability-Based Model Selection" Information 17, no. 6: 548. https://doi.org/10.3390/info17060548

APA Style

Zhu, C., Dawod, A. Y., Yu, X., & Zhou, Q. (2026). LSTM-RF Stock Prediction Algorithm via Short-Term Directional Probability-Based Model Selection. Information, 17(6), 548. https://doi.org/10.3390/info17060548

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

LSTM-RF Stock Prediction Algorithm via Short-Term Directional Probability-Based Model Selection

Abstract

1. Introduction

2. Materials and Methods

2.1. Dataset Preparation

2.2. LSTM Network

2.3. Random Forest

2.4. Proposed LSTM-RF Algorithm

3. Results

3.1. Simulation Environment

3.2. Selection of LSTM Activation Function and Number of Iterations

3.3. Comparison of Accuracy of Four Basic Prediction Models

3.4. Comparison with Other Hybrid Models

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI