Next Article in Journal
Governance of Web-Based Idea Management System Rewards: From the Perspective of Open Innovation
Previous Article in Journal
Framework for Measuring Process Innovation Performance at Indonesian State-Owned Companies
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A New Stock Price Forecasting Method Using Active Deep Learning Approach

by
Khalid Alkhatib
1,*,
Huthaifa Khazaleh
1,
Hamzah Ali Alkhazaleh
2,*,
Anas Ratib Alsoud
3 and
Laith Abualigah
4
1
Department of Computer Information Systems, Jordan University of Science and Technology, Irbid 22110, Jordan
2
College of Engineering and IT, University of Dubai, Academic City, Dubai 14143, United Arab Emirates
3
Hourani Center for Applied Scientific Research, Al-Ahliyya Amman University, Amman 19328, Jordan
4
Faculty of Computer Sciences and Informatics, Amman Arab University, Amman 11953, Jordan
*
Authors to whom correspondence should be addressed.
J. Open Innov. Technol. Mark. Complex. 2022, 8(2), 96; https://doi.org/10.3390/joitmc8020096
Submission received: 23 April 2022 / Revised: 24 May 2022 / Accepted: 25 May 2022 / Published: 27 May 2022

Abstract

:
Stock price prediction is a significant research field due to its importance in terms of benefits for individuals, corporations, and governments. This research explores the application of the new approach to predict the adjusted closing price of a specific corporation. A new set of features is used to enhance the possibility of giving more accurate results with fewer losses by creating a six-feature set (that includes High, Low, Volume, Open, HiLo, OpSe), rather than the traditional four-feature set (High, Low, Volume, Open). The study also investigates the effect of data size by using datasets (Apple, ExxonMobil, Tesla, Snapchat) of different sizes to boost open innovation dynamics. The effect of the business sector in terms of the loss result is also considered. Finally, the study included six deep learning models, MLP, GRU, LSTM, Bi-LSTM, CNN, and CNN-LSTM, to predict the adjusted closing price of the stocks. The six variables used (High, Low, Open, Volume, HiLo, and OpSe) are evaluated according to the model’s outcome, showing fewer losses than the original approach, which utilizes the original feature set. The results show that LSTM-based models improved using the new approach, even though all models showed a comparative result wherein no model showed better results or continuously outperformed other models. Finally, the added new features positively affected the prediction models’ performance.

1. Introduction

Countries focus on improving and enhancing their economies to create a good standard of living by ensuring public spending. The modern economy establishes large corporations that could create enormous opportunities and keep up with rapid changes in the world economy [1,2]. The stock market is a pool of buyer and seller securities divided into the private stock exchange, open stock exchange, and mixed ownership stock exchange [3]. The private stock exchange involves exchanging shares of private companies, whereas the open stock exchange includes shares of a company listed in the public stock market. The mixed-ownership stock is in companies whose shares are only partially exchangeable in the public stock market. These stock exchanges are created in the United Kingdom, such as in the London Stock Exchange, and the United States, such as in the New York Stock Exchange (NYSE) [4,5,6,7,8,9].
Stock price forecasting is among one of the most challenging problems that financial institutions, businesses, and individual investors face [10]. Many factors impact the validity of stock price forecasts, including economics, political contexts, and investor psychology. According to the literature, because of this complexity, there is much interest in applying machine learning methods such as artificial intelligence, probabilistic reasoning, and evolutionary programming to assess large historical datasets on stock prices [11,12]. As it does not require any statistical hypotheses, an Artificial Neural Network, chiefly, a statistical and non-parametric method, is one of the most popular tools in predictive modeling among all of these computer intelligence approaches [13,14,15,16,17,18].
The stock market is the backbone of any economy; the primary purposes of any investment in the stock market are profit maximization and minimizing risk [4]; therefore, countries need to enhance their stock markets, since they are related to economic growth [19]. Investing in the stock market could lead to a quick return on investment; therefore, stock market prediction is one of the best strategies to achieve a profit. Stock market prediction is not linear, thus making it harder to predict a corporation’s stock prices in a specific market [20]. Consequently, investors and researchers have to find techniques that could lead to accurate results and higher profits [21]. Conventional machine learning models are superior to statistical models such as ARIMA [22]. On the other hand, deep learning models such as Long Short-Term Memory (LSTM) were proven to outperform machine learning models such as Support Vector Regression (SVR) [23], (KARA et al., 2011), which also showed that the deep learning model Artificial Neural Network (ANN) had been detected instead of Support Vector Machine (SVM) [24].
Forex price forecasting is similar to stock price forecasting [25,26]. An attention RNN-ARIMA (ARNN-ARIMA) model is proposed to forecast forex prices. The proposed model was evaluated using three main metrics: Root Mean Squared Error (RMSE), Mean Squared Percentage Error (MAPE), and Directional Accuracy. The proposed model has been compared with multiple models, including RNN, GRU, LSTM, and ARNN. It outperformed all other models by utilizing all metrics, subsequently achieving the lowest RMSE and MAPE with 1.65 × 10−3, and 23.2%, respectively, and the highest DA with 75.7%, slightly outperforming ARNN which achieved 73.5% DA [27]. An LSTM with an embedded layer model (ELSTM), and a LSTM with an Autoencoder (ALSTM), are introduced in [28]. The proposed model compared multiple metrics to evaluate its performance compared with multiple models. In addition to the two datasets, the first experiment on the first dataset, using ALSTM and ELSTM, revealed a good performance, outperforming other models such as attention multi-layer perceptron (AMLP) and embedded multi-layer perceptron (EMLP) by scoring a lower MSE and higher relative accuracy of the Shanghai A-share composite index; however, ALSTM achieved the worst MSE score on the second dataset, and both models achieved the worst results in terms of the comparative accuracy of Sinopec.
Deep learning models gave excellent results in many areas [29,30]. They showed potential for use in stock market prediction due to their capability to detect the dynamics of stock market movements and get adequate results [31]. This article focuses on the proposed six deep learning models and detecting the differences between them, including LSTM [32,33]. The Gated Recurrent Network (GRU) has also been used in the evaluation process [34], which is also a RNN-based model. A Multi-Layer Perceptron (MLP) [35] has been used in this work, as well as a Convolutional Neural Network (CNN) [36], CNN-LSTM model, and Bidirectional-LSTM (Bi-LSTM) [37]. This research introduced six models; the first is the Multi-Layer Perceptron (MLP). MLP is a neural network of three sections of neurons, including an input neuron layer, hidden neuron layer, and output neuron layer, and the model could have multiple hidden layers. Each neuron is connected to all neurons in the previous layer in this model. These types of connections are called fully connected layers or dense layers. The neurons of the same layer are not connected. The learning process changes the weights of each neuron after processing the data according to the error amount in the output compared with the excepted result. Data concerning four companies, Apple, Tesla, ExxonMobil, and Snapchat, evaluated these models. Each dataset focuses upon a different period to detect the effect of data size. Each company has a different business focus. This article proposes a feature extraction technique to increase the number of features models that could be utilized in order to give accurate predictions with fewer losses. Finally, as noted by Kim and Kim in [38], the loss functions used in the evaluation process are Mean Squared Error (MSE) and Mean Absolute Percentage Error (MAPE). The results showed that LSTM-based models improved using the new approach, even though all models showed a comparative result wherein no model showed better results or continuously outperformed other models. The CNN model showed the best efficiency in terms of execution time. GRU and CNN were the best models for giving good results with fewer examples. The main aims of this paper are presented as follows:
  • Study the effects of the additional features (i.e., High, Low, Volume, Open, HiLo, OpSe).
  • Detect the effect of the size of the datasets on the prediction accuracy.
  • Detect the difference between the deep learning models (i.e., MLP, GRU, LSTM, Bi-LSTM, CNN, and CNN-LSTM).
The main sections of this paper are organized as follows. Section 2 presents the related works in this paper. Section 3 shows the proposed methodology for Stock Price Forecasting. Section 4 presents the experiments, results, and discussion. Finally, the conclusion and directions for future research are given in Section 5.

2. Related Work

Recently, a great deal of research in forecasting forex and stock market prices has been undertaken [39,40,41,42,43]. Kang et al., 2019, proposed Generative Adversarial Network architecture with Long Short-Term Memory (LSTM) as a generator and Multi-Layer Perceptron (MLP) as a discriminator. The GAN model has been compared with the LSTM, Artificial Neural Network (ANN), and Support Vector Regression (SVR); multiple metrics have been utilized to evaluate the models, and the proposed GAN model has proven to be superior compared to another model, according to all metrics used in this paper [44]. Big data would allow for more efficiency and innovative speed. Venture capital, equity funds, and exchange-traded funds are examples of financial innovation that have aided financial development and economic growth [45,46,47,48,49].
Three models, including Support Vector Regression (SVR), Linear Regression (LR), and Long Short-Term Memory (LSTM), are introduced in [50]. It was revealed that the LSTM outperforms other models by far, achieving 0.0151 scores, whereas LR came second with 13.872, and SVR came last with 34.623 [51]. Pratik et al. proposed two models based on graph theory; the first was based on the correlation between historical prices and the other was based on causation. The results proved that graph-based models are superior to traditional methods, and the causation-based model achieved slightly better results than the correlation one. The basic RNN, LSTM, and GRU models are proposed in [52]. The GRU model had achieved results with 0.67 accuracies and 0.629 log loss, followed by LSTM with 0.665 accuracies and 0.629 log loss, and RNN with 62.5 accuracies and a log loss of 0.725, but, both LSTM and GRU were tweaked with the addition of a dropout layer, and the GRU model did not show any enhancement because of the dropout layer; however, the LSTM showed a slight performance enhancement of 2%.
The LSTM model is proposed in [53] to forecast (nifty 50) stock prices; LSTM is an RNN architecture used in Natural Language Processing (NLP). The results showed that the more parameters and epochs it gets, the better performance it gives, and it achieved the best performance of 0.00859 in the RMSE metric using the High, Low, Open, Close parameter set and 500 epochs. Four deep learning models, namely, MLP, RNN, CNN, and LSTM, are introduced in [54]; these models have been trained on TATA MOTORS. After training, the models were evaluated by predicting stock prices, and the models achieved satisfactory results by identifying the patterns of stock movements even in other stock markets, which shows that deep learning models could identify the underlying dynamics; CNN proved to be superior. This article also tried the ARIMA model, but it did not learn the underlying dynamics between multiple time series.
A CNN model that uses a high order structure is proposed in [55]. Indeed, it was compared with many different models, including traditional methods such as ARIMA and Wavelet, which were proven to perform the worst, followed by the machine-learning model, and the Hidden Markov Model (HMM), which was also inferior when compared with deep learning models such as LSTM and SMF with 1–3% accuracy. These deep learning models were inferior to the CNN model, which uses a high-order structure. These results were obtained after evaluating multiple datasets, including Apple, Google, IBM, S&P 500, and other datasets. The RNN, CNN, and LSTM deep learning models are introduced in [56], and ARIMA is compared against deep learning models. The models were trained and evaluated on the Infosys dataset, TCS, and Cipla datasets, to investigate if the models would capture the hidden underlying dynamics between data. Deep learning models showed a superior performance to the ARIMA model, with CNN being the best deep learning model, outperforming ARIMA by 1352.1%, LSTM by 177.1%, and RNN by 165.2%.
The performance of various deep learning models, such as deep LSTM, MLP, and ELSTM models in [57], LSTM and GRU in [58], and the SVR and NN in [59], were compared for stock price forecasting. Data from three banks in the NSE of India has been gathered to evaluate these models. Deep LSTM was proven to have a higher accuracy and lower MSE than other models. A Deep Wide Neural Network (DWNN) is proposed in [60] that combines both RNN and CNN models to solve the RNN-basic models’ limitations, and it trains the models’ stock data in China’s SSE sandstorm sector to ensure that it has been utilized; the results proved that the combination of RNN and CNN models reduced the performance by 30% compared with the vanilla RNN. A hybrid model that combines the Discrete Wavelet Transform (DWT) and Artificial Neural Network (ANN) is proposed in [61] to produce better performance using DWT to analyze the original data. Moreover, to produce an approximation, and to detail coefficients used as input for the model, this method enhances performance compared with the original ANN model for five datasets.
A novel model is proposed in [62] to predict Bitcoin prices, similar to stock price prediction. Three are deep learning models, vanilla RNN, LSTM, and ARIMA. The three models showed similar performance when it comes to accuracy, 52.78%, 50.25%, and 50.05% for LSTM, RNN, and ARIMA, respectively; however, when it comes to the RMSE, the two deep learning models demolished the ARIMA model, with 6.87% and 5.45% for LSTM and RNN, respectively, and 53.74% RMSE for the ARIMA model. A new deep learning model is proposed using vanilla CNN, ANN, and a CNN model enhanced by a genetic algorithm (GA-CNN) [63]. The results showed that GA-CNN outperforms both CNN and ANN models in terms of accuracy by achieving 73.74% accuracy, thus outperforming the vanilla CNN by over +3%, and ANN by +15% accuracy. In [64], multiple deep learning models are introduced, including LSTM, CNN, LSTM-CNN, SVR, Applied Empirical Mode Decomposition (EMD), and Complete Ensemble-EMD (CEEMD), to help in the process of improving LSTM and CNN-based models. They applied these models to four different datasets and the results showed that CEEMD-LSTM-CNN proved to be superior to other models introduced in this paper.
A novel model utilized Wavelet Transform, stacked auto-encoders, and bidirectional long short-term memory [65]. This model was called WAE-BLSTM and had a three-stage workflow, including eliminating noise, dimensionality reduction, and prediction using BLSTM. To show the capabilities of this model, which has been compared with four models, W-BLSTM, W-LSTM, BLSTM, and LSTM, the WAE-BLSTM outperformed other models according to both MAE and RMSE metrics. A CNN-BiLSTM-AM model is presented in [66] that utilizes CNN, BiLSTM, and the attention mechanism. CNN extracts the features, BiLSTM is used for prediction using these features, and the attention mechanism captures the influence of the extracted features. Compared with Bi-LSTM-AM, CNN-BiLSTM, CNN-LSTM, BiLSTM, LSTM, CNN, RNN, and MLP, the model proved to be superior according to MAE and RMSE metrics.
The Elman neural network is introduced in [67] and is an RNN-based neural network. The Elman-NN utilized direct input-to-output connections (DIOCs) to produce Elman-DIOCs to evaluate these models against the Elman-NN and MLP. Four global stock indices were used. Elman-DIOCs outperformed both the Elman-NN and MLP according to MAE and RMSE metrics; DIOCs are usually beneficial when adding to Neural Network models. A graph-based CNN is introduced in [68] called the Stock Sequence Array Convolutional Neural Network (SSACNN). It gathers data, including historical data prices and leading indicators as an array, and feeds them to the CNN model as a graph; ten stock datasets from two markets fed into the model in the evaluation process, and SSACNN proved to outperform CNN, ANN, and SVM models in terms of accuracy.
Different GRU models are presented in [69] to predict Bitcoin prices, and these models also have been compared with LSTM and the Artificial Neural Network (ANN); these GRU models included the basic GRU, GRU-Dropout model, and the GRU-Dropout-GRU model, and the results showed that the basic GRU outperformed both other GRU models, LSTM, and ANN models by achieving lower RMSE. Attention-Based LSTM is introduced in [70], which utilizes Wavelet Transform to clear the noise of the data (AWLSTM). This model has been compared with WLSTM, LSTM, and GRU models. Three datasets and four metrics have been used to evaluate the models; the datasets included S&P 500, DIJA, and HSI, and the results proved that AWLSTM is superior compared with other models according to the four metrics [70]. Table 1 shows an overview of the most related works.

3. Methodology

This section gives the main procedure of the methods used as follows.

3.1. Datasets

This research includes four datasets of four companies with different business sectors: Apple, Tesla, Snapchat, and ExxonMobil.
Apple is a software and hardware provider. Its data set includes stock price indexes such as opening and volume, high and low price, as well as adjusted closing price, which is considered to be a feature that predicts how the first four (opening and volume, high and low price) indexes are treated, either as input data or features, based on the past 21 years of stock price data. The first dataset for this study includes the period 30 October 2000 to 17 October 2021, with 5283 data instances. The second dataset contains 11 years of stock price data for Tesla from 29 June 2010 to 27 October 2021, including 2855 instances that concern an automobile company. Tesla’s market capitalization and stock prices have been more volatile than Apple and Snapchat datasets. This is due to tweets by Tesla’s executive director, Elon Musk, which have influenced Tesla’s market capitalization and stock prices. The third dataset contains three years and nine months of stock price data for Snapchat from 3 February 2017 to 11 November 2021, including 1186 instances; it is a social media platform and a relatively new company compared with the other three datasets. Its dataset creates a challenge for the models to make predictions due to its relatively small dataset, leading to under-fit. The fourth dataset is the ExxonMobil dataset, which includes the pricing data of the period from 3 January 2000 to 7 December 2021, including 5520 instances of an oil company created from merging the Exxon oil and Mobil oil companies. Its dataset was added to diversify the used datasets. This dataset has been used for the past 21 years. Data collected from Yahoo Finance (.csv) files included four input features and one output feature. The Date/Time dimension has been removed because it has no relation or effect on the prediction process.
The data has been normalized using a min–max scaler:
x* = (xmin)/(maxmin)
where:
-
x*: is the new value
-
x: is the old value
-
min: the minimum value
-
max: the maximum value
The max is the maximum value of the sample, and the min is the minimum of the sample, so the x is mapped to [0, 1].
Then, it is split into 70% training data, 15% testing data, and 15% validation; this split is used to prevent overfitting models and to evaluate models accurately [72].

3.2. Used Models

The overall flow diagram of the proposed work is presented in Figure 1.
This research introduced six models; the first is the Multi-Layer Perceptron (MLP). MLP is a neural network of three sections of neurons, including the input neuron layer, hidden neuron layer, and output neuron layer, and the model could have multiple hidden layers. Each neuron is connected to all neurons in the previous layer in this model. These types of connections are called fully connected layers or dense layers. Neurons of the same layer are not connected. The learning process changes the weights of each neuron after processing the data according to the error amount in the output compared with the excepted result. Each neuron has several inputs (xi), and each neuron has weight (wi); the sum of the results of the neurons’ inputs (xi) multiplied by the weights (wi) of these neurons, is then added to the threshold value (b), as shown in the equation below [73].
A = ∑ xiwi + b
Then, this net of A is applied to the activation function F(A) to give the output as the equation below.
output = F(A)
The second model used in this research is Long Short-Term Memory (LSTM). LSTM is a RNN-based model that is used when long-term dependencies are a significant part of the learning process [73]. This is because remembering dependencies for a long time is a major benefit of using LSTM since it has forgotten gates on top of two main gates, which are the input and output gates. These forgotten gates allow the model to learn when to forget [24]; the following Figure 2 will break down the work process of the LSTM cell [73].
As Figure 2 reveals, Ct-1 and Ct are the old and present cell states. The ht-1 and ht are the output of the previous and current cells. ft is the forgotten gate—the input gate. The Ot is the output sigmoid gate, and the line from Ct-1 and Ct carries the information covering the entire network, gathering the information from the gates of the cell and transferring it from Ct-1 to Ct [73]. The ft layer decides to remember the information [37], and the ft output is multiplied by Ct-1. Then, the multiplication between it, the sigmoid layer gate, and the Ĉt tanh layer gate, is added to the Ct-1, and the point-wise multiplication of Ot and tanh forms the output ht [73].
Bidirectional-LSTM (Bi-LSTM) is a version of LSTM that is introduced to increase the amount of data available in the neural network [71,74]. The LSTM could only learn from past information, and Bi-LSTM could learn from both the past and future at the same time because it had two hidden layers that have opposite directions connected to the same output [74], as shown in Figure 3.
The Convolutional Neural Network (CNN) is a special Feed Forward Neural Network (FFNN). CNN has shown a decent performance in many Artificial Intelligence (AI) applications such as Natural Language Processing (NLP), image and video processing, as well as its application in time series data [27]. CNN uses weight sharing and local perception to downsize the used parameters. Moreover, it could also be separated into three-layer types: convolutional, pooling, and fully connected [75]. CNN works as follows: the convolutional layer conducts a convolution operation to extract the features, then, the pooling layer reduces the number of extracted features, thus, reducing dimensionality to speed up the process and avoid the curse of dimensionality [76].
This research also introduced a CNN-LSTM model that combines the CNN and LSTM models to get the best out of each model; however, since the model is slightly deeper than other models proposed in this research, and thus, it needs a higher volume of data, the LSTM model uses the extracted features from the CNN model to predict the stock prices due to the LSTM’s ability to identify dependencies [77].
The Gated Recurrent Unit (GRU) is a RNN-based model similar to LSTM, but it merges the forgotten gate and the input gate into one single gate called the update gate. Then, it combines both the cell state and hidden state, and both GRU and LSTM solve the vanishing gradient problem of the vanilla RNN, but since GRU has less tensor operation, it will be faster than LSTM in the training time. Figure 4 shows the GRU model representation [78].
Where xt is the input and ht-1 is the output of the previous unit multiplied by the weights Wt. Then, after adding the two, the result was applied to the sigmoid function. The vanishing gradient problem is solved by the update gate zt, which decides how much information should pass. The rest gate rt carries out an operation similar to that of the input gate. The rt decides how much information should be forgotten. The current memory content ht, where the input is multiplied by the weights Wr and ht-1, is multiplied by the output of rest gate rt, then Hadamard Product Operation (HPO) is applied to pass the relative information tanh function, which is applied to the summation [78]. To get ht the following operations are applied:
zt = σ(Wz·[ht-1,xt])
rt = σ(Wr·[ht-1,xt])
ĥt = tan h(W·[rt.ht-1,xt])
ht = (1 − ztht-1 + zt·ht
All six models used Exponential Linear Units (ELU), which outperformed the Rectified Linear units (ReLU) in the first experiments; therefore, ELU are the primary activation function for all of the experiments mentioned earlier. ELU are activation functions that could speed up the training process, solve vanishing gradient problems by improving linear characteristics, and give an identity for positive values [52]. ELU are considered an alternative for ReLU because of their ability to reduce bias shifts by pushing the mean activation towards zero when training. ELU could learn faster and have better generalization than Leaky-ReLU (LReLU) and ReLU [52]. ELU also perform normalization across the network layers without additional normalization, so a predetermined parameter scales the ELU. The following equation represents the function of ELU [79].
ELU(x) = {(e^x − 1, if x ≤ 0@x, otherwhise)┤}
  • MLP Model
The MLP was the first model that tested both approaches using four datasets (Apple, Tesla, Snapchat, and ExxonMobil). The MLP model used in this research contains three layers: input layer (Sequential), hidden layer (100 Dense neurons), and output layer (Dense single neuron). It utilized an ELU activation function and Adam optimization function; the model completed a hundred epochs with a batch size of 2. As mentioned in the introduction section, the data split was 70% training, 15% testing, and 15% validation.
  • CNN Model
The CNN was the fifth model that tested both approaches using four datasets (Apple, Tesla, Snapchat, and ExxonMobil). The CNN model used in this research contains six layers: an input layer (Sequential) and four hidden layers. The first layer is a Conv1D layer with 64 filters and a kernel size of 2, a MaxPooling1D layer with a pooling size of 2, a Flatten layer, a Dense layer of 50 neurons, and an Output layer (Dense single neuron). The model utilized an ELU activation function and Adam optimization function; the model completed 100 epochs with a batch size of 4.
  • LSTM Model
LSTM was the third model that tested both approaches using four datasets (Apple, Tesla, Snapchat, and ExxonMobil). The LSTM model used in this research contains three layers: the input layer (Sequential), the hidden layers (32 LSTM neurons), and the output layer (Dense single neuron). The model utilized an ELU activation function and Adam optimization function; the model completed 100 epochs with a batch size of 2.
  • Bi-LSTM Model
Bi-LSTM was the fourth model that tested both approaches using four datasets (Apple, Tesla, Snapchat, and ExxonMobil). The Bi-LSTM model used in this research contains four layers: an input layer (Sequential), two hidden layers (32 Bi-LSTM neurons and 16 Bi-LSTM neurons), and an output layer (Dense single neuron). The model utilized an ELU activation function and Adam optimization function; the model completed 100 epochs with a batch size of 2.
  • GRU Model
The GRU was the second model that tested both approaches using four datasets (Apple, Tesla, Snapchat, and ExxonMobil). The GRU model used in this research contains four layers: The input layer (Sequential), two hidden layers (both of which are GRU layers wherein the first contains 50 neurons and the second contains 25 neurons), and an output layer (Dense single neuron). The model utilized an ELU activation function and Adam optimization function; the model completed 70 epochs with a batch size of 2.
  • CNN-LSTM Model
CNN-LSTM was the sixth model that tested both approaches using four datasets (Apple, Tesla, Snapchat, and ExxonMobil). The CNN-LSTM model used in this research contains six layers: an input layer (Sequential) and four hidden layers. The first layer is a Conv1D layer with 64 filters and a kernel size of 1, followed by a MaxPooling1D layer with a pooling size of 2, a flatten layer, a LSTM layer of 50 neurons, and an output layer (Dense single neuron). The model utilized an ELU activation function and Adam optimization function; the model completed 100 epochs with a batch size of 4.

3.3. Feature Engineering

The original approach of stock prediction using deep learning uses four features, which are:
  • High: represents the highest price of the stock on a particular day.
  • Low: represents the lowest price of the stock on a particular day.
  • Open: represents the price at the opening stock exchange on a particular day.
  • Volume: represents the total number of shares or contracts exchanged between buyers and sellers.
These four are the most commonly used in stock price prediction when predicting the adjusted closing price, which represents the stock price after adjusting the closing price, and amends a stock’s closing price to reflect a stock’s value after accounting for any corporate actions. This research investigates the effect of the modification of the original prediction approach using the original feature set mentioned above (High, Low, Volume, Open) by creating two additional features, which will be referred to HiLo (High-Low) and OpSe (Open-Close).

4. Results

In this section, the proposed methods are experimented with and compared with other literature methods. A new set of features is used to enhance the possibility of giving more accurate results with fewer losses by creating a six-feature set (that includes High, Low, Volume, Open, HiLo, OpSe), rather than the traditional four-feature set (High, Low, Volume, Open). The study also investigates the effect of data size by using datasets (Apple, ExxonMobil, Tesla, Snapchat) of different sizes. The study also investigates the effect of the business sector on the loss result; finally, the study included six deep learning models, MLP, GRU, LSTM, Bi-LSTM, CNN, and CNN-LSTM, to predict the adjusted closing price of the stocks. This study revealed that using six variables (High, Low, Open, Volume, HiLo, and OpSe) improves the model’s outcome, showing fewer losses than the original approach, which utilizes the original feature set. The software used in this paper is Python, and the original main parameters of the tested methods are applied. This research was performed using Google-Colab. Moreover, several libraries were used, which are as follows: pandas, NumPy, Matplotlib, Sklearn, Keras.
Table 2 and Table 3 demonstrate that the models showed minor and major improvements depending on the model and how much it could benefit from the new additional features. The results also show that the LSTM model outperformed other training results in both datasets.
Table 4 and Table 5 show that when using the four-feature set, the CNN outperformed the other models’ features in the Tesla dataset; however, when using the additional two features, Bi-LSTM outperformed other models according to the MSE metric due to the massive boost in performance caused by the addition of the two features. Moreover, LSTM outperforms other models according to the MAPE metric, which shows that the LSTM-based models improved more than other models when using additional features.
As shown in Table 6 and Table 7, when using four features, the CNN model outperformed other models, but when using the additional two features, LSTM showed a massive boost in performance and outperformed other models, thus showing that the LSTM model received the most benefits compared with other models using the two additional features. Table 8 and Table 9 show that Bi-LSTM outperforms other models in both MSE and MAPE metrics, as well as in both datasets. The tables also show that almost all models benefitted from two additional features except CNN-LSTM.
The tables presented previously show that LSTM-based models perform better than other models in most datasets, with the CNN performing better in two cases. In general, most models showed an improvement when using the additional features.
The results of Table 10 and Table 11 showed that the CN-based models (CNN, CNN-LSTM) outperform other models according to MSE metric, and LSTM-based models (LSTM and Bi-LSTM) outperform other models according to the MAPE metric. Surprisingly, Bi-LSTM showed overfitting when using four input features; this overfitting is reduced when using the additional two features. The results of Table 12 and Table 13 show that Bi-LSTM is super overfitted and has a high loss compared with its training results, according to the MSE metric. The two additional features show immense importance in that it pushes the Bi-LSTM model from the worst model among the models, according to MSE metric, and second-worst model, according to the MAPE metric, to the absolute best model according to both metrics.
The results of Table 14 and Table 15 show that the Bi-LSTM model outperforms other models according to the MSE model when using four input features, and GRU outperforms other models in Table 15 according to the MAPE metric; GRU also outperforms other models according to both metrics when using the additional two features. The results of Table 16 show that GRU outperforms other models according to both MSE and MAPE metrics. The CNN showed a high loss, but as shown in Table 17, this loss was mitigated after using the additional features. In general, all models showed a higher loss compared with other datasets because oil companies’ stock prices in nature are very volatile, which makes the prediction process harder for the models.
Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, Table 9, Table 10, Table 11, Table 12, Table 13, Table 14, Table 15, Table 16 and Table 17 show that no model continuously outperformed the other models; the results also highlight the fact that the new approach did improve the prediction accuracy of the models in most cases, especially the LSTM and Bi-LSTM models. The results also showed that most significant losses were related to Tesla due to its high adjusted closing price compared with other datasets; for example, the adjusted closing price for Tesla on 27 October 2021 was 1037.85, and Apple’s adjusted closing price on the same day was 148.85. This means that the big difference in loss is due to the high-adjusted closing price of Tesla; the results also show that the small-sized dataset of Snapchat did not create a loss problem, and it was as normal as the others. The results also show that the models achieved a high loss when predicting the price of the ExxonMobil dataset compared with other datasets due to the volatile nature of its stock prices. To have another perspective on the achieved results, visualizations have been utilized to clearly show the proposed effect of the approach. Figure 5 and Figure 6 show the validation results of both the standard and proposed approaches.
As shown in Figure 5 and Figure 6, the proposed approach decreased the loss of both the MSE and MAPE metrics in most cases. Bi-LSTM showed a considerable decrease in loss when utilizing the MSE metric, and both the CNN and GRU showed a good decrease in loss when utilizing the MAPE metric. Figure 7 and Figure 8 present the results’ visualization of the Tesla corporation when utilizing MSE and MAPE metrics. It is clear from this figure that the Bi-LSTM got the best MAPE measure value when the number of features is four, and the LSTM got the best MAPE value when the number of features is six.
As shown in Figure 7 and Figure 8, Bi-LSTM showed a big decrease in loss when utilizing the MSE metric, and a good decrease in loss when utilizing the MAPE metric with six features (new approach); the figures also showed that some in cases, the new approach did not make a difference, or it caused a slight increase in loss. Figure 9 and Figure 10 show the visualization results of the Snapchat corporation.
As shown in Figure 9 and Figure 10, the new approach caused a decrease in loss in some cases, and a noticeable increase in loss in other cases. This noticeable increase might be due to the small-sized dataset of Snapchat. From Figure 9, we can see that the MLP method, when used on the Snapchat dataset, got the best results in terms of MSE when using both the four-feature set and the six-feature set. Figure 11 and Figure 12 show the visualization results of the ExxonMobil Corporation. From Figure 10, we can also see that the MLP method, when used on the Snapchat dataset, got the best results in terms of MAPE when using the six-feature set, and the CNN-LSTM method, when used on the Snapchat dataset, got the best results in terms of MAPE using the four-feature set.
Figure 11 and Figure 12 show that the new approach was usually beneficial in terms of decreasing the loss of the MSE and the MAPE metrics. The results in Figure 11 and Figure 12 show that the CNN model benefited the most in comparison to other models. From Figure 11, we can also see that the CNN method, when used on the ExxonMobil dataset, got the best results according to the MAPE metric when using the four-feature set, and with regard to the CNN-LSTM method, when used on the ExxonMobil dataset, it got the best results according to the MAPE metric when using the four-feature set.
Finally, the proposed new approach showed promising results that could help create better and more accurate stock predictions; however, it is not perfect, and in some cases, it could lead to a slight increase in loss, but as with any new technique, the proposed approach needs more research. By adding these essential features, we increased the algorithm’s effectiveness in each area. Automatically selected features are an essential collection of techniques to use when preparing the dataset. In this work, we learned about feature selection, the advantages of primary feature classification, and how to use these techniques to their full potential.

5. Discussion

In this section, the relation between deep learning-based stock price forecasting methods and open innovation is presented.
Little research has focused on projecting daily stock market returns, especially when utilizing vital machine learning approaches such as deep neural networks (DNNs) [80,81]. The operations of these created economic models require two-factor and three-factor financial analysis to examine the dynamics of the company’s profitability [82,83]. The suggested model in [84] is based on the convergence of deterministic financial analysis methods that are included in the DuPont model, and simulation methods that allow analysis with random components. A good prediction of a stock’s future price might provide significant profit. When projecting stock trends in prior years, many methodologies were used [44,47].

6. Conclusions

6.1. Implication

Predicting the future has been a dream for most economies and people due to the benefits that it may bring. Predicting stock price movements will also benefit those interested in researching stock market prediction. Artificial intelligence will present researchers with forecasts that are more accurate than ever. It will also become more accurate as technology and algorithms become more advanced over time. The results of this study reveal that the new technique of using six variables (High, Low, Open, Volume, HiLo, and OpSe) improves the models’ outcomes, showing fewer losses compared with the original approach, which utilizes the original feature-set (High, Low, Open, Volume). The paper also showed that LSTM-based models improved much more using the new approach, even though all models showed a comparative result wherein no model showed far better results or continuously outperformed the other models; thus, overall, feature engineering proved to benefit the models. It is proven that feature engineering should be considered as an essential step in terms of designing better learning models. In this work, we learned about feature selection, the advantages of primary feature classification, and how to use these techniques to their full potential.

6.2. Limits and Future Reserch Topic

The research limitations include: depending only on a basic deep learning model, as the research did not investigate using transformer-based approaches or transfer learning; and finally, the research area of time series analysis does not have a big pre-trained model such as BERT in NLP, and DALL-E2 in the computer vision domain. Thus, this area might be covered better in future work. Improved machine learning and deep learning methods might be proposed to tackle the current weaknesses and the low performance in some tested cases. Moreover, other test cases can be considered in order to validate the performance of the proposed method. Future work could build on this work by using a more advanced deep learning approach, or by using a hybrid model that uses both stock price indexes and sentiment news analysis to improve the results by including more features that the models could benefit from.

Author Contributions

Conceptualization, K.A., H.K., H.A.A., A.R.A. and L.A.; methodology, K.A., H.K., H.A.A. and L.A.; software, K.A. and H.K.; validation, K.A. and H.K.; formal analysis, K.A. and H.K.; investigation, K.A. and H.K.; resources, K.A. and H.K.; data curation, K.A. and H.K.; writing—original draft preparation, K.A., A.R.A. and H.K.; writing—review and editing, K.A., H.K., H.A.A., A.R.A. and L.A.; visualization, K.A. and H.K.; supervision, K.A. and H.K.; project administration, K.A. and H.K.; funding acquisition, K.A. and H.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The source code will be available upon request from the corresponding authors.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Al-qaness, M.A.; Ewees, A.A.; Fan, H.; Abualigah, L.; Abd Elaziz, M. Boosted ANFIS model using augmented marine predator algorithm with mutation operators for wind power forecasting. Appl. Energy 2022, 314, 118851. [Google Scholar] [CrossRef]
  2. Mehr, A.D.; Ghiasi, A.R.; Yaseen, Z.M.; Sorman, A.U.; Abualigah, L. A novel intelligent deep learning predictive model for meteorological drought forecasting. J. Ambient Intell. Humaniz. Comput. 2022, 1–15. [Google Scholar] [CrossRef]
  3. Mexmonov, S. Stages of Development of the Stock Market of Uzbekistan. Арxив нayчныx иccлeдoвaний 2020, 24, 6661–6668. [Google Scholar]
  4. Nti, I.K.; Adekoya, A.F.; Weyori, B.A. A systematic review of fundamental and technical analysis of stock market predictions. Artif. Intell. Rev. 2019, 53, 3007–3057. [Google Scholar] [CrossRef]
  5. Sengupta, A.; Sena, V. Impact of open innovation on industries and firms—A dynamic complex systems view. Technol. Forecast. Soc. Chang. 2020, 159, 120199. [Google Scholar] [CrossRef]
  6. Terwiesch, C.; Xu, Y. Innovation Contests, Open Innovation, and Multiagent Problem Solving. Manag. Sci. 2008, 54, 1529–1543. [Google Scholar] [CrossRef] [Green Version]
  7. Blohm, I.; Riedl, C.; Leimeister, J.M.; Krcmar, H. Idea evaluation mechanisms for collective intelligence in open innovation communities: Do traders outperform raters? In Proceedings of the 32nd International Conference on Information Systems, Cavtat, Croatia, 21–24 June 2010.
  8. Del Giudice, M.; Carayannis, E.G.; Palacios-Marqués, D.; Soto-Acosta, P.; Meissner, D. The human dimension of open innovation. Manag. Decis. 2018, 56, 1159–1166. [Google Scholar] [CrossRef]
  9. Daradkeh, M. The Influence of Sentiment Orientation in Open Innovation Communities: Empirical Evidence from a Business Analytics Community. J. Inf. Knowl. Manag. 2021, 20, 2150031. [Google Scholar] [CrossRef]
  10. Kao, L.-J.; Chiu, C.-C.; Lu, C.-J.; Yang, J.-L. Integration of nonlinear independent component analysis and support vector regression for stock price forecasting. Neurocomputing 2013, 99, 534–542. [Google Scholar] [CrossRef]
  11. Ezugwu, A.E.; Ikotun, A.M.; Oyelade, O.O.; Abualigah, L.; Agushaka, J.O.; Eke, C.I.; Akinyelu, A.A. A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects. Eng. Appl. Artif. Intell. 2022, 110, 104743. [Google Scholar] [CrossRef]
  12. Wang, Y.; Guo, Y. Forecasting method of stock market volatility in time series data based on mixed model of ARIMA and XGBoost. China Commun. 2020, 17, 205–221. [Google Scholar] [CrossRef]
  13. Mosteanu, N.; Faccia, A. Fintech Frontiers in Quantum Computing, Fractals, and Blockchain Distributed Ledger: Paradigm Shifts and Open Innovation. J. Open Innov. Technol. Mark. Complex. 2021, 7, 19. [Google Scholar] [CrossRef]
  14. Guo, Y.; Zheng, G. How do firms upgrade capabilities for systemic catch-up in the open innovation context? A multiple-case study of three leading home appliance companies in China. Technol. Forecast. Soc. Chang. 2019, 144, 36–48. [Google Scholar] [CrossRef]
  15. Yun, J.J.; Won, D.; Park, K. Entrepreneurial cyclical dynamics of open innovation. J. Evol. Econ. 2018, 28, 1151–1174. [Google Scholar] [CrossRef]
  16. Shabanov, V.; Vasilchenko, M.; Derunova, E.; Potapov, A. Formation of an Export-Oriented Agricultural Economy and Regional Open Innovations. J. Open Innov. Technol. Mark. Complex. 2021, 7, 32. [Google Scholar] [CrossRef]
  17. Hilmola, O.P.; Torkkeli, M.; Viskari, S. Riding the economic long wave: Why are the open innovation index and the performance of leading manufacturing industries intervened? Int. J. Technol. Intell. Plan. 2007, 3, 174. [Google Scholar] [CrossRef]
  18. Du, J.; Leten, B.; Vanhaverbeke, W. Managing open innovation projects with science-based and market-based partners. Res. Policy 2014, 43, 828–840. [Google Scholar] [CrossRef]
  19. Bahadur, S.G.C. Stock market and economic development: A causality test. J. Nepal. Bus. Stud. 2006, 3, 36–44. [Google Scholar]
  20. Bharathi, S.; Geetha, A.; Sathiynarayanan, R. Sentiment Analysis of Twitter and RSS News Feeds and Its Impact on Stock Market Prediction. Int. J. Intell. Eng. Syst. 2017, 10, 68–77. [Google Scholar] [CrossRef]
  21. Sharma, A.; Bhuriya, D.; Singh, U. Survey of stock market prediction using machine learning approach. In Proceedings of the 2017 International Conference of Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, 20–22 April 2017; pp. 506–509. [Google Scholar]
  22. Nassar, L.; Okwuchi, I.E.; Saad, M.; Karray, F.; Ponnambalam, K. Deep learning based approach for fresh produce market price prediction. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 19–24 July 2020; pp. 1–7. [Google Scholar]
  23. Bathla, G. Stock Price prediction using LSTM and SVR. In Proceedings of the 2020 Sixth International Conference on Parallel, Distributed and Grid Computing (PDGC), Waknaghat, India, 6–8 November 2020; pp. 211–214. [Google Scholar] [CrossRef]
  24. Kara, Y.; Boyacioglu, M.A.; Baykan, Ö.K. Predicting direction of stock price index movement using artificial neural networks and support vector machines: The sample of the Istanbul Stock Exchange. Expert Syst. Appl. 2011, 38, 5311–5319. [Google Scholar] [CrossRef]
  25. Abualigah, L.; Diabat, A. Improved multi-core arithmetic optimization algorithm-based ensemble mutation for multidisciplinary applications. J. Intell. Manuf. 2022, 1–42. [Google Scholar] [CrossRef]
  26. Alkhatib, K.; Almahmood, M.; Elayan, O.; Abualigah, L. Regional analytics and forecasting for most affected stock markets: The case of GCC stock markets during COVID-19 pandemic. Int. J. Syst. Assur. Eng. Manag. 2021, 1–11. [Google Scholar] [CrossRef]
  27. Zeng, Z.; Khushi, M. Wavelet denoising and attention-based RNN-ARIMA model to predict forex price. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 19–24 July 2020; pp. 1–7. [Google Scholar]
  28. Pang, X.; Zhou, Y.; Wang, P.; Lin, W.; Chang, V. An innovative neural network approach for stock market prediction. J. Supercomput. 2018, 76, 2098–2118. [Google Scholar] [CrossRef]
  29. Shahi, T.B.; Sitaula, C.; Neupane, A.; Guo, W. Fruit classification using attention-based MobileNetV2 for industrial applications. PLoS ONE 2022, 17, e0264586. [Google Scholar] [CrossRef] [PubMed]
  30. Sitaula, C.; Shahi, T.B.; Aryal, S.; Marzbanrad, F. Fusion of multi-scale bag of deep visual words features of chest X-ray images to detect COVID-19 infection. Sci. Rep. 2021, 11, 1–12. [Google Scholar] [CrossRef]
  31. Chen, Y.; Wu, J.; Bu, H. Stock market embedding and prediction: A deep learning method. In Proceedings of the 2018 15th International Conference on Service Systems and Service Management (ICSSSM), Hangzhou, China, 21–22 July 2018; pp. 1–6. [Google Scholar]
  32. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
  33. Elman, J.L. Finding structure in time. Cogn. Sci. 1990, 14, 179–211. [Google Scholar] [CrossRef]
  34. Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv 2014, arXiv:1406.1078. [Google Scholar]
  35. Rosenblatt, F. The perceptron: A probabilistic model for information storage and organization in the brain. Psychol. Rev. 1958, 65, 386–408. [Google Scholar] [CrossRef] [Green Version]
  36. Atlas, L.; Homma, T.; Marks, R. An artificial neural network for spatio-temporal bipolar patterns: Application to phoneme classification. In Neural Information Processing Systems; American Institute of Physics: Maryland, MD, USA, 1987; pp. 31–40. [Google Scholar]
  37. Schuster, M.; Paliwal, K.K. Bidirectional recurrent neural networks. IEEE Trans. Signal Processing 1997, 45, 2673–2681. [Google Scholar] [CrossRef] [Green Version]
  38. Kim, S.; Kim, H. A new metric of absolute percentage error for intermittent demand forecasts. Int. J. Forecast. 2016, 32, 669–679. [Google Scholar] [CrossRef]
  39. Watanabe, C.; Shin, J.; Heikkinen, J.; Tarasyev, A. Optimal dynamics of functionality development in open innovation. IFAC Proc. Vol. 2009, 42, 173–178. [Google Scholar] [CrossRef] [Green Version]
  40. Jeong, H.; Shin, K.; Kim, E.; Kim, S. Does Open Innovation Enhance a Large Firm’s Financial Sustainability? A Case of the Korean Food Industry. J. Open Innov. Technol. Mark. Complex. 2020, 6, 101. [Google Scholar] [CrossRef]
  41. Le, T.; Hoque, A.; Hassan, K. An Open Innovation Intraday Implied Volatility for Pricing Australian Dollar Options. J. Open Innov. Technol. Mark. Complex. 2021, 7, 23. [Google Scholar] [CrossRef]
  42. Wu, B.; Gong, C. Impact of open innovation communities on enterprise innovation performance: A system dynamics perspective. Sustainability 2019, 11, 4794. [Google Scholar] [CrossRef] [Green Version]
  43. Arias-Pérez, J.; Coronado-Medina, A.; Perdomo-Charry, G. Big data analytics capability as a mediator in the impact of open innovation on firm performance. J. Strategy Manag. 2021, 15, 1–15. [Google Scholar] [CrossRef]
  44. Zhang, K.; Zhong, G.; Dong, J.; Wang, S.; Wang, Y. Stock Market Prediction Based on Generative Adversarial Network. Procedia Comput. Sci. 2019, 147, 400–406. [Google Scholar] [CrossRef]
  45. Chesbrough, H.W. Open Innovation: The New Imperative for Creating and Profiting from Technology; Harvard Business Press: Boston, MA, USA, 2003. [Google Scholar]
  46. Moretti, F.; Biancardi, D. Inbound open innovation and firm performance. J. Innov. Knowl. 2020, 5, 1–19. [Google Scholar] [CrossRef]
  47. Kiran, G.M. Stock Price prediction with LSTM Based Deep Learning Techniques. Int. J. Adv. Sci. Innov. 2021, 2, 18–21. [Google Scholar]
  48. Bhatti, S.H.; Santoro, G.; Sarwar, A.; Pellicelli, A.C. Internal and external antecedents of open innovation adoption in IT organisations: Insights from an emerging market. J. Knowl. Manag. 2021, 25, 1726–1744. [Google Scholar] [CrossRef]
  49. Yang, C.H.; Shyu, J.Z. A symbiosis dynamic analysis for collaborative R&D in open innovation. Int. J. Comput. Sci. Eng. 2010, 5, 74. [Google Scholar]
  50. Patil, P.; Wu CS, M.; Potika, K.; Orang, M. Stock market prediction using ensemble of graph theory, machine learning and deep learning models. In Proceedings of the 3rd International Conference on Software Engineering and Information Management, Sydney, Australia, 12–15 January 2020; pp. 85–92. [Google Scholar]
  51. Rana, M.; Uddin, M.M.; Hoque, M.M. Effects of activation functions and optimizers on stock price prediction using LSTM recurrent networks. In Proceedings of the 2019 3rd International Conference on Computer Science and Artificial Intelligence, Normal, IL, USA, 6–8 December 2019; pp. 354–358. [Google Scholar]
  52. Di Persio, L.; Honchar, O. Recurrent neural networks approach to the financial forecast of Google assets. Int. J. Math. Comput. Simul. 2017, 11, 7–13. [Google Scholar]
  53. Roondiwala, M.; Patel, H.; Varma, S. Predicting stock prices using LSTM. Int. J. Sci. Res. 2017, 6, 1754–1756. [Google Scholar]
  54. Hiransha, M.; Gopalakrishnan, E.A.; Menon, V.K.; Soman, K.P. NSE stock market prediction using deep-learning models. Procedia Comput. Sci. 2018, 132, 1351–1362. [Google Scholar]
  55. Wen, M.; Li, P.; Zhang, L.; Chen, Y. Stock Market Trend Prediction Using High-Order Information of Time Series. IEEE Access 2019, 7, 28299–28308. [Google Scholar] [CrossRef]
  56. Selvin, S.; Vinayakumar, R.; Gopalakrishnan, E.A.; Menon, V.K.; Soman, K.P. Stock price prediction using LSTM, RNN and CNN-sliding window model. In Proceedings of the 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Udupi, India, 13–16 September 2017; pp. 1643–1647. [Google Scholar]
  57. Agrawal, M.; Shukla, P.; Rgpv, B. Deep Long Short Term Memory Model for Stock Price Prediction using Technical Indicators. Available online: https://www.researchgate.net/publication/337800379_Deep_Long_Short_Term_Memory_Model_for_Stock_Price_Prediction_using_Technical_Indicators (accessed on 22 April 2022).
  58. Shahi, T.B.; Shrestha, A.; Neupane, A.; Guo, W. Stock Price Forecasting with Deep Learning: A Comparative Study. Mathematics 2020, 8, 1441. [Google Scholar] [CrossRef]
  59. Pun, T.B.; Shahi, T.B. Nepal stock exchange prediction using support vector regression and neural networks. In Proceedings of the 2018 Second International Conference on Advances in Electronics, Computers and Communications (ICAECC), Bangalore, India, 9–10 February 2018; pp. 1–6. [Google Scholar]
  60. Zhang, R.; Yuan, Z.; Shao, X. A new combined CNN-RNN model for sector stock price analysis. In Proceedings of the 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC), Tokyo, Japan, 23–27 July 2018; pp. 546–551. [Google Scholar]
  61. Chandar, S.K.; Sumathi, M.; Sivanandam, S.N. Prediction of Stock Market Price using Hybrid of Wavelet Transform and Artificial Neural Network. Indian J. Sci. Technol. 2016, 9, 1–5. [Google Scholar] [CrossRef] [Green Version]
  62. McNally, S.; Roche, J.; Caton, S. Predicting the price of bitcoin using machine learning. In Proceedings of the 2018 26th euromicro international conference on parallel, distributed and network-based processing (PDP), Cambridge, UK, 21–23 March 2018; pp. 339–343. [Google Scholar]
  63. Chung, H.; Shin, K.-S. Genetic algorithm-optimized multi-channel convolutional neural network for stock market prediction. Neural Comput. Appl. 2019, 32, 7897–7914. [Google Scholar] [CrossRef]
  64. Rezaei, H.; Faaljou, H.; Mansourfar, G. Stock price prediction using deep learning and frequency decomposition. Expert Syst. Appl. 2021, 169, 114332. [Google Scholar] [CrossRef]
  65. Xu, Y.; Chhim, L.; Zheng, B.; Nojima, Y. Stacked deep learning structure with bidirectional long-short term memory for stock market prediction. In International Conference on Neural Computing for Advanced Applications; Springer: Singapore, 2020; pp. 447–460. [Google Scholar]
  66. Lu, W.; Li, J.; Wang, J.; Qin, L. A CNN-BiLSTM-AM method for stock price prediction. Neural Comput. Appl. 2021, 33, 4741–4753. [Google Scholar] [CrossRef]
  67. Wang, Y.; Wang, L.; Yang, F.; Di, W.; Chang, Q. Advantages of direct input-to-output connections in neural networks: The Elman network for stock index forecasting. Inf. Sci. 2020, 547, 1066–1079. [Google Scholar] [CrossRef]
  68. Wu, J.M.; Li, Z.; Srivastava, G.; Tasi, M.; Lin, J.C. A graph-based convolutional neural network stock price prediction with leading indicators. Software: Pract. Exp. 2020, 51, 628–644. [Google Scholar] [CrossRef]
  69. Dutta, A.; Kumar, S.; Basu, M. A Gated Recurrent Unit Approach to Bitcoin Price Prediction. J. Risk Financial Manag. 2020, 13, 23. [Google Scholar] [CrossRef] [Green Version]
  70. Qiu, J.; Wang, B.; Zhou, C. Forecasting stock prices with long-short term memory neural network based on attention mechanism. PLoS ONE 2020, 15, e0227222. [Google Scholar] [CrossRef]
  71. Houssein, E.H.; Dirar, M.; Abualigah, L.; Mohamed, W.M. An efficient equilibrium optimizer with support vector regression for stock market prediction. Neural Comput. Appl. 2021, 34, 3165–3200. [Google Scholar] [CrossRef]
  72. Al Bashabsheh, E.; Alasal, S.A. ES-JUST at SemEval-2021 Task 7: Detecting and Rating Humor and Offensive Text Using Deep Learning. In Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021), Online, 5–6 August 2021; pp. 1102–1107. [Google Scholar]
  73. Jia, H. Investigation into the effectiveness of long short term memory networks for stock price prediction. arXiv 2016, arXiv:1603.07893. [Google Scholar]
  74. Qin, L.; Yu, N.; Zhao, D. Applying the convolutional neural network deep learning technology to behavioural recognition in intelligent video. Tehnički vjesnik 2018, 25, 528–535. [Google Scholar]
  75. Hao, Y.; Gao, Q. Predicting the Trend of Stock Market Index Using the Hybrid Neural Network Based on Multiple Time Scale Feature Learning. Appl. Sci. 2020, 10, 3961. [Google Scholar] [CrossRef]
  76. Kamalov, F. Forecasting significant stock price changes using neural networks. Neural Comput. Appl. 2020, 32, 17655–17667. [Google Scholar] [CrossRef]
  77. Livieris, I.E.; Pintelas, E.; Pintelas, P. A CNN–LSTM model for gold price time-series forecasting. Neural Comput. Appl. 2020, 32, 17351–17360. [Google Scholar] [CrossRef]
  78. Clevert, D.A.; Unterthiner, T.; Hochreiter, S. Fast and accurate deep network learning by exponential linear units (elus). arXiv 2015, arXiv:1511.07289. [Google Scholar]
  79. Cococcioni, M.; Rossi, F.; Ruffaldi, E.; Saponara, S. A novel posit-based fast approximation of elu activation function for deep neural networks. In Proceedings of the 2020 IEEE International Conference on Smart Computing (SMARTCOMP), Bologna, Italy, 14–17 September 2020; pp. 244–246. [Google Scholar]
  80. Armano, G.; Marchesi, M.; Murru, A. A hybrid genetic-neural architecture for stock indexes forecasting. Inf. Sci. 2005, 170, 3–33. [Google Scholar] [CrossRef]
  81. Zhong, X.; Enke, D. Predicting the daily return direction of the stock market using hybrid machine learning algorithms. Financ. Innov. 2019, 5, 1–20. [Google Scholar] [CrossRef]
  82. Hu, Z.; Zhao, Y.; Khushi, M. A Survey of Forex and Stock Price Prediction Using Deep Learning. Appl. Syst. Innov. 2021, 4, 9. [Google Scholar] [CrossRef]
  83. Dang, H.; Mei, B. Stock Movement Prediction Using Price Factor and Deep Learning. Int. J. Comput. Inf. Eng. 2022, 16, 73–76. [Google Scholar]
  84. Borodin, A.; Mityushina, I.; Streltsova, E.; Kulikov, A.; Yakovenko, I.; Namitulina, A. Mathematical Modeling for Financial Analysis of an Enterprise: Motivating of Not Open Innovation. J. Open Innov. Technol. Mark. Complex. 2021, 7, 79. [Google Scholar] [CrossRef]
Figure 1. The overall flow diagram of the proposed work.
Figure 1. The overall flow diagram of the proposed work.
Joitmc 08 00096 g001
Figure 2. LSTM cell.
Figure 2. LSTM cell.
Joitmc 08 00096 g002
Figure 3. Bi-LSTM Neural Network.
Figure 3. Bi-LSTM Neural Network.
Joitmc 08 00096 g003
Figure 4. GRU model representation.
Figure 4. GRU model representation.
Joitmc 08 00096 g004
Figure 5. The visualization of the MSE results of Apple.
Figure 5. The visualization of the MSE results of Apple.
Joitmc 08 00096 g005
Figure 6. The visualization of the MAPE results for Apple.
Figure 6. The visualization of the MAPE results for Apple.
Joitmc 08 00096 g006
Figure 7. The visualization of the MSE results for Tesla.
Figure 7. The visualization of the MSE results for Tesla.
Joitmc 08 00096 g007
Figure 8. The visualization of the MAPE results for Tesla.
Figure 8. The visualization of the MAPE results for Tesla.
Joitmc 08 00096 g008
Figure 9. The visualization of the MSE results of Snapchat.
Figure 9. The visualization of the MSE results of Snapchat.
Joitmc 08 00096 g009
Figure 10. The visualization of the MAPE results of Snapchat.
Figure 10. The visualization of the MAPE results of Snapchat.
Joitmc 08 00096 g010
Figure 11. The visualization of the MSE results of ExxonMobil.
Figure 11. The visualization of the MSE results of ExxonMobil.
Joitmc 08 00096 g011
Figure 12. The visualization of the MAPE results of ExxonMobil.
Figure 12. The visualization of the MAPE results of ExxonMobil.
Joitmc 08 00096 g012
Table 1. An overview of the most related works.
Table 1. An overview of the most related works.
ReferenceMethodContributionMeasuresYear
[56]Deep learning architecture-based Long Short-Term Memory (LSTM)Stock price prediction using LSTM, RNN, and CNN-sliding window modelARIMA2017
[60]A Deep Wide Neural Network (DWNN)A new combined CNN-RNN model for sector stock price analysisPrediction mean squared error2018
[44]Long Short-Term Memory (LSTM) as a generator and Multi-Layer Perceptron (MLP)A new hybrid machine learning approach for stock market predictionMAE,
RMAE,
MAE,
AR
2019
[68]A new convolutional novel neural networkA graph-based convolutional neural network stock price prediction with leading indicatorsPrediction accuracy2020
[50]Support Vector Regression (SVR), Linear Regression (LR), and Long Short-Term Memory (LSTM)Stock market prediction using an ensemble of graph theory, machine learning and deep learning modelsRMAE2020
[71]Support vector regression (SVR) method with equilibrium optimizer (EO)An efficient equilibrium optimizer with support vector regression for stock market predictionMean fitness function (prediction rate)2022
Table 2. The training results of the models using Apple data with four input features.
Table 2. The training results of the models using Apple data with four input features.
ModelMSE (Training)MAPE (Training)
MLP0.02411.583
GRU0.02655.3065
LSTM0.01051.29
Bi-LSTM0.0334.985
CNN0.03961.993
CNN-LSTM0.0362.238
Table 3. The model’s training results using Apple data with six input features.
Table 3. The model’s training results using Apple data with six input features.
ModelMSE (Training)MAPE (Training)
MLP0.01881.582
GRU0.01775.57
LSTM0.00191.08
Bi-LSTM0.014.894
CNN0.03012.228
CNN-LSTM0.0342.557
Table 4. The training results of the models using Tesla data with four input features.
Table 4. The training results of the models using Tesla data with four input features.
ModelMSE (Training)MAPE (Training)
MLP0.71011.809
GRU0.46425.254
LSTM0.36951.672
Bi-LSTM0.6013.8
CNN0.31571.576
CNN-LSTM0.5932.069
Table 5. The models’ training results using Tesla data with six input features.
Table 5. The models’ training results using Tesla data with six input features.
ModelMSE (Training)MAPE (Training)
MLP0.60711.86
GRU0.12696.325
LSTM0.10991.765
Bi-LSTM0.0985.779
CNN0.40151.791
CNN-LSTM0.4831.964
Table 6. The models’ training results using Snapchat data with four input features.
Table 6. The models’ training results using Snapchat data with four input features.
ModelMSE (Training)MAPE (Training)
MLP0.23482.184
GRU0.1452.62
LSTM0.08701.653
Bi-LSTM0.1272.276
CNN0.08691.483
CNN-LSTM0.1341.899
Table 7. The models’ training results using Snapchat data with six input features.
Table 7. The models’ training results using Snapchat data with six input features.
ModelMSE (Training)MAPE (Training)
MLP0.19891.765
GRU0.0241.444
LSTM0.00751.006
Bi-LSTM0.0211.363
CNN0.08131.563
CNN-LSTM0.1261.926
Table 8. The training results of the models using ExxonMobil data with four input features.
Table 8. The training results of the models using ExxonMobil data with four input features.
ModelMSE (Training)MAPE (Training)
MLP0.4791.422
GRU0.3251.15
LSTM0.42661.285
Bi-LSTM0.3191.131
CNN5.3573.583
CNN-LSTM3.1862.823
Table 9. The training results of the models using ExxonMobil data with six input features.
Table 9. The training results of the models using ExxonMobil data with six input features.
ModelMSE (Training)MAPE (Training)
MLP0.431.377
GRU0.3041.012
LSTM0.3421.158
Bi-LSTM0.2660.99
CNN3.1882.985
CNN-LSTM3.7192.919
Table 10. The validation results of the models using Apple data with four input features.
Table 10. The validation results of the models using Apple data with four input features.
ModelMSE (Validation)MAPE (Validation)
MLP9.5043.46
GRU19.1964.1615
LSTM2.3883.208
Bi-LSTM69.9152.875
CNN2.2816.744
CNN-LSTM3.3398.473
Table 11. The validation results of the models using Apple data with six input features.
Table 11. The validation results of the models using Apple data with six input features.
ModelMSE (Validation)MAPE (Validation)
MLP15.893.488
GRU15.772.069
LSTM3.3251.983
Bi-LSTM5.3012.078
CNN2.773.744
CNN-LSTM2.4947.624
Table 12. The validation results of the models using Tesla data with four input features.
Table 12. The validation results of the models using Tesla data with four input features.
ModelMSE (Validation)MAPE (Validation)
MLP39.92.49
GRU19.242.518
LSTM561.89
Bi-LSTM294.952.037
CNN5.4751.46
CNN-LSTM3.6591.678
Table 13. The validation results of the models using Tesla data with six input features.
Table 13. The validation results of the models using Tesla data with six input features.
ModelMSE (Validation)MAPE (Validation)
MLP44.282.854
GRU14.122.105
LSTM47.052.533
Bi-LSTM2.2721.279
CNN3.8891.626
CNN-LSTM11.2611.765
Table 14. The validation results of the models using Snapchat data with four input features.
Table 14. The validation results of the models using Snapchat data with four input features.
ModelMSE (Validation)MAPE (Validation)
MLP37.825.538
GRU12.271.875
LSTM16.232.054
Bi-LSTM4.1393.825
CNN5.423.545
CNN-LSTM21.8096.757
Table 15. The validation results of the models using Snapchat data with six input features.
Table 15. The validation results of the models using Snapchat data with six input features.
ModelMSE (Validation)MAPE (Validation)
MLP45.117.409
GRU0.9040.612
LSTM36.64.118
Bi-LSTM0.9974.523
CNN7.8393.219
CNN-LSTM18.6186.69
Table 16. The validation results of the models using ExxonMobil data with four input features.
Table 16. The validation results of the models using ExxonMobil data with four input features.
ModelMSE (Validation)MAPE (Validation)
MLP19.6046.537
GRU19.1015.902
LSTM27.2447.434
Bi-LSTM34.5036.048
CNN127.87515.909
CNN-LSTM74.33813.814
Table 17. The validation results of the models using ExxonMobil data with six input features.
Table 17. The validation results of the models using ExxonMobil data with six input features.
ModelMSE (Validation)MAPE (Validation)
MLP18.215.996
GRU19.2915.854
LSTM26.567.672
Bi-LSTM19.5816.626
CNN72.30313.488
CNN-LSTM79.13414.981
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Alkhatib, K.; Khazaleh, H.; Alkhazaleh, H.A.; Alsoud, A.R.; Abualigah, L. A New Stock Price Forecasting Method Using Active Deep Learning Approach. J. Open Innov. Technol. Mark. Complex. 2022, 8, 96. https://doi.org/10.3390/joitmc8020096

AMA Style

Alkhatib K, Khazaleh H, Alkhazaleh HA, Alsoud AR, Abualigah L. A New Stock Price Forecasting Method Using Active Deep Learning Approach. Journal of Open Innovation: Technology, Market, and Complexity. 2022; 8(2):96. https://doi.org/10.3390/joitmc8020096

Chicago/Turabian Style

Alkhatib, Khalid, Huthaifa Khazaleh, Hamzah Ali Alkhazaleh, Anas Ratib Alsoud, and Laith Abualigah. 2022. "A New Stock Price Forecasting Method Using Active Deep Learning Approach" Journal of Open Innovation: Technology, Market, and Complexity 8, no. 2: 96. https://doi.org/10.3390/joitmc8020096

Article Metrics

Back to TopTop