# US Dollar/Turkish Lira Exchange Rate Forecasting Model Based on Deep Learning Methodologies and Time Series Analysis

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Literature Review

## 3. Proposed System

#### 3.1. Sentiment Analysis

#### 3.2. Word Embedding Models

**Word2Vec:**Word2Vec is accepted as a pioneer word embedding method that has started a new trend in natural language processing. Word2Vec tries to express words in a vector space and it is a prediction-based and unsupervised model [13]. Thanks to neural networks, the model can easily learn representation of words as dense vectors that encode patterns and many linguistic regularities among words. Thus, it makes it possible to display trained words as vectors, encoding multiple language models between words. There are 2 types of sub-methods, Skip-Gram, and continuous bag of words (CBOW). Although both methods are generally similar, they have different advantages compared to each other. The purpose of the skip-gram model is to predict the words surrounding a given word. On the other hand, the CBOW model predicts a target word w

_{t}from the surrounding words by maximizing the log probabilities. The training complexity of the CBOW model is expressed as:

**GloVe:**Global Vectors is another popular word embedding algorithm which was introduced in [14]. Word2Vec models use surroundings of words for training and do not take advantage of the count-based statistics which includes word co-occurrences. For this purpose, the GloVe method consolidates the local content window and count-based matrix factorization techniques to achieve more effective representation. Matrix factorization allows for obtaining word to word statistical information from a corpus. In summary, it is a method that aims to create the ratio of the probability of words being simultaneous, rather than the probability of words forming together, the information it contains, and this information by calculating vector differences.

_{i}shows the matrix of occurrence of a word together with another word, X

_{ij}shows the number of occurrence of a word in a corpus, i shows the word and j shows the corpus. The probability that the word appears in corpus is calculated as follows:

**FastText:**FastText is an artificial neural network library developed for text classification. It converts text or words into continuous vectors that can be used in any language, such as a speech-related task. In this context, the detection of spam can be one of the most common examples. It is faster and more efficient than other text classification structures. Instead of using individual words as inputs, it divides the words into several letter-based “n-grams”. N is the n repetition degree in gram expression. The word is divided into characters with the expression n, which allows us to understand the length of a word [15,16]. FastText uses the skip-gram model with negative sampling proposed for Word2Vec with a modified skip-gram loss function. The FastText method is expressed as:

_{n}shows the bag of words created for the n

^{th}document, y

_{n}indicates the classes, A and B show the weight of the matrix.

#### 3.3. Deep Learning Models

**Convolutional Neural Networks:**CNNs are very successful in image processing, and in addition, during recent years, it has been observed that they are also successful in natural language processing (NLP) problems [17,18]. CNN is known as a feed forward neural network which includes pooling, convolution, and full connected layers. There can be many convolution layers performing a filter of convolution to data in order to acquire a feature, which are fed into pooling layers and followed by dense layers. The fundamental task of filters is to learn the context of problem throughout training procedure. In this way, dependencies located in the original data are represented with the utilization of feature maps, which is named the convolution process. Then, the pooling layer is used to decrease the parameters and the number of calculations in the network with the purpose of decreasing training time and reducing dimension and over-fitting. After that, the final decision is assigned by fully connected layers. The convolution layer uses a hyperparameter, depth, step, and zero fill to reduce and optimize the complexity of the data coming out of the layer.

**Recurrent Neural Networks:**In RNNs, the output from the preceding step feeds the current step as input to remember words. RNN use a hidden layer to remember information which was calculated in the past. RNNs, as distinct from other neural networks, reduce the semantic difficulty of inputs to be set. RNNs apply the same operations on all inputs or covered layers to produce the result. Using the same data for each input decreases the semantic difficulty of the data [19,20]. When long-term dependencies are seen in the sequence data, RNN-based models cannot learn previous data properly. The reason for this problem is gradient descent operations performed in back-propagation process. As a result of continuous matrix multiplications, small weight values decrease exponentially and disappear. Moreover, when the weight values are large, these values reach “NaN” values because of the continuous matrix multiplication. To handle these kinds of issues, techniques such as the suitable activation functions gradient or clipping can be utilized. In summary, a simple recurrent network has activation feedback that includes short-term memory. The input layer is a set of weight matrices, state transition, and output functions that will enable automatic adaptation through learning for the hidden layer and the output layer. The state/hidden layer, t, is fed not only by an input layer, but also by activation, that is, t − 1, from forward propagation.

**Long Short-Term Memory Networks:**LSTMs are advanced enough to handle gradient-based problems of RNNs. They are sub-branches of RNNs which can maintain information in memory for long periods of time. Thus, long dependencies among data are stored and the contextual semantics are kept with the usage of LSTMs. The starting point is to ensure a solution to the exponential error growth problem using the back-propagation algorithm while deep neural networks are being trained. Errors are stored and used by LSTMs in the back-propagation process. Decisions can be made by LSTM, such as what to keep and when to authorize reads [21,22]. An LSTM network calculates network activations from the data array from the input layer to be transmitted to the data array of the output layer as follows:

#### 3.4. Time Series Analysis

_{t}indicates the value that Y has taken at time t. Since most machine learning models are not suitable for working with incomplete values, the time series must be continuous for these models to be used effectively and appropriately. To avoid this problem, the missing values must be filled in with the appropriate data or the rows with the missing data must be deleted.

_{0}, y

_{0}), (x

_{1}, y

_{1}), (x

_{2}, y

_{2}) (x

_{0}, y

_{0}), (x

_{1}, y

_{1}), (x

_{2}, y

_{2}), there should be a model that satisfies the equation at the given three different data points.

_{t}refers to the trend and α, β* signify smoothing parameters.

_{t}refers to the trend, α and β* denote smoothing parameters, and m refers to the frequency of seasonality.

_{t}, X is the delay value, and ε denotes the randomly defined error parameters.

#### 3.5. Architecture of the Proposed Model

## 4. Experiment Setup

## 5. Experiment Results

_{EN}outperforms others with 85.75% accuracy when the cross-validation results are evaluated. The classification performance of each model is ordered in descending order as: fastText

_{EN}, GloVe

_{TR}, fastText

_{TR}, Word2Vec

_{TR}, GloVe

_{EN}, Word2Vec

_{EN}.

## 6. Discussion and Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

- Okazaki, M.; Matsuo, Y. Semantic Twitter: Analyzing Tweets for Real-Time Event Notification. In Proceedings of the International Conference on Social Software, Cork, Ireland, 3–4 March 2008; pp. 63–74. [Google Scholar]
- Pang, B.; Lee, L. Opinion Mining and Sentiment Analysis. Found. Trend Inf. Ret.
**2008**, 2, 1–135. [Google Scholar] [CrossRef] [Green Version] - Stenqvist, E.; Lönnö, J. Predicting Bitcoin Price Fluctuation with Twitter Sentiment Analysis. Master’s Thesis, KTH Royal Institute of Technology, Stockholm, Sweden, 2017. [Google Scholar]
- Ozcan, F. Exchange Rate Prediction from Twitter’s Trending Topics. In Proceedings of the International Conference on Analytical and Computational Methods in Probability Theory, Moscow, Russia, 23–27 October 2017; pp. 1–46. [Google Scholar]
- Komariah, K.S.; Sin, B.K. Naïve Bayes Approach for Predicting Foreign Exchange Rate Fluctuation Based on Twitter Sentiment Analysis. In Proceedings of the Spring Conference of Korean Multimedia Society, Andong, Korea, 2–3 August 2015; pp. 78–92. [Google Scholar]
- Ozturk, S.; Ciftci, K.A. Sentiment Analysis of Twitter Content as a Predictor of Exchange Rate Movements. Rev. Econ. Anal.
**2014**, 6, 132–140. [Google Scholar] - Yasir, M.; Durrani, M.Y.; Afzal, S.; Maqsood, M.; Aadil, F.; Mehmood, I.; Rho, S. An Intelligent Event-Sentiment-Based Daily Foreign Exchange Rate Forecasting System. Appl. Sci.
**2019**, 9, 2980. [Google Scholar] [CrossRef] [Green Version] - Maria, F.C.; Eva, D. Exchange-Rates Forecasting: Exponential Smoothing Techniques and ARIMA Models. Ann. Econ.
**2011**, 1, 499–508. [Google Scholar] - Rout, M.; Majhi, B.; Majhi, R.; Panda, G. Forecasting of Currency Exchange Rates using an Adaptive ARMA Model with Differential Based Evolution. J. King Saud Univ. Comp. Info. Sci.
**2014**, 26, 7–18. [Google Scholar] [CrossRef] [Green Version] - Rojas, C.G.; Herman, M. Foreign Exchange Forecasting via Machine Learning. Bachelor’s Thesis, Stanford University, Stanford, CA, USA, 2018. [Google Scholar]
- Varenius, M. Real Currency Exchange Rate Prediction—A Time Series Analysis. Bachelor’s Thesis, Stockholm University, Stockholm, Sweden, 2017. [Google Scholar]
- Zhang, H.; Li, D. Naïve Bayes Text Classifier. In Proceedings of the IEEE International Conference on Granular Computing, Fremont, CA, USA, 2–4 November 2007; pp. 708–711. [Google Scholar]
- Santos, I.; Nedjah, N.; Macedo, D.E.; Mourelle, L. Sentiment Analysis using Convolutional Neural Network with FastText Embeddings. In Proceedings of the IEEE Latin American Conference on Computational Intelligence, Arequipa, Peru, 8–10 November 2017; pp. 1–5. [Google Scholar]
- Zhang, L.; Wang, S.; Liu, B. Deep Learning for Sentiment Analysis: A Survey. Data Min. Knowl. Disc.
**2018**, 8, 1–25. [Google Scholar] [CrossRef] [Green Version] - Joulin, A.; Grave, E.; Bojanowski, P.; Mikolov, T. Bag of Tricks for Efficient Text Classification. arXiv
**2016**, arXiv:1607.01759. [Google Scholar] - Mikolov, T.; Grave, E.; Bojanowski, P.; Puhrsch, C.; Joulin, A. Advances in Pre-Training Distributed Word Representations. arXiv
**2017**, arXiv:1712.09405. [Google Scholar] - Lecun, Y.; Bengio, Y.; Hiton, G. Deep Learning. Nature
**2015**, 521, 436–444. [Google Scholar] [CrossRef] [PubMed] - Voulodimos, A.; Doulamis, N.; Doulamis, A.; Protopapadakis, E. Deep Learning for Computer Vision: A Brief Review. Comput. Intel. Neurosci.
**2018**, 1, 1–13. [Google Scholar] - Elman, J.L. Finding Structure in Time. Cogn. Sci.
**1990**, 14, 179–211. [Google Scholar] [CrossRef] - Tunalı, V.; Bilgin, T.T. PRETO: A High-Performance Text Mining Tool for Preprocessing Turkish Texts. In Proceedings of the International Conference on Computer Systems and Technologies, Ruse, Bulgaria, 19–20 June 2012; pp. 134–140. [Google Scholar]
- Greff, K.; Srivastava, R.K.; Koutnik, J.; Steunebrink, B.R.; Schmidhuber, J. LSTM: A Search Space Odyssey. IEEE Trans. Neur Net Learn. Syst.
**2017**, 28, 2222–2232. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Kent, D.; Salem, F.M. Performance of Three Slim Variants of The Long Short-Term Memory {(LSTM)} Layer. arXiv
**2019**, arXiv:1901.00525. [Google Scholar] - Vandebogert, K. Method of Quadratic Interpolation. Ph.D. Thesis, University of South Carolina, Columbia, SC, USA, 2017. [Google Scholar]
- Hyndman, R.J.; Athanasopoulos, G. Simple Exponential Smoothing, Forecasting: Principles and Practice, 2nd ed.; Otexts: Melbourne, Australia, 2018. [Google Scholar]
- Holt, C. Forecasting Seasonals and Trends by Exponential Weighted Moving Averages. Int. J. Forecast.
**2004**, 20, 5–10. [Google Scholar] [CrossRef] - Winters, P.R. Forecasting Sales by Exponentially Weighted Moving Averages. Manage. Sci.
**1960**, 6, 324–342. [Google Scholar] [CrossRef] - Box, G.E.; Jenkins, G.M.; Reinsel, G.C.; Ljung, G.M. Time Series Analysis: Forecasting and Control, 5th ed.; Wiley: Hoboken, NJ, USA, 2016. [Google Scholar]
- Zhang, G.P. Time Series Forecasting using a Hybrid ARIMA and Neural Network Model. Neurocomp
**2003**, 50, 159–175. [Google Scholar] [CrossRef] - Central Bank of The Republic of Turkey. Available online: https://www.tcmb.gov.tr/kurlar/kurlar\_tr.html (accessed on 10 September 2019).

**Figure 2.**Receiver operating characteristic (ROC) curve of long short-term memory networks (LSTM) + GloVe combination for Turkish dataset.

Turkish | English | |
---|---|---|

Positive | 75,947 | 4585 |

Negative | 15,244 | 6968 |

Other | 6 | 100 |

Total | 91,197 | 11,653 |

Avg. of terms per doc. | 1740 | 1425 |

Avg. term length | 632 | 519 |

Dataset | Content | Label |
---|---|---|

Twitter_{EN} | Unfortunately your imprudent messages caused the dollar/tl hit almost 5! | negative |

Twitter_{TR} | dolar tl güne hızlı yükselişle başladı gram altın liranın üzerinde seyrediyor | positive |

Exchange Rate Data | (date) 23022018, (opening price) 3.7789, (closing price) 3.7941 | positive |

Model | Accuracy | F-Measure | Precision | Sensitivity | CV | MCC |
---|---|---|---|---|---|---|

Word2Vec_{TR} | 84.59 | 91.04 | 88.24 | 94.03 | 83.40 | 0.35 |

Word2Vec_{EN} | 73.65 | 68.30 | 64.44 | 72.65 | 73.09 | 0.55 |

GloVe_{TR} | 85.10 | 91.35 | 88.32 | 94.59 | 85.24 | 0.41 |

GloVe_{EN} | 78.36 | 73.71 | 70.17 | 77.63 | 79.59 | 0.58 |

fastText_{TR} | 83.69 | 90.15 | 90.82 | 89.49 | 84.34 | 0.67 |

fastText_{EN} | 85.29 | 80.98 | 81.81 | 80.18 | 85.75 | 0.63 |

**Table 4.**Experimental results obtained by combining deep learning methods and word embedding models in the Turkish dataset.

Model | Accuracy | F-Measure | Precision | Sensitivity | CV | MCC |
---|---|---|---|---|---|---|

RNN_{GloVe} | 85.31 | 91.42 | 89.12 | 93.85 | 85.33 | 0.41 |

RNN_{Word2Vec} | 84.45 | 91.21 | 86.27 | 96.75 | 84.44 | 0.32 |

RNN_{fastText} | 84.05 | 91.23 | 84.27 | 99.44 | 83.84 | 0.35 |

CNN_{GloVe} | 84.88 | 91.17 | 88.91 | 93.54 | 85.00 | 0.38 |

CNN_{Word2Vec} | 84.60 | 91.23 | 86.87 | 96.06 | 83.94 | 0.31 |

CNN_{fastText} | 84.54 | 91.45 | 84.89 | 99.11 | 84.48 | 0.43 |

LSTM_{GloVe} | 86.03 | 91.85 | 89.45 | 94.39 | 85.97 | 0.41 |

LSTM_{Word2Vec} | 83.80 | 90.49 | 88.63 | 92.43 | 83.43 | 0.36 |

LSTM_{fastText} | 83.75 | 90.61 | 87.50 | 93.94 | 83.42 | 0.65 |

**Table 5.**Experimental results obtained by combining deep learning methods and word embedding models in the English dataset.

Model | Accuracy | F-Measure | Precision | Sensitivity | CV | MCC |
---|---|---|---|---|---|---|

RNN_{GloVe} | 76.68 | 64.04 | 80.54 | 53.16 | 77.65 | 0.54 |

RNN_{Word2Vec} | 75.34 | 68.65 | 68.20 | 69.10 | 75.50 | 0.46 |

RNN_{fastText} | 68.71 | 48.47 | 68.00 | 37.65 | 67.69 | 0.41 |

CNN_{GloVe} | 77.85 | 70.08 | 73.87 | 66.67 | 76.01 | 0.52 |

CNN_{Word2Vec} | 70.99 | 56.46 | 68.18 | 48.17 | 71.47 | 0.43 |

CNN_{fastText} | 77.91 | 72.10 | 71.24 | 72.98 | 76.50 | 0.66 |

LSTM_{GloVe} | 79.01 | 72.83 | 73.70 | 71.98 | 78.46 | 0.55 |

LSTM_{Word2Vec} | 74.34 | 58.73 | 79.03 | 46.73 | 76.11 | 0.44 |

LSTM_{fastText} | 76.11 | 67.98 | 71.38 | 64.89 | 75.00 | 0.72 |

Evaluation Metric | SES | HLT | HWC | HWT | ARIMA |
---|---|---|---|---|---|

MAPE | 2.6432 | 2.4576 | 2.6325 | 2.6273 | 2.7144 |

MAE | 0.0442 | 0.0412 | 0.0440 | 0.0439 | 0.0472 |

MSE | 0.0024 | 0.0026 | 0.0024 | 0.0023 | 0.0020 |

Accuracy | 95.58 | 95.87 | 95.59 | 95.60 | 92.99 |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Yasar, H.; Kilimci, Z.H.
US Dollar/Turkish Lira Exchange Rate Forecasting Model Based on Deep Learning Methodologies and Time Series Analysis. *Symmetry* **2020**, *12*, 1553.
https://doi.org/10.3390/sym12091553

**AMA Style**

Yasar H, Kilimci ZH.
US Dollar/Turkish Lira Exchange Rate Forecasting Model Based on Deep Learning Methodologies and Time Series Analysis. *Symmetry*. 2020; 12(9):1553.
https://doi.org/10.3390/sym12091553

**Chicago/Turabian Style**

Yasar, Harun, and Zeynep Hilal Kilimci.
2020. "US Dollar/Turkish Lira Exchange Rate Forecasting Model Based on Deep Learning Methodologies and Time Series Analysis" *Symmetry* 12, no. 9: 1553.
https://doi.org/10.3390/sym12091553