Leveraging Large Language Models and Data Augmentation in Cognitive Computing to Enhance Stock Price Predictions

Habbat, Nassera; Nouri, Hicham; Berradi, Zahra

doi:10.3390/engproc2025112040

Open AccessProceeding Paper

Leveraging Large Language Models and Data Augmentation in Cognitive Computing to Enhance Stock Price Predictions^†

by

Nassera Habbat

^1,*

,

Hicham Nouri

²

and

Zahra Berradi

³

¹

AIRA Laboratory, Faculty of Science and Technology of Settat, Hassan First University, Settat 26000, Morocco

²

Faculty of Legal Economic and Social Sciences AIN SEBAA, Hassan II University of Casablanca, Casablanca 20580, Morocco

³

ENSA of Tetouan, Abdelmalek Essaadi University, Tetouan 10430, Morocco

^*

Author to whom correspondence should be addressed.

^†

This paper is an extended version of our paper published in Presented at the 7th edition of the International Conference on Advanced Technologies for Humanity (ICATH 2025), Kenitra, Morocco, 9–11 July 2025.

Eng. Proc. 2025, 112(1), 40; https://doi.org/10.3390/engproc2025112040

Published: 17 October 2025

Download

Browse Figures

Versions Notes

Abstract

Precise stock price forecasting is essential for informed decision-making in financial markets. This study examines the combination of large language models (LLMs) with data augmentation approaches, utilizing improvements in cognitive computing to enhance stock price prediction. Traditional methods rely on structured data and basic time-series analysis. However, new research shows that deep learning and transformer-based architectures can effectively process unstructured financial data, such as news articles and social media sentiment. This study employs models, such as RNN, mBERT, RoBERTa, and GPT-4 based architectures, to illustrate the efficacy of our suggested method in forecasting stock movements. The research employs data augmentation techniques, including synthetic data creation using Generative Pre-trained Transformers, to rectify imbalances in training datasets. We assess metrics like accuracy, F1-score, recall, and precision to verify the models’ performance. We also investigate the influence of preprocessing methods like text normalization and feature engineering. Extensive tests show that transformer models are much better at predicting how stock prices will move than traditional methods. For example, the GPT-4 based model got an F1 score of 0.92 and an accuracy of 0.919, which shows that LLMs have a lot of potential in financial applications.

Keywords:

sentiment analysis; large language models; financial market; data augmentation

1. Introduction

Due to its importance to investors, policymakers, and economists, stock price forecasting has become a major financial study topic. Predicting market behavior is difficult and time-consuming, despite its rising importance. Financial markets’ inherent volatility and stochasticity make forecasting models unreliable [1]. Machine learning has been used to address these issues, but stock market volatility still makes predictions inaccurate [2]. Market predictability has been shaped by traditional financial theories like the EMH and Random Walk Theory. According to the EMH, stock prices instantly reflect all publicly available information, making past data useless for prediction [3]. The Random Walk Theory states that price variations are unforeseeable based on historical patterns because they follow a random route. These theories stress that exogenous variables like financial news and macroeconomic developments typically impact stock price fluctuations more than historical trends, especially during market turmoil. Under some situations, past stock data patterns can produce useful projections, according to empirical study [4]. When applied to well-structured information, time-series models like the Autoregressive Integrated Moving Average (ARIMA) can improve short-term forecasting. ARIMA-based models have shown significant prediction accuracy in case studies of the Indian stock market [5]. When data is tainted with noise or unstructured material, such as financial news or social media sentiments, their performance drops considerably [6]. Big data and artificial intelligence have revived interest in text mining and deep learning for stock market prediction. Deep neural networks excel in extracting complicated characteristics from various data sources, improving forecasting accuracy. Although deep learning architectures have improved [7], many cannot adapt to the complex and turbulent financial markets. Research continues to adapt model designs to rapid market swings and exogenous shocks. Behavioral finance highlights the significant impact of investors’ emotions and psychological patterns on financial decision-making [8]. Deng et al. [9] conducted a recent investigation into the potential of utilizing sentiment from retail, institutional, and foreign investors to predict changes in the Shanghai Stock Exchange Index. The growing influence of social media on investor sentiment has rendered such data a valuable component for predictive stock market models [10,11]. The proliferation of online platforms has established textual content as a significant medium for expressing investor emotions, which often influence market dynamics [12]. Extracting meaningful insights from vast and unstructured textual data continues to pose significant challenges. In response, analytical techniques, including sentiment analysis (SA) [4], natural language processing (NLP) [13], and opinion mining, have become increasingly important for interpreting investor sentiment and improving forecasting accuracy [14]. Even though a lot of research has been done in this area, many existing models still have big prediction mistakes and are becoming more complicated, which has led researchers to ask for better, clearer, and more useful solutions for stakeholders. This study uses cognitive computing methods like large language models (LLMs) [15,16] and data augmentation [17,18] to improve stock price prediction model accuracy and interpretability. Traditional modeling methods struggle to capture subtle sentiment and contextual clues in financial text data from news, analyst reports, and social media due to its volume and complexity. GPT-4 and BERT-based LLMs can derive predictive insights from unstructured financial narratives because of their enhanced language understanding. Data augmentation strategies synthetically extend training datasets, reduce class imbalance, and improve model generalization, especially in volatile or low-signal situations. This dual strategy addresses sentiment-driven forecasting restrictions in a cost-effective, scalable, and efficient manner. Paper Structure: The Section 2 examines cognitive computing, LLM, and basic financial prediction research. The Section 3 describes the data sources (stock market records, financial news, and social media), LLM architectures, and augmentation methods. Empirical observations comparing our model to traditional baselines are presented in the Section 4. In the Section 5, the study’s main findings and future research areas, such as multimodal data and real-time learning frameworks, are discussed.

2. Related Works

Zhao et al. [19] established SA-DLSTM, a hybrid framework, in 2023 to improve stock market forecasting and simulation trading. This design utilizes LSTM, DAE, and emotion-enhanced convolutional neural networks. Initial stock market data were supplemented with online user-generated content and sentiment characteristics retrieved using ECNN. DAE was used to gather market data’s key properties. The methodology improved sentiment indices by connecting temporal sentiment variations with market dynamics. These and key market indicators were fed into the LSTM for prediction. SA-DLSTM outperformed traditional models in predicting accuracy, return on investment, and risk reduction, which improved investor decision-making. Mu et al. [10] developed the MS-SSA-LSTM model, which combines multi-source basic data, deep learning, sentiment analysis, and swarm intelligence, the same year. They calculated sentiment indices using a modified sentiment lexicon from East Money forum postings. The Sparrow Search Algorithm (SSA) was then used to optimize LSTM hyperparameters [1]. It then used the improved LSTM to predict stock prices using sentiment and trading fundamentals. MS-SSA-LSTM has shown higher performance and generalizability, with a median R² increase of 10.74% above typical LSTM models. Swathi et al. used Twitter-based sentiment analysis with a TLBO-enhanced LSTM model to estimate stock prices in 2022. Tweets are short and unstructured; therefore, data preparation was necessary to reduce noise and prepare LSTM inputs. After classifying sentences as positive or negative, TLBO optimized the LSTM output layer. Twitter data validations showed that the TLBO-LSTM model outperformed traditional models in several performance parameters. Zhao et al. [20] created a machine learning approach in 2023 that combined traditional financial indicators with social media sentiment, commodity prices (oil and gold), and financial news to increase stock-forecasting accuracy. Gradient Boosting Classifier (GBM), using sentiment and oil price data, achieved an 87.2% prediction accuracy. Wang et al. [10] built a hybrid market prediction model using technical indicators and social sentiment data. The study used several algorithms, such as MLP, NB, DT, LR, RF, XGBoost, LSTM, and CNN, reaching an accuracy of 73.41% and an F1-score of 84.19%. These studies indicated that public emotion improves prediction. Lastly, Huang et al. [4] developed a hybrid genetic algorithm (HGA) with gray relational analysis (GRA) to predict stock markets. Social media sentiment analysis was utilized to discover chip-based trade signals, which were combined with LSTM outputs to improve forecasting.

3. Materials and Methods

The objective of this section is to present a comprehensive overview of the research methodology employed to integrate large language models (LLMs) [21] and data augmentation techniques [22] within a cognitive computing framework, with the specific aim of improving the accuracy and reliability of stock price prediction. This approach demonstrates strong adaptability across diverse economic contexts and financial environments beyond a single cultural or regional focus. The method is broken down into clear stages, with each stage containing a set of connected steps that help combine LLMs with improved datasets. These phases are visually depicted in Figure 1, which provides a schematic representation of the entire research workflow. The following sections will explore each phase in more detail, explaining how data acquisition, preprocessing, sentiment analysis, and performance evaluation fit into this combined predictive system.

3.1. Word Embedding

BERT [23] is a prevalent deep learning model engineered for natural language processing jobs, utilizing its capacity to comprehend linguistic context proficiently. BERT improves robots’ understanding of intricate language patterns by gathering contextual clues from surrounding words. The architecture facilitates pre-training on large textual corpora, subsequently allowing fine-tuning for downstream applications like sentiment analysis. BERT, constructed on the transformer structure, employs an attention mechanism to allocate variable weights to input-output relationships. The last concealed layer of the transformer encodes the semantic attributes of an input sentence S into a word vector matrix SD, offering a comprehensive representation for further processing.

S D = [A_{0}, A_{1}, A_{2}, A_{3}, \dots, A_{n}, A_{n + 1}]

A submatrix contains the specific terms of interest, notably those about

D_{r}

.

D_{r} = [A_{1}, A_{2}, A_{3}, A_{4}, \dots, A_{1 + m - 1}]

D_{r} \in R^{m \times l}

, where m represents the required length. Max-pooling uses target vectors to select the most important qualities from all target words for each dimension.

V = max {D_{r}, L = 0}, V \in R^{l \times L}

3.2. Cognitive Computing Models

3.2.1. RNN

RNNs [3] are artificial neural networks that store and use information from earlier inputs due to their cyclical nature. RNNs keep memory through feedback connections, unlike feedforward networks. RNNs are ideal for sequential data types, including time series, text, and audio signals, because of their architecture. NLP uses RNNs for language modeling, machine translation, and sentiment categorization. Beyond NLP, they are used in speech recognition and time-dependent predictive modeling. RNNs remain useful for sequence-based applications despite the rise of contemporary deep learning architectures like Transformers, as shown in Figure 2.

3.2.2. Multilingual BERT (mBERT)

The mBERT system is a multilingual modification of the original BERT transformer architecture. It was trained on a heterogeneous corpus that included 104 different languages. In accordance with the BERT basic configuration, mBERT is made up of twelve transformer layers, each of which has seven hundred and sixty-eight hidden units and twelve self-attention heads. Additionally, it facilitates input sequences that are as long as 512 tokens [23].

3.2.3. RoBERTa

Several hybrid deep learning architectures use the Robustly Optimized BERT Pretraining Approach (RoBERTa) [25], an upgraded BERT model based on Transformer architecture. Deep learning advances like the Transformer model use self-attention techniques to determine the relative importance of input sequence pieces. Transformers provide concurrent processing and dynamic contextual knowledge across all sequence points, unlike RNNs. Self-attention layers and feed-forward neural networks create context-aware input representations in the encoder. The decoder integrates self-attention, cross-attention between encoder and decoder outputs, and feed-forward layers. Addressing predictive modeling’s difficulties requires certain architectural aspects. The encoder component of RoBERTa, which generates contextual embeddings from input data, is the subject of this study.

3.2.4. GPT-4

The state-of-the-art big language model GPT-4 (Generative Pre-trained Transformer 4) [7] by OpenAI predicts the next word in a sequence based on contextual awareness to create human-like prose. It uses the Transformer architecture, a neural network framework adept at managing sequential input using attention processes, to improve on its predecessors. GPT-4 captures sophisticated language, semantics, and syntax patterns by training on a vast corpus of text data from several sources. Unsupervised learning with masked language modeling and task fine-tuning improves translation, summarization, question-answering, and creative writing performance. GPT-4 has better capacity, scalability, and context-aware answers. Multiple layers of decoder blocks that use masked multi-head self-attention, layer normalization, and residual connections improve its ability to understand and represent information. GPT-4 can better simulate long-range text relationships and recognize sophisticated prompts with these architectural enhancements. GPT-4 has set new standards in natural language processing and creation, making it essential to sophisticated AI systems and applications.

4. Results and Discussion

4.1. Dataset

In this study, we used The Stock Tweets for Sentiment Analysis and Prediction dataset on Kaggle [26], which provides a detailed analysis of social media sentiment and stock market movements. As shown in Figure 3, Figure 4, Figure 5 and Figure 6, it includes almost 80,000 tweets on Yahoo Finance’s top 25 stock tickers from 30 September 2021, to 30 September 2022.

4.2. Pre-Processing Using NLP

We applied several preprocessing approaches [27] in the first stage of text data preparation to enhance its quality and relevance. The process started with tokenization, which involves dividing the text into individual units or tokens, enabling more detailed analysis. Then, to lower noise and concentrate on more useful words, stop words—usually occurring words that usually lack significant meaning—were deleted. Stemming and lemmatization were used to simplify the text data by reducing words to their basic forms, which helps combine different forms of a word into one version. Finally, word-embedding techniques changed the refined tokens into numerical vectors, making it possible to use them in machine learning models for predicting outcomes. In the end, word-embedding methods transformed the refined tokens into numerical vectors, enabling their integration into predictive analysis machine learning models. The Natural Language Processing (NLP) components of this system aimed to methodically analyze and reformat text material obtained from social media sites. This reorganization helps to highlight relevant characteristics for the sentiment analysis, which are subsequently merged with numerical stock data to improve the forecast of stock price changes. Given this background, we deemed only English-language tweets consistent and free of the complexity associated with multilingual data processing. Carefully conducted, the preparation steps—tokenization, stop-word removal, and lemmatization—made sure that the text data was well prepared for the next analysis, as illustrated in the Figure 7 and Figure 8.

4.3. Results Analysis

The outcomes of Table 1 and Table 2 show obvious patterns in the performance of several models employed for stock price prediction. Examining the models with data augmentation (Table 1) reveals a consistent increase in performance measures across all architectures, with the GPT-4-based model clearly beating the rest. With an accuracy of 0.919 and an F1-score of 0.92, it shows not just a satisfactory general correctness but also a solid balance between precision and recall. While mBERT and RNN exhibit reasonable but considerably lesser performance, RoBERTa likewise does well, with an F1-score of 0.882. These findings highlight how well big transformer models handle unstructured financial data.

Comparing the models without data augmentation reveals a clear decline in all measures (Table 2). Though it exhibits a little decline relative to its performance with augmentation, GPT-4 still leads with an F1-score of 0.903 and accuracy of 0.902. Following RoBERTa and mBERT, RNN once more comes in last with the lowest scores. This trend underlines the need for data augmentation in improving model generalization and durability. The continued success of transformer-based models, even without extra data, shows that they can handle complex language patterns in financial text better than traditional RNNs.

GPT-generated synthetic data appears to have made a significant contribution. With the most improvements shown in more complicated models, data augmentation has increased the performance of each model. Augmented data caused the F1-score of the GPT-4-based model to rise from 0.903 to 0.92 and its accuracy to rise from 0.911 to 0.928. GPT-generated data seems to have helped offset training data constraints by providing more varied and informative instances, which probably improved decision bounds and reduced overfitting. Likewise, mBERT and RoBERTa models showed observable increases in recall and accuracy, suggesting better sensitivity and dependability in forecasts.

All in all, these findings validate the research premise that combining LLMs with sophisticated data augmentation methods significantly improves the accuracy of stock-price-forecasting models. The outstanding performance of GPT-4, even without augmentation, indicates that modern language models are already well suited for challenging tasks involving unstructured data. Augmented data’s further enhancement, thus, highlights the synergistic advantage of enhancing training sets by means of synthetic data synthesis. These results indicate a smarter and more adaptable future for market analysis by allowing for greater use of LLMs and enhancement methods in financial forecasting processes.

The comparison between Table 1 and Table 2 emphasizes a notable drop in performance measures when data augmentation is not used, hence supporting the idea that augmented data is vital for model training and generalization. The decline in performance of the GPT-4 model from an F1-score of 0.92 to 0.903 suggests that although transformer models are naturally strong, the inclusion of synthetic data is essential in optimizing their potential. This finding is especially relevant in financial forecasting as the availability of several meaningful training examples may significantly lower the overfitting risk and enhance model resilience.

The gains seen in mBERT and RoBERTa, together with GPT-4, support even more the theory that data augmentation increases model sensitivity and dependability. The significant rises in recall and accuracy imply that enhanced data not only offers additional examples for training but also enriches the dataset with different situations, hence allowing the models to provide more informed forecasts. Particularly in specialized sectors like banking, this result corresponds with earlier research supporting the use of synthetic data to offset constraints connected with tiny or unbalanced datasets.

These results have far-reaching consequences beyond the obvious ones; they point to a changing possibility for the combination of data augmentation methods and large language models (LLMs) in financial forecasting. The performance of the GPT-4 model, even without augmentation, suggests that modern language models are well suited to handle difficult tasks, including unstructured data. The extra performance increases obtained by means of data augmentation, therefore, draw attention to the synergistic advantages of integrating sophisticated modeling approaches with creative training methodologies.

Future studies should investigate how scalable these techniques are across other financial sectors and datasets. Examining the effects of other augmentation techniques beyond GPT-generated data might provide further information on how to maximize the model performance. Building confidence and knowledge in the use of these models will also depend on their interpretability in relation to stock price projection. Integrating LLMs and data augmentation will probably be more important as financial markets change, as it will improve the predicted accuracy and help to fit dynamic market behavior.

All things considered, the results of this study support the hypothesis that using LLMs with advanced data augmentation techniques may greatly improve the accuracy of stock-price-forecasting models. The exceptional performance of the GPT-4 model, even without augmentation, emphasizes the preparedness of current language models for difficult tasks, including unstructured financial data. The shown gains in market research and financial forecasting techniques using enhanced data not only emphasize the need of synthetic data synthesis but also point a hopeful path for future developments. Developing increasingly smart and flexible systems able to negotiate the complexity of financial markets will depend on the ongoing investigation of these methods as we go ahead.

5. Conclusions

Large language models (LLMs) and synthetic data production were highlighted in this study on how data augmentation affects stock price prediction machine learning models. LLMs, notably GPT-4, outperform conventional models in unstructured financial data interpretation. With data augmentation, the GPT-4-based model performed best, with an accuracy of 0.919 and an F1-score of 0.92, demonstrating its ability to handle financial texts’ sophisticated vocabulary. Even RoBERTa and mBERT improved with GPT-generated data, whereas regular RNNs did not. The performance difference with and without data augmentation shows how synthetic data improves generalization and reduces overfitting.

LLMs are strong and adaptable stock-forecasting tools, especially when combined with modern data augmentation techniques, suggesting a paradigm change in financial modeling. The improvements in accuracy, recall, and dependability show that GPT-generated data can solve financial dataset constraints. This research verifies the usefulness of these methodologies in financial forecasting and suggests applications in risk assessment, sentiment analysis, and predictive analytics. Integrating contemporary machine learning algorithms can help create more accurate, data-driven decision-making systems as financial markets become more complicated. The study also allows for further research on integrating ensemble models, evaluating various augmentation procedures, and extending these findings to additional unstructured data domains.

Author Contributions

H.N. developed and performed the experiments, analyzed the data, and wrote the manuscript. N.H. collected the data and contributed to the final version of the manuscript. Z.B. contributed to the conceptualization of the study, participated in the experiments, and assisted in interpreting the results. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article.

Acknowledgments

We thank the anonymous reviewers for their constructive comments and valuable suggestions that helped improve the quality of this manuscript.

Conflicts of Interest

The authors declare no conflicts of interest related to the content of this article.

Abbreviations

The following acronyms are employed in this paper:

BERT	Bidirectional Encoder Representations from Transformers
NLP	Natural Language Processing
LLM	Large Language Model
GPT	Generative Pre-trained Transformer
RNN	Recurrent Neural Network

References

Banik, S.; Sharma, N.; Mangla, M.; Mohanty, S.N.; Shitharth, S. LSTM based decision support system for swing trading in stock market. Knowl.-Based Syst. 2022, 239, 107994. [Google Scholar] [CrossRef]
Bordoloi, M.; Biswas, S.K. Sentiment analysis: A survey on design framework, applications and future scopes. Artif. Intell. Rev. 2023, 56, 12505–12560. [Google Scholar] [CrossRef]
Hicham, N.; Karim, S.; Habbat, N. Enhancing Arabic Sentiment Analysis in E-Commerce Reviews on Social Media Through a Stacked Ensemble Deep Learning Approach. Math. Model. Eng. Probl. 2023, 10, 790–798. [Google Scholar] [CrossRef]
Huang, J.-Y.; Tung, C.-L.; Lin, W.-Z. Using Social Network Sentiment Analysis and Genetic Algorithm to Improve the Stock Prediction Accuracy of the Deep Learning-Based Approach. Int. J. Comput. Intell. Syst. 2023, 16, 93. [Google Scholar] [CrossRef]
Yadav, K.; Yadav, M.; Saini, S. Stock values predictions using deep learning based hybrid models. CAAI Trans. Intell. Technol. 2022, 7, 107–116. [Google Scholar] [CrossRef]
Nouri, H.; Sabri, K.; Habbat, N. A Comprehensive Analysis of Consumers Sentiments Using an Ensemble Based Approach for Effective Marketing Decision-Making. In Artificial Intelligence and Industrial Applications; Masrour, T., Ramchoun, H., Hajji, T., Hosni, M., Eds.; Lecture Notes in Networks and Systems; Springer Nature: Cham, Switzerland, 2023; Volume 772, pp. 323–333. [Google Scholar]
Hicham, N.; Nassera, H. Improving emotion classification in e-commerce customer review analysis using GPT and meta-ensemble deep learning technique for multilingual system. Multimed. Tools Appl. 2024, 83, 87323–87367. [Google Scholar] [CrossRef]
Hicham, N.; Habbat Nassera, H. Customer behavior forecasting using machine learning techniques for improved marketing campaign competitiveness. Int. J. Eng. Market. Dev. (IJEMD) 2023, 1, 1–15. [Google Scholar] [CrossRef]
De Souza, O.A.P.; Miguel, L.F.F. CIOA: Circle-Inspired Optimization Algorithm, an algorithm for engineering optimization. SoftwareX 2022, 19, 101192. [Google Scholar] [CrossRef]
Mu, G.; Gao, N.; Wang, Y.; Dai, L. A Stock Price Prediction Model Based on Investor Sentiment and Optimized Deep Learning. IEEE Access 2023, 11, 51353–51367. [Google Scholar] [CrossRef]
Li, J.; Yang, J. Financial shocks, investor sentiment, and heterogeneous firms’ output volatility: Evidence from credit asset securitization markets. Finance Res. Lett. 2024, 60, 104860. [Google Scholar] [CrossRef]
He, N.; Wang, L.; Zheng, P.; Zhang, C.; Li, L. CBSASNet: A Siamese Network Based on Channel Bias Split Attention for Remote Sensing Change Detection. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–17. [Google Scholar] [CrossRef]
Kanakaraj, M.; Guddeti, R.M.R. Performance analysis of Ensemble methods on Twitter sentiment analysis using NLP techniques. In Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015), Anaheim, CA, USA, 7–9 February 2015; pp. 169–170. [Google Scholar]
Anaraki, M.V.; Farzin, S. Humboldt Squid Optimization Algorithm (HSOA): A Novel Nature-Inspired Technique for Solving Optimization Problems. IEEE Access 2023, 11, 122069–122115. [Google Scholar] [CrossRef]
Fan, W.; Ding, Y.; Ning, L.; Wang, S.; Li, H.; Yin, D.; Chua, T.; Li, Q. A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models. arXiv 2024, arXiv:2405.06211. [Google Scholar] [CrossRef]
Agrawal, S.; Trenkle, J.; Kawale, J. Beyond Labels: Leveraging Deep Learning and LLMs for Content Metadata. In Proceedings of the RecSys ’23: Seventeenth ACM Conference on Recommender Systems, Singapore, 18–22 September 2023; p. 1. [Google Scholar]
Karimi, A.; Rossi, L.; Prati, A. AEDA: An Easier Data Augmentation Technique for Text Classification. arXiv 2021, arXiv:2108.13230. [Google Scholar] [CrossRef]
Ansari, G.; Garg, M.; Saxena, C. Data Augmentation for Mental Health Classification on Social Media. arXiv 2021, arXiv:2112.10064. [Google Scholar] [CrossRef]
Zhao, Y.; Yang, G. Deep Learning-based Integrated Framework for stock price movement prediction. Appl. Soft Comput. 2023, 133, 109921. [Google Scholar] [CrossRef]
Wang, Z.; Hu, Z.; Li, F.; Ho, S.-B.; Cambria, E. Learning-Based Stock Trending Prediction by Incorporating Technical Indicators and Social Media Sentiment. Cogn. Comput. 2023, 15, 1092–1102. [Google Scholar] [CrossRef]
Alghisi, S.; Rizzoli, M.; Roccabruna, G.; Mousavi, S.M.; Riccardi, G. Should We Fine-Tune or RAG? Evaluating Different Techniques to Adapt LLMs for Dialogue. arXiv 2024, arXiv:2406.06399. [Google Scholar] [CrossRef]
Bayer, M.; Kaufhold, M.-A.; Buchhold, B.; Keller, M.; Dallmeyer, J.; Reuter, C. Data augmentation in natural language processing: A novel text generation approach for long and short text classifiers. Int. J. Mach. Learn. Cybern. 2023, 14, 135–150. [Google Scholar] [CrossRef]
Pires, T.; Schlinger, E.; Garrette, D. How multilingual is Multilingual BERT? arXiv 2019, arXiv:1906.01502. [Google Scholar] [CrossRef]
Feng, W.; Guan, N.; Li, Y.; Zhang, X.; Luo, Z. Audio visual speech recognition with multimodal recurrent neural networks. In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, USA, 14–19 May 2017; pp. 681–688. [Google Scholar] [CrossRef]
Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv 2019, arXiv:1907.11692. Available online: http://arxiv.org/abs/1907.11692 (accessed on 9 August 2023).
Equinxx. Stock Tweets for Sentiment Analysis and Prediction. Kaggle Datasets, 2025. Available online: https://www.kaggle.com/datasets/equinxx/stock-tweets-for-sentiment-analysis-and-prediction (accessed on 20 April 2025).
Hicham, N.; Karim, S.; Habbat, N. Customer sentiment analysis for Arabic social media using a novel ensemble machine learning approach. Int. J. Electr. Comput. Eng. (IJECE) 2023, 13, 4504–4515. [Google Scholar] [CrossRef]

Figure 1. The proposed methodology.

Figure 2. RNN architecture [24].

Figure 3. Tweet volume over time.

Figure 4. Top 10 most mentioned companies.

Figure 5. Tweets across different months.

Figure 6. Tweets across different days of the week.

Figure 7. Sentiment overview.

Figure 8. Sentiment over time.

Table 1. Model’s performance with data augmentation.

Model	Accuracy	F1-Score	Recall	Precision	Preprocessing	Data Augmentation
RNN	0.823	0.810	0.792	0.836	Yes	Yes (GPT-generated)
mBERT	0.877	0.869	0.855	0.884	Yes	Yes (GPT-generated)
RoBERTa	0.891	0.882	0.871	0.893	Yes	Yes (GPT-generated)
GPT-4	0.919	0.920	0.911	0.928	Yes	Yes (GPT-generated)

Table 2. Model’s performance without data augmentation.

Model	Accuracy	F1-Score	Recall	Precision	Preprocessing	Data Augmentation
RNN	0.808	0.795	0.777	0.820	Yes	No
mBERT	0.861	0.853	0.839	0.868	Yes	No
RoBERTa	0.874	0.866	0.855	0.876	Yes	No
GPT-4	0.902	0.903	0.894	0.911	Yes	No

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Habbat, N.; Nouri, H.; Berradi, Z. Leveraging Large Language Models and Data Augmentation in Cognitive Computing to Enhance Stock Price Predictions. Eng. Proc. 2025, 112, 40. https://doi.org/10.3390/engproc2025112040

AMA Style

Habbat N, Nouri H, Berradi Z. Leveraging Large Language Models and Data Augmentation in Cognitive Computing to Enhance Stock Price Predictions. Engineering Proceedings. 2025; 112(1):40. https://doi.org/10.3390/engproc2025112040

Chicago/Turabian Style

Habbat, Nassera, Hicham Nouri, and Zahra Berradi. 2025. "Leveraging Large Language Models and Data Augmentation in Cognitive Computing to Enhance Stock Price Predictions" Engineering Proceedings 112, no. 1: 40. https://doi.org/10.3390/engproc2025112040

APA Style

Habbat, N., Nouri, H., & Berradi, Z. (2025). Leveraging Large Language Models and Data Augmentation in Cognitive Computing to Enhance Stock Price Predictions. Engineering Proceedings, 112(1), 40. https://doi.org/10.3390/engproc2025112040

Article Menu

Leveraging Large Language Models and Data Augmentation in Cognitive Computing to Enhance Stock Price Predictions^†

Abstract

1. Introduction

2. Related Works