Big Data and Graph Deep Learning for Financial Decision Support from Social Networks: A Critical Review
Abstract
1. Introduction
1.1. Contributions of This Study
- It reframes social network analytics for finance around evidence observability and decision cutoffs, clarifying why platform access, ranking effects, and deletions change what results can legitimately claim.
- It synthesizes model components (text, temporal, relational, fusion, deployment constraints) through their failure modes and assumptions, rather than treating architectures as an end in themselves.
- It organizes the applied literature by signal type (sentiment aggregation, relational indicators, multimodal and cross-platform signals) and explains how each signal becomes decision-facing only after attribution and alignment steps.
- It introduces an assurance-oriented perspective, emphasizing manipulation risk, robustness under missing evidence and delay, calibration under shift, and traceability of outputs back to time-appropriate evidence.
- It consolidates bounded findings, recurring failure modes, and research directions that follow directly from validity gaps rather than from model fashion.
1.2. Scope and Limitations
1.3. Paper Structure
2. Methods
3. Social Network Data in Financial Contexts
3.1. Evidence Types, Provenance, Sampling Bias
3.2. Conditioning: Bots, Spam, Entity Resolution, Timestamp Alignment
3.3. Practical Data Issues That Directly Affect Evaluation
4. Model Components for Social–Financial Representation and Inference
4.1. Text Representation for Noisy, Time-Sensitive Evidence
4.2. Temporal Modeling and Session-Aware Alignment
4.3. Relational Modeling and Graph Encoders
4.4. Fusion Mechanisms for Heterogeneous Evidence
4.5. Training and Deployment Constraints
4.6. Comparative Synthesis of Model Components
5. Transformer-Based Sentiment Analysis for Financial Decision-Making
5.1. Sentiment Signals and Aggregation
5.2. Relational Signals: Diffusion, Coordination, Influence
5.3. Multimodal and Cross-Platform Signals
5.4. What Transfers Across Market Phases, What Breaks Under Drift
6. Network Structure and Information Diffusion
6.1. Manipulation, Synthetic Content, and Platform Bias
6.2. Information Cascades and Viral Content
6.3. Interpretability, Traceability, and Audit Needs
7. Real-Time Processing and Scalability
7.1. Stable Findings and Narrow Claims That Hold
7.2. Fragile Findings and Common Failure Modes
8. Research Directions
- A first priority is standardizing time-aware study design for social evidence, including explicit decision cutoffs, session-aligned windows, and observability constraints that model collection delay and deletion effects. Shared benchmarks are useful only if they enforce these constraints; otherwise they encourage optimization to leakage-prone setups.
- A second direction is stronger treatment of attribution under ambiguity. This includes time-aware entity dictionaries, conservative abstention when mention confidence is low, and evaluation that measures not only sentiment accuracy but also attribution error and its downstream impact on indices and graphs. Work in this area would shift the focus from post-level classification toward instrument-level signal validity, which is closer to financial use.
- Third, robustness needs to be evaluated as a primary objective. This requires stress tests that mimic operational degradation: rate limiting, partial outages, modality dropout, and shifts in bot prevalence. Models should be compared on graceful degradation and calibration stability, not only on peak performance. In parallel, reliability weighting methods should be developed and tested under shifting manipulation intensity, with explicit reporting of when weighting fails or becomes biased.
- Fourth, relational analysis would benefit from clearer semantics and observability models. Rather than treating all interactions as equivalent edges, future work should differentiate edge types and evaluate whether the learned structure reflects diffusion, debate, coordinated amplification, or platform-driven exposure. Dynamic graph designs should also be evaluated under time-local construction rules to avoid embedding post-event structure that inflates results.
- Fifth, multimodal and cross-platform integration should be judged by incremental value and transferability. Research should report modality contribution under consistent training budgets, include missing-modality stress tests, and evaluate cross-platform alignment methods under realistic propagation delays and vocabulary drift. Where identity linkage is impossible, topic- and URL-based alignment should be treated as an approximation with measured error, not as a neutral merge.
- Finally, decision readiness calls for practical traceability standards. Systems should be designed so that outputs can be reconstructed from versioned preprocessing, logged evidence identifiers, and explicit action mappings with thresholds and abstention rules. This would make failures diagnosable, support governance requirements, and reduce the gap between retrospective modeling and accountable operational deployment.
9. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Theodorakopoulos, L.; Theodoropoulou, A.; Bakalis, A. Big Data in Financial Risk Management: Evidence, Advances, and Open Questions: A Systematic Review. Front. Artif. Intell. 2025, 8, 1658375. [Google Scholar] [CrossRef] [PubMed]
- Verma, R.; Verma, P. Economic News, Social Media Sentiments, and Stock Returns: Which Is a Bigger Driver? J. Risk Financ. Manag. 2025, 18, 16. [Google Scholar] [CrossRef]
- Day, M.-Y.; Lee, C.-C. Deep Learning for Financial Sentiment Analysis on Finance News Providers. In Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), San Francisco, CA, USA, 18–21 August 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 1127–1134. [Google Scholar]
- Xie, L.; Chen, Z.; Yu, S. Deep Convolutional Transformer Network for Stock Movement Prediction. Electronics 2024, 13, 4225. [Google Scholar] [CrossRef]
- Arian, H.; Norouzi Mobarekeh, D.; Seco, L. Backtest Overfitting in the Machine Learning Era: A Comparison of out-of-Sample Testing Methods in a Synthetic Controlled Environment. Knowl.-Based Syst. 2024, 305, 112477. [Google Scholar] [CrossRef]
- Arsenault, P.-D.; Wang, S.; Patenaude, J.-M. A Survey of Explainable Artificial Intelligence (XAI) in Financial Time Series Forecasting. ACM Comput. Surv. 2025, 57, 1–37. [Google Scholar] [CrossRef]
- Mahdi, O.A.; Pardede, E.; Bevinakoppa, S.; Ali, N. Federated Learning Under Concept Drift: A Systematic Survey of Foundations, Innovations, and Future Research Directions. Electronics 2025, 14, 4480. [Google Scholar] [CrossRef]
- Lazer, D.M.J.; Pentland, A.; Watts, D.J.; Aral, S.; Athey, S.; Contractor, N.; Freelon, D.; Gonzalez-Bailon, S.; King, G.; Margetts, H.; et al. Computational Social Science: Obstacles and Opportunities. Science 2020, 369, 1060–1062. [Google Scholar] [CrossRef]
- Huszár, F.; Ktena, S.I.; O’Brien, C.; Belli, L.; Schlaikjer, A.; Hardt, M. Algorithmic Amplification of Politics on Twitter. Proc. Natl. Acad. Sci. USA 2022, 119, e2025334119. [Google Scholar] [CrossRef]
- Gorwa, R.; Binns, R.; Katzenbach, C. Algorithmic Content Moderation: Technical and Political Challenges in the Automation of Platform Governance. Big Data Soc. 2020, 7, 205395171989794. [Google Scholar] [CrossRef]
- Evans, L.; Owda, M.; Crockett, K.; Vilas, A.F. A Methodology for the Resolution of Cashtag Collisions on Twitter—A Natural Language Processing & Data Fusion Approach. Expert Syst. Appl. 2019, 127, 353–369. [Google Scholar] [CrossRef]
- Daudert, T. A Multi-Source Entity-Level Sentiment Corpus for the Financial Domain: The FinLin Corpus. Lang. Resour. Eval. 2022, 56, 333–356. [Google Scholar] [CrossRef]
- Chen, C.Y.-H.; Hafner, C.M. Sentiment-Induced Bubbles in the Cryptocurrency Market. J. Risk Financ. Manag. 2019, 12, 53. [Google Scholar] [CrossRef]
- Pfeffer, J.; Mayer, K.; Morstatter, F. Tampering with Twitter’s Sample API. EPJ Data Sci. 2018, 7, 50. [Google Scholar] [CrossRef]
- Alizadeh, M.; Zare, D.; Samei, Z.; Alizadeh, M.; Kubli, M.; Aliahmadi, M.; Ebrahimi, S.; Gilardi, F. Comparing Methods for Creating a National Random Sample of Twitter Users. Soc. Netw. Anal. Min. 2024, 14, 160. [Google Scholar] [CrossRef]
- Gulnerman, A.G.; Karaman, H.; Pekaslan, D.; Bilgi, S. Citizens’ Spatial Footprint on Twitter—Anomaly, Trend and Bias Investigation in Istanbul. ISPRS Int. J. Geo-Inf. 2020, 9, 222. [Google Scholar] [CrossRef]
- Khan, M.T.; Dimitrov, D.; Dietze, S. Characterization of Tweet Deletion Patterns in the Context of COVID-19 Discourse and Polarization. In Proceedings of the 36th ACM Conference on Hypertext and Social Media, Chicago, IL, USA, 15–18 September 2025; ACM: New York, NY, USA, 2025; pp. 43–47. [Google Scholar]
- Olteanu, A.; Castillo, C.; Diaz, F.; Kıcıman, E. Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries. Front. Big Data 2019, 2, 13. [Google Scholar] [CrossRef] [PubMed]
- Warkulat, S.; Pelster, M. Social Media Attention and Retail Investor Behavior: Evidence from r/Wallstreetbets. Int. Rev. Financ. Anal. 2024, 96, 103721. [Google Scholar] [CrossRef]
- Bastos, M. This Account Doesn’t Exist: Tweet Decay and the Politics of Deletion in the Brexit Debate. Am. Behav. Sci. 2021, 65, 757–773. [Google Scholar] [CrossRef]
- Bouadjenek, M.R.; Sanner, S.; Wu, G. A User-Centric Analysis of Social Media for Stock Market Prediction. ACM Trans. Web 2023, 17, 1–22. [Google Scholar] [CrossRef]
- Krystyniak, K.; Liu, H.; Hu, H. What’s Trending? Stock-Level Investor Sentiment and Returns. Int. J. Financ. Stud. 2025, 13, 158. [Google Scholar] [CrossRef]
- Cresci, S.; Lillo, F.; Regoli, D.; Tardelli, S.; Tesconi, M. Cashtag Piggybacking: Uncovering Spam and Bot Activity in Stock Microblogs on Twitter. ACM Trans. Web 2019, 13, 1–27. [Google Scholar] [CrossRef]
- Tardelli, S.; Avvenuti, M.; Tesconi, M.; Cresci, S. Detecting Inorganic Financial Campaigns on Twitter. Inf. Syst. 2022, 103, 101769. [Google Scholar] [CrossRef]
- Alothali, E.; Hayawi, K.; Alashwal, H. SEBD: A Stream Evolving Bot Detection Framework with Application of PAC Learning Approach to Maintain Accuracy and Confidence Levels. Appl. Sci. 2023, 13, 4443. [Google Scholar] [CrossRef]
- Abdelwahab, A.; Mostafa, M. A Deep Neural Network Technique for Detecting Real-Time Drifted Twitter Spam. Appl. Sci. 2022, 12, 6407. [Google Scholar] [CrossRef]
- Graham, T.; Hames, S.; Alpert, E. The Coordination Network Toolkit: A Framework for Detecting and Analysing Coordinated Behaviour on Social Media. J. Comput. Soc. Sci. 2024, 7, 1139–1160. [Google Scholar] [CrossRef]
- Weber, D.; Neumann, F. Amplifying Influence through Coordinated Behaviour in Social Networks. Soc. Netw. Anal. Min. 2021, 11, 111. [Google Scholar] [CrossRef] [PubMed]
- Liu, W.; Cui, X. Improving Named Entity Recognition for Social Media with Data Augmentation. Appl. Sci. 2023, 13, 5360. [Google Scholar] [CrossRef]
- Li, H.; Li, C.; Sun, Z.; Zhu, H. Entity Linking Model Based on Cascading Attention and Dynamic Graph. Electronics 2024, 13, 3845. [Google Scholar] [CrossRef]
- Liu, F.; Chen, L.; Zheng, Y.; Feng, Y. A Prediction Method with Data Leakage Suppression for Time Series. Electronics 2022, 11, 3701. [Google Scholar] [CrossRef]
- Wang, W. Investor Sentiment and Stock Market Returns: A Story of Night and Day. Eur. J. Financ. 2024, 30, 1437–1469. [Google Scholar] [CrossRef]
- Fang, Z.; Dudek, J.; Costas, R. Facing the Volatility of Tweets in Altmetric Research. J. Assoc. Inf. Sci. Technol. 2022, 73, 1192–1195. [Google Scholar] [CrossRef]
- Ulloa, R.; Mangold, F.; Schmidt, F.; Gilsbach, J.; Stier, S. Beyond Time Delays: How Web Scraping Distorts Measures of Online News Consumption. Commun. Methods Meas. 2025, 19, 179–200. [Google Scholar] [CrossRef]
- Davidson, B.I.; Wischerath, D.; Racek, D.; Parry, D.A.; Godwin, E.; Hinds, J.; Van Der Linden, D.; Roscoe, J.F.; Ayravainen, L.; Cork, A.G. Platform-Controlled Social Media APIs Threaten Open Science. Nat. Hum. Behav. 2023, 7, 2054–2057. [Google Scholar] [CrossRef]
- Albarrak, M.S. The Effect of Twitter Messages and Tone on Stock Return: The Case of Saudi Stock Market “Tadawul”. J. Risk Financ. Manag. 2024, 17, 405. [Google Scholar] [CrossRef]
- Hino, A.; Fahey, R.A. Representing the Twittersphere: Archiving a Representative Sample of Twitter Data under Resource Constraints. Int. J. Inf. Manag. 2019, 48, 175–184. [Google Scholar] [CrossRef]
- Elmas, T. The Impact of Data Persistence Bias on Social Media Studies. In Proceedings of the 15th ACM Web Science Conference 2023, Austin, TX, USA, 30 April–1 May 2023; ACM: New York, NY, USA, 2023; pp. 196–207. [Google Scholar]
- Küpfer, A. Nonrandom Tweet Mortality and Data Access Restrictions: Compromising the Replication of Sensitive Twitter Studies. Polit. Anal. 2024, 32, 493–506. [Google Scholar] [CrossRef]
- Wang, X.; Xiang, Z.; Xu, W.; Yuan, P. The Causal Relationship between Social Media Sentiment and Stock Return: Experimental Evidence from an Online Message Forum. Econ. Lett. 2022, 216, 110598. [Google Scholar] [CrossRef]
- Sun, G.; Li, Y. Intraday and Post-Market Investor Sentiment for Stock Price Prediction: A Deep Learning Framework with Explainability and Quantitative Trading Strategy. Systems 2025, 13, 390. [Google Scholar] [CrossRef]
- Vicente, P. Sampling Twitter Users for Social Science Research: Evidence from a Systematic Review of the Literature. Qual. Quant. 2023, 57, 5449–5489. [Google Scholar] [CrossRef] [PubMed]
- Dujeancourt, E.; Garz, M. The Effects of Algorithmic Content Selection on User Engagement with News on Twitter. Inf. Soc. 2023, 39, 263–281. [Google Scholar] [CrossRef]
- Apicella, A.; Isgrò, F.; Prevete, R. Don’t Push the Button! Exploring Data Leakage Risks in Machine Learning and Transfer Learning. Artif. Intell. Rev. 2025, 58, 339. [Google Scholar] [CrossRef]
- Shobayo, O.; Adeyemi-Longe, S.; Popoola, O.; Ogunleye, B. Innovative Sentiment Analysis and Prediction of Stock Price Using FinBERT, GPT-4 and Logistic Regression: A Data-Driven Approach. Big Data Cogn. Comput. 2024, 8, 143. [Google Scholar] [CrossRef]
- Wang, R.; Zhu, H.; Wang, L.; Chen, Z.; Gao, M.; Xin, Y. User Identity Linkage Across Social Networks by Heterogeneous Graph Attention Network Modeling. Appl. Sci. 2020, 10, 5478. [Google Scholar] [CrossRef]
- Cinelli, M.; De Francisci Morales, G.; Galeazzi, A.; Quattrociocchi, W.; Starnini, M. The Echo Chamber Effect on Social Media. Proc. Natl. Acad. Sci. USA 2021, 118, e2023301118. [Google Scholar] [CrossRef]
- Zarour, M.; Alzabut, H.; Al-Sarayreh, K.T. MLOps Best Practices, Challenges and Maturity Models: A Systematic Literature Review. Inf. Softw. Technol. 2025, 183, 107733. [Google Scholar] [CrossRef]
- Kreuzberger, D.; Kühl, N.; Hirschl, S. Machine Learning Operations (MLOps): Overview, Definition, and Architecture. IEEE Access 2023, 11, 31866–31879. [Google Scholar] [CrossRef]
- Kustitskaya, T.A.; Esin, R.V.; Noskov, M.V. Model Drift in Deployed Machine Learning Models for Predicting Learning Success. Computers 2025, 14, 351. [Google Scholar] [CrossRef]
- Reza, M.K.; Prater-Bennette, A.; Asif, M.S. Robust Multimodal Learning With Missing Modalities via Parameter-Efficient Adaptation. IEEE Trans. Pattern Anal. Mach. Intell. 2025, 47, 742–754. [Google Scholar] [CrossRef] [PubMed]
- Sambasivan, N.; Kapania, S.; Highfill, H.; Akrong, D.; Paritosh, P.; Aroyo, L.M. “Everyone Wants to Do the Model Work, Not the Data Work”: Data Cascades in High-Stakes AI. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, Yokohama, Japan, 8–13 May 2021; ACM: New York, NY, USA, 2021; pp. 1–15. [Google Scholar]
- Mitchell, M.; Wu, S.; Zaldivar, A.; Barnes, P.; Vasserman, L.; Hutchinson, B.; Spitzer, E.; Raji, I.D.; Gebru, T. Model Cards for Model Reporting. In Proceedings of the Conference on Fairness, Accountability, and Transparency, Atlanta, GA, USA, 29–31 January 2019; ACM: New York, NY, USA, 2019; pp. 220–229. [Google Scholar]
- Ardia, D.; Bluteau, K. Twitter and Cryptocurrency Pump-and-Dumps. Int. Rev. Financ. Anal. 2024, 95, 103479. [Google Scholar] [CrossRef]
- Huang, A.H.; Wang, H.; Yang, Y. FINBERT: A Large Language Model for Extracting Information from Financial Text. Contemp. Account. Res. 2023, 40, 806–841. [Google Scholar] [CrossRef]
- Nasiopoulos, D.K.; Roumeliotis, K.I.; Sakas, D.P.; Toudas, K.; Reklitis, P. Financial Sentiment Analysis and Classification: A Comparative Study of Fine-Tuned Deep Learning Models. Int. J. Financ. Stud. 2025, 13, 75. [Google Scholar] [CrossRef]
- Memiş, E.; Akarkamçı (Kaya), H.; Yeniad, M.; Rahebi, J.; Lopez-Guede, J.M. Comparative Study for Sentiment Analysis of Financial Tweets with Deep Learning Methods. Appl. Sci. 2024, 14, 588. [Google Scholar] [CrossRef]
- Hovakimyan, G.; Bravo, J.M. Evolving Strategies in Machine Learning: A Systematic Review of Concept Drift Detection. Information 2024, 15, 786. [Google Scholar] [CrossRef]
- Garcia, C.M.; Abilio, R.; Koerich, A.L.; Britto, A.D.S.; Barddal, J.P. Concept Drift Adaptation in Text Stream Mining Settings: A Systematic Review. ACM Trans. Intell. Syst. Technol. 2025, 16, 1–67. [Google Scholar] [CrossRef]
- Wilksch, M.; Abramova, O. PyFin-Sentiment: Towards a Machine-Learning-Based Model for Deriving Sentiment from Financial Tweets. Int. J. Inf. Manag. Data Insights 2023, 3, 100171. [Google Scholar] [CrossRef]
- Giantsidi, S.; Tarantola, C. Deep Learning for Financial Forecasting: A Review of Recent Trends. Int. Rev. Econ. Finance 2025, 104, 104719. [Google Scholar] [CrossRef]
- AlRashedy, A.S.; Mathkour, H.I. Label-Driven Optimization of Trading Models Across Indices and Stocks: Maximizing Percentage Profitability. Mathematics 2025, 13, 3889. [Google Scholar] [CrossRef]
- Geirhos, R.; Jacobsen, J.-H.; Michaelis, C.; Zemel, R.; Brendel, W.; Bethge, M.; Wichmann, F.A. Shortcut Learning in Deep Neural Networks. Nat. Mach. Intell. 2020, 2, 665–673. [Google Scholar] [CrossRef]
- Sinha, A.; Kedas, S.; Kumar, R.; Malo, P. SENTFIN 1.0: ENTITY-AWARE Sentiment Analysis for Financial News. J. Assoc. Inf. Sci. Technol. 2022, 73, 1314–1335. [Google Scholar] [CrossRef]
- Pan, R.; García-Díaz, J.A.; Valencia-García, R. Individual- vs. Multiple-Objective Strategies for Targeted Sentiment Analysis in Finances Using the Spanish MTSA 2023 Corpus. Electronics 2024, 13, 717. [Google Scholar] [CrossRef]
- Hendrickx, K.; Perini, L.; Van Der Plas, D.; Meert, W.; Davis, J. Machine Learning with a Reject Option: A Survey. Mach. Learn. 2024, 113, 3073–3110. [Google Scholar] [CrossRef]
- Reschke, F.; Strych, J.-O. Emojis and Stock Returns. Rev. Behav. Financ. 2024, 16, 223–233. [Google Scholar] [CrossRef]
- Ballinari, D.; Behrendt, S. How to Gauge Investor Behavior? A Comparison of Online Investor Sentiment Measures. Digit. Financ. 2021, 3, 169–204. [Google Scholar] [CrossRef] [PubMed]
- Palomino, M.A.; Aider, F. Evaluating the Effectiveness of Text Pre-Processing in Sentiment Analysis. Appl. Sci. 2022, 12, 8765. [Google Scholar] [CrossRef]
- Chitty-Venkata, K.T.; Mittal, S.; Emani, M.; Vishwanath, V.; Somani, A.K. A Survey of Techniques for Optimizing Transformer Inference. J. Syst. Archit. 2023, 144, 102990. [Google Scholar] [CrossRef]
- Jaggi, M.; Mandal, P.; Narang, S.; Naseem, U.; Khushi, M. Text Mining of Stocktwits Data for Predicting Stock Prices. Appl. Syst. Innov. 2021, 4, 13. [Google Scholar] [CrossRef]
- Xiao, Q.; Ihnaini, B. Stock Trend Prediction Using Sentiment Analysis. PeerJ Comput. Sci. 2023, 9, e1293. [Google Scholar] [CrossRef]
- Yang, N.; Fernandez-Perez, A.; Indriawan, I. Spillover between Investor Sentiment and Volatility: The Role of Social Media. Int. Rev. Financ. Anal. 2024, 96, 103643. [Google Scholar] [CrossRef]
- Fuertes, A.-M.; Olmo, J. On Setting Day-Ahead Equity Trading Risk Limits: VaR Prediction at Market Close or Open? J. Risk Financ. Manag. 2016, 9, 10. [Google Scholar] [CrossRef]
- Zhang, L.; Hua, L. Market Predictability Before the Closing Bell Rings. Risks 2024, 12, 180. [Google Scholar] [CrossRef]
- Casolaro, A.; Capone, V.; Iannuzzo, G.; Camastra, F. Deep Learning for Time Series Forecasting: Advances and Open Problems. Information 2023, 14, 598. [Google Scholar] [CrossRef]
- Cerqueira, V.; Torgo, L.; Mozetič, I. Evaluating Time Series Forecasting Models: An Empirical Study on Performance Estimation Methods. Mach. Learn. 2020, 109, 1997–2028. [Google Scholar] [CrossRef]
- Zhang, W.; Liu, J.; Deng, W.; Tang, S.; Yang, F.; Han, Y.; Liu, M.; Wan, R. AMTCN: An Attention-Based Multivariate Temporal Convolutional Network for Electricity Consumption Prediction. Electronics 2024, 13, 4080. [Google Scholar] [CrossRef]
- Shchur, O.; Türkmen, A.C.; Januschowski, T.; Günnemann, S. Neural Temporal Point Processes: A Review. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence; International Joint Conferences on Artificial Intelligence Organization, Montreal, QC, Canada, 19–27 August 2021; pp. 4585–4593. [Google Scholar]
- Foumani, N.M.; Tan, C.W.; Webb, G.I.; Salehi, M. Improving Position Encoding of Transformers for Multivariate Time Series Classification. Data Min. Knowl. Discov. 2024, 38, 22–48. [Google Scholar] [CrossRef]
- Milidonis, A.; Chisholm, K. The Regime-Switching Structural Default Risk Model. Risks 2024, 12, 48. [Google Scholar] [CrossRef]
- Ugras, Y.J.; Ritter, M.A. Market Reaction to Earnings Announcements Under Different Volatility Regimes. J. Risk Financ. Manag. 2025, 18, 19. [Google Scholar] [CrossRef]
- Alexiou, L.; Goyal, A.; Kostakis, A.; Rompolis, L. Pricing Event Risk: Evidence from Concave Implied Volatility Curves. Rev. Financ. 2025, 29, 963–1007. [Google Scholar] [CrossRef]
- Bergmeir, C.; Hyndman, R.J.; Koo, B. A Note on the Validity of Cross-Validation for Evaluating Autoregressive Time Series Prediction. Comput. Stat. Data Anal. 2018, 120, 70–83. [Google Scholar] [CrossRef]
- Kapoor, S.; Narayanan, A. Leakage and the Reproducibility Crisis in Machine-Learning-Based Science. Patterns 2023, 4, 100804. [Google Scholar] [CrossRef]
- Yue, Z.; Yu, G. Effects of Policy Communication Changes on Social Media: Before and After Policy Adjustment. Systems 2025, 13, 248. [Google Scholar] [CrossRef]
- Kim, H. Social Media Engagement and Retail Investors’ Short-Termism. Financ. Res. Lett. 2025, 85, 108249. [Google Scholar] [CrossRef]
- Suárez-Cetrulo, A.L.; Quintana, D.; Cervantes, A. A Survey on Machine Learning for Recurring Concept Drifting Data Streams. Expert Syst. Appl. 2023, 213, 118934. [Google Scholar] [CrossRef]
- Guerrero Cano, J.V.; Aguiar, G.J.; Cano, A. Anticipating to Change: A Proactive Approach for Concept Drift Adaptation in Data Streams. Mach. Learn. 2026, 115, 3. [Google Scholar] [CrossRef]
- Huang, X.; Li, J.; Yuan, Y. Link Prediction in Dynamic Social Networks Combining Entropy, Causality, and a Graph Convolutional Network Model. Entropy 2024, 26, 477. [Google Scholar] [CrossRef] [PubMed]
- Liu, Y.; Wei, Z.; Chen, L.; Xu, C.; Guan, Z. Multi-Modal Temporal Dynamic Graph Construction for Stock Rank Prediction. Mathematics 2025, 13, 845. [Google Scholar] [CrossRef]
- Liu, W.; Wang, S.; Ding, J. Influence Maximization Based on Adaptive Graph Convolution Neural Network in Social Networks. Electronics 2024, 13, 3110. [Google Scholar] [CrossRef]
- Morshed, A. Graph Neural Networks and Explainable Spillovers: Global Monetary and Oil Shocks in GCC Financial Markets. Economies 2025, 13, 308. [Google Scholar] [CrossRef]
- Meštrović, A.; Petrović, M.; Beliga, S. Retweet Prediction Based on Heterogeneous Data Sources: The Combination of Text and Multilayer Network Features. Appl. Sci. 2022, 12, 11216. [Google Scholar] [CrossRef]
- Sun, M.; Tang, M. A Review of Link Prediction Algorithms in Dynamic Networks. Mathematics 2025, 13, 807. [Google Scholar] [CrossRef]
- Liu, Z.; Li, Z.; Li, W.; Duan, L. Deep Graph Tensor Learning for Temporal Link Prediction. Inf. Sci. 2024, 660, 120085. [Google Scholar] [CrossRef]
- Smith, J.A.; Moody, J.; Morgan, J.H. Network Sampling Coverage II: The Effect of Non-Random Missing Data on Network Measurement. Soc. Netw. 2017, 48, 78–99. [Google Scholar] [CrossRef]
- De La Haye, K.; Embree, J.; Punkay, M.; Espelage, D.L.; Tucker, J.S.; Green, H.D. Analytic Strategies for Longitudinal Networks with Missing Data. Soc. Netw. 2017, 50, 17–25. [Google Scholar] [CrossRef]
- Cinelli, M.; Cresci, S.; Quattrociocchi, W.; Tesconi, M.; Zola, P. Coordinated Inauthentic Behavior and Information Spreading on Twitter. Decis. Support Syst. 2022, 160, 113819. [Google Scholar] [CrossRef]
- Yang, Y.; Paudel, R.; McShan, J.; Hindman, M.; Huang, H.H.; Broniatowski, D. Coordinated Link Sharing on Facebook. Sci. Rep. 2025, 15, 15684. [Google Scholar] [CrossRef]
- Jiao, P.; Guo, X.; Jing, X.; He, D.; Wu, H.; Pan, S.; Gong, M.; Wang, W. Temporal Network Embedding for Link Prediction via VAE Joint Attention Mechanism. IEEE Trans. Neural Netw. Learn. Syst. 2022, 33, 7400–7413. [Google Scholar] [CrossRef] [PubMed]
- Wei, Q.; Hu, G. Evaluating Graph Neural Networks under Graph Sampling Scenarios. PeerJ Comput. Sci. 2022, 8, e901. [Google Scholar] [CrossRef] [PubMed]
- Wasserbacher, H.; Spindler, M. Machine Learning for Financial Forecasting, Planning and Analysis: Recent Developments and Pitfalls. Digit. Financ. 2022, 4, 63–88. [Google Scholar] [CrossRef]
- Baltrusaitis, T.; Ahuja, C.; Morency, L.-P. Multimodal Machine Learning: A Survey and Taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 41, 423–443. [Google Scholar] [CrossRef]
- Kim, J.; Hong, J.; Choi, Y. Causal Inference for Modality Debiasing in Multimodal Emotion Recognition. Appl. Sci. 2024, 14, 11397. [Google Scholar] [CrossRef]
- Ramachandram, D.; Taylor, G.W. Deep Multimodal Learning: A Survey on Recent Advances and Trends. IEEE Signal Process. Mag. 2017, 34, 96–108. [Google Scholar] [CrossRef]
- Mou, S.; Xue, Q.; Chen, J.; Takiguchi, T.; Ariki, Y. MM-iTransformer: A Multimodal Approach to Economic Time Series Forecasting with Textual Data. Appl. Sci. 2025, 15, 1241. [Google Scholar] [CrossRef]
- Bustarviejo, J.; Bousoño-Calzón, C. Multimodal Information Fusion for Financial Forecasting via Cross-Attention and Calibrated Uncertainty. Mach. Learn. Appl. 2026, 23, 100840. [Google Scholar] [CrossRef]
- Pereira, L.M.; Salazar, A.; Vergara, L. A Comparative Study on Recent Automatic Data Fusion Methods. Computers 2023, 13, 13. [Google Scholar] [CrossRef]
- Pan, L.; Han, X.; Liu, X.; Liu, Y. A Practical Multimodal Fusion System with Uncertainty Modeling for Robust Visual and Affective Applications. IEEE Access 2025, 13, 145289–145302. [Google Scholar] [CrossRef]
- Nirala, V.; Ratneshwer. A Robust Weighted Late Fusion Approach for IoT. Internet Things 2026, 36, 101857. [Google Scholar] [CrossRef]
- Alomari, M.; Al Rababa’a, A.R.; El-Nader, G.; Alkhataybeh, A.; Ur Rehman, M. Examining the Effects of News and Media Sentiments on Volatility and Correlation: Evidence from the UK. Q. Rev. Econ. Financ. 2021, 82, 280–297. [Google Scholar] [CrossRef]
- Al Guindy, M. Cryptocurrency Price Volatility and Investor Attention. Int. Rev. Econ. Financ. 2021, 76, 556–570. [Google Scholar] [CrossRef]
- Huang, C.; Chen, J.; Huang, Q.; Wang, S.; Tu, Y.; Huang, X. AtCAF: Attention-Based Causality-Aware Fusion Network for Multimodal Sentiment Analysis. Inf. Fusion 2025, 114, 102725. [Google Scholar] [CrossRef]
- Wu, Y.; Chen, J.; Hu, L.; Xu, H.; Liang, H.; Wu, J. OmniFuse: A General Modality Fusion Framework for Multi-Modality Learning on Low-Quality Medical Data. Inf. Fusion 2025, 117, 102890. [Google Scholar] [CrossRef]
- Zhao, B.; Zhang, W.; Zou, Z. MCE: Towards a General Framework for Handling Missing Modalities under Imbalanced Missing Rates. Pattern Recognit. 2026, 172, 112591. [Google Scholar] [CrossRef]
- Seijo-Pardo, B.; Alonso-Betanzos, A.; Bennett, K.P.; Bolón-Canedo, V.; Josse, J.; Saeed, M.; Guyon, I. Biases in Feature Selection with Missing Data. Neurocomputing 2019, 342, 97–112. [Google Scholar] [CrossRef]
- Emmanuel, T.; Maupong, T.; Mpoeleng, D.; Semong, T.; Mphago, B.; Tabona, O. A Survey on Missing Data in Machine Learning. J. Big Data 2021, 8, 140. [Google Scholar] [CrossRef]
- Almakhamreh, A.H.A.; Bozkir, A.S. CrossPhire: Benefiting Multimodality for Robust Phishing Web Page Identification. Appl. Sci. 2026, 16, 751. [Google Scholar] [CrossRef]
- Orabi, M.; Mouheb, D.; Al Aghbari, Z.; Kamel, I. Detection of Bots in Social Media: A Systematic Review. Inf. Process. Manag. 2020, 57, 102250. [Google Scholar] [CrossRef]
- Yang, Q.; Zhao, Y.; Cheng, H. Uncertainty-Aware Evidential Fusion for Multi-Modal Object Detection in Autonomous Driving. Drones 2026, 10, 130. [Google Scholar] [CrossRef]
- Xu, K.; Wang, S.; Diao, Z. DATTAMM: Domain-Aware Test-Time Adaptation for Multimodal Misinformation Detection. Appl. Sci. 2025, 15, 11832. [Google Scholar] [CrossRef]
- Zhang, X.; Duh, K. Reproducible and Efficient Benchmarks for Hyperparameter Optimization of Neural Machine Translation Systems. Trans. Assoc. Comput. Linguist. 2020, 8, 393–408. [Google Scholar] [CrossRef]
- Pawłowski, M.; Wróblewska, A.; Sysko-Romańczuk, S. Effective Techniques for Multimodal Data Fusion: A Comparative Analysis. Sensors 2023, 23, 2381. [Google Scholar] [CrossRef]
- Dong, Y.; Hao, Y. A Stock Prediction Method Based on Multidimensional and Multilevel Feature Dynamic Fusion. Electronics 2024, 13, 4111. [Google Scholar] [CrossRef]
- Liu, R.; Liu, H.; Huang, H.; Song, B.; Wu, Q. Multimodal Multiscale Dynamic Graph Convolution Networks for Stock Price Prediction. Pattern Recognit. 2024, 149, 110211. [Google Scholar] [CrossRef]
- Sheng, Y.; Qu, Y.; Ma, D. Stock Price Crash Prediction Based on Multimodal Data Machine Learning Models. Financ. Res. Lett. 2024, 62, 105195. [Google Scholar] [CrossRef]
- Yu, S.; Wang, J.; Hussein, W.; Hung, P.C.K. Robust Multimodal Federated Learning for Incomplete Modalities. Comput. Commun. 2024, 214, 234–243. [Google Scholar] [CrossRef]
- Ngo, D.; Park, H.-C.; Kang, B. Edge Intelligence: A Review of Deep Neural Network Inference in Resource-Limited Environments. Electronics 2025, 14, 2495. [Google Scholar] [CrossRef]
- Wang, Y.; Zhao, J. A Unified and Resource-Aware Framework for Adaptive Inference Acceleration on Edge and Embedded Platforms. Electronics 2025, 14, 2188. [Google Scholar] [CrossRef]
- De Barrena, T.F.; Fernandes, A.; Ferrando, J.L.; García, A.; Landaluce, H.; Angulo, I. Adaptive High Frequency Data Streaming for Soft Real-Time Industrial AI: A Scalable Microservices Based Architecture with Dynamic Downsampling. Array 2025, 27, 100488. [Google Scholar] [CrossRef]
- Schlegel, M.; Sattler, K.-U. Capturing End-to-End Provenance for Machine Learning Pipelines. Inf. Syst. 2025, 132, 102495. [Google Scholar] [CrossRef]
- Vonderhaar, L.; Couder, J.; Procko, T.T.; Lueddeke, E.; Cisneros, D.; Ochoa, O. Verifying Machine Learning Interpretability and Explainability Requirements Through Provenance. Software 2026, 5, 9. [Google Scholar] [CrossRef]
- Mohammed, S.; Budach, L.; Feuerpfeil, M.; Ihde, N.; Nathansen, A.; Noack, N.; Patzlaff, H.; Naumann, F.; Harmouch, H. The Effects of Data Quality on Machine Learning Performance on Tabular Data. Inf. Syst. 2025, 132, 102549. [Google Scholar] [CrossRef]
- Zareie, A.; Bakir, M.E.; Greenwood, M.A.; Bontcheva, K.; Scarton, C. Identifying Coordination in Online Social Networks through Anomalous Sharing Behaviour. Online Soc. Netw. Media 2025, 50, 100341. [Google Scholar] [CrossRef]
- Xiao, Y.; Shao, H.; Liu, B. Evaluating Calibration of Deep Fault Diagnostic Models under Distribution Shift. Comput. Ind. 2025, 171, 104334. [Google Scholar] [CrossRef]
- Johnston, S.S.; Fortin, S.; Kalsekar, I.; Reps, J.; Coplan, P. Improving Visual Communication of Discriminative Accuracy for Predictive Models: The Probability Threshold Plot. JAMIA Open 2021, 4, ooab017. [Google Scholar] [CrossRef]
- Shashikumar, S.P.; Amrollahi, F.; Nemati, S. Unsupervised Detection and Correction of Model Calibration Shift at Test-Time. In Proceedings of the 2023 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Sydney, Australia, 24–27 July 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1–4. [Google Scholar]
- Abdar, M.; Pourpanah, F.; Hussain, S.; Rezazadegan, D.; Liu, L.; Ghavamzadeh, M.; Fieguth, P.; Cao, X.; Khosravi, A.; Acharya, U.R.; et al. A Review of Uncertainty Quantification in Deep Learning: Techniques, Applications and Challenges. Inf. Fusion 2021, 76, 243–297. [Google Scholar] [CrossRef]
- Meaney, C.; Wang, X.; Guan, J.; Stukel, T.A. Comparison of Methods for Tuning Machine Learning Model Hyper-Parameters: With Application to Predicting High-Need High-Cost Health Care Users. BMC Med. Res. Methodol. 2025, 25, 134. [Google Scholar] [CrossRef] [PubMed]
- Kim, J.; Lee, D. Comparative Study on Hyperparameter Tuning for Predicting Concrete Compressive Strength. Buildings 2025, 15, 2173. [Google Scholar] [CrossRef]
- Xin, D.; Miao, H.; Parameswaran, A.; Polyzotis, N. Production Machine Learning Pipelines: Empirical Analysis and Optimization Opportunities. In Proceedings of the 2021 International Conference on Management of Data, Virtual Event, China, 20–25 June 2021; ACM: New York, NY, USA, 2021; pp. 2639–2652. [Google Scholar]
- Martins, P.; Cardoso, F.; Váz, P.; Silva, J.; Abbasi, M. Performance and Scalability of Data Cleaning and Preprocessing Tools: A Benchmark on Large Real-World Datasets. Data 2025, 10, 68. [Google Scholar] [CrossRef]
- Sezer, O.B.; Gudelek, M.U.; Ozbayoglu, A.M. Financial Time Series Forecasting with Deep Learning: A Systematic Literature Review: 2005–2019. Appl. Soft Comput. 2020, 90, 106181. [Google Scholar] [CrossRef]
- Wu, Z.; Pan, S.; Chen, F.; Long, G.; Zhang, C.; Yu, P.S. A Comprehensive Survey on Graph Neural Networks. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 4–24. [Google Scholar] [CrossRef]
- Gao, J.; Li, P.; Chen, Z.; Zhang, J. A Survey on Deep Learning for Multimodal Data Fusion. Neural Comput. 2020, 32, 829–864. [Google Scholar] [CrossRef]
- Cheng, Y.; Wang, D.; Zhou, P.; Zhang, T. Model Compression and Acceleration for Deep Neural Networks: The Principles, Progress, and Challenges. IEEE Signal Process. Mag. 2018, 35, 126–136. [Google Scholar] [CrossRef]
- Liu, Q.; Son, H. Methods for Aggregating Investor Sentiment from Social Media. Humanit. Soc. Sci. Commun. 2024, 11, 925. [Google Scholar] [CrossRef]
- Muhammad, I.; Rospocher, M. On Assessing the Performance of LLMs for Target-Level Sentiment Analysis in Financial News Headlines. Algorithms 2025, 18, 46. [Google Scholar] [CrossRef]
- Kasula, V.K.; Tumma, C.; Konda, B. A Comprehensive Review of Artificial Intelligence Models for Lifetime Value Optimization. In 2025 2nd International Conference on Research Methodologies in Knowledge Management, Artificial Intelligence and Telecommunication Engineering (RMKMATE), Chennai, India, 7–8 May 2025; IEEE: Piscataway, NJ, USA, 2025; pp. 1–5. [Google Scholar] [CrossRef]
- Abdollahi, H.; Fjesme, S.L.; Sirnes, E. Measuring Market Volatility Connectedness to Media Sentiment. N. Am. J. Econ. Financ. 2024, 71, 102091. [Google Scholar] [CrossRef]
- ALDayel, A.; Magdy, W. Stance Detection on Social Media: State of the Art and Trends. Inf. Process. Manag. 2021, 58, 102597. [Google Scholar] [CrossRef]
- Tang, Y.; Yang, Y.; Huang, A.; Tam, A.; Tang, J. FinEntity: Entity-Level Sentiment Classification for Financial Texts. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, 6–10 December 2023; Association for Computational Linguistics: Stroudsburg, PA, USA, 2023; pp. 15465–15471. [Google Scholar]
- Corsi, G. Evaluating Twitter’s Algorithmic Amplification of Low-Credibility Content: An Observational Study. EPJ Data Sci. 2024, 13, 18. [Google Scholar] [CrossRef]
- Aguilera, A.; Quinteros, P.; Dongo, I.; Cardinale, Y. CrediBot: Applying Bot Detection for Credibility Analysis on Twitter. IEEE Access 2023, 11, 108365–108385. [Google Scholar] [CrossRef]
- Akdogan, Y.E.; Anbar, A. More than Just Sentiment: Using Social, Cognitive, and Behavioral Information of Social Media to Predict Stock Markets with Artificial Intelligence and Big Data. Borsa Istanb. Rev. 2024, 24, 61–82. [Google Scholar] [CrossRef]
- Broadstock, D.C.; Zhang, D. Social-Media and Intraday Stock Returns: The Pricing Power of Sentiment. Financ. Res. Lett. 2019, 30, 116–123. [Google Scholar] [CrossRef]
- Chu, X.; Wan, X.; Qiu, J. The Relative Importance of Overnight Sentiment versus Trading-Hour Sentiment in Volatility Forecasting. J. Behav. Exp. Financ. 2023, 39, 100826. [Google Scholar] [CrossRef]
- Audrino, F.; Sigrist, F.; Ballinari, D. The Impact of Sentiment and Attention Measures on Stock Market Volatility. Int. J. Forecast. 2020, 36, 334–357. [Google Scholar] [CrossRef]
- Oliveira, N.; Cortez, P.; Areal, N. The Impact of Microblogging Data for Stock Market Prediction: Using Twitter to Predict Returns, Volatility, Trading Volume and Survey Sentiment Indices. Expert Syst. Appl. 2017, 73, 125–144. [Google Scholar] [CrossRef]
- Kastrati, M.; Kastrati, Z.; Shariq Imran, A.; Biba, M. Leveraging Distant Supervision and Deep Learning for Twitter Sentiment and Emotion Classification. J. Intell. Inf. Syst. 2024, 62, 1045–1070. [Google Scholar] [CrossRef]
- Pan, Q.; Meng, Z. Hybrid Uncertainty Calibration for Multimodal Sentiment Analysis. Electronics 2024, 13, 662. [Google Scholar] [CrossRef]
- Wagner, M.; Wei, X. Ambiguous Investor Sentiment. Financ. Res. Lett. 2024, 67, 105773. [Google Scholar] [CrossRef]
- Firdaniza, F.; Ruchjana, B.; Chaerani, D.; Radianti, J. Information Diffusion Model in Twitter: A Systematic Literature Review. Information 2021, 13, 13. [Google Scholar] [CrossRef]
- Bellutta, D.; Carley, K.M. Investigating Coordinated Account Creation Using Burst Detection and Network Analysis. J. Big Data 2023, 10, 20. [Google Scholar] [CrossRef]
- Hirshleifer, D.; Peng, L.; Wang, Q. News Diffusion in Social Networks and Stock Market Reactions. Rev. Financ. Stud. 2025, 38, 883–937. [Google Scholar] [CrossRef]
- Bandy, J.; Diakopoulos, N. Curating Quality? How Twitter’s Timeline Algorithm Treats Different Types of News. Soc. Media Soc. 2021, 7, 20563051211041648. [Google Scholar] [CrossRef]
- Gausen, A.; Luk, W.; Guo, C. Using Agent-Based Modelling to Evaluate the Impact of Algorithmic Curation on Social Media. J. Data Inf. Qual. 2023, 15, 1–24. [Google Scholar] [CrossRef]
- Chen, Z.-H.; Wu, W.-L.; Li, S.-P.; Bao, K.; Koedijk, K.G. Social Media Information Diffusion and Excess Stock Returns Co-Movement. Int. Rev. Financ. Anal. 2024, 91, 103036. [Google Scholar] [CrossRef]
- Tardelli, S.; Nizzoli, L.; Tesconi, M.; Conti, M.; Nakov, P.; Da San Martino, G.; Cresci, S. Temporal Dynamics of Coordinated Online Behavior: Stability, Archetypes, and Influence. Proc. Natl. Acad. Sci. USA 2024, 121, e2307038121. [Google Scholar] [CrossRef]
- Zouzou, Y.; Varol, O. Unsupervised Detection of Coordinated Fake-Follower Campaigns on Social Media. EPJ Data Sci. 2024, 13, 62. [Google Scholar] [CrossRef]
- Tian, Y.; Xie, Y. Artificial Cheerleading in IEO: Marketing Campaign or Pump and Dump Scheme. Inf. Process. Manag. 2024, 61, 103537. [Google Scholar] [CrossRef]
- Ogburn, E.L.; Sofrygin, O.; Díaz, I.; Van Der Laan, M.J. Causal Inference for Social Network Data. J. Am. Stat. Assoc. 2024, 119, 597–611. [Google Scholar] [CrossRef] [PubMed]
- Agarwal, S.; Mehta, S. Effective Influence Estimation in Twitter Using Temporal, Profile, Structural and Interaction Characteristics. Inf. Process. Manag. 2020, 57, 102321. [Google Scholar] [CrossRef]
- Loh, W.W.; Ren, D. Estimating Social Influence in a Social Network Using Potential Outcomes. Psychol. Methods 2022, 27, 841–855. [Google Scholar] [CrossRef]
- Yang, J.; Li, Y.; Gao, C.; Dong, W. Entity Disambiguation with Context Awareness in User-Generated Short Texts. Expert Syst. Appl. 2020, 160, 113652. [Google Scholar] [CrossRef]
- Park, J.H.; Moon, J.Y.; Hong, S.-J. Understanding the Bi-Directional Message Diffusion Mechanism in the Context of IT Trends and Current Social Issues. Inf. Manag. 2021, 58, 103527. [Google Scholar] [CrossRef]
- Morstatter, F.; Liu, H. Discovering, Assessing, and Mitigating Data Bias in Social Media. Online Soc. Netw. Media 2017, 1, 1–13. [Google Scholar] [CrossRef]
- Torres-Lugo, C.; Pote, M.; Nwala, A.C.; Menczer, F. Manipulating Twitter through Deletions. Proc. Int. AAAI Conf. Web Soc. Media 2022, 16, 1029–1039. [Google Scholar] [CrossRef]
- López-Vizcaíno, M.; Nóvoa, F.J.; Fernández, D.; Cacheda, F. Time Aware F-Score for Cybersecurity Early Detection Evaluation. Appl. Sci. 2024, 14, 574. [Google Scholar] [CrossRef]
- Diallo, A.R.; Homri, L.; Boeuf, T.; Dantan, J.-Y.; Bonnet, F. Quantifying and Mitigating Alarm Fatigue Caused by Fault Detection Systems. Reliab. Eng. Syst. Saf. 2026, 267, 111890. [Google Scholar] [CrossRef]
- Horta Ribeiro, M.; Hosseinmardi, H.; West, R.; Watts, D.J. Deplatforming Did Not Decrease Parler Users’ Activity on Fringe Social Media. PNAS Nexus 2023, 2, pgad035. [Google Scholar] [CrossRef] [PubMed]
- Ben El Hadj Said, I.; Slim, S. The Dynamic Relationship between Investor Attention and Stock Market Volatility: International Evidence. J. Risk Financ. Manag. 2022, 15, 66. [Google Scholar] [CrossRef]
- Lopez-Vizcaino, M.F.; Novoa, F.J.; Fernandez, D.; Cacheda, F. Measuring Early Detection of Anomalies. IEEE Access 2022, 10, 127695–127707. [Google Scholar] [CrossRef]
- Toraman, C.; Şahinuç, F.; Yilmaz, E.H.; Akkaya, I.B. Understanding Social Engagements: A Comparative Analysis of User and Text Features in Twitter. Soc. Netw. Anal. Min. 2022, 12, 47. [Google Scholar] [CrossRef]
- Kim-Hahm, H.; Abou-Zaid, A.S.; Mohd, A. News vs. Social Media: Sentiment Impact on Stock Performance of Big Tech Companies. J. Risk Financ. Manag. 2025, 18, 660. [Google Scholar] [CrossRef]
- Cookson, J.A.; Lu, R.; Mullins, W.; Niessner, M. The Social Signal. J. Financ. Econ. 2024, 158, 103870. [Google Scholar] [CrossRef]
- Ho, T.-T.; Huang, Y. Stock Price Movement Prediction Using Sentiment Analysis and CandleStick Chart Representation. Sensors 2021, 21, 7957. [Google Scholar] [CrossRef]
- Ruan, L.; Jiang, H. Stock Price Prediction Using FinBERT-Enhanced Sentiment with SHAP Explainability and Differential Privacy. Mathematics 2025, 13, 2747. [Google Scholar] [CrossRef]
- Nguyen, N.-H.; Nguyen, T.-T.; Ngo, Q.T. DASF-Net: A Multimodal Framework for Stock Price Forecasting with Diffusion-Based Graph Learning and Optimized Sentiment Fusion. J. Risk Financ. Manag. 2025, 18, 417. [Google Scholar] [CrossRef]
- Duszejko, P.; Walczyna, T.; Piotrowski, Z. Detection of Manipulations in Digital Images: A Review of Passive and Active Methods Utilizing Deep Learning. Appl. Sci. 2025, 15, 881. [Google Scholar] [CrossRef]
- Asmawati, E.; Saikhu, A.; Siahaan, D.O. Sentiment Analysis of Meme Images Using Deep Neural Network Based on Keypoint Representation. Informatics 2025, 12, 118. [Google Scholar] [CrossRef]
- Hill, B.G.; Koback, F.L.; Schilling, P.L. The Risk of Shortcutting in Deep Learning Algorithms for Medical Imaging Research. Sci. Rep. 2024, 14, 29224. [Google Scholar] [CrossRef]
- Jones, S.M.; Van De Sompel, H.; Shankar, H.; Klein, M.; Tobin, R.; Grover, C. Scholarly Context Adrift: Three out of Four URI References Lead to Changed Content. PLoS ONE 2016, 11, e0167475. [Google Scholar] [CrossRef]
- Farhan, M.; Butt, U.; Sulaiman, R.B.; Alraja, M. Self-Sovereign Identities and Content Provenance: VeriTrust—A Blockchain-Based Framework for Fake News Detection. Future Internet 2025, 17, 448. [Google Scholar] [CrossRef]
- Bandy, J.; Diakopoulos, N. More Accounts, Fewer Links: How Algorithmic Curation Impacts Media Exposure in Twitter Timelines. Proc. ACM Hum.-Comput. Interact. 2021, 5, 1–28. [Google Scholar] [CrossRef]
- Murdock, I.; Carley, K.M.; Yağan, O. An Agent-Based Model of Cross-Platform Information Diffusion and Moderation. Soc. Netw. Anal. Min. 2024, 14, 145. [Google Scholar] [CrossRef]
- Gao, H.; Wang, Y.; Shao, J.; Shen, H.; Cheng, X. User Identity Linkage across Social Networks with the Enhancement of Knowledge Graph and Time Decay Function. Entropy 2022, 24, 1603. [Google Scholar] [CrossRef]
- Huang, M.; Wang, J.-L.; Zhang, Z.-K. Narrative Co-Evolution in Hybrid Social Networks: A Longitudinal Computational Analysis of Confucius Institutes. Entropy 2025, 27, 1240. [Google Scholar] [CrossRef] [PubMed]
- Zhan, Y.; Yang, R.; You, J.; Huang, M.; Liu, W.; Liu, X. A Systematic Literature Review on Incomplete Multimodal Learning: Techniques and Challenges. Syst. Sci. Control Eng. 2025, 13, 2467083. [Google Scholar] [CrossRef]
- Fernando, M.; Cèsar, F.; David, N.; José, H. Missing the Missing Values: The Ugly Duckling of Fairness in Machine Learning. Int. J. Intell. Syst. 2021, 36, 3217–3258. [Google Scholar] [CrossRef]
- Pereira, R.C.; Abreu, P.H.; Rodrigues, P.P.; Figueiredo, M.A.T. Imputation of Data Missing Not at Random: Artificial Generation and Benchmark Analysis. Expert Syst. Appl. 2024, 249, 123654. [Google Scholar] [CrossRef]
- Nevado-Catalán, D.; Pastrana, S.; Vallina-Rodriguez, N.; Tapiador, J. An Analysis of Fake Social Media Engagement Services. Comput. Secur. 2023, 124, 103013. [Google Scholar] [CrossRef]
- Chelas, S.; Routis, G.; Roussaki, I. Detection of Fake Instagram Accounts via Machine Learning Techniques. Computers 2024, 13, 296. [Google Scholar] [CrossRef]
- Yuan, Y.; Li, Z.; Zhao, B. A Survey of Multimodal Learning: Methods, Applications, and Future. ACM Comput. Surv. 2025, 57, 1–34. [Google Scholar] [CrossRef]
- Singh, S.; Saber, E.; Markopoulos, P.P.; Heard, J. Regulating Modality Utilization within Multimodal Fusion Networks. Sensors 2024, 24, 6054. [Google Scholar] [CrossRef]
- Ma, X.; Cai, X.; Song, Y.; Liang, Y.; Liu, G.; Yang, Y. RMP: Robust Multi-Modal Perception Under Missing Condition. Electronics 2025, 15, 119. [Google Scholar] [CrossRef]
- Ma, Y.; Li, S.; Zhou, M. Twitter-Based Market Uncertainty and Global Stock Volatility Predictability. N. Am. J. Econ. Finance 2025, 75, 102256. [Google Scholar] [CrossRef]
- Anand, A.; Pathak, J. The Role of Reddit in the GameStop Short Squeeze. Econ. Lett. 2022, 211, 110249. [Google Scholar] [CrossRef]
- Gan, B.; Alexeev, V.; Bird, R.; Yeung, D. Sensitivity to Sentiment: News vs Social Media. Int. Rev. Financ. Anal. 2020, 67, 101390. [Google Scholar] [CrossRef]
- Bousbaa, Z.; Sanchez-Medina, J.; Bencharef, O. Financial Time Series Forecasting: A Data Stream Mining-Based System. Electronics 2023, 12, 2039. [Google Scholar] [CrossRef]
- Fan, R.; Talavera, O.; Tran, V. Social Media Bots and Stock Markets. Eur. Financ. Manag. 2020, 26, 753–777. [Google Scholar] [CrossRef]
- Zeng, T.; Shema, A.; Acuna, D.E. Dead Science: Most Resources Linked in Biomedical Articles Disappear in Eight Years. In Information in Contemporary Society; Taylor, N.G., Christian-Lamb, C., Martin, M.H., Nardi, B., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2019; Volume 11420, pp. 170–176. [Google Scholar]
- Abdalgader, K.; Matroud, A.A.; Al-Doboni, G. Temporal Dynamics in Short Text Classification: Enhancing Semantic Understanding Through Time-Aware Model. Information 2025, 16, 214. [Google Scholar] [CrossRef]
- Eg, R.; Demirkol Tønnesen, Ö.; Tennfjord, M.K. A Scoping Review of Personalized User Experiences on Social Media: The Interplay between Algorithms and Human Factors. Comput. Hum. Behav. Rep. 2023, 9, 100253. [Google Scholar] [CrossRef]
- Bugajev, A.; Kriauzienė, R.; Chadyšas, V. Realistic Data Delays and Alternative Inactivity Definitions in Telecom Churn: Investigating Concept Drift Using a Sliding-Window Approach. Appl. Sci. 2025, 15, 1599. [Google Scholar] [CrossRef]
- Oliveira, J.M.; Ramos, P. Evaluating the Effectiveness of Time Series Transformers for Demand Forecasting in Retail. Mathematics 2024, 12, 2728. [Google Scholar] [CrossRef]
- Lin, X.; Chang, L.; Nie, X.; Dong, F. Temporal Attention for Few-Shot Concept Drift Detection in Streaming Data. Electronics 2024, 13, 2183. [Google Scholar] [CrossRef]
- Cao, Z.; Li, Y.; Kim, D.-H.; Shin, B.-S. Deep Neural Network Confidence Calibration from Stochastic Weight Averaging. Electronics 2024, 13, 503. [Google Scholar] [CrossRef]
- Kogan, S.; Moskowitz, T.J.; Niessner, M. Social Media and Financial News Manipulation. Rev. Financ. 2023, 27, 1229–1268. [Google Scholar] [CrossRef]
- Fernandez Vilas, A.; Diaz Redondo, R.P.; Lorenzo Garcia, A. The Irruption of Cryptocurrencies Into Twitter Cashtags: A Classifying Solution. IEEE Access 2020, 8, 32698–32713. [Google Scholar] [CrossRef]
- Hewamalage, H.; Ackermann, K.; Bergmeir, C. Forecast Evaluation for Data Scientists: Common Pitfalls and Best Practices. Data Min. Knowl. Discov. 2023, 37, 788–832. [Google Scholar] [CrossRef]
- Incorvaia, G.; Hond, D.; Asgari, H. Uncertainty Quantification of Machine Learning Model Performance via Anomaly-Based Dataset Dissimilarity Measures. Electronics 2024, 13, 939. [Google Scholar] [CrossRef]
- Cresci, S. A Decade of Social Bot Detection. Commun. ACM 2020, 63, 72–83. [Google Scholar] [CrossRef]
- Breuer, J.; Kmetty, Z.; Haim, M.; Stier, S. User-Centric Approaches for Collecting Facebook Data in the ‘Post-API Age’: Experiences from Two Studies and Recommendations for Future Research. Inf. Commun. Soc. 2023, 26, 2649–2668. [Google Scholar] [CrossRef]
- Gebru, T.; Morgenstern, J.; Vecchione, B.; Vaughan, J.W.; Wallach, H.; Iii, H.D.; Crawford, K. Datasheets for Datasets. Commun. ACM 2021, 64, 86–92. [Google Scholar] [CrossRef]
- Gupta, G.; Raja, K.; Gupta, M.; Jan, T.; Whiteside, S.T.; Prasad, M. A Comprehensive Review of DeepFake Detection Using Advanced Machine Learning and Fusion Methods. Electronics 2023, 13, 95. [Google Scholar] [CrossRef]
- Theodorakopoulos, L.; Theodoropoulou, A.; Klavdianos, C. Big Data Analytics and AI for Consumer Behavior in Digital Marketing: Applications, Synthetic and Dark Data, and Future Directions. Big Data Cogn. Comput. 2026, 10, 46. [Google Scholar] [CrossRef]
- Ghiurău, D.; Popescu, D.E. Distinguishing Reality from AI: Approaches for Detecting Synthetic Content. Computers 2024, 14, 1. [Google Scholar] [CrossRef]
- Liu, X.; Li, Y.; Li, K. Enhancing the Robustness of AI-Generated Text Detectors: A Survey. Mathematics 2025, 13, 2145. [Google Scholar] [CrossRef]
- Gillespie, T. Do Not Recommend? Reduction as a Form of Content Moderation. Soc. Media Soc. 2022, 8, 20563051221117552. [Google Scholar] [CrossRef]
- Theodorakopoulos, L.; Theodoropoulou, A.; Klavdianos, C. Interactive Viral Marketing Through Big Data Analytics, Influencer Networks, AI Integration, and Ethical Dimensions. J. Theor. Appl. Electron. Commer. Res. 2025, 20, 115. [Google Scholar] [CrossRef]
- Rieder, B.; Hofmann, J. Towards Platform Observability. Internet Policy Rev. 2020, 9, 1–28. [Google Scholar] [CrossRef]
- Ohme, J.; Araujo, T.; Boeschoten, L.; Freelon, D.; Ram, N.; Reeves, B.B.; Robinson, T.N. Digital Trace Data Collection for Social Media Effects Research: APIs, Data Donation, and (Screen) Tracking. Commun. Methods Meas. 2024, 18, 124–141. [Google Scholar] [CrossRef]
- Haimson, O.L.; Delmonaco, D.; Nie, P.; Wegner, A. Disproportionate Removals and Differing Content Moderation Experiences for Conservative, Transgender, and Black Social Media Users: Marginalization and Moderation Gray Areas. Proc. ACM Hum.-Comput. Interact. 2021, 5, 1–35. [Google Scholar] [CrossRef]
- Lee, H.-C.; Lee, S.-W. Provenance-Based Trust-Aware Requirements Engineering Framework for Self-Adaptive Systems. Sensors 2023, 23, 4622. [Google Scholar] [CrossRef]
- Stieglitz, S.; Mirbabaie, M.; Ross, B.; Neuberger, C. Social Media Analytics—Challenges in Topic Discovery, Data Collection, and Data Preparation. Int. J. Inf. Manag. 2018, 39, 156–168. [Google Scholar] [CrossRef]
- Golland, L.; Watteler, O.; Recker, J.; Schwalbach, J.; Bishop, L. From (Almost) Open to Heavily Restricted Data Access—The Development of the Twitter/X Developer Policies. Big Data Soc. 2026, 13, 20539517261419333. [Google Scholar] [CrossRef]
- Kim, Y.; Nordgren, R.; Emery, S. The Story of Goldilocks and Three Twitter’s APIs: A Pilot Study on Twitter Data Sources and Disclosure. Int. J. Environ. Res. Public Health 2020, 17, 864. [Google Scholar] [CrossRef] [PubMed]
- Belák, V.; Mashhadi, A.; Sala, A.; Morrison, D. Phantom Cascades: The Effect of Hidden Nodes on Information Diffusion. Comput. Commun. 2016, 73, 12–21. [Google Scholar] [CrossRef]
- Huang, F.; Zhang, M.; Li, Y. A Comparison Study of Tie Non-Response Treatments in Social Networks Analysis. Front. Psychol. 2019, 9, 2766. [Google Scholar] [CrossRef]
- Salehzadeh-Yazdi, A.; Hütt, M.-T. Assessing the Impact of Sampling Bias on Node Centralities in Synthetic and Biological Networks. Npj Syst. Biol. Appl. 2025, 11, 47. [Google Scholar] [CrossRef]
- Che, Z.; Purushotham, S.; Cho, K.; Sontag, D.; Liu, Y. Recurrent Neural Networks for Multivariate Time Series with Missing Values. Sci. Rep. 2018, 8, 6085. [Google Scholar] [CrossRef]
- Ehrig, M.; Bullock, G.S.; Leng, X.I.; Pajewski, N.M.; Speiser, J.L. Imputation and Missing Indicators for Handling Missing Longitudinal Data: Data Simulation Analysis Based on Electronic Health Record Data. JMIR Med. Inform. 2025, 13, e64354. [Google Scholar] [CrossRef]
- Yang, C.; Liang, Z.; Liu, T.; Hu, Z.; Yan, D. MGMR-Net: Mamba-Guided Multimodal Reconstruction and Fusion Network for Sentiment Analysis with Incomplete Modalities. Electronics 2025, 14, 3088. [Google Scholar] [CrossRef]
- Grzenda, M.; Gomes, H.M.; Bifet, A. Delayed Labelling Evaluation for Data Streams. Data Min. Knowl. Discov. 2020, 34, 1237–1266. [Google Scholar] [CrossRef]
- Botacin, M.; Gomes, H. Towards More Realistic Evaluations: The Impact of Label Delays in Malware Detection Pipelines. Comput. Secur. 2025, 148, 104122. [Google Scholar] [CrossRef]
- Gabrovšek, P.; Aleksovski, D.; Mozetič, I.; Grčar, M. Twitter Sentiment around the Earnings Announcement Events. PLoS ONE 2017, 12, e0173151. [Google Scholar] [CrossRef]
- Hanczar, B. Performance Visualization Spaces for Classification with Rejection Option. Pattern Recognit. 2019, 96, 106984. [Google Scholar] [CrossRef]
- Zhou, X.; Chen, B.; Gui, Y.; Cheng, L. Conformal Prediction: A Data Perspective. ACM Comput. Surv. 2026, 58, 1–37. [Google Scholar] [CrossRef]
- Cheng, J.; Tian, J.; Spoto, F.; Azhir, A.; Mork, D.; Estiri, H. Signal Fidelity Index-Aware Calibration for Addressing Distributional Shift in Predictive Modeling across Heterogeneous Real-World Data. Sci. Rep. 2025, 16, 2807. [Google Scholar] [CrossRef]
- Foltyn, A.; Deuschel, J. Towards Reliable Multimodal Stress Detection under Distribution Shift. In Proceedings of the Companion Publication of the 2021 International Conference on Multimodal Interaction, Montreal, QC, Canada, 18–22 October 2021; ACM: New York, NY, USA, 2021; pp. 329–333. [Google Scholar]
- Ramezani, M.; Ahadinia, A.; Ziaei Bideh, A.; Rabiee, H.R. Joint Inference of Diffusion and Structure in Partially Observed Social Networks Using Coupled Matrix Factorization. ACM Trans. Knowl. Discov. Data 2023, 17, 1–28. [Google Scholar] [CrossRef]
- Shvydun, S. Centrality in Complex Networks under Incomplete Data. PLoS Complex Syst. 2025, 2, e0000042. [Google Scholar] [CrossRef]
- Mora-Cantallops, M.; Sánchez-Alonso, S.; García-Barriocanal, E.; Sicilia, M.-A. Traceability for Trustworthy AI: A Review of Models and Tools. Big Data Cogn. Comput. 2021, 5, 20. [Google Scholar] [CrossRef]
- Gultekin, E.; Aktas, M.S. A Novel End-to-End Provenance System for Predictive Maintenance: A Case Study for Industrial Machinery Predictive Maintenance. Computers 2024, 13, 325. [Google Scholar] [CrossRef]
- Jain, S.; Wallace, B.C. Attention is not Explanation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; Association for Computational Linguistics: Minneapolis, MN, USA, 2019; pp. 3543–3556. [Google Scholar]
- Slack, D.; Hilgard, S.; Jia, E.; Singh, S.; Lakkaraju, H. Fooling LIME and SHAP: Adversarial Attacks on Post Hoc Explanation Methods. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, New York, NY, USA, 7–9 February 2020; ACM: New York, NY, USA, 2020; pp. 180–186. [Google Scholar]
- Li, Y.; Zhou, J.; Zheng, B.; Shafiabady, N.; Chen, F. Reliable and Faithful Generative Explainers for Graph Neural Networks. Mach. Learn. Knowl. Extr. 2024, 6, 2913–2929. [Google Scholar] [CrossRef]
- Seranmadevi, R.; Addula, S.R.; Kumar, D.; Tyagi, A.K. Security and Privacy in AI: IoT-Enabled Banking and Finance Services. In Monetary Dynamics and Socio-Economic Development in Emerging Economies; IGI Global Scientific Publishing: Hershey, PA, USA, 2026; pp. 163–194. [Google Scholar] [CrossRef]
- Studer, S.; Bui, T.B.; Drescher, C.; Hanuschkin, A.; Winkler, L.; Peters, S.; Müller, K.-R. Towards CRISP-ML(Q): A Machine Learning Process Model with Quality Assurance Methodology. Mach. Learn. Knowl. Extr. 2021, 3, 392–413. [Google Scholar] [CrossRef]
- Zhao, X.; Ma, Z.G.; Jørgensen, B.N. An End-to-End Data and Machine Learning Pipeline for Energy Forecasting: A Systematic Approach Integrating MLOps and Domain Expertise. Information 2025, 16, 805. [Google Scholar] [CrossRef]
- Theodorakopoulos, L.; Theodoropoulou, A.; Kampiotis, G.; Kalliampakou, I. NeuralACT: Accounting Analytics Using Neural Network for Real-Time Decision Making from Big Data. IEEE Access 2025, 13, 8621–8637. [Google Scholar] [CrossRef]
- Petre, C.; Duffy, B.E.; Hund, E. “Gaming the System”: Platform Paternalism and the Politics of Algorithmic Visibility. Soc. Media Soc. 2019, 5, 2056305119879995. [Google Scholar] [CrossRef]
- Erlandsson, F.; Bródka, P.; Boldt, M.; Johnson, H. Do We Really Need to Catch Them All? A New User-Guided Social Media Crawling Method. Entropy 2017, 19, 686. [Google Scholar] [CrossRef]
- An, W.; Beauvile, R.; Rosche, B. Causal Network Analysis. Annu. Rev. Sociol. 2022, 48, 23–41. [Google Scholar] [CrossRef]
- Yuan, Y.; Pang, N.; Zhang, Y.; Liu, K. Which Cascade Is More Decisive in Rumor Detection on Social Media: Based on Comparison between Repost and Reply Sequences. Knowl.-Based Syst. 2023, 278, 110857. [Google Scholar] [CrossRef]
- Wang, M.; Fan, S.; Li, Y.; Gao, B.; Xie, Z.; Chen, H. Robust Multi-Modal Fusion Architecture for Medical Data with Knowledge Distillation. Comput. Methods Programs Biomed. 2025, 260, 108568. [Google Scholar] [CrossRef]
- Komorniczak, J.; Ksieniewicz, P.; Zyblewski, P. Structuring the Processing Frameworks for Data Stream Evaluation and Application. Pattern Recognit. 2026, 172, 112516. [Google Scholar] [CrossRef]
- Yang, Y.; Kuchibhotla, A.K.; Tchetgen Tchetgen, E. Doubly Robust Calibration of Prediction Sets under Covariate Shift. J. R. Stat. Soc. Ser. B Stat. Methodol. 2024, 86, 943–965. [Google Scholar] [CrossRef] [PubMed]
- Pham, T.; Kottke, D.; Krempl, G.; Sick, B. Stream-Based Active Learning for Sliding Windows under the Influence of Verification Latency. Mach. Learn. 2022, 111, 2011–2036. [Google Scholar] [CrossRef]
- Arratia, A.; El Daou, M.; Kagerhuber, J.; Smolyarova, Y. Examining Challenges in Implied Volatility Forecasting: A Critical Review of Data Leakage and Feature Engineering Combined with High-Complexity Models. Comput. Econ 2025. [Google Scholar] [CrossRef]
- Zhang, Y.; Zhu, Y.; Linnainmaa, J.T. Man versus Machine Learning Revisited. Rev. Financ. Stud. 2025, 38, 3768–3790. [Google Scholar] [CrossRef]
- Ayala, M.J.; Gonzálvez-Gallego, N.; Arteaga-Sánchez, R. Google Search Volume Index and Investor Attention in Stock Market: A Systematic Review. Financ. Innov. 2024, 10, 70. [Google Scholar] [CrossRef]
- Derczynski, L.; Nichols, E.; Van Erp, M.; Limsopatham, N. Results of the WNUT2017 Shared Task on Novel and Emerging Entity Recognition. In Proceedings of the 3rd Workshop on Noisy User-Generated Text, Copenhagen, Denmark, 7 September 2017; Association for Computational Linguistics: Stroudsburg, PA, USA, 2017; pp. 140–147. [Google Scholar]
- Ganea, O.-E.; Hofmann, T. Deep Joint Entity Disambiguation with Local Neural Attention. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 9–11 September 2017; Association for Computational Linguistics: Stroudsburg, PA, USA, 2017; pp. 2619–2629. [Google Scholar]
- Cinus, F.; Minici, M.; Luceri, L.; Ferrara, E. Exposing Cross-Platform Coordinated Inauthentic Activity in the Run-Up to the 2024 U.S. Election. In Proceedings of the ACM on Web Conference 2025, Sydney, NSW, Australia, 28 April–2 May 2025; ACM: New York, NY, USA, 2025; pp. 541–559. [Google Scholar]




| Model Block | Best Suited Evidence | Typical Validity Risks | Feasibility Constraints |
|---|---|---|---|
| Text encoders | Short posts, threads, instrument-anchored mentions | Reaction content in the evidence window; post edits/deletions not handled; ambiguous mentions forced to an asset | Per-message inference latency and memory footprint under volume spikes; calibration under shift |
| Temporal modules | Session-level sequences, irregular bursts, rolling windows | Session misalignment; horizon overlap; observation delay ignored | Window/state maintenance sensitivity; retraining cadence and monitoring overhead |
| Graph encoders | Interaction graphs, co-mentions, user–asset links | Graph construction crossing decision cutoffs; edge semantics ambiguity; visibility-driven missing edges | Graph construction and update cost at scale; sampling/sparsification can alter behavior |
| Fusion mechanisms | Text + interaction + market features; optional media/links | Cross-modal timing mismatch; missing-modality artifacts; dominance of manipulable modalities | Cross-modal synchronization and buffering overhead; robustness to missing or delayed evidence required |
| Lightweight tabular heads | Aggregated indices, engineered indicators | Leakage via post hoc aggregates; target leakage through market-derived labels | Low latency; easier monitoring but limited expressiveness |
| Retrieval-augmented analysis | Threads plus external artifacts and links | Retrieval includes future material; freshness/provenance not enforced | Retrieval and external lookup latency; audit logging/versioning needed for reproducibility |
| Signal Source | What It Adds | Main Risks | Minimum Safeguards to Report |
|---|---|---|---|
| Social text + market features | Context on volatility and trend; sometimes improves stability | Market channel dominates; social contribution becomes unclear | Incremental value test (market-only vs. market + social) under the same time-forward split |
| Visual content (images, video frames) | Non-text cues (charts, screenshots, community-coded signals) | Fabrication, weak attribution, small labeled sets; models learn style and attention cues | Provenance assumptions; missing-visual handling; error analysis on manipulated or low-quality visuals |
| Links and referenced content | External context; topic anchoring via sources | Pages change or disappear; timing often reaction-driven; link-sharing bias | Time cutoffs for link availability; retention/archiving policy; robustness to dead or modified links |
| Cross-platform aggregation | Broader coverage; reduces single-platform dependence | Alignment errors, identity mismatch, vocabulary drift; conflation of unrelated threads | Alignment method (topic/entity/URL); propagation delay handling; tests on platform-shift periods and missing-platform conditions |
| Task Setting | When Masking/Missing-Indicator Handling Is Preferable | When Limited Imputation Is More Defensible | Main Risk If Mishandled |
|---|---|---|---|
| Sentiment-to-signal aggregation | When missing posts, counters, or linked artifacts reflect moderation, rate limits, platform outages, event-time overload, or selective visibility, because absence may itself indicate reduced observability or evidence quality | When the missing field is auxiliary and low-level, such as short gaps in dense numeric covariates, and when an explicit missingness indicator is retained | Imputation can convert collection artifacts into artificial stability and distort the meaning of the sentiment index |
| Surveillance/manipulation detection | Usually preferable, because structured absence may itself be part of the signal, and synthetic completion can hide suspicious propagation patterns or moderation effects | Only in narrow cases involving peripheral numeric fields that do not alter cascade structure or behavioral interpretation | Imputation can fabricate coordination motifs, suppress anomaly cues, or increase false confidence in threat assessment |
| Relational/graph-based analysis | Preferable when missing edges or users arise from rate limiting, visibility restrictions, or removals, since partial observability affects centrality and cascade structure in non-uniform ways | Rarely appropriate, except for carefully bounded structural assumptions that are disclosed and stress-tested | Filled-in edges or interactions can create graph structures that were never observed and can mislead diffusion or centrality analysis |
| Multimodal fusion | Preferable when modality absence is informative, irregular, or platform-driven, and when the system must preserve uncertainty about what was actually observed | More defensible when one optional modality is frequently absent in a stable pattern and the model is explicitly trained and evaluated under that condition | Default filling can make the model appear robust while actually relying on unrealistic completion patterns |
| Short-horizon operational decision support | Preferable when absence is tied to timing, collection delay, or evidence freshness constraints, because missingness affects what was knowable at decision time | Limited imputation may be acceptable for slow-moving auxiliary features whose absence does not change time validity | Imputation can blur the boundary between unavailable evidence and available evidence, leading to unrealistic operational evaluation |
| What to Report | Why it Matters |
|---|---|
| Evidence source and access limits (API tier, sampling method, rate limits) | Observed data is a visibility-filtered sample; access changes can alter results without any model change |
| Evidence freshness policy (cutoffs, delays tolerated, staleness handling) | Prevents post-cutoff reaction content from entering features and inflating performance |
| Observation delay and backlog behavior under spikes | Delays are largest when attention spikes; retrospective studies often assume immediate availability |
| Deletion and edit handling (retention, re-fetching, snapshot policy) | Post edits and removals can create unrealistic evidence that was not observable at decision time |
| Entity linking method and confidence handling (abstention or fallback) | Ambiguous mentions can systematically corrupt instrument-level indices and graphs |
| Window definitions (evidence window, decision cutoff, label horizon; session convention) | Avoids leakage and cross-session mixing; enables apples-to-apples comparison of results |
| Missing-modality policy (masking, imputation, reliability weights) | Missingness is structured and can become a predictor; models must degrade predictably |
| Controls for attention confounds (volume/engagement baselines, conditioning by activity level) | Prevents social indicators from acting as proxies for volatility or news intensity |
| Calibration and thresholding policy (abstention, uncertainty, alert-rate control) | Decision costs depend on confidence; miscalibration under drift drives unstable actions and alerts |
| Stress tests (partial outages, modality dropout, manipulation templates, shifted periods) | Measures robustness under plausible failure conditions rather than only average-case accuracy |
| Traceability logs (evidence IDs, timestamps, preprocessing versions, mapping rules) | Enables reconstruction, debugging, and governance; supports accountable decision use |
| Task Family | Typical Evidence Used | Common Model Families | Common Evaluation Target/Horizon | What Tends to Hold up Across Studies | Why Raw Metrics Are Hard to Compare |
|---|---|---|---|---|---|
| Sentiment-based short-horizon forecasting | Post text, sentiment scores, cashtags, basic engagement indicators | Contextual text encoders, FinBERT-style models, lightweight sequence or tabular prediction heads | Intraday, same-day, post-market, next-session direction or volatility-sensitive response | Gains are most defensible when entity attribution is explicit, evidence windows end before the decision cutoff, and sentiment is evaluated beyond simple attention baselines | Same-day versus next-day targets, market-close conventions, reaction contamination, and session alignment differ substantially across studies |
| Risk monitoring and volatility-sensitive inference | Aggregated sentiment indices, activity bursts, narrative intensity, news-linked or cross-source signals | Temporal models, hybrid text-plus-market pipelines, multimodal fusion architectures | Volatility response, stress-state classification, risk monitoring around event windows, short-horizon deterioration signals | Social evidence is often most useful during high-attention or uncertainty periods, especially when it is evaluated incrementally over market-only controls | Volatility definitions, event windows, baseline controls, and horizon length vary enough to make pooled metrics unstable |
| Coordination, manipulation, and surveillance tasks | Interaction graphs, repost cascades, user–asset links, repeated phrasing, behavioral metadata | Graph encoders, dynamic graph models, graph-plus-text hybrids, anomaly and coordination detectors | Suspicious coordination, diffusion anomalies, influence estimation, surveillance flags, manipulation-related detection | Relational signals are more defensible for monitoring structure, propagation, and coordinated behavior than for claiming direct causal prediction of market outcomes | Graph construction rules, edge semantics, visibility gaps, and labeling practices differ sharply across studies |
| Multimodal or cross-platform decision support | Text, interaction traces, linked artifacts, optional market covariates, sometimes images or external documents | Early, late, or intermediate fusion; retrieval-supported pipelines; multimodal transformers | Event interpretation, short-horizon support, robustness under partial evidence, decision support under heterogeneous observability | Improvements are more credible when ablations, missing-modality tests, and reliability controls are explicit | Platform access, identity linkage quality, modality availability, and observability assumptions are rarely aligned across papers |
| Transfer, drift, and deployment-oriented evaluation | Time-separated social evidence, repeated evaluation windows, delayed or incomplete evidence, platform-change periods | Drift-aware pipelines, recalibrated classifiers, uncertainty-aware models, monitoring-oriented evaluation designs | Time-forward generalization, breakage under shift, robustness under delay or missingness | Stable findings are more likely when studies report where performance degrades, how evidence properties changed, and whether calibration survives shift | Adjacent-period splits, retraining frequency, platform evolution, and observability changes make cross-study performance numbers especially fragile |
| Failure Mode | How It Appears in Results | Minimal Mitigation |
|---|---|---|
| Visibility and sampling bias | Strong performance that does not transfer when access tier, ranking, or moderation changes | Report access limits; test across non-adjacent periods; sensitivity analysis to sampling rules |
| Temporal overlap leakage | Unusually high short-horizon accuracy; sharp drop under time-forward evaluation | Enforce strict decision cutoffs; separate evidence and label horizons; document window definitions |
| Entity ambiguity and misattribution | Noisy or inconsistent instrument-level signals; unstable cross-asset results | Confidence-aware entity linking; abstain on ambiguous mentions; evaluate attribution error explicitly |
| Engagement and attention confounding | Models succeed mainly during high-volatility episodes; weak incremental value beyond volume proxies | Include attention baselines; condition results by activity level; control for event intensity |
| Drift in language and participation | Degradation over time; brittle behavior on new assets and emerging narratives | Time-forward testing on later periods; recalibration; monitor feature shift and error by period |
| Missing evidence and observation delay | Silent failures or alert flooding under spikes; calibration collapse in sparse evidence | Missingness stress tests; delay-aware evaluation; conservative fallback and abstention rules |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Theodorakopoulos, L.; Theodoropoulou, A. Big Data and Graph Deep Learning for Financial Decision Support from Social Networks: A Critical Review. Electronics 2026, 15, 1405. https://doi.org/10.3390/electronics15071405
Theodorakopoulos L, Theodoropoulou A. Big Data and Graph Deep Learning for Financial Decision Support from Social Networks: A Critical Review. Electronics. 2026; 15(7):1405. https://doi.org/10.3390/electronics15071405
Chicago/Turabian StyleTheodorakopoulos, Leonidas, and Alexandra Theodoropoulou. 2026. "Big Data and Graph Deep Learning for Financial Decision Support from Social Networks: A Critical Review" Electronics 15, no. 7: 1405. https://doi.org/10.3390/electronics15071405
APA StyleTheodorakopoulos, L., & Theodoropoulou, A. (2026). Big Data and Graph Deep Learning for Financial Decision Support from Social Networks: A Critical Review. Electronics, 15(7), 1405. https://doi.org/10.3390/electronics15071405
