Journal Description
Big Data and Cognitive Computing
Big Data and Cognitive Computing
is an international, peer-reviewed, open access journal on big data and cognitive computing published monthly online by MDPI.
- Open Access— free for readers, with article processing charges (APC) paid by authors or their institutions.
- High Visibility: indexed within Scopus, ESCI (Web of Science), dblp, Inspec, Ei Compendex, and other databases.
- Journal Rank: JCR - Q1 (Computer Science, Theory and Methods) / CiteScore - Q1 (Computer Science Applications)
- Rapid Publication: manuscripts are peer-reviewed and a first decision is provided to authors approximately 24.5 days after submission; acceptance to publication is undertaken in 4.6 days (median values for papers published in this journal in the first half of 2025).
- Recognition of Reviewers: reviewers who provide timely, thorough peer-review reports receive vouchers entitling them to a discount on the APC of their next publication in any MDPI journal, in appreciation of the work done.
Impact Factor:
4.4 (2024);
5-Year Impact Factor:
4.2 (2024)
Latest Articles
Integration of Associative Tokens into Thematic Hyperspace: A Method for Determining Semantically Significant Clusters in Dynamic Text Streams
Big Data Cogn. Comput. 2025, 9(8), 197; https://doi.org/10.3390/bdcc9080197 - 25 Jul 2025
Abstract
►
Show Figures
With the exponential growth of textual data, traditional topic modeling methods based on static analysis demonstrate limited effectiveness in tracking the dynamics of thematic content. This research aims to develop a method for quantifying the dynamics of topics within text corpora using a
[...] Read more.
With the exponential growth of textual data, traditional topic modeling methods based on static analysis demonstrate limited effectiveness in tracking the dynamics of thematic content. This research aims to develop a method for quantifying the dynamics of topics within text corpora using a thematic signal (TS) function that accounts for temporal changes and semantic relationships. The proposed method combines associative tokens with original lexical units to reduce thematic entropy and information noise. Approaches employed include topic modeling (LDA), vector representations of texts (TF-IDF, Word2Vec), and time series analysis. The method was tested on a corpus of news texts (5000 documents). Results demonstrated robust identification of semantically meaningful thematic clusters. An inverse relationship was observed between the level of thematic significance and semantic diversity, confirming a reduction in entropy using the proposed method. This approach allows for quantifying topic dynamics, filtering noise, and determining the optimal number of clusters. Future applications include analyzing multilingual data and integration with neural network models. The method shows potential for monitoring information flows and predicting thematic trends.
Full article
Open AccessArticle
Leadership Uniformity in Timeout-Based Quorum Byzantine Fault Tolerance (QBFT) Consensus
by
Andreas Polyvios Delladetsimas, Stamatis Papangelou, Elias Iosif and George Giaglis
Big Data Cogn. Comput. 2025, 9(8), 196; https://doi.org/10.3390/bdcc9080196 - 24 Jul 2025
Abstract
►▼
Show Figures
This study evaluates leadership uniformity—the degree to which the proposer role is evenly distributed among validator nodes over time—in Quorum-based Byzantine Fault Tolerance (QBFT), a Byzantine Fault-Tolerant (BFT) consensus algorithm used in permissioned blockchain networks. By introducing simulated follower timeouts derived from uniform,
[...] Read more.
This study evaluates leadership uniformity—the degree to which the proposer role is evenly distributed among validator nodes over time—in Quorum-based Byzantine Fault Tolerance (QBFT), a Byzantine Fault-Tolerant (BFT) consensus algorithm used in permissioned blockchain networks. By introducing simulated follower timeouts derived from uniform, normal, lognormal, and Weibull distributions, it models a range of network conditions and latency patterns across nodes. This approach integrates Raft-inspired timeout mechanisms into the QBFT framework, enabling a more detailed analysis of leader selection under different network conditions. Three leader selection strategies are tested: Direct selection of the node with the shortest timeout, and two quorum-based approaches selecting from the top 20% and 30% of nodes with the shortest timeouts. Simulations were conducted over 200 rounds in a 10-node network. Results show that leader selection was most equitable under the Weibull distribution with shape , which captures delay behavior observed in real-world networks. In contrast, the uniform distribution did not consistently yield the most balanced outcomes. The findings also highlight the effectiveness of quorum-based selection: While choosing the node with the lowest timeout ensures responsiveness in each round, it does not guarantee uniform leadership over time. In low-variability distributions, certain nodes may be repeatedly selected by chance, as similar timeout values increase the likelihood of the same nodes appearing among the fastest. Incorporating controlled randomness through quorum-based voting improves rotation consistency and promotes fairer leader distribution, especially under heavy-tailed latency conditions. However, expanding the candidate pool beyond 30% (e.g., to 40% or 50%) introduced vote fragmentation, which complicated quorum formation in small networks and led to consensus failure. Overall, the study demonstrates the potential of timeout-aware, quorum-based leader selection as a more adaptive and equitable alternative to round-robin approaches, and provides a foundation for developing more sophisticated QBFT variants tailored to latency-sensitive networks.
Full article

Figure 1
Open AccessArticle
Discovering the Emotions of Frustration and Confidence During the Application of Cognitive Tests in Mexican University Students
by
Marco A. Moreno-Armendáriz, Jesús Mercado-Ríos, José E. Valdez-Rodríguez, Rolando Quintero and Victor H. Ponce-Ponce
Big Data Cogn. Comput. 2025, 9(8), 195; https://doi.org/10.3390/bdcc9080195 - 24 Jul 2025
Abstract
Emotion detection using computer vision has advanced significantly in recent years, achieving remarkable performance that, in some cases, surpasses that of humans. Convolutional neural networks (CNNs) excel in this task by capturing facial features that allow for effective emotion classification. However, most research
[...] Read more.
Emotion detection using computer vision has advanced significantly in recent years, achieving remarkable performance that, in some cases, surpasses that of humans. Convolutional neural networks (CNNs) excel in this task by capturing facial features that allow for effective emotion classification. However, most research focuses on basic emotions, such as happiness, anger, or sadness, neglecting more complex emotions, like frustration. People set expectations or goals to meet; if they do not happen, frustration arises, generating reactions such as annoyance, anger, and disappointment, which can harm confidence and motivation. These aspects make it especially relevant in mental health and educational contexts, where detecting it could help mitigate its adverse effects. In this research, we developed a CNN-based approach to detect frustration through facial expressions. The scarcity of specific datasets for this task led us to create an experimental protocol to generate our dataset. This classification task presents a high degree of difficulty due to the variability in facial expressions among different participants when feeling frustrated. Despite this, our new model achieved an F1-score of , thus obtaining an adequate baseline model.
Full article
(This article belongs to the Special Issue Application of Deep Neural Networks)
►▼
Show Figures

Figure 1
Open AccessArticle
Large Language Model-Based Topic-Level Sentiment Analysis for E-Grocery Consumer Reviews
by
Julizar Isya Pandu Wangsa, Yudhistira Jinawi Agung, Safira Raissa Rahmi, Hendri Murfi, Nora Hariadi, Siti Nurrohmah, Yudi Satria and Choiru Za’in
Big Data Cogn. Comput. 2025, 9(8), 194; https://doi.org/10.3390/bdcc9080194 - 23 Jul 2025
Abstract
►▼
Show Figures
Customer sentiment analysis plays a pivotal role in the digital economy by offering comprehensive insights that inform strategic business decisions, optimize digital marketing initiatives, and improve overall customer satisfaction. We propose a large language model-based topic-level sentiment analysis framework. We employ a BERT-based
[...] Read more.
Customer sentiment analysis plays a pivotal role in the digital economy by offering comprehensive insights that inform strategic business decisions, optimize digital marketing initiatives, and improve overall customer satisfaction. We propose a large language model-based topic-level sentiment analysis framework. We employ a BERT-based model to generate contextualized vector representations of the documents, and then clustering algorithms are automatically applied to group documents into topics. Once the topics are formed, a GPT model is used to perform sentiment classification on the content related to each topic. The simulations show the effectiveness of this approach, where selecting appropriate clustering techniques yields more semantically coherent topics. Furthermore, topic-level sentiment polarization shows that 31.7% of all negative sentiment concentrates on the shopping experience, despite an overall positive sentiment trend.
Full article

Figure 1
Open AccessArticle
Survey on the Role of Mechanistic Interpretability in Generative AI
by
Leonardo Ranaldi
Big Data Cogn. Comput. 2025, 9(8), 193; https://doi.org/10.3390/bdcc9080193 - 23 Jul 2025
Abstract
►▼
Show Figures
The rapid advancement of artificial intelligence (AI) and machine learning has revolutionised how systems process information, make decisions, and adapt to dynamic environments. AI-driven approaches have significantly enhanced efficiency and problem-solving capabilities across various domains, from automated decision-making to knowledge representation and predictive
[...] Read more.
The rapid advancement of artificial intelligence (AI) and machine learning has revolutionised how systems process information, make decisions, and adapt to dynamic environments. AI-driven approaches have significantly enhanced efficiency and problem-solving capabilities across various domains, from automated decision-making to knowledge representation and predictive modelling. These developments have led to the emergence of increasingly sophisticated models capable of learning patterns, reasoning over complex data structures, and generalising across tasks. As AI systems become more deeply integrated into networked infrastructures and the Internet of Things (IoT), their ability to process and interpret data in real-time is essential for optimising intelligent communication networks, distributed decision making, and autonomous IoT systems. However, despite these achievements, the internal mechanisms that drive LLMs’ reasoning and generalisation capabilities remain largely unexplored. This lack of transparency, compounded by challenges such as hallucinations, adversarial perturbations, and misaligned human expectations, raises concerns about their safe and beneficial deployment. Understanding the underlying principles governing AI models is crucial for their integration into intelligent network systems, automated decision-making processes, and secure digital infrastructures. This paper provides a comprehensive analysis of explainability approaches aimed at uncovering the fundamental mechanisms of LLMs. We investigate the strategic components contributing to their generalisation abilities, focusing on methods to quantify acquired knowledge and assess its representation within model parameters. Specifically, we examine mechanistic interpretability, probing techniques, and representation engineering as tools to decipher how knowledge is structured, encoded, and retrieved in AI systems. Furthermore, by adopting a mechanistic perspective, we analyse emergent phenomena within training dynamics, particularly memorisation and generalisation, which also play a crucial role in broader AI-driven systems, including adaptive network intelligence, edge computing, and real-time decision-making architectures. Understanding these principles is crucial for bridging the gap between black-box AI models and practical, explainable AI applications, thereby ensuring trust, robustness, and efficiency in language-based and general AI systems.
Full article

Figure 1
Open AccessArticle
Synonym Substitution Steganalysis Based on Heterogeneous Feature Extraction and Hard Sample Mining Re-Perception
by
Jingang Wang, Hui Du and Peng Liu
Big Data Cogn. Comput. 2025, 9(8), 192; https://doi.org/10.3390/bdcc9080192 - 22 Jul 2025
Abstract
►▼
Show Figures
Linguistic steganography can be utilized to establish covert communication channels on social media platforms, thus facilitating the dissemination of illegal messages, seriously compromising cyberspace security. Synonym substitution-based linguistic steganography methods have garnered considerable attention due to their simplicity and strong imperceptibility. Existing linguistic
[...] Read more.
Linguistic steganography can be utilized to establish covert communication channels on social media platforms, thus facilitating the dissemination of illegal messages, seriously compromising cyberspace security. Synonym substitution-based linguistic steganography methods have garnered considerable attention due to their simplicity and strong imperceptibility. Existing linguistic steganalysis methods have not achieved excellent detection performance for the aforementioned type of linguistic steganography. In this paper, based on the idea of focusing on accumulated differences, we propose a two-stage synonym substitution-based linguistic steganalysis method that does not require a synonym database and can effectively detect texts with very low embedding rates. Experimental results demonstrate that this method achieves an average detection accuracy 2.4% higher than the comparative method.
Full article

Figure 1
Open AccessArticle
Enhanced Face Recognition in Crowded Environments with 2D/3D Features and Parallel Hybrid CNN-RNN Architecture with Stacked Auto-Encoder
by
Samir Elloumi, Sahbi Bahroun, Sadok Ben Yahia and Mourad Kaddes
Big Data Cogn. Comput. 2025, 9(8), 191; https://doi.org/10.3390/bdcc9080191 - 22 Jul 2025
Abstract
►▼
Show Figures
Face recognition (FR) in unconstrained conditions remains an open research topic and an ongoing challenge. The facial images exhibit diverse expressions, occlusions, variations in illumination, and heterogeneous backgrounds. This work aims to produce an accurate and robust system for enhanced Security and Surveillance.
[...] Read more.
Face recognition (FR) in unconstrained conditions remains an open research topic and an ongoing challenge. The facial images exhibit diverse expressions, occlusions, variations in illumination, and heterogeneous backgrounds. This work aims to produce an accurate and robust system for enhanced Security and Surveillance. A parallel hybrid deep learning model for feature extraction and classification is proposed. An ensemble of three parallel extraction layer models learns the best representative features using CNN and RNN. 2D LBP and 3D Mesh LBP are computed on face images to extract image features as input to two RNNs. A stacked autoencoder (SAE) merged the feature vectors extracted from the three CNN-RNN parallel layers. We tested the designed 2D/3D CNN-RNN framework on four standard datasets. We achieved an accuracy of . The hybrid deep learning model significantly improves FR against similar state-of-the-art methods. The proposed model was also tested on an unconstrained conditions human crowd dataset, and the results were very promising with an accuracy of . Furthermore, our model shows an 11.5% improvement over similar hybrid CNN-RNN architectures, proving its robustness in complex environments where the face can undergo different transformations.
Full article

Figure 1
Open AccessArticle
Detection of Biased Phrases in the Wiki Neutrality Corpus for Fairer Digital Content Management Using Artificial Intelligence
by
Abdullah, Muhammad Ateeb Ather, Olga Kolesnikova and Grigori Sidorov
Big Data Cogn. Comput. 2025, 9(7), 190; https://doi.org/10.3390/bdcc9070190 - 21 Jul 2025
Abstract
►▼
Show Figures
Detecting biased language in large-scale corpora, such as the Wiki Neutrality Corpus, is essential for promoting neutrality in digital content. This study systematically evaluates a range of machine learning (ML) and deep learning (DL) models for the detection of biased and pre-conditioned phrases.
[...] Read more.
Detecting biased language in large-scale corpora, such as the Wiki Neutrality Corpus, is essential for promoting neutrality in digital content. This study systematically evaluates a range of machine learning (ML) and deep learning (DL) models for the detection of biased and pre-conditioned phrases. Conventional classifiers, including Extreme Gradient Boosting (XGBoost), Light Gradient-Boosting Machine (LightGBM), and Categorical Boosting (CatBoost), are compared with advanced neural architectures such as Bidirectional Encoder Representations from Transformers (BERT), Long Short-Term Memory (LSTM) networks, and Generative Adversarial Networks (GANs). A novel hybrid architecture is proposed, integrating DistilBERT, LSTM, and GANs within a unified framework. Extensive experimentation with intermediate variants DistilBERT + LSTM (without GAN) and DistilBERT + GAN (without LSTM) demonstrates that the fully integrated model consistently outperforms all alternatives. The proposed hybrid model achieves a cross-validation accuracy of 99.00%, significantly surpassing traditional baselines such as XGBoost (96.73%) and LightGBM (96.83%). It also exhibits superior stability, statistical significance (paired t-tests), and favorable trade-offs between performance and computational efficiency. The results underscore the potential of hybrid deep learning models for capturing subtle linguistic bias and advancing more objective and reliable automated content moderation systems.
Full article

Figure 1
Open AccessSystematic Review
State of the Art and Future Directions of Small Language Models: A Systematic Review
by
Flavio Corradini, Matteo Leonesi and Marco Piangerelli
Big Data Cogn. Comput. 2025, 9(7), 189; https://doi.org/10.3390/bdcc9070189 - 21 Jul 2025
Abstract
►▼
Show Figures
Small Language Models (SLMs) have emerged as a critical area of study within natural language processing, attracting growing attention from both academia and industry. This systematic literature review provides a comprehensive and reproducible analysis of recent developments and advancements in SLMs post-2023. Drawing
[...] Read more.
Small Language Models (SLMs) have emerged as a critical area of study within natural language processing, attracting growing attention from both academia and industry. This systematic literature review provides a comprehensive and reproducible analysis of recent developments and advancements in SLMs post-2023. Drawing on 70 English-language studies published between January 2023 and January 2025, identified through Scopus, IEEE Xplore, Web of Science, and ACM Digital Library, and focusing primarily on SLMs (including those with up to 7 billion parameters), this review offers a structured overview of the current state of the art and potential future directions. Designed as a resource for researchers seeking an in-depth global synthesis, the review examines key dimensions such as publication trends, visual data representations, contributing institutions, and the availability of public datasets. It highlights prevailing research challenges and outlines proposed solutions, with a particular focus on widely adopted model architectures, as well as common compression and optimization techniques. This study also evaluates the criteria used to assess the effectiveness of SLMs and discusses emerging de facto standards for industry. The curated data and insights aim to support and inform ongoing and future research in this rapidly evolving field.
Full article

Figure 1
Open AccessArticle
An Approach to Enable Human–3D Object Interaction Through Voice Commands in an Immersive Virtual Environment
by
Alessio Catalfamo, Antonio Celesti, Maria Fazio, A. F. M. Saifuddin Saif, Yu-Sheng Lin, Edelberto Franco Silva and Massimo Villari
Big Data Cogn. Comput. 2025, 9(7), 188; https://doi.org/10.3390/bdcc9070188 - 17 Jul 2025
Abstract
Nowadays, the Metaverse is facing many challenges. In this context, Virtual Reality (VR) applications allowing voice-based human–3D object interactions are limited due to the current hardware/software limitations. In fact, adopting Automated Speech Recognition (ASR) systems to interact with 3D objects in VR applications
[...] Read more.
Nowadays, the Metaverse is facing many challenges. In this context, Virtual Reality (VR) applications allowing voice-based human–3D object interactions are limited due to the current hardware/software limitations. In fact, adopting Automated Speech Recognition (ASR) systems to interact with 3D objects in VR applications through users’ voice commands presents significant challenges due to the hardware and software limitations of headset devices. This paper aims to bridge this gap by proposing a methodology to address these issues. In particular, starting from a Mel-Frequency Cepstral Coefficient (MFCC) extraction algorithm able to capture the unique characteristics of the user’s voice, we pass it as input to a Convolutional Neural Network (CNN) model. After that, in order to integrate the CNN model with a VR application running on a standalone headset, such as Oculus Quest, we converted it into an Open Neural Network Exchange (ONNX) format, i.e., a Machine Learning (ML) interoperability open standard format. The proposed system demonstrates good performance and represents a foundation for the development of user-centric, effective computing systems, enhancing accessibility to VR environments through voice-based commands. Experiments demonstrate that a native CNN model developed through TensorFlow presents comparable performances with respect to the corresponding CNN model converted into the ONNX format, paving the way towards the development of VR applications running in headsets controlled through the user’s voice.
Full article
(This article belongs to the Special Issue Advances in Artificial Intelligence for Computer Vision, Augmented Reality Virtual Reality and Metaverse)
►▼
Show Figures

Figure 1
Open AccessArticle
LeONet: A Hybrid Deep Learning Approach for High-Precision Code Clone Detection Using Abstract Syntax Tree Features
by
Thanoshan Vijayanandan, Kuhaneswaran Banujan, Ashan Induranga, Banage T. G. S. Kumara and Kaveenga Koswattage
Big Data Cogn. Comput. 2025, 9(7), 187; https://doi.org/10.3390/bdcc9070187 - 15 Jul 2025
Abstract
►▼
Show Figures
Code duplication, commonly referred to as code cloning, is not inherent in software systems but arises due to various factors, such as time constraints in meeting project deadlines. These duplications, or “code clones”, complicate the program structure and increase maintenance costs. Code clones
[...] Read more.
Code duplication, commonly referred to as code cloning, is not inherent in software systems but arises due to various factors, such as time constraints in meeting project deadlines. These duplications, or “code clones”, complicate the program structure and increase maintenance costs. Code clones are categorized into four types: Type-1, Type-2, Type-3, and Type-4. This study aims to address the adverse effects of code clones by introducing LeONet, a hybrid Deep Learning approach that enhances the detection of code clones in software systems. The hybrid approach, LeONet, combines LeNet-5 with Oreo’s Siamese architecture. We extracted clone method pairs from the BigCloneBench Java repository. Feature extraction was performed using Abstract Syntax Trees, which are scalable and accurately represent the syntactic structure of the source code. The performance of LeONet was compared against other classifiers including ANN, LeNet-5, Oreo’s Siamese, LightGBM, XGBoost, and Decision Tree. LeONet demonstrated superior performance among the classifiers tested, achieving the highest F1 score of 98.12%. It also compared favorably against state-of-the-art approaches, indicating its effectiveness in code clone detection. The results validate the effectiveness of LeONet in detecting code clones, outperforming existing classifiers and competing closely with advanced methods. This study underscores the potential of hybrid deep learning models and feature extraction techniques in improving the accuracy of code clone detection, providing a promising direction for future research in this area.
Full article

Figure 1
Open AccessArticle
CNN-Based Framework for Classifying COVID-19, Pneumonia, and Normal Chest X-Rays
by
Cristian Randieri, Andrea Perrotta, Adriano Puglisi, Maria Grazia Bocci and Christian Napoli
Big Data Cogn. Comput. 2025, 9(7), 186; https://doi.org/10.3390/bdcc9070186 - 11 Jul 2025
Cited by 1
Abstract
This paper describes the development of a CNN model for the analysis of chest X-rays and the automated diagnosis of pneumonia, bacterial or viral, and lung pathologies resulting from COVID-19, offering new insights for further research through the development of an AI-based diagnostic
[...] Read more.
This paper describes the development of a CNN model for the analysis of chest X-rays and the automated diagnosis of pneumonia, bacterial or viral, and lung pathologies resulting from COVID-19, offering new insights for further research through the development of an AI-based diagnostic tool, which can be automatically implemented and made available for rapid differentiation between normal pneumonia and COVID-19 starting from X-ray images. The model developed in this work is capable of performing three-class classification, achieving 97.48% accuracy in distinguishing chest X-rays affected by COVID-19 from other pneumonias (bacterial or viral) and from cases defined as normal, i.e., without any obvious pathology. The novelty of our study is represented not only by the quality of the results obtained in terms of accuracy but, above all, by the reduced complexity of the model in terms of parameters and a shorter inference time compared to other models currently found in the literature. The excellent trade-off between the accuracy and computational complexity of our model allows for easy implementation on numerous embedded hardware platforms, such as FPGAs, for the creation of new diagnostic tools to support medical practice.
Full article
(This article belongs to the Special Issue Beyond Diagnosis: Machine Learning in Prognosis, Prevention, Healthcare, Neurosciences, and Precision Medicine)
►▼
Show Figures

Figure 1
Open AccessArticle
Adaptive, Privacy-Enhanced Real-Time Fraud Detection in Banking Networks Through Federated Learning and VAE-QLSTM Fusion
by
Hanae Abbassi, Saida El Mendili and Youssef Gahi
Big Data Cogn. Comput. 2025, 9(7), 185; https://doi.org/10.3390/bdcc9070185 - 9 Jul 2025
Abstract
►▼
Show Figures
Increased digital banking operations have brought about a surge in suspicious activities, necessitating heightened real-time fraud detection systems. Conversely, traditional static approaches encounter challenges in maintaining privacy while adapting to new fraudulent trends. In this paper, we provide a unique approach to tackling
[...] Read more.
Increased digital banking operations have brought about a surge in suspicious activities, necessitating heightened real-time fraud detection systems. Conversely, traditional static approaches encounter challenges in maintaining privacy while adapting to new fraudulent trends. In this paper, we provide a unique approach to tackling those challenges by integrating VAE-QLSTM with Federated Learning (FL) in a semi-decentralized architecture, maintaining privacy alongside adapting to emerging malicious behaviors. The suggested architecture builds on the adeptness of VAE-QLSTM to capture meaningful representations of transactions, serving in abnormality detection. On the other hand, QLSTM combines quantum computational capability with temporal sequence modeling, seeking to give a rapid and scalable method for real-time malignancy detection. The designed approach was set up through TensorFlow Federated on two real-world datasets—notably IEEE-CIS and European cardholders—outperforming current strategies in terms of accuracy and sensitivity, achieving 94.5% and 91.3%, respectively. This proves the potential of merging VAE-QLSTM with FL to address fraud detection difficulties, ensuring privacy and scalability in advanced banking networks.
Full article

Figure 1
Open AccessReview
LLMs in Cyber Security: Bridging Practice and Education
by
Hany F. Atlam
Big Data Cogn. Comput. 2025, 9(7), 184; https://doi.org/10.3390/bdcc9070184 - 8 Jul 2025
Abstract
►▼
Show Figures
Large Language Models (LLMs) have emerged as powerful tools in cyber security, enabling automation, threat detection, and adaptive learning. Their ability to process unstructured data and generate context-aware outputs supports both operational tasks and educational initiatives. Despite their growing adoption, current research often
[...] Read more.
Large Language Models (LLMs) have emerged as powerful tools in cyber security, enabling automation, threat detection, and adaptive learning. Their ability to process unstructured data and generate context-aware outputs supports both operational tasks and educational initiatives. Despite their growing adoption, current research often focuses on isolated applications, lacking a systematic understanding of how LLMs align with domain-specific requirements and pedagogical effectiveness. This highlights a pressing need for comprehensive evaluations that address the challenges of integration, generalization, and ethical deployment in both operational and educational cyber security environments. Therefore, this paper provides a comprehensive and State-of-the-Art review of the significant role of LLMs in cyber security, addressing both operational and educational dimensions. It introduces a holistic framework that categorizes LLM applications into six key cyber security domains, examining each in depth to demonstrate their impact on automation, context-aware reasoning, and adaptability to emerging threats. The paper highlights the potential of LLMs to enhance operational performance and educational effectiveness while also exploring emerging technical, ethical, and security challenges. The paper also uniquely addresses the underexamined area of LLMs in cyber security education by reviewing recent studies and illustrating how these models support personalized learning, hands-on training, and awareness initiatives. The key findings reveal that while LLMs offer significant potential in automating tasks and enabling personalized learning, challenges remain in model generalization, ethical deployment, and production readiness. Finally, the paper discusses open issues and future research directions for the application of LLMs in both operational and educational contexts. This paper serves as a valuable reference for researchers, educators, and practitioners aiming to develop intelligent, adaptive, scalable, and ethically responsible LLM-based cyber security solutions.
Full article

Figure 1
Open AccessArticle
Gait-Based Parkinson’s Disease Detection Using Recurrent Neural Networks for Wearable Systems
by
Carlos Rangel-Cascajosa, Francisco Luna-Perejón, Saturnino Vicente-Diaz and Manuel Domínguez-Morales
Big Data Cogn. Comput. 2025, 9(7), 183; https://doi.org/10.3390/bdcc9070183 - 7 Jul 2025
Abstract
Parkinson’s disease is one of the neurodegenerative conditions that has seen a significant increase in prevalence in recent decades. The lack of specific screening tests and notable disease biomarkers, combined with the strain on healthcare systems, leads to delayed detection of the disease,
[...] Read more.
Parkinson’s disease is one of the neurodegenerative conditions that has seen a significant increase in prevalence in recent decades. The lack of specific screening tests and notable disease biomarkers, combined with the strain on healthcare systems, leads to delayed detection of the disease, which worsens its progression. The development of diagnostic support tools can support early detection and facilitate timely intervention. The ability of Deep Learning algorithms to identify complex features from clinical data has proven to be a promising approach in various medical domains as support tools. In this study, we present an investigation of different architectures based on Gated Recurrent Neural Networks to assess their effectiveness in identifying subjects with Parkinson’s disease from gait records. Models with Long-Short term Memory (LSTM) and Gated Recurrent Unit (GRU) layers were evaluated. Performance results reach competitive effectiveness values with the current state-of-the-art accuracy (up to 93.75% (average ± SD: 86 ± 5%)), simplifying computational complexity, which represents an advance in the implementation of executable screening and diagnostic support tools in systems with few computational resources in wearable devices.
Full article
(This article belongs to the Topic eHealth and mHealth: Challenges and Prospects, 2nd Edition)
►▼
Show Figures

Figure 1
Open AccessArticle
Dependency-Aware Entity–Attribute Relationship Learning for Text-Based Person Search
by
Wei Xia, Wenguang Gan and Xinpan Yuan
Big Data Cogn. Comput. 2025, 9(7), 182; https://doi.org/10.3390/bdcc9070182 - 7 Jul 2025
Abstract
►▼
Show Figures
Text-based person search (TPS), a critical technology for security and surveillance, aims to retrieve target individuals from image galleries using textual descriptions. The existing methods face two challenges: (1) ambiguous attribute–noun association (AANA), where syntactic ambiguities lead to incorrect associations between attributes and
[...] Read more.
Text-based person search (TPS), a critical technology for security and surveillance, aims to retrieve target individuals from image galleries using textual descriptions. The existing methods face two challenges: (1) ambiguous attribute–noun association (AANA), where syntactic ambiguities lead to incorrect associations between attributes and the intended nouns; and (2) textual noise and relevance imbalance (TNRI), where irrelevant or non-discriminative tokens (e.g., ‘wearing’) reduce the saliency of critical visual attributes in the textual description. To address these aspects, we propose the dependency-aware entity–attribute alignment network (DEAAN), a novel framework that explicitly tackles AANA through dependency-guided attention and TNRI via adaptive token filtering. The DEAAN introduces two modules: (1) dependency-assisted implicit reasoning (DAIR) to resolve AANA through syntactic parsing, and (2) relevance-adaptive token selection (RATS) to suppress TNRI by learning token saliency. Experiments on CUHK-PEDES, ICFG-PEDES, and RSTPReid demonstrate state-of-the-art performance, with the DEAAN achieving a Rank-1 accuracy of 76.71% and an mAP of 69.07% on CUHK-PEDES, surpassing RDE by 0.77% in Rank-1 and 1.51% in mAP. Ablation studies reveal that DAIR and RATS individually improve Rank-1 by 2.54% and 3.42%, while their combination elevates the performance by 6.35%, validating their synergy. This work bridges structured linguistic analysis with adaptive feature selection, demonstrating practical robustness in surveillance-oriented TPS scenarios.
Full article

Figure 1
Open AccessArticle
Laor Initialization: A New Weight Initialization Method for the Backpropagation of Deep Learning
by
Laor Boongasame, Jirapond Muangprathub and Karanrat Thammarak
Big Data Cogn. Comput. 2025, 9(7), 181; https://doi.org/10.3390/bdcc9070181 - 7 Jul 2025
Abstract
►▼
Show Figures
This paper presents Laor Initialization, an innovative weight initialization technique for deep neural networks that utilizes forward-pass error feedback in conjunction with k-means clustering to optimize the initial weights. In contrast to traditional methods, Laor adopts a data-driven approach that enhances convergence’s stability
[...] Read more.
This paper presents Laor Initialization, an innovative weight initialization technique for deep neural networks that utilizes forward-pass error feedback in conjunction with k-means clustering to optimize the initial weights. In contrast to traditional methods, Laor adopts a data-driven approach that enhances convergence’s stability and efficiency. The method was assessed using various datasets, including a gold price time series, MNIST, and CIFAR-10 across the CNN and LSTM architectures. The results indicate that the Laor Initialization achieved the lowest K-fold cross-validation RMSE (0.00686), surpassing Xavier, He, and Random. Laor demonstrated a high convergence success (final RMSE = 0.00822) and the narrowest interquartile range (IQR), indicating superior stability. Gradient analysis confirmed Laor’s robustness, achieving the lowest coefficients of variation (CV = 0.2230 for MNIST, 0.3448 for CIFAR-10, and 0.5997 for gold price) with zero vanishing layers in the CNNs. Laor achieved a 24% reduction in CPU training time for the Gold price data and the fastest runtime on MNIST (340.69 s), while maintaining efficiency on CIFAR-10 (317.30 s). It performed optimally with a batch size of 32 and a learning rate between 0.001 and 0.01. These findings establish Laor as a robust alternative to conventional methods, suitable for moderately deep architectures. Future research should focus on dynamic variance scaling and adaptive clustering.
Full article

Figure 1
Open AccessArticle
Modeling and Simulation of Public Opinion Evolution Based on the SIS-FJ Model with a Bidirectional Coupling Mechanism
by
Wenxuan Fu, Renqi Zhu, Bo Li, Xin Lu and Xiang Lin
Big Data Cogn. Comput. 2025, 9(7), 180; https://doi.org/10.3390/bdcc9070180 - 4 Jul 2025
Abstract
The evolution of public opinion on social media affects societal security and stability. To effectively control the societal impact of public opinion evolution, it is essential to study its underlying mechanisms. Public opinion evolution on social media primarily involves two processes: information dissemination
[...] Read more.
The evolution of public opinion on social media affects societal security and stability. To effectively control the societal impact of public opinion evolution, it is essential to study its underlying mechanisms. Public opinion evolution on social media primarily involves two processes: information dissemination and opinion interaction. However, existing studies overlook the bidirectional coupling relationship between these two processes, with limitations such as weak coupling and insufficient consideration of individual heterogeneity. To address this, we propose the SIS-FJ model with a bidirectional coupling mechanism, which combines the strengths of the SIS (Susceptible–Infected–Susceptible) model in information dissemination and the FJ (Friedkin–Johnsen) model in opinion interaction. Specifically, the SIS model is used to describe information dissemination, while the FJ model is used to describe opinion interaction. In the computation of infection and recovery rates of the SIS model, we introduce the opinion differences between individuals and their observable neighbors from the FJ model. In the computation of opinion values in the FJ model, we introduce the node states from the SIS model, thus achieving bidirectional coupling between the two models. Moreover, the model considers individual heterogeneity from multiple aspects, including infection rate, recovery rate, and individual susceptibility. Through simulation experiments, we investigate the effects of initial opinion distribution, individual susceptibility, and network structure on public opinion evolution. Interestingly, neither initial opinion distribution, individual susceptibility, nor network structure exerts a significant influence on the proportion of disseminating and non-disseminating individuals at termination. Furthermore, we optimize the model by adjusting the functions for infection and recovery rates.
Full article
(This article belongs to the Topic Social Computing and Social Network Analysis)
►▼
Show Figures

Figure 1
Open AccessArticle
Research on a Crime Spatiotemporal Prediction Method Integrating Informer and ST-GCN: A Case Study of Four Crime Types in Chicago
by
Yuxiao Fan, Xiaofeng Hu and Jinming Hu
Big Data Cogn. Comput. 2025, 9(7), 179; https://doi.org/10.3390/bdcc9070179 - 3 Jul 2025
Abstract
►▼
Show Figures
As global urbanization accelerates, communities have emerged as key areas where social conflicts and public safety risks clash. Traditional crime prevention models experience difficulties handling dynamic crime hotspots due to data lags and poor spatiotemporal resolution. Therefore, this study proposes a hybrid model
[...] Read more.
As global urbanization accelerates, communities have emerged as key areas where social conflicts and public safety risks clash. Traditional crime prevention models experience difficulties handling dynamic crime hotspots due to data lags and poor spatiotemporal resolution. Therefore, this study proposes a hybrid model combining Informer and Spatiotemporal Graph Convolutional Network (ST-GCN) to achieve precise crime prediction at the community level. By employing a community topology and incorporating historical crime, weather, and holiday data, ST-GCN captures spatiotemporal crime trends, while Informer identifies temporal dependencies. Moreover, the model leverages a fully connected layer to map features to predicted latitudes. The experimental results from 320,000 crime records from 22 police districts in Chicago, IL, USA, from 2015 to 2020 show that our model outperforms traditional and deep learning models in predicting assaults, robberies, property damage, and thefts. Specifically, the mean average error (MAE) is 0.73 for assaults, 1.36 for theft, 1.03 for robbery, and 1.05 for criminal damage. In addition, anomalous event fluctuations are effectively captured. The results indicate that our model furthers data-driven public safety governance through spatiotemporal dependency integration and long-sequence modeling, facilitating dynamic crime hotspot prediction and resource allocation optimization. Future research should integrate multisource socioeconomic data to further enhance model adaptability and cross-regional generalization capabilities.
Full article

Figure 1
Open AccessReview
Toward the Mass Adoption of Blockchain: Cross-Industry Insights from DeFi, Gaming, and Data Analytics
by
Shezon Saleem Mohammed Abdul, Anup Shrestha and Jianming Yong
Big Data Cogn. Comput. 2025, 9(7), 178; https://doi.org/10.3390/bdcc9070178 - 3 Jul 2025
Abstract
Blockchain’s promise of decentralised, tamper-resistant services is gaining real traction in three arenas: decentralized finance (DeFi), blockchain gaming, and data-driven analytics. These sectors span finance, entertainment, and information services, offering a representative setting in which to study real-world adoption. This survey analyzes how
[...] Read more.
Blockchain’s promise of decentralised, tamper-resistant services is gaining real traction in three arenas: decentralized finance (DeFi), blockchain gaming, and data-driven analytics. These sectors span finance, entertainment, and information services, offering a representative setting in which to study real-world adoption. This survey analyzes how each domain implements blockchain, identifies the incentives that accelerate uptake, and maps the technical and organizational barriers that still limit scale. By examining peer-reviewed literature and recent industry developments, this review distils common design features such as token incentives, verifiable digital ownership, and immutable data governance. It also pinpoints the following domain-specific challenges: capital efficiency in DeFi, asset portability and community engagement in gaming, and high-volume, low-latency querying in analytics. Moreover, cross-sector links are already forming, with DeFi liquidity tools supporting in-game economies and analytics dashboards improving decision-making across platforms. Building on these findings, this paper offers guidance on stronger interoperability and user-centered design and sets research priorities in consensus optimization, privacy-preserving analytics, and inclusive governance. Together, the insights equip developers, policymakers, and researchers to build scalable, interoperable platforms and reuse proven designs while avoiding common pitfalls.
Full article
(This article belongs to the Special Issue Application of Cloud Computing in Industrial Internet of Things)
►▼
Show Figures

Figure 1

Journal Menu
► ▼ Journal Menu-
- BDCC Home
- Aims & Scope
- Editorial Board
- Reviewer Board
- Topical Advisory Panel
- Instructions for Authors
- Special Issues
- Topics
- Topical Collections
- Article Processing Charge
- Indexing & Archiving
- Editor’s Choice Articles
- Most Cited & Viewed
- Journal Statistics
- Journal History
- Journal Awards
- Conferences
- Editorial Office
Journal Browser
► ▼ Journal BrowserHighly Accessed Articles
Latest Books
E-Mail Alert
News
Topics
Topic in
AI, Applied Sciences, BDCC, Sensors, Information, IJGI
Applied Computing and Machine Intelligence (ACMI)
Topic Editors: Chuan-Ming Liu, Wei-Shinn KuDeadline: 31 July 2025
Topic in
Algorithms, BDCC, BioMedInformatics, Information, Mathematics
Machine Learning Empowered Drug Screen
Topic Editors: Teng Zhou, Jiaqi Wang, Youyi SongDeadline: 31 August 2025
Topic in
IJERPH, JPM, Healthcare, BDCC, Applied Sciences, Sensors
eHealth and mHealth: Challenges and Prospects, 2nd Edition
Topic Editors: Antonis Billis, Manuel Dominguez-Morales, Anton CivitDeadline: 31 October 2025
Topic in
Actuators, Algorithms, BDCC, Future Internet, JMMP, Machines, Robotics, Systems
Smart Product Design and Manufacturing on Industrial Internet
Topic Editors: Pingyu Jiang, Jihong Liu, Ying Liu, Jihong YanDeadline: 31 December 2025

Conferences
Special Issues
Special Issue in
BDCC
Field Robotics and Artificial Intelligence (AI)
Guest Editors: Robert Ross, Alex StumpfDeadline: 20 August 2025
Special Issue in
BDCC
Semantic Web Technology and Recommender Systems 2nd Edition
Guest Editors: Konstantinos Kotis, Dimitris SpiliotopoulosDeadline: 31 August 2025
Special Issue in
BDCC
Machine Learning Methodologies and Applications in Cybersecurity Data Analysis
Guest Editors: Biao Han, Xiaoyan Wang, Xiucai Ye, Na ZhaoDeadline: 31 August 2025
Special Issue in
BDCC
Energy Conservation Towards a Low-Carbon and Sustainability Future
Guest Editors: Yongming Han, Xuan HuDeadline: 25 September 2025