Journal Description
Big Data and Cognitive Computing
Big Data and Cognitive Computing
is an international, peer-reviewed, open access journal on big data and cognitive computing published monthly online by MDPI.
- Open Access— free for readers, with article processing charges (APC) paid by authors or their institutions.
- High Visibility: indexed within Scopus, ESCI (Web of Science), dblp, Inspec, Ei Compendex, and other databases.
- Journal Rank: JCR - Q1 (Computer Science, Theory and Methods) / CiteScore - Q1 (Computer Science Applications)
- Rapid Publication: manuscripts are peer-reviewed and a first decision is provided to authors approximately 23.1 days after submission; acceptance to publication is undertaken in 4.6 days (median values for papers published in this journal in the second half of 2025).
- Recognition of Reviewers: reviewers who provide timely, thorough peer-review reports receive vouchers entitling them to a discount on the APC of their next publication in any MDPI journal, in appreciation of the work done.
- Journal Cluster of Artificial Intelligence: AI, AI in Medicine, Algorithms, BDCC, MAKE, MTI, Stats, Virtual Worlds and Computers.
Impact Factor:
4.4 (2024);
5-Year Impact Factor:
4.2 (2024)
Latest Articles
Comparative Read Performance Analysis of PostgreSQL and MongoDB in E-Commerce: An Empirical Study of Filtering and Analytical Queries
Big Data Cogn. Comput. 2026, 10(2), 66; https://doi.org/10.3390/bdcc10020066 - 19 Feb 2026
Abstract
►
Show Figures
This paper presents a comparative analysis of read performance for PostgreSQL and MongoDB in e-commerce scenarios, using identical datasets in a resource-constrained single-host environment. The results demonstrate that PostgreSQL executes complex analytical queries 1.6–15.1 times faster, depending on the query type and data
[...] Read more.
This paper presents a comparative analysis of read performance for PostgreSQL and MongoDB in e-commerce scenarios, using identical datasets in a resource-constrained single-host environment. The results demonstrate that PostgreSQL executes complex analytical queries 1.6–15.1 times faster, depending on the query type and data volume. The study employed synthetic data generation with the Faker library across three stages, processing up to 300,000 products and executing each of 6 query types 15 times. Both filtering and analytical queries were tested on non-indexed data in a controlled localhost environment with PostgreSQL 17.5 and MongoDB 7.0.14, using default configurations. PostgreSQL showed 65–80% shorter execution times for multi-criteria queries, while MongoDB required approximately 33% less disk space. These findings suggest that normalized relational schemas are advantageous for transactional e-commerce systems where analytical queries dominate the workload. The results are directly applicable to small and medium e-commerce developers operating in budget-constrained, single-host deployment environments when choosing between relational and document-oriented databases for structured transactional data with read-heavy analytical workloads. A minimal indexed validation confirms that the baseline trends remain consistent under a simple indexing configuration. Future work will examine broader indexing strategies, write-intensive workloads, and distributed deployment scenarios.
Full article
Open AccessPerspective
Integration of Lean Analytics and Industry 6.0: A Novel Meta-Theoretical Framework for Antifragile, Generative AI-Orchestrated, Circular–Regenerative, and Hyper-Connected Manufacturing Ecosystems
by
Mohammad Shahin, Mazdak Maghanaki and F. Frank Chen
Big Data Cogn. Comput. 2026, 10(2), 65; https://doi.org/10.3390/bdcc10020065 - 17 Feb 2026
Abstract
The convergence of Lean manufacturing principles with Industry 4.0 has yielded significant operational improvements, yet the emerging paradigm of Industry 6.0—characterized by antifragile, autonomous, and sustainable systems—demands a fundamental rethinking of existing analytical frameworks. This paper introduces the Industry 6.0 Lean Analytics (I6LA)
[...] Read more.
The convergence of Lean manufacturing principles with Industry 4.0 has yielded significant operational improvements, yet the emerging paradigm of Industry 6.0—characterized by antifragile, autonomous, and sustainable systems—demands a fundamental rethinking of existing analytical frameworks. This paper introduces the Industry 6.0 Lean Analytics (I6LA) Framework, a novel meta-theoretical approach that integrates Lean principles with the core concepts of Industry 6.0. By systematically analyzing the limitations of current Lean analytics in the context of Industry 6.0 requirements, we identify critical gaps in areas such as system resilience, AI-driven autonomy, and circular economy integration. The I6LA Framework addresses these gaps through four new theoretical pillars: Antifragile Lean Systems Theory, generative AI-Orchestrated Value Streams, Circular–Regenerative Analytics, and Hyper-Connected Ecosystem Integration. This research provides a new set of mathematical models for measuring antifragility, generative orchestration efficiency, and circularity, offering a comprehensive analytical toolkit for the next generation of manufacturing. The framework’s primary contribution is a paradigm shift from optimizing stable, human-in-the-loop systems to managing dynamic, autonomous ecosystems that thrive on volatility and are regenerative by design. This paper provides both a robust theoretical foundation and practical implementation guidance for organizations navigating the transition to Industry 6.0.
Full article
(This article belongs to the Section Cognitive System)
Open AccessArticle
Efficient Time Series Visual Exploration for Insight Discovery
by
Heba Helal and Mohamed A. Sharaf
Big Data Cogn. Comput. 2026, 10(2), 64; https://doi.org/10.3390/bdcc10020064 - 16 Feb 2026
Abstract
Visual exploration of time series data is essential for uncovering meaningful insights in domains such as healthcare monitoring and financial analysis, yet it remains computationally challenging due to the combinatorial explosion of potential subsequence comparisons. For long time series, an exhaustive comparison of
[...] Read more.
Visual exploration of time series data is essential for uncovering meaningful insights in domains such as healthcare monitoring and financial analysis, yet it remains computationally challenging due to the combinatorial explosion of potential subsequence comparisons. For long time series, an exhaustive comparison of all possible subsequence pairs becomes prohibitively expensive, limiting interactive exploration. This paper presents the TiVEx (Time Series Visual Exploration) family of algorithms for efficiently discovering the top-k most dissimilar subsequence pairs in comparative time series analysis. TiVEx achieves scalability through three complementary strategies: TiVEx-sharing exploits computational reuse across overlapping subsequence windows, eliminating redundant distance calculations; TiVEx-pruning employs distance-based upper bounds to eliminate unpromising candidates without exhaustive evaluation; and TiVEx-hybrid integrates both mechanisms to maximize efficiency gains. The key observation is that overlapping subsequences share a substantial computational structure, which can be systematically exploited while maintaining result optimality through provably correct pruning bounds. Extensive experiments on six diverse datasets demonstrate that TiVEx-hybrid achieves up to 84% reduction in distance calculations compared to exhaustive search while producing identical top-k results. Compared to state-of-the-art subsequence comparison methods, TiVEx-hybrid achieves 2.3× improvement in computational efficiency. Our effectiveness analysis confirms that TiVEx achieves result quality within 5% of exhaustive search even when exploring only a subset of candidate positions, enabling scalable visual exploration without compromising insight quality.
Full article
(This article belongs to the Special Issue Application of Pattern Recognition and Machine Learning)
Open AccessReview
Cognitive Assemblages: Living with Algorithms
by
Stéphane Grumbach
Big Data Cogn. Comput. 2026, 10(2), 63; https://doi.org/10.3390/bdcc10020063 - 16 Feb 2026
Abstract
►▼
Show Figures
The rapid expansion of algorithmic systems has transformed cognition into an increasingly distributed and collective enterprise, giving rise to what can be described as cognitive assemblages, dynamic constellations of humans, institutions, data infrastructures, and artificial agents. This paper traces the historical and conceptual
[...] Read more.
The rapid expansion of algorithmic systems has transformed cognition into an increasingly distributed and collective enterprise, giving rise to what can be described as cognitive assemblages, dynamic constellations of humans, institutions, data infrastructures, and artificial agents. This paper traces the historical and conceptual evolution that has led to this shift. First, we show how cognition, once conceived as the property of autonomous individuals, has progressively become embedded in socio-technical networks in which algorithmic processes participate as co-agents. Second, we revisit the progressive awareness of human cognitive limits, from bounded rationality to contemporary theories of extended mind. These frameworks anticipate and help explain today’s hybrid cognitive ecologies. Third, we assess the philosophical implications for Enlightenment ideals of autonomy, rationality, and self-governance, showing how these concepts must be reinterpreted in light of pervasive algorithmic intermediation. Finally, we examine global initiatives that seek to integrate augmented cognitive capacities into large-scale cybernetic forms of societal coordination, ranging from digital platforms and data spaces to AI-driven governance systems. These developments offer new opportunities for steering complex societies under conditions of globalization, environmental disruption, and the rise of autonomous intelligent systems, yet they also raise profound questions regarding control, accountability, and democratic legitimacy. We argue that understanding cognitive assemblages is essential to designing socio-technical systems capable of supporting collective intelligence while preserving human values in an era of accelerating complexity.
Full article

Figure 1
Open AccessArticle
Skill Classification of Youth Table Tennis Players Using Sensor Fusion and the Random Forest Algorithm
by
Yung-Hoh Sheu, Cheng-Yu Huang, Li-Wei Tai, Tzu-Hsuan Tai and Sheng K. Wu
Big Data Cogn. Comput. 2026, 10(2), 62; https://doi.org/10.3390/bdcc10020062 - 15 Feb 2026
Abstract
This study addresses the issue of inaccurate results in traditional table tennis player classification, which is often influenced by subjective judgment and environmental factors, by proposing a youth table tennis player classification system based on sensor fusion and the random forest algorithm. The
[...] Read more.
This study addresses the issue of inaccurate results in traditional table tennis player classification, which is often influenced by subjective judgment and environmental factors, by proposing a youth table tennis player classification system based on sensor fusion and the random forest algorithm. The system utilizes an embedded intelligent table tennis racket equipped with an ICM20948 nine-axis sensor and a wireless transmission module to capture real-time acceleration and angular velocity data during players’ strokes while synchronously employing a camera with OpenPose to extract joint angle variations. A total of 40 players’ stroke data were collected. Due to the limited sample size of top-tier players, the Synthetic Minority Over-sampling Technique (SMOTE) was applied, resulting in a final dataset of 360 records. Multiple key motion indicators were then computed and stored in a dedicated database. Experimental results showed that the proposed system, powered by the random forest algorithm, achieved a classification accuracy of 91.3% under conventional cross-validation, while subject-independent LOSO validation yielded a more conservative accuracy of 70.89%, making it a valuable reference for coaches and referees in conducting objective player classification. Future work will focus on expanding the dataset of domestic high-performance athletes and integrating precise sports science resources to further enhance the system’s performance and algorithmic models, thereby promoting the scientific selection of national team players and advancing the intelligent development of table tennis.
Full article
(This article belongs to the Section Artificial Intelligence and Multi-Agent Systems)
►▼
Show Figures

Figure 1
Open AccessArticle
Underwater Visual-Servo Alignment Control Integrating Geometric Cognition Compensation and Confidence Assessment
by
Jinkun Li, Lingyu Sun, Minglu Zhang and Xinbao Li
Big Data Cogn. Comput. 2026, 10(2), 61; https://doi.org/10.3390/bdcc10020061 - 14 Feb 2026
Abstract
To meet the requirements for the automatic alignment, insertion, and inspection of guide-tube opening pins on the upper core plate in a component pool during refueling outages of nuclear power units, this paper proposes a cognition-enhanced visual-servoing framework that integrates geometric cognition-based compensation,
[...] Read more.
To meet the requirements for the automatic alignment, insertion, and inspection of guide-tube opening pins on the upper core plate in a component pool during refueling outages of nuclear power units, this paper proposes a cognition-enhanced visual-servoing framework that integrates geometric cognition-based compensation, observation-confidence modeling, and constraint-aware optimal control. The framework addresses the key challenge posed by the coexistence of long-term geometric drift and underwater observation uncertainty. Specifically, historical closed-loop data are leveraged to learn and compensate for systematic geometric errors online, substantially improving coarse-positioning accuracy. In addition, an explicit confidence model is introduced to quantitatively assess the reliability of visual measurements. Building on these components, a confidence-driven, finite-horizon, constrained model predictive control strategy is designed to achieve safe and efficient finite-step convergence while strictly respecting actuator physical constraints. Ground experiments and deep-water component-pool validations demonstrate that the proposed method reduces coarse-positioning error by approximately 75%, achieves stable sub-millimeter alignment with an ample engineering safety margin, and effectively decreases erroneous insertions and the need for manual intervention. These results confirm the engineering applicability and safety advantages of the proposed cognition-enhanced visual-servoing framework for underwater alignment tasks in nuclear component pools.
Full article
(This article belongs to the Special Issue Field Robotics and Artificial Intelligence (AI))
►▼
Show Figures

Figure 1
Open AccessArticle
Reliability of LLM Inference Engines from a Static Perspective: Root Cause Analysis and Repair Suggestion via Natural Language Reports
by
Hongwei Li and Yongjun Wang
Big Data Cogn. Comput. 2026, 10(2), 60; https://doi.org/10.3390/bdcc10020060 - 13 Feb 2026
Abstract
Large Language Model (LLM) inference engines are becoming critical system infrastructure, yet their increasing architectural complexity makes defects difficult to be diagnosed and repaired. Existing reliability studies predominantly focus on model behavior or training frameworks, leaving inference engine bugs underexplored, especially in settings
[...] Read more.
Large Language Model (LLM) inference engines are becoming critical system infrastructure, yet their increasing architectural complexity makes defects difficult to be diagnosed and repaired. Existing reliability studies predominantly focus on model behavior or training frameworks, leaving inference engine bugs underexplored, especially in settings where execution-based debugging is impractical. We present a static, issue-centric approach for automated root cause analysis and repair suggestion generation for LLM inference engines. Based solely on issue reports and developer discussions, we construct a real-world defect dataset and annotate each issue with a semantic root cause category and affected system module. Leveraging text-based representations, our framework performs root cause classification and coarse-grained module localization without requiring code execution or specialized runtime environments. We further integrate structured repair patterns with a large language model to generate interpretable and actionable repair suggestions. Experiments on real-world issues concerning vLLMs demonstrate that our approach achieves effective root cause identification and module localization under limited and imbalanced data. A cross-engine evaluation further shows promising generalization to TensorRT-LLM. Human evaluation confirms that the generated repair suggestions are correct, useful, and clearly expressed. Our results indicate that static, issue-level analysis is a viable foundation for scalable debugging assistance in LLM inference engines. This work highlights the feasibility of static, issue-level defect analysis for complex LLM inference engines and explores automated debugging assistance techniques. The dataset and implementation will be publicly released to facilitate future research.
Full article
(This article belongs to the Special Issue Advanced Software and Machine Learning Techniques for System Architectures and Big Data)
Open AccessArticle
PLTA-FinBERT: Pseudo-Label Generation-Based Test-Time Adaptation for Financial Sentiment Analysis
by
Hai Yang, Hainan Chen, Chang Jiang, Juntao He and Pengyang Li
Big Data Cogn. Comput. 2026, 10(2), 59; https://doi.org/10.3390/bdcc10020059 - 11 Feb 2026
Abstract
►▼
Show Figures
Financial sentiment analysis leverages natural language processing techniques to quantitatively assess sentiment polarity and emotional tendencies in financial texts. Its practical application in investment decision-making and risk management faces two major challenges: the scarcity of high-quality labeled data due to expert annotation costs,
[...] Read more.
Financial sentiment analysis leverages natural language processing techniques to quantitatively assess sentiment polarity and emotional tendencies in financial texts. Its practical application in investment decision-making and risk management faces two major challenges: the scarcity of high-quality labeled data due to expert annotation costs, and semantic drift caused by the continuous evolution of market language. To address these issues, this study proposes PLTA-FinBERT, a pseudo-label generation-based test-time adaptation framework that enables dynamic self-learning without requiring additional labeled data. The framework consists of two modules: a multi-perturbation pseudo-label generation mechanism that enhances label reliability through consistency voting and confidence-based filtering, and a test-time dynamic adaptation strategy that iteratively updates model parameters based on high-confidence pseudo-labels, allowing the model to continuously adapt to new linguistic patterns. PLTA-FinBERT achieves 0.8288 accuracy on the sentiment classification dataset of financial sentiment analysis, representing an absolute improvement of 2.37 percentage points over the benchmark. On the FiQA sentiment intensity prediction task, it obtains an of 0.58, surpassing the previous state-of-the-art by 3 percentage points.
Full article

Figure 1
Open AccessArticle
Bias Correction and Explainability Framework for Large Language Models: A Knowledge-Driven Approach
by
Xianming Yang, Qi Li, Chengdong Qian, Haitao Wang, Yonghui Wu and Wei Wang
Big Data Cogn. Comput. 2026, 10(2), 58; https://doi.org/10.3390/bdcc10020058 - 10 Feb 2026
Abstract
Large Language Models (LLMs) have demonstrated extraordinary capabilities in natural language generation; however, their real-world deployment is frequently hindered by the generation of factually incorrect or biased content, along with an inherent deficiency in transparency. To address these critical limitations and thereby enhance
[...] Read more.
Large Language Models (LLMs) have demonstrated extraordinary capabilities in natural language generation; however, their real-world deployment is frequently hindered by the generation of factually incorrect or biased content, along with an inherent deficiency in transparency. To address these critical limitations and thereby enhance the reliability and explainability of LLM outputs, this study proposes a novel integrated framework, namely the Adaptive Knowledge-Driven Correction Network (AKDC-Net), which incorporates three core algorithmic innovations. Firstly, the Hierarchical Uncertainty-Aware Bias Detector (HUABD) performs multi-level linguistic analysis (lexical, syntactic, semantic, and pragmatic) and, for the first time, decomposes predictive uncertainty into epistemic and aleatoric components. This decomposition enables principled, interpretable bias detection with clear theoretical underpinnings. Secondly, the Neural-Symbolic Knowledge Graph Enhanced Corrector (NSKGEC) integrates a temporal graph neural network with a differentiable symbolic reasoning module, facilitating logically consistent and factually grounded corrections based on dynamically updated knowledge sources. Thirdly, the Contrastive Learning-driven Multimodal Explanation Generator (CLMEG) leverages a cross-modal attention mechanism within a contrastive learning paradigm to generate coherent, high-quality textual and visual explanations that enhance the interpretability of LLM outputs. Extensive evaluations were conducted on a challenging medical domain dataset to validate the effectiveness of the proposed AKDC-Net framework. Experimental results demonstrate significant improvements over state-of-the-art baselines: specifically, a 14.1% increase in the F1-score for bias detection, a 19.4% enhancement in correction quality, and a 31.4% rise in user trust scores. These findings establish a new benchmark for the development of more trustworthy and transparent artificial intelligence (AI) systems, laying a solid foundation for the broader and more reliable application of LLMs in high-stakes domains.
Full article
(This article belongs to the Special Issue Enhancement Optimization Techniques on Large Language Model)
►▼
Show Figures

Figure 1
Open AccessArticle
Enhancing the Artificial Rabbit Optimizer Using Fuzzy Rule Interpolation
by
Mohammad Almseidin
Big Data Cogn. Comput. 2026, 10(2), 57; https://doi.org/10.3390/bdcc10020057 - 10 Feb 2026
Abstract
►▼
Show Figures
Metaheuristic optimization algorithms have demonstrated their effectiveness in solving complex optimization tasks, such as those related to Intrusion Detection Systems (IDSs). It was widely used to enhance the detection rate of various types of cyber attacks by reducing the feature space or tuning
[...] Read more.
Metaheuristic optimization algorithms have demonstrated their effectiveness in solving complex optimization tasks, such as those related to Intrusion Detection Systems (IDSs). It was widely used to enhance the detection rate of various types of cyber attacks by reducing the feature space or tuning the model’s hyperparameters. The Artificial Rabbit Optimizer (ARO) mimics rabbits’ intelligent foraging and hiding behavior. The ARO algorithm has seen widespread adoption in the optimization field. The widespread use of the ARO algorithm occurs due to its simple design and ease of implementation. However, ARO can get trapped in local optima due to its limited diversity in population dynamics. Although the transition between phases is managed via an energy shrink factor, fine-tuning this balance remains challenging and unexplored. These limitations could limit the ARO algorithm’s effectiveness in high-dimensional space, as with IDS systems. This paper introduces a novel enhancement of the original ARO by integrating Fuzzy Rule Interpolation (FRI) to compute the energy factor during the optimization process dynamically. In this work, we integrate the FRI along with the ARO algorithm to improve solution accuracy, maintain population diversity, and accelerate convergence, particularly in high-dimensional and complex problems such as IDS. The integration of the FRI and ARO aimed to control the exploration-exploitation balance in the IDS application area. To validate our proposed hybrid approach, we tested it on a diverse set of intrusion datasets, covering eight different benchmark intrusion detection datasets. The suggested hybrid approach has been demonstrated to be effective in handling various intrusion classification tasks. For binary intrusion classification tasks, it achieved accuracy rates ranging from 96% to 99.9%. In the case of multiclass intrusion classification tasks, the accuracy was slightly more consistent, falling between 91.6% and 98.9%. The suggested approach effectively reduced the number of feature spaces, achieving reduction rates from 56% up to 96%. Furthermore, the proposed approach outperformed other state-of-the-art methods in terms of detection rate.
Full article

Figure 1
Open AccessArticle
ISFJ-RAG: Interventional Suppression of Hallucinations via Counter-Factual Joint Decoding Retrieval-Augment Generation
by
Yuezhao Liu, Wei Li, Yijie Wang, Ningtong Chen and Min Chen
Big Data Cogn. Comput. 2026, 10(2), 56; https://doi.org/10.3390/bdcc10020056 - 9 Feb 2026
Abstract
►▼
Show Figures
Although retrieval-augmented generation (RAG) technology mitigates the hallucination issue in large language models (LLMs) by incorporating external knowledge, and combining reasoning models can further enhance RAG system performance, retrieval noise and attention bias still lead to the diffusion of factual errors in problems
[...] Read more.
Although retrieval-augmented generation (RAG) technology mitigates the hallucination issue in large language models (LLMs) by incorporating external knowledge, and combining reasoning models can further enhance RAG system performance, retrieval noise and attention bias still lead to the diffusion of factual errors in problems such as factual queries, multi-hop questions, and unanswerable questions. Existing methods struggle to effectively suppress “high-confidence hallucinations” in long-chain reasoning due to their failure to decouple knowledge bias effects from causal reasoning paths. To address this, this paper proposes the ISFJ-RAG framework, which dynamically intervenes in hallucinations through counterfactual joint decoding. First, a structural causal model (SCM) reveals three root causes of hallucinations in RAG systems: irrelevant knowledge interference, reasoning path bias, and spurious correlations in self-attention mechanisms. A dual-decoder architecture is further designed: the total causal effect decoder models the global relationship between user queries and knowledge, while the knowledge bias effect decoder captures potential biases induced by external knowledge. A dynamic modulation module converts the latter’s output into a proxy measure of hallucination bias. By computing individual treatment effects (ITEs), the bias component is removed from the full generation distribution, achieving simultaneous suppression of knowledge-irrelevant and reasoning-irrelevant hallucinations. Ablation experiments validate the robustness of average token log-probability as a confidence metric. Experiments demonstrate that on the RAGEval benchmark, ISFJ-RAG improves generation completeness to 86.89% (+5.49%) while reducing hallucination rates to 10.39% (−2.5%) and irrelevance rates to 4.44% (−2.99%).
Full article

Figure 1
Open AccessReview
What Distinguishes AI-Generated from Human Writing? A Rapid Review of the Literature
by
Georgios P. Georgiou
Big Data Cogn. Comput. 2026, 10(2), 55; https://doi.org/10.3390/bdcc10020055 - 8 Feb 2026
Abstract
Large language models (LLMs) are now routine writing tools across various domains, intensifying questions about when text should be treated as human-authored, artificial intelligence (AI)-generated, or collaboratively produced. This rapid review aims to identify cue families reported in empirical studies as distinguishing AI
[...] Read more.
Large language models (LLMs) are now routine writing tools across various domains, intensifying questions about when text should be treated as human-authored, artificial intelligence (AI)-generated, or collaboratively produced. This rapid review aims to identify cue families reported in empirical studies as distinguishing AI from human-authored text and to assess how stable these cues are across genres/tasks, text lengths, and revision conditions. Following the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines, we searched four online databases for peer-reviewed empirical articles (1 January 2022–1 January 2026). After deduplication and screening, 40 studies were included. Evidence converged on five cue families: surface, discourse/pragmatic, epistemic/content, predictability/probabilistic, and provenance. Surface cues dominated the literature and were the most consistently operationalized. Discourse/pragmatic cues followed, particularly in discipline-bound academic genres where stance and metadiscourse differentiated AI from human writing. Predictability/probabilistic cues were central in detector-focused studies, while epistemic/content cues emerged primarily in tasks where grounding and authenticity were salient. Provenance cues were concentrated in watermarking research. Across studies, cue stability was consistently conditional rather than universal. Specifically, surface and discourse cues often remained discriminative within constrained genres, but shifted with register and discipline; probabilistic cues were powerful yet fragile under paraphrasing, post-editing, and evasion; and provenance signals required robustness to editing, mixing, and span localization. Overall, the literature indicates that AI–human distinction emerges from layered and context-dependent cue profiles rather than from any single reliable marker. High-stakes decisions, therefore, require condition-aware interpretation, triangulation across multiple cue families, and human oversight rather than automated classification in isolation.
Full article
(This article belongs to the Special Issue Machine Learning Applications in Natural Language Processing)
►▼
Show Figures

Figure 1
Open AccessArticle
Research on Modeling Method of eLoran Signal Propagation Delay Prediction Model: Integrating Path-Weighted Meteorological Data and Propagation Delay Data in Long-Distance Scenarios
by
Tao Jin, Shiyao Liu, Baorong Yan, Xiang Jiang, Wei Guo, Yu Hua, Shougang Zhang and Lu Xu
Big Data Cogn. Comput. 2026, 10(2), 54; https://doi.org/10.3390/bdcc10020054 - 7 Feb 2026
Abstract
►▼
Show Figures
The enhanced long-range navigation (eLoran) system serves as an important backup method for the global navigation satellite system (GNSS) system. In long-distance transmission scenarios, the signal propagation delay of the eLoran system is affected by fluctuations in meteorological factors along the path. Regarding
[...] Read more.
The enhanced long-range navigation (eLoran) system serves as an important backup method for the global navigation satellite system (GNSS) system. In long-distance transmission scenarios, the signal propagation delay of the eLoran system is affected by fluctuations in meteorological factors along the path. Regarding these issues, such as the potential timing system errors caused by meteorological factors and the limitation on the accuracy of the timing system, in this paper, an innovative prediction model is proposed to predict the propagation delay data by fusing the propagation delay data of multiple differential reference stations on the path and the path-weighted meteorological data. By collecting and processing actual data, four types of prediction tasks were designed. Comparative analyses of the prediction performance of eight common models were conducted on a unified dataset. The results show that the Pucheng–Zhengzhou path-weighted ten-factor back-propagation neural network (PZWT-BPNN) model performs the best, achieving a balance between prediction accuracy and training efficiency. This model effectively suppresses the timing errors caused by meteorological fluctuations and improves the prediction accuracy of the propagation delay of the system, providing corresponding technical support for key fields such as low-altitude economy and transportation.
Full article

Figure 1
Open AccessArticle
Modern ICT Tools and Video Content in Athletes’ Education—Inspiration from Corporate Learning and Development
by
Martin Mičiak, Dominika Toman, Milan Kubina, Tatiana Poljaková, Klaudia Ivanovič, Kvetoslava Šimová, Anna Majchráková, Ivana Bystrická, Linda Kováčik and Tibor Furmánek
Big Data Cogn. Comput. 2026, 10(2), 53; https://doi.org/10.3390/bdcc10020053 - 6 Feb 2026
Abstract
►▼
Show Figures
Active athletes represent a specific target for learning and development. Their schedules, including training sessions and competitions, leave little time for education. However, athletes still need skills beyond sports to ensure they are prepared for future employment. Our study approaches this issue by
[...] Read more.
Active athletes represent a specific target for learning and development. Their schedules, including training sessions and competitions, leave little time for education. However, athletes still need skills beyond sports to ensure they are prepared for future employment. Our study approaches this issue by identifying appropriate settings for athletes’ learning and development. (1) Based on the background of current athletes’ education, it addresses the gap of not enough attention being paid to transferable practices from corporate attitudes to learning and development. (2) The study’s methodology primarily uses the case study concept because this conveys the video content we created for the athletes’ learning and development. This is combined with the method of content analysis of selected examples from corporate learning and development and the design thinking workshop, with the engagement of important stakeholder groups: athletes (2 participants), lecturers (2 participants), and representatives of sports organizations (1 participant). The other 9 workshop participants were master’s students in a managerial study programme because of their age similarities with the current athletes and the applicability of the courses they were studying to athletes’ education. (3) The designed process was created as a digital twin using haptic artefacts and the S2M technology (version 1.0) within the OMiLAB platform (version 1.6). Our results show that video content tailored to the athletes’ constraints is a viable solution that improves their career prospects. (4) The study’s practical implications are supported by the expert validation of the model provided by the inside of the large sports organizations’ management.
Full article

Figure 1
Open AccessArticle
CCTD-MARL: Coupled Communication-Task Decoupling Framework for Multi-Agent Systems Under Partial Observability
by
Kehan Li, Zhenya Wang, Xin Tang, Heng You, Long Hu, Haidong Xie and Min Chen
Big Data Cogn. Comput. 2026, 10(2), 52; https://doi.org/10.3390/bdcc10020052 - 5 Feb 2026
Abstract
►▼
Show Figures
Although multi-agent reinforcement learning (MARL) has achieved significant success in various domains, its deployment in real-world scenarios remains challenging, particularly in communication-constrained environments involving multi-task coupling. Existing methods suffer from two limitations: (1) the inability to effectively integrate and process incomplete state from
[...] Read more.
Although multi-agent reinforcement learning (MARL) has achieved significant success in various domains, its deployment in real-world scenarios remains challenging, particularly in communication-constrained environments involving multi-task coupling. Existing methods suffer from two limitations: (1) the inability to effectively integrate and process incomplete state from disparate agents, and (2) a lack of robust mechanisms for handling complex multi-task coupling. To address these challenges, we propose the Coupled Communication-Task Decoupling (CCTD) framework. CCTD introduces two critical innovations: first, a distributed state compensation mechanism to process historical data, thereby reconstructing accurate global states from partial observations; second, a hierarchical architecture that systematically decomposes complex tasks into manageable subtasks while preserving their interdependencies. Thanks to its modular design, CCTD can integrate with existing MARL algorithms and allow for flexible combination of various subtasks. Extensive experiments demonstrate that CCTD outperforms baseline methods, achieving a 10% improvement in communication reception rate and superior performance across all subtasks in multi-task environments.
Full article

Figure 1
Open AccessArticle
Hybrid Method of Organizing Information Search in Logistics Systems Based on Vector-Graph Structure and Large Language Models
by
Vadim Voloshchuk, Yaroslav Melnik, Irina Safronenkova, Egor Lishchenko, Oleg Kartashov and Alexander Kozlovskiy
Big Data Cogn. Comput. 2026, 10(2), 51; https://doi.org/10.3390/bdcc10020051 - 5 Feb 2026
Abstract
►▼
Show Figures
In logistics systems, the organization of information retrieval plays a key role in human interaction with technical systems to ensure decision-making speed, route optimization, planning, and resource allocation. At the same time, the efficiency of the logistics system when simultaneously processing large volumes
[...] Read more.
In logistics systems, the organization of information retrieval plays a key role in human interaction with technical systems to ensure decision-making speed, route optimization, planning, and resource allocation. At the same time, the efficiency of the logistics system when simultaneously processing large volumes of data and constantly updating it is determined by the speed of processing user requests and the accuracy of the responses provided by the system. Within the retrieval-augmented generation architecture, a hybrid information retrieval method has been proposed, based on the combined use of a vector-graph data representation structure and large language model. Experiments showed that the hybrid method achieved best accuracy rates of 0.24–0.25 (among all considered methods) with enhanced scalability capabilities (when the number of nodes increases fourfold, the time increases only twofold—from 0.09 s to 0.20 s) due to the limitation of the graph traversal area when implementing the graph component of the hybrid search. An optimal range of 30–50 nodes to be traversed was also identified, balancing precision and query processing speed. The findings are of practical value to logistics system developers and supply chain managers aiming to implement high-precision, natural language-based information retrieval in dynamic operational environments.
Full article

Figure 1
Open AccessArticle
Improving Transferability of Adversarial Attacks via Maximization and Targeting from Image to Video Quality Assessment
by
Georgii Gotin, Ekaterina Shumitskaya, Dmitriy Vatolin and Anastasia Antsiferova
Big Data Cogn. Comput. 2026, 10(2), 50; https://doi.org/10.3390/bdcc10020050 - 5 Feb 2026
Abstract
►▼
Show Figures
This paper proposes a novel method for transferable adversarial attacks from Image Quality Assessment (IQA) to Video Quality Assessment (VQA) models. Attacking modern VQA models is challenging due to their high complexity and the temporal nature of video content. Since IQA and VQA
[...] Read more.
This paper proposes a novel method for transferable adversarial attacks from Image Quality Assessment (IQA) to Video Quality Assessment (VQA) models. Attacking modern VQA models is challenging due to their high complexity and the temporal nature of video content. Since IQA and VQA models share similar low- and mid-level feature representations, and IQA models are substantially cheaper and faster to run, we leverage them as surrogates to generate transferable adversarial perturbations. Our method, MaxT-I2VQA jointly Maximizes IQA scores and Targets IQA feature activations to improve transferability from IQA to VQA models. We first analyze the correlation between IQA and VQA internal features and use these insights to design a feature-targeting loss. We evaluate MaxT-I2VQA by transferring attacks from four state-of-the-art IQA models to four recent VQA models and compare against three competitive baselines. Compared to prior methods, MaxT-I2VQA increases the transferability of an attack success rate by 7.9% and reduces per-example attack runtime by 8 times. Our experiments confirm that IQA and VQA feature spaces are sufficiently aligned to enable effective cross-task transfer.
Full article

Figure 1
Open AccessArticle
SiAraSent: From Features to Deep Transformers for Large-Scale Arabic Sentiment Analysis
by
Omar Almousa, Yahya Tashtoush, Anas AlSobeh, Plamen Zahariev and Omar Darwish
Big Data Cogn. Comput. 2026, 10(2), 49; https://doi.org/10.3390/bdcc10020049 - 3 Feb 2026
Abstract
Sentiment analysis of Arabic text, particularly on social media platforms, presents a formidable set of unique challenges that stem from the language’s complex morphology, its numerous dialectal variations, and the frequent and nuanced use of emojis to convey emotional context. This paper presents
[...] Read more.
Sentiment analysis of Arabic text, particularly on social media platforms, presents a formidable set of unique challenges that stem from the language’s complex morphology, its numerous dialectal variations, and the frequent and nuanced use of emojis to convey emotional context. This paper presents SiAraSent, a hybrid framework that integrates traditional text representations, emoji-aware features, and deep contextual embeddings based on Arabic transformers. Starting from a strong and fully interpretable baseline built on Term Frequency–Inverse Definition Frequency (TF–IDF)-weighted character and word N-grams combined with emoji embeddings, we progressively incorporate SinaTools for linguistically informed preprocessing and AraBERT for contextualized encodings. The framework is evaluated on a large-scale dataset of 58,751 Arabic tweets labeled for sentiment polarity. Our design works within four experimental configurations: (1) a baseline traditional machine learning architecture that employs TF-IDF, N-grams, and emoji features with an Support Vector Machine (SVM) classifier; (2) an Large-language Model (LLM) feature extraction approach that leverages deep contextual embeddings from the pre-trained AraBERT model; (3) a novel hybrid fusion model that concatenates traditional morphological features, AraBERT embeddings, and emoji-based features into a high-dimensional vector; and (4) a fully fine-tuned AraBERT model specifically adapted for the sentiment classification task. Our experiments demonstrate the remarkable efficacy of our proposed framework, with the fine-tuned AraBERT architecture achieving an accuracy of 93.45%, a significant 10.89% improvement over the best traditional baseline.
Full article
(This article belongs to the Special Issue Advances in Natural Language Processing and Text Mining: 2nd Edition)
►▼
Show Figures

Figure 1
Open AccessArticle
AMPS: A Direction-Aware Adaptive Multi-Scale Potential Model for Link Prediction in Complex Networks
by
Xinghua Qin, Sizheng Liu, Mengmeng Zhang, Jun Tang and Yirun Ruan
Big Data Cogn. Comput. 2026, 10(2), 48; https://doi.org/10.3390/bdcc10020048 - 3 Feb 2026
Abstract
►▼
Show Figures
To overcome the limitations of current link prediction methods in effectively leveraging topological information and node importance, this paper introduces a new model called AMPS (Adaptive Multi-scale Potential-enhanced Path Similarity). The model is built on a hierarchical structure that captures both global network
[...] Read more.
To overcome the limitations of current link prediction methods in effectively leveraging topological information and node importance, this paper introduces a new model called AMPS (Adaptive Multi-scale Potential-enhanced Path Similarity). The model is built on a hierarchical structure that captures both global network topology and local interaction patterns, with full compatibility for directed and undirected networks. This is achieved through a process that quantifies node potential fields, enhances multi-scale similarity, and fuses information across scales. Specifically, we define three types of potential field models, global, local, and k-hop, to flexibly measure node importance. We also introduce two complementary prediction modules: an enhanced common neighbor matrix (PCN), which uses potential fields to refine local structural details, and a feature-weighted generalized path similarity (GLP), which integrates node importance into path evaluation. The final similarity score is obtained by adaptively combining the outputs of PCN and GLP. Experiments on 12 undirected datasets and 9 directed datasets demonstrate that AMPS significantly outperforms other mainstream algorithms in terms of the AUC metric. It also exhibits strong robustness under varying training set ratios, maintaining stable advantages in both directed and undirected scenarios. This framework provides a physically intuitive, topology-aware, and high-precision solution for link prediction across various types of networks.
Full article

Figure 1
Open AccessArticle
Lithology Identification from Well Logs via Meta-Information Tensors and Quality-Aware Weighting
by
Wenxuan Chen, Guoyun Zhong, Fan Diao, Peng Ding and Jianfeng He
Big Data Cogn. Comput. 2026, 10(2), 47; https://doi.org/10.3390/bdcc10020047 - 2 Feb 2026
Abstract
In practical well-logging datasets, severe missing values, anomalous disturbances, and highly imbalanced lithology classes are pervasive. To address these challenges, this study proposes a well-logging lithology identification framework that combines Robust Feature Engineering (RFE) with quality-aware XGBoost. Instead of relying on interpolation-based data
[...] Read more.
In practical well-logging datasets, severe missing values, anomalous disturbances, and highly imbalanced lithology classes are pervasive. To address these challenges, this study proposes a well-logging lithology identification framework that combines Robust Feature Engineering (RFE) with quality-aware XGBoost. Instead of relying on interpolation-based data cleaning, RFE uses sentinel values and a meta-information tensor to explicitly encode patterns of missingness and anomalies, and incorporates sliding-window context to transform data defects into discriminative auxiliary features. In parallel, a quality-aware sample-weighting strategy is introduced that jointly accounts for formation boundary locations and label confidence, thereby mitigating training bias induced by long-tailed class distributions. Experiments on the FORCE 2020 lithology prediction dataset demonstrate that, relative to baseline models, the proposed method improves the weighted F1 score from 0.66 to 0.73, while Boundary F1 and the geological penalty score are also consistently enhanced. These results indicate that, compared with traditional workflows that rely solely on data cleaning, explicit modeling of data incompleteness provides more pronounced advantages in terms of robustness and engineering applicability.
Full article
(This article belongs to the Section Data Mining and Machine Learning)
►▼
Show Figures

Figure 1
Journal Menu
► ▼ Journal Menu-
- BDCC Home
- Aims & Scope
- Editorial Board
- Reviewer Board
- Topical Advisory Panel
- Instructions for Authors
- Special Issues
- Topics
- Sections & Collections
- Article Processing Charge
- Indexing & Archiving
- Editor’s Choice Articles
- Most Cited & Viewed
- Journal Statistics
- Journal History
- Journal Awards
- Conferences
- Editorial Office
Journal Browser
► ▼ Journal BrowserHighly Accessed Articles
Latest Books
E-Mail Alert
News
Topics
Topic in
AI, BDCC, Fire, GeoHazards, Remote Sensing
AI for Natural Disasters Detection, Prediction and Modeling
Topic Editors: Moulay A. Akhloufi, Mozhdeh ShahbaziDeadline: 31 March 2026
Topic in
Applied Sciences, Electronics, J. Imaging, MAKE, Information, BDCC, Signals
Applications of Image and Video Processing in Medical Imaging
Topic Editors: Jyh-Cheng Chen, Kuangyu ShiDeadline: 30 April 2026
Topic in
Actuators, Algorithms, BDCC, Future Internet, JMMP, Machines, Robotics, Systems
Smart Product Design and Manufacturing on Industrial Internet
Topic Editors: Pingyu Jiang, Jihong Liu, Ying Liu, Jihong YanDeadline: 30 June 2026
Topic in
Sensors, Electronics, Technologies, AI, Entropy, Quantum Reports, BDCC
Responsible Classic/Quantum AI Technologies for Industrial Applications
Topic Editors: Youyang Qu, Khandakar Ahmed, Zhiyi TianDeadline: 31 July 2026
Conferences
Special Issues
Special Issue in
BDCC
Field Robotics and Artificial Intelligence (AI)
Guest Editors: Robert Ross, Alex StumpfDeadline: 27 February 2026
Special Issue in
BDCC
Evolutionary Computation and Artificial Intelligence: Building a Sustainable Future for Smart Cities
Guest Editors: Changjun Zhou, Zhichao PanDeadline: 28 February 2026
Special Issue in
BDCC
Deep Network Learning and Its Applications: 2nd Edition
Guest Editors: Guarino Alfonso, Rocco Zaccagnino, Emiliano Del GobboDeadline: 28 February 2026
Special Issue in
BDCC
Application of Semantic Technologies in Intelligent Environment
Guest Editors: Maria Nisheva-Pavlova, Galia Angelova, Moulay A. AkhloufiDeadline: 28 February 2026




