Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (2,994)

Search Parameters:
Keywords = large benchmark

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 1427 KB  
Article
Federated Deep Reinforcement Learning for Energy Scheduling in Privacy-Sensitive PV-EV Charging Networks
by Yongguang Zhao, Xinni Li, Yongqing Zheng and Wei Guo
Electronics 2026, 15(5), 1012; https://doi.org/10.3390/electronics15051012 (registering DOI) - 28 Feb 2026
Abstract
The large-scale adoption of electric vehicles (EVs) improves transport sustainability but creates severe peak-time stress on distribution grids. In PV-assisted charging networks, station operators must jointly decide retail charging prices and energy-storage dispatch under uncertain demand and generation conditions. This paper develops a [...] Read more.
The large-scale adoption of electric vehicles (EVs) improves transport sustainability but creates severe peak-time stress on distribution grids. In PV-assisted charging networks, station operators must jointly decide retail charging prices and energy-storage dispatch under uncertain demand and generation conditions. This paper develops a distributed federated deep reinforcement learning framework for multi-station scheduling, where each station trains a local soft actor–critic (SAC) policy and only model parameters are exchanged with a global aggregator. To better adapt prices to local supply–demand conditions, we introduce a sales-factor-based correction mechanism that links the announced price to demand pressure and storage status. The objective combines station revenue, operating expenses, and user-discomfort-related penalties under operational constraints. Simulation results on a five-station setting show stable convergence and consistent gains over benchmark methods, with profit improvements of 3.90–39.00%. The framework keeps raw operational data local and supports collaborative optimization across stations. Full article
(This article belongs to the Special Issue Deep Learning and Advanced Machine Learning for Energy Forecasting)
Show Figures

Figure 1

31 pages, 822 KB  
Review
A Review of Multi-Agent AI Systems for Biological and Clinical Data Analysis
by Jackson Spieser, Ali Balapour, Jarek Meller, Krushna C. Patra and Behrouz Shamsaei
Methods Protoc. 2026, 9(2), 33; https://doi.org/10.3390/mps9020033 (registering DOI) - 28 Feb 2026
Abstract
This review evaluates the emerging paradigm of multi-agent systems (MASs) for biomedical and clinical data analysis, focusing on their ability to overcome the reasoning and reliability limitations of standalone large language models (LLMs). We synthesize findings from recent architectural frameworks, specifically LangGraph, CrewAI, [...] Read more.
This review evaluates the emerging paradigm of multi-agent systems (MASs) for biomedical and clinical data analysis, focusing on their ability to overcome the reasoning and reliability limitations of standalone large language models (LLMs). We synthesize findings from recent architectural frameworks, specifically LangGraph, CrewAI, and the Model Context Protocol (MCP), to examine how specialized agent teams divide labor, utilize precision tools, and cross-verify outputs. We find that MAS architectures yield significant performance gains in various domains: recent implementations improved oncology decision-making accuracy from 30.3% to 87.2% and reached a peak of 93.2% accuracy on USMLE-style benchmarks through simulated clinical evolution. In clinical trial matching, multi-agent frameworks achieved 87.3% accuracy and enhanced clinician screening efficiency by 42.6% (p < 0.001). However, we also highlight critical operational challenges, including an unreliability tax of 15–50× higher token consumption compared to standalone models and the risk of cascading errors where initial hallucinations are amplified across the agent collective. We conclude that while MAS enables a shift toward collaborative intelligence in biomedicine, its clinical and research adoption requires the development of deterministic orchestration and rigorous cost-utility frameworks to ensure safety and expert-centered oversight. Full article
(This article belongs to the Section Biomedical Sciences and Physiology)
15 pages, 1050 KB  
Article
Preclinical HistoBench: A Pilot Benchmark Dataset for Evaluating Large Language Models on Preclinical Histopathological Classification
by Avan Kader, Marie-Luise H. H. Ranner-Hafferl, Felix Reuter, Miriam L. Fichtner, Marcus R. Makowski, Keno K. Bressem and Lisa C. Adams
Biology 2026, 15(5), 395; https://doi.org/10.3390/biology15050395 - 27 Feb 2026
Abstract
Background and Purpose: We present a pilot benchmark dataset of 378 preclinical histological samples for evaluating large language model (LLM) performance on multi-dimensional classification tasks. This dataset addresses the lack of standardized benchmarks for assessing LLMs in preclinical histopathology, encompassing species identification (mouse, [...] Read more.
Background and Purpose: We present a pilot benchmark dataset of 378 preclinical histological samples for evaluating large language model (LLM) performance on multi-dimensional classification tasks. This dataset addresses the lack of standardized benchmarks for assessing LLMs in preclinical histopathology, encompassing species identification (mouse, rabbit, rat), organ recognition, staining methods, and preparation techniques. Methods: We evaluated the LLMs GPT-4.1, GPT-4o-mini, and Llama 3.2 on 378 histological samples across four classification dimensions: species identification (mouse, rabbit, rat), organ recognition (kidney, liver, prostate, spleen), staining method classification (H&E, Elastica van Gieson, collagen, iron, IHC-elastin, MOVAT’s pentachrome), and preparation type determination (frozen vs. paraffin-embedded). Performance was assessed using sensitivity and specificity metrics with confusion matrix analysis. Results: Model performance varied substantially across tasks and exhibited strong sensitivity to class imbalance. For preparation type classification, GPT-4.1 achieved the most balanced performance (50% frozen sensitivity, 85.7% paraffin sensitivity), while Llama 3.2 failed to recognize paraffin samples (0% sensitivity). In species classification, Llama 3.2 was the only model capable of identifying all three species (rabbit: 75% sensitivity, rat: 85.7% sensitivity) despite poor mouse recognition (0.3% sensitivity). GPT-4.1 achieved higher mouse sensitivity within this dataset (70.4% sensitivity) but failed with minority species. For staining classification, Llama 3.2 demonstrated highest overall performance, achieving >88% sensitivity for most staining types, while GPT-4o-mini showed perfect H&E recognition (100% sensitivity). Conclusions: Current LLMs demonstrate variable performance for histological classification with substantial sensitivity to class imbalance. While not suitable for standalone diagnostic use, they may serve as useful screening tools in research settings with appropriate human oversight. Full article
(This article belongs to the Special Issue AI Deep Learning Approach to Study Biological Questions (2nd Edition))
Show Figures

Graphical abstract

40 pages, 21366 KB  
Article
Three-Dimensional Digital Model Reconstruction and Seepage Characteristic Analysis of Porous Polyimide
by Zhaoliang Dou, Shuang Li, Wenbin Chen, Ye Yang, Hongjuan Yan, Lina Si, Qianghua Chen, Kang An, Hong Li and Fengbin Liu
Polymers 2026, 18(5), 591; https://doi.org/10.3390/polym18050591 - 27 Feb 2026
Abstract
This study focuses on porous polyimide (PPI) lubricating materials for high-speed aerospace bearings. Based on their real microstructure, three-dimensional digital model reconstruction and mesoscale seepage characteristics were investigated. First, a sequence of two-dimensional slice images of PPI was obtained using micro-focus X-ray computed [...] Read more.
This study focuses on porous polyimide (PPI) lubricating materials for high-speed aerospace bearings. Based on their real microstructure, three-dimensional digital model reconstruction and mesoscale seepage characteristics were investigated. First, a sequence of two-dimensional slice images of PPI was obtained using micro-focus X-ray computed tomography (CT). Through image filtering, threshold segmentation, and three-dimensional reconstruction, a highly faithful digital model of the pore structure was constructed, and a quantified pore-network model was further extracted. Second, a multiple-relaxation-time lattice Boltzmann model based on the D3Q27 discrete scheme was established, and its accuracy and stability in complex boundaries and pressure-driven flows were verified using classic benchmark cases. Subsequently, the validated numerical model was applied to the reconstructed PPI pore structure to simulate and systematically analyze the single-phase seepage behavior of lubricating oil. The results show that the lubricant seepage exhibits a strong “preferential flow path” effect, with most of the flow transported through a small number of large-size throats. A clear quantitative relationship exists between the microscopic flow field structure—including velocity distribution, flow paths, and pressure gradient—and the pore-topology features, such as throat-size distribution, connectivity, and tortuosity. This verifies the mesoscale mechanism that “structure governs flow.” The complete technical chain established in this work—“real-structure reconstruction–numerical model validation–seepage mechanism analysis”—provides a reliable theoretical and numerical tool for gaining deeper insight into the lubricant transport behavior in porous polyimide and offers guidance for the microstructural design and optimization of this material. Full article
(This article belongs to the Section Polymer Analysis and Characterization)
25 pages, 25354 KB  
Article
OpenPlant: A Large-Scale Benchmark Dataset for Agricultural Plant Classification Using CNNs, ViTs, and VLMs
by Kaiqi Liu, Wei Sun, Guanping Wang, Quan Feng and Hui Li
Plants 2026, 15(5), 727; https://doi.org/10.3390/plants15050727 - 27 Feb 2026
Abstract
Accurate plant classification based on deep learning is important for precision agriculture, such as weed control, crop monitoring, and smart farming systems. The accuracies of deep learning models rely on datasets. Although many datasets have been proposed in recent decades, they have the [...] Read more.
Accurate plant classification based on deep learning is important for precision agriculture, such as weed control, crop monitoring, and smart farming systems. The accuracies of deep learning models rely on datasets. Although many datasets have been proposed in recent decades, they have the common limitations in terms of scale, less environmental diversity, and challenges of data integration. To solve these problems, in this paper, we introduce a new dataset named OpenPlant, which is a large-scale and open dataset containing 635,176 RGB images across 1167 plant species. OpenPlant includes diverse growth stages of plants, plant structures, and environmental conditions, and its annotations were carefully verified to ensure quality. The proposed OpenPlant can be a benchmark for agricultural plant classification. In this paper, we benchmarked 10 widely used convolutional neural networks (CNNs), 6 vision transformers (ViTs), and 12 vision–language models (VLMs) to provide a comprehensive evaluation. The OpenPlant dataset offers a comprehensive benchmark for agricultural research using deep learning and the results provide insights into future directions. Full article
Show Figures

Figure 1

30 pages, 960 KB  
Article
SCIM: Self-Correcting Iterative Mechanism for Retrieval-Augmented Generation
by Ke Li and Tingting Zhang
Electronics 2026, 15(5), 996; https://doi.org/10.3390/electronics15050996 (registering DOI) - 27 Feb 2026
Abstract
Standard Retrieval-Augmented Generation (RAG) models are limited by their “one-shot” nature, failing to assess or improve answer quality dynamically. To address this, we introduce SCIM (Self-Correcting Iterative Mechanism), a framework featuring multi-dimensional evaluation and adaptive retrieval. A key distinction of SCIM is its [...] Read more.
Standard Retrieval-Augmented Generation (RAG) models are limited by their “one-shot” nature, failing to assess or improve answer quality dynamically. To address this, we introduce SCIM (Self-Correcting Iterative Mechanism), a framework featuring multi-dimensional evaluation and adaptive retrieval. A key distinction of SCIM is its efficiency: it operates on a lightweight Flan-T5-base model (250M parameters) and requires no fine-tuning, challenging the industry’s reliance on 7B+ parameter models. Experimental results across four major benchmarks show that SCIM yields a 17.2% improvement over standard RAG (p <0.001). Notably, SCIM achieves parity with state-of-the-art models like ITER-RETGEN while reducing retrieval overhead by 31%, with 35% of queries converging within just 1–2 iterations. With high human correlation (Spearman ρ=0.842), SCIM demonstrates that robust, self-correcting RAG performance is attainable without the computational costs of large-scale LLMs. Full article
(This article belongs to the Section Artificial Intelligence)
18 pages, 2368 KB  
Article
TransGoT: Structured Graph-of-Thoughts Reasoning for Machine Translation with Large Language Models
by Danying Zhang, Yixin Liu, Jie Zhao and Cai Xu
Big Data Cogn. Comput. 2026, 10(3), 70; https://doi.org/10.3390/bdcc10030070 - 27 Feb 2026
Abstract
Machine translation with large language models has recently attracted growing attention due to its flexibility and strong zero-shot and few-shot capabilities. However, most prompt-based LLM translation methods rely on linear generation or shallow self-refinement, implicitly committing to a single reasoning path. Such designs [...] Read more.
Machine translation with large language models has recently attracted growing attention due to its flexibility and strong zero-shot and few-shot capabilities. However, most prompt-based LLM translation methods rely on linear generation or shallow self-refinement, implicitly committing to a single reasoning path. Such designs are brittle when translating long and syntactically complex sources, where reliable translation often requires structured planning and hypothesis exploration. In this paper, we propose TransGoT, a novel machine translation framework inspired by the graph-of-thoughts paradigm, which formulates translation as a structured, multi-stage reasoning process over a graph of intermediate thoughts. TransGoT explicitly decomposes translation into constraint identification, draft generation, and culture- and style-aware refinement, enabling systematic exploration and aggregation of alternative translation hypotheses. To better adapt graph-based reasoning to translation, we design two key mechanisms: (1) Uncertainty-driven thought transformation. Unlike general reasoning tasks, translation uncertainty is often localized and unevenly distributed across tokens, making holistic regeneration inefficient. We therefore design uncertainty-driven thought transformation, which leverages model-internal confidence signals to guide targeted token-level revision; (2) Dispersion-adaptive thought scoring. It emphasizes evaluation criteria with stronger inter-candidate variance to enable robust multi-criteria thought selection. We evaluate TransGoT on the WMT22 benchmarks and experimental results demonstrate that TransGoT consistently outperforms strong LLM-based translation baselines, validating the effectiveness of structured graph-based reasoning for machine translation. Full article
(This article belongs to the Special Issue Natural Language Processing Applications in Big Data)
28 pages, 18338 KB  
Article
Forecast of Electric Power Consumed by Public Buildings: Univariate and Multivariate Approaches Based on Quantile Regression Models
by Sara Perna, Anna Rita Di Fazio, Andrea Iacovacci, Francesco Conte and Pasquale De Falco
Energies 2026, 19(5), 1200; https://doi.org/10.3390/en19051200 - 27 Feb 2026
Abstract
Load forecasting has become a key tool, especially for distribution system operators, to ensure optimal grid management and control. In recent years, attention has shifted toward probabilistic load forecasting (PLF), as it can model forecast uncertainty. Because electricity demand is strongly influenced by [...] Read more.
Load forecasting has become a key tool, especially for distribution system operators, to ensure optimal grid management and control. In recent years, attention has shifted toward probabilistic load forecasting (PLF), as it can model forecast uncertainty. Because electricity demand is strongly influenced by time-dependent factors such as seasonal patterns and daily habits, non-parametric PLF methods are particularly suitable because they make no assumptions about the distribution of variables. This study focuses on quantile regression (QR), a widely studied non-parametric PLF technique that models forecast uncertainty by only assuming a linear dependency among variables. It is applied every hour to forecast the daily consumption of three large public buildings—an elderly healthcare center, a biomedical research facility, and a polyclinic—with different demand variability profiles. Forecasts are carried out using real-world consumption data and evaluated considering both univariate and multivariate approaches. The performance of both QR approaches is rigorously evaluated against that of two persistence-based methods through standard evaluation metrics. For the univariate case, two aggregation levels are considered: single buildings and aggregation of buildings. The results confirm the effectiveness of both uQR and mQR, which consistently outperform persistence-based benchmarks. In terms of the pinball loss (PL) function, the QR approaches exhibit values ranging from 1% to 1.8% across all case studies. Both approaches demonstrate reliable and sharp prediction intervals (PIs); for example, for the PI(10–90) using the uQR, the PI coverage probability (PICP) ranges from 0.78 to 0.89 and the PI normalized average width (PINAW) from 0.09 to 0.26. Overall, uQR achieves lower PL, whereas mQR yields slightly better PICP and PINAW results for the building characterized by an irregular and unpredictable consumption profile. Full article
(This article belongs to the Special Issue Advanced Forecasting Methods for Sustainable Power Grid: 2nd Edition)
18 pages, 7561 KB  
Article
Large-Scale Real-World Smartphone Photoplethysmography Datasets for Vascular Assessment
by Stevan Jokić, Ivan Jokić, Nenad Gligorić, Aneta Kartali and Octavian M. Machidon
Electronics 2026, 15(5), 988; https://doi.org/10.3390/electronics15050988 (registering DOI) - 27 Feb 2026
Abstract
The development of reliable smartphone-based methods for vascular assessment is limited by the scarcity of large-scale, high-quality, real-world photoplethysmography (PPG) datasets. This work introduces two openly reusable smartphone camera-based PPG datasets curated from over one million unconstrained recordings, designed to support vascular morphology [...] Read more.
The development of reliable smartphone-based methods for vascular assessment is limited by the scarcity of large-scale, high-quality, real-world photoplethysmography (PPG) datasets. This work introduces two openly reusable smartphone camera-based PPG datasets curated from over one million unconstrained recordings, designed to support vascular morphology analysis and vascular aging research. The first dataset comprises approximately 5000 high-fidelity PPG heartbeat templates labeled into four morphological classes based on dicrotic notch characteristics, enabling assessment of arterial waveform structure beyond chronological age. The second dataset contains about 10,000 demographically balanced PPG samples curated for chronological age regression using rigorous subject-level balancing and correlation-based quality control. A standardized processing pipeline is presented, including beat alignment, ensemble averaging, and objective signal acceptance criteria to ensure morphological stability. To validate dataset utility, multiple machine learning models were benchmarked using raw signals, second derivatives, and compact Gaussian representations, achieving classification accuracy up to 90.08% and age prediction error below 10 years. By prioritizing real-world data quality, transparency, and reuse, this work provides a robust foundation for scalable, interpretable, and reproducible research in smartphone-based vascular assessment. Full article
(This article belongs to the Special Issue Feature Papers in Bioelectronics: 2025–2026 Edition)
Show Figures

Figure 1

20 pages, 3196 KB  
Article
Semantic Firewalls with Online Ensemble Learning for Secure Agentic RAG Systems in Financial Chatbots
by Victor Castro-Maldonado, Marco A. Aceves-Fernández, Luis R. García-Noguez and Jesús C. Pedraza-Ortega
AI 2026, 7(3), 80; https://doi.org/10.3390/ai7030080 - 27 Feb 2026
Abstract
The RAG agentic architecture has demonstrated its ability to transform large language models (LLMs) into agents capable of planning, reasoning, and executing subtasks using external tools or APIs. In the financial sector, one of the main priorities when implementing new technologies—especially in systems [...] Read more.
The RAG agentic architecture has demonstrated its ability to transform large language models (LLMs) into agents capable of planning, reasoning, and executing subtasks using external tools or APIs. In the financial sector, one of the main priorities when implementing new technologies—especially in systems like chatbots—is the protection of customer data and the need to maintain customer trust, making the challenges significant. This research presents a robust banking chatbot system that integrates RAG agentic architecture with specialized financial components, setting a new standard in the digital banking sector by prioritizing security, transparency, and functionality. The contributions of this work include the implementation of RAG agentic reasoning and self-correction financial components, and, primarily, the empirical study of the impact of a semantic firewall with online learning in financial RAG agentic systems, evaluated using public benchmarks and standard ranking metrics. Full article
Show Figures

Figure 1

36 pages, 1552 KB  
Article
RO-FIN-LLM: A Benchmark with LLM-as-a-Judge and Human Evaluators for Romanian Tax and Accounting
by Maria-Ecaterina Olariu, Vlad-Gabriel Buinceanu, Cristian Simionescu, Octavian Dospinescu, Răzvan Georgescu, Cezar Tudor, Adrian Iftene and Ana-Maria Bores
Systems 2026, 14(3), 244; https://doi.org/10.3390/systems14030244 - 27 Feb 2026
Abstract
Large Language Models (LLMs) are increasingly being adopted in business settings; however, there remains a shortage of evaluation tools that account for country-specific regulations, particularly for Romania’s taxation and financial accounting requirements. RO-FIN-LLM is a benchmark designed to test how well LLMs handle [...] Read more.
Large Language Models (LLMs) are increasingly being adopted in business settings; however, there remains a shortage of evaluation tools that account for country-specific regulations, particularly for Romania’s taxation and financial accounting requirements. RO-FIN-LLM is a benchmark designed to test how well LLMs handle Romania-specific regulatory question answering in taxation (including VAT regimes, income/profit tax, microenterprise rules, and other obligations) and financial accounting (including journal entries/monographs, amortization, provisions, and foreign exchange transactions). The benchmark contains questions curated by experts, each including the applicable regulatory time frames and the legal sources for the answers. Evaluation is performed in two protocols: closed-book and open-book with Retrieval Augmented Generation (RAG), using Tavily Search API. Evaluation metrics are represented by rubrics, namely correctness, legal citation quality, and clarity/structure. A subset of answers produced by three models was additionally evaluated by 12 specialists in the financial-accounting domain. In this revision, we also describe a public release plan for the question schema, prompts, and evaluation scripts to support independent reproducibility. Full article
(This article belongs to the Special Issue Business Intelligence and Data Analytics in Enterprise Systems)
Show Figures

Figure 1

49 pages, 1910 KB  
Review
Beyond Next-Token Prediction: A Standards-Aligned Survey of Autoregressive LLM Failure Modes, Deployment Patterns, and the Potential Role of World Models
by Lorenzo Ricciardi Celsi and James McCann
Electronics 2026, 15(5), 966; https://doi.org/10.3390/electronics15050966 - 26 Feb 2026
Abstract
This paper is a focused, standards-aligned survey of where autoregressive (AR) large language models (LLMs) tend to break down when deployed inside industrial informatics workflows that must satisfy long-horizon objectives, hard constraints, traceability, and functional-safety obligations (e.g., IEC 61508/ISO 26262/ISO 21448). Rather than [...] Read more.
This paper is a focused, standards-aligned survey of where autoregressive (AR) large language models (LLMs) tend to break down when deployed inside industrial informatics workflows that must satisfy long-horizon objectives, hard constraints, traceability, and functional-safety obligations (e.g., IEC 61508/ISO 26262/ISO 21448). Rather than claiming new algorithms or experiments, we synthesize and organize prior work into (i) a control-oriented taxonomy of four AR failure modes that recur in practice (compounding error, myopic objectives, data brittleness/hallucinations, and scaling/latency inefficiencies), (ii) a catalog of standards-compatible deployment patterns that mitigate these issues (human-gated LLM-in-the-loop, retrieval + verification pipelines, planner-of-record architectures, and runtime assurance envelopes), and (iii) an operational decision framework (criteria table with observable proxies, a stepwise decision procedure, and worked examples) for deciding when token-centric mitigations are sufficient versus when state/world-model components become warranted. Joint Embedding Predictive Architectures (JEPA) and Hierarchical JEPA (H-JEPA) JEPA are proposed as representative state-predictive architectures, with discussion explicitly bounded by currently available empirical evidence; we explicitly note that the published evidence base is currently concentrated on vision/multimodal benchmarks and that industrial control validation remains limited. To make evidence boundaries transparent, we introduce (a) a survey method (scope, inclusion/exclusion criteria, and data-extraction fields), (b) a comparison matrix across representative prior systems, and (c) an evidence map that links each deployment pattern to peer-reviewed empirical findings and system reports. Full article
30 pages, 2430 KB  
Article
ST-GraphRCA: A Root Cause Analysis Model for Spatio-Temporal Graph Propagation in IoT Edge Computing
by Tianyi Su, Ruibing Mo, Yanyu Gong and Haifeng Wang
Sensors 2026, 26(5), 1474; https://doi.org/10.3390/s26051474 - 26 Feb 2026
Abstract
Real-time processing demands for massive IoT sensor data necessitate reliance on distributed microservice systems within edge clusters. However, pinpointing the root cause of anomalies within these edge microservice clusters poses a critical challenge for intelligent IoT operation and maintenance. To address the issue, [...] Read more.
Real-time processing demands for massive IoT sensor data necessitate reliance on distributed microservice systems within edge clusters. However, pinpointing the root cause of anomalies within these edge microservice clusters poses a critical challenge for intelligent IoT operation and maintenance. To address the issue, a spatio-temporal graph propagation model ST-GraphRCA is proposed for root cause analysis in IoT edge environments. Our approach begins by resolving the fundamental issue of time-series asynchrony across distributed multi-source metrics. A PCA-DTW hybrid feature extraction method is introduced with a dynamic alignment strategy to mitigate the effects of random network delays and data deformation without requiring prior synchronization. Subsequently, ST-GraphRCA constructs a stream-based forward propagation graph based on the flow conservation principle. By integrating dynamic edge weights with node-level input–output anomaly scores, ST-GraphRCA precisely infers fault propagation pathways and identifies potential root cause candidates through causal reasoning. Finally, a topology-constrained high-utility mining algorithm filters these candidates. Using a constraint matrix, the algorithm filters out unreachable service combinations to locate low-frequency and high-risk root causes. Experimental results indicate that ST-GraphRCA achieves an F1-Score of 0.89, outperforming existing methods. In resource-constrained edge scenarios, its average localization time is merely 238.8 ms, representing a six-fold improvement over key benchmarks. Thus, ST-GraphRCA not only provides an efficient anomaly fault tracing solution for large-scale IoT systems but also offers technical support for the intelligent operation and maintenance of distributed microservice systems. Full article
(This article belongs to the Section Industrial Sensors)
Show Figures

Figure 1

29 pages, 1696 KB  
Article
Optimizing Lightweight Convolutional Networks via Topological Attention and Entropy-Constrained Distillation: A Spectral–Topological Approach for Robust Facial Expression Recognition
by Xiaohong Dong, Yu Gao, Mengyan Liu and Wenxiaoman Yu
Algorithms 2026, 19(3), 177; https://doi.org/10.3390/a19030177 - 26 Feb 2026
Abstract
Deep learning models typically rely on large-scale datasets with accurate annotations, yet real-world applications inevitably suffer from label noise, which severely degrades generalization—particularly for lightweight neural networks with limited capacity. Existing learning with noisy labels methods are mainly designed for over-parameterized models and [...] Read more.
Deep learning models typically rely on large-scale datasets with accurate annotations, yet real-world applications inevitably suffer from label noise, which severely degrades generalization—particularly for lightweight neural networks with limited capacity. Existing learning with noisy labels methods are mainly designed for over-parameterized models and are often unsuitable for resource-constrained deployment. To address this challenge, we propose a robust framework that integrates a Micro Hybrid Attention Module (MHAM) with knowledge distillation (KD) for lightweight architectures such as MobileNetV3. MHAM employs a decoupled channel–spatial attention design to enhance discriminative feature extraction while suppressing noise-sensitive background responses. From a graph–signal perspective, MHAM can be interpreted as a spectral smoothing operator that improves optimization stability. In addition, knowledge distillation with soft teacher supervision mitigates overfitting to corrupted hard labels and reduces prediction uncertainty. Extensive experiments demonstrate the effectiveness of the proposed method. On FER2013, a real-world noisy facial expression recognition benchmark, our approach achieves 68.5% accuracy with only 0.52M parameters, while reducing optimization variance by 24%. On CIFAR-10 with 40% symmetric label noise, it improves accuracy from 54.85% to 60.10%. On CIFAR-10N with multiple types of real-world human annotation noise, the proposed method consistently achieves 63.9–71.9% accuracy under different noise protocols. These results show that the proposed framework provides an efficient and robust solution for noisy label learning in lightweight facial expression and object classification on edge devices. Full article
(This article belongs to the Special Issue Deep Neural Networks and Optimization Algorithms (2nd Edition))
Show Figures

Figure 1

31 pages, 1026 KB  
Article
Bridging Cognitive and Expression Spaces in Creative AI by Integrating DIKWP-TRIZ and Semantic Mathematics
by Zhendong Guo and Yucong Duan
Electronics 2026, 15(5), 963; https://doi.org/10.3390/electronics15050963 - 26 Feb 2026
Viewed by 28
Abstract
Large Language Models (LLMs) generate fluent text but often struggle with reliable multi-step reasoning, factual grounding, and stable use of long context, especially when inputs are incomplete, inconsistent, or imprecise. To address these challenges, we propose a Creative AI framework that integrates DIKWP-TRIZ [...] Read more.
Large Language Models (LLMs) generate fluent text but often struggle with reliable multi-step reasoning, factual grounding, and stable use of long context, especially when inputs are incomplete, inconsistent, or imprecise. To address these challenges, we propose a Creative AI framework that integrates DIKWP-TRIZ with a semantic-mathematical constraint layer. DIKWP-TRIZ extends TRIZ by embedding a DIKWP (Data–Information–Knowledge–Wisdom–Purpose) network, enabling purposeful, value-aware transformations and explicit repair operations under 3-No conditions. The semantic layer introduces three context-indexed constraints over concept–expression mappings (Existence, Contextual Uniqueness, and Transitivity), making ambiguities and contradictions explicit and checkable during inference and generation. We enumerate the DIKWP × DIKWP transformation type space (25 ordered pairs over {D,I,K,W,P}) and provide candidate TRIZ inventive principles for each type as design-time guidance. A global Purpose controller steers transformation selection and enforces goal alignment and ethical constraints. We present a reference architecture and qualitative case analyses against a standard LLM, illustrating how the framework structures intermediate steps, surfaces assumptions, and supports traceable explanations. Quantitative benchmarking remains for future work. Full article
(This article belongs to the Special Issue Autonomous Intelligence: Concepts and Applications of Agentic AI)
Show Figures

Figure 1

Back to TopTop