MDPI - Publisher of Open Access Journals

21 pages, 11533 KB

Open AccessArticle

LACE-Net: A Swin Transformer with Local Frequency-Domain Energy and Adaptive Contrast Enhancement for Fine-Grained Land Cover Classification

by Yongmei Tan, Gong Chen, Yan Huang, Hengzhou Ye and Jincheng Tang

Computers 2026, 15(5), 281; https://doi.org/10.3390/computers15050281 - 28 Apr 2026

Abstract

The Swin Transformer exhibits limitations in fine-grained land use and land cover (LULC) classification, particularly in capturing high-frequency texture details and representing low-contrast regions. To address these issues, we propose a novel network model, termed LACE-Net, which integrates local frequency-domain energy and adaptive [...] Read more.

The Swin Transformer exhibits limitations in fine-grained land use and land cover (LULC) classification, particularly in capturing high-frequency texture details and representing low-contrast regions. To address these issues, we propose a novel network model, termed LACE-Net, which integrates local frequency-domain energy and adaptive contrast enhancement. Built upon the Swin Transformer backbone, the model introduces an innovative Local Frequency-Domain Energy-Adaptive Contrast Enhancement Multi-Scale Attention (LACE). This block consists of parallel branches for frequency-domain perception and contrast enhancement, which effectively combine texture and illumination physical priors. In addition, a texture-adaptive momentum adjustment mechanism is incorporated to refine the spatial enhancement attention weights dynamically. Consequently, LACE-Net greatly strengthens the modeling and representation of high-frequency details and complex spatial structural features. Experiments are performed on a self-constructed Guangxi regional dataset (denoted as GLC-30) and the publicly available remote sensing scene classification benchmark dataset NWPU-RESISC45. The results show that LACE-Net achieves a Top-1 accuracy (Top-1 Acc) of 96.48% and a macro-averaged F1 score (mF1) of 93.13%. These results outperform current mainstream vision models, particularly in mitigating the spectral confusion issue of “same spectrum, different objects.” The model exhibits superior fine-grained classification performance and robust generalization across datasets. Full article

(This article belongs to the Special Issue Advanced Image Processing and Computer Vision (2nd Edition))

► Show Figures

Figure 1

52 pages, 639 KB

Open AccessArticle

xjb: Fast Float to String Algorithm

by Junbo Xiang and Tiejun Wang

Computers 2026, 15(5), 280; https://doi.org/10.3390/computers15050280 - 27 Apr 2026

Abstract

Efficiently and accurately converting floating-point numbers to decimal strings remains a fundamental challenge in numerical computation, data serialization, and human–computer interaction. While modern algorithms such as Ryū, Dragonbox, and Schubfach rigorously satisfy the Steele–White criteria for correctness and minimal output length, their performance [...] Read more.

Efficiently and accurately converting floating-point numbers to decimal strings remains a fundamental challenge in numerical computation, data serialization, and human–computer interaction. While modern algorithms such as Ryū, Dragonbox, and Schubfach rigorously satisfy the Steele–White criteria for correctness and minimal output length, their performance is frequently constrained by branch mispredictions, high-precision multiplication overhead, and suboptimal utilization of instruction-level parallelism. This paper introduces xjb, a novel floating-point–string conversion algorithm derived from Schubfach that systematically overcomes these bottlenecks. By restructuring the core computation to reduce instruction dependencies, adopting branchless decision logic, and exploiting SIMD instruction sets for decimal-to-ASCII formatting, xjb delivers state-of-the-art throughput across diverse hardware platforms. The algorithm requires only a single 64-by-128-bit multiplication for IEEE 754 binary64 conversions and a single 64-by-64-bit multiplication for binary32, drastically decreasing arithmetic complexity. Extensive benchmarking on AMD R7-7840H and Apple M1/M5 processors demonstrates that xjb consistently outperforms leading contemporary implementations. Notably, on the Apple M5, xjb achieves speedups of approximately 20% and 136% for binary64 and binary32 conversions, respectively, when compared to the highly optimized zmij library. The algorithm is fully compliant with the Steele–White principle; exhaustive validation over the entire binary32 space and extensive random testing across the binary64 range confirm both its theoretical soundness and practical robustness. Full article

(This article belongs to the Special Issue Computational Science and Its Applications 2025 (ICCSA 2025))

24 pages, 453 KB

Open AccessArticle

Reason2Decide-C: Adaptive Cycle-Consistent Training for Clinical Rationales

by H M Quamran Hasan, Housam Khalifa Bashier Babiker, Mi-Young Kim and Randy Goebel

Computers 2026, 15(5), 279; https://doi.org/10.3390/computers15050279 - 27 Apr 2026

Abstract

Large Language Models (LLMs) used for clinical decision support must not only make accurate predictions but also generate rationales that are consistent with, and sufficient for, those predictions. Building on Reason2Decide, a two-stage rationale-driven multi-task framework, we propose Reason2Decide-C (R2D-C, where C denotes [...] Read more.

Large Language Models (LLMs) used for clinical decision support must not only make accurate predictions but also generate rationales that are consistent with, and sufficient for, those predictions. Building on Reason2Decide, a two-stage rationale-driven multi-task framework, we propose Reason2Decide-C (R2D-C, where C denotes cycle consistency), which augments Reason2Decide’s stage 2 training with confidence-adaptive scheduled sampling and cycle-consistent rationale-to-label training. In stage 1, we pretrain our model on rationale generation. In stage 2, we jointlytrain on label prediction and rationale generation, gradually replacing gold labels with model-predicted labels based on confidence. Simultaneously, we feed the rationale logits back into the model to recover the label, thus enforcing explanation sufficiency. We evaluate R2D-C on one proprietary triage dataset, as well as public biomedical QA and reasoning datasets. Across model sizes, R2D-C substantially improves rationale–prediction consistency (where stage 1 and stage 2 predictions agree) and sufficiency (where the rationale alone recovers the ground-truth label) over other baselines while matching or modestly improving predictive performance (F1); in several settings R2D-C surpasses

40 \times

larger foundation models. Ablations confirm that the full combination is optimal, maximizing alignment and LLM-as-a-Judge rationale quality. These results demonstrate that confidence-adaptive scheduled sampling and cycle-consistent rationale-to-label training substantially enhance explanation alignment without sacrificing accuracy. Full article

(This article belongs to the Special Issue Generative AI in Medicine: Emerging Applications, Challenges, and Future Directions)

► Show Figures

Figure 1

44 pages, 7975 KB

Open AccessArticle

A Validated Design Guideline for Mobile Applications Grounded in the Participation of Deaf Users for Accessible Development

by Andrés Eduardo Fuentes-Cortázar and José Rafael Rojano-Cáceres

Computers 2026, 15(5), 278; https://doi.org/10.3390/computers15050278 - 27 Apr 2026

Abstract

Mobile devices are widely used, yet accessibility for people with disabilities remains a critical challenge. Deaf users who rely primarily on sign language (SL) frequently encounter barriers when interacting with applications not designed for their communication needs. This study proposes a design guide [...] Read more.

Mobile devices are widely used, yet accessibility for people with disabilities remains a critical challenge. Deaf users who rely primarily on sign language (SL) frequently encounter barriers when interacting with applications not designed for their communication needs. This study proposes a design guide for developing mobile applications tailored to sign language users. The guide was developed through the active participation of three groups: Deaf individuals, usability and user experience (UX) experts, and mobile application developers. Based on their contributions, thirteen design guidelines were defined, addressing sign language integration, visual feedback, navigation, content presentation, and interface design. The guidelines were validated through usability and UX evaluations conducted with the three participant groups. A mobile application was subsequently developed following the proposed guidelines to assess their practical applicability. The evaluation results indicate that the guide effectively supports the development of more accessible and usable mobile applications for Deaf users. Incorporating sign language-centered design principles significantly improves usability and user experience for individuals with hearing disabilities, contributing to more inclusive mobile application development. Full article

(This article belongs to the Section Human–Computer Interactions)

► Show Figures

Figure 1

36 pages, 637 KB

Open AccessArticle

Cognitive Grounding for Perspective Integration in Multi-LLM Systems

by Lev Sukherman, Yetunde Longe-Folajimi and Marina Konkol

Computers 2026, 15(5), 277; https://doi.org/10.3390/computers15050277 - 27 Apr 2026

Abstract

This paper investigates whether structured collaboration between multiple large language models (LLMs), each assigned a distinct cognitive role grounded in psychological theory, produces benefits beyond simple answer aggregation. We propose the Parallel Synthesis architecture, in which three cognitively specialized roles Analyzer (hierarchical decomposition), [...] Read more.

This paper investigates whether structured collaboration between multiple large language models (LLMs), each assigned a distinct cognitive role grounded in psychological theory, produces benefits beyond simple answer aggregation. We propose the Parallel Synthesis architecture, in which three cognitively specialized roles Analyzer (hierarchical decomposition), Creative (divergent thinking), and Critic (critical evaluation) process each task independently and in parallel, and a Synthesizer integrates their outputs into a final response. To evaluate collaborative reasoning, we introduce the Emergent Reasoning Score (ERS), a composite metric that separates perspective integration (Synthesis Effectiveness) from novel concept generation (Emergent Value). Experiments on Experiments on the AI2 Reasoning Challenge (ARC-Challenge) (1172 questions) and and the Massive Multitask Language Understanding benchmark (MMLU) (1531 questions) show two consistent findings. First, the architecture achieves high Synthesis Effectiveness (

S E = 0.711

–

0.744

), indicating reliable integration of all three cognitive perspectives. Second, Emergent Value remains low (

E V = 0.096

–

0.112

), indicating that synthesis primarily recombines existing concepts rather than generating substantial novel content. A Majority Voting baseline achieves comparable or slightly higher answer accuracy than the Synthesizer on both benchmarks, showing that the architecture’s main contribution lies not in answer selection but in producing integrated reasoning traces that draw on multiple perspectives. These findings suggest that the practical value of cognitively grounded multi-agent architectures lies in reliable perspective integration, while ERS provides a reusable framework for distinguishing integration from genuinely novel reasoning in multi-agent LLM systems. The empirical results reported here constitute a pilot validation of the proposed framework on closed-form benchmarks, intended to establish a proof of concept and motivate larger-scale evaluation. Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) and Large Language Modelling (2nd Edition))

27 pages, 2028 KB

Open AccessArticle

Monitoring of Customer Segment Dynamics Using Clustering and Event-Based Alerts

by Stavroula Chatzinikolaou, Giannis Vassiliou, Efstratia Vasileiou, Sotirios Batsakis and Nikos Papadakis

Computers 2026, 15(5), 276; https://doi.org/10.3390/computers15050276 - 27 Apr 2026

Abstract

Continuous customer activity generated by modern digital platforms drives the evolution of behavioral segments over time. Traditional customer segmentation methods typically rely on periodic batch analysis of historical data, producing static snapshots that may quickly become outdated and fail to capture emerging behavioral [...] Read more.

Continuous customer activity generated by modern digital platforms drives the evolution of behavioral segments over time. Traditional customer segmentation methods typically rely on periodic batch analysis of historical data, producing static snapshots that may quickly become outdated and fail to capture emerging behavioral patterns. This paper presents a monitoring-oriented framework for detecting customer segment evolution and generating timely notifications about meaningful structural changes in the customer population. The proposed system continuously ingests user activity events, incrementally updates customer profiles, and periodically recomputes behavioral segments using fixed-k KMeans clustering over standardized recency, frequency, and monetary (RFM) features. To improve robustness and interpretability, the framework incorporates adaptive event scoring, stability-aware segment validation, drift-aware centroid matching, and persistence-based filtering of transient changes. These mechanisms reduce noisy alerts caused by repeated clustering updates while preserving meaningful signals about evolving customer behavior. The framework is evaluated on the Online Retail II and Instacart datasets under streaming simulation conditions. Experimental results show that the proposed approach maintains stable clustering structures, identifies persistent segment changes, and uncovers economically meaningful customer groups. Compared with static segmentation and periodic clustering baselines, the framework improves clustering quality while enabling continuous monitoring of segment evolution. Overall, the results suggest that adaptive monitoring can extend traditional customer segmentation into a practical continuous analytics process for moderate-scale dynamic environments. Full article

► Show Figures

Figure 1

22 pages, 1217 KB

Open AccessArticle

The Missing Layer in Modern IT: Governance of Commitments, Not Just Compute and Data

by Rao Mikkilineni and William Patrick Kelly

Computers 2026, 15(5), 275; https://doi.org/10.3390/computers15050275 - 24 Apr 2026

Viewed by 118

Abstract

Contemporary enterprise IT operations are largely implemented on Shannon–Turing computing models in which programs execute read–compute–write cycles over data structures, while governance—fault handling, configuration control, auditability, continuity, and accounting—is applied externally through infrastructure platforms, observability stacks, and human operational processes. This separation scales [...] Read more.

Contemporary enterprise IT operations are largely implemented on Shannon–Turing computing models in which programs execute read–compute–write cycles over data structures, while governance—fault handling, configuration control, auditability, continuity, and accounting—is applied externally through infrastructure platforms, observability stacks, and human operational processes. This separation scales analytical throughput but accumulates what we term coherence debt: locally expedient operational commitments whose provenance and revisability degrade over time until exposed by failures, security incidents, regulatory demands, or architectural transitions. This paper examines the evolution of operational computing models that integrate com-pupation with regulation at two distinct levels. First, Distributed Intelligent Managed Elements (DIME) extend the classical Turing cycle toward a supervised execution loop—read–check-with-oracle–compute–write—by incorporating signaling overlays and FCAPS (Fault, Configuration, Accounting, Performance, and Security) supervision into computation in progress. Second, the Autopoietic Management and Orchestration System (AMOS), grounded in the General Theory of Information, the Burgin–Mikkilineni Thesis, and Deutsch’s epistemic framework, fully decouples process executors from governance by treating any Turing-equivalent engine as a replaceable execution substrate while elevating knowledge structures—encoded as local and global Digital Genomes—to first-class operational state within a governed knowledge network. Using a distributed microservice transaction testbed, we demonstrate how this approach operationalizes topology-as-data, a capability-oriented control plane, decoupled application-layer FCAPS independent of infrastructure management, and policy-selectable consistency/availability semantics. Our results show that the principal benefit of AMOS is not circumventing theoretical constraints such as the Consistency, Availability, and Partition tolerance (CAP) theorem, but governing their trade-offs as explicit, auditable commitments with defined convergence pathways and controlled return to a coherent system state, thereby reducing coherence debt and improving operational reliability in distributed AI-enabled enterprise systems. Full article

(This article belongs to the Special Issue Cloud Computing and Big Data Mining)

27 pages, 2562 KB

Open AccessArticle

Case Studies on the Logical Structure of the Algorithms Tabu Search and Threshold Accepting for Generating Solutions in Searching and Solving the Bin-Packing Problem

by Vanesa Landero-Nájera, Joaquín Pérez-Ortega, Laura Cruz-Reyes, Claudia Guadalupe Gómez-Santillán, Nelva N. Almanza-Ortega, Carlos Rodríguez-Orta and Carlos Andrés Collazos-Morales

Computers 2026, 15(5), 274; https://doi.org/10.3390/computers15050274 - 24 Apr 2026

Viewed by 94

Abstract

The logical structure of approximation algorithms has been identified by the scientific community in four principal parts: tuning parameters, generating initial solutions, generating neighbor solutions, and stopping algorithm execution. A review of the literature specifically for the algorithms Threshold Accepting (TA) and Tabu [...] Read more.

The logical structure of approximation algorithms has been identified by the scientific community in four principal parts: tuning parameters, generating initial solutions, generating neighbor solutions, and stopping algorithm execution. A review of the literature specifically for the algorithms Threshold Accepting (TA) and Tabu Search (TS) indicates that, in most cases, choices are performed on one or several of these logical parts, often implicitly guided by expert knowledge for improving algorithm performance. However, these design choices, particularly in the selection of initialization and neighborhood strategies, are rarely analyzed in a systematic and reproducible manner. A formal experimental framework is presented to systematically analyze logical structure design choices, which are typically based on empirical expertise, by isolating and evaluating the combined effects of methodologies in the logical parts of initialization and neighborhood under controlled conditions of TA and TS algorithms in solving the one-dimensional Bin Packing Problem (BPP). A total of 324 benchmark instances were used to assess multiple algorithmic variants. Performance was evaluated in terms of solution quality and computational effort, supported by graphical analysis and statistical methods, including Wilcoxon signed-rank tests, effect size measures, bootstrap-based confidence intervals, and linear regression. The experimental results consistently show that the simpler internal logical structure of TA and TS algorithms, specifically with a probability-guided initialization combined with a single neighborhood operator, can achieve a better balance between solution quality and computational effort compared to more complex alternatives in general instances of BPP. Full article

(This article belongs to the Special Issue Computational Science and Its Applications 2025 (ICCSA 2025))

18 pages, 1839 KB

Open AccessArticle

A GNN-Based Log Anomaly Detection Framework with Prompt Learning for Edge Computing

by Xianlang Hu, Guangsheng Feng, Xinling Huang, Xiangying Kong and Hongwu Lv

Computers 2026, 15(5), 273; https://doi.org/10.3390/computers15050273 - 24 Apr 2026

Viewed by 92

Abstract

System logs have been critical for analyzing the operational status and abnormal behavior of highly distributed and heterogeneous edge computing nodes. In edge environments, logs exhibit cross-event and cross-field structural interactions, making it difficult to uncover potential anomaly patterns from isolated events. Moreover, [...] Read more.

System logs have been critical for analyzing the operational status and abnormal behavior of highly distributed and heterogeneous edge computing nodes. In edge environments, logs exhibit cross-event and cross-field structural interactions, making it difficult to uncover potential anomaly patterns from isolated events. Moreover, sparse annotations and varying log formats limit the effectiveness of existing methods. To address these challenges, we propose a graph neural network (GNN) anomaly detection framework with prompt learning. It leverages few-shot prompt learning to automatically extract key fields and constructs a weighted directed graph that jointly models semantic embeddings and temporal dependencies, fully representing the structural interactions and semantic associations across events and fields. Furthermore, the framework performs graph-level anomaly detection by jointly optimizing graph representation learning and classification objective within an enhanced one-class directed graph convolutional network, enabling effective identification of global structural anomaly patterns in log graphs. Experimental results demonstrate that the proposed method achieves an average F1-score of 93.3%, surpassing the current state-of-the-art (SOTA) methods by 6.93%. Full article

(This article belongs to the Special Issue Mobile Fog and Edge Computing)

31 pages, 6114 KB

Open AccessArticle

A Multi-Stage YOLOv11-Based Deep Learning Framework for Robust Instance Segmentation and Material Quantification of Mixed Plastic Waste

by Andrew N. Shafik, Mohamed H. Khafagy, Alber S. Aziz and Shereen A. Hussein

Computers 2026, 15(5), 271; https://doi.org/10.3390/computers15050271 - 24 Apr 2026

Viewed by 101

Abstract

Instance segmentation in heterogeneous waste scenes remains challenging due to object variability, deformable shapes, partial occlusion, and large appearance differences across packaging types. This study presents a YOLOv11-based deep learning framework for mixed plastic waste instance segmentation, developed to connect visual perception with [...] Read more.

Instance segmentation in heterogeneous waste scenes remains challenging due to object variability, deformable shapes, partial occlusion, and large appearance differences across packaging types. This study presents a YOLOv11-based deep learning framework for mixed plastic waste instance segmentation, developed to connect visual perception with reliable material quantification. The framework integrates curated instance-level annotations, strict split isolation, multi-stage optimization, training strategy ablation, and seed-robustness analysis to support reproducible model selection. Experimental results on a held-out test set show that the optimized model achieves a mask mAP@50:95 of 0.9337, indicating strong segmentation performance under heterogeneous waste-scene conditions. To extend the analysis beyond standard vision metrics, the framework incorporates a physics-informed mask-to-mass module that converts predicted masks into class-specific mass estimates using geometric calibration and material priors. Applied to a representative stream of 1253 detected objects, the system estimated a total plastic mass of 15.48 ± 1.08 kg, corresponding to a theoretical H₂ potential of 0.41 ± 0.04 kg and a greenhouse-gas avoidance of 34.57 ± 4.15 kg CO₂e. Overall, the proposed framework extends waste-scene understanding beyond vision-level assessment toward physically grounded, data-driven decision support for smart material recovery systems. Full article

(This article belongs to the Special Issue Machine Learning: Innovation, Implementation, and Impact)

34 pages, 2325 KB

Open AccessArticle

VIRTUOSO: A Multilayer Cloud Security and Risk Management Framework

by Raja Waseem Anwar, Flavio Pastore and Tariq Abdullah

Computers 2026, 15(5), 272; https://doi.org/10.3390/computers15050272 - 24 Apr 2026

Viewed by 116

Abstract

Despite its continued growth, cloud computing remains susceptible to significant security challenges, as shared virtualised environments pose threats at multiple levels. These vulnerabilities are caused by a lack of security coverage in the responsibility model between the provider and the tenant. In this [...] Read more.

Despite its continued growth, cloud computing remains susceptible to significant security challenges, as shared virtualised environments pose threats at multiple levels. These vulnerabilities are caused by a lack of security coverage in the responsibility model between the provider and the tenant. In this work, we propose the multi-layered architecture VIRTUOSO (VIRTual Unified Operation Security Optimiser) to cover these security gaps through advanced automation and ML. VIRTUOSO has four layers. The Input Layer extracts key risk components from collected telemetry data. The Deep Automation Security Layer provides automated actions and continuous monitoring of security defences. Its counterpart, the Intelligent Security Layer, predicts threats using anomaly detection. The last layer, the Output Layer, returns an aggregated risk summary. The datasets we used were chosen for their relevance: the UNSW-NB15 dataset, a subset of the web-attack classification from CSE-CIC-IDS2018, and a sample of anonymised log events from AWS CloudTrail. Our ensemble classifiers achieve a best accuracy of 95.08% ± 0.13% on UNSW-NB15 (RF), with statistically significant differences among models confirmed by the Friedman test (p < 0.004) and Nemenyi post hoc analysis, and 99.25% ± 0.52% on web-attack (CatBoost), where ensemble differences are not statistically significant (p = 0.093), consistent with the high separability of this dataset. The training-test gap and DNN curves show no overfitting, whereas our adversarial tests show a maximum accuracy loss of 8.1% at ε = 0.02. With these promising results, we can assert that, pending verification in an actual cloud environment and potential integration with FL, our ensemble classifier model appears to be a good real-world prototype. Full article

(This article belongs to the Special Issue Using New Technologies in Cyber Security Solutions (3rd Edition))

17 pages, 3548 KB

Open AccessArticle

R-Snort: A Performance-Optimized Multi-Agent NIDS Architecture for SOHO and Edge-of-Things Networks Using Snort 3 on Raspberry Pi 5

by Julio Gómez López, Deian Orlando Petrovics Tabacu, Nicolás Padilla Soriano and Alfredo Alcayde García

Computers 2026, 15(5), 270; https://doi.org/10.3390/computers15050270 - 24 Apr 2026

Viewed by 273

Abstract

Network Intrusion Detection Systems (NIDSs) are critical to ensuring the resilience of modern digital infrastructures. Although traditionally deployed in large-scale corporate environments, the expanding threat landscape requires the integration of robust security measures into Small Office/Home Office (SOHO) and Edge-of-Things (EoT) networks. However, [...] Read more.

Network Intrusion Detection Systems (NIDSs) are critical to ensuring the resilience of modern digital infrastructures. Although traditionally deployed in large-scale corporate environments, the expanding threat landscape requires the integration of robust security measures into Small Office/Home Office (SOHO) and Edge-of-Things (EoT) networks. However, these environments often face significant constraints in terms of specialized hardware and technical expertise. This article presents R-Snort, an open-source NIDS based on Snort 3, optimized for low-cost Raspberry Pi 5 hardware. Its multi-agent architecture enables distributed deployment with centralized traffic analysis and cross-agent attack correlation, while an intuitive web interface simplifies alert visualization and system management for non-expert administrators. Its main contributions are: (1) a performance-optimized NIDS agent achieving 1 Gbps throughput; (2) a distributed multi-agent architecture enabling centralized event correlation and detection of multi-vector attacks; and (3) an IaC-based automated deployment framework with an intuitive web interface, democratizing professional-grade security for SOHO and EoT environments. Full article

(This article belongs to the Special Issue Intrusion Detection and Trust Provisioning in Edge-of-Things Environment)

► Show Figures

Figure 1

18 pages, 1042 KB

Open AccessArticle

Development and Evaluation of a Chatbot-Based System for Early Detection of Depression Indicators

by Min Yang, Makoto Oka and Hirohiko Mori

Computers 2026, 15(5), 269; https://doi.org/10.3390/computers15050269 - 23 Apr 2026

Viewed by 100

Abstract

In this study, we developed a chatbot-based system for detecting early signs of depression and verified its effectiveness through experimental evaluations and user surveys. Emphasizing that it does not rely on medical checklists, the system is designed to automatically extract three linguistic features [...] Read more.

In this study, we developed a chatbot-based system for detecting early signs of depression and verified its effectiveness through experimental evaluations and user surveys. Emphasizing that it does not rely on medical checklists, the system is designed to automatically extract three linguistic features associated with depression—frequent use of first-person pronouns, pessimistic expressions, and obsessive-compulsive writing styles—from natural user conversations. Multiple models were constructed for these features, and an ensemble layer integrates their outputs for a comprehensive judgment. The implemented system analyzes input sentences obtained through chat, extracts the three categories of features, calculates a final score through an ensemble layer, and visualizes potential signs of depression based on the total score. We performed an evaluation experiment with 20 participants. In the test data evaluation, the system demonstrated over 76% accuracy in each of the three classification categories: first-person usage, pessimistic tendency, and obsessive-compulsive tendency. Full article

23 pages, 915 KB

Open AccessArticle

Learning Scientific Document Representations via Triple-Source Automatic Supervision Without Annotations or Citations

by Mussa Turdalyuly, Ainur Tursynkhan, Aigerim Yerimbetova, Tolganay Turdalykyzy, Bakzhan Sakenov, Nurzhan Mukazhanov and Nazerke Baisholan

Computers 2026, 15(5), 268; https://doi.org/10.3390/computers15050268 - 23 Apr 2026

Viewed by 134

Abstract

Learning meaningful representations of scientific documents is essential for information retrieval, knowledge discovery, and recommendation systems. Traditional methods such as TF-IDF rely on lexical matching and fail to capture deeper semantic relationships, while transformer-based approaches typically depend on limited supervision signals. In this [...] Read more.

Learning meaningful representations of scientific documents is essential for information retrieval, knowledge discovery, and recommendation systems. Traditional methods such as TF-IDF rely on lexical matching and fail to capture deeper semantic relationships, while transformer-based approaches typically depend on limited supervision signals. In this work, we propose a Triple-Source automatic supervision framework for learning document embeddings from scientific corpora. The model integrates three types of supervision–title–abstract pairs, same-category document pairs, and document-level semantic relationships—within a unified contrastive learning framework based on a multilingual XLM-RoBERTa encoder. Unlike prior approaches that rely on citation graphs or manual annotations, our method enables citation-free and annotation-free representation learning using only lightweight metadata. Experiments on a publicly available arXiv dataset consisting of 98,649 documents demonstrate improved semantic retrieval performance, achieving Recall@1 = 0.6181 for same-category retrieval and outperforming both TF-IDF and single-source transformer baselines. The learned embeddings also exhibit improved clustering of scientific domains, indicating more structured semantic representations. Full article

(This article belongs to the Special Issue Advances in Semantic Multimedia and Personalized Digital Content)

33 pages, 17932 KB

Open AccessArticle

Early Detection of Aggressive Human Behavior in Video Streams Using Deep Spatiotemporal Models

by Aida Issembayeva, Anargul Shaushenova, Ardak Nurpeisova, Aidar Ispussinov, Buldyryk Suleimenova, Anargul Bekenova, Aliya Satybaldieva, Aigul Zholmukhanova and Galiya Mauina

Computers 2026, 15(5), 267; https://doi.org/10.3390/computers15050267 - 23 Apr 2026

Viewed by 184

Abstract

In this paper, we propose a spatiotemporal approach for binary classification of violent and non-violent behavior in real-world settings. The experimental pipeline includes video preprocessing, stratified data splitting, generation of temporally structured clips, and comparative evaluation of baseline models, including a convolutional neural [...] Read more.

In this paper, we propose a spatiotemporal approach for binary classification of violent and non-violent behavior in real-world settings. The experimental pipeline includes video preprocessing, stratified data splitting, generation of temporally structured clips, and comparative evaluation of baseline models, including a convolutional neural network. We also developed a Residual Adaptive Motion Temporal Binary Heat Network model that combines frame color characteristics, residual motion descriptions, temporal feature fusion, an early risk assessment mechanism, and interpretable localization maps. Experiments were conducted on a balanced dataset of 2000 video clips. The proposed model demonstrated the best early warning performance: a supervision rate of 0.6, an F1 score of 0.9527, and a balanced accuracy of 0.9533. With full supervision, the F1 score was 0.9342, and the area under the receiver operating characteristic curve (AUC) was 0.9871. The practical significance of the work is that the proposed approach can be used as a decision support tool for the preliminary identification of potentially dangerous video fragments with subsequent manual verification, without the assumption of autonomous use in high-risk scenarios. Full article

(This article belongs to the Special Issue Deep Learning and Explainable Artificial Intelligence (2nd Edition))

► Show Figures

Figure 1

Search Results (2,175)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Article Types

Countries / Regions

Search Results (2,175)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI