Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (6,053)

Search Parameters:
Keywords = semantic information

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
13 pages, 2368 KB  
Article
DGE-YOLO: Dual-Branch Gathering and Attention for Efficient Accurate UAV Object Detection
by Kunwei Lv, Zhiren Xiao, Hang Ren, Xiali Li and Ping Lan
Appl. Sci. 2026, 16(8), 4004; https://doi.org/10.3390/app16084004 (registering DOI) - 20 Apr 2026
Abstract
The rapid proliferation of unmanned aerial vehicles (UAVs) has amplified the need for robust and efficient object detection in diverse aerial environments. However, detecting small objects under complex conditions (e.g., low illumination, cluttered backgrounds, and thermal–visual discrepancies) remains challenging. While many existing detectors [...] Read more.
The rapid proliferation of unmanned aerial vehicles (UAVs) has amplified the need for robust and efficient object detection in diverse aerial environments. However, detecting small objects under complex conditions (e.g., low illumination, cluttered backgrounds, and thermal–visual discrepancies) remains challenging. While many existing detectors emphasize real-time inference, they often rely on weak or late fusion strategies, resulting in suboptimal utilization of complementary multi-modal cues. To address this limitation, we propose DGE-YOLO, an enhanced YOLO-based framework for effective infrared–visible (IR–RGB) multi-modal fusion in UAV object detection. DGE-YOLO adopts a dual-branch architecture for modality-specific feature extraction, preserving modality-aware representations before fusion. To strengthen cross-scale semantics, we introduce an Efficient Multi-scale Attention (EMA) module that improves feature discrimination across spatial resolutions. Furthermore, we replace the conventional neck with a Gather-and-Distribute module to reduce information loss during feature aggregation and improve multi-scale feature propagation. Extensive experiments on the DroneVehicle dataset demonstrate that DGE-YOLO consistently outperforms state-of-the-art baselines, confirming its effectiveness and practicality as an applied multi-modal detection solution for UAV scenarios. Full article
(This article belongs to the Special Issue Applied Multimodal AI: Methods and Applications Across Domains)
Show Figures

Figure 1

22 pages, 12163 KB  
Article
SV-LIO: A Probabilistic Adaptive Semantic Voxel Map for LiDAR–Inertial Odometry
by Lixiao Yang and Youbing Feng
Electronics 2026, 15(8), 1744; https://doi.org/10.3390/electronics15081744 (registering DOI) - 20 Apr 2026
Abstract
Accurate and real-time localization is a fundamental prerequisite for the autonomous navigation of mobile robots. LiDAR–Inertial Odometry (LIO) achieves high-precision state estimation and scene reconstruction in unknown environments by effectively fusing data from LiDAR and Inertial Measurement Units (IMU). However, conventional LIO methods [...] Read more.
Accurate and real-time localization is a fundamental prerequisite for the autonomous navigation of mobile robots. LiDAR–Inertial Odometry (LIO) achieves high-precision state estimation and scene reconstruction in unknown environments by effectively fusing data from LiDAR and Inertial Measurement Units (IMU). However, conventional LIO methods typically rely solely on geometric features during point cloud registration. In complex scenarios, such as outdoor unstructured or dynamic environments, these methods are often susceptible to reduced localization accuracy due to geometric degeneration or mismatches. To address these challenges, we propose SV-LIO, A Probabilistic Adaptive Semantic Voxel Map for LiDAR–Inertial Odometry, which leverages point-wise semantic information from semantic segmentation to enhance registration accuracy and system robustness. Specifically, we construct a probabilistic adaptive semantic voxel map that extracts multi-scale spatial planes attached with semantic information. Building on this representation, we employ a semantic-guided strategy for nearest-neighbor plane association between LiDAR scans and the local map, and construct semantic-weighted point-to-plane residuals to constrain pose estimation. By jointly optimizing the IMU-propagated pose prior and semantic-guided LiDAR observation constraints, SV-LIO realizes high-precision real-time state estimation and semantic scene reconstruction. Extensive experiments on the KITTI dataset demonstrate that SV-LIO achieves significant improvements in both localization accuracy compared to state-of-the-art (SOTA) LIO methods, while also constructing semantic maps capable of providing rich environmental information. Full article
(This article belongs to the Section Electrical and Autonomous Vehicles)
35 pages, 4414 KB  
Article
Superpixel-Based Deep Feature Analysis Coupled with Dense CRF for Land Use Change Detection Using High-Resolution Remote Sensing Images
by Jinqi Gong, Tie Wang, Zongchen Wang and Junyi Zhou
Remote Sens. 2026, 18(8), 1245; https://doi.org/10.3390/rs18081245 - 20 Apr 2026
Abstract
Land use change detection (LUCD) serves as a crucial technical cornerstone for natural resource management and ecological environment monitoring, playing an indispensable role in advancing the modernization of national governance capacities. Nonetheless, severe interference from radiometric variations on feature representation readily induces spurious [...] Read more.
Land use change detection (LUCD) serves as a crucial technical cornerstone for natural resource management and ecological environment monitoring, playing an indispensable role in advancing the modernization of national governance capacities. Nonetheless, severe interference from radiometric variations on feature representation readily induces spurious changes and thus a high false alarm rate. Additionally, the challenge of balancing discriminative feature extraction and fine-grained contextual modeling leads to fragmented change regions and missed detection. To address these issues and eliminate the reliance on annotated samples, a novel framework is proposed for unsupervised LUCD, integrating superpixel-based deep feature analysis with a dense conditional random field (CRF). Firstly, relative radiometric correction and band-wise maximum stacking fusion are performed on the bi-temporal images. A simple non-iterative clustering (SNIC) algorithm is adopted to generate homogeneous superpixels with cross-temporal consistency. Then, a deep feature coupling mining mechanism is introduced to implement spatial–spectral feature extraction and in-depth parsing of invariant semantic information. Meanwhile, the difference confidence map based on dual features is constructed using superpixel-level discriminant vectors to enhance the separability. Finally, leveraging homogeneous units with spatial correspondence, a task-specific redesign of a global optimization model is established to achieve the precise extraction of change regions, which incorporates difference confidence, spatial adjacency relationship, and cross-temporal feature similarity into the dense CRF. The experimental results demonstrate that the proposed method achieves an average overall accuracy of over 90% across all datasets with excellent comprehensive performance, striking a well-balanced trade-off in practical applicability. It can effectively suppress salt-and-pepper noise, significantly improve the recall rate of change regions (maintaining at approximately 90%), and exhibit favorable superiority and robustness in complex land cover scenarios. Full article
25 pages, 3443 KB  
Article
Improved Parameter-Driven Automated Three-Class Segmentation for Concrete CT: A Reproducible Pipeline for Large-Scale Dataset Production
by Youxi Wang, Tianqi Zhang and Xinxiao Chen
Buildings 2026, 16(8), 1620; https://doi.org/10.3390/buildings16081620 - 20 Apr 2026
Abstract
The automated production of large-scale labeled datasets from concrete X-ray computed tomography (CT) images is a fundamental prerequisite for training and validating deep learning-based segmentation models. However, existing methods either require extensive manual annotation or rely on domain-specific deep learning models that themselves [...] Read more.
The automated production of large-scale labeled datasets from concrete X-ray computed tomography (CT) images is a fundamental prerequisite for training and validating deep learning-based segmentation models. However, existing methods either require extensive manual annotation or rely on domain-specific deep learning models that themselves demand labeled data—a circular dependency. This paper presents a parameter-driven three-class segmentation framework that automatically classifies each pixel in a concrete CT slice into one of three material phases: void (air pores and cracks), coarse aggregate, and mortar matrix, generating annotation masks suitable for large-scale dataset production without manual labeling. The proposed method combines: (1) fixed-threshold void detection calibrated to concrete CT grayscale characteristics; (2) adaptive percentile-based initial segmentation responsive to image-specific statistics; (3) multi-criteria connected component scoring based on area, shape descriptors (circularity, solidity, compactness, extent, aspect ratio), intensity distribution, and boundary gradient; (4) material science-informed size constraints aligned with concrete phase volume fractions; and (5) a material continuity enforcement module that applies topological hole-filling and conditional convex-hull consolidation to eliminate internal contamination within accepted aggregate regions, reducing boundary roughness by 7.6% and recovering misclassified boundary pixels. All parameters are centralized in a configuration file, enabling reproducible batch processing of 224 × 224 pixel CT slices at 0.07–1.12 s per image. Evaluated on 1007 224 × 224 concrete CT patches cropped from 200 representative scan frames, the framework produces three-class segmentation masks with physically consistent void fractions (mean 3.2%), aggregate fractions (mean 32.4%), and mortar fractions (mean 64.4%), all within ranges reported in the concrete CT literature (used as a dataset-scale QC screen, not a validation metric). Primary outputs and the archived image–mask pairs for this work are provided as an 8-bit patch archive. For pixel-wise validation, we report IoU, Dice, and pixel accuracy on an independently labeled subset that can be unambiguously paired with the released predictions: averaged over 57 matched patches, mean pixel accuracy is 88.6%, macro-mean IoU is 74.7%, and macro-mean Dice is 84.9%. The framework provides a fully automated annotation pipeline for dataset production, eliminating manual labeling costs for concrete CT image collections. The generated datasets are suitable for training semantic segmentation networks such as U-Net and its variants. Full article
(This article belongs to the Section Building Materials, and Repair & Renovation)
17 pages, 913 KB  
Article
An Empirical Study of Knowledge Graph-Enhanced RAG for Information Security Compliance
by Dimitar Jovanovski, Marija Stojcheva, Mila Dodevska, Petre Lameski, Igor Mishkovski and Dejan Gjorgjevikj
Information 2026, 17(4), 389; https://doi.org/10.3390/info17040389 - 20 Apr 2026
Abstract
Information security compliance has become critical for organizations worldwide, with the ISO/IEC 27000 family serving as the most widely adopted framework for establishing information security management systems. Despite their global acceptance, these standards present significant interpretation challenges due to their formal language, abstract [...] Read more.
Information security compliance has become critical for organizations worldwide, with the ISO/IEC 27000 family serving as the most widely adopted framework for establishing information security management systems. Despite their global acceptance, these standards present significant interpretation challenges due to their formal language, abstract structure, and extensive cross-referencing across 97 documents. Traditional retrieval-augmented generation (RAG) systems, which rely on independent text chunking and dense vector retrieval, prove inadequate for such highly interconnected regulatory materials, often fragmenting contextual relationships and reducing accuracy. This study introduces a privacy-preserving RAG framework that integrates LightRAG, a knowledge graph-based retrieval system, with locally hosted open-source language models. Unlike chunk-based RAG systems that treat document segments independently, the system in this study constructs a semantic knowledge graph that explicitly models relationships between clauses through typed edges representing cross-references, semantic similarity, and hierarchical dependencies. To enable rigorous evaluation, we developed a curated benchmark dataset of 222 multiple-choice questions with authoritative ground-truth answers, systematically constructed from official ISO standards, certification preparation materials, and academic sources. Through systematic evaluation on this benchmark, we show that knowledge graph-based retrieval achieves higher accuracy than chunk-based RAG and non-retrieval LLM baselines within the evaluated setup. The analysis indicates that embedding model quality is strongly associated with system performance, that hybrid retrieval modes combining local and global graph traversal tend to yield better accuracy, and that mid-sized open-source models paired with strong retrievers can approach the performance of larger proprietary systems. The best configuration achieves 90.54% accuracy, demonstrating the promising effectiveness of graph-structured retrieval for multiple-choice regulatory questions. Full article
Show Figures

Figure 1

26 pages, 3904 KB  
Article
AcneFormer: A Lesion-Aware and Noise-Robust CNN–Transformer for Acne Image Classification
by Yongtao Zhou and Kui Zhao
Sensors 2026, 26(8), 2533; https://doi.org/10.3390/s26082533 - 20 Apr 2026
Abstract
Convolutional neural networks (CNNs) have been widely used for acne image classification due to their effectiveness in capturing local texture of skin lesions. However, the locality of convolution operations limits their ability to model long-range dependencies. Vision Transformer (ViT) methods address this issue [...] Read more.
Convolutional neural networks (CNNs) have been widely used for acne image classification due to their effectiveness in capturing local texture of skin lesions. However, the locality of convolution operations limits their ability to model long-range dependencies. Vision Transformer (ViT) methods address this issue to some extent but their high computational complexity and reliance on large-scale pre-training present challenges. Although CNN–Transformer architecture alleviates this conflict to some extent, acne images present task-specific challenges, including indistinct lesion boundaries, subtle inter-class variations, and various facial interference factors. In this paper, we propose AcneFormer, a lesion-aware and noise-robust CNN–Transformer architecture for acne image classification. We introduce three modules especially for acne tasks: a Lesion Cue Enhancement (LCE) module to highlight discriminative multi-scale spatial patterns, a Cross-Layer Feature Transmission (CLFT) module to enhance cross-layer information flow in Transformers, and a Differential Semantic Denoising (DSD) module to suppress irrelevant responses during deep feature interaction. Extensive experiments show that AcneFormer outperforms several strong baselines. Ablation and external lesion-annotated analyses further show a consistent pattern: LCE mainly improves lesion-sensitive localization and class-balanced recognition, CLFT expands valid cross-depth lesion evidence, and DSD suppresses off-lesion semantic responses. Full article
Show Figures

Figure 1

18 pages, 892 KB  
Article
Emotional Recognition Under Multimodal Conflict: A Gaze-Based Response Task
by Alessandro De Santis, Giusi Antonia Toto, Martina Rossi, Laura D’Amico and Pierpaolo Limone
Psychol. Int. 2026, 8(2), 26; https://doi.org/10.3390/psycholint8020026 - 20 Apr 2026
Abstract
Background: Emotional recognition relies on the integration of multiple affective cues. In everyday contexts, however, facial expressions, vocal prosody, and semantic content may convey incongruent emotional information, generating emotional conflict and increasing cognitive demands. Objective: The present study examined how multimodal emotional conflict [...] Read more.
Background: Emotional recognition relies on the integration of multiple affective cues. In everyday contexts, however, facial expressions, vocal prosody, and semantic content may convey incongruent emotional information, generating emotional conflict and increasing cognitive demands. Objective: The present study examined how multimodal emotional conflict affects emotion recognition during video viewing, focusing on short videos in which a single actor simultaneously conveyed incongruent emotional cues across facial, vocal, and semantic channels. Methods: Forty-seven undergraduate students completed a gaze-based response task in which, after each short video, they provided a single judgment of the overall emotion conveyed by the stimulus. The videos depicted either congruent or incongruent combinations of semantic content, facial expressions, and vocal prosody across six basic emotions and a neutral condition. Data were analyzed using repeated-measures ANOVAs and generalized linear mixed-effects models. Results: Accuracy was consistently higher for congruent than incongruent stimuli across all domains, indicating a robust emotional interference effect. Critically, the magnitude of this effect differed by domain. Semantic content showed the largest performance reduction under incongruence, followed by facial expression and vocal prosody. Mixed-effects models confirmed these effects while accounting for participant- and item-level variability and revealed a significant Congruency × Domain interaction. Conclusions: In a gaze-based response task requiring a single overall emotion judgment, emotional conflict disrupted recognition in a domain-specific manner, with semantic information being particularly vulnerable to multimodal interference. Full article
(This article belongs to the Section Cognitive Psychology)
Show Figures

Figure 1

36 pages, 5744 KB  
Article
Multi-Scale Atrous Feature Fusion Based on a VGG19-UNet Encoder for Brain Tumor Segmentation
by Shoffan Saifullah and Rafał Dreżewski
Appl. Sci. 2026, 16(8), 3971; https://doi.org/10.3390/app16083971 - 19 Apr 2026
Abstract
Accurate brain tumor segmentation from magnetic resonance imaging (MRI) remains challenging due to heterogeneous tumor morphology, intensity variability, and multi-scale structural complexity. This study proposes a DeepLabV3+-based segmentation framework integrating a VGG19-UNet encoder, Atrous Spatial Pyramid Pooling (ASPP), and low-level feature refinement to [...] Read more.
Accurate brain tumor segmentation from magnetic resonance imaging (MRI) remains challenging due to heterogeneous tumor morphology, intensity variability, and multi-scale structural complexity. This study proposes a DeepLabV3+-based segmentation framework integrating a VGG19-UNet encoder, Atrous Spatial Pyramid Pooling (ASPP), and low-level feature refinement to simultaneously capture hierarchical semantics and boundary-sensitive spatial details. The architecture enhances receptive field coverage without additional downsampling while preserving fine-grained contour information during reconstruction. Extensive evaluation was conducted on the Figshare Brain Tumor Segmentation (FBTS) dataset and the BraTS 2021 and BraTS 2018 benchmarks, focusing on Whole Tumor segmentation across multiple MRI modalities and tumor grades. Under five-fold cross-validation, the proposed model achieved a mean Dice Similarity Coefficient of 0.9717 and Jaccard Index of 0.9456 on FBTS, with stable and competitive performance across FLAIR, T1, T2, and T1CE modalities in both HGG and LGG cases. Boundary-level analysis further confirmed controlled Hausdorff Distance and low Average Symmetric Surface Distance. Statistical validation and ablation analysis demonstrate consistent improvements over baseline U-Net configurations. The proposed framework provides a robust and computationally efficient solution for automated brain tumor segmentation across heterogeneous datasets. Full article
(This article belongs to the Special Issue Research on Artificial Intelligence in Healthcare)
27 pages, 873 KB  
Article
ToR-Lite: A Lightweight Semantic Query Decomposition for Multi-Hop Retrieval-Augmented Generation in Cloud-Based AI Systems
by Hee-Kyong Yoo, Wonbae Kim and Nammee Moon
Appl. Sci. 2026, 16(8), 3966; https://doi.org/10.3390/app16083966 - 19 Apr 2026
Abstract
Cloud-based AI systems increasingly rely on Retrieval-Augmented Generation (RAG) to handle complex, knowledge-intensive queries. However, query decomposition for multi-hop retrieval—traditionally powered by large language models (LLMs)—incurs significant latency and cost, rendering it impractical for large-scale, cost-sensitive cloud deployments. We propose ToR-Lite, a lightweight, [...] Read more.
Cloud-based AI systems increasingly rely on Retrieval-Augmented Generation (RAG) to handle complex, knowledge-intensive queries. However, query decomposition for multi-hop retrieval—traditionally powered by large language models (LLMs)—incurs significant latency and cost, rendering it impractical for large-scale, cost-sensitive cloud deployments. We propose ToR-Lite, a lightweight, generative LLM-free semantic query decomposition framework designed to enhance multi-hop retrieval efficiency in cloud-based AI systems. ToR-Lite employs a novel Word-Window Splitting algorithm that detects semantic breakpoints via sliding window embeddings, effectively decomposing complex queries without expensive LLM inference. Experiments on the MultiHop-RAG benchmark (n = 2255) demonstrate that ToR-Lite achieves +6.03 pp Hits@10 and +0.89 pp Exact Match improvements over the Baseline, while operating 3.18 times faster than LLM-based Adaptive ToR. Retrieval performance correlates monotonically with decomposition granularity: three sub-query decompositions (#Dq = 3) yields a +7.00 pp Hits@10 improvement, confirming that semantic granularity is a key driver of retrieval performance. Comparison with rule-based Baselines confirms that these gains derive from the precision of semantic boundary detection rather than decomposition quantity alone. ToR-Lite delivers nearly twice the retrieval improvement per unit of computational cost, offering a practical and cost-effective solution for latency-sensitive cloud AI deployments. Full article
(This article belongs to the Special Issue AI Technology and Security in Cloud/Big Data)
25 pages, 7376 KB  
Article
Adaptive Prompting-Driven Degradation-Aware Fusion for Infrared and Visible Images
by Qian Zhang, Jie Zhou and Hong Liang
Appl. Sci. 2026, 16(8), 3947; https://doi.org/10.3390/app16083947 - 18 Apr 2026
Viewed by 39
Abstract
Infrared and visible image fusion aims to combine the complementary advantages of thermal radiation information and rich texture details to generate more informative images for downstream perception tasks. However, existing deep learning-based methods usually assume ideal imaging conditions and often suffer from performance [...] Read more.
Infrared and visible image fusion aims to combine the complementary advantages of thermal radiation information and rich texture details to generate more informative images for downstream perception tasks. However, existing deep learning-based methods usually assume ideal imaging conditions and often suffer from performance degradation in complex environments such as low illumination, rain interference, and strong lighting disturbances. To address this problem, this paper proposes an adaptive prompting-driven degradation-aware fusion framework. Specifically, a degradation-aware prompt generation module is introduced to automatically perceive degradation patterns from the input images and generate structured conditional prompts. These prompts guide the network to adaptively adjust feature representations through learnable affine modulation. Furthermore, a semantic-aligned feature learning strategy is designed to ensure consistent cross-modal representation in the latent space. Extensive experiments demonstrate that the proposed method achieves superior performance compared with several state-of-the-art fusion approaches under both normal and degraded conditions. Full article
50 pages, 4359 KB  
Article
Evaluating CLAP and MERT for Fine-Grained Cymbal Classification: A Multi-Stage Representation Analysis
by Michael Starakis, Maximos Kaliakatsos-Papakostas and Chrisoula Alexandraki
Electronics 2026, 15(8), 1723; https://doi.org/10.3390/electronics15081723 - 18 Apr 2026
Viewed by 53
Abstract
This study presents a representation-centric evaluation of audio foundation models for fine-grained musical instrument analysis, focusing on cymbal classification. A confound-aware comparison of CLAP and MERT embeddings is conducted to examine how each latent space supports recoverability of acoustically and semantically relevant information. [...] Read more.
This study presents a representation-centric evaluation of audio foundation models for fine-grained musical instrument analysis, focusing on cymbal classification. A confound-aware comparison of CLAP and MERT embeddings is conducted to examine how each latent space supports recoverability of acoustically and semantically relevant information. To support this analysis, the study introduces a representation-centric, confound-aware multi-stage evaluation framework that separates exploratory geometry, leakage-safe probing, and supporting unsupervised clustering evidence. The methodology is applied to a challenging cymbal dataset characterized by hierarchical labels, class imbalance, and subtle acoustic variation. Results reveal a target-dependent profile of representational strengths rather than a single overall winner. CLAP exhibits stronger variance concentration and more label-consistent local neighborhood organization, and it outperforms MERT on fine-grained, strike-related targets. MERT, however, retains a small but consistent advantage on higher-level cymbal-type classification. Unsupervised analyses show that these advantages reflect local neighborhood structure, not strong global cluster formation, and confound diagnostics indicate that size-related information remains largely type-mediated. Overall, the findings underscore the importance of structured, multi-stage evaluation for disentangling embedding geometry, recoverability, and confound effects while demonstrating the complementary strengths of AFMs in complex audio classification settings. Full article
19 pages, 510 KB  
Article
From Vector Space to Symbolic Space: Informational and Semantic Analysis of Benign and DDoS IoT Traffic Using LLMs
by Mironela Pirnau, Iustin Priescu, Mihai-Alexandru Botezatu, Catalina Mihaela Priescu and Daniela Joita
Electronics 2026, 15(8), 1724; https://doi.org/10.3390/electronics15081724 - 18 Apr 2026
Viewed by 68
Abstract
This paper investigates the feasibility of using Large Language Models (LLMs) for the structural analysis of flow-based network data. This analysis is carried out in the presence of a structural difference between the multidimensional numerical space of IoT features and the symbolic space [...] Read more.
This paper investigates the feasibility of using Large Language Models (LLMs) for the structural analysis of flow-based network data. This analysis is carried out in the presence of a structural difference between the multidimensional numerical space of IoT features and the symbolic space in which LLMs operate. The primary objective was the development of a formal framework that enables the controlled transformation of numerical data into linguistically analyzable semantic representations, without resorting to classification or machine learning mechanisms. We propose the Semantic Flow Encoding (SFE) mechanism, a deterministic method for robust discretization and behavioral abstraction that converts the numerical characteristics of Internet of Things (IoT) flows into structural semantic descriptions using the Canadian Institute for Cybersecurity Internet of Things Device Identification and Anomaly Detection (CIC IoT-DIAD) 2024 dataset. Through formal informational measures, it is demonstrated that the existence of an intrinsic structural difference between benign and DDoS traffic in the analyzed dataset. In the validation stage, we evaluated whether these informational differences are reflected at the level of linguistic abstraction through controlled inference experiments in IBM WatsonX. The present paper suggests that LLMs may support semantic auditing of distributional structure when guided by a formal encoding layer. In this manner, a reproducible framework for integrating numerical security data into language-model-based analysis is suggested. Full article
Show Figures

Figure 1

34 pages, 8222 KB  
Article
DPF-DETR: Enhancing Drone Image Detection with Density Perception and Multi-Scale Feature Fusion
by Sidi Lai, Zhensong Li, Xiaotan Wei, Yutong Wang and Shiliang Zhu
Remote Sens. 2026, 18(8), 1221; https://doi.org/10.3390/rs18081221 - 17 Apr 2026
Viewed by 106
Abstract
The DPF-DETR model has been designed to address the challenges encountered in object detection within drone imagery, particularly in scenarios involving significant target scale variations, dense targets, and complex backgrounds. To overcome the limitations of traditional object detection methods, the Density Sensing Mechanism [...] Read more.
The DPF-DETR model has been designed to address the challenges encountered in object detection within drone imagery, particularly in scenarios involving significant target scale variations, dense targets, and complex backgrounds. To overcome the limitations of traditional object detection methods, the Density Sensing Mechanism (DSM) and Adaptive Density Map Loss (AdaptiveDM Loss) have been incorporated into the model to provide fine-grained supervision signals. The DSM optimizes the query selection mechanism by utilizing density maps, enabling the number of queries to be adaptively adjusted based on the distribution density of targets, thus improving detection accuracy in dense regions. Furthermore, the precision of the model in detecting dense targets is enhanced by AdaptiveDM Loss, which dynamically adjusts the weights for object localization and classification. Multi-scale feature fusion capabilities are also improved by the Multi-Scale Feature Fusion Network (MSFFN) and the Selective Feature Integration Module (SFIM). The MSFFN refines the fusion of features, which improves the detection of targets across various scales, particularly in complex scenes. Additionally, SFIM enhances the detection accuracy for small targets and complex backgrounds by integrating low-level spatial features with high-level semantic information. The Context-Sensitive Feature Interaction Module (CSFIM) further optimizes multi-scale feature fusion through context-guided interactions, bridging the semantic gap between features of different scales, thus improving the robustness of the model in dense scenarios. Experimental results have shown that DPF-DETR outperforms traditional models and state-of-the-art detection methods across multiple datasets, demonstrating superior robustness and accuracy, especially in dense target detection and complex background scenarios. Full article
7 pages, 191 KB  
Proceeding Paper
Psychological Dimensions Involved in Image Communication: A Multidisciplinary Research Proposal for Analyzing Cognitive and Perceptual Processes in Visual Education
by Giusi Antonia Toto and Pierpaolo Limone
Proceedings 2026, 139(1), 7; https://doi.org/10.3390/proceedings2026139007 - 17 Apr 2026
Viewed by 107
Abstract
Image communication represents a fundamental domain of human experience that intersects cognitive neuroscience, educational psychology, and visual communication theory. The increasing digitalization of contemporary society has amplified the importance of visual literacy, defined as the ability to interpret, use, and create visual media. [...] Read more.
Image communication represents a fundamental domain of human experience that intersects cognitive neuroscience, educational psychology, and visual communication theory. The increasing digitalization of contemporary society has amplified the importance of visual literacy, defined as the ability to interpret, use, and create visual media. While neuroscientific research highlights the brain’s proficiency in processing visual information, significant gaps remain in understanding the underlying psychological mechanisms and their practical applications in educational contexts. This study proposes a multidisciplinary research design to systematically analyze these psychological dimensions. The research will integrate cognitive, perceptual, and pedagogical perspectives to understand how visual representations influence learning. The methodological design includes a multi-method approach combining experimental analysis, ethnographic observation, and psychometric evaluation on a stratified sample of 240 participants (aged 16–25) divided into three groups: high school students (n = 80), university students (n = 80), and young professionals (n = 80). The proposed methodology will utilize eye-tracking to analyze visual perception patterns, integrated with semantic differential methods to evaluate cognitive and affective associations with visual imagery. The expected results should clarify how the effectiveness of image communication depends on the coherence between technical and semantic aspects of visual imagery. The research aims to contribute to the theoretical framework of educational neuroscience, offering empirical evidence for optimizing teaching strategies based on multimodal visual communication. Full article
22 pages, 2585 KB  
Article
Enhancing Supply Chain Resilience in Textile SMEs: A Human-Centric Customer-to-Manufacturer Framework Using Public E-Commerce Data
by Chien-Chih Wang, Yu-Teng Hsu and Hsuan-Yu Kuo
J. Theor. Appl. Electron. Commer. Res. 2026, 21(4), 123; https://doi.org/10.3390/jtaer21040123 - 17 Apr 2026
Viewed by 180
Abstract
Upstream textile small and medium-sized enterprises (SMEs) frequently exhibit constrained supply chain resilience owing to persistent information latency and structural dependence on downstream orders. To address these challenges, this study develops and validates a customer-to-manufacturer (C2M) intelligence framework that enables data-driven production planning [...] Read more.
Upstream textile small and medium-sized enterprises (SMEs) frequently exhibit constrained supply chain resilience owing to persistent information latency and structural dependence on downstream orders. To address these challenges, this study develops and validates a customer-to-manufacturer (C2M) intelligence framework that enables data-driven production planning using publicly available e-commerce data. The framework incorporates ethically compliant acquisition of consumer demand signals, semantic translation of unstructured market data into textile engineering attributes, machine-learning-based demand forecasting, and human-centric decision support. Utilizing 3.87 million consumer comments from 127,846 product listings, a Neural Boosted Tree model with entity embeddings for textile attributes was constructed. This model achieved a mean R2 of 0.921 in cross-validation, surpassing benchmark methods. Consumer comment volume was validated as a proxy for sales activity, facilitating demand estimation. Forecasts were translated into production guidance using Monte Carlo simulation and a decision dashboard. In a 12-month field study at a Taiwanese dyeing SME, implementation resulted in a 28% reduction in inventory value, a 31% decrease in dye lot changeovers, and a 16% increase in capacity utilization. This research extends the C2M paradigm from downstream retail contexts to upstream textile SMEs, proposes an integrated and operationally feasible intelligence framework for resource-constrained manufacturers, and demonstrates how digital intelligence can enhance supply chain resilience while supporting, rather than replacing, human decision-making. The results indicate that upstream textile SMEs can leverage publicly visible e-commerce signals to enhance production planning responsiveness, minimize inventory exposure and dye-lot disruptions, and strengthen resilience to demand uncertainty through planner-centered digital decision support. Full article
(This article belongs to the Section Data Science, AI, and e-Commerce Analytics)
Show Figures

Figure 1

Back to TopTop