Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (96)

Search Parameters:
Keywords = Self-Knowledge Distillation

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
14 pages, 621 KB  
Article
Accelerating Realization of Effective Capacity in Lightweight Vision Models via Self-Competitive Distillation
by Weidong Zhang, Baoxin Li, Huan Liu, Pak Lun Kevin Ding and Ahmet Arda Dalyanci
Algorithms 2026, 19(4), 262; https://doi.org/10.3390/a19040262 - 1 Apr 2026
Viewed by 495
Abstract
We introduce Self-Competitive Distillation (SCD), a parameter-neutral training strategy aimed at influencing optimization dynamics without increasing model size or relying on external teachers. Two identical instances of the same architecture, initialized with different random seeds, are trained jointly and dynamically exchange asymmetric teacher–student [...] Read more.
We introduce Self-Competitive Distillation (SCD), a parameter-neutral training strategy aimed at influencing optimization dynamics without increasing model size or relying on external teachers. Two identical instances of the same architecture, initialized with different random seeds, are trained jointly and dynamically exchange asymmetric teacher–student roles based on instantaneous performance, enabling knowledge transfer between diverging optimization trajectories. Under fixed parameter and training budgets, SCD is observed to improve the realized effective capacity of lightweight architectures, yielding a higher test accuracy at matched epochs. Across multiple lightweight vision models and datasets, SCD demonstrates gains in both in-domain performance and cross-domain generalization, as measured by xScore. These results suggest that, within the evaluated experimental conditions, SCD can help mobile models make more effective use of training dynamics, while the underlying architecture remains the primary determinant of effective capacity in resource-constrained settings. Full article
(This article belongs to the Special Issue Advances in Deep Learning-Based Data Analysis)
Show Figures

Figure 1

31 pages, 5672 KB  
Article
D-SOMA: A Dynamic Self-Organizing Map-Assisted Multi-Objective Evolutionary Algorithm with Adaptive Subregion Characterization
by Xinru Zhang and Tianyu Liu
Computers 2026, 15(4), 207; https://doi.org/10.3390/computers15040207 - 26 Mar 2026
Viewed by 232
Abstract
Multi-objective evolutionary optimization faces significant challenges due to guidance mismatch under complex Pareto-front geometries. This paper proposes a dynamic self-organizing map-assisted evolutionary algorithm (D-SOMA), a manifold-aware framework that harmonizes knowledge-informed priors with unsupervised objective-space characterization. Specifically, a knowledge-informed guided resampling strategy is formulated [...] Read more.
Multi-objective evolutionary optimization faces significant challenges due to guidance mismatch under complex Pareto-front geometries. This paper proposes a dynamic self-organizing map-assisted evolutionary algorithm (D-SOMA), a manifold-aware framework that harmonizes knowledge-informed priors with unsupervised objective-space characterization. Specifically, a knowledge-informed guided resampling strategy is formulated to bridge stochastic initialization and targeted exploitation. By distilling spatial distribution priors from the decision-variable boundaries of early-stage elite solutions, it establishes a high-quality starting population biased towards promising regions. To capture the intrinsic geometry of the evolving population, a self-organizing map (SOM)-based adaptive subregion characterization strategy leverages the topological preservation of self-organizing maps to extract latent modeling parameters. This strategy adaptively determines subregion centers and influence radii, enabling a data-driven partitioning that respects the underlying manifold structure. Furthermore, a density-driven phase-responsive scale adjustment strategy is introduced. By synthesizing spatial density feedback and temporal evolutionary trajectories, it dynamically modulates the characterization granularity K, thereby maintaining a rigorous balance between geometric modeling fidelity and computational overhead. Extensive experiments on 50 benchmark problems from the DTLZ, WFG, MaF and RWMOP suites demonstrate that D-SOMA is statistically superior to seven state-of-the-art algorithms, exhibiting robust convergence and superior diversity across diverse problem landscapes. Full article
Show Figures

Graphical abstract

29 pages, 3177 KB  
Article
Dual-Distillation Vision-Language Model for Multimodal Emotion Recognition in Conversation with Quantized Edge Deployment
by DeogHwa Kim, Yu il Lee, Da Hyun Yoon, Byeong Jun Kim and Deok-Hwan Kim
Appl. Sci. 2026, 16(6), 3103; https://doi.org/10.3390/app16063103 - 23 Mar 2026
Viewed by 394
Abstract
Multimodal Emotion Recognition in Conversation (ERC) has attracted attention as a key technology in human–computer interaction, mental healthcare, and intelligent services. However, deploying ERC in real-world settings remains challenging due to reliability gaps across modalities, instability in visual representations, and the high computational [...] Read more.
Multimodal Emotion Recognition in Conversation (ERC) has attracted attention as a key technology in human–computer interaction, mental healthcare, and intelligent services. However, deploying ERC in real-world settings remains challenging due to reliability gaps across modalities, instability in visual representations, and the high computational cost of large pretrained models. In particular, on resource-constrained edge devices, it is difficult to reduce model size and inference latency while preserving accuracy. To address these challenges, we jointly propose a knowledge-distillation-based multimodal ERC model, called DDVLM, with an edge-optimized Weight-Only Quantization (WOQ) pipeline for efficient edge deployment. DDVLM assigns the textual modality as the teacher and the visual modality as the student, transferring emotion-distribution knowledge to improve non-verbal representations and stabilize multimodal learning. In addition, Exponential Moving Average (EMA)-based self-distillation enhances the consistency and generalization capability of text features. Meanwhile, the proposed WOQ pipeline quantizes linear-layer weights to INT8 while preserving precision-sensitive operations in mixed precision, thereby minimizing accuracy loss and reducing model size, memory usage, and inference latency. Experiments on the MELD dataset demonstrated that the proposed approach achieves state-of-the-art performance while also enabling real-time inference on edge devices such as NVIDIA Jetson. Overall, this work presents a practical ERC framework that jointly considers accuracy and deployability. Full article
(This article belongs to the Special Issue Multimodal Emotion Recognition and Affective Computing)
Show Figures

Figure 1

30 pages, 2362 KB  
Article
SGCAD: A SAR-Guided Confidence-Gated Distillation Framework of Optical and SAR Images for Water-Enhanced Land-Cover Semantic Segmentation
by Junjie Ma, Zhiyi Wang, Yanyi Yuan and Fengming Hu
Remote Sens. 2026, 18(6), 962; https://doi.org/10.3390/rs18060962 - 23 Mar 2026
Viewed by 284
Abstract
Multimodal fusion of synthetic aperture radar (SAR) and optical imagery is widely used in Earth observation for applications such as land-cover mapping and surface-water mapping (including post-event flood mapping under near-synchronous acquisitions) and land-use inventory. Optical images provide rich spectral and texture cues, [...] Read more.
Multimodal fusion of synthetic aperture radar (SAR) and optical imagery is widely used in Earth observation for applications such as land-cover mapping and surface-water mapping (including post-event flood mapping under near-synchronous acquisitions) and land-use inventory. Optical images provide rich spectral and texture cues, whereas SAR offers all-weather structural information that is complementary but heterogeneous. In practice, this heterogeneity often introduces fusion conflicts in multi-class segmentation, causing critical categories such as water bodies to be under-optimized. To address this issue, this paper presents a SAR-guided class-aware knowledge distillation (SGCAD) method for multimodal semantic segmentation. First, a SAR-only HRNet is trained as a water-expert teacher to learn discriminative backscattering and boundary priors for water extraction. Second, a lightweight multimodal student model (LightMCANet) is optimized using a class-aware distillation strategy that transfers teacher knowledge only within high-confidence water regions, thereby suppressing noisy supervision and reducing interference to other classes. Third, a SAR edge guidance module (SEGM) is introduced in the decoder to enhance boundary continuity for slender structures such as water bodies and roads. Overall, SGCAD improves targeted category learning while maintaining stable performance across the remaining classes. Experiments on a self-built dataset from GF-1 optical and LuTan-1 SAR imagery demonstrate higher overall accuracy and more coherent water/road predictions than representative baselines. Future work will extend the proposed distillation scheme to additional categories and broader geographic scenes. Full article
(This article belongs to the Section Remote Sensing Image Processing)
Show Figures

Figure 1

18 pages, 4746 KB  
Article
MS2-CL: Multi-Scale Self-Supervised Learning for Camera to LiDAR Cross-Modal Place Recognition
by Wen Liu, Lei Ma, Xuanshun Zhuang and Zhongliang Deng
Sensors 2026, 26(5), 1561; https://doi.org/10.3390/s26051561 - 2 Mar 2026
Viewed by 349
Abstract
Place recognition is a fundamental challenge for robotics and autonomous vehicles. While visual place recognition has achieved high precision, cross-modal place recognition—specifically, visual localization within large-scale point cloud maps—remains a formidable problem. Existing methods often struggle with the significant domain gap between modalities [...] Read more.
Place recognition is a fundamental challenge for robotics and autonomous vehicles. While visual place recognition has achieved high precision, cross-modal place recognition—specifically, visual localization within large-scale point cloud maps—remains a formidable problem. Existing methods often struggle with the significant domain gap between modalities and can be computationally prohibitive, especially those processing raw 3D point clouds. Furthermore, they frequently fail to learn features invariant to viewpoint and scale variations, limiting generalization to unseen environments. In this paper, we formulate cross-modal recognition as a problem of learning a scale-invariant, unified embedding space. Our framework employs a hierarchical Swin Transformer to extract multi-scale features from unified 2D representations of both modalities. The central principle of our method is a multi-scale self-distillation paradigm, which recasts feature learning as an intra-modal knowledge transfer task. Specifically, the coarse-scale “teacher” features provide supervision for the fine-scale “student” features. The final inter-modal alignment is then achieved via a global contrastive loss, exclusively leveraging the semantically rich “teacher” embeddings to ensure a reliable and discriminative matching. Extensive experiments on the KITTI and KITTI-360 datasets demonstrate that our method achieves state-of-the-art performance. Notably, using only the KITTI-trained model without fine-tuning, Recall@1 exceeds 60% on all evaluable sequences of KITTI-360 at a 10 m threshold. Code and pre-trained models will be made publicly available upon acceptance. Full article
(This article belongs to the Section Radar Sensors)
Show Figures

Figure 1

14 pages, 8345 KB  
Article
A Self-Mutual Learning Framework Based on Knowledge Distillation for Scene Text Detection
by Weisheng Zheng, Xiaofei Zhang, Kefan Qu, Ye Tao, Juan Feng and Wangpeng He
Electronics 2026, 15(5), 1037; https://doi.org/10.3390/electronics15051037 - 2 Mar 2026
Viewed by 280
Abstract
Knowledge distillation serves as a prevalent model compression strategy within scene text detection, enabling the transfer of learned representations from a high-capacity teacher architecture to a streamlined student counterpart. Building upon this concept, deep mutual learning alleviates dependence on the teacher model through [...] Read more.
Knowledge distillation serves as a prevalent model compression strategy within scene text detection, enabling the transfer of learned representations from a high-capacity teacher architecture to a streamlined student counterpart. Building upon this concept, deep mutual learning alleviates dependence on the teacher model through interactive learning among student models. However, existing deep mutual learning networks inadequately address the complex redundant backgrounds and text feature distributions in scene text images, failing to effectively balance the trade-off between model performance and lightweight design. To address this issue, this paper proposes an improved self-mutual learning framework based on deep mutual learning. By employing a design that incorporates parallel multi-detection heads and interactive learning, the proposed approach simplifies the model training process while significantly improving detection accuracy. Specifically, the framework introduces a pruning mechanism that enables different detection heads to capture input features with varying degrees of sparsity. This not only reduces interference from redundant backgrounds but also leads to a more lightweight implementation. Moreover, varying feature sparsity among detection heads promotes more diverse knowledge exchange throughout mutual learning. This substantially boosts the distilled model’s resilience in intricate text environments. Comprehensive evaluations show that our approach achieves superior F-measure scores compared to leading knowledge distillation methods. Full article
Show Figures

Figure 1

25 pages, 1279 KB  
Article
SSKD: Stepwise Self-Knowledge Distillation for Binary Neural Networks in Keyword Spotting
by Hailong Zou, Jionghao Zhang, Jun Li, Hang Ran, Wulve Yang, Rui Zhou, Zenghui Yu, Yi Zhan and Shushan Qiao
Appl. Sci. 2026, 16(4), 2021; https://doi.org/10.3390/app16042021 - 18 Feb 2026
Viewed by 369
Abstract
The hardware power-aware keyword spotting (KWS) implementation requires small memory footprint, low-complex computation, and high accuracy performances. Binary neural networks (BNNs) naturally satisfy these constraints. They quantize both weights and activations to 1-bit. This reduces storage and replaces most multiply–accumulate operations with bitwise [...] Read more.
The hardware power-aware keyword spotting (KWS) implementation requires small memory footprint, low-complex computation, and high accuracy performances. Binary neural networks (BNNs) naturally satisfy these constraints. They quantize both weights and activations to 1-bit. This reduces storage and replaces most multiply–accumulate operations with bitwise operations. However, such extreme quantization incurs substantial information loss and leaves a noticeable accuracy gap relative to full-precision models. Optimization is also more difficult because the sign function is non-differentiable, and surrogate-gradient updates introduce gradient mismatch. To preserve the hardware benefits of BNNs while alleviating the accuracy degeneration induced by 1-bit quantization, this article addresses the problem from two complementary aspects: Firstly, a Stepwise Self-Knowledge Distillation (SSKD) training approach is proposed to effectively improve the student BNN’s accuracy performance. The SSKD training framework achieves effective supervision for student BNNs. A Stepwise Training Strategy is proposed to optimize the training stability and accuracy. Weight Scaling Factor improves the student’s representational capability. Secondly, an extremely lightweight Binary Temporal Convolutional ResNet (BTC-ResNet) is also proposed. The parameters and calculations inside the network are greatly reduced for the inference. Experiments on the GSCD v1 and GSCD v2 benchmarks demonstrate the effectiveness of our methods for low-power keyword spotting. For the 12-class task, BTC-ResNet14 achieves 97.23% accuracy on GSCD v1 and 97.31% on GSCD v2 with 0.75 Mb parameters and 1.35 M FLOPs. For the 35-class task on GSCD v2, it reaches 95.56% accuracy with 0.76 Mb parameters and 1.35 M FLOPs. These results indicate that our method achieves a competitive accuracy–efficiency balance relative to recent distillation-based BNN KWS baselines reported in the comparative experiments. All these studies are helpful and promising for future KWS deployment on low-power hardware devices. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

23 pages, 14619 KB  
Article
Edge-Distilled and Local–Global Feature Selection Network for Hyperspectral Image Super-Resolution
by Xinzhao Li, Mengzhe Fan, Xiaoqing Zheng and Jiandong Shang
Sensors 2026, 26(3), 1055; https://doi.org/10.3390/s26031055 - 6 Feb 2026
Viewed by 417
Abstract
In recent years, the methods based on convolutional neural networks have achieved significant progress in hyperspectral image super-resolution. However, existing methods still face two key challenges: (1) they fail to fully extract edge detail information from hyperspectral images; (2) they struggle to simultaneously [...] Read more.
In recent years, the methods based on convolutional neural networks have achieved significant progress in hyperspectral image super-resolution. However, existing methods still face two key challenges: (1) they fail to fully extract edge detail information from hyperspectral images; (2) they struggle to simultaneously capture local and global features. To address these issues, we propose an Edge-Distilled and Local–Global Feature Selection network (EDLGFS) for hyperspectral image super-resolution. This network aims to effectively leverage edge details and local–global features, thereby enhancing super-resolution reconstruction quality. Firstly, we design an edge-guided super-resolution network based on knowledge distillation. This network transfers edge knowledge to improve the reconstruction. Secondly, we propose a Local–Global Feature Selection mechanism (LGFS), which integrates convolutions of different sizes with the self-attention mechanism. This design models spatial correlations across features with different receptive fields, achieving efficient feature selection to more effectively capture local and global features. Finally, we propose a dynamic loss mechanism to more effectively balance the contribution of each loss term. Extensive experimental results on three public datasets demonstrate that the proposed EDLGFS achieves superior super-resolution reconstruction quality. Full article
(This article belongs to the Special Issue Intelligent Sensing and Artificial Intelligence for Image Processing)
Show Figures

Figure 1

26 pages, 11755 KB  
Article
SAMKD: A Hybrid Lightweight Algorithm Based on Selective Activation and Masked Knowledge Distillation for Multimodal Object Detection
by Ruitao Lu, Zhanhong Zhuo, Siyu Wang, Jiwei Fan, Tong Shen and Xiaogang Yang
Remote Sens. 2026, 18(3), 450; https://doi.org/10.3390/rs18030450 - 1 Feb 2026
Viewed by 432
Abstract
Multimodal object detection is currently a research hotspot in computer vision. However, the fusion of visible and infrared modalities inevitably increases computational complexity, making most high-performance detection models difficult to deploy on resource-constrained UAV edge devices. Although pruning and knowledge distillation are widely [...] Read more.
Multimodal object detection is currently a research hotspot in computer vision. However, the fusion of visible and infrared modalities inevitably increases computational complexity, making most high-performance detection models difficult to deploy on resource-constrained UAV edge devices. Although pruning and knowledge distillation are widely used for model compression, applying them independently often leads to an unstable accuracy–efficiency trade-off. Therefore, this paper proposes a hybrid lightweight algorithm named SAMKD, which combines selective activation pruning with masked knowledge distillation in a staged manner to improve efficiency while maintaining detection performance. Specifically, the selective activation network pruning model (SAPM) first reduces redundant computation by dynamically adjusting network weights and the activation state of input data to generate a lightweight student network. Then, the mask binary classification knowledge distillation (MBKD) strategy is introduced to compensate for this degradation by guiding the student network to recover missing representation patterns under masked feature learning. Moreover, MBKD reformulates classification logits into multiple foreground–background binary mappings, effectively alleviating the severe foreground–background imbalance commonly observed in UAV aerial imagery. This paper constructs a multimodal UAV aerial imagery object detection dataset, M2UD-18K, which includes 9 types of targets and over 18,000 pairs. Extensive experiments show that SAMKD performs well on the self-constructed M2UD-18K dataset, as well as the public DroneVehicle dataset, achieving a favorable trade-off between detection accuracy and detection speed. Full article
Show Figures

Figure 1

43 pages, 1250 KB  
Review
Challenges and Opportunities in Tomato Leaf Disease Detection with Limited and Multimodal Data: A Review
by Yingbiao Hu, Huinian Li, Chengcheng Yang, Ningxia Chen, Zhenfu Pan and Wei Ke
Mathematics 2026, 14(3), 422; https://doi.org/10.3390/math14030422 - 26 Jan 2026
Cited by 1 | Viewed by 679
Abstract
Tomato leaf diseases cause substantial yield and quality losses worldwide, yet reliable detection in real fields remains challenging. Two practical bottlenecks dominate current research: (i) limited data, including small samples for rare diseases, class imbalance, and noisy field images, and (ii) multimodal heterogeneity, [...] Read more.
Tomato leaf diseases cause substantial yield and quality losses worldwide, yet reliable detection in real fields remains challenging. Two practical bottlenecks dominate current research: (i) limited data, including small samples for rare diseases, class imbalance, and noisy field images, and (ii) multimodal heterogeneity, where RGB images, textual symptom descriptions, spectral cues, and optional molecular assays provide complementary but hard-to-align evidence. This review summarizes recent advances in tomato leaf disease detection under these constraints. We first formalize the problem settings of limited and multimodal data and analyze their impacts on model generalization. We then survey representative solutions for limited data (transfer learning, data augmentation, few-/zero-shot learning, self-supervised learning, and knowledge distillation) and multimodal fusion (feature-, decision-, and hybrid-level strategies, with attention-based alignment). Typical model–dataset pairs are compared, with emphasis on cross-domain robustness and deployment cost. Finally, we outline open challenges—including weak generalization in complex field environments, limited interpretability of multimodal models, and the absence of unified multimodal benchmarks—and discuss future opportunities toward lightweight, edge-ready, and scalable multimodal systems for precision agriculture. Full article
(This article belongs to the Special Issue Structural Networks for Image Application)
Show Figures

Figure 1

27 pages, 6058 KB  
Article
Hierarchical Self-Distillation with Attention for Class-Imbalanced Acoustic Event Classification in Elevators
by Shengying Yang, Lingyan Chou, He Li, Zhenyu Xu, Boyang Feng and Jingsheng Lei
Sensors 2026, 26(2), 589; https://doi.org/10.3390/s26020589 - 15 Jan 2026
Viewed by 497
Abstract
Acoustic-based anomaly detection in elevators is crucial for predictive maintenance and operational safety, yet it faces significant challenges in real-world settings, including pervasive multi-source acoustic interference within confined spaces and severe class imbalance in collected data, which critically degrades the detection performance for [...] Read more.
Acoustic-based anomaly detection in elevators is crucial for predictive maintenance and operational safety, yet it faces significant challenges in real-world settings, including pervasive multi-source acoustic interference within confined spaces and severe class imbalance in collected data, which critically degrades the detection performance for minority yet critical acoustic events. To address these issues, this study proposes a novel hierarchical self-distillation framework. The method embeds auxiliary classifiers into the intermediate layers of a backbone network, creating a deep teacher–shallow student knowledge transfer paradigm optimized jointly via Kullback–Leibler divergence and feature alignment losses. A self-attentive temporal pooling layer is introduced to adaptively weigh discriminative time-frequency features, thereby mitigating temporal overlap interference, while a focal loss function is employed specifically in the teacher model to recalibrate the learning focus towards hard-to-classify minority samples. Extensive evaluations on the public UrbanSound8K dataset and a proprietary industrial elevator audio dataset demonstrate that the proposed model achieves superior performance, exceeding 90% in both accuracy and F1-score. Notably, it yields substantial improvements in recognizing rare events, validating its robustness for elevator acoustic monitoring. Full article
Show Figures

Figure 1

17 pages, 710 KB  
Article
KD-SecBERT: A Knowledge-Distilled Bidirectional Encoder Optimized for Open-Source Software Supply Chain Security in Smart Grid Applications
by Qinman Li, Xixiang Zhang, Weiming Liao, Tao Dai, Hongliang Zheng, Beiya Yang and Pengfei Wang
Electronics 2026, 15(2), 345; https://doi.org/10.3390/electronics15020345 - 13 Jan 2026
Viewed by 380
Abstract
With the acceleration of digital transformation, open-source software has become a fundamental component of modern smart grids and other critical infrastructures. However, the complex dependency structures of open-source ecosystems and the continuous emergence of vulnerabilities pose substantial challenges to software supply chain security. [...] Read more.
With the acceleration of digital transformation, open-source software has become a fundamental component of modern smart grids and other critical infrastructures. However, the complex dependency structures of open-source ecosystems and the continuous emergence of vulnerabilities pose substantial challenges to software supply chain security. In power information networks and cyber–physical control systems, vulnerabilities in open-source components integrated into Supervisory Control and Data Acquisition (SCADA), Energy Management System (EMS), and Distribution Management System (DMS) platforms and distributed energy controllers may propagate along the supply chain, threatening system security and operational stability. In such application scenarios, large language models (LLMs) often suffer from limited semantic accuracy when handling domain-specific security terminology, as well as deployment inefficiencies that hinder their practical adoption in critical infrastructure environments. To address these issues, this paper proposes KD-SecBERT, a domain-specific semantic bidirectional encoder optimized through multi-level knowledge distillation for open-source software supply chain security in smart grid applications. The proposed framework constructs a hierarchical multi-teacher ensemble that integrates general language understanding, cybersecurity-domain knowledge, and code semantic analysis, together with a lightweight student architecture based on depthwise separable convolutions and multi-head self-attention. In addition, a dynamic, multi-dimensional distillation strategy is introduced to jointly perform layer-wise representation alignment, ensemble knowledge fusion, and task-oriented optimization under a progressive curriculum learning scheme. Extensive experiments conducted on a multi-source dataset comprising National Vulnerability Database (NVD) and Common Vulnerabilities and Exposures (CVE) entries, security-related GitHub code, and Open Web Application Security Project (OWASP) test cases show that KD-SecBERT achieves an accuracy of 91.3%, a recall of 90.6%, and an F1-score of 89.2% on vulnerability classification tasks, indicating strong robustness in recognizing both common and low-frequency security semantics. These results demonstrate that KD-SecBERT provides an effective and practical solution for semantic analysis and software supply chain risk assessment in smart grids and other critical-infrastructure environments. Full article
Show Figures

Figure 1

21 pages, 1342 KB  
Article
TSCL-LwF: A Cross-Subject Emotion Recognition Model via Multi-Scale CNN and Incremental Learning Strategy
by Chunting Wan, Xing Tang, Cong Hu, Juan Yang, Shaorong Zhang and Dongyi Chen
Brain Sci. 2026, 16(1), 84; https://doi.org/10.3390/brainsci16010084 - 9 Jan 2026
Viewed by 611
Abstract
Background/Objectives: Wearable affective human–computer interaction increasingly relies on sparse-channel EEG signals to ensure comfort and practicality in real-life scenarios. However, the limited information provided by sparse-channel EEG, together with pronounced inter-subject variability, makes reliable cross-subject emotion recognition particularly challenging. Methods: To [...] Read more.
Background/Objectives: Wearable affective human–computer interaction increasingly relies on sparse-channel EEG signals to ensure comfort and practicality in real-life scenarios. However, the limited information provided by sparse-channel EEG, together with pronounced inter-subject variability, makes reliable cross-subject emotion recognition particularly challenging. Methods: To address these challenges, we propose a cross-subject emotion recognition model, termed TSCL-LwF, based on sparse-channel EEG. It combines a multi-scale convolutional network (TSCL) and an incremental learning strategy with Learning without Forgetting (LwF). Specifically, the TSCL is utilized to capture the spatio-temporal characteristics of sparse-channel EEG, which employs diverse receptive fields of convolutional networks to extract and fuse the interaction information within the local prefrontal area. The incremental learning strategy with LwF introduces a limited set of labeled target domain data and incorporates the knowledge distillation loss to retain the source domain knowledge while enabling rapid target domain adaptation. Results: Experiments on the DEAP dataset show that the proposed TSCL-LwF achieves accuracy of 77.26% for valence classification and 80.12% for arousal classification. Moreover, it also exhibits superior accuracy when evaluated on the self-collected dataset EPPVR. Conclusions: The successful implementation of cross-subject emotion recognition based on a sparse-channel EEG will facilitate the development of wearable EEG technologies with practical applications. Full article
Show Figures

Figure 1

42 pages, 3251 KB  
Article
Efficient and Accurate Epilepsy Seizure Prediction and Detection Based on Multi-Teacher Knowledge Distillation RGF-Model
by Wei Cao, Qi Li, Anyuan Zhang and Tianze Wang
Brain Sci. 2026, 16(1), 83; https://doi.org/10.3390/brainsci16010083 - 9 Jan 2026
Viewed by 719
Abstract
Background: Epileptic seizures are unpredictable, and while existing deep learning models achieve high accuracy, their deployment on wearable devices is constrained by high computational costs and latency. To address this, this work proposes the RGF-Model, a lightweight network that unifies seizure prediction and [...] Read more.
Background: Epileptic seizures are unpredictable, and while existing deep learning models achieve high accuracy, their deployment on wearable devices is constrained by high computational costs and latency. To address this, this work proposes the RGF-Model, a lightweight network that unifies seizure prediction and detection within a single causal framework. Methods: By integrating Feature-wise Linear Modulation (FiLM) with a Ring-Buffer Gated Recurrent Unit (Ring-GRU), the model achieves adaptive task-specific feature conditioning while strictly enforcing causal consistency for real-time inference. A multi-teacher knowledge distillation strategy is employed to transfer complementary knowledge from complex teacher ensembles to the lightweight student, significantly reducing complexity without sacrificing accuracy. Results: Evaluations on the CHB-MIT and Siena datasets demonstrate that the RGF-Model outperforms state-of-the-art teacher models in terms of efficiency while maintaining comparable accuracy. Specifically, on CHB-MIT, it achieves 99.54% Area Under the Curve (AUC) and 0.01 False Prediction Rate per hour (FPR/h) for prediction, and 98.78% Accuracy (Acc) for detection, with only 0.082 million parameters. Statistical significance was assessed using a random predictor baseline (p < 0.05). Conclusions: The results indicate that the RGF-Model provides a highly efficient solution for real-time wearable epilepsy monitoring. Full article
(This article belongs to the Section Neurotechnology and Neuroimaging)
Show Figures

Figure 1

29 pages, 11833 KB  
Article
MIE-YOLO: A Multi-Scale Information-Enhanced Weed Detection Algorithm for Precision Agriculture
by Zhoujiaxin Heng, Yuchen Xie and Danfeng Du
AgriEngineering 2026, 8(1), 16; https://doi.org/10.3390/agriengineering8010016 - 1 Jan 2026
Viewed by 1003
Abstract
As precision agriculture places higher demands on real-time field weed detection and recognition accuracy, this paper proposes a multi-scale information-enhanced weed detection algorithm, MIE-YOLO (Multi-scale Information Enhanced), for precision agriculture. Based on the popular YOLO12 (You Only Look Once 12) model, MIE-YOLO combines [...] Read more.
As precision agriculture places higher demands on real-time field weed detection and recognition accuracy, this paper proposes a multi-scale information-enhanced weed detection algorithm, MIE-YOLO (Multi-scale Information Enhanced), for precision agriculture. Based on the popular YOLO12 (You Only Look Once 12) model, MIE-YOLO combines edge-aware multi-scale fusion with additive gated blocks and two-stage self-distillation to boost small-object and boundary detection while staying lightweight. First, the MS-EIS (Multi-Scale-Edge Information Select) architecture is designed to effectively aggregate and select edge and texture information at different scales to enhance fine-grained feature representation. Next, the Add-CGLU (Additive-Convolutional Gated Linear Unit) pyramid network is proposed, which enhances the representational power and information transfer efficiency of multi-scale features through additive fusion and gating mechanisms. Finally, the DEC (Detail-Enhanced Convolution) detection head is introduced to enhance detail and refine the localization of small objects and fuzzy boundaries. To further improve the model’s detection accuracy and generalization performance, the DS (Double Self-Knowledge Distillation) strategy is defined to perform double self-knowledge distillation within the entire network. Experimental results on the custom Weed dataset, which contains 9257 images of eight weed categories, show that MIE-YOLO improves the F1 score by 1.9% and the mAP by 2.0%. Furthermore, it reduces computational parameters by 29.9%, FLOPs by 6.9%, and model size by 17.0%, achieving a runtime speed of 66.2 FPS. MIE-YOLO improves weed detection performance while maintaining a certain level of inference efficiency, providing an effective technical path and engineering implementation reference for intelligent field inspection and precise weed control in precision agriculture. The source code is available on GitHub. Full article
(This article belongs to the Special Issue Integrating AI and Robotics for Precision Weed Control in Agriculture)
Show Figures

Figure 1

Back to TopTop