Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (708)

Search Parameters:
Keywords = distillation learning

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
43 pages, 1250 KB  
Review
Challenges and Opportunities in Tomato Leaf Disease Detection with Limited and Multimodal Data: A Review
by Yingbiao Hu, Huinian Li, Chengcheng Yang, Ningxia Chen, Zhenfu Pan and Wei Ke
Mathematics 2026, 14(3), 422; https://doi.org/10.3390/math14030422 - 26 Jan 2026
Abstract
Tomato leaf diseases cause substantial yield and quality losses worldwide, yet reliable detection in real fields remains challenging. Two practical bottlenecks dominate current research: (i) limited data, including small samples for rare diseases, class imbalance, and noisy field images, and (ii) multimodal heterogeneity, [...] Read more.
Tomato leaf diseases cause substantial yield and quality losses worldwide, yet reliable detection in real fields remains challenging. Two practical bottlenecks dominate current research: (i) limited data, including small samples for rare diseases, class imbalance, and noisy field images, and (ii) multimodal heterogeneity, where RGB images, textual symptom descriptions, spectral cues, and optional molecular assays provide complementary but hard-to-align evidence. This review summarizes recent advances in tomato leaf disease detection under these constraints. We first formalize the problem settings of limited and multimodal data and analyze their impacts on model generalization. We then survey representative solutions for limited data (transfer learning, data augmentation, few-/zero-shot learning, self-supervised learning, and knowledge distillation) and multimodal fusion (feature-, decision-, and hybrid-level strategies, with attention-based alignment). Typical model–dataset pairs are compared, with emphasis on cross-domain robustness and deployment cost. Finally, we outline open challenges—including weak generalization in complex field environments, limited interpretability of multimodal models, and the absence of unified multimodal benchmarks—and discuss future opportunities toward lightweight, edge-ready, and scalable multimodal systems for precision agriculture. Full article
(This article belongs to the Special Issue Structural Networks for Image Application)
Show Figures

Figure 1

20 pages, 4006 KB  
Article
Deformable Pyramid Sparse Transformer for Semi-Supervised Driver Distraction Detection
by Qiang Zhao, Zhichao Yu, Jiahui Yu, Simon James Fong, Yuchu Lin, Rui Wang and Weiwei Lin
Sensors 2026, 26(3), 803; https://doi.org/10.3390/s26030803 - 25 Jan 2026
Viewed by 41
Abstract
Ensuring sustained driver attention is critical for intelligent transportation safety systems; however, the performance of data-driven driver distraction detection models is often limited by the high cost of large-scale manual annotation. To address this challenge, this paper proposes an adaptive semi-supervised driver distraction [...] Read more.
Ensuring sustained driver attention is critical for intelligent transportation safety systems; however, the performance of data-driven driver distraction detection models is often limited by the high cost of large-scale manual annotation. To address this challenge, this paper proposes an adaptive semi-supervised driver distraction detection framework based on teacher–student learning and deformable pyramid feature fusion. The framework leverages a limited amount of labeled data together with abundant unlabeled samples to achieve robust and scalable distraction detection. An adaptive pseudo-label optimization strategy is introduced, incorporating category-aware pseudo-label thresholding, delayed pseudo-label scheduling, and a confidence-weighted pseudo-label loss to dynamically balance pseudo-label quality and training stability. To enhance fine-grained perception of subtle driver behaviors, a Deformable Pyramid Sparse Transformer (DPST) module is integrated into a lightweight YOLOv11 detector, enabling precise multi-scale feature alignment and efficient cross-scale semantic fusion. Furthermore, a teacher-guided feature consistency distillation mechanism is employed to promote semantic alignment between teacher and student models at the feature level, mitigating the adverse effects of noisy pseudo-labels. Extensive experiments conducted on the Roboflow Distracted Driving Dataset demonstrate that the proposed method outperforms representative fully supervised baselines in terms of mAP@0.5 and mAP@0.5:0.95 while maintaining a balanced trade-off between precision and recall. These results indicate that the proposed framework provides an effective and practical solution for real-world driver monitoring systems under limited annotation conditions. Full article
(This article belongs to the Section Vehicular Sensing)
55 pages, 3083 KB  
Review
A Survey on Green Wireless Sensing: Energy-Efficient Sensing via WiFi CSI and Lightweight Learning
by Rod Koo, Xihao Liang, Deepak Mishra and Aruna Seneviratne
Energies 2026, 19(2), 573; https://doi.org/10.3390/en19020573 - 22 Jan 2026
Viewed by 75
Abstract
Conventional sensing expends energy at three stages: powering dedicated sensors, transmitting measurements, and executing computationally intensive inference. Wireless sensing re-purposes WiFi channel state information (CSI) inherent in every packet, eliminating extra sensors and uplink traffic, though reliance on deep neural networks (DNNs) often [...] Read more.
Conventional sensing expends energy at three stages: powering dedicated sensors, transmitting measurements, and executing computationally intensive inference. Wireless sensing re-purposes WiFi channel state information (CSI) inherent in every packet, eliminating extra sensors and uplink traffic, though reliance on deep neural networks (DNNs) often trained and run on graphics processing units (GPUs) can negate these gains. This review highlights two core energy efficiency levers in CSI-based wireless sensing. First ambient CSI harvesting cuts power use by an order of magnitude compared to radar and active Internet of Things (IoT) sensors. Second, integrated sensing and communication (ISAC) embeds sensing functionality into existing WiFi links, thereby reducing device count, battery waste, and carbon impact. We review conventional handcrafted and accuracy-first methods to set the stage for surveying green learning strategies and lightweight learning techniques, including compact hybrid neural architectures, pruning, knowledge distillation, quantisation, and semi-supervised training that preserve accuracy while reducing model size and memory footprint. We also discuss hardware co-design from low-power microcontrollers to edge application-specific integrated circuits (ASICs) and WiFi firmware extensions that align computation with platform constraints. Finally, we identify open challenges in domain-robust compression, multi-antenna calibration, energy-proportionate model scaling, and standardised joules per inference metrics. Our aim is a practical battery-friendly wireless sensing stack ready for smart home and 6G era deployments. Full article
Show Figures

Graphical abstract

17 pages, 7858 KB  
Article
Sensor-Drift Compensation in Electronic-Nose-Based Gas Recognition Using Knowledge Distillation
by Juntao Lin and Xianghao Zhan
Informatics 2026, 13(1), 15; https://doi.org/10.3390/informatics13010015 - 20 Jan 2026
Viewed by 100
Abstract
Environmental changes and sensor aging can cause sensor drift in sensor array responses (i.e., a shift in the measured signal/feature distribution over time), which in turn degrades gas classification performance in real-world deployments of electronic-nose systems. Previous studies using the UCI Gas Sensor [...] Read more.
Environmental changes and sensor aging can cause sensor drift in sensor array responses (i.e., a shift in the measured signal/feature distribution over time), which in turn degrades gas classification performance in real-world deployments of electronic-nose systems. Previous studies using the UCI Gas Sensor Array Drift Dataset as a benchmark reported promising drift compensation results but often lacked robust statistical validation and may overcompensate for drift by suppressing class-discriminative variance. To address these limitations and rigorously evaluate improvements in sensor-drift compensation, we designed two domain adaptation tasks based on the UCI electronic-nose dataset: (1) using the first batch to predict remaining batches, simulating a controlled laboratory setting, and (2) using Batches 1 through n1 to predict Batch n, simulating continuous training data updates for online training. Then, we systematically tested three methods—our semi-supervised knowledge distillation method (KD) for sensor-drift compensation; a previously benchmarked method, Domain-Regularized Component Analysis (DRCA); and a hybrid method, KD–DRCA—across 30 random test-set partitions on the UCI dataset. We showed that semi-supervised KD consistently outperformed both DRCA and KD–DRCA, achieving up to 18% and 15% relative improvements in accuracy and F1-score, respectively, over the baseline, proving KD’s superior effectiveness in electronic-nose drift compensation. This work provides a rigorous statistical validation of KD for electronic-nose drift compensation under long-term temporal drift, with repeated randomized evaluation and significance testing, and demonstrates consistent improvements over DRCA on the UCI drift benchmark. Full article
(This article belongs to the Section Machine Learning)
Show Figures

Figure 1

18 pages, 3705 KB  
Article
Cross-Platform Multi-Modal Transfer Learning Framework for Cyberbullying Detection
by Weiqi Zhang, Chengzu Dong, Aiting Yao, Asef Nazari and Anuroop Gaddam
Electronics 2026, 15(2), 442; https://doi.org/10.3390/electronics15020442 - 20 Jan 2026
Viewed by 119
Abstract
Cyberbullying and hate speech increasingly appear in multi-modal social media posts, where images and text are combined in diverse and fast changing ways across platforms. These posts differ in style, vocabulary and layout, and labeled data are sparse and noisy, which makes it [...] Read more.
Cyberbullying and hate speech increasingly appear in multi-modal social media posts, where images and text are combined in diverse and fast changing ways across platforms. These posts differ in style, vocabulary and layout, and labeled data are sparse and noisy, which makes it difficult to train detectors that are both reliable and deployable under tight computational budgets. Many high performing systems rely on large vision language backbones, full parameter fine tuning, online retrieval or model ensembles, which raises training and inference costs. We present a parameter efficient cross-platform multi-modal transfer learning framework for cyberbullying and hateful content detection. Our framework has three components. First, we perform domain adaptive pretraining of a compact ViLT backbone on in domain image-text corpora. Second, we apply parameter efficient fine tuning that updates only bias terms, a small subset of LayerNorm parameters and the classification head, leaving the inference computation graph unchanged. Third, we use noise aware knowledge distillation from a stronger teacher built from pretrained text and CLIP based image-text encoders, where only high confidence, temperature scaled predictions are used as soft labels during training, and teacher models and any retrieval components are used only offline. We evaluate primarily on Hateful Memes and use IMDB as an auxiliary text only benchmark to show that the deployment aware PEFT + offline-KD recipe can still be applied when other modalities are unavailable. On Hateful Memes, our student updates only 0.11% of parameters and retain about 96% of the AUROC of full fine-tuning. Full article
(This article belongs to the Special Issue Data Privacy and Protection in IoT Systems)
Show Figures

Figure 1

35 pages, 3598 KB  
Article
PlanetScope Imagery and Hybrid AI Framework for Freshwater Lake Phosphorus Monitoring and Water Quality Management
by Ying Deng, Daiwei Pan, Simon X. Yang and Bahram Gharabaghi
Water 2026, 18(2), 261; https://doi.org/10.3390/w18020261 - 19 Jan 2026
Viewed by 190
Abstract
Accurate estimation of Total Phosphorus, referred to as “Phosphorus, Total” (PPUT; µg/L) in the sourced monitoring data, is essential for understanding eutrophication dynamics and guiding water-quality management in inland lakes. However, lake-wide PPUT mapping at high resolution is challenging to achieve using conventional [...] Read more.
Accurate estimation of Total Phosphorus, referred to as “Phosphorus, Total” (PPUT; µg/L) in the sourced monitoring data, is essential for understanding eutrophication dynamics and guiding water-quality management in inland lakes. However, lake-wide PPUT mapping at high resolution is challenging to achieve using conventional in-situ sampling, and nearshore gradients are often poorly resolved by medium- or low-resolution satellite sensors. This study exploits multi-generation PlanetScope imagery (Dove Classic, Dove-R, and SuperDove; 3–5 m, near-daily revisit) to develop a hybrid AI framework for PPUT retrieval in Lake Simcoe, Ontario, Canada. PlanetScope surface reflectance, short-term meteorological descriptors (3 to 7-day aggregates of air temperature, wind speed, precipitation, and sea-level pressure), and in-situ Secchi depth (SSD) were used to train five ensemble-learning models (HistGradientBoosting, CatBoost, RandomForest, ExtraTrees, and GradientBoosting) across eight feature-group regimes that progressively extend from bands-only, to combinations with spectral indices and day-of-year (DOY), and finally to SSD-inclusive full-feature configurations. The inclusion of SSD led to a strong and systematic performance gain, with mean R2 increasing from about 0.67 (SSD-free) to 0.94 (SSD-aware), confirming that vertically integrated optical clarity is the dominant constraint on PPUT retrieval and cannot be reconstructed from surface reflectance alone. To enable scalable SSD-free monitoring, a knowledge-distillation strategy was implemented in which an SSD-aware teacher transfers its learned representation to a student using only satellite and meteorological inputs. The optimal student model, based on a compact subset of 40 predictors, achieved R2 = 0.83, RMSE = 9.82 µg/L, and MAE = 5.41 µg/L, retaining approximately 88% of the teacher’s explanatory power. Application of the student model to PlanetScope scenes from 2020 to 2025 produces meter-scale PPUT maps; a 26 July 2024 case study shows that >97% of the lake surface remains below 10 µg/L, while rare (<1%) but coherent hotspots above 20 µg/L align with tributary mouths and narrow channels. The results demonstrate that combining commercial high-resolution imagery with physics-informed feature engineering and knowledge transfer enables scalable and operationally relevant monitoring of lake phosphorus dynamics. These high-resolution PPUT maps enable lake managers to identify nearshore nutrient hotspots, tributary plume structures. In doing so, the proposed framework supports targeted field sampling, early warning for eutrophication events, and more robust, lake-wide nutrient budgeting. Full article
(This article belongs to the Section Water Quality and Contamination)
Show Figures

Figure 1

20 pages, 390 KB  
Systematic Review
Systematic Review of Quantization-Optimized Lightweight Transformer Architectures for Real-Time Fruit Ripeness Detection on Edge Devices
by Donny Maulana and R Kanesaraj Ramasamy
Computers 2026, 15(1), 69; https://doi.org/10.3390/computers15010069 - 19 Jan 2026
Viewed by 321
Abstract
Real-time visual inference on resource-constrained hardware remains a core challenge for edge computing and embedded artificial intelligence systems. Recent deep learning architectures, particularly Vision Transformers (ViTs) and Detection Transformers (DETRs), achieve high detection accuracy but impose substantial computational and memory demands that limit [...] Read more.
Real-time visual inference on resource-constrained hardware remains a core challenge for edge computing and embedded artificial intelligence systems. Recent deep learning architectures, particularly Vision Transformers (ViTs) and Detection Transformers (DETRs), achieve high detection accuracy but impose substantial computational and memory demands that limit their deployment on low-power edge platforms such as NVIDIA Jetson and Raspberry Pi devices. This paper presents a systematic review of model compression and optimization strategies—specifically quantization, pruning, and knowledge distillation—applied to lightweight object detection architectures for edge deployment. Following PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, peer-reviewed studies were analyzed from Scopus, IEEE Xplore, and ScienceDirect to examine the evolution of efficient detectors from convolutional neural networks to transformer-based models. The synthesis highlights a growing focus on real-time transformer variants, including Real-Time DETR (RT-DETR) and low-bit quantized approaches such as Q-DETR, alongside optimized YOLO-based architectures. While quantization enables substantial theoretical acceleration (e.g., up to 16× operation reduction), aggressive low-bit precision introduces accuracy degradation, particularly in transformer attention mechanisms, highlighting a critical efficiency-accuracy tradeoff. The review further shows that Quantization-Aware Training (QAT) consistently outperforms Post-Training Quantization (PTQ) in preserving performance under low-precision constraints. Finally, this review identifies critical open research challenges, emphasizing the efficiency–accuracy tradeoff and the high computational demands imposed by Transformer architectures. Future directions are proposed, including hardware-aware optimization, robustness to imbalanced datasets, and multimodal sensing integration, to ensure reliable real-time inference in practical agricultural edge computing environments. Full article
Show Figures

Figure 1

21 pages, 11032 KB  
Article
Scale Calibration and Pressure-Driven Knowledge Distillation for Image Classification
by Jing Xie, Penghui Guan, Han Li, Chunhua Tang, Li Wang and Yingcheng Lin
Symmetry 2026, 18(1), 177; https://doi.org/10.3390/sym18010177 - 18 Jan 2026
Viewed by 136
Abstract
Knowledge distillation achieves model compression by training a lightweight student network to mimic the output distribution of a larger teacher network. However, when the teacher becomes overconfident, its sharply peaked logits break the scale symmetry of supervision and induce high-variance gradients, leading to [...] Read more.
Knowledge distillation achieves model compression by training a lightweight student network to mimic the output distribution of a larger teacher network. However, when the teacher becomes overconfident, its sharply peaked logits break the scale symmetry of supervision and induce high-variance gradients, leading to unstable optimization. Meanwhile, research that focuses only on final-logit alignment often fails to utilize intermediate semantic structure effectively. This causes weak discrimination of student representations, especially under class imbalance. To address these issues, we propose Scale Calibration and Pressure-Driven Knowledge Distillation (SPKD): a one-stage framework comprising two lightweight, complementary mechanisms. First, a dynamic scale calibration module normalizes the teacher’s logits to a consistent magnitude, reducing gradient variance. Secondly, an adaptive pressure-driven mechanism refines student learning by preventing feature collapse and promoting intra-class compactness and inter-class separability. Extensive experiments on CIFAR-100 and ImageNet demonstrate that SPKD achieves superior performance to distillation baselines across various teacher–student combinations. For example, SPKD achieves a score of 74.84% on CIFAR-100 for the homogeneous architecture VGG13-VGG8. Additional evidence from logit norm and gradient variance statistics, as well as representation analyses, proves the fact that SPKD stabilizes optimization while learning more discriminative and well-structured features. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

28 pages, 9150 KB  
Article
PhysGraphIR: Adaptive Physics-Informed Graph Learning for Infrared Thermal Field Prediction in Meter Boxes with Residual Sampling and Knowledge Distillation
by Hao Li, Siwei Li, Xiuli Yu and Xinze He
Electronics 2026, 15(2), 410; https://doi.org/10.3390/electronics15020410 - 16 Jan 2026
Viewed by 158
Abstract
Infrared thermal field (ITF) prediction for meter boxes is crucial for the early warning of power system faults, yet this method faces three major challenges: data sparsity, complex geometry, and resource constraints in edge computing. Existing physics-informed neural network-graph neural network (PINN-GNN) approaches [...] Read more.
Infrared thermal field (ITF) prediction for meter boxes is crucial for the early warning of power system faults, yet this method faces three major challenges: data sparsity, complex geometry, and resource constraints in edge computing. Existing physics-informed neural network-graph neural network (PINN-GNN) approaches suffer from redundant physics residual calculations (over 70% of flat regions contain little information) and poor model generalization (requiring retraining for new box types), making them inefficient for deployment on edge devices. This paper proposes the PhysGraphIR framework, which employs an Adaptive Residual Sampling (ARS) mechanism to dynamically identify hotspot region nodes through a physics-aware gating network, calculating physics residuals only at critical nodes to reduce computational overhead by over 80%. In this study, a ‘hotspot region’ is explicitly defined as a localized area exhibiting significant temperature elevation relative to the background—typically concentrated around electrical connection terminals or wire entrances—which is critical for identifying potential thermal faults under sparse data conditions. Additionally, it utilizes a Physics Knowledge Distillation Graph Neural Network (Physics-KD GNN) to decouple physics learning from geometric learning, transferring universal heat conduction knowledge to specific meter box geometries through a teacher–student architecture. Experimental results demonstrate that on both synthetic and real-world meter box datasets, PhysGraphIR achieves a hotspot region mean absolute error (MAE) of 11.8 °C under 60% infrared data missing conditions, representing a 22% improvement over traditional PINN-GNN. The training speed is accelerated by 3.1 times, requiring only five infrared samples to adapt to new box types. The experiments prove that this method significantly enhances prediction accuracy and computational efficiency under sparse infrared data while maintaining physical consistency, providing a feasible solution for edge intelligence in power systems. Full article
Show Figures

Figure 1

27 pages, 6058 KB  
Article
Hierarchical Self-Distillation with Attention for Class-Imbalanced Acoustic Event Classification in Elevators
by Shengying Yang, Lingyan Chou, He Li, Zhenyu Xu, Boyang Feng and Jingsheng Lei
Sensors 2026, 26(2), 589; https://doi.org/10.3390/s26020589 - 15 Jan 2026
Viewed by 251
Abstract
Acoustic-based anomaly detection in elevators is crucial for predictive maintenance and operational safety, yet it faces significant challenges in real-world settings, including pervasive multi-source acoustic interference within confined spaces and severe class imbalance in collected data, which critically degrades the detection performance for [...] Read more.
Acoustic-based anomaly detection in elevators is crucial for predictive maintenance and operational safety, yet it faces significant challenges in real-world settings, including pervasive multi-source acoustic interference within confined spaces and severe class imbalance in collected data, which critically degrades the detection performance for minority yet critical acoustic events. To address these issues, this study proposes a novel hierarchical self-distillation framework. The method embeds auxiliary classifiers into the intermediate layers of a backbone network, creating a deep teacher–shallow student knowledge transfer paradigm optimized jointly via Kullback–Leibler divergence and feature alignment losses. A self-attentive temporal pooling layer is introduced to adaptively weigh discriminative time-frequency features, thereby mitigating temporal overlap interference, while a focal loss function is employed specifically in the teacher model to recalibrate the learning focus towards hard-to-classify minority samples. Extensive evaluations on the public UrbanSound8K dataset and a proprietary industrial elevator audio dataset demonstrate that the proposed model achieves superior performance, exceeding 90% in both accuracy and F1-score. Notably, it yields substantial improvements in recognizing rare events, validating its robustness for elevator acoustic monitoring. Full article
Show Figures

Figure 1

20 pages, 5073 KB  
Article
SAWGAN-BDCMA: A Self-Attention Wasserstein GAN and Bidirectional Cross-Modal Attention Framework for Multimodal Emotion Recognition
by Ning Zhang, Shiwei Su, Haozhe Zhang, Hantong Yang, Runfang Hao and Kun Yang
Sensors 2026, 26(2), 582; https://doi.org/10.3390/s26020582 - 15 Jan 2026
Viewed by 189
Abstract
Emotion recognition from physiological signals is pivotal for advancing human–computer interaction, yet unimodal pipelines frequently underperform due to limited information, constrained data diversity, and suboptimal cross-modal fusion. Addressing these limitations, the Self-Attention Wasserstein Generative Adversarial Network with Bidirectional Cross-Modal Attention (SAWGAN-BDCMA) framework is [...] Read more.
Emotion recognition from physiological signals is pivotal for advancing human–computer interaction, yet unimodal pipelines frequently underperform due to limited information, constrained data diversity, and suboptimal cross-modal fusion. Addressing these limitations, the Self-Attention Wasserstein Generative Adversarial Network with Bidirectional Cross-Modal Attention (SAWGAN-BDCMA) framework is proposed. This framework reorganizes the learning process around three complementary components: (1) a Self-Attention Wasserstein GAN (SAWGAN) that synthesizes high-quality Electroencephalography (EEG) and Photoplethysmography (PPG) to expand diversity and alleviate distributional imbalance; (2) a dual-branch architecture that distills discriminative spatiotemporal representations within each modality; and (3) a Bidirectional Cross-Modal Attention (BDCMA) mechanism that enables deep two-way interaction and adaptive weighting for robust fusion. Evaluated on the DEAP and ECSMP datasets, SAWGAN-BDCMA significantly outperforms multiple contemporary methods, achieving 94.25% accuracy for binary and 87.93% for quaternary classification on DEAP. Furthermore, it attains 97.49% accuracy for six-class emotion recognition on the ECSMP dataset. Compared with state-of-the-art multimodal approaches, the proposed framework achieves an accuracy improvement ranging from 0.57% to 14.01% across various tasks. These findings offer a robust solution to the long-standing challenges of data scarcity and modal imbalance, providing a profound theoretical and technical foundation for fine-grained emotion recognition and intelligent human–computer collaboration. Full article
(This article belongs to the Special Issue Advanced Signal Processing for Affective Computing)
Show Figures

Figure 1

24 pages, 3471 KB  
Article
Transformable Quadruped Wheelchair: Unified Walking and Wheeled Locomotion via Mode-Conditioned Policy Distillation
by Atsuki Akamisaka and Katashi Nagao
Sensors 2026, 26(2), 566; https://doi.org/10.3390/s26020566 - 14 Jan 2026
Viewed by 292
Abstract
In recent years, while progress has been made in barrier-free design, the complete elimination of physical barriers such as uneven road surfaces and stairs remains difficult, and wheelchair passengers continue to face significant mobility constraints. This study aims to verify the effectiveness of [...] Read more.
In recent years, while progress has been made in barrier-free design, the complete elimination of physical barriers such as uneven road surfaces and stairs remains difficult, and wheelchair passengers continue to face significant mobility constraints. This study aims to verify the effectiveness of a transformable quadruped wheelchair that can switch between two modes of movement: walking and wheeled travel. Specifically, reinforcement learning using Proximal Policy Optimization (PPO) was used to acquire walking strategies for uneven terrain and wheeled travel strategies for flat terrain. NVIDIA Isaac Sim was used for simulation. To evaluate the stability of both modes, we performed a frequency analysis of the passenger’s acceleration data. As a result, we observed periodic vibrations around 2 Hz in the vertical direction in walking mode, while in wheeled mode, we confirmed extremely small vibrations and stable running. Furthermore, we distilled these two strategies into a single mode-conditional strategy and conducted long-distance running experiments involving mode transformation. The results demonstrated that by adaptively switching between walking and wheeled modes depending on the terrain, mobility efficiency was significantly improved compared to continuous operation in a single mode. This study demonstrates the effectiveness of an approach that involves learning multiple specialized strategies and switching between them as needed to efficiently traverse diverse environments using a transformable robot. Full article
(This article belongs to the Section Wearables)
Show Figures

Figure 1

17 pages, 710 KB  
Article
KD-SecBERT: A Knowledge-Distilled Bidirectional Encoder Optimized for Open-Source Software Supply Chain Security in Smart Grid Applications
by Qinman Li, Xixiang Zhang, Weiming Liao, Tao Dai, Hongliang Zheng, Beiya Yang and Pengfei Wang
Electronics 2026, 15(2), 345; https://doi.org/10.3390/electronics15020345 - 13 Jan 2026
Viewed by 187
Abstract
With the acceleration of digital transformation, open-source software has become a fundamental component of modern smart grids and other critical infrastructures. However, the complex dependency structures of open-source ecosystems and the continuous emergence of vulnerabilities pose substantial challenges to software supply chain security. [...] Read more.
With the acceleration of digital transformation, open-source software has become a fundamental component of modern smart grids and other critical infrastructures. However, the complex dependency structures of open-source ecosystems and the continuous emergence of vulnerabilities pose substantial challenges to software supply chain security. In power information networks and cyber–physical control systems, vulnerabilities in open-source components integrated into Supervisory Control and Data Acquisition (SCADA), Energy Management System (EMS), and Distribution Management System (DMS) platforms and distributed energy controllers may propagate along the supply chain, threatening system security and operational stability. In such application scenarios, large language models (LLMs) often suffer from limited semantic accuracy when handling domain-specific security terminology, as well as deployment inefficiencies that hinder their practical adoption in critical infrastructure environments. To address these issues, this paper proposes KD-SecBERT, a domain-specific semantic bidirectional encoder optimized through multi-level knowledge distillation for open-source software supply chain security in smart grid applications. The proposed framework constructs a hierarchical multi-teacher ensemble that integrates general language understanding, cybersecurity-domain knowledge, and code semantic analysis, together with a lightweight student architecture based on depthwise separable convolutions and multi-head self-attention. In addition, a dynamic, multi-dimensional distillation strategy is introduced to jointly perform layer-wise representation alignment, ensemble knowledge fusion, and task-oriented optimization under a progressive curriculum learning scheme. Extensive experiments conducted on a multi-source dataset comprising National Vulnerability Database (NVD) and Common Vulnerabilities and Exposures (CVE) entries, security-related GitHub code, and Open Web Application Security Project (OWASP) test cases show that KD-SecBERT achieves an accuracy of 91.3%, a recall of 90.6%, and an F1-score of 89.2% on vulnerability classification tasks, indicating strong robustness in recognizing both common and low-frequency security semantics. These results demonstrate that KD-SecBERT provides an effective and practical solution for semantic analysis and software supply chain risk assessment in smart grids and other critical-infrastructure environments. Full article
Show Figures

Figure 1

15 pages, 1363 KB  
Article
Hierarchical Knowledge Distillation for Efficient Model Compression and Transfer: A Multi-Level Aggregation Approach
by Titinunt Kitrungrotsakul and Preeyanuch Srichola
Information 2026, 17(1), 70; https://doi.org/10.3390/info17010070 - 12 Jan 2026
Viewed by 260
Abstract
The success of large-scale deep learning models in remote sensing tasks has been transformative, enabling significant advances in image classification, object detection, and image–text retrieval. However, their computational and memory demands pose challenges for deployment in resource-constrained environments. Knowledge distillation (KD) alleviates these [...] Read more.
The success of large-scale deep learning models in remote sensing tasks has been transformative, enabling significant advances in image classification, object detection, and image–text retrieval. However, their computational and memory demands pose challenges for deployment in resource-constrained environments. Knowledge distillation (KD) alleviates these issues by transferring knowledge from a strong teacher to a student model, which can be compact for efficient deployment or architecturally matched to improve accuracy under the same inference budget. In this paper, we introduce Hierarchical Multi-Segment Knowledge Distillation (HIMS_KD), a multi-stage framework that sequentially distills knowledge from a teacher into multiple assistant models specialized in low-, mid-, and high-level representations, and then aggregates their knowledge into the final student. We integrate feature-level alignment, auxiliary similarity-logit alignment, and supervised loss during distillation. Experiments on benchmark remote sensing datasets (RSITMD and RSICD) show that HIMS_KD improves retrieval performance and enhances zero-shot classification; and when a compact student is used, it reduces deployment cost while retaining strong accuracy. Full article
(This article belongs to the Special Issue AI-Based Image Processing and Computer Vision)
Show Figures

Figure 1

23 pages, 1579 KB  
Article
Exploring Difference Semantic Prior Guidance for Remote Sensing Image Change Captioning
by Yunpeng Li, Xiangrong Zhang, Guanchun Wang and Tianyang Zhang
Remote Sens. 2026, 18(2), 232; https://doi.org/10.3390/rs18020232 - 11 Jan 2026
Viewed by 302
Abstract
Understanding complex change scenes is a crucial challenge in remote sensing field. Remote sensing image change captioning (RSICC) task has emerged as a promising approach to translate appeared changes between bi-temporal remote sensing images into textual descriptions, enabling users to make accurate decisions. [...] Read more.
Understanding complex change scenes is a crucial challenge in remote sensing field. Remote sensing image change captioning (RSICC) task has emerged as a promising approach to translate appeared changes between bi-temporal remote sensing images into textual descriptions, enabling users to make accurate decisions. Current RSICC methods frequently encounter difficulties in consistency for contextual awareness and semantic prior guidance. Therefore, this study explores difference semantic prior guidance network to reason context-rich sentence for capturing appeared vision changes. Specifically, the context-aware difference module is introduced to guarantee the consistency of unchanged/changed context features, strengthening multi-level changed information to improve the ability of semantic change feature representation. Moreover, to effectively mine higher-level cognition ability to reason salient/weak changes, we employ difference comprehending with shallow change information to realize semantic change knowledge learning. In addition, the designed parallel cross refined attention in Transformer decoder can balance vision difference and semantic knowledge for implicit knowledge distilling, enabling fine-grained perception changes of semantic details and reducing pseudochanges. Compared with advanced algorithms on the LEVIR-CC and Dubai-CC datasets, experimental results validate the outstanding performance of the designed model in RSICC tasks. Notably, on the LEVIR-CC dataset, it reaches a CIDEr score of 143.34%, representing a 3.11% improvement over the most competitive SAT-cap. Full article
Show Figures

Figure 1

Back to TopTop