Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (141)

Search Parameters:
Keywords = linear decoder model

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 3447 KB  
Article
Hybrid Decoding with Co-Occurrence Awareness for Fine-Grained Food Image Segmentation
by Shenglong Wang and Guorui Sheng
Foods 2026, 15(3), 534; https://doi.org/10.3390/foods15030534 - 3 Feb 2026
Viewed by 124
Abstract
Fine-grained food image segmentation is essential for accurate dietary assessment and nutritional analysis, yet remains highly challenging due to ambiguous boundaries, inter-class similarity, and dense layouts of meals containing many different ingredients in real-world settings. Existing methods based solely on CNNs, Transformers, or [...] Read more.
Fine-grained food image segmentation is essential for accurate dietary assessment and nutritional analysis, yet remains highly challenging due to ambiguous boundaries, inter-class similarity, and dense layouts of meals containing many different ingredients in real-world settings. Existing methods based solely on CNNs, Transformers, or Mamba architectures often fail to simultaneously preserve fine-grained local details and capture contextual dependencies over long distances. To address these limitations, we propose HDF (Hybrid Decoder for Food Image Segmentation), a novel decoding framework built upon the MambaVision backbone. Our approach first employs a convolution-based feature pyramid network (FPN) to extract multi-stage features from the encoder. These features are then thoroughly fused across scales using a Cross-Layer Mamba module that models inter-level dependencies with linear complexity. Subsequently, an Attention Refinement module integrates global semantic context through spatial–channel reweighting. Finally, a Food Co-occurrence Module explicitly enhances food-specific semantics by learning dynamic co-occurrence patterns among categories, improving segmentation of visually similar or frequently co-occurring ingredients. Evaluated on two widely used, high-quality benchmarks, FoodSeg103 and UEC-FoodPIX Complete, which are standard datasets for fine-grained food segmentation, HDF achieves a 52.25% mean Intersection-over-Union (mIoU) on FoodSeg103 and a 76.16% mIoU on UEC-FoodPIX Complete, outperforming current state-of-the-art methods by a clear margin. These results demonstrate that HDF’s hybrid design and explicit co-occurrence awareness effectively address key challenges in food image segmentation, providing a robust foundation for practical applications in dietary logging, nutritional estimation, and food safety inspection. Full article
(This article belongs to the Section Food Analytical Methods)
Show Figures

Figure 1

16 pages, 3327 KB  
Article
EEMD-TiDE-Based Passenger Flow Prediction for Urban Rail Transit
by Dongcai Cheng, Yuheng Zhang and Haijun Li
Electronics 2026, 15(3), 529; https://doi.org/10.3390/electronics15030529 - 26 Jan 2026
Viewed by 168
Abstract
Urban rail transit networks in developing countries are rapidly expanding, entering a networked operational phase where accurate passenger flow forecasting is crucial for optimizing vehicle scheduling, resource allocation, and transportation efficiency. In the short term, accurate real-time forecasting enables the dynamic adjustment of [...] Read more.
Urban rail transit networks in developing countries are rapidly expanding, entering a networked operational phase where accurate passenger flow forecasting is crucial for optimizing vehicle scheduling, resource allocation, and transportation efficiency. In the short term, accurate real-time forecasting enables the dynamic adjustment of train headways and crew deployment, reducing average passenger waiting times during peak hours and alleviating platform overcrowding; in the long term, reliable trend predictions support strategic planning, including capacity expansion, station retrofitting, and energy management. This paper proposes a novel hybrid forecasting model, EEMD-TiDE, that combines improved Ensemble Empirical Mode Decomposition (EEMD) with a Time Series Dense Encoder (TiDE) to enhance prediction accuracy. The EEMD algorithm effectively overcomes mode mixing issues in traditional EMD by incorporating white noise perturbations, decomposing raw passenger flow data into physically meaningful Intrinsic Mode Functions (IMFs). At the same time, the TiDE model, a linear encoder–decoder architecture, efficiently handles multi-scale features and covariates without the computational overhead of self-attention mechanisms. Experimental results using Xi’an Metro passenger flow data (2017–2019) demonstrate that EEMD-TiDE significantly outperforms baseline models. This study provides a robust solution for urban rail transit passenger flow forecasting, supporting sustainable urban development. Full article
(This article belongs to the Section Computer Science & Engineering)
Show Figures

Figure 1

22 pages, 5115 KB  
Article
Intelligent Detection Method of Defects in High-Rise Building Facades Using Infrared Thermography
by Daiming Liu, Yongqiang Jin, Yuan Yang, Zhenyang Xiao, Zeming Zhao, Changling Gao and Dingcheng Zhang
Sensors 2026, 26(2), 694; https://doi.org/10.3390/s26020694 - 20 Jan 2026
Viewed by 313
Abstract
High-rise building facades are prone to defects due to prolonged exposure to complex environments. Infrared detection, as a commonly employed method for facade defect inspection, often results in low accuracy owing to abundant interferences and blurred defect boundaries. In this work, an intelligent [...] Read more.
High-rise building facades are prone to defects due to prolonged exposure to complex environments. Infrared detection, as a commonly employed method for facade defect inspection, often results in low accuracy owing to abundant interferences and blurred defect boundaries. In this work, an intelligent defect detection method for high-rise building facades is proposed. In the first stage of the proposed method, a segmentation model based on DeepLabV3+ is proposed to remove interferences in infrared images using masks. The model incorporates a Post-Decoder Dual-Branch Boundary Refinement Module, which is subdivided into a boundary feature optimization branch and a boundary-guided attention branch. Sub-pixel-level contour refinement and boundary-adaptive weighting are hence achieved to mitigate edge blurring induced by thermal diffusion and to enhance the perception of slender cracks and cavity edges. A triple constraint mechanism is also introduced, combining cross-entropy, multi-scale Dice, and boundary-aware losses to address class imbalance and enhance segmentation performance for small targets. Furthermore, superpixel linear iterative clustering (SLIC) is utilized to enforce regional consistency, hence improving the smoothness and robustness of predictions. In the second stage of the proposed method, a defect detection model based on YOLOV11 is proposed to process masked infrared images for detecting hollow, seepage, cracks and detachment. This work validates the proposed method using 180 infrared images collected via unmanned aerial vehicles. The experimental results demonstrate that the proposed method achieves a detection precision of 89.7%, an mAP@0.5 of 87.9%, and a 57.8 mAP@50-95. surpassing other algorithms and confirming its effectiveness and superiority. Full article
(This article belongs to the Section Fault Diagnosis & Sensors)
Show Figures

Figure 1

34 pages, 10017 KB  
Article
U-H-Mamba: An Uncertainty-Aware Hierarchical State-Space Model for Lithium-Ion Battery Remaining Useful Life Prediction Using Hybrid Laboratory and Real-World Datasets
by Zhihong Wen, Xiangpeng Liu, Wenshu Niu, Hui Zhang and Yuhua Cheng
Energies 2026, 19(2), 414; https://doi.org/10.3390/en19020414 - 14 Jan 2026
Viewed by 304
Abstract
Accurate prognosis of the remaining useful life (RUL) for lithium-ion batteries is critical for mitigating range anxiety and ensuring the operational safety of electric vehicles. However, existing data-driven methods often struggle to maintain robustness when transferring from controlled laboratory conditions to complex, sensor-limited, [...] Read more.
Accurate prognosis of the remaining useful life (RUL) for lithium-ion batteries is critical for mitigating range anxiety and ensuring the operational safety of electric vehicles. However, existing data-driven methods often struggle to maintain robustness when transferring from controlled laboratory conditions to complex, sensor-limited, real-world environments. To bridge this gap, this study presents U-H-Mamba, a novel uncertainty-aware hierarchical framework trained on a massive hybrid repository comprising over 146,000 charge–discharge cycles from both laboratory benchmarks and operational electric vehicle datasets. The proposed architecture employs a two-level design to decouple degradation dynamics, where a Multi-scale Temporal Convolutional Network functions as the base encoder to extract fine-grained electrochemical fingerprints, including derived virtual impedance proxies, from high-frequency intra-cycle measurements. Subsequently, an enhanced Pressure-Aware Multi-Head Mamba decoder models the long-range inter-cycle degradation trajectories with linear computational complexity. To guarantee reliability in safety-critical applications, a hybrid uncertainty quantification mechanism integrating Monte Carlo Dropout with Inductive Conformal Prediction is implemented to generate calibrated confidence intervals. Extensive empirical evaluations demonstrate the framework’s superior performance, achieving a RMSE of 3.2 cycles on the NASA dataset and 5.4 cycles on the highly variable NDANEV dataset, thereby outperforming state-of-the-art baselines by 20–40%. Furthermore, SHAP-based interpretability analysis confirms that the model correctly identifies physics-informed pressure dynamics as critical degradation drivers, validating its zero-shot generalization capabilities. With high accuracy and linear scalability, the U-H-Mamba model offers a viable and physically interpretable solution for cloud-based prognostics in large-scale electric vehicle fleets. Full article
(This article belongs to the Section F5: Artificial Intelligence and Smart Energy)
Show Figures

Figure 1

29 pages, 7092 KB  
Article
Dual-Branch Attention Photovoltaic Power Forecasting Model Integrating Ground-Based Cloud Image Features
by Lianglin Zou, Hongyang Quan, Jinguo He, Shuai Zhang, Ping Tang, Xiaoshi Xu and Jifeng Song
Energies 2026, 19(2), 409; https://doi.org/10.3390/en19020409 - 14 Jan 2026
Viewed by 133
Abstract
The photovoltaic field has seen significant development in recent years, with continuously expanding installation capacity and increasing grid integration. However, due to the intermittency of solar energy and meteorological variability, PV output power poses serious challenges to grid security and dispatch reliability. Traditional [...] Read more.
The photovoltaic field has seen significant development in recent years, with continuously expanding installation capacity and increasing grid integration. However, due to the intermittency of solar energy and meteorological variability, PV output power poses serious challenges to grid security and dispatch reliability. Traditional forecasting methods largely rely on modeling historical power and meteorological data, often neglecting the consideration of cloud movement, which constrains further improvement in prediction accuracy. To enhance prediction accuracy and model interpretability, this paper proposes a dual-branch attention-based PV power prediction model that integrates physical features from ground-based cloud images. Regarding input features, a cloud segmentation model is constructed based on the vision foundation model DINO encoder and an improved U-Net decoder to obtain cloud cover information. Based on deep feature point detection and an attention matching mechanism, cloud motion vectors are calculated to extract cloud motion speed and direction features. For feature processing, feature attention and temporal attention mechanisms are introduced, enabling the model to learn key meteorological factors and critical historical time steps. Structurally, a parallel architecture consisting of a linear branch and a nonlinear branch is adopted. A context-aware fusion module adaptively combines the prediction results from both branches, achieving collaborative modeling of linear trends and nonlinear fluctuations. Comparative experiments were conducted using two years of engineering data. Experimental results demonstrate that the proposed model outperforms the benchmarks across multiple metrics, validating the predictive advantages of the dual-branch structure that integrates physical features under complex weather conditions. Full article
(This article belongs to the Section A2: Solar Energy and Photovoltaic Systems)
Show Figures

Figure 1

26 pages, 5686 KB  
Article
MAFMamba: A Multi-Scale Adaptive Fusion Network for Semantic Segmentation of High-Resolution Remote Sensing Images
by Boxu Li, Xiaobing Yang and Yingjie Fan
Sensors 2026, 26(2), 531; https://doi.org/10.3390/s26020531 - 13 Jan 2026
Viewed by 205
Abstract
With rapid advancements in sub-meter satellite and aerial imaging technologies, high-resolution remote sensing imagery has become a pivotal source for geospatial information acquisition. However, current semantic segmentation models encounter two primary challenges: (1) the inherent trade-off between capturing long-range global context and preserving [...] Read more.
With rapid advancements in sub-meter satellite and aerial imaging technologies, high-resolution remote sensing imagery has become a pivotal source for geospatial information acquisition. However, current semantic segmentation models encounter two primary challenges: (1) the inherent trade-off between capturing long-range global context and preserving precise local structural details—where excessive reliance on downsampled deep semantics often results in blurred boundaries and the loss of small objects and (2) the difficulty in modeling complex scenes with extreme scale variations, where objects of the same category exhibit drastically different morphological features. To address these issues, this paper introduces MAFMamba, a multi-scale adaptive fusion visual Mamba network tailored for high-resolution remote sensing images. To mitigate scale variation, we design a lightweight hybrid encoder incorporating an Adaptive Multi-scale Mamba Block (AMMB) in each stage. Driven by a Multi-scale Adaptive Fusion (MSAF) mechanism, the AMMB dynamically generates pixel-level weights to recalibrate cross-level features, establishing a robust multi-scale representation. Simultaneously, to strictly balance local details and global semantics, we introduce a Global–Local Feature Enhancement Mamba (GLMamba) in the decoder. This module synergistically integrates local fine-grained features extracted by convolutions with global long-range dependencies modeled by the Visual State Space (VSS) layer. Furthermore, we propose a Multi-Scale Cross-Attention Fusion (MSCAF) module to bridge the semantic gap between the encoder’s shallow details and the decoder’s high-level semantics via an efficient cross-attention mechanism. Extensive experiments on the ISPRS Potsdam and Vaihingen datasets demonstrate that MAFMamba surpasses state-of-the-art Convolutional Neural Network (CNN), Transformer, and Mamba-based methods in terms of mIoU and mF1 scores. Notably, it achieves superior accuracy while maintaining linear computational complexity and low memory usage, underscoring its efficiency in complex remote sensing scenarios. Full article
(This article belongs to the Special Issue Intelligent Sensors and Artificial Intelligence in Building)
Show Figures

Figure 1

19 pages, 1780 KB  
Article
Dynamic Topology-Aware Linear Attention Network for Efficient Traveling Salesman Problem Optimization
by Shilong Zhao and Qianqian Duan
Mathematics 2026, 14(1), 166; https://doi.org/10.3390/math14010166 - 1 Jan 2026
Viewed by 400
Abstract
The Traveling Salesman Problem (TSP) is a classic combinatorial optimization problem with broad applications in logistics and smart agriculture. However, despite significant progress in Transformer-based deep reinforcement learning methods, two major challenges remain. First, standard linear embedding layers struggle to capture dynamic local [...] Read more.
The Traveling Salesman Problem (TSP) is a classic combinatorial optimization problem with broad applications in logistics and smart agriculture. However, despite significant progress in Transformer-based deep reinforcement learning methods, two major challenges remain. First, standard linear embedding layers struggle to capture dynamic local geometric relationships between nodes. Second, the quadratic complexity of self-attention in the decoder hinders efficiency in large-scale TSP instances. To address these issues, this paper proposes a Dynamic Topology-Aware Linear Attention Network (DTALAN). The encoder employs a Channel-aware Topological Refinement Graph Convolution (CTRGC) module to model local geometric structures and a Global Attention Mechanism (GAM) for adaptive feature recalibration. The decoder introduces a temporal locality-aware attention mechanism that focuses only on recently visited nodes, reducing self-attention complexity from quadratic to linear while preserving solution quality. The policy network is trained using the REINFORCE algorithm with baseline and the Adam optimizer. Experiments on random instances and the TSPLIB benchmark show that DTALAN outperforms leading deep reinforcement learning methods in both optimality gap and inference efficiency. For TSP100, it achieves an optimality gap of 0.55%, producing near-optimal solutions. Ablation studies confirm that both the improved CTRGC and enhanced GAM modules are essential to these results. Full article
Show Figures

Figure 1

24 pages, 4080 KB  
Article
An Unsupervised Situation Awareness Framework for UAV Sensor Data Fusion Enabled by a Stabilized Deep Variational Autoencoder
by Anxin Guo, Zhenxing Zhang, Rennong Yang, Ying Zhang, Liping Hu and Leyan Li
Sensors 2026, 26(1), 111; https://doi.org/10.3390/s26010111 - 24 Dec 2025
Viewed by 500
Abstract
Effective situation awareness relies on the robust processing of high-dimensional data streams generated by onboard sensors. However, the application of deep generative models to extract features from complex UAV sensor data (e.g., GPS, IMU, and radar feeds) faces two fundamental challenges: critical training [...] Read more.
Effective situation awareness relies on the robust processing of high-dimensional data streams generated by onboard sensors. However, the application of deep generative models to extract features from complex UAV sensor data (e.g., GPS, IMU, and radar feeds) faces two fundamental challenges: critical training instability and the difficulty of representing multi-modal distributions inherent in dynamic flight maneuvers. To address this, this paper proposes a novel unsupervised sensor data processing framework to overcome these issues. Our core innovation is a deep generative model, VAE-WRBM-MDN, specifically engineered for stable feature extraction from non-linear time-series sensor data. We demonstrate that while standard Variational Autoencoders (VAEs) often struggle to converge on this task, our introduction of Weighted-uncertainty Restricted Boltzmann Machines (WRBM) for layer-wise pre-training ensures stable learning. Furthermore, the integration of a Mixture Density Network (MDN) enables the decoder to accurately reconstruct the complex, multi-modal conditional distributions of sensor readings. Comparative experiments validate our approach, achieving 95.69% classification accuracy in identifying situational patterns. The results confirm that our framework provides robust enabling technology for real-time intelligent sensing and raw data interpretation in autonomous systems. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

28 pages, 6383 KB  
Article
Learning the Grid: Transformer Architectures for Electricity Price Forecasting in the Australian National Market
by Mark Sinclair, Andrew J. Shepley and Farshid Hajati
Appl. Sci. 2026, 16(1), 75; https://doi.org/10.3390/app16010075 - 21 Dec 2025
Viewed by 435
Abstract
The increasing adoption of highly variable renewable energy has introduced unprecedented volatility into the National Electricity Market (NEM), rendering traditional linear price forecasting models insufficient. The Australian Energy Market Operator (AEMO) spot price forecasts often struggle during periods of volatile demand, renewable variability, [...] Read more.
The increasing adoption of highly variable renewable energy has introduced unprecedented volatility into the National Electricity Market (NEM), rendering traditional linear price forecasting models insufficient. The Australian Energy Market Operator (AEMO) spot price forecasts often struggle during periods of volatile demand, renewable variability, and strategic rebidding. This study evaluates whether transformer architectures can improve intraday NEM price forecasting. Using 34 months of market data and weather conditions, several transformer variants, including encoder–decoder, decoder-only, and encoder-only, were compared against the AEMO’s operational forecast, a two-layer LSTM baseline, the Temporal Fusion Transformer, PatchTST, and TimesFM. The decoder-only transformer achieved the best accuracy across the 2–16 h horizons in NSW, with nMAPE values of 33.6–39.2%, outperforming both AEMO and all baseline models. Retraining in Victoria and Queensland produced similarly strong results, demonstrating robust regional generalisation. A feature importance analysis showed that future-facing predispatch and forecast covariates dominate model importance, explaining why a decoder-only transformer variant performed so competitively. While magnitude estimation for extreme price spikes remains challenging, the transformer models demonstrated superior capability in delivering statistically significant improvements in forecast accuracy. An API providing real-time forecasts using the small encoder–decoder transformer model is available. Full article
(This article belongs to the Special Issue Artificial Intelligence (AI) for Energy Systems)
Show Figures

Figure 1

24 pages, 4961 KB  
Article
U-PKAN: A Dual-Module Kolmogorov–Arnold Network for Agricultural Plant Disease Detection
by Dejun Xi, Baotong Zhang and Yi-Jia Wang
Agriculture 2025, 15(24), 2599; https://doi.org/10.3390/agriculture15242599 - 16 Dec 2025
Viewed by 380
Abstract
Crop diseases and pests have a significant impact on planting costs and crop yields and, in severe cases, can threaten food security and farmers’ incomes. Currently, most researchers employ various deep learning methods, such as the YOLO series algorithms and U-Net and its [...] Read more.
Crop diseases and pests have a significant impact on planting costs and crop yields and, in severe cases, can threaten food security and farmers’ incomes. Currently, most researchers employ various deep learning methods, such as the YOLO series algorithms and U-Net and its variants, for the detection of agricultural plant diseases. However, the existing algorithms suffer from insufficient interpretability and are limited to linear modeling, which can lead to issues such as trust crises in current technologies, restricted applications and difficulties in tracing and correcting errors. To address these issues, a dual-module Kolmogorov–Arnold Network (U-PKAN) is proposed for agricultural plant disease detection in this paper. A KAN encoder–decoder structure is adopted to construct the network. To ensure the network fully extracts features, two different modules, namely Patchembed-KAN (P-KAN) and Decoder-KAN (D-KAN), are designed. To enhance the network’s feature fusion capability, a KAN-based symmetrical structure for skip connections is designed. The proposed method places learnable activation functions on weights, enabling it to achieve higher accuracy with fewer parameters. Moreover, it can reveal the compositional structure and variable dependencies of synthetic datasets through symbolic formulas, thus exhibiting excellent interpretability. A field corn disease image dataset was collected and constructed. Additionally, the performance of the U-PKAN model was verified using the open plant disease dataset PlantDoc and a gear pitting dataset. To better understand the performance differences between different methods, U-PKAN was compared with U-KAN, U-Net, AttUNet, and U-Net++ models for performance benchmarking. IoU and the Dice coefficient were chosen as evaluation metrics. The experimental results demonstrate that the proposed method achieves faster convergence and higher segmentation accuracy. Overall, the proposed method demonstrates outstanding performance in aspects such as function approximation, global perception, interpretability and computational efficiency. Full article
Show Figures

Figure 1

34 pages, 7119 KB  
Article
A Deployment-Aware Framework for Carbon- and Water- Efficient LLM Serving
by Julian Hoxha, Marsela Thanasi-Boçe and Tarek Khalifa
Sustainability 2025, 17(23), 10473; https://doi.org/10.3390/su172310473 - 22 Nov 2025
Viewed by 999
Abstract
Inference now dominates the lifecycle footprint of large language models, yet published estimates often use inconsistent boundaries and optimize carbon while ignoring water. We present a provider-agnostic framework that unifies scope-transparent measurement with time-resolved, SLO-aware orchestration and jointly optimizes carbon and consumptive water. [...] Read more.
Inference now dominates the lifecycle footprint of large language models, yet published estimates often use inconsistent boundaries and optimize carbon while ignoring water. We present a provider-agnostic framework that unifies scope-transparent measurement with time-resolved, SLO-aware orchestration and jointly optimizes carbon and consumptive water. Measurement reports daily medians at a comprehensive serving boundary that includes accelerators, host CPU/DRAM, provisioned idle, and PUE uplift, and provides accelerator-only whiskers for reconciliation. Optimization uses a mixed-integer linear program solved over five-minute windows; it selects region, batch size, and phase-aware hardware for prefill and decode while enforcing p95 TTFT and TPOT as well as capacity constraints. Applied to four representative models, a single SLO-aware policy reduces comprehensive-boundary medians by 57 to 59 percent for energy, 59 to 60 percent for water, and 78 to 80 percent for location-based CO2, with SLOs met in every window. For a day with 500 million queries on GPT-4o, totals fall from 0.344 to 0.145 GWh, 1.196 to 0.490 ML, and 121 to 25 t CO2 (location-based). The framework offers a deployable template for carbon- and water-aware LLM serving with auditable and scope-transparent reporting. Full article
Show Figures

Figure 1

26 pages, 2510 KB  
Article
A Three-Machine Flowshop Scheduling Problem with Linear Fatigue Effect
by Weiping Xu, Zehou Sun, Xiaotian Ai, Baoyun Zhao, Jingyi Lu, Hanyu Zhou, Xinqi Mao, Xiaoling Wen, Chin-Chia Wu and Shufeng Liu
Mathematics 2025, 13(22), 3670; https://doi.org/10.3390/math13223670 - 16 Nov 2025
Viewed by 576
Abstract
Highly customized requirements in smart manufacturing result in the unavoidable manual execution of complex operational procedures. Physical and mental fatigue from long work periods for assembly-line operators induces production issues, such as defective work-in-processes or equipment failure. An effective production schedule should account [...] Read more.
Highly customized requirements in smart manufacturing result in the unavoidable manual execution of complex operational procedures. Physical and mental fatigue from long work periods for assembly-line operators induces production issues, such as defective work-in-processes or equipment failure. An effective production schedule should account for worker fatigue. This study investigates a three-machine flowshop scheduling problem with the objective of makespan minimization, in which a linear fatigue effect function provides an approximate mathematical representation of fatigue and recovery processes in workers. A mixed integer programming (MIP) model is developed to optimize the integration of automated and human-operated production in manufacturing systems. Given its NP-hardness, an improved tabu search (ITS) algorithm is designed to obtain high-quality solutions, incorporating multiple initial solutions, a well-designed encoding-decoding strategy, and a tabu-based adaptive search mechanism to enhance efficiency. Numerical simulations indicate the veracity of the MIP model and the effectiveness of the ITS algorithm. Full article
Show Figures

Figure 1

20 pages, 4429 KB  
Article
ANT-KT: Adaptive NAS Transformers for Knowledge Tracing
by Shuanglong Yao, Yichen Song, Ye Liu, Ji Chen, Deyu Zhao and Xing Wang
Electronics 2025, 14(21), 4148; https://doi.org/10.3390/electronics14214148 - 23 Oct 2025
Viewed by 1096
Abstract
Knowledge Tracing aims to assess students’ mastery of knowledge concepts in real time, playing a crucial role in providing personalized learning services in intelligent tutoring systems. In recent years, researchers have attempted to introduce Neural Architecture Search (NAS) into knowledge tracing tasks to [...] Read more.
Knowledge Tracing aims to assess students’ mastery of knowledge concepts in real time, playing a crucial role in providing personalized learning services in intelligent tutoring systems. In recent years, researchers have attempted to introduce Neural Architecture Search (NAS) into knowledge tracing tasks to automatically design more efficient network structures. However, existing NAS-based methods for Knowledge Tracing suffer from excessively large search spaces and slow search efficiency, which significantly constrain their practical applications. To address these limitations, this paper proposes an Adaptive Neural Architecture Search framework based on Transformers for KT, called ANT-KT. Specifically, we design an enhanced encoder that combines convolution operations with state vectors to capture both local and global dependencies in students’ learning sequences. Moreover, an optimized decoder with a linear attention mechanism is introduced to improve the efficiency of modeling long-term student knowledge state evolution. We further propose an evolutionary NAS algorithm that incorporates a model optimization efficiency objective and a dynamic search space reduction strategy, enabling the discovery of high-performing yet computationally efficient architectures. Experimental results on two large-scale real-world datasets, EdNet and RAIEd2020, demonstrate that ANT-KT significantly reduces time costs across all stages of NAS while achieving performance improvements on multiple evaluation metrics, validating the efficiency and practicality of the proposed method. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

20 pages, 2894 KB  
Article
End-to-End Swallowing Event Localization via Blue-Channel-to-Depth Substitution in RGB-D: GRNConvNeXt-Modified AdaTAD with KAN-Chebyshev Decoder
by Derek Ka-Hei Lai, Zi-An Zhao, Andy Yiu-Chau Tam, Jing Li, Jason Zhi-Shen Zhang, Duo Wai-Chi Wong and James Chung-Wai Cheung
AI 2025, 6(11), 276; https://doi.org/10.3390/ai6110276 - 22 Oct 2025
Viewed by 967
Abstract
Background: Swallowing is a complex biomechanical process, and its impairment (dysphagia) poses major health risks for older adults. Current diagnostic methods such as videofluoroscopic swallowing (VFSS) and fiberoptic endoscopic evaluation of swallowing (FEES) are effective but invasive, resource-intensive, and unsuitable for continuous [...] Read more.
Background: Swallowing is a complex biomechanical process, and its impairment (dysphagia) poses major health risks for older adults. Current diagnostic methods such as videofluoroscopic swallowing (VFSS) and fiberoptic endoscopic evaluation of swallowing (FEES) are effective but invasive, resource-intensive, and unsuitable for continuous monitoring. This study proposes a novel end-to-end RGB–D framework for automated swallowing event localization in continuous video streams. Methods: The framework enhances the AdaTAD backbone through three key innovations: (i) finding the optimal strategy to integrate depth information to capture subtle neck movements, (ii) examining the best adapter design for efficient temporal feature adaptation, and (iii) introducing a Kolmogorov–Arnold Network (KAN) decoder that leverages Chebyshev polynomials for non-linear temporal modeling. Evaluation on a proprietary swallowing dataset comprising 641 clips and 3153 annotated events demonstrated the effectiveness of the proposed framework. We analysed and compared the modification strategy across designs of adapters, decoders, input channel combinations, regression methods, and patch embedding techniques. Results: The optimized configuration (VideoMAE + GRNConvNeXtAdapter + KAN + RGD + boundary regression + sinusoidal embedding) achieved an average mAP of 83.25%, significantly surpassing the baseline I3D + RGB + MLP model (61.55%). Ablation studies further confirmed that each architectural component contributed incrementally to the overall improvement. Conclusions: These results establish the feasibility of accurate, non-invasive, and automated swallowing event localization using depth-augmented video. The proposed framework paves the way for practical dysphagia screening and long-term monitoring in clinical and home-care environments. Full article
Show Figures

Figure 1

19 pages, 2867 KB  
Article
Non-Linear Modeling and Precision Analysis Approach for Implantable Multi-Channel Neural Recording Systems
by Jinyan He, Jian Xu and Yueming Wang
Micromachines 2025, 16(10), 1176; https://doi.org/10.3390/mi16101176 - 17 Oct 2025
Viewed by 631
Abstract
High-precision implantable multi-channel neural recording systems are considered as having a crucial role in the diagnosis and treatment of neurological disorders. However, it is a significant design challenge to achieve an optimal trade-off among linear parameters, signal fidelity, power consumption, and circuit area. [...] Read more.
High-precision implantable multi-channel neural recording systems are considered as having a crucial role in the diagnosis and treatment of neurological disorders. However, it is a significant design challenge to achieve an optimal trade-off among linear parameters, signal fidelity, power consumption, and circuit area. To address this challenge, a Simulink-based modeling approach has been proposed to incorporate adjustable non-linear parameters across the front-end circuits and analog-to-digital converter (ADC) stages. The model evaluates non-linearity impacts on system performance through both quantitative spike detection accuracy analysis and a neural decoding paradigm based on Chinese handwriting reconstruction. Simulated results show that total harmonic distortion (THD) can be set to −34.32 dB for the low-noise amplifier (LNA), −33.73 dB for the programmable gain amplifier (PGA), and −57.95 dB for the ADC in order to achieve reliable detection accuracy with minimal design cost. Moreover, ADC non-linearity has a greater influence on system performance than that of the LNA and PGA. The proposed approach offers quantitative and systematic hardware design guidance to balance signal fidelity and resource efficiency for future low-power, high-accuracy neural recording systems. Full article
(This article belongs to the Section B1: Biosensors)
Show Figures

Figure 1

Back to TopTop