Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

Search Results (726)

Search Parameters:
Keywords = residual fusion networks

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 3876 KB  
Article
Res-FormerNet: A Residual–Transformer Fusion Network for 2-D Magnetotelluric Inversion
by Junhu Yu, Xingong Tang and Zhitao Xiong
Appl. Sci. 2026, 16(1), 270; https://doi.org/10.3390/app16010270 (registering DOI) - 26 Dec 2025
Abstract
We propose Res-FormerNet, an improved inversion network that integrates a lightweight Transformer encoder into a ResNet50 backbone to enhance two-dimensional magnetotelluric (MT) inversion. The model is designed to jointly leverage residual convolutional structures for local feature extraction and global attention mechanisms for capturing [...] Read more.
We propose Res-FormerNet, an improved inversion network that integrates a lightweight Transformer encoder into a ResNet50 backbone to enhance two-dimensional magnetotelluric (MT) inversion. The model is designed to jointly leverage residual convolutional structures for local feature extraction and global attention mechanisms for capturing long-range spatial dependencies in geoelectrical resistivity models. To evaluate the effectiveness of the proposed architecture, more than 100,000 synthetic models generated by a two-dimensional staggered-grid finite-difference forward solver are used to construct training and validation datasets for TE and TM apparent resistivity responses, with realistic noise levels applied to simulate field acquisition conditions. A smoothness-aware loss function is further introduced to improve inversion stability and structural continuity. Results from synthetic tests demonstrate that incorporating the Transformer encoder substantially enhances the recovery of large-scale anomalies, structural boundaries, and resistivity contrasts compared with the original ResNet50. The proposed method also exhibits strong generalization capability when applied to real MT field data from southern Africa, producing inversion results highly consistent with those obtained using the nonlinear conjugate gradient (NLCG) method. These findings confirm that the Res-FormerNet architecture provides an effective and robust framework for MT inversion and illustrate the potential of hybrid convolution–attention networks for advancing data-driven electromagnetic inversion. Full article
(This article belongs to the Special Issue Applied Geophysical Imaging and Data Processing)
24 pages, 1386 KB  
Article
Distributed Cooperative Spectrum Sensing via Push–Sum Consensus for Full-Duplex Cognitive Aerial Base Stations
by Andrea Tani and Dania Marabissi
Future Internet 2026, 18(1), 10; https://doi.org/10.3390/fi18010010 (registering DOI) - 26 Dec 2025
Abstract
The integration of terrestrial and aerial components in future wireless networks is a key enabler for achieving wide-area coverage and providing ubiquitous services. In this context, and with the goal of enhancing spectral efficiency through opportunistic spectrum reuse, this paper investigates a cooperative [...] Read more.
The integration of terrestrial and aerial components in future wireless networks is a key enabler for achieving wide-area coverage and providing ubiquitous services. In this context, and with the goal of enhancing spectral efficiency through opportunistic spectrum reuse, this paper investigates a cooperative spectrum sensing approach in which cognitive UAVs equipped with full-duplex (FD) MIMO technology operate as aerial base stations (ABS). Each UAV performs local detection using the sphericity test, then a push–sum consensus protocol is employed to fuse local test statistics without relying on a fusion center. Unlike conventional unweighted consensus or centralized hard-decision fusion, the proposed approach accounts for the heterogeneity introduced by residual self-interference in FD transceivers. Specifically, multipath in the self-interference channel induces temporal correlation, increasing the variance of the local test statistic and, consequently, the false-alarm probability. To mitigate this effect, we design variance-aware consensus weights proportional to the inverse of the sphericity test variance enhancing robustness to RSI-induced variability. Numerical results demonstrate that the proposed scheme outperforms both unweighted consensus and centralized OR-rule fusion in user capacity, while maintaining negligible communication overhead. Moreover, the operational altitude of the UAVs is evaluated to balance the coverage provided to users and the primary signal detection capability. Full article
Show Figures

Figure 1

22 pages, 6921 KB  
Article
SFE-DETR: An Enhanced Transformer-Based Face Detector for Small Target Faces in Open Complex Scenes
by Chenhao Yang, Yueming Jiang and Chunyan Song
Sensors 2026, 26(1), 125; https://doi.org/10.3390/s26010125 - 24 Dec 2025
Viewed by 112
Abstract
Face detection is an important task in the field of computer vision and is widely applied in various applications. However, in open and complex scenes with dense faces, occlusions, and image degradation, small face detection still faces significant challenges due to the extremely [...] Read more.
Face detection is an important task in the field of computer vision and is widely applied in various applications. However, in open and complex scenes with dense faces, occlusions, and image degradation, small face detection still faces significant challenges due to the extremely small target scale, difficult localization, and severe background interference. To address these issues, this paper proposes a small face detector for open complex scenes, SFE-DETR, which aims to simultaneously improve detection accuracy and computational efficiency. The backbone network of the model adopts an inverted residual shift convolution and dilated reparameterization structure, which enhances shallow features and enables deep feature self-adaptation, thereby better preserving small-scale information and reducing the number of parameters. Additionally, a multi-head multi-scale self-attention mechanism is introduced to fuse multi-scale convolutional features with channel-wise weighting, capturing fine-grained facial features while suppressing background noise. Moreover, a redesigned SFE-FPN introduces high-resolution layers and incorporates a novel feature fusion module consisting of local, large-scale, and global branches, efficiently aggregating multi-level features and significantly improving small face detection performance. Experimental results on two challenging small face detection datasets show that SFE-DETR reduces parameters by 28.1% compared to the original RT-DETR-R18 model, achieving a mAP50 of 94.7% and AP-s of 42.1% on the SCUT-HEAD dataset, and a mAP50 of 86.3% on the WIDER FACE (Hard) subset. These results demonstrate that SFE-DETR achieves optimal detection performance among models of the same scale while maintaining efficiency. Full article
(This article belongs to the Section Optical Sensors)
Show Figures

Figure 1

22 pages, 4809 KB  
Article
Multi-Scale Interactive Network with Color Attention for Low-Light Image Enhancement
by Haoxiang Lu, Changna Qian, Ziming Wang and Zhenbing Liu
Sensors 2026, 26(1), 83; https://doi.org/10.3390/s26010083 - 22 Dec 2025
Viewed by 146
Abstract
Enhancing low-light images is crucial in computer vision applications. Most existing learning-based models often struggle to balance light enhancement and color correction, while images typically contain different types of information at different levels. Hence, we proposed a multi-scale interactive network with color attention [...] Read more.
Enhancing low-light images is crucial in computer vision applications. Most existing learning-based models often struggle to balance light enhancement and color correction, while images typically contain different types of information at different levels. Hence, we proposed a multi-scale interactive network with color attention named MSINet to effectively explore these different types of information for lowlight image enhancement (LLIE) tasks. Specifically, the MSINet first employs the CNN-based branch built upon stacked residual channel attention blocks (RCABs) to fully explore the image local features. Meanwhile, the Transformer-based branch constructed by Transformer blocks contains cross-scale attention (CSA) and multi-head self-attention (MHSA) to mine the global features. Notably, the local and global features extracted by each RCAB and Transformer block are interacted with by the fusion module. Additionally, the color correction branch (CCB) based upon self-attention (SA) can learn the color distribution information from the lowlight input for further guaranteeing the color fidelity of the final output. Extensive experiments have demonstrated that our proposed MSINet outperforms state-of-the-art LLIE methods in light enhancement and color correction. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

23 pages, 2688 KB  
Article
RGSGAN–MACRNet: A More Accurate Recognition Method for Imperfect Corn Kernels Under Sample-Size-Limited Conditions
by Chenxia Wan, Wenzheng Li, Qinghui Zhang, Le Xiao, Pengtao Lv, Huiyi Zhao and Shihua Jing
Foods 2025, 14(24), 4356; https://doi.org/10.3390/foods14244356 - 18 Dec 2025
Viewed by 253
Abstract
Under sample-size-limited conditions, the recognition accuracy of imperfect corn kernels is severely degraded. To address this issue, a recognition framework that integrates a Residual Generative Spatial–Channel Synergistic Attention Generative Adversarial Network (RGSGAN) with a Multi-Scale Asymmetric Convolutional Residual Network (MACRNet) was proposed. First, [...] Read more.
Under sample-size-limited conditions, the recognition accuracy of imperfect corn kernels is severely degraded. To address this issue, a recognition framework that integrates a Residual Generative Spatial–Channel Synergistic Attention Generative Adversarial Network (RGSGAN) with a Multi-Scale Asymmetric Convolutional Residual Network (MACRNet) was proposed. First, residual structures and a spatial–channel synergistic attention mechanism are incorporated into the RGSGAN generator, and the Wasserstein distance with gradient penalty is integrated to produce high-quality samples and expand the dataset. On this basis, the MACRNet employs a multi-branch asymmetric convolutional residual module to perform multi-scale feature fusion, thereby substantially enhancing its ability to capture subtle textural and local structural variations in imperfect corn kernels. The experimental results demonstrated that the proposed method attains a classification accuracy of 98.813%, surpassing ResNet18, EfficientNet-v2, ConvNeXt-T, and ConvNeXt-v2 by 8.3%, 6.16%, 3.01%, and 4.09%, respectively, and outperforms the model trained on the original dataset by 5.29%. These results confirm the superior performance of the proposed approach under sample-size-limited conditions, effectively alleviating the adverse impact of data scarcity on the recognition accuracy of imperfect corn kernels. Full article
(This article belongs to the Section Food Analytical Methods)
Show Figures

Figure 1

28 pages, 4625 KB  
Article
Hybrid PCA-Based and Machine Learning Approaches for Signal-Based Interference Detection and Anomaly Classification Under Synthetic Data Conditions
by Sebastián Čikovský, Patrik Šváb and Peter Hanák
Sensors 2025, 25(24), 7585; https://doi.org/10.3390/s25247585 - 14 Dec 2025
Viewed by 308
Abstract
This article addresses anomaly detection in multichannel spatiotemporal data under strict low-false-alarm constraints (e.g., 1% False Positive Rate, FPR), a requirement essential for safety-critical applications such as signal interference monitoring in sensor networks. We introduce a lightweight, interpretable pipeline that deliberately avoids deep [...] Read more.
This article addresses anomaly detection in multichannel spatiotemporal data under strict low-false-alarm constraints (e.g., 1% False Positive Rate, FPR), a requirement essential for safety-critical applications such as signal interference monitoring in sensor networks. We introduce a lightweight, interpretable pipeline that deliberately avoids deep learning dependencies, implemented solely in NumPy and scikit-learn. The core innovation lies in fusing three complementary anomaly signals in an ensemble: (i) Principal Component Analysis (PCA) Reconstruction Error (MSE) to capture global structure deviations, (ii) Local Outlier Factor (LOF) on residual maps to detect local rarity, and (iii) Monte Carlo Variance as a measure of epistemic uncertainty in model predictions. These signals are combined via learned logistic regression (F*) and specialized Neyman–Pearson optimized fusion (F** and F***) to rigorously enforce bounded false alarms. Evaluated on synthetic benchmarks that simulate realistic anomalies and extensive SNR shifts (±12 dB), the fusion approach demonstrates exceptional robustness. While the best single baseline (MC-variance) achieves a True Positive Rate (TPR) of ≈0.60 at 1% FPR on the 0 dB hold-out, the fusion significantly raises this to ≈0.74 (F**), avoiding the performance collapse of baselines under degraded SNR (maintaining ≈ 0.62 TPR at −12 dB). This deployable solution provides a transparent, edge-ready anomaly detection capability that is highly effective at operating points critical for reliable monitoring in dynamic environments. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

18 pages, 3768 KB  
Article
DFGNet: A CropLand Change Detection Network Combining Deformable Convolution and Grouped Residual Self-Attention
by Xiangxi Feng and Xiaofang Liu
Appl. Sci. 2025, 15(24), 13133; https://doi.org/10.3390/app152413133 - 14 Dec 2025
Viewed by 126
Abstract
To address the challenges of limited multi-scale feature alignment, excessive feature redundancy, and blurred change boundaries in arable land change detection, this paper proposes an improved model based on the Feature Pyramid Network (FPN). Building upon FPN as the foundational framework, a deformable [...] Read more.
To address the challenges of limited multi-scale feature alignment, excessive feature redundancy, and blurred change boundaries in arable land change detection, this paper proposes an improved model based on the Feature Pyramid Network (FPN). Building upon FPN as the foundational framework, a deformable convolutional network is incorporated into the upsampling path to enhance geometric feature extraction for irregular change regions. Subsequently, the multi-scale feature maps generated by the FPN are processed by a Dynamic Low-Rank Fusion (DLRF) module, which integrates a Grouped Residual Self-Attention mechanism. This mechanism suppresses feature redundancy through low-rank decomposition and performs dynamic, adaptive, cross-scale feature fusion via attention weighting, ultimately producing a binary map of arable land changes. Experiments on public datasets demonstrate that the proposed method outperforms both the original FPN and other mainstream models in key metrics such as mIoU and F1-score, while generating clearer change maps. These results validate the effectiveness of incorporating deformable convolutions and the dynamic low-rank fusion strategy within the FPN framework, providing an effective approach that achieves an mIoU of 57.57% and a change detection F1-score of 72.42% for cultivated land identification. Full article
Show Figures

Figure 1

19 pages, 1320 KB  
Article
Enhanced Short-Term Load Forecasting Based on Adaptive Residual Fusion of Autoformer and Transformer
by Lukun Zeng, Kaihong Zheng, Guoying Lin, Jingxu Yang, Mingqi Wu, Guanyu Chen and Haoxia Jiang
Energies 2025, 18(24), 6496; https://doi.org/10.3390/en18246496 - 11 Dec 2025
Viewed by 204
Abstract
Accurate short-term electricity load forecasting (STELF) is essential for grid scheduling and low-carbon smart grids. However, load exhibits multi-timescale periodicity and non-stationary fluctuations, making STELF highly challenging for existing models. To address this challenge, an Autoformer–Transformer residual fusion network (ATRFN) is proposed in [...] Read more.
Accurate short-term electricity load forecasting (STELF) is essential for grid scheduling and low-carbon smart grids. However, load exhibits multi-timescale periodicity and non-stationary fluctuations, making STELF highly challenging for existing models. To address this challenge, an Autoformer–Transformer residual fusion network (ATRFN) is proposed in this paper. A dynamic weighting mechanism is applied to combine the outputs of Autoformer and Transformer through residual connections. In this way, lightweight result-level fusion is enabled without modifications to either architecture. In experimental validations on real-world load datasets, the proposed ATRFN model achieves notable performance gains over single STELF models. For univariate STELF, the ATRFN model reduces forecasting errors by 11.94% in mean squared error (MSE), 10.51% in mean absolute error (MAE), and 7.99% in mean absolute percentage error (MAPE) compared with the best single model. In multivariate experiments, it further decreases errors by at least 5.22% in MSE, 2.77% in MAE, and 2.85% in MAPE, demonstrating consistent improvements in predictive accuracy. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Electrical Power Systems)
Show Figures

Figure 1

26 pages, 7430 KB  
Article
PMSAF-Net: A Progressive Multi-Scale Asymmetric Fusion Network for Lightweight and Multi-Platform Thin Cloud Removal
by Li Wang and Feng Liang
Remote Sens. 2025, 17(24), 4001; https://doi.org/10.3390/rs17244001 - 11 Dec 2025
Viewed by 199
Abstract
With the rapid improvement of deep learning, significant progress has been made in cloud removal for remote sensing images (RSIs). However, the practical deployment of existing methods on multi-platform devices faces several limitations, including high computational complexity preventing real-time processing, substantial hardware resource [...] Read more.
With the rapid improvement of deep learning, significant progress has been made in cloud removal for remote sensing images (RSIs). However, the practical deployment of existing methods on multi-platform devices faces several limitations, including high computational complexity preventing real-time processing, substantial hardware resource demands that are unsuitable for edge devices, and inadequate performance in complex cloud scenarios. To address these challenges, we propose PMSAF-Net, a lightweight Progressive Multi-Scale Asymmetric Fusion Network designed for efficient thin cloud removal. The proposed network employs a Dual-Branch Asymmetric Attention (DBAA) module to optimize spatial details and channel dependencies, reducing computation cost while improving feature extraction. A Multi-Scale Context Aggregation (MSCA) mechanism captures multi-level contextual information through hierarchical dilated convolutions, effectively handling clouds of varying scales and complexities. A Refined Residual Block (RRB) minimizes boundary artifacts through reflection padding and residual calibration. Additionally, an Iterative Feature Refinement (IFR) module progressively enhances feature representations via dense cross-stage connections. Extensive experimental multi-platform datasets results show that the proposed method achieves favorable performance against state-of-the-art algorithms. With only 0.32 M parameters, PMSAF-Net maintains low computational costs, demonstrating its strong potential for multi-platform deployment on resource-constrained edge devices. Full article
Show Figures

Figure 1

28 pages, 11936 KB  
Article
AC-YOLOv11: A Deep Learning Framework for Automatic Detection of Ancient City Sites in the Northeastern Tibetan Plateau
by Xuan Shi and Guangliang Hou
Remote Sens. 2025, 17(24), 3997; https://doi.org/10.3390/rs17243997 - 11 Dec 2025
Viewed by 437
Abstract
Ancient walled cities represent key material evidence for early state formation and human–environment interaction on the northeastern Tibetan Plateau. However, traditional field surveys are often constrained by the vastness and complexity of the plateau environment. This study proposes an improved deep learning framework, [...] Read more.
Ancient walled cities represent key material evidence for early state formation and human–environment interaction on the northeastern Tibetan Plateau. However, traditional field surveys are often constrained by the vastness and complexity of the plateau environment. This study proposes an improved deep learning framework, AC-YOLOv11, to achieve automated detection of ancient city remains in the Qinghai Lake Basin using 0.8 m GF-2 satellite imagery. By integrating a dual-path attention residual network (AC-SENet) with multi-scale feature fusion, the model enhances sensitivity to faint geomorphic and structural features under conditions of erosion, vegetation cover, and modern disturbance. Training on the newly constructed Qinghai Lake Ancient City Dataset (QHACD) yielded a mean average precision (mAP@0.5) of 82.3% and F1-score of 94.2%. Model application across 7000 km2 identified 309 potential sites, of which 74 were verified as highly probable ancient cities, and field investigations confirmed 3 new sites with typical rammed-earth characteristics. Spatial analysis combining digital elevation models and hydrological data shows that 75.7% of all ancient cities are located within 10 km of major rivers or the lake shoreline, primarily between 3500 and 4000 m a.s.l. These results reveal a clear coupling between settlement distribution and environmental constraints in the high-altitude arid zone. The AC-YOLOv11 model demonstrates strong potential for large-scale archaeological prospection and offers a methodological reference for automated heritage mapping on the Qinghai–Tibet Plateau. Full article
Show Figures

Figure 1

29 pages, 4957 KB  
Article
Wind Power Prediction Method Based on Physics-Guided Fusion and Distribution Constraints
by Wenbin Zheng, Jiaojiao Yin, Zhiwei Wang, Huijie Sun and Letian Bai
Energies 2025, 18(24), 6479; https://doi.org/10.3390/en18246479 - 10 Dec 2025
Viewed by 391
Abstract
Accurate wind power prediction is of great significance for grid stability and renewable energy integration. Addressing the challenge of effectively integrating physical mechanisms with data-driven methods in wind power prediction, this paper innovatively proposes a two-stage deep learning prediction framework incorporating physics-guided fusion [...] Read more.
Accurate wind power prediction is of great significance for grid stability and renewable energy integration. Addressing the challenge of effectively integrating physical mechanisms with data-driven methods in wind power prediction, this paper innovatively proposes a two-stage deep learning prediction framework incorporating physics-guided fusion and distribution constraints, aiming to improve the prediction accuracy and physical authenticity of individual wind turbines. In the first stage, we construct a baseline model based on multi-branch multilayer perceptrons (MLP) that eschews traditional attempts to accurately reconstruct complex three-dimensional spatiotemporal wind fields, instead directly learning the power conversion characteristics of wind turbines under specific meteorological conditions from historical operational data, namely the power coefficient (Cp). This data-driven Cp fitting method provides a physically interpretable and robust benchmark for power prediction. In the second stage, targeting the prediction residuals from the baseline model, we design a bidirectional long short-term memory network (BiLSTM) for refined correction. The core innovation of this stage lies in introducing Maximum Mean Discrepancy (MMD) as a regularization term to constrain the predicted wind speed-power joint probability distribution. This constraint enforces the model-generated power predictions to remain statistically consistent with historical real data distributions, effectively preventing the model from producing predictions that deviate from physical reality, significantly enhancing the model’s generalization capability and reliability. Experimental results demonstrate that compared to traditional methods, the proposed method achieves significant improvements in Mean Absolute Error, Root Mean Square Error, and other metrics, validating the effectiveness of physical constraints in improving prediction accuracy. Full article
Show Figures

Figure 1

23 pages, 7184 KB  
Article
RAFF-AMACNet: Adaptive Multi-Rate Atrous Convolution Network with Residual Attentional Feature Fusion for Satellite Signal Recognition
by Leyan Chen, Bo Zang, Yi Zhang, Lin Li, Haitao Wei, Xudong Liu and Meng Wu
Sensors 2025, 25(24), 7514; https://doi.org/10.3390/s25247514 - 10 Dec 2025
Viewed by 264
Abstract
With the launch of an increasing number of satellites to establish complex satellite communication networks, automatic modulation recognition (AMR) plays a crucial role in satellite signal recognition and spectrum management. However, most existing AMR models struggle to handle signals in such complex satellite [...] Read more.
With the launch of an increasing number of satellites to establish complex satellite communication networks, automatic modulation recognition (AMR) plays a crucial role in satellite signal recognition and spectrum management. However, most existing AMR models struggle to handle signals in such complex satellite communication environments. Therefore, this paper proposes an adaptive multi-rate atrous convolution network with residual attentional feature fusion (RAFF-AMACNet) that employs the adaptive multi-rate atrous convolution (AMAC) module to adaptively extract and dynamically join more prominent multi-scale features, enhancing the model’s time-series context awareness and generating robust feature maps. On this basis, the pyramid backbone consists of multiple stacked residual attentional feature fusion (RAFF) modules, featuring a dual-attention collaborative mechanism designed to mitigate feature map shifts and increase the separation between feature clusters of different classes under significant Doppler effects and nonlinear influences. On our independently constructed RML24 dataset, a general-purpose dataset tailored for satellite cognitive radio systems, simulation results indicate that at a signal-to-noise ratio of 0 dB, the modulation recognition accuracy reaches 92.99%. Full article
Show Figures

Figure 1

28 pages, 5016 KB  
Article
A Lightweight Improved YOLOv8-Based Method for Rebar Intersection Detection
by Rui Wang, Fangjun Shi, Yini She, Li Zhang, Kaifeng Lin, Longshun Fu and Jingkun Shi
Appl. Sci. 2025, 15(24), 12898; https://doi.org/10.3390/app152412898 - 7 Dec 2025
Viewed by 286
Abstract
As industrialized construction and smart building continue to advance, rebar-tying robots place higher demands on the real-time and accurate recognition of rebar intersections and their tying status. Existing deep learning-based detection methods generally rely on heavy backbone networks and complex feature-fusion structures, making [...] Read more.
As industrialized construction and smart building continue to advance, rebar-tying robots place higher demands on the real-time and accurate recognition of rebar intersections and their tying status. Existing deep learning-based detection methods generally rely on heavy backbone networks and complex feature-fusion structures, making it difficult to deploy them efficiently on resource-constrained mobile robots and edge devices, and there is also a lack of dedicated datasets for rebar intersections. In this study, 12,000 rebar mesh images were collected and annotated from two indoor scenes and one outdoor scene to construct a rebar-intersection dataset that supports both object detection and instance segmentation, enabling simultaneous learning of intersection locations and tying status. On this basis, a lightweight improved YOLOv8-based method for rebar intersection detection and segmentation is proposed. The original backbone is replaced with ShuffleNetV2, and a C2f_Dual residual module is introduced in the neck; the same improvements are further transferred to YOLOv8-seg to form a unified lightweight detection–segmentation framework for joint prediction of intersection locations and tying status. Experimental results show that, compared with the original YOLOv8L and several mainstream detectors, the proposed model achieves comparable or superior performance in terms of mAP@50, precision and recall, while reducing model size and computational cost by 51.2% and 58.1%, respectively, and significantly improving inference speed. The improved YOLOv8-seg also achieves satisfactory contour alignment and regional consistency for rebar regions and intersection masks. Owing to its combination of high accuracy and low resource consumption, the proposed method is well suited for deployment on edge-computing devices used in rebar-tying robots and construction quality inspection, providing an effective visual perception solution for intelligent construction. Full article
(This article belongs to the Special Issue Advances in Smart Construction and Intelligent Buildings)
Show Figures

Figure 1

23 pages, 9482 KB  
Article
A Hybrid End-to-End Dual Path Convolutional Residual LSTM Model for Battery SOH Estimation
by Azadeh Gholaminejad, Arta Mohammad-Alikhani and Babak Nahid-Mobarakeh
Batteries 2025, 11(12), 449; https://doi.org/10.3390/batteries11120449 - 6 Dec 2025
Viewed by 353
Abstract
Accurate estimation of battery state of health is essential for ensuring safety, supporting fault diagnosis, and optimizing the lifetime of electric vehicles. This study proposes a compact dual-path architecture that combines Convolutional Neural Networks with Convolutional Long Short-Term Memory (ConvLSTM) units to jointly [...] Read more.
Accurate estimation of battery state of health is essential for ensuring safety, supporting fault diagnosis, and optimizing the lifetime of electric vehicles. This study proposes a compact dual-path architecture that combines Convolutional Neural Networks with Convolutional Long Short-Term Memory (ConvLSTM) units to jointly extract spatial and temporal degradation features from charge-cycle voltage and current measurements. Residual and inter-path connections enhance gradient flow and feature fusion, while a three-channel preprocessing strategy aligns cycle lengths and isolates padded regions, improving learning stability. Operating end-to-end, the model eliminates the need for handcrafted features and does not rely on discharge data or temperature measurements, enabling practical deployment in minimally instrumented environments. The model is evaluated on the NASA battery aging dataset under two scenarios: Same-Battery Evaluation and Leave-One-Battery-Out Cross-Battery Generalization. It achieves average RMSE values of 1.26% and 2.14%, converging within 816 and 395 epochs, respectively. An ablation study demonstrates that the dual-path design, ConvLSTM units, residual shortcuts, inter-path exchange, and preprocessing pipeline each contribute to accuracy, stability, and reduced training cost. With only 4913 parameters, the architecture remains robust to variations in initial capacity, cutoff voltage, and degradation behavior. Edge deployment on an NVIDIA Jetson AGX Orin confirms real-time feasibility, achieving 2.24 ms latency, 8.24 MB memory usage, and 12.9 W active power, supporting use in resource-constrained battery management systems. Full article
Show Figures

Figure 1

17 pages, 3220 KB  
Article
ArecaNet: Robust Facial Emotion Recognition via Assembled Residual Enhanced Cross-Attention Networks for Emotion-Aware Human–Computer Interaction
by Jaemyung Kim and Gyuho Choi
Sensors 2025, 25(23), 7375; https://doi.org/10.3390/s25237375 - 4 Dec 2025
Viewed by 376
Abstract
Recently, the convergence of advanced sensor technologies and innovations in artificial intelligence and robotics has highlighted facial emotion recognition (FER) as an essential component of human–computer interaction (HCI). Traditional FER studies based on handcrafted features and shallow machine learning have shown a limited [...] Read more.
Recently, the convergence of advanced sensor technologies and innovations in artificial intelligence and robotics has highlighted facial emotion recognition (FER) as an essential component of human–computer interaction (HCI). Traditional FER studies based on handcrafted features and shallow machine learning have shown a limited performance, while convolutional neural networks (CNNs) have improved nonlinear emotion pattern analysis but have been constrained by local feature extraction. Vision transformers (ViTs) have addressed this by leveraging global correlations, yet both CNN- and ViT-based single networks often suffer from overfitting, single-network dependency, and information loss in ensemble operations. To overcome these limitations, we propose ArecaNet, an assembled residual enhanced cross-attention network that integrates multiple feature streams without information loss. The framework comprises (i) channel and spatial feature extraction via SCSESResNet, (ii) landmark feature extraction from specialized sub-networks, (iii) iterative fusion through residual enhanced cross-attention, (iv) final emotion classification from the fused representation. Our research introduces a novel approach by integrating pre-trained sub-networks specialized in facial recognition with an attention mechanism and our uniquely designed main network, which is optimized for size reduction and efficient feature extraction. The extracted features are fused through an iterative residual enhanced cross-attention mechanism, which minimizes information loss and preserves complementary representations across networks. This strategy overcomes the limitations of conventional ensemble methods, enabling seamless feature integration and robust recognition. The experimental results show that the proposed ArecaNet achieved accuracies of 97.0% and 97.8% using the public databases, FER-2013 and RAF-DB, which were 4.5% better than the existing state-of-the-art method, PAtt-Lite, for FER-2013 and 2.75% for RAF-DB, and achieved a new state-of-the-art accuracy for each database. Full article
(This article belongs to the Special Issue Sensor-Based Behavioral Biometrics)
Show Figures

Figure 1

Back to TopTop