Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (2,893)

Search Parameters:
Keywords = multi-fusion information

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 4338 KiB  
Article
Lightweight Attention-Based CNN Architecture for CSI Feedback of RIS-Assisted MISO Systems
by Anming Dong, Yupeng Xue, Sufang Li, Wendong Xu and Jiguo Yu
Mathematics 2025, 13(15), 2371; https://doi.org/10.3390/math13152371 (registering DOI) - 24 Jul 2025
Abstract
Reconfigurable Intelligent Surface (RIS) has emerged as a promising enabling technology for wireless communications, which significantly enhances system performance through real-time manipulation of electromagnetic wave reflection characteristics. In RIS-assisted communication systems, existing deep learning-based channel state information (CSI) feedback methods often suffer from [...] Read more.
Reconfigurable Intelligent Surface (RIS) has emerged as a promising enabling technology for wireless communications, which significantly enhances system performance through real-time manipulation of electromagnetic wave reflection characteristics. In RIS-assisted communication systems, existing deep learning-based channel state information (CSI) feedback methods often suffer from excessive parameter requirements and high computational complexity. To address this challenge, this paper proposes LwCSI-Net, a lightweight autoencoder network specifically designed for RIS-assisted multiple-input single-output (MISO) systems, aiming to achieve efficient and low-complexity CSI feedback. The core contribution of this work lies in an innovative lightweight feedback architecture that deeply integrates multi-layer convolutional neural networks (CNNs) with attention mechanisms. Specifically, the network employs 1D convolutional operations with unidirectional kernel sliding, which effectively reduces trainable parameters while maintaining robust feature-extraction capabilities. Furthermore, by incorporating an efficient channel attention (ECA) mechanism, the model dynamically allocates weights to different feature channels, thereby enhancing the capture of critical features. This approach not only improves network representational efficiency but also reduces redundant computations, leading to optimized computational complexity. Additionally, the proposed cross-channel residual block (CRBlock) establishes inter-channel information-exchange paths, strengthening feature fusion and ensuring outstanding stability and robustness under high compression ratio (CR) conditions. Our experimental results show that for CRs of 16, 32, and 64, LwCSI-Net significantly improves CSI reconstruction performance while maintaining fewer parameters and lower computational complexity, achieving an average complexity reduction of 35.63% compared to state-of-the-art (SOTA) CSI feedback autoencoder architectures. Full article
(This article belongs to the Special Issue Data-Driven Decentralized Learning for Future Communication Networks)
Show Figures

Figure 1

17 pages, 2072 KiB  
Article
Barefoot Footprint Detection Algorithm Based on YOLOv8-StarNet
by Yujie Shen, Xuemei Jiang, Yabin Zhao and Wenxin Xie
Sensors 2025, 25(15), 4578; https://doi.org/10.3390/s25154578 (registering DOI) - 24 Jul 2025
Abstract
This study proposes an optimized footprint recognition model based on an enhanced StarNet architecture for biometric identification in the security, medical, and criminal investigation fields. Conventional image recognition algorithms exhibit limitations in processing barefoot footprint images characterized by concentrated feature distributions and rich [...] Read more.
This study proposes an optimized footprint recognition model based on an enhanced StarNet architecture for biometric identification in the security, medical, and criminal investigation fields. Conventional image recognition algorithms exhibit limitations in processing barefoot footprint images characterized by concentrated feature distributions and rich texture patterns. To address this, our framework integrates an improved StarNet into the backbone of YOLOv8 architecture. Leveraging the unique advantages of element-wise multiplication, the redesigned backbone efficiently maps inputs to a high-dimensional nonlinear feature space without increasing channel dimensions, achieving enhanced representational capacity with low computational latency. Subsequently, an Encoder layer facilitates feature interaction within the backbone through multi-scale feature fusion and attention mechanisms, effectively extracting rich semantic information while maintaining computational efficiency. In the feature fusion part, a feature modulation block processes multi-scale features by synergistically combining global and local information, thereby reducing redundant computations and decreasing both parameter count and computational complexity to achieve model lightweighting. Experimental evaluations on a proprietary barefoot footprint dataset demonstrate that the proposed model exhibits significant advantages in terms of parameter efficiency, recognition accuracy, and computational complexity. The number of parameters has been reduced by 0.73 million, further improving the model’s speed. Gflops has been reduced by 1.5, lowering the performance requirements for computational hardware during model deployment. Recognition accuracy has reached 99.5%, with further improvements in model precision. Future research will explore how to capture shoeprint images with complex backgrounds from shoes worn at crime scenes, aiming to further enhance the model’s recognition capabilities in more forensic scenarios. Full article
(This article belongs to the Special Issue Transformer Applications in Target Tracking)
Show Figures

Figure 1

22 pages, 4611 KiB  
Article
MMC-YOLO: A Lightweight Model for Real-Time Detection of Geometric Symmetry-Breaking Defects in Wind Turbine Blades
by Caiye Liu, Chao Zhang, Xinyu Ge, Xunmeng An and Nan Xue
Symmetry 2025, 17(8), 1183; https://doi.org/10.3390/sym17081183 (registering DOI) - 24 Jul 2025
Abstract
Performance degradation of wind turbine blades often stems from geometric asymmetry induced by damage. Existing methods for assessing damage face challenges in balancing accuracy and efficiency due to their limited ability to capture fine-grained geometric asymmetries associated with multi-scale damage under complex background [...] Read more.
Performance degradation of wind turbine blades often stems from geometric asymmetry induced by damage. Existing methods for assessing damage face challenges in balancing accuracy and efficiency due to their limited ability to capture fine-grained geometric asymmetries associated with multi-scale damage under complex background interference. To address this, based on the high-speed detection model YOLOv10-N, this paper proposes a novel detection model named MMC-YOLO. First, the Multi-Scale Perception Gated Convolution (MSGConv) Module was designed, which constructs a full-scale receptive field through multi-branch fusion and channel rearrangement to enhance the extraction of geometric asymmetry features. Second, the Multi-Scale Enhanced Feature Pyramid Network (MSEFPN) was developed, integrating dynamic path aggregation and an SENetv2 attention mechanism to suppress background interference and amplify damage response. Finally, the Channel-Compensated Filtering (CCF) module was constructed to preserve critical channel information using a dynamic buffering mechanism. Evaluated on a dataset of 4818 wind turbine blade damage images, MMC-YOLO achieves an 82.4% mAP [0.5:0.95], representing a 4.4% improvement over the baseline YOLOv10-N model, and a 91.1% recall rate, an 8.7% increase, while maintaining a lightweight parameter count of 4.2 million. This framework significantly enhances geometric asymmetry defect detection accuracy while ensuring real-time performance, meeting engineering requirements for high efficiency and precision. Full article
(This article belongs to the Special Issue Symmetry and Its Applications in Image Processing)
Show Figures

Figure 1

35 pages, 1231 KiB  
Review
Toward Intelligent Underwater Acoustic Systems: Systematic Insights into Channel Estimation and Modulation Methods
by Imran A. Tasadduq and Muhammad Rashid
Electronics 2025, 14(15), 2953; https://doi.org/10.3390/electronics14152953 (registering DOI) - 24 Jul 2025
Abstract
Underwater acoustic (UWA) communication supports many critical applications but still faces several physical-layer signal processing challenges. In response, recent advances in machine learning (ML) and deep learning (DL) offer promising solutions to improve signal detection, modulation adaptability, and classification accuracy. These developments highlight [...] Read more.
Underwater acoustic (UWA) communication supports many critical applications but still faces several physical-layer signal processing challenges. In response, recent advances in machine learning (ML) and deep learning (DL) offer promising solutions to improve signal detection, modulation adaptability, and classification accuracy. These developments highlight the need for a systematic evaluation to compare various ML/DL models and assess their performance across diverse underwater conditions. However, most existing reviews on ML/DL-based UWA communication focus on isolated approaches rather than integrated system-level perspectives, which limits cross-domain insights and reduces their relevance to practical underwater deployments. Consequently, this systematic literature review (SLR) synthesizes 43 studies (2020–2025) on ML and DL approaches for UWA communication, covering channel estimation, adaptive modulation, and modulation recognition across both single- and multi-carrier systems. The findings reveal that models such as convolutional neural networks (CNNs), long short-term memory networks (LSTMs), and generative adversarial networks (GANs) enhance channel estimation performance, achieving error reductions and bit error rate (BER) gains ranging from 103 to 106. Adaptive modulation techniques incorporating support vector machines (SVMs), CNNs, and reinforcement learning (RL) attain classification accuracies exceeding 98% and throughput improvements of up to 25%. For modulation recognition, architectures like sequence CNNs, residual networks, and hybrid convolutional–recurrent models achieve up to 99.38% accuracy with latency below 10 ms. These performance metrics underscore the viability of ML/DL-based solutions in optimizing physical-layer tasks for real-world UWA deployments. Finally, the SLR identifies key challenges in UWA communication, including high complexity, limited data, fragmented performance metrics, deployment realities, energy constraints and poor scalability. It also outlines future directions like lightweight models, physics-informed learning, advanced RL strategies, intelligent resource allocation, and robust feature fusion to build reliable and intelligent underwater systems. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

22 pages, 2952 KiB  
Article
Raw-Data Driven Functional Data Analysis with Multi-Adaptive Functional Neural Networks for Ergonomic Risk Classification Using Facial and Bio-Signal Time-Series Data
by Suyeon Kim, Afrooz Shakeri, Seyed Shayan Darabi, Eunsik Kim and Kyongwon Kim
Sensors 2025, 25(15), 4566; https://doi.org/10.3390/s25154566 - 23 Jul 2025
Abstract
Ergonomic risk classification during manual lifting tasks is crucial for the prevention of workplace injuries. This study addresses the challenge of classifying lifting task risk levels (low, medium, and high risk, labeled as 0, 1, and 2) using multi-modal time-series data comprising raw [...] Read more.
Ergonomic risk classification during manual lifting tasks is crucial for the prevention of workplace injuries. This study addresses the challenge of classifying lifting task risk levels (low, medium, and high risk, labeled as 0, 1, and 2) using multi-modal time-series data comprising raw facial landmarks and bio-signals (electrocardiography [ECG] and electrodermal activity [EDA]). Classifying such data presents inherent challenges due to multi-source information, temporal dynamics, and class imbalance. To overcome these challenges, this paper proposes a Multi-Adaptive Functional Neural Network (Multi-AdaFNN), a novel method that integrates functional data analysis with deep learning techniques. The proposed model introduces a novel adaptive basis layer composed of micro-networks tailored to each individual time-series feature, enabling end-to-end learning of discriminative temporal patterns directly from raw data. The Multi-AdaFNN approach was evaluated across five distinct dataset configurations: (1) facial landmarks only, (2) bio-signals only, (3) full fusion of all available features, (4) a reduced-dimensionality set of 12 selected facial landmark trajectories, and (5) the same reduced set combined with bio-signals. Performance was rigorously assessed using 100 independent stratified splits (70% training and 30% testing) and optimized via a weighted cross-entropy loss function to manage class imbalance effectively. The results demonstrated that the integrated approach, fusing facial landmarks and bio-signals, achieved the highest classification accuracy and robustness. Furthermore, the adaptive basis functions revealed specific phases within lifting tasks critical for risk prediction. These findings underscore the efficacy and transparency of the Multi-AdaFNN framework for multi-modal ergonomic risk assessment, highlighting its potential for real-time monitoring and proactive injury prevention in industrial environments. Full article
(This article belongs to the Special Issue (Bio)sensors for Physiological Monitoring)
Show Figures

Figure 1

27 pages, 8957 KiB  
Article
DFAN: Single Image Super-Resolution Using Stationary Wavelet-Based Dual Frequency Adaptation Network
by Gyu-Il Kim and Jaesung Lee
Symmetry 2025, 17(8), 1175; https://doi.org/10.3390/sym17081175 - 23 Jul 2025
Abstract
Single image super-resolution is the inverse problem of reconstructing a high-resolution image from its low-resolution counterpart. Although recent Transformer-based architectures leverage global context integration to improve reconstruction quality, they often overlook frequency-specific characteristics, resulting in the loss of high-frequency information. To address this [...] Read more.
Single image super-resolution is the inverse problem of reconstructing a high-resolution image from its low-resolution counterpart. Although recent Transformer-based architectures leverage global context integration to improve reconstruction quality, they often overlook frequency-specific characteristics, resulting in the loss of high-frequency information. To address this limitation, we propose the Dual Frequency Adaptive Network (DFAN). DFAN first decomposes the input into low- and high-frequency components via Stationary Wavelet Transform. In the low-frequency branch, Swin Transformer layers restore global structures and color consistency. In contrast, the high-frequency branch features a dedicated module that combines Directional Convolution with Residual Dense Blocks, precisely reinforcing edges and textures. A frequency fusion module then adaptively merges these complementary features using depthwise and pointwise convolutions, achieving a balanced reconstruction. During training, we introduce a frequency-aware multi-term loss alongside the standard pixel-wise loss to explicitly encourage high-frequency preservation. Extensive experiments on the Set5, Set14, BSD100, Urban100, and Manga109 benchmarks show that DFAN achieves up to +0.64 dBpeak signal-to-noise ratio, +0.01 structural similarity index measure, and −0.01learned perceptual image patch similarity over the strongest frequency-domain baselines, while also delivering visibly sharper textures and cleaner edges. By unifying spatial and frequency-domain advantages, DFAN effectively mitigates high-frequency degradation and enhances SISR performance. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

22 pages, 2420 KiB  
Article
BiEHFFNet: A Water Body Detection Network for SAR Images Based on Bi-Encoder and Hybrid Feature Fusion
by Bin Han, Xin Huang and Feng Xue
Mathematics 2025, 13(15), 2347; https://doi.org/10.3390/math13152347 - 23 Jul 2025
Abstract
Water body detection in synthetic aperture radar (SAR) imagery plays a critical role in applications such as disaster response, water resource management, and environmental monitoring. However, it remains challenging due to complex background interference in SAR images. To address this issue, a bi-encoder [...] Read more.
Water body detection in synthetic aperture radar (SAR) imagery plays a critical role in applications such as disaster response, water resource management, and environmental monitoring. However, it remains challenging due to complex background interference in SAR images. To address this issue, a bi-encoder and hybrid feature fuse network (BiEHFFNet) is proposed for achieving accurate water body detection. First, a bi-encoder structure based on ResNet and Swin Transformer is used to jointly extract local spatial details and global contextual information, enhancing feature representation in complex scenarios. Additionally, the convolutional block attention module (CBAM) is employed to suppress irrelevant information of the output features of each ResNet stage. Second, a cross-attention-based hybrid feature fusion (CABHFF) module is designed to interactively integrate local and global features through cross-attention, followed by channel attention to achieve effective hybrid feature fusion, thus improving the model’s ability to capture water structures. Third, a multi-scale content-aware upsampling (MSCAU) module is designed by integrating atrous spatial pyramid pooling (ASPP) with the Content-Aware ReAssembly of FEatures (CARAFE), aiming to enhance multi-scale contextual learning while alleviating feature distortion caused by upsampling. Finally, a composite loss function combining Dice loss and Active Contour loss is used to provide stronger boundary supervision. Experiments conducted on the ALOS PALSAR dataset demonstrate that the proposed BiEHFFNet outperforms existing methods across multiple evaluation metrics, achieving more accurate water body detection. Full article
(This article belongs to the Special Issue Advanced Mathematical Methods in Remote Sensing)
Show Figures

Figure 1

22 pages, 2919 KiB  
Article
A Feasible Domain Segmentation Algorithm for Unmanned Vessels Based on Coordinate-Aware Multi-Scale Features
by Zhengxun Zhou, Weixian Li, Yuhan Wang, Haozheng Liu and Ning Wu
J. Mar. Sci. Eng. 2025, 13(8), 1387; https://doi.org/10.3390/jmse13081387 - 22 Jul 2025
Abstract
The accurate extraction of navigational regions from images of navigational waters plays a key role in ensuring on-water safety and the automation of unmanned vessels. Nonetheless, current technological methods encounter significant challenges in addressing fluctuations in water surface illumination, reflective disturbances, and surface [...] Read more.
The accurate extraction of navigational regions from images of navigational waters plays a key role in ensuring on-water safety and the automation of unmanned vessels. Nonetheless, current technological methods encounter significant challenges in addressing fluctuations in water surface illumination, reflective disturbances, and surface undulations, among other disruptions, in turn making it challenging to achieve rapid and precise boundary segmentation. To cope with these challenges, in this paper, we propose a coordinate-aware multi-scale feature network (GASF-ResNet) method for water segmentation. The method integrates the attention module Global Grouping Coordinate Attention (GGCA) in the four downsampling branches of ResNet-50, thus enhancing the model’s ability to capture target features and improving the feature representation. To expand the model’s receptive field and boost its capability in extracting features of multi-scale targets, the Avoidance Spatial Pyramid Pooling (ASPP) technique is used. Combined with multi-scale feature fusion, this effectively enhances the expression of semantic information at different scales and improves the segmentation accuracy of the model in complex water environments. The experimental results show that the average pixel accuracy (mPA) and average intersection and union ratio (mIoU) of the proposed method on the self-made dataset and on the USVInaland unmanned ship dataset are 99.31% and 98.61%, and 98.55% and 99.27%, respectively, significantly better results than those obtained for the existing mainstream models. These results are helpful in overcoming the background interference caused by water surface reflection and uneven lighting in the aquatic environment and in realizing the accurate segmentation of the water area for the safe navigation of unmanned vessels, which is of great value for the stable operation of unmanned vessels in complex environments. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

37 pages, 55522 KiB  
Article
EPCNet: Implementing an ‘Artificial Fovea’ for More Efficient Monitoring Using the Sensor Fusion of an Event-Based and a Frame-Based Camera
by Orla Sealy Phelan, Dara Molloy, Roshan George, Edward Jones, Martin Glavin and Brian Deegan
Sensors 2025, 25(15), 4540; https://doi.org/10.3390/s25154540 - 22 Jul 2025
Abstract
Efficient object detection is crucial to real-time monitoring applications such as autonomous driving or security systems. Modern RGB cameras can produce high-resolution images for accurate object detection. However, increased resolution results in increased network latency and power consumption. To minimise this latency, Convolutional [...] Read more.
Efficient object detection is crucial to real-time monitoring applications such as autonomous driving or security systems. Modern RGB cameras can produce high-resolution images for accurate object detection. However, increased resolution results in increased network latency and power consumption. To minimise this latency, Convolutional Neural Networks (CNNs) often have a resolution limitation, requiring images to be down-sampled before inference, causing significant information loss. Event-based cameras are neuromorphic vision sensors with high temporal resolution, low power consumption, and high dynamic range, making them preferable to regular RGB cameras in many situations. This project proposes the fusion of an event-based camera with an RGB camera to mitigate the trade-off between temporal resolution and accuracy, while minimising power consumption. The cameras are calibrated to create a multi-modal stereo vision system where pixel coordinates can be projected between the event and RGB camera image planes. This calibration is used to project bounding boxes detected by clustering of events into the RGB image plane, thereby cropping each RGB frame instead of down-sampling to meet the requirements of the CNN. Using the Common Objects in Context (COCO) dataset evaluator, the average precision (AP) for the bicycle class in RGB scenes improved from 21.08 to 57.38. Additionally, AP increased across all classes from 37.93 to 46.89. To reduce system latency, a novel object detection approach is proposed where the event camera acts as a region proposal network, and a classification algorithm is run on the proposed regions. This achieved a 78% improvement over baseline. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

15 pages, 1174 KiB  
Article
A New Incremental Learning Method Based on Rainbow Memory for Fault Diagnosis of AUV
by Ying Li, Yuxing Ye, Zhiwei Zhang and Long Wen
Sensors 2025, 25(15), 4539; https://doi.org/10.3390/s25154539 - 22 Jul 2025
Abstract
Autonomous Underwater Vehicles (AUVs) are gradually becoming some of the most important equipment in deep-sea exploration. However, in the dynamic nature of the deep-sea environment, any unplanned fault of AUVs would cause serious accidents. Traditional fault diagnosis models are trained in static and [...] Read more.
Autonomous Underwater Vehicles (AUVs) are gradually becoming some of the most important equipment in deep-sea exploration. However, in the dynamic nature of the deep-sea environment, any unplanned fault of AUVs would cause serious accidents. Traditional fault diagnosis models are trained in static and fixed datasets, making them difficult to adopt in new and unknown deep-sea environments. To address these issues, this study explores incremental learning to enable AUVs to continuously adapt to new fault scenarios while preserving previously learned diagnostic knowledge, named the RM-MFKAN model. First, the approach begins by employing the Rainbow Memory (RM) framework to analyze data characteristics and temporal sequences, thereby delineating boundaries between old and new tasks. Second, the model evaluates data importance to select and store key samples encapsulating critical information from prior tasks. Third, the RM is combined with the enhanced KAN network, and the stored samples are then combined with new task training data and fed into a multi-branch feature fusion neural network. The proposed RM-MFKAN model was conducted on the “Haizhe” dataset, and the experimental results have demonstrated that the proposed model achieves superior performance in fault diagnosis for AUVs. Full article
Show Figures

Figure 1

22 pages, 2485 KiB  
Article
Infrared and Visible Image Fusion Using a State-Space Adversarial Model with Cross-Modal Dependency Learning
by Qingqing Hu, Yiran Peng, KinTak U and Siyuan Zhao
Mathematics 2025, 13(15), 2333; https://doi.org/10.3390/math13152333 - 22 Jul 2025
Abstract
Infrared and visible image fusion plays a critical role in multimodal perception systems, particularly under challenging conditions such as low illumination, occlusion, or complex backgrounds. However, existing approaches often struggle with global feature modelling, cross-modal dependency learning, and preserving structural details in the [...] Read more.
Infrared and visible image fusion plays a critical role in multimodal perception systems, particularly under challenging conditions such as low illumination, occlusion, or complex backgrounds. However, existing approaches often struggle with global feature modelling, cross-modal dependency learning, and preserving structural details in the fused images. In this paper, we propose a novel adversarial fusion framework driven by a state-space modelling paradigm to address these limitations. In the feature extraction phase, a computationally efficient state-space model is utilized to capture global semantic context from both infrared and visible inputs. A cross-modality state-space architecture is then introduced in the fusion phase to model long-range dependencies between heterogeneous features effectively. Finally, a multi-class discriminator, trained under an adversarial learning scheme, enhances the structural fidelity and detail consistency of the fused output. Extensive experiments conducted on publicly available infrared–visible fusion datasets demonstrate that the proposed method achieves superior performance in terms of information retention, contrast enhancement, and visual realism. The results confirm the robustness and generalizability of our framework for complex scene understanding and downstream tasks such as object detection under adverse conditions. Full article
Show Figures

Figure 1

16 pages, 1655 KiB  
Article
FO-DEMST: Optimized Multi-Scale Transformer with Dual-Encoder Architecture for Feeding Amount Prediction in Sea Bass Aquaculture
by Hongpo Wang, Qihui Zhang, Hong Zhou, Yunchen Tian, Yongcheng Jiang and Jianing Quan
J. Sens. Actuator Netw. 2025, 14(4), 77; https://doi.org/10.3390/jsan14040077 - 22 Jul 2025
Viewed by 82
Abstract
Traditional methods for predicting feeding amounts rely on historical data and experience but fail to account for non-linear fish growth and the influence of water quality and meteorological factors. This study presents a novel approach for sea bass feeding prediction based on Spearman [...] Read more.
Traditional methods for predicting feeding amounts rely on historical data and experience but fail to account for non-linear fish growth and the influence of water quality and meteorological factors. This study presents a novel approach for sea bass feeding prediction based on Spearman + RF feature optimization and multi-scale feature fusion using a transformer model. A logistic growth curve model is used to analyze sea bass growth and establish the relationship between biomass and feeding amount. Spearman correlation analysis and random forest optimize the feature set for improved prediction accuracy. A dual-encoder structure incorporates historical feeding data and biomass along with water quality and meteorological information. Multi-scale feature fusion addresses time-scale inconsistencies between input variables The results showed that the MSE and MAE of the improved transformer model for sea bass feeding prediction were 0.42 and 0.31, respectively, which decreased by 43% in MSE and 33% in MAE compared to the traditional transformer model. Full article
(This article belongs to the Special Issue Remote Sensing and IoT Application for Smart Agriculture)
Show Figures

Figure 1

25 pages, 2727 KiB  
Review
AI-Powered Next-Generation Technology for Semiconductor Optical Metrology: A Review
by Weiwang Xu, Houdao Zhang, Lingjing Ji and Zhongyu Li
Micromachines 2025, 16(8), 838; https://doi.org/10.3390/mi16080838 - 22 Jul 2025
Viewed by 161
Abstract
As semiconductor manufacturing advances into the angstrom-scale era characterized by three-dimensional integration, conventional metrology technologies face fundamental limitations regarding accuracy, speed, and non-destructiveness. Although optical spectroscopy has emerged as a prominent research focus, its application in complex manufacturing scenarios continues to confront significant [...] Read more.
As semiconductor manufacturing advances into the angstrom-scale era characterized by three-dimensional integration, conventional metrology technologies face fundamental limitations regarding accuracy, speed, and non-destructiveness. Although optical spectroscopy has emerged as a prominent research focus, its application in complex manufacturing scenarios continues to confront significant technical barriers. This review establishes three concrete objectives: To categorize AI–optical spectroscopy integration paradigms spanning forward surrogate modeling, inverse prediction, physics-informed neural networks (PINNs), and multi-level architectures; to benchmark their efficacy against critical industrial metrology challenges including tool-to-tool (T2T) matching and high-aspect-ratio (HAR) structure characterization; and to identify unresolved bottlenecks for guiding next-generation intelligent semiconductor metrology. By categorically elaborating on the innovative applications of AI algorithms—such as forward surrogate models, inverse modeling techniques, physics-informed neural networks (PINNs), and multi-level network architectures—in optical spectroscopy, this work methodically assesses the implementation efficacy and limitations of each technical pathway. Through actual application case studies involving J-profiler software 5.0 and associated algorithms, this review validates the significant efficacy of AI technologies in addressing critical industrial challenges, including tool-to-tool (T2T) matching. The research demonstrates that the fusion of AI and optical spectroscopy delivers technological breakthroughs for semiconductor metrology; however, persistent challenges remain concerning data veracity, insufficient datasets, and cross-scale compatibility. Future research should prioritize enhancing model generalization capability, optimizing data acquisition and utilization strategies, and balancing algorithm real-time performance with accuracy, thereby catalyzing the transformation of semiconductor manufacturing towards an intelligence-driven advanced metrology paradigm. Full article
(This article belongs to the Special Issue Recent Advances in Lithography)
Show Figures

Figure 1

23 pages, 3554 KiB  
Article
Multi-Sensor Fusion Framework for Reliable Localization and Trajectory Tracking of Mobile Robot by Integrating UWB, Odometry, and AHRS
by Quoc-Khai Tran and Young-Jae Ryoo
Biomimetics 2025, 10(7), 478; https://doi.org/10.3390/biomimetics10070478 - 21 Jul 2025
Viewed by 179
Abstract
This paper presents a multi-sensor fusion framework for the accurate indoor localization and trajectory tracking of a differential-drive mobile robot. The proposed system integrates Ultra-Wideband (UWB) trilateration, wheel odometry, and Attitude and Heading Reference System (AHRS) data using a Kalman filter. This fusion [...] Read more.
This paper presents a multi-sensor fusion framework for the accurate indoor localization and trajectory tracking of a differential-drive mobile robot. The proposed system integrates Ultra-Wideband (UWB) trilateration, wheel odometry, and Attitude and Heading Reference System (AHRS) data using a Kalman filter. This fusion approach reduces the impact of noisy and inaccurate UWB measurements while correcting odometry drift. The system combines raw UWB distance measurements with wheel encoder readings and heading information from an AHRS to improve robustness and positioning accuracy. Experimental validation was conducted through repeated closed-loop trajectory trials. The results demonstrate that the proposed method significantly outperforms UWB-only localization, yielding reduced noise, enhanced consistency, and lower Dynamic Time Warping (DTW) distances across repetitions. The findings confirm the system’s effectiveness and suitability for real-time mobile robot navigation in indoor environments. Full article
(This article belongs to the Special Issue Advanced Intelligent Systems and Biomimetics)
Show Figures

Figure 1

25 pages, 8560 KiB  
Article
Visual Point Cloud Map Construction and Matching Localization for Autonomous Vehicle
by Shuchen Xu, Kedong Zhao, Yongrong Sun, Xiyu Fu and Kang Luo
Drones 2025, 9(7), 511; https://doi.org/10.3390/drones9070511 - 21 Jul 2025
Viewed by 146
Abstract
Collaboration between autonomous vehicles and drones can enhance the efficiency and connectivity of three-dimensional transportation systems. When satellite signals are unavailable, vehicles can achieve accurate localization by matching rich ground environmental data to digital maps, simultaneously providing the auxiliary localization information for drones. [...] Read more.
Collaboration between autonomous vehicles and drones can enhance the efficiency and connectivity of three-dimensional transportation systems. When satellite signals are unavailable, vehicles can achieve accurate localization by matching rich ground environmental data to digital maps, simultaneously providing the auxiliary localization information for drones. However, conventional digital maps suffer from high construction costs, easy misalignment, and low localization accuracy. Thus, this paper proposes a visual point cloud map (VPCM) construction and matching localization for autonomous vehicles. We fuse multi-source information from vehicle-mounted sensors and the regional road network to establish the geographically high-precision VPCM. In the absence of satellite signals, we segment the prior VPCM on the road network based on real-time localization results, which accelerates matching speed and reduces mismatch probability. Simultaneously, by continuously introducing matching constraints of real-time point cloud and prior VPCM through improved iterative closest point matching method, the proposed solution can effectively suppress the drift error of the odometry and output accurate fusion localization results based on pose graph optimization theory. The experiments carried out on the KITTI datasets demonstrate the effectiveness of the proposed method, which can autonomously construct the high-precision prior VPCM. The localization strategy achieves sub-meter accuracy and reduces the average error per frame by 25.84% compared to similar methods. Subsequently, this method’s reusability and localization robustness under light condition changes and environment changes are verified using the campus dataset. Compared to the similar camera-based method, the matching success rate increased by 21.15%, and the average localization error decreased by 62.39%. Full article
Show Figures

Figure 1

Back to TopTop