MDPI - Publisher of Open Access Journals

27 pages, 21019 KiB

Open AccessArticle

A UWB-AOA/IMU Integrated Navigation System for 6-DoF Indoor UAV Localization

by Pengyu Zhao, Hengchuan Zhang, Gang Liu, Xiaowei Cui and Mingquan Lu

Drones 2025, 9(8), 546; https://doi.org/10.3390/drones9080546 (registering DOI) - 1 Aug 2025

With the increasing deployment of unmanned aerial vehicles (UAVs) in indoor environments, the demand for high-precision six-degrees-of-freedom (6-DoF) localization has grown significantly. Ultra-wideband (UWB) technology has emerged as a key enabler for indoor UAV navigation due to its robustness against multipath effects and [...] Read more.

With the increasing deployment of unmanned aerial vehicles (UAVs) in indoor environments, the demand for high-precision six-degrees-of-freedom (6-DoF) localization has grown significantly. Ultra-wideband (UWB) technology has emerged as a key enabler for indoor UAV navigation due to its robustness against multipath effects and high-accuracy ranging capabilities. However, conventional UWB-based systems primarily rely on range measurements, operate at low measurement frequencies, and are incapable of providing attitude information. This paper proposes a tightly coupled error-state extended Kalman filter (TC–ESKF)-based UWB/inertial measurement unit (IMU) fusion framework. To address the challenge of initial state acquisition, a weighted nonlinear least squares (WNLS)-based initialization algorithm is proposed to rapidly estimate the UAV’s initial position and attitude under static conditions. During dynamic navigation, the system integrates time-difference-of-arrival (TDOA) and angle-of-arrival (AOA) measurements obtained from the UWB module to refine the state estimates, thereby enhancing both positioning accuracy and attitude stability. The proposed system is evaluated through simulations and real-world indoor flight experiments. Experimental results show that the proposed algorithm outperforms representative fusion algorithms in 3D positioning and yaw estimation accuracy. Full article

► Show Figures

Figure 1

26 pages, 62045 KiB

Open AccessArticle

CML-RTDETR: A Lightweight Wheat Head Detection and Counting Algorithm Based on the Improved RT-DETR

by Yue Fang, Chenbo Yang, Chengyong Zhu, Hao Jiang, Jingmin Tu and Jie Li

Electronics 2025, 14(15), 3051; https://doi.org/10.3390/electronics14153051 - 30 Jul 2025

Abstract

Wheat is one of the important grain crops, and spike counting is crucial for predicting spike yield. However, in complex farmland environments, the wheat body scale has huge differences, its color is highly similar to the background, and wheat ears often overlap with [...] Read more.

Wheat is one of the important grain crops, and spike counting is crucial for predicting spike yield. However, in complex farmland environments, the wheat body scale has huge differences, its color is highly similar to the background, and wheat ears often overlap with each other, which makes wheat ear detection work face a lot of challenges. At the same time, the increasing demand for high accuracy and fast response in wheat spike detection has led to the need for models to be lightweight function with reduced the hardware costs. Therefore, this study proposes a lightweight wheat ear detection model, CML-RTDETR, for efficient and accurate detection of wheat ears in real complex farmland environments. In the model construction, the lightweight network CSPDarknet is firstly introduced as the backbone network of CML-RTDETR to enhance the feature extraction efficiency. In addition, the FM module is cleverly introduced to modify the bottleneck layer in the C2f component, and hybrid feature extraction is realized by spatial and frequency domain splicing to enhance the feature extraction capability of wheat to be tested in complex scenes. Secondly, to improve the model’s detection capability for targets of different scales, a multi-scale feature enhancement pyramid (MFEP) is designed, consisting of GHSDConv, for efficiently obtaining low-level detail information and CSPDWOK for constructing a multi-scale semantic fusion structure. Finally, channel pruning based on Layer-Adaptive Magnitude Pruning (LAMP) scoring is performed to reduce model parameters and runtime memory. The experimental results on the GWHD2021 dataset show that the

{AP}_{50}

of CML-RTDETR reaches 90.5%, which is an improvement of 1.2% compared to the baseline RTDETR-R18 model. Meanwhile, the parameters and GFLOPs have been decreased to 11.03 M and 37.8 G, respectively, resulting in a reduction of 42% and 34%, respectively. Finally, the real-time frame rate reaches 73 fps, significantly achieving parameter simplification and speed improvement. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Figure 1

18 pages, 4857 KiB

Open AccessArticle

Fast Detection of FDI Attacks and State Estimation in Unmanned Surface Vessels Based on Dynamic Encryption

by Zheng Liu, Li Liu, Hongyong Yang, Zengfeng Wang, Guanlong Deng and Chunjie Zhou

J. Mar. Sci. Eng. 2025, 13(8), 1457; https://doi.org/10.3390/jmse13081457 - 30 Jul 2025

Viewed by 39

Abstract

Wireless sensor networks (WSNs) are used for data acquisition and transmission in unmanned surface vessels (USVs). However, the openness of wireless networks makes USVs highly susceptible to false data injection (FDI) attacks during data transmission, which affects the sensors’ ability to receive real [...] Read more.

Wireless sensor networks (WSNs) are used for data acquisition and transmission in unmanned surface vessels (USVs). However, the openness of wireless networks makes USVs highly susceptible to false data injection (FDI) attacks during data transmission, which affects the sensors’ ability to receive real data and leads to decision-making errors in the control center. In this paper, a novel dynamic data encryption method is proposed whereby data are encrypted prior to transmission and the key is dynamically updated using historical system data, with a view to increasing the difficulty for attackers to crack the ciphertext. At the same time, a dynamic relationship is established among ciphertext, key, and auxiliary encrypted ciphertext, and an attack detection scheme based on dynamic encryption is designed to realize instant detection and localization of FDI attacks. Further, an

H_{\infty}

fusion filter is designed to filter external interference noise, and the real information is estimated or restored by the weighted fusion algorithm. Ultimately, the validity of the proposed scheme is confirmed through simulation experiments. Full article

(This article belongs to the Special Issue Control and Optimization of Ship Propulsion System)

► Show Figures

Figure 1

4 pages, 976 KiB

Open AccessProceeding Paper

Developing a Risk Recognition System Based on a Large Language Model for Autonomous Driving

by Donggyu Min and Dong-Kyu Kim

Eng. Proc. 2025, 102(1), 7; https://doi.org/10.3390/engproc2025102007 - 29 Jul 2025

Viewed by 98

Abstract

Autonomous driving systems have the potential to reduce traffic accidents dramatically; however, conventional modules often struggle to accurately detect risks in complex environments. This study presents a novel risk recognition system that integrates the reasoning capabilities of a large language model (LLM), specifically [...] Read more.

Autonomous driving systems have the potential to reduce traffic accidents dramatically; however, conventional modules often struggle to accurately detect risks in complex environments. This study presents a novel risk recognition system that integrates the reasoning capabilities of a large language model (LLM), specifically GPT-4, with traffic engineering domain knowledge. By incorporating surrogate safety measures such as time-to-collision (TTC) alongside traditional sensor and image data, our approach enhances the vehicle’s ability to interpret and react to potentially dangerous situations. Utilizing the realistic 3D simulation environment of CARLA, the proposed framework extracts comprehensive data—including object identification, distance, TTC, and vehicle dynamics—and reformulates this information into natural language inputs for GPT-4. The LLM then provides risk assessments with detailed justifications, guiding the autonomous vehicle to execute appropriate control commands. The experimental results demonstrate that the LLM-based module outperforms conventional systems by maintaining safer distances, achieving more stable TTC values, and delivering smoother acceleration control during dangerous scenarios. This fusion of LLM reasoning with traffic engineering principles not only improves the reliability of risk recognition but also lays a robust foundation for future real-time applications and dataset development in autonomous driving safety. Full article

► Show Figures

Figure 1

40 pages, 13570 KiB

Open AccessArticle

DuSAFNet: A Multi-Path Feature Fusion and Spectral–Temporal Attention-Based Model for Bird Audio Classification

by Zhengyang Lu, Huan Li, Min Liu, Yibin Lin, Yao Qin, Xuanyu Wu, Nanbo Xu and Haibo Pu

Animals 2025, 15(15), 2228; https://doi.org/10.3390/ani15152228 - 29 Jul 2025

Viewed by 193

Abstract

This research presents DuSAFNet, a lightweight deep neural network for fine-grained bird audio classification. DuSAFNet combines dual-path feature fusion, spectral–temporal attention, and a multi-band ArcMarginProduct classifier to enhance inter-class separability and capture both local and global spectro–temporal cues. Unlike single-feature approaches, DuSAFNet captures [...] Read more.

This research presents DuSAFNet, a lightweight deep neural network for fine-grained bird audio classification. DuSAFNet combines dual-path feature fusion, spectral–temporal attention, and a multi-band ArcMarginProduct classifier to enhance inter-class separability and capture both local and global spectro–temporal cues. Unlike single-feature approaches, DuSAFNet captures both local spectral textures and long-range temporal dependencies in Mel-spectrogram inputs and explicitly enhances inter-class separability across low, mid, and high frequency bands. On a curated dataset of 17,653 three-second recordings spanning 18 species, DuSAFNet achieves 96.88% accuracy and a 96.83% F1 score using only 6.77 M parameters and 2.275 GFLOPs. Cross-dataset evaluation on Birdsdata yields 93.74% accuracy, demonstrating robust generalization to new recording conditions. Its lightweight design and high performance make DuSAFNet well-suited for edge-device deployment and real-time alerts for rare or threatened species. This work lays the foundation for scalable, automated acoustic monitoring to inform biodiversity assessments and conservation planning. Full article

(This article belongs to the Section Birds)

► Show Figures

Figure 1

25 pages, 2518 KiB

Open AccessArticle

An Efficient Semantic Segmentation Framework with Attention-Driven Context Enhancement and Dynamic Fusion for Autonomous Driving

by Jia Tian, Peizeng Xin, Xinlu Bai, Zhiguo Xiao and Nianfeng Li

Appl. Sci. 2025, 15(15), 8373; https://doi.org/10.3390/app15158373 - 28 Jul 2025

Viewed by 237

Abstract

In recent years, a growing number of real-time semantic segmentation networks have been developed to improve segmentation accuracy. However, these advancements often come at the cost of increased computational complexity, which limits their inference efficiency, particularly in scenarios such as autonomous driving, where [...] Read more.

In recent years, a growing number of real-time semantic segmentation networks have been developed to improve segmentation accuracy. However, these advancements often come at the cost of increased computational complexity, which limits their inference efficiency, particularly in scenarios such as autonomous driving, where strict real-time performance is essential. Achieving an effective balance between speed and accuracy has thus become a central challenge in this field. To address this issue, we present a lightweight semantic segmentation model tailored for the perception requirements of autonomous vehicles. The architecture follows an encoder–decoder paradigm, which not only preserves the capability for deep feature extraction but also facilitates multi-scale information integration. The encoder leverages a high-efficiency backbone, while the decoder introduces a dynamic fusion mechanism designed to enhance information interaction between different feature branches. Recognizing the limitations of convolutional networks in modeling long-range dependencies and capturing global semantic context, the model incorporates an attention-based feature extraction component. This is further augmented by positional encoding, enabling better awareness of spatial structures and local details. The dynamic fusion mechanism employs an adaptive weighting strategy, adjusting the contribution of each feature channel to reduce redundancy and improve representation quality. To validate the effectiveness of the proposed network, experiments were conducted on a single RTX 3090 GPU. The Dynamic Real-time Integrated Vision Encoder–Segmenter Network (DriveSegNet) achieved a mean Intersection over Union (mIoU) of 76.9% and an inference speed of 70.5 FPS on the Cityscapes test dataset, 74.6% mIoU and 139.8 FPS on the CamVid test dataset, and 35.8% mIoU with 108.4 FPS on the ADE20K dataset. The experimental results demonstrate that the proposed method achieves an excellent balance between inference speed, segmentation accuracy, and model size. Full article

► Show Figures

Figure 1

21 pages, 3293 KiB

Open AccessArticle

A Fusion of Entropy-Enhanced Image Processing and Improved YOLOv8 for Smoke Recognition in Mine Fires

by Xiaowei Li and Yi Liu

Entropy 2025, 27(8), 791; https://doi.org/10.3390/e27080791 - 25 Jul 2025

Viewed by 160

Abstract

Smoke appears earlier than flames, so image-based fire monitoring techniques mainly focus on the detection of smoke, which is regarded as one of the effective strategies for preventing the spread of initial fires that eventually evolve into serious fires. Smoke monitoring in mine [...] Read more.

Smoke appears earlier than flames, so image-based fire monitoring techniques mainly focus on the detection of smoke, which is regarded as one of the effective strategies for preventing the spread of initial fires that eventually evolve into serious fires. Smoke monitoring in mine fires faces serious challenges: the underground environment is complex, with smoke and backgrounds being highly integrated and visual features being blurred, which makes it difficult for existing image-based monitoring techniques to meet the actual needs in terms of accuracy and robustness. The conventional ground-based methods are directly used in the underground with a high rate of missed detection and false detection. Aiming at the core problems of mixed target and background information and high boundary uncertainty in smoke images, this paper, inspired by the principle of information entropy, proposes a method for recognizing smoke from mine fires by integrating entropy-enhanced image processing and improved YOLOv8. Firstly, according to the entropy change characteristics of spatio-temporal information brought by smoke diffusion movement, based on spatio-temporal entropy separation, an equidistant frame image differential fusion method is proposed, which effectively suppresses the low entropy background noise, enhances the detail clarity of the high entropy smoke region, and significantly improves the image signal-to-noise ratio. Further, in order to cope with the variable scale and complex texture (high information entropy) of the smoke target, an improvement mechanism based on entropy-constrained feature focusing is introduced on the basis of the YOLOv8m model, so as to more effectively capture and distinguish the rich detailed features and uncertain information of the smoke region, realizing the balanced and accurate detection of large and small smoke targets. The experiments show that the comprehensive performance of the proposed method is significantly better than the baseline model and similar algorithms, and it can meet the demand of real-time detection. Compared with YOLOv9m, YOLOv10n, and YOLOv11n, although there is a decrease in inference speed, the accuracy, recall, average detection accuracy mAP (50), and mAP (50–95) performance metrics are all substantially improved. The precision and robustness of smoke recognition in complex mine scenarios are effectively improved. Full article

(This article belongs to the Section Multidisciplinary Applications)

► Show Figures

Figure 1

27 pages, 33803 KiB

Open AccessArticle

Multi-Channel Spatio-Temporal Data Fusion of ‘Big’ and ‘Small’ Network Data Using Transformer Networks

by Tao Cheng, Hao Chen, Xianghui Zhang, Xiaowei Gao, Lu Yin and Jianbin Jiao

ISPRS Int. J. Geo-Inf. 2025, 14(8), 286; https://doi.org/10.3390/ijgi14080286 - 24 Jul 2025

Viewed by 247

Abstract

The integration of heterogeneous spatio-temporal datasets presents a critical challenge in geospatial data science, particularly when combining large-scale, passively collected “big” data with precise but sparse “small” data. In this study, we propose a novel framework—Multi-Channel Spatio-Temporal Data Fusion (MCST-DF)—that leverages transformer-based deep [...] Read more.

The integration of heterogeneous spatio-temporal datasets presents a critical challenge in geospatial data science, particularly when combining large-scale, passively collected “big” data with precise but sparse “small” data. In this study, we propose a novel framework—Multi-Channel Spatio-Temporal Data Fusion (MCST-DF)—that leverages transformer-based deep learning to fuse these data sources for accurate network flow estimation. Our approach introduces a Residual Spatio-Temporal Transformer Network (RSTTNet), equipped with a layered attention mechanism and multi-scale embedding architecture to capture both local and global dependencies across space and time. We evaluate the framework using real-world mobile sensing and loop detector data from the London road network, demonstrating over 89% prediction accuracy and outperforming several benchmark deep learning models. This work provides a generalisable solution for spatio-temporal fusion of diverse geospatial data sources and has direct relevance to smart mobility, urban infrastructure monitoring, and the development of spatially informed AI systems. Full article

► Show Figures

Figure 1

21 pages, 5181 KiB

Open AccessArticle

TEB-YOLO: A Lightweight YOLOv5-Based Model for Bamboo Strip Defect Detection

by Xipeng Yang, Chengzhi Ruan, Fei Yu, Ruxiao Yang, Bo Guo, Jun Yang, Feng Gao and Lei He

Forests 2025, 16(8), 1219; https://doi.org/10.3390/f16081219 - 24 Jul 2025

Viewed by 264

Abstract

The accurate detection of surface defects in bamboo is critical to maintaining product quality. Traditional inspection methods rely heavily on manual labor, making the manufacturing process labor-intensive and error-prone. To overcome these limitations, TEB-YOLO is introduced in this paper, a lightweight and efficient [...] Read more.

The accurate detection of surface defects in bamboo is critical to maintaining product quality. Traditional inspection methods rely heavily on manual labor, making the manufacturing process labor-intensive and error-prone. To overcome these limitations, TEB-YOLO is introduced in this paper, a lightweight and efficient defect detection model based on YOLOv5s. Firstly, EfficientViT replaces the original YOLOv5s backbone, reducing the computational cost while improving feature extraction. Secondly, BiFPN is adopted in place of PANet to enhance multi-scale feature fusion and preserve detailed information. Thirdly, an Efficient Local Attention (ELA) mechanism is embedded in the backbone to strengthen local feature representation. Lastly, the original CIoU loss is replaced with EIoU loss to enhance localization precision. The proposed model achieves a precision of 91.7% with only 10.5 million parameters, marking a 5.4% accuracy improvement and a 22.9% reduction in parameters compared to YOLOv5s. Compared with other mainstream models including YOLOv5n, YOLOv7, YOLOv8n, YOLOv9t, and YOLOv9s, TEB-YOLO achieves precision improvements of 11.8%, 1.66%, 2.0%, 2.8%, and 1.1%, respectively. The experiment results show that TEB-YOLO significantly improves detection precision and model lightweighting, offering a practical and effective solution for real-time bamboo surface defect detection. Full article

(This article belongs to the Special Issue Cutting-Edge Solutions in Advanced Forestry: Integrating Sensors, AI, IoT, Robotics, and Connectivity)

► Show Figures

Figure 1

17 pages, 4338 KiB

Open AccessArticle

Lightweight Attention-Based CNN Architecture for CSI Feedback of RIS-Assisted MISO Systems

by Anming Dong, Yupeng Xue, Sufang Li, Wendong Xu and Jiguo Yu

Mathematics 2025, 13(15), 2371; https://doi.org/10.3390/math13152371 - 24 Jul 2025

Viewed by 221

Abstract

Reconfigurable Intelligent Surface (RIS) has emerged as a promising enabling technology for wireless communications, which significantly enhances system performance through real-time manipulation of electromagnetic wave reflection characteristics. In RIS-assisted communication systems, existing deep learning-based channel state information (CSI) feedback methods often suffer from [...] Read more.

Reconfigurable Intelligent Surface (RIS) has emerged as a promising enabling technology for wireless communications, which significantly enhances system performance through real-time manipulation of electromagnetic wave reflection characteristics. In RIS-assisted communication systems, existing deep learning-based channel state information (CSI) feedback methods often suffer from excessive parameter requirements and high computational complexity. To address this challenge, this paper proposes LwCSI-Net, a lightweight autoencoder network specifically designed for RIS-assisted multiple-input single-output (MISO) systems, aiming to achieve efficient and low-complexity CSI feedback. The core contribution of this work lies in an innovative lightweight feedback architecture that deeply integrates multi-layer convolutional neural networks (CNNs) with attention mechanisms. Specifically, the network employs 1D convolutional operations with unidirectional kernel sliding, which effectively reduces trainable parameters while maintaining robust feature-extraction capabilities. Furthermore, by incorporating an efficient channel attention (ECA) mechanism, the model dynamically allocates weights to different feature channels, thereby enhancing the capture of critical features. This approach not only improves network representational efficiency but also reduces redundant computations, leading to optimized computational complexity. Additionally, the proposed cross-channel residual block (CRBlock) establishes inter-channel information-exchange paths, strengthening feature fusion and ensuring outstanding stability and robustness under high compression ratio (CR) conditions. Our experimental results show that for CRs of 16, 32, and 64, LwCSI-Net significantly improves CSI reconstruction performance while maintaining fewer parameters and lower computational complexity, achieving an average complexity reduction of 35.63% compared to state-of-the-art (SOTA) CSI feedback autoencoder architectures. Full article

(This article belongs to the Special Issue Data-Driven Decentralized Learning for Future Communication Networks)

► Show Figures

Figure 1

22 pages, 4611 KiB

Open AccessArticle

MMC-YOLO: A Lightweight Model for Real-Time Detection of Geometric Symmetry-Breaking Defects in Wind Turbine Blades

by Caiye Liu, Chao Zhang, Xinyu Ge, Xunmeng An and Nan Xue

Symmetry 2025, 17(8), 1183; https://doi.org/10.3390/sym17081183 - 24 Jul 2025

Viewed by 280

Abstract

Performance degradation of wind turbine blades often stems from geometric asymmetry induced by damage. Existing methods for assessing damage face challenges in balancing accuracy and efficiency due to their limited ability to capture fine-grained geometric asymmetries associated with multi-scale damage under complex background [...] Read more.

Performance degradation of wind turbine blades often stems from geometric asymmetry induced by damage. Existing methods for assessing damage face challenges in balancing accuracy and efficiency due to their limited ability to capture fine-grained geometric asymmetries associated with multi-scale damage under complex background interference. To address this, based on the high-speed detection model YOLOv10-N, this paper proposes a novel detection model named MMC-YOLO. First, the Multi-Scale Perception Gated Convolution (MSGConv) Module was designed, which constructs a full-scale receptive field through multi-branch fusion and channel rearrangement to enhance the extraction of geometric asymmetry features. Second, the Multi-Scale Enhanced Feature Pyramid Network (MSEFPN) was developed, integrating dynamic path aggregation and an SENetv2 attention mechanism to suppress background interference and amplify damage response. Finally, the Channel-Compensated Filtering (CCF) module was constructed to preserve critical channel information using a dynamic buffering mechanism. Evaluated on a dataset of 4818 wind turbine blade damage images, MMC-YOLO achieves an 82.4% mAP [0.5:0.95], representing a 4.4% improvement over the baseline YOLOv10-N model, and a 91.1% recall rate, an 8.7% increase, while maintaining a lightweight parameter count of 4.2 million. This framework significantly enhances geometric asymmetry defect detection accuracy while ensuring real-time performance, meeting engineering requirements for high efficiency and precision. Full article

(This article belongs to the Special Issue Symmetry and Its Applications in Image Processing)

► Show Figures

Figure 1

22 pages, 2952 KiB

Open AccessArticle

Raw-Data Driven Functional Data Analysis with Multi-Adaptive Functional Neural Networks for Ergonomic Risk Classification Using Facial and Bio-Signal Time-Series Data

by Suyeon Kim, Afrooz Shakeri, Seyed Shayan Darabi, Eunsik Kim and Kyongwon Kim

Sensors 2025, 25(15), 4566; https://doi.org/10.3390/s25154566 - 23 Jul 2025

Viewed by 207

Abstract

Ergonomic risk classification during manual lifting tasks is crucial for the prevention of workplace injuries. This study addresses the challenge of classifying lifting task risk levels (low, medium, and high risk, labeled as 0, 1, and 2) using multi-modal time-series data comprising raw [...] Read more.

Ergonomic risk classification during manual lifting tasks is crucial for the prevention of workplace injuries. This study addresses the challenge of classifying lifting task risk levels (low, medium, and high risk, labeled as 0, 1, and 2) using multi-modal time-series data comprising raw facial landmarks and bio-signals (electrocardiography [ECG] and electrodermal activity [EDA]). Classifying such data presents inherent challenges due to multi-source information, temporal dynamics, and class imbalance. To overcome these challenges, this paper proposes a Multi-Adaptive Functional Neural Network (Multi-AdaFNN), a novel method that integrates functional data analysis with deep learning techniques. The proposed model introduces a novel adaptive basis layer composed of micro-networks tailored to each individual time-series feature, enabling end-to-end learning of discriminative temporal patterns directly from raw data. The Multi-AdaFNN approach was evaluated across five distinct dataset configurations: (1) facial landmarks only, (2) bio-signals only, (3) full fusion of all available features, (4) a reduced-dimensionality set of 12 selected facial landmark trajectories, and (5) the same reduced set combined with bio-signals. Performance was rigorously assessed using 100 independent stratified splits (70% training and 30% testing) and optimized via a weighted cross-entropy loss function to manage class imbalance effectively. The results demonstrated that the integrated approach, fusing facial landmarks and bio-signals, achieved the highest classification accuracy and robustness. Furthermore, the adaptive basis functions revealed specific phases within lifting tasks critical for risk prediction. These findings underscore the efficacy and transparency of the Multi-AdaFNN framework for multi-modal ergonomic risk assessment, highlighting its potential for real-time monitoring and proactive injury prevention in industrial environments. Full article

(This article belongs to the Special Issue (Bio)sensors for Physiological Monitoring)

► Show Figures

Figure 1

37 pages, 55522 KiB

Open AccessArticle

EPCNet: Implementing an ‘Artificial Fovea’ for More Efficient Monitoring Using the Sensor Fusion of an Event-Based and a Frame-Based Camera

by Orla Sealy Phelan, Dara Molloy, Roshan George, Edward Jones, Martin Glavin and Brian Deegan

Sensors 2025, 25(15), 4540; https://doi.org/10.3390/s25154540 - 22 Jul 2025

Viewed by 220

Abstract

Efficient object detection is crucial to real-time monitoring applications such as autonomous driving or security systems. Modern RGB cameras can produce high-resolution images for accurate object detection. However, increased resolution results in increased network latency and power consumption. To minimise this latency, Convolutional [...] Read more.

Efficient object detection is crucial to real-time monitoring applications such as autonomous driving or security systems. Modern RGB cameras can produce high-resolution images for accurate object detection. However, increased resolution results in increased network latency and power consumption. To minimise this latency, Convolutional Neural Networks (CNNs) often have a resolution limitation, requiring images to be down-sampled before inference, causing significant information loss. Event-based cameras are neuromorphic vision sensors with high temporal resolution, low power consumption, and high dynamic range, making them preferable to regular RGB cameras in many situations. This project proposes the fusion of an event-based camera with an RGB camera to mitigate the trade-off between temporal resolution and accuracy, while minimising power consumption. The cameras are calibrated to create a multi-modal stereo vision system where pixel coordinates can be projected between the event and RGB camera image planes. This calibration is used to project bounding boxes detected by clustering of events into the RGB image plane, thereby cropping each RGB frame instead of down-sampling to meet the requirements of the CNN. Using the Common Objects in Context (COCO) dataset evaluator, the average precision (AP) for the bicycle class in RGB scenes improved from 21.08 to 57.38. Additionally, AP increased across all classes from 37.93 to 46.89. To reduce system latency, a novel object detection approach is proposed where the event camera acts as a region proposal network, and a classification algorithm is run on the proposed regions. This achieved a 78% improvement over baseline. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

25 pages, 2727 KiB

Open AccessReview

AI-Powered Next-Generation Technology for Semiconductor Optical Metrology: A Review

by Weiwang Xu, Houdao Zhang, Lingjing Ji and Zhongyu Li

Micromachines 2025, 16(8), 838; https://doi.org/10.3390/mi16080838 - 22 Jul 2025

Viewed by 422

Abstract

As semiconductor manufacturing advances into the angstrom-scale era characterized by three-dimensional integration, conventional metrology technologies face fundamental limitations regarding accuracy, speed, and non-destructiveness. Although optical spectroscopy has emerged as a prominent research focus, its application in complex manufacturing scenarios continues to confront significant [...] Read more.

As semiconductor manufacturing advances into the angstrom-scale era characterized by three-dimensional integration, conventional metrology technologies face fundamental limitations regarding accuracy, speed, and non-destructiveness. Although optical spectroscopy has emerged as a prominent research focus, its application in complex manufacturing scenarios continues to confront significant technical barriers. This review establishes three concrete objectives: To categorize AI–optical spectroscopy integration paradigms spanning forward surrogate modeling, inverse prediction, physics-informed neural networks (PINNs), and multi-level architectures; to benchmark their efficacy against critical industrial metrology challenges including tool-to-tool (T2T) matching and high-aspect-ratio (HAR) structure characterization; and to identify unresolved bottlenecks for guiding next-generation intelligent semiconductor metrology. By categorically elaborating on the innovative applications of AI algorithms—such as forward surrogate models, inverse modeling techniques, physics-informed neural networks (PINNs), and multi-level network architectures—in optical spectroscopy, this work methodically assesses the implementation efficacy and limitations of each technical pathway. Through actual application case studies involving J-profiler software 5.0 and associated algorithms, this review validates the significant efficacy of AI technologies in addressing critical industrial challenges, including tool-to-tool (T2T) matching. The research demonstrates that the fusion of AI and optical spectroscopy delivers technological breakthroughs for semiconductor metrology; however, persistent challenges remain concerning data veracity, insufficient datasets, and cross-scale compatibility. Future research should prioritize enhancing model generalization capability, optimizing data acquisition and utilization strategies, and balancing algorithm real-time performance with accuracy, thereby catalyzing the transformation of semiconductor manufacturing towards an intelligence-driven advanced metrology paradigm. Full article

(This article belongs to the Special Issue Recent Advances in Lithography)

► Show Figures

Figure 1

23 pages, 3554 KiB

Open AccessArticle

Multi-Sensor Fusion Framework for Reliable Localization and Trajectory Tracking of Mobile Robot by Integrating UWB, Odometry, and AHRS

by Quoc-Khai Tran and Young-Jae Ryoo

Biomimetics 2025, 10(7), 478; https://doi.org/10.3390/biomimetics10070478 - 21 Jul 2025

Viewed by 396

Abstract

This paper presents a multi-sensor fusion framework for the accurate indoor localization and trajectory tracking of a differential-drive mobile robot. The proposed system integrates Ultra-Wideband (UWB) trilateration, wheel odometry, and Attitude and Heading Reference System (AHRS) data using a Kalman filter. This fusion [...] Read more.

This paper presents a multi-sensor fusion framework for the accurate indoor localization and trajectory tracking of a differential-drive mobile robot. The proposed system integrates Ultra-Wideband (UWB) trilateration, wheel odometry, and Attitude and Heading Reference System (AHRS) data using a Kalman filter. This fusion approach reduces the impact of noisy and inaccurate UWB measurements while correcting odometry drift. The system combines raw UWB distance measurements with wheel encoder readings and heading information from an AHRS to improve robustness and positioning accuracy. Experimental validation was conducted through repeated closed-loop trajectory trials. The results demonstrate that the proposed method significantly outperforms UWB-only localization, yielding reduced noise, enhanced consistency, and lower Dynamic Time Warping (DTW) distances across repetitions. The findings confirm the system’s effectiveness and suitability for real-time mobile robot navigation in indoor environments. Full article

(This article belongs to the Special Issue Advanced Intelligent Systems and Biomimetics)

► Show Figures

Figure 1

Search Results (779)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (779)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI