MDPI - Publisher of Open Access Journals

18 pages, 46866 KB

Open AccessArticle

SATrack: Semantic-Aware Alignment Framework for Visual–Language Tracking

by Yangyang Tian, Liusen Xu, Zhe Li, Liang Jiang, Cen Chen and Huanlong Zhang

Electronics 2025, 14(19), 3935; https://doi.org/10.3390/electronics14193935 (registering DOI) - 4 Oct 2025

Visual–language tracking often faces challenges like target deformation and confusion caused by similar objects. These issues can disrupt the alignment between visual inputs and their textual descriptions, leading to cross-modal semantic drift and feature-matching errors. To address these issues, we propose SATrack, a [...] Read more.

Visual–language tracking often faces challenges like target deformation and confusion caused by similar objects. These issues can disrupt the alignment between visual inputs and their textual descriptions, leading to cross-modal semantic drift and feature-matching errors. To address these issues, we propose SATrack, a Semantic-Aware Alignment framework for visual–language tracking. Specifically, we first propose the Semantically Aware Contrastive Alignment module, which leverages attention-guided semantic distance modeling to identify hard negative samples that are semantically similar but carry different labels. This helps the model better distinguish confusing instances and capture fine-grained cross-modal differences. Secondly, we design the Cross-Modal Token Filtering strategy, which leverages attention responses guided by both the visual template and the textual description to filter out irrelevant or weakly related tokens in the search region. This helps the model focus more precisely on the target. Finally, we propose a Confidence-Guided Template Memory mechanism, which evaluates the prediction quality of each frame using convolutional operations and confidence thresholding. High-confidence frames are stored to selectively update the template memory, enabling the model to adapt to appearance changes over time. Extensive experiments show that SATrack achieves a 65.8% success rate on the TNL2K benchmark, surpassing the previous state-of-the-art UVLTrack by 3.1% and demonstrating superior robustness and accuracy. Full article

(This article belongs to the Special Issue Deep Perception in Autonomous Driving, 2nd Edition)

► Show Figures

Figure 1

28 pages, 7501 KB

Open AccessArticle

Multi-Step Apparent Temperature Prediction in Broiler Houses Using a Hybrid SE-TCN–Transformer Model with Kalman Filtering

by Pengshen Zheng, Wanchao Zhang, Bin Gao, Yali Ma and Changxi Chen

Sensors 2025, 25(19), 6124; https://doi.org/10.3390/s25196124 - 3 Oct 2025

Abstract

In intensive broiler production, rapid environmental fluctuations can induce heat stress, adversely affecting flock welfare and productivity. Apparent temperature (AT), integrating temperature, humidity, and wind speed, provides a comprehensive thermal index, guiding predictive climate control. This study develops a multi-step AT forecasting model [...] Read more.

In intensive broiler production, rapid environmental fluctuations can induce heat stress, adversely affecting flock welfare and productivity. Apparent temperature (AT), integrating temperature, humidity, and wind speed, provides a comprehensive thermal index, guiding predictive climate control. This study develops a multi-step AT forecasting model based on a hybrid SE-TCN–Transformer architecture enhanced with Kalman filtering. The temporal convolutional network with SE attention extracts short-term local trends, the Transformer captures long-range dependencies, and Kalman smoothing reduces prediction noise, collectively improving robustness and accuracy. The model was trained on multi-source time-series data from a commercial broiler house and evaluated for 5, 15, and 30 min horizons against LSTM, GRU, Autoformer, and Informer benchmarks. Results indicate that the proposed model achieves substantially lower prediction errors and higher determination coefficients. By combining multi-variable feature integration, local–global temporal modeling, and dynamic smoothing, the model offers a precise and reliable tool for intelligent ventilation control and heat stress management. These findings provide both scientific insight into multi-step thermal environment prediction and practical guidance for optimizing broiler welfare and production performance. Full article

(This article belongs to the Section Smart Agriculture)

21 pages, 2248 KB

Open AccessArticle

TSFNet: Temporal-Spatial Fusion Network for Hybrid Brain-Computer Interface

by Yan Zhang, Bo Yin and Xiaoyang Yuan

Sensors 2025, 25(19), 6111; https://doi.org/10.3390/s25196111 - 3 Oct 2025

Abstract

Unimodal brain–computer interfaces (BCIs) often suffer from inherent limitations due to the characteristic of using single modalities. While hybrid BCIs combining electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS) offer complementary advantages, effectively integrating their spatiotemporal features remains a challenge due to inherent signal [...] Read more.

Unimodal brain–computer interfaces (BCIs) often suffer from inherent limitations due to the characteristic of using single modalities. While hybrid BCIs combining electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS) offer complementary advantages, effectively integrating their spatiotemporal features remains a challenge due to inherent signal asynchrony. This study aims to develop a novel deep fusion network to achieve synergistic integration of EEG and fNIRS signals for improved classification performance across different tasks. We propose a novel Temporal-Spatial Fusion Network (TSFNet), which consists of two key sublayers: the EEG-fNIRS-guided Fusion (EFGF) layer and the Cross-Attention-based Feature Enhancement (CAFÉ) layer. The EFGF layer extracts temporal features from EEG and spatial features from fNIRS to generate a hybrid attention map, which is utilized to achieve more effective and complementary integration of spatiotemporal information. The CAFÉ layer enables bidirectional interaction between fNIRS and fusion features via a cross-attention mechanism, which enhances the fusion features and selectively filters informative fNIRS representations. Through the two sublayers, TSFNet achieves deep fusion of multimodal features. Finally, TSFNet is evaluated on motor imagery (MI), mental arithmetic (MA), and word generation (WG) classification tasks. Experimental results demonstrate that TSFNet achieves superior classification performance, with average accuracies of 70.18% for MI, 86.26% for MA, and 81.13% for WG, outperforming existing state-of-the-art multimodal algorithms. These findings suggest that TSFNet provides an effective solution for spatiotemporal feature fusion in hybrid BCIs, with potential applications in real-world BCI systems. Full article

(This article belongs to the Special Issue EEG-Based Brain–Computer Interface: Trends, Challenges and Advancements)

► Show Figures

Figure 1

30 pages, 8197 KB

Open AccessArticle

Numerical and Experimental Study of Mode Coupling Due to Localised Few-Mode Fibre Bragg Gratings and a Spatial Mode Multiplexer

by James Hainsworth, Adriana Morana, Lucas Lescure, Philippe Veyssiere, Sylvain Girard and Emmanuel Marin

Sensors 2025, 25(19), 6087; https://doi.org/10.3390/s25196087 - 2 Oct 2025

Abstract

Mode conversion effects in Fibre Bragg Gratings (FBGs) are widely exploited in applications such as sensing and fibre lasers. However, when FBGs are inscribed into Few-mode optical Fibres (FMFs), the mode interactions become highly complex due to the increased number of guided modes, [...] Read more.

Mode conversion effects in Fibre Bragg Gratings (FBGs) are widely exploited in applications such as sensing and fibre lasers. However, when FBGs are inscribed into Few-mode optical Fibres (FMFs), the mode interactions become highly complex due to the increased number of guided modes, rendering their practical use difficult. In this study, we investigate whether the addition of a spatial mode multiplexer, used to selectively excite specific fibre modes, can simplify the interpretation and utility of few-mode FBGs (FM-FBGs). We focus on point-by-point (PbP)-inscribed FBGs, localised with respect to the transverse cross-section of the fibre core, and study their interaction with a range of Hermitian Gauss input modes. We present a comprehensive numerical study supported by experimental validation, examining the mechanisms of mode coupling induced by localised FBGs and its implications, with a focus on sensing applications. Our results show that the introduction of a spatial mode multiplexer leads to slight simplification of the FBG transmission spectrum. Nevertheless, significant simplification of the reflection spectrum is achievable after modal filtering occurs as the reflected light re-traverses the spatial mode multiplexer, potentially enabling WDM monitoring of FM-FBGs. Notably, we report a novel approach to multiplexing FBGs based on their transverse location within the fibre core and the modal content initially coupled into the fibre. To the best of our knowledge, this multiplexing technique is yet to be reported. Full article

(This article belongs to the Special Issue Feature Papers in Sensing and Imaging 2025)

► Show Figures

Figure 1

25 pages, 2147 KB

Open AccessArticle

Skeletal Image Features Based Collaborative Teleoperation Control of the Double Robotic Manipulators

by Hsiu-Ming Wu and Shih-Hsun Wei

Electronics 2025, 14(19), 3897; https://doi.org/10.3390/electronics14193897 - 30 Sep 2025

Abstract

In this study, a vision-based remote and synchronized control scheme is proposed for the double six-DOF robotic manipulators. Using an Intel RealSense D435 depth camera and MediaPipe skeletal image feature technique, the operator’s 3D hand pose is captured and mapped to the robot’s [...] Read more.

In this study, a vision-based remote and synchronized control scheme is proposed for the double six-DOF robotic manipulators. Using an Intel RealSense D435 depth camera and MediaPipe skeletal image feature technique, the operator’s 3D hand pose is captured and mapped to the robot’s workspace via coordinate transformation. Inverse kinematics is then applied to compute the necessary joint angles for synchronized motion control. Implemented on double robotic manipulators with the MoveIt framework, the system successfully achieves a collaborative teleoperation control task to transfer an object from a robotic manipulator to another one. Further, moving average filtering techniques are used to enhance trajectory smoothness and stability. The framework demonstrates the feasibility and effectiveness of non-contact, vision-guided multi-robot control for applications in teleoperation, smart manufacturing, and education. Full article

(This article belongs to the Section Systems & Control Engineering)

43 pages, 6500 KB

Open AccessArticle

Human Risk Mitigators: A Bibliometric and Thematic Analysis of Financial Advisors in Household Resilience

by Maria-Roxana Balea-Stanciu, Georgiana-Iulia Lazea and Ovidiu-Constantin Bunget

J. Risk Financial Manag. 2025, 18(10), 548; https://doi.org/10.3390/jrfm18100548 - 30 Sep 2025

Abstract

In the context of rising uncertainty and financial crises, the roles of financial advisors are evolving beyond technical compliance, particularly in household contexts. This article introduces a novel perspective by highlighting how these professionals contribute to resilience and stability at all levels of [...] Read more.

In the context of rising uncertainty and financial crises, the roles of financial advisors are evolving beyond technical compliance, particularly in household contexts. This article introduces a novel perspective by highlighting how these professionals contribute to resilience and stability at all levels of society by building financial literacy and acting as human barriers against systemic risk. From the datasets retrieved from Web of Science and Scopus, a final curated sample of 102 peer-reviewed articles was retained following thematic refinement and in-depth human filtering. After data harmonisation, a bibliometric analysis was conducted through VOSviewer, identifying five key thematic clusters. Beyond cartographic description, a rigorous thematic exploration was conducted. We advance an interpretive architecture consisting of mechanisms (M1–M4), advice-to-outcome pathways (P1–P3), and a conditional context (Conditions of Success (CS), Failure points (F) and Moderating Factors (MF)), enabling integrative inference and cumulative explanation across an otherwise heterogeneous corpus. Results show that financial advisors mitigate risk by educating clients, guiding decisions, and turning complexity into usable judgment. They also bear risk; as human barriers, they channel and transform these pressures through their professional practice, returning stabilizing effects to households and, by extension, to the wider financial system. Full article

(This article belongs to the Special Issue Financial and Sustainability Reporting in a Digital Era, 2nd Edition)

► Show Figures

Figure 1

14 pages, 1646 KB

Open AccessArticle

Arabic WikiTableQA: Benchmarking Question Answering over Arabic Tables Using Large Language Models

by Fawaz Alsolami and Asmaa Alrayzah

Electronics 2025, 14(19), 3829; https://doi.org/10.3390/electronics14193829 - 27 Sep 2025

Abstract

Table-based question answering (TableQA) has made significant progress in recent years; however, most advancements have focused on English datasets and SQL-based techniques, leaving Arabic TableQA largely unexplored. This gap is especially critical given the widespread use of structured Arabic content in domains such [...] Read more.

Table-based question answering (TableQA) has made significant progress in recent years; however, most advancements have focused on English datasets and SQL-based techniques, leaving Arabic TableQA largely unexplored. This gap is especially critical given the widespread use of structured Arabic content in domains such as government, education, and media. The main challenge lies in the absence of benchmark datasets and the difficulty that large language models (LLMs) face when reasoning over long, complex tables in Arabic, due to token limitations and morphological complexity. To address this, we introduce Arabic WikiTableQA, the first large-scale dataset for non-SQL Arabic TableQA, constructed from the WikiTableQuestions dataset and enriched with natural questions and gold-standard answers. We developed three methods to evaluate this dataset: a direct input approach, a sub-table selection strategy using SQL-like filtering, and a knowledge-guided framework that filters the table using semantic graphs. Experimental results with an LLM show that the graph-guided approach outperforms the others, achieving 74% accuracy, compared to 64% for sub-table selection and 45% for direct input, demonstrating its effectiveness in handling long and complex Arabic tables. Full article

(This article belongs to the Special Issue Deep Learning Approaches for Natural Language Processing)

► Show Figures

Figure 1

25 pages, 17562 KB

Open AccessArticle

SGFNet: Redundancy-Reduced Spectral–Spatial Fusion Network for Hyperspectral Image Classification

by Boyu Wang, Chi Cao and Dexing Kong

Entropy 2025, 27(10), 995; https://doi.org/10.3390/e27100995 - 24 Sep 2025

Viewed by 130

Abstract

Hyperspectral image classification (HSIC) involves analyzing high-dimensional data that contain substantial spectral redundancy and spatial noise, which increases the entropy and uncertainty of feature representations. Reducing such redundancy while retaining informative content in spectral–spatial interactions remains a fundamental challenge for building efficient and [...] Read more.

Hyperspectral image classification (HSIC) involves analyzing high-dimensional data that contain substantial spectral redundancy and spatial noise, which increases the entropy and uncertainty of feature representations. Reducing such redundancy while retaining informative content in spectral–spatial interactions remains a fundamental challenge for building efficient and accurate HSIC models. Traditional deep learning methods often rely on redundant modules or lack sufficient spectral–spatial coupling, limiting their ability to fully exploit the information content of hyperspectral data. To address these challenges, we propose SGFNet, which is a spectral-guided fusion network designed from an information–theoretic perspective to reduce feature redundancy and uncertainty. First, we designed a Spectral-Aware Filtering Module (SAFM) that suppresses noisy spectral components and reduces redundant entropy, encoding the raw pixel-wise spectrum into a compact spectral representation accessible to all encoder blocks. Second, we introduced a Spectral–Spatial Adaptive Fusion (SSAF) module, which strengthens spectral–spatial interactions and enhances the discriminative information in the fused features. Finally, we developed a Spectral Guidance Gated CNN (SGGC), which is a lightweight gated convolutional module that uses spectral guidance to more effectively extract spatial representations while avoiding unnecessary sequence modeling overhead. We conducted extensive experiments on four widely used hyperspectral benchmarks and compared SGFNet with eight state-of-the-art models. The results demonstrate that SGFNet consistently achieves superior performance across multiple metrics. From an information–theoretic perspective, SGFNet implicitly balances redundancy reduction and information preservation, providing an efficient and effective solution for HSIC. Full article

(This article belongs to the Section Multidisciplinary Applications)

► Show Figures

Figure 1

21 pages, 3479 KB

Open AccessArticle

A Comprehensive Methodology for Soft Error Rate (SER) Reduction in Clock Distribution Network

by Jorge Johanny Saenz-Noval, Umberto Gatti and Cristiano Calligaro

Chips 2025, 4(4), 39; https://doi.org/10.3390/chips4040039 - 24 Sep 2025

Viewed by 71

Abstract

Single Event Transients (SETs) in clock-distribution networks are a major source of soft errors in synchronous systems. We present a practical framework that assesses SET risk early in the design cycle, before layout and parasitics, using a Vulnerability Function (VF) derived from Verilog [...] Read more.

Single Event Transients (SETs) in clock-distribution networks are a major source of soft errors in synchronous systems. We present a practical framework that assesses SET risk early in the design cycle, before layout and parasitics, using a Vulnerability Function (VF) derived from Verilog fault injection. This framework guides targeted Engineering Change Orders (ECOs), such as clock-net remapping, re-routing, and the selective insertion of SET filters, within a reproducible open-source flow (Yosys, OpenROAD, OpenSTA). A new analytical Soft Error Rate (SER) model for clock trees is also proposed, which decomposes contributions from the root, intermediate levels, and leaves, and is calibrated by SPICE-measured propagation probabilities, area, and particle flux. When coupled with throughput, this model yields a frequency-aware system-level Bit Error Rate (

{B E R}_{s y s}

). The methodology was validated on a First-In First-Out (FIFO) memory, demonstrating a significant vulnerability reduction of approximately 3.35× in READ mode and 2.67× in WRITE mode. Frequency sweeps show monotonic decreases in both clock-tree vulnerability and

{B E R}_{s y s}

at higher clock frequencies, a trend attributed to temporal masking and throughput effects. Cross-node SPICE characterization between 65 nm and 28 nm reveals a technology-dependent effect: for the same injected charge, the 28 nm process produces a shorter root-level pulse, which lowers the propagation probability relative to 65 nm and shifts the optimal clock-tree partition. These findings underscore the framework’s key innovations: a technology-independent, early-stage VF for ranking critical clock nets; a clock-tree SER model calibrated by measured propagation probabilities; an ECO loop that converts VF insights into concrete hardening actions; and a fully reproducible open-source implementation. The paper’s scope is architectural and pre-layout, with extensions to broader circuit classes and a full electrical analysis outlined for future work. Full article

► Show Figures

Figure 1

20 pages, 2197 KB

Open AccessArticle

Perceptual Image Hashing Fusing Zernike Moments and Saliency-Based Local Binary Patterns

by Wei Li, Tingting Wang, Yajun Liu and Kai Liu

Computers 2025, 14(9), 401; https://doi.org/10.3390/computers14090401 - 21 Sep 2025

Viewed by 270

Abstract

This paper proposes a novel perceptual image hashing scheme that robustly combines global structural features with local texture information for image authentication. The method starts with image normalization and Gaussian filtering to ensure scale invariance and suppress noise. A saliency map is then [...] Read more.

This paper proposes a novel perceptual image hashing scheme that robustly combines global structural features with local texture information for image authentication. The method starts with image normalization and Gaussian filtering to ensure scale invariance and suppress noise. A saliency map is then generated from a color vector angle matrix using a frequency-tuned model to identify perceptually significant regions. Local Binary Pattern (LBP) features are extracted from this map to represent fine-grained textures, while rotation-invariant Zernike moments are computed to capture global geometric structures. These local and global features are quantized and concatenated into a compact binary hash. Extensive experiments on standard databases show that the proposed method outperforms state-of-the-art algorithms in both robustness against content-preserving manipulations and discriminability across different images. Quantitative evaluations based on ROC curves and AUC values confirm its superior robustness–uniqueness trade-off, demonstrating the effectiveness of the saliency-guided fusion of Zernike moments and LBP for reliable image hashing. Full article

► Show Figures

Figure 1

13 pages, 5006 KB

Open AccessArticle

Enhancing Heart Rate Detection in Vehicular Settings Using FMCW Radar and SCR-Guided Signal Processing

by Ashwini Kanakapura Sriranga, Qian Lu and Stewart Birrell

Sensors 2025, 25(18), 5885; https://doi.org/10.3390/s25185885 - 20 Sep 2025

Viewed by 288

Abstract

This paper presents an optimised signal processing framework for contactless physiological monitoring using Frequency Modulated Continuous Wave (FMCW) radar within automotive environments. This research focuses on enhancing heart rate (HR) and heart rate variability (HRV) detection from radar signals by integrating radar placement [...] Read more.

This paper presents an optimised signal processing framework for contactless physiological monitoring using Frequency Modulated Continuous Wave (FMCW) radar within automotive environments. This research focuses on enhancing heart rate (HR) and heart rate variability (HRV) detection from radar signals by integrating radar placement optimisation and advanced phase-based processing techniques. Optimal radar placement was evaluated through Signal-to-Clutter Ratio (SCR) analysis, conducted with multiple human participants in both laboratory and dynamic driving simulator experimental conditions, to determine the optimal in-vehicle location for signal acquisition. An effective processing pipeline was developed, incorporating background subtraction, range bin selection, bandpass filtering, and phase unwrapping. These techniques facilitated the reliable extraction of inter-beat intervals and heartbeat peaks from the phase signal without the need for contact-based sensors. The framework was evaluated using a Walabot FMCW radar module against ground truth HR signals, demonstrating consistent and repeatable results under baseline and mild motion conditions. In subsequent work, this framework was extended with deep learning methods, where radar-derived HR and HRV were benchmarked against research-grade ECG and achieved over 90% accuracy, further reinforcing the robustness and reliability of the approach. Together, these findings confirm that carefully guided radar positioning and robust signal processing can enable accurate and practical in-cabin physiological monitoring, offering a scalable solution for integration in future intelligent vehicle and driver monitoring systems. Full article

(This article belongs to the Special Issue Electromagnetic Waves, Antennas and Sensor Technologies in Modern Biomedical and Environmental Applications)

► Show Figures

Figure 1

20 pages, 42612 KB

Open AccessArticle

Progressive Color Correction and Vision-Inspired Adaptive Framework for Underwater Image Enhancement

by Zhenhua Li, Wenjing Liu, Ji Wang and Yuqiang Yang

J. Mar. Sci. Eng. 2025, 13(9), 1820; https://doi.org/10.3390/jmse13091820 - 19 Sep 2025

Viewed by 274

Abstract

Underwater images frequently exhibit color distortion, detail blurring, and contrast degradation due to absorption and scattering by the underwater medium. This study proposes a progressive color correction strategy integrated with a vision-inspired image enhancement framework to address these issues. Specifically, the progressive color [...] Read more.

Underwater images frequently exhibit color distortion, detail blurring, and contrast degradation due to absorption and scattering by the underwater medium. This study proposes a progressive color correction strategy integrated with a vision-inspired image enhancement framework to address these issues. Specifically, the progressive color correction process includes adaptive color quantization-based global color correction, followed by guided filter-based local color refinement, aiming to restore accurate colors while enhancing visual perception. Within the vision-inspired enhancement framework, the color-adjusted image is first decomposed into a base layer and a detail layer, corresponding to low- and high-frequency visual information, respectively. Subsequently, detail enhancement and noise suppression are applied in the detail pathway, while global brightness correction is performed in the structural pathway. Finally, results from both pathways are fused to yield the enhanced underwater image. Extensive experiments on four datasets verify that the proposed method effectively handles the aforementioned underwater enhancement challenges and significantly outperforms state-of-the-art techniques. Full article

(This article belongs to the Section Ocean Engineering)

► Show Figures

Graphical abstract

30 pages, 1643 KB

Open AccessArticle

Destination (Un)Known: Auditing Bias and Fairness in LLM-Based Travel Recommendations

by Hristo Andreev, Petros Kosmas, Antonios D. Livieratos, Antonis Theocharous and Anastasios Zopiatis

AI 2025, 6(9), 236; https://doi.org/10.3390/ai6090236 - 19 Sep 2025

Viewed by 410

Abstract

Large language-model chatbots such as ChatGPT and DeepSeek are quickly gaining traction as an easy, first-stop tool for trip planning because they offer instant, conversational advice that once required sifting through multiple websites or guidebooks. Yet little is known about the biases that [...] Read more.

Large language-model chatbots such as ChatGPT and DeepSeek are quickly gaining traction as an easy, first-stop tool for trip planning because they offer instant, conversational advice that once required sifting through multiple websites or guidebooks. Yet little is known about the biases that shape the destination suggestions these systems provide. This study conducts a controlled, persona-based audit of the two models, generating 6480 recommendations for 216 traveller profiles that vary by origin country, age, gender identity and trip theme. Six observable bias families (popularity, geographic, cultural, stereotype, demographic and reinforcement) are quantified using tourism rankings, Hofstede scores, a 150-term cliché lexicon and information-theoretic distance measures. Findings reveal measurable bias in every bias category. DeepSeek is more likely than ChatGPT to suggest off-list cities and recommends domestic travel more often, while both models still favour mainstream destinations. DeepSeek also points users toward culturally more distant destinations on all six Hofstede dimensions and employs a denser, superlative-heavy cliché register; ChatGPT shows wider lexical variety but remains strongly promotional. Demographic analysis uncovers moderate gender gaps and extreme divergence for non-binary personas, tempered by a “protective” tendency to guide non-binary travellers toward countries with higher LGBTQI acceptance. Reinforcement bias is minimal, with over 90 percent of follow-up suggestions being novel in both systems. These results confirm that unconstrained LLMs are not neutral filters but active amplifiers of structural imbalances. The paper proposes a public-interest re-ranking layer, hosted by a body such as UN Tourism, that balances exposure fairness, seasonality smoothing, low-carbon routing, cultural congruence, safety safeguards and stereotype penalties, transforming conversational AI from an opaque gatekeeper into a sustainability-oriented travel recommendation tool. Full article

(This article belongs to the Special Issue AI Bias in the Media and Beyond)

► Show Figures

Figure 1

18 pages, 5667 KB

Open AccessArticle

Verification of Vision-Based Terrain-Referenced Navigation Using the Iterative Closest Point Algorithm Through Flight Testing

by Taeyun Kim, Seongho Nam, Hyungsub Lee and Juhyun Oh

Sensors 2025, 25(18), 5813; https://doi.org/10.3390/s25185813 - 17 Sep 2025

Viewed by 489

Abstract

Terrain-referenced navigation (TRN) provides an alternative navigation method for environments with limited GPS availability. This paper proposes a vision-based TRN framework that employs stereo imagery and a rotation-invariant iterative closest point (ICP) algorithm to align reconstructed elevation maps with a terrain elevation database. [...] Read more.

Terrain-referenced navigation (TRN) provides an alternative navigation method for environments with limited GPS availability. This paper proposes a vision-based TRN framework that employs stereo imagery and a rotation-invariant iterative closest point (ICP) algorithm to align reconstructed elevation maps with a terrain elevation database. In contrast to conventional ICP, which is sensitive to camera intrinsic errors, the proposed approach improves robustness at high altitudes. Its feasibility and effectiveness are demonstrated through full-scale flight tests using a Cessna aircraft equipped with an IMU, camera, and barometric altimeter. The results show that the proposed method consistently enhances positioning accuracy and robustness compared with a filter-based approach, particularly under challenging high-altitude conditions where image resolution is reduced. The algorithm proved capable of maintaining reliable performance across varying flight altitudes, demonstrating its robustness under high-altitude conditions. This study establishes the novelty of integrating rotation-invariant ICP with vision-based TRN and provides real-world validation through actual flight testing. The findings offer valuable implications for future research and potential applications in unmanned aerial vehicles and long-range guided systems, where passive and GPS-independent navigation is critical for mission success. Full article

(This article belongs to the Section Navigation and Positioning)

► Show Figures

Figure 1

16 pages, 881 KB

Open AccessArticle

Text-Guided Spatio-Temporal 2D and 3D Data Fusion for Multi-Object Tracking with RegionCLIP

by Youlin Liu, Zainal Rasyid Mahayuddin and Mohammad Faidzul Nasrudin

Appl. Sci. 2025, 15(18), 10112; https://doi.org/10.3390/app151810112 - 16 Sep 2025

Viewed by 406

Abstract

3D Multi-Object Tracking (3D MOT) is a critical task in autonomous systems, where accurate and robust tracking of multiple objects in dynamic environments is essential. Traditional approaches primarily rely on visual or geometric features, often neglecting the rich semantic information available in textual [...] Read more.

3D Multi-Object Tracking (3D MOT) is a critical task in autonomous systems, where accurate and robust tracking of multiple objects in dynamic environments is essential. Traditional approaches primarily rely on visual or geometric features, often neglecting the rich semantic information available in textual modalities. In this paper, we propose Text-Guided 3D Multi-Object Tracking (TG3MOT), a novel framework that incorporates Vision-Language Models (VLMs) into the YONTD architecture to improve 3D MOT performance. Our framework leverages RegionCLIP, a multimodal open-vocabulary detector, to achieve fine-grained alignment between image regions and textual concepts, enabling the incorporation of semantic information into the tracking process. To address challenges such as occlusion, blurring, and ambiguous object appearances, we introduce the Target Semantic Matching Module (TSM), which quantifies the uncertainty of semantic alignment and filters out unreliable regions. Additionally, we propose the 3D Feature Exponential Moving Average Module (3D F-EMA) to incorporate temporal information, improving robustness in noisy or occluded scenarios. Furthermore, the Gaussian Confidence Fusion Module (GCF) is introduced to weight historical trajectory confidences based on temporal proximity, enhancing the accuracy of trajectory management. We evaluate our framework on the KITTI dataset and compare it with the YONTD baseline. Extensive experiments demonstrate that although the overall HOTA gain of TG3MOT is modest (+0.64%), our method achieves substantial improvements in association accuracy (+0.83%) and significantly reduces ID switches (−16.7%). These improvements are particularly valuable in real-world autonomous driving scenarios, where maintaining consistent trajectories under occlusion and ambiguous appearances is crucial for downstream tasks such as trajectory prediction and motion planning. The code will be made publicly available. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

Search Results (733)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (733)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI