MDPI - Publisher of Open Access Journals

22 pages, 20186 KB

Open AccessArticle

Real-Time Edge-Prior Guided SegFormer for Robust Contour Extraction of Aggregate Particles in Conveyor-Belt Depth Maps

by Jian Shen, Hanye Liu, Zhilin Chen, Xiangnan Zhao and Huijuan Yang

Sensors 2026, 26(10), 3196; https://doi.org/10.3390/s26103196 - 18 May 2026

Accurate contour extraction of aggregate particles from conveyor-belt depth maps is essential for downstream particle counting and size measurement, yet industrial depth data often contains weak discontinuities, missing values, and speckle-like noise. We propose a task-specific geometry-aware contour extraction framework that combines a [...] Read more.

Accurate contour extraction of aggregate particles from conveyor-belt depth maps is essential for downstream particle counting and size measurement, yet industrial depth data often contains weak discontinuities, missing values, and speckle-like noise. We propose a task-specific geometry-aware contour extraction framework that combines a compact SegFormer encoder with depth-derived priors, a lightweight local branch, edge-prior gated fusion, and full-resolution residual refinement. The input representation consists of normalized depth, Sobel gradient magnitude, and the absolute Laplacian response. On AGG_FULLDATA, the method achieves Optimal Dataset Scale (ODS), Optimal Image Scale (OIS), and Average Precision (AP) values of 0.9607/0.9716/0.9683 under the primary tolerance-based protocol (

t o l = 1

), while retaining an ODS of 0.6476 under strict pixel-exact matching. On External130, a test-only split collected under altered operating conditions using the same sensor, it reaches 0.9580/0.9734/0.9683 without retraining and consistently outperforms the MiT-only baseline. A rigid-object repeatability study based on 30 raw PLY scans shows a mean boundary deviation of 0.335 px, a within-1 px correspondence rate of 97.1%, and a coefficient of variation (CV) of equivalent diameter below 1%, supporting the practical meaning of

t o l = 1

. The full pipeline runs at 48.9 frames per second (FPS) with 3.71M parameters on an NVIDIA GeForce RTX 4060 GPU. Broader robustness to separately controlled operating factors, environmental disturbances, and cross-device settings still requires validation. Full article

(This article belongs to the Section Sensing and Imaging)

29 pages, 14315 KB

Open AccessArticle

A Proof-of-Concept Free-Flight Photogrammetric Framework Based on Monocular Vision and Sensor-Group Displacement Fusion

by Enshun Lu, Xin Wan, Wupeng Deng and Xiaofeng Li

Sensors 2026, 26(10), 3177; https://doi.org/10.3390/s26103177 - 17 May 2026

Viewed by 158

Abstract

As unmanned aerial vehicles (UAVs) have increasingly become aerial imaging platforms, the reliance of traditional photogrammetry on ground control points (GCPs) remains a major limitation in complex terrain, confined spaces, and scenarios where control points are difficult to deploy. To address this issue, [...] Read more.

As unmanned aerial vehicles (UAVs) have increasingly become aerial imaging platforms, the reliance of traditional photogrammetry on ground control points (GCPs) remains a major limitation in complex terrain, confined spaces, and scenarios where control points are difficult to deploy. To address this issue, this study proposes a proof-of-concept framework for free-flight photogrammetry based on the fusion of monocular vision and sensor-group displacement information. The framework employs a rigid point set station-displacement algorithm to compute the exterior orientation elements between adjacent measurement stations, providing a feasible approach for multi-station pose propagation under control-point-free conditions. In addition, a composite weighting strategy incorporating the effects of optical distortion and rigid-body consistency evaluation is developed to improve the rational use of point-set information during station-displacement computation. To evaluate the feasibility of the proposed method, numerical simulations were first conducted to analyze the variation patterns of exterior orientation computation and target-point reconstruction under different sampling intervals and error conditions. Subsequently, an indoor controlled bench-top experimental platform was constructed to physically validate the complete workflow of the proposed method. The bench-top experimental results show that the overall mean three-dimensional positioning error of the two cross-station image pairs was 15.450 mm, and the maximum three-dimensional positioning error was 36.685 mm. The mean absolute distance errors for station 1–station 2 and station 1–station 3 were 9.230 mm and 12.436 mm, respectively. These results indicate that the proposed method can complete station-displacement-based exterior orientation computation and three-dimensional target measurement in a controlled physical scenario, demonstrating clear proof-of-concept significance. It should be noted that UAV measurement experiments under real flight conditions have not yet been completed in this study, and further validation on an actual UAV platform is still required. Full article

(This article belongs to the Section Remote Sensors)

22 pages, 45694 KB

Open AccessArticle

Visual Localization for Deep-Sea Mining Vehicles During Operation

by Yangrui Cheng, Bingkun Wang, Xiaojun Zhuo, Kai Liu and Yingjie Guan

J. Mar. Sci. Eng. 2026, 14(8), 759; https://doi.org/10.3390/jmse14080759 - 21 Apr 2026

Viewed by 366

Abstract

Deep-sea mining operations demand continuous, drift-free positioning over multi-day missions—a requirement that traditional acoustic dead-reckoning systems struggle to meet due to cumulative error accumulation and frequent DVL bottom-lock loss in sediment plume environments. Inspired by Google Cartographer’s 2D grid mapping paradigm, we present [...] Read more.

Deep-sea mining operations demand continuous, drift-free positioning over multi-day missions—a requirement that traditional acoustic dead-reckoning systems struggle to meet due to cumulative error accumulation and frequent DVL bottom-lock loss in sediment plume environments. Inspired by Google Cartographer’s 2D grid mapping paradigm, we present a prior map-based visual localization framework that decouples offline mapping from real-time localization, fundamentally eliminating drift through absolute image registration against pre-built seabed mosaics. By integrating adaptive keyframe selection, Multi-Scale Retinex (MSR) enhancement, and the AD-LG deep feature matching architecture, our system constructs globally consistent seabed maps for absolute positioning. The framework leverages deformable convolutions and LightGlue to effectively mitigate challenges such as low texture and non-rigid distortion. Quantitative validation on tank simulation datasets demonstrates significant superiority over IMU-only and standard fusion schemes; qualitative deployment on real Pacific CCZ imagery confirms near-real-time operational feasibility on an embedded Jetson Orin NX platform. This system establishes visual navigation as a viable backup to acoustic systems, addressing a critical gap in deep-sea mining vehicle autonomy. Full article

(This article belongs to the Special Issue Advances in Underwater Positioning and Navigation Technology)

► Show Figures

Figure 1

33 pages, 10259 KB

Open AccessArticle

Multimodal Remote Sensing Image Classification Based on Dynamic Group Convolution and Bidirectional Guided Cross-Attention Fusion

by Lu Zhang, Yaoguang Yang, Zhaoshuang He, Guolong Li, Feng Zhao, Wenqiang Hua, Gongwei Xiao and Jingyan Zhang

Remote Sens. 2026, 18(7), 1066; https://doi.org/10.3390/rs18071066 - 2 Apr 2026

Viewed by 530

Abstract

The synergistic integration of Hyperspectral Imaging (HSI) and Light Detection and Ranging (LiDAR) data has become a pivotal strategy in remote sensing for precise land-cover classification. However, existing multimodal deep learning frameworks frequently suffer from intrinsic limitations, including rigid feature extraction protocols, underutilization [...] Read more.

The synergistic integration of Hyperspectral Imaging (HSI) and Light Detection and Ranging (LiDAR) data has become a pivotal strategy in remote sensing for precise land-cover classification. However, existing multimodal deep learning frameworks frequently suffer from intrinsic limitations, including rigid feature extraction protocols, underutilization of LiDAR-derived textural information, and asymmetric fusion mechanisms that fail to balance the contribution of spectral and elevation features effectively. To address these challenges, this paper proposes a novel framework named DGC-BCAF, which integrates Dynamic Group Convolution and Bidirectional Guided Cross-Attention Fusion to achieve adaptive feature representation and robust cross-modal interaction. First, a Dynamic Group Convolution (DGConv) module embedded within a ResNet18 backbone is designed to function as the central spatial context extractor. Unlike traditional group convolution, this module learns a dynamic relationship matrix to automatically group input channels, thereby facilitating flexible and context-aware feature representation that adapts to complex spatial distributions. Second, to overcome the insufficient exploitation of elevation data, we introduce a dedicated LiDAR texture encoding branch. This branch innovatively fuses Gray-Level Co-occurrence Matrix (GLCM) statistical features with multi-scale convolutional representations, capturing both geometric height information and fine-grained surface textural details that are critical for distinguishing objects with similar elevations. Finally, central to our architecture is the Bidirectional Cross-Attention Fusion (BCAF) module. Unlike standard unidirectional fusion approaches, BCAF employs a LiDAR geometry to guide the selection of salient spectral bands, while simultaneously utilizing spectral signatures to emphasize informative LiDAR channels. This mutual guidance ensures a balanced contribution from both modalities. Extensive experiments conducted on three benchmark datasets—Houston 2013, Trento, and MUUFL—demonstrate that DGC-BCAF consistently outperforms state-of-the-art methods in terms of overall accuracy, average accuracy, and Kappa coefficient. The results confirm that the proposed adaptive grouping and bidirectional guidance strategies significantly improve classification performance, particularly in distinguishing spectrally similar materials and delineating complex urban structures. Full article

► Show Figures

Figure 1

17 pages, 1639 KB

Open AccessArticle

Cascade Registration and Fusion for Unaligned Infrared and Visible Images in Autonomous Driving

by Long Xiao, Yidong Xie and Chengda Yao

Electronics 2026, 15(7), 1427; https://doi.org/10.3390/electronics15071427 - 30 Mar 2026

Viewed by 394

Abstract

Infrared and visible image fusion is a critical technology for enhancing the all-weather perception capabilities of autonomous driving systems. However, the inherent physical parallax of vehicle-mounted sensors combined with motion-induced vibrations makes it difficult to achieve strict alignment between the source images. Direct [...] Read more.

Infrared and visible image fusion is a critical technology for enhancing the all-weather perception capabilities of autonomous driving systems. However, the inherent physical parallax of vehicle-mounted sensors combined with motion-induced vibrations makes it difficult to achieve strict alignment between the source images. Direct fusion of such misaligned pairs leads to ghosting artifacts, which significantly compromises driving safety. To address this challenge, this paper proposes a cascaded deep fusion framework tailored for autonomous driving scenarios. A dual-modal perception dataset is first constructed, incorporating realistic physical parallax and non-rigid deformations. Subsequently, a decoupled strategy is established, characterized by geometric correction followed by semantic fusion: the Static-Feature Recursive Registration (SFRR) network is utilized to explicitly correct the spatial misalignments caused by parallax, thereby establishing geometric consistency; then, the Hierarchical Invertible Block Fusion (HIBF) network achieves lossless integration of cross-modal features by combining spatial frequency separation with invertible interaction techniques. Experimental results demonstrate that the proposed method outperforms representative algorithms across several metrics, including Mutual Information (MI), Visual Information Fidelity (VIF), Structural Similarity (SSIM), and Correlation Coefficient (CC), producing high-quality fused images with clear structural definitions. Full article

(This article belongs to the Special Issue Development and Application of Computer Vision and Perception in Vehicles)

► Show Figures

Figure 1

40 pages, 2214 KB

Open AccessArticle

A CNN-ViT Hybrid Architecture Res101-MViT-Ens for Accurate and Lightweight Automated Ocular Disease Diagnosis

by Hao Wang, Ting Ke and Hui Lv

Appl. Sci. 2026, 16(6), 2905; https://doi.org/10.3390/app16062905 - 18 Mar 2026

Viewed by 448

Abstract

Automated ocular disease diagnosis faces critical challenges including insufficient diagnostic precision, local–global feature imbalance, rigid feature fusion, weak cross-domain generalization, and difficult lightweight deployment. This study aims to develop a high-performance, generalizable, and deployable hybrid deep learning architecture for accurate multi-class ocular disease [...] Read more.

Automated ocular disease diagnosis faces critical challenges including insufficient diagnostic precision, local–global feature imbalance, rigid feature fusion, weak cross-domain generalization, and difficult lightweight deployment. This study aims to develop a high-performance, generalizable, and deployable hybrid deep learning architecture for accurate multi-class ocular disease diagnosis. We propose the Res101-MViT-Ens hybrid architecture, which fuses ResNet101 for local fine-grained feature extraction and MobileViT-XXS for global contextual modeling via an end-to-end dynamic learnable weight fusion mechanism, with class-balanced sampling and medically adaptive augmentation for data preprocessing. The model is validated on the ODIR-5K dataset and cross-evaluated on three heterogeneous datasets (MESSIDOR-2, Kaggle DR, EyePACS). It achieves 99.44% accuracy, a 99.41% F1-score, and 99.32% Kappa on ODIR-5K, with a 99.46% average cross-dataset accuracy, outperforming state-of-the-art models. With 54 M parameters and 42.6 ms per-image inference latency on the Snapdragon 8 Gen2 edge module (Qualcomm Technologies, Inc., San Diego, CA, USA), it outperforms mainstream edge architectures. This proposed architecture achieves state-of-the-art diagnostic precision; balances accuracy, generalization and practicality; and is suitable for lightweight grassroots deployment in ocular disease screening. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

32 pages, 7690 KB

Open AccessArticle

FSSC-Net: A Frequency–Spatial Self-Calibrated Network for Task-Adaptive Remote Sensing Image Understanding

by Hao Yuan and Bin Zhang

Remote Sens. 2026, 18(5), 824; https://doi.org/10.3390/rs18050824 - 6 Mar 2026

Viewed by 693

Abstract

Although recent studies have achieved remarkable progress in remote sensing image understanding by fusing spatial- and frequency-domain features to leverage their complementary strengths, they still face two key limitations: frequency modeling remains rigid due to static constraints, limiting adaptability, and spatial–frequency fusion often [...] Read more.

Although recent studies have achieved remarkable progress in remote sensing image understanding by fusing spatial- and frequency-domain features to leverage their complementary strengths, they still face two key limitations: frequency modeling remains rigid due to static constraints, limiting adaptability, and spatial–frequency fusion often suffers from poor generalization and instability across tasks and network depths. Our experiments reveal that the relative importance of low- and high-frequency components varies dynamically across feature hierarchies and training stages, indicating that frequency information is inherently task-dependent and stage-aware. Motivated by these observations, we propose the Frequency–Spatial Self-Calibrated Network (FSSC-Net), a task-driven framework for adaptive frequency modeling and collaborative spatial–frequency fusion. FSSC-Net incorporates a lightweight, plug-and-play self-calibrated frequency modeling mechanism, comprising a Dynamic Frequency Selection Module and a Task-Guided Calibration Fusion Module. This mechanism adaptively modulates frequency responses via soft masks, enabling dynamic extraction of task-relevant low- and high-frequency components and effective alignment between spatial- and frequency-domain features. Moreover, we present a systematic analysis of frequency importance across tasks and training stages, providing quantitative evidence for the necessity of task-calibrated frequency modeling. Extensive experiments on various benchmarks demonstrate that FSSC-Net consistently outperforms state-of-the-art methods, exhibiting strong task adaptability and robust cross-task generalization. Full article

(This article belongs to the Section Remote Sensing Image Processing)

► Show Figures

Figure 1

16 pages, 7688 KB

Open AccessArticle

Vision-Only Localization of Drones with Optimal Window Velocity Fusion

by Seokwon Yeom

Electronics 2026, 15(3), 637; https://doi.org/10.3390/electronics15030637 - 2 Feb 2026

Viewed by 536

Abstract

Drone localization is essential for various purposes such as navigation, autonomous flight, and object tracking. However, this task is challenging when satellite signals are unavailable. This paper addresses database-free vision-only localization of flying drones using optimal window template matching and velocity fusion. Assuming [...] Read more.

Drone localization is essential for various purposes such as navigation, autonomous flight, and object tracking. However, this task is challenging when satellite signals are unavailable. This paper addresses database-free vision-only localization of flying drones using optimal window template matching and velocity fusion. Assuming the ground is flat, multiple optimal windows are derived from a piecewise linear segment (regression) model of the image-to-real world conversion function. The optimal window is used as a fixed region template to estimate the instantaneous velocity of the drone. The multiple velocities obtained from multiple optimal windows are integrated by a hybrid fusion rule: a weighted average for lateral (sideways) velocities, and a winner-take-all decision for longitudinal velocities. In the experiments, a drone performed a total of six medium-range (800 m to 2 km round trip) and high-speed (up to 14 m/s) maneuvering flights in rural and urban areas. The flight maneuvers include forward-backward, zigzags, and banked turns. Performance was evaluated by root mean squared error (RMSE) and drift error of the GNSS-derived ground-truth trajectories and rigid-body rotated vision-only trajectories. Four fusion rules (simple average, weighted average, winner-take-all, hybrid fusion) were evaluated, and the hybrid fusion rule performed the best. The proposed video stream-based method has been shown to achieve flight errors ranging from a few meters to tens of meters, which corresponds to a few percent of the flight length. Full article

(This article belongs to the Special Issue Recent Research and Applications of Computer Vision and Image Processing)

► Show Figures

Figure 1

28 pages, 29386 KB

Open AccessArticle

Dual-Scale Pixel Aggregation Transformer for Change Detection in Multitemporal Remote Sensing Images

by Kai Zhang, Ziqing Wan, Xue Zhao, Feng Zhang, Ke Liu and Jiande Sun

Remote Sens. 2026, 18(3), 422; https://doi.org/10.3390/rs18030422 - 28 Jan 2026

Viewed by 776

Abstract

Transformers have recently been applied to change detection (CD) of multitemporal remote sensing images because of their ability to model global information. However, the rigid patch partitioning in vanilla self-attention destroys spatial structures and consistency in observed scenes, leading to limited CD performance. [...] Read more.

Transformers have recently been applied to change detection (CD) of multitemporal remote sensing images because of their ability to model global information. However, the rigid patch partitioning in vanilla self-attention destroys spatial structures and consistency in observed scenes, leading to limited CD performance. In this paper, we propose a novel dual-scale pixel aggregation transformer (DSPA-Former) to mitigate this issue. The core of DSPA-Former lies in a dynamic superpixel tokenization strategy and bidirectional dual-scale interaction within the learned feature space, which preserves semantic integrity while capturing long-range dependencies. Specifically, we design a hierarchical decoder that integrates multiscale features through specialized mechanisms for pixel superpixel dialogue, guided feature enhancement, and adaptive multiscale fusion. By modeling the homogeneous properties of spatial information via superpixel segmentation, DSPA-Former effectively maintains structural consistency and sharpens change boundaries. Comprehensive experiments on the LEVIR-CD, WHU-CD, and CLCD datasets demonstrate that DSPA-Former achieves superior performance compared to state-of-the-art methods, particularly in preserving the structural integrity of complex change regions. Full article

(This article belongs to the Section Remote Sensing Image Processing)

► Show Figures

Figure 1

36 pages, 22245 KB

Open AccessArticle

CMSNet: A SAM-Enhanced CNN–Mamba Framework for Damaged Building Change Detection in Remote Sensing Imagery

by Jianli Zhang, Liwei Tao, Wenbo Wei, Pengfei Ma and Mengdi Shi

Remote Sens. 2025, 17(23), 3913; https://doi.org/10.3390/rs17233913 - 3 Dec 2025

Viewed by 1471

Abstract

In war and explosion scenarios, buildings often suffer varying degrees of damage characterized by complex, irregular, and fragmented spatial patterns, posing significant challenges for remote sensing–based change detection. Additionally, the scarcity of high-quality datasets limits the development and generalization of deep learning approaches. [...] Read more.

In war and explosion scenarios, buildings often suffer varying degrees of damage characterized by complex, irregular, and fragmented spatial patterns, posing significant challenges for remote sensing–based change detection. Additionally, the scarcity of high-quality datasets limits the development and generalization of deep learning approaches. To overcome these issues, we propose CMSNet, an end-to-end framework that integrates the structural priors of the Segment Anything Model (SAM) with the efficient temporal modeling and fine-grained representation capabilities of CNN–Mamba. Specifically, CMSNet adopts CNN–Mamba as the backbone to extract multi-scale semantic features from bi-temporal images, while SAM-derived visual priors guide the network to focus on building boundaries and structural variations. A Pre-trained Visual Prior-Guided Feature Fusion Module (PVPF-FM) is introduced to align and fuse these priors with change features, enhancing robustness against local damage, non-rigid deformations, and complex background interference. Furthermore, we construct a new RWSBD (Real-world War Scene Building Damage) dataset based on Gaza war scenes, comprising 42,732 annotated building damage instances across diverse scales, offering a strong benchmark for real-world scenarios. Extensive experiments on RWSBD and three public datasets (CWBD, WHU-CD, and LEVIR-CD+) demonstrate that CMSNet consistently outperforms eight state-of-the-art methods in both quantitative metrics (F1, IoU, Precision, Recall) and qualitative evaluations, especially in fine-grained boundary preservation, small-scale change detection, and complex scene adaptability. Overall, this work introduces a novel detection framework that combines foundation model priors with efficient change modeling, along with a new large-scale war damage dataset, contributing valuable advances to both research and practical applications in remote sensing change detection. Additionally, the strong generalization ability and efficient architecture of CMSNet highlight its potential for scalable deployment and practical use in large-area post-disaster assessment. Full article

(This article belongs to the Special Issue Remote Sensing Image Change Detection and Feature Enhancement Based on Deep Learning)

► Show Figures

Figure 1

36 pages, 106084 KB

Open AccessArticle

Critical Factors for the Application of InSAR Monitoring in Ports

by Jaime Sánchez-Fernández, Alfredo Fernández-Landa, Álvaro Hernández Cabezudo and Rafael Molina Sánchez

Remote Sens. 2025, 17(23), 3900; https://doi.org/10.3390/rs17233900 - 30 Nov 2025

Viewed by 1103

Abstract

Ports pose distinctive monitoring challenges due to harsh marine conditions, mixed construction typologies, and heterogeneous ground conditions. These factors complicate the routine use of satellite InSAR, especially when medium-resolution scatterers must be reliably attributed to specific assets for risk and asset management decisions. [...] Read more.

Ports pose distinctive monitoring challenges due to harsh marine conditions, mixed construction typologies, and heterogeneous ground conditions. These factors complicate the routine use of satellite InSAR, especially when medium-resolution scatterers must be reliably attributed to specific assets for risk and asset management decisions. In current practice, persistent and distributed scatterer (PS/DS) points are often interpreted in map view without an explicit positional uncertainty model or systematic linkage to three-dimensional infrastructure geometry. We present an end-to-end Differential InSAR framework tailored to large ports that fuses medium-resolution Sentinel-1 Level 2 Co-registered Single-Look Complex (L2-CSLC) stacks with high-resolution airborne LiDAR at the post-processing stage. For the Port of Bahía de Algeciras (Spain), we process 123 Sentinel-1A/B images (2020–2022) in ascending and descending geometry using PS/DS time-series analysis with ETAD-like timing corrections and RAiDER tropospheric/ionospheric mitigation. LiDAR is then used to (i) derive look-specific shadow/layover masks and (ii) perform a whitening-transformed nearest-neighbor association that assigns PS/DS points to LiDAR points under an explicit range–azimuth–cross-range (RAC) uncertainty ellipsoid. The RAC standard deviations

(σ_{r}, σ_{a}, σ_{c})

are derived from the effective CSLC range/azimuth resolution and from empirical height correction statistics, providing a geometry- and data-informed prior on positional uncertainty. Finally, we render dual-geometry red–green composites (ascending to R, descending to G; shared normalization) on the LiDAR point cloud, enabling consistent inspection in plan and elevation. Across asset types, rigid steel/concrete elements (trestles, quay faces, and dolphins) sustain high coherence, small whitened offsets, and stable backscatter in both looks; cylindrical storage tanks are bright but exhibit look-dependent visibility and larger cross-range residuals due to height and curvature; and container yards and vessels show high amplitude dispersion and lower temporal coherence driven by operations. Overall, LiDAR-assisted whitening-based linking reduces effective positional ambiguity and improves structure-specific attribution for most scatterers across the port. The fusion products, geometry-aware linking plus three-dimensional dual-geometry RGB, enhance the interpretability of medium-resolution SAR and provide a transferable, port-oriented basis for integrating deformation evidence into risk and asset management workflows. Full article

► Show Figures

Figure 1

25 pages, 6629 KB

Open AccessArticle

A Study of a GNSS/IMU System for Object Localization and Spatial Position Estimation

by Rosen Miletiev, Peter Z. Petkov and Rumen Yordanov

Sensors 2025, 25(22), 6968; https://doi.org/10.3390/s25226968 - 14 Nov 2025

Viewed by 3390

Abstract

Today, navigation systems are commonly used in a variety of applications such as autonomous vehicles, image stabilization, object detection and tracking, and virtual reality (VR) or artificial reality (AR) systems. These systems require not only the precise location but also the accurate tracking [...] Read more.

Today, navigation systems are commonly used in a variety of applications such as autonomous vehicles, image stabilization, object detection and tracking, and virtual reality (VR) or artificial reality (AR) systems. These systems require not only the precise location but also the accurate tracking of the orientation of rigid bodies moving in a three-dimensional (3D) space. This study introduces the integration of GNSS and a 10DoF IMU system to solve the navigation task and calculation of the object position, attitude, and heading. As the location and the attitude calculations require different states but use the same data from the INS sensors, the sensor data fusion in two Kalman filters is proposed. As the filters’ performance is critical, according to the initial states, we study in detail the Allan Variance and normal distribution parameters of three different MEMS IMU sensors. The GNSS system performance and statistics are examined using two commercial and three proposed single or dual-band GNSS antennas. An experimental study is conducted, and the KF output of the heading angle is compared with other sources. Full article

(This article belongs to the Special Issue Smart Sensing and Control for Autonomous Intelligent Unmanned Systems)

► Show Figures

Figure 1

24 pages, 11432 KB

Open AccessArticle

MRDAM: Satellite Cloud Image Super-Resolution via Multi-Scale Residual Deformable Attention Mechanism

by Liling Zhao, Zichen Liao and Quansen Sun

Remote Sens. 2025, 17(21), 3509; https://doi.org/10.3390/rs17213509 - 22 Oct 2025

Viewed by 1266

Abstract

High-resolution meteorological satellite cloud imagery plays a crucial role in diagnosing and forecasting severe convective weather phenomena characterized by suddenness and locality, such as tropical cyclones. However, constrained by imaging principles and various internal/external interferences during satellite data acquisition, current satellite imagery often [...] Read more.

High-resolution meteorological satellite cloud imagery plays a crucial role in diagnosing and forecasting severe convective weather phenomena characterized by suddenness and locality, such as tropical cyclones. However, constrained by imaging principles and various internal/external interferences during satellite data acquisition, current satellite imagery often fails to meet the spatiotemporal resolution requirements for fine-scale monitoring of these weather systems. Particularly for real-time tracking of tropical cyclone genesis-evolution dynamics and capturing detailed cloud structure variations within cyclone cores, existing spatial resolutions remain insufficient. Therefore, developing super-resolution techniques for meteorological satellite cloud imagery through software-based approaches holds significant application potential. This paper proposes a Multi-scale Residual Deformable Attention Model (MRDAM) based on Generative Adversarial Networks (GANs), specifically designed for satellite cloud image super-resolution tasks considering their morphological diversity and non-rigid deformation characteristics. The generator architecture incorporates two key components: a Multi-scale Feature Progressive Fusion Module (MFPFM), which enhances texture detail preservation and spectral consistency in reconstructed images, and a Deformable Attention Additive Fusion Module (DAAFM), which captures irregular cloud pattern features through adaptive spatial-attention mechanisms. Comparative experiments against multiple GAN-based super-resolution baselines demonstrate that MRDAM achieves superior performance in both objective evaluation metrics (PSNR/SSIM) and subjective visual quality, proving its superior performance for satellite cloud image super-resolution tasks. Full article

(This article belongs to the Special Issue Neural Networks and Deep Learning for Satellite Image Processing)

► Show Figures

Graphical abstract

15 pages, 2983 KB

Open AccessArticle

A Comparative Study of Five Target Volume Definitions for Radiotherapy in Glioblastoma Multiforme

by Kamuran Ibis, Kubra Ozkaya Toraman, Canan Koksal Akbas, Ozlem Guler Guniken, Korhan Kokce, Sezi Ceren Gunay, Rasim Meral and Musa Altun

Medicina 2025, 61(10), 1860; https://doi.org/10.3390/medicina61101860 - 16 Oct 2025

Viewed by 1994

Abstract

Background and Objectives: This study aimed to compare target volumes and organ-at-risk (OAR) doses using five different volume definitions in radiotherapy (RT) planning of patients with glioblastoma multiforme (GBM). Materials and Methods: Rigid image fusion was performed using simulation computed tomography and postoperative [...] Read more.

Background and Objectives: This study aimed to compare target volumes and organ-at-risk (OAR) doses using five different volume definitions in radiotherapy (RT) planning of patients with glioblastoma multiforme (GBM). Materials and Methods: Rigid image fusion was performed using simulation computed tomography and postoperative magnetic resonance imaging scans of 20 patients with GBM. Volumetric modulated arc therapy (VMAT) plans were generated according to three two-phase protocols—American Brain Tumor Consortium (ABTC), North Central Cancer Treatment Group/Alliance (NCCTG/Alliance), and Radiation Therapy Oncology Group/NRG (RTOG/NRG)—and two single-phase protocols—European Organisation for Research and Treatment of Cancer (EORTC) and European Society for Radiotherapy and Oncology–European Association of Neuro-Oncology (ESTRO/EANO)—each delivering a total dose of 60 Gy. OARs and dose constraints were evaluated. Statistical analysis was performed using the paired sample t-test. Results: The ESTRO/EANO volume had the smallest median PTV overall (p < 0.001). The lowest brain-PTV Dmean in the initial phase was observed in the ABTC group, followed closely by ESTRO/EANO (p < 0.001). Among boost volumes, the ABTC volume was the smallest, and the median brain-PTV Dmean was lowest in the ESTRO/EANO volume. ESTRO/EANO provided the lowest doses for contralateral and ipsilateral cochlea Dmean, brainstem D1cc, and contralateral lens Dmax. Notably, both EORTC and ESTRO/EANO plans maintained OAR doses within acceptable constraints, with ESTRO/EANO achieving the most consistently minimised exposure. Conclusions: Reduced irradiated brain volume, acceptable OAR preservation and practical applicability, the use of ESTRO-EANO and EORTC target volumes in radiotherapy of glioblastoma multiforme may provide dosimetric advantages that require further validation in clinical outcome studies. Full article

(This article belongs to the Special Issue High-Grade Gliomas: Updates and Challenges)

► Show Figures

Figure 1

9 pages, 2155 KB

Open AccessReview

Esophageal Injury in Patients with Ankylosing Spondylitis After Cervical Spine Trauma: Our Case Series and Narrative Review

by Nenad Koruga, Alen Rončević, Mario Špoljarić, Tomislav Ištvanić, Stjepan Ištvanić, Vedran Farkaš, Klemen Grabljevec, Anđela Grgić, Tatjana Rotim, Tajana Turk, Domagoj Kretić and Anamarija Soldo Koruga

Medicina 2025, 61(10), 1855; https://doi.org/10.3390/medicina61101855 - 16 Oct 2025

Cited by 1 | Viewed by 1371

Abstract

Introduction: Ankylosing spondylitis (AS) is a chronic inflammatory disorder that causes progressive ossification and fusion of the spine, particularly in the cervical region. This results in a rigid spinal column that is highly susceptible to unstable fractures, even after low-energy trauma. Cervical [...] Read more.

Introduction: Ankylosing spondylitis (AS) is a chronic inflammatory disorder that causes progressive ossification and fusion of the spine, particularly in the cervical region. This results in a rigid spinal column that is highly susceptible to unstable fractures, even after low-energy trauma. Cervical fractures in AS are often complex, extending through multiple spinal segments, and are associated with a high risk of neurological compromise. Esophageal injury associated with such fractures is rare but clinically significant, as the anatomical vicinity of the esophagus makes it vulnerable to direct trauma, delayed perforation, or secondary damage from fracture displacement and hardware failure. Aim: The purpose of this review is to present and highlight the clinical relevance of esophageal injury in cervical spine trauma among patients with AS, emphasizing the diagnostic challenges and surgical treatment in order to improve outcomes. Results: Esophageal injuries in the context of AS-related cervical trauma are frequently overlooked due to subtle clinical manifestations such as dysphagia, subcutaneous emphysema, or covert signs of mediastinitis. Plain radiographs are insufficient to identify such complications; advanced imaging modalities are often required for detection. Management is complex and usually demands a multidisciplinary approach, involving both stabilization of the cervical spine and repair of the esophagus. Despite treatment efforts, these patients remain at increased risk for morbidity and mortality, mainly due to infection and sepsis. Conclusions: Esophageal injury in cervical spine trauma associated with AS is an uncommon but life-threatening condition. Early recognition, comprehensive radiologic evaluation, and careful surgical planning are crucial for optimal management. Heightened clinical suspicion and awareness of this rare complication are essential to improve diagnostic accuracy and patient outcomes. Full article

(This article belongs to the Section Neurology)

► Show Figures

Figure 1

Search Results (51)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (51)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI