MDPI - Publisher of Open Access Journals

40 pages, 27259 KB

Open AccessArticle

Monocular 3D Position Estimation of a Moving Vehicle Based on a Kalman-Goldschmidt Adaptive Filter

by Diana Kalita, Pavel Lyakhov, Valery Andreev and Denis Butusov

J. Sens. Actuator Netw. 2026, 15(3), 48; https://doi.org/10.3390/jsan15030048 - 18 Jun 2026

Viewed by 119

Determining the 3D position of a vehicle from a 2D image plays a key role in video surveillance, autonomous driving, and spatial localization. However, localization accuracy can significantly degrade in conditions of incomplete or synthetic measurement noise and keypoint jitter. In this paper, [...] Read more.

Determining the 3D position of a vehicle from a 2D image plays a key role in video surveillance, autonomous driving, and spatial localization. However, localization accuracy can significantly degrade in conditions of incomplete or synthetic measurement noise and keypoint jitter. In this paper, we propose a new iterative 3D position estimation algorithm (KGA). This algorithm includes geometric correction and calibration steps for converting from 2D to 3D coordinates; trajectory prediction and correction using a Kalman filter; and adaptive tuning of the filter parameters using the Goldschmidt algorithm. Experiments confirm that KGA outperforms the standard (FK) and modified (MFK) Kalman filters in accuracy and convergence speed, demonstrating robustness to various camera angles and noise levels. The novelty of this approach lies in the integration of the Goldschmidt algorithm into the Kalman filter to create an adaptation mechanism that dynamically adjusts the measurement noise covariance based on instantaneous innovation magnitude. Unlike end-to-end deep learning trackers or nonlinear filters (EKF/UKF), KGA is designed as a lightweight post-processing stage that can be seamlessly integrated into existing detection pipelines while maintaining the low computational footprint required for UAV-based edge deployment. The algorithm is of practical value for computer vision systems requiring accurate and robust tracking under varying observational conditions, with current implementation suitable for offline or buffered processing, and clear pathways to real-time deployment through code optimization. The algorithm is of practical value for computer vision systems requiring accurate and robust tracking under varying observational conditions. Full article

(This article belongs to the Section Big Data, Computing and Artificial Intelligence)

► Show Figures

Figure 1

28 pages, 6306 KB

Open AccessArticle

A Hybrid Closed-Loop Tracker Fusing a Kalman Filter State Observer for Fast and Robust Embedded Visual Tracking

by Xile Wei, Jiacheng Li and Meili Lu

Electronics 2026, 15(11), 2276; https://doi.org/10.3390/electronics15112276 - 25 May 2026

Viewed by 232

Abstract

Visual object tracking finds extensive application in real-time video analysis on edge devices, yet faces dual challenges: decreased speed due to limited computational resources and weak anti-disturbance capability in complex scenarios. This paper proposes the Hybrid Closed-Loop Tracker (HCLT) to enhance both speed [...] Read more.

Visual object tracking finds extensive application in real-time video analysis on edge devices, yet faces dual challenges: decreased speed due to limited computational resources and weak anti-disturbance capability in complex scenarios. This paper proposes the Hybrid Closed-Loop Tracker (HCLT) to enhance both speed and robustness of embedded visual tracking. HCLT integrates high-precision and high-speed trackers to make real-time performance controllable, while a Kalman filter is employed for state observation and feedback. Within this closed-loop framework, we introduce motion and feature point information as supplementary states and further design mechanisms for adaptive search region adjustment and tracking recovery. Our methods effectively mitigate the impact of external disturbances. Experimental results demonstrate that HCLT further improves both speed and robustness on the basis of high-performance trackers, achieving high tracking accuracy across multiple public benchmark datasets. It demonstrates excellent anti-disturbance performance, particularly in challenging scenarios such as blur and occlusions, while maintaining frame rates exceeding 35 frames per second (FPS) at 720p resolution when deployed on an RK3588 embedded device, thus representing a significant improvement over deep neural network trackers. Full article

(This article belongs to the Special Issue Advances in Visual Tracking: Emerging Techniques and Applications)

► Show Figures

Figure 1

23 pages, 10226 KB

Open AccessArticle

Rotor Attitude Estimation for Spherical Motors Using Geometry-Constrained Kalman Transformer Algorithm in Monocular Vision

by Fucong Liu, Baokaidi Tian, Faqiang Wen, Lei Yu, Tianxiang Yu and Min Li

Sensors 2026, 26(10), 3156; https://doi.org/10.3390/s26103156 - 16 May 2026

Viewed by 357

Abstract

Permanent-magnet spherical motors (PMSpMs) possess three-degree-of-freedom omnidirectional motion characteristics, and rotor attitude estimation (RAE) is essential for closed-loop control. This article proposes a visual RAE method for spherical motors using a Kalman filter and geometric constraint Transformer (GK-TransT). An RAE system was equipped [...] Read more.

Permanent-magnet spherical motors (PMSpMs) possess three-degree-of-freedom omnidirectional motion characteristics, and rotor attitude estimation (RAE) is essential for closed-loop control. This article proposes a visual RAE method for spherical motors using a Kalman filter and geometric constraint Transformer (GK-TransT). An RAE system was equipped with a monocular area scan camera with a visual feature component (VFC) mounted on the bottom of the rotor. In the proposed GK-TransT algorithm, the Kalman filter is used to enhance the robustness and accuracy of the TransT tracker. To verify the algorithm, a tracking comparison was conducted among the GK-TransT, original TransT, KCF, and CSRT algorithms. The results indicate that the tracking precisions of the proposed GK-TransT algorithm for the main and auxiliary feature points reach 90.9% and 94.4%, respectively, with an average processing speed of 61.23 FPS and a single-frame latency of 16.33 ms. Considering the tracking precision, real-time performance, and robustness under occlusion and motion blur conditions, the GK-TransT algorithm is more applicable for the RAE of the PMSpM. In addition, an RAE test bench was developed, and the GK-TransT-based method and a micro-electro-mechanical system (MEMS) sensor were compared. The physical ground truth of a hydraulic rotary table was used as the benchmark. The comparison results indicate that the GK-TransT-based method achieves a higher accuracy than the MEMS method. Finally, the practicability of the proposed method is proved. Full article

(This article belongs to the Section Sensors and Robotics)

► Show Figures

Figure 1

20 pages, 7422 KB

Open AccessArticle

MAAT: A Marine-Aware Adaptive Tracker for Robust and Real-Time Multi-Object Tracking in Maritime Environments

by Xinjie Han, Qi Han, Yunsheng Fan and Dongdong Mu

J. Mar. Sci. Eng. 2026, 14(8), 738; https://doi.org/10.3390/jmse14080738 - 16 Apr 2026

Viewed by 507

Abstract

Multi-object tracking (MOT) is a key technology for enabling autonomous navigation of unmanned surface vehicle (USV) as it provides continuous perception of surrounding maritime targets and supports navigation decision-making. However, videos acquired on maritime platforms typically suffer from challenges such as platform-induced jitter [...] Read more.

Multi-object tracking (MOT) is a key technology for enabling autonomous navigation of unmanned surface vehicle (USV) as it provides continuous perception of surrounding maritime targets and supports navigation decision-making. However, videos acquired on maritime platforms typically suffer from challenges such as platform-induced jitter and nonlinear object motion, which significantly degrade tracking performance. To address these challenges, this paper builds upon ByteTrack by incorporating an adaptive Kalman filtering scheme and proposing a density-aware association strategy, resulting in a novel tracker termed the Marine-Aware Adaptive Tracker (MAAT). Specifically, an adaptive Kalman filter is introduced to increase the contribution of high-confidence detections during the state update process, thereby enhancing the stability and robustness of state estimation. Furthermore, to better mitigate the frequent identity switches caused by severe platform jitter from the USV observation platform, a density-aware association strategy is proposed. This strategy dynamically adjusts the composition of the cost matrix according to the density of high-confidence targets, enabling more reliable data association under varying scene conditions. Finally, the proposed tracking algorithm is evaluated against several state-of-the-art methods on the Singapore Maritime Dataset. It achieves competitive performance, attaining 44.37 MOTA and 43.857 IDF1. Moreover, MAAT operates in real time, running at 41.4 FPS. The experimental results demonstrate that MAAT is capable of performing accurate and real-time multi-object tracking in dynamic maritime environments with surface fluctuations, thereby providing effective technical support for intelligent maritime surveillance applications. Full article

(This article belongs to the Special Issue New Technologies in Autonomous Ship Navigation)

► Show Figures

Figure 1

23 pages, 3588 KB

Open AccessArticle

Laser-Tracker-Based Robot Pose Measurement Using PSD Spot Sensing and Multi-Sensor Fusion with Simulation Validation

by Suli Wang, Jing Yang and Xiaodan Sang

Micromachines 2026, 17(3), 290; https://doi.org/10.3390/mi17030290 - 26 Feb 2026

Viewed by 921

Abstract

Accurate measurement of robotic pose is indispensable for large-scale precision manufacturing and robotic calibration, particularly because traditional robotic kinematic models often fall short owing to environmental disturbances and structural uncertainties. Laser tracker systems offer high-precision, large-volume measurement capabilities and are therefore appealing as [...] Read more.

Accurate measurement of robotic pose is indispensable for large-scale precision manufacturing and robotic calibration, particularly because traditional robotic kinematic models often fall short owing to environmental disturbances and structural uncertainties. Laser tracker systems offer high-precision, large-volume measurement capabilities and are therefore appealing as external references for robot pose estimation; however, their practical efficacy is heavily reliant on optical tracking stability, sensor noise levels, and system robustness. This paper introduces a laser tracker-based framework for measuring robot pose, which integrates PSD-based optical spot sensing, multi-sensor fusion, and simulation-based system analysis. A prototype PSD sensing subsystem has been developed utilizing analog signal conditioning, high-speed A/D sampling, and FPGA-based centroid computation. Bench experiments validate the linearity, geometric sensitivity, and robustness of the PSD sensing chain under controlled spot translations and various ambient illumination conditions. Results demonstrate that the PSD response is nearly linear within a ±0.9 mm spot displacement and that the implementation of an interference optical filter significantly enhances measurement repeatability under background light. At the system level, a comprehensive simulation framework is established wherein PSD measurements are fused with inertial and encoder data via an extended Kalman filter. The simulations explore the effects of process noise tuning, time synchronization, systematic error sources, and control strategies on pose estimation accuracy. Ranging-related effects and error-compensation mechanisms are analyzed within the context of modeling and simulation, providing insights into the interferometric ranging principle underlying the complete laser tracker system. The validation of the prototype alongside simulation results demonstrates that PSD-based optical tracking, combined with multi-sensor fusion and layered error compensation, can effectively improve robustness and positional accuracy. The proposed framework offers valuable guidance for the development and phased validation of laser tracker-oriented robot pose measurement systems in complex industrial environments. Full article

(This article belongs to the Special Issue Micro/Nano Optical Devices and Sensing Technology)

► Show Figures

Figure 1

17 pages, 2743 KB

Open AccessArticle

Research on Motion Trajectory Correction Method for Wall-Climbing Robots Based on External Visual Localization System

by Haolei Ru, Meiping Sheng, Fei Gao, Zhanghao Li, Jiahui Qi, Lei Cheng, Kuo Su, Jiahao Zhang and Jiangjian Xiao

Sensors 2026, 26(3), 773; https://doi.org/10.3390/s26030773 - 23 Jan 2026

Viewed by 477

Abstract

To reduce manual operation and enhance the intelligence of the high-altitude maintenance wall-climbing robot during its operation, path planning and autonomous navigation need to be implemented. Due to non-uniform magnetic adhesion between the wall-climbing robot and the steel plate, often caused by variations [...] Read more.

To reduce manual operation and enhance the intelligence of the high-altitude maintenance wall-climbing robot during its operation, path planning and autonomous navigation need to be implemented. Due to non-uniform magnetic adhesion between the wall-climbing robot and the steel plate, often caused by variations in steel thickness or surface pitting, the wall-climbing robot may experience motion deviations and deviate from its planned trajectory. In order to obtain the actual deviation from the expected trajectory, it is necessary to accurately locate the wall-climbing robot. This allows for the generation of precise control signals, enabling trajectory correction and ensuring high-precision autonomous navigation. Therefore, this paper proposes an external visual localization system based on a pan–tilt laser tracker unit. The system utilizes a zoom camera to track an AprilTag marker and drives the pan–tilt platform, while a laser rangefinder provides high-accuracy distance measurement. The robot’s three-dimensional (3D) pose is ultimately calculated by fusing the visual and ranging data. However, due to the limited tracking speed of the pan–tilt mechanism relative to the robot’s movement, we introduce an Extended Kalman Filter (EKF) to robustly predict the robot’s true spatial coordinates. The robot’s three-dimensional coordinates are periodically compared with the predefined route coordinates to calculate the deviation. This comparison generates closed-loop control signals for the robot’s movement direction and speed. Finally, based on the LoRa communication protocol, closed-loop control of the robot’s movement direction and speed are achieved through the upper-level computer, ensuring that the robot returns to the predefined track. Extensive comparative experiments demonstrate that the localization system achieves stable localization with an accuracy better than 0.025 m on a 6 m × 2.5 m steel structure surface. Based on this high-precision positioning and motion correction, the robot’s motion deviation is kept within 0.1 m, providing a reliable pose reference for precise motion control and high-reliability operation in complex structural environments. Full article

(This article belongs to the Special Issue New Trends in the Sensing and Control Techniques Used for Intelligent Industrial Perception and Service Robotics)

► Show Figures

Figure 1

32 pages, 1500 KB

Open AccessArticle

Communication-Efficient Asynchronous Fusion for Multi-Radar Systems via State and Covariance Projection

by Wenhui Xue, Peng Chen, Chunguo Li, Zhenxin Cao and Shuqin Zhang

Electronics 2026, 15(2), 458; https://doi.org/10.3390/electronics15020458 - 21 Jan 2026

Viewed by 638

Abstract

Multi-radar systems can significantly improve tracking robustness and accuracy, but practical deployments are challenged by asynchronous sensing timestamps across distributed platforms and by limited communication bandwidth. This paper proposes a communication-efficient asynchronous track fusion framework based on state and covariance projection. Each radar [...] Read more.

Multi-radar systems can significantly improve tracking robustness and accuracy, but practical deployments are challenged by asynchronous sensing timestamps across distributed platforms and by limited communication bandwidth. This paper proposes a communication-efficient asynchronous track fusion framework based on state and covariance projection. Each radar performs local Kalman filtering and transmits only a compact track message consisting of the posterior state estimate, the associated error covariance, and a timestamp. At the fusion center, a causal reference time is chosen as the latest received timestamp, and all tracks are projected to this common time using a hybrid constant-acceleration (CA)/constant-velocity (CV) motion model with appropriately discretized process noise, followed by information-form (inverse-covariance) fusion. Under standard linear-Gaussian assumptions, the fusion rule is minimum mean square error (MMSE)-optimal when the projected estimation errors are approximately independent. We also analyze the computational complexity and the communication payload of the proposed procedure. Monte Carlo simulations with five heterogeneous radars and random inter-radar time offsets up to 37.5 ms over 100 runs show that the proposed fusion reduces the steady-state range root mean square error (RMSE) by about 66% and the radial-velocity RMSE by about 31% relative to the average single-radar tracker, while maintaining statistical consistency as verified by the normalized estimation error squared (NEES). These results indicate that projection-based track fusion provides an effective accuracy–communication trade-off for asynchronous multi-radar tracking. Full article

(This article belongs to the Special Issue Challenges and Opportunities in the Internet of Vehicles)

► Show Figures

Figure 1

43 pages, 6158 KB

Open AccessArticle

A Multi-Fish Tracking and Behavior Modeling Framework for High-Density Cage Aquaculture

by Xinyao Xiao, Tao Liu, Shuangyan He, Peiliang Li, Yanzhen Gu, Pixue Li and Jiang Dong

Sensors 2026, 26(1), 256; https://doi.org/10.3390/s26010256 - 31 Dec 2025

Viewed by 998

Abstract

Multi-fish tracking and behavior analysis in deep-sea cages face two critical challenges: first, the homogeneity of fish appearance and low image quality render appearance-based association unreliable; second, standard linear motion models fail to capture the complex, nonlinear swimming patterns (e.g., turning) of fish, [...] Read more.

Multi-fish tracking and behavior analysis in deep-sea cages face two critical challenges: first, the homogeneity of fish appearance and low image quality render appearance-based association unreliable; second, standard linear motion models fail to capture the complex, nonlinear swimming patterns (e.g., turning) of fish, leading to frequent identity switches and fragmented trajectories. To address these challenges, we propose SOD-SORT, which integrates a Constant Turn-Rate and Velocity (CTRV) motion model within an Extended Kalman Filter (EKF) framework into DeepOCSORT, a recent observation-centric tracker. Through systematic Bayesian optimization of the EKF process noise (Q), observation noise (R), and ReID weighting parameters, we achieve harmonious integration of advanced motion modeling with appearance features. Evaluations on the DeepBlueI validation set show that SOD-SORT attains IDF1 = 0.829 and reduces identity switches by 13% (93 vs. 107) compared to the DeepOCSORT baseline, while maintaining comparable MOTA (0.737). Controlled ablation studies reveal that naive integration of CTRV-EKF with default parameters degrades performance substantially (IDs: 172 vs. 107 baseline), but careful parameter optimization resolves this motion-appearance conflict. Furthermore, we introduce a statistical quantization method that converts variable-length trajectories into fixed-length feature vectors, enabling effective unsupervised classification of normal and abnormal swimming behaviors in both the Fish4Knowledge coral reef dataset and real-world Deep Blue I cage videos. The proposed approach demonstrates that principled integration of advanced motion models with appearance cues, combined with high-quality continuous trajectories, can support reliable behavior modeling for aquaculture monitoring applications. Full article

(This article belongs to the Special Issue Sensors and Advanced Sensing Techniques for Computer Vision Applications: Second Edition)

► Show Figures

Figure 1

22 pages, 3668 KB

Open AccessArticle

OcclusionTrack: Multi-Object Tracking in Dense Scenes

by Yuzhi Chen, Fanqin Meng and Ziqiu Chen

Appl. Sci. 2025, 15(24), 13030; https://doi.org/10.3390/app152413030 - 10 Dec 2025

Viewed by 1813

Abstract

This paper presents OcclusionTrack (OCCTrack), a robust multi-object tracker designed to address occlusion challenges in dense scenes. Occlusion remains a critical issue in multi-object tracking; despite significant advancements in current tracking methods, dense scenes and frequent occlusions continue to pose formidable challenges for [...] Read more.

This paper presents OcclusionTrack (OCCTrack), a robust multi-object tracker designed to address occlusion challenges in dense scenes. Occlusion remains a critical issue in multi-object tracking; despite significant advancements in current tracking methods, dense scenes and frequent occlusions continue to pose formidable challenges for existing tracking-by-detection trackers. Therefore, four key improvements are integrated into a tracking-by-detection paradigm: (1) a confidence-based Kalman filter (CBKF) that dynamically adapts measurement noise to handle partial occlusions; (2) camera motion compensation (CMC) for inter-frame alignment to stabilize predictions; (3) a depth–cascade-matching (DCM) algorithm that uses relative depth to resolve association ambiguities among overlapping objects; and (4) a CMC-detection-based trajectory Re-activate method to recover and correct tracks after complete occlusion. Despite relying solely on IoU matching, OCCTrack achieves highly competitive performance on MOT17 (HOTA 64.9, MOTA 80.9, IDF1 79.7), MOT20 (HOTA 63.2, MOTA 76.9, IDF1 77.5), and DanceTrack (HOTA 57.5, MOTA 91.4, IDF1 58.4). The primary contribution of this work lies in the cohesive integration of these modules into a unified, real-time pipeline that systematically mitigates both partial and complete occlusion effects, offering a practical and reproducible framework for complex real-world tracking scenarios. Full article

► Show Figures

Figure 1

25 pages, 8383 KB

Open AccessArticle

MemLoTrack: Enhancing TIR Anti-UAV Tracking with Memory-Integrated Low-Rank Adaptation

by Jae Kwan Park and Ji-Hyeong Han

Sensors 2025, 25(23), 7359; https://doi.org/10.3390/s25237359 - 3 Dec 2025

Viewed by 1213

Abstract

Tracking small, fast-moving unmanned aerial vehicles (UAVs) in thermal infrared (TIR) imagery is a significant challenge due to low-resolution targets, Dynamic Background Clutter, and frequent occlusions. To address this, we introduce MemLoTrack, a novel onestream Vision Transformer tracker that integrates a memory mechanism [...] Read more.

Tracking small, fast-moving unmanned aerial vehicles (UAVs) in thermal infrared (TIR) imagery is a significant challenge due to low-resolution targets, Dynamic Background Clutter, and frequent occlusions. To address this, we introduce MemLoTrack, a novel onestream Vision Transformer tracker that integrates a memory mechanism into a parameterefficient LoRA framework. MemLoTrack enhances a baseline tracker (LoRAT) with two key components: (i) a gated First-In, First-Out (FIFO) memory bank (MB) for temporal context aggregation and (ii) a lightweight Memory Attention Layer (MAL) for effective information retrieval. A key component of our method is a selective memory update policy, which commits a frame to the memory bank only when it satisfies both a classification confidence threshold (

τ

) and a Kalman filter-based motion consistency check. This gating mechanism robustly prevents memory contamination due to distractors, occlusions, and reappearance events. Our training is highly efficient, updating only the LoRA adapters, MAL, and prediction head while the pretrained DINOv2 backbone remains frozen. Evaluated on the challenging Anti-UAV410 benchmark, MemLoTrack (L_mem = 7,

τ

= 0.8) achieves an AUC of 63.6 and a State Accuracy (SA) of 64.0, representing a significant improvement over the LoRAT baseline by +1.4 AUC and +1.5 SA. Compared to the state-of-the-art method FocusTrack, MemLoTrack demonstrates superior robustness with higher AUC (63.6 vs. 62.8) and SA (64.0 vs. 63.9), while trading lower precision (P/P-Norm) scores. Furthermore, MemLoTrack operates at 153 FPS on a single RTX 4070 Ti SUPER, demonstrating that parameter-efficient fine-tuning with a selective memory mechanism is a powerful and deployable strategy for real-time Anti-UAV tracking in demanding TIR environments. Full article

(This article belongs to the Special Issue Vision Sensors for Object Detection and Tracking)

► Show Figures

Figure 1

17 pages, 8567 KB

Open AccessArticle

Multi-Object Tracking with Confidence-Based Trajectory Prediction Scheme

by Kai Yi, Jiarong Li and Yi Zhang

Sensors 2025, 25(23), 7221; https://doi.org/10.3390/s25237221 - 26 Nov 2025

Cited by 1 | Viewed by 2695

Abstract

Multi-Object Tracking (MOT) aims to associate multiple objects across consecutive video sequences and maintain continuous and stable trajectories. Currently, much attention has been paid to data association problems, where many methods filter detection boxes for object matching based on the confidence scores (CS) [...] Read more.

Multi-Object Tracking (MOT) aims to associate multiple objects across consecutive video sequences and maintain continuous and stable trajectories. Currently, much attention has been paid to data association problems, where many methods filter detection boxes for object matching based on the confidence scores (CS) of the detectors without fully utilizing the detection results. Kalman filter (KF) is a traditional means for sequential frame processing, which has been widely adopted in MOT. It matches and updates a predicted trajectory with a detection box in video. However, under crowded scenes, the noise will create low-confidence detection boxes, causing identity switch (IDS) and tracking failure. In this paper, we thoroughly investigate the limitations of existing trajectory prediction schemes in MOT and prove that KF can still achieve competitive results in video sequence processing if proper care is taken to handle the noise. We propose a confidence-based trajectory prediction scheme (dubbed ConfMOT) based on KF. The CS of the detection results is used to adjust the noise during updating KF and to predict the trajectories of the tracked objects in videos. While a cost matrix (CM) is constructed to measure the cost of successful matching of unreliable objects. Meanwhile, each trajectory is labeled with a unique CS, while the lost trajectories that have not been updated for a long time will be removed. Our tracker is simple yet efficient. Extensive experiments have been conducted on mainstream datasets, where our tracker has exhibited superior performance to other advanced competitors. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

21 pages, 4837 KB

Open AccessArticle

IEGS-BoT: An Integrated Detection-Tracking Framework for Cellular Dynamics Analysis in Medical Imaging

by Shuqin Tu, Weidian Chen, Liang Mao, Quan Zhang, Fang Yuan and Jiaying Du

Biomimetics 2025, 10(9), 564; https://doi.org/10.3390/biomimetics10090564 - 24 Aug 2025

Cited by 1 | Viewed by 1299

Abstract

Cell detection-tracking tasks are vital for biomedical image analysis with potential applications in clinical diagnosis and treatment. However, it poses challenges such as ambiguous boundaries and complex backgrounds in microscopic video sequences, leading to missed detection, false detection, and loss of tracking. Therefore, [...] Read more.

Cell detection-tracking tasks are vital for biomedical image analysis with potential applications in clinical diagnosis and treatment. However, it poses challenges such as ambiguous boundaries and complex backgrounds in microscopic video sequences, leading to missed detection, false detection, and loss of tracking. Therefore, we propose an enhanced multiple object tracking algorithm IEGS-YOLO + BoT-SORT, named IEGS-BoT, to address these issues. Firstly, the IEGS-YOLO detector is developed for cell detection tasks. It uses the iEMA module, which effectively combines the global information to enhance the local information. Then, we replace the traditional convolutional network in the neck of the YOLO11n with GSConv to reduce the computational complexity while maintaining accuracy. Finally, the BoT-SORT tracker is selected to enhance the accuracy of bounding box positioning through camera motion compensation and Kalman filter. We conduct experiments on the CTMC dataset, and the results show that in the detection phase, the map50 (mean Average Precision) and map50–95 values are 73.2% and 32.6%, outperforming the YOLO11n detector by 1.1% and 0.6%, respectively. In the tracking phase, using the IEGS-BoT method, the multiple objects tracking accuracy (MOTA), higher order tracking accuracy (HOTA), and identification F1 (IDF1) reach 53.97%, 51.30%, and 67.52%, respectively. Compared with the base BoT-SORT, the proposed method achieves improvements of 1.19%, 0.23%, and 1.29% in MOTA, HOTA, and IDF1, respectively. ID switch (IDSW) decreases from 1170 to 894, which demonstrates significant mitigation of identity confusion. This approach effectively addresses the challenges posed by object loss and identity switching in cell tracking, providing a more reliable solution for medical image analysis. Full article

(This article belongs to the Special Issue New Biomimetic Advances in Signal and Image Processing for Biomedical Applications 2025)

► Show Figures

Figure 1

30 pages, 7223 KB

Open AccessEditor’s ChoiceArticle

Smart Wildlife Monitoring: Real-Time Hybrid Tracking Using Kalman Filter and Local Binary Similarity Matching on Edge Network

by Md. Auhidur Rahman, Stefano Giordano and Michele Pagano

Computers 2025, 14(8), 307; https://doi.org/10.3390/computers14080307 - 30 Jul 2025

Cited by 6 | Viewed by 4254

Abstract

Real-time wildlife monitoring on edge devices poses significant challenges due to limited power, constrained bandwidth, and unreliable connectivity, especially in remote natural habitats. Conventional object detection systems often transmit redundant data of the same animals detected across multiple consecutive frames as a part [...] Read more.

Real-time wildlife monitoring on edge devices poses significant challenges due to limited power, constrained bandwidth, and unreliable connectivity, especially in remote natural habitats. Conventional object detection systems often transmit redundant data of the same animals detected across multiple consecutive frames as a part of a single event, resulting in increased power consumption and inefficient bandwidth usage. Furthermore, maintaining consistent animal identities in the wild is difficult due to occlusions, variable lighting, and complex environments. In this study, we propose a lightweight hybrid tracking framework built on the YOLOv8m deep neural network, combining motion-based Kalman filtering with Local Binary Pattern (LBP) similarity for appearance-based re-identification using texture and color features. To handle ambiguous cases, we further incorporate Hue-Saturation-Value (HSV) color space similarity. This approach enhances identity consistency across frames while reducing redundant transmissions. The framework is optimized for real-time deployment on edge platforms such as NVIDIA Jetson Orin Nano and Raspberry Pi 5. We evaluate our method against state-of-the-art trackers using event-based metrics such as MOTA, HOTA, and IDF1, with a focus on detected animals occlusion handling, trajectory analysis, and counting during both day and night. Our approach significantly enhances tracking robustness, reduces ID switches, and provides more accurate detection and counting compared to existing methods. When transmitting time-series data and detected frames, it achieves up to 99.87% bandwidth savings and 99.67% power reduction, making it highly suitable for edge-based wildlife monitoring in resource-constrained environments. Full article

(This article belongs to the Special Issue Intelligent Edge: When AI Meets Edge Computing)

► Show Figures

Figure 1

19 pages, 2267 KB

Open AccessArticle

Closed-Loop Aerial Tracking with Dynamic Detection-Tracking Coordination

by Yang Wang, Heqing Huang, Jiahao He, Dongting Han and Zhiwei Zhao

Drones 2025, 9(7), 467; https://doi.org/10.3390/drones9070467 - 30 Jun 2025

Cited by 2 | Viewed by 1484

Abstract

Aerial tracking is an important service for many Unmanned Aerial Vehicle (UAV) applications. Existing work has failed to provide robust solutions when handling target disappearance, viewpoint changes, and tracking drifts in practical scenarios with limited UAV resources. In this paper, we propose a [...] Read more.

Aerial tracking is an important service for many Unmanned Aerial Vehicle (UAV) applications. Existing work has failed to provide robust solutions when handling target disappearance, viewpoint changes, and tracking drifts in practical scenarios with limited UAV resources. In this paper, we propose a closed-loop framework integrating three key components: (1) a lightweight adaptive detection with multi-scale feature extraction, (2) spatiotemporal motion modeling through Kalman-filter-based trajectory prediction, and (3) autonomous decision-making through composite scoring of detection confidence, appearance similarity, and motion consistency. By implementing dynamic detection-tracking coordination with quality-aware feature preservation, our system enables real-time operation through performance-adaptive frequency modulation. Evaluated on VOT-ST2019 and OTB100 benchmarks, the proposed method yields marked improvements over baseline trackers, achieving a 27.94% increase in Expected Average Overlap (EAO) and a 10.39% reduction in failure rates, while sustaining a frame rate of 23–95 FPS on edge hardware. The framework achieves rapid target reacquisition during prolonged occlusion scenarios through optimized protocols, outperforming conventional methods in sustained aerial surveillance tasks. Full article

(This article belongs to the Section Drone Design and Development)

► Show Figures

Figure 1

20 pages, 119066 KB

Open AccessArticle

Coarse-Fine Tracker: A Robust MOT Framework for Satellite Videos via Tracking Any Point

by Hanru Shi, Xiaoxuan Liu, Xiyu Qi, Enze Zhu, Jie Jia and Lei Wang

Remote Sens. 2025, 17(13), 2167; https://doi.org/10.3390/rs17132167 - 24 Jun 2025

Cited by 1 | Viewed by 1715

Abstract

Traditional Multiple Object Tracking (MOT) methods in satellite videos mostly follow the Detection-Based Tracking (DBT) framework. However, the DBT framework assumes that all objects are correctly recognized and localized by the detector. In practice, the low resolution of satellite videos, small objects, and [...] Read more.

Traditional Multiple Object Tracking (MOT) methods in satellite videos mostly follow the Detection-Based Tracking (DBT) framework. However, the DBT framework assumes that all objects are correctly recognized and localized by the detector. In practice, the low resolution of satellite videos, small objects, and complex backgrounds inevitably leads to a decline in detector performance. To alleviate the impact of detector degradation on track, we propose Coarse-Fine Tracker, a framework that integrates the MOT framework with the Tracking Any Point (TAP) method CoTracker for the first time, leveraging TAP’s persistent point correspondence modeling to compensate for detector failures. In our Coarse-Fine Tracker, we divide the satellite video into sub-videos. For one sub-video, we first use ByteTrack to track the outputs of the detector, referred to as coarse tracking, which involves the Kalman filter and box-level motion features. Given the small size of objects in satellite videos, we treat each object as a point to be tracked. We then use CoTracker to track the center point of each object, referred to as fine tracking, by calculating the appearance feature similarity between each point and its neighboring points. Finally, the Consensus Fusion Strategy eliminates mismatched detections in coarse tracking results by checking their geometric consistency against fine tracking results and recovers missed objects via linear interpolation or linear fitting. This method is validated on the VISO and SAT-MTB datasets. Experimental results in VISO show that the tracker achieves a multi-object tracking accuracy (MOTA) of 66.9, a multi-object tracking precision (MOTP) of 64.1, and an IDF1 score of 77.8, surpassing the detector-only baseline by 11.1% in MOTA while reducing ID switches by 139. Comparative experiments with ByteTrack demonstrate the robustness of our tracking method when the performance of the detector deteriorates. Full article

(This article belongs to the Special Issue Target Detection, Recognition, Tracking, and Positioning Using Remote Sensing and AI Techniques)

► Show Figures

Figure 1

Search Results (63)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (63)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI