Joint Adjustment Image Stabilization Method Based on Trajectories of Maritime Multi-Target Detection and Tracking
Abstract
1. Introduction
- Innovative multi-scale heterogeneous convolution detection algorithm: A heterogeneous network module integrating convolutions of different scales is designed, jointly learning features through target, background regions, and strip convolutions, significantly enhancing the detection accuracy and robustness of weak small ship targets.
- Trajectory Correction Method Based on Curve Model Fitting: To address the absence of stable feature points in maritime scenes, a target trajectory fitting method based on curve models is proposed. Combined with the prior assumption of uniform motion, an equal-division correction strategy is introduced to achieve accurate fitting of ship target trajectories.
- Joint Stabilization Method with Multi-Target Deviation Consistency Constraint: A joint image stabilization method based on multiple target trajectories is constructed. By imposing a consistency constraint on the deviations of multiple targets within the same frame, the stabilization parameters are solved accurately, which significantly reduces trajectory jitter.
2. Related Work
2.1. Small-Target Detection and Tracking in Remote Sensing Images
2.2. Marine Target Detection and Tracking
2.3. Sequence Image Stabilization Technology
2.4. Multi-Target Trajectory Modeling in Image Processing
3. Methodology
3.1. Small-Target Detection and Tracking
3.2. Multi-Target Joint Image Stabilization
- Let t be the time index (), the start time, and the end time.
- Let i be the vessel index ().
- Let denote the image coordinates of the i-th ship at time t.
- Let denote the image coordinates of the i-th ship at time t obtained from the quadratic curve model fitting.
- Let represent the image offset at time t.
- Let represent the offset of the i-th ship at time t.
4. Experiments
4.1. Datasets
- Frame selection: Three consecutive frames are selected from the sequence, ensuring they capture the same geographic area with minimal time intervals. This aims to preserve temporal dynamics while maintaining spatial consistency.
- Stacking operation: The selected frames are stacked along the time dimension to create a three-channel input, where each channel corresponds to a different frame image, thereby forming a three-channel RGB-like image. The original targets, which appear as gray patches, are transformed into red–green–blue point sequences.
- Sample generation: The stacked images are divided into non-overlapping patches of size pixels, which collectively constitute the complete dataset. Sample annotation is performed based on the synthetic sequence point targets from the three frames, with bounding boxes annotated accordingly. During testing, the predicted bounding box center points are used as the target positions in the current frame. The dataset contains a total of 260 target sequences, each consisting of 30 frames. The training, validation, and test sets are split in an 8:1:1 ratio.
- Quality control: Each sample undergoes quality checks to verify the absence of severe artifacts such as cloud cover or sensor saturation. Significantly contaminated samples are excluded to ensure training reliability.
4.2. Experimental Setting
4.3. Evaluation Metrics
4.3.1. Target Detection and Tracking Metrics
4.3.2. Image Stabilization Deviation Evaluation Metric
4.4. Comparative Experiments
4.4.1. Comparative Experiment on Target Detection and Tracking
4.4.2. Image Stabilization Validation Experiment
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Wang, B.; Xing, Y.; Wang, N.; Chen, C.P. Monitoring waste from unmanned aerial vehicle and satellite imagery using deep learning techniques: A review. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 1–15. [Google Scholar] [CrossRef]
- Zhu, W.; Gong, W.; Wang, Y.; Zhang, Y.; Hu, J. GMF-Net: A Gaussian-Matched Fusion Network for Weak Small Object Detection in Satellite Laser Ranging Imagery. Sensors 2026, 26, 407. [Google Scholar] [CrossRef]
- Aydin, A.; Avaroğlu, E. AERIS-ED: A Novel Efficient Attention Riser for Multi-Scale Object Detection in Remote Sensing. Appl. Sci. 2025, 15, 12223. [Google Scholar] [CrossRef]
- Wang, P.; Qin, P.; Chai, R.; Zeng, J.; Zhao, P.; Chen, Z.; Han, B. End-to-End Online Video Stitching and Stabilization Method Based on Unsupervised Deep Learning. Appl. Sci. 2025, 15, 5987. [Google Scholar] [CrossRef]
- Rekavandi, A.M.; Xu, L.; Boussaid, F.; Seghouane, A.K.; Hoefs, S.; Bennamoun, M. A guide to image-and video-based small object detection using deep learning: Case study of maritime surveillance. IEEE Trans. Intell. Transp. Syst. 2025, 26, 1234–1245. [Google Scholar] [CrossRef]
- Zhang, C.; Zhang, X.; Gao, G.; Lang, H.; Liu, G.; Cao, C.; Song, Y.; Guan, Y.; Dai, Y. Development and application of ship detection and classification datasets: A review. IEEE Geosci. Remote Sens. Mag. 2024, 14, 456–468. [Google Scholar] [CrossRef]
- Gao, F.; Tian, Y.; Wu, Y.; Zhang, Y. ST-YOLOv8: Small-Target Ship Detection in SAR Images Targeting Specific Marine Environments. Appl. Sci. 2025, 15, 3456–3468. [Google Scholar] [CrossRef]
- Liu, F.; Zhang, F.; Wang, M.; Xu, Q. Two-Level Supervised Network for Small Ship Target Detection in Shallow Thin Cloud-Covered Optical Satellite Images. Appl. Sci. 2024, 14, 11558. [Google Scholar] [CrossRef]
- Lowe, D.G. Distinctive Image Features from Scale-Invariant Keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
- Dalal, N.; Triggs, B. Histograms of Oriented Gradients for Human Detection. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: Piscataway, NJ, USA, 2005; pp. 886–893. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Computer Vision–ECCV 2016 (ECCV); Springer: Berlin/Heidelberg, Germany, 2016; pp. 21–37. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: Piscataway, NJ, USA, 2016. [Google Scholar]
- Sun, Y.; Liu, W.; Gao, Y.; Hou, X.; Bi, F. A Dense Feature Pyramid Network for Remote Sensing Object Detection. Appl. Sci. 2022, 12, 4997. [Google Scholar] [CrossRef]
- Yuan, Z.; Liu, Z.; Zhu, C.; Qi, J.; Zhao, D. Object Detection in Remote Sensing Images via Multi-Feature Pyramid Network with Receptive Field Block. Remote Sens. 2021, 13, 862. [Google Scholar] [CrossRef]
- Yang, X.; Yan, J.; Feng, Z.; He, T. R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object Detection. Proc. AAAI Conf. Artif. Intell. 2021, 35, 3163–3171. [Google Scholar] [CrossRef]
- Xie, X.; Cheng, G.; Wang, J.; Yao, X.; Han, J. Oriented R-CNN for Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV); IEEE: Piscataway, NJ, USA, 2021; pp. 3520–3529. [Google Scholar]
- Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-End Object Detection with Transformers. In European Conference on Computer Vision (ECCV); Springer: Berlin/Heidelberg, Germany, 2020. [Google Scholar]
- Wang, X.; Wang, A.; Yang, J.; Chen, A.; Sun, Y. Small Object Detection Based on Deep Learning for Remote Sensing: A Comprehensive Review. Remote Sens. 2023, 15, 3265. [Google Scholar] [CrossRef]
- Zhang, Z.; Wang, C.; Song, J.; Xu, Y. Object Tracking Based on Satellite Videos: A Literature Review. Remote Sens. 2022, 14, 3674. [Google Scholar] [CrossRef]
- Li, Y.; Xu, Q.; Kong, Z.; Li, W. MULS-Net: A multilevel supervised network for ship tracking from low-resolution remote-sensing image sequences. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5624214. [Google Scholar] [CrossRef]
- Kong, Z.; Xu, Q.; Li, Y.; Han, X.; Li, W. TS-Track: Trajectory self-adjusted ship tracking for GEO satellite image sequences via multilevel supervision paradigm. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5639415. [Google Scholar] [CrossRef]
- Zhao, T.; Wang, Y.; Li, Z.; Gao, Y.; Chen, C.; Feng, H.; Zhao, Z. Ship Detection with Deep Learning in Optical Remote-Sensing Images: A Survey of Challenges and Advances. Remote Sens. 2024, 16, 1145. [Google Scholar] [CrossRef]
- Hu, J.; Zhi, X.; Jiang, S.; Tang, H.; Zhang, W.; Bruzzone, L. Supervised multi-scale attention-guided ship detection in optical remote sensing images. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5630514. [Google Scholar] [CrossRef]
- Xing, Z.; Ren, J.; Fan, X.; Zhang, Y. S-DETR: A Transformer Model for Real-Time Detection of Marine Ships. J. Mar. Sci. Eng. 2023, 11, 696. [Google Scholar] [CrossRef]
- Liu, Q.; Xiang, X.; Yang, Z.; Hu, Y.; Hong, Y. Arbitrary Direction Ship Detection in Remote-Sensing Images Based on Multitask Learning and Multiregion Feature Fusion. IEEE Trans. Geosci. Remote Sens. 2021, 59, 1553–1564. [Google Scholar] [CrossRef]
- Li, L.; Zhou, Z.; Wang, B.; Miao, L.; Zong, H. A Novel CNN-Based Method for Accurate Ship Detection in HR Optical Remote Sensing Images via Rotated Bounding Box. IEEE Trans. Geosci. Remote Sens. 2021, 59, 686–699. [Google Scholar] [CrossRef]
- Zhang, X.; Wang, G.; Zhu, P.; Zhang, T.; Li, C.; Jiao, L. GRS-Det: An Anchor-Free Rotation Ship Detector Based on Gaussian-Mask in Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2021, 59, 3518–3531. [Google Scholar] [CrossRef]
- Yu, W.; You, H.; Lv, P.; Hu, Y.; Han, B. A Moving Ship Detection and Tracking Method Based on Optical Remote Sensing Images from the Geostationary Satellite. Sensors 2021, 21, 7547. [Google Scholar] [CrossRef]
- Wang, B.; Sui, H.; Ma, G.; Zhou, Y. MCTracker: Satellite Video Multi-Object Tracking Considering Inter-Frame Motion Correlation and Multi-Scale Cascaded Feature Enhancement. ISPRS J. Photogramm. Remote Sens. 2024, 214, 181–199. [Google Scholar] [CrossRef]
- Wang, Y.; Huang, Q.; Jiang, C.; Liu, J.; Shang, M.; Miao, Z. Video Stabilization: A Comprehensive Survey. Neurocomputing 2023, 516, 205–230. [Google Scholar] [CrossRef]
- Tang, L.; Ma, S.; Ma, X.; You, H. Research on image matching of improved SIFT algorithm based on stability factor and feature descriptor simplification. Appl. Sci. 2022, 12, 8448. [Google Scholar] [CrossRef]
- Maes, F.; Collignon, A.; Vandermeulen, D.; Marchal, G.; Suetens, P. Multimodality Image Registration by Maximization of Mutual Information. IEEE Trans. Med Imaging 1997, 16, 187–198. [Google Scholar] [CrossRef] [PubMed]
- James, J.G.; Jain, D.; Rajwade, A. Globalflownet: Video stabilization using deep distilled global motion estimates. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision; IEEE: Piscataway, NJ, USA, 2023; pp. 5078–5087. [Google Scholar]
- Choi, J.; Park, J.; Kweon, I.S. Self-supervised real-time video stabilization. In Proceedings of the 32nd British Machine Vision Conference, BMVC 2021, Online, 22–25 November 2021. [Google Scholar]
- Morris, B.T.; Trivedi, M.M. A Survey of Vision-Based Trajectory Learning and Analysis for Surveillance. IEEE Trans. Circuits Syst. Video Technol. 2008, 18, 1114–1127. [Google Scholar] [CrossRef]
- Heravi, M.Y.; Jang, Y.; Jeong, I.; Sarkar, S. Deep learning-based activity-aware 3D human motion trajectory prediction in construction. Expert Syst. Appl. 2024, 239, 122423. [Google Scholar] [CrossRef]
- Zhang, B.; Yu, W.; Jia, Y.; Huang, J.; Yang, D.; Zhong, Z. Predicting vehicle trajectory via combination of model-based and data-driven methods using Kalman filter. Proc. Inst. Mech. Eng. Part D J. Automob. Eng. 2024, 238, 2437–2450. [Google Scholar] [CrossRef]
- Tian, M.; Chen, Z.; Wang, H.; Liu, L. An intelligent particle filter for infrared dim small target detection and tracking. IEEE Trans. Aerosp. Electron. Syst. 2022, 58, 5318–5333. [Google Scholar] [CrossRef]
- Luo, W.; Xing, J.; Milan, A.; Zhang, X.; Liu, W.; Zhao, X.; Kim, T.-K. Multiple Object Tracking: A Literature Review. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 3348–3365. [Google Scholar] [CrossRef]
- Li, X.; Zhang, T.; Liu, Z.; Liu, B.; ur Rehman, S.; Rehman, B.; Sun, C. Saliency guided siamese attention network for infrared ship target tracking. IEEE Trans. Intell. Veh. 2024, 10, 123–134. [Google Scholar] [CrossRef]
- Deng, C.; Wu, J.; Han, Y.; Wang, W.; Chanussot, J. Learning a robust topological relationship for online multiobject tracking in UAV scenarios. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5628615. [Google Scholar] [CrossRef]
- Billah, M.M.; Zhang, J.; Zhang, T. A method for vessel’s trajectory prediction based on encoder decoder architecture. J. Mar. Sci. Eng. 2022, 10, 1529. [Google Scholar] [CrossRef]
- Alcantarilla, P.F.; Bartoli, A.; Davison, A.J. KAZE features. In European Conference on Computer Vision (ECCV); Fiorenze, Italy, 7–13 October 2012, Springer: Berlin/Heidelberg, Germany, 2012; pp. 214–227. [Google Scholar]
- Jocher, G.; Qiu, J. Ultralytics YOLO11. GitHub Repository. 2024. Available online: https://docs.ultralytics.com/zh/models/yolo11/ (accessed on 10 February 2026).
- Yang, X.; Liu, Y.; Wang, Y. Position detection and direction prediction for arbitrary-oriented ships via multitask rotation region convolutional neural network. IEEE Access 2018, 6, 50839–50849. [Google Scholar] [CrossRef]
- Yuan, X.; Wang, T.; Liu, J.; Shen, H. Small object detection via coarse-to-fine proposal generation and imitation learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV); IEEE: Piscataway, NJ, USA, 2023. [Google Scholar]
- Wojke, N.; Bewley, A.; Paulus, D. Simple online and realtime tracking with a deep association metric. In 2017 IEEE International Conference on Image Processing (ICIP); IEEE: Piscataway, NJ, USA, 2017; pp. 3645–3649. [Google Scholar]
- Zhang, Y.; Sun, Y.; Wang, Y.; Li, Y. Bytetrack: Multi-object tracking by associating every detection box. In European Conference on Computer Vision; Springer: Cham, Switzerland, 2022; pp. 581–597. [Google Scholar]
- Meinhardt, T.; Kainz, B.; Leal-Taixé, L. Trackformer: Multi-object tracking with transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; IEEE: Piscataway, NJ, USA, 2022; pp. 15203–15213. [Google Scholar]










| Hyperparameter | Value |
|---|---|
| Epoch | 150 |
| Batch Size | 16 |
| Image Size | |
| Initial Learning Rate | 0.001 |
| Optimizer | Adam |
| Method | Precision (%) | Recall (%) | |
|---|---|---|---|
| Detection | Faster R-CNN | 83.93 | 78.56 |
| YOLOv11 | 88.86 | 81.67 | |
| R-DFPN | 89.22 | 81.52 | |
| Our approach | 90.13 | 82.45 | |
| Method | MOTA (%) | IDSw (n) | |
|---|---|---|---|
| Tracking | DeepSort | 63.59 | 63 |
| ByteTrack | 68.44 | 56 | |
| TrackFormer | 73.26 | 38 | |
| Our approach | 73.97 | 35 | |
| Frame | Original Image Deviation (Pixel) | Stabilized Image Deviation (Pixel) | ||||
|---|---|---|---|---|---|---|
| No. | Horizontal | Vertical | Absolute | Horizontal | Vertical | Absolute |
| 1 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2 | 0.0 | −4.0 | 4.0 | 0.9 | −0.8 | 1.2 |
| 3 | 0.0 | −4.0 | 4.0 | 0.6 | −0.5 | 0.8 |
| 4 | 1.0 | 3.0 | 3.2 | 1.2 | −1.0 | 1.6 |
| 5 | −3.0 | 4.0 | 5.0 | −0.4 | 1.8 | 1.8 |
| 6 | −11.1 | 7.0 | 13.1 | 0.2 | 0.3 | 0.4 |
| 7 | −8.1 | 9.0 | 12.1 | 0.5 | 0.2 | 0.6 |
| 8 | −12.1 | 5.0 | 13.1 | 0.8 | 1.2 | 1.4 |
| 9 | −10.1 | −2.0 | 10.3 | −1.1 | 1.3 | 1.7 |
| 10 | −2.0 | −4.0 | 4.5 | −2.1 | 0.5 | 2.2 |
| 11 | −7.0 | −3.0 | 7.7 | −1.2 | 0.8 | 1.4 |
| 12 | −15.1 | 0.0 | 15.1 | −3.3 | 1.3 | 3.6 |
| 13 | −14.1 | 2.0 | 14.2 | −2.6 | 1.8 | 3.1 |
| 14 | −17.1 | 8.0 | 18.9 | −2.9 | 1.4 | 3.2 |
| 15 | −11.1 | 6.0 | 12.6 | −4.3 | 1.1 | 4.4 |
| 16 | −5.0 | 2.0 | 5.4 | −3.8 | 0.9 | 3.9 |
| 17 | 1.0 | −1.0 | 1.4 | −5.4 | 1.8 | 5.7 |
| 18 | −1.0 | −4.0 | 4.1 | −6.1 | 1.8 | 6.3 |
| 19 | −4.0 | −2.0 | 4.5 | −5.8 | 0.8 | 5.9 |
| 20 | −9.1 | 1.0 | 9.1 | −5.6 | −0.1 | 5.6 |
| 21 | −12.1 | 4.0 | 12.7 | −4.5 | −0.9 | 4.6 |
| 22 | −16.1 | 5.0 | 16.9 | −2.5 | −0.7 | 2.6 |
| 23 | 0.0 | 1.0 | 1.0 | −3.6 | −3.4 | 4.9 |
| 24 | 2.0 | 4.0 | 4.5 | −1.7 | −2.0 | 2.6 |
| 25 | 10.1 | 1.0 | 10.1 | 1.1 | −2.6 | 2.8 |
| 26 | 13.1 | 1.0 | 13.1 | 0.9 | −4.1 | 4.1 |
| 27 | 7.0 | 4.0 | 8.1 | 2.5 | −4.5 | 5.2 |
| 28 | 1.0 | 5.0 | 5.1 | 2.1 | −5.9 | 6.3 |
| 29 | −4.0 | 7.0 | 8.1 | 3.7 | −6.2 | 7.2 |
| 30 | 4.0 | 11.0 | 11.7 | 4.0 | −6.7 | 7.8 |
| Mean | −4.1 | 2.2 | 8.5 | −1.2 | −0.8 | 3.4 |
| Object Number | Original Image Deviation (Pixel) | Stabilized Image Deviation (Pixel) | ||||
|---|---|---|---|---|---|---|
| Horizontal | Vertical | Absolute | Horizontal | Vertical | Absolute | |
| ≥3 | −3.8 | 2.1 | 7.9 | −1.2 | −0.7 | 3.2 |
| 2 | −3.7 | 2.0 | 7.7 | −1.6 | −0.7 | 4.5 |
| 1 | −3.9 | 2.1 | 8.1 | −1.6 | −0.6 | 5.2 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Liu, F.; Li, Y.; Wang, M. Joint Adjustment Image Stabilization Method Based on Trajectories of Maritime Multi-Target Detection and Tracking. Appl. Sci. 2026, 16, 4029. https://doi.org/10.3390/app16084029
Liu F, Li Y, Wang M. Joint Adjustment Image Stabilization Method Based on Trajectories of Maritime Multi-Target Detection and Tracking. Applied Sciences. 2026; 16(8):4029. https://doi.org/10.3390/app16084029
Chicago/Turabian StyleLiu, Fangjian, Yuan Li, and Mi Wang. 2026. "Joint Adjustment Image Stabilization Method Based on Trajectories of Maritime Multi-Target Detection and Tracking" Applied Sciences 16, no. 8: 4029. https://doi.org/10.3390/app16084029
APA StyleLiu, F., Li, Y., & Wang, M. (2026). Joint Adjustment Image Stabilization Method Based on Trajectories of Maritime Multi-Target Detection and Tracking. Applied Sciences, 16(8), 4029. https://doi.org/10.3390/app16084029
