A Ship Incremental Recognition Framework via Unknown Extraction and Joint Optimization Learning
Highlights
- An open-world detection framework oriented towards remote sensing ship targets is proposed.
- An unknown target extraction module based on tail distribution modeling is proposed, which can accurately distinguish unknown ships.
- A joint optimization based learning module is proposed to achieve incremental recognition of new class samples, while significantly alleviating the catastrophic forgetting problem of known classes.
- The proposed method can provide ideas for discovering new types of maritime targets in complex scenarios, offering support for maritime safety and traffic management applications.
Abstract
1. Introduction
- An open-world detection framework oriented towards remote sensing ship targets is proposed. Supported by an adaptive unknown rejection threshold and an incremental learning mechanism, it provides new solutions to address the continual evolution of ship categories and unpredictability in complex real-world scenarios;
- A Fine-Grained Feature and Extreme Value-Based Unknown Recognition Module (FEUR) is designed, which achieves precise detection and effective differentiation of unknown ship targets by capturing subtle differences between ship categories and combining it with tail distribution modeling;
- A Joint Optimization-Based Incremental Learning (JOIL) module is proposed, which utilizes hierarchical parameter constraints to achieve differentiated adjustment of the backbone network and detection head. This allows for incremental recognition with only a small number of new class labeled samples, while significantly alleviating the catastrophic forgetting problem of known categories.
2. Related Works
2.1. Ship Object Detection
2.2. Unknown Identification and Open World Detection
2.3. Incremental Learning
3. Methods
3.1. Overall Architecture of the Proposed Method
3.2. Fine-Grained Feature and Extreme Value–Based Unknown Recognition Module
3.3. Joint Optimization–Based Incremental Learning Module
4. Experiments
4.1. Datasets and Evaluation Metrics
4.2. Implementation Details
4.3. Ablation Analysis
- When the Tailsize is 10, the A-OSE and WI metrics are at their lowest, indicating that the risk of the model misclassifying unknown categories as known categories is the lowest;
- As Tailsize increases, UDR gradually improves, which is because a larger Tailsize covers more tail samples, thereby enhancing its sensitivity to outliers (unknown categories). In extreme cases, when Tailsize equals the total number of samples, the model treats all targets as outliers. Although UDR = 1, the mAP for known categories becomes 0, losing its practical significance;
- When Tailsize is 10, the model achieves the highest UDP, indicating that the model’s confidence calibration performance for detecting unknown categories is optimal at this parameter;
- The dynamic selection strategy for Tailsize does not show a significant performance advantage.
4.4. Algorithm Performance Comparison
4.4.1. Unknown Detection Experiment Results
4.4.2. Incremental Learning Experimental Results
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Prasad, D.K.; Rajan, D.; Rachmawati, L.; Rajabally, E.; Quek, C. Video processing from electro-optical sensors for object detection and tracking in a maritime environment: A survey. IEEE Trans. Intell. Transp. Syst. 2017, 18, 1993–2016. [Google Scholar] [CrossRef]
- Tang, B.; Lu, R.; Yang, X.; Li, Y.; Li, Y.; Zhang, D.; Chen, S. R2PLoc: A Region-to-Point UAV Visual Geo-Localization Framework Leveraging Hierarchical Semantic Representation. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5643818. [Google Scholar] [CrossRef]
- Hu, J.; Wei, Y.; Chen, W.; Zhi, X.; Zhang, W. CM-YOLO: Typical object detection method in remote sensing cloud and mist scene images. Remote Sens. 2025, 17, 125. [Google Scholar] [CrossRef]
- Yao, Y.; Jiang, Z.; Zhang, H.; Zhao, D.; Cai, B. Ship detection in optical remote sensing images based on deep convolutional neural networks. J. Appl. Remote Sens. 2017, 11, 042611. [Google Scholar] [CrossRef]
- Zhang, R.; Yao, J.; Zhang, K.; Feng, C.; Zhang, J. S-CNN-based ship detection from high-resolution remote sensing images. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2016, 41, 423–430. [Google Scholar] [CrossRef]
- Zou, Z.; Shi, Z. Ship detection in spaceborne optical image with SVD networks. IEEE Trans. Geosci. Remote Sens. 2016, 54, 5832–5845. [Google Scholar] [CrossRef]
- Zhou, K.; Zhang, M.; Wang, H.; Tan, J. Ship detection in SAR images based on multi-scale feature extraction and adaptive feature fusion. Remote Sens. 2022, 14, 755. [Google Scholar] [CrossRef]
- Zhuo, Z.; Lu, R.; Yao, Y.; Wang, S.; Zheng, Z.; Zhang, J.; Yang, X. TAF-YOLO: A Small-Object Detection Network for UAV Aerial Imagery via Visible and Infrared Adaptive Fusion. Remote Sens. 2025, 17, 3936. [Google Scholar] [CrossRef]
- Zhou, Y.; Zhu, Y.; Ren, H.; Kang, J.; Zou, L.; Wang, X. Refined Multi-modal Feature Learning Framework for Marine Target Detection Using Radar Sensor. Digit. Signal Process. 2025, 170, 105816. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Dosovitskiy, A. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Joseph, K.; Khan, S.; Khan, F.S.; Balasubramanian, V.N. Towards open world object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 5830–5840. [Google Scholar]
- Peng, C.; Zhao, K.; Wang, T.; Li, M.; Lovell, B.C. Few-shot class-incremental learning from an open-set perspective. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; Springer: Cham, Switzerland, 2022; pp. 382–397. [Google Scholar]
- Shmelkov, K.; Schmid, C.; Alahari, K. Incremental learning of object detectors without catastrophic forgetting. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 3400–3409. [Google Scholar]
- Kirkpatrick, J.; Pascanu, R.; Rabinowitz, N.; Veness, J.; Desjardins, G.; Rusu, A.A.; Milan, K.; Quan, J.; Ramalho, T.; Grabska-Barwinska, A.; et al. Overcoming catastrophic forgetting in neural networks. Proc. Natl. Acad. Sci. USA 2017, 114, 3521–3526. [Google Scholar] [CrossRef]
- Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft coco: Common objects in context. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; Springer: Cham, Switzerland, 2014; pp. 740–755. [Google Scholar]
- Everingham, M.; Van Gool, L.; Williams, C.K.; Winn, J.; Zisserman, A. The pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 2010, 88, 303–338. [Google Scholar] [CrossRef]
- Er, M.J.; Zhang, Y.; Chen, J.; Gao, W. Ship detection with deep learning: A survey. Artif. Intell. Rev. 2023, 56, 11825–11865. [Google Scholar] [CrossRef]
- Zhang, Z.; Zhang, L.; Wang, Y.; Feng, P.; He, R. ShipRSImageNet: A large-scale fine-grained dataset for ship detection in high-resolution optical remote sensing images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 8458–8472. [Google Scholar] [CrossRef]
- Bentes, C.; Velotto, D.; Tings, B. Ship classification in TerraSAR-X images with convolutional neural networks. IEEE J. Ocean. Eng. 2017, 43, 258–266. [Google Scholar] [CrossRef]
- Liu, G.; Zhang, Y.; Zheng, X.; Sun, X.; Fu, K.; Wang, H. A new method on inshore ship detection in high-resolution satellite images using shape and context information. IEEE Geosci. Remote Sens. Lett. 2013, 11, 617–621. [Google Scholar] [CrossRef]
- Zhang, S.; Wu, R.; Xu, K.; Wang, J.; Sun, W. R-CNN-based ship detection from high resolution remote sensing imagery. Remote Sens. 2019, 11, 631. [Google Scholar] [CrossRef]
- Kanjir, U.; Greidanus, H.; Oštir, K. Vessel detection and classification from spaceborne optical images: A literature survey. Remote Sens. Environ. 2018, 207, 1–26. [Google Scholar] [CrossRef] [PubMed]
- Hu, J.; Li, Y.; Zhi, X.; Shi, T.; Zhang, W. Complementarity-aware Feature Fusion for Aircraft Detection via Unpaired Opt2SAR Image Translation. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5628019. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Proceedings of the 26th International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–6 December 2012; Volume 25. [Google Scholar]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Tian, Z.; Shen, C.; Chen, H.; He, T. Fcos: Fully convolutional one-stage object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9627–9636. [Google Scholar]
- Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-end object detection with transformers. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; Springer: Cham, Switzerland, 2020; pp. 213–229. [Google Scholar]
- Yu, Y.; Yang, X.; Li, J.; Gao, X. A cascade rotated anchor-aided detector for ship detection in remote sensing images. IEEE Trans. Geosci. Remote Sens. 2020, 60, 5600514. [Google Scholar] [CrossRef]
- Su, H.; He, Y.; Jiang, R.; Zhang, J.; Zou, W.; Fan, B. DSLA: Dynamic smooth label assignment for efficient anchor-free object detection. Pattern Recognit. 2022, 131, 108868. [Google Scholar] [CrossRef]
- Ren, Z.; Tang, Y.; Yang, Y.; Zhang, W. Sasod: Saliency-aware ship object detection in high-resolution optical images. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5611115. [Google Scholar] [CrossRef]
- Bendale, A.; Boult, T.E. Towards open set deep networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1563–1572. [Google Scholar]
- Gupta, A.; Narayan, S.; Joseph, K.; Khan, S.; Khan, F.S.; Shah, M. Ow-detr: Open-world detection transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 9235–9244. [Google Scholar]
- Wu, Z.; Lu, Y.; Chen, X.; Wu, Z.; Kang, L.; Yu, J. UC-OWOD: Unknown-classified open world object detection. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; Springer: Cham, Switzerland, 2022; pp. 193–210. [Google Scholar]
- Ma, S.; Wang, Y.; Wei, Y.; Fan, J.; Li, T.H.; Liu, H.; Lv, F. Cat: Localization and identification cascade detection transformer for open-world object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 19681–19690. [Google Scholar]
- Zohar, O.; Wang, K.C.; Yeung, S. Prob: Probabilistic objectness for open world object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 11444–11453. [Google Scholar]
- Du, X.; Wang, Z.; Cai, M.; Li, Y. Vos: Learning what you don’t know by virtual outlier synthesis. arXiv 2022, arXiv:2202.01197. [Google Scholar]
- Liang, W.; Xue, F.; Liu, Y.; Zhong, G.; Ming, A. Unknown sniffer for object detection: Don’t turn a blind eye to unknown objects. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 3230–3239. [Google Scholar]
- Li, Z.; Hoiem, D. Learning without forgetting. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 2935–2947. [Google Scholar] [CrossRef] [PubMed]
- Rebuffi, S.A.; Kolesnikov, A.; Sperl, G.; Lampert, C.H. icarl: Incremental classifier and representation learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2001–2010. [Google Scholar]
- Peng, C.; Zhao, K.; Lovell, B.C. Faster ilod: Incremental learning for object detectors based on faster rcnn. Pattern Recognit. Lett. 2020, 140, 109–115. [Google Scholar] [CrossRef]






| Tailsize | A-OSE | WI | UDR | UDP |
|---|---|---|---|---|
| 5 | 21 | 0.0087 | 0.7059 | 0.4167 |
| 10 | 20 | 0.0086 | 0.8235 | 0.5238 |
| 15 | 21 | 0.0090 | 0.8431 | 0.5116 |
| 5% | 22 | 0.0093 | 0.7647 | 0.4359 |
| Tailsize = 5 | Tailsize = 10 | Tailsize = 15 | Tailsize = 5% | |
|---|---|---|---|---|
| Unknown (CRS) | 1.1 | 1.7 | 1.8 | 0.9 |
| AC | 87.2 | 85.3 | 85.3 | 85.3 |
| AAS | 82.7 | 79.8 | 78.2 | 78.2 |
| ATD | 81.6 | 79.7 | 79.7 | 79.7 |
| CV | 62.3 | 47.1 | 44.1 | 40.9 |
| CR | 65.7 | 62.1 | 62.1 | 62.1 |
| DD | 74.4 | 63.7 | 62.9 | 62.3 |
| FF | 87.5 | 76.4 | 75.0 | 73.9 |
| LS | 65.3 | 58.8 | 57.3 | 56.5 |
| LCS | 77.2 | 77.2 | 77.2 | 77.2 |
| MWV | 44.8 | 27.8 | 22.4 | 22.1 |
| PV | 67.5 | 62.1 | 61.8 | 59.4 |
| RO | 77.8 | 68.6 | 67.1 | 66.1 |
| SUB | 76.7 | 75.5 | 75.5 | 73.2 |
| CTS | 87.4 | 87.4 | 87.4 | 87.4 |
| OT | 77.0 | 75.0 | 72.6 | 73.9 |
| SC | 77.4 | 79.2 | 77.4 | 77.4 |
| mAP (w/o CRS) | 74.5 | 69.1 | 67.9 | 67.2 |
| CRS | 64.5 | 63.0 | 63.6 | 64.9 | 54.5 | 0.0 |
| AC | 86.2 | 86.9 | 87.3 | 89.0 | 89.6 | 89.3 |
| AAS | 85.0 | 88.7 | 92.2 | 91.9 | 92.7 | 93.1 |
| ATD | 56.4 | 69.6 | 73.7 | 79.3 | 85.4 | 85.3 |
| CV | 72.3 | 72.7 | 72.7 | 72.6 | 73.9 | 73.8 |
| CR | 71.6 | 72.9 | 74.9 | 76.7 | 81.8 | 81.8 |
| DD | 79.1 | 79.5 | 80.0 | 80.6 | 82.8 | 83.5 |
| FF | 86.9 | 87.1 | 87.0 | 87.1 | 87.2 | 87.3 |
| LS | 70.0 | 71.1 | 71.4 | 72.7 | 78.1 | 75.5 |
| LCS | 89.0 | 88.6 | 86.7 | 87.0 | 87.3 | 86.1 |
| MWV | 47.4 | 51.0 | 51.3 | 52.7 | 54.0. | 53.2 |
| PV | 76.2 | 78.0 | 78.7 | 78.6 | 78.6 | 79.1 |
| RO | 55.6 | 65.3 | 75.0 | 83.2 | 86.7 | 86.7 |
| SUB | 81.2 | 87.3 | 81.3 | 81.2 | 88.6 | 88.4 |
| CTS | 82.8 | 85.4 | 89.3 | 90.6 | 94.1 | 94.6 |
| OT | 74.1 | 79.5 | 80.3 | 87.0 | 85.6 | 81.8 |
| SC | 71.9 | 82.7 | 85.3 | 85.8 | 87.7 | 89.7 |
| mAP | 73.5 | 77.0 | 78.3 | 80.1 | 81.7 | 78.2 |
| Methods | A-OSE | WI | UDR | UDP |
|---|---|---|---|---|
| Baseline method | 48 | 0.0151 | 0.5410 | 0.0002 |
| ORE | 45 | 0.0142 | 0.5820 | 0.2215 |
| OW-DETR | 42 | 0.0135 | 0.6842 | 0.2850 |
| UC-OWOD | 38 | 0.0126 | 0.7105 | 0.3120 |
| PROB | 32 | 0.0110 | 0.7350 | 0.3980 |
| VOS | 35 | 0.0118 | 0.7420 | 0.3640 |
| UnSniffer | 29 | 0.0095 | 0.7850 | 0.4510 |
| Ours | 20 | 0.0086 | 0.8235 | 0.5238 |
| Full Fine-Tuning | EWC | Our Method | Sample Storage | EWC + Sample Storage | Our Method + Sample Storage | Original Network | |
|---|---|---|---|---|---|---|---|
| CRS | 58.0 | 54.5 | 63.6 | 63.6 | 36.4 | 91.3 | 0.0 |
| AC | 67.3 | 89.6 | 89.6 | 89.5 | 89.3 | 89.0 | 95.6 |
| AAS | 65.3 | 92.7 | 93.4 | 94.0 | 93.7 | 89.8 | 95.5 |
| ATD | 47.0 | 85.4 | 85.7 | 86.7 | 85.8 | 87.3 | 88.0 |
| CV | 68.1 | 73.9 | 73.4 | 74.8 | 74.3 | 73.6 | 74.6 |
| CR | 68.3 | 81.8 | 81.4 | 82.1 | 82.3 | 82.0 | 83.7 |
| DD | 76.3 | 82.8 | 83.0 | 84.1 | 83.8 | 83.4 | 84.9 |
| FF | 86.2 | 87.2 | 87.3 | 87.3 | 87.3 | 86.6 | 87.5 |
| LS | 62.0 | 78.1 | 75.4 | 79.7 | 79.8 | 75.4 | 81.6 |
| LCS | 85.0 | 87.3 | 87.6 | 88.6 | 87.6 | 88.0 | 89.8 |
| MWV | 41.9 | 54.0 | 54.6 | 61.6 | 60.6 | 51.6 | 59.2 |
| PV | 73.2 | 78.6 | 78.7 | 78.9 | 79.0 | 72.5 | 78.3 |
| RO | 44.7 | 86.7 | 86.4 | 86.8 | 86.9 | 87.2 | 88.6 |
| SUB | 80.6 | 88.6 | 88.5 | 89.2 | 88.7 | 81.2 | 89.3 |
| CTS | 38.9 | 94.1 | 94.6 | 95.4 | 95.5 | 90.4 | 90.2 |
| OT | 62.7 | 85.6 | 85.0 | 86.4 | 86.3 | 81.6 | 90.6 |
| SC | 26.0 | 87.7 | 87.9 | 89.9 | 89.7 | 89.7 | 90.9 |
| mAP | 61.9 | 81.7 | 82.1 | 83.4 | 81.6 | 82.4 | 80.5 |
| Storage Content | Storage Space | Training Time | |
|---|---|---|---|
| Full Fine-tuning | Training data for the new classes | 216 MB | 4.2 min |
| EWC/Our Method | Data for new classes + Fisher Information Matrix | 356 MB | 4.3 min |
| Sample Storage | Data for new classes + Representative samples from old classes | 332 MB | 12.7 min |
| Our Method + Sample Storage | Data for new classes + Representative samples from old classes + Fisher Information Matrix | 472 MB | 13.1 min |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Li, Y.; Bao, G.; Hu, J.; Zhi, X.; Hu, T.; Wang, J.; Wu, W. A Ship Incremental Recognition Framework via Unknown Extraction and Joint Optimization Learning. Remote Sens. 2026, 18, 149. https://doi.org/10.3390/rs18010149
Li Y, Bao G, Hu J, Zhi X, Hu T, Wang J, Wu W. A Ship Incremental Recognition Framework via Unknown Extraction and Joint Optimization Learning. Remote Sensing. 2026; 18(1):149. https://doi.org/10.3390/rs18010149
Chicago/Turabian StyleLi, Yugao, Guangzhen Bao, Jianming Hu, Xiyang Zhi, Tianyi Hu, Junjie Wang, and Wenbo Wu. 2026. "A Ship Incremental Recognition Framework via Unknown Extraction and Joint Optimization Learning" Remote Sensing 18, no. 1: 149. https://doi.org/10.3390/rs18010149
APA StyleLi, Y., Bao, G., Hu, J., Zhi, X., Hu, T., Wang, J., & Wu, W. (2026). A Ship Incremental Recognition Framework via Unknown Extraction and Joint Optimization Learning. Remote Sensing, 18(1), 149. https://doi.org/10.3390/rs18010149

