RSPS-SAM: A Remote Sensing Image Panoptic Segmentation Method Based on SAM
Abstract
:1. Introduction
- An end-to-end remote sensing image panoptic segmentation method based on SAM, RSPS-SAM, was proposed. By designing a Mask Decoder, the method achieved automatic generation of mask prompts and determination of mask category information.
- A Batch Attention Pyramid was designed to extract multi-scale information and long-range contextual information from remote sensing images through multi-scale batch attention calculation and multi-scale feature fusion.
- Experimental results demonstrated that the proposed RSPS-SAM achieved a panoptic segmentation quality (PQ) of 57.2 on high-resolution remote sensing images.
2. Proposed Methods
2.1. Network Architecture
2.2. Batch Attention Pyramid
2.3. Mask Decoder
3. Experiments and Analysis
3.1. Experimental Setup
3.1.1. Experimental Data
3.1.2. Evaluation Method
3.1.3. Detailed Settings
3.2. Results
3.2.1. Comparative Experiments
3.2.2. Ablation Studies
3.2.3. Complexity and Computational Efficiency
4. Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Li, D.; Tong, Q.; Li, R.; Gong, J.; Zhang, L. Current Issues in High-Resolution Earth Observation Technology. Sci. China Earth Sci. 2012, 55, 1043–1051. [Google Scholar] [CrossRef]
- Kirillov, A.; He, K.; Girshick, R.; Rother, C.; Dollar, P. Panoptic Segmentation. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 9396–9405. [Google Scholar]
- Fare Garnot, V.S.; Landrieu, L. Panoptic Segmentation of Satellite Image Time Series with Convolutional Temporal Attention Networks. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 11–17 October 2021; pp. 4852–4861. [Google Scholar]
- Zhang, Y.; Yang, J.; Li, X.; Wang, J. Autonomous Remote Sensing Investigation and Monitoring Technique of Typical Classes of Natural Resources and Its Application. Geomat. World 2022, 29, 66–73. [Google Scholar]
- Xu, Y.; Qin, Y. Parameter selection experiment of urban block object segmentation based on Landsat 8. J. Spatio Temporal Inf. 2023, 30, 33–40. [Google Scholar] [CrossRef]
- Weyler, J.; Läbe, T.; Behley, J.; Stachniss, C. Panoptic Segmentation with Partial Annotations for Agricultural Robots. IEEE Robot. Autom. Lett. 2024, 9, 1660–1667. [Google Scholar] [CrossRef]
- Li, X.; Chen, D. A Survey on Deep Learning-Based Panoptic Segmentation. Digit. Signal Process. 2022, 120, 103283. [Google Scholar] [CrossRef]
- Sakaino, H. PanopticRoad: Integrated Panoptic Road Segmentation Under Adversarial Conditions. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Vancouver, BC, Canada, 20–22 June 2023; pp. 3591–3603. [Google Scholar]
- De Carvalho, O.L.F.; De Carvalho Júnior, O.A.; Silva, C.R.E.; De Albuquerque, A.O.; Santana, N.C.; Borges, D.L.; Gomes, R.A.T.; Guimarães, R.F. Panoptic Segmentation Meets Remote Sensing. Remote Sens. 2022, 14, 965. [Google Scholar] [CrossRef]
- Zhao, D.; Yuan, B.; Chen, Z.; Li, T.; Liu, Z.; Li, W.; Gao, Y. Panoptic Perception: A Novel Task and Fine-Grained Dataset for Universal Remote Sensing Image Interpretation. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–14. [Google Scholar] [CrossRef]
- Hua, X.; Wang, X.; Rui, T.; Shao, F.; Wang, D. Cascaded Panoptic Segmentation Method for High Resolution Remote Sensing Image. Appl. Soft Comput. 2021, 109, 107515. [Google Scholar] [CrossRef]
- Khoshboresh-Masouleh, M.; Shah-Hosseini, R. Building Panoptic Change Segmentation with the Use of Uncertainty Estimation in Squeeze-and-Attention CNN and Remote Sensing Observations. Int. J. Remote Sens. 2021, 42, 7798–7820. [Google Scholar] [CrossRef]
- Fernando, T.; Fookes, C.; Gammulle, H.; Denman, S.; Sridharan, S. Towards On-Board Panoptic Segmentation of Multispectral Satellite Images. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–12. [Google Scholar] [CrossRef]
- Šarić, J.; Oršić, M.; Šegvić, S. Panoptic SwiftNet: Pyramidal Fusion for Real-Time Panoptic Segmentation. Remote Sens. 2023, 15, 1968. [Google Scholar] [CrossRef]
- Radford, A.; Kim, J.W.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.; et al. Learning Transferable Visual Models from Natural Language Supervision. In Proceedings of the 38th International Conference on Machine Learning, Online, 18–24 July 2021; pp. 8748–8763. [Google Scholar]
- Yuan, L.; Chen, D.; Chen, Y.-L.; Codella, N.; Dai, X.; Gao, J.; Hu, H.; Huang, X.; Li, B.; Li, C.; et al. Florence: A New Foundation Model for Computer Vision. arXiv 2021, arXiv:2111.11432. [Google Scholar]
- Bao, H.; Dong, L.; Piao, S.; Wei, F. BEiT: BERT Pre-Training of Image Transformers. arXiv 2021, arXiv:2106.08254. [Google Scholar]
- Kirillov, A.; Mintun, E.; Ravi, N.; Mao, H.; Rolland, C.; Gustafson, L.; Xiao, T.; Whitehead, S.; Berg, A.C.; Lo, W.-Y.; et al. Segment Anything. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2–6 October 2023; pp. 3992–4003. [Google Scholar]
- Ren, Y.; Yang, X.; Wang, Z.; Yu, G.; Liu, Y.; Liu, X.; Meng, D.; Zhang, Q.; Yu, G. Segment Anything Model (SAM) Assisted Remote Sensing Supervision for Mariculture—Using Liaoning Province, China as an Example. Remote Sens. 2023, 15, 5781. [Google Scholar] [CrossRef]
- Wu, J.; Ji, W.; Liu, Y.; Fu, H.; Xu, M.; Xu, Y.; Jin, Y. Medical SAM Adapter: Adapting Segment Anything Model for Medical Image Segmentation. arXiv 2023, arXiv:2304.12620. [Google Scholar]
- Zhao, Z. Enhancing Autonomous Driving with Grounded-Segment Anything Model: Limitations and Mitigations. In Proceedings of the 2023 IEEE 3rd International Conference on Data Science and Computer Application (ICDSCA), Dalian, China, 27–29 October 2023; pp. 1258–1265. [Google Scholar]
- Wang, D.; Zhang, J.; Du, B.; Xu, M.; Liu, L.; Tao, D.; Zhang, L. SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model. In Proceedings of the Advances in Neural Information Processing Systems, New Orleans, LA, USA, 10–16 December 2023; Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S., Eds.; Curran Associates, Inc.: New York, NY, USA, 2023; Volume 36, pp. 8815–8827. [Google Scholar]
- Chen, X.; Wu, W.; Yang, W.; Qin, H.; Wu, X.; Mao, X. Make Segment Anything Model Perfect on Shadow Detection. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–13. [Google Scholar] [CrossRef]
- Qian, X.; Lin, C.; Chen, Z.; Wang, W. SAM-Induced Pseudo Fully Supervised Learning for Weakly Supervised Object Detection in Remote Sensing Images. Remote Sens. 2024, 16, 1532. [Google Scholar] [CrossRef]
- Osco, L.P.; Wu, Q.; De Lemos, E.L.; Gonçalves, W.N.; Ramos, A.P.M.; Li, J.; Marcato, J. The Segment Anything Model (SAM) for Remote Sensing Applications: From Zero to One Shot. Int. J. Appl. Earth Obs. Geoinf. 2023, 124, 103540. [Google Scholar] [CrossRef]
- Ji, W.; Li, J.; Bi, Q.; Liu, T.; Li, W.; Cheng, L. Segment Anything Is Not Always Perfect: An Investigation of SAM on Different Real-World Applications. Mach. Intell. Res. 2024, 21, 617–630. [Google Scholar] [CrossRef]
- Yan, Z.; Li, J.; Li, X.; Zhou, R.; Zhang, W.; Feng, Y.; Diao, W.; Fu, K.; Sun, X. RingMo-SAM: A Foundation Model for Segment Anything in Multimodal Remote-Sensing Images. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–16. [Google Scholar] [CrossRef]
- Chen, K.; Liu, C.; Chen, H.; Zhang, H.; Li, W.; Zou, Z.; Shi, Z. RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation Based on Visual Foundation Model. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–17. [Google Scholar] [CrossRef]
- Nguyen, K.D.; Phung, T.-H.; Cao, H.-G. A SAM-Based Solution for Hierarchical Panoptic Segmentation of Crops and Weeds Competition. arXiv 2023, arXiv:2309.13578. [Google Scholar]
- Cheng, B.; Misra, I.; Schwing, A.G.; Kirillov, A.; Girdhar, R. Masked-Attention Mask Transformer for Universal Image Segmentation. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 19–20 June 2022; pp. 1280–1289. [Google Scholar]
- Li, Z.; He, G.; Fu, H.; Chen, Q.; Shangguan, B.; Feng, P.; Jin, S. RS DINO: A Novel Panoptic Segmentation Algorithm for High Resolution Remote Sensing Images. In Proceedings of the 2023 11th International Conference on Agro-Geoinformatics (Agro-Geoinformatics), Wuhan, China, 25–28 July 2023; pp. 1–5. [Google Scholar]
- Kirillov, A.; Girshick, R.; He, K.; Dollar, P. Panoptic Feature Pyramid Networks. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 6392–6401. [Google Scholar]
- Li, Z.; Wang, W.; Xie, E.; Yu, Z.; Anandkumar, A.; Alvarez, J.M.; Luo, P.; Lu, T. Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 1270–1279. [Google Scholar] [CrossRef]
- Li, F.; Zhang, H.; Xu, H.; Liu, S.; Zhang, L.; Ni, L.M.; Shum, H.-Y. Mask DINO: Towards A Unified Transformer-Based Framework for Object Detection and Segmentation. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 18–22 June 2023; pp. 3041–3050. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 770–778. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 11–17 October 2021; pp. 9992–10002. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image Is Worth 16 × 16 Words: Transformers for Image Recognition at Scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Cheng, B.; Schwing, A.; Kirillov, A. Per-Pixel Classification Is Not All You Need for Semantic Segmentation. In Proceedings of the Advances in Neural Information Processing Systems, Online, 6–14 December 2021; Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W., Eds.; Curran Associates, Inc.: New York, NY, USA, 2021; Volume 34, pp. 17864–17875. [Google Scholar]
Methods | PQ | PQth 1 | PQst 1 | Params. | GFLOPs | FPS |
---|---|---|---|---|---|---|
Panoptic-FPN | 31.4 | 26.7 | 33.1 | 45.6M | 170 | 14.5 |
Mask2Former | 48.3 | 42.0 | 50.6 | 216M | 868 | 4.1 |
Panoptic SegFormer | 49.4 | 40.1 | 53.8 | 221M | 886 | 3.9 |
Mask DINO | 54.0 | 53.0 | 54.3 | 223M | 1326 | 6.0 |
RSPS-SAM | 57.2 | 52.8 | 58.7 | 740M | 2973.7 | 1.2 |
Methods | Batch Attention Pyramid | PQ |
---|---|---|
RSPS-SAM | without | 56.1 |
with | 57.2 |
Methods | Loading Method | PQ |
---|---|---|
RSPS-SAM | Shuffled | 55.8 |
Ordered | 57.2 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, Z.; Li, Z.; Liang, Y.; Persello, C.; Sun, B.; He, G.; Ma, L. RSPS-SAM: A Remote Sensing Image Panoptic Segmentation Method Based on SAM. Remote Sens. 2024, 16, 4002. https://doi.org/10.3390/rs16214002
Liu Z, Li Z, Liang Y, Persello C, Sun B, He G, Ma L. RSPS-SAM: A Remote Sensing Image Panoptic Segmentation Method Based on SAM. Remote Sensing. 2024; 16(21):4002. https://doi.org/10.3390/rs16214002
Chicago/Turabian StyleLiu, Zhuoran, Zizhen Li, Ying Liang, Claudio Persello, Bo Sun, Guangjun He, and Lei Ma. 2024. "RSPS-SAM: A Remote Sensing Image Panoptic Segmentation Method Based on SAM" Remote Sensing 16, no. 21: 4002. https://doi.org/10.3390/rs16214002
APA StyleLiu, Z., Li, Z., Liang, Y., Persello, C., Sun, B., He, G., & Ma, L. (2024). RSPS-SAM: A Remote Sensing Image Panoptic Segmentation Method Based on SAM. Remote Sensing, 16(21), 4002. https://doi.org/10.3390/rs16214002