Defense against Adversarial Patch Attacks for Aerial Image Semantic Segmentation by Robust Feature Extraction
Abstract
:1. Introduction
- To the best of our knowledge, we are the first to systematically analyze the impact of adversarial patch attacks on aerial image semantic segmentation and propose an effective defense method against targeted and un-targeted adversarial patch attacks. Our research reveals the significance of the resistability and robustness of deep learning models when addressing safety-critical Earth observation tasks.
- We comprehensively analyze the characteristics of adversarial patches in aerial image semantic segmentation. To defend against adversarial patches, a novel robust feature extraction framework is further proposed. By obtaining robust local features, semantic features, contour features and global features, RFENet can suppress the interference of adversarial patches in the feature extraction process.
- To demonstrate the performance of the proposed method, we conduct a series of experiments, including the defense capabilities of the model with multiple types of adversarial patches. The experiments conducted on three aerial image datasets containing urban and suburban show that the proposed framework can defend against adversarial patches while maintaining better semantic segmentation accuracy.
2. Related Works
2.1. Adversarial Attack and Defense for Aerial Images
- (1)
- Attack on classification tasks. Czaja et al. [30] first studied adversarial example attacks in remote sensing image classification and demonstrated that adding only weak adversarial perturbations can mislead classifier prediction results. Li et al. [23] constructed black-box and white-box attack methods for attacking SAR image classifiers. Xu et al. [25] systematically evaluated the adversarial example attack in remote sensing image classification tasks. Ai et al. [31] analyzed the influence of adversarial perturbation on aerial image classification and verified the transferability of adversarial examples in different classification models. Jiang et al. [32] proposed a project gradient descent adversarial attack (PGD) method to attack multi-source remote sensing image classifiers. Based on the adversarial training strategy, Cheng et al. [33] proposed a perturbation-seeking generative adversarial network (PSGANs) to improve the robustness of the remote scene classification model. Chen et al. [34] systematically analyzed the impact of four adversarial attack methods on multiple remote scene classification models.
- (2)
- Attack on detection tasks. Compared with aerial image classification tasks, attacks against object detection are more challenging. Lian et al. [35] constructed a benchmark on adversarial patches to destroy the performance of aerial object detectors. Lu et al. [36] proposed a scale-adaptive adversarial patch attack method for attacking aircraft object detection models, which can adaptively adjust the patch size according to the detected object. Zhang et al. [37] constructed the universal adversarial patch to attack the aerial object detection model and demonstrated its realizability in the physical domain. Du et al. [38] proposed a patch generation method for attacking vehicle object detectors in aerial scenarios, which enhances the attack efficiency by improving the similarity between the patch region and the vehicle object. Deng et al. [39] proposed an adversarial patch generation method based on style transfer theory and used multiple data augmentation methods to improve the generalization and transferability of patch attacks.
- (3)
- Attack on semantic segmentation tasks. Attacks on aerial image semantic segmentation have also received attention recently. Xu et al. [21] proposed black-box mixup-attack and mixcut-attack methods for attacking the semantic segmentation model, which obtain essential regions of the original image by random cropping and use gradient optimization and momentum iteration to improve the attack effect. They further collected the generated adversarial examples dataset for researchers to design advanced adversarial defense methods in aerial scenarios. Dräger et al. [40] proposed a patch attack method based on wavelet transform to destroy semantic segmentation model performance. This method first uses wavelet transform to divide aerial images into high-frequency and low-frequency, then embeds adversarial patches in low-frequency to enhance the stealthiness of patch attacks. Since aerial image semantic segmentation involves many safety-critical tasks, its related adversarial attack methods should receive more attention.
- (4)
- Adversarial defense technology. With the continuous emergence of adversarial attack methods in aerial image processing, the corresponding adversarial defense methods have also been studied. To improve the defense ability of the remote sensing image classification model in the face of unknown attacks, Cheng et al. [33] proposed a perturbation-seeking generative adversarial network (PS-GAN) defense framework. The proposed PS-GAN uses GAN to generate massive data samples and introduces a scaling search radius strategy to reduce the difference between adversarial and clean examples to achieve adversarial defense. Chen et al. [41] proposed a soft threshold defense method against various adversarial attacks on remote sensing scene classification models. The soft threshold defense method uses adversarial examples as negative samples, obtains the decision boundary with a logistic regression algorithm and uses the decision boundary to judge the confidence of each category to detect adversarial examples. To defend against adversarial patch attacks in aerial image object detection, Chen et al. [42] constructed a cascade adversarial defense framework, which located the adversarial patch region according to the high-frequency and salience information in the back gradient propagation and then used the random erasure method to suppress the adversarial patch.
2.2. Robust Features against Adversarial Attacks
3. Methodology
3.1. Limited Receptive Field Mechanism
3.2. Spatial Semantic Enhancement Module
3.3. Boundary Feature Perception Module
3.4. Global Correlation Encoder
- (1)
- For the input feature , the L dimensional feature vector of each spatial position in the feature map is used as the feature node and reconstructed into the node graph .
- (2)
- The feature nodes are input into the convolution to obtain different feature node vectors and perform dot product operations to form the spatial relationship matrix representing the relationship between each vector. The pairwise relationship between node i and node j is represented as , and the calculation is as follows,
- (3)
- For the ith feature node, the pairwise relationships with other nodes are stacked sequentially to obtain the spatial relationship vector . The spatial relationship vector and the original feature information are spliced to obtain the spatial relationship attention with global structure information and local detail information. The specific calculation is as follows,
- (4)
- The spatial relationship attention is used to calculate the attention weight of each position in the feature map. The attention weight is multiplied with the original feature to obtain the spatial relation attention weighted feature. The specific calculation is as follows,
3.5. Optimization Function
4. Experiments and Analysis
4.1. Data Descriptions
4.2. Adversarial Patch Setting
4.3. Implementation Details
Algorithm 1 Adversarial Patch Attack on RFENet |
Input: 1: Aerial image x and corresponding ground truth y. 2: Semantic segmentation model f with parameters . 3: Adversarial patch , training epochs , and learning rate . Output: The predictions on the adversarial patch images . 4: Initialize model parameters with uniform distribution. 5: for t in do 6: Compute the local features via Equations (1)–(6). 7: Compute the semantic features via Equations (7)–(10). 8: Computer the boundary features via Equations (11)–(14). 9: Computer the global features via Equations (15)–(19). 10: Computer the cross-entropy loss via Equations (20). 11: Update by descending its stochastic gradients. 12: end for 13: Generate the adversarial patch image via Refs. [16,17,18,63,64,65]. 14: Feed the adversarial patch image to the model f to achieve the segmentation. |
4.4. Comparison with State-of-the-Art Methods
- BSNet [66]: It includes feature extraction and restoration stages in which the feature extraction stage uses the gradient convolution to obtain boundary features, and the feature restoration stage uses global dependencies to recover feature resolution.
- MANet [67]: This method uses the efficient attention mechanism to extract global context dependencies, uses the linear complex kernel attention mechanism for local and global feature alignment and performs feature fusion by channel weighting.
- AFNet [68]: This method uses the small-scale dilated convolution kernel to extract multi-scale features of different ground objects and then uses the multi-scale structure with a scale layer attention module to obtain discriminative feature information.
- SSAtNet [69]: This method uses the pyramid attention pooling module to obtain detailed feature information, uses the pooling index to fuse local and global features and recovers the fine-grained feature information by information correction.
- MDANet [70]: This method uses multi-scale deformable attention to capture different scale features, uses a self-attention module to establish long-range context dependence and optimizes the boundary region segmentation effect with a local embedding module.
4.5. Ablation Study
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Russell, B.J.; Soffer, R.J.; Ientilucci, E.J.; Kuester, M.A.; Conran, D.N.; Arroyo-Mora, J.P.; Ochoa, T.; Durell, C.; Holt, J. The Ground to Space CALibration Experiment (G-SCALE): Simultaneous Validation of UAV, Airborne, and Satellite Imagers for Earth Observation Using Specular Targets. Remote Sens. 2023, 15, 294. [Google Scholar] [CrossRef]
- Tu, W.; Hu, Z.; Li, L.; Cao, J.; Jiang, J.; Li, Q.; Li, Q. Portraying Urban Functional Zones by Coupling Remote Sensing Imagery and Human Sensing Data. Remote Sens. 2018, 10, 141. [Google Scholar] [CrossRef] [Green Version]
- Zhang, Y.; Guo, L.; Wang, Z.; Yu, Y.; Liu, X.; Xu, F. Intelligent Ship Detection in Remote Sensing Images Based on Multi-Layer Convolutional Feature Fusion. Remote Sens. 2020, 12, 3316. [Google Scholar] [CrossRef]
- Shirmard, H.; Farahbakhsh, E.; Müller, R.D.; Chandra, R. A Review of Machine Learning in Processing Remote Sensing Data for Mineral Exploration. Remote Sens. Environ. 2022, 268, 112750–112760. [Google Scholar] [CrossRef]
- Yang, L.; Cervone, G. Analysis of Remote Sensing Imagery for Disaster Assessment Using Deep Learning: A Case Study of Flooding Event. Soft Comput. 2019, 23, 13393–13408. [Google Scholar] [CrossRef]
- Gu, J.; Wang, Z.; Kuen, J.; Ma, L.; Shahroudy, A.; Shuai, B.; Liu, T.; Wang, X.; Wang, G.; Cai, J.; et al. Recent Advances in Convolutional Neural Networks. Pattern Recognit. 2018, 77, 354–377. [Google Scholar] [CrossRef] [Green Version]
- Szegedy, C.; Zaremba, W.; Sutskever, I.; Bruna, J.; Erhan, D.; Goodfellow, I.; Fergus, R. Intriguing Properties of Neural Networks. arXiv 2014, arXiv:1312.6199. [Google Scholar]
- Goodfellow, I.J.; Shlens, J.; Szegedy, C. FGSM. arXiv 2015, arXiv:1412.6572. [Google Scholar]
- Madry, A.; Makelov, A.; Schmidt, L.; Tsipras, D.; Vladu, A. Towards Deep Learning Models Resistant to Adversarial Attacks. arXiv 2019, arXiv:1706.06083. [Google Scholar]
- Kurakin, A.; Goodfellow, I.; Bengio, S. Adversarial Machine Learning at Scale. arXiv 2017, arXiv:1611.01236. [Google Scholar]
- Papernot, N.; McDaniel, P.; Jha, S.; Fredrikson, M.; Celik, Z.B.; Swami, A. The Limitations of Deep Learning in Adversarial Settings. In Proceedings of the IEEE European Symposium on Security and Privacy, Saarbrucken, Germany, 11–15 March 2016; pp. 372–387. [Google Scholar]
- Carlini, N.; Wagner, D. Towards Evaluating the Robustness of Neural Networks. arXiv 2017, arXiv:1608.04644. [Google Scholar]
- Thalye, A.; Engstrom, L.; Ilyas, A.; Kwok, K. Synthesizing robust adversarial examples. In Proceedings of the International Conference on Machine Learning, PMLR, Stockholm, Sweden, 10–15 July 2018; pp. 284–293. [Google Scholar]
- Muhammad, A.; Bae, S.-H. A Survey on Efficient Methods for Adversarial Robustness. IEEE Access. 2022, 10, 118815–118830. [Google Scholar] [CrossRef]
- Brown, T.B.; Mané, D.; Roy, A.; Abadi, M.; Gilmer, J. Adversarial patch. arXiv 2017, arXiv:1712.09665. [Google Scholar]
- Karmon, D.; Zoran, D.; Goldberg, Y. LaVAN: Localized and Visible Adversarial Noise. arXiv 2018, arXiv:1801.02608. [Google Scholar]
- Chindaudom, A.; Siritanawan, P.; Sumongkayothin, K.; Kotani, K. AdversarialQR: An Adversarial Patch in QR Code Format. In Proceedings of the Joint ICIEV & icIVPR, Kitakyushu, Japan, 26–29 August 2020; pp. 1–6. [Google Scholar]
- Bai, T.; Luo, J.; Zhao, J. Inconspicuous Adversarial Patches for Fooling Image-Recognition Systems on Mobile Devices. IEEE Internet Things J. 2022, 9, 9515–9524. [Google Scholar] [CrossRef]
- Zhang, H.; Ma, X. Misleading Attention and Classification: An Adversarial Attack to Fool Object Detection Models in the Real World. Comput. Secur. 2022, 122, 102876–102881. [Google Scholar] [CrossRef]
- Nesti, F.; Rossolini, G.; Nair, S.; Biondi, A.; Buttazzo, G. Evaluating the Robustness of Semantic Segmentation for Autonomous Driving against Real-World Adversarial Patch Attacks. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–8 January 2022; pp. 2826–2835. [Google Scholar]
- Xu, Y.; Ghamisi, P. Universal Adversarial Examples in Remote Sensing: Methodology and Benchmark. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–15. [Google Scholar] [CrossRef]
- Chen, L.; Xu, Z.; Li, Q.; Peng, J.; Wang, S.; Li, H. An Empirical Study of Adversarial Examples on Remote Sensing Image Scene Classification. IEEE Trans. Geosci. Remote Sens. 2021, 59, 7419–7433. [Google Scholar] [CrossRef]
- Li, H.; Huang, H.; Chen, L.; Peng, J.; Huang, H.; Cui, Z.; Mei, X.; Wu, G. Adversarial Examples for CNN-Based SAR Image Classification: An Experience Study. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sensing. 2021, 14, 1333–1347. [Google Scholar] [CrossRef]
- Chen, L.-C.; Papandreou, G.; Schroff, F.; Adam, H. Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv 2018, arXiv:1706.05587. [Google Scholar]
- Xu, Y.; Du, B.; Zhang, L. Assessing the Threat of Adversarial Examples on Deep Neural Networks for Remote Sensing Scene Classification: Attacks and Defenses. IEEE Trans. Geosci. Remote Sens. 2021, 59, 1604–1617. [Google Scholar] [CrossRef]
- Xu, Y.; Du, B.; Zhang, L. Self-Attention Context Network: Addressing the Threat of Adversarial Attacks for Hyperspectral Image Classification. IEEE Trans. Image Process. 2021, 30, 8671–8685. [Google Scholar] [CrossRef]
- He, X.; Yang, S.; Li, G.; Li, H.; Chang, H.; Yu, Y. Non-Local Context Encoder: Robust Biomedical Image Segmentation against Adversarial Attacks. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; pp. 8417–8424. [Google Scholar]
- Xiang, C.; Bhagoji, A.N.; Sehwag, V.; Mittal, P. PatchGuard: A Provably Robust Defense against Adversarial Patches via Small Receptive Fields and Masking. arXiv 2021, arXiv:2005.10884. [Google Scholar]
- Lal, S.; Rehman, S.U.; Shah, J.H.; Meraj, T.; Rauf, H.T.; Damaševičius, R.; Mohammed, M.A.; Abdulkareem, K.H. Adversarial Attack and Defence through Adversarial Training and Feature Fusion for Diabetic Retinopathy Recognition. Sensors 2021, 21, 3922. [Google Scholar] [CrossRef]
- Czaja, W.; Fendley, N.; Pekala, M.; Ratto, C.; Wang, I.-J. Adversarial Examples in Remote Sensing. In Proceedings of the 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Seattle, WA, USA, 6–9 November 2018; pp. 408–411. [Google Scholar]
- Ai, S.; Voundi Koe, A.S.; Huang, T. Adversarial Perturbation in Remote Sensing Image Recognition. Appl. Soft Comput. 2021, 105, 107252–107259. [Google Scholar] [CrossRef]
- Jiang, Y.; Yin, G.; Yuan, Y.; Da, Q. Project Gradient Descent Adversarial Attack against Multisource Remote Sensing Image Scene Classification. Secur. Commun. Networks. 2021, 2021, 6663028. [Google Scholar] [CrossRef]
- Cheng, G.; Sun, X.; Li, K.; Guo, L.; Han, J. Perturbation-Seeking Generative Adversarial Networks: A Defense Framework for Remote Sensing Image Scene Classification. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–11. [Google Scholar] [CrossRef]
- Chen, L.; Li, H.; Zhu, G.; Li, Q.; Zhu, J.; Huang, H.; Peng, J.; Zhao, L. Attack Selectivity of Adversarial Examples in Remote Sensing Image Scene Classification. IEEE Access. 2020, 8, 137477–137489. [Google Scholar] [CrossRef]
- Lian, J.; Mei, S.; Zhang, S.; Ma, M. Benchmarking Adversarial Patch Against Aerial Detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–16. [Google Scholar] [CrossRef]
- Lu, M.; Li, Q.; Chen, L.; Li, H. Scale-Adaptive Adversarial Patch Attack for Remote Sensing Image Aircraft Detection. Remote Sens. 2021, 13, 4078. [Google Scholar] [CrossRef]
- Zhang, Y.; Zhang, Y.; Qi, J.; Bin, K.; Wen, H.; Tong, X.; Zhong, P. Adversarial Patch Attack on Multi-Scale Object Detection for UAV Remote Sensing Images. Remote Sens. 2022, 14, 5298. [Google Scholar] [CrossRef]
- Du, A.; Chen, B.; Chin, T.J.; Law, Y.W.; Sasdelli, M.; Rajasegaran, R.; Campbell, D. Physical Adversarial Attacks on an Aerial Imagery Object Detector. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 4–8 January 2022; pp. 1796–1806. [Google Scholar]
- Deng, B.; Zhang, D.; Dong, F.; Zhang, J.; Shafiq, M.; Gu, Z. Rust-Style Patch: A Physical and Naturalistic Camouflage Attacks on Object Detector for Remote Sensing Images. Remote Sens. 2023, 15, 885. [Google Scholar] [CrossRef]
- Dräger, N.; Xu, Y.; Ghamisi, P. Backdoor Attacks for Remote Sensing Data with Wavelet Transform. arXiv 2022, arXiv:2211.08044. [Google Scholar]
- Chen, L.; Xiao, J.; Zou, P.; Li, H. Lie to Me: A Soft Threshold Defense Method for Adversarial Examples of Remote Sensing Images. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
- Xue, W.; Chen, Z.; Tian, W.; Wu, Y.; Hua, B. A Cascade Defense Method for Multidomain Adversarial Attacks under Remote Sensing Detection. Remote Sens. 2022, 14, 3559. [Google Scholar] [CrossRef]
- Zhang, H.; Wang, J. Defense against Adversarial Attacks Using Feature Scattering-Based Adversarial Training. In Proceedings of the Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, Montreal, QC, Canada, 8–14 December 2019; pp. 113–118. [Google Scholar]
- Zhang, X.; Wang, J.; Wang, T.; Jiang, R.; Xu, J.; Zhao, L. Robust Feature Learning for Adversarial Defense via Hierarchical Feature Alignment. Inf. Sci. 2021, 560, 256–270. [Google Scholar] [CrossRef]
- Xie, C.; Wu, Y.; van der Maaten, L.; Yuille, A.L.; He, K. Feature Denoising for Improving Adversarial Robustness. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 501–509. [Google Scholar]
- Zhou, D.; Liu, T.; Han, B.; Wang, N.; Peng, C.; Gao, X. Towards Defending against Adversarial Examples via Attack-Invariant Features. In Proceedings of the 38th International Conference on Machine Learning, Virtual, 18–24 July 2021; pp. 12835–12845. [Google Scholar]
- Freitas, S.; Chen, S.-T.; Wang, Z.J.; Horng Chau, D. UnMask: Adversarial Detection and Defense Through Robust Feature Alignment. In Proceedings of the IEEE International Conference on Big Data, Atlanta, GA, USA, 10–13 December 2020; pp. 1081–1088. [Google Scholar]
- Liu, Z.; Liu, Q.; Liu, T.; Xu, N.; Lin, X.; Wang, Y.; Wen, W. Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 860–868. [Google Scholar]
- Li, X.; Zhu, D. Robust Detection of Adversarial Attacks on Medical Images. In Proceedings of the IEEE International Symposium on Biomedical Imaging (ISBI), Iowa City, IA, USA, 3–7 April 2020; pp. 1154–1158. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Chollet, F. Xception: Deep Learning with Depthwise Separable Convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1800–1807. [Google Scholar]
- Chen, X.; Li, Z.; Jiang, J.; Han, Z.; Deng, S.; Li, Z.; Fang, T.; Huo, H.; Li, Q.; Liu, M. Adaptive Effective Receptive Field Convolution for Semantic Segmentation of VHR Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2021, 59, 3532–3546. [Google Scholar] [CrossRef]
- Jiang, K.; Wang, Z.; Yi, P.; Lu, T.; Jiang, J.; Xiong, Z. Dual-Path Deep Fusion Network for Face Image Hallucination. IEEE Trans. Neural Netw. Learn. Syst. 2022, 33, 378–391. [Google Scholar] [CrossRef]
- Chen, Y.; Dai, X.; Liu, M.; Chen, D.; Yuan, L.; Liu, Z. Dynamic Convolution: Attention Over Convolution Kernels. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 11027–11036. [Google Scholar]
- Fu, J.; Liu, J.; Tian, H.; Li, Y.; Bao, Y.; Fang, Z.; Lu, H. Dual Attention Network for Scene Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 3141–3149. [Google Scholar]
- Luan, S.; Chen, C.; Zhang, B.; Han, J.; Liu, J. Gabor Convolutional Networks. IEEE Trans. Image Process. 2018, 27, 4357–4366. [Google Scholar] [CrossRef] [Green Version]
- Li, J.; Zha, S.; Chen, C.; Ding, M.; Zhang, T.; Yu, H. Attention Guided Global Enhancement and Local Refinement Network for Semantic Segmentation. IEEE Trans. Image Process. 2022, 31, 3211–3223. [Google Scholar] [CrossRef]
- Li, X.; Yu, L.; Chang, D.; Ma, Z.; Cao, J. Dual Cross-Entropy Loss for Small-Sample Fine-Grained Vehicle Classification. IEEE Trans. Veh. Technol. 2019, 68, 4204–4212. [Google Scholar] [CrossRef]
- Wang, L.; Wang, C.; Sun, Z.; Chen, S. An Improved Dice Loss for Pneumothorax Segmentation by Mining the Information of Negative Areas. IEEE Access. 2020, 8, 167939–167949. [Google Scholar] [CrossRef]
- Lyu, Y.; Vosselman, G.; Xia, G.-S.; Yilmaz, A.; Yang, M.Y. UAVid: A Semantic Segmentation Dataset for UAV Imagery. ISPRS J. Photogramm. 2020, 165, 108–119. [Google Scholar] [CrossRef]
- Chen, L.; Liu, F.; Zhao, Y.; Wang, W.; Yuan, X.; Zhu, J. VALID: A Comprehensive Virtual Aerial Image Dataset. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation, Paris, France, 31 May 2020; pp. 2009–2016. [Google Scholar]
- Nigam, I.; Huang, C.; Ramanan, D. Ensemble Knowledge Transfer for Semantic Segmentation. In Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision, Lake Tahoe, NV, USA, 12–15 March 2018; pp. 1499–1508. [Google Scholar]
- Gao, L.; Zhang, Q.; Song, J.; Liu, X.; Shen, H.T. Patch-Wise Attack for Fooling Deep Neural Network. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; Volume 12373, pp. 307–322. [Google Scholar]
- Zhou, X.; Pan, Z.; Duan, Y.; Zhang, J.; Wang, S. DiAP-A Data Independent Approach to Generate Adversarial Patches. Mach. Vis. Appl. 2021, 32, 67–75. [Google Scholar] [CrossRef]
- Pintor, M.; Angioni, D.; Sotgiu, A.; Demetrio, L.; Demontis, A.; Biggio, B.; Roli, F. ImageNet-Patch: A Dataset for Benchmarking Machine Learning Robustness against Adversarial Patches. Pattern Recognit. 2023, 134, 109064–109072. [Google Scholar] [CrossRef]
- Hou, J.; Guo, Z.; Wu, Y.; Diao, W.; Xu, T. BSNet: Dynamic Hybrid Gradient Convolution Based Boundary-Sensitive Network for Remote Sensing Image Segmentation. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–22. [Google Scholar] [CrossRef]
- Li, R.; Zheng, S.; Zhang, C.; Duan, C.; Su, J.; Wang, L.; Atkinson, P.M. Multiattention Network for Semantic Segmentation of Fine-Resolution Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–13. [Google Scholar] [CrossRef]
- Liu, R.; Mi, L.; Chen, Z. AFNet: Adaptive Fusion Network for Remote Sensing Image Semantic Segmentation. IEEE Trans. Geosci. Remote Sens. 2021, 59, 7871–7886. [Google Scholar] [CrossRef]
- Zhao, Q.; Liu, J.; Li, Y.; Zhang, H. Semantic Segmentation with Attention Mechanism for Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–13. [Google Scholar] [CrossRef]
- Zuo, R.; Zhang, G.; Zhang, R.; Jia, X. A Deformable Attention Network for High-Resolution Remote Sensing Images Semantic Segmentation. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–14. [Google Scholar] [CrossRef]
Datasets | Object Category | Training | Validation | Test |
---|---|---|---|---|
UAVid | building, road, tree, low-vegetation, moving-car, static-car, human, background-clutter | 200 | 70 | 150 |
Semantic Drone | tree, rocks, dog, fence, grass, water, bicycle, dirt, pool, door, gravel, wall, obstacle, car, vegetation, fence-pole, window, paved-area | 280 | 40 | 80 |
AeroScape | person, bike, car, drone, obstacle, construction, road, sky, animal, boat, vegetation | 2288 | 654 | 327 |
Class | BSNet | MANet | AFNet | SSAtNet | MDANet | RFENet |
---|---|---|---|---|---|---|
building | 13.24/8.89 | 15.86/11.52 | 14.48/9.75 | 17.62/12.14 | 16.38/10.82 | 85.26/84.38 |
road | 20.75/12.46 | 22.37/13.48 | 25.86/15.81 | 27.63/16.24 | 26.75/17.32 | 87.13/86.57 |
tree | 23.48/14.52 | 25.32/15.72 | 28.45/17.83 | 30.14/18.65 | 31.75/18.96 | 88.75/87.46 |
low-vegetation | 17.52/10.63 | 19.86/12.57 | 21.75/14.89 | 22.73/15.26 | 23.17/16.84 | 86.24/85.23 |
moving-car | 8.75/4.26 | 12.73/10.86 | 15.28/13.75 | 17.42/14.31 | 18.52/14.85 | 81.32/79.41 |
static-car | 10.63/8.15 | 13.78/11.43 | 14.23/12.64 | 16.17/15.85 | 17.62/16.35 | 78.63/77.42 |
human | 7.15/5.48 | 11.73/10.62 | 16.31/14.86 | 17.35/15.24 | 19.13/18.76 | 76.51/74.64 |
background | 31.25/25.36 | 33.87/26.13 | 36.55/28.70 | 39.48/37.26 | 42.68/38.97 | 80.13/76.85 |
PA (%) | 26.17/20.78 | 27.15/21.75 | 28.60/24.65 | 30.40/24.96 | 31.42/26.03 | 91.26/89.32 |
mPA (%) | 23.74/17.65 | 24.83/18.32 | 25.48/21.63 | 27.51/22.06 | 28.92/24.38 | 88.57/86.24 |
mF1 (%) | 19.57/16.85 | 21.36/18.13 | 22.82/19.46 | 24.38/21.42 | 26.93/24.35 | 85.42/83.27 |
mIoU (%) | 16.59/11.21 | 19.44/14.04 | 21.62/16.03 | 23.56/18.12 | 24.50/19.11 | 82.98/81.49 |
Runtime (s) | 21.32/22.75 | 26.83/27.42 | 25.47/26.14 | 31.58/33.18 | 33.72/35.64 | 19.84/21.45 |
Class | BSNet | MANet | AFNet | SSAtNet | MDANet | RFENet |
---|---|---|---|---|---|---|
tree | 13.27/14.73 | 14.15/16.43 | 18.63/19.28 | 16.82/17.15 | 18.79/19.37 | 75.82/77.95 |
rocks | 9.42/6.38 | 11.58/12.26 | 14.30/13.52 | 11.43/12.56 | 13.48/14.62 | 70.42/71.35 |
dog | 7.63/8.12 | 9.45/11.38 | 11.93/12.46 | 10.23/9.62 | 15.74/16.58 | 73.57/75.69 |
fence | 12.57/13.81 | 14.72/16.54 | 15.82/16.97 | 11.85/13.16 | 17.86/18.20 | 74.98/76.24 |
grass | 22.71/21.64 | 24.52/23.83 | 27.38/28.14 | 19.54/21.32 | 31.78/32.26 | 92.54/93.38 |
water | 20.14/19.58 | 21.65/20.46 | 25.62/24.35 | 19.57/18.62 | 27.98/28.42 | 90.42/91.85 |
bicycle | 12.57/10.62 | 15.72/16.31 | 18.39/19.58 | 11.23/12.15 | 19.36/20.37 | 75.87/76.35 |
dirt | 7.65/8.54 | 9.43/11.26 | 12.53/13.86 | 7.54/6.38 | 14.24/15.37 | 69.42/70.31 |
pool | 17.43/18.75 | 19.86/20.78 | 26.75/25.34 | 16.82/15.68 | 30.27/31.86 | 94.26/95.77 |
door | 2.14/3.27 | 4.31/5.64 | 7.42/6.73 | 3.75/5.82 | 10.42/12.45 | 66.78/67.42 |
gravel | 19.43/20.63 | 21.59/23.86 | 31.86/32.97 | 26.93/27.42 | 35.98/34.26 | 85.97/86.41 |
wall | 14.64/16.52 | 16.25/17.31 | 22.42/21.75 | 13.87/15.46 | 25.08/24.22 | 72.47/73.86 |
obstacle | 16.78/17.88 | 18.64/20.93 | 23.96/22.87 | 17.94/18.53 | 26.93/27.51 | 78.32/79.43 |
car | 21.65/22.15 | 23.70/24.56 | 25.78/26.14 | 20.65/19.37 | 29.82/28.13 | 93.56/94.27 |
vegetation | 19.58/18.96 | 22.41/21.93 | 26.53/28.52 | 21.85/23.97 | 31.53/32.85 | 79.32/81.36 |
fence-pole | 3.22/3.54 | 5.82/6.87 | 8.62/9.74 | 10.86/11.42 | 12.95/13.37 | 66.87/67.12 |
window | 7.25/6.75 | 9.57/11.36 | 15.30/16.27 | 12.79/11.68 | 16.58/17.74 | 71.43/73.84 |
parved-area | 21.49/22.31 | 23.86/25.73 | 25.93/27.36 | 24.78/23.64 | 38.36/38.02 | 95.78/96.34 |
PA (%) | 29.45/31.27 | 31.73/32.56 | 36.12/37.41 | 33.42/34.41 | 42.13/44.56 | 84.72/87.36 |
mPA (%) | 22.68/23.57 | 27.51/28.46 | 33.87/34.18 | 30.89/31.42 | 39.05/41.73 | 81.87/82.35 |
mF1 (%) | 16.29/17.48 | 18.32/19.17 | 22.75/23.64 | 18.97/19.30 | 26.93/27.86 | 79.84/81.43 |
mIoU (%) | 13.86/14.12 | 15.96/16.91 | 19.89/20.32 | 15.47/15.77 | 23.18/23.65 | 79.32/80.50 |
Runtime (s) | 23.72/24.51 | 28.47/29.16 | 27.64/28.03 | 32.27/33.85 | 34.96/35.82 | 20.32/21.98 |
Class | BSNet | MANet | AFNet | SSAtNet | MDANet | RFENet |
---|---|---|---|---|---|---|
tree | 12.26/14.37 | 15.72/17.31 | 18.64/20.17 | 21.95/23.04 | 25.34/27.15 | 82.53/83.42 |
rocks | 3.42/4.86 | 5.16/7.23 | 9.41/11.84 | 12.72/13.84 | 17.85/19.21 | 65.34/66.57 |
dog | 5.92/6.78 | 8.57/10.36 | 15.72/17.37 | 18.97/19.28 | 23.86/24.57 | 73.96/74.18 |
fence | 4.89/5.13 | 7.93/8.75 | 14.29/16.53 | 17.50/18.26 | 20.43/21.92 | 61.97/63.42 |
grass | 6.28/7.52 | 11.58/13.64 | 18.51/21.75 | 21.62/22.83 | 25.32/26.71 | 66.28/67.74 |
water | 8.15/9.43 | 13.72/15.08 | 22.73/24.32 | 25.76/27.08 | 30.72/31.98 | 60.92/62.31 |
bicycle | 17.82/18.94 | 23.75/25.84 | 29.62/31.65 | 32.95/34.26 | 37.24/38.62 | 88.74/89.92 |
dirt | 22.45/23.71 | 31.84/33.91 | 38.54/39.41 | 41.85/42.32 | 46.31/48.05 | 91.24/92.51 |
pool | 7.14/8.43 | 9.48/12.56 | 13.57/14.28 | 16.96/18.14 | 21.54/23.52 | 62.73/64.95 |
door | 9.56/10.75 | 14.87/16.92 | 20.32/21.46 | 23.41/24.96 | 29.31/30.89 | 83.47/85.26 |
gravel | 17.84/19.13 | 22.73/25.04 | 27.43/28.54 | 31.86/33.28 | 37.03/39.41 | 87.62/88.43 |
PA (%) | 32.16/33.78 | 35.96/37.21 | 38.96/39.57 | 42.63/44.28 | 43.97/45.26 | 88.96/90.08 |
mPA (%) | 26.47/27.92 | 28.56/29.75 | 31.62/32.74 | 34.57/35.94 | 36.57/38.21 | 82.74/84.25 |
mF1 (%) | 22.36/24.05 | 24.57/25.82 | 26.75/27.92 | 30.92/32.63 | 32.41/34.45 | 79.85/81.43 |
mIoU (%) | 10.52/11.73 | 15.03/16.96 | 20.79/22.48 | 24.14/25.20 | 28.63/30.18 | 74.98/76.24 |
Runtime (s) | 25.43/27.52 | 31.25/33.98 | 30.54/31.63 | 38.42/39.57 | 39.24/40.26 | 21.75/22.68 |
Baseline | LRFM | SSEM | BFPM | GCEM | UAVid | Semantic Drone | Aeroscapes |
---|---|---|---|---|---|---|---|
✔ | 21.53 | 19.86 | 22.45 | ||||
✔ | ✔ | 43.26 (21.73 ↑) | 38.47 (18.57 ↑) | 45.94 (23.49 ↑) | |||
✔ | ✔ | ✔ | 63.82 (20.56 ↑) | 59.13 (20.66 ↑) | 68.70 (22.76 ↑) | ||
✔ | ✔ | ✔ | ✔ | 79.51 (15.69 ↑) | 75.68 (16.55 ↑) | 81.93 (13.23 ↑) | |
✔ | ✔ | ✔ | ✔ | ✔ | 87.24 (7.73 ↑) | 82.71 (7.03 ↑) | 88.46 (6.53 ↑) |
Method | Patch Size | Patch Shape | ||||||
---|---|---|---|---|---|---|---|---|
Rectangle | Circle | |||||||
BSNet | 73.16 | 62.41 | 48.75 | 31.57 | 27.71 | 20.62 | 18.64 | 24.38 |
MANet | 72.47 | 63.85 | 47.18 | 38.45 | 26.32 | 21.87 | 19.85 | 25.97 |
AFNet | 75.83 | 64.22 | 46.22 | 36.73 | 28.16 | 22.54 | 17.93 | 25.42 |
SSAtNet | 76.15 | 61.4 | 47.86 | 35.94 | 29.34 | 21.25 | 18.77 | 26.85 |
MDANet | 77.43 | 66.57 | 45.27 | 34.68 | 30.38 | 23.79 | 20.34 | 27.66 |
RFENet | 84.36 | 84.16 | 83.64 | 83.51 | 82.97 | 82.68 | 83.05 | 84.22 |
Dataset | Patch Attack | BSNet | MANet | AFNet | SSAtNet | MDANet | RFENet |
---|---|---|---|---|---|---|---|
UAVid | Clean Sample | 77.52 | 78.13 | 76.84 | 79.37 | 81.42 | 89.47 |
LaVAN | 23.41 (54.11 ↓) | 24.82 (53.31 ↓) | 25.74 (51.10 ↓) | 28.05 (51.32 ↓) | 28.74 (52.68 ↓) | 88.63 (0.84 ↓) | |
QR-Patch | 16.27 (61.25 ↓) | 17.58 (60.55 ↓) | 21.34 (55.50 ↓) | 20.95 (58.42 ↓) | 23.65 (47.77 ↓) | 87.36 (2.11 ↓) | |
IAP | 19.73 (57.79 ↓) | 21.46 (56.67 ↓) | 23.65 (53.19 ↓) | 24.83 (54.54 ↓) | 25.47 (55.95 ↓) | 87.42 (2.05 ↓) | |
Patch-Wise | 17.35 (60.17 ↓) | 18.94 (59.19 ↓) | 24.37 (52.47 ↓) | 25.43 (53.94 ↓) | 24.69 (56.73 ↓) | 88.06 (1.41 ↓) | |
DiAP | 18.32 (59.20 ↓) | 19.57 (58.56 ↓) | 22.38 (54.46 ↓) | 24.94 (54.43 ↓) | 25.12 (56.30 ↓) | 88.59 (0.88 ↓) | |
Image-Patch | 21.64 (55.88 ↓) | 22.75 (55.38 ↓) | 24.06 (52.78 ↓) | 23.87 (55.50 ↓) | 23.43 (57.99 ↓) | 89.34 (0.13 ↓) | |
Semantic Drone | Clean Sample | 68.41 | 69.20 | 71.42 | 73.65 | 75.27 | 85.26 |
LaVAN | 20.74 (47.67 ↓) | 22.64 (46.56 ↓) | 23.42 (48.00 ↓) | 24.96 (48.69 ↓) | 25.87 (49.40 ↓) | 84.75 (0.51 ↓) | |
QR-Patch | 12.43 (55.98 ↓) | 13.71 (55.49 ↓) | 14.26 (57.16 ↓) | 15.92 (57.73 ↓) | 16.57 (58.70 ↓) | 83.15 (2.11 ↓) | |
IAP | 18.94 (49.47 ↓) | 19.37 (49.83 ↓) | 21.58 (49.84 ↓) | 20.95 (52.70 ↓) | 22.46 (52.81 ↓) | 83.98 (1.28 ↓) | |
Patch-Wise | 22.35 (46.06 ↓) | 23.89 (45.31 ↓) | 24.17 (47.25 ↓) | 25.48 (48.17 ↓) | 24.56 (50.71 ↓) | 84.32 (0.94 ↓) | |
DiAP | 16.54 (51.87 ↓) | 17.68 (51.52 ↓) | 18.21 (53.21 ↓) | 19.54 (54.11 ↓) | 19.86 (55.41 ↓) | 83.65 (1.61 ↓) | |
Image-Patch | 24.45 (43.96 ↓) | 26.87 (42.33 ↓) | 26.42 (45.00 ↓) | 27.63 (46.02 ↓) | 28.75 (46.52 ↓) | 84.93 (0.33 ↓) | |
Aeroscapes | Clean Sample | 74.56 | 76.34 | 77.85 | 78.46 | 79.35 | 86.75 |
LaVAN | 22.37 (52.19 ↓) | 23.56 (52.78 ↓) | 24.97 (52.88 ↓) | 26.30 (52.16 ↓) | 27.63 (51.72 ↓) | 86.14 (0.61 ↓) | |
QR-Patch | 11.38 (63.18 ↓) | 12.68 (63.66 ↓) | 13.04 (64.81 ↓) | 14.57 (63.89 ↓) | 15.62 (63.73 ↓) | 84.52 (2.23 ↓) | |
IAP | 19.26 (55.30 ↓) | 21.79 (54.55 ↓) | 20.35 (57.50 ↓) | 22.84 (55.62 ↓) | 23.74 (55.61 ↓) | 85.64 (1.11 ↓) | |
Patch-Wise | 15.06 (59.50 ↓) | 17.93 (58.41 ↓) | 18.24 (59.61 ↓) | 19.78 (58.68 ↓) | 21.35 (58.00 ↓) | 86.45 (0.30 ↓) | |
DiAP | 17.35 (57.21 ↓) | 18.89 (57.45 ↓) | 19.24 (58.61 ↓) | 18.73 (59.73 ↓) | 20.26 (59.09 ↓) | 85.92 (0.83 ↓) | |
Image-Patch | 21.52 (53.04 ↓) | 22.76 (53.58 ↓) | 23.85 (54.00 ↓) | 24.67 (53.79 ↓) | 25.92 (53.43 ↓) | 86.57 (0.18 ↓) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, Z.; Wang, B.; Zhang, C.; Liu, Y. Defense against Adversarial Patch Attacks for Aerial Image Semantic Segmentation by Robust Feature Extraction. Remote Sens. 2023, 15, 1690. https://doi.org/10.3390/rs15061690
Wang Z, Wang B, Zhang C, Liu Y. Defense against Adversarial Patch Attacks for Aerial Image Semantic Segmentation by Robust Feature Extraction. Remote Sensing. 2023; 15(6):1690. https://doi.org/10.3390/rs15061690
Chicago/Turabian StyleWang, Zhen, Buhong Wang, Chuanlei Zhang, and Yaohui Liu. 2023. "Defense against Adversarial Patch Attacks for Aerial Image Semantic Segmentation by Robust Feature Extraction" Remote Sensing 15, no. 6: 1690. https://doi.org/10.3390/rs15061690
APA StyleWang, Z., Wang, B., Zhang, C., & Liu, Y. (2023). Defense against Adversarial Patch Attacks for Aerial Image Semantic Segmentation by Robust Feature Extraction. Remote Sensing, 15(6), 1690. https://doi.org/10.3390/rs15061690