Defending against Poisoning Attacks in Aerial Image Semantic Segmentation with Robust Invariant Feature Enhancement
Abstract
:1. Introduction
- To the best of our knowledge, we introduce the concept of a poisoning attack into aerial image semantic segmentation for the first time and propose an effective defense framework against both targeted and untargeted poisoning attacks. Our research highlights the importance of enhancing the robustness of deep learning models in handling safety-critical aerial image processing tasks.
- To effectively defend against poisoning attacks, we propose a novel robust invariant feature enhancement framework based on the theory of robust feature representation. By obtaining robust invariant texture features, structural features, and global features, the proposed defense framework can effectively suppress the influence of poisoning samples on feature extraction and representation.
- To demonstrate the effectiveness of the proposed defense framework, we conducted extensive experiments to evaluate the adversarial defense performance against poisoning attacks. The experiments on the aerial image benchmark dataset in urban scenes show that the proposed framework can effectively defend against poisoning attacks and maintain better semantic segmentation performance.
2. Related Works
2.1. Poisoning Attacks
2.2. Poisoning Defense
2.3. Robust Feature Representation
3. Methodology
3.1. Texture Feature Enhancement Module
3.2. Structural Feature Enhancement Module
3.3. Global Feature Enhancement Module
3.4. Multi-Resolution Feature Fusion Module
3.5. Hierarchical Loss Function
4. Experiments and Analysis
4.1. Data Descriptions
- UDD Dataset: The dataset is collected by the professional-grade drone DJI-Phantom4 equipped with a 4K high-resolution camera at an altitude between 60 and 100 m. In the process of data collection, the camera shoot mode is set as interval shoot, and the panoramic image is obtained at the interval of 120 s. The original image resolution in this dataset is 4096 × 2160 or 4000 × 3000 and provides manual annotation information for semantic segmentation. Because the image is mainly collected in the urban region, the dataset contains common ground objects such as facade, road, vegetation, vehicle, and roof in the urban scene. In Figure 7, we provide the number of instances for each category and some sample examples. The dataset provides 205 high-resolution aerial images. We use 145 images as training set, 20 images as validation set, and the remaining 40 images as testing set. Limited by the computing resources of the hardware device, we scale the training set image to 1024 × 512, maintaining the original image size for the validation and testing sets.
- UAVid Dataset: The dataset uses the 10 m/s stable flying drone DJI-Phantom3 as the data collection device, flying at the altitude of around 50 m and using a camera with 4k resolution for continuous shoot. The original resolution of the collected images is 4096 × 2160 or 3840 × 2160, and each image contains urban objects in different scenes. The dataset provides fine-grained manual annotation information, and the labeled object categories include building, road, tree, vegetation, moving-car, static-car, and human. Because the dataset is mainly collected in the urban center region, the image scene is more complex. The number of instances for each category and some sample images are shown in Figure 7. For the semantic segmentation task, the dataset provides 300 high-resolution aerial images. We use 210 images as training set, 30 images as validation set, and the remaining 60 images as testing set. In addition, we scale the original image size to 2048 × 1024 to reduce the computational burden of hardware devices and accelerate model training.
4.2. Experimental Setup and Implementation Details
- Poisoning Attack Settings: To demonstrate the effectiveness of the proposed defense framework against poisoning attacks, we use different poisoning sample generation strategies, including clean-label attack [35], back-gradient attack [38], generative attack [65], feature selection attack [66], transferable clean-label attack [67], and concealed poisoning attack [68]. For different attack methods, we only consider the untargeted attack scenario, which destroys the prediction results of the target model for all categories of pixels. We assume that the attacker has sufficient knowledge of the target model, including model structure and training samples.
- Application Details: In the experiment, we use Pytorch 1.11.0 and Python 3.8.0 to construct the proposed defense framework. All experiments are run on Dell workstations with Intel i9-12900T CPU, 64GB RAM, NVIDIA GTX Geforce 3090 GPU, Ubuntu 18.04 operation system. For model parameter optimization, we set the initial learning rate as 0.001, use the stochastic gradient descent (SGD) with momentum of 0.9 as the optimizer, and use the poly learning strategy to automatically adjust the model learning rate. The training epoch of the model is set as 2000, and the batch size is set as 16. To ensure the credibility of the experimental results, we randomly selected samples in the dataset used as training set, validation set, and testing set and repeated the experimental process 20 times. In addition, limited by the number of training set samples, we use data augmentation methods including size clipping, random inversion, brightness transformation, and random erasure to increase the number of samples to improve the model generalization capability. To assure the fairness of the comparison results, for all the compared aerial image semantic segmentation methods, we use the source code provided by the author to conduct experiments, consistent with the original hyper-parameters setting and optimization strategy.
- Evaluation Metrics: To quantitatively evaluate the experimental results, we use , , , and typically used in semantic segmentation as evaluation metrics. Specifically, we first define , , , and as true positives, false positives, false negatives, and true negatives, respectively. The definitions of different evaluation metrics are as follows.
- Pixel Accuracy (PA): This metric is defined as the proportion of correctly classified pixels to the total number of pixels, that is, .
- Mean Pixel Accuracy (mPA): This metric is the weighted average of pixel accuracy, which calculates the pixel accuracy for each category, and then averages the pixel accuracy of all categories.
- F1 Measure (F1_score): This metric is the harmonic mean of precision () and recall () of each class. Formally, , where and .
- Mean Intersection over Union (mIoU): This metric is the mean of IoU, and the IoU is calculated as , where and denote the set of prediction pixels and ground truth for the ith class.
4.3. Comparison with State-of-the-Art Methods
- AFNet [69]: This network uses the hierarchical cascade structure to enhance different scale features and uses the scale-feature attention mechanism to establish the context correlation of multi-scale feature information.
- SBANet [70]: This network uses the boundary attention module to enhance the feature representation of the boundary region and uses the multi-task learning strategy to guide the model to mine valuable feature information.
- MANet [71]: This network uses discriminative feature learning to obtain fine-grained feature information and uses the multi-scale feature calibration module to filter redundant features to enhance feature representation.
- SSAtNet [72]: This network uses the pyramid attention pooling module to adaptively enhance multi-scale feature representation and uses the pooling index correlation module to restore the loss of detailed feature information.
- HFGNet [25]: This network enhances the representation of different feature information by mining hidden attention feature maps and uses the local channel attention mechanism to establish feature correlation.
- STUFormer [73]: This model uses the spatial interaction module to establish the pixel-level correlation and uses the feature compression module to reduce the loss of detail feature and restore the feature map resolution.
- EMRFormer [74]: This model uses multi-layer Transformer structure to extract local feature information, uses spatial attention mechanism to obtain global information, and uses feature alignment module to achieve feature fusion.
- CONFormer [75]: This model uses context Transformer to adaptively fuse local feature information and uses the two-branch semantic correlation module to establish the correlation between local features and global features.
- ATTFormer [76]: This model uses atrous Transformer to enhance multi-scale feature representation and uses channel and spatial attention mechanism to enhance the fine-grained representation of global feature information.
- DSegFormer [77]: This model uses position-encoder attention mechanism to extract valuable feature information from different categories of pixel regions and uses skip connections for feature interaction and fine-grained fusion.
4.3.1. Experimental Results on UDD Dataset
4.3.2. Experimental Results on UAVid Dataset
4.4. Ablation Study
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Osco, L.P.; Junior, J.M.; Ramos, A.P.M.; de Castro Jorge, L.A.; Fatholahi, S.N.; de Andrade Silva, J.; Matsubara, E.T.; Pistori, H.; Gonçalves, W.N.; Li, J. A Review on Deep Learning in UAV Remote Sensing. Int. J. Appl. Earth Obs. Geoinf. 2021, 102, 102456–102464. [Google Scholar] [CrossRef]
- Feroz, S.; Abu Dabous, S. Uav-based Remote Sensing Applications for Bridge Condition assessment. Remote Sens. 2021, 13, 1809. [Google Scholar] [CrossRef]
- Zhang, L.; Zhang, H.; Niu, Y.; Han, W. Mapping Maize Water Stress based on UAV Multispectral Remote Sensing. Remote Sens. 2019, 11, 605. [Google Scholar] [CrossRef] [Green Version]
- Yuan, X.; Shi, J.; Gu, L. A Review of Deep Learning Methods for Semantic Segmentation of Remote Sensing Imagery. Expert Syst. Appl. 2021, 169, 114417–114425. [Google Scholar] [CrossRef]
- Liu, S.; Cheng, J.; Liang, L.; Bai, H.; Dang, W. Light-weight Semantic segmentation network for UAV Remote Sensing Images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 8287–8296. [Google Scholar] [CrossRef]
- Pires de Lima, R.; Marfurt, K. Convolutional Neural Network for Remote-Sensing Scene Classification: Transfer learning analysis. Remote Sens. 2019, 12, 86. [Google Scholar] [CrossRef] [Green Version]
- Li, K.; Wan, G.; Cheng, G.; Meng, L.; Han, J. Object Detection in Optical Remote Sensing Images: A Survey and A New Benchmark. ISPRS J. Photogramm. Remote Sens. 2020, 159, 296–307. [Google Scholar] [CrossRef]
- Zhang, J.; Lin, S.; Ding, L.; Bruzzone, L. Multi-Scale Context Aggregation for Semantic Segmentation of Remote Sensing Images. Remote Sens. 2020, 12, 701. [Google Scholar] [CrossRef] [Green Version]
- Zhu, X.X.; Tuia, D.; Mou, L.; Xia, G.S.; Zhang, L.; Xu, F.; Fraundorfer, F. Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources. IEEE Geosci. Remote Sens. Mag. 2017, 5, 8–36. [Google Scholar] [CrossRef] [Green Version]
- Mohsan, S.A.H.; Khan, M.A.; Noor, F.; Ullah, I.; Alsharif, M.H. Towards the Unmanned Aerial Vehicles (UAVs): A Comprehensive Review. Drones 2022, 6, 147. [Google Scholar] [CrossRef]
- Zhang, Z.; Liu, Y.; Liu, T.; Lin, Z.; Wang, S. DAGN: A Real-Time UAV Remote Sensing Image Vehicle Detection Framework. IEEE Geosci. Remote Sens. Lett. 2019, 17, 1884–1888. [Google Scholar] [CrossRef]
- Huang, S.; Papernot, N.; Goodfellow, I.; Duan, Y.; Abbeel, P. Adversarial Attacks on Neural Network Policies. arXiv 2017, arXiv:1702.02284. [Google Scholar]
- Czaja, W.; Fendley, N.; Pekala, M.; Ratto, C.; Wang, I.J. Adversarial Examples in Remote Sensing. In Proceedings of the 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Seattle, WA, USA, 6–9 November 2018; pp. 408–411. [Google Scholar]
- Chen, L.; Zhu, G.; Li, Q.; Li, H. Adversarial Example in Remote Sensing Image Recognition. arXiv 2019, arXiv:1910.13222. [Google Scholar]
- Xu, Y.; Du, B.; Zhang, L. Assessing the Threat of Adversarial Examples on Deep Neural Networks for Remote Sensing Scene Classification: Attacks and Defenses. IEEE Trans. Geosci. Remote Sens. 2020, 59, 1604–1617. [Google Scholar] [CrossRef]
- Chen, L.; Xu, Z.; Li, Q.; Peng, J.; Wang, S.; Li, H. An Empirical Study of Adversarial Examples on Remote Sensing Image scene classification. IEEE Trans. Geosci. Remote Sens. 2021, 59, 7419–7433. [Google Scholar] [CrossRef]
- Ai, S.; Koe, A.S.V.; Huang, T. Adversarial Perturbation in Remote Sensing Image Recognition. Appl. Soft Comput. 2021, 105, 107252–107263. [Google Scholar] [CrossRef]
- Bai, T.; Wang, H.; Wen, B. Targeted Universal Adversarial Examples for Remote Sensing. Remote Sens. 2022, 14, 5833. [Google Scholar] [CrossRef]
- Chen, L.; Xiao, J.; Zou, P.; Li, H. Lie to Me: A Soft Threshold Defense Method for Adversarial Examples of Remote Sensing Images. IEEE Geosci. Remote Sens. Lett. 2021, 19, 1–5. [Google Scholar] [CrossRef]
- Wei, X.; Yuan, M. Adversarial Pan-Sharpening Attacks for Object Detection in Remote Sensing. Pattern Recognit. 2023, 139, 109466. [Google Scholar] [CrossRef]
- Lian, J.; Mei, S.; Zhang, S.; Ma, M. Benchmarking Adversarial Patch Against Aerial Detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–16. [Google Scholar] [CrossRef]
- Wang, Z.; Wang, B.; Liu, Y.; Guo, J. Global Feature Attention Network: Addressing the Threat of Adversarial Attack for Aerial Image Semantic Segmentation. Remote Sens. 2023, 15, 1325. [Google Scholar] [CrossRef]
- Alfeld, S.; Zhu, X.; Barford, P. Data Poisoning Attacks against Autoregressive Models. In Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; Volume 30. [Google Scholar]
- Jagielski, M.; Severi, G.; Pousette Harger, N.; Oprea, A. Subpopulation Data Poisoning Attacks. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, Virtual, 15–19 November 2021; pp. 3104–3122. [Google Scholar]
- Wang, Z.; Zhang, S.; Zhang, C.; Wang, B. Hidden Feature-Guided Semantic Segmentation Network for Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–17. [Google Scholar] [CrossRef]
- Shafahi, A.; Najibi, M.; Xu, Z.; Dickerson, J.; Davis, L.S.; Goldstein, T. Universal Adversarial Training. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 5636–5643. [Google Scholar]
- Zhang, H.; Wang, J. Defense against Adversarial Attacks using Feature Scattering-based Adversarial Training. Adv. Neural Inf. Process. Syst. 2019, 32. [Google Scholar]
- Zhang, X.; Wang, J.; Wang, T.; Jiang, R.; Xu, J.; Zhao, L. Robust Feature Learning for Adversarial Defense via Hierarchical Feature Alignment. Inf. Sci. 2021, 560, 256–270. [Google Scholar] [CrossRef]
- Xu, Y.; Du, B.; Zhang, L. Self-Attention Context Network: Addressing the Threat of Adversarial Attacks for Hyperspectral Image Classification. IEEE Trans. Image Process. 2021, 30, 8671–8685. [Google Scholar] [CrossRef]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Tian, Z.; Cui, L.; Liang, J.; Yu, S. A Comprehensive Survey on Poisoning Attacks and Countermeasures in Machine Learning. ACM Comput. Surv. 2022, 55, 1–35. [Google Scholar] [CrossRef]
- Chen, T.; Ling, J.; Sun, Y. White-Box Content Camouflage Attacks against Deep Learning. Comput. Secur. 2022, 117, 102676–102682. [Google Scholar] [CrossRef]
- Liu, G.; Lai, L. Provably Efficient Black-Box Action Poisoning attacks against Reinforcement Learning. Adv. Neural Inf. Process. Syst. 2021, 34, 12400–12410. [Google Scholar]
- Pang, T.; Yang, X.; Dong, Y.; Su, H.; Zhu, J. Accumulative Poisoning Attacks on Real-Time Data. Adv. Neural Inf. Process. Syst. 2021, 34, 2899–2912. [Google Scholar]
- Shafahi, A.; Huang, W.R.; Najibi, M.; Suciu, O.; Studer, C.; Dumitras, T.; Goldstein, T. Poison frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks. Adv. Neural Inf. Process. Syst. 2018, 31. [Google Scholar]
- Zhao, B.; Lao, Y. CLPA: Clean-Label Poisoning Availability Attacks using Generative Adversarial Nets. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 22 February–1 March 2022; Volume 36, pp. 9162–9170. [Google Scholar]
- Kurita, K.; Michel, P.; Neubig, G. Weight Poisoning Attacks on Pre-trained Models. arXiv 2020, arXiv:2004.06660. [Google Scholar]
- Muñoz-González, L.; Biggio, B.; Demontis, A.; Paudice, A.; Wongrassamee, V.; Lupu, E.C.; Roli, F. Towards Poisoning of Deep Learning Algorithms with Back-gradient Optimization. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, Dallas, TX, USA, 3 November 2017; pp. 27–38. [Google Scholar]
- Guo, W.; Tondi, B.; Barni, M. An Overview of Bbackdoor Attacks against Deep Neural Networks and Possible Defences. IEEE Open J. Signal Process. 2022, 3, 261–287. [Google Scholar] [CrossRef]
- Huang, A. Dynamic Backdoor Attacks against Federated Learning. arXiv 2020, arXiv:2011.07429. [Google Scholar]
- Aghakhani, H.; Meng, D.; Wang, Y.X.; Kruegel, C.; Vigna, G. Bullseye Polytope: A Scalable Clean-Label Poisoning Attack with Improved Transferability. In Proceedings of the 2021 IEEE European Symposium on Security and Privacy (EuroS&P), Virtually, 21–25 February 2021; pp. 159–178. [Google Scholar]
- Shafahi, A.; Najibi, M.; Ghiasi, M.A.; Xu, Z.; Dickerson, J.; Studer, C.; Davis, L.S.; Taylor, G.; Goldstein, T. Adversarial Training for Free! Adv. Neural Inf. Process. Syst. 2019, 32. [Google Scholar]
- Geiping, J.; Fowl, L.; Somepalli, G.; Goldblum, M.; Moeller, M.; Goldstein, T. What Doesn’t Kill You Makes You Robust (er): How to Adversarially Train against Data Poisoning. arXiv 2021, arXiv:2102.13624. [Google Scholar]
- Gao, Y.; Wu, D.; Zhang, J.; Gan, G.; Xia, S.T.; Niu, G.; Sugiyama, M. On the Effectiveness of Adversarial Training against Backdoor Attacks. arXiv 2022, arXiv:2202.10627. [Google Scholar]
- Hallaji, E.; Razavi-Far, R.; Saif, M.; Herrera-Viedma, E. Label Noise Analysis meets Adversarial Training: A Defense against Label Poisoning in Federated Learning. Knowl.-Based Syst. 2023, 266, 110384. [Google Scholar] [CrossRef]
- Chen, J.; Zhang, X.; Zhang, R.; Wang, C.; Liu, L. De-pois: An Attack-Agnostic Defense against Data Poisoning Attacks. IEEE Trans. Inf. Forensics Secur. 2021, 16, 3412–3425. [Google Scholar] [CrossRef]
- Liu, A.; Liu, X.; Yu, H.; Zhang, C.; Liu, Q.; Tao, D. Training Robust Deep Neural Networks via Adversarial Noise Propagation. IEEE Trans. Image Process. 2021, 30, 5769–5781. [Google Scholar] [CrossRef]
- Yang, X.; Xu, Z.; Luo, J. Towards Perceptual Image Dehazing by Physics-based Disentanglement and Adversarial Training. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–3 February 2018; Volume 32. [Google Scholar]
- Li, X.; Qu, Z.; Zhao, S.; Tang, B.; Lu, Z.; Liu, Y. Lomar: A Local Defense against Poisoning Attack on Federated Learning. IEEE Trans. Dependable Secur. Comput. 2021, 20, 437–450. [Google Scholar] [CrossRef]
- Dang, T.K.; Truong, P.T.T.; Tran, P.T. Data Poisoning Attack on Deep Neural Network and Some Defense Methods. In Proceedings of the 2020 International Conference on Advanced Computing and Applications (ACOMP), Quy Nhon, Vietnam, 25–27 November 2020; pp. 15–22. [Google Scholar]
- Zhang, J.; Xu, X.; Han, B.; Niu, G.; Cui, L.; Sugiyama, M.; Kankanhalli, M. Attacks Which Do Not Kill Training Make Adversarial Learning Stronger. In Proceedings of the International Conference on Machine Learning, PMLR, Virtual, 13–18 July 2020; pp. 11278–11287. [Google Scholar]
- Li, T.; Wu, Y.; Chen, S.; Fang, K.; Huang, X. Subspace Adversarial Training. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 13409–13418. [Google Scholar]
- Kim, J.; Lee, B.K.; Ro, Y.M. Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck. Adv. Neural Inf. Process. Syst. 2021, 34, 17148–17159. [Google Scholar]
- Xie, S.M.; Ma, T.; Liang, P. Composed Fine-Tuning: Freezing Pre-Trained Denoising Autoencoders for Improved Generalization. In Proceedings of the International Conference on Machine Learning, PMLR, Virtual, 18–24 July 2021; pp. 11424–11435. [Google Scholar]
- Song, C.; He, K.; Lin, J.; Wang, L.; Hopcroft, J.E. Robust Local Features for Improving the Generalization of Adversarial Training. arXiv 2019, arXiv:1909.10147. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; Proceedings, Part III 18. Springer: New York, NY, USA, 2015; pp. 234–241. [Google Scholar]
- Liao, X.; Yin, J.; Chen, M.; Qin, Z. Adaptive Payload Distribution in Multiple Images Steganography based on Image Texture Features. IEEE Trans. Dependable Secur. Comput. 2020, 19, 897–911. [Google Scholar] [CrossRef]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Nashville, TN, USA, 11–17 October 2021; pp. 10012–10022. [Google Scholar]
- Zhu, Y.; Liang, Z.; Yan, J.; Chen, G.; Wang, X. ED-Net: Automatic Building Extraction from High-Resolution Aerial Images with Boundary Information. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 4595–4606. [Google Scholar] [CrossRef]
- Li, P.; Ren, P.; Zhang, X.; Wang, Q.; Zhu, X.; Wang, L. Region-Wise Deep Feature Representation for Remote Sensing Images. Remote Sens. 2018, 10, 871. [Google Scholar] [CrossRef] [Green Version]
- Li, X.; Yu, L.; Chang, D.; Ma, Z.; Cao, J. Dual Cross-Entropy Loss for Small-Sample Fine-Grained Vehicle Classification. IEEE Trans. Veh. Technol. 2019, 68, 4204–4212. [Google Scholar] [CrossRef]
- Luo, Y.; Lü, J.; Jiang, X.; Zhang, B. Learning From Architectural Redundancy: Enhanced Deep Supervision in Deep Multipath Encoder–Decoder Networks. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 4271–4284. [Google Scholar] [CrossRef]
- Chen, Y.; Wang, Y.; Lu, P.; Chen, Y.; Wang, G. Large-scale Structure from Motion with Semantic Constraints of Aerial Images. In Proceedings of the Pattern Recognition and Computer Vision: First Chinese Conference, PRCV 2018, Guangzhou, China, 23–26 November 2018; Proceedings, Part I 1. Springer: New York, NY, USA, 2018; pp. 347–359. [Google Scholar]
- Lyu, Y.; Vosselman, G.; Xia, G.S.; Yilmaz, A.; Yang, M.Y. UAVid: A Semantic Segmentation Dataset for UAV Imagery. ISPRS J. Photogramm. Remote Sens. 2020, 165, 108–119. [Google Scholar] [CrossRef]
- Yang, C.; Wu, Q.; Li, H.; Chen, Y. Generative Poisoning Attack Method against Neural Networks. arXiv 2017, arXiv:1703.01340. [Google Scholar]
- Liu, H.; Ditzler, G. Data Poisoning against Information-Theoretic Feature Selection. Inf. Sci. 2021, 573, 396–411. [Google Scholar] [CrossRef]
- Zhu, C.; Huang, W.R.; Li, H.; Taylor, G.; Studer, C.; Goldstein, T. Transferable Clean-Label Poisoning Attacks on Deep Neural Nets. In Proceedings of the International Conference on Machine Learning, PMLR, Long Beach, CA, USA, 9–15 June 2019; pp. 7614–7623. [Google Scholar]
- Zheng, J.; Chan, P.P.; Chi, H.; He, Z. A Concealed Poisoning Attack to Reduce Deep Neural Networks’ Robustness against Adversarial Samples. Inf. Sci. 2022, 615, 758–773. [Google Scholar] [CrossRef]
- Liu, R.; Mi, L.; Chen, Z. AFNet: Adaptive Fusion Network for Remote Sensing Image Semantic Segmentation. IEEE Trans. Geosci. Remote Sens. 2020, 59, 7871–7886. [Google Scholar] [CrossRef]
- Li, A.; Jiao, L.; Zhu, H.; Li, L.; Liu, F. Multitask Semantic Boundary Awareness Network for Remote Sensing Image Segmentation. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–14. [Google Scholar] [CrossRef]
- He, P.; Jiao, L.; Shang, R.; Wang, S.; Liu, X.; Quan, D.; Yang, K.; Zhao, D. MANet: Multi-Scale Aware-Relation Network for Semantic Segmentation in Aerial Scenes. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–15. [Google Scholar] [CrossRef]
- Zhao, Q.; Liu, J.; Li, Y.; Zhang, H. Semantic Segmentation with Attention Mechanism for Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–13. [Google Scholar] [CrossRef]
- He, X.; Zhou, Y.; Zhao, J.; Zhang, D.; Yao, R.; Xue, Y. Swin Transformer Embedding UNet for Remote Sensing Image Semantic Segmentation. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–15. [Google Scholar] [CrossRef]
- Xiao, T.; Liu, Y.; Huang, Y.; Li, M.; Yang, G. Enhancing Multiscale Representations with Transformer for Remote Sensing Image Semantic Segmentation. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–16. [Google Scholar] [CrossRef]
- Ding, L.; Lin, D.; Lin, S.; Zhang, J.; Cui, X.; Wang, Y.; Tang, H.; Bruzzone, L. Looking Outside the Window: Wide-Context Transformer for the Semantic Segmentation of High-Resolution Remote Sensing Images. arXiv 2021, arXiv:2106.15754. [Google Scholar] [CrossRef]
- Zhang, C.; Jiang, W.; Zhang, Y.; Wang, W.; Zhao, Q.; Wang, C. Transformer and CNN hybrid Deep Neural Network for Semantic Segmentation of Very-High-Resolution Remote Sensing Imagery. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–20. [Google Scholar] [CrossRef]
- Li, X.; Cheng, Y.; Fang, Y.; Liang, H.; Xu, S. 2DSegFormer: 2-D Transformer Model for Semantic Segmentation on Aerial Images. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–13. [Google Scholar] [CrossRef]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. Segnet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef] [PubMed]
Category | AFNet | SBANet | MANet | SSAtNet | HFGNet | RIFENet |
---|---|---|---|---|---|---|
facade | 15.73/13.52/10.87 | 12.82/9.53/7.65 | 19.58/16.31/13.82 | 26.35/21.97/19.85 | 24.43/22.56/18.64 | 82.75/79.62/78.43 |
road | 18.52/16.37/14.75 | 14.87/11.34/9.46 | 22.38/18.75/12.53 | 32.67/28.64/25.41 | 25.17/27.85/23.71 | 85.97/84.31/82.45 |
vegetation | 22.58/20.86/18.42 | 19.63/16.35/12.02 | 25.86/22.13/15.64 | 35.25/31.46/28.52 | 36.87/28.14/25.42 | 86.53/85.26/84.17 |
vehicle | 11.24/9.75/5.62 | 8.52/6.34/4.21 | 11.28/8.37/6.98 | 22.45/18.64/16.35 | 24.91/17.24/15.84 | 83.62/82.46/82.05 |
roof | 23.71/18.34/16.57 | 21.62/17.58/14.35 | 24.56/20.62/18.95 | 39.41/32.52/29.78 | 36.54/28.92/24.83 | 87.26/86.31/85.42 |
background | 32.64/28.73/26.45 | 26.38/20.35/18.73 | 33.84/30.98/28.34 | 42.56/38.23/35.64 | 40.15/35.06/33.27 | 88.15/87.12/86.07 |
mPA (%) | 20.74/17.93/15.45 | 17.31/13.58/11.07 | 22.91/19.53/16.04 | 33.12/28.57/25.92 | 31.35/26.63/23.62 | 85.71/84.18/83.10 |
F1_score (%) | 18.51/15.32/13.41 | 15.12/11.24/9.65 | 19.13/17.12/13.89 | 29.38/26.52/24.39 | 27.58/22.46/20.54 | 82.46/81.52/79.38 |
mIoU (%) | 16.72/14.58/13.24 | 13.89/10.41/8.36 | 17.42/15.21/11.35 | 27.04/24.75/22.84 | 24.81/20.24/18.37 | 81.63/80.75/78.24 |
Runtime (s) | 19.42/21.65/22.36 | 17.28/18.52/19.35 | 24.38/26.75/28.97 | 21.57/22.86/24.15 | 18.24/19.85/21.36 | 16.52/17.58/18.46 |
Category | STUFormer | EMRFormer | CONFormer | ATTFormer | DsegFormer | RIFENet |
---|---|---|---|---|---|---|
building | 21.46/18.24/19.52 | 15.72/17.85/13.34 | 18.95/15.32/16.97 | 17.62/16.24/14.78 | 13.47/15.86/11.24 | 78.96/75.32/77.65 |
road | 25.87/21.52/22.87 | 19.23/22.14/16.25 | 22.63/20.57/18.65 | 20.43/21.06/17.93 | 16.75/18.64/13.65 | 81.37/79.58/80.42 |
tree | 31.85/27.64/29.51 | 24.95/21.62/18.37 | 25.84/17.83/21.45 | 26.71/18.46/19.65 | 20.35/17.68/19.73 | 83.75/84.21/83.97 |
vegetation | 23.45/19.26/22.93 | 17.87/15.91/15.32 | 20.63/13.48/16.83 | 18.52/14.15/14.77 | 15.14/13.21/12.86 | 79.54/77.26/76.83 |
moving-car | 11.38/9.57/10.46 | 9.24/10.58/7.22 | 12.45/8.42/6.52 | 11.37/9.54/8.15 | 8.45/9.58/6.84 | 74.35/75.42/73.06 |
static-car | 13.26/12.75/14.82 | 8.24/11.96/10.63 | 10.65/9.37/8.14 | 9.24/10.68/9.21 | 7.13/5.74/8.36 | 71.92/70.87/72.35 |
human | 7.54/8.65/6.32 | 4.26/5.48/3.62 | 5.41/3.97/4.54 | 4.85/6.02/5.63 | 5.06/6.42/5.67 | 72.53/71.62/70.59 |
background | 35.97/32.84/33.64 | 31.85/28.42/26.97 | 32.85/27.46/28.25 | 32.76/29.31/27.42 | 28.72/32.17/27.41 | 80.45/78.67/81.42 |
mPA (%) | 21.35/18.81/20.01 | 16.42/16.75/13.97 | 18.68/14.55/15.17 | 17.69/15.68/14.69 | 14.38/14.91/13.22 | 77.86/76.62/77.04 |
F1_score (%) | 19.24/16.73/17.85 | 14.58/13.96/12.24 | 16.56/12.24/13.81 | 14.32/13.75/12.37 | 12.54/13.05/11.96 | 75.35/74.17/75.24 |
mIoU (%) | 16.52/13.74/14.27 | 11.25/10.97/9.65 | 13.15/10.83/11.26 | 12.05/11.52/10.65 | 10.68/11.73/9.54 | 73.85/72.64/72.51 |
Runtime (s) | 22.86/21.59/23.74 | 28.65/27.32/29.43 | 32.54/33.87/34.95 | 21.75/23.46/22.34 | 23.85/24.32/25.76 | 17.35/16.51/18.25 |
Baseline | T-FEM | S-FEM | G-FEM | MR-FFM | mAP (%) | F1_Score (%) | mIoU (%) | Runtime (s) |
---|---|---|---|---|---|---|---|---|
✓ | 11.45 | 8.47 | 6.12 | 15.97 | ||||
✓ | ✓ | 25.86 (14.41 ↑) | 22.58 (14.11 ↑) | 20.95 (14.83 ↑) | 17.65 (1.68 ↑) | |||
✓ | ✓ | ✓ | 46.37 (20.51 ↑) | 41.23 (18.65 ↑) | 38.53 (17.58 ↑) | 18.74 (1.09 ↑) | ||
✓ | ✓ | ✓ | ✓ | 75.42 (29.05 ↑) | 73.89 (32.66 ↑) | 70.64 (32.11 ↑) | 21.83 (3.09 ↑) | |
✓ | ✓ | ✓ | ✓ | ✓ | 83.65 (8.23 ↑) | 80.14 (6.25 ↑) | 78.26 (7.62 ↑) | 23.17 (1.34 ↑) |
Attacks | AFNet [69] | SBANet [70] | MANet [71] | SSAtNet [72] | HFGNet [25] |
---|---|---|---|---|---|
benign dataset | 76.23 | 78.65 | 82.41 | 79.42 | 83.67 |
clean-label [35] | 21.86 (54.37 ↓) | 18.42 (65.25 ↓) | 21.32 (61.09 ↓) | 25.46 (54.26 ↓) | 31.75 (51.92 ↓) |
back-gradient [38] | 18.53 (57.70 ↓) | 14.31 (69.36 ↓) | 18.21 (62.20 ↓) | 22.17 (57.25 ↓) | 26.87 (56.80 ↓) |
generative [65] | 16.24 (59.99 ↓) | 12.54 (71.13 ↓) | 16.53 (65.88 ↓) | 19.52 (59.90 ↓) | 21.62 (62.05 ↓) |
feature selection [66] | 15.72 (60.51 ↓) | 10.62 (73.05 ↓) | 14.68 (67.73 ↓) | 17.35 (62.07 ↓) | 18.14 (65.53 ↓) |
trans-clean-label [67] | 11.37 (64.86 ↓) | 8.73 (74.94 ↓) | 9.57 (72.84 ↓) | 12.82 (66.60 ↓) | 14.05 (69.62 ↓) |
concealed poisoning [68] | 12.48 (63.75 ↓) | 11.74 (71.93 ↓) | 10.93 (71.48 ↓) | 14.13 (65.29 ↓) | 16.34 (67.33 ↓) |
STUFormer [73] | EMRFormer [74] | CONFormer [75] | ATTFormer [76] | DsegFormer [77] | RIFENet (Ours) |
78.34 | 75.25 | 81.57 | 83.34 | 85.42 | 87.59 |
26.46 (53.88 ↓) | 22.53 (52.72 ↓) | 24.64 (56.93 ↓) | 20.17 (63.17 ↓) | 19.68 (65.74 ↓) | 81.74 (5.85 ↓) |
24.57 (53.77 ↓) | 20.92 (54.33 ↓) | 21.35 (60.22 ↓) | 18.31 (65.03 ↓) | 17.52 (67.90 ↓) | 79.83 (7.76 ↓) |
22.31 (56.03 ↓) | 18.75 (56.50 ↓) | 19.42 (62.15 ↓) | 16.42 (66.92 ↓) | 16.65 (68.77 ↓) | 78.36 (9.23 ↓) |
20.24 (58.10 ↓) | 15.68 (59.57 ↓) | 17.37 (64.20 ↓) | 14.27 (69.07 ↓) | 15.14 (70.28 ↓) | 77.65 (9.94 ↓) |
17.18 (61.16 ↓) | 11.14 (64.11 ↓) | 12.98 (68.59 ↓) | 9.85 (73.49 ↓) | 8.23 (77.19 ↓) | 75.82 (11.77 ↓) |
18.32 (60.02 ↓) | 13.57 (61.68 ↓) | 15.24 (66.33 ↓) | 12.93 (70.41 ↓) | 13.98 (71.44 ↓) | 76.93 (10.66 ↓) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, Z.; Wang, B.; Zhang, C.; Liu, Y.; Guo, J. Defending against Poisoning Attacks in Aerial Image Semantic Segmentation with Robust Invariant Feature Enhancement. Remote Sens. 2023, 15, 3157. https://doi.org/10.3390/rs15123157
Wang Z, Wang B, Zhang C, Liu Y, Guo J. Defending against Poisoning Attacks in Aerial Image Semantic Segmentation with Robust Invariant Feature Enhancement. Remote Sensing. 2023; 15(12):3157. https://doi.org/10.3390/rs15123157
Chicago/Turabian StyleWang, Zhen, Buhong Wang, Chuanlei Zhang, Yaohui Liu, and Jianxin Guo. 2023. "Defending against Poisoning Attacks in Aerial Image Semantic Segmentation with Robust Invariant Feature Enhancement" Remote Sensing 15, no. 12: 3157. https://doi.org/10.3390/rs15123157
APA StyleWang, Z., Wang, B., Zhang, C., Liu, Y., & Guo, J. (2023). Defending against Poisoning Attacks in Aerial Image Semantic Segmentation with Robust Invariant Feature Enhancement. Remote Sensing, 15(12), 3157. https://doi.org/10.3390/rs15123157