Abstract
Deep neural network (DNN)-based object detection has been extensively implemented in Unmanned Aerial Vehicles (UAVs). However, these architectures reveal significant vulnerabilities when faced with adversarial attacks, particularly the physically realizable adversarial patches, which are highly practicable. Existing methods for generating adversarial patches are easily affected by factors such as motion blur and color distortion, leading to a decline in the attack success rate (ASR). To address these limitations, a low-frequency robust adversarial patch (LFRAP) generation framework that integrates three dimensions of color, texture, and frequency domain is proposed. Firstly, a dynamic extraction mechanism for the environmental color pool based on clustering is proposed. This mechanism not only improves the degree of environmental integration but also reduces printing losses. Secondly, mathematical modeling of the effects of Unmanned Aerial Vehicle (UAV) high-speed motion is incorporated into the patch training process. The specialized texture derived from this modeling alleviates patch blurring and the subsequent decrease in attack efficiency caused by the high-speed movement of UAVs. Finally, a frequency domain separation strategy is introduced in the generation process to optimize the frequency space distribution, thereby reducing information loss during image recapture by UAV vision systems. The experimental results show that this framework increased the environment integration rate of the generated patches by 18.9%, and the attack success rate under the condition of motion blur increased by 19.2%, which significantly outperformed conventional methods.