Black-Box Boundary Attack Based on Gradient Optimization

: Deep neural networks have gained extensive applications in computer vision, demonstrating signiﬁcant success in fundamental research tasks such as image classiﬁcation. However, the robustness of these networks faces severe challenges in the presence of adversarial attacks. In real-world scenarios, addressing hard-label attacks often requires the execution of tens of thousands of queries. To combat these challenges, the Black-Box Boundary Attack leveraging Gradient Optimization (GOBA) has been introduced. This method employs a binary search strategy to acquire an initial adversarial example with signiﬁcant perturbation. The Monte Carlo algorithm is utilized to estimate the gradient of the sample, facilitating iterative movement along the estimated gradient and the direction of the malicious label. Moreover, query vectors positively correlated with the gradient are extracted to construct a sampling space with an optimal scale, thereby enhancing the efﬁciency of the Monte Carlo algorithm. Experimental evaluations were conducted using the HSJA, QEBA, and NLBA attack methodologies on the ImageNet, CelebA, and MNIST datasets, respectively. The results indicate that, under the constraint of 3 k query times, the GOBA, compared to other methods, can, on average, reduce perturbation ( L 2 distance) by 55.74% and simultaneously increase the attack success rate by an average of 13.78%.


Introduction
In recent years, significant strides have been made in the field of computer vision through the application of deep learning, particularly in tasks such as image classification [1][2][3][4][5], object detection [6][7][8][9][10], and image segmentation [11].Object detection algorithms, in particular, have seen widespread use in critical areas such as autonomous driving [12,13] and industrial inspection [14].Given the paramount importance of these applications to security, there is an escalated demand for algorithms to exhibit heightened robustness and security.
However, the susceptibility of deep learning to adversarial attacks [15], where even subtle perturbations to input images can drastically alter model outputs, has become a prominent concern.Consequently, there has been a growing emphasis on research regarding adversarial attacks and defenses within the context of deep learning [16][17][18][19].Depending on the accessibility of information about the targeted model, adversarial attacks can be broadly categorized into two types.White-box attacks assume complete transparency of the internal structure and parameters of the targeted model, empowering attackers to obtain this information and construct precise adversarial samples.However, white-box scenarios become more challenging when attackers lack access to the model's structure and parameters, leading to a significant reduction in the success rate of attacks.On the other hand, black-box attacks, where attackers lack specific information about the model or training dataset, align more closely with real-world scenarios and play a crucial role in evaluating model robustness in practical applications.Black-box attacks are further divided into non-query attacks [20] and query-based attacks.Query-based attacks can be subcategorized into score-based and hard-label attacks [21].Score-based attacks involve attackers having access to complete data labels and corresponding probabilities, while hardlabel attacks limit attackers to obtaining single-label model outputs, adding substantial difficulty to the attack by restricting access to rich information.This paper focuses on hardlabel attacks, with the aim of enhancing the success rate of hard-label black-box attacks while adhering to a constrained query budget.
Presently, adversarial attacks in image classification have received considerable attention, resulting in the emergence of numerous noteworthy attack algorithms.However, existing decision-based approaches have not effectively tackled the challenges associated with reducing query counts and improving attack success rates.Consequently, it becomes relatively straightforward to detect and reject queries near the boundary.Analysis suggests that the most formidable aspect of minimizing query counts lies in gradient computation during the iterative image modification process.The Monte Carlo algorithm introduces probability errors when calculating gradient directions, emphasizing the importance of selecting optimal random variables to minimize their variance or increasing simulation counts under fixed variance conditions.Fewer attack attempts are more aligned with the practical scenario of hard-label problems.
Addressing these challenges, this paper introduces the Black-Box Boundary Attack based on Gradient Optimization (GOBA), a novel algorithm designed for black-box attacks.The GOBA introduces a low-dimensional noise history space to effectively approximate decision boundaries, capitalizing on the inherent smoothness of these boundaries.By constructing a vector distribution sampling space, the GOBA outperforms independent sampling directly in the original space, showcasing superior performance in enhancing gradient accuracy.Therefore, the GOBA achieves higher attack success rates for adversarial samples without the need for external information.

Major Contributions
The main contributions of this paper are as follows: 1.
The introduction of an innovative Black-Box Boundary Attack based on Gradient Optimization (GOBA) is presented.By exploiting the flatness characteristics of classification boundaries, perturbation vectors positively correlated with them are extracted to construct a random vector sampling space.This minimizes the necessity for independent sampling, effectively reducing the query budget required for boundary attacks.

2.
An optimal dimension subspace is employed to enhance the precision of gradients in high-dimensional spaces, and an optimized traditional binary search boundary method is introduced.This ensures the accurate calculation of the sample's movement step, leading to adversarial samples adhering more closely to the adversarial boundary, consequently increasing the success rate of sample attacks.

3.
The proposed method's effectiveness and generality are validated through extensive comparative experiments conducted on the Imagenet, CelebA, and MNIST datasets.
Experimental results demonstrate that, compared to existing attack methods, the GOBA not only exhibits robust generality but also demonstrates outstanding performance in black-box attack scenarios.

Related Work
Over the past few years, significant efforts have been dedicated to countering adversarial attacks in machine learning models.
Brendel et al. [22] introduced the pioneering hard-label-based attack method, which starts with a substantial adversarial perturbation while simultaneously diminishing perturbations in source and sphere directions.While their approach effectively addresses hard-label attacks, it is limited in efficiency due to reliance on a standard normal distribution.Cheng et al. [23] framed hard-label attacks as an optimization problem involving direction, distance, and gradient estimation of the decision boundary.However, in highdimensional scenarios, the distance calculation and gradient estimation in their method require a considerable number of queries.Evolutionary methods [24] have improved variance updates by replacing the normal distribution with custom-modeled weights for variance and pixels post successful sampling.Nevertheless, the use of symbol-agnostic variance introduces instability during the sampling process.Shi et al. [25] introduced the Customized Adversarial Boundary (CAB), utilizing the square of the difference between adversarial samples and source images as variance and cumulative direction in case of mean failure.However, this method did not significantly enhance the attack success rate of the samples.
Cheng et al. [26] introduced the sign function to approximate the direction of ascent or descent (sign-opt), but it still requires a significant number of queries.Liu et al. [27] iteratively extracted noise from a normal distribution to estimate the gradient direction of the decision boundary.Rahmati et al. [28] developed an algorithm to determine the optimal query distribution, yet both methods use random sampling within constrained spaces without a substantial reduction in query count.Guo et al. [29] leveraged the gradient of reference models to construct a search subspace, with the aim of enhancing query efficiency.However, this approach overly relies on external information and model portability, leading to a decrease in attack success rate despite some reduction in query count.
Chen et al. [30] proposed the HopSkipJump attack (HSJA), which directly computes gradient estimation on the decision boundary.While the algorithm achieves accurate gradient calculation through iterative updates of adversarial examples along the decision boundary, the query counts of the HSJA method remain high.The Query-Efficient Boundary-Based Black-box Attack (QEBA) [31] builds on the HSJA by performing gradient calculations through subspace sampling of low-dimensional vectors.Li et al. introduced Nonlinear Gradient Estimation for the Query-Efficient Black-box Attack (NLBA) [32], highlighting the existence of non-linear projections that achieve higher cosine similarity lower bounds.Zhang et al. [33] demonstrated the presence of an optimal scale in the projection space.However, these methods primarily focus on random vector dimension transformations and sampling space sizes' impact on the gradient without a substantial improvement in query efficiency.Maho et al. [34] explored moving in different directions based on the geometric properties of the decision boundary but observed a noticeable decrease in the success rate of the SurFree attack with a limited number of queries.The ongoing challenge of reducing query budget while increasing attack success rate remains a critical challenge in research on hard-label black-box attacks.

Adversarial Attack
In the black-box attack scenario on neural networks used for image classification, the target model is denoted as F(X), where X represents a specific image.
Adversarial samples are images that have been perturbed with imperceptible noise.This noise is sufficiently small, yet it causes the image to be misclassified into a malicious label y tgt by the target model F(X), as illustrated in Equation (1), where ρ represents the perturbation noise to be added, and X src is the source image.
Assuming that X * represents the currently discovered adversarial example with the smallest noise amplitude and X represents the adversarial sample obtained by adding new noise to the image X * , then the objective of the adversarial attack is defined by Equation (2).The objective is to maximize the difference between the L 2 distance of image X relative to the image X * under the condition that images X * and X are consistently misclassified.

Black-Box Boundary Attack Based on Gradient Optimization
The boundary attack, which is a decision-based adversarial attack, aims to generate an adversarial sample that closely approaches the decision boundary.The adversary initiates the attack by starting with the source image X src and perturbing it within the pixel space towards the direction of X tgt ; X tgt is the target image when classified correctly.
An image situated on the decision boundary between two labels and classified as y tgt (F X tgt = y tgt ) is referred to as a boundary image.The primary objective of the attack is to discover an adversarial image X adv such that F(X adv ) = y tgt , while simultaneously minimizing the distance metric D(X src , X adv ), which is typically calculated using the L 2 norm or L ∞ norm.Consequently, the generated adversarial image X adv is a boundary image with an optimized, minimized distance from the source image.
The iterative process of generating adversarial examples is depicted in Figure 1, where the source image X src and the target image X tgt , labeled with a malicious category y tgt , are selected.Initially, a binary search is performed on both images to derive the initial image X 0 , which is misclassified as malicious, following the procedure described in Equation ( 3): Electronics 2024, 13, x FOR PEER REVIEW 5 of 14 an image  .The step size for each movement is meticulously calculated based on the proximity of the current adversarial example  and  , with a deliberate reduction in step magnitude as the iteration count escalates, mitigating the risk of undue distortion.Following this, we implement a strategic binary search technique to meticulously adjust the adversarial example  , ensuring its precise positioning near the decision boundary that delineates classes  and  .This process ensures that the adversarial example undergoes a continuous cycle of iterative refinement, consistently edging closer to the decision boundary.The iteration step size, coupled with the gradient, dictates the directional adjustment for each boundary point, serving as a pivotal factor in accelerating the adversarial example's convergence towards the intended target.In Figure 2, we delineate the procedural framework of the Black-Box Boundary Attack based on Gradient Optimization (GOBA) method.The process initiates with the selection of two distinct images,  and  , each accurately classified under their respective labels.Upon this initial setup, perturbations derived from random sampling are integrated into image  .These perturbed images are then fed into the target model to ascertain their classification labels, facilitating the computation of gradient values via the evaluation of the indicator function.Leveraging the computed gradient values alongside the initial perturbation vectors, Following this, an iterative algorithm comprising three steps was implemented:

Tgt
(1) sampling vectors in the image space, (2) introducing them as perturbations to the image, and (3) estimating the current boundary gradient by evaluating the classification outcomes of the target model.In the case of score-based attacks, the exact prediction function was determined using the confidence scores derived from the target model's output, as illustrated in Equation ( 4): In the context of boundary-based attacks that yield only hard labels, a significant challenge inherent to decision-based attack methodologies is the inability to ascertain precise label confidence scores.This limitation necessitates a transformation of the prediction function into a computable indicator function ϕ(X), as delineated in Equation ( 5): Upon acquiring the batch output values of the indicator function from the target model, we estimate the gradient at boundary points utilizing the Monte Carlo algorithm.This estimation is formalized in Equation ( 6), where U rnd denotes a batch of random vectors of size B and η is introduced as a minor weighting constant to refine the estimation accuracy.
To refine the adversarial example's trajectory towards a correct classification boundary, we systematically adjust its position along the estimated gradient direction towards an image X t .The step size for each movement is meticulously calculated based on the proximity of the current adversarial example X t and X src , with a deliberate reduction in step magnitude as the iteration count escalates, mitigating the risk of undue distortion.Following this, we implement a strategic binary search technique to meticulously adjust the adversarial example X t+1 , ensuring its precise positioning near the decision boundary that delineates classes X src and X tgt .
This process ensures that the adversarial example undergoes a continuous cycle of iterative refinement, consistently edging closer to the decision boundary.The iteration step size, coupled with the gradient, dictates the directional adjustment for each boundary point, serving as a pivotal factor in accelerating the adversarial example's convergence towards the intended target.
In Figure 2, we delineate the procedural framework of the Black-Box Boundary Attack based on Gradient Optimization (GOBA) method.The process initiates with the selection of two distinct images, X src and X tgt , each accurately classified under their respective labels.Upon this initial setup, perturbations derived from random sampling are integrated into image X t .These perturbed images are then fed into the target model to ascertain their classification labels, facilitating the computation of gradient values via the evaluation of the indicator function.
an image  .The step size for each movement is meticulously calculated based on the proximity of the current adversarial example  and  , with a deliberate reduction in step magnitude as the iteration count escalates, mitigating the risk of undue distortion.Following this, we implement a strategic binary search technique to meticulously adjust the adversarial example  , ensuring its precise positioning near the decision boundary that delineates classes  and  .
This process ensures that the adversarial example undergoes a continuous cycle of iterative refinement, consistently edging closer to the decision boundary.The iteration step size, coupled with the gradient, dictates the directional adjustment for each boundary point, serving as a pivotal factor in accelerating the adversarial example's convergence towards the intended target.In Figure 2, we delineate the procedural framework of the Black-Box Boundary Attack based on Gradient Optimization (GOBA) method.The process initiates with the selection of two distinct images,  and  , each accurately classified under their respective labels.Upon this initial setup, perturbations derived from random sampling are integrated into image  .These perturbed images are then fed into the target model to ascertain their classification labels, facilitating the computation of gradient values via the evaluation of the indicator function.Leveraging the computed gradient values alongside the initial perturbation vectors, we establish a novel sampling space, setting the stage for the subsequent iteration phase.This step is crucial for augmenting the synergy between the noise vector and the gradient values, thereby enabling the generation of more effective perturbations and the refinement Leveraging the computed gradient values alongside the initial perturbation vectors, we establish a novel sampling space, setting the stage for the subsequent iteration phase.This step is crucial for augmenting the synergy between the noise vector and the gradient values, thereby enabling the generation of more effective perturbations and the refinement of gradient value estimates.The ultimate goal is to iteratively update the adversarial examples, with the aim of crafting an image that, while visually resembling the X src (bird), is misclassified as the malicious tag y tgt (dog) by the target model F(X).

Proposed System Design
Algorithm 1 provides a detailed overview of the implementation steps for the Gradient Optimization-based Black-Box Boundary Attack.In this algorithm, B denotes the size of the random vector set, R s represents the sampling space, τ signifies the threshold for the distance between the selected vector and the gradient, ξ stands for the step size for moving the boundary points in the direction of the gradient, ε represents the maximum threshold of the distance between the image moved by a certain step size and the current adversarial sample, and I indicates the number of iterations, which is typically set to 100.The specific implementation steps are outlined as follows: In step 1, suitable coefficients are selected for X src and X tgt during initialization to guarantee that, when misclassified as y tgt , the image X t (t = 0) exhibits minimal deviation from X src .The resulting image X 0 is then utilized as the original image for the subsequent boundary attack.
In steps 2 to 7, at t = 1, the initial sampling space is randomly sampled to acquire perturbations.Subsequent boundary point sampling is then carried out in the optimal dimensional subspace R θ corresponding to the guidance vectors.The low-dimensional samples obtained are subsequently projected back into the original input space.To address the potential loss of vector information resulting from nonlinear transformations, a linear projection method is utilized to derive the perturbation vector for the next boundary point.To maintain consistent guidance from historical information throughout the gradient estimation process, the sampling subspace constructed by historical data is dynamically updated.The iteration of historical information does not necessitate the design of an adaptive factor, thereby effectively reducing the algorithm's complexity.Following the sampling process, the outcomes are projected back into the original space, resulting in B perturbation vectors.
get U rnd ∈ R S .5: else: 6: get V rnd ∈ R θ .7: U rnd = Bil_Interp (V rnd ).8: end while: 16: In steps 8 to 9, a random perturbation is added to the current image to create a new input, denoted as X q .This input X q is then fed into the target model, and the resulting output is transformed to derive the value of the indicator function ϕ X q , thus completing the gradient estimation.
In steps 10 to 11, the process involves generating the guidance vector set.The GOBA utilizes historical information obtained through sampling to construct the guidance vector set.The selection of information is illustrated in Figure 3.
When dealing with a large sample space, the effectiveness of random sampling attacks in high-dimensional spaces heavily relies on the scale of the sampling subspace.Enhancing efficiency significantly can be achieved by conducting attacks with an optimal scale.To fine-tune the subspace scale, PGAN (Progressive GAN) [31] is leveraged.Several dimensionality reduction techniques have been proposed, encompassing methods such as dimensionality reduction [22,35] and low-frequency constraints [29,36], which serve to expedite the attack process.Upon acquiring the historical vector set, normalization and dimensionality reduction are performed to adjust to the optimal dimension corresponding to the current dataset.In comparison to alternative dimensionality reduction approaches, bilinear interpolation stands out for its simplicity and speed.Consequently, the vectors within the set undergo bilinear interpolation, reducing them to the optimal dimension space, thus forming the sampling space  .
The set of random vectors, represented as  = [ , . . .,  ]∈  × , comprises n orthogonal basis vectors in  .Here, the query space  ⊆ denotes the sampling within the original space.Upon acquiring the guidance vectors in the reduced dimension, they are utilized to construct the sampling space  of random vectors.The objective is to sample random perturbations from the optimal dimension rather than directly from the original space  .A vector  ∈  is randomly sampled from the unit ball in  .If the span () =  , then this sampling process is equivalent to sampling in the original space  .
In steps 12 to 14, the step size  for movement in the gradient direction is calculated.The image classification result is continuously updated based on the adjusted image to ensure the image label remains  , with  representing the step size at the t-th step.Consequently, the prediction score for the adversarial class is expected to increase.
In step 16, within the iterative loop of boundary points, the adversarial example  is advanced by a distance of  along the gradient direction  .Each random vector generated near a boundary point is represented by U rnd .Following the calculation of the gradient vector G, historical vectors are either expanded or removed from the guidance vectors based on a predefined threshold τ.This adjustment is made to ensure the condition U rnd − G p < τ is met, while continuously updating τ to maintain the number of vectors within a fixed range.Subsequently, a subset of vectors is chosen to form the guidance vector set, which in turn guides the update of the next boundary point.
When dealing with a large sample space, the effectiveness of random sampling attacks in high-dimensional spaces heavily relies on the scale of the sampling subspace.Enhancing efficiency significantly can be achieved by conducting attacks with an optimal scale.To fine-tune the subspace scale, PGAN (Progressive GAN) [31] is leveraged.Several dimensionality reduction techniques have been proposed, encompassing methods such as dimensionality reduction [22,35] and low-frequency constraints [29,36], which serve to expedite the attack process.
Upon acquiring the historical vector set, normalization and dimensionality reduction are performed to adjust to the optimal dimension corresponding to the current dataset.In comparison to alternative dimensionality reduction approaches, bilinear interpolation stands out for its simplicity and speed.Consequently, the vectors within the set undergo bilinear interpolation, reducing them to the optimal dimension space, thus forming the sampling space R θ .
The set of random vectors, represented as U rnd = [ω 1 , ..., ω n ]∈ R m×n , comprises n orthogonal basis vectors in R m .Here, the query space R s ⊆R m denotes the sampling within the original space.Upon acquiring the guidance vectors in the reduced dimension, they are utilized to construct the sampling space R θ of random vectors.The objective is to sample random perturbations from the optimal dimension rather than directly from the original space R m .A vector V rnd ∈ R θ is randomly sampled from the unit ball in R θ .If the span (θ) = R m , then this sampling process is equivalent to sampling in the original space R m .
In steps 12 to 14, the step size ξ for movement in the gradient direction is calculated.The image classification result is continuously updated based on the adjusted image to ensure the image label remains y tgt , with ξ t representing the step size at the t-th step.Consequently, the prediction score for the adversarial class is expected to increase.
In step 16, within the iterative loop of boundary points, the adversarial example X t is advanced by a distance of ξ t along the gradient direction G t .
Proceeding to step 17, within the iterative loop of boundary points, the adversarial example advanced along the gradient direction undergoes further refinement using a binary search, as optimized in Algorithm 2. The decision boundary range is progressively narrowed in the initial 6 steps, followed by the last 4 steps within a specific small perturbation range where the misclassified example X t is sought.
Algorithm 2: Binary Search input: X t , indicator function ϕ x , ε output: X t+1 1: X tem = X tgt 2: while X tem − X t p > ε do: 3: X tem = X tem /2 6: else: break 7: while ϕ x (X t ) = 0 do: 8: X tem = 2X tem 9: X t = X tem /2 + X t /2 10: X t+1 = X t 11: return X t+1 In step 18, the distance between the current adversarial example X t+1 and the original image X src is computed to evaluate the magnitude of the perturbation introduced to the existing example.
Fifty pairs of source and target images were randomly selected from the validation set, with the target model predicting different classes for these pairs.A pre-trained ResNet-18 model was utilized as the target model for the experiment.The evaluation primarily focused on six methods, HSJA, QEBA-S, QEBA-F, NLBA-AE, NLBA-VAE, and GOBA, comparing the perturbation sizes (measured by L 2 distance) of generated samples under consistent constraints and the attack success rates under varying perturbation thresholds the datasets.The L 2 distance was employed as the criterion for evaluating perturbations, with the attack methods' superiority determined based on their attack success rates.Each attack method exhibited distinct effects on individual images, posing challenges in accurately assessing their efficiency.Thus, the overall attack success rates across the dataset were considered a more comprehensive indicator of attack efficiency, as calculated in Equation ( 7).Perturbation thresholds were set at 1 × 10 −3 for the Imagenet dataset, 1 × 10 −4 for CelebA, and 5 × 10 −3 for MNIST.
Here, N represents the total number of samples, while N adv denotes the number of samples for which the L 2 distance of the generated adversarial samples, obtained after a limited number of queries, falls below a specified L 2 threshold.

Noise Similarity Analysis
To assess the efficacy of utilizing a constructed sampling space for random vector sampling, this study calculated the similarity between random vectors (noise) and gradient vectors across six methodologies as the number of queries increased, as illustrated in Figure 4.This evaluation was crucial for validating the effectiveness of employing the bilinear interpolation method to compile historical information for sampling space construction.As shown in Figure 4a, on the MNIST dataset, a substantial enhancement in the correlation between gradient and random vectors was observed, with the similarity consistently exceeding 0.2 despite minor fluctuations as the number of queries escalated.Similarly, as depicted in Figure 4b,c for the Imagenet and CelebA datasets, respectively, a significant increase in vector similarity was noted, sustaining levels above 0.04 and 0.05, correspondingly, alongside an upward trend in query volumes.
ilarly, as depicted in Figure 4b,c for the Imagenet and CelebA datasets, respectively, a significant increase in vector similarity was noted, sustaining levels above 0.04 and 0.05, correspondingly, alongside an upward trend in query volumes.
These findings underscore the benefits of dynamically modifying the sampling space and fine-tuning its dimensional parameters, which markedly bolsters the Monte Carlo algorithm's efficiency in gradient estimation compared to static-value approaches.This dynamic strategy enhances the alignment between noise and gradient directions, thereby optimizing the generation of adversarial examples with greater precision and reduced computational overhead.

Attack Performance Analysis
The experiment was designed to monitor the variations in the L2 norm distance throughout the adversarial attack process across the MNIST, Imagenet, and CelebA datasets, incorporating varying quantities of queries.This study conducted a comparative analysis of six distinct methods, focusing on the magnitude of perturbations generated as the number of queries escalated.This analysis was quantified by measuring the L2 distance between the adversarial and original samples, with the results depicted in Figure 5. Notably, the L2 distance on the MNIST dataset was observed to be generally larger compared to that on Imagenet and CelebA, aligning with findings from prior research.This phenomenon is likely attributable to the increased challenge of deceiving models tasked with simpler problems, where the models' heightened sensitivity to perturbations demands larger modifications to the input images to successfully induce misclassification.These findings underscore the benefits of dynamically modifying the sampling space and fine-tuning its dimensional parameters, which markedly bolsters the Monte Carlo algorithm's efficiency in gradient estimation compared to static-value approaches.This dynamic strategy enhances the alignment between noise and gradient directions, thereby optimizing the generation of adversarial examples with greater precision and reduced computational overhead.

Attack Performance Analysis
The experiment was designed to monitor the variations in the L 2 norm distance throughout the adversarial attack process across the MNIST, Imagenet, and CelebA datasets, incorporating varying quantities of queries.This study conducted a comparative analysis of six distinct methods, focusing on the magnitude of perturbations generated as the number of queries escalated.This analysis was quantified by measuring the L 2 distance between the adversarial and original samples, with the results depicted in Figure 5. Notably, the L 2 distance on the MNIST dataset was observed to be generally larger compared to that on Imagenet and CelebA, aligning with findings from prior research.This phenomenon is likely attributable to the increased challenge of deceiving models tasked with simpler problems, where the models' heightened sensitivity to perturbations demands larger modifications to the input images to successfully induce misclassification.

1.
Adversarial examples crafted using the GOBA method demonstrate superior visual quality.Remarkably, when the L norm distance is minimized to a negligible level, these adversarial examples become virtually indistinguishable to the human eye from their original counterparts, yet they significantly impair the classification accuracy of machine learning models.In scenarios involving up to 10,000 queries, the GOBA achieved a reduction in the L 2 norm distance between the adversarial and original images of 5.38 × 10 −4 for the MNIST dataset, 1.03 × 10 −5 for the Imagenet dataset, and an impressive 4.88 × 10 −7 for the CelebA dataset.This performance markedly outstrips that of the previous most effective method, the QEBA-S, by achieving an average reduction in perturbation magnitude (measured by the L 2 distance) of 54.31%.Thus, the GOBA's efficacy not only surpasses that of the QEBA-S but also exceeds the capabilities of the other five evaluated methods, highlighting its effective application in generating adversarial examples with minimal perturbation deviations yet a maximal misclassification impact.
The proposed methodology for constructing a sampling space, predicated on historical data, is designed to augment the efficacy of queries involving random vectors.By dynamically modulating the sampling space, this approach endeavors to identify the optimal perturbation vector for a given image.It accomplishes this with a finite number of queries, thereby computing gradients with greater accuracy and minimizing the overall image perturbation, significantly bolstering the success rate of attacks.
In stark contrast, alternative methods relying on independent sampling fail to efficiently leverage the quantity of model queries to enhance algorithmic accuracy, often resulting in inefficiencies.
This nuanced approach not only improves the strategic utilization of query capacities but also underscores the importance of adaptively adjusting the sampling space to achieve more precise gradient estimations.The outcome is a marked advancement in the generation of adversarial examples, characterized by reduced perturbations while maintaining high levels of misclassification effectiveness.

2.
Superior attack performance of the GOBA: Figure 6 illustrates the attack success rate curves of the six methods for generating adversarial samples on the three datasets.With a query budget of 1 k, the GOBA's attack success rate on the MNIST dataset is 2.67 times that of the HSJA and 2.28 times that of NLBA-VAE.Under a 1.5 k query budget on the Imagenet dataset, the GOBA's attack success rate is at least 2% higher than the HSJA and QEBA-S.On the CelebA dataset, it is at least 9% higher.Evidently, the GOBA's attack success rate surpasses the HSJA, QEBA, and NLBA.In the experiments, QEBA, NLBA, and GOBA all leverage low-frequency noise for attacks, whereas HSJA employs Gaussian noise.The smooth characteristics of low-frequency noise allow images to exhibit texture features closely resembling real images.The classifier captures features introduced by low-frequency noise, incorporating them with the original features, thereby diminishing confidence in the correct class.Conversely, Gaussian noise is sharper, less likely to form features resembling real images, and is easily filtered out by linear filters, rendering the attack less effective.This contributes to the suboptimal performance of the HSJA, particularly with a low query count.Although the QEBA and NLBA attacks utilize low-frequency noise in experiments, the QEBA directly applies linear transformations based on uniform noise during adversarial sample initialization.Similarly, the NLBA involves nonlinear transformations, substantially increasing the time cost of the attack, and the attack effect is not stable.

3.
Efficient convergence in attack success rate: The GOBA exhibits faster convergence in attack success rate.Under the constraint of a 90% success rate, the GOBA requires In the experiments, QEBA, NLBA, and GOBA all leverage low-frequency noise for attacks, whereas HSJA employs Gaussian noise.The smooth characteristics of low-frequency noise allow images to exhibit texture features closely resembling real images.The classifier captures features introduced by low-frequency noise, incorporating them with the original features, thereby diminishing confidence in the correct class.Conversely, Gaussian noise is sharper, less likely to form features resembling real images, and is easily filtered out by linear filters, rendering the attack less effective.This contributes to the suboptimal performance of the HSJA, particularly with a low query count.Although the QEBA and NLBA attacks utilize low-frequency noise in experiments, the QEBA directly applies linear transformations based on uniform noise during adversarial sample initialization.Similarly, the NLBA involves nonlinear transformations, substantially increasing the time cost of the attack, and the attack effect is not stable.
Table 1 provides a summary of the datasets and models employed for the HSJA, QEBA-S, QEBA-F, NLBA-AE, NLBA-VAE, and GOBA, presenting the attack success rate (ASR) and the perturbation size (L 2 distance) between adversarial samples and original images with a fixed query number of 3 k.The data highlights a substantial improvement in the GOBA attacks on adversarial samples after 3 k queries, showcasing a reduction in perturbation size (L 2 distance) of at least 55.74% and an average increase in ASR of 13.78%.This underscores the GOBA as an efficient and rapidly converging, query-efficient attack.

Security Analysis
The results presented by the Black-Box Boundary Attack based on Gradient Optimization (GOBA) are encouraging in terms of attack success rates and query efficiency.The following provides an in-depth scrutiny of the security aspects of the GOBA.
In adversarial attacks, a primary concern is the robustness of the proposed method against various defense mechanisms.Evaluating the GOBA's performance when confronted with common defense strategies employed by machine learning models, such as adversarial training, input preprocessing, and gradient masking, is essential.Such an evaluation facilitates the identification of more targeted defensive strategies.As adversarial attacks grow increasingly sophisticated, analyzing the detectability of adversarial samples generated by the GOBA using state-of-the-art methods enables an understanding of current detection technique limitations and facilitates ongoing efforts to advance adversarially robust models.
In conclusion, the proposed GOBA method is crucial for ensuring the effectiveness and reliability of existing defense methods in real-world scenarios, contributing to the development of more robust and secure machine learning models and enhancing their resilience against adversarial attacks.

Conclusions
This paper addresses the issue of hard-label attacks and proposes an efficient query attack based on gradient direction optimization (GOBA).In the presence of gradients, this method utilizes historical query information as prior knowledge for optimizing random vectors, dynamically constructing an optimal dimensional sampling space based on historical information.The evaluation results of this method on three natural image datasets indicate that, under the same attack success rate threshold, a reduction of over 40% in query budget can be achieved by the GOBA, accompanied by an improvement of 13.78% in attack success rate with the minimum possible number of queries.However, this study still has limitations: Existing hard-label attacks can be defended against by restricting queries near the boundary or by inserting additional 'unknown' classes for low-confidence inputs to expand decision boundaries.This renders existing methods ineffective.Therefore, exploring how to introduce perturbations that are not limited to the vicinity of the boundary (global perturbations) or obtaining decision boundaries with higher accuracy during attacks should be further investigated.Subsequent research can focus on these two directions to generate higher-quality adversarial samples.

Figure 3 .
Figure 3. Selection the guide vector.

Figure 3 .
Figure 3. Selection the guide vector.

Figure 5 .
Figure 5.  distance between the adversarial example and the target image in different datasets.(a) MNIST, (b) Imagenet, (c) CelebA.1. Adversarial examples crafted using the GOBA method demonstrate superior visual quality.Remarkably, when the L2 norm distance is minimized to a negligible level, these adversarial examples become virtually indistinguishable to the human eye from their original counterparts, yet they significantly impair the classification accuracy of machine learning models.In scenarios involving up to 10,000 queries, the GOBA achieved a reduction in the L2 norm distance between the adversarial and original images of 5.38 × 10 −4 for the MNIST dataset, 1.03 × 10 −5 for the Imagenet dataset, and an impressive 4.88 × 10 −7 for the CelebA dataset.This performance mark-

Table 1 .
L 2 distance and success rate of different attacks in the queries of 3 K.