This section presents an active learning algorithm that integrates adversarial training with feature fusion for boundary data samples. This method employs adversarial training to enhance model robustness, thereby strengthening performance when processing noisy and complex data. Simultaneously, the active learning strategy reduces the required number of training samples, lowering overall training costs. Specifically, in our boundary data feature fusion approach for active learning, samples selected in each round are initially augmented through adversarial training to generate adversarial counterparts. These adversarial samples are subsequently merged with the existing labeled pool, enabling the model to fully exploit the augmented data during updates.
4.1. FGSM Confrontation Training
Adversarial training serves as a method to improve model generalization by incorporating adversarial samples into the training dataset. This approach compels the model to learn from these challenging examples, thereby improving its ability to defend against adversarial perturbations. The Fast Gradient Sign Method (FGSM) represents one of the most widely used techniques to generate adversarial samples, and FGSM adversarial training constitutes a training methodology that employs FGSM to generate adversarial samples and incorporate them into the training set [
33].
FGSM is an algorithm that efficiently generates adversarial perturbations by computing the gradient of the loss function with respect to the input. The fundamental principle involves applying a small perturbation along the direction of the loss function’s gradient to input samples, thereby causing the model to produce erroneous predictions. This perturbation is computed individually for each input sample, making it inherently sample-specific [
34]. The FGSM generation process follows these steps:
Calculate the gradient: For each input sample and its corresponding label, we first compute the gradient of the loss function with respect to the input:
where
represents the loss function with model parameters
, input sample
x, and true label
y. This gradient indicates the direction in which small changes to the input would most significantly increase the loss.
Generate adversarial perturbations: Using the computed gradient to generate adversarial perturbations, the key principle of FGSM involves computing the sign of the gradient (representing the direction of the gradient) and adding perturbations along that direction. The perturbation magnitude is controlled by a small constant parameter:
where
is the sign function that extracts the sign of each element in the gradient vector, and
is the hyperparameter that controls the perturbation magnitude. This formula represents the process of applying a small perturbation to input samples along the direction of the loss function’s gradient. The sign function ensures that the perturbation moves in the direction that would maximally increase the loss, while the
parameter bounds the perturbation size to maintain the adversarial sample’s similarity to the original input.
The generated adversarial samples are incorporated alongside the original samples during the training process, enabling the model to learn correct predictions when confronted with adversarially perturbed inputs. This approach enhances the model’s robustness by exposing it to challenging examples that lie near the decision boundary.
In adversarial training, the training process comprises two complementary components. First, positive sample training follows the traditional approach by utilizing original data for model training. Second, adversarial sample training incorporates adversarial samples generated using FGSM into the training data. During each training step, a batch of data is selected from the training set, where each sample consists of an input x and its corresponding label y. FGSM is then applied to generate adversarial perturbations for each sample, producing the corresponding adversarial examples.
The training procedure calculates losses for both the original samples and their adversarial counterparts, combining these losses for backpropagation to update model parameters, where
represents the loss computed on the original sample, and
denotes the loss computed on the adversarial sample:
where this combined loss function ensures that the model simultaneously learns to make correct predictions on both normal and adversarially perturbed inputs.
For all hyperparameters, unless otherwise specified, we adopt the following settings in all experiments. FGSM perturbation magnitude ϵ: We search over , 2/255, 4/255, 8/255 for image input (pixel range ) and report the results for the selected value in each experiment; the default is . Loss mixing coefficient λ: We weight clean and adversarial losses as . We tune over the range and use by default (corresponding to Equation (14)). Number of neighbors k: For modules that require neighbor retrieval (e.g., regularization of the consistency of k-NN or augmentation based on neighborhood, when applicable in our pipeline), we select k from with a default of .
FGSM adversarial training constitutes an effective method for improving model robustness. By generating adversarial samples and incorporating them into the training data, this approach enhances the model’s ability to adapt to input perturbations, enabling the model to maintain performance when confronted with adversarial examples during inference.
4.2. Adversarial Training for Sample Selection
This method combines active learning and adversarial training to enhance model stability and performance through the following process:
First, the trained model performs forward propagation to extract feature representations from the final layer for each sample, as expressed in Equation (3). We then select the most representative samples by computing pairwise similarity using a combined Manhattan and Euclidean distance metric, as shown in Equation (6). This combination leverages the strengths of both metrics to achieve stable similarity calculations, particularly for features with varying scales.
After identifying the n most similar samples through inter-sample similarity calculations, we merge them using the MixUp fusion method to generate new training instances that enhance model generalization, as shown in Equation (7). The model evaluates classification confidence and returns the corresponding index list for active learning selection.
Following sample selection by the boundary data feature fusion algorithm, the original images undergo forward propagation through the neural network to obtain prediction results via the fully connected layer. FGSM then calculates the gradient of the loss function with respect to the input image through backpropagation, revealing how each pixel should be modified to maximize the loss. Larger gradient magnitudes indicate greater pixel impact on the loss function.
The perturbation is computed and adversarial samples are generated according to
where
represents the perturbation step size,
denotes the gradient of the loss function with respect to input sample
x,
is the model’s loss function, and
is the sign function. The parameter
determines the perturbation magnitude: smaller values (e.g.,
) are used for simpler tasks like MNIST or SVHN, while larger values (e.g.,
) are selected for complex tasks like CIFAR-10. Larger perturbations make training for robust performance more challenging.
The calculated perturbations are added to the original samples to generate adversarial samples:
where
H represents the adversarial sample resulting from adding the perturbation
to the original input
x.
Algorithm overview: We work with a model that iteratively improves using an unlabeled pool U and a labeled set L. For each , we extract a feature embedding from the model’s penultimate layer and measure pairwise similarity using the combined distance . Each example x has a neighborhood consisting of its top-n most similar peers under D. We synthesize interpolated examples using MixUp with and to probe the decision boundary and calibrate uncertainty.
Based on uncertainty , we select the B most uncertain samples to form batch for labeling, obtaining labels . For each selected sample , we create an adversarial counterpart using FGSM: we form a pseudo-label from the model’s current prediction, compute the input gradient , and craft a perturbation , producing the adversarial example (clipped to the valid input range). The labeled set is augmented with both clean and adversarial pairs for .
Training minimizes the combined objective
, where
balances clean accuracy and robustness, and
controls perturbation strength. This cycle—feature extraction, neighborhood identification, uncertainty-based selection, and adversarial augmentation—repeats until convergence. By selecting samples that are uncertain and lie in dense feature regions, then training on their adversarial variants, the algorithm enhances model robustness while reducing labeling costs, as shown in Algorithm 2.
| Algorithm 2 Distance-Measured Data Mixing with Adversarial Training (DM2-AT) |
- 1:
Input: Model , unlabeled pool U, labeled pool L, batch size B, neighbor count n, MixUp parameter , FGSM step size - 2:
Output: Trained model - 3:
Begin: - 4:
while model has not converged do - 5:
Extract features for all . - 6:
For each , find its top-n neighbors using L1+L2 distance. - 7:
Generate synthetic set via MixUp on pairs from U and their neighbors. - 8:
Score with model uncertainty to select the B most uncertain original samples . - 9:
for each do - 10:
// is the model’s predicted label - 11:
- 12:
end for - 13:
Query true labels for the selected samples . - 14:
Update and . - 15:
Retrain on L using a combined loss for original and adversarial samples. - 16:
end while - 17:
Return
|