Advances in Brain-Inspired Deep Neural Networks for Adversarial Defense

Li, Ruyi; Ke, Ming; Dong, Zhanguo; Wang, Lubin; Zhang, Tielin; Du, Minghua; Wang, Gang

doi:10.3390/electronics13132566

Open AccessReview

Advances in Brain-Inspired Deep Neural Networks for Adversarial Defense

by

Ruyi Li

^1,2,†,

Ming Ke

^1,†,

Zhanguo Dong

^1,2,

Lubin Wang

²,

Tielin Zhang

³

,

Minghua Du

⁴

and

Gang Wang

^2,5,*

¹

School of Computer and Communication, Lanzhou University of Technology, Lanzhou 730050, China

²

Center of Brain Science, Beijing Institute of Basic Medical Sciences, Beijing 100850, China

³

Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai 200031, China

⁴

Department of Emergency, First Medical Center, Chinese PLA General Hospital, Beijing 100853, China

⁵

Chinese Institute for Brain Research, Beijing 102206, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Electronics 2024, 13(13), 2566; https://doi.org/10.3390/electronics13132566

Submission received: 17 May 2024 / Revised: 16 June 2024 / Accepted: 21 June 2024 / Published: 29 June 2024

(This article belongs to the Section Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

Deep convolutional neural networks (DCNNs) have achieved impressive performance in image recognition, object detection, etc. Nevertheless, they are susceptible to adversarial attacks and interferential noise. Adversarial attacks can mislead DCNN models by manipulating input data with small perturbations, causing security risks to intelligent system applications. Comparatively, these small perturbations have very limited perceptual impact on humans. Therefore, the research on brain-inspired adversarial robust models has gained increasing attention. Beginning from the adversarial attack concepts and schemes, we present a review of the conventional adversarial attack and defense methods and compare the advantages and differences between brain-inspired robust neural networks and the conventional adversarial defense methods. We further review the existing adversarial robust DCNN models, including methods inspired by the early visual systems and supervised by neural signals. Representative examples have validated the efficacy of brain-inspired methods for designing adversarial robust models, which may benefit the further research and development of brain-inspired robust deep convolutional neural networks and the intelligent system applications.

Keywords:

adversarial robustness; adversarial attack; adversarial defense; deep convolutional neural networks; biological vision system

1. Introduction

Over recent years, supported by big data and high-performance computing, the theoretical foundations and technical advancements of deep learning have been studied intensively. A series of intelligent algorithms, particularly deep convolutional neural networks (DCNNs), have been widely used and have achieved remarkable success in various fields. Particularly, in critical application scenarios such as autonomous driving [1], pedestrian detection [2], face recognition [3], and facial payment, DCNN algorithms have demonstrated impressive performance and have been extensively deployed in real-world environments. Nevertheless, most of the DCNN models can hardly guarantee that they will produce accurate results. Therefore, the robustness of DCNN models has attracted incresing attention in deep learning.

Although the DCNNs borrow computing schemes from human brains and exhibit sounding performance in most classification tasks, they are far from ideally robust in real applications. Robustness can be defined as the stability and resilience of a system or model when confronted with interferences targeting its security or effectiveness. It primarily encompasses the capacity of the model to withstand and mitigate such interferences while preserving its intended functionality and performance. Adversarial perturbations or image corruption can easily lead to incorrect predictions in DCNN, whereas human perception is relatively less affected by adversarial perturbations or image degradation. This indicates a gap between DCNN and the generalization capabilities of the biological visual system [4]. One of the recent emerging topics in deep learning is how to enhance the robustness of convolutional neural networks [5]. Among such studies, brain-inspired methods incorporating findings from neuroscience research can help to narrow the gap between convolutional neural networks and the biological visual system [6]. Brain-inspired schemes are believed to enhance the robustness, generalization, interpretability, and security of deep neural networks.

Reviewing the existing related studies, early research focused on defense against specific adversarial attack algorithms. Recently, researchers have shifted their attention to narrowing the gap between convolutional neural networks and the perception of adversarial samples by inspirations from the field of neuroscience [7]. These models are designed to enhance the security and defense capabilities, enabling systems to better counter evolving threats and attacks [8]. By delving into the investigation of biological visual systems, researchers have discovered key factors such as the push–pull inhibition of primary visual cortex (V1) neurons in the brain [9] and surround modulation mechanisms [10]. These mechanisms have been applied to the design and training of DCNN, thereby improving the robustness of DCNN in image classification. This application is of significant importance for enhancing the performance and reliability of computer vision systems in practical applications.

To comprehensively survey the current mainstream adversarial defense strategies and summarize the defense methods for enhancing neural network robustness based on different principles, this paper provides a comprehensive exposition of the representative brain-inspired approaches in the field. These works primarily leverage the characteristics and mechanisms of the animal visual system, applying them to the design and training of neural networks to make deep neural networks more consistent with the animal visual system, thereby approaching the cognitive capabilities of humans. These representative works demonstrate the effective integration of neuroscience and deep learning, serving as valuable adversarial defense methods in the domain of adversarial attack and defense, with significant implications for enhancing the adversarial robustness of object classification.

The remainder of this paper is organized as follows. Section 2 presents the preliminaries of this topic and elaborates on the traditional adversarial defense schemes. Section 3 summarizes the applications of bio-inspired deep neural networks in adversarial defense, while Section 4 elaborates on the conclusions.

2. Adversarial Attacks and Defenses in Deep Learning

2.1. Adversarial Attacks

2.1.1. Adversarial Examples

In early research, Szegedy et al. [11] discovered that, when noise data are added to clean image samples, a deep learning model with high classification accuracy will misclassify the samples with high confidence, indicating poor generalization of typical deep learning models to adversarial samples. The noise data added to the sample images are typically invisible to human eyes as they involve subtle modifications on the original sample images. These modified image samples are known as adversarial examples. Figure 1 presents an example of an adversarial example, where the original sample’s true label and the classification result by a DCNN model are both “Panda”, but, after applying carefully designed small perturbations to the original example, the model classifies it as “Gibbon”. This phenomenon is referred to as adversarial attack, and the defense mechanisms developed to counter adversarial attacks are known as adversarial defense. It is worth noting that there are no visually perceptible differences between the original and adversarial examples, indicating that the human visual system can effectively mitigate the impact of adversarial perturbations. Therefore, incorporating human visual mechanisms into DCNNs can enhance their adversarial robustness, bringing them closer to the robustness exhibited by the human visual system.

The existence of adversarial samples can be generally attributed to two reasons. From the perspective of the datasets, data incompleteness can pose challenges to the learning and generalization process of the model. Unlabeled or mislabeled data in the dataset can influence the model and result in less precise and robust decision boundaries [12]. Considering the distribution of the data, they often reside in low-dimensional manifold regions within a high-dimensional space. As the dimensionality increases, the boundaries of the data distribution become increasingly ambiguous, leading to sparsity of data points in the high-dimensional space. Exploiting this characteristic, attackers can generate adversarial samples by introducing small perturbations that push the samples toward the boundaries of the data distribution. In terms of the properties of deep learning models, Szegedy et al. [11] highlighted that the fundamental cause of adversarial samples lies in the nonlinear decision boundaries of deep neural networks and their vulnerability in high-dimensional spaces. Furthermore, Goodfellow et al. [13] specifically proposed the high-dimensional linear hypothesis, pointing out that the linear nature of the local space of the model contributes to the generation of adversarial samples.

Adversarial attacks involve making small but targeted modifications to input data, causing the model to produce incorrect classification results [14]. In image classification, for a given deep learning model

f (x) = y

, the input dataset is defined as

{\{x_{i}, l_{i}\}}_{i = 1}^{N}

, where

x_{i}

is a data sample with label

l_{i}

and N is the size of the dataset. The target neural network is represented as

f (\cdot)

, with input x and predicted output y. The objective of an adversarial attack is to minimize the corresponding adversarial loss function by adding a small perturbation noise to the input, resulting in an optimized loss function (also known as the adversarial loss) represented by

J (θ, x, y)

, where

θ

denotes the weight. Adversarial attacks aim to find a small perturbation noise r on the input x of the target model

f (\cdot)

such that

f (x + r) \neq y

. Let

x^{'}

be defined as

x + r, x

be the clean sample, r be the adversarial perturbation, and

x^{'}

be the adversarial sample. For targeted attacks, the objective is to make the model output the desired target class by minimizing

f (x^{'}) = y^{t}

, where

y^{t}

represents the target class that needs to be output by the model.

Adversarial samples

x^{'}

approximate the samples x under a specific distance metric:

x^{'} : D (x, x^{'}) < δ, f (x^{'}) \neq y

(1)

where

D (\cdot, \cdot)

is the distance metric and

δ

is a predefined distance constraint, which is also known as the allowed perturbation. Empirically, a small

δ

is adopted to guarantee the similarity between x and

x^{'}

such that

x^{'}

is indistinguishable from x.

According to the definition, an adversarial sample

x^{'}

should be close to a benign sample x under a specific distance metric. The most popular distance metric is the

l_{p}

distance metric. The

l_{p}

distance between x and

x^{'}

is denoted as

{∥x - x^{'}∥}_{p}

, where

{∥ \cdot ∥}_{p}

is defined as follows:

{∥ v ∥}_{p} = {({|v_{1}|}^{p} + {|v_{2}|}^{p} + L + {|v_{d}|}^{p})}^{1 / p}

(2)

where p is a real number. Specifically, the

l_{0}

distance corresponds to the number of elements in the benign sample x modified by the adversarial attack. The

l_{2}

distance measures the standard Euclidean distance between x and

x^{'}

. The

l_{\infty}

distance measures the maximum difference between corresponding elements of clean samples and adversarial samples.

2.1.2. Adversarial Attacks

According to the attacker’s level of knowledge and access to the target mode [15], attack methods can be categorized as white-box attacks and black-box attacks. White-box attack refers to a scenario where the attacker possesses complete knowledge about the internal structure and parameters of the target model, along with unrestricted access to its information, including datasets, model architecture, and training methods. Conversely, black-box attack implies that the attacker has no knowledge about the internal structure or parameters of the target model and can only launch attacks based on the model’s input-output behavior. Currently, most attack algorithms rely on white-box attacks to generate adversarial samples. Therefore, this section will focus on four aspects to present white-box attack algorithms.

(1): Optimization-based attack algorithm. Optimization-based attack methods are commonly used techniques in adversarial attacks. They involve searching for the minimal perturbation in the input space that maximally alters the model’s predictions, leading to misclassification of input samples or classification of input samples into incorrect target classes. Szegedy et al. [12] first proposed the Box-constrained L-BFGS algorithm, an optimization algorithm used to solve unconstrained nonlinear optimization problems. It was the first optimization-based adversarial attack algorithm. Another optimization-based adversarial attack algorithm is the C and W (Carlini and Wagner) algorithm [16], an enhanced version of the Box-constrained L-BFGS algorithm. This algorithm introduces a perturbation vector to the input sample that approximates the minimal perturbation, enabling the perturbed sample to deceive the target model. This algorithm successfully bypassed the defense mechanism known as Defensive Distillation proposed by Papernot et al. [17].
(2): Iterative attacks. Iterative attacks are methods used to adversarially attack machine learning models. These methods iteratively adjust the pixels or features of the original input sample to maximize the success rate of the attack on the adversarial samples. Goodfellow et al. [13] proposed the Fast Gradient Sign Method (FGSM) algorithm based on the observation that the linear nature of the local space of the model leads to the generation of adversarial samples. The FGSM algorithm performs the one-step update along the direction (i.e., the sign) of the gradient of the adversarial loss $J (θ, x, y)$ , to increase the loss in the steepest direction, with the goal of causing misclassification by the model. FGSM is a single-step adversarial attack algorithm that generates adversarial samples quickly and exhibits good transferability, but it has a relatively low attack success rate. To improve the success rate of attacks, Kurakin et al. [18] introduced the Basic Iterative Method (BIM), an iterative version of FGSM. BIM performs FGSM with smaller step sizes, generating more powerful adversarial samples, but increases the computational cost significantly. In 2017, Madry et al. [19] proposed the Projected Gradient Descent (PGD) attack algorithm, which is essentially an iterative version of FGSM. Similar to BIM, it serves as one of the benchmark testing algorithms for evaluating model robustness. Moosavi-Dezfooli et al. [20] introduced DeepFool, a method that iteratively computes minimal perturbations. DeepFool is an adversarial attack technique designed to deceive deep neural networks and guide them toward incorrect classifications by minimizing the distance between the input sample and the decision boundary, which exploits the locally linear approximation property of the decision boundaries of neural networks.
(3): Generative neural networks. Attack methods based on generative neural networks leverage the powerful capabilities of generative models to generate high-quality adversarial samples, causing misdirection and deception to the target model. Baluja et al. [21] were the first to utilize generative neural networks to generate adversarial samples and developed the Adversarial Transformation Network (ATN). Adversarial samples generated by the ATN exhibit strong attack potency and have a certain level of interpretability, but their transferability is relatively weak. To address this limitation, Hayes et al. [22] proposed the Universal Adversarial Network (UAN) attack algorithm to enhance the transferability of the ATN. The UAN attack trains a simple deconvolutional neural network to transform randomly sampled noise from a natural distribution into universal adversarial perturbations, which can be applied to various types of input samples. Additionally, Xiao et al. [23] introduced the concept of generative adversarial network (GAN) and proposed the AdvGAN. This network consists of a generator network and a discriminator network, similar to GAN. The generator network is responsible for generating adversarial samples, while the discriminator network is used to distinguish between real samples and adversarial samples.
(4): Practical attacks. Adversarial samples can extend from the digital space to the physical world. For instance, Sharif et al. [24] conducted adversarial attack tests on face recognition systems in real-world scenarios and proposed an attack method using stickers attached to eyeglass frames to achieve their adversarial goals. Hu et al. [25] introduced Adversarial Texture (AdvTexture) to perform multi-angle attacks. AdvTexture can be applied to cover clothing of arbitrary shapes, allowing individuals wearing such clothing to evade human body detectors from various viewpoints, as shown in Figure 2. They also proposed a generative method called Circular Crop-Based Scalable Generation Attack to produce AdvTexture with repetitive structures.

Up to now, the vulnerability of deep learning models to adversarial attacks remains an open issue without a unified theoretical consensus, which limits their broader security applications. In the context of highly influential autonomous driving systems, attackers could interfere with deep learning decision models by altering the appearance of traffic signs, potentially resulting in severe consequences. Therefore, addressing adversarial attack challenges is crucial for ensuring the security and robustness of deep learning models.

2.2. Adversarial Defenses

With the continuous evolution of adversarial attacks, adversarial defense has become a primary approach to enhance deep learning security. Currently, defense methods against adversarial attacks can be mainly categorized into data-level defenses and model-level defenses. Additionally, a promising and highly discussed approach is to leverage brain-inspired deep convolutional neural networks to enhance model robustness.

2.2.1. Data Preprocessing-Based Methods

(1): Image transformation. Data-level defense, also known as data preprocessing, involves applying data augmentation to images before feeding them into the network, such as random cropping, flipping, and scaling, to reduce or eliminate the impact of adversarial perturbations [15]. Adversarial samples can be mitigated by using JPEG compression to reduce the influence of adversarial perturbations, although the defense provided by this method is limited. To address this issue, Liu et al. [26] redesigned JPEG algorithm and proposed a defense method called feature distillation, which utilizes distributional information of image features to reduce the impact of adversarial perturbations. Bhagoji et al. [27] proposed using Principal Component Analysis (PCA) to compress input data, thereby reducing or eliminating the influence of adversarial perturbations. Jia et al. [28] proposed an image compression network to eliminate adversarial perturbations further and restore clean samples through a reconstruction process. Prakash et al. [29] introduced the pixel deflection method, which randomly selects several pixels and replaces them with randomly chosen pixels to achieve adversarial robustness. They also utilized wavelet denoising to eliminate the noise introduced by pixel replacements. These individual adversarial defense methods are designed for specific attacks. To address this limitation, Raff et al. [30] proposed a comprehensive defense approach that combines multiple preprocessing methods to withstand adversarial sample attacks. Before feeding images into the network, a series of transformations, such as JPEG compression, wavelet denoising, and non-local means filtering, are applied to remove adversarial perturbations from the samples.
(2): Denoising networks. Denoising networks aim to remove adversarial perturbations from input data to mitigate the impact of adversarial attacks. Mustafa et al. [31] proposed a super-resolution defense method that can remap the manifold edge samples to the natural image manifold. This approach does not require modifications to the architecture or training process of the target classification model. Instead, it preprocesses adversarial samples at the input layer, improving the image quality of adversarial samples while maintaining the model’s classification accuracy on original images. Osadchy et al. [32] regarded adversarial perturbations as noise and proposed a pixel-guided denoiser that utilizes filters to eliminate the noise. Liao et al. [33] extended the work above by introducing a high-guided denoiser, which aims to denoise based on high-order representations. This method leverages a U-Net model to learn the noise distribution and the mapping relationship for denoising. After training, adversarial samples are inputted into the U-Net to remove perturbations and restore clean images.
(3): Adversarial training. Adversarial training is to augment the training set with adversarial samples generated through adversarial attacks, aiming to improve the model’s robustness against specific adversarial perturbations. The concept was initially proposed in Ref. [13]. Considering the inefficiency of training with a large number of samples, Kurakin et al. [18] introduced a novel training strategy that expands the training set and applies batch normalization, effectively improving the efficiency of adversarial training. Moosavidezfooli et al. [20] pointed out that adversarial training only enhances the robustness of the model against adversarial samples in the training set, which may affect the model’s performance on original samples. Yang et al. [34] observed that image datasets are generally separable and thus utilized the separability of the dataset to strike a balance between adversarial defense methods and the trade-off between model robustness and accuracy.

2.2.2. Model Robustness Enhancement Methods

Defense strategies for models primarily include model structure optimization, gradient regularization, and the utilization of auxiliary networks to enhance model robustness.

Neural networks often exhibit excessive sensitivity due to the large magnitude of gradients concerning their inputs. To address this issue, Ross et al. [35] proposed gradient regularization to hide gradients and render gradient-based attacks ineffective. This approach introduces regularization terms to constrain the gradients of model outputs concerning inputs, reducing the sensitivity of the model to small variations in the input. Gradient obfuscation is a concept closely related to gradient regularization and plays a crucial role in deep learning. It encompasses concepts such as shattered gradients, random gradients, vanishing gradients, and exploding gradients. Shattered gradients introduce non-differentiable preprocessing to introduce errors in the true gradient signal. Random gradients rely on stochastic defenses, where both the network itself and the input undergo random transformations, making it difficult for attackers to determine the true gradient of random samples and reducing the success rate of attacks. Vanishing and exploding gradients make use of the accumulation of derivatives through multiplicative factors, leading to either very small or very large gradients that attackers cannot exploit to generate adversarial samples. However, defense methods based on gradient obfuscation can still be avoided. Athalye et al. [36] successfully broke seven out of nine gradient obfuscation-based defenses in ICLR 2018 using gradient approximation methods. Inspired by the idea of smoothing the model’s output, Hinton et al. [37] proposed the defense distillation. It enhances model robustness by transferring the features of an original model to an auxiliary model, achieving neural network compression while maintaining prediction accuracy. The defense distillation approach, introduced by Papernot et al. [17], is a defense method that enhances the robustness of deep neural networks. It trains an auxiliary distillation model to transfer the robustness features of the main model. During training, the output of the main model is used as the label for the distillation model. By using soft labels for training, the model’s decision boundaries become smoother, reducing sensitivity and improving the model’s robustness against adversarial examples.

Using additional networks to enhance model robustness has demonstrated promising results in adversarial defense. This approach provides an extra layer of defense for the model, enhancing its ability to withstand adversarial attacks and improve overall model robustness. Akhtar et al. [38] proposed adding a defense network in front of the original network. In this approach, the original network is responsible for learning the classification task, while the defense network is specifically designed to correct adversarial examples. By incorporating the trained defense network into the original model, the model’s predictions on adversarial examples remain consistent with clean examples. Lee et al. [39] employed generative adversarial networks (GANs), which consist of a generator and a discriminator, to generate realistic samples through adversarial and mutual learning. The generator network takes a random noise vector as input and attempts to generate samples similar to the training data. In contrast, the discriminator network receives real and generated samples as input and aims to classify them accurately. Defense methods based on generative adversarial networks (GANs) can effectively protect against adversarial attacks when adequately trained.

2.2.3. Enhancing Model Robustness with Brain-Inspired Deep Neural Networks

With the continuous evolution of adversarial attacks, general defense methods have become less effective. Currently, the industry focuses mainly on using adversarial training as a defense mechanism against adversarial attacks. However, while adversarial training has proven effective in improving the robustness of models against adversarial attacks, it often requires extra computational resources and time for training. Furthermore, the trained model can only defend against attacks from the adversarial algorithms involved in the training process, rendering it ineffective against other attacks. Therefore, it is crucial to explore more efficient and cost-effective methods for defending against adversaries. To better address the adversarial attacks, researchers have started to seek biologically inspired adversarial defense strategies [40], drawing inspiration from the biological system and simulating the defense mechanisms of the brain cortex to train more robust deep convolutional neural networks. Through this interdisciplinary research, new perspectives and strategies are provided for the further development of deep learning.

It is worth mentioning that incorporating brain-inspired feature extraction into deep learning architectures has potential advantages. The brain has remarkable feature extraction capabilities, extracting key features from complex and noisy sensory inputs [41]. By leveraging the feature extraction mechanisms of the brain, more robust and efficient deep learning models can be designed to counter the impact of adversarial attacks. For example, convolutional neural networks are inspired by the visual cortex of the brain, where the convolution operation simulates the local receptive fields and weight-sharing mechanism in the cortical layers [42]. This fusion can enhance model’s understanding and generalization capabilities concerning input data.

Currently, research on the brain has focused mainly on the primary visual cortex (V1). V1 has garnered significant attention from researchers worldwide as a crucial component for understanding vision. As a key brain region for information processing, V1 exhibits various types of simple cells with orientation-selective receptive fields, extracting low-level features such as high signal-to-noise ratio edges, lines, and points. Therefore, introducing orientation-selective receptive fields from V1 into neural network models can enhance their robustness. Additionally, numerous studies in neuroscience have found that Gaussian convolution kernels align well with the receptive field properties of the visual system’s neurons. The Gaussian function is described as follows:

G (x; σ) = \frac{1}{2 π σ^{2}} exp (- \frac{x^{T} x}{2 σ^{2}})

(3)

where

x = {[x, y]}^{T}

denotes the plane coordinates and

σ \in R_{+}

stands for a scale in scale-space. Based on the Gaussian function, researchers have derived various convolution kernels, such as the widely used Laplacian of Gaussian (LoG) and Gabor filters. The LoG [10], denoted as

\nabla^{2} G

, is obtained by performing the second-order partial derivative of a standard Gaussian function (unnormalized). The LoG function is described as follows:

L o G = \nabla G (x; σ) = - \frac{1}{π σ^{4}} [1 - \frac{x^{T} x}{2 σ^{2}}] e^{- \frac{x^{T} x}{2 σ^{2}}}

(4)

The LoG can capture edge information and extract apparent edge features from adversarial samples unaffected by adversarial attacks. The Gabor filter [4] is a widely used feature extraction method in image processing and computer vision. It emulates the response characteristics of simple cells in the human visual system. Gabor filters are commonly used to extract texture, edge, and frequency information. Their multiscale and multi-directional properties can enhance the detection capability of perturbations. The Gabor function consists of a two-dimensional grating with a Gaussian envelope and is described by the following equation:

G_{θ, f, ϕ, n_{x}, n_{y}} (x, y) = \frac{1}{2 π σ_{x} σ_{y}} exp [- 0.5 (x_{r o t}^{2} / σ_{x}^{2} + y_{r o t}^{2} / σ_{y}^{2})] cos (2 π f + ϕ)

(5)

where

\begin{matrix} x_{rot} = x cos (θ) + y sin (θ) \\ y_{rot} = - x sin (θ) + y cos (θ) \end{matrix}

(6)

\begin{matrix} σ_{x} = \frac{n_{x}}{f} \\ σ_{y} = \frac{n_{y}}{f} \end{matrix}

(7)

x_{rot}

and

y_{rot}

are the orthogonal and parallel orientations relative to the grating,

θ

is the angle of the grating orientation, f is the spatial frequency of the grating,

ϕ

is the phase of the grating relative to the Gaussian envelope, and

σ_{x}

and

σ_{y}

are the standard deviations of the Gaussian envelope orthogonal and parallel to the grating, which can be defined as multiples (

n_{x}

and

n_{y}

) of the grating cycle (inverse of the frequency). By investigating and modeling V1 neurons, their application in computer vision enables specific feature extraction. Even in the presence of adversarial samples that modify the image structure, the V1-inspired models can still selectively attend to relevant features such as edges, lines, and textures, enhancing the model’s robustness towards visual perception akin to primates.

Moreover, deep convolutional neural networks have shown outstanding performance in various tasks such as neural activity prediction and analysis thanks to their inspiration from biological structures [43]. Early research demonstrated that neural networks outperformed 37 models from neuroscience and computer vision in predicting representations in the temporal cortex [44]. Subsequent studies have found that neural networks show similar superiority in predicting voxel activity across the visual hierarchy. For instance, Cadena et al. [45] successfully predicted responses in the primary visual cortex (V1) of macaques using multi-layer convolutional neural networks, which were the only class of models shown to provide accurate V1 responses to natural stimuli with multiple nonlinearities. Additionally, Schrimpf et al. [46] introduced the concept of “Brain-Score”, which scores neural networks based on their similarity to core mechanisms of the brain, and they released the “brain-score.org” platform where neural networks for visual processing can be submitted to obtain brain scores and rankings relative to other models.

3. Models of Bio-Inspired Deep Neural Networks in Adversarial Defense

Adopting biologically inspired neural mechanisms in adversarial defense strategies provides a novel theoretical framework [47]. This framework facilitates a deeper understanding of the robustness issues in deep learning models and establishes a theoretical foundation for building powerful adversarial defense strategies. This section delves into several brain-inspired robust models and discusses how they effectively address the evolving challenges posed by adversarial attacks. Table 1 summarizes these representative models in the field. These works have reshaped the architecture and conceptual framework of biologically inspired mechanisms in computer vision, playing a crucial role in supporting and encouraging the development of deep neural network models for adversarial robustness.

3.1. Primary Visual Cortex-Inspired Robust Deep Neural Networks

Convolutional neural networks (CNNs) lack innate biological constraints in simulating primate vision. To address this issue, Malhotra et al. [48] replaced the first convolutional layer of a standard CNN with a set of fixed-parameter Gabor filters. Gabor filters are widely believed by neuroscientists to be a standard model for simulating simple cells in the primary visual cortex (V1) [49]. By employing fixed-parameter Gabor filters, it is possible to simulate the mechanism of simple cells. The experimental results have shown that replacing the convolutional layer with Gabor filters reduces the CNN’s dependence on non-shape features and improves the model robustness.

Recent research has indicated that models with higher robustness against white-box attacks can also accurately predict the neural responses in the V1 of macaques at early stages. Inspired by this, Dapello et al. [4] introduced a novel hybrid CNN model that incorporates primate V1 visual processing mechanisms as the front-end of the model while utilizing trainable standard CNN architectures (e.g., AlexNet or ResNet) as the back-end. This novel model, known as VOneNet, not only surpasses the benchmark models in adversarial attack robustness but also competes with computationally expensive adversarial training methods. This finding reveals a new pathway to enhance the robustness of deep learning models and provides a fresh perspective for understanding and simulating the biological visual system.

The core component of VOneNet is the VOneBlock, which comprises three main elements: a set of biologically constrained Gabor Filter Banks (GFB), simple and complex cell nonlinearities, and a V1 neuronal stochasticity generator, as shown in Figure 3A. The parameters of the biologically constrained GFB are finely tuned to approximate the neural response characteristics of the V1 in primates. The GFB operates on the input RGB image using a set of Gabor filters with various orientations, scales, shapes, and spatial frequencies. Each Gabor filter computes convolution responses with the independent color channels of the input image. This method endows the resulting spatial filter bank with higher heterogeneity compared to the standard filters in the first layer of a CNN, better approximating the diversity of receptive fields in the V1 of primates. The simple and complex cells constitute the primary downstream projection to V2. Therefore, the VOneBlock incorporates both simple and complex cell nonlinearity in its nonlinear layer, allocating these mechanisms to each channel based on the cell type. Randomness plays a significant role in the numerous critical properties of neural responses. This implies that the same visual input can elicit different neural response outputs. Experimental observations on awake macaque monkeys revealed that the mean spike count (obtained from repeated measurements) depends on the presented image, and the spike sequences in each experiment approximate the stochastic processes following a Poisson distribution. To simulate the stochastic characteristic of neural responses, the model introduces independently distributed Gaussian noise in the VOneBlock.

The experimental results demonstrate that VOneBlock significantly enhances the robustness of benchmark networks, namely ResNet50, Cornet-S, and AlexNet, against white-box attacks while maintaining stable classification performance on clean images. Particularly, the model’s robustness improvement is pronounced when facing high-intensity attacks, as shown in Figure 3B. This suggests that the VOneBlock exhibits vital portability, enabling its integration as a front-end architecture in diverse neural network models to enhance their robustness against adversarial attacks. Furthermore, a comparison was conducted between VOneResNet50, which served as the benchmark architecture, and the current state-of-the-art defense methods on the ImageNet-C dataset. This comparison considered white-box adversarial attacks and a larger panel of image perturbations containing a variety of common corruptions. The results, as shown in Table 2, demonstrate that VOneResNet50 improved on both perturbation types, outperforming all the models on the perturbation mean (average of white-box and common corruptions), with an improvement of 18% over ResNet50. In future work, we will continue to explore various types of neurons in the V1 that contribute to robustness, aiming to enrich the repertoire of front-end convolutional kernels and improve the alignment between our models and the V1.

The VOneNet model exhibits high robustness performance when confronted with common image corruptions. However, in the subsequent study by Baidya et al. [50], the individual testing of specific image corruption types by deleting or modifying a component of VOneBlock to generate different variants of the V1 front-end resulted in potential accuracy losses; the architecture of variants is shown in Figure 4. Interestingly, these different variant models showed significant improvements in robustness when faced with specific types of image corruption. Therefore, the authors employed an ensemble technique to integrate seven different variants of VOneNet into a variant ensemble model. The experimental results demonstrated that the ensemble model outperformed any individual variant in handling various types of image corruptions, even achieving accuracy comparable to a model using ResNet18 on clean images.

While certain CNN models have surpassed human visual capabilities in image classification, their performance remains suboptimal when dealing with images containing various common noise patterns. This highlights the primary limitations of such models in handling image noise and perturbations. To address this issue, researchers modified the VOneBlock structure and observed relative improvements in V1 model variants for specific perturbation types. Furthermore, by employing ensemble techniques to combine individual models with different V1 front-end variants, the advantages of each model for specific noise types were successfully leveraged. Although this ensemble approach lacks theoretical support from a biological standpoint, it holds potential in adversarial defense as different noise types can be viewed as a form of adversarial attack, and the ensemble model can enhance the overall robustness by aggregating the strengths of each model. Therefore, by improving the VOneBlock structure and applying ensemble techniques, new methods and insights can be provided for adversarial defense to address various complex noise and perturbation scenarios. This offers a promising direction for further enhancing models’ adversarial performance and robustness.

3.2. Neural Signal-Based Robust Deep Neural Networks

Deep convolutional neural networks (DCNNs) have emerged as state-of-the-art techniques for predicting the neural responses in the primary visual cortex. Safarani et al. [51] achieved Multi-Task Learning (MTL) by establishing a shared representation between image classification and neural response prediction. The Multi-Task Learning model architecture, depicted in Figure 5, utilizes VGG-19 as its backbone. The shared representation can be achieved regularized by combining it with neural data in Multi-Task Learning (MTL), inheriting the functional inductive biases present in the neural data. This approach enhances the network’s generalizability to out-of-distribution images, thereby improving the model robustness.

Safarani et al. [51] first trained a model to predict neural responses based on neural data recorded in the awake V1 of two macaque monkeys. The model was based on the optimal model proposed by Cadena et al. [45] for predicting neural responses in the V1. Subsequently, the predicted neural responses for images in the TIN dataset were used as the foundational neural data for Multi-Task Learning (MTL). The experimental model was based on a variant of VGG-19, with a batch normalization (BN) layer after each convolutional layer. It was found that jointly training the model with neural responses from the macaque V1 enhanced robustness. Bypassing the monkey V1 neural response data predicted by the model as input to the MTL-Monkey model, it was observed that the model exhibited improved robustness against TIN-TC image corruptions compared to the baseline model. By jointly training the network for image classification and predicting neural responses to natural stimuli in the monkey V1, even without specific image distortions during training, a deep network could be jointly trained on monkey visual cortex data, thereby enhancing the robustness of computer vision tasks. Additionally, the resulting network demonstrated higher out-of-distribution generalization, with robustness comparable to an Oracle network trained on noisy images. The results are shown in Figure 6. As the network’s robustness improved, its representation became more brain-like. Moreover, the jointly trained network was found to be more sensitive to content than noise and more sensitive to salient regions in the scene, consistent with existing theories that V1 plays a crucial role in object detection and bottom-up saliency.

In computer vision, bridging the gap between artificial and biological intelligence significantly improves model reliability and generalization [52]. It was found that network models trained using macaque V1 data exhibited significant enhancements in robustness and brain-like representations, suggesting the possibility of transferring perceptual biases from biological neural networks to artificial neural networks. Furthermore, further exploration and optimization of transferring inductive biases from biology to artificial neural networks could involve utilizing more biological data and tasks for joint training to improve network robustness and generalization. It is worth mentioning that future research can delve deeper into the representation features of bio-inspired networks and explore their connection to brain cognition and perception processes. A thorough understanding of the mechanisms and effects of bio-inspired networks can offer valuable insights for designing more robust deep convolutional neural networks.

3.3. Mechanism-Inspired Robust Deep Neural Networks

3.3.1. Retinal Non-Uniform Sampling and Multi-Receptive Field Mechanism

Convolutional neural networks (CNNs) employ a uniform square grid for sampling images. However, in the primate retina, the distribution of cones is typically non-uniform, implying that their density and distribution in the visual scene are not uniform. Due to this non-uniform distribution, the spatial sampling of visual stimuli on the retina is uneven. This mechanism is manifested in the network as peak sampling density at fixed points in the image, decreasing with increasing distance from these points, with the highest-density regions typically corresponding to the vicinity of the fovea centralis. This non-uniform distribution allows primates to perceive and focus on visual scene details and important information more acutely, which are critical for visual tasks and cognitive activities. In the peripheral regions with lower cone density, the distribution is better suited for detecting, perceiving, and rapidly detecting and recognizing moving targets. Furthermore, the receptive field size increases with distance from the eccentricity in the primate visual pathway. From a sampling perspective, this implies that, during visual processing, a series of image patches at different spatial scales are sampled with fixed points in the image as the center. This multiscale mechanism is important as it integration translation, scale, and clutter invariance into neural networks. By employing multiple receptive field sizes, the system can capture visual features at different scales and adapt to object size and position variations. This multiscale representation enables the primate visual system to better handle variations in object translation, scaling, and clutter, thereby enhancing the robustness and generalization of visual information.

Reddy et al. [53] demonstrated through experiments that the combined effect of these two mechanisms significantly improves the robustness of neural networks to various forms of adversarial perturbations, hyperparameters, and adversarial criteria without being hindered by the gradient confusion problem.

Non-uniform sampling refers to the way the retina spatially samples visual information, mimicking to some extent the workings of the primate visual system [54]. It has been found that applying this non-uniform sampling mechanism to neural networks can improve their robustness aginst small adversarial perturbations. This mechanism enables neural networks to better adapt to perturbations by performing high-resolution sampling in the central region and low-resolution sampling in the peripheral region, thereby enhancing the robustness to adversarial attacks. Another factor to consider is the presence of multiple receptive fields. The receptive field refers to the region of input that a neuron perceives. In the primate visual system, different positions and sizes of receptive fields capture information about different scales and spatial features [55]. Studies have shown that incorporating this multi-receptive field mechanism into neural networks can enhance their robustness to small adversarial perturbations. By having neurons with different receptive field sizes, the network can better capture and utilize multiscale information from the input image, mitigating the impact of adversarial perturbations on classification results.

Biologically inspired mechanisms promise to improve the robustness of standard convolutional neural networks. They have also led to more extensive explorations and supported the hypothesis that CNNs are a good model for biological vision. However, this research is still in its early stages and requires further investigation and validation to better understand the role of these mechanisms in adversarial robustness. The next step is to integrate these biologically inspired mechanisms into the design of neural networks to improve adversarial robustness without sacrificing standard performance [56].

3.3.2. Shape Defense

Convolutional neural networks (CNNs) exhibit a preference for texture in their convolutional operation primarily because of the higher number of texture-related pixels compared to the number of pixels on object boundaries. This imbalance in reliance on texture versus shape contributes to the vulnerability of CNNs to adversarial examples. However, humans rely more on shape than texture when recognizing objects. For instance, when presented with an image of a cat with an elephant’s skin texture, humans would correctly label it as a cat based on its shape, whereas a CNN might misclassify it as an elephant due to its texture. This reveals an imbalance in the dependence of CNNs on texture and shape during the recognition process. Borji et al. [57] proposed two methods to incorporate shape bias into CNNs, aiming to enhance their reliance on shape over texture.

The first method is called edge-guided adversarial training (EAT). The edge map preserves the structure of the image and helps to eliminate classification ambiguity. One training approach involves the adversarial training of grayscale or RGB images along with edge maps (Figure 7A). In Figure 7B, a more complex form of adversarial training is performed, where, for each input, a new edge map replaces the old edge map to conduct adversarial training.

The second method is called GAN-based Shape Defense (GSD). This approach involves training a conditional GAN to map edge maps from clean or adversarial images to their corresponding clean images and using the generated images to train the CNN. During inference, the edge map is computed first, and then the generated image from that edge map is classified. The results indicate that using better edge detection and image generation methods can lead to improved performance. Adding edge information (introducing shape bias) to the original model can enhance robustness against common image corruptions. The effectiveness of this method is primarily due to the conditional generative adversarial network (cGAN) learning a function that is invariant to adversarial perturbations. Since edge maps are not completely invariant to (especially large) perturbations, cGAN needs to be trained on an augmented dataset that includes both clean and perturbed images. One advantage of this method is its computational efficiency since it does not require adversarial training. Any adaptive attack against this defense would need to deceive the cGAN, which may fail because failures (i.e., the inability to generate good images) would be noticeable from the generated images. Compared to other adversarial defense methods that utilize generative adversarial networks (GANs), this method exhibits less reliance on texture [58]. Furthermore, this defense approach can be combined with texture and edge-based defense methods, potentially further improving its effectiveness.

Future work should consider evaluating these defense mechanisms against other adversarial attacks such as sparse attacks, pixel attacks [59], and attacks that only perturb edge pixels [60]. Shape defense can also be combined with other defense mechanisms to generate robust models without significant slowdown. Finally, there may be other approaches to incorporate shape bias into CNNs, such as augmenting the dataset with both original images and their corresponding edge maps, overlaying the texture of certain objects onto others (similar to [57]), increasing the intensity of edge pixels compared to texture pixels, and designing and using normalization layers (e.g., divisive normalization).

3.3.3. V1 Push–Pull Inhibition-Based Robust Convolutional Model

In the brain’s primary visual cortex, certain neurons exhibit a phenomenon known as push–pull inhibition. These neurons consist of two distinct receptive fields. One receptive field exhibits excitatory responses to positive stimuli, while the other receptive field shows excitatory responses to negative stimuli. Typically, the negative receptive field has a broader range, allowing it to inhibit the response of the positive receptive field. The push–pull inhibition phenomenon enhances the selectivity of neurons to specific visual stimuli, enabling them to maintain high selectivity even in the presence of noise interference. Simulating this mechanism, a robust deep-learning model can be designed to maintain high selectivity to specific stimuli when faced with noise interference or other perturbations.

Strisciuglio et al. [9] proposed a novel CNN layer called the push–pull layer, inspired by push–pull inhibition in the primary visual cortex. Including this layer in the network architecture helps to improve the model’s robustness to various corruptions of input images while maintaining the performance of clean image classification. The push–pull layer can be incorporated into any CNN architecture, exhibiting strong portability, and adding the push–pull layer maintains the model size, making it a generic approach to enhance network robustness. Strisciuglio et al. [9] designed the push–pull layer using two convolutional kernels, called push and pull kernels. These kernels simulate the excitatory and inhibitory receptive fields of push–pull neurons. The pull kernel typically has a larger receptive field, and its weights are obtained by computing a resampled and upsampled version of the push kernel’s weights, subtracting a small portion from the push component response to achieve push–pull inhibition. After calculating the push–pull response map, a nonlinearity is applied as the activation function for the push–pull receptive field. The general structure of the push–pull layer is shown in Figure 8:

The response of a push–pull layer as

P (I) = θ (k * I) - α θ (- k_{↑ h} * I)

(8)

where

θ

is a rectifier linear unit (ReLU) function,

α

represents the weight coefficients of the pull component, which we call inhibition strength, and

↑ h

represents upsampling of the push kernel

k (\cdot)

by a scale factor

h > 1

. In the training process, only the push kernel weights are updated, and the pull kernel weights are derived from the push kernel weights. The implementation of the push–pull layer ensures that gradients flow back to both the push and pull kernels, and the push kernel weights are updated accordingly. In this way, when gradients backpropagate through the pull kernel, the influence of the pull component is taken into account.

The effectiveness of the push–pull layer was validated by replacing the first convolutional layer of the LeNet model on the MNIST dataset and the first convolutional layers of the ResNet and DenseNet models on the corrupted versions of the CIFAR dataset. The experimental results demonstrate that the push–pull layer significantly enhances the robustness of existing networks to corruptions of input images. Moreover, following the principles of VOneNet, substituting the first convolutional layer of any CNN with push–pull layers has been shown to enhance the robustness and generalization of the network. By introducing the push–pull layer, the network can better handle various image corruptions, including noise and contrast variations. The incorporation of this idea provides new directions for further improvements and optimization of CNN architectures.

The push–pull layer has shown promising results on the MNIST and CIFAR datasets. Future research can consider exploring the application of the push–pull layer to a broader range of datasets and tasks, including natural image classification, object detection, semantic segmentation, etc. This will further validate the effectiveness and applicability of the push–pull layer. The push–pull layer can be used with other improvement techniques and architectures to further enhance network performance. For example, combining it with attention mechanisms, residual connections, or regularization techniques may yield more powerful models. Such combinations can enhance the expressiveness, robustness, and generalization ability of the network. The current research has mainly focused on replacing the first convolutional layer with the push–pull layer. Future studies can explore more complex push–pull structures, such as introducing push–pull layers at different levels or branches of the network. This may lead to richer feature representations and stronger robustness. From a broader perspective, this research suggests that CNNs have not fully considered the image structures crucial for robustness. To improve and achieve better results, it is worth exploring additional ways to incorporate shape information or combine other feature representation techniques, for example, incorporating bounding box information as additional input channels or using more sophisticated GAN models to better generate clean images corresponding to the edge maps.

3.3.4. Surround Modulation-Inspired Neural Network

In the brain’s primary visual cortex, many neurons have classical receptive fields surrounded by non-classical receptive field regions. The non-classical receptive field is a functional region that can modulate the neuron response to the same stimulus, a phenomenon known as surround modulation. Surround modulation arises due to differences in visual stimulus features, such as spatial frequency, orientation, color, motion direction, and brightness, between the classical and non-classical receptive field regions. In neural networks, modulation of the classical and non-classical receptive fields involves three types of connections: feedforward connections from lower-level areas that influence the center of the receptive field, lateral connections within the region that mainly affect local surround inhibition, and feedback connections from higher-level cortical areas that regulate a broader region, including the distant surround. When the features in the center and surround are similar, surround modulation typically exhibits an inhibitory effect, where the presence of the surround suppresses the response of the center. This surround modulation plays a crucial role in visual information processing as it enhances edge detection [61,62], improves the contrast sensitivity of the visual system, and helps the neural system to reduce redundant information and optimize the encoding of visual information. These characteristics provide valuable insights for the design of deep learning and artificial neural networks.

Based on the studies above, Hasani et al. [10] introduced local lateral connections into the activation maps of convolutional layers to simulate the effect of surround modulation, aiming to achieve a more brain-like structure. In this model, excitatory Gaussian functions, which derive the mechanism of center-surround modulation, are employed to define the interactions within the classical receptive field, while inhibitory Gaussian functions of a broader spatial extent are used to define the interactions within the non-classical receptive field. By using this model to simulate the center-surround modulation through lateral connections in CNNs, as shown in Figure 9a, which take the form of fixed-kernel convolution layers (differences of Gaussian, DoG), the surround modulation (SM) kernels are designed to increase feature saliency by suppressing redundant and spatially invariant responses in the activation maps, as shown in Figure 9b. Positive weights are introduced for neighboring neurons, and negative weights are introduced for neurons further apart within a fixed kernel. This approach improves the performance of CNNs on image classification.

The experimental results demonstrate that the surround modulation mechanism significantly enhances the performance and training speed of traditional CNNs in visual tasks. Furthermore, the study also analyzed the impact of this simple form of surround modulation on improving the robustness of CNNs under complex visual conditions, as well as the potential effects of the proposed surround modulation structure on increasing the sparsity and decorrelation of CNN neural activity. In most cases, the surround-modulated CNN achieved performance levels comparable to baseline models with fewer optimization steps or using smaller amounts of training data. Based on this empirical evidence, surround modulation appears to contribute to learning, consistent with neurophysiological studies of the brain. Surround modulation increases feature saliency and reduces redundancy, which could slow down training. Hasani et al. [10] also analyzed the performance of batch normalization in the presence of surround-modulated neural activity and found that batch normalization increases sparsity and decorrelation similarly to surround modulation, suggesting that the success of batch normalization could be explained by effective neural encoding.

Compared to traditional CNNs, the surround modulation mechanism in the biological visual system introduces a biologically plausible modification method for models. CNN models inspired by the surround modulation mechanism exhibit certain generalization abilities under conditions of drastic changes in illumination. Additionally, these models demonstrate higher robustness in dealing with complex environments such as occlusions and clutter. Introducing the surround modulation mechanism can enhance the robustness of neural network models against adversarial attacks. Particularly, when facing physically spatially based adversarial samples, traditional CNN models struggle to correctly recognize objects from partially visible image information, which can lead to irreversible harm in applications like autonomous vehicles. However, models based on the surround modulation mechanism from the biological visual system, through top-down feedback connections and attention mechanisms, introduce more abstract features in segmentation tasks and can accurately identify objects such as traffic signs even in the presence of occlusions, showing excellent recognition capabilities. This provides beneficial applications for intelligent safety systems.

3.3.5. On–Off-Center-Surround Pathway

Designing perceptual models with adaptive robustness to lighting variations is challenging in computer vision. To address this goal, Babaiee et al. [63] drew inspiration from two prevalent components in the visual processing system of vertebrate animals, namely on-center and off-center, and expanded the receptive field of convolutional neural networks. On-center neurons exhibit excitatory centers and inhibitory surrounds, meaning they are activated when the light source is directly incident on their receptive field center but not when the surrounding area is illuminated [64,65]. Conversely, off-center neurons are activated when there is no illumination in the receptive field center but not when the surrounding area is illuminated. Babaiee et al. [63] proposed a novel network component called On–Off-Center-Surround (OOCS) to simulate these neuron types. This component enhances the robustness of deep visual networks to lighting condition changes by introducing mechanisms similar to on-center and off-center neurons.

By studying the surround modulation mechanism inspired by the V1, we have found that the convolution results of the on-center and off-center filters are complementary. Figure 10 shows the feature preferences of the on-center filter, off-center filter, and combined filter when extracting features from real-world images. For instance, on-center filters detect the inner edges of light features on a dark background, as shown in Figure 10 of the judo photo, where the bright spot top-left inside the red circle can only be correctly detected by the on-center filter. On the other hand, the off-center filter detects the outer edges of light features on a dark background. Therefore, by observing the feature map obtained after convolving with the off-center filter, we find that the bright spot top-left only captures edge information, and the dark particles at the bottom of the bird image can only be correctly detected by the off-center filter. Hence, both of these filters are necessary. By combining the extraction processes of both filters, a mixed convolution obtains the sum of the responses from both convolution kernels. This mixed response enables the neural network to more easily detect various small objects that are challenging to recognize in the image.

Similar to the surround modulation mechanism proposed in Ref. [10], OOCS also utilizes a Difference of Gaussians (DoG) kernel with positive weights for the central neuron and negative weights for the surrounding neurons [66]. However, unlike in the reference, the DoG in OOCS does not require a hyperparameter search to find the optimal value of the variance. Instead, the variance is calculated based on the size of the receptive field. Additionally, in Ref. [10], the two Gaussian functions used in

K_{i} = 1 / 2 π σ_{i}^{2}

yield nearly equal values for the proximity

σ_{1}

and

σ_{2}

, resulting in small differences in their subtracted values. Normalizing these values to the center value would cause the center to lose weight in either excitation or inhibition. In contrast, the computed weight values in OOCS are large enough, eliminating the need for normalization.

Inspired by the OOCS phenomenon in biology, any CNN can be extended to an OOCS-CNN with the same number of parameters. This approach processes the input by adding complementary CS convolutional kernels and combines their results with the original convolutional layer. The experiments demonstrate that OOCS not only improves the performance of CNNs in image recognition but also enhances their robustness to challenging lighting conditions. The OOCS concept contributes significantly to future applications requiring robustness, such as vision-based autonomous driving, where the lighting conditions can vary from bright sunlight to shaded areas. OOCS can improve the robustness of driving under such lighting conditions.

4. Discussion

4.1. Advantages of Bio-Inspired Deep Neural Networks

The contributions of bio-inspired mechanisms to adversarial defense are significant, particularly in white-box attacks, where they exhibit clear advantages over the traditional defense methods. As the most powerful and widely adopted defense approach, adversarial training has been employed in various defenses against adversarial attacks. However, it is undeniable that adversarial training incurs high training and computational costs while exhibiting poor generalization performance. Adversarially trained networks often suffer significant performance degradation when faced with clean and new samples, rendering adversarial training an unreliable defense strategy.

In contrast, through cross-disciplinary research with neuroscience, brain-inspired mechanisms can be incorporated into neural network models by modifying the model architecture to mimic the processing of external stimuli in the human V1 area. The model parameters are strictly aligned with the biological visual processing mechanisms and are fixed, requiring no additional training overhead to achieve the defense objective. This approach significantly enhances the robustness against adversarial attacks and improves the robustness against common image corruptions. Furthermore, the architectural benefits of bio-inspired brain mechanisms can be combined with other training-based defense methods to achieve more significant overall robustness gains.

4.2. Main Unresolved Challenges

(1): More complex theoretical foundations. To better understand the contributions of bio-inspired mechanisms to model robustness and why matching biology leads to more robust computer vision models, extensive theoretical foundations and a series of complex mathematical modeling efforts need to be explored. Our understanding of biological neural mechanisms still presents many mysteries, and the lack of clear theoretical guidance can make applying these mechanisms to neural network design challenging. While neuroscience has recently witnessed the emergence of numerous novel neurocomputational models and tools from machine learning, the latest advancements in machine learning and computer vision have primarily been driven by the widespread availability of computational resources and immense computing power. This may result in more time-consuming training and inference processes, requiring higher computational and storage resources, posing challenges for real-time and embedded applications.
(2): Data requirements and model interpretability. Simulating the biological visual perception system requires substantial neural data for reference and training. However, collecting and processing large-scale neural datasets can be challenging and costly in practical implementation. Furthermore, the neural data from primates may exhibit individual variations and complexities, presenting a challenge in handling such differences. Moreover, employing biologically inspired approaches can increase the complexity of the model, rendering it more difficult to interpret and understand. This may restrict the ability to explain model decisions and assess their reliability, which are crucial factors in certain applications, such as medical diagnostics and security domains.
(3): Ethics and privacy issues. Significant amounts of neural data and individual information may be involved when employing methods that simulate biological visual perception. This raises ethical and privacy concerns regarding data collection, storage, and protection, as well as potential misuse and risks. Conducting thorough risk assessments and establishing corresponding regulatory and control measures to prevent abuses and incidents targeting legitimate users are imperative. Additionally, our defense mechanisms may generate false positives, incorrectly labeling legitimate behavior as malicious. This could lead to the wrongful blocking or unfair treatment of legitimate users. Therefore, in designing defense mechanisms, efforts should be made to minimize the possibility of false positives and collateral damage while providing effective mechanisms for appeals and remedies. Finally, introducing models incorporating biological neural mechanisms may have profound societal implications across diverse domains. Ensuring the ethical and social responsibility of developing and deploying these models is essential.

4.3. Bio-Inspirations and Future Work

Compared to the current approaches, the success of biologically inspired defense methods lies in their improved approximation of the architecture of the most extensively studied visual areas in the brain. These methods have made some progress in enhancing the robustness of deep neural networks but still face challenges and opportunities for improvement. In future research, exploring brain-inspired robustness holds tremendous potential in deep learning. Further investigations into the brain mechanisms can deepen our understanding of the relationship between the brain and deep learning on multiple levels. The brain-inspired research primarily focuses on the V1 area, and the next objective is to pursue more neurobiologically accurate models. For instance, the models can be extended to enhance biological fidelity by incorporating attributes such as divisive normalization [67] and contextual modulation [68], aiming to provide greater robustness. The perceptual ability of contextual modulation is also crucial. The biological visual system can not only evade adversarial attacks with subtle perturbations but also handle noise disturbances (such as Gaussian noise and random noise). This ability is attributed to our recognition of objects not solely based on their edges, shapes, and textures but also on the surrounding environment and background. However, this mechanism has not been effectively applied to neural networks, and we believe it will be an inspiring direction for future research.

In addition to V1, the retina and the Lateral Geniculate Nucleus (LGN) play crucial roles in preprocessing visual information. The current V1 models only partially capture some of this information, indicating the potential to extend the current work towards the retina/LGN front-end to integrate CNNs with human visual object recognition better. The different layers of the retina also have additional top-down connections, and the size of receptive fields varies depending on their location. Therefore, it is worthwhile to explore the implementation of a complete retina model and further improve its robustness. Furthermore, biologically inspired models are constrained by neural data and do not require additional training, making them suitable for preprocessing or data augmentation, especially in medical image segmentation.

In recent years, there has been increasing attention paid to the direction of multimodal learning. We contemplate whether integrating multimodal learning, such as vision and speech, with neurobiological mechanisms can enhance the perception and robustness of models. This may involve understanding the mechanisms of interaction and integration of multiple sensory modalities in the neurovisual cortex and applying this knowledge to the design and training of multimodal neural networks. Such an approach may also improve deep learning models’ robustness and generalization abilities.

In conclusion, we believe that there is still vast untapped potential in biological intelligence, and many researchers have been actively advancing more neuroscientifically inspired machine learning algorithms. The various models presented in this paper demonstrate that this approach has become a reality, where primate neurobiologically inspired models require less training to achieve behavior closer to human-like performance, leading to a new virtuous cycle in which neuroscience and artificial intelligence mutually enhance and strengthen each other’s understanding and capabilities. Finally, interdisciplinary collaborations and platforms will contribute to driving the development of deep neural networks with enhanced adversarial robustness.

5. Conclusions

Enhancing the robustness of deep convolutional neural network (DCNN) models has attracted increasing attention in deep learning. In this paper, we have elaborated on the concepts of adversarial attacks and adversarial defense. We provide a comparative analysis of the current mainstream adversarial attack methods and emphasize the effectiveness of integrating neural computation into robust deep convolutional networks for adversarial defense. Furthermore, we review the recent research achievements in the industry that underscore the advantages of drawing inspiration from the biological visual processing mechanisms in deep convolutional networks for enhancing adversarial robustness. Neural network models incorporating brain-like visual computation and biological neural signals offer novel ideas and methods to improve system adversarial robustness. Drawing inspiration from the biological brain’s neural topological network structure and the mechanisms of multi-brain region visual pathways, robust deep convolutional networks that incorporate multimodal, multitemporal, multipath parallel, and multiscale attention mechanisms hold significant implications for application.

Author Contributions

Conceptualization, G.W.; methodology, G.W., Z.D. and R.L.; software, G.W., Z.D. and R.L.; validation, G.W., Z.D., R.L. and M.D.; formal analysis, G.W. and R.L.; investigation, G.W., Z.D. and R.L.; resources, G.W. and L.W.; data curation, G.W., Z.D. and R.L.; writing—original draft preparation, M.K., Z.D. and R.L.; writing—review and editing, R.L., M.K., T.Z., M.D. and G.W.; visualization, Z.D. and R.L.; supervision, G.W. and M.K.; project administration, G.W. All authors have read and agreed to the published version of the manuscript.

Funding

The work is supported by the National Natural Science Foundation of China (62102443).

Data Availability Statement

No new data were created.

Acknowledgments

The authors would like to thank Dajun Xing (Beijing Normal University, China) and Tao Zhang (CAS, China) for their constructive suggestions.

Conflicts of Interest

We confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome.

References

Rinchen, S.; Vaidya, B.; Mouftah, H.T. scalable multi-task learning r-cnn for object detection in autonomous driving. In Proceedings of the IEEE 2023 International Wireless Communications and Mobile Computing, Kuala Lumpur, Malaysia, 4–8 December 2023. [Google Scholar] [CrossRef]
Cai, J.; Xu, M.; Li, W.; Xiong, Y.; Xia, W.; Tu, Z.; Soatto, S. MeMOT: Multi-object tracking with memory. In Proceedings of the IEEE 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022. [Google Scholar] [CrossRef]
Xiao, Z.; Gao, X.; Fu, C.; Dong, Y.; Gao, W.; Zhang, X.; Zhou, J.; Zhu, J. Improving transferability of adversarial patches on face recognition with generative models. In Proceedings of the IEEE 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 19–25 June 2021. [Google Scholar] [CrossRef]
Dapello, J.; Marques, T.; Schrimpf, M.; Geiger, F.; Cox, D.D.; DiCarlo, J.J. Simulating a Primary Visual Cortex at the Front of CNNs Improves Robustness to Image Perturbations. Adv. Neural Inf. Process. Syst. 2020, 33, 13073–13087. [Google Scholar] [CrossRef]
Liu, X.; Cheng, M.; Zhang, H.; Hsieh, C.J. Towards robust neural networks via random self-ensemble. In Lecture Notes in Computer Science; Springer International Publishing: Berlin/Heidelberg, Germany, 2018; pp. 381–397. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015. [Google Scholar] [CrossRef]
Zhuang, C.; Yan, S.; Nayebi, A.; Schrimpf, M.; Frank, M.C.; DiCarlo, J.J.; Yamins, D.L.K. Unsupervised Neural Network Models of the Ventral Visual Stream. Proc. Natl. Acad. Sci. USA 2020, 118, e2014196118. [Google Scholar] [CrossRef]
Tuncay, G.S.; Demetriou, S.; Ganju, K.; Gunter, C.A. Resolving the predicament of android custom permissions. In Proceedings of the 2018 Network and Distributed System Security Symposium: Internet Society, NDSS, San Diego, CA, USA, 18–21 February 2018. [Google Scholar] [CrossRef]
Strisciuglio, N.; Lopez-Antequera, M.; Petkov, N. Enhanced robustness of convolutional networks with a push–pull inhibition layer. Neural Comput. Appl. 2020, 32, 17957–17971. [Google Scholar] [CrossRef]
Hasani, H.; Soleymani, M.; Aghajan, H. Surround modulation: A bio-inspired connectivity structure for convolutional neural networks. Adv. Neural Inf. Process. Syst. 2019, 32, 1–12. [Google Scholar]
Szegedy, C.; Zaremba, W.; Sutskever, I.; Bruna, J.; Erhan, D.; Goodfellow, I.; Fergus, R. Intriguing properties of neural networks. arXiv 2013, arXiv:1312.6199. [Google Scholar] [CrossRef]
Nguyen, A.; Yosinski, J.; Clune, J. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015. [Google Scholar] [CrossRef]
Goodfellow, I.J.; Shlens, J.; Szegedy, C. Explaining and Harnessing Adversarial Examples. arXiv 2014, arXiv:1412.6572. [Google Scholar] [CrossRef]
Wei, J.; Du, S.; Yu, Z. Review of white-box adversarial attack technologies in image classification. J. Comput. Appl. 2022, 42, 2732–2741. [Google Scholar] [CrossRef]
Liang, B.; Li, H.; Su, M.; Li, X.; Shi, W.; Wang, X. Summary of the security of image adversarial samples. J. Inf. Secur. Res. 2021, 7, 294–309. [Google Scholar] [CrossRef]
Carlini, N.; Wagner, D. Towards evaluating the robustness of neural networks. In Proceedings of the 2017 IEEE Symposium on Security and Privacy, San Diego, CA, USA, 22–26 May 2017. [Google Scholar] [CrossRef]
Papernot, N.; McDaniel, P.; Wu, X.; Jha, S.; Swami, A. Distillation as a defense to adversarial perturbations against deep neural networks. In Proceedings of the 2016 IEEE Symposium on Security and Privacy, San Diego, CA, USA, 22–26 May 2016. [Google Scholar] [CrossRef]
Kurakin, A.; Goodfellow, I.; Bengio, S. Adversarial examples in the physical world. arXiv 2016, arXiv:1607.02533. [Google Scholar] [CrossRef]
Madry, A.; Makelov, A.; Schmidt, L.; Tsipras, D.; Vladu, A. Towards Deep Learning Models Resistant to Adversarial Attacks. arXiv 2017, arXiv:1706.06083. [Google Scholar] [CrossRef]
Moosavi-Dezfooli, S.M.; Fawzi, A.; Frossard, P. DeepFool: A simple and accurate method to fool deep neural networks. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar] [CrossRef]
Baluja, S.; Fischer, I. Adversarial Transformation Networks: Learning to Generate Adversarial Examples. arXiv 2017, arXiv:1703.09387. [Google Scholar] [CrossRef]
Hayes, J.; Danezis, G. Learning universal adversarial perturbations with generative models. In Proceedings of the 2018 IEEE Security and Privacy Workshops, San Diego, CA, USA, 24 May 2018. [Google Scholar] [CrossRef]
Xiao, C.; Li, B.; Zhu, J.Y.; He, W.; Liu, M.; Song, D. Generating Adversarial Examples with Adversarial Networks. arXiv 2018, arXiv:1801.02610. [Google Scholar] [CrossRef]
Sharif, M.; Bhagavatula, S.; Bauer, L.; Reiter, M.K. Accessorize to a Crime: Real and stealthy attacks on state-of-the-art face recognition. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, CCS’16, Vienna, Austria, 24–28 October 2016. [Google Scholar] [CrossRef]
Hu, Z.; Huang, S.; Zhu, X.; Sun, F.; Zhang, B.; Hu, X. Adversarial Texture for Fooling Person Detectors in the Physical World. arXiv 2022, arXiv:2203.03373. [Google Scholar] [CrossRef]
Liu, Z.; Liu, Q.; Liu, T.; Xu, N.; Lin, X.; Wang, Y.; Wen, W. Feature distillation: DNN-oriented JPEG compression against adversarial examples. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar] [CrossRef]
Bhagoji, A.N.; Chakraborty, S.; Mittal, P.; Calo, S. Analyzing Federated Learning through an Adversarial Lens. arXiv 2018, arXiv:1811.12470. [Google Scholar] [CrossRef]
Jia, X.; Wei, X.; Cao, X.; Foroosh, H. ComDefend: An efficient image compression model to defend adversarial examples. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar] [CrossRef]
Prakash, A.; Moran, N.; Garber, S.; DiLillo, A.; Storer, J. Deflecting adversarial attacks with pixel deflection. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar] [CrossRef]
Raff, E.; Sylvester, J.; Forsyth, S.; McLean, M. Barrage of random transforms for adversarially robust defense. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar] [CrossRef]
Mustafa, A.; Khan, S.H.; Hayat, M.; Shen, J.; Shao, L. Image Super-Resolution as a Defense Against Adversarial Attacks. IEEE Trans. Image Process. 2020, 29, 1711–1724. [Google Scholar] [CrossRef] [PubMed]
Osadchy, M.; Hernandez-Castro, J.; Gibson, S.; Dunkelman, O.; Perez-Cabo, D. No Bot Expects the DeepCAPTCHA! Introducing Immutable Adversarial Examples, with Applications to CAPTCHA Generation. IEEE Trans. Inf. For. Secur. 2017, 12, 2640–2653. [Google Scholar] [CrossRef]
Liao, F.; Liang, M.; Dong, Y.; Pang, T.; Hu, X.; Zhu, J. Defense against adversarial attacks using high-level representation guided denoiser. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar] [CrossRef]
Yang, Y.Y.; Rashtchian, C.; Zhang, H.; Salakhutdinov, R.; Chaudhuri, K. A Closer Look at Accuracy vs. Robustness. arXiv 2020, arXiv:2003.02460. [Google Scholar] [CrossRef]
Ross, A.S.; Doshi-Velez, F. Improving the Adversarial Robustness and Interpretability of Deep Neural Networks by Regularizing their Input Gradients. arXiv 2017, arXiv:1711.09404. [Google Scholar] [CrossRef]
Athalye, A.; Carlini, N.; Wagner, D. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples. arXiv 2018, arXiv:1802.00420. [Google Scholar] [CrossRef]
Davidson, H.E. Filling the Knowledge Gaps. Consult. Pharm. 2015, 30, 249. [Google Scholar] [CrossRef]
Ren, K.; Zheng, T.; Qin, Z.; Liu, X. Adversarial Attacks and Defenses in Deep Learning. Engineering 2020, 6, 346–360. [Google Scholar] [CrossRef]
Lee, H.; Han, S.; Lee, J. Generative Adversarial Trainer: Defense to Adversarial Perturbations with GAN. arXiv 2017, arXiv:1705.03387. [Google Scholar] [CrossRef]
Lindsay, G.W. Convolutional Neural Networks as a Model of the Visual System: Past, Present, and Future. J. Cogn. Neurosci. 2021, 33, 2017–2031. [Google Scholar] [CrossRef]
Kerr, D.; Coleman, S.A.; McGinnity, T.M.; Clogenson, M. Biologically inspired intensity and range image feature extraction. In Proceedings of the 2013 International Joint Conference on Neural Networks (IJCNN), Dallas, TX, USA, 4–9 August 2013. [Google Scholar] [CrossRef]
Yan, D.; Hu, B. Shared Representation Generator for Relation Extraction With Piecewise-LSTM Convolutional Neural Networks. IEEE Access 2019, 7, 31672–31680. [Google Scholar] [CrossRef]
Kudithipudi, D.; Aguilar-Simon, M.; Babb, J.; Bazhenov, M.; Blackiston, D.; Bongard, J.; Brna, A.P.; Chakravarthi Raja, S.; Cheney, N.; Clune, J.; et al. Biological underpinnings for lifelong learning machines. Nat. Mach. Intell. 2022, 4, 196–210. [Google Scholar] [CrossRef]
Khaligh-Razavi, S.M.; Kriegeskorte, N. Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation. PLoS Comput. Biol. 2014, 10, e1003915. [Google Scholar] [CrossRef] [PubMed]
Cadena, S.A.; Denfield, G.H.; Walker, E.Y.; Gatys, L.A.; Tolias, A.S.; Bethge, M.; Ecker, A.S. Deep convolutional models improve predictions of macaque V1 responses to natural images. PLoS Comput. Biol. 2019, 15, e1006897. [Google Scholar] [CrossRef] [PubMed]
Schrimpf, M.; Kubilius, J.; Hong, H.; Majaj, N.J.; Rajalingham, R.; Issa, E.B.; Kar, K.; Bashivan, P.; Prescott-Roy, J.; Geiger, F.; et al. Brain-Score: Which Artificial Neural Network for Object Recognition is most Brain-Like? bioRxiv 2018. [Google Scholar] [CrossRef]
Machiraju, H.; Choung, O.H.; Frossard, P.; Herzog, M.H. Bio-inspired Robustness: A Review. arXiv 2021, arXiv:2103.09265. [Google Scholar] [CrossRef]
Malhotra, G.; Evans, B.; Bowers, J. Adding biological constraints to CNNs makes image classification more human-like and robust. In Proceedings of the 2019 Conference on Cognitive Computational Neuroscience: Cognitive Computational Neuroscience, CCN, Online, 13–16 September 2019. [Google Scholar] [CrossRef]
Luan, S.; Chen, C.; Zhang, B.; Han, J.; Liu, J. Gabor Convolutional Networks. IEEE Trans. Image Process. 2018, 27, 4357–4366. [Google Scholar] [CrossRef]
Baidya, A.; Dapello, J.; DiCarlo, J.J.; Marques, T. Combining Different V1 Brain Model Variants to Improve Robustness to Image Corruptions in CNNs. arXiv 2021, arXiv:2110.10645. [Google Scholar] [CrossRef]
Safarani, S.; Nix, A.; Willeke, K.; Cadena, S.A.; Restivo, K.; Denfield, G.; Tolias, A.S.; Sinz, F.H. Towards robust vision by multi-task learning on monkey visual cortex. arXiv 2021, arXiv:2107.14344. [Google Scholar] [CrossRef]
Li, Z.; Brendel, W.; Walker, E.Y.; Cobos, E.; Muhammad, T.; Reimer, J.; Bethge, M.; Sinz, F.H.; Pitkow, X.; Tolias, A.S. Learning From Brains How to Regularize Machines. arXiv 2019, arXiv:1911.05072. [Google Scholar] [CrossRef]
Reddy, M.V.; Banburski, A.; Pant, N.; Poggio, T. Biologically Inspired Mechanisms for Adversarial Robustness. arXiv 2020, arXiv:2006.16427. [Google Scholar] [CrossRef]
Freeman, J.; Simoncelli, E.P. Metamers of the ventral stream. Nat. Neurosci. 2011, 14, 1195–1201. [Google Scholar] [CrossRef] [PubMed]
Han, Y.; Roig, G.; Geiger, G.; Poggio, T. Scale and translation-invariance for novel objects in human vision. Sci. Rep. 2020, 10, 61. [Google Scholar] [CrossRef] [PubMed]
Kang, X.; Guo, J.; Song, B.; Cai, B.; Sun, H.; Zhang, Z. Interpretability for reliable, efficient, and self-cognitive DNNs: From theories to applications. Neurocomputing 2023, 545, 126267. [Google Scholar] [CrossRef]
Borji, A. Shape Defense Against Adversarial Attacks. arXiv 2020, arXiv:2008.13336. [Google Scholar] [CrossRef]
Mirza, M.; Osindero, S. Conditional Generative Adversarial Nets. arXiv 2014, arXiv:1411.1784. [Google Scholar] [CrossRef]
Su, J.; Vargas, D.V.; Sakurai, K. One Pixel Attack for Fooling Deep Neural Networks. IEEE Trans. Evol. Comput. 2019, 23, 828–841. [Google Scholar] [CrossRef]
Brendel, W.; Rauber, J.; Bethge, M. Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models. arXiv 2017, arXiv:1712.04248. [Google Scholar] [CrossRef]
Wang, G.; Lopez-Molina, C.; De Baets, B. Multiscale Edge Detection Using First-Order Derivative of Anisotropic Gaussian Kernels. J. Math. Imaging Vis. 2019, 61, 1096–1111. [Google Scholar] [CrossRef]
Jing, J.; Liu, S.; Wang, G.; Zhang, W.; Sun, C. Recent advances on image edge detection: A comprehensive review. Neurocomputing 2022, 503, 259–271. [Google Scholar] [CrossRef]
Babaiee, Z.; Hasani, R.; Lechner, M.; Rus, D.; Grosu, R. On-off center-surround receptive fields for accurate and robust image classification. In Proceedings of the 38th International Conference on Machine Learning, Virtual, 18–24 July 2021. [Google Scholar]
Wang, G.; Lopez-Molina, C.; Vidal-Diez de Ulzurrun, G.; De Baets, B. Noise-robust line detection using normalized and adaptive second-order anisotropic Gaussian kernels. Signal Process. 2019, 160, 252–262. [Google Scholar] [CrossRef]
Wang, G.; Lopez-Molina, C.; De Baets, B. Blob reconstruction using unilateral second order gaussian kernels with application to high-iso long-exposure image denoising. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 4817–4825. [Google Scholar]
Wang, G.; Lopez-Molina, C.; De Baets, B. Automated blob detection using iterative Laplacian of Gaussian filtering and unilateral second-order Gaussian kernels. Digit. Signal Process. 2020, 96, 102592. [Google Scholar] [CrossRef]
Carandini, M.; Heeger, D.J.; Movshon, J.A. Linearity and normalization in simple cells of the macaque primary visual cortex. J. Neurosci. 1997, 17, 8621–8644. [Google Scholar] [CrossRef]
Roelfsema, P.R. Cortical algorithms for perceptual grouping. Annu. Rev. Neurosci. 2006, 29, 203–227. [Google Scholar] [CrossRef]

Figure 1. A demonstration of an adversarial sample generated by applying adversarial attack to ImageNet-pretrained ResNet. The imperceptible perturbation crafted by small perturbations fools ResNet into recognizing the image as a gibbon.

Figure 2. Adversarial textures can be applied to different styles of clothing (such as T-shirts, skirts, and dresses) [25], rendering individuals wearing these garments undetectable by intelligent detectors, regardless of their positions and gestures.

Figure 3. The architecture of VOneNet (A) and the robustness performance evaluation results (B) [4], Gray bars show the performance of base models. Blue bars show the improvements from the VOneBlock. Dashed lines indicate the performance on clean images.

Figure 4. Each variant of VOneNet incorporates a distinct VOneBlock [50].

Figure 5. The VGG-19 architecture with Multi-Task Learning (MTL) for image classification and neural prediction [51].

Figure 6. The exemplar classification results on TIN-TC demonstrate three types of corruption [52], while MTL-Monkey exhibits the best (left), median (middle), and worst (right) robustness scores across five gradually increasing severity levels.

Figure 7. Illustration of the edge-guided adversarial training (EAT) scheme. The adversarial training of grayscale or RGB images along with edge maps (A) and the more complex adversarial training approach (B).

Figure 8. Architectural scheme of the push–pull layer [9].

Figure 9. Achieving circular modulation via lateral connections [10]. (a) Each modulation unit selectively activates nearby units and inhibits distant units based on its own activity level within a certain range. (b) A

5 \times 5

spatial modulation (SM) kernel with excitatory connections (green) and inhibitory connections (red) characterized by associated weights.

Figure 9. Achieving circular modulation via lateral connections [10]. (a) Each modulation unit selectively activates nearby units and inhibits distant units based on its own activity level within a certain range. (b) A

5 \times 5

spatial modulation (SM) kernel with excitatory connections (green) and inhibitory connections (red) characterized by associated weights.

Figure 10. The convolutional results of on-center and off-center filters are complementary [63]. On-center filters detect outer edges for dark features on a light background, while off-center filters detect inner edges for dark features on a light background. Conversely, for light features on a dark background, the roles of on-center and off-center filters are reversed. The first column represents the original image, the second and third columns represent the convolutional responses of the on-center and off-center filters, respectively, and the fourth column shows the combined response obtained by summing the two filter responses.

Table 1. Summary of representative models in the field of biologically inspired adversarial defense.

	Model	Backbone	Characteristics
V1-inspired	VOneNet	ResNet50	VOneblock is an adaptable front-end that is plug-and-play, exhibiting strong transferability in mitigating adversarial attacks and image perturbations.
V1-inspired	VOneNet variants	ResNet50	The initial evidence demonstrates that V1-inspired CNN can lead to higher robustness gains through distillation, yet they lack biologically plausible explanations.
Neural signal-based	MTL	VGG-19	By employing a novel constrained reconstruction analysis technique, this study investigated the changes in feature representations of a brain-trained network as its robustness improved. However, it is worth noting that the training solely relied on neural data from monkey brains, which may introduce specific conditions and limitations to the obtained results, thereby potentially limiting the generalizability of the findings.
Mechanism-inspired	Retinal fixations	Standard CNN	By incorporating non-uniform sampling and multiscale receptive fields, CNN exhibits significant improvement in robustness against small adversarial perturbations. However, this enhancement comes at the cost of increased computational overhead, and the performance of these mechanisms on large-scale datasets remains unverified.
	EAT	Standard CNN	The EAT method, in contrast to utilizing fixed-weight biological filters, still employs adversarial training to enhance edge features. The invariance of edge maps allows the model to exhibit greater robustness against moderate imperceptible perturbations, albeit at a higher training cost.
	Push–pull layer	ResNet20	The introduction of a novel push–pull layer into CNN architectures has demonstrated an improvement in the robustness of existing networks against various types of image corruptions. In the future, it would be beneficial to explore the integration of this mechanism into object detection to enhance the model’s noise robustness.
	SM-CNN	Standard CNN	SM-CNN demonstrates superior generalization capabilities when handling challenging visual tasks. However, determining the appropriate connection weights and structural parameters poses a challenging task. Extensive debugging and optimization are required in practical applications to achieve optimal results.
	OOCS-SNN	Standard DCNN	The introduction of on-center and off-center filters in OOCS-CNN enables better detection of edge information in images by calculating the variance of on and off kernels through receptive field. This approach enhances the accuracy and robustness of image classification. The robustness to illumination variations offered by this method holds significant potential for improvements in visual perception systems, particularly in autonomous driving.

Table 2. VOneResNet50 outperforms other defenses on perturbation mean and overall mean.

Model	Clean	White Box	Corruption	Mean
Base ResNet50	75.6	16.4	38.8	27.6
${AT}_{L \infty}$	62.4	52.3	32.3	42.3
${ANT}^{3 \times 3}$ + SIN	74.1	17.3	52.6	34.9
VOneResNet50	71.7	51.1	40.0	45.6

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, R.; Ke, M.; Dong, Z.; Wang, L.; Zhang, T.; Du, M.; Wang, G. Advances in Brain-Inspired Deep Neural Networks for Adversarial Defense. Electronics 2024, 13, 2566. https://doi.org/10.3390/electronics13132566

AMA Style

Li R, Ke M, Dong Z, Wang L, Zhang T, Du M, Wang G. Advances in Brain-Inspired Deep Neural Networks for Adversarial Defense. Electronics. 2024; 13(13):2566. https://doi.org/10.3390/electronics13132566

Chicago/Turabian Style

Li, Ruyi, Ming Ke, Zhanguo Dong, Lubin Wang, Tielin Zhang, Minghua Du, and Gang Wang. 2024. "Advances in Brain-Inspired Deep Neural Networks for Adversarial Defense" Electronics 13, no. 13: 2566. https://doi.org/10.3390/electronics13132566

APA Style

Li, R., Ke, M., Dong, Z., Wang, L., Zhang, T., Du, M., & Wang, G. (2024). Advances in Brain-Inspired Deep Neural Networks for Adversarial Defense. Electronics, 13(13), 2566. https://doi.org/10.3390/electronics13132566

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Advances in Brain-Inspired Deep Neural Networks for Adversarial Defense

Abstract

1. Introduction

2. Adversarial Attacks and Defenses in Deep Learning

2.1. Adversarial Attacks

2.1.1. Adversarial Examples

2.1.2. Adversarial Attacks

2.2. Adversarial Defenses

2.2.1. Data Preprocessing-Based Methods

2.2.2. Model Robustness Enhancement Methods

2.2.3. Enhancing Model Robustness with Brain-Inspired Deep Neural Networks

3. Models of Bio-Inspired Deep Neural Networks in Adversarial Defense

3.1. Primary Visual Cortex-Inspired Robust Deep Neural Networks

3.2. Neural Signal-Based Robust Deep Neural Networks

3.3. Mechanism-Inspired Robust Deep Neural Networks

3.3.1. Retinal Non-Uniform Sampling and Multi-Receptive Field Mechanism

3.3.2. Shape Defense

3.3.3. V1 Push–Pull Inhibition-Based Robust Convolutional Model

3.3.4. Surround Modulation-Inspired Neural Network

3.3.5. On–Off-Center-Surround Pathway

4. Discussion

4.1. Advantages of Bio-Inspired Deep Neural Networks

4.2. Main Unresolved Challenges

4.3. Bio-Inspirations and Future Work

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI