Artificial Visual System for Orientation Detection Based on Hubel–Wiesel Model

The Hubel–Wiesel (HW) model is a classical neurobiological model for explaining the orientation selectivity of cortical cells. However, the HW model still has not been fully proved physiologically, and there are few concise but efficient systems to quantify and simulate the HW model and can be used for object orientation detection applications. To realize a straightforward and efficient quantitive method and validate the HW model’s reasonability and practicality, we use McCulloch-Pitts (MP) neuron model to simulate simple cells and complex cells and implement an artificial visual system (AVS) for two-dimensional object orientation detection. First, we realize four types of simple cells that are only responsible for detecting a specific orientation angle locally. Complex cells are realized with the sum function. Every local orientation information of an object is collected by simple cells and subsequently converged to the corresponding same type complex cells for computing global activation degree. Finally, the global orientation is obtained according to the activation degree of each type of complex cell. Based on this scheme, an AVS for global orientation detection is constructed. We conducted computer simulations to prove the feasibility and effectiveness of our scheme and the AVS. Computer simulations show that the mechanism-based AVS can make accurate orientation discrimination and shows striking biological similarities with the natural visual system, which indirectly proves the rationality of the Hubel–Wiesel model. Furthermore, compared with traditional CNN, we find that our AVS beats CNN on orientation detection tasks in identification accuracy, noise resistance, computation and learning cost, hardware implementation, and reasonability.


Introduction
The human brain nervous system is a highly complex deep network constructed by more than 10 11 neurons [1]. About 80% of information received by our brain comes from the visual system, and the neurons in the human brain are more concentrated on visual tasks [2,3]. Accordingly, starting with research on the visual system is widely considered a proper way to figure out how the brain works. Phenomenally, forms, colors, and movements are fundamental and distinct attributes of visual images. Thus, we think the essential functions in our visual system include form perception, color perception, and motion perception [4,5]. In the visual system, we consider orientation detection a form perception. It has an essential role in human behavioral decision making. Whereas so far, the principle of orientation selectivity remains unclear [6,7]. Once the mechanism of visual orientation detection is understood, it would be of significance on the studies of the human brain [8,9]. From 1955 to 1978, Hubel and Wiesel systematically studied the visual functional structure [10]. In 1959, they reported that some cat cortical neurons showed orientation selectivity [11]. When they showed objects with various shapes and locations in front of a cat's eyes, these neurons had the optimal response to an object with a specific orientation and specific location; otherwise, with little or no response [12]. In 1968, they reported that some neurons also had similar characteristics in the monkey's striate cortex, but its optimal response to an object was no longer limited to a specific location. Hubel and Wiesel named these two kinds of neurons: simple cells and complex cells [13]. Simple cells are a simple type of cortical cells in the visual cortex, which have the optimal response to an object with a specific orientation at a certain fixed location in the visual field. Furthermore, an ineffective response area for a simple cell might be effective for another cell. Complex cells also have optimal responses to objects with a specific orientation angle but no longer limit objects' locations in the vision field. The objects with optimal orientation can move in the receptive field without causing neuron inactivation [14][15][16][17]. To explain the orientation selectivity of these cortical cells, Hubel and Wiesel put forward a scheme that a simple cell's receptive field can be integrated from the center-surround receptive fields of several LGN cells, and a complex cell's receptive field is integrated from several simple cells' [10,11]. Thus the neural circuit is constructed into a feedforward neural network which we usually call it Hubel-Wiesel (HW) model (described in Section 2.1). Although the HW model has not been fully proved, some physiological experimental results showed that indeed there were connections between LGN cells and simple cells, which proves some possibility of the HW model [18][19][20]. Simultaneously, with the increase in practical application requirements, there are several ways to realize orientation detection: principal component analysis method, gradient modeling method, digital filter method, and CNN method [21][22][23][24]. Among these methods, after training with a considerable number of data, the CNN method showed better recognizing performance [25,26]. With increasing requirements for more complex scenes, the traditional deep learning method falls into the generalization difficulty, and a trained model usually could be applicable for limited tasks. Thus, scientists have started to concentrate more on the brain and try to apply the visual information processing mechanisms to computer vision or artificial intelligence [27]. Although there are already many works focused on the simulation of the visual system, most of them could not directly be applied in deep learning. Some methods focus on the realization in an electronic device manner [28][29][30], though these electronic device-based works are impressive, they are hard to connect with computer vision. Some related research concentrated on the simulation of biology features [31][32][33], and though these systems can simulate the cell features, they were designed complicated. Some works concentrated on the application [34][35][36], and although they are applicable, without generalization, they are task-limited. So far, we lack a concise and efficient quantitative manner for the classical HW model.
To prove the reasonability and practicality of the HW model and explain the orientation selectivity in a qualitative manner, we propose a McCulloch-Pitts (MP) neuron-based orientation detective scheme and implement an artificial visual system (AVS) for twodimensional object orientation detection. Simple cells and complex cells are realized by the MP neuron model. For simplicity, we realize four types of simple cells for a 3 × 3 two-dimensional local receptive field, each of which corresponds to a specific orientation angle (0 • , 45 • , 90 • , and 135 • ). In the detection process, the orientation information of each local receptive field is extracted by the simple cells separately and converged to the same type of complex cells. The function of the complex cell is to converge the activations of all simple cells. Finally, the global orientation is inferred by the activation degree of the complex cell. The type of complex cell with the most activation corresponds to the global orientation. Based on this scheme, we implement an AVS for global orientation detection, and its performance is evaluated by computer simulation on an image dataset. The objects in this dataset are two-dimensional and with different ideal shapes, locations, and orientation angles. The computer simulation results show that the AVS has biological similarities with the biological visual system and offers excellent orientation recognition accuracy to objects with different sizes, shapes, and locations, thus directly proving the reasonability and practicality of the HW model. To show the AVS's superiority, we compare AVS and CNN's performance on orientation detection and find that AVS beats CNN in identification accuracy, noise resistance, computation and learning cost, hardware implementation, and reasonability.

Mechanism and System
This section introduces the Hubel-Wiesel (HW) model and realizes simple cells and complex cells by artificial neuron model. Finally, we describe the implementation of an artificial visual system based on the HW model for two-dimensional object orientation detection.

Hubel-Wiesel Model
Hubel-Wiesel (HW) model is a scheme for explaining the orientation selectivity of cortical cells [10]. Orientation selectivity is a feature of cortex cells observed from Hubel and Wiesel's experiments. The setup and results of the experiment on a cat are roughly shown in Figure 1 [17,37,38]. The electric signals show that some cells in the cat cortex were found to have the optimal response to which light stimuli with a specific orientational edge and a fixed location. These cells are named simple cells. Furthermore, another type of cells called complex cells respond vigorously to stimuli with a specific orientational edge but are no longer limited to a fixed local location, and the optimal orientational stimuli can make the complex cell activated the most within every location of the global receptive field [14][15][16][17].  To explain the orientation selectivity of simple cells and complex cells, Hubel and Wiesel proposed a feedforward model scheme. They speculated that simple cells receive the convergent input of several LGN cells whose receptive fields are arranged with a definite orientation. Thus the simple cell's optimal response is tuned to stimuli with this specific orientation. Similarly, a complex cell's inputs converged from several simple cells with the same orientation selectivity. Accordingly, complex cells can realize the insensitivity of stimuli location within global receptive field [10,11]. Figure 2 shows the process of receptive fields' linking [17,38]. Thus, a classical HW feedforward model can be described theoretically as Figure 3.

McCulloch-Pitts Neuron Model
In the 1940s, McCulloch and Pitts proposed s simple model of biological neurons [39]. McCulloch-Pitts (MP) neuron model is a simplification of the biological nerve cells, which has only two states 1: excited (fire) and 0: not excited (inhibited). Figure 4 describes the detailed structure of an MP neuron. The neuron accepts inputs with different weights. When the weighted sum exceeds a certain threshold, the neuron will fire and output y = 1; otherwise, y = 0 [40].
LGN cells

Simple cells Complex cell
Output Figure 4. The structure of McCulloch-Pitts neuron model.

Realization of Simple Cell and Complex Cell
This section describes the realization of simple and complex cells based on the artificial neuron model. Due to studies on biological neural networks, now we have the consensus that a single neuron can perform a simple task. Thus, we design simple cells based on the MP model for local orientation detection and complex cells with a sum function for summary activation according to the feature of simple cells and complex cells.
As we introduced above, a simple cell's receptive field may be formed by several LGN cells' receptive fields (see Figure 2a). In this paper, for the simplicity of the simulation implementation and neuron computation, we design each simple cell with a 3 × 3 local receptive field. As shown in Figure 5, several LGN cells' receptive fields construct a simple cell receptive field size of 3 × 3. And for a simple cell, its activation directly depends on the light information in its spatial receptive field. Thus we decide to omit the processing of retinal cells in this pathway and let the light information directly transmitted into simple cells, so we do not need to consider which and how many LGN cells' neural signals are inputted to a simple cell. When light falls on a region, a bunch of photoreceptors accepts the light signal and generates a corresponding electrical signal. Then the electrical signal is transmitted to simple cells through the primary visual pathway. The simplified signal transmission circuit is shown in Figure 6. We simplify using one photoreceptor to accept light information in a one-pixel region. Then the light information in the nine pixels (3 × 3) region is directly transmitted to a simple cell. The electrical signals are simplified to 0-1 signals. When a photoreceptor accepts light, it outputs 1; otherwise, 0.
This study introduces four kinds of orientation-selective simple cells based on the MP model for detecting orientation angles of 0 • , 45 • , 90 • , and 135 • , respectively. The details of a 45 • -selective simple cell are as illustrated in Figure 7. In Figure 7, the input signal is expressed by x i,j , where the 'i' and 'j' represent the two-dimensional location in the local receptive field. Furthermore, for the simplicity of the realization of AVS (introduced in next section) and neural computation, we idealize the 'OFF' regions of simple cells, light stimulation in the 'OFF' region will not cause any inhibitory response of the corresponding simple cell (the weights of neural connections in OFF region can be regarded as 0).  x 9 x 6 x 3

Spatial light information
x 9 x 6 x 3 x 8 x 5 x 2 x 7 x 4 x 1

Photoreceptors
Simple cell Project Figure 6. Signal transmission flow from light information to a simple cell. In a 3 × 3 region, the locations of each pixel are labeled from x 1 to x 9 .

Spatial light information Photoreceptors
Simple cell  Figure 7, only the light stimulation in optimal orientation-selective region (ON region) is received by the 45 • -selective simple cell. For a 45 • -selective simple cell, the optimal orientation-selective locations are x i,j , x i+1,j−1 and x i−1,j+1 . The spatial light information in the local receptive field is projected on the retina and received by corresponding photoreceptors, generated electrical input signals are transmitted into the 45 • -selective simple cell. When a photoreceptor receives light, the generated input is 1 (effective input), and its weight of neural connection to a simple cell is set to 1. Threshold θ is set to 2.5. When the weighted sum of inputs reaches the threshold θ, the neuron is activated. Thus, if and only if the x i,j , x i+1,j−1 , and x i−1,j+1 are all effective inputs, the 45 • simple cell is activated. The activation results can be expressed by the following equation:

As shown in
The structures of the four types of orientation-selective simple cells and their optimal stimuli orientation are shown in Figure 8. Likewise, the simple cells in the other three orientations are realized in the similar way. In a 3 × 3 local receptive field, they also only respond to three effective inputs. 0 • -selective simple cell only responds to stimulus in x i,j , x i,j−1 and x i,j+1 . The inputs of 90 • -selective simple cell come from x i,j , x i−1,j and x i+1,j . 135 • -selective simple cell's effective input locations are set to x i,j , x i−1,j−1 and x i+1,j+1 .
Complex cells in our proposed detective scheme are responsible for converging the total activation of all simple cells. For simplicity, the simple cells with the same optimal orientation selectivity are connected to one single complex cell. Correspondingly, four different orientation-selective complex cells are needed (0 • , 45 • , 90 • , and 135 • ). The realization of a complex cell is described in Figure 9. The following equation can express the output result by the complex cell:

AVS for Global Orientation Detection
We implement an artificial visual system (AVS) for two-dimensional object orientation detection based on the simple and complex cells we design. The AVS's structure and the process of global orientation detection on an object by AVS are described in this section. As mentioned above, the simple cells can be activated by a 3-pixel optimal orientated line within a 3 × 3 local receptive field. For a large image size of M × N, take each pixel as a central point to divide this image into M × N local receptive fields. So the basic detection scheme for a large size image by AVS uses the simple cell to detect possible orientations of every local receptive field and uses complex cells to record the total activations of each type of simple cell. Accordingly, to extract local orientation information of an object in a two-dimensional M × N image, M × N × 4 simple cells and 4 corresponding complex cells are needed. Figure 10 shows the entire structure of AVS for detecting the global orientation of an object in a 5 × 5 image. For a 5 × 5 image, taking each pixel as the central location, it can be divided into 25 local receptive fields size of 3 (regions of the local receptive field beyond image can be regarded as no light stimulation in this region). The light stimulation in each local receptive field will accept by 9 photoreceptors and generate corresponding 0-1 signal inputs. In each local receptive field, the photoreceptors are connected with a set of four different simple cells. Each group of simple cells separately extracts the 25 local orientation information. Subsequently, activation results of all simple cells are input to corresponding same type complex cells. According to the function of complex cells we design, the complex cell can sum up the total activation of each type of simple cell to get final outputs representing the activation of four orientations. The object's global orientation is inferred from the type of complex cell that is most activated. In Figure 10  x 1 x 20

Simulation and Result
This section describes the validation results of AVS on datasets and some biologyinspired experiments. We also compared AVS and CNNs' performance on noise data. All simulations were implemented on the Apple M1 chips hardware environment.
To validate the mechanism's feasibility and the mechanism-based AVS, we implemented this mechanism and the AVS for global orientation detection by computer simulation. This section describes the AVS's physiological similarity with the biological visual system and evaluates the practicality combined with dataset testing results.
We first tested the AVS's feasibility on a binary image dataset. The images were sized to have 1024 pixels (32 × 32) and at least 3 pixel light spots. In each image, light spots were formed into an ideal object (central symmetry or axial symmetry) with a specific orientation angle (0 • , 45 • , 90 • , or 135 • ). We evaluated the detection system by analyzing its recognizing accuracy on 45,788 images, and the results are summarized in Table 1. The AVS has high detection accuracy for the orientation of ideal objects in binary images. An example of the orientation detection on an object in a binary image is provided in Figure 11. From Figure 11a, we can observe that the object's orientation angle in the image was 135 • . This 32 × 32 image could be divided into 1024 local receptive fields. To get this object's global orientation, 4096 (32 × 32 × 4) simple cells were needed to detect each local receptive field, and 4 complex cells were required to converge the activation of simple cells. Referring to the biologists' potential recording method of a single neuron [11], we also used spike rate to record the complex cells' activation degree. One active simple cell could let the same type of complex cell generate a spike, so theoretically, a complex cell's spike rate could be up to 1024. From the activation results shown in Figure 11b,c, we can see that the 135 • -selective complex cell's spike rate was 44, which was the most. Thus, the orientation detection result of the object was 135 • .
An example of detecting an 0 • -object is shown in Figure 12. From Figure 12b,c, we know that the 0 • -selective complex cell's spike rate was 65, which was the most. The detection result was the same as we observed by our eyes. When we rotated the object in Figure 12a to other orientation angles and detected the object by our AVS, we could obtain the result shown in Figure 13. When an object was oriented at different orientations, it would activate the corresponding complex cell the most. This result also supports that our mechanism is feasible and reliable. To further verify our mechanism, we conducted some comparative experiments. First, let us look at the object shown in Figure 14a. It is a 5 × 5 square. From the detection results shown in Figure 14b,c, we know that 0 • -selective complex cell and 90 • -selective complex cell spike rate were the same and were the most. So the AVS cannot determine which angle is the object's orientation. It is the same for the biological visual system because we humans cannot tell its orientation angle. We can say it is oriented toward 0 • or 90 • at the same time. To further investigate the correlation between object shape and complex cell activation, we gradually increased the length of a 3 × 3 square along the direction of 0 • until it became a 3 × 18 rectangle. The objects and the spike rate curves are shown in Figure 15. Then we started to increase the length of the 3 × 18 rectangle along the direction of 90 • until it became a square again. The objects and the spike rate curves are shown in Figure 15. From the spike rate curve shown in Figure 15b, we can observe that as the object's shape became more and more inclined to a rectangle with distinguishable length and width, the spike rate between complex cells became markedly different. The spike rate of the 0 • -selective complex cell far exceeded other cells, which means AVS can easily determine the object's orientation angle, and human beings can recognize the object's orientation angle more accessible. Accordingly, we can conclude that as the length of the object increases along a certain direction, the corresponding complex cell's spike rate increases, and vice versa. This conclusion is consistent with the experimental phenomena observed by Hubel in rabbit cortical cells [11]. Examining the objects in Figure 16a, we can also find that when the object approaches to be a square, it is more and more difficult for human beings to identify the object's orientation angle. Furthermore, from Figure 16b, we can observe that the activation of the 90 • -selective complex cell tends to be equal to the 0 • -selective complex cell. It is similar to our visual recognition mechanism. The closer the shape of an object is to a square, the more difficult it is to identify the orientation angle. This result also supports the previous conclusion in the last paragraph. In addition, the objects in the images were located at different locations, but the orientation angle could be detected correctly. This result shows the similarity with complex cells' feature, which is responding selectively to stimuli with a particular orientation but no limit to the exact location of the stimulus [10,11].
To compare the performance of our AVS with CNN in orientation detection tasks and their noise resistance, we conducted a series of comparative experiments. First, we generated an original dataset, which consisted of 13,438 images (sized as 32 × 32), and the objects in each image had at least 32 pixel light spots. Each object had a specific orientation angle and location. Then based on the original dataset, we added noise and generated datasets with different types and quantities of noise. According to the arrangement of adding noise, they can be divided into two categories of noise data. Examples of two types of noise are shown in Figure 17. As shown in Figure 17a, the first type of noise was the case of no noise in the object, and the noise was randomly added to the background. The noise was randomly added to the whole image in the second type, as shown in Figure 18b. Then, we generated seven datasets with different quantities of noise for each type of noise: 5%, 10%, 15%, 20%, 25%, and 30% (for a 32 × 32 image, a certain proportion of pixels of the whole image were noise). The performance of AVS on these noise datasets and the recognition results are shown in Table 2. Results in Table 2 show that AVS has better noise resistance on background noise than whole-image noise.  Take the two objects shown in Figures 11a and 12a as examples. We added noise to the images and detected them. The detection results on noise images are shown in Figures 18 and 19. In Figure 18a, the noise was only randomly added to the background. Comparing the detection results with the non-noise one, we know that although the noise increased the spike rate of other complex cells, the 135 • -selective complex cell was still most activated. In Figure 19a, the noise was randomly added to the whole image. The object's shape had been changed. From Figure 19c, we know 0 • -selective complex cell had the most spike rate, the same as the detection result of the object in Figure 12a. Comparing the spike records shown in Figure 19b with the spike records in Figure 12b, it is evident that the continuous noise in the image affected the complex cells' spike rate. The continuous noise in the background activated the 0 • -selective simple cell and 135 • -selective simple cell in local receptive fields, and the continuous noise in the object inhibited the activations of simple cells. In short, though the noise would affect the spike rates of complex cells, the orientation angle of an object still could be recognized when the proportion of noise was lower than a certain degree. Additionally, the noise in an object had more effect on detection results than the noise in the background.
We also compared the generalization performance and noise immunity of AVS and CNN on orientation detection. The structure of the CNN we used in these experiments is shown in Figure 20. We chose Adam as the optimizer. Thirty 3 × 3 filters were used in the convolution layer, and 2 × 2 max-pooling was used in the pooling layer. The output size of the first affine layer was 100, and the last layer finally outputted four values. The training set consists of 10,750 ideal object images. We trained the CNN model 50 epochs and chose the model with the best detection accuracy as the final model. The ideal object testing dataset consists of 2687 images. We also collected eight natural objects (binary form). We rotated, moved the location, and changed the size of these objects within the image to obtain a natural object dataset that consists of 1280 different images. Figure 21 shows several examples of natural object data. Then based on the two original testing sets, we further generated several noise datasets.
The testing results are summarized in Tables 3 and 4. From Table 3, we know that CNN's recognition accuracy was very high without noise but dropped quickly with the proportion of noise increased. When the noise proportion exceeded 1%, the performance of CNN on ideal objects and natural objects all collapsed. For the AVS, it always kept an excellent advantage over CNN. Its recognition accuracy was still about 98% when tested on the ideal object datasets and could reach over 90% when tested on natural object datasets. Overall, the AVS can successfully give correct discrimination to objects' orientation, regardless of the object's shape, size, and location. Although the AVS already has an acceptable and good performance on natural objects, we further explored AVS's performance and the impactors on AVS's robustness. We recorded the classification results by AVS on 0% and 10% natural object noise datasets and plotted the corresponding confusion matrix, as shown in Figure 22. When the objects are without noise, AVS could give correct classifications to all objects. When objects are with noise, according to the confusion matrix, we can know that AVS still could provide accurate classifications to all 0 • and 90 • orientational objects, but had errors with some 45 • and 135 • orientational objects. Recalling the images with classification errors, we found that the objects are the same objects with different positions and sizes. These objects had activated a close number of 45 • and 135 • selective simple cells. When the images are clean, AVS can give the correct detection results, but when noise is added, the cells' activation is affected, thus due to the classification error. Overall, AVS performs excellently on clean images and has good noise immunity on noise data.

Discussion
This research aimed to study the feasibility and reliability of the Hubel-Wiesel (HW) model, the concise and efficient quantitive methods of simple cells and complex cells, and realize an artificial visual system (AVS) based on the HW model for practicality. We realized the AVS with the following merit: • Effectiveness; AVS could achieve 100% accuracy on ideal shape object datasets and natural objects with a particular orientation, which showed that our detection mechanism and the mechanism-based AVS could effectively detect the orientation of an object with distinct locations and sizes. The simulation results of biology-inspired experiments also showed that AVS is effective and highly consistent with real physiological experiments [12] • Robustness; compared with CNN on orientation detection tasks, AVS costs fewer computation resources than CNN but has better performance and noise resistance.
• Interpretability; the mechanism, structure, and parameters of the AVS for global orientation detection were all designed from HW physiological model, so AVS does not need learning and saves many training resources. The CNN method is a black-box operation and usually requires more training data or deeper networks to improve noise immunity and has requirements on input data's size. The AVS method does not need more layers and is easier to be accepted and trust. The calculation of the AVS is straightforward, and the image size is no need to fit the AVS so that its hardware implementation is also more straightforward than that of CNN. Even if we want to train AVS, we can use the perceptron algorithm instead of the MP neuron model. The AVS training can start from a better and reasonable initial condition to accelerate the learning process and prevent local minimums.
Overall, on the object-orientation detection tasks, AVS is much better than the CNN method because the AVS has good generalization ability, higher recognition accuracy, and stronger noise resistance and is explainable, feasible, reasonable, and robust. The AVS based on the HW model is feasible, efficient, and very similar to the perception mechanism of the biological visual system. Therefore, the implementation scheme of simple and complex cells realized in computer simulations is expected to provide a more helpful experiment direction in neural research.

Conclusions
This paper proposed a two-dimensional global orientation detective mechanism based on the Hubel-Wiesel (HW) model. Though we still know little about the principle of visual perception, we referred to the characteristics of the cortical cells with orientation selectivity. We designed four types of simple cells and complex cells. Using simple cells to extract every local orientation information, and activations of all simple cells are converged to the corresponding type of complex cells, we can get the global orientation according to the complex cell with the most activation. Simple cells and complex cells are realized on the McCulloch-Pitts neuron model. Based on this scheme, we proposed an artificial visual system (AVS) for global orientation detection and tested its performance on different orientation detection tasks. Although the inhibitory effect from the OFF region was omitted in simple cells, the success of AVS provided a possible scheme to explain the principles of orientation selectivity of cortical cells and gives evidence of the reasonability of the HW model, and also can provide a potential neural experiment implementation scheme on orientation selectivity research. Since the present AVS version can only detect those objects with a definite orientation and binary forms, future studies will need to extend the application and generalization on color images and more orientations.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to data privacy regulations.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: