Image Fusion Method Based on Snake Visual Imaging Mechanism and PCNN

The process of image fusion is the process of enriching an image and improving the image’s quality, so as to facilitate the subsequent image processing and analysis. With the increasing importance of image fusion technology, the fusion of infrared and visible images has received extensive attention. In today’s deep learning environment, deep learning is widely used in the field of image fusion. However, in some applications, it is not possible to obtain a large amount of training data. Because some special organs of snakes can receive and process infrared information and visible information, the fusion method of infrared and visible light to simulate the visual mechanism of snakes came into being. Therefore, this paper takes into account the perspective of visual bionics to achieve image fusion; such methods do not need to obtain a significant amount of training data. However, most of the fusion methods for simulating snakes face the problem of unclear details, so this paper combines this method with a pulse coupled neural network (PCNN). By studying two receptive field models of retinal nerve cells, six dual-mode cell imaging mechanisms of rattlesnakes and their mathematical models and the PCNN model, an improved fusion method of infrared and visible images was proposed. For the proposed fusion method, eleven groups of source images were used, and three non-reference image quality evaluation indexes were compared with seven other fusion methods. The experimental results show that the improved algorithm proposed in this paper is better overall than the comparison method for the three evaluation indexes.


Introduction
Image fusion is a process of using different sensors to generate richer and higher quality images from the same scene through computer technology.Image fusion processing can extract the effective information and available complementary information in the image, filter the redundant information and invalid information in the source image, generate a robust or informative image, and improve the image quality.
Image fusion has been studied for more than 40 years and has been widely used.Researchers have improved fusion methods from various angles.Among them, the fusion of visible image and infrared image has been paid more attention.Fusion algorithms based on traditional image processing methods are constantly evolving, including many different traditional methods, such as the multi-scale transform fusion algorithm [1,2], sparse representation image fusion algorithm [3] and some other traditional methods [4,5].This traditional method is generally stable, no training is needed and the fusion effect is also very good, but it is generally complex to use in calculations and low in operation efficiency.There may be some problems in the processing of details.Nowadays, fusion algorithms based on deep learning methods are becoming more and more popular, including methods based on convolutional neural networks [6,7], methods based on generative adversarial networks [8], methods based on autoencoder networks [9] and so on.This method generally provides a good fusion effect and rich details, but requires a large amount of data for training.However, in some applications, a large amount of training data cannot be obtained.
The research findings in the field of biology provide a new idea for the study of image fusion methods.Researchers have studied the biological mechanisms of visual perception and infrared vision in snakes.In 1953, by dissecting the optic nerve fibers of frogs, Kuffler [10] discovered the activity of ganglion cells and the existence of two basic receptor types based ON various firing patterns in the retina, ON/OFF centers surrounding cells.Hodgkin and Huxley [11] proposed the passive membrane equation through scientific research to describe center-surround shunting neural networks (CSSNNs).In 1978, Hartline et al. [12] studied the visual function and infrared perception of ventral subfamily snakes (rattlesnakes) using electrophysiological methods, and pointed out that visible light and infrared sensing neurons were distributed in the optic tectum of these snakes, and clarified the existence of bihump cells.In 1981, Newman and Hartline [13] discovered that the two-mode cells of rattlesnakes can receive and process information from both infrared and visible light, and automatically fuse infrared and visible images naturally.Chen et al. then temporarily blocked some sensors of the pit viper, demonstrating that infrared and visible information complement each other for the pit viper to hunt prey and inhibit each other in the localization process [14].
On the basis of this research on the infrared sensing organs of snakes, some researchers have proposed some novel fusion methods of infrared and visible light from the perspective of mimicking the visual imaging mechanism of snakes.In 1997, Waxman et al. [15] imitated the physiological mechanism of rattlesnakes and proposed an adversarial fusion method of night vision images and infrared images.But the pseudo-color image generated by this method is distorted and has low visibility, which is not conducive to human observation.Reinhard et al. [16] proposed a method for color transfer between two color images.Li [17] and Zhang et al. [18] have improved the classical receptive field model of snakes and achieved good results.
The pulse-coupled neural network model is the third generation of untrained artificial neural networks, which is different from the traditional artificial neural network.It is inspired by mammals.In 1990, Eckhorn et al. [19] proposed a neural network model based on signal transduction of neurons in the cat visual cortex.In 1999, Johnson and Padgett [20] modified this model for image processing and named it the PCNN.Through linear addition and modulation coupling, the PCNN reflects exponential attenuation and time delay of bioelectrical transmission.So, the PCNN has better processing ability for adjacent excitation signals and can be used for image fusion, image segmentation, etc.The PCNN does not require training.
This paper is based on the research of the image fusion method of the rattlesnake vision imaging system.From the perspective of bionics, the fusion method of infrared images and visible images is designed to simulate the fusion mechanism of infrared signals and visible signals of rattlesnakes.Different from deep learning methods, this kind of method does not require a large amount of training data, and can still achieve fusion in some applications where a large amount of training data cannot be obtained.So far, all the fusion methods used to simulate snakes generate pseudo-color images because human eyes can recognize objects faster in color images than in gray ones.However, the pseudo-color images generated by these methods may make the details unclear.Therefore, when building a model simulating the vision mechanism of snakes, this paper does not use the step of directly mapping the processed images to RGB three-channel to generate pseudo-color images, but combines the model with the PCNN network.

Visual Receptive Field
In the 1930s, Hartline became the first person to document the axonal ganglion cells of a single retina by dissecting optic nerve fibers from frogs.He identified three types of retinal ganglion cells: ON cells that fire strongly when the retina is illuminated, OFF cells that fire when light is turned OFF, and ON/OFF cells that react briefly to both the turning on Sensors 2024, 24, 3077 3 of 20 and turning off of light.He proved that each cell was sensitive to only a small illuminated area on the surface of the retina, which he called the receptive field of the cell.Kuffler then discovered in cats that the receptive field of each ganglion cell is actually composed of two concentrically arranged regions: an excitatory central region and an antagonistic surround region.Stimulating the central area with a small spot will cause a strong response, while a larger spot stimulus will produce a diminished response (antagonization) when it spreads to the surrounding area.When the irradiation range of light spot is limited to the center of its receptive field, ON/OFF cells have a strong instantaneous response to both the beginning and the end of light.When ON cells are stimulated by light intensity or local light enhancement, the frequency of the nerve pulse is increased.The OFF cells are activated when the light intensity is removed or when the local light intensity is reduced, and the frequency of their nerve pulse is increased.When the size of the spot increases to the surrounding area, the response to light increment and light decrement is decreased, proving that the surrounding area causes the opposite response.
According to the study of anatomy and physiology, the common receptive field of retinal nerve cells can be divided into ON-center and OFF-center ganglion cells.The ON ganglion cells are located in the ON excitatory region and surrounded by the OFF inhibitory region.The OFF ganglion cells are located in the OFF excitatory region and surrounded by the ON inhibitory region.In Figure 1, "+" represents the excitatory region and "−" represents the inhibitory region.

Visual Receptive Field
In the 1930s, Hartline became the first person to document the axonal ganglion cells of a single retina by dissecting optic nerve fibers from frogs.He identified three types of retinal ganglion cells: ON cells that fire strongly when the retina is illuminated, OFF cells that fire when light is turned OFF, and ON/OFF cells that react briefly to both the turning on and turning off of light.He proved that each cell was sensitive to only a small illuminated area on the surface of the retina, which he called the receptive field of the cell.Kuffler then discovered in cats that the receptive field of each ganglion cell is actually composed of two concentrically arranged regions: an excitatory central region and an antagonistic surround region.Stimulating the central area with a small spot will cause a strong response, while a larger spot stimulus will produce a diminished response (antagonization) when it spreads to the surrounding area.When the irradiation range of light spot is limited to the center of its receptive field, ON/OFF cells have a strong instantaneous response to both the beginning and the end of light.When ON cells are stimulated by light intensity or local light enhancement, the frequency of the nerve pulse is increased.The OFF cells are activated when the light intensity is removed or when the local light intensity is reduced, and the frequency of their nerve pulse is increased.When the size of the spot increases to the surrounding area, the response to light increment and light decrement is decreased, proving that the surrounding area causes the opposite response.
According to the study of anatomy and physiology, the common receptive field of retinal nerve cells can be divided into ON-center and OFF-center ganglion cells.The ON ganglion cells are located in the ON excitatory region and surrounded by the OFF inhibitory region.The OFF ganglion cells are located in the OFF excitatory region and surrounded by the ON inhibitory region.In Figure 1, "+" represents the excitatory region and "−" represents the inhibitory region.
where X(x, y)and Y(x, y)are the ON center versus cell response and OFF center versus cell response, respectively.A represents the attenuation constant of the cell, D and D represent the basal activity of ON and OFF against cells, respectively, E and F are the polarization constants, C(x, y)and S(x, y) represent the central region and surrounding region of the receptive field obeying the Gaussian distribution, respectively, and the formula is as follows: where X(x, y) and Y(x, y) are the ON center versus cell response and OFF center versus cell response, respectively.A represents the attenuation constant of the cell, D and D represent the basal activity of ON and OFF against cells, respectively, E and F are the polarization constants, C(x, y) and S(x, y) represent the central region and surrounding region of the receptive field obeying the Gaussian distribution, respectively, and the formula is as follows: The above formulas describe the instantaneous changes in cells after stimulation.When the cell response eventually tends to balance, the following equation is obtained.
Sensors 2024, 24, 3077 4 of 20 ON against cell output: OFF against cell output: 2.2.Six Dual-Mode Cell Fusion Mechanisms and Mathematical Models in Rattlesnakes ON, OFF, and ON/OFF cells in the retina and their antagonistic center-surrounding tissue form the basic structure of all vertebrate visual systems, and spatial antagonism is common in the cellular receptive field of the visual system.A great number of dualmode cells exist in the optic tectum of venomous snakes such as pythons and rattlesnakes.These cells have different nonlinear responses when receiving infrared and visible light stimulation.These responses are roughly divided into six categories, and the following will be briefly introduced to these six responses.

"OR" Cells
When the "OR" cell receives two kinds of stimulus signals, the visible light signal and infrared signal, it can not only respond to any single one of the two kinds of stimulus signals, but also respond to two kinds of stimulus signals that exist at the same time, and it will result in a gain effect when both signals are present at the same time and stimulate the cell.Therefore, the weighted method is adopted to simulate the physiological mechanism of "OR" cells.
When I V (x, y) < I IR (x, y), the mathematical model is When I V (x, y) > I IR (x, y), the mathematical model is where m > 0.5, n < 0.5, I OR (x, y) represents the image obtained after the processing of the "OR" cell mathematical model.

"AND" Cells
When the "AND" cell receives two kinds of stimulus signals, the visible light signal and infrared signal, it can only produce an obvious response when the two kinds of stimulus signals are present at the same time.And when either of the two kinds of stimulus signals separately stimulate the cell, there is basically no response or only a weak response.Therefore, the weighted method is adopted to simulate the physiological mechanism of "AND" cells.
When I V (x, y) < I IR (x, y), the mathematical model is When I V (x, y) > I IR (x, y), the mathematical model is where m > 0.5, n < 0.5, I AND (x, y) represents the image obtained after the processing of "AND" cell mathematical model.

Enhanced Cell Mathematical Model
(1) The mathematical model of infrared enhanced visible light: When the cell receives two kinds of stimulus signals, the visible light signal and infrared signal, a response is generated only when the visible light signal separately stimulates the cell, and when the infrared signal separately stimulates the cell, there is basically no response or only a weak response.However, when two kinds of signals stimulate the cell at the same time, the response generated will be enhanced.So, the infrared signal plays a role in enhancing the response in the cell.
When the received two signals stimulate the cell simultaneously, the response generated by the visible light signal stimulation of the cell is the most significant part, while the stimulation of the infrared signal to the cell enhances the response generated by the visible light signal stimulation.Therefore, visible light signal plays a dominant role in the establishment of mathematical models to simulate the physiological mechanism of this cell, while the infrared signal uses the exponential function to represent the enhancement effect.The mathematical model is (2) The mathematical model of visible enhanced infrared light: When the cell receives two kinds of stimulus signals, the visible light signal and infrared signal, the response is generated only when the infrared signal stimulates the cell alone, and basically no response or only a weak response is generated when the visible light signal stimulates the cell alone.However, when the two kinds of signals stimulate the cell at the same time, the response generated by the cell will be enhanced.So, visible light signal plays a role in assisting enhancement in this cell.
Therefore, in the establishment of mathematical models to simulate the physiological mechanism of this cell, the infrared signal plays a dominant role, while the visible signal uses the exponential function to represent the enhancement effect.The mathematical model is

Inhibited Cell Mathematical Model
(1) The mathematical model of infrared suppression of visible light: When the cell receives two kinds of stimulus signals, the visible light signal and infrared signal, the response is generated only when the visible light signal stimulates the cell alone, while when the infrared signal stimulates the cells alone, there is basically no response or only a weak response.However, when the two kinds of signals stimulate the cell at the same time, the response produced by the cell will be weakened.Therefore, the infrared signal plays a role in inhibiting the response in the cell.
When the received two signals stimulate the cell simultaneously, the response generated by the visible light signal stimulation of the cell is the most significant part, while the stimulation of the infrared signal to the cell weakens the response generated by the visible light signal stimulation.Therefore, the visible light signal plays a dominant role in the establishment of mathematical models to simulate the physiological mechanism of this cell, while the infrared signal uses the logarithmic function to represent the inhibition effect.The mathematical model is as follows: (2) The mathematical model of visible suppression of infrared light: When the cell receives two kinds of stimulus signals, the visible light signal and infrared signal, the response is generated only when the infrared signal stimulates the cell alone, and when the visible light signal stimulates the cell alone, there is basically no response or only a weak response.However, when the two kinds of signals stimulate the cell at the same time, the response produced by the cell will be weakened.Therefore, the visible light signal plays a role in inhibiting the response in this cell.
Therefore, in the establishment of mathematical models to simulate the physiological mechanism of this cell, the infrared signal plays a dominant role, while the visible signal uses the logarithmic function to represent the inhibition effect.The mathematical model is as follows:

Basic Theory of Pulse-Coupled Neural Networks
As a famous third-generation artificial neural network, the PCNN has its own advantages compared with other image processing methods.First, the PCNN model is derived from studies of the cat visual cortex.Its information processing is closer to human visual processing.And the PCNN has a flexible structure.In addition, the existing PCNN method also shows that the PCNN has a wide range of applications in image processing fields such as image fusion and image enhancement.Therefore, recently, the image fusion method based on PCNNs has attracted more attention from many experts because of its characteristics in the field of biology.
This section mainly introduces PCNN neuron model, simplified neuron model, and the operating mechanism of PCNN.Firstly, the standard model of PCNN and its simplified model are introduced.
The structure of PCNN neurons is shown in Figure 2. The neuron consists of an input part, a connection part and an impulse generator.Neurons receive input signals from feed inputs and link inputs.The feed input is the main input from the receiving area of the neuron.The receiving area of the neuron consists of adjacent pixels of the corresponding pixel in the input image.Link inputs are secondary inputs that are laterally connected to adjacent neurons.The difference between these inputs is that the feed input has a slower characteristic response time constant than the link input.The standard PCNN model is represented by the following formula.
(2) The mathematical model of visible suppression of infrared light: When the cell receives two kinds of stimulus signals, the visible light signal and infrared signal, the response is generated only when the infrared signal stimulates the cell alone, and when the visible light signal stimulates the cell alone, there is basically no response or only a weak response.However, when the two kinds of signals stimulate the cell at the same time, the response produced by the cell will be weakened.Therefore, the visible light signal plays a role in inhibiting the response in this cell.
Therefore, in the establishment of mathematical models to simulate the physiological mechanism of this cell, the infrared signal plays a dominant role, while the visible signal uses the logarithmic function to represent the inhibition effect.The mathematical model is as follows:

Basic Theory of Pulse-Coupled Neural Networks
As a famous third-generation artificial neural network, the PCNN has its own advantages compared with other image processing methods.First, the PCNN model is derived from studies of the cat visual cortex.Its information processing is closer to human visual processing.And the PCNN has a flexible structure.In addition, the existing PCNN method also shows that the PCNN has a wide range of applications in image processing fields such as image fusion and image enhancement.Therefore, recently, the image fusion method based on PCNNs has attracted more attention from many experts because of its characteristics in the field of biology.
This section mainly introduces PCNN neuron model, simplified neuron model, and the operating mechanism of PCNN.Firstly, the standard model of PCNN and its simplified model are introduced.
The structure of PCNN neurons is shown in Figure 2. The neuron consists of an input part, a connection part and an impulse generator.Neurons receive input signals from feed inputs and link inputs.The feed input is the main input from the receiving area of the neuron.The receiving area of the neuron consists of adjacent pixels of the corresponding pixel in the input image.Link inputs are secondary inputs that are laterally connected to adjacent neurons.The difference between these inputs is that the feed input has a slower characteristic response time constant than the link input.The standard PCNN model is represented by the following formula.The role of the receive field is to receive the following two types of input: where F ij and L ij represent the feed input and the link input, respectively, S ij represents the external stimulus and M ijkl and W ijkl represent the connection weight matrix, which can regulate the influence of each neuron in the neighborhood of the central neuron.α L and α F represent the time attenuation constant, which determines the attenuation speed of channel F and channel L. Usually, α L > α F .V F and V L are the inherent potential constants, respectively, are the amplitude coefficients of the feed input and the connection input, and represent the amplitude adjustment constants of the connection domain.The energy transferred by the ignition neuron in the neighborhood to the central neuron can be scaled.The subscript ij indicates the position of the center pixel of the PCNN.The subscript kl represents the position of the adjacent pixel corresponding to the center pixel.
In the modulation domain, the following equation can be used: where U ij represents the internal state of the neuron and β represents the connection coefficient of the modulation domain, which can change the weight of the linked channels in the internal activity, and the value of β usually depends on different needs.If the influence from the L channel is expected to be large, β should be given a larger value.All neurons usually have the same value.But it is not absolute.Each neuron can have its own value.The function of the pulse generator is to generate pulse output, which is composed of a threshold regulator, comparator and pulse generator, as shown below: where T ij (n) represents the dynamic threshold and α T represents the time attenuation constant, and the rate at which the threshold decays in the iterative process.It directly determines the firing time of neurons and is an important parameter.Smaller α T can make the PCNN work more intricate, but it takes a significant amount of time to complete the processing.Larger α T values can reduce the running time of PCNNs.V T determines the threshold of firing neurons, which is usually constant.When the internal state U ij of the neuron is greater than the threshold value T ij , that is, the neuron meets the condition U ij (n) > T ij (n), the neuron will generate a pulse, also known as firing once.
Since the formula of the standard model of PCNNs is too complicated, a simplified model of PCNNs is proposed later with the improvement of research.The formula of this simplified model is an improvement of Equation (15) and is shown below:

Fusion Method Based on Rattlesnake Imaging Mechanism and PCNN
In the previous chapter, according to the physiological mechanism of rattlesnake double-mode cells, six cell types were classified, and the mechanism and mathematical model of these six cells were introduced.The basic theory of the PCNN model was also briefly introduced.An improved algorithm based on the rattlesnake imaging mechanism and PCNN is proposed to solve the problem of detail loss in fused images.
The structure of the improved algorithm is shown in Figure 3. Infrared and visible light source images are denoted as IR and VI.First, VI is input into the ON confrontation system.According to the characteristics of the ON-centered receptive field, the input VI is enhanced to obtain the enhanced visible image VI_ON.
Second, VI and IR are treated with "OR", "AND" and "enhancement" according to the simulated mathematical model by simulating the six dual-mode cell working mechanisms of rattlesnake vision, and the information of VI∩IR, VI∪IR, VI+IR (visible light enhanced infrared) and IR+VI (infrared enhanced visible light) is obtained, respectively.
Third, after subtracting the common information (VI∩IR) obtained by "AND" cells from the enhanced visible light image VI_ON, the unique information vi of the enhanced visible light is obtained.
Fourth, VI∪IR input surrounds the suppression area and corresponds to the suppression signal; VI+IR input to the central excitation area corresponds to the excitation signal, and the final output is OR+VI_IR.At the same time, VI∪IR input surrounds the suppression area, corresponding to the suppression signal; IR+VI input surrounds the central excitation area, corresponding to the excitation signal, and the final output is OR+IR_VI.That is, VI+IR, IR+VI and VI∪IR are input into two ON confrontation systems at the same time.
Then, OR+VI_IR and vi are entered into a PCNN, resulting in PCNN1.At the same time, OR+IR_VI and vi are entered into another PCNN to obtain PCNN2.
Finally, the two-image information of PCNN1 and PCNN2 is weighted to obtain the final fusion image.

Results and Discussion
Following on from the above discussion, this section will conduct experimental simulation for the proposed improved algorithm and compare it with other fusion methods.The image fusion effect of the improved algorithm is compared from two aspects, subjective evaluation and objective evaluation.
To observe the image fusion effect of our improved algorithm more directly and concretely, seven fusion algorithms are prepared in this section as comparative experiments.In this paper, we briefly introduce the following methods to compare with the improved algorithm.
First, the improved fusion algorithm proposed by Li [17] is taken as one of the comparative experiments.In order to facilitate the identification of comparison methods in subsequent evaluation, it is denoted as Li here.Third, after subtracting the common information (VI∩IR) obtained by "AND" cells from the enhanced visible light image VI_ON, the unique information vi of the enhanced visible light is obtained.
Fourth, VI∪IR input surrounds the suppression area and corresponds to the suppression signal; VI+IR input to the central excitation area corresponds to the excitation signal, and the final output is OR+VI_IR.At the same time, VI∪IR input surrounds the suppression area, corresponding to the suppression signal; IR+VI input surrounds the central excitation area, corresponding to the excitation signal, and the final output is OR+IR_VI.That is, VI+IR, IR+VI and VI∪IR are input into two ON confrontation systems at the same time.
Then, OR+VI_IR and vi are entered into a PCNN, resulting in PCNN1.At the same time, OR+IR_VI and vi are entered into another PCNN to obtain PCNN2.
Finally, the two-image information of PCNN1 and PCNN2 is weighted to obtain the final fusion image.

Results and Discussion
Following on from the above discussion, this section will conduct experimental simulation for the proposed improved algorithm and compare it with other fusion methods.The image fusion effect of the improved algorithm is compared from two aspects, subjective evaluation and objective evaluation.
To observe the image fusion effect of our improved algorithm more directly and concretely, seven fusion algorithms are prepared in this section as comparative experiments.In this paper, we briefly introduce the following methods to compare with the improved algorithm.
First, the improved fusion algorithm proposed by Li [17] is taken as one of the comparative experiments.In order to facilitate the identification of comparison methods in subsequent evaluation, it is denoted as Li here.
Second, the improved MIT color fusion algorithm proposed by Zhang et al. [18] is taken as one of the comparative experiments.In order to facilitate the identification of comparison methods in subsequent evaluation, it is denoted as Zhang here.
Third, the improved image fusion model based on rattlesnake double-mode cells proposed by Wang et al. [23] is taken as one of the comparative experiments.In order to facilitate the identification of comparison methods in subsequent evaluation, it is denoted as Wang here.
Fourth, the experiment based on Gradient Transfer Fusion and total change (TV) proposed by Ma et al. [24] is taken as one of the comparative experiments.In order to facilitate the identification of comparison methods in subsequent evaluation, it is denoted as GTF here.
Fifth, the fusion method of latent low-rank representation proposed by Li et al. [25] is taken as one of the comparative experiments.In order to facilitate the identification of comparison methods in subsequent evaluation, it is denoted LatLRR here.
Sixth, the image fusion method based on multi-resolution singular value decomposition proposed by Naidu et al. [26] is taken as one of the comparative experiments.In order to facilitate the identification of comparison methods in subsequent evaluation, it is denoted as MSVD here.
Seventh, the multi-sensor image fusion method based on the fourth-order partial differential equations proposed by Bavirisetti et al. [27] is taken as one of the comparative experiments.In order to facilitate the identification of comparison methods in subsequent evaluation, it is denoted as FPDE here.
Eighth, in order to facilitate the identification of the comparative methods in subsequent evaluation, the proposed improved algorithm is recorded as Our.

Subjective Evaluation
Aiming at the improved method Our proposed in the previous section and other fusion methods compared with Our, 11 groups of source images of different scenes are used in this section for comparative simulation.Among them, seven groups of source images are from the TNO dataset [28] and four groups of source images are from the MSRS (Multi-Spectral Road Scenarios for Practical Infrared and Visible Image Fusion) dataset.The experimental parameters of Our are as follows: σ s = 500, σ c = 2.83, A = 1, D = 0, E = 900 and F = 1.In this section, a comparative analysis is made from the subjective evaluation, namely the visual effect of the fused image observed by human eyes.2) are visible and infrared source images, respectively; (3) is the fusion result of Li's method [17]; (4) is the fusion result of Zhang's method [18]; (5) is the fusion result of Wang's method [23]; (6) is the fusion result of GTF's method [24]; (7) is the fusion result of the LatLRR method [25]; (8) is the fusion result of the MSVD method [26]; (9) is the fusion result of the FPDE method [27]; (10) is the fusion result of the improved method Our.
Sensors 2024, 24, x FOR PEER REVIEW 10 of 21  By observing all the experiments on the TNO dataset, we can find that there is not much difference between Figures 3 and 10 in the surrounding environment, and the key information in the figure can distinguish the rough outline.Combining the comparison of seven groups of images, we can conclude that the results of the fusion method shown in Figure 3 are the most natural color, but the target information is not very prominent.The results of the fusion method shown in Figures 4 and 5 are bright as a whole.As shown in Figure 6, the target information of the fusion method is prominent, but the lack of surrounding environment information in the retained visible image is noticeable, and the overall image is closer to the infrared image.The results of the fusion method shown in Figures 7-9 are basically similar, and closer to the visible image than Figure 6.The result graphs of the fusion method shown in Figure 10 not only highlight the target information in the infrared image, but also are closest to the visible image. (1) (2) ( ( (  ( (  ( (  ( By observing all the experiments on the TNO dataset, we can find that there is not much difference between Figures 3 and 10 in the surrounding environment, and the key information in the figure can distinguish the rough outline.Combining the comparison of seven groups of images, we can conclude that the results of the fusion method shown in Figure 3 are the most natural color, but the target information is not very prominent.The results of the fusion method shown in Figures 4 and 5 are bright as a whole.As shown in Figure 6, the target information of the fusion method is prominent, but the lack of surrounding environment information in the retained visible image is noticeable, and the overall image is closer to the infrared image.The results of the fusion method shown in Figures 7-9 are basically similar, and closer to the visible image than Figure 6.The result graphs of the fusion method shown in Figure 10 not only highlight the target information in the infrared image, but also are closest to the visible image.( By observing all the experiments on the TNO dataset, we can find that there is not much difference between Figures 3 and 10 in the surrounding environment, and the key information in the figure can distinguish the rough outline.Combining the comparison of seven groups of images, we can conclude that the results of the fusion method shown in Figure 3 are the most natural color, but the target information is not very prominent.The results of the fusion method shown in Figures 4 and 5 are bright as a whole.As shown in Figure 6, the target information of the fusion method is prominent, but the lack of surrounding environment information in the retained visible image is noticeable, and the overall image is closer to the infrared image.The results of the fusion method shown in Figures 7-9 are basically similar, and closer to the visible image than Figure 6.The result graphs of the fusion method shown in Figure 10 not only highlight the target information in the infrared image, but also are closest to the visible image.( By observing all the experiments on the TNO dataset, we can find that there is not much difference between Figures 3 and 10 in the surrounding environment, and the key information in the figure can distinguish the rough outline.Combining the comparison of seven groups of images, we can conclude that the results of the fusion method shown in Figure 3 are the most natural color, but the target information is not very prominent.The results of the fusion method shown in Figures 4 and 5 are bright as a whole.As shown in Figure 6, the target information of the fusion method is prominent, but the lack of surrounding environment information in the retained visible image is noticeable, and the overall image is closer to the infrared image.The results of the fusion method shown in Figures 7-9 are basically similar, and closer to the visible image than Figure 6.The result graphs of the fusion method shown in Figure 10 not only highlight the target information in the infrared image, but also are closest to the visible image.By observing all the experiments on the MSRS dataset, we can find that there is not much difference between Figures 3 and 10 in the surrounding environment, and the key information in the figure can distinguish the rough outline.Combining the comparison of seven groups of images, we can conclude that the results of the fusion method shown in Figure 3 are the most natural color, but the target information is not very prominent.As shown in Figure 4, the experimental results are red and bright as a whole.As shown in Figure 5, the target information of the fusion method is prominent, but there are distortion problems in some positions.As shown in Figure 6, the results of the fusion method are darker as a whole, which are closer to the infrared image.The results of the fusion method shown in Figures 7-9 are basically similar, and closer to the visible image than Figure 6.The result graphs of the fusion method shown in Figure 10 not only highlight the target information in the infrared image, but also are closest to the visible image.(   ( (   10) are the same as in the previous section. (1) (2) ( (  By observing all the experiments on the MSRS dataset, we can find that there is not much difference between Figures 3 and 10 in the surrounding environment, and the key information in the figure can distinguish the rough outline.Combining the comparison of seven groups of images, we can conclude that the results of the fusion method shown in Figure 3 are the most natural color, but the target information is not very prominent.As shown in Figure 4, the experimental results are red and bright as a whole.As shown in Figure 5, the target information of the fusion method is prominent, but there are distortion problems in some positions.As shown in Figure 6, the results of the fusion method are darker as a whole, which are closer to the infrared image.The results of the fusion method shown in Figures 7-9 are basically similar, and closer to the visible image than Figure 6.The result graphs of the fusion method shown in Figure 10 not only highlight the target information in the infrared image, but also are closest to the visible image.

Objective Evaluation
Above, we used 11 groups of source images for the improved method Our, and conducted experimental comparison with seven algorithms, respectively, and carried out subjective evaluation on the fusion results of the eight algorithms.This section will evaluate the performance of these eight fusion algorithms through three evaluation indexes, such as spatial frequency (SF).

TNO Dataset
Table 1 and Figure 15 show the data table and corresponding line graph of the standard deviation evaluation index obtained by experimental comparison of seven other fusion methods and the improved method Our using seven TNO dataset image pairs, respectively.By analyzing the information in the data table and line chart, it can be found that 1/7 of the standard deviation best values are generated using the method proposed by Li et al., 2/7 of the standard deviation best values are generated using the method proposed by Wang et al., and 4/7 of the standard deviation best values are generated using the improved method Our.This shows that in the comparative experiment of these eight fusion methods, the optimal method in terms of standard deviation is the improved method Our.

Objective Evaluation
Above, we used 11 groups of source images for the improved method Our, and conducted experimental comparison with seven algorithms, respectively, and carried out subjective evaluation on the fusion results of the eight algorithms.This section will evaluate the performance of these eight fusion algorithms through three evaluation indexes, such as spatial frequency (SF).

TNO Dataset
Table 1 and Figure 15 show the data table and corresponding line graph of the standard deviation evaluation index obtained by experimental comparison of seven other fusion methods and the improved method Our using seven TNO dataset image pairs, respectively.By analyzing the information in the data table and line chart, it can be found that 1/7 of the standard deviation best values are generated using the method proposed by Li et al., 2/7 of the standard deviation best values are generated using the method proposed by Wang et al., and 4/7 of the standard deviation best values are generated using the improved method Our.This shows that in the comparative experiment of these eight fusion methods, the optimal method in terms of standard deviation is the improved method Our.16 show the data table and corresponding line graph of the spatial frequency evaluation index obtained by experimental comparison of seven other fusion methods and the improved method Our using seven TNO dataset image pairs, respectively.By analyzing the information in the data table and line chart, it can be found that 1/7 of the spatial frequency best values are generated using the FPDE method, and 6/7 of the spatial frequency best values are generated using the improved method Our.This shows that in the comparative experiment of these eight fusion methods, the optimal method in terms of spatial frequency is the improved method Our.Table 2 and Figure 16 show the data table and corresponding line graph of the spatial frequency evaluation index obtained by experimental comparison of seven other fusion methods and the improved method Our using seven TNO dataset image pairs, respectively.By analyzing the information in the data table and line chart, it can be found that 1/7 of the spatial frequency best values are generated using the FPDE method, and 6/7 of the spatial frequency best values are generated using the improved method Our.This shows that in the comparative experiment of these eight fusion methods, the optimal method in terms of spatial frequency is the improved method Our.

Table 2. Spatial frequency (TNO).
Li [17] Zhang [18] Wang [23] LatLRR [24] GTF [25] MSVD [26]     Table 3 and Figure 17 show the data table and corresponding line graph of the information entropy evaluation index obtained by experimental comparison of seven other fusion methods and the improved method Our using seven TNO dataset image pairs, respectively.By analyzing the information in the data table and line chart, it can be found that 1/7 of the information entropy best values are generated using the method proposed by Li et al. [17], 1/7 of the information entropy best values are generated using the GTF method, 2/7 of the information entropy best values are generated using the method proposed by Wang et al. [18], and 3/7 of the information entropy best values are generated using the improved method Our.This shows that in the comparative experiment of these eight fusion methods, the optimal method in terms of information entropy is the improved method Our.Table 3 and Figure 17 show the data table and corresponding line graph of the information entropy evaluation index obtained by experimental comparison of seven other fusion methods and the improved method Our using seven TNO dataset image pairs, respectively.By analyzing the information in the data table and line chart, it can be found that 1/7 of the information entropy best values are generated using the method proposed by Li et al. [17], 1/7 of the information entropy best values are generated using the GTF method, 2/7 of the information entropy best values are generated using the method proposed by Wang et al. [18], and 3/7 of the information entropy best values are generated using the improved method Our.This shows that in the comparative experiment of these eight fusion methods, the optimal method in terms of information entropy is the improved method Our.

MSRS Dataset
Table 4 and Figure 18 show the data table and corresponding line graph of the standard deviation evaluation index obtained by experimental comparison of seven other fusion methods and the improved method Our using four MSRS dataset image pairs.By analyzing the information in the data table and line chart, it can be found that 100% of the standard deviation best values are generated using the improved method Our.This shows that in the comparative experiment of these eight fusion methods, the optimal method in terms of standard deviation is the improved method Our.

MSRS Dataset
Table 4 and Figure 18 show the data table and corresponding line graph of the standard deviation evaluation index obtained by experimental comparison of seven other fusion methods and the improved method Our using four MSRS dataset image pairs.By analyzing the information in the data table and line chart, it can be found that 100% of the standard deviation best values are generated using the improved method Our.This shows that in the comparative experiment of these eight fusion methods, the optimal method in terms of standard deviation is the improved method Our.Table 5 and Figure 19 show the data table and corresponding line graph of the spatial frequency evaluation index obtained by experimental comparison of seven other fusion methods and the improved method Our using four MSRS dataset image pairs.By analyzing the information in the data table and line chart, it can be found that 25% of the spatial frequency best values are generated using the method proposed by Wang et al., and 75% of the spatial frequency best values are generated using the improved method Our.This shows that in the comparative experiment of these eight fusion methods, the optimal method in terms of spatial frequency is the improved method Our.19 show the data table and corresponding line graph of the spatial frequency evaluation index obtained by experimental comparison of seven other fusion methods and the improved method Our using four MSRS dataset image pairs.By analyzing the information in the data table and line chart, it can be found that 25% of the spatial frequency best values are generated using the method proposed by Wang et al., and 75% of the spatial frequency best values are generated using the improved method Our.This shows that in the comparative experiment of these eight fusion methods, the optimal method in terms of spatial frequency is the improved method Our.

Table 5. Spatial frequency (MSRS).
Li [17] Zhang [18] Wang [23] LatLRR [24] GTF [25] MSVD [26] 20 show the data table and corresponding line graph of the information entropy evaluation index obtained by experimental comparison of seven other fusion methods and the improved method Our using four MSRS dataset image pairs.By analyzing the information in the data table and line chart, it can be found that 25% of the information entropy best values are generated using the method proposed by Zhang et al., 25% of the information entropy best values are generated using the method proposed by Wang et al., and 50% of the information entropy best values are generated using the improved method Our.This shows that in the comparative experiment of these eight fusion methods, the optimal method in terms of information entropy is the improved   20 show the data table and corresponding line graph of the information entropy evaluation index obtained by experimental comparison of seven other fusion methods and the improved method Our using four MSRS dataset image pairs.By analyzing the information in the data table and line chart, it can be found that 25% of the information entropy best values are generated using the method proposed by Zhang et al., 25% of the information entropy best values are generated using the method proposed by Wang et al., and 50% of the information entropy best values are generated using the improved method Our.This shows that in the comparative experiment of these eight fusion methods, the optimal method in terms of information entropy is the improved method Our.Li [17] Zhang [18] Wang [23] LatLRR [24] GTF [25] MSVD [26]   On the whole, most of the optimal values of both TNO dataset and MSRS dataset are concentrated in the method proposed in this paper.Then, the Wilcoxon signed rank test is used to test our proposed method.On the basis of these 11 image pairs, we pair seven comparative methods with the method proposed in this paper on three indicators.First, the null assumption is that the data difference between the two groups is zero; the alternative hypothesis is that there are differences in the data between the two groups.We pair the seven methods with the method proposed in this paper, and since there are three indicators, there are 21 sets of data tests.According to the Wilcoxon signed rank test, our difference value is assumed to be the data index of the method in this paper minus the data index of the comparative method.When the value of the method in this paper is higher than that of the comparative method for an image pair, it is positive rank, and vice versa.Secondly, the difference values are ranked according to their absolute values, and the ranking is assigned in order from small to large in absolute value.Then, we find the sum of the positive ranking and the negative ranking, respectively.Our test statistic W is the smallest absolute value of the sum of positive rankings and the sum of negative rankings.Since we have a total of 11 image pairs, n = 11.To determine whether the null hypothesis should be rejected, we refer to the Wilcoxon signed rank test critical value table to find the critical values.The critical value corresponding to a significance level α of 0.1 and n = 11 is 13.If our test statistic W is less than or equal to the critical value 13 in the table, we can reject the null hypothesis.Otherwise, we cannot reject the null hypothesis.
(1) According to the calculation, in comparison with the method proposed by Li, we obtain the test statistic W1 = 9 using the standard deviation index, the test statistic W2 = 0 using the spatial frequency index, and the test statistic W3 = 11 using the information entropy index.Since the test statistic W for the three indexes is less than the critical value 13, On the whole, most of the optimal values of both TNO dataset and MSRS dataset are concentrated in the method proposed in this paper.Then, the Wilcoxon signed rank test is used to test our proposed method.On the basis of these 11 image pairs, we pair seven comparative methods with the method proposed in this paper on three indicators.First, the null assumption is that the data difference between the two groups is zero; the alternative hypothesis is that there are differences in the data between the two groups.We pair the seven methods with the method proposed in this paper, and since there are three indicators, there are 21 sets of data tests.According to the Wilcoxon signed rank test, our difference value is assumed to be the data index of the method in this paper minus the data index of the comparative method.When the value of the method in this paper is higher than that of the comparative method for an image pair, it is positive rank, and vice versa.Secondly, the difference values are ranked according to their absolute values, and the ranking is assigned in order from small to large in absolute value.Then, we find the sum of the positive ranking and the negative ranking, respectively.Our test statistic W is the smallest absolute value of the sum of positive rankings and the sum of negative rankings.Since we have a total of 11 image pairs, n = 11.To determine whether the null hypothesis should be rejected, we refer to the Wilcoxon signed rank test critical value table to find the critical values.The critical value corresponding to a significance level α of 0.1 and n = 11 is 13.If our test statistic W is less than or equal to the critical value 13 in the table, we can reject the null hypothesis.Otherwise, we cannot reject the null hypothesis.
(1) According to the calculation, in comparison with the method proposed by Li, we obtain the test statistic W 1 = 9 using the standard deviation index, the test statistic W 2 = 0

Figure 1 .
Figure 1.ON/OFF central receptive field model (left is ON receptive field; right is OFF receptive field).

Figure 1 .
Figure 1.ON/OFF central receptive field model (left is ON receptive field; right is OFF receptive field).2.1.2.ON/OFF-Mathematical Model of Central Receptive Field At first, A.F.Huxley et al. proposed the passive membrane equation to simulate the exchange of cell membrane ion currents in physiology.Later, Newman et al. [21] used Grossberg's centered surround shunt neural network to build an image fusion model based on snakes.The formula is as follows:

Figure 2 .
Figure 2. PCNN model (F ij : feed input, L ij : link input, S ij : external stimulus, M and W: connection weight matrix, α L and α F : time attenuation constant, V F and V L : inherent potential constant, U ij : internal state of the neuron, β: connection coefficient of the modulation domain, T ij : dynamic threshold, α T : time attenuation constant; V T : the threshold of firing neurons) [22].

Figure 3 .
Figure 3. Model structure based on rattlesnake dual-mode mechanism and PCNN.

Figure 3 .
Figure 3. Model structure based on rattlesnake dual-mode mechanism and PCNN.

4. 1
show the fusion results obtained by the proposed improved algorithm and the corresponding comparative experiment using the source image simulation of seven groups of the TNO dataset.In these results, (1) and (2) are visible and infrared source images, respectively; (3) is the fusion result of Li's method[17]; (4) is the fusion result of Zhang's method[18]; (5) is the fusion result of Wang's method[23];(6) is the fusion result of GTF's method[24];(7) is the fusion result of the LatLRR method[25];(8) is the fusion result of the MSVD method[26];(9) is the fusion result of the FPDE method[27];(10) is the fusion result of the improved method Our.

Figure 10 .
Figure 10.Comparison of experimental results of the seventh set of graphs in the TNO dataset.(1) visible image; (2) infrared image; (3) Li [17]; (4) Zhang [18]; (5) Wang [23]; (6) GTF [24]; (7) LatLRR [25]; (8) MSVD [26]; (9) FPDE [27]; (10) Our.4.1.2.MSRS Dataset Figures 11-14 show the fusion results obtained by the proposed improved algorithm and the corresponding comparative experiment using the source image simulation of four groups of the MSRS dataset.The images corresponding to (1) to (10) are the same as in the previous section.By observing all the experiments on the MSRS dataset, we can find that there is not much difference between Figures 3 and 10 in the surrounding environment, and the key information in the figure can distinguish the rough outline.Combining the comparison of seven groups of images, we can conclude that the results of the fusion method shown in Figure3are the most natural color, but the target information is not very prominent.As shown in Figure4, the experimental results are red and bright as a whole.As shown in

Figures 11 -
show the fusion results obtained by the proposed improved algorithm and the corresponding comparative experiment using the source image simulation of four groups of the MSRS dataset.The images corresponding to (1) to(10) are the same as in the previous section.

Table 5 and
Figure

Table 6 and
Figure

Table 6 and
Figure