1. Introduction
The concept of the cellular automata (CA) was first introduced by Stan Ulam and John von Neumann in the 1940s while they were working on the Manhattan Project at Los Alamos National Laboratory [
1]. A remarkable contribution to the development of cellular automata raised from the research of Slam in the study of crystal growth and the interest of von Neumann in self-replicating systems [
1]. The initial idea was focused on developing a two-dimensional cellular automaton comprising a grid of square cells. Considering the neighboring cells of a certain cell, it can take a black or white state [
1]. According to John von Neumann’s approach, for a certain cell, the neighborhood is composed of four adjacent squares, that is, the cellular automaton can be seen as a cross composed of square cells [
1,
2]. Based on the above, it is stated, in general terms, that an automaton consists of a set of cells delimited by a finite number of states, the neighborhood, and local transition rules [
3].
Therefore, cellular automata are considered as discrete dynamic systems of finite spatial and temporal state composed of a finite set of cells that evolve in parallel in discrete time steps [
1,
4]; in other words, they are discrete in space and time, allowing the description of local interactions employing transition rules for each cell in the space. In this way, cellular automata are structures that can be used for modeling and studying complex non-linear dynamic systems [
1,
2,
5].
Regarding the diversity of applications in which the concept of a cellular automaton is immersed, and with the inherent ease that these present to model systems, researchers and scientists extrapolated the concept of cellular automata to digital images.
Digital images are considered as a two-dimensional representation of a real image from a numerical matrix that records values for each element at a given point. Each position within the matrix is called a pixel, and it takes discrete values according to properties such as brightness (the intensity of light) or color [
6].
Regarding characteristics, digital images have some called “basic”. One is the type of image; for example, there are black and white images that only record the intensity of the light that falls on the pixels. Besides, there are non-optical images such as ultrasound or X-rays in which the intensity of sound or X-rays is recorded. Resolution is also a characteristic expressed by the number of pixels per inch (ppi), that is, the higher the resolution, the more detailed the image is. Another is the color depth (of a color image), or “bits per pixel” is the number of bits that describe brightness or color for one pixel. More bits allow recording more shades of gray or more colors. Finally, the image format provides more details on how the numbers are arranged in the image file and includes the type of compression employed [
6,
7].
Digital image processing arises from the need to improve appearance, and thereby make certain details more evident [
8]. As a result, techniques were implemented to treat different image characteristics (brightness, sharpness, contrast, intensity, noise, among others), but particularly, methods that allow the information contained to be highlighted or suppressed selectively in an image [
8].
In this sense, one frequently covered defect is noise, since, through different devices capable of capturing digital images, errors or interferences can be generated when transmitting information bits. Theoretically, different types of noise have been defined and classified according to their characteristics. Consequently, treatment methods and techniques have been developed for each one [
9].
Conventional techniques include filters in the space and frequency domains (high pass, low pass, average, etc.), which compared to new techniques have shown lower results [
10]; for this reason, different alternatives have been developed aiming at getting efficient and visually aesthetic results. Therefore, the combination of methods, techniques, and various mathematical models have been useful to achieve these objectives. Likewise, edge detection plays a relevant role in image processing, since through these algorithms it is possible to identify objects, define patterns, or segment information within images [
9].
Regarding a dictionary learning-based approach for image denoising, [
11] proposes a scheme to couple MOD (Method of Optimal Directions) and the Approximate K-Singular Value Decomposition (AK-SVD). This approach integrates a reconstruction and learning process into a model to removal multiplicative and additive noise. In this way, a sparse term is used with the purpose of reducing non-Gaussian outliers associated with multiplicative noise. Additionally, a Laplacian Schatten norm is employed to capture the global structure information. Other related work is presented in [
12], introducing an approach on a discriminative ridge regression to supervised classification called Discriminative Ridge Machine (DRM). The focus here is to determine a representative model while defining class discriminations of categorical information. The model is also extended considering other existing models like lasso and group lasso, incorporating discriminative information. For implementation, the authors consider a quadratic model that allows analytical solutions in a closed-form.
According to [
13], image denoising can be addressed as an inverse problem where an approach is sparse decomposition over redundant dictionaries. In a sparse representation form, the signals correspond to a linear combination of redundant dictionary atoms. Considering this orientation, in [
13] is presented an algorithm for image denoising based on Non Negative Matrix Factorization (N-NMF), and sparse representation over redundant dictionary. In this proposal, the dictionary is trained, rooted in samples from noised image, and then it searches for the best representation using the Approximate Matching Pursuit (AMP). Another advancement in this orientation is presented in [
14], proposing a non-negative matrix factorization method that learns both clustering and local similarity in a suitable form. Such portrayal exposes data’s inherent geometric property. Applying the representation in the kernel space allows boosting the capability of the model to identify nonlinear structures associated to the data.
Regarding applications based on Robust Principal Component Analysis (RPCA), in [
15] is exposed the issues task to eliminate both mixed types and heavy noises from Hyperspectral Images (HSI). Authors address such issue proposing a non-convex development in RCPA for denoising application of hyperspectral images. This approach takes the log-determinant rank approximation and the
, log norm to restrain the sparse properties of column-wise or low-rank for the component matrices. Another related work can be observed in [
16] that proposes a RCPA model based on matrix tri-factorization that uses SVDs computation of small matrices. Thus, such an approach diminishes the complexity of RCPA to make it linear and completely scalable.
Regarding cellular automata and image processing, an approach to CA-based image segmentation is the GrowCut algorithm proposed in [
17]. Since this is an interactive image segmentation algorithm, the user selects the set of seed pixels that signalizes the sections of interest in the image. Regarding GrowCut algorithm, in [
18] is presented a procedure of interactive image segmentation designed to reduce specific image segmentation problems for identifying regions of interest. The feasibility to automatically generate seeds for GrowCut is shown; besides, authors suggest a method to automate seed generation for the segmentation task in heart images. In addition, a conventional GrowCut cellular automaton using chaotic features is enhanced in [
19]. This development employs an extended, stochastic neighborhood, where randomly-selected remote neighbors reinforce the conventional local neighbors. The authors state that according to the results, by having small changes in the initial conditions in the process, major changes can be induced in the segmentation result.
Regarding others works, in [
20] is proposed an algorithm for determining the optimal outdoor evacuation routes in hills. The system uses web services to obtain geographic information from Google Image. The routes are determined using graph theory with geographic information (latitude, longitude, and elevation), and cellular automata in 3D.
Meanwhile, in [
21], cellular automata rules are optimized for edge detection employing Particle Swarm Optimization (PSO). In this work, it is exposed that cellular automata provides fast computation, the optimization rule, and the adaptability to target images. According to authors, the method is tunable in medical images to identify structures such as cardiac cavities.
Considering the integration of cellular automata with other techniques, in [
22] is designed an automaton that could be part of a more complex system to make bio-computers that can be used for teachers in multidisciplinary education. Meanwhile, in [
23] is considered a Hybrid Cellular Automata (HCA) architecture for modeling the cardiac cell–cell membrane resistance. This work shows that the modeling proposal reproduces important and complex spatio-temporal properties that can be used in future models. Besides, authors show how the GPU-based technology can accelerate the simulation and analysis of these systems. Finally, in [
24], random walk theory and lattice gas cellular automata are used to generate a mathematical model for gluing wood particles.
Given the above, researchers and those interested in the subject have implemented methods, algorithms, or various techniques that combine basic theories with cellular automata to image processing and model applications. Although outstanding results have been obtained, the possibility of conducting research in this area to improve digital image processing using cellular automata is observed.
Regarding previous works, reference [
25] presents an algorithm based on cellular automata to reduce impulsive noise in digital images. Meanwhile, the papers [
26,
27] describe the integration of a cellular neural network and an adaptive cellular automaton for the impulsive noise reduction and edge detection in digital images; finally, the document [
28] displays a process for digital images edge detection employing cellular automaton.
As it can be observed, there are different approaches to eliminate noise in digital images, the cellular automata being a suitable alternative given the direct relationship between a cellular automaton and a digital image given its representation in the form of matrices. In this way, explorative research can be carried out to include different behaviors of cellular automata for digital image processing.
Proposal Approach and Document Organization
This document aims at displaying the mathematical model for an algorithm based on cellular automata to eliminate noise in digital images. The considered algorithm is implemented in [
25,
26,
27]; however, the mathematical description of the automata dynamics operation for filtering process is not performed, which is the object of study in this paper.
In this way, the contributions of the article are the presentation of the mathematical model of the CA operation considering the description of the adaptation process for the cellular automaton, and the methodology for statistical analysis using synthetic images.
The motivation in this work is that the cellular automata have a very direct relationship with digital images given their structure, then CA behaviors can be included to process images; particularly, an adaptive behavior of CA modifying the neighborhood of the cells is considered. In this context, it is important to describe this type of behavior mathematically, which can serve as a reference for further works.
The document is organized as follows. First,
Section 2 introduces the background;
Section 3 displays the cellular automata algorithm to eliminate impulsive noise in digital images.
Section 3.2 details the model for the algorithm based on cellular automata. Next,
Section 4 presents the simulation results to observe the algorithm characteristics. Finally, the discussion and conclusions are presented in
Section 5 and
Section 6.
3. Algorithm Based on Cellular Automata
Considering [
35], cellular automata is a class of discrete dynamic system that allow to model complex systems. The CA is composed of a set of cells with a dimension
D, taking 1, 2 or 3. Image processing employs CA in a two-dimensional space [
36]. According to [
37], a cellular automaton is composed of three parts:
- (a)
Cells or lattice.
- (b)
Neighborhood or adjacent neighbors.
- (c)
Rules for cell transitions.
Regarding the functioning of a cellular automaton, the state of a cell in time
is calculated using its current state and the values of its neighbors in time
t[
38,
39]. A
D-dimensional cellular automata can be defined as a 4-tuple
, where:
: D-dimensional space of integers.
S: finite set of to the states of A.
N: finite ordered subset of that corresponds to the neighborhood of A.
: local rule (transition function) of A.
The neighborhood allows us to obtain environmental information. As shown in
Figure 1, Moore neighborhood has a configuration of
matrix that covers a larger number of pixels than the von Neumann neighborhood. The Moore neighborhood is described by Equation (
1), where
r is the range, for an 8 neighbors
.
In order to implement the algorithm, each cell of the CA corresponds to a pixel inside the digital image, where the value of each cell corresponds to an intensity value of the image, in a grayscale image, the values are from 0 to 255 [
40]. It is noticeable that the proposed algorithm performs an adaptive modification of the neighborhood where both the first and the last row and column of the image are extended.
3.1. Cellular Automata Algorithm to Eliminate Noise in Digital Images
The mathematical model developed to describe noise elimination in digital images (using CA) is a synthesis of the algorithm presented in [
25] called DK (Dynamic-Knowledge), which through the use of cellular automata and an adaptive property, eliminates noise salt and pepper present in a digital grayscale image.
A two-dimensional digital image can be represented as a matrix of size
, where each element
found in the
i-th row and the
j-th column is a pixel within the digital image. The values assigned to the element
corresponds to the luminosity information presented in a pixel [
25]. Additionally, a cellular automaton can be defined as a discrete dynamic system composed of a set of cells with a dimension
D. The most used to model natural or artificial systems are the two-dimensional ones to easily recreate a collection of simple objects locally interacting with each other [
2,
3,
25].
The DK algorithm uses a Moore neighborhood, which considers 8 neighbors from the base cell (center). Each cell of the cellular automaton can be seen as a pixel within the image, when the algorithm traverses the image using cellular automata and detects a possible noisy pixel, a function is applied to change the pixel value based on the neighbors value. Moreover, when having insufficient information from the neighbor, the cellular automata extends its neighborhood in a row and a column, that is, forming a matrix to obtain more data that lead toward the best possible decision.
3.2. Cellular Automata Behavioral Model
In order to perform the model description, firstly, the unit step function is defined since this is used to activate or deactivate the algorithm adaptive property. In this order is used function
given by Equation (
2).
Figure 2 graphically shows the unitary step function.
Considering
a value, function
allows describing a piecewise function via turn on/off the components of a function
. For example,
Equation (
3) is represented as
.
On the other hand, a matrix of dimensions
or
represents a part of the digital image. Each element within the matrix can be depicted in the form
, which allows establishing a relative position within the matrix, as shown in
Figure 3.
Each of the eight elements surrounding the central cell
is part of the Moore neighborhood. In the DK algorithm, this neighborhood is taken with a noisy central pixel and evaluates the number of neighbors that have values 0 or 255 (possible noisy pixels). If the number of neighbors with different values of 0 or 255 is less than 5, the algorithm calculates the average of these values, and based on the neighborhood rules, it changes the value of the central pixel. Otherwise, if the number of cells with values 0 or 255 is greater than or equal to 5, then the algorithm expands the neighborhood to one of size
containing the previous one (see
Figure 4), this to collect more information to lead the algorithm to make the best decision.
Considering the above, a unit step function is used to turn off the function that employs the neighborhood
and a new function using the neighborhood
is considered to calculate the average of the values of the cells that do not have values 0 and 255. Thus, the function that models the behavior of the DK algorithm is given by Equation (
4).
where
n is the number of pixels with values 0 or 255 in the neighborhood
(
), with
, meanwhile
m is the number of pixels with values 0 or 255 in the neighborhood
(
) with
. Finally,
is the pixel value at position
, and
,
.
Equation (
4) can be expressed as
, in this order, function
counts the number
of cells without salt and pepper values
and calculates the average of these values. Since the summation includes the position
, it is eliminated from the calculated average using
. Finally, the result is rounded to the smallest integer employing the floor function
.
On the other hand, for
function,
is the unit step function activated when the number of pixels with salt and pepper values (0 or 255) is equal or greater than 5 in the neighborhood
of the cell
.
When the number of cells in the neighborhood is equal or greater than 5, the DK algorithm takes a neighborhood of size calculating the average of all the cells in that neighborhood. Given that the sum includes the position , this is removed from the calculated average employing the term . Note that when function turns on, is activated and deactivated. Finally, the result obtained is rounded to the smallest integer close to the calculated average.
4. Simulation Validation
In this section, a computational analysis is presented by simulation in R of the mathematical model described to evaluate the behavior of the DK algorithm compared to the conventional methods for noise elimination (median and mean filters).
In order to have simulation results with statistical validity, an experimental design based on a simple random sampling with replacement is employed, in such a way that there is a sufficient number of experiments to evaluate the performance of the algorithms. The number of experiments was determined considering what was reported in [
41] for simple random sampling with replacement.
The denoising algorithms are evaluated using arrays (simulating a portion of a picture), which are randomly generated. In these experiments, the central point of this hue corresponds to the pixel with noise. Once the number of experiments (population) is defined, the noise elimination algorithms (mean, median and DK) are executed. Then, with the results obtained, the calculation of the absolute error is made to subsequently compare the results.
The experiments were carried out via a simple sampling with replacement, that is,
matrix were randomly generated in the discrete interval of pixel values
; in this case, the size of the population is known [
41]. An integer value of the interval is placed in each element of the array, except for position
, which can take the values 0 or 255. The size of the population is given by Equation (
7), where
N is divided by 8 given the symmetries of the square (according to groups theory [
42]). In this way, the sample size (number of experiments) is given by Equation (
8).
In Equation (
8), the respective variables are:
N: Population size.
Z: 95% confidence level .
p: Probability of success .
q: Probability of failure .
d: Accuracy .
Simulations were made in MATLAB and the samples obtained were statistically analyzed. As an example, it is taken the pixel interval [80, 84]. In this case, the space of all possible experiments is 48,828.
It is relevant to mention that when the number of noisy pixels within the matrix is greater than half the number of non-noisy pixels (i.e., noisy pixels greater than 5), the matrix expands to a size to encompass a larger number of pixels. In this way, after determining the correct or closest value of the central pixel, the matrix returns to its original size of .
With the sampling formula, using the same values taken for the parameters, 381 masks were simulated. The absolute errors between the real value of pixel
and that obtained with each of the algorithms were calculated. In this order, the histogram of the absolute error displayed in
Figure 5 is obtained, where the results of each algorithm are shown separately.
Figure 6 shows the histogram set of data obtained in a simulation, presenting in the same scale the results for the median, mean, and DK algorithms. In these results, the DK algorithm presents smaller values.
Regarding
Figure 5 and
Figure 6, it is observed that for the DK algorithm, the largest number of values of the absolute difference are close to zero. This implies that, from the total of samples obtained when executing the DK algorithm exists a greater quantity of successes at the moment of identifying the real value of the pixel. In other words, it is possible to recover the original value of the image (or a sufficiently close value) so as not to present a significant difference with the figure without noise.
In the case of the mean algorithm, although it exhibits a distribution with many values close to zero, it also presents a considerable number of pixels that are far from the real value, which implies that there are notable differences in the results after noise removal. On the other hand, the median algorithm presents results with a particular distribution (with three groups), it is observed that there is a group where the original value is recovered; however, there are two other groups far from the real value pixel, resulting in an image where impulsive noise is not satisfactorily removed.
The data were saved in three vectors and processed using the statistical software R obtaining the mean, variance, among other relevant statistical values as presented in
Table 1. The basic statistics are the number of values considered in the sample (nbr-val), number of null values within the sample (nbr-null), number of missing values within the sample (nbr-na), minimal value obtained (min), maximal value obtained (max), range equal to difference between max and min (range), and the sum of all non-missing values (sum). Meanwhile, the descriptive statistics are: the median (median), the mean (mean), the standard error on the mean for a given variable (SE-mean), the confidence interval for the arithmetic mean (CI-mean) at the respective
p-level, the variance (var), the standard deviation (std-dev), and the variation coefficient (coef-var) equal to the standard deviation divided by the mean. Finally, the normal distribution statistics are: the skewness coefficient
(skewness), kurtosis coefficient
(kurtosis), statistic of a Shapiro–Wilk test of normality (norm-test-SW), and the respective associated probability (norm-test-p) [
43].
In addition, the statistical summary of the median, mean filters and DK algorithm is presented in
Table 2. This information can be seen more compactly in the box and whisker plot shown in
Figure 7.
In these results, a stronger trend is observed by the DK algorithm towards zero. Thus, on average, the data obtained with the algorithm tend to be more similar to the real values of the mask. However, as more simulations were done using fewer points with salt and pepper, the behavior of the DK algorithm and the average algorithm behaved similarly.
Additionally, calculations were made to determine the correlation of the data obtained and are presented in
Table 3. A positive correlation was observed between the median and mean algorithms (filters), while DK algorithm showed very little correlation between them.
Finally, to quantitatively and qualitatively observe the algorithm performance,
Figure 8 shows the elimination of noise using the mean, median and DK algorithms, where different noise values are considered for each row having
,
,
,
,
,
,
, and
of noise level, which indicates the number of pixels contaminated in the image. Meanwhile, the first column presents the original image; later, the second column the noisy image, in the third, the image processed with the median filter; the fourth column the result with the mean filter, and finally the fifth column the filtering process with DK algorithm. As can be seen from these results, in most cases DK algorithm presents a better result than the other algorithms considered.
Considering [
25,
44], the Peak Signal-to-Noise Ratio (PSNR), the Signal-to-Noise Ratio (SNR), and the Structural Similarity Index Measure (SSIM) are calculated using the images in
Figure 8 to evaluate the performance of DK algorithm quantitatively.
The PSNR in decibels (
) is calculated using Equation (
9), which employs the Mean Squared Error (MSE) given in Equation (
10), where,
is a pixel of the original image (reference image),
is a pixel of the reconstructed image (filtered image), and
B the number of bits employed for representing each pixel (8 bits).
The performance index SNR in
is determined by Equation (
11). This metric characterizes the quality of an image with the relationship between image power and the noise it presents [
44].
On the other hand, the SSIM is calculated via Equation (
12), it is used to establish the similarity between two images, allowing us to determine how different the original image
is from the distorted image
. Values of luminance
, contrast
, and structure
are used for SSIM considering
and
[
10,
25,
32]. The SSIM measures between two images to compare a value between 0 and 1, where one is the absolute similarity and zero the total loss of similarity.
Table 4 shows the results of the metrics with different noise levels for the algorithms evaluated using the images displayed in
Figure 8, as can be seen, the noise was effectively reduced in most of the levels using DK algorithm, and the similarity was always above
, which means that the filtering images with the proposed method was successful and with broad capacity to restore the image.
5. Discussion
Even though the considered cellular automata was employed in previous works [
25,
26,
27,
28], a detailed mathematical model was not addressed. Therefore, the main aspect in this paper corresponds to the mathematical description and statistical validation.
The proposed mathematical model of cellular automata can be used in a later work to carry out the respective dynamic analysis (not addressed in this work). In order to observe the algorithm features, a statistical validation for the functioning of the cellular automaton to eliminate impulsive noise is carried out.
In this paper are performed the mathematical description of the algorithm and a comparison with two very well-known algorithms, which can be considered as a comparison standard for salt and pepper noise removal. It is also relevant to mention that the algorithm is designed to eliminate salt and pepper noise, for other types of noise, it is expected to carry out the respective research to adjust the proposed algorithm.
In order to have an experimental validation, a comparison is made with two standard algorithms for impulsive noise elimination observing that DK algorithm displays a better performance; however, a broader comparison with other algorithms can be made in a further work. In this way, new strategies can also be considered to incorporate into the DK algorithm. To perform a suitable algorithm comparisons, the following aspects must be taken into account:
Selection of the type of algorithms to be compared considering: reported performance, actuality, available code, number of citations, proposed approach of the algorithm. Some algorithms to consider consist on cellular automata-based algorithmic approaches for noise removal in digital images as Outer Totalistic Cellular Automata (OTCA) [
45], and other developments like the presented in [
10,
46,
47,
48,
49,
50,
51,
52]; likewise, hybrid methods that incorporate cellular automata and fuzzy logic [
32,
53], as well as modifications and improvements of median filter as Unsymmetric Trimmed Median Filter (UTMF) [
54], median-type noise detectors [
34], and implementations using local image statistics [
33]. Other approaches could also be considered, including algorithms based on dictionary learning methods [
11,
12], non-negative matrix factorization [
13,
14], and robust principal component analysis [
15,
16].
Type of noise to eliminate considering different algorithms approaches. It can be considered noise additive, multiplicative, impulsive static and dynamic noise [
55]. The associated probability distribution can also be considered as: uniform, Gaussian, Poisson, Rayleigh, Speckle, Gamma, White, Brownian, and other noise characteristics like periodic and structural [
56].
Performance metrics considering the operation of the algorithms, in a way that the advantages of each algorithm, can be observed as: processing time, amount of noise removed, image distortion, Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Signal-to-Noise Ratio (SNR), Image Enhancement Factor (IEF), and Structural Similarity Index Measure (SSIM), that is a perceptual metric that quantifies image quality degradation caused by the processing; also the Peak Signal-to-Noise Ratio (PSNR) corresponding to the relationship between the maximum possible energy of a signal and the noise that affects it [
44,
45,
53].
Statistical tests to carry out the comparisons (ANOVA, Kruskal–Wallis, Bonferroni, etc.), considering assumptions of normality and equality of variance to establish the type of test (parametric and non-parametric), and in this way perform a fair comparison between algorithms [
57,
58].
Finally, the limitations of this work included to carry out the statistical tests the images used are synthetic considering only impulsive noise; besides, the algorithm operates on grayscale images, and no wide comparison with other types of algorithms is made.
6. Conclusions
In this work, the mathematical description of the algorithm based on a cellular automata to eliminate (reduce) noise in digital images is obtained. Various functions are incorporated into this model to complete this description.
The proposed model can be used to adapt the operation of the algorithm to carry out other processes on the digital image such as edge separation, equalization, and pattern identification.
A statistical validation of automaton cellular functioning to eliminate impulsive noise is carried out considering an experimental design based on a simple random sampling with replacement using a sufficient number of experiments to evaluate the algorithms performance.
Comparison of DK algorithm with other well-known techniques for noise removal (salt and pepper) is made. These results show that the algorithm obtains a suitable performance compared to other techniques. The denoising technique based on cellular automata can be considered as a non-linear type filtering.
According to the statistical analysis results carried out, it is observed from the correlation table that the DK algorithm presents an approximate percentage relationship of with respect to the median algorithm, and to the mean algorithm. This means that, even though the DK algorithm is based on a behavior similar to median and mean algorithms, its adaptive feature gives it the ability to expand the information with which it makes decisions and obtains effective results.
Limitations to overcome in other works are dynamic analysis of the model, also the filtering of other types of noise; operation in color images, and a wide comparison with other types of algorithms.
In a further work, the generalization of this model can be considered to be applied in color images, it can be also used to acquire or modify other color image characteristics. In addition, the application of the algorithm can also be extended to other types of noise. Besides, it could also include additional strategies in DK algorithm as neural networks, neuro-fuzzy systems, and support vector machines.