A Grid-Density Based Algorithm by Weighted Spiking Neural P Systems with Anti-Spikes and Astrocytes in Spatial Cluster Analysis

: In this paper, we propose a novel clustering approach based on P systems and grid- density strategy. We present grid-density based approach for clustering high dimensional data, which ﬁrst projects the data patterns on a two-dimensional space to overcome the curse of dimensionality problem. Then, through meshing the plane with grid lines and deleting sparse grids, clusters are found out. In particular, we present weighted spiking neural P systems with anti-spikes and astrocyte (WSNPA2 in short) to implement grid-density based approach in parallel. Each neuron in weighted SN P system contains a spike, which can be expressed by a computable real number. Spikes and anti-spikes are inspired by neurons communicating through excitatory and inhibitory impulses. Astrocytes have excitatory and inhibitory inﬂuence on synapses. Experimental results on multiple real-world datasets demonstrate the effectiveness and efﬁciency of our approach.


Introduction
Spiking neural P systems (SN P in short) are a kind of parallel and distributed neural-like computation model in the field of membrane computing [1,2]. SN P systems, which are inspired by neural cells [3], have a series of spikes and information processing rules, called firing and forgetting rules [4]. Inspired by different biological phenomena and mathematical motivations, several families of SN P systems have been constructed, such as SN P systems with anti-spikes [5], SN P systems with weight [6], SN P systems with astrocyte [7], stochastic numerical P systems [8], SN P systems with threshold [9], numerical spiking neural P systems [10], double layers self-organized spiking neural P systems [11], SN P systems with rules on synapses [12], SN P systems with structural plasticity [13]. For applications, SN P systems are used to design logic gates, logic circuits [7] and operating systems [14], perform basic arithmetic operations [15], solve combinatorial optimization problems [16], and realize fingerprint recognition [11]. Pǎun who initiated the P systems pointed out that solving real problems by membrane computing needs to be addressed [17]. The comparative analysis of dynamic behaviors of a hybrid algorithm indicates that the combination of evolutionary computation with P systems can produce a better algorithm for balancing exploration and exploitation [18][19][20]. However, the hybrid algorithm does not use objects and rules defined by P systems. On account of the P system is still in the phase of solving addition, subtraction, multiplication, and division [13]. How does the P system realize more complex and universally applicable functions? Clustering algorithm has universal applicability, and its inherent characteristics make it especially suitable for parallel operation through P system to realize the possibility of reducing time complexity. The whole process of clustering algorithm proposed in this paper is implemented through changes of objects by rules in membranes. In which, objects encode data. Membrane rules working on objects achieve the clustering goal. Real-world datasets always have multiple attributes, so these datasets often include high-dimensional data with features of multiple dimensions. Grid-based clustering is usually used for the more complex and high-dimension data. Data space is partitioned into certain number of cells. Cells are basic units for clustering operations [21]. OPTIGRID [22] is designed to obtain an optimal grid partitioning. CLIQUE is probably the most intuitive and comprehensive clustering technique [23]. The shifting grid approach (SHIFT) has been reported to be somehow similar to the sliding window technique. However, grid-based clustering methods face the curse of dimensionality. In other words, as the dimensionality increases, the number of grids increases exponentially. In order to solve this problem, some methods proposed to select two features to form a plane before meshing. In [24], random projection is used to reduce the dimensionality of the data. The dynamic feature mask is proposed to deal with the feature selection problem [25]. However, the features selected through these methods are not always the most distinguishable. In order to further improve the clustering effect, we propose to select features based on the data distribution histogram of each dimension. Inspired by AGRID [26], we combine the grid-based clustering method with the density-based clustering method. Based on the above considerations, this paper develops a hybrid optimization method, grid-density based algorithm by weighted SN P systems with anti-spike and astrocyte. Characteristic of each dimension is calculated and compared by rules independently in different membranes synchronously. Communications among membranes is utilized to explore clusters. Experimental results on multiple real-world datasets demonstrate the effectiveness and efficiency of our approach.

Weighted Spiking Neural P Systems with Anti-Spikes and Astrocytes
Weighted spiking neural P systems with anti-spikes and astrocytes (called WSNPA2) of degree m ≥ 1 is a construct of the form Π = (O, σ 1 , . . . , σ m , syn, ast 1 , . . . , ast k , In, Out) where, O is the set of spikes, O = {a,ā}, a is spike,ā is anti-spike. The empty string is denoted by λ; σ 1 , σ 2 , . . . , σ m are neurons, m is the degree of neurons, of the form σ i = (n i , R i ), 1 ≤ i ≤ m, Where, n i is the initial number of spikes contained in σ i , R i is a finite set of rules with: (1)E/s c → s, s is spikes or anti-spikes, c is the number of spikes in the rule, c ≥ 1, E is a regular expression over a or a; (2)s e → λ, e is the number of spikes, e ≥ 1. syn ⊆ {1, 2, . . . , m} × {1, 2, . . . , m} × ω are synapses between neurons, ω is the weight on synapse (i, j), ω ∈ Z. For each (i, j), there is at most one synapse (i, j, ω). A rule E/s c → s is applied as follows. If neuron σ i contains r spikes/anti-spikes, r ≥ c, then the rule can fire, c numbers of spikes/anti-spikes are consumed, r − c numbers of spikes/anti-spikes remain in σ i and one spikes/anti-spikes is released. The number of spikes/anti-spikes is multiplied by ω and pass immediately to all neurons with (i, j, ω) ∈ syn. s e → λ is forgetting rules. e numbers of spikes/anti-spikes are omitted from the neuron immediately.
For spikes a q and anti-spikes a p (p, q ∈ Z are numbers of spikes and anti-spikes), an annihilation rule aā → λ is applied in a maximal manner. a q−p or a(a) p−q remain for the next step, provided that q ≥ p or p ≥ q, respectively. ast 1 , . . . , ast k are astrocytes, of the form ast i = (syn asti , t i ), where syn asti ⊆ syn is the subset of synapses controlled by the astrocyte, t i is the threshold of the astrocyte. Suppose that there are k spikes passing along the neighboring synapses syn asti . If k ≥ t i , then ast i has an inhibitory influence on syn asti , and the k spikes are transformed into one spike by a k → a. a will be sent into the neuron connected to ast i . Otherwise, k < t i , then ast i has an excitatory influence on syn asti , all spikes survive and reach their destination neurons.

Identify the Two Well-Informed Features
Generally, in grid-based methods, the computations will grow exponentially with high dimensions, because of the evaluations should be done over all grid points. For example, a cluster analysis with N dimensions and L grid partitions in each dimension, would result in L N grids. To avoid this curse of dimensionality problem, we try to project data in actual feature space into a 2D space, aim to discover the initial locations of spike clusters in a plane. The plane comprised by the two well-informed features n i , n j ∈ N will be covered by a L × L lattice of grids with M data objects X p (p = 1, . . . , M).
At first, each dimension of objects is partitioned into x pi is the value of feature n i in data pattern X p and | . | is the cardinality operator representing the number of elements in a set. For each attribute of the Wine data set, we draw a histogram according to the above rules. Since the data set has 13 attributes, we get 13 feature maps. Figure 1 depicts this histogram for the 13 features of the known Wine data set. Because the number of peaks means the ability to divide the data in this data set by this feature, we take ε, which represents the number of peaks, as the measurement standard. As is shown in Figure 1, the ε in these histograms are 3,2,1,2,1,2,2,4,3,3,3,4,5, respectively. According to these values of ε, the features n 8 , n 12 and n 13 are selected. If there are same maximum values of ε, we divide each dimension of the data object into K/2 bins to recalculate the number of peaks until two well-informed features are selected. These features will then be used to do the cluster analysis.

Clustering by Grid-Density Based Algorithm
The plane comprised by the two well-informed features will be covered by a H = L × L lattice of grids. Grids are denoted by G = {g 1 , g 2 , . . . , g H }. C(g h ), h ∈ {1, 2, . . . , H} is the number of data X p partitioned in grid g h according to (3).
Next, non-dense grids are deleted. A grid is dense if C(g h ) > θ, θ ∈ N + is a threshold defined before computation. The threshold is initialized to 2% of the number of data, and on this basis, it floats upwards by 10% and downwards by 1%. Several experiments are performed to select the threshold that makes the clustering effect the best. After getting the initial members of grid graph G, G is refined by finding out dense grid. Those sparse girds are discarded. The refined grid graph is defined as: Each grid g h ∈ G r has 4 neighbors connected with it as shown in Figure 2. When there is no dense grid in the cluster that can be connected to the grid, a cluster is formed. A cluster is a set of neighbors of dense grids. The process of clustering algorithm is shown in Algorithm 1 below. In this section, the weighted spiking neural P system with anti-spikes and astrocytes is designed for grid-density based clustering. Objects in each neuron are organized as spikes and anti-spikes with real-valued numbers corresponding to Ω = {X pi , 1 ≤ p ≤ M, 1 ≤ i ≤ N}. Feature selection and cluster analysis are implemented by rules of WSNPA2. WSNPA2 is divided into three subsystems: feature selection, effectiveness comparison and clustering. The structure of WSNPA2 is shown in Figure 3, where ovals represent neurons, rhombic stand for astrocytes and arrows indicate channels. WSNPA2 for grid-density based clustering algorithm is described as the following construct Π = (O, σ S1 , σ S2 , σ S3 , syn, R, ast S1 , ast S3 , σ in1 , . . . , σ inN , σ out1 , . . . , σ out t ) where, O = {a,ā}. At beginning, the input neuron contains x pi numbers of spike a; σ S1 stands for neurons in feature selection subsystems,σ S1 = {DMiz, Fiz , FSiz }, 1 The number of astrocytes ast S1 in the feature selection system is N * N between each two DM iz . The number of astrocytes ast S3 in the clustering system is L × L × 2 + 1; Input neurons σ in1 , . . . , σ inN are in the feature selection system. Output neurons σ out1 , . . . , σ out t are in the clustering system. There are several different clustering subsystems working in parallel for different grid number H = L * L, which means the whole system can output variant clustering results simultaneously. Then, the clustering results obtained by different clustering subsystems are connected according to the neighboring relationship of grid positions, so as to obtain the final clustering result. Multi-WSNPA2 makes the calculation proceed in parallel in feature selection subsystem, effectiveness comparison subsystem and clustering subsystem, respectively. The complexity is reduced from O(n) to O(kn), where k is a constant less than 1. The detail how the complexity of grid-density based algorithm is calculated is as follows: 1. The complexity of traversing N data to form feature histograms is N. 2. The complexity of calculating the amount of data falling into each interval in the histogram is N.
3. The complexity of determining whether the amount of data in each rectangle in the histogram is greater than the left and right sides is K, where K is the number of rectangles. 4. The complexity of finding the two features with the most peaks is A, where A is the number of features in the data set. 5. The complexity of projecting data patterns into H = L × L grids is N. 6. The complexity of calculating the amount of data in each grid is N. 7. The complexity of selecting dense grid is L 2 . 8. The complexity of combing neighbor dense grids is L 2 − D, where D is the number of grids removed.
The complexity of grid-density based algorithm is O(N + N + K + A + N + N + L 2 + L 2 − D), where K, A, L and D are constants. The simplification is O(n). When we use multi-WSNPA2 to calculate the above algorithm, the data traversal in 1, 5, interval traversal in 2 and grid traversal in 6, 7 are parallel operations. So its complexity is O(1 + maxdataK + K + K + 1 + maxdataL + 1 + 1), where maxdataK and maxdataL are the max value in interval K and max value in grid L, respectively. And the simplification is O(kn), where k is a constant less than 1.
syn represents synapse among neurons: R is the following set of firing and forgetting rules:([ ] x means the rule works in neuron x, otherwise, the rule executes through all neurons)

Overview of Computations
Data set of M observations are codified by spikes a x pi , 1 ≤ i ≤ N, 1 ≤ p ≤ M. The computation of the P system is split in three subsystems. When a x pi arrive in neuron DM i1 , the computation begins in parallel.
In feature selection subsystem, threshold t ih in each astrocytes ast S1ih is . If x pi > t ih , it is said that x pi belongs to the current neuron DM iz . Rule 2 add a spike in DM iz . Otherwise, a x pi pass through DM iz to find its neuron (bin) by rule 1. After all a x pi execute with rule 1 and rule 2, the peak of each dimension is chosen by rule 3 and rule 4.
All peaks of dimension i gain by spike a in neuron E i . Then, effectiveness comparison subsystem starts. The maximum number of peaks of each dimension is selected by rule 5-9. Rule 5 and 6 copy peaks a m into a 2m+2 and sends a m into neuron EC i for preparation. Then, different number of a m is descended one by one by rule 8. Rule 9 helps ECS collect all dimensions without the one with maximum number of peaks. The serial number of the neuron who sends outā 2 by rule 10 is chosen as the first dimension for clustering. Other effectiveness comparison subsystem will work in the same way except that the chosen dimension is deleted by rule 11.
Rule 12 activates the input neurons of the two selected features. The clustering subsystem begins. Rule 13-14 put observations into suitable bins in their own dimensions. (θ i = [i 1 X i max − X i min )/L]). Then, rule 15 select the grid who has two spikes. It is chosen as initial grid for cluster . Rule 16 activates the input neurons of the two selected features again. Rule 17-19 finds dense grids. Rule 12-16 will continue to work until there are no spike input. The clustering result is obtained by the serial number of neurons with a output by rule 19.

Results and Discussion
The experiments set out to investigate the performance of the proposed approach compared to classical clustering algorithms. We conduct experiments using ten real-world datasets and all datasets are from UCI (https://archive.ics.uci.edu/ml/datasets.php). Table 1 summarizes these data sets, ordered in their number of attributes.
The amount of necessary resources to define multi-WSNPA2 of grid-density based clustering for the ten datasets are shown in Table 2.
To compare the algorithm with k-means, AHC (agglomerative hierarchical clustering) and two other new algorithms in more precise notion, their clustering performance in terms of accuracy is depicted in Table 3. This AHC uses the ward linkage 27 which is appropriate for Euclidean distance. The accuracy of clusters evaluates the right objects of clusters in each class.
Clearly, the accuracy is comparable to k-means, AHC and two other new algorithms and even better as its averages (in bold-face) show. This means that the clustering effect of our method is better than other algorithms.  Haberman  3  2  306  Iris  4  3  150  Thyroid  5  4  215  Ecoli  7  8  336  Diabetes  8  3  768  Breast  9  3  699  Glass  9  6  214  Wine  13  3  178  Vehicle  18  4  846  Ionosphere  34 2 351 Table 2. The amount of necessary resources to define multi-WSNPA2 of the ten datasets.  The intrinsic maximal parallelism of P systems can be exploited to produce a speed-up for solutions. In order to achieve this, the model needs several ingredients, among them the ability to generate an exponential workspace in polynomial time. The computational cost is more than k-means as the last stage of its algorithm is repetitive. Table 4 compares the time consuming against k-means and AHC where the fastest (in average) is shown in boldface. The results show that our algorithm can cluster faster on most data sets.

Conclusions
This paper discusses the use of weighted spiking neural P system with anti-spike and astrocyte to appropriately develop a novel hybrid method with grid-density based algorithm for solving clustering problems which first projects the data patterns on a two-dimensional space to overcome the curse of dimensionality problem. To choose these two well-informed features, a simple and fast feature selection algorithm is proposed. Then, through meshing the plane with grid lines and deleting sparse grids, clusters are found out. In particular, we present weighted spiking neural P systems with anti-spikes and astrocyte (WSNPA2 in short) to implement grid-density based approach in parallel. Each neuron in weighted SN P system contains a spike, which can be expressed by a computable real number. Spikes and anti-spikes are inspired by neurons communicating through excitatory and inhibitory impulses. Astrocytes have excitatory and inhibitory influence on synapses. Characteristic of each dimension is calculated and compared by rules independently in different membranes synchronously. Communications among membranes is utilized to explore clusters. Experimental results on multiple real-world datasets demonstrate the effectiveness and efficiency of our approach to classical k-means, AHC and two other new algorithms.

Conflicts of Interest:
The authors declare no conflict of interest.