Dynamics of Fuzzy-Rough Cognitive Networks

: Fuzzy-rough cognitive networks (FRCNs) are interpretable recurrent neural networks, primarily designed for solving classiﬁcation problems. Their structure is simple and transparent, while the performance is comparable to the well-known black-box classiﬁers. Although there are many applications on fuzzy cognitive maps and recently for FRCNS, only a very limited number of studies discuss the theoretical issues of these models. In this paper, we examine the behaviour of FRCNs viewing them as discrete dynamical systems. It will be shown that their mathematical properties highly depend on the size of the network, i.e., there are structural differences between the long-term behaviour of FRCN models of different size, which may inﬂuence the performance of these modelling tools.


Introduction
Artificial Intelligence (AI) models and methods are parts of our lives. However, most of the AI techniques are blackboxes in the sense, that they do not explain how and why they arrived at a specific conclusion. Explainable AI try to overcome this situation, developing models with interpretable semantics and transparency [1,2]. Fuzzy Cognitive Maps (FCMs) are one of the earliest EAI models, introduced by B. Kosko [3]. FCMs are recurrent neural networks employing weighted causal relation between the model's concepts. Due to their modelling ability and interpretability, these models have a wide range of application [4,5].
Although FCMs have an enormous number of applications, only a few studies are devoted to the analytical and not empirical discussion of their behaviour. Boutalis et al. [6] examined the existence and uniqueness of fixed points of FCMs. Lee and Kwon studied the stability of FCMs using Lyapunov method [7]. Knight et al. [8] analyzed FCMs with linear and sigmoid transfer functions. In [9], the authors generalized the findings of [6] to FCMs with arbitrary sigmoid function. All of these studies arrived to the conclusion (although in a different form) that when the parameter of the sigmoid threshold function is small enough, then the FCM converges to a unique fixed point, regardless of the initial activation values.
The hybridisation of rough set theory and fuzzy set theory [10,11] provide a not only promising, but fruitful combination of different methods of handling and modelling uncertainty [12,13].
The application areas encompass a wide variety of sciences, so only a few of them are mentioned here, without the need for completeness. Fuzzy rough sets and fuzzy rough neural networks have been applied in feature selection problems [14], evolutionary fuzzy rough neural networks have been developed for stock prediction [15]. Fuzzy rough set models are used in multi-criteria decision-making in [16]. Classification tasks have been solved by fuzzy rough granular neural networks in [17]. The combination of unsupervised convolutional neural networks and fuzzy-rough C-mean was used effectively for clustering of large-scale image dataset in [18]. The environmental impact of a renewable energy system was estimated using fuzzy rough sets in [19]. The interval-valued fuzzy-rough based Delphi method was applied for evaluating the siting criteria of offshore wind farms in [20]. Another current direction is the fusion of neutrosophic theory and rough set theory [21]. An example of its application is the emission-based prioritization of bridge maintenance projects [22]. Nevertheless, in the aspect of this paper, fuzzy rough granular networks [23] are the most exciting applications of the synergy of fuzzy and rough theories.
Granular Computing [24] uses information granules, such as classes, clusters, subsets etc., just like humans do. Granular Neural Networks (GNNs) [25] make a synergy between the celebrated neural networks and granular computing [26]. Rough Cognitive Networks (RCN) are GNNs, introduced by Nápoles et al. [27], combining the abstract semantic of the three-way decision model with the neural reasoning mechanism of Fuzzy Cognitive Maps for addressing numerical decision-making problems. The information space is discretized (granulated) by using Rough Set Theory [28,29], which has many other interesting applications [30][31][32][33]. According to simulation results, RNN was capable to outperform standard classifiers. On the other hand, learning the similarity threshold parameter had significant computational cost.
Rough Cognitive Ensembles (RCEs) was proposed to overcome this computational burden [34]. It employs a collection of Rough Cognitive Networks as base classifiers, each operating at a different granularity level. This allows suppressing the requirement of learning a similarity threshold. Nevertheless, this model is still very sensitive to the similarity threshold upon which the rough information granules are built.
Fuzzy-Rough Cognitive Maps (FRCNs) has been introduced by Nápoles et al. [35]. The main feature of FRCNs is that the crisp information granules are replaced with fuzzyrough granules. Based on simulation results, FRCNs show performance comparable to the best blackbox classifiers.
Vanloffelt et al. [36] studied the contributions of building blocks to the FRCNs' performance via empirical simulations with several different network topologies. They concluded that the connections between positive neurons might not necessary to maintain the performance of FRCNs. The theoretical study by Concepción et al. [37] discussed the contribution of negative and boundary neurons. Moreover, they arrived at the conclusion that negative neurons have no impact on the decision, and the ranking between positive neuron remains invariant during the whole reasoning process.
Besides the results presented in [37], this paper was motivated by the fact that only a few studies are discussing the behaviour of cognitive networks from the strict mathematical point of view. Nevertheless, such studies may provide us with information about what we can or cannot achieve with these models. Analyzing the behaviour and contribution of the building blocks unveils the exact role of components of the complex structure: which part is crucial and which one is unnecessary etc. In this paper, we do not develop, neither implement another new fuzzy-rough model. Instead, we analyze the behaviour of FRCNs, which are comparable in performance to the best black-box classifiers. Because of their proven competitiveness [35], there is no need for further model verification and validation.
In the current paper, the dynamical behaviour of fuzzy-rough cognitive networks is examined. The main contributions are the following: first, we show that stable positive neurons have at most two different activation values for any initial activation vectors. Then we show that a certain point with equal coordinates (called trivial fixed point) is always a fixed point, nevertheless, not always a fixed point attractor. Furthermore, a condition for the existence of a unique, globally attractive fixed point is also stated. Complete analysis of the dynamics of positive neurons for two and three decision classes is provided. Finally, we show that for a higher number of classes, the occurrence of limit cycles is a necessity and the vast majority of initial activation values lead to oscillation. The rest of the paper is organized as follows. In Section 2, we recall the construction of fuzzy-rough cognitive maps and overview the existing results about their behaviour. In Section 3, a summary of the mathematical background necessary for further investigation of the dynamics of FRCNs is provided, including contraction mapping and elements of bifurcation theory. Section 4 presents general result about the dynamics of positive neurons, condition for unique fixed point attractor and refinement of some findings of [37]. Section 5 introduces size-specific results for FRCNs, providing a complete description for the case of two and three decision classes and pointing out that over a specific size, oscillation behaviour will be naturally present. Section 6 discusses the relation of the behaviour of positive neurons to the final decision class of FRCNs. The paper ends with a short conclusion in Section 7.

Fuzzy-Rough Cognitive Networks
In this section, we shortly summarize the basic notions of fuzzy-rough cognitive networks and the findings about their dynamical properties reported in the literature. It is based on the works [35][36][37].

Construction of FRCNs
Building up FRCNs includes the following three steps: information space granulation, network construction and finally, network exploitation.
Information space granulation means dividing available information into granules. Let U denote the universe of discourse and X = {X 1 , . . . , X N }, X ⊂ U . Then X c ⊂ U is the subset containing all objects assigned (labelled) to decision class D c . The membership degree of x ∈ U to X c is computed in a binary way: Membership function µ P (y, x) is the next component, using similiraty degree between two instances x and y: where µ P : U × U → [0, 1] is the membership degree of y to X c , given that x belongs to X c (in this sense, it is a conditional membership degree). It is composed by the previously defined binary membership function µ X c (x) and the similarity degree ϕ(x, y). The later is based on a normalized distance measure δ(x, y). Membership functions for the lower and upper approximation for any fuzzy set X c , respectively: Here I denotes an implication function and T denotes a conjunction function. As a next step, we calculate the membership functions associated to the positive, negative and boundary regions: After determining the membership functions of the decision classes, we construct a network using four type of neurons: The recurrent neural network have N output neurons (decision neurons). The number of input neuron is between 2N and 3N, depending on the number of non-empty boundary regions. The 'wiring' between the neurons is based on the following steps (see Algorithm 1). Positive, negative and boundary neurons influence themselves with intensity 1. Each positive neuron influence the decision neuron related to it with intensity 1, moreover these neurons act on the other positive and decision neurons with intensity −1. Finally, if two decision classes share non-empty boundary regions, then each boundary neuron influences both decision neurons with intensity 0.5. As it is clear from this setting, the weights of a FRCN are not determined by learning methods, they are based on the semantic relations between the information granules.

Algorithm 1:
The construction procedure of the fuzzy-rough cognitive network.
for each subset X c do Add a neuron P c as the c-th positive region Add a neuron N c as the c-th negative region Add a neuron B c as the c-th boundary region Add a neuron D c as the c-th decision class The computation of initial activation values (A (0) i s) of the neurons is based in the similiraty degree between the new object y and all x ∈ U , and the membership degree of every x to positive, negative and boundary regions. The initial activation value of decision neurons is zero.
Negative neurons (N i ): Boundary neurons (B i ): After determining the initial values, the next step is network exploitation, by the iteration rule . (11) or in matrix vector form A (t+1) = f (WA (t) ), where f is understood coordinate-wise. Here w ij is the weight of the connection from neuron C j to neuron C i . Finally, the label of the decision neuron with highest activation value is assigned the classified object. Figure 1 shows the FRCN's structure for a binary classification problem. Figure 1. The structure of a fuzzy-rough cognitive network for binary classification.

Preliminary Results on the Dynamics of FRCNs
In this subsection, we shortly summarize the main contributions of [37] regarding the dynamics of fuzzy-rough cognitive networks. For detailed explanations and proofs, see [37].

•
Negative and boundary neurons always converge to a unique fixed value. This value depends on parameter λ, but the convergence and uniqueness are independent of λ.

•
The ranking between positive neurons remains invariant during the recurrent reasoning. As we will see in Section 5, although this statement is absolutely true in the strict mathematical sense, from the practical point if view, in some cases it has very limited applicability. • In an FRCN with N decision classes, there will always be N − 1 positive neurons with activation values less than or equal to 1/2 (after at least one iteration step). • In an FRCN there will be at most one neuron with activation value higher or equal to 1/2 (after at least one iteration step).

Consider the updating rule
(see Equation (11)), where f is the sigmoid function with parameter λ. Assuming that positive neurons reach stable states, for every i it holds, that P . Recall that each positive neuron is influenced by the other positive neurons with weight −1, while influences itself with weight 1 (i.e., if i = j, then w ij = 1, otherwise w ij = −1). Consequently, we have Now let us further investigate the equation P j we get the following equation: on the left hand side we find the sum of all activation values, which depends on the value i is the stabilized activation value of any positive neuron). The sum of the activation values is a function of P (t) i , of course. Define the following real function: From the behaviour of s(x), the authors derived some properties of the positive neurons: • If λ ≤ 2, then s(x) is monotone decreasing, thus a specific value can be produced by a single input value x. It means that if λ ≤ 2 and the positive neurons are stable (converge to a fixed point), then they have the same activation value. • If λ > 2, then s(x) produces the same value for at most three different input values in (0, 1). It means that if the vector of positive neurons converges to a fixed point, then this vector has at most three different coordinate values.
In Sections 4 and 5, we refine these statements and introduce some additional results regarding the dynamics of FRCNs.

Mathematical Background
The updating rule of FRCNs suggest to handle them as discrete dynamical systems. In this section, we briefly summarize the most important notions and methods used in the forthcoming sections.

Contraction Mapping
In the network exploitation phase, we apply the iteration rule Equation (11) again and again until the activation values stabilize (or the number of iterations reaches the predefined maximum). If the activation values stabilize (i.e., arrive an equilibrium state), then the difference between the outputs of two consecutive iteration steps will be smaller and smaller. In other words, the iteration contracts a subset of the state space (or the whole state space) into a single point. The following definition provides a strict mathematical description of this property.
Definition 1 (see [38], p. 220). Let (X, d) be a metric space, with metric d. If ϕ maps X into X and if there is a number c < 1 such that for all x, y ∈ X, then ϕ is said to be a contraction of X into X.
If the iteration reaches an equilibrium state, then this state will not change by applying the updating rule. Mathematically speaking, it is a fixed point of the mapping generating the iteration. The following famous theorem establishes connection between the contraction mapping and a unique fixed point.
Theorem 1 (Contraction mapping theorem or Banach's fixed point theorem, see [38], pp. 220-221). If X is a complete metric space, and if ϕ is a contraction of X into X, then there exists one and only one x ∈ X such that ϕ(x) = x.
The proof to this theorem is constructive (see [38], p. 221) and offers a straightforward way to find this unique fixed point. We only have to pick an arbitrary element of X and apply mapping ϕ again and again. The limit will be the unique fixed point of ϕ: An arbitrary fixed point x * is said to be asymptotically stable if starting the iteration close enough to x * , the limit will be x * . Moreover, the fixed point x * is said to be globally asymptotically stable if starting the iteration from any element of the state space, the limit will be x * . Based on Theorem 1 and its constructive proof, it is clear that the unique fixed point of a contraction is globally asymptotically stable.

Elements of Bifurcation Theory
The dynamics of a discrete dynamical system may change, if its parameters are varied. A qualitative change of the dynamical behaviour, e.g., a transition from a unique stable fixed point to multiple fixed points or oscillation, is called bifurcation and the corresponding critical parameter is called bifurcation point.
The detailed description of the dynamics of FRCNs requires the application of the elements of bifurctaion theory. Here only the most important notions are listed. For more details, see [39,40].
Consider a discrete-time dynamical system depending on only one parameter λ (G : R n → R n ): where function G is smooth with respect to x and λ. Let's assume that x 0 is a fixed point of the mapping with parameter λ 0 . The local stability (resistance agianst small perturbations) of the fixed point depends on the eigenvalues of the Jacobian evaluated at x 0 (J G (x 0 )). If the Jacobian has no eigenvalues on the unit circle, x 0 is said to be hyperbolic. Hyperbolic fixed points can be categorized according to the eigenvalues of J G (x 0 ): • If all of the eigenvalues lie inside the unit circle (i.e., the absolute value of the eigenvalues is less than one), then it is an asymptotically stable fixed point. In other words, this fixed point attracts the space in every direction in its (sometimes very small) neighborhood. Consequently, its basin of attraction is an n dimensional subset of then n dimensional space. • If there are some eigenvalues (at least one) with absolute value greater than one, and there are some (at least one) eigenvalues with absolute value less than one, then it is a saddle point. It means that this fixed point may attract points of the space in some, but not every direction in its neighborhood. Consequently, the dimension of its basin of attraction is less than n. • If all of the eigenvalues has absolute value greater than one, then it is an unstable fixed point (repeller fixed point).
In the FCM theory and similary in FRCN terminology, stable fixed point is a fixed point, that can be a limit of the iteration. In this sense, stable and saddle fixed points are considered as stable in the FCM (FRCN) sense. Nevertheless, their dynamical behavior could be very different. We will see in Section 5 that different type of fixed points have significantly different size of basin of attractions. In other words, some fixed points are less important than others.
The simplest ways the hyperbolicity can be violeted are the following: • A simple positive eigenvalue crosses the unit circle at α = 1. The bifurcation associated to this scenario is called saddle-node bifurcation. It is a birth of new fixed points. • A simple negative eigenvalue crosses the unit circle at α = −1. This causes perioddoubling bifurcation, the birth of limit cycles. • A pair of complex conjugate eigenvalues approaches the unit circle at α 1,2 = e iφ , φ ∈ (0, π). This is the so-called Neimark-Sacker bifurcation, causes the occurence of a closed invariant curve.
It will be shown in Section 5, that depending on the size of the FRCN, different types of bifurcations determine the main dynamics of the system.

General Results on the Dynamics of Positive Neurons
In this section, we introduce some general results about the dynamics of positive neurons. Size specific results are presented in Section 5.
With start with the refinement of a result from [37]. Further investigation of the function s(x) (see Equation (14)) provides more information about the possible fixed points of positive neurons (see Figure 2). It has been shown that there are at most one positive neuron with activation value higher than 1/2. If s(x) > 1, then a specific value may be produced by at most three different values of x. But two of these values are higher than 1/2, thus only one of them get a role in the activation vector. It means that one coordinate of the activation vector is greater than 1/2 and the remaining ones are less than 1/2 and they have equal values.
Consider now the case, when s(x) < 1. Observe that the graph of s(x) is symmetrical about the point (1/2, 1) (it can be easily verified analytically). Let us choose a specific value γ = s(x), such that the horizontal line y = γ has three intersection points with the graph. Denote the first coordinates of theses point by . Using the symmetry of the function, we can conclude that Since the sum of the activation values is less than 1 (s(x) < 1), it follows that there can be at most two different values (x 1 and x 2 ) in the activation vector for any given s(x).
Summarizing this short argument, if the activation vector of positive neurons converges to a fixed point, then it may have at most two different coordinate values. The iteration rule for the updating of the neurons' activation values has the form where the sigmoid is applied coordinate-wise and W is the connection matrix of the network. Based on the construction of the network (see Algorithm 1), it has the following block form: where O, I denote zero matrix and identity matrix, respectively. W B describes the connections from boundary to decision neurons (it contains 0 s and 0.5 s, their positions depend on non-empty boundary regions), W P describes the connections between positive neurons (if i = j, then w ij = 1, else w ij = −1), W D = W P contains the connections from positive neurons to decision neurons. Because of the upper-diagonal block structure, instead of dealing with the whole matrix, we can use the blocks. It has been proved in [37] that activation values of the negative and boundary neurons converge to the same unique value, which depends on λ, but independent of the initial activation values. Positive neurons influence themselves, each other and decision neurons, but do not receive input from other set of neurons. Their activation values are propagated to the decision neurons. In a long run, when neurons reach a stable state (or the iteration is stopped by achieving the maximal number of iterations), the propagated value is their stable (or final) state. In the following, we examine the long-term behaviour of positive neurons. Proof. Consider the fixed point equation for every 1 ≤ j ≤ N: In coordinate-wise form: If P j = x for every 1 ≤ j ≤ N, then it simplifies to the following equation: We show that there always exists a unique solution to this equation. Let us introduce the function Function g(x) is continuous and differentiable, moreover g(0) = −0.5 < 0 and g(1) = 1 − 1 1+e λ(N−2) > 0, thus it has at least one zero in (0, 1). According to Rolle's theorem, between two zeros of a differentiable function its derivative has a zero. The derivative is which is always positive. It means that there is exactly one zero of g(x) in (0, 1). Consequently, we have shown that for any given λ > 0 and N, there is exactly one fixed point of the positive neurons with equal coordinates. There may be other fixed points, but their coordinates are not all the same.
The following lemma plays a crucial role in the proof of Theorem 2 and in the examination of the Jacobian of the iteration mapping. Lemma 2. Let W P be an N × N matrix with the following entries: Then the eigenvalues of W P are 2 − N (with multiplicity one) and 2 (with multiplicity N − 1).

Proof. Basic linear algebra.
Theorem 2. Consider a fuzzy cognitive map (recurrent neural network) with sigmoid transfer function ( f (x) = 1/(1 + e −λx )) and with weight matrix W p whose entries are then it has exactly one fixed point. Moreover, this fixed point is a global attractor, i.e., the iteration starting from any initial activation vector ends at this point.

Proof.
We are going to show that if the condition in theorem is fulfilled, then the mapping P → f (W P P) is contraction, thus according to Banach's theorem, it has exactly one fixed point and this fixed point is globally asymptotically stable, i.e., iterations starting from any initial vectors arrive to this fixed point. Let us choose two different initial vectors, P and P . Then Here the first inequality comes from the fact that the derivative of the sigmoid function f (x) is less than or equals λ/4 and f (x) is Lipschitzian, while the second inequality comes from the definition of the induced matrix norm. Since W p is a real, symmetric matrix its spectral norm ( * 2 ) equals the maximal absolute values of its eigenvalues. By Lemma 2, W p 2 = max{2, |2 − N|}. According to the definition of contraction (Equation (15)), if the coefficient of P − P 2 is less than one, then the mapping is a contraction and by Theorem 1 it has exactly one fixed point and this fixed point is globally asymptotically stable. The inequality in the Theorem comes by a simple rearranging: The immediate corollary of Theorem 2 and Lemma 1 is if there is a unique globally attracting fixed point, then its coordinates are equal. We will refer to fixed point with equal coordinates as trivial fixed point. The whole complex behaviour of positive neurons (and in such a way, fuzzy-rough cognitive networks) evolves from this trivial fixed point via bifurcations (see the flowchart Figure 3 for the way to the first bifurcation). In Section 5, we show that different size of FRCNs (different number of decision classes N) may show significantly different qualitative behaviour.

Dynamics of Positive Neurons
First we provide the Jacobian at the trivial fixed point. In general (except the case N = 2), this fixed point is a function of λ and N. Let us denote the coordinates of the trivial fixed point by p * . Then the (i, j) entry of the Jacobian of the mapping f (W p P) at this point, using the fact that f (W p P * ) = p * and for the sigmoid function The whoole Jacobian matrix evaluated at the trivial fixed point is the following: Its eigenvalues are λp * (1 − p * ) times the eigenvalues of W p : (2 − N)λp * (1 − p * ) and 2λp * (1 − p * ). As the value of λ increases, at a certain point the absolute value of the eigenvalue with the highest modulus reaches one, the trivial fixed point loses its global stability and a bifurcation occurs. The type of this bifurcation has great effect on the further evolution and dynamics of the system. Based on the eigenvalues of W P , we see that Neimark-Sacker bifurcation does not occur here, but saddle-node and perido-doubling bifurcations do play an important role.

N = 2
Consider first the case when we have only two decision classes. The relations between the positive neurons can be seen in Figure 1. The weight matrix describing the connections is the following (subscript P refers to positive): Easy to check the the point [0.5, 0.5] T is always a fixed point of the mapping f (W p P) = P, since and f (0) = 0.5. According to Theorem 2, if λ < 2, then it is the only fixed point, moreover it is globally asymptotically stable, i.e., strating from any initial activation vector, the iteration will converge to this fixed point. The Jacobian of the mapping at this fixed point is and its eigenvalues are 0 and λ/2. When the eigenvalue λ/2 = 1 (λ = 2), a bifurcation occurs, giving birth to two new fixed points. In the following, we are going to show that for every λ > 2, there are exactly three fixed points, moreover these fixed points have the following coordinates: , where x * is a fixed point of a one dimensional mapping described below. Let us assume that (x 1 , x 2 ) T is a fixed point of the mapping, then Since f is the sigmoid function, we have that f (−x) = 1 − f (x), consequently So for a fixed point the coordinates are (x 1 , 1 − x 1 ). The first equation leads to the following fixed point equation: It means that the FPs of the positive neurons can be determined by solving Equation (39). From the graphical point of view, easy to see that if λ ≤ 2, then it has exactly one solution (x 1 = 0.5), but if λ > 2, then there are three different solutions: 0.5, x * and 1 − x * (see Figure 4).
From the analytical viewpoint, we have to solve the equation Applying the inverse of f (x) and rearranginig the terms: As it was pointed out in [37], if λ > 2, then the left hand side has local minimum at 1 2 − λ−2 4λ less than one , and local maximum at 1 2 + λ−2 4λ greater than one. If λ ≤ 2, then the function is strictly monotone decreasing. Using continuity of the function, we conclude that there are exactly three solutions for every λ > 2 and there is a unique solution if λ ≤ 2 (see Figure 5).   Let us examine the basins of attraction for the three different fixed points, i.e., λ > 2 and the fixed points are [0.5, 0.5] T , [x * , 1 − x * ] and [1 − x * , x * ], with x * > 1/2. Consider a point (x 1 , x 2 ) T as initial activation vector (see Figure 6).
• If x 1 = x 2 , then the iteration leads to the fixed point (0.5, 0.5), since • If The size of the basin of attraction can be considered as the number of its points. In the strict mathematical sense it is infinity, of course. On the other hand, the basin of fixed point (0.5, 0.5) is a one dimensional object (line segment), while the basins of (x * , 1 − x * ) and (1 − x * , x * ) are two-dimensional sets (triangles), so they are 'much more bigger' sets.
In applications, we always work with sometimes large, but finite numbers of points, based on the required and available precision. Let us define the level of granularity as the length of subintervals, when we divide the unit interval into n equal parts. Then the division points are 0, 1/n, 2/n, . . . , 1, so we have n + 1 points. The basin of fixed point (0.5, 0.5) contains n + 1 points, while the basins of the two other fixed points have n(n + 1)/2, n(n + 1)/2 points. By increasing the number of division points, the proportion of the basins tend to zero and 1/2, as it was expected.
In a certain sense, it means that fixed points (x * , 1 − x * ) T and (1 − x * , x * ) T are more important than fixed point (0.5, 0.5) T , since much more initial activation values lead to these points.

N = 3
The structure of the connections between positive neurons can be seen in Figure 7. In this case, the eigenvalues of W P are −1 (with multiplicity one) and 2 (with multiplicity two). The fixed point with equal coordinates loses its global asymptotic stability when the absolute value of its larger eigenvalue equals one. Since the positive eigenvalue has the higher absolute value, this bifurcation results in new fixed points. Nevertheless, this eigenvalue has multiplicity two, so it is not a simple bifurcation, i.e., not only a pair of new fixed points arise, but a couple of new FPs. The trivial fixed point becomes a saddle point, i.e., it attracts points in a certain direction, but repells them in other directions. If we further increase the value of the parameter λ, then the absolute value of the negative eigenvalue reaches one and the trivial fixed point suffers a bifurcation again. Since the eigenvalue is −1, this is a period-doubling bifurcation, giving birth to a two-period limit cycle.
We show that there are three type of fixed points: • The trivial fixed point with equal coordinates (FP 0 ); • Fixed points with one high and two low values (FP 1 ); • Fixed points with one low and two medium coordinates (FP 2 ). The existence of FP 0 is clear, as it was shown by Lemma 1. As it was pointed out in Section 4, the non-trivial fixed points have two different coordinate values. Let us denote these values by x, x and y. Then the fixed point equation is which simplifies to the following system of equations: By substituting x, we have It is again a fixed point equation, whose number of solutions depends on the value of parameter λ (see Figure 8): • if λ ≤ 2.2857 (rounded), then there is exactly one solution, it refers to the trivial fixed point (FP 0 ); • if λ > 2.2857, then there are three different solutions, one refers to FP 0 , one to FP 1 s and one to FP 2 .
Using the values of y, we can determine the values of x. Furthermore, if y is high, then based on the equation x = f (−y) = 1 − f (y), we may conclude that x is low. Similarly, if y is low, then x is medium. Finally, there are seven fixed points: Initial activation values lead to these fixed points according to the following: 3 , then the iteration converges to FP 0 ; 3 , then the iteration converges to FP 1 ; 3 , then the iteration converges to FP 2 . Ranking is preserved between positive neurons, in the sense that if P (0) x ≥ P (0) y , then P * x ≥ P * y . Since the number of possible outcomes is very limited (only three cases without permutations), it means that some differences in the initial activation values will magnified, for example if the initial activation vector is (0.3, 0.2, 0.1), then the iteration converges to (0.9926, 0.0069, 0.0069). On the other hand, some large differences will be hidden: initial activation vector (1, 0.95, 0) leads again to (0.9926, 0.0069, 0.0069).
Limit cycle occurs, when the negative eigenvalue of the Jacobian computed at the trivial fixed point reaches −1 (at about λ = 5.8695). Similarly to the trivial fixed point, the elements of the limity cycle have equal coordinates. Let us denote these points by (x 1 , x 1 , x 1 ) and (x 2 , x 2 , x 2 ). The members of a two-period limit cycle are fixed points of the double iterated function.
In coordinate-wise form this provides the following system of equations: From which we have If λ ≤ 5.8695, then it has a unique solution. It refers to the trivial fixed point, since this point is a fixed point of the double-iterated function, too. If λ > 5.8695, then there are two other solutions, low and medium, these are the coordinates of the two-period limit cycle. For example, for λ = 7, these points are 0.0563 and 0.4027.
For a general case, basins of attraction of a dynamical systems are difficult to determine and sometimes it is analytically not feasible task [41,42], enough to mention the famous graph of Newton's method's basin ofattractions [43]. We examined the basin of attraction of the fixed points by putting an equally spaced grid on the set of possible initial values of positive neurons P 1 , P 2 and P 3 , and applied the grid points as initial activation values. Table 1 shows the sizes of the basins of attraction for different granurality. Results are visualized in Figures 9 and 10.

N ≥ 4
If the FRCN has N = 4 decision classes, then the eigenvalues of the Jacobian at the trivial fixed point have the same magnitude, but with different sign. So positive and negative eigenvalues reaches one (in absolute value) at the same value of parameter λ (see Figure 11), causing the appearance of new fixed points and limit cycle simultaneuosly. The trivial fixed point is no longer an attractor, but fixed points with pattern one high, three low and two medium, two low values do exists. If N ≥ 5, then the absolute value of the negative eigenvalue of the Jacobian evaluated at the trivial fixed point is higher then the positive value, consequently first a perioddoubling bifurcation occurs and the trivial fixed point loses its attractiveness. We should note that occurence of fixed points of type FP 1 and FP 2 is not linked to the other (positive) eigenvalue, since they occur earlier, for a smaller value of λ.
For the general case (N decision classes), there exist two types of fixed points with the following patterns: one high, N − 1 low values and two medium, N − 2 low values. The fixed point equations for the one high (x 1 ), N − 1 low (x 2 ) pattern which leads to the following one-dimensional fixed point problem: The pattern two medium (x 1 ), N − 2 low (x 2 ) values leads to the following equations: from which we get the a one-dimensional fixed point problem: Nevertheless, these fixed points are less important for multiple decision classes. Finally, we provide a geometrical reasoning of the structure of fixed points. Consider two fixed points of type FP 1 , i.e., they have one high and N − 1 low coordinates. Their basins of attractions are separated by a set, whose points do belong to none of them, but lie on an N − 1 dimensional hyperplane 'between' them. Without loss of generality, we may assume that one fixed point is P * 1 = (α, β, . . . , β) and the other one is P * 2 = (β, α, β, . . . , β). Because of symmetry, the hyperplane is perpendicular the line connecting P * 1 and P * 2 , i.e., its normal vector is paralel to −−→ P * 1 P * 2 = (β − α, α − β, 0, . . . , 0). Additionally, the hyperplane crosses the line at the middle point of P * 1 and P * 2 , which has coordinates α+β 2 , α+β 2 , β, . . . , β . Consequently, the equation of the separating hyperplane is x 1 = x 2 . The separating set is a subset of this plane with the additional constrain x i < x 1 , for every i = 1, 2. Consequently, a fixed point of type FP 2 has two medium coordinates with equal values (x 1 = x 2 ), and N − 2 equal, but low coordinates. Since there are N fixed points of type FP 1 , there are N 2 fixed points of type FP 2 .
Simulation results show, that for N ≥ 4, limit cycles tend to steal the show. Limit cycle oscillates between two activation vectors with equal coordinates (the equality of the coordinates is an immediate consequence of symmetry). Let us denote these points by (x 1 , . . . , x 1 ) and (x 2 , . . . , x 2 ). The members of a two-period limit cycle are fixed points of the double iterated function.
In coordinate-wise form this gives the following system of equations: From which we have For λ values generally applied in fuzzy cognitive maps and for λ = 5 used in FRCNs, this fixed point equation has three solutions: one refers to the trivial fixed point, which is no longer a fixed point attractor, the other two (low and medium) are coordinates of the elements of the limit cycle (see Figure 12). Simulation results show that by increasing the number of decision classes, more and more initial values arrive to a limit cycle. We put equally spaced N dimensional grid on the set [0, 1] N with stepsize 0.5, 0.25, 0.2 and 0.1, then applied the gridpoints as initial activation values of the positive neurons. Since any particular real-life dataset finally turns into initial activation values for the FRCN model, it means that these gridpoints can be viewed as representations of possible datasets, up to the predefined precision (i.e., step size). Iteration stopped when convergence or limit cycle was detected or the predefined maximum numbers of steps reached. As we can observe in Table 2, most of the initial values finally arrive to a limit cycle. Table 2. Number of points in basin of attraction in percentage of the total number of points, for different number of classes (N) and level of granularity, λ = 5. FP 1 refers to fixed points with one high and N − 1 low values, while F 2 refers to fixed points with two medium and N − 2 low values, LC stands for limit cycle.

Relation to Decision
It has been proved previously, that values of negative and boundary neurons converge to the same value (it is ≈ 0.9930 for λ = 5) and the dynamical behaviour of positive neurons was analyzed in the preceding sections. Now we examine their effect on the decision neurons and such a way, on the final decision.
Decision neurons have only input values, they do not influence each other, neither other types of neurons. As a consequence, the sigmoid transfer function f (x) only transform their values into the (0, 1) interval, but does not change the order of the values (with respect to the ordering relation '≤'), since f (x) is strictly monotone increasing. Before analyzing the effects of the results of the previous sections, we briefly summarize the conclusion of [37]: assuming that the activation values of positive neurons reach a stable state, they concluded that negative neurons have no influence on FRCNs' performance, but the ranking of positive neurons' activation values and the number of boundary neurons connected to each decision neuron have high impact. Based on the previous sections, below we ad some more insights to this result.
If positive neurons reach a stable state (fixed point), then this stable state have either the pattern one high and N − 1 low values (FP 1 ) or two medium and N − 2 low values (FP 2 ), the trivial fixed point with equal coordinates (FP 0 ) plays a rule only for 2 and 3 decision classes. These values are unique and completely determined by the parameter λ and the number of decision classes N. It means that the number of possible final states is very limited. This fact was mentioned in the case of N = 3 decision classes, but valid for every N ≥ 3 cases. Namely, small differences between the initial activation vales could be magnified by the exploitation phase. Almost equal initial activation values with proper maximum lead to the pattern of one high and N − 1 low values, resulting in something like a winner-takes-all rule. Although the runner-up has only a little smaller initial value, after reaching the stable state, it needs the same number of boundary connections to overcome the winner, as the one with very low initial value needs.
If the maximal number of iterations is reached without convergence (i.e., the activation value vector oscillates in a limit cycle), then the iteration is stopped and the last activation vector is taken. It has either (low, . . . , low) or (medium, . . . , medium) pattern with equal coordinates. In this case, the positive neurons have absolutely no effect on the final decision. The classification goes to the neuron with the highest number of boundary connections, regardless of the small or large differences between the initial activation values of the positive neurons.

Conclusions and Future Work
The behaviour of fuzzy-rough cognitive networks was studied applying the theory of discrete dynamical systems and their bifurcations. The dynamics of negative and positive neurons was fully discussed in lthe iterature, so we focused on the behaviour of positive neurons. It was pointed out, that the number of fixed points is very limited and their coordinate values follow a specific pattern (FP 0 , FP 1 , FP 2 ). Additionally, it was proved that when the number of decision classes is greater than three, the limit cycles unavoidably occur, causing the recurrent reasoning inconclusive. Simulations show that proportion of initial activation values leading to limit cycles increases with the number of decision classes, and the waste number of scenarios lead to oscillation. In this case, the decision relies totally on the number of boundary neuron connected to each decision neurons, regardless of the initial activation value of positive neurons.
The method applied in the paper may be followed in the analysis of other FCM-like models. As we have seen, if the parameter of the sigmoid threshold function is small enough, then an FCM has one and only one fixed point, which is globally asymptotically stable. If we increase the value of the parameter, then a fixed-point bifurcation occurs, causing an entirely different dynamical behaviour. If the weight matrix has a nice structure, for example in the case of positive neurons, then there is a chance to find the unique fixed point in a simple form or as a limit of a lower-dimensional iteration and determine the parameter value at the bifurcation point. Similarly, based on the eigenvalues of the Jacobian evaluated at this fixed point we can determine the type of bifurcation. Nevertheless, general FCMs have no well-structured weight matrices, since weights are usually determined by human experts or learning methods. It means some limitations on the generalization of the method applied. Theoretically, we can find unique fixed points and the bifurcation point, but this task is much more difficult for a general weight matrix.
Another exciting and important research direction is the possible generalization of the results to the extensions of fuzzy cognitive maps. Some well-known extensions are fuzzy grey cognitive maps (FGCMs) [44], interval-valued fuzzy cognitive maps (IVFCMs) [45], intuitionistic fuzzy cognitive maps (IFCMs) [46,47], temporal IFCMs [48], the combination of fuzzy grey and intuitionistic FCMs [49], interval-valued intuitionistic fuzzy cognitive maps (IVIFCMs) [50]. This future work probable requires deep mathematical inspection on interval-valued dynamical systems and may lead to several new theoretical and practical results on interval-valued cognitive networks, as well.

Data Availability Statement:
No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest:
The author declare no conflict of interest.