Abstract
Detailed analysis of the function of multilayer perceptron (MLP) and its neurons together with the use of time-varying neurons allowed the authors to find an analogy with the use of structures of linear differential operators. This procedure allowed the construction of a group and a hypergroup of artificial neurons. In this article, focusing on semihyperstructures and using the above described procedure, the authors bring new insights into structures and hyperstructures of artificial neurons and their possible symmetric relations.
1. Introduction
As mentioned in the PhD thesis [], neurons are the atoms of neural computation. Out of those simple computational units all neural networks are build up. The output computed by a neuron can be expressed using two functions . The details of computation consist in several steps: In a first step the input to the neuron, , is associated with the weights of the neuron, , by involving the so-called propagation function f. This can be thought as computing the activation potential from the pre-synaptic activities. Then from that result the so-called activation function g computes the output of the neuron. The weights, which mimic synaptic strength, constitute the adjustable internal parameters of the neuron. The process of adapting the weights is called learning [,,,,,,,,,,,,,,,,,].
From the biological point of view it is appropriate to use an integrative propagation function. Therefore, a convenient choice would be to use the weighted sum of the input , that is the activation potential equal to the scalar product of input and weights. This is, in fact, the most popular propagation function since the dawn of neural computation. However, it is often used in a slightly different form:
The special weight is called bias. Applying for and for as the above activation function yields the famous perceptron of Rosenblatt. In that case the function works as a threshold.
Let be a general non-linear (or piece-wise linear) transfer function. Then the action of a neuron can be expressed by
where is input value in discrete time k where is weight value in discrete time where is bias, is output value in discrete time
Notice that in some very special cases the transfer function F can be also linear. Transfer function defines the properties of artificial neuron and this can be any mathematical function. Usually it is chosen on the basis of the problem that the artificial neuron (artificial neural network) needs to solve and in most cases it is taken (as mentioned above) from the following set of functions: step function, linear function and non-linear (sigmoid) function [,,,,,,,].
In what follows we will consider a certain generalization of classical artificial neurons mentioned above such that inputs and weight will be functions of an argument t belonging into a linearly ordered (tempus) set T with the least element . As the index set we use the set of all continuous functions defined on an open interval . So, denote by W the set of all non-negative functions forming a subsemiring of the ring of all real functions of one real variable . Denote by for , the mapping
which will be called the artificial neuron with the bias By we denote the collection of all such artificial neurons.
Neurons are usually denoted by capital letters or , nevertheless we use also notation , where is the vector of weights [,,].
We suppose - for the sake of simplicity - that transfer functions (activation functions) are the same for all neurons from the collection and the role of this function plays the identity function .
Feedforward multilayer networks are architectures, where the neurons are assembled into layers, and the links between the layers go only into one direction, from the input layer to the output layer. There are no links between the neurons in the same layer. Also, there may be one or several hidden layers between the input and the output layer [,,].
2. Preliminaries on Hyperstructures
From an algebraic point of view, it is useful to describe the terms and concepts used in the field of algebraic structures. A hypergroupoid is a pair , where H is a (nonempty) set and
is a binary hyperoperation on the set If for all (the associativity axiom), the the hypergroupoid is called a semihypergroup. A semihypergroup is said to be a hypergroup if the following axiom:
for all (the reproduction axiom), is satisfied. Here, for sets we define as usually
Thus, hypergroups considered in this paper are hypergroups in the sense of F. Marty [,]. In some constructions it is useful to apply the following lemma (called also the Ends-lemma having many applications—cf. [,,,,]). Recall, first that by a (quasi-)ordered semigroup we mean a triad where is a semigroup, is a (quasi-)ordered set, i.e., a set S endowed with a reflexive and transitive binary relation “≤” and for all triads of elements the implication holds.
Lemma 1 (Ends-Lemma).
Let be a (quasi-)ordered semigroup. Define a binary hyperoperation
Then is a semihypergroup. Moreover, if the semigroup is commutative, then the semihypergroup is also commutative and if is a (quasi-)ordered group then the semihypergroup is a hypergroup.
Notice, that if are (semi-)hypergroups, then a mapping is said to be the homomorphism of into if for any pair we have
If for any pair the equality holds, the homomorphism h is called the good (or strong) homomorphism—cf. [,]. By we denote the endomorphism monoid of a semigroup (group)
Concerning the basics of the hypergroup theory see also [,,,,,,,,,,,,,,].
Linear differential operators described in the article and used e.g., in [,] are of the following form:
Definition 1.
Let be an open interval, be the ring of all continuous functions For we define
(the ring of all smooth functions up to order n, i.e., having derivatives up to order n defined on the interval
Definition 2
([,]). Let be a semigroup and A hyperoperation defined by for any pair is said to be the P-hyperoperation in G. If
holds for any triad the P-hyperoperation is associative. If also the axiom of reproduction is satisfied, the hypergrupoid is said to be a P-hypergroup.
Evidently, if is a group, then also is a P-hypergroup. If the set P is a singleton, then the P-operation is a usual single—valued operation.
Definition 3.
A subset is said to be a sub-P-hypergroup of if and is a hypergroup.
Now, similarly as in the case of the collection of linear differential operators [], we will construct a group and hypergroup of artificial neurons, cf. [,,,,].
Denote by Kronecker delta, i.e., and , whenever
Suppose Let be a such an integer that . We define
where
and, of course, the neuron is defined as the mapping Further, for a pair of neurons from we put if and and with the same bias. Evidently is an ordered set. A relationship (compatibility) of the binary operation "" and the ordering on is given by this assertion analogical to Lemma 2 in [].
Lemma 2.
The triad (algebraic structure with an ordering) is a non-commutative ordered group.
Sketch of the proof was published in [].
Denoting
we get the following assertion:
Proposition 1
(Prop. 1. [], p. 239). Let . Then for any positive integer and for any integer m such that the semigroup is an invariant subgroup of the group
Proposition 2
(Prop. 2. [], p. 240). Let and are integers such that . Define a mapping by this rule: For an arbitrary neuron where we put with the action:
Then the mapping is a homomorphism of the group into the group
Now, using the construction described in the Lemma 1, we obtain the final transposition hypergroup (called also non-commutative join space). Denote by the power set of consisting of all nonempty subsets of the last set and define a binary hyperoperation
by the rule
for all pairs . More in detail if , then if Then we have that is a non-commutative hypergroup. We say that this hypergroup is constructed by using the Ends Lemma (cf. e.g., [,,]. These hypergroups can be called as EL-hypergroups. The above defined invariant (called also normal) subgroup of the group is the carrier set of a subhypergroup of the hypergroup and it has certain significant properties.
Using certain generalization of methods from [] (p. 283), we obtain, after we investigate the constructed structures, the following result:
Theorem 1.
Let Then for any positive integer and for any integer m such that the hypergroup ,where
is a transposition hypergroup (i.e., a non-commutative join space) such that is its subhypergroup, which is
- -
- invertible (i.e., implies and implies for all pairs of neurons
- -
- closed (i.e., for all pairs
- -
- reflexive (i.e., for any neuron and
- -
- normal (i.e. for any neuron
Remark 1.
A certain generalization of the formal (artificial) neuron can be obtained from expression of a linear differential operator of the n-th order. Recall the expression of formal neuron with inner potential where is the vector of inputs, is the vector of weights. Using the bias b of the considered neuron and the transfer function σ we can expressed the output as
Now consider a tribal function where is an open interval; inputs are derived from as follows: Inputs Further the bias As weights we use the continuous functions
Then formula
is a description of the action of the neuron which will be called a formal(artificial) differential neuron. This approach allows to use solution spaces of corresponding linear differential equations.
Proposition 3
([], p. 16). Let be two groups and Then the homomorphism f is a good homomorphism between P-hypergroups and
Concerning the discussed theme see [,,,,,,,]. Now denote by an arbitrary non/empty subset and let
Then defining
for any pair of neurons , we obtain a P-hypergroup of artificial time varying neurons. If S is a singleton, i.e., P is a one-element subset of , the obtained structure is a variant of . Notice, that any for a group induces a good homomorphism of the P-hypergroups and any automorphism creates an isomorphism beween the above P-hypergroups.
Let be the additive group of all integers. Let be arbitrary but fixed chosen artificial neuron with the output function . Denote by the left translation within the group of time varying neurons determined by , i.e.,
for any neuron . Further, denote by the r-th iteration of for . Define the projection by
It is easy to see that we get a usual (discrete) transformation group, i.e., the action of (as the phase group) on the group . Thus the following two requirements are satisfied:
- for any neuron ,
- for any integers and any artificial neuron . Notice that, in the dynamical system theory this structure is called a cascade.
On the phase set we will define a binary hyperoperation. For any pair of neurons define
Then we have that is a commutative binary hyperoperation and since , we obtain that the hypergroupoid is a commutative, extensive hypergroup [,,,,,,,,,,]. Using its properties we can characterize certain properties of the cascade The hypergroup can be called phase hypergroup of the given cascade.
Recall now the concept of invariant subsets of the phase set of a cascade and the concept of a critical point. A subset M of of a phase set X of the cascade is called invariant whenever , for all and all . A critical point of a cascade is an invariant singleton. It is evident that a subset M of neurons, i.e., is invariant in the cascade whenever it is a carrier set of a subhypergroup of the hypergroup , i.e., M is closed with respect to the hyperoperation ∗, which means . Moreover, union or intersection of an arbitrary non-empty system is also invariant.
3. Main Results
Now, we will construct series of groups and hypergroups of artificial neurons using certain analogy with series of groups of differential operators described in [].
We denote by (for an open interval ) the set of all linear differential operators i.e., the ring of all continuous functions defined on the interval J, acting as
and endowed the binary operation
Now denote by the set of all operators acting as
with similarly defined binary operations such that are noncommutative groups. Define mappings by
and by
It can be easily verified that both and for an arbitrary , are group homomorphisms.
Evidently, for all . Thus we obtain complete sequences of ordinary linear differential operators with linking homomorphisms 

Now consider the groups of time-varying neurons from Proposition 3 and above defined homomorphism of the group into the group . Then we can change the diagram in the following way:


Using the Ends lemma and results the theory of linear operators we can describe also mapping morphisms in sequences groups of linear differential operators:
Theorem 2.
Let , such that such that . Let be the hypergroup obtained from the group by Proposition 2. Suppose that are the above defined surjective group homomorphisms. Then are surjective homomorphisms of hypergroups.
Remark 2.
The second sequence of (2) can thus be bijectively mapped onto sequence of hypergroups
wit the linking surjective homomorphisms . Therefore, the bijective mapping of the above mentioned sequences is functorial.
Now, shift to the concept of an automaton. This was developed as a mathematical interpretation of real-life systems that work on a discrete time-scale. Using the binary operation of concatenation of chains of input symbols we obtain automata with input alphabets endowed with the structure of a semigroup or a group. Considering mainly the structure given by transition function and neglecting output functions with output sets we reach a very useful generalization of the concept of automaton called quasi—automaton [,,,]. Let us introduce the concept of automata as an action of time varying neurons. Moreover, let system , consists of nonempty time-varying neuron set of states , arbitrary semigroup of their inputs S and let mapping fulfill the following condition:
for arbitrary and can be understood as a analogy of concept of quasi-automaton, as a generalization of the Mealy-type automaton. The above condition is some times called Mixed Associativity Condition (MAC).
Definition 4.
Let A be a nonempty set, a semihypergroup and a mapping satisfying the condition:
for any triad where . The triad is called a quasi-multiautomaton with the state set A and the input semihypergroup . The mapping is called transition function (or next-state function) of the quasi-multiautomaton . Condition is called Generalized Mixed Associativity Condition (or GMAC).
The just defined structures are also called as actions of semihypergroups on sets A (called state sets).
Neuron acts as described above:
where i goes from 0 to is the weight value in continuous time, b is a bias and is the output value in continuous time . Here the transition function F is the identity function.
Now suppose that the input functions are differentiable up to arbitrary order .
We consider linear differential operators
defined
Then we denote by the additive Abelian group of linear differential operators , where for with the bias b we define
where
Suppose that and define
by
are weights corresponding with inputs and b is the bias of a neuron corresponding to the operator .
Theorem 3.
Let be the above defined structures and be the above defined mapping. Then the triad is an action of the group on the group , i.e., a quasi-automaton with the state space and with the alphabet with the group structure of artificial neurons.
Proof.
We are going to verify the mixed associativity condition (MAC). Suppose and . Then we have
thus MAC is satisfied. □
Consider an interval and the ring of all continuous functions defined on the interval. Let be a sequence of ring-endomorphism of . Denote the EL-hypergroup of artificial neurons constructed above, with vectors of weights of the dimension . Let is the -dimensional cartesian cube. Denote by the extension of such that . Let us denote the mapping defined by with . Consider underlying sets of hypergroups endowed with the above defined ordering relation:
we hanve if . Now, for such that which means ( and biases of corresponding neurons are the same) we have which implies .
Consequently the mapping is order-preserving, i.e., this is an order-homomorphism of hypergroups. The final result of our considerations is the following sequence of hypergroups of artificial neurons and linking homomorphisms:
4. Conclusions
Artificial neural networks and structured systems of artificial neurons have been discussed by a great number of researchers. They are an important part of artificial intelligence with many useful applications in various branches of science and technical constructions. Our considerations are based on algebraic and analytic approach using certain formal similarity with classical structures and new hyperstructures of differential operators. We discussed a certain generalizations of classical artificial time-varying neurons and studied them using recently derived methods. The presented investigations allow further development.
Author Contributions
Investigation, J.C. and B.S.; Methodology, J.C. and B.S.; Supervision, J.C. and B.S.; Writing original draft, J.C. and B.S.; Writing review and editing, J.C. and B.S.
Funding
The first author was supported by the FEKT-S-17-4225 grant of Brno University of Technology.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Koskela, T. Neural Network Methods in Analysing and Modelling Time Varying Processes. Ph.D. Thesis, Helsinki University of Technology, Helsinki, Finland, 2003; pp. 1–113. [Google Scholar]
- Behnke, S. Hierarchical Neural Networks for Image Interpretation. In Lecture Notes in Computer Science; Springer: Heidelberg, Germany, 2003; Volume 2766, p. 245. [Google Scholar]
- Bishop, C.M. Neural Networks for Pattern Recognition; Oxford University Press: Oxford, UK, 1995; pp. 1–482. [Google Scholar]
- Buchholz, S. A Theory of Neural Computation With Clifford-Algebras; Technical Report Number 0504; Christian-Albrechts-Universität zu Kiel, Institut für Informatik und Praktische Mathematik: Kiel, Germany, 2005; pp. 1–135. [Google Scholar]
- Gardner, M.W.; Dorling, S.R. Artificial Neural Networks (the Multilayerperceptron)–a Review of Applications in Theatmospheric Sciences. Atmos. Environ. 1998, 32, 2627–2636. [Google Scholar] [CrossRef]
- Koudelka, V.; Raida, Z.; Tobola, P. Simple electromagnetic modeling of small airplanes: Neural network approach. Radioengineering 2009, 18, 38–41. [Google Scholar]
- Krenker, A.; Bešter, J.; Kos, A. Introduction to the artificial neural networks. In Artificial Neural Networks—Methodological Advances and Biomedical Applications; Suzuki, K., Ed.; Tech: Rijeka, Croatia, 2011; pp. 3–18. [Google Scholar]
- Raida, Z.; Lukeš, Z.; Otevřel, V. Modeling broadband microwave structures by artificial neural networks. Radioengineering 2004, 13, 3–11. [Google Scholar]
- Rosenblatt, F. The Perceptron: A probabilistic model for information storage and organiyation in the brain. Psychol. Rev. 1958, 65, 386–408. [Google Scholar]
- Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
- Tučková, J. Comparison of two Approaches in the Fundamental Frequency Control by Neural Nets. In Proceedings of the 6th Czech-German Workshop in Speech Processing Praha, Praha, Czech Republic, 2–4 September 1996; p. 37. [Google Scholar]
- Tučková, J.; Boreš, P. The Neural Network Approach in Fundamental Frequency Control. In Speech Processing: Forum Phoneticum; Wodarz, H.-W., Ed.; Hector Verlag: Frankfurt am Main, Germany, 1997; pp. 143–154. ISSN 0341–3144. ISBN 3-930220-10-5. [Google Scholar]
- Tučková, J.; Šebesta, V. Data Mining Approach for Prosody Modelling by ANN in Text-to-Speech Synthesis. In Proceedings of the IASTED Inernational Conference on “Artificial Intelligence and Applications (AIA 2001)”, Marbella, Spain, 4–7 September 2001; Hamza, M.H., Ed.; ACTA Press Anaheim-Calgary-Zurich: Marbella, Spain, 2001; pp. 164–166. [Google Scholar]
- Volná, E. Neuronové sítě 1, 2nd ed.; Ostravská Univerzita: Ostrava, Czech Republic, 2008; p. 86. [Google Scholar]
- Waldron, M.B. Time varying neural networks. In Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, New Orleans, LA, USA, 4–7 November 1988. [Google Scholar]
- Widrow, B.; Lehr, A. 30years of adaptive networks: Perceptron, Madaline, and Backpropagation. Proc. IEEE 1990, 78, 1415–1442. [Google Scholar] [CrossRef]
- Kremer, S. Spatiotemporal Connectionist Networks: A Taxonomy and Review. Neural Comput. 2001, 13, 249–306. [Google Scholar] [CrossRef]
- Narendra, K.; Parthasarathy, K. Identification and control of dynamical systems using neural networks. IEEE Trans. Neural Netw. 1990, 1, 4–27. [Google Scholar] [CrossRef] [PubMed]
- Hagan, M.; Demuth, H.; Beale, M. Neural Network Design; PWS Publishing: Boston, MA, USA, 1996. [Google Scholar]
- Chvalina, J.; Smetana, B. Models of Iterated Artificial Neurons. In Proceedings of the 18th Conference on Aplied Mathematics Aplimat, Bratislava, Slovakia, 5–7 February 2019; pp. 203–212, ISBN 978-80-227-4884-1. [Google Scholar]
- Chvalina, J.; Smetana, B. Groups and Hypergroups of Artificial Neurons. In Proceedings of the 17th Conference on Aplied Mathematics Aplimat, Bratislava, Slovakia, 6–8 February 2018; University of Technology in Bratislava, Faculty of Mechanical Engineering: Bratislava, Slovakia, 2018; pp. 232–243. [Google Scholar]
- Chvalina, J.; Smetana, B. Solvability of certain groups of time varying artificial neurons. Ital. J. Pure Appl. Math. 2018. submitted. [Google Scholar]
- Chvalina, J. Commutative hypergroups in the sense of Marty and ordered sets. In Proceedings of the Summer School an General Algebra and Ordered Sets, Olomouc, Czech Republic, 4–12 September 1994; pp. 19–30. [Google Scholar]
- Marty, F. Sur une généralisation de la notion de groupe. In Proceedings of the IV Congrès des Mathématiciens Scandinaves, Stockholm, Sweden, 14–18 August 1934; pp. 45–49. [Google Scholar]
- Novák, M.; Cristea, I. Composition in EL–hyperstructures. Hacet. J. Math. Stat. 2019, 48, 45–58. [Google Scholar] [CrossRef]
- Novák, M. n-ary hyperstructures constructed from binary quasi-ordered semigroups. An. Stiintifice Ale Univ. Ovidius Constanta Ser. Mat. 2014, 22, 147–168. [Google Scholar] [CrossRef]
- Novák, M. On EL—Semihypergroups. Eur. J. Comb. 2015, 44, 274–286. [Google Scholar] [CrossRef]
- Novák, M. Some basic properties of EL-hyperstructures. Eur. J. Combin. 2013, 34, 446–459. [Google Scholar] [CrossRef]
- Chvalina, J.; Novák, M.; Staněk, D. Sequences of groups and hypergroups of linear ordinary differential operators. Ital. J. Pure Appl. Math. 2019. accepted for publication. [Google Scholar]
- Corsini, P. Prolegomena of Hypergroup Theory; Aviani Editore Tricesimo: Udine, Italy, 1993. [Google Scholar]
- Corsini, P.; Leoreanu, V. Applications of Hyperstructure Theory; Kluwer Academic Publishers: Dordrecht, The Netherlands; Boston, MA, USA; London, UK, 2003. [Google Scholar]
- Chvalina, J.; Hošková-Mayerová, Š.; Dehghan Nezhad, A. General actions of hypergroups and some applications. An. Stiintifice Ale Univ. Ovidius Constanta 2013, 21, 59–82. [Google Scholar]
- Cristea, I. Several aspects on the hypergroups associated with n-ary relations. An. Stiintifice Ale Univ. Ovidius Constanta 2009, 17, 99–110. [Google Scholar]
- Cristea, I.; Ştefănescu, M. Binary relations and reduced hypergroups. Discret. Math. 2008, 308, 3537–3544. [Google Scholar] [CrossRef]
- Cristea, I.; Ştefănescu, M. Hypergroups and n-ary relations. Eur. J. Combin. 2010, 31, 780–789. [Google Scholar] [CrossRef]
- Leoreanu-Fotea, V.; Ciurea, C.D. On a P-hypergroup. J. Basic Sci. 2008, 4, 75–79. [Google Scholar]
- Pollock, D.; Waldron, M.B. Phase dependent output in a time varying neural net. Proc. Ann Conf. EMBS 1989, 11, 2054–2055. [Google Scholar]
- Račková, P. Hypergroups of symmetric matrices. In Proceedings of the 10th International Congress of Algebraic Hyperstructures and Applications (AHA 2008), Brno, Czech Republic, 3–9 September 2008; pp. 267–272. [Google Scholar]
- Vougiouklis, T. Generalization of P-hypergroups. In Rendiconti del Circolo Matematico di Palermo; Springer: Berlin, Germany, 1987; pp. 114–121. [Google Scholar]
- Vougiouklis, T. Hyperstructures and their Representations. In Monographs in Mathematics; Hadronic Press: Palm Harbor, FL, USA, 1994. [Google Scholar]
- Vougiouklis, T.; Konguetsof, L. P-hypergoups. Acta Univ. C. Math. Phys. 1987, 28, 15–20. [Google Scholar]
- Chvalina, J.; Chvalinová, L. Modelling of join spaces by n-th order linear ordinary differential operators. In Proceedings of the Fourth International Conference APLIMAT 2005, Bratislava, Slovakia, 1–4 February 2005; pp. 279–284, ISBN 80-969264-2- X. [Google Scholar]
- Chvalina, J.; Chvalinová, L. Action of centralizer hypergroups of n-th order linear differential operators on rings on smooth functions. J. Appl. Math. 2008, 1, 45–53. [Google Scholar]
- Neuman, F. Global Properties of Linear Ordinary Differential Equations; Academia Praha—Kluwer Academic Publishers: Dordrecht, The Netherlands; Boston, MA, USA; London, UK, 1991; p. 320. [Google Scholar]
- Vougiouklis, T. Cyclicity in a special class of hypergroups. Acta Univ. C Math. Phys. 1981, 22, 3–6. [Google Scholar]
- Chvalina, J.; Svoboda, Z. Sandwich semigroups of solutions of certain functional equations and hyperstuctures determined by sandwiches of functions. J. Appl. Math. 2009, 2, 35–43. [Google Scholar]
- Cristea, I.; Novák, M.; Křehlík, Š. A class of hyperlattices induced by quasi-ordered semigroups. In Proceedings of the 16th Conference on Aplied Mathematics Aplimat 2017, Bratislava, Slovakia, 31 January–2 February 2017; pp. 1124–1135. [Google Scholar]
- Bavel, Z. The source as a tool in automata. Inf. Control 1971, 18, 140–155. [Google Scholar] [CrossRef][Green Version]
- Borzooei, R.A.; Varasteh, H.R.; Hasankhani, A. -Multiautomata on Join Spaces Induced by Differential Operators. Appl. Math. 2014, 5, 1386–1391. [Google Scholar] [CrossRef]
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).