Artificial Neural Networks Using Quiver Representations of Finite Cyclic Groups

Wanditra, Lucky Cahya; Muchtadi-Alamsyah, Intan; Nasution, Dellavitha

doi:10.3390/sym15122110

Open AccessArticle

Artificial Neural Networks Using Quiver Representations of Finite Cyclic Groups

by

Lucky Cahya Wanditra

^1,*,†,

Intan Muchtadi-Alamsyah

^2,3,†

and

Dellavitha Nasution

^2,†

¹

Doctoral Program in Mathematics, Faculty of Mathematics and Natural Sciences, Institut Teknologi Bandung, Bandung 40132, Indonesia

²

Algebra Research Group, Faculty of Mathematics and Natural Sciences, Institut Teknologi Bandung, Bandung 40132, Indonesia

³

University Center of Excellence on Artificial Intelligence for Vision, Natural Language Processing and Big Data Analytics (U-CoE AI-VLB), Institut Teknologi Bandung, Bandung 40132, Indonesia

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Symmetry 2023, 15(12), 2110; https://doi.org/10.3390/sym15122110

Submission received: 3 October 2023 / Revised: 20 November 2023 / Accepted: 22 November 2023 / Published: 24 November 2023

(This article belongs to the Special Issue Selected Papers of International Conference on Mathematical and Statistical Sciences: A Selçuk Meeting (Dedicated to Victims of Türkiye Earthquakes))

Download Versions Notes

Abstract

:

In this paper, we propose using quiver representations as a tool for understanding artificial neural network algorithms. Specifically, we construct these algorithms by utilizing the group algebra of a finite cyclic group as vertices and convolution transformations as maps. We will demonstrate the neural network using convolution operation in the group algebra. The convolution operation in the group algebra that is formed by a finite cyclic group can be seen as a circulant matrix. We will represent a circulant matrix as a map from a cycle permutation matrix to a polynomial function. Using the permutation matrix, we will see some properties of the circulant matrix. Furthermore, we will examine some properties of circulant matrices using representations of finite symmetric groups as permutation matrices. Using the properties, we also examine the properties of moduli spaces formed by the actions of the change of basis group on the set of quiver representations. Through this analysis, we can compute the dimension of the moduli spaces.

Keywords:

quiver representations; finite cyclic group; group algebra; moduli space; dimension

1. Introduction

In the modern age, artificial intelligence has seamlessly woven into our daily lives. Whether it is the video assistant referees in football matches or the autonomous cars navigating our city streets, artificial intelligence’s presence is undeniable [1]. At its core, artificial intelligence often relies on the remarkable capabilities of artificial neural networks, a technology that mirrors the workings of the human brain, enabling object recognition, tracking, and much more [2].

The human brain, a marvel of complexity, processes information through the intricate interplay of neurons, transferring signals with astonishing precision. In artificial neural networks, these neurons find their mathematical counterpart as nodes, and the transmission of information takes on the guise of maps. We craft a mathematical model that employs quiver representations to better understand and manipulate these networks. In this model, nodes become vertices, and the transfer of information is akin to connecting paths between them. Quiver representations offer a visual means of grasping the intricate connectivity between artificial neurons, facilitating our exploration of information flows within neural networks.

Crucially, within this framework, the activation function emerges as the arbiter of information significance, determining which data are worthy of propagation through the network. Yet, this is only the beginning of our journey into the captivating world of artificial neural networks.

To further enrich our understanding and empower our analysis, we introduce the concept of Group algebra. It is a mathematical construct formed by mapping a group onto a field—in our case, the complex number set. This addition becomes particularly essential when dealing with situations where a single neuron must process multi-dimensional information, such as colors. Group algebra’s functional properties offer a versatile toolkit for convolution and transformation, allowing us to explore the intricate relationships between different functions within the neural network.

In our exploration, we particularly focus on applying group algebra in the context of cyclic groups, finite groups, with cyclic properties. This choice becomes especially pertinent when creating a quiver group algebra representation, where we draw connections with neural networks. These representations provide us with the means to compare and contrast various artificial neural networks through the lens of morphisms of quiver representations [3,4,5].

As we traverse this intricate web of mathematical constructs, neural science, and data analysis, our journey leads us to the intriguing concept of moduli spaces. These spaces offer a glimpse into the diverse dimensions and possibilities within neural networks, shedding light on their potential and limitations.

In this paper, we will embark on a comprehensive exploration of the interplay between artificial intelligence, quiver representations, group algebras, cyclic groups, and moduli spaces. By the journey’s end, it will become evident that these seemingly disparate threads are tightly woven into a unified tapestry that reshapes the landscape of mathematics, neural science, and data-driven discovery.

2. Quiver Representation from Convolution Group Algebras

2.1. Quiver Representation

Let

Q = (V, E, s, t)

be a quiver where V is a set of vertices, E is a set of arrows and

s, t : E \to V

map every arrow

ϵ \in E

to its source vertex

s (ϵ) \in V

and target vertex

t (ϵ) \in V

, respectively. A quiver can have loops. An arrow

ϵ \in E

is a loop if

s (ϵ) = t (ϵ)

[4]. Quiver Q is said to be arranged by layers if it satisfies the following conditions:

All vertices $v \in V$ can be arranged in columns from left to right;
There are no arrows from vertices in the right columns to vertices in the left columns;
There are no arrows between vertices in the same column [4].

A quiver Q is called a network quiver if it satisfies the following conditions:

Q is arranged by layers;
Every input, output, and bias vertex does not have any loop;
Every hidden vertex has exactly one loop [4].

Now, we will describe a quiver representation.

Definition 1.

Let

Q = (V, E, s, t)

be a quiver. A quiver representation is defined by a tuple

(W, T)

where

W = {W_{v}}_{v \in V}

is a sequence of vector spaces that indices by vertices in V and

T = {T_{ϵ}}_{ϵ \in E}

is a sequence of linear maps that indices by arrows in E, such that for every ϵ,

T_{ϵ} : W_{s (ϵ)} \to W_{t (ϵ)}

is a linear map [6].

Definition 2.

Let

(W, T)

and

(U, S)

be two quiver representations of a quiver Q. A morphism of representations is defined by

τ = {τ_{v}}_{v \in V}

a sequence of linear maps where

τ_{v} = W_{v} \to U_{v}

satisfy

if

τ_{v}

is a bijection for every

v \in V

, we say that τ is an isomorphism and

(W, T)

and

(U, S)

are isomorphic to each other [6].

2.2. Convolution Representation

Now, we will see a quiver representation that uses group algebra and a convolution operation that induces linear transformation.

Definition 3.

A non-empty set G, is called a group if there exists a map

\cdot : G \times G \to G

, that is called an operation of the group, such that

$(a \cdot b) \cdot c = a \cdot (b \cdot c)$ for all $a, b, c \in G$ ,
There is $e \in G$ such that $e \cdot a = a = a \cdot e$ for all $a \in G$ , called the identity of G;
For every $a \in G$ there is $a^{- 1} \in G$ such that $a \cdot a^{- 1} = e = a^{- 1} a$ .

If there is

n \in N

such that

| G | = n

, then G is called a finite group.

Definition 4.

Let G be a finite group. Define

C^{G} = {f : G \to C}

where

For every $f, g \in C^{G}$ and $x \in G$ , we have $(f + g) (x) = f (x) + g (x)$ ;
For every $f \in C^{G}$ , $z \in C$ , and $x \in G$ , we have $(z f) (x) = z (f (x))$ ;
For every $f, g \in C^{G}$ , we define $〈 f, g 〉 = \sum_{x \in G} f (x) \bar{g (x)}$ ;
For every $f, g \in C^{G}$ , we define $(f * g) (x) = \sum_{y \in G} f (y) g (y^{- 1} x)$ (convolution operation) [7].

Using the group algebra, we will define the Fourier transformation as follows.

Definition 5.

Let

n \in Z

and

G ≅ Z_{n}

. Let

ω_{n} = e^{\frac{2 π i}{n}}

. Define

\begin{matrix} F : & C^{G} \to C^{G} \\ a \mapsto \hat{a} \end{matrix}

where

\hat{a} (g) = \sum_{h \in G} a (h) ω_{n}^{- g h}

. The transformation

F

is called a Fourier transformation [7].

From the definition, the Fourier transformation can be represented by the Fourier matrix

F = (\begin{matrix} 1 & 1 & 1 & \dots & 1 \\ 1 & ω_{n} & ω_{n}^{2} & \dots & ω_{n}^{n - 1} \\ 1 & ω_{n}^{2} & ω_{n}^{4} & \dots & ω_{n}^{2 (n - 1)} \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ 1 & ω_{n}^{n - 1} & ω_{n}^{2 (n - 1)} & \dots & ω_{n}^{(n - 1) (n - 1)} \end{matrix}) .

Now, we obtain group algebra with a Fourier transformation and convolution operation. Using the group algebra, we will define a quiver representation that is called a

C^{G}

-representation.

Definition 6.

Let

Q = (V, E, s, t)

be a quiver and G be a finite group. Let

W = ({W_{v}}_{v \in V}, {W_{ϵ}}_{ϵ \in E})

be a representation of Q. The representation W is called a

C^{G}

-representation if

W_{v} = C^{G}

for every

v \in V

where

C^{G}

is the group algebra related to G and

C

.

Definition 7.

Let

W = ({C^{G}}_{v \in V}, {W_{ϵ}}_{ϵ \in E})

be a

C^{G}

-representation of a quiver

Q = (V, E, s, t)

. The representation W is called a

C^{G}

-convolution representation if for every

ϵ \in E

there is

w_{ϵ} \in C^{G}

such that

\begin{matrix} W_{ϵ} : & C^{G} \to C^{G} \\ a \mapsto w_{ϵ} * a . \end{matrix}

We can compare two representations using a transformation called a morphism in quiver representation. The morphism of representations is defined as follows:

Definition 8.

Let

U, W

be two

C^{G}

-representations of a quiver

Q = (V, E, s, t)

. A morphism of

C^{G}

-representations

τ : W \to U

is given by

τ = {τ_{v}}_{v \in V}

, where

τ_{v} : W_{v} \to U_{v}

is a linear map such that for every

ε \in E

, we have

τ_{t (ε)} W_{ε} = U_{ε} τ_{s (ε)} .

If for every

v \in V

,

τ_{v}

is invertible then τ is an isomorphism and W is isomorphic to U.

Define

H (Q)

as the set of all morphisms of

C^{G}

-representations.

Theorem 1.

Let

Q = (V, E, s, t)

be a quiver. Let

C^{G} (Q)

be the set of all

C^{G}

-representations of Q. Define

Γ (Q) = \{τ \in H (Q) | \exists U, W \in C^{G} (Q) such that τ : W \to U is an isomorphism\}

where for every

τ, σ \in Γ

, we have

τ \cdot σ = {τ_{v} σ_{v}}_{v \in V} .

The set Γ under operation · forms a group. Furthermore,

Γ (Q) = \prod_{v \in V} G L (C^{G}) .

Proof.

It is easy to see that for every

τ, σ \in Γ (Q)

,

τ \cdot σ \in Γ (Q)

because

τ_{v} σ_{v}

is invertible for every

v \in V

and

τ \cdot σ

commutes with the representations. Next, we will see the associative property of the operation · in

Γ (Q)

. Let

τ, σ, ρ \in Γ (Q)

. We have

(τ \cdot σ) \cdot ρ = {(τ_{v} σ_{v}) ρ_{v}}_{v \in V} = {τ_{v} σ_{v} ρ_{v}}_{v \in V} = {τ_{v} (σ_{v} ρ_{v})}_{v \in V} = τ \cdot (σ \cdot ρ) .

The identity morphism

i d = {i d_{^{G}}}_{v \in V}

is the identity element in

Γ (Q)

. It is easy to see that every

τ \in Γ (Q)

has an inverse because for every

v \in V

,

τ_{v}

is an isomorphism and commutes with the representations. Therefore

Γ (Q)

forms a group. Now, we will prove that

Γ (Q) = \prod_{v \in V} G L (C^{G})

. It is clear that

Γ (Q) \subseteq \prod_{v \in V} G L (C^{G})

. Let

τ = {τ_{v}}_{v \in V} \in \prod_{v \in V} G L (C^{G})

. We know that

τ

has an inverse, that is

τ^{- 1} = {τ_{v}^{- 1}}_{v \in V}

because

τ_{v} \in G L (C^{G})

. We only need to show that

τ

is a morphism of

C^{G}

-representation. Let W be a

C^{G}

-representation and define a

C^{G}

-representation

U = τ W τ^{- 1}

where for every

ε \in E

, we have

U_{ε} = τ_{t (ε)} W_{ε} τ_{s (ε)}^{- 1} .

This means that

τ \in Γ (Q)

. Therefore,

Γ (Q) = \prod_{v \in V} G L (C^{G})

. □

Now, we will define a

C^{G}

-representation that uses convolutions as linear transformations.

Definition 9.

Let

W = ({C^{G}}_{v \in V}, {W_{ε}}_{ε \in E})

be a

C^{G}

-representation of a quiver

Q = (V, E, s, t)

. The representation W is called a convolution representation if for every

ε \in E

there is

w_{ε} \in C^{G}

such that

\begin{matrix} W_{ε} : & C^{G} \to C^{G} \\ a \mapsto w_{ε} * a . \end{matrix}

Definition 10.

Let

W, U

be two

C^{G}

convolution representations of a quiver. A morphism of convolution representations

τ : W \to U

is a morphism of

C^{G}

-representation. Furthermore, if τ is an isomorphism of

C^{G}

-representation, then τ is an isomorphism of convolution representations, and we say W is isomorphic to U.

Let G be a finite group. We write

C_{C^{G}} (Q)

for the set of all convolution representations of a quiver Q and

C (Q)

for the set of all isomorphisms of convolution representations of quiver Q. Obviously

C (Q) \subseteq Γ (Q)

. Define

C_{Γ} (Q) = \{τ \in C (Q) | τ W τ^{- 1} \in C_{C^{G}} (Q), \forall W \in C_{C^{G}} (Q)\} .

If

| G | = n

then

C^{G}

is isomorphic to

C^{n}

as a

C

-vector space. From now on, we assume that G is cyclic. Let

w \in C^{G}

and write

w = {(w_{0}, w_{1}, w_{2}, \dots, w_{n})}^{T}

. Define a linear map

\begin{matrix} W : & C^{G} \to C^{G} \\ a \mapsto w * a . \end{matrix}

By definition,

(w * a) (x) = \sum_{y \in G} w (y) a (y^{- 1} x)

for all

x \in G .

Under the standard basis

{δ_{g}}_{g \in G}

where

δ_{g} (x) = 1

if

x = g

and

δ_{g} (x) = 0

if

x \neq g

, we can write

W (a) = (\begin{matrix} w_{0} & w_{n - 1} & \dots & w_{1} \\ w_{1} & w_{0} & \dots & w_{2} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ w_{n - 1} & w_{n - 2} & \dots & w_{0} \end{matrix}) (\begin{matrix} a_{0} \\ a_{1} \\ ⋮ \\ a_{n - 1}; \end{matrix}) .

It means that the matrix representation of a convolution linear map is a circulant matrix. Circulant matrices have a nice property.

Lemma 1.

Let

π = c i r c (0, 1, 0, \dots, 0) = (\begin{matrix} 0 & 1 & 0 & \dots & 0 \\ 0 & 0 & 1 & \dots & 0 \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ 1 & 0 & 0 & \dots & 0 \end{matrix})

. Let A be

n \times n

matrix, then A is circulant if and only if

A π = π A

Proof.

We know that

π

is the permutation matrix of

σ = (1, 2, \dots, n)

. So, we can get

π^{- 1} = π^{*}

where

π^{*}

is the conjugate transpose of

π

. Let

A = (a_{i, j})

. We know that

π A π^{*} = (a_{σ (i), σ (j)}) .

We know that A is a circulant matrix if and only if

a_{i, j} = a_{σ (i), σ (j)}

. So, we can conclude

π A π^{*} = A

if and only if A is a circulant matrix. Because of

π^{*} = π^{- 1}

, we can get

π A = A π

if and only if A is a circulant matrix. □

Lemma 2.

Let

A, B

be two invertible matrices. If

A C B

is a circulant matrix for every circulant matrix C, then

A, B

are also circulant.

Proof.

Let

A, B

be two invertible matrices and

π = c i r c (0, 1, 0, \dots, 0)

. Let

A C B

be a circulant matrix for every circulant matrix C. We have

π (A I B) = π (A B) = (A B) π

(1)

where I is the identity matrix. From Equation (1), we have

π A = A B π B^{- 1}

. Now we have

\begin{matrix} (A π B) π & = π (A π B) \\ A π B π & = (π A) π B \\ A π B π & = (A B π B^{- 1}) π B \\ π B π & = (B π B^{- 1}) π B \\ B & = π^{*} B π B^{- 1} π B π^{*} . \end{matrix}

We get

B = π^{*} B π B^{- 1} π B π^{*}

. Notice that

\begin{matrix} B π B^{- 1} & = (π^{*} B π B^{- 1} π B π^{*}) π B \\ = π^{*} B π B^{- 1} π . \end{matrix}

Hence

B π B^{- 1}

is a circulant matrix. From the diagonalization of circulant matrices, we get

B π B^{- 1} = F^{*} D F

where F is a Fourier matrix and D is a diagonal matrix. Now, we have

\begin{matrix} B π B^{- 1} = B F^{*} Ω F B^{- 1} & = F^{*} D F \\ F B F^{*} Ω F B^{- 1} F^{*} & = D \end{matrix}

where

Ω = d i a g (1, ω^{k}, ω^{2 k}, \dots, ω^{(n - 1) k})

. Let

D = d i a g (d_{1}, d_{2}, d_{3}, \dots, d_{n})

and

M = F B F^{*} = (\begin{matrix} m_{1, 1} & m_{1, 2} & m_{1, 3} & \dots & m_{1, n} \\ m_{2, 1} & m_{2, 2} & m_{2, 3} & \dots & m_{2, n} \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ m_{n, 1} & m_{n, 2} & m_{n, 3} & \dots & m_{n, n} \end{matrix}) .

We get

M Ω = D M .

\begin{matrix} \Rightarrow (\begin{matrix} m_{1, 1} & m_{1, 2} & \dots & m_{1, n} \\ m_{2, 1} & m_{2, 3} & \dots & m_{2, n} \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ m_{n, 1} & m_{n, 2} & \dots & m_{n, n} \end{matrix}) (\begin{matrix} 1 & 0 & \dots & 0 \\ 0 & ω^{k} & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & ω^{(n - 1) k} \end{matrix}) \\ = (\begin{matrix} d_{1} & 0 & \dots & 0 \\ 0 & d_{2} & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & d_{n} \end{matrix}) (\begin{matrix} m_{1, 1} & m_{1, 2} & \dots & m_{1, n} \\ m_{2, 1} & m_{2, 2} & \dots & m_{2, n} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ m_{n, 1} & m_{n, 2} & \dots & m_{n, n} \end{matrix}) . \\ \Rightarrow (\begin{matrix} m_{1, 1} & ω^{k} m_{1, 2} & \dots & ω^{(n - 1) k} m_{1, n} \\ m_{2, 1} & ω^{k} m_{2, 2} & \dots & ω^{(n - 1) k} m_{2, n} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ m_{n, 1} & ω^{k} m_{n, 2} & \dots & ω^{(n - 1) k} m_{n, n} \end{matrix}) = (\begin{matrix} d_{1} m_{1, 1} & d_{1} m_{1, 2} & \dots & d_{1} m_{1, n} \\ d_{2} m_{2, 1} & d_{2} m_{2, 2} & \dots & d_{2} m_{2, n} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ d_{n} m_{n, 1} & d_{n} m_{n, 2} & \dots & d_{n} m_{n, n} \end{matrix}) . \end{matrix}

We know

d_{i} \neq 0

for all

i \in {1, 2, \dots, n}

because D is invertible. Now, let us see the component of the matrices in the last equation. For column j we get that

d_{i} = ω^{(j - 1) k}

for every

i \in {1, 2, \dots, n}

or

m_{i, j} = 0

for

i \neq j

. If

d_{i} = ω (j - 1) k

for every

i \in {1, 2, \dots n}

, we must get

m_{i, l} = 0

for

l \neq j

but that is impossible because M is an invertible matrix. So, we can get

m_{i, j} = 0

for

i \neq j

. It means that M must be a diagonal matrix and

D = Ω

. So, we have

\begin{matrix} F B F^{*} Ω F B^{- 1} F^{*} & = D \\ F B F^{*} Ω F B^{- 1} F^{*} & = Ω \\ B π B^{- 1} & = F^{*} Ω F \\ B π B^{- 1} & = π . \end{matrix}

Therefore, B is a circulant matrix. Notice that

\begin{matrix} π (A B) & = (A B) π \\ π A B & = A (B π) \\ (π A) B & = A (π B) & because B is circulant . \end{matrix}

We conclude that

π A = A π

and A is a circulant matrix. □

Let

C i r (C^{n})

be the set of all

n \times n

invertible circulant matrices over

C

.

Theorem 2.

Let G be a cyclic group of order

n .

The set

C_{Γ} (Q)

under the operation · forms a subgrup of Γ. Furthermore,

C_{Γ} (Q) = \prod_{v \in V} C i r (C^{n})

Proof.

Let

τ \in C_{Γ} (Q)

and

W \in C_{C^{G}} (Q)

. From the definition, we know that

τ W τ^{- 1}

in

C_{^{G}} (Q)

. Let

τ W τ^{- 1} = U

, then for every

ϵ \in E

we have

U_{ϵ} = τ_{t (ϵ)} W_{ϵ} τ_{s (ϵ)}^{- 1}

We know that

U_{ϵ}, W_{ϵ} \in (C^{G})

, so from Lemma 2 we get that

τ_{t (ϵ)}, τ_{s (ϵ)}^{- 1}

must be in

C_{C^{G}}

too because the inverse of a circulant matrix is a circulant matrix. We have that every

v \in V

τ_{v}

must be a circulant matrix, so we have

C_{Γ} (Q) \subseteq \prod_{v \in V} C i r (C^{G}) = \prod_{v \in V} C i r (C^{G})

. Now, we will see

\prod_{v \in V} C i r (C^{G}) \subseteq C_{Γ} (Q)

. We know that the multiplication of two circulant matrices must also become a circulant matrix. So we must have

\prod_{v \in V} C i r (C^{G}) \subseteq C_{Γ} (Q)

. It is easy to see

C_{Γ} (Q)

is a group because

\prod_{v \in V} C i r (C^{G})

forms a group under · operation. □

3. Moduli Space of Neural Networks from Convolution Group Algebras

3.1. Neural Network Function

Definition 11.

Let

Q = (V, E, s, t)

be a quiver. The delooped quiver is the quiver

Q ° = (V °, E °, s °, t °)

where

V ° = V

,

E ° = {ϵ \in E | s (ϵ) \neq t (ϵ)}

,

s ° = s ↾_{E °}

, and

t ° = t ↾_{E °}

[4].

Definition 12.

Let

Q = (V, E, s, t)

be a network quiver. The hidden quiver is the quiver

\tilde{Q} = (\tilde{V}, \tilde{E}, \tilde{s}, \tilde{t})

where

\tilde{V}

is a set of all vertices in the hidden layer,

\tilde{E}

is a set of all arrows between hidden vertices,

\tilde{s}, \tilde{t} : \tilde{E} \to \tilde{V}

induced by

s, t

[4].

Definition 13.

Let Q be a network quiver. Let

Q °

be a delooped quiver of Q. A neural network over quiver Q is a pair of

(W, f)

such that W is a convolution representation of

Q °

and

f = {f_{v}}_{{v \in \tilde{V}}}

where for every

v \in \tilde{V}

we have

f_{v} : C^{G} \to C^{G}

is a differentiable function. The function

f_{v}

is called the activation function [4].

Now, we will define a neural network function using the vertex output function.

Definition 14.

Let

(W, f)

be a neural network over quiver Q. Define

ζ_{v} = {ϵ \in E | t (ϵ) = v}

and a function

a {(W, f)}_{v} : {(C^{G})}^{d} \to C^{G}

where

a {(W, f)}_{v} (x) = \{\begin{matrix} x_{v} & if v is an input vertex \\ 1 & if v is a bias vertex \\ f (\sum_{ϵ \in ζ_{v}} w_{ϵ} a {(W, f)}_{s ° (ϵ)} (x)) & if v is a vertex that has activation function \\ \sum_{ϵ \in ζ_{v}} w_{ϵ} a {(W, f)}_{s ° (ϵ)} (x) & if v is an output vertex \end{matrix} .

Define the function

Ψ (W, f) : {(C^{G})}^{d} \to {(C^{G})}^{k}

where every

x \in {(C^{G})}^{d}

is mapped to

Ψ (W, f) (x) = {a {(W, f)}_{v} (x)}_{{v \in V_{o u t}}}

with

V_{o u t} = {v \in V | v is an output vertex}

[4].

Using the concept of morphism of quiver representations, we can define a morphism of neural networks.

Definition 15.

Let

(W, f)

and

(W^{'}, f^{'})

be two neural networks over a network quiver Q. A morphism of neural networks is a morphism of convolution representations

τ : (W, f) \to (W^{'}, f^{'})

that satisfies

$τ_{v} = 1$ for every $v \notin \tilde{V}$ ;
For every $v \in \tilde{V}$ , we have this commutative diagram:

where

\tilde{V}

is a set of all vertices in the hidden layer. If for every

v \in V

,

τ_{v}

is bijective, then τ is an isomorphism and

(W, f)

is isomorphic to

(W^{'}, f^{'})

[4].

Now, we will see the similarity of neural network functions from two isomorphic neural networks.

Theorem 3.

If

(W, f)

and

(V, g)

are two isomorphic

C^{G}

-neural networks then

Ψ (W, f) (x) = Ψ (V, g) (x) .

Proof.

Let

(W, f)

and

(V, g)

be two isomorphic

C^{G}

-neural networks. It means that there is an isomorphism

τ

such that

τ ⋄ (W, f) = (V, g)

. Let

ϵ : s (ϵ) \to t (ϵ)

be an arrow in Q. Let

x \in (C^{G}) d

be an input vertex for

(W, f)

and

(V, g)

. If v is an input vertex, we have

a {(W, f)}_{v} (x) = a {(V, g)}_{v} (x) = \{\begin{matrix} x_{v} & v is an input vertex \\ 1 & v is a bias vertex \end{matrix}

because

τ_{v} = i d

.

If v is in the first hidden layer, we have

\begin{matrix} a {(V, g)}_{v} (x) & = g_{v} (\sum_{ϵ \in ζ_{v}} V_{ϵ} a {(V, g)}_{s (ϵ)} (x)) \\ = τ_{v} f_{v} (τ_{v}^{- 1} \sum_{ϵ \in ζ_{v}} V_{ϵ} a {(V, g)}_{s (ϵ)} (x)) \\ = τ_{v} f_{v} (τ_{v}^{- 1} \sum_{ϵ \in ζ_{v}} τ_{t (ϵ)} W_{ϵ} τ_{s (ϵ)} a {(V, g)}_{s (ϵ)} (x)) \\ = τ_{v} f_{v} (\sum_{ϵ \in ζ_{v}} τ_{v}^{- 1} τ_{v} W_{ϵ} a {(V, g)}_{s (ϵ)} (x)) & because τ_{s (ϵ)} = 1 \\ = τ_{v} f_{v} (\sum_{ϵ \in ζ_{v}} W_{ϵ} a {(V, g)}_{s (ϵ)} (x)) . \end{matrix}

Since

s (ϵ)

is an output vertex, then

a {(V, g)}_{s (ϵ)} (x) = a {(W, f)}_{s (ϵ)} (x)

. Hence, we get

a {(V, g)}_{v} (x) = τ_{v} f_{v} (\sum_{ϵ \in ζ_{v}} W_{ϵ} a {(W, f)}_{s (ϵ)} (x)) .

Therefore, we get

a {(V, g)}_{v} (x) = τ_{v} a {(W, f)}_{v} (x) .

(2)

For v in the second hidden layer, we have

\begin{matrix} a {(V, g)}_{v} (x) & = g_{v} (\sum_{ϵ \in ζ_{v}} V_{ϵ} a {(V, g)}_{s (ϵ)} (x)) \\ = τ_{v} f_{v} (τ_{v}^{- 1} \sum_{ϵ \in ζ_{v}} V_{ϵ} a {(V, g)}_{s (ϵ)} (x)) \\ = τ_{v} f_{v} (τ_{v}^{- 1} \sum_{ϵ \in ζ_{v}} τ_{t (ϵ)} W_{ϵ} τ_{s (ϵ)}^{- 1} a {(V, g)}_{s (ϵ)} (x)) \\ = τ_{v} f_{v} (\sum_{ϵ \in ζ_{v}} τ_{v}^{- 1} τ_{v} W_{ϵ} τ_{s (ϵ)}^{- 1} τ_{s (ϵ)} a {(W, f)}_{s (ϵ)} (x)) & because ϵ \in ζ_{v} \\ = τ_{v} f_{v} (\sum_{ϵ \in ζ_{v}} W_{ϵ} a {(W, f)}_{s (ϵ)} (x)) \\ = τ_{v} a {(W, f)}_{v} (x) & from the definition of (W, f) . \end{matrix}

It means if v is in the second hidden layer, we get

a {(V, g)}_{v} (x) = τ_{v} a {(W, f)}_{v} (x) .

Inductively, we will get

a {(V, g)}_{v} (x) = τ_{v} a {(W, f)}_{v} (x)

for every v in the hidden layer. If v is an output vertex, we have

\begin{matrix} a {(V, g)}_{v} (x) & = \sum_{ϵ \in ζ_{v}} V_{ϵ} a {(V, g)}_{s (ϵ)} (x) \\ = \sum_{ϵ \in ζ_{v}} τ_{t (ϵ)} W_{ϵ} τ_{s (ϵ)} a {(V, g)}_{s (ϵ)} (x) \\ = \sum_{ϵ \in ζ_{v}} τ_{v} W_{ϵ} τ_{s (ϵ)}^{- 1} τ_{s (ϵ)} a {(W, f)}_{s (ϵ)} (x) & because s (ϵ) is in hidden layer \\ = \sum_{ϵ \in ζ_{v}} W_{ϵ} a {(W, f)}_{s (ϵ)} (x) & because τ_{v} = 1 for v \in V^{'} . \end{matrix}

Therefore, we get

a {(V, g)}_{v} (x) = a {(W, f)}_{v} (x) .

It means for every output vertex v we get

Ψ (W, f) = Ψ (V, g) .

□

3.2. Moduli Spaces of Neural Network

The last theorem tells us two isomorphic neural networks will have the same neural network function. The moduli spaces can be defined by the isomorphic classes of neural networks as follows.

Definition 16.

Let G be a group and X be a set. An action of G on X is a map

\cdot : G \times X \to X

, denoted by

a \cdot x

such that

$e \cdot x = x$ for every $x \in X$ where e is the identity in G,
$a \cdot (b \cdot x) = (a \cdot b) \cdot x$ for all $a, b \in G$ and all $x \in X$ .

The set

O = \{a \cdot x | a \in G\}

is called an orbit of the action [8].

We know that

Γ (Q)

is a group of isomorphisms. Let

^{G} (Q)

be a set of all

C^{G}

-representation of Q. Define a map

\begin{matrix} \cdot : & Γ (Q) \times C^{G} (Q) \to C^{G} (Q) \\ (τ \cdot W) \mapsto τ \cdot W . \end{matrix}

where

τ \cdot W = τ W τ^{- 1}

. We know that the map · is an action of

Γ (Q)

on

C^{G} (Q)

because we have

I = {i d_{^{G}}}_{v \in V} \in Γ (Q)

and

I W I^{- 1} = I W I = W

for all

W \in C^{G} (Q)

. We also have

τ \cdot (σ \cdot W) = τ \cdot (σ W σ^{- 1}) = τ σ W σ^{- 1} τ - 1 = (τ σ) W {(τ σ)}^{- 1} = (τ \cdot σ) \cdot W

for all

τ, σ \in Γ (Q)

and all

W \in C^{G} (Q)

. Now, we can define

M (Q)

as the set of all orbits of the action of

Γ

on

C^{G} (Q)

. The set

M (Q)

is the moduli space of

C^{G}

-representations (see also [9]). We will apply the same group action to

C^{G} (Q)

as a subgroup of

Γ (Q)

and

C_{C^{G}} (Q)

as a subset of

C^{G} (Q)

. We know that the map · is also a group action of

C_{Γ} (Q)

on

C_{C^{G}} (Q)

.

Definition 17.

Let

Q = (V, E, s, t)

be a network quiver. Define an action of

C_{Γ} (Q)

on

C_{C^{G}} (Q)

as

τ \cdot W = τ W τ^{- 1}

. The moduli space

M (Q)

of the convolution representation is the set of all orbits from the group action.

Let

\tilde{W}

be a convolution representation of

\tilde{Q}

. We can fix a family of vector spaces

{U_{v}}_{v \in \tilde{V}}

indexed by

v \in \tilde{V}

and given

V_{v} = {(C^{G})}^{k}

if v is an output vertex in

\tilde{Q}

and

V_{v} = 0

for any other vertices in

\tilde{Q}

. A choice of convolution representations

\tilde{W}

of hidden quiver

\tilde{Q}

and a linear map

h_{v} : {\tilde{W}}_{v} \to V_{v}

for each

v \in \tilde{V}

determines a pair

(\tilde{W}, h)

where

h = {h_{v}}_{v \in \tilde{V}}

that is known as a framed quiver representation of

\tilde{Q}

by the family of vector spaces

{V_{v}}_{v \in \tilde{V}}

. From the definition, we can say

k e r (h) = {k e r {(h)}_{v}}_{v \in \tilde{V}}

.

Dually, we can fix a family of vector spaces

{U_{v}}_{v \in \tilde{V}}

indexed by

v \in \tilde{V}

and given

U_{v} = {(C^{G})}^{d}

when v is an input vertex of

\tilde{Q}

and

U_{v} = 0

for any other

v \in \tilde{V}

. A choices of convolution representation

\tilde{W}

of hidden quiver

\tilde{Q}

and a linear map

l_{v} : U_{v} \to \tilde{W}

for each

v \in \tilde{V}

determines a pair

(l, \tilde{W})

where

l = {l_{v}}_{v \in \tilde{V}}

, that is known as co-framed quiver representation of

\tilde{Q}

by the family of vector spaces

{U_{v}}_{v \in \tilde{V}}

. From the definition we can say

I m (l) = {I m {(l)}_{v}}_{v \in \tilde{V}}

.

Definition 18.

A double-framed convolution quiver representation is a triple

(l, \tilde{W}, h)

where

\tilde{W}

is a convolution representation of quiver

\tilde{Q}

,

(\tilde{W}, h)

is a framed quiver representation of

\tilde{Q}

, and

(l, \tilde{W})

is a co-framed quiver representation of

\tilde{Q}

[4].

Now, we will see the stability of the double-framed moduli space.

Definition 19.

A double-framed quiver representation

(l, \tilde{W}, h)

is stable if

The subrepresentation that contained in $k e r (h)$ is only zero sub representation;
The subrepresentation that contains $I m (l)$ is only $\tilde{W}$ [4].

From this definition, we can get this lemma.

Lemma 3.

Let

(l, \tilde{W}, h)

be a double-framed convolution representation. We say

(l, \tilde{W}, h)

is stable if

For every output vertex v, $k e r (h_{v}) = ⋂_{i = 1}^{k} k e r ({(h_{v})}_{i}$ holds;
For every input vertex v, $I m (l_{v}) = s p a n (⋃_{i = 1}^{k} I m ({(l_{v})}_{i})$ holds.

This lemma only gives us a sufficient condition for the stability condition. Not all double-framed convolution representations are stable. Nevertheless, we can see the dimension of the moduli space. Let us see this example first.

Example 1.

Let Q be a network quiver that is drawn like this:

We can make the delooped quiver from Q as

Q °

After that, we will make a new quiver as follows:

The quiver will be denoted as

Q^{v}

. The

C^{G}

quiver representation of Q is as follows:

We can also choose a morphism of representation τ such that

τ_{a_{3}} = {W_{ϵ_{1}}}^{- 1}

,

τ_{a_{4}} = W_{ϵ_{5}}^{- 1}

,

τ_{a_{5}} = W_{ϵ_{6}}^{- 1}

,

τ_{a_{6}} = {W_{ϵ_{1}}}^{- 1} τ_{W_{ϵ_{7}}}^{- 1}

,

τ_{a_{7}} = {W_{ϵ_{5}}}^{- 1} {W_{ϵ_{11}}}^{- 1}

,

τ_{a_{8}} = {W_{ϵ_{6}}}^{- 1} {W_{ϵ_{15}}}^{- 1}

, and

τ_{a_{1}} = τ_{a_{2}} = τ_{a_{9}} = τ_{a_{1} 0} = I

such that quiver

Q^{V}

has weight equal to identity for all arrows:

So, we will get that the dimension of double-framed convolution representation is proportioned by the number of arrows that are not equal to identity.

Theorem 4.

Let

V_{i n}

be the set of input vertices of Q and

V_{o u t}

be the set of output vertices from Q. Define

C_{Γ} (\tilde{Q}) = {I}_{v \in V_{i n}} \times \prod_{v \in \tilde{V}} C i r (C^{G}) \times {I}_{v \in V_{o u t}} .

Let

M (\tilde{Q})

be the set of all orbits from the action of

C_{Γ} (\tilde{Q})

on

C_{^{G}} (\tilde{Q})

, then the set

M (\tilde{Q})

will form the moduli space. Furthermore, the dimension of the moduli space is

| G | (| E ° | - | \tilde{V} |)

.

Proof.

Let Q be a network quiver with

\tilde{Q}

as the hidden quiver and

Q °

as the delooped quiver. Let W be a convolution representation of quiver

Q °

. Let

v \in \tilde{V}

. We know that there is

ϵ \in E °

such that

t (ϵ) = v

. We only choose one

ϵ \in E °

that

t (ϵ) = v

for every

v \in \tilde{V}

to build a new quiver

Q^{v}

. Because of the construction, no two arrows have the same target. This implies that

Q^{v}

is a union of trees, and the intersection of any two trees can only be a source vertex of Q. Furthermore, for any of those trees, only a vertex that is hidden is a unique source corresponding to that tree. Now, we will construct a morphism of quiver representations

τ = {τ_{v}}_{{v \in V}}

. If v is the input vertex, set the

τ_{v} = 1

. If v is not the input vertex, we have an arrow

α \in Q^{v}

such that

t (α) = v

. So, we can set

τ_{v} = W_{α}^{- 1} τ_{s (α)}

. Using the recursive formula, we can get a new convolution representation of

Q^{v}

such that every arrow in

Q^{v}

will be represented as identity linear maps from

C^{G}

to

C^{G}

. From that fact, we can conclude that the number of free choices of quiver representation will be the same as the number of arrows in

Q °

that have not been set to identity. Consider

Using Fourier transformation, we know the dimension of

C = {T : C^{G} \to C^{G} | \exists a \in C^{G} ∋ T (f) = a * f}

is equal to

| G |

. Therefore, the dimension of the moduli space is

| G | (| E ° | - | \tilde{V} |)

. □

4. Conclusions

We have defined a neural network function from an artificial neural network formed by

C^{G}

representation, especially for convolution representation. We can also see some moduli space properties formed by the action of the isomorphism group on the set of all convolution representations. From this work, we can minimize the complexity of the neural network algorithm.

In further research, we will explore the moduli spaces from recurrent neural networks and the topology of moduli spaces. We will see some properties of neural network functions with some types of activation functions and their effect on the continuity of neural network functions.

Author Contributions

Conceptualization, L.C.W., I.M.-A. and D.N.; methodology, L.C.W.; software, L.C.W.; validation, I.M.-A. and D.N.; formal analysis, L.C.W., I.M.-A. and D.N.; investigation, L.C.W.; resources, L.C.W.; data curation, L.C.W.; writing—original draft preparation, L.C.W.; writing—review and editing, I.M.-A. and D.N.; visualization, L.C.W.; supervision, L.C.W.; project administration, I.M.-A.; funding acquisition, D.N. All authors have read and agreed to the published version of the manuscript.

Funding

The authors sincerely thank the Institute for Research and Community Services, Bandung Institute of Technology, for the financial support through the Research, Community Service and Innovation Program (PPMI FMIPA scheme) 2023.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments

The authors thank the anonymous referees for their helpful comments that improved the quality of the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zhang, Q.; Wang, Y. Construction of composite mode of sports education professional football teaching based on sports video recognition technology. In Proceedings of the 2020 5th International Conference on Mechanical, Control and Computer Engineering (ICMCCE), Harbin, China, 25–27 December 2020; Volume 5, pp. 1889–1893. [Google Scholar]
Schöller, F.E.T.; Blanke, M.; Plenge-Feidenhans, M.K.; Nalpantidis, L. Vision-based Object Tracking in Marine Environments using Features from Neural Network Detections. IFAC-PapersOnLine 2020, 52, 14517–14523. [Google Scholar] [CrossRef]
Belov-Kanel, A.; Rowen, L.H.; Vishne, U. Application of Full Quivers of Representations of Algebras, to Polynomial Identities. Commun. Algebra 2011, 39, 4536–4551. [Google Scholar] [CrossRef]
Armenta, M.; Jodoin, P.-M. The Representation Theory of Neural Networks. Mathematics 2021, 9, 3216. [Google Scholar] [CrossRef]
Armenta, M.; Judge, T.; Painchaud, N.; Skandarani, Y.; Lemaire, C.; Gibeau Sanchez, G.; Spino, P.; Jodoin, P.-M. Neural Teleportation. Mathematics 2023, 11, 480. [Google Scholar] [CrossRef]
Assem, I.; Simson, D.; Skowronski, A. Quivers and Algebras. In Elements of the Representation Theory of Associative Algebras; CUP: New York, NY, USA, 2007; Volume 1, pp. 41–96. [Google Scholar]
Wanditra, L.C.; Muchtadi Alamsyah, I.; Rachmaputri, G. Wave Packet Transform on Finite Abelian Group. Southeast Asian Bull. Math. 2020, 44, 843–857. [Google Scholar]
Isaacs, I.M. Quivers and Algebras. In Algebra; Graduate Studies in Mathematics; American Mathematical Society: New York, NY, USA, 2009; pp. 42–54. [Google Scholar]
Armenta, M.; Brüstle, T.; Hassoun, S.; Reineke, M. Double framed moduli spaces of quiver representations. Linear Algebra Its Appl. 2022, 650, 98–131. [Google Scholar] [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wanditra, L.C.; Muchtadi-Alamsyah, I.; Nasution, D. Artificial Neural Networks Using Quiver Representations of Finite Cyclic Groups. Symmetry 2023, 15, 2110. https://doi.org/10.3390/sym15122110

AMA Style

Wanditra LC, Muchtadi-Alamsyah I, Nasution D. Artificial Neural Networks Using Quiver Representations of Finite Cyclic Groups. Symmetry. 2023; 15(12):2110. https://doi.org/10.3390/sym15122110

Chicago/Turabian Style

Wanditra, Lucky Cahya, Intan Muchtadi-Alamsyah, and Dellavitha Nasution. 2023. "Artificial Neural Networks Using Quiver Representations of Finite Cyclic Groups" Symmetry 15, no. 12: 2110. https://doi.org/10.3390/sym15122110

APA Style

Wanditra, L. C., Muchtadi-Alamsyah, I., & Nasution, D. (2023). Artificial Neural Networks Using Quiver Representations of Finite Cyclic Groups. Symmetry, 15(12), 2110. https://doi.org/10.3390/sym15122110

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Artificial Neural Networks Using Quiver Representations of Finite Cyclic Groups

Abstract

1. Introduction

2. Quiver Representation from Convolution Group Algebras

2.1. Quiver Representation

2.2. Convolution Representation

3. Moduli Space of Neural Networks from Convolution Group Algebras

3.1. Neural Network Function

3.2. Moduli Spaces of Neural Network

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI