Matrix Expression of Convolution and Its Generalized Continuous Form

In this paper, we consider the matrix expression of convolution, and its generalized continuous form. The matrix expression of convolution is effectively applied in convolutional neural networks, and in this study, we correlate the concept of convolution in mathematics to that in convolutional neural network. Of course, convolution is a main process of deep learning, the learning method of deep neural networks, as a core technology. In addition to this, the generalized continuous form of convolution has been expressed as a new variant of Laplace-type transform that, encompasses almost all existing integral transforms. Finally, we would, in this paper, like to describe the theoretical contents as detailed as possible so that the paper may be self-contained.


Introduction
Deep learning means the learning of deep neural networks, called deep and if multiple hidden layers exist. Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction [1]. The convolution in convolutional deep neural network (CNN) is the tool for obtaining a feature map from the original image data, it sweeps the original image with a kernel matrix, and transforms the original data into a different shape. This distorted image is called a feature map. Therefore, in CNN, the convolution can be regarded as a tool that creates a feature map from the original image. Herein, the concept of convolution in artificial intelligence is demonstrated mathematically.
The core concept of CNN is the convolution which applies the weight to the receptive fields only, and it transforms the original data into a feature map. This process is called convolution. This is a similar principle to integral transform. The method of Integral transform maps from the original domain to another domain to solve a given problem more easily. Since the matrix expression of convolution is an essential concept in artificial intelligence, we believe that this study would certainly be meaningful. In addition to this, the generalized continuous form of convolution has also been studied, and thus this form is expressed as a new variant of Laplace-type transform.
On one hand, the transform theory is extensively utilized in fields involving medical diagnostic equipment, such as magnetic resonance imaging or computed tomography. Typically, a projection data are obtained by an integral transform, and an image using an inverse transform is produced. Although plausible integral transforms exist, almost all existing integral transforms are not sufficiently satisfied with fullness, and can be interpreted as a Laplace-type transform. One of us proposed a comprehensive form of the Laplace-type integral transform in [2]. The present study is being conducted to investigate the matrix expression of convolution and its generalized continuous form.
In [2], a Laplace-type integral transform was proposed, expressed as For values of α as 0, −1, 1, and −2, we have, respectively, the Laplace [3], Sumudu [4], Elzaki [5], and Mohand transforms [6]. This form can be expressed in various manners. Replacing t by ut, we have where β = α + 1. In the form, β values of 1, 0, 2, and −1 correspond to the Laplace, Sumudu, Elzaki, and Mohand transforms, respectively. If we substitute u = 1/s in (1), we then obtain the simplest form of the generalized integral transform as follows: where γ = −α. In this form, the Laplace, Sumudu, Elzaki, and Mohand transforms have γ values of 0, 1, −1, and 2, respectively. It is somewhat paved, but essentially a simple way to derive the Sumudu transform is to multiply the Laplace transform by s. Similarly, it can be obtained multiply by s −1 to obtain the Elzaki transform, and multiply by s 2 to obtain the Mohand transform. The natural transform [7] can be obtained by substituting f (t) with f (ut). Additionally, by substituting t = ln x, the Laplace-type transform G( f ) can be expressed as As a similar form, there is a Mellin transform [8] of the form ∞ 0 f (x)x s−1 dx.
As shown above, many integral transforms have their own fancy masks, but most of them can essentially be interpreted as Laplace-type transforms. From a different point of view, a slight change in the kernel results in a significant difference in the integral transform theory. Meanwhile, plausible transforms exist, such as the Fourier, Radon, and Mellin transforms. Typically, if the interval of integration and the power of kernel are different, it can be interpreted as a completely different transform. Studies using Laplace transform were conducted in [9,10]. The generalized solutions of the third-order Cauchy-Euler equation in the space of right-sided distributions has found [9], studied the solution of the heat equation without boundary conditions [10], and investigated further properties of Laplace-type transform [11]. As an application, a new class of Laplace-type integrals involving generalized hypergeometric functions has been studied [12,13]. As for research related to the integral equation, Noeiaghdam et al. [14] presented a new scheme based on the stochastic arithmetic. The scheme is presented to guarantee the validity and accuracy of the homotopy analysis method. Different kinds of integral equations such as singular and first kind are considered to find the optimal results by applying the proposed algorithms.
The main objective of this study is to investigate the matrix expression of convolution and its generalized continuous form. The generalized continuous form of the matrix expression was carried out in the form of a new variant of Laplace-type transform. The obtained result are as follows: (1) If the matrix representing the function (image) f is A and the matrix representing the function g is B, then the convolution f * g is represented by the sum of all elements of A • B and this is the same as tr(AB T ) where • is array multiplication, T is the transpose, and tr is the trace. Thus, the convolution in artificial intelligence (AI) is the same as tr(AB T ). (2) The generalized continuous form of the convolution in AI can be represented as where Φ(u) is an arbitrary bounded function and

Matrix Expression of Convolution in Convolutional Neural Network (CNN)
Note that functions can be interpreted as images in artificial intelligence (AI). The convolution is changed from by the discretization. The convolution in CNN is the tool for obtaining a feature map from the original image data, plays a role to sweeping the original image with kernel matrices (or filter), and it transforms original data into a different shape. In order to calculate the convolution, each n × n part of the original matrix is element-wise multiplied by the kernel matrix and all its components are added. Typically, the kernel matrix is using by 3 × 3 matrix. On the one hand, the pooling (or sub-sampling) is a simple job, reducing the size of the image made by convolution. It is the principle that the resolution is increased when the screen is reduced.
Let the matrix representing the function f is A and the matrix representing the function g is B. For two matrices A and B of the same dimension, the array multiplication (or sweeping) A • B is given by For example, the array multiplication for 2 × 2 matrices is The array multiplication appears in lossy compression such as joint photographic experts group and the decoding step. Let us look at an example.

Example 1.
In the classification field of AI, the pixel is treated as a matrix. When the original image is This is just an example for understanding, and in perceptron the output uses a value between −1 and 1 using the activation function. Note that the perceptron is an artificial network designed to mimic the brain's cognitive abilities. Therefore, the output of neuron (or node) Y can be represented as where w is a weight, Θ is the threshold value, and X is the activation function with X = ∑ n i=1 x i w i . In the backpropagation algorithm of deep neural network, the sigmoid function is used as the activation function [15]. This function is easy to differentiate and ensures neuron output is in [0, 1]. If max-pulling is applied to the above convolved feature map, the resulting matrix becomes 1 × 1 matrix (17) = 17.
As discussed above, convolution in AI can be obtained by array multiplication. We would like to associate this definition with matrix multiplication in mathematics.

Definition 1. (Convolution in AI)
If the matrix representing the function (image) f is A and the matrix representing the function g is B, then the convolution f * g is represented by the sum of all elements of A • B and this is the same as tr(AB T ) where • is array multiplication, T is the transpose, and tr is the trace. Thus, the convolution in AI is the same as tr(AB T ), the sum of all elements on the diagonal with the right side facing down in AB T .
Typically, the convolution kernel is used as a 3 × 3 matrix, but for easy understanding, let us consider a 2 × 2 matrix.
then the convolution in AI is calculated as ae + b f + cg + dh by the sweeping. On the other hand, for T is the transpose and tr is the trace. This is the same result as in AI.

Generalized Continuous Form of Matrix Expression of Convolution
If the matrix representing a function f is A and the matrix representing a function g is B, then the convolution of the functions f and g can be denoted by tr(AB T ). Intuitively, the diagonal part of B T corresponds to a graph of g(t − τ). The overlapping part of the graph can be interpreted as the concept of intersection, that is, the concept of multiplication. Thus, the generalized continuous form of the convolution in AI can be represented in a variant of Laplace-type transform given by If f (t) is a function defined for all t ≥ 0, an integral of Laplace-type transform vi( f ) is given by Additionally, let Φ(u) be an arbitrary bounded function and let V( f ) be a variant of Laplace-type transform of Based on the above two definitions, it is clear that the above variant of Laplace-type transform is represented as V( f ) = Φ(u) · vi( f ) for an arbitrary function Φ(u). If so, let us see the relation with other integral transforms. Since if δ → 1+ and Φ(u) = u α , then it corresponds to the G α transform. When we take δ → 1, Φ(u) = 1, and u = 1/s, we get the Laplace transform. Similarly, when we take δ → 1, Φ(u) = u −1 (δ → 1, Φ(u) = u), we get the Sumudu transform (Elzaki transform), respectively. In order to obtain a simple form of generalization, it is better to set φ(u) to u α for an arbitrary integer α. However, it is judged that φ(u) is better than u α as a suitable generalization, where φ(u) is a bounded arbitrary function. The reason is that φ(u) can express more integral transforms. Lemma 1. (Lebesgue dominated convergence theorem [16,17]). Let (X, M, µ) be a measure space and suppose { f n } is a sequence of extended real-valued measurable functions defined on X such that (a) lim n→∞ f n (x) = f (x) exists µ-a.e. (b) There is an integrable function g so that for each n, | f n | ≤ g µ-a.e.

Then, f is integrable and
lim n→∞ X f n dµ = X f dµ.
Beppo Livi's theorem is a special form of Lemma 1. Its contents are as follows: for (g n ) is a nondecreasing sequence. The details are can be found on page 71 in [16]. Note that the convolution of f and g is given by The following theorem is as follows. Since the proof is not difficult, we would like to cover just a few.
(2) (Shifting theorem) If f (t) has the transform F(u), then e at f (t) has the transform Φ(u) · F(∆ − a). That is, Moreover, If f (t) has the transform F(u), then the shifted function f (t − a)h(t − a) has the transform e −a∆ · Φ(u)F(∆). In formula, for h(t − a) is Heaviside function (We write h since we need u to denote u-space).
(3) (Linearity) Let V( f ) be the variant of Laplace-type transform. Then V( f ) is a linear operation.
(4) (Existence) If f (t) is defined, piecewise continuous on every finite interval on the semi-axis t ≥ 0 and satisfies | f (t)| ≤ Me kt for all t ≥ 0 and some constants M and k, then the variant of Laplace-type transform V( f ) exists for all ∆ > k.
(5) (Uniqueness) If the variant of Laplace-type transform of a given function exists, then it is uniquely determined.
where h is Heaviside function.
(7) (Dirac's delta function) We consider the function In a similar way to Heaviside, taking the integral of Laplace-type transform, we get If we denote the limit of f k as δ(t − a), then (8) (Shifted data problems) For a given differential equation y + ay + by = r(t) subject to y(t 0 ) = c 0 and y (t 0 ) = c 1 , where t 0 = 0 and a and b are constant, we can set t = t 1 + t 0 . Then t = t 0 gives t 1 = 0 and so, we have y 1 + ay 1 + by 1 = r(t 1 + t 0 ), y 1 (0) = c 0 , y 1 (0) = c 1 for input r(t). Taking the variant, we can obtain the output y(t).
(9) (Transforms of derivatives and integrals) Let a function f is n-th differentiable and integrable, and let us consider the fraction ∆ as an operator. Then V( f ) of the n-th derivatives of f (t) satisfies (10) (Convolution) If two functions f and g are integrable for * is the convolution, then V( f * g) satisfies for V( f ) = F(∆).

Proof. (5) Assume that V( f ) exists by
This is a contradiction on V( f 1 ) = V( f 2 ), and hence the transform is uniquely determined. Conversely, if two functions f 1 and f 2 have the same transform (i.e., if V( f 1 ) = V( f 2 )), then and so f 1 = f 2 a.e. Hence f 1 = f 2 excepting for the set of measure zero. (9) Note that vi( f ) = ∞ 0 e −t∆ f (t) dt, and let us approach the proof by induction. In case of n = 1, Integrating by parts, we have which is true by (2). Next, let us suppose that n = m is valid for some m. Thus, Now we start with the left-hand side of (2).
As the direct results of (9), For example, we consider y − y = t subject to y(0) = 1 and y (0) = 1. Taking the integral of Laplace-type transform on both sides, we have From the relation of V( f ) = Φ(u) · F * (∆), we have the solution where h is hyperbolic function.

Solution.
(1) Since this equation is y + y * t = 1, taking the integral of Laplace-type transform on both sides, and so, we obtain the solution y = cos t.
Let us do the check by expansion. Expanding, we get y (t) + y(t) = 0. Since a a f = 0, we get y(0) = 1 and y (0) = 0. Thus, we obtain y = cos t.
(2) This is rewritten as a convolution y(t) − y * sin t = t.
Taking the integral of Laplace-type transform, we have for Y = vi(y). The solution is and gives the answer (3) Note that the equation is the same as y − (1 + t) * y = 1 − sinh t. Taking the transform, we get and hence Simplification gives and so, we obtain the answer Let us turn the topic to initial value problem of the convolution. The initial value problem where Y(∆) = vi(y) and F(∆) = vi( f ). Simplification gives If we Proof. This is an immediate consequence of V( f ) = Φ(u) · F * (∆) and V( f ) = Φ(u) · vi( f ). For this reason, detailed proofs are omitted.
The statements below are the immediate results of Theorem 2.
(1) V(t f (t)) = −Φ(u)F (∆) Let us check examples for temperature in an infinite bar and displacement in a semi-infinite string by the variant of Laplace-type transform.

Example 4.
(Semi-infinite string) Find the displacement w(x, t) of an elastic string subject to the following conditions [3].
(a) The string is initially at rest on the x-axis from x = 0 to ∞. (b) For t > 0 the left end of the string is moved in a given fashion, namely, according to a single sine wave Then the displacement w is where h is Heaviside function.
The proof is simple, and the interchangeability of limit and integral in the proof process guarantees its validity by the Lebesgue dominated convergence theorem.

Solution.
Taking the integral of Laplace-type transform on both sides of w t = c 2 w xx , we have Organizing the equality, we get Organizing this equality, we get where the Wronskian W = 2 √ ∆/c. The value lim x→∞ f (x) = 0 gives lim x→∞ F(x, u) = 0, and hence B(u) = 0. Thus, from (4), we get and vi(1) = 1/∆. Taking the inverse transform, we obtain the temperature w(x, t) as follows: 4c 2 t + k 0 on |x| < 1, and * is the convolution. In case of |x| > 1, we have the solution In the above equality, we note that because vi( f * g) = F(∆)g(∆) for vi( f ) = F(∆).

Conclusions
In this study, the concept of convolution in convolutional neural networks (CNN) was presented mathematically and tried to connect with the concept of convolution in mathematics. As a continuous form of convolution in CNN, a new form of Laplace-type transform has been proposed. In the future, we will study the change of convolution in CNN by changing the stride. In addition to this, we shall also explore the possibility of our applying our newly defined Laplace-type transform in obtaining certain new and interesting results involving generalized hypergeometric functions that would certainly unify and generalized the results available in the literature and may be potentially useful from an applications point of view.