Abstract
In this paper, we consider the matrix expression of convolution, and its generalized continuous form. The matrix expression of convolution is effectively applied in convolutional neural networks, and in this study, we correlate the concept of convolution in mathematics to that in convolutional neural network. Of course, convolution is a main process of deep learning, the learning method of deep neural networks, as a core technology. In addition to this, the generalized continuous form of convolution has been expressed as a new variant of Laplace-type transform that, encompasses almost all existing integral transforms. Finally, we would, in this paper, like to describe the theoretical contents as detailed as possible so that the paper may be self-contained.
1. Introduction
Deep learning means the learning of deep neural networks, called deep and if multiple hidden layers exist. Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction [1]. The convolution in convolutional deep neural network (CNN) is the tool for obtaining a feature map from the original image data, it sweeps the original image with a kernel matrix, and transforms the original data into a different shape. This distorted image is called a feature map. Therefore, in CNN, the convolution can be regarded as a tool that creates a feature map from the original image. Herein, the concept of convolution in artificial intelligence is demonstrated mathematically.
The core concept of CNN is the convolution which applies the weight to the receptive fields only, and it transforms the original data into a feature map. This process is called convolution. This is a similar principle to integral transform. The method of Integral transform maps from the original domain to another domain to solve a given problem more easily. Since the matrix expression of convolution is an essential concept in artificial intelligence, we believe that this study would certainly be meaningful. In addition to this, the generalized continuous form of convolution has also been studied, and thus this form is expressed as a new variant of Laplace-type transform.
On one hand, the transform theory is extensively utilized in fields involving medical diagnostic equipment, such as magnetic resonance imaging or computed tomography. Typically, a projection data are obtained by an integral transform, and an image using an inverse transform is produced. Although plausible integral transforms exist, almost all existing integral transforms are not sufficiently satisfied with fullness, and can be interpreted as a Laplace-type transform. One of us proposed a comprehensive form of the Laplace-type integral transform in [2]. The present study is being conducted to investigate the matrix expression of convolution and its generalized continuous form.
In [2], a Laplace-type integral transform was proposed, expressed as
For values of as 0, , 1, and , we have, respectively, the Laplace [3], Sumudu [4], Elzaki [5], and Mohand transforms [6]. This form can be expressed in various manners. Replacing t by , we have
where . In the form, values of 1, 0, 2, and correspond to the Laplace, Sumudu, Elzaki, and Mohand transforms, respectively. If we substitute in (1), we then obtain the simplest form of the generalized integral transform as follows:
where . In this form, the Laplace, Sumudu, Elzaki, and Mohand transforms have values of 0, 1, , and 2, respectively. It is somewhat paved, but essentially a simple way to derive the Sumudu transform is to multiply the Laplace transform by s. Similarly, it can be obtained multiply by to obtain the Elzaki transform, and multiply by to obtain the Mohand transform. The natural transform [7] can be obtained by substituting with . Additionally, by substituting , the Laplace-type transform can be expressed as
As a similar form, there is a Mellin transform [8] of the form
As shown above, many integral transforms have their own fancy masks, but most of them can essentially be interpreted as Laplace-type transforms. From a different point of view, a slight change in the kernel results in a significant difference in the integral transform theory. Meanwhile, plausible transforms exist, such as the Fourier, Radon, and Mellin transforms. Typically, if the interval of integration and the power of kernel are different, it can be interpreted as a completely different transform. Studies using Laplace transform were conducted in [9,10]. The generalized solutions of the third-order Cauchy–Euler equation in the space of right-sided distributions has found [9], studied the solution of the heat equation without boundary conditions [10], and investigated further properties of Laplace-type transform [11]. As an application, a new class of Laplace-type integrals involving generalized hypergeometric functions has been studied [12,13]. As for research related to the integral equation, Noeiaghdam et al. [14] presented a new scheme based on the stochastic arithmetic. The scheme is presented to guarantee the validity and accuracy of the homotopy analysis method. Different kinds of integral equations such as singular and first kind are considered to find the optimal results by applying the proposed algorithms.
The main objective of this study is to investigate the matrix expression of convolution and its generalized continuous form. The generalized continuous form of the matrix expression was carried out in the form of a new variant of Laplace-type transform. The obtained result are as follows:
- (1)
- If the matrix representing the function (image) f is A and the matrix representing the function g is B, then the convolution is represented by the sum of all elements of and this is the same as where ∘ is array multiplication, T is the transpose, and is the trace. Thus, the convolution in artificial intelligence (AI) is the same as .
- (2)
- The generalized continuous form of the convolution in AI can be represented aswhere is an arbitrary bounded function and
2. Matrix Expression of Convolution in Convolutional Neural Network (CNN)
Note that functions can be interpreted as images in artificial intelligence (AI). The convolution is changed from to
by the discretization. The convolution in CNN is the tool for obtaining a feature map from the original image data, plays a role to sweeping the original image with kernel matrices (or filter), and it transforms original data into a different shape. In order to calculate the convolution, each part of the original matrix is element-wise multiplied by the kernel matrix and all its components are added. Typically, the kernel matrix is using by matrix. On the one hand, the pooling (or sub-sampling) is a simple job, reducing the size of the image made by convolution. It is the principle that the resolution is increased when the screen is reduced.
Let the matrix representing the function f is A and the matrix representing the function g is B. For two matrices A and B of the same dimension, the array multiplication (or sweeping) is given by
For example, the array multiplication for matrices is
The array multiplication appears in lossy compression such as joint photographic experts group and the decoding step. Let us look at an example.
Example 1.
In the classification field of AI, the pixel is treated as a matrix. When the original image is
array multiplying the kernel matrix
on the first matrix, we obtain the matrix
Now, adding all of its components, we obtain . Next, if we array-multiply the kernel matrix to the matrix
on the right and add all the components, we get by stride . If we continue this process to the final matrix
we get . Consequently, the original matrix changes to
by using the convolution kernel. This is called the convolved feature map.
This is just an example for understanding, and in perceptron the output uses a value between and 1 using the activation function. Note that the perceptron is an artificial network designed to mimic the brain’s cognitive abilities. Therefore, the output of neuron (or node) Y can be represented as
where w is a weight, is the threshold value, and X is the activation function with . In the backpropagation algorithm of deep neural network, the sigmoid function
is used as the activation function [15]. This function is easy to differentiate and ensures neuron output is in . If max-pulling is applied to the above convolved feature map, the resulting matrix becomes matrix .
As discussed above, convolution in AI can be obtained by array multiplication. We would like to associate this definition with matrix multiplication in mathematics.
Definition 1.
(Convolution in AI) If the matrix representing the function (image) f is A and the matrix representing the function g is B, then the convolution is represented by the sum of all elements of and this is the same as where ∘ is array multiplication, T is the transpose, and is the trace. Thus, the convolution in AI is the same as , the sum of all elements on the diagonal with the right side facing down in .
Typically, the convolution kernel is used as a matrix, but for easy understanding, let us consider a matrix.
Example 2.
If
and
then the convolution in AI is calculated as by the sweeping. On the other hand,
and
for T is the transpose and is the trace. This is the same result as in AI.
3. Generalized Continuous Form of Matrix Expression of Convolution
If the matrix representing a function f is A and the matrix representing a function g is B, then the convolution of the functions f and g can be denoted by . Intuitively, the diagonal part of corresponds to a graph of . The overlapping part of the graph can be interpreted as the concept of intersection, that is, the concept of multiplication. Thus, the generalized continuous form of the convolution in AI can be represented in a variant of Laplace-type transform given by
If is a function defined for all , an integral of Laplace-type transform is given by
for with
Additionally, let be an arbitrary bounded function and let be a variant of Laplace-type transform of . If is a function defined for all , is defined by
for with .
Based on the above two definitions, it is clear that the above variant of Laplace-type transform is represented as for an arbitrary function . If so, let us see the relation with other integral transforms. Since
if and , then it corresponds to the transform. When we take , , and , we get the Laplace transform. Similarly, when we take , , we get the Sumudu transform (Elzaki transform), respectively. In order to obtain a simple form of generalization, it is better to set to for an arbitrary integer . However, it is judged that is better than as a suitable generalization, where is a bounded arbitrary function. The reason is that can express more integral transforms.
Lemma 1.
(Lebesgue dominated convergence theorem [16,17]). Let be a measure space and suppose is a sequence of extended real-valued measurable functions defined on X such that
(a) exists μ-a.e.
(b) There is an integrable function g so that for each n, μ-a.e.
Then, f is integrable and
Beppo Livi’s theorem is a special form of Lemma 1. Its contents are as follows:
for is a nondecreasing sequence. The details are can be found on page 71 in [16]. Note that the convolution of f and g is given by
The following theorem is as follows. Since the proof is not difficult, we would like to cover just a few.
Theorem 1.
- (1)
- (Duality with Laplace transform) If is the Laplace transform of a function , then it satisfies the relation of .
- (2)
- (Shifting theorem) If has the transform , then has the transform . That is,Moreover, If has the transform , then the shifted function has the transform . In formula,for is Heaviside function (We write h since we need u to denote u-space).
- (3)
- (Linearity) Let be the variant of Laplace-type transform. Then is a linear operation.
- (4)
- (Existence) If is defined, piecewise continuous on every finite interval on the semi-axis and satisfiesfor all and some constants M and k, then the variant of Laplace-type transform exists for all .
- (5)
- (Uniqueness) If the variant of Laplace-type transform of a given function exists, then it is uniquely determined.
- (6)
- (Heaviside function)where h is Heaviside function.
- (7)
- (Dirac’s delta function) We consider the functionIn a similar way to Heaviside, taking the integral of Laplace-type transform, we getIf we denote the limit of as , then
- (8)
- (Shifted data problems) For a given differential equation subject to and , where and a and b are constant, we can set . Then gives and so, we havefor input . Taking the variant, we can obtain the output .
- (9)
- (Transforms of derivatives and integrals) Let a function f is n-th differentiable and integrable, and let us consider the fraction Δ as an operator. Then of the n-th derivatives of satisfiesand
- (10)
- (Convolution) If two functions f and g are integrable for * is the convolution, then satisfiesfor .
Proof.
(5) Assume that exists by and both. If for , then
This is a contradiction on , and hence the transform is uniquely determined. Conversely, if two functions and have the same transform (i.e., if ), then
and so a.e. Hence excepting for the set of measure zero.
(9) Note that , and let us approach the proof by induction. In case of ,
Integrating by parts, we have
which is true by (2).
Next, let us suppose that is valid for some m. Thus,
holds for is the m-th derivative of f. Let us show that
Now we start with the left-hand side of (2).
Therefore, this theorem is valid for an arbitrary natural number n. Putting ,
follows. □
As the direct results of (9), and are follow.
For example, we consider subject to and . Taking the integral of Laplace-type transform on both sides, we have
for . Organizing this equation, we get . Simplification gives
From the relation of , we have the solution
where h is hyperbolic function.
Example 3.
(Integral equations of Volterra type) Find the solution of
Solution.
- (1)
- Since this equation is , taking the integral of Laplace-type transform on both sides, we havefor . Thusand so, we obtain the solution .Let us do the check by expansion. Expanding, we get . Since , we get and . Thus, we obtain .
- (2)
- This is rewritten as a convolutionTaking the integral of Laplace-type transform, we havefor . The solution isand gives the answer
- (3)
- Note that the equation is the same as . Taking the transform, we getand henceSimplification givesand so, we obtain the answerby the relation of .
Let us turn the topic to initial value problem of the convolution. The initial value problem
gives
where and . Simplification gives
If we put the system function , then
Since for , taking the inverse transform, we have
Theorem 2.
(Differentiation and integration of transforms) Let us put and . Then
Proof.
This is an immediate consequence of and . For this reason, detailed proofs are omitted.
The statements below are the immediate results of Theorem 2.
Let us check examples for temperature in an infinite bar and displacement in a semi-infinite string by the variant of Laplace-type transform. □
Example 4.
(Semi-infinite string) Find the displacement of an elastic string subject to the following conditions [3].
- (a)
- The string is initially at rest on the x-axis from to ∞.
- (b)
- For the left end of the string is moved in a given fashion, namely, according to a single sine wave
- (c)
- Furthermore, as for .
Then the displacement w is
where h is Heaviside function.
The proof is simple, and the interchangeability of limit and integral in the proof process guarantees its validity by the Lebesgue dominated convergence theorem.
Example 5.
(Temperature in an infinite bar) Find the temperature w in an infinite bar if the initial temperature is
with .
Solution. Taking the integral of Laplace-type transform on both sides of , we have
for . Organizing the equality, we get
Organizing this equality, we get
where the Wronskian The value gives , and hence . Thus, from , we get
By the direct calculation, we have
From the formula of for and , we know
and . Taking the inverse transform, we obtain the temperature as follows:
on , and * is the convolution. In case of , we have the solution
In the above equality, we note that
because for .
4. Conclusions
In this study, the concept of convolution in convolutional neural networks (CNN) was presented mathematically and tried to connect with the concept of convolution in mathematics. As a continuous form of convolution in CNN, a new form of Laplace-type transform has been proposed. In the future, we will study the change of convolution in CNN by changing the stride. In addition to this, we shall also explore the possibility of our applying our newly defined Laplace-type transform in obtaining certain new and interesting results involving generalized hypergeometric functions that would certainly unify and generalized the results available in the literature and may be potentially useful from an applications point of view.
Author Contributions
Conceptualization, A.K.R.; validation, Y.H.G.; and writing, H.K. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Acknowledgments
The corresponding author (H.K.) acknowledges the support of the Kyungdong University Research Fund, 2021. The authors are also grateful to the anonymous referees whose valuable suggestions and comments significantly improved the quality of this paper.
Conflicts of Interest
The authors declare no conflict of interest.
References
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521. [Google Scholar] [CrossRef] [PubMed]
- Kim, H. The intrinsic structure and properties of Laplace-typed integral transforms. Math. Probl. Eng. 2017, 2017, 1–8. [Google Scholar] [CrossRef]
- Kreyszig, E. Advanced Engineering Mathematics; Wiley: Singapore, 2013. [Google Scholar]
- Watugula, G.K. Sumudu Transform: A new integral transform to solve differential equations and control engineering problems. Integr. Educ. 1993, 24, 35–43. [Google Scholar] [CrossRef]
- Elzaki, T.M.; Ezaki, S.M.; Hilal, E.M.A. ELzaki and Sumudu Transform for Solving some Differential Equations. Glob. J. Pure Appl. Math. 2012, 8, 167–173. [Google Scholar]
- Mohand, M.; Mahgoub, A. The New Integral Transform ’Mohand Transform. Adv. Theor. Appl. Math. 2017, 12, 113–120. [Google Scholar]
- Belgacem, F.B.M.; Silambarasan, R. Theory of natural transform. Math. Eng. Sci. Aerosp. 2012, 3, 105–135. [Google Scholar]
- Bertrand, J.; Bertrand, P.; Ovarlez, J.P. The Mellin Transform, The Transforms and Applications; Poularkas, A.D., Ed.; CRC Press: Boca Raton, FL, USA, 1996. [Google Scholar]
- Jhanthanam, S.; Nonlaopon, K.; Orankitjaroen, S. Generalized Solutions of the Third-Order Cauchy-Euler Equation in the Space of Right-Sided Distributions via Laplace Transform. Mathematics 2019, 7, 376. [Google Scholar] [CrossRef]
- Kim, H. The solution of the heat equation without boundary conditions. Dyn. Syst. Appl. 2018, 27, 653–662. [Google Scholar]
- Supaknaree, S.; Nonlaopon, K.; Kim, H. Further properties of Laplace-type integral transforms. Dyn. Syst. Appl. 2019, 28, 195–215. [Google Scholar]
- Koepf, W.; Kim, I.; Rathie, A.K. On a New Class of Laplace-Type Integrals Involving Generalized Hypergeometric Functions. Axioms 2019, 8, 87. [Google Scholar] [CrossRef]
- Sung, T.; Kim, I.; Rathie, A.K. On a new class of Eulerian’s type integrals involving generalized hypergeometric functions. Aust. J. Math. Anal. Appl. 2019, 16, 1–15. [Google Scholar]
- Noeiaghdam, S.; Fariborzi Araghi, M.A.; Abbasbandy, S. Finding optimal convergence control parameter in the homotopy analysis method to solve integral equations based on the stochastic arithmetic. Numer. Algorithms 2019, 81. [Google Scholar] [CrossRef]
- Negnevitsky, M. Artificial Intelligence; Addison-Wesley: Essex, England, 2005. [Google Scholar]
- Cohn, D.L. Measure Theory; Birkhäuser: Boston, MA, USA, 1980. [Google Scholar]
- Jang, J.; Kim, H. An application of monotone convergence theorem in PDEs and Fourier analysis. Far East J. Math. Sci. 2015, 98, 665–669. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).