Computational Methods for Parameter Identification in 2D Fractional System with Riemann–Liouville Derivative

In recent times, many different types of systems have been based on fractional derivatives. Thanks to this type of derivatives, it is possible to model certain phenomena in a more precise and desirable way. This article presents a system consisting of a two-dimensional fractional differential equation with the Riemann–Liouville derivative with a numerical algorithm for its solution. The presented algorithm uses the alternating direction implicit method (ADIM). Further, the algorithm for solving the inverse problem consisting of the determination of unknown parameters of the model is also described. For this purpose, the objective function was minimized using the ant algorithm and the Hooke–Jeeves method. Inverse problems with fractional derivatives are important in many engineering applications, such as modeling the phenomenon of anomalous diffusion, designing electrical circuits with a supercapacitor, and application of fractional-order control theory. This paper presents a numerical example illustrating the effectiveness and accuracy of the described methods. The introduction of the example made possible a comparison of the methods of searching for the minimum of the objective function. The presented algorithms can be used as a tool for parameter training in artificial neural networks.


Introduction
Fractional calculus is widely used in various fields of science and technology, e.g., in the design of sensors, in signal processing, and network sensors [1][2][3][4][5]. In the paper [2], authors describe the use of fractional calculus for artificial neural networks. Fractional derivatives are mainly used for parameter training using optimization algorithms, system synchronization, and system stabilization. As the authors quote, such systems have been used in unmanned aerial vehicles (UAVs), circuit realization robotics, and many other engineering applications. The paper [3] covers applications of fractional calculus in sensing and filtering domains. The authors present the most important achievements in the fields of fractional-order sensors, fractional-order analogs, and digital filters. In [5], they present a new fractional sensor based on a classical accelerometer and the concepts of fractional calculus. In order to achieve this, two synthesis methods were presented: the successive stages follow an identical analytical recursive formulation, and in the second method, a PSO algorithm determines the fractional system elements numerically.
In addition to applications in electronics, neural networks, and sensors, fractional calculus is also used in modeling of thermal processes [6,7], in modeling of anomalous diffusion [8,9], in medicine [10], and also in control theory [11,12]. Authors of the study in [6] model heat transfer in a two-dimensional plate using Caputo operator. Theoretical results are verified by experimental data from a thermal camera. It is shown that the fractional model is more accurate than the integer-order model in the sense of mean square error cost function.
Often in applications of fractional calculus, differential equations with fractional derivatives have to be solved numerically. This is the reason for the importance of developing algorithms for solving this type of problem. A lot of papers presenting numerical solutions of fractional partial differential equations have been published in recent years. In the paper [13], the author used the artificial neural network in the construction of a solution method for the one-phase Stefan problem. In turn, Ref. [14] presented an algorithm for the solution of fractional-order delay differential equations. Bu et al., in [15], presented a space-time finite element method to solve a two-dimensional diffusion equation. The paper describes a fully discrete scheme for the considered equation. Authors also presented a theorem regarding existence, stability of the presented method, and error estimation with numerical examples. Another interesting study is [16], in which the ADI method to solve fractional reaction-diffusion equations with Dirichlet boundary conditions was described. The authors used a new fractional version of the alternating direction implicit method. A numerical example was also presented.
In the paper, authors present a solution to the inverse problem consisting of the appropriate selection of the model input parameters in such a way that the system response adjusts to the measurement data. Inverse problems are a very important part of all sorts of engineering problems [17]. In [18], the inverse problem is considered for fractional partial differential equation with a nonlocal condition on the integral type. The considered equation is a generalization of the Barenblatt-Zheltov-Kochina differential equation, which simulates the filtration of a viscoelastic fluid in fractured porous media. In [19], the authors considered two inverse problems with a fractional derivative. The first problem is to reconstruct the state function based on the knowledge of its value and the value of its derivative in the final moments of time. The second problem consists of recreating the source function in fractional diffusion and wave equations. Additional information are the measurements in a neighborhood of final time. The authors prove the uniqueness of the solution to these problems. Finally, the authors derive the explicit solution for some particular cases. In the paper [20], the fractional heat conduction inverse problem is considered, consisting of finding heat conductivity in presented model. The authors also compare two optimization methods: iteration method and swarm algorithm.
The learning algorithm constitutes the main part of deep learning. The number of layers differentiates the deep neural network from shallow ones. The higher the number of layers, the deeper it becomes. Each layer can be specialized to detect a specific aspect or feature. The goal of the learning algorithm is to find the optimal values for the weight vectors to solve a class of problem in a domain. Training algorithms aim to achieve the end goal by reducing the cost function. While weights are learned by training on the dataset, there are additional crucial parameters, referred to as hyperparameters, that are not directly learned from the training dataset. These hyperparameters can take a range of values and add complexity of finding the optimal architecturenand model [21]. Deep learning can be optimized in different areas. The training algorithms can be fine-tuned at different levels by incorporating heuristics, e.g., for hyperparameter optimization. The time to train a deep learning network model is a major factor to gauge the performance of an algorithm or network, so the problem of the training optimization in a deep learning application can be seen as the solution of an inverse problem. In fact, the inverse problem consists of selecting the appropriate model input parameters in order to obtain the desired data on the output. To solve the problem, we create an objective function that compares the desired values (target) with the network outputs calculated for the determined values of the searched parameters (weights). Finding the minimum of the objective function, we find the sought weights.
In this paper, in Section 2, a system consisting of a 2D fractional partial differential diffusion equation with Riemann-Liouville derivative is presented. Dirichlet boundary conditions were added to the equation. This type of model can be used for the designing process of heat conduction in porous media. In Section 2.2, a numerical scheme of the considered equation is presented based on the alternating direction implicit method (ADIM). In Section 3, the inverse problem is formulated. It consists of identification of two parameters of the presented model based on measurements of state function in selected points of the domain. The inverse problem has been reduced to solving the optimization problem. For this purpose, two algorithms were used and compared: probabilistic ant colony optimization (ACO) algorithm and deterministic Hooke-Jeeves (HJ) method. Section 4 presents a numerical example illustrating the operation of the described methods. Section 5 provides the conclusions.

Fractional Model
This section consists of a description of the considered anomalous diffusion model which is considered with a fractional derivative, and then we present a numerical algorithm solving the presented differential equation.

Model Description
Models using fractional derivatives have recently been widely used in various engineering problems, e.g., in electronics for modeling a supercapacitor, in mechanics for modeling heat flow in porous materials, in automation for describing problems in control theory, or in biology for modeling drug transport. In this study, we consider the following model of anomalous diffusion: u(x, y, t)| ∂Ω = 0, t ∈ (0, T], u(x, y, t)| t=0 = ϕ(x, y), (x, y) ∈ Ω. (2) The differential Equation (1) describes the anomalous diffusion phenomenon (e.g., heat conduction in porous materials [22][23][24]), and is defined in the area Ω × T, where (x, y) ∈ Ω, c, , λ > 0 are parameters defining material properties, u is a state function, and f is an additional component in the model. Using the terminology taken from the theory of heat conduction, we can write that c is the specific heat, is the density, λ is the heat conduction coefficient, and the function f describes the additional heat source. All parameters are multiplied by the constants by the value of one and the units that ensure the compatibility of the units of the entire equation. The state function u describes the temperature distribution in time and space. The Equation (2) define the initial boundary conditions necessary to uniquely solve the differential equation. It is assumed that at the boundary the u state function has the value 0, and at the initial moment the value of the u function is determined by the well-known ϕ function. In the Equation (1), there also occurs fractional derivative of α and β order. In the model under consideration, these derivatives are defined as Riemann-Liouville [25] derivatives: The Formula (3) defines the left derivative, and the Formula (4) defines the right derivative. In both cases, they assume that α ∈ (0, 1). In addition, the derivative of y of β order in the Equation (1) is defined as the Riemann-Liouville derivative.

Numerical Solution of Direct Problem
Now, let us present the numerical solution of the model defined by Equations (1) and (2). If we have all the data about the model, such as parameters c, , λ, α, β, initial boundary conditions, and geometry of the area, by solving the Equation (1), we solve the direct problem. In order to solve the problem under consideration, we write the Equation (1) as follows: Then, we discretize the area by creating an uniform mesh in each of the dimensions. Let us assume the following symbols: i,j , f k i,j , λ i,j . We approximate the Riemann-Liouville derivative using the shifted Grünwald formula [26]: where Similarly, we can approximate the fractional derivative to the spatial variable y. In the case of the derivative over time, we use the difference quotient: Let us use the following notation: where λ x i,j denotes the first-order derivative (at (x i , y j )) over the λ function with respect to the x variable. We assume analogous symbols for the y variable. After using the Formulas (6)-(10) and some transformations, the difference scheme for the Equation (5) can be written in the following form: In order to simplify the description of the numerical algorithm to be implemented, we present the difference schema (11) in matrix form, so we introduce the following matrices: Now we define two block matrices, S and H. First, we create the matrix S of dimension , which is a diagonal block matrix containing matrices R x (l), l = 1, 2, . . . , M y − 1 on the main diagonal, and zeros in other places.
Second, we create matrix H, which has the same dimension as matrix S, in the following form: Now it is possible to write the difference scheme (11) in matrix form: where The matrices from the difference scheme (16) are large, so the obtainedsystem of equations is time-consuming to solve. Hence, we applied the alternating direction implicit method (ADIM) to the difference scheme (11), which significantly reduces the computation time (details can be found in [27]). This is an important issue in the case of inverse problems, where a direct problem should be solved many times. Let us write the scheme (11) in the form of the directional separation product: Numerical scheme (17) is split into two parts and solved, respectively, first in the direction x, and afterwards in the direction y. With this approach, the resulting matrices for the systems of equations have significantly lower dimensions than in the case of the scheme (11). The numerical algorithm has two main steps: • For each fixed y j , solve the numerical scheme in the direction x. As a consequence, we will obtain a temporary solution: u k+1 i,j : • Then, for each fixed x i , solve the numerical scheme in the direction y: This process can be symbolically depicted as in Figure 1. For the boundary nodes and the initial condition, we applied: In the case of the ADIM method, it is also possible to present the equations in a matrix form, which has been executed below. First, for each l = 1, 2, . . . , M x − 1, we define auxiliary vectors u * l : where . Then, the numerical scheme (18) can be written in the following matrix form (for p = 1, 2, . . . , M y − 1): where the temporary solution has the form u k We obtain M y − 1 systems of equations, each of (M x − 1) × (M x − 1) dimension. Next, we present the scheme (19) in the direction y in matrix form (for l = 1, 2, . . . , M x − 1): where u k+1 l = [u k+1 l,1 , u k+1 l,2 , . . . , u k+1 l,M y −1 ] T and u * k l = [ u k l,1 , u k l,2 , . . . , u k l,M y −1 ] T . At this stage of the algorithm, we can solve M x − 1 systems of equations with dimensions (M y − 1) × (M y − 1) each. The Bi-CGSTAB [28,29] method is used to solve the equation systems, which has significance influences on the computation time. More implementation details and a comparison of times for the described method can be found in the papers [27,30].

Inverse Problem
In many engineering problems, in particular in various types of simulations and mathematical modeling, there is a need to solve the inverse problem. In this case, the inverse problem consists of selecting the appropriate model input parameters (1) and (2) to obtain the desired data on the output. Values of the state function u at selected points (so-called measurement points) of the domain are treated as input data for the inverse problem. The task consists of selecting unknown parameters of the model in such a way that the u function assumes the given values at the measurement points. Problems of this type are badly conditioned, which may result in the instability of the solution or the ambiguity of it [31,32]. Details of the solving algorithm are presented in the following sections.

Parameter Identification
In the model (1) and (2), the following data are assumed: f (x, y, t) = 3,000,000 1309 82,467(x − 2) 2 x 2 (y − 1) 2 y 3 cos t 100 where To solve the problem, we create an objective function that compares the values of the u function calculated for the determined values of the searched parameters λ, α (at measurement points) with the measurement data. Therefore, we define the objective function as follows: where N 1 and N 2 are the number of measuring points and the number of measurements in a given measuring point, respectively. In the considered example, N 1 = 7, and N 2 depends on the used mesh. By u k i,j (λ, α), we denote the values of the u function obtained in the algorithm for the fixed parameters λ, α, and by m u k i,j measurement data. Finding the minimum of the objective function (25), we find the sought parameters.

Function Minimization
In the case of the minimization objective function, we can use any heuristic algorithm (e.g., swarming algorithms). In this paper, we decided to use two algorithms: • Ant colony optimization algorithm (ACO). • Hooke-Jeeves algorithm (HJ).
In this section, we describe both algorithms.

Ant Colony Optimization Algorithm
The presented ACO algorithm is a probabilistic one, so we obtain a different result in each execution. Proper selection of algorithm parameters should make the obtained results give convergent solutions. The algorithm is inspired by the behavior of an ant swarm in nature. More about the ACO algorithm and its applications can be found in the articles [33][34][35]. In order to describe the algorithm, we introduce the following notations:

Hooke-Jeeves Algorithm
The Hooke-Jeeves algorithm is a deterministic algorithm for searching for the minimum of an objective function. It is based on two main operations: • Exploratory move. It is used to test the behavior of the objective function in a small selected area with the use of test steps along all directions of the orthogonal base. • Pattern move. It consists of moving in a strictly determined manner to the next area where the next trial step is considered, but only if at least one of the steps performed was successful.
In this algorithm, we consider the following parameters: Each pheromone spot (solution vector) is assigned a probability according to the formula: where ω l are weights related to the solution index l and expressed by the formula: 8: for k = 1, 2, . . . , M do 9: Ant randomly chooses the l-th solution with a probability of p l . 10: Then ant transforms each of the coordinates (j = 1, 2, . . . , n) of the selected solution using Gauss function:  1, 2, . . . , n). This is an exploratory move. 2: If a better point is found, continue in that direction. This is a pattern move. 3: If no better point is found then narrow down the search space using the narrowing parameter β.

Results-Numerical Examples
We consider the inverse problem described in the Section 3.1. In the models (1) and (2), we set data described by the Equations (23) and (24). We used two different grids 160 × 160 × 250 and 100 × 100 × 200 and different levels of measurement data disturbances (input data for the inverse problem): 0%, 2%, 5%, 10%. The unknown data in the model are λ and α-these data need to be identified using the presented algorithm. To examine and test the algorithm, we know exact values of these parameters, which are λ = 240, α = 0.8.
First, we present the results obtained using the ACO algorithm. We set the following parameters of the ant algorithm: Based on the L, M, I parameters, we can determine the number of calls to the objective function, which in our example is M · I + L = 656. Obtained results are presented in Table 1. The best results were obtained for exact input data and 100 × 100 × 200 mesh, the relative errors of reconstruction parameters λ and α are 0.0283% and 0.584%, respectively, and for the 160 × 160 × 250, mesh these errors are equal to 0.151% and 0.687%. In the case of the input data with a pseudo-random error, the obtained results are also very good, and the errors of reconstructed parameters do not exceed the input data disturbance errors. In particular, the errors of reconstruction of the λ coefficient are very small and do not exceed 1% (except in the case of disturbing the input data with an error of 10% and the 100 × 100 × 200 grid). Relative errors of reconstructed α parameter have values greater than λ errors, most likely due to the fact that the sought value is significantly lower than λ. Of course, along with the increase in input data disturbances, the values of the minimized objective function also increased. Except for in a few cases, the mesh density did not significantly affect the results. Table 1. Results of calculations in case of ACO algorithm. λ-reconstructed value of thermal conductivity coefficient; α-reconstructed value of x-direction derivative order; δ-the relative error of reconstruction; J-the value of objective function; σ-standard deviation of objective function.  Figure 3 shows how the value of the objective function changed depending on the iteration number for four input data cases. The figures do not include the objective function values for the initial iterations. This is due to the fact that these values were relatively high, and inclusion in the figures would reduce their legibility. We can see that in the last few iterations (2)(3)(4)(5), the values of the objective function do not change anymore. The appropriate selection of the L, M, I parameters for the ACO algorithm affects the computation time and is not always a simple task. It depends on the complication of the objective function and the number of sought parameters (size of the problem). In particular, a situation in which the algorithm does not change the solution in the next dozen iterations should be avoided. As we can observe in the presented example, the selection of ACO parameters, such as the number of iterations, as well as the size of the population, seems appropriate.  It is a deterministic algorithm, and the resulting solution, as well as the number of calls to the objective function, depend on the starting point and stop criterion ξ. In our example, we consider four different starting points: (100, 0.2), (300, 0.1), (450, 0.5), (500, 0.9). It turned out that regardless of the selected starting point, the same solution was always obtained, but it should be noted that in the case that the value of any of the reconstructed parameters exceeded the predetermined limits, then we execute the so-called penalty function. It was significant in the case of the (100, 0.2) starting point, for which the algorithm exceeded the limits and stopped at the local minimum; e.g., for the 160 × 160 × 250 grid and 0% disturbances, we obtained the results λ ≈ 250, α ≈ 1.8, J ≈ 138. Similar results were obtained for the remaining cases and the (100, 0.2) start. Table 2 shows the results obtained using the Hooke-Jeeves algorithm. Comparing the results obtained from both algorithms, we can see that in most cases the errors in reconstruction of the parameters are smaller for the Hooke-Jeeves algorithm; e.g., for the 160 × 160 × 250 and 2% input data disturbance errors, errors in sought parameters λ and α for the HJ algorithm were 0.0198% and 0.231%, respectively, while for the ACO algorithm, these errors were 0.371% and 1.64%. In addition, the value of the objective function for the HJ algorithm was smaller J H J ≈ 1014, J ACO ≈ 1020. As mentioned earlier, the failure to apply the penalty function caused the HJ algorithm for the (100, 0.2) starting point to return unsatisfactory results. This should be noted when the objective function is complicated, for example, by increasing the number of parameters to be found. Now we present the error of reconstruction of the u state function in the grid points. These results are summarized in Table 3. The mean errors of reconstruction of the u state function are at a low level and do not exceed 0.5% in each of the analyzed cases. We can also observe that the maximum errors in most cases are greater for the 100 × 100 × 200 grid; in particular, it is visible for the input data noised by the 5% and 10% errors. Figures 4 and 5 show error plots of reconstruction of the u state function at the measurement points K1, K2, . . . , K7. The graphs of these errors for both the ACO and HJ algorithms are quite similar. It can be noticed that for the measurement points K1, K2, K5, K6, greater errors were obtained for the input data noised by the 5% error than for the input data disturbed by the error of 10%. Levels of the u reconstruction errors for the input data unaffected and affected by the 2% error (red and green colors) are on a much lower level than for the other input data (blue and black colors).

Sensitivity Analysis
A sensitivity analysis was also performed for both reproduced parameters [38]. Sensitivity coefficients are derived from the measured quantity according to the reproduced quantity: In the calculations, both of the above derivatives are approximated by central difference quotients: where ε = 10 −5 [39], and u p (x, y, t) denotes the state function determined for a given value of p.
We considered a test case with α = 0.8 and λ = 240. Figure 6 shows the variability of the sensitivity coefficients at measurement points over the entire analyzed period of time. The obtained results were symmetrical with respect to the vertical axis of symmetry of the area-the line x = 1. Therefore, the measurement coefficients in points K 5 , K 6 , and K 7 are equal to the coefficients in points K 1 , K 2 , and K 3 , respectively. The performed sensitivity analysis showed that the positions selected for the measurement points are correct. They ensure the appropriate sensitivity of the state function to changes in the values of the restored parameters.

Conclusions
This paper presents algorithms for direct and inverse solutions for a model consisting of a differential equation with a fractional derivative with respect to a space of the Riemann-Liouville type. Equations of this type are used to describe the phenomena of anomalous diffusion, e.g., anomalous heat transfer in porous media. The inverse problem has been reduced to the search for the minimum of a properly created objective function. Two algorithms were used to deal with this problem: ant colony optimization algorithm and Hooke-Jeeves method. From the presented numerical example, we can draw the following conclusions: • The obtained results are satisfactory and errors of parameters reconstruction are minimal. • Both presented algorithms returned similar results, but in the case of the HJ algorithm, it was necessary to use the penalty function for one of the starting points. • The number of evaluation of the objective function was smaller for the HJ algorithm (250-300) than for the ACO algorithm (656).
The used differential scheme is unconditionally stable and has the approximation order equal to O((∆x) 2 + (∆y) 2 + (∆t) 2 ) [26]. The convergence of the differential scheme is fast; already for sparse meshes, the approximation errors for the solution of the direct problem are small [27]. In addition, in the case of the inverse problem considered in this paper, it is enough to use a relatively sparse mesh to very well reconstruct the searched parameters. The presented method can be used as a tool for parameter training in artificial neural networks.