Reduced Multiplicative (BURA-MR) and Additive (BURA-AR) Best Uniform Rational Approximation Methods and Algorithms for Fractional Elliptic Equations

Numerical methods for spectral space-fractional elliptic equations are studied. The boundary value problem is defined in a bounded domain of general geometry, Ω ⊂ Rd, d ∈ {1, 2, 3}. Assuming that the finite difference method (FDM) or the finite element method (FEM) is applied for discretization in space, the approximate solution is described by the system of linear algebraic equations Aαu = f, α ∈ (0, 1). Although matrix A ∈ RN×N is sparse, symmetric and positive definite (SPD), matrix Aα is dense. The recent achievements in the field are determined by methods that reduce the original non-local problem to solving k auxiliary linear systems with sparse SPD matrices that can be expressed as positive diagonal perturbations of A. The present study is in the spirit of the BURA method, based on the best uniform rational approximation rα,k(t) of degree k of tα in the interval [0, 1]. The introduced additive BURA-AR and multiplicative BURA-MR methods follow the observation that the matrices of part of the auxiliary systems possess very different properties. As a result, solution methods with substantially improved computational complexity are developed. In this paper, we present new theoretical characterizations of the BURA parameters, which gives a theoretical justification for the new methods. The theoretical estimates are supported by a set of representative numerical tests. The new theoretical and experimental results raise the question of whether the almost optimal estimate of the computational complexity of the BURA method in the form O(N log2 N) can be improved.


Introduction
The basic assumption about the fractional diffusion phenomenon is that the Brownian motion hypothesis is violated. In the case of isotropic homogeneous media, such processes are described by the fractional Laplacian. A natural and easy-to-comprehend presentation of fractional Laplacian in the whole space R d , d = 1, 2, 3 can be derived through the Fourier transform. Serious (not only computational) challenges appear when the equation is posed in a bounded domain Ω, equipped with the correct boundary conditions, and the related boundary value problem is considered. There exist at least two different and not equivalent definitions of fractional Laplacian [1,2]. They follow the spectral and the Riesz formulations, respectively. Some recent comparison results between those two approaches are presented in [3], where the difference of the asymptotes of boundary layers is analyzed in detail.
In this paper, we consider the spectral fractional elliptic equation with power α ∈ (0, 1), which is in the form where A is a self-adjoint elliptic operator in Ω, satisfying homogeneous Dirichlet boundary conditions. The non-local operator A α is defined via the spectral decomposition of A, namely where λ j > 0 are the eigenvalues, respectively, ψ j are the corresponding normalized eigenfunctions of A, and (...) is the L 2 inner product.
Let ω h be a uniform rectangular mesh, and the finite difference method (FDM) using a standard (2d + 1)-stencil is applied for approximating the operator A . Then, the FDM discretization of (1) leads to a system of linear algebraic equations with respect to the mesh functions u and f in the form Here, A ∈ R N×N is a sparse, symmetric and positive definite (SPD) matrix, and As in the continuous case, {λ j,h } N j=1 is the spectrum of A, and the eigenvectors Ψ j,h are normalized with respect to the Euclidean dot product (...). Similar construction is applicable when the finite element method (FEM), using a quasi uniform triangulation T h , is applied. In this case, A = M −1 K, where K and M are the FEM stiffness and mass matrices, respectively, and A is SPD with respect to the scalar product generated by the mass matrix. In what follows, we assume that linear finite elements are used.
The recent achievements in the numerical solution of spectral fractional elliptic equations are determined by the class of methods that are applicable to multidimensional domains Ω of general geometry; see [4,5], the references therein, as well as the most recent paper [6]. Although there are several rather different approaches there, all the obtained algorithms reduce the original non-local problem to solving k auxiliary linear systems with sparse SPD matrices that can be expressed as some diagonal perturbations of A in the form A − d i I, d i < 0, i = 1, . . . k. Thus, all these methods can be interpreted as certain rational approximations of degree k [4,7], i.e., A −α ≈ r α,k (A). In this context, the advantages of the BURA method [4,5,8,9] follow directly by its definition, as being the best uniform rational approximation r α,k of degree k for t α in [0, 1].
Introducing the BURA method [9], the modified Remez algorithm was applied to derive r α,k . However, it faces serious difficulties regarding the computational stability for larger k and smaller α. The results obtained in the last few years largely overcome this problem. For example, in [7], the Chebfun implementation of the adaptive Antoulas-Anderson (AAA) algorithm is applied, where the representation of the rational approximant in barycentric form and greedy selection of the support points is used to approximate z −α for z ∈ [λ 1,h , λ N,h ]. Further progress in this direction has been made in [10] (see [11] for the related software implementation). The algorithm (BRASIL) is based on the introduced barycentric rational interpolation. The new algorithms and software tools for stable computation of best uniform rational approximations for larger degrees k provide promising opportunities with respect to further expansions of the application of the BURA technology and its better understanding and interpretation.
The next four publications [12][13][14][15] are in the spirit of further development of methods based on best uniform rational approximations.
Solving time-fractional differential equations is considered in [13], where the spectrum of the fractional Laplace kernel is approximated with a rational function, applying the AAA algorithm. Stability and convergence properties of the proposed numerical scheme are studied. Moreover, the developed algorithm is efficiently applied to a time-fractional Cahn-Hilliard problem.
A class of Reduced Basis Methods (RBM) is studied in [12]. Based on the developed theory, the equivalence between RBM and Rational Krylov Methods (RKM) is analyzed in [14]. Then a unified RKM framework for the spectral fractional in space and fractional in time problems is proposed in [15].
The methods discussed above have a strong impact on the development of efficient solution methods for fractional diffusion problems beyond the scalar elliptic case (see, e.g., [4]). Such results are presented in the more recent papers [16,17], where nonlinear and time-dependent fractional diffusion-in-space problems are considered, respectively. In this spirit, the results are also presented in [18], where a tensor numerical method for optimal control problems constrained by a fractional elliptic operator is developed.
In many studies, the needed degree k for obtaining the targeted accuracy of r α,k (A) is considered as a measure for the computational efficiency of the method. This holds true under the assumption that the complexity of the method is O(kN). In the case of a large scale unstructured matrix A, this means that some iterative method of optimal complexity O(N) is used for solving the auxiliary systems. Obviously, these auxiliary systems possess very different properties. More recently [19], it was found that for larger k, part of the coefficients −d i can be extremely large. This observation motivates the proposed reduced sum modification of the additive BURA method, called BURA-AR.
In this paper, we present new theoretical characterizations of the BURA parameters, which gives theoretical justification of the BURA-AR method and the newly proposed reduced product multiplicative BURA-MR method. In this way, a significant improvement in computational complexity is achieved. The presented numerical results can be used as direct practical receipts for optimizing the computational complexity, depending on the fractional power α. The new theoretical and experimental results raise the question of whether the current state-of-the-art almost-optimal estimate of the computational complexity of the BURA method in the form O(N(log N) 2 ) can be improved.
The rest of the paper is organized as follows. Some basic definitions, error estimates, and additive and multiplicative representations of the BURA algorithm are given in Section 2. BURA's stabilized calculations for larger k are discussed in the next section. It includes a comparative analysis of the theoretical error estimates and the accuracy of BURA, computed using the BRASIL software. Important theoretical results are presented in Section 4. The obtained characterization of the BURA parameters includes interlacing inequalities and asymptotic cluster analysis. The BURA-AR and BURA-MR methods are introduced in the next section. It is shown how to optimize the new parameter l in order to obtain the best computational complexity for a given target accuracy and given a priori estimates of the extreme eigenvalues λ 1,h and λ N,h . The numerical results presented in Section 6 confirm the efficiency of the developed computational technology. The considered large-scale 3D numerical tests illustrate the influence of the fractional power α depending on the smoothness of the right-hand side f (x). Brief concluding remarks are given at the end.

The BURA Method
The abbreviation BURA stands for best uniform rational approximation. Let us consider the min-max problem: find r α,k ∈ R(k, k) such that where r k (t) = P k (t)/Q k (t), P k and Q k are polynomials of degree k. Then the error E α,k of the k-BURA element r α,k is defined as A sharp estimate of E α,k is derived in [20]: Following [5] we introduce the approximation of A −α in the form where u k is the BURA numerical solution of the linear algebraic system (3). In other words, u k is the BURA approximation of the solution u(x) of the fractional elliptic Equation (1).
The following error estimate of the BURA method holds true.
Let Ω ⊂ R 2 , and let the lumped mass linear finite elements for discretization in space be used. Then and therefore with γ > 0, and C independent of h and k. Here u h,k is the FEM function corresponding to the BURA approximation u k .

Remark 1.
The equivalence of the lumped linear FEM and FDM discretizations on uniform rectangle meshes allows obtaining analogues of the error estimates (9)-(11) for the case of the finite difference method.
As noted in [5], for Ω ⊂ R d , d = 1, 3, the estimates (8) and (9) remain valid provided that 1 + γ is replaced by d/2 + γ. In this way, in the more interesting three-dimensional case, one can obtain the error bounds that is where Ω ⊂ R 3 . An additive or multiplicative representation of the rational function r α,k (1/z) can be used for implementation of the BURA method. Till now, the first one is preferred, but in this paper we will analyze both of them independently. The additive representation relies on the partial fraction decomposition of r α,k , while the multiplicative one deals with the factorization of r α,k : where c i > 0 and d i , ξ i < 0. In these ways, the BURA method reads as Thus, the BURA method for numerical solution of the fractional differential Equation (1) requires the solving of k auxiliary linear systems with sparse SPD matrices, which are positive diagonal shifts of the FEM/FDM matrix A. The unified view of the methods for solving Equation (1) as a rational approximation [4,7] leads to a general understanding of the degree k as a measure for computational efficiency. Thus, for the computational complexity of the BURA method, we accept that the following asymptotic estimate holds true: This is valid if a method with optimal computational complexity is used to solve the auxiliary linear systems that appear in (13). In the multidimensional case, this means that such an iterative solver is applied. In our numerical tests, we used the Boomer-AMG implementation from HYPRE [21] of the algebraic multigrid preconditioner in the PCG framework. The further analysis is based on the error estimates (9)- (11). From there, choosing properly the mesh parameters h and the BURA order k, the contributions of the discretization and the BURA approximation to the total error are balanced. In this way, the following almost optimal computational complexity of the BURA algorithms is obtained Improving the computational efficiency of the BURA method is the central focus of this paper. The analysis of the BURA parameters, and in particular of the coefficients d i , determining the diagonal shifts of A in (13), is behind the developed approach.

BURA: Stabilized Computations for Large k
In [4], a modified Remez algorithm (see [22]) was used for the computation of the BURA element of a degree up to ten. This algorithm is very time consuming and because of the difficulties related to the computational stability for large k's in [19], we used the Best Rational Approximation by Successive Interval Length (BRASIL) [11] to get the BURA of degree 85. Here, we computed the error of the rational approximation of the function t α in the interval [0, 1] obtained using the BRASIL software version 1.2.1. We used the following options in baryrat.brasil: convergence criterion tolerance; npi = −30 30 iterations of golden section search per interval; maxiter =10,000 the maximum number of iterations.
The BRASIL algorithm is based on a barycentric representation of the rational approximation of the form In order to obtain the coefficients of the simple additive representation (12), we first find the roots {ξ i } k i=1 and the poles {d i } k i=1 of the above barycentric representation. This is already implemented in the baryrat package by solving a corresponding eigenvalue problem using arithmetical precision of 100 decimal places (provided by the mpmath Python package). Since we are interested in r α,k (z) = r α,k (z −1 ) our roots and poles will be , respectively. Therefore, we can straightforwardly compute d i . To find the values of the c i coefficients in (12), we then solve the following linear system For the solution of the above linear system, we have also used the mpmath package with a precision of 100 decimal places. Let us also note that here we are using a modified convergence criterion for the BRASIL algorithm to ensure that the algorithm converges for large values of k without tinkering with the tolerance. The BRASIL algorithm first computes the errors for each interval between the interpolation points. Convergence is then decided, based on comparing the deviation of the errors in each interval to the tolerance The issue with the above is that ε has to be changed for different values of k. When the errors become very small, the arithmetical error in computing the deviation becomes significant and the algorithm stops converging for small values of ε. Instead of only looking at the deviation, we are checking the product of the maximum error and the deviation The idea is to automatically get as close to the BURA as double-precision floating point arithmetics would allow. The errors for the values of α = 1 4 , 1 2 , 3 4 are presented in Figure 1. We compare these errors with the theoretical estimate (6) of the BURA error from [20]. It is easy to see that the computed errors are very close to the theoretical estimate when the estimate is greater than the double-precision accuracy. The comparison shows that the theoretical estimate of the error is sharp.

In the Unit Interval t ∈ [0, 1]
The multiplicative and the additive representations of r α,k (t) are the following: Note that due to lim t→∞ r α,k (t) = c 0 , the coefficients c 0 in both formulas above coincide.
Theorem 1 (see [20,23]). For arbitrary α ∈ (0, 1) and k ∈ N the following holds true: (a) The zeros and the poles of r α,k are real, negative, and interlacing: The number of poles and zeros of r α,k on any given closed subinterval (d) The number of extreme points of ε α,k on any given closed subinterval Lemma 1. Let α ∈ (0, 1). As k → ∞, almost all poles and zeros of r α,k , as well as extreme points of ε α,k tend to the origin.
Proof. Fix m ∈ N. We will show that for every ε > 0 there exists a K = K(ε), such that for all k > K we have −ε < d m < ξ m < 0. The result for the extreme points is analogous. Indeed, consider an arbitrary ε > 0. Applying (19) with a = −∞ and b = −ε, we derive that there exists a δ = δ(ε) > 0 and a K = K (δ), such that for all k > K Thus, for all k > K Since Taking K = max(K , K ), combining (21) with (17) completes the proof. (6), the integral constants in (19) and (20) are not sharp for smaller k, meaning that K 1 in the proof of Lemma 1. Therefore, (21) is not a reliable approximant for practical applications, where k = O(10).
According to (18) Therefore the additive representation of r α,k can be rewritten in the form This provides the upper bound for (22). The proof is completed.
monotonically increases with i, meaning that the sharpness of the estimate decreases.
Consider the multiplicative representation of r α,k from (16). Then Proof. Again, the lower bound holds true for all i = 1, . . . , k, due to Theorem 1. For the upper bound, we combine (23) and (22) to derive For the last inequality, we apply (17) and conclude that The following relations between the parameters hold true: The last identity is due to c 0 = lim z→∞ r α,k (z) = r α,k (0), while the others are straightforward. According to Section 4.1 and (26), we have the following properties of the parameters, related to r α,k for all choices of α ∈ (0, 1):

The BURA-AR and the BURA-MR Methods
In Figures 2 and 3, we illustrate the behavior of the poles d i of r α,k (z) in (25), where α ∈ { 1 4 , 1 2 , 3 4 } and k = 45, 70, respectively.   (13), is practically equal to one. Of course, we do not need any preconditioning for such well-conditioned matrices. Moreover, even solving such systems may not be the right approach. This motivates us to investigate potential reduction on the number of linear systems to be solved without losing BURA accuracy.
Let δ < 1 be given. We can then introduce the error indicators Both functions r A α,k, and r M α,k, share the same poles and the error analysis performed later in this section indicates that both reduction procedures could be used in practical applications. They do not only stabilize the numerical computations but also improve the efficiency of the BURA solver.
Motivated by the above observation, we replace the r α,k (A) in the BURA method for solving the spectral fractional elliptic Equation (1) by either r A α,k, (A) or r M α,k, (A) and obtain the following BURA-AR and BURA-MR formulations Using the BURA-AR and BURA-MR methods, we reduce the number of linear systems that are to be solved from k to k − , and thus we decrease the computational complexity of the method.

BURA-AR Error Analysis
Throughout this section, let δ, α, and k be fixed. We are interested in analyzing the error indicator E A,δ α,k, . By triangle inequality, we have so we need to estimate the second term. Applying (27) and Lemma 2, we derive Hence, we proved the following theorem: Theorem 2. Let δ, α ∈ (0, 1) and k ∈ N. If ∈ N, such that |d | ≤ 1, then As a byproduct, we also derived that whenever δ < η 2k+1 meaning that for fixed α and k, the error indicator E A,δ α,k, is a monotonically increasing function with respect to and a monotonically decreasing function with respect to δ. The proof relies on the observation that r A α,k, (z) − r A α,k, (z) > 0, z > 0, proven in an identical way as r A α,k, (z) − r α,k (z) > 0 above, together with (27) and (18) that imply Note that while log 10 (2 |d | 1+α δ −1 ) ≤ log 10 (E α,k ) , the reduction error between the BURA-AR method and the original BURA method is dominated by the BURA approximation error E α,k , thus the reduction process does not affect the overall accuracy of the solver in any way. In practice, we observe that due to the large difference in the orders of |d i | for small i, the multiplier l can be omitted and

Corollary 2.
Let u be the solution of the linear algebraic system (3) and u A k, be the additive reduced BURA solution, corresponding to (28) with |d | ≤ 1. Then Example 1. Consider the fractional Laplacian problem with homogeneous Dirichlet boundary conditions in the d dimensional unit cube, d ∈ {1, 2, 3}. Take the finite difference or lumped finite element discretization of the problem on a uniform mesh with mesh-size h. It is well-known that the condition number of the matrix A generated in this way for the corresponding algebraic Equation (3) behaves like independent of the space dimensionality of the problem. According to (8), in order to balance the discretization and the approximation BURA error, there is no practical point to take larger k than the one that guarantees E α,k = h −2 . Now, in order to balance the reduction error with the approximation one, applying Corollary 2, we conclude that all systems, corresponding to diagonal shifts d i , such that can be reduced in the numerical computation. Here, following (27), we used that d i = 1/d i .
In particular, for α = 0.25 and h = 10 −2 , 10 −3 , 10 −4 we can reduce all systems corresponding to diagonal shifts that are higher or equal to 10 7 , 10 10 , 10 13 , respectively. The results are summarized in Table 1. We observe that the BURA-AR method can reduce more than 1/3 of the systems to be solved without influencing the overall BURA error order when α = 0.25. As α increases, the impact of BURA-AR on the computational efficiency decreases. Thus, for the considered example, the results in Table 1 show that for α = 0.75, the recommended reduction is = 0. This is in agreement with the significantly higher accuracy of BURA for α = 0.75, see Figure 1. In other words, smaller k is needed to ensure a certain accuracy, and then a smaller reduction is applicable.  Furthermore, for every α ∈ (0, 1), we can reduce all systems with diagonal shifts larger than 10 16 , as long as h ≥ 10 −4 , which usually holds true in practical applications.

BURA-MR Error Analysis
Again, let δ, α, and k be fixed. We are interested in analyzing the error indicator E M,δ α,k, . By triangle inequality, we have where r M α,k,0 ≡ r α,k . Since (27)), and 1/(z − ξ s+1 ) is strictly monotonically increasing for z ∈ [1, ∞) we conclude that where the constants are uniformly bounded for fixed k. Moreover, (19) suggests that the zeros and the poles of and C(k, s) < 2 k−s . Analogously to Section 5.1, we proved the following theorem: where the constant C < +∞ depends solely on k. Again, as a byproduct, we also derived that whenever δ < η 2k+1 meaning that for fixed α and k, the error indicator E M,δ α,k, is a monotonically increasing function with respect to and a monotonically decreasing function with respect to δ. Corollary 3. Let u be the solution of the linear algebraic system (3) and u M k, be the multiplicative reduced BURA solution, corresponding to (28) with |d | ≤ 1. Then

BURA-AR and BURA-MR Comparison
To summarize, the following theoretical estimates and relations were established in Section 5: Theorem 4. Let α ∈ (0, 1) and k ∈ N. Consider the BURA element r α,k (t) and take arbitrary 0 < δ < η 2k+1 , where η 2k+1 is the extreme point of ε α,k (t) before 1. Then Therefore, from a theoretical approximating point of view, the BURA-AR method always outperforms the BURA-MR method, giving rise to smaller errors for a fixed number of systems to be solved. Furthermore, in the BURA-MR method, apart from the k − algebraic systems, there are also k − matrix-vector multiplications (see (28)). On top of that, the BURA-AR method allows for parallel, independent solutions of the algebraic systems, while the BURA-MR method is a sequential one, as the solution of the previous system becomes the right-hand side of the next one. However, due to the interlacing property (17) and (27), the BURA-MR method seems to be numerically stable, independently of the value of . Moreover, based on the conducted numerical experiments, we observe that the difference E M,δ α,k, (z) − E A,δ α,k, (z) decays as increases, so from an application point of view, the two methods are equally useful. This topic will be addressed in the next section, where various experimental data are generated and analyzed.

Numerical Experiments
In this section, two classes of numerical experiments are considered. The first one aims at numerical validation of the theoretical foundations for the error indicator E A,δ α,k, , derived in Section 5. Therefore, the experiments are in 1D and involve numerical and visual comparison of z −α and r A α,k, (z) in the interval z ∈ [1, δ −1 ). A large value for k, namely k = 70 is considered so that the reduction benefits of BURA-AR are better illustrated. The decreasing effect of BURA-AR when α increases has already been commented on; therefore, the focus here is mostly on α = {0.25, 0.5}. Here, and in what follows, the case α = 0.75 and k = 45 are included for completeness. The numerical data are summarized in Table 2.
It can be seen that even in the case δ = 10 −10 , the accuracy E A,δ α,70, < 5 × 10 −11 for ≤ 29 and ≤ 17 is α = 1 4 and α = 1 2 , respectively. The values = 29 and = 17 are the minimum values that guarantee that all remaining poles of r A 0.25,70, and r A 0.5,70, , respectively, are within the double-precision limit | d | ≤ 2 52 ≈ 5 × 10 15 . The monotonically increasing behavior of r A α,k, when increases, documented in Theorem 4, is well illustrated especially on the right plots, where z ∈ [2 × 10 8 , 10 10 ]. which is a smooth function that agrees with the boundary conditions, and f 2 (x, y, z) ≡ 1, which gives rise to boundary layers and has been a subject of theoretical and experimental investigations in numerous papers already (e.g., see [2] or [24]).
For the first choice, the exact continuous solution can be explicitly derived, thus the error estimate (11) can be numerically computed (see Figure 5). For the second choice, the BURA reduction errors u k − u M k, 2 / f 2 and u k − u A k, 2 / f 2 have been studied (see Figures 6 and 7). The simulations were performed on coarse (h = 1/128) and finer (h = 1/1024) meshes using for the auxiliary sparse SPD systems the BoomerAMG implementation from HYPRE [21] of the algebraic multigrid preconditioner in the PCG framework. The values α = {0.25, 0.5, 0.75} have been considered. Since the best possible FEM accuracy is O(h 2 ), the BURA method for α = 0.25 is set on degree k = 24, which guarantees E 0.25,24 ≈ 10 −6 , thus, in all numerical simulations, the BURA error will not dominate over the error of discretization. For the corresponding degrees when α = 0.5 and α = 0.75, we choose k = 24/2 = 12 and k = 24/3 = 8, respectively, motivated by the estimate (6). In Figure 6, we compare the BURA reduction errors for f 2 to both the theoretical estimate from Theorems 2 and 3 and the corresponding pure non-reduction error E α,k , related to = 0. All results agree with Theorem 4, as u k − u M k, 2 ≥ u k − u A k, 2 always holds true. Further, we observe that the difference u A k, − u M k, 2 decreases when increases.
Finally, the error plots indicate that we can reduce half of the systems ( = 12) for α = 0.25, one-quarter of the systems ( = 3) for α = 0.5, and one-eighth of the systems ( = 1) for α = 0.75, without affecting the order of the overall BURA accuracy. In Figure 7, we compare the BURA reduction errors for f 2 with h 2 -the best order of the discretization error estimate. The idea of this experiment is that whenever the BURA reduction error is below h 2 , then the total error estimate (11) should be dominated by the error of discretization; thus, the reduction process should again not affect the overall accuracy. We observe that for h = 1/128∼10 −2 , we can "safely" reduce 13 systems within both BURA-AR and BURA-MR methods when α = 0.25. Analogously, we can "safely" reduce 4 systems within both BURA-AR and BURA-MR methods when α = 0.5, and 3 and 2 systems, respectively, within the BURA-AR and BURA-MR methods, when α = 0.75. When h = 1/1024∼10 −3 , we are within the "optimal" scenario depicted in Table 1 and, as suggested in the last column of the rows corresponding to h = 10 −3 , we witness the possibility of exactly 9 system reductions for α = 0.25, exactly 2 system reductions for α = 0.5, and even 1 system reduction for α = 0.75. Finally, in Figure 5, we conduct experiments analogous to Figure 6 but using the smooth right-hand side f 1 and estimating the total error (11). The level of potential systems reduction is similar to that in the previous experiment. Moreover, we observe the balance between the BURA error and the error of discretization around level h 2 , especially for finer meshes. This is an indication that the theoretically derived order of discretization of h 2α in certain cases (smooth right-hand sides that agree on the boundary with the boundary conditions and give rise to regular solutions) may be too pessimistic and that the upper order estimate of h 2 , corresponding to classical, non-fractional diffusion, could be achieved, independently of the choice of α.

Concluding Remarks
The newly obtained results significantly improve the computational efficiency of the BURA method. The fact that practically all other methods for a numerical solution of multidimensional spectral fractional-in-space diffusion problems in arbitrary domains can be viewed as some rational approximation further enhances the advantages of the new BURA-MR and BURA-AR methods. On the other hand, the proposed reduced product/sum approach can be applied to many of the other existing methods, as well. The new theoretical results and the presented proof-of-concept numerical tests open the window for a wide range of real-life applications, including various problems beyond the scalar elliptic case.
The number of auxiliary sparse SPD systems to be solved is applied as a measure of computational efficiency in [9], where the BURA method is introduced. Then, this approach is used in the comparative analysis of various other methods, which can be interpreted as a certain rational approximation of degree k of A −α . The theory of the BURA method does not depend on the shape of the computational domain. This also applies to the methods for which the unified approach from [7] is applicable. Based on this, in various papers, one-dimensional numerical tests are considered to be fully representative. In theory, this is correct, but in practice, it is not. The key point in the multidimensional case is that the SPD matrix A is generally unstructured and large-scale. Thus, the implementation of methods based on rational approximation of degree k naturally involves a certain iterative solver of the related k auxiliary sparse SPD systems. In this spirit is the presented study on the efficient implementation of the BURA method in the case of higher degrees k.
The idea of reducing BURA was firstly discussed in the short paper [19]. Here, we analyzed two alternative methods: the recently proposed BURA-AR together with the new BURA-MR. They are based on the additive and the multiplicative representations of BURA, respectively. Although the two expressions are completely equivalent, the theoretical estimates of the reduced sum BURA-AR and the reduced product BURA-MR, as well as the corresponding numerical results, are different.
Until now, only the additive representation was used in the software implementation of the BURA method for large-scale multidimensional problems. One of the reasons for this is perhaps its better parallelization. Here, we show rather promising results for the multiplicative variant of BURA and related BURA-MR methods. This resumes the discussion of the advantages and disadvantages of both options.
The results of this study contribute to the strengthening of theoretical knowledge and practical skills for a numerical solution of real-life fractional diffusion problems in complex multidimensional domains. At the same time, there are still certain issues related to computational stability that most likely need further understanding, interpretation and treatment.
The development of computationally efficient BURA-AR and BURA-MR methods for reaction-diffusion and time-dependent equations involving fractional-in-space diffusion operators is among the priority topics for further research.
Finally, we observe that under certain assumptions, the computational complexity of the reduced BURA can be improved to O(N log N).

Data Availability Statement:
The study did not report any data.