Next Article in Journal
On the Generalized Inverse Gaussian Volatility in the Continuous Ho–Lee Model
Previous Article in Journal
Blockchain-Enhanced Security for 5G Edge Computing in IoT
Previous Article in Special Issue
Numerical Analysis of the Impact of Variable Borer Miner Operating Modes on the Microclimate in Potash Mine Working Areas
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Enhanced Efficient 3D Poisson Solver Supporting Dirichlet, Neumann, and Periodic Boundary Conditions

Department of Civil Engineering, Tamkang University, New Taipei City 251301, Taiwan
Computation 2025, 13(4), 99; https://doi.org/10.3390/computation13040099
Submission received: 7 March 2025 / Revised: 8 April 2025 / Accepted: 14 April 2025 / Published: 18 April 2025
(This article belongs to the Special Issue Advances in Computational Methods for Fluid Flow)

Abstract

:
This paper generalizes the efficient matrix decomposition method for solving the finite-difference (FD) discretized three-dimensional (3D) Poisson’s equation using symmetric 27-point, 4th-order accurate stencils to adapt more boundary conditions (BCs), i.e., Dirichlet, Neumann, and Periodic BCs. It employs equivalent Dirichlet nodes to streamline source term computation due to BCs. A generalized eigenvalue formulation is presented to accommodate the flexible 4th-order stencil weights. The proposed method significantly enhances computational speed by reducing the 3D problem to a set of independent 1D problems. As compared to the typical matrix inversion technique, it results in a speed-up ratio proportional to n 4 , where n is the number of nodes along one side of the cubic domain. Accuracy is validated using Gaussian and sinusoidal source fields, showing 4th-order convergence for Dirichlet and Periodic boundaries, and 2nd-order convergence for Neumann boundaries due to extrapolation limitations—though with lower errors than traditional 2nd-order schemes. The method is also applied to vortex-in-cell flow simulations, demonstrating its capability to handle outer boundaries efficiently and its compatibility with immersed boundary techniques for internal solid obstacles.

1. Introduction

Poisson’s equation is crucial in engineering physics, including elastics [1], electrostatics [2], heat transfer [3], and fluid mechanics. In aerodynamics, for instance, the pressure field can be derived from a given velocity field by solving the Poisson equation. Another key application is the relationship between the vector stream function and vorticity. Assuming scalar quantities for the solution ϕ and the source f , the generic form of the equation is:
2 ϕ = f
In this paper, the compact finite-difference (FD) discretization of Equation (1) is employed, which is derived through central differencing and error analysis to numerically solve the Poisson equation. This method is chosen for its ease of implementation, flexibility, and broad applicability [4,5]. The ultimate goal is to develop a fast and accurate three-dimensional (3D) solver for Equation (1) that can handle various boundary conditions (BCs), including Dirichlet, Neumann, and Periodic. However, as the review will show, few FD methods fully achieve these objectives. Therefore, this work aims to enhance the efficiency of FD methods based on existing knowledge.
The finite difference (FD) approximation results in an n × n matrix, where n n x n y represents the total number of grid points in a 2D rectangular domain, with n x and n y as the grid sizes in the x- and y-directions. For a typical 2nd-order discretization of the 2D Poisson equation, each row of the matrix has only five nonzero entries. In small systems ( n 2 ~ O 10 2 ), direct methods like Gauss elimination [4] are efficient. However, doubling the grid resolution increases the matrix size four-fold in 2D and eight-fold in 3D, leading to memory constraints and making direct methods impractical. To cope with this, iterative methods such as successive over-relaxation [4] are used, trading memory efficiency for increased computational time. The computational time of iterative methods can be reduced when coupled with multigrid schemes, as demonstrated in 2D [6] and 3D [7]. However, these remain iterative methods that do not directly solve the problem.
These challenges arose shortly after the advent of computers in the 1960s for scientific computations. Hockney [8] proposed an efficient framework to solve the 2nd-order accurate FD discretized Poisson equation, which was further developed over the following decades. The approach treats the large solution matrix as a composition of block matrices representing the FD approximation along each dimension. Using a 48 × 48 uniform mesh for a 2D ‘plasma problem’, ref. [8] demonstrated that matrix decomposition with Fourier analysis (FA) and Cyclic Reduction (CR) improved calculation speed by a factor of 10 on the IBM 1090.
Cyclic Reduction (CR) is an efficient direct method that uses the recursive properties of block matrices to solve reduced systems. Stable solutions require Buneman’s variants [9], which Buzbee et al. [10] extended to handle Periodic and Neumann boundary conditions for the 2D Poisson equation. CR halves the system size at each step and was originally designed for grid widths of 2 K + 1 , where K is the maximum reduction level. Sweet [11] later generalized this approach. Despite its benefits, Buneman’s CR variants have limitations: (i) they are restricted to 2nd-order accurate discretization, (ii) they require advanced mathematical knowledge, and (iii) the final reduction step depends on boundary conditions. These issues likely explain the lack of recent developments based on CR.
Another efficient method is matrix decomposition, which uses modal decomposition to diagonalize block matrices, allowing the solution to be obtained from the diagonalized system and the inverse transformation. Buzbee et al. [10] expanded Hockney’s methods to handle FD discretization with accuracies higher than 2nd-order. Swarztrauber [12] analytically presented eigenvectors and eigenvalues for Dirichlet, Neumann, and Periodic boundary conditions within the same framework using trigonometric functions, which can be coupled with the Fast Fourier Transform [13,14] to enhance computational speed. For staggered grids, the formulations are summarized in [15,16,17].
While efficient methods for solving FD-discretized Poisson equations are well established in 2D, detailed 3D implementations remain limited. Wilhelmson and Ericksen [18] extended Buneman’s CR variants and Fourier analysis to 3D, using a square block matrix of order n x n y , where n x and n y are the grid sizes in the x - and y -directions. For large grids, n x n y can exceed 10 4 , leading to excessive memory demands and slow computation, making a direct 3D extension impractical. For their 3D matrix decomposition method, the cosine transforms are used to reduce the problem to 2D. However, with a lack of detailed explanations, it is difficult to follow their implementations.
Recently, Shiferaw and Mittal [19] proposed a 3D matrix decomposition method for solving the fourth-order accurate FD-discretized Poisson equation. They derived a symmetric 27-point compact stencil using Taylor series error analysis and presented trigonometric formulations for eigenvalues and eigenvectors of the block matrices. While their method is well-detailed for processing the first two dimensions, the third-dimensional reduction lacks proof, raising concerns. Wang et al. [20] used the fast discrete sine transform to build the fast 3D Poisson’s solver for a 4th-order accurate compact stencil. Both of the works [19,20] are limited to Dirichlet BCs, excluding Neumann and Periodic BCs. Feng and Zhao [21] also used the fast discrete sine transform for a high-order Poisson solver extendable to 3D and various BCs. However, the non-compact central FD stencil requires extending grid nodes beyond the boundary for multiple steps, depending on the accuracy order. Special boundary condition treatments, like the Augmented Matched Interface and Boundary method, must be applied to handle these extended nodes. Since the current work focuses on the method for compact FD stencils, extending multiple layers of nodes beyond the boundary may not be required.
Various software packages for solving FD discretized Poisson equations have emerged, including the pioneering FISHPACK written in FORTRAN 90 [21] and the recent PttPACK [22,23,24], which accelerates 2nd-order FD solutions using parallel GPU and CPU computations. While these resources are useful, data transfer issues arise due to interfaces between different programming languages. Instead of unifying software, this work aims to provide the necessary background to help readers build their codes, as existing technical notes often lack sufficient detail for this purpose.
The literature reveals that matrix decomposition is ideal for constructing efficient 3D Poisson solvers, but most recent 4th-order FD stencil developments are limited to Dirichlet BCs. Previous work on various BCs mainly addresses 2D Poisson equations with 2nd-order stencils. This gap between 4th-order 3D solvers and their application to various BCs is the focus of this paper. Hence, it first reviews the 27-point, fourth-order symmetric compact stencil for the 3D Poisson solver in Section 2. The proposed 3D matrix decomposition method is presented in Section 4, which reduces the 3D problem to 2D via modal transformation and then further to 1D using the 2D matrix decomposition method discussed in Section 3. Eigenvalue and eigenvector formulations, originally for 2nd-order FD discretization [12], are adapted for 4th-order stencils (see Appendix A). Section 5 validates the method with Gaussian and sinusoidal source fields under various BCs, demonstrating its application in vortex-in-cell flow simulations.

2. Finite Difference Discretization of Poisson’s Equation

In this section, the general form of the compact symmetric stencil deduced from the 4th-order finite difference discretization of Poisson’s equation is introduced. Let now n x , n y , and n z be the number of solution nodes along the x-, y-, and z-directions, respectively, then, Equation (2) is used to specify sub- and linear indices of the solution nodes, i.e.,
x i , j , k = x j , y i , z k = x o + j   h ,   y o + i   h ,   z o + k   h ,   for i = 1 ,   2 ,   ,   n y ,   j = 1 ,   2 ,   ,   n x and   k = 1 ,     2 ,   ,   n z
where x o , y o , z o is the coordinate of the South-West-Bottom corner of the rectangular domain; h is the spacing of the uniform grid; i , j , and k are the sub-indices associated with y-, x- and z-directions, respectively.
In 3D, the discretization of Equation (1) at a node, x i , j , k , can be written as
a ϕ ϕ i , j , k + b ϕ [ ϕ i , j , k 1 + ϕ i , j 1 , k + ϕ i 1 , j , k + ϕ i , j + 1 , k + ϕ i + 1 , j , k + ϕ i , j , k + 1 ] + c ϕ [ ϕ i , j 1 , k 1 + ϕ i 1 , j , k 1 + ϕ i , j + 1 , k 1 + ϕ i + 1 , j , k 1 + ϕ i 1 , j 1 , k + ϕ i 1 , j + 1 , k + ϕ i + 1 , j + 1 , k + ϕ i + 1 , j 1 , k + ϕ i , j 1 , k + 1 + ϕ i 1 , j , k + 1 + ϕ i , j + 1 , k + 1 + ϕ i + 1 , j , k + 1 ] + d ϕ [ ϕ i 1 , j 1 , k 1 + ϕ i 1 , j + 1 , k 1 + ϕ i + 1 , j + 1 , k 1 + ϕ i + 1 , j 1 , k 1 + ϕ i 1 , j 1 , k + 1 + ϕ i 1 , j + 1 , k + 1 + ϕ i + 1 , j + 1 , k + 1 + ϕ i + 1 , j 1 , k + 1 ] = h 2 f ~ i , j , k
where the coefficients a ϕ , b ϕ , c ϕ , and d ϕ are the weights for nodes in the stencil cube center, stencil face center, stencil edge center, and stencil corners, respectively. Note that for the 4th-order discretization, the right-hand-side (RHS) source term, f ~ , is computed by averaging the source, f , at neighboring nodes using a similar pattern of stencil weights, a f , b f , c f , and e f , i.e.,
f ~ i , j , k = a f f i , j , k + b f f i , j , k 1 + f i , j 1 , k + f i 1 , j , k + f i , j + 1 , k + f i + 1 , j , k + f i , j , k + 1 + c f [ f i , j 1 , k 1 + f i 1 , j , k 1 + f i , j + 1 , k 1 + f i + 1 , j , k 1 + f i 1 , j 1 , k + f i 1 , j + 1 , k + f i + 1 , j + 1 , k + f i + 1 , j 1 , k + f i , j 1 , k + 1 + f i 1 , j , k + 1 + f i , j + 1 , k + 1 + f i + 1 , j , k + 1 ] + d f [ f i 1 , j 1 , k 1 + f i 1 , j + 1 , k 1 + f i + 1 , j + 1 , k 1 + f i + 1 , j 1 , k 1 + ϕ i 1 , j 1 , k + 1 + f i 1 , j + 1 , k + 1 + f i + 1 , j + 1 , k + 1 + f i + 1 , j 1 , k + 1 ]
Using the notation of Equation (3) for the 2nd-order discretization of the 3D Poisson’s equation leads to left-hand-side (LHS) stencil weights a ϕ = 6 , b ϕ = 1 , and c ϕ = d ϕ = 0 and the corresponding RHS stencil weights are a f = 1 , and b f = c f = d f = 0 . For the 2D discretization, the general form of Equation (3) can still be used but without the weights of the Top ( k + 1 ) and Bottom ( k 1 ) plane nodes. Hence, the 2nd-order 2D discretization, the LHS stencil weights are a ϕ = 4 , b ϕ = 1 , c ϕ = 0 , while the RHS stencil weights are a f = 1 , and b f = c f = 0 . For the 4th-order discretization, the derivations of 3D stencil weights follow the results from [19,25], while the values of 2D stencil weights are based on [26]. Table 1 outlines these stencil weights, and their general forms are illustrated in Figure 1 for the 2D and 3D domains, respectively. Note that the nodes of the same symbol and color are used to identify the same coefficient. In Table 1, two ‘Types’ of 4th-order accurate 3D stencils are considered, where Type 1 utilizes full 27 weights while Type 2 neglects the light corner weights, i.e., d ϕ = 0 , for LHS, and edge and corner weights, i.e., c ϕ = d ϕ = 0 , for RHS.
For 3D rectangular grids, N x , N y and N z are the total number of grids along the x-, y-, and z-directions, respectively. To name the neighbors of the stencil weights or the boundary faces in this paper, the ‘Bottom (B) and Top (T)’, ‘West (W) and East (E)’, and ‘South (S) and North (N)’, are used to refer to the lower and the upper neighbors or bounds along the z-, x-, and y-directions, respectively. Three types of boundary conditions (BCs) are considered, i.e., the Dirichlet (D), Neumann (N), and Periodic (P) BCs. Among these three types of boundary nodes, only the Dirichlet nodes do not have the unknowns to be solved from the discretized system. On the other hand, Neumann and Periodic nodes have unknowns that need to be solved and, hence, need to be indexed. Because of these, the total number of the solution nodes may not be the same as the grid nodes. For example, if Dirichlet conditions are used on the W- and E-boundaries, then the number of solution nodes along the x-direction is n x = N x 2 . The values n x , n y , and n z due to variations of boundary conditions are summarized in Table 2.
In addition to the sub-index noted in Equation (2), the linear index, i ^ , is sometimes used. The relationship between the sub-indices and linear indices follows
i ^ = i + j   n x + k   n y n x
Hence, ϕ i , j , k represented by the sub-indexing is equivalent to the ϕ i ^ represented using the linear indexing.
Using linear indexing of Equation (4) for ordering, the fields of solutions and sources can be collected into vectors, i.e.,
Φ = ϕ 1 , ϕ 2 , , ϕ i ^ , , ϕ n x n y n z T
F = f ~ 1 , f ~ 2 , , f ~ i ^ , , f ~ n x n y n z T
In this paper, sometimes the nodes along a y-directed line are considered independently. For this reason, the vectors ϕ j , k and f ~ j , k are used to represent the solution and source nodal values along the y-direction y i   ( i = 1 ,   2 ,   , n y ) at x-coordinate x j and on the xy-plane of z-coordinate z k , i.e.,
ϕ j , k = ϕ 1 , j , k ϕ 2 , j , k ϕ n y 1 , j , k ϕ n y , j , k T
f ~ j , k = f ~ 1 , j , k f ~ 2 , j , k f ~ n y 1 , j , k f ~ n y , j , k T
If the solutions and sources on the k-th xy-plane are considered, then ϕ k and f ~ k are used to represent the planar values, in which similar linearly indexing is applied, i.e.,
ϕ k = ϕ 1 , k ϕ 2 , k ϕ n x 1 , k ϕ n x , k T
f ~ k = f ~ 1 , k f ~ 2 , k f ~ n x 1 , k f ~ n x , k T
To index the 2D rectangular grid, let k = 0 , and then Equations (2) and (4) can be used for the sub- and linear indices, respectively. Nodal value collections in the form of Equations (5) and (6) can be reduced to 2D as well by neglecting the effect of the z-direction.

3. Matrix Decomposition Method for 2D Poisson’s Equation

3.1. Two-Dimensional Finite-Difference Discretization

To illustrate the FD-discretized system, we first review the 2D finite difference (FD) discretization of Poisson’s equation. Figure 2 presents two examples of 2D rectangular domains with uniformly spaced mesh grids, differing in BCs: NNDD in Figure 2a and DPNP in Figure 2b, where the BCs are named in the sequence of West, South, East, and North boundaries. The boundary nodes are distinguished by colors and symbols—green circles for Dirichlet (D) nodes, blue triangles for Neumann (N) nodes, and red diamonds for Periodic (P) nodes. At corners where two boundary conditions apply, both must be considered for the corresponding node. Inside the domain, black crosses mark the ‘Interior’ nodes, which require solutions. Additionally, solutions must be determined for Neumann and Periodic boundary nodes, except for Dirichlet nodes. The linear indices of the solution nodes, based on Equation (4), are labeled on the right side of the grid in Figure 2. Outside the Neumann and Periodic boundaries, a layer of ‘Dummy’ nodes is added. These auxiliary nodes do not belong to the physical domain but serve as computational padding.
When the boundary conditions (BCs) become non-homogeneous, the finite difference (FD) discretization of Poisson’s equation near the boundaries becomes more complex. To illustrate this, we analyze two N x × N y = 7 × 7 example grids (Figure 2). For a generalized approach to discretization and source terms, we select nodes #11, #36, #6, and #1 from the domain in Figure 2a, along with nodes #1 and #36 from Figure 2b. A closer view of these nodes, their neighbors, and local sub-indices is shown in Figure 3. The discretization results of left-hand side (LHS) and right-hand side (RHS) stencil weights, modified by the BCs, are shown in Figure 4 and Figure 5, respectively. Among these nodes, detailed discretization processes are given only for the more complex cases, i.e., the cases in Figure 3d–f. Generalization of discretization processes is made for simpler cases in Figure 3a–c. Since the discretization focuses on the LHS stencil weights, the coefficients a ϕ , b ϕ , c ϕ , and d ϕ are now simplified to a , b , c , and d for the following discussions.
Example 1.
The corner node connecting two Neumann boundaries—#1 in Figure 2a.
Applying the 9-point stencil in Figure 3d stencil leads to the discretization, i.e.,
a ϕ i , j + b [ ϕ i + 1 , j + ϕ i , j + 1 ] + c ϕ i + 1 , j + 1 = h 2 f ~ i , j b ϕ i , j 1 + ϕ i 1 , j c ϕ i 1 , j 1 + ϕ i + 1 , j 1 + ϕ i 1 , j + 1
which require the values on dummy nodes. These values are extrapolated based on the 2nd-order accurate central difference, i.e.,
ϕ i + 1 , j 1 = ϕ i + 1 , j + 1 + 2 h · ϕ x i + 1 , j
ϕ i , j 1 = ϕ i , j + 1 + 2 h · ϕ x i , j
ϕ i 1 , j = ϕ i + 1 , j + 2 h · ϕ y i , j
ϕ i 1 , j + 1 = ϕ i + 1 , j + 1 + 2 h · ϕ y i , j + 1
ϕ i 1 , j 1 = ϕ i + 1 , j + 1 + 2 h · ϕ x i , j + ϕ y i , j
With gradient information, ϕ x and ϕ y , given on the Neumann nodes, and substitution of the extrapolated values into the discretization, this leads to
a ϕ i , j + 2 b [ ϕ i + 1 , j + ϕ i , j + 1 ] + 4 c ϕ i + 1 , j + 1 = h 2 f ~ i , j b 2 h · ϕ x i , j + 2 h · ϕ y i , j c 2 h · ϕ x i , j + ϕ y i , j + 2 h · ϕ x i + 1 , j + 2 h · ϕ y i , j + 1
Example 2.
The corner node connecting Periodic and Dirichlet Boundaries—#1 in Figure 2b.
Application of the 9-point stencil in Figure 3e leads to the discretization
a ϕ i , j + b [ ϕ i + 1 , j + ϕ i , j + 1 ] + c ϕ i + 1 , j + 1 = h 2 f ~ i , j b ϕ i , j 1 + ϕ i 1 , j c ϕ i 1 , j 1 + ϕ i + 1 , j 1 + ϕ i 1 , j + 1
which involves three dummy nodes along on South stencils. Since the South boundary of Figure 2b is of Periodic condition, the dummy nodal values can be obtained from the North boundary of the domain, i.e.,
ϕ i 1 , j 1 = ϕ N y , j 1
ϕ i 1 , j = ϕ N y , j
ϕ i 1 , j + 1 = ϕ N y , j + 1
Since Periodic values are a part of the solutions, the unknowns are moved to the LHS
a ϕ i , j + b [ ϕ i + 1 , j + ϕ i , j + 1 + ϕ N y , j ] + c ϕ N y , j + 1 + ϕ i + 1 , j + 1 = h 2 f ~ i , j b ϕ i , j 1 c ϕ N y , j 1 + ϕ i + 1 , j 1
Example 3.
The corner nodes connecting the Neumann and Periodic boundaries—#36 in Figure 2b.
Applying the 9-point stencil in Figure 3f leads to the discretization, i.e.,
a ϕ i , j + b [ ϕ i , j 1 + ϕ i + 1 , j ] + c ϕ i + 1 , j 1 = h 2 f ~ i , j b ϕ i 1 , j + ϕ i , j + 1 c ϕ i 1 , j 1 + ϕ i 1 , j + 1 + ϕ i + 1 , j + 1
First, periodic conditions are applied to obtain the SW and S nodal and gradient values, i.e.,
ϕ i 1 , j 1 = ϕ N y , j 1
ϕ i 1 , j = ϕ N y ,   j
ϕ x i 1 , j = ϕ x N y , j
Next, apply the Neumann conditions to extrapolate the SE, E, and NE nodal values, i.e.,
ϕ i 1 , j + 1 = ϕ i 1 , j 1 + 2 h · ϕ x i 1 , j = ϕ N y , j 1 + 2 h · ϕ x N y , j
ϕ i , j + 1 = ϕ i , j 1 + 2 h · ϕ x i , j
ϕ i + 1 , j + 1 = ϕ i + 1 , j 1 + 2 h · ϕ x i + 1 , j
With these substitutions and moving all unknowns to the LHS, the discretization becomes
a ϕ i , j + b [ 2 ϕ i , j 1 + ϕ N y ,   j + ϕ i + 1 , j ] + 2 c ϕ i + 1 , j 1 + ϕ N y , j 1 = h 2 f ~ i , j b ϕ N y ,   j + 2 h · ϕ x i , j c 2 h · ϕ x N y , j + 2 h · ϕ x i + 1 , j
The FD discretization for nodes shown in Figure 3 results in the LHS stencil weights and RHS stencil sources being summarized in Figure 4 and Figure 5, respectively. For a typical interior node (Figure 3a), since all neighbors are solution nodes, they have the LHS stencil weights (Figure 4a) being the same as those prescribed in Figure 1. The source term on the RHS is solely contributed from the central node, i.e., f ~ i , j in Figure 5a. For nodes beside the Dirichlet boundaries (Figure 3b,c,e), the stencil weights of the Dirichlet nodes are removed from the LHS (Figure 4b,c,e) since they are known and moved to the RHS as the source terms (Figure 5b,c,e). For discretizing the Neumann nodes (Figure 3c,d,f), the LHS stencil weights are doubled for neighbor nodes on the interior side of the Neumann face (Figure 4c,d,f). The gradient source terms on the Neumann nodes also contribute to the RHS source terms (Figure 5c,d,f). For discretizing the Periodic nodes (Figure 3e,f), the neighbor stencil weights are associated with the nodes on the opposite side of the domain in the periodic direction (Figure 4e,f). No source terms are generated from the neighboring Periodic nodes (Figure 5e,f).
As shown in the examples and Figure 5, calculating source terms induced by mixed boundary conditions can be complex. In 3D discretization, corner nodes in the 2D plane extend to edge nodes in the 3D domain, increasing their influence. Moreover, corner nodes in 3D may involve three different boundary conditions, further complicating the calculations. Therefore, efficiently computing boundary-induced source terms with high accuracy is crucial. To address this, Neumann and Periodic boundary conditions are transformed into equivalent Dirichlet boundary conditions on extended dummy nodes. The following procedure outlines this approach, which can be directly generalized to a 3D scenario:
(i)
The solutions, ϕ , are assumed zeros on grid nodes except for the Dirichlet nodes. Next, for the Dummy nodes outside the Periodic boundaries (e.g., the S and N Dummy nodes in Figure 2b), the nodal values are obtained by copying the Periodic conditions from the opposite side of the domain. Since the solutions are assumed to be zero, this step effectively extends the Dirichlet conditions in the periodic directions. The gradient conditions are copied in the same way if Neumann nodes exist.
(ii)
For the Dummy nodes outside the Neumann boundaries (e.g., W- and S-boundaries in Figure 2a and E-boundary in Figure 2b), the nodal values are obtained by extrapolation using a 2nd-order scheme. Since the solutions are zero before this step, this step effectively assigns the gradient source terms to the Dummy nodes.
(iii)
Consider the extended grid that includes the Dummy nodes and reset boundary conditions such that fringe nodes of the domain are all Dirichlet. These new Dirichlet nodes have nodal values that are equivalent to the effects of Neumann and Periodic BCs.
(iv)
Add the Dirichlet conditions to the RHS source term, f ~ , for each solution node. For this step, the routine may first check if the Dirichlet nodes are SW, W, NW, S, N, ES, E, or NE to any solution node, and then the Dirichlet condition is added to the solution node.
By ordering the solution nodes using linear indexing of Equation (4), the results of 2D FD discretization can be written as
A Φ = F
where Φ and F are the solution and source vectors in Equation (5) (2D version), and A is the square solution matrix formed by the n x blocks of square matrices of order n y . Using the block matrices for representation, Equation (11) becomes, i.e.,
A 1 g W A 2 p W A 2 A 2 A 1 A 2 A 2 A 1 A 2 A 2 A 1 A 2 p E A 2 g E A 2 A 1   ϕ 1 ϕ 2 ϕ n x 2 ϕ n x 1 ϕ n x = f ~ 1 f ~ 2 f ~ n x 2 f ~ n x 1 f ~ n x
Here n x and n y are the numbers of solution nodes along the x- and y-directions of a rectangular domain; g W , g E , p W , and p E are constants depending on BCs of West (W) and North (N) boundaries. The values of these quantities are summarized in Table 2. By summarizing the patterns of the LHS stencil weights for various boundary conditions, e.g., Examples 1 to 3 and Figure 4, the block matrices can be written as follows:
A 1 = a g S b p S b b a b b a b b a b p N b g N b a n y × n y
A 2 = b g S c c b c p S c c p N c b c c b c g N c b n y × n y
Here g S , g N , p S , and p N are constants depending on BCs of South (S) and North (N) boundaries and are summarized in Table 2. Note that the constants g ’s and p ’s are used to discern if the boundary conditions are Neumann or Periodic, respectively, while the subscripts are used to denote which boundary it applies to.

3.2. 2D Modal Decomposition Technique for Solution

This section reviews the background of the matrix decomposition method [10] for solving Equation (12) and generalizes it for 4th-order stencil and various boundary conditions. This review is necessary for continuing the dimensional reduction in the 3D implementation shown in Section 4.2. The technique mainly utilizes modal decomposition of the block matrices. Since the block matrices A 1 and A 2 can be proved to have the same eigenvectors [27], Q , the following modal decomposition can be applied, i.e.,
Q 1 A 1 Q = Λ 1
Q 1 A 2 Q = Λ 2
Here, Q 1 denotes the inverse of the eigenvector matrix. If A 1 and A 2 are symmetric, which applies to the Dirichlet and Periodic conditions (along the y-direction), the eigenvectors are orthogonal, i.e., Q T = Q 1 . Λ 1 and Λ 2 are the diagonal matrices containing the eigenvalues λ i and μ i , i.e.,
Λ 1 = d i a g ( λ 1 ,   λ 2 ,     , λ n y )
Λ 2 = d i a g ( μ 1 ,   μ 2 ,     , μ n y )
By pre-multiplying each of the i -th rows of Equation (12) by Q 1 , noting the transformed source term using the overbar, i.e., f ~ ¯ j = Q 1 f ~ j , and utilizing the orthogonality of the eigenvectors, ϕ j = Q ϕ ¯ j , for j = 1 , 2 , , n x , Equation (12) becomes
Λ 1 g W Λ 2 Λ 2 Λ 1 Λ 2 p W Λ 2 Λ 2 p E Λ 2 Λ 1 Λ 2 Λ 2 Λ 1 Λ 2 g E Λ 2 Λ 1   ϕ ¯ 1 ϕ ¯ 2 ϕ ¯ n x 2 ϕ ¯ n x 1 ϕ ¯ n x = f ~ ¯ 1 f ~ ¯ 2 f ~ ¯ n x 2 f ~ ¯ n x 1 f ~ ¯ n x
Since the eigenvalue matrices are diagonal, the equations associated with each mode in Equation (16) are independent. For the i -th mode, it is
λ i g W μ i μ i λ i μ i p W μ i μ i λ i μ i p E μ i μ i λ i μ i g E μ i λ i n x × n x ϕ ¯ i , 1 ϕ ¯ i , 2 ϕ ¯ i , n x 2 ϕ ¯ i , n x 1 ϕ ¯ i , n x = f ~ ¯ i , 1 f ~ ¯ i , 2 f ~ ¯ i , n x 2 f ~ ¯ i , n x 1 f ~ ¯ i ,   n x
Equation (17a) can be rewritten in an abbreviated form, i.e.,
Γ i ϕ ^ i = f ~ ^ i ,         i = 1 ,   2 ,   , n y  
Here Γ i is the tridiagonal square matrix collecting the eigenvalues of the i -th mode; ϕ ^ i and f ~ ^ i denote, respectively, the transformed solution and source vectors of the i -th mode.
With the modal decomposition technique described in Equations (14)–(17), the solution algorithm is as follows:
  • Compute the eigenvector matrix, Q , and diagonal matrices of eigenvalues, Λ 1 and Λ 2 , for matrices A 1 and A 2 , respectively.
  • Compute transformed source vectors, i.e., f ~ ¯ j = Q 1 f ~ j , for j = 1 , 2 , , n x .
  • Collect the transformed source of the i -th mode to form the vector: f ~ ^ i = f ~ ¯ i , 1 , f ~ ¯ i , 2 , , f ~ ¯ i , n x T , and form the tridiagonal eigenvalue matrix Γ i .
  • Solve the transformed solution of the i -th mode, ϕ ^ i = ϕ ¯ i , 1 , ϕ ¯ i , 2 , , ϕ ¯ i , n x T , for i = 1 , 2 , , n y , from the system Equation (17).
  • Transform back to the solution by ϕ j = Q ϕ ¯ j for j = 1 , 2 , , n x .
A few things can be observed when using the matrix decomposition method. Firstly, the 2D FD discretization of Poisson’s equation generates a large sparse solution matrix of order n x n y × n x n y (i.e., Equation (12)). Direct matrix inversion of such a large matrix would require large computer memory as well as computational time. In the matrix decomposition method of Equation (17), the solution matrix is reduced to a tridiagonal one of order n y × n y , which is equivalent to solving a 1D system along the y-direction. Although the matrix needs to be inverted for n y modes, the process is independent of each other and therefore faster. Secondly, the eigenvectors, Q , and the inversion of the tridiagonal matrix, Γ i , can be predetermined and stored for repeated use if the grid does not change its geometry. This is favorable in reducing the computational time by removing the calculation of the eigenvalue problem for each time step in the simulation of the temporal evolution of the problem. To preserve the accuracy intended by the FD discretization, the contributions of all n y modes are considered in the modal decomposition and reconstruction. The only exception is the rare case where the transformed source vector, i.e., f ~ ^ i in Equation (17b), is observed to be a zero vector, in which case its modal contribution can be neglected.
In the matrix decomposition, the eigenvectors and eigenvalues can be prescribed analytically and, hence, remove the need for calculation. Swarztrauber [12] provides a valuable summary of these formulations for the typical 2nd-order accurate FD stencil and various BCs (on the South and North boundaries). However, as noted by Cohen [27], because the general form of the block matrix can be obtained by a linear combination of basic 2nd-order discretization and an identity matrix, the eigenvectors of the block matrix associated with the 4th-order discretization should be the same as the 2nd-order ones. The eigenvalues, on the other hand, are different in the 4th-order discretization and, therefore, need to be modified from those prescribed in [12]. This work presents the adapted formulations of eigenvectors and eigenvalues in Appendix A for completeness of the proposed method.

4. Matrix Decomposition Method for 3D Poisson’s Equation

4.1. Linear System of the Discretization

By generalizing the results of the 2D FD discretization, the 3D FD discretization of Poisson’s equation results in the following linear system:
M Φ = F
Since the components of the solutions Φ and sources F are ordered as in Equation (5), the square matrix M formed by n z block matrices A and T can be summarized as follows as the results of the 3D FD discretization, i.e.,
M = A g B T T A T p B T T p T T A T T A T g T T A n z b l o c k s × n z b l o c k s
Note that the number of blocks, n z , and the coefficients, g B , p B , g T , and p T in M will depend on the boundary conditions of the Bottom (B) and Top (T) faces of the domain, i.e., the two ends in the z-direction. These numbers are summarized in Table 2. The block matrices A and T are used to link the nodal values on the k-th xy-plane to its top ((k+1)-th) and bottom ((k−1)-th) neighbor of the xy-planes. The matrices A and T can be further decomposed into square matrices formed by n x blocks as follows:
A = A 1 g W A 2 A 2 A 1 A 2 p W A 2 A 2 p E A 2 A 1 A 2 A 2 A 1 A 2 g E A 2 A 1 n x b l o c k s × n x b l o c k s
T = T 1 g W T 2 T 2 T 1 T 2 p W T 2 T 2 p E T 2 T 1 T 2 T 2 T 1 T 2 g E T 2 T 1 n x b l o c k s × n x b l o c k s
Here A 1 and A 2 are the matrices that link the nodal values on the j-th line to its East ((j+1)-th) and West ((j−1)-th) neighbor lines of the grids in the y-direction on the k-th xy-plane. Note that the matrix A in Equation (18c) has the block matrices A 1 and A 2 being the same as in Equation (13) used in 2D discretization. Matrices T 1 and T 2 do similar jobs as for A 1 and A 2 , respectively, for the nodes on the upper ((k+1)-th) and lower ((k−1)-th) xy-planes, i.e.,
T 1 = b g S c c b c p S c c p N c b c c b c g N c b n y × n y
T 2 = c g S d d c d p S d d p N d c d d c d p N d c n y × n y
In A 1 (Equation (13a)), the coefficients a and b link the nodal values on the i-th center node to its South ((i−1)-th) and North ((i+1)-th) neighbors on the j-th line of the grids in the y-direction. The coefficients b and c in matrix A 2 (Equation (13b)) do the same thing as a and b in A 1 , respectively, but for the (j+1)- and (j−1)-th lines of the grids in the y-direction. The same rule can be applied to explain the coefficients b , c , and d in T 1 and T 2 for nodes on the upper ((k+1)-th) and lower ((k−1)-th) neighbors of the k-th xy-plane. These values are summarized in Table 2 as well.
To evaluate the source terms ( F ) induced by various boundary conditions, the equivalent Dirichlet BCs proposed in Section 3.1—generalized directly from the 2D version—are used. This approach avoids the added complexity of implementing a 3D-specific computer routine.

4.2. 3D Modal Decomposition Technique for Solution

This section expands the modal decomposition technique described in Section 3.2 from 2D to 3D. The way Shiferaw and Mittal [19] started the derivation is first followed but ends up with a more general result that could account for various combinations of BCs. Note that the matrix decomposition results presented in [19] can only be applied to the Dirichlet BCs, and the formulations of the associated eigenvalues are presented without explanation. Here it deduces in detail how the 3D problem with various BCs can be reduced to independent 1D tridiagonal systems using this technique for a fast solution.
Since the stencil coefficients are symmetric (Figure 1), the eigenvectors of matrices A 1 , A 2 , T 1 , and T 2 are the same (see argument in [27]) and, hence, can be assumed as Q . The eigenvalues, on the other hand, need to be distinguished. For matrices A 1 and A 2 , the corresponding eigenvalues Λ 1 and Λ 2 are obtained using Equations (14) and (15). For matrices T 1 and T 2 , the eigenvalues are obtained in a similar way, i.e.,
Q 1 T 1 Q = Ω 1
Q 1 T 2 Q = Ω 2
where Ω 1 and Ω 2 are the diagonal matrix of order n y containing the eigenvalues, i.e.,
Ω 1 = d i a g α 1 ,   α 2 ,     , α n y
Ω 2 = d i a g β 1 ,   β 2 ,     , β n y
Note that if the BCs on South-North boundaries, i.e., the two ends in the y-direction, are Dirichlet-Dirichlet or Periodic-Periodic, the matrices A 1 , A 2 , T 1 , and T2 are symmetric, and, hence, the orthogonality of the eigenvectors, i.e., Q T = Q 1 , can be applied.
Now let Q ~ be the matrix containing n x blocks of Q in the diagonal blocks, i.e.,
Q ~ = Q Q Q Q n x   b l o c k s × n x   b l o c k s
and the corresponding eigenvalue matrices of the same order follow:
Q ~ 1 A Q ~ = Λ = Λ 1 g W Λ 2 Λ 2 Λ 1 Λ 2 Λ 2 Λ 1 p W Λ 2 Λ 1 Λ 2 Λ 1 p E Λ 2 Λ 2 Λ 1 Λ 2 g E Λ 2 Λ 1
Q ~ 1 T Q ~ = Ω = Ω 1 g W Ω 2 Ω 2 Ω 1 Ω 2 Ω 2 Ω 1 p W Ω 2 Λ 1 Ω 2 Λ 1 p E Ω 2 Ω 2 Ω 1 Ω 2 g E Ω 2 Ω 1
Recall in Equation (7) that ϕ k and f ~ k are used to denote the vectors collecting, respectively, the solutions and sources on nodes of the k-th xy-plane. Then the discretized system, Equation (18a), can be expanded as
A   ϕ 1 + g B T   ϕ 2   + p B T   ϕ n z   = f ~ 1
T   ϕ 1 +   A   ϕ 2 + T   ϕ 3   = f ~ 2
  T   ϕ n z 2 + A   ϕ n z 1 + T   ϕ n z = f ~ n z 1
p T T   ϕ 1   + g T T   ϕ n z 1 + A   ϕ n z = f ~ n z
By pre-multiplying the above equations by Q ~ 1 to obtain the transform, f ~ ¯ k = Q ~ 1 f ~ k , and noting the inverse transform, f ~ k = Q ~   f ~ ¯ k , the above equations become
Λ   ϕ ¯ 1 + g B Ω   ϕ ¯ 2   + p B Ω   ϕ ¯ n z = f ~ ¯ 1
Ω   ϕ ¯ 1 +   Λ   ϕ ¯ 2 + Ω   ϕ ¯ 3   = f ~ ¯ 2
  Ω   ϕ ¯ n z 2 + Λ   ϕ ¯ n z 1 + Ω   ϕ ¯ n z = f ~ ¯ n z 1
p T Ω   ϕ ¯ 1   + g T Ω   ϕ ¯ n z 1 + Λ   ϕ ¯ n z = f ~ ¯ n z
By collecting the terms associated with the i-th mode and using the sub-indexing for the solution and source vectors, the above equations become the following system:
        λ i g W μ i μ i λ i p W μ i μ i p E μ i g E μ i     λ i   g B α i g W β i β i α i p W β i β i p E β i g E β i α i         α i g W β i β i α i p W β i β i p E β i g E β i α i           λ i g W μ i μ i λ i p W μ i μ i p E μ i g E μ i     λ i p B α i g W β i β i α i p W β i β i p E β i g E β i α i         α i g W β i β i α i p W β i β i p E β i g E β j α i         p T α i g W β i β i α i p W β i β i p E β i g E β i α i                                     g T α i g W β i β i α i p W β i β i p E β i g E β i α i                 λ i g W μ i μ i λ i p W μ i μ i p E μ i g E μ i     λ i ϕ ¯ i , 1 , 1 ϕ ¯ i , 2 , 1 ϕ ¯ i , n x , 1 ϕ ¯ i , 1 , 2 ϕ ¯ i , 2 , 2 ϕ ¯ i , n x , 2 ϕ ¯ i , 1 , n z ϕ ¯ i , 2 , n z ϕ ¯ i , n x , n z = f ~ ¯ i , 1 , 1 f ~ ¯ i , 2 , 1 f ~ ¯ i , n x , 1 f ~ ¯ i , 1 , 2 f ~ ¯ i , 2 , 2 f ~ ¯ i , n x , 2 f ~ ¯ i , 1 , n z f ~ ¯ i , 2 , n z f ~ ¯ i , n x , n z
By comparing the 2D finite difference discretization results in Equations (12) and (13), it can be observed that Equation (25) can be thought of as the finite difference discretization of the 2D plane (i.e., the xz-plane) formed by the remaining two directions reduced from modal transformation. The pattern of the equivalent 2D stencil is like the 4th-order stencil with the nodal coefficients replaced by the eigenvalues a = λ i , c = β i , and d = 0 . However, this reduction does not lead to a symmetric stencil, since for South-North neighbors the stencil weights are b S / N = μ i , but for West-East neighbors the stencil weights are b S / N = α i . The schematic of the equivalent stencil and the reduced 2D grid are shown in Figure 6. By comparing Equation (25) to Equations (12) and (13), it can be seen that the effects of BCs on W, E, B, and T boundaries are preserved in Equation (25), i.e., the locations of g W ,     g E , g B ,     g T ,   p W , p E ,   p B , and p T in Equation (25) are the same as those of the 2D discretization.
The equivalent 2D discretization in Equation (25) can be further reduced to a 1D linear system using the matrix decomposition method described in Section 3.2. The solution solved from the 1D system can be transformed back to 2D and 3D fields to complete the solution. Such a procedure is summarized as follows:
(i)
Determine the n y × n y eigenvector matrix, Q , associated with the boundary conditions in the y-direction, using the results in the Appendix A.
(ii)
Apply the first transform of the source field to obtain f ~ ¯ . This is done through line-wise operation, i.e., f ~ ¯ j , k = Q 1 f ~ j , k for j = 1 , 2 , , n x and k = 1 ,   2 , , n z , where f ~ ¯ j , k and f ~ j , k are the n y × 1 vectors in the format of Equation (6) that are used to store nodal values along the y-direction. Permute the first dimension of f ~ ¯ to the third so that f ~ ¯ i , j , k now becomes f ~ ¯ j , k , i .
(iii)
Given an i-th mode, f ~ ¯ i is now the n x n z × 1 vector on the RHS of Equation (25). To solve the LHS n x n z × 1 vector of the transformed solution, ϕ ¯ i , in Equation (25), apply the 2D modal decomposition technique in Section 3.2 based on the equivalent 2D (non-symmetric) compact stencil shown in Figure 6 and the BCs on the xz-planes. Repeat this step for all of the i-th modes ( i = 1 ,   2 , , n y ) to complete the solution. Once done, reversely permute ϕ ¯ j , k , i to restore its original indexing, ϕ ¯ i , j , k for ϕ ¯ .
(iv)
Conduct an inverse transform to obtain the solution using the line-wise operations similar to step (ii), i.e., ϕ j , k = Q ϕ ¯ j , k for j = 1 , 2 , , n x and k = 1 ,   2 , , n z .

5. Numerical Examples

5.1. General Background

The results of the numerical implementations of the 3D matrix decomposition scheme are presented in this section. To show the ability of the proposed method to account for various boundary conditions (BCs), an example of the BC combinations is shown in Figure 7. The combination of BCs is named in sequence for the six boundary faces: Bottom (B), West (W), South (S), East (E), North (N), and Top (T). Hence, the BCs in Figure 7 are named PDNDDP, where D, N, and P stand for Dirichlet, Neumann, and Periodic BCs, respectively. In the next sections, validations of the proposed method are given for solving the point source field (Section 5.2) and the sinusoidal source fields (Section 5.3) in various combinations of the BCs.
The proposed method is flexible to the stencil weights, as long as it is compact and symmetric. Hence, the order convergence rate can be examined in the validation sections for various 3D stencil weights shown in Table 1. The typical 2nd-order accurate stencil weights and the two 4th-order stencils are applied to obtain numerical solutions. Note that the 4th-order stencil-2 is a common 4th-order stencil seen in [19,25], while the 4th-order stencil-1 is extracted from [19] using the assumption of the uniform grid in each direction. The orders of accuracy are examined by refining mesh grid spacing in a doubly manner, i.e.,
h = L x / 2 k + 1 ,   k = 3 ,   4 ,   5 ,   6 ,   7
where L x is the length of the domain in the x-direction. The errors of numerical simulations are defined as the absolute deviation from the exact solution, ϕ e x a c t x , at coordinate x = x , y , z , i.e.,
Spatial   error :   ε x = ϕ n u m e r i c a l x ϕ e x a c t x
where ϕ n u m e r i c a l x denotes the numerical solution using the proposed method. The statistics of the spatial errors, such as the mean (L2 norm), ε m e a n , 99.9-percentile, ε 99.9 % , and the maximum value, ε m a x , are presented for indication of the numerical performances, which are defined as:
(28a) Spatial   averaged   error :   ε m e a n = 1 L x L y L z ε 2 x d x 0.5 (28b) 99.9 - percentile   error :   ε 99.9 % = 99.9 - percentile   of   ε x (28c) Maximum   error : ε max = maximum   of   ε x
In Equation (28a), Lx, Ly, and Lz are the side lengths of the 3D rectangular domain along the x-, y-, and z-directions, respectively.
Note that the use of mean error reflects the overall accuracy, but the effects of scarce large errors may be diluted. If only the peak error is examined, the overall performance may not be sensed. Hence, in this paper, the 99.9th-percentile error is also examined.
In Section 5.4, application examples of the proposed method are presented for some 3D vortex-in-cell flow simulations, which include the collision of two co-axial vortex rings, the flow induced by the impulsively a started sphere and a circular cylinder, as well as the boundary layer flow around the floor-mounted cube at low Reynolds numbers, to demonstrate the need of combined BCs in the practical flow simulations, and how the proposed method can be well-suited for this purpose.

5.2. Validation with the 3D Particle Field

The first example is to use the proposed method to solve Poisson’s equation under the Gaussian particle source, ζ , i.e.,
ζ = Γ 2 π   σ 3 exp r 2 2 σ 2
Here Γ is the particle strength, σ is the smoothing parameter, r is the distance from the particle center, x o = x o e 1 + y o e 2 + z o e 3 , i.e.,
r 2 = x x o 2 + y y o 2 + z z o 2
Of note, the selection of the point Gaussian source is inspired by the 3D point vorticity distribution in the vortex particle method (e.g., Ploumhans et al. [28]). The resulting Poisson solution to this smoothed point source, G r , is used to calculate the particle attribution to the corresponding stream-function field, i.e.,
2 G = ζ
Using the Green’s (divergence) theorem in the spherical coordinate, the exact solution to Equation (31) can be obtained, i.e.,
G = 1 σ σ 4 π r erf r 2 σ
where e r f s = 0 s exp p 2 d p . From Equation (32), the exact Dirichlet BCs can be obtained as well. For the Neumann BCs, the gradient of Equation (32) is also calculated:
G = r 4 π r 3 erf r 2 σ r σ 2 π exp r 2 2 σ 2
where r = x x o is the location vector relative to the particle.
For simplicity in the demonstrations, only the effects of one particle are examined. The particle has a unity strength, Γ = 1 , and is located at the origin, i.e., x o = y o = z o = 0 , with σ = 0.2 . A cubic domain with a side length of 2 is chosen (i.e., L x = L y = L z = 2 ) and the grid points span within L x / 2 x L x / 2 , L y / 2 y L y / 2 , and L z / 2 z L z / 2 . For this problem, combinations of Dirichlet and Neumann BCs are studied. The isosurface of the exact solution is shown in Figure 8. As can be seen, the solution has a smooth peak near the origin and gradually decays in the radial direction. The red dots in Figure 8 indicate where the errors of the 4th-order (stencil-1) numerical solutions are larger than the 99.9-percentile value. For the all-Dirichlet BCs (Figure 8a), the largest errors occur near the peak. For the partially Neumann BCs (Figure 8b), the largest errors occur near the center of the Neumann boundary faces. Since the locations of the largest errors are different in Figure 8a,b, the mechanisms causing the error may be different for Dirichlet and Neumann BCs.
More thorough error analyses are shown in Figure 9, which includes the effects of grid spacing, stencil schemes (identified by symbols), and boundary conditions (identified by colors). Also attached in the figure are the auxiliary lines denoting the 4th- and 2nd-order convergence rates. As can be seen, the most accurate results come from the 4th-order schemes in the all-Dirichlet (DDDDDD) conditions. As one of the boundary conditions becomes Neumann, the convergence rate of the 4th-order schemes decreases to 2nd-order. When more boundary conditions become Neumann, the accuracies of both 2nd- and 4th-order schemes decrease. The accuracy reduction rates are less sensitive in the 2nd-order scheme than in the 4th-order schemes when more Neumann boundaries are added.
To see the effectiveness of the 4th-order scheme as compared to the typical 2nd-order scheme for various boundary conditions, the errors of the 4th-order scheme-2 are first divided by the error of the 2nd-order scheme and shown in Figure 10. As can be seen, since the 4th-order scheme works as it claims in the all-Dirichlet BCs, the error ratio decreases in 2nd-order for all-Dirichlet BCs. For other BCs that include Neumann, the error ratio remains constant, manifesting again the reduction of the 4th-order scheme to the 2nd-order. Although the convergence rates decrease, the error ratios of the 4th-order scheme-2 generally remain below unity, except in the extreme case (NNNNND), where the stronger coupling to boundary gradients in the 4th-order stencil dominates the solution. This observation preserves the value of using the 4th-order scheme against the 2nd-order one in Neumann boundaries.
To compare the effectiveness of the two 4th-order schemes in detail, the ratio of the errors obtained from scheme-1 is divided by those of scheme-2. As can be seen in Figure 11, most of the ratios are close to 1, except for all Dirichlet BCs. The reason scheme-1 results in higher errors under all Dirichlet BCs is not yet clear and warrants further investigation. In summary, regarding the effectiveness of the schemes, the 4th-order schemes are mostly better than the 2nd-order scheme, and the 4th-order scheme-2 is slightly better than the 4th-order scheme-1.
The cause of accuracy reduction to the 2nd order of the 4th-order schemes in the presence of the Neumann boundary conditions may be explained by its 2nd-order assumptions in extrapolating the nodal values outside the Neumann boundaries (see examples in Section 3.1). Using this hypothesis, one may retrieve the 4th-order convergence rate if these Neumann boundaries are pushed outward far enough from the region of intense gradients. To examine this point, the errors of different domain lengths in the z-direction, i.e., L z = 2 ,   4 ,   10 , for solutions in boundary condition NDDDDN are plotted in Figure 12. As expected, the 4th-order scheme is almost fully retrieved for L z = 10 under this BC, while partially retrieved for L z = 4 .

5.3. Validation with the 3D Sinusoidal Field

In this section, the solutions to the sinusoidal source field of the Poisson’s equation are discussed, i.e.,
ω = sin x   cos y   sin z
2 Ψ = ω
Ψ = 1 3 sin x   cos y   sin z
The fields are periodic, and the domain spans from 0 to 2 π in all three directions, i.e.,   L x = L y = L z = 2 π and 0 x L x , 0 y L y and 0 z L z . Equation (34) is inspired by Taylor-Green’s formulation for the initial vorticity field (ω), which is commonly used to study turbulence evolution (e.g., [29]). By solving Poisson’s equation (Equation (35)) with Periodic BCs in all directions, the exact stream function solution (ψ) is obtained (Equation (36)). Since this problem is used for validation, combinations of Dirichlet, Neumann, and periodic boundary conditions are assumed for the following discussion. Figure 13 shows the iso-surfaces of the exact solution (Equation (36)) to the Poisson’s equation (Equation (35)), along with the red dots denoting the locations of errors larger than the 99.9-percentile value under BCs NNDNDD (Figure 13a) and PPPPPP (Figure 13b). As can be seen, larger errors are distributed on the edges connecting Neumann boundaries in Figure 13a and near peaks for all Periodic conditions in Figure 13b. The difference in error locations in Figure 13a,b implies different mechanisms in error generation for Neumann and Periodic BCs. The error statistics of Dirichlet and Periodic BCs are shown in Figure 14. It can be seen that the use of the 4th-order and the 2nd-order schemes shows exactly the 4th- and 2nd-order rates of convergence, respectively, under combinations of Dirichlet and Periodic boundary conditions.
When the Neumann BCs are combined with the Dirichlet or Periodic BCs, Figure 15 shows the convergence rates of the 4th-order schemes reduce to 2nd-order, except for the BC DPNPND. The reason that the convergence rate in BC DPNPND is still 4th-order is because of zero y-gradient values on the South and North boundaries. Unlike the point-source field discussed in the previous section, it is not possible to reduce the effects of Neumann conditions by pushing the Neumann boundary outward for this periodic field.
Figure 16 shows the error ratios of the 4th-order scheme-2 to the 2nd-order scheme. Since the 4th-order scheme works as it claims for the Dirichlet and Periodic BCs, the error ratios decrease at the rate of the 2nd-order as the grid spacing reduces. For other BCs that include Neumann, the error ratios shown in Figure 16 are less than or equal to unity for mean errors, manifesting the effectiveness of the 4th-order scheme in the overall solutions. On the other hand, the error ratios can exceed unity for the 99.9th percentile and maximum errors, suggesting that the stronger coupling to boundary gradients in the 4th-order stencil has a greater impact on the solution than in the 2nd-order stencil.
The error ratios of the 4th-order scheme-1 to the 4th-order scheme-2 are shown in Figure 17. As can be seen, scheme-1 is more accurate than scheme-2 for Dirichlet and Periodic conditions. The two schemes have nearly equal accuracy for Dirichlet-Neumann or Neumann-Periodic conditions.

5.4. Discussion on the Applicability of the Efficient Method

In this section, the advantages and disadvantages of the proposed matrix decomposition method are discussed. As mentioned in Section 1, the primary motivation for using this method is its computational efficiency. To demonstrate this, the computational times of Poisson’s solver using both the matrix inversion and matrix decomposition methods are compared for a sinusoidal field (Section 5.2) within a cubic domain ( L x = L y = L z = 2 π ). The comparison is performed for varying numbers of one-sided nodes: n = 10 ,   20 ,   30 ,   40 ,   50 , and 60 , under a mixed boundary condition (DNPNPD). All calculations are conducted using MATLAB R2022a. For the matrix inversion method, the system matrix M from Equation (18) is explicitly formed, and the solution is computed using MATLAB’s syntax Φ = F \ M . However, since M is very large—of approximate size n 3 × n 3 —it must be stored as a sparse matrix (using sparse(M)) to avoid excessive memory usage due to the large number of zero entries. Computational times are measured on a standard desktop computer equipped with an Intel six-core i5-10500 CPU @ 3.10 GHz and 8 GB of RAM.
The resulting computational times are presented in Figure 18, with auxiliary reference lines indicating O n 6 and O n 2 for comparison. The mean, 99.9th percentile, and maximum errors are identical for both methods, confirming their numerical equivalence. For the matrix inversion technique, the computational time increases approximately with O n 6 , which corresponds to the total number of entries in the full matrix M . In contrast, the matrix decomposition method yields significantly lower computational times, scaling approximately as O n 2 , which corresponds to the number of nodes along one side of the cubic domain. This confirms that the proposed decomposition method effectively reduces the 3D problem to 1D ones, as discussed in Section 4.2. Consequently, the matrix decomposition method achieves a speed-up factor on the order of O n 4 .
For the matrix decomposition method shown in Section 4.2 to preserve its efficiency, the general structure of the block matrices in Equation (18) needs to remain. Hence, one may use different spacing in different directions, but the grid needs to be uniform for the current efficient method to work [19,20]. When encountering a problem that requires finer resolution locally, e.g., flow around a solid body, a non-uniform or non-structured grid may be used to reach the overall efficiency of the grid.
If a non-uniform grid is used, one may consider some analytical function to map the non-uniform grid to a uniform one and solve Poisson’s equation in the mapped domain. Such a mapping could lead to spatially varying coefficients multiplied by the second derivatives in the mapped Poisson’s equation and generate additional first derivatives with variable coefficients on the LHS of Equation (1). The discretization of such mapped governing equations cannot be represented by the block matrices presented in the current method and, hence, cannot be solved efficiently using the current scheme. To resolve this issue, typical FD discretization with non-compact stencils and different matrix decomposition schemes [30] may be resorted to. For a non-structured grid, for example, the hierarchically divided square cells used in [31], the FD discretization can be complicated. Developing a method to efficiently solve such a system is even more difficult.
On the other hand, to cope with the discontinuity of source terms, the uniform (or non-uniform) grids may be used with the immersed boundary method (IBM) (e.g., examples in Section 5.5) or the immersed interface method (IIM) (e.g., [32,33,34]). Many new developments have been found in IIMs since they work better to remove the effects due to sharp discontinuity. However, inevitable changes in the local stencil weights near the discontinuity may again lead to different formats of block matrices. In this case, the other format of the matrix decomposition [30] needs to be sought.

5.5. Applications in the 3D Vortex-In-Cell Flow Simulations

This section presents applications of the proposed Poisson solver with various boundary conditions in 3D vortex-in-cell (VIC) simulations. Detailed VIC algorithms can be found in [35,36,37] and other key references. In the following examples, the stream function vector fields are obtained by solving Poisson’s equation for a given vorticity field using the 4th-order scheme-2 stencil (see Table 1). The velocity field is then computed by differentiating the stream functions, while vorticity advection, stretching, and diffusion are determined based on the vorticity governing equation. To solve Poisson’s equation, Dirichlet conditions for the stream function are derived from the Biot-Savart law and Fast Multipole Methods. Since solving Poisson’s equation for the stream function is only one component of the VIC algorithm, it does not solely determine the accuracy and computational time of the simulations. Therefore, the focus here is not on the convergence rate or the overall computational time. The outline of the flow examples, as well as the corresponding Reynolds number, time increments, grid spacing, and BCs of the Poisson’s solver, are given in Table 3.
The first example in Figure 19 presents the evolution of the head-on collision of the coaxial vortex rings. The two vortex rings are initialized at z = ± 1 with an initial radius of the ring, R = 1 , a vortex tube radius, σ = 0.1 , and an initial circulation of the vortex tubes, Γ = ± 1 . The Reynolds number based on the circulation is R e Γ = Γ / ν = 1000 , and ν is the kinematic viscosity. For this problem, the grid spacing, h / R = 0.01 , and time increments, t   Γ / R 2 = 0.01 , are selected in the simulations, which are similar to the simulations of [38]. The initial vorticity distribution on the cross-sections normal to the vortex tube follows the Gaussian profile. Dirichlet BCs of stream functions on all the boundaries are implemented. As the two vortex rings collide, the flow fields expand the radius of the rings (Figure 19). The stretching effects enhance the vorticity inside the core at the initial stage of the collision. The effects of the diffusion then catch on once the vortex tubes become thin.
The second example, shown in Figure 20, illustrates the flow around an impulsively started sphere, at Reynolds number R e D = u D / ν = 500 , where u is the free stream velocity magnitude and D is the diameter of the sphere. For this problem, the grid spacing, h / D = 0.0179 , and time increments, t   u / D = 0.01 , are selected in the simulations, which are similar to the simulations of [36]. The effects of the solid body are accounted for using Brinkman’s penalization method for immersed bodies (e.g., [37]). This approach allows the proposed method to incorporate inner boundary effects (e.g., [39]) without modification. This process imposes a strong vorticity layer on the sphere’s surface, which then diffuses into the surrounding field and is advected downstream, generating the separated flow. Dirichlet BCs of stream functions are applied to all domain boundaries. With the sine-wave perturbation of the free-stream velocity (in the z-direction) during 3 T 4 , the wake structure is revealed by the instantaneous vorticity and velocity fields at normalized time T = u t / D = 20 .
The third example extends the second one by expanding the bluff body to infinity in the y-direction. Figure 21 shows the example of the flow induced by an impulsively started circular cylinder at Reynolds number R e D = u D / ν = 1000 at normalized time T = u t / D = 12.8 . For this problem, the grid spacing, h / D = 0.012 , and time increments, t   u / D = 0.01 , are selected in the simulations, which are similar to the simulations of [36]. Like the previous example, sine-wave perturbation of the free-stream velocity (in the z-direction) during 3 T 4 triggers the vortex shedding. In this simulation, the Periodic BCs of stream functions are applied in the y-direction, i.e., the South and North boundaries. For the rest of the boundaries, the Dirichlet BCs are applied. Note that the Dirichlet conditions need to be evaluated by finite periodic shifts of the Dirichlet nodes to account for periodicity in the y-direction (see [35]).
The final example presents a boundary layer flow over a surface-mounted cube (Figure 22) at a Reynolds number of R e D = u D / ν = 300 , where u denotes the free-stream velocity unaffected by the boundary layer and D is the cube width. For this case, the grid spacing h / D = 0.032 and time step t   u / D = 0.02 are used, representing a slightly coarser spatial and temporal resolution compared to the simulations in [36]. Since the flow is not intended to be periodic in the y-direction, the zero Neumann BCs of stream functions are used in the South and North boundaries (lower and upper boundaries in the y-direction). All the other boundaries are of Dirichlet conditions. Note that to impose the no-through flow conditions on the floor, mirrored image vorticity below the bottom boundary must be imposed first (see [35]). Then, the effects of the vorticity flux of the floor can be imposed using the panel method (e.g., [28]).

6. Conclusions

This paper presents an efficient matrix decomposition method for solving the finite-difference (FD) discretized 3D Poisson’s equation using symmetric 27-point, 4th-order accurate stencils. Key contributions include:
(i)
Extension to various boundary conditions (BCs): The method generalizes previous approaches to effectively handle Dirichlet, Neumann, and Periodic BCs.
(ii)
Efficient source term computation: The use of equivalent Dirichlet nodes simplifies implementation and improves accuracy when computing source terms for Dirichlet, Neumann, and Periodic BCs.
(iii)
Generalized eigenvalue formulation: The method refines existing eigenvalue formulas to better handle the general 4th-order stencil weights.
(iv)
Validation of accuracy: The solver is tested using Gaussian and sinusoidal sources. It achieves 4th-order convergence for Dirichlet and Periodic boundaries. When Neumann boundaries are involved, convergence drops to 2nd-order due to 2nd-order extrapolation but still yields lower mean errors than traditional 2nd-order schemes. More accurate extrapolation methods may be applied in the future to reduce the error for Neumann BCs.
(v)
Application in flow simulation: The method is successfully applied to vortex-in-cell (VIC) simulations. While it primarily manages outer boundaries, it can be used with the immersed boundary method to handle internal solid objects. Neumann and Periodic boundaries also help reduce domain size and improve efficiency. Immersed interface methods may be considered in the future to better handle sharp discontinuity.

Funding

This work is supported by the National Science and Technology Council (NSTC) in Taiwan for grant NSTC-112-2221-E-032-008.

Data Availability Statement

Data are available upon request from the corresponding author.

Acknowledgments

The author thanks the financial support offered by the National Science and Technology in Taiwan.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Prescription of the Eigenvalues and Eigenvectors for Block Matrices

This section follows the 2D matrix decomposition method in Section 3.2. and gives formulations of eigenvectors and eigenvalues of the block matrices of Equation (13). The formulations of eigenvalues are modified from Swarztrauber [12] to account for the general 4th-order compact stencil weights (Figure 1), while the eigenvectors are the same as in [12]. Since the mode numbers are related to the boundary conditions on the South and North boundaries, the index of the mode k and location j for the eigenvector matrix Q = q j , k follows
(A1a) j = 1 ,   1 ,   ,   N y 2   for   Dirichlet   South   and   North (A1b) j = 1 ,   1 ,   ,   N y 1   for   Dirichlet   South   and   Neumann   North . (A1c) j = 0 ,   1 ,   ,   N y 1 for   Neumann   or   Periodic   South   and   North
Here N y is the total number of solution nodes and j = 0 and j = N y 1 are the index of nodes where boundary conditions are applied.
(i)
Periodic South and North BCs
If the South and North boundaries have Periodic boundary conditions, the block matrices of Equation (13) become the following form:
C = a b b a b b b a b b b a b b a N y × N y
Since the matrix is symmetric, its eigenvectors are orthogonal, i.e., Q 1 = Q T . The matrix Q is composed of eigenvector of modes k = 1 ,   2 ,   ,   N y , where N y is the number of grid points in y-direction, i.e., Q = q 1 ,   q 2 ,   , q N y . For the k -th mode, the eigenvector q k has N y components, i.e., q k = q 1 , k , q 2 , k ,   , q N y , k T . The j -th component of q k , where j = 0 , 1 , , N y 1 , is
(A3a) q j ,   k = 1 N y for   the   first   mode ,   k = 1 (A3b) q j , k         = 2 N y cos 2 π k 2 j N y   For even   modes , k = 2 ,   4 ,   ,   N y 1 (A3c) q j , k = 2 N y sin 2 π k 1 2 j N y   For odd   modes , k = 3 ,   5 ,   ,   N y 1 (A3d) q j , k = 1 N y 1 j   For   the last   modes , k = N y
The eigenvalues of matrix C is for this case are
(A4a) λ k = a + 2 b   k = 1 (A4b) λ k = a + 2 b 4 b sin 2 π k 2 N y For   even   modes , k = 2 ,   4 ,   ,   N y 1 (A4c) λ k = a + 2 b 4 b sin 2 π k 1 2 N y For   odd   modes , k = 3 ,   5 ,   ,   N y 1 (A4d) λ k = a 2 b k = N y
(ii)
Dirichlet South and Dirichlet North BCs
If South and North boundaries are of Dirichlet boundary conditions, the block matrices of Equation (13) become the following form:
C = a b b a b b a b b a b b a N y 2 × N y 2
Note that the matrix is symmetric and, hence, its eigenvectors are orthogonal, i.e., Q 1 = Q T . The matrix Q is composed of eigenvector of modes k = 1 ,   2 ,   ,   N y 2 , i.e., Q = q 1 ,   q 2 ,   , q N y 2 . For the k -th mode, the eigenvector q k has N y 2 components, i.e., q k = q 1 ,   k , q 2 , k ,   , q N y 2 , k T . The j -th component of q k , where j = 1 , 2 , , N y 2 , is
q j , k = 2 N y 1 s i n π k j N y 1   for k = 1 ,   2 ,   ,   N y 2
The eigenvalues of matrix C for this case are
λ k = a + 2 b 4 b sin 2 π k 2 N y 1   for k = 1 ,   2 ,   ,   N y 2
(iii)
Neumann South and Neumann North BCs
If the South and North boundaries are of Neumann boundary conditions, the block matrices of Equation (13) become the following form:
C = a 2 b b a b b a b b a b 2 b a N y × N y
Because the matrix is not symmetric, the eigenvectors used in the transform ( ϕ ¯ = Q 1 ϕ = P ϕ ) and the inverse ( ϕ = Q ϕ ¯ ) will be listed separately. For k -th component of the transformed vector, ϕ ¯ k = j = 0 N y 1 p k , j ϕ j for k = 1 , 2 , , N y , the components p k , j are listed as follows:
p k , j = 1 N y 1   j = 0
p k , j = 2 N y 1 cos π k 1 j N y 1   j = 1 , , N y 2
p k , j = 1 k 1 N y 1   j = N y 1
For the j -th component of the inversed vector, ϕ j = k = 1 N y q j , k ϕ ¯ k for j = 0 ,   1 , , N y 1 , the components of q j , k are listed as follows:
q j , k = 1 2   k = 1
q j , k = cos π k 1 j N y 1   k = 2 , , N y 1
q j , k = 1 j 2   k = N y
The eigenvalues of matrix C for this case are
λ k = a + 2 b 4 b sin 2 π k 1 2 N y 1 for   k = 1 ,   2 ,   ,   N y
(iv)
Dirichlet South and Neumann North BCs
In this case, the block matrices of Equation (13) become the following form:
C = a b b a b b a b b a b 2 b a N y 1 × N y 1
Because the matrix is not symmetric, the eigenvectors used in the transform ( ϕ ¯ = Q 1 ϕ = P ϕ ) and the inverse ( ϕ = Q ϕ ¯ ) will be listed separately. For k -th component of the transformed vector, ϕ ¯ k = j = 1 N y 1 p k , j ϕ j for k = 1 , , N y 1 , the components p k , j are listed as follows:
p k , j = 2 N y 1 sin π 2 k 1 j 2 N y 1   j = 1 ,   2 ,   ,   N y 2
p k , j = 1 k + 1 N y 1   j = N y 1
For the j -th component of the inversed vector, ϕ j = k = 1 N y q j , k ϕ ¯ k for j = 1 , , N y 1 , the components of q j , k are listed as follows:
q j , k = sin π 2 k 1 j 2 N y 1   j = 1 ,   2 ,   ,   N y 1
The eigenvalues of matrix C for this case are
λ k = a + 2 b 4 b sin 2 π 2 k 1 4 N y 1 for k = 1 ,   2 ,   ,   N y 1

References

  1. Zhong, Y.; Shirinzadeh, B.; Alici, G.; Smith, J. Deformable object simulation with Poisson equation. In Proceedings of the IEEE, International Conference on Mechatronics & Automation, Niagara Falls, ON, Canada, 29 July–1 August 2005; pp. 187–192. [Google Scholar]
  2. Nagel, J.R. Numerical solutions to Poisson equations using the finite-difference method. IEEE Antennas Propag. Mag. 2014, 56, 209–224. [Google Scholar] [CrossRef]
  3. Radhakrishnan, A.; Xu, M.; Shahane, S.; Vanka, S.P. A non-nested multilevel method for meshless solution of the Poisson equation in heat transfer and fluid flow. arXiv 2021, arXiv:2104.13758. [Google Scholar] [CrossRef]
  4. Versteeg, H.; Malalasekera, W. Introduction to Computational Fluid Dynamics: The Finite Volume Method, 2nd ed.; Pearson: London, UK, 2007. [Google Scholar]
  5. Smith, G.D. Numerical Solutions of Partial Differential Equations: Finite Difference Methods; Oxford University Press: New York, NY, USA, 1985. [Google Scholar]
  6. Wang, Y.; Zhang, J. Sixth order compact scheme combined with multigrid method and extrapolation technique for 2D Poisson equation. J. Comput. Phys. 2009, 228, 137–146. [Google Scholar] [CrossRef]
  7. Ge, Y. Multigrid method and fourth-order compact difference discretization scheme with unequal meshsizes for 3D Poisson equation. J. Comput. Phys. 2010, 229, 6381–6391. [Google Scholar] [CrossRef]
  8. Hockney, R.W. A fast direct solution of Poisson equation using Fourier analysis. JACM 1965, 12, 95–113. [Google Scholar] [CrossRef]
  9. Buneman, O. A Compact Non-Iterative Poisson Solver; Rep. SUIPR-294; Institute for Plasma Research, Stanford University: Stanford, CA, USA, 1969. [Google Scholar]
  10. Buzbee, B.L.; Golub, G.H.; Nielson, C.W. On direct methods for solving Poisson’s equations. SIAM J. Numer. Anal. 1970, 7, 627–655. [Google Scholar] [CrossRef]
  11. Sweet, R.A. A Generalized Cyclic Reduction Algorithm. SIAM J. Numer. Anal. 1974, 11, 506–520. [Google Scholar] [CrossRef]
  12. Swarztrauber, P.N. The methods of cyclic reduction, Fourier analysis and the FACR algorithm for the discrete solution of Poisson’s equation on a rectangle. SIAM Rev. 1977, 19, 490–501. [Google Scholar] [CrossRef]
  13. Cooley, J.W.; Lewis, P.A.W.; Welch, P.D. The fast Fourier transform algorithm: Programming considerations in the calculation of sine, cosine and Laplace transforms. J. Sound Vib. 1970, 12, 315–337. [Google Scholar] [CrossRef]
  14. Swarztrauber, P.N. Symmetric FFTs. Math. Comput. 1986, 47, 323–346. [Google Scholar] [CrossRef]
  15. Schumann, U.; Sweet, R.A. A direct method for the solution of Poisson’s equation with Neumann boundary conditions on a staggered grid of arbitrary size. J. Comput. Phys. 1976, 20, 171–182. [Google Scholar] [CrossRef]
  16. Schumann, U.; Sweet, R.A. Fast Fourier transforms for direct solution of Poisson’s equation with staggered boundary conditions. J. Comput. Phys. 1988, 20, 123–137. [Google Scholar] [CrossRef]
  17. Sweet, R.A. Direct methods for the solution of Poisson’s equation on a staggered grid. J. Comput. Phys. 1973, 12, 422–428. [Google Scholar] [CrossRef]
  18. Wilhelmson, R.B.; Ericksen, J.H. Direct Solutions for Poisson’s equation in three dimensions. J. Comput. Phys. 1977, 25, 319–331. [Google Scholar] [CrossRef]
  19. Shiferaw, A.; Chand Mittal, R. An efficient direct method to solve the three dimensional Poisson’s equation. Am. J. Comput. Math. 2011, 1, 285–293. [Google Scholar] [CrossRef]
  20. Wang, H.; Zhang, Y.; Ma, X.; Qiu, J.; Liang, Y. An efficient implementation of fourth-order compact finite difference scheme for Poisson equation with Dirichlet boundary conditions. Comput. Math. Appl. 2016, 71, 1843–1860. [Google Scholar] [CrossRef]
  21. Feng, H.; Zhao, S. FFT-based high order central difference schemes for three-dimensional Poisson’s equation with various types of boundary conditions. J. Comput. Phys. 2020, 410, 109391. [Google Scholar] [CrossRef]
  22. Adams, J.C.; Swarztrauber, P.N.; Sweet, R. FISHPACK90: Efficient Fortran Subprograms for the Solution of Separable Elliptic Partial Differential Equations. Astrophysics Source Code Library. 2016. Available online: https://ascl.net/1609.005 (accessed on 7 April 2025).
  23. Hasbestan, J.J.; Senocak, I. PittPack: Open-Source FFT-Based Poisson’s Equation Solver for Computing with Accelerators. In Proceedings of the ASME 2018 International Mechanical Engineering Congress and Exposition IMECE 2018, Pittsburgh, PA, USA, 9–15 November 2018. [Google Scholar]
  24. Hasbestan, J.J.; Xiao, C.-N.; Senocak, I. PittPack: An open-source Poisson’s equation solver for extreme-scale computing with accelerators. Comput. Phys. Commun. 2020, 254, 107272. [Google Scholar] [CrossRef]
  25. Kyei, Y.; Roop, J.P.; Tang, G. A family of sixth-order compact finite-difference schemes for the three-dimensional Poisson Equation. Adv. Numer. Anal. 2010, 2010, 352174. [Google Scholar] [CrossRef]
  26. Deriaz, E. Compact finite difference schemes of arbitrary order for the Poisson equation in arbitrary dimensions. BIT Numer. Math. 2019, 60, 199–233. [Google Scholar] [CrossRef]
  27. Cohen, S. Cyclic Reduction; Lecture note; Stanford University: Stanford, CA, USA, 1994. [Google Scholar]
  28. Ploumhans, P.; Winckelmans, G.S.; Salmon, J.K.; Leonard, A.; Warren, M.S. Vortex Methods for Direct Numerical Simulation of Three-Dimensional Bluff Body Flows: Application to the Sphere at Re = 300, 500, and 1000. J. Comput. Phys. 2002, 178, 427–463. [Google Scholar] [CrossRef]
  29. Sharma, N.; Sengupta, T.K. Vorticity dynamics of the three-dimensional Taylor-Green vortex problem. Phys. Fluids 2019, 31, 3. [Google Scholar] [CrossRef]
  30. Bialecki, B.; Fairweather, G.; Karageorghis, A. Matrix decomposition algorithms for elliptic boundary value problems: A survey. Numer. Algor. 2011, 56, 253–295. [Google Scholar] [CrossRef]
  31. Raeli, A.; Bergmann, M.; Iollo, A. A finite-difference method for the variable coefficient Poisson equation on hierarchical Cartesian meshes. J. Comput. Phys. 2018, 355, 59–77. [Google Scholar] [CrossRef]
  32. Feng, H.; Long, G.; Zhao, S. An augmented matched interface and boundary (MIB) method for solving elliptic interface problem. J. Comput. Appl. Math. 2019, 361, 426–443. [Google Scholar] [CrossRef]
  33. Ren, Y.; Feng, H.; Zhao, S. A FFT accelerated high order finite difference method for elliptic boundary value problems over irregular domains. J. Comput. Phys. 2022, 448, 110762. [Google Scholar] [CrossRef]
  34. Ren, Y.; Zhao, S.A. High-order hybrid approach integrating neural networks and fast Poisson solvers for elliptic interface problems. Computation 2025, 13, 83. [Google Scholar] [CrossRef]
  35. Cocle, R.; Winckelmans, G.; Daeninck, G. Combining the vortex-in-cell and parallel fast multipole methods for efficient domain decomposition simulations. J. Comput. Phys. 2008, 227, 9091–9120. [Google Scholar] [CrossRef]
  36. Mimeau, C.; Cottet, G.-H.; Mortazavi, I. Direct numerical simulations of 3D flow past obstacles with a vortex penalization method. Comput. Fluids 2016, 136, 331–347. [Google Scholar] [CrossRef]
  37. Spietz, H.J.; Hejlesen, M.M.; Walther, J.H. Iterative Brinkman penalization for simulation of impulsively started flow past a sphere and a circular disc. J. Comput. Phys. 2017, 336, 261–274. [Google Scholar] [CrossRef]
  38. Cheng, M.; Loum, J.; Lim, T.T. Numerical simulation of head-on collision of two coaxial vortex rings. Fluid Dyn. Res. 2018, 50, 065513. [Google Scholar] [CrossRef]
  39. Buzbee, B.L.; Dorr, F.W.; George, J.A.; Golub, G.H. The direct solution of the discrete Poisson equation on irregular regions. SIAM J. Numer. Anal. 1971, 8, 722–736. [Google Scholar] [CrossRef]
Figure 1. The general structure of the 27-point symmetric compact stencil used for the finite-difference discretization of 3D Poisson’s equation. Here a , b , c , and d respectively denote the weights for nodes in the center (red circle), face centers (blue squares), edge centers (green triangles), and corners (grey diamonds) of the stencil cube.
Figure 1. The general structure of the 27-point symmetric compact stencil used for the finite-difference discretization of 3D Poisson’s equation. Here a , b , c , and d respectively denote the weights for nodes in the center (red circle), face centers (blue squares), edge centers (green triangles), and corners (grey diamonds) of the stencil cube.
Computation 13 00099 g001
Figure 2. Examples of the uniformly spaced grid of 2D rectangular domains with various combinations of boundary conditions (named in the sequence of West-South-East-North, WSEN): (a) NNDD, (b) DPNP, where D, N, and P stand for Dirichlet, Neumann, and Periodic BCs, respectively. The linear indices of the solution nodes are labeled beside the nodes.
Figure 2. Examples of the uniformly spaced grid of 2D rectangular domains with various combinations of boundary conditions (named in the sequence of West-South-East-North, WSEN): (a) NNDD, (b) DPNP, where D, N, and P stand for Dirichlet, Neumann, and Periodic BCs, respectively. The linear indices of the solution nodes are labeled beside the nodes.
Computation 13 00099 g002
Figure 3. Target nodes i , j and their 8 neighbors in 2D FD discretization under various boundary conditions. Cases selected from Figure 2a are nodes (a) #11, (b) #36, (c) #6, and (d) #1; Cases selected from Figure 2b are nodes (e) #1 and (f) #36.
Figure 3. Target nodes i , j and their 8 neighbors in 2D FD discretization under various boundary conditions. Cases selected from Figure 2a are nodes (a) #11, (b) #36, (c) #6, and (d) #1; Cases selected from Figure 2b are nodes (e) #1 and (f) #36.
Computation 13 00099 g003
Figure 4. LHS stencil weights as the results of discretization for nodes in Figure 3. Cases selected from Figure 2a are nodes (a) #11, (b) #36, (c) #6, and (d) #1; Cases selected from Figure 2b are nodes (e) #1, and (f) #36.
Figure 4. LHS stencil weights as the results of discretization for nodes in Figure 3. Cases selected from Figure 2a are nodes (a) #11, (b) #36, (c) #6, and (d) #1; Cases selected from Figure 2b are nodes (e) #1, and (f) #36.
Computation 13 00099 g004
Figure 5. Stencil contribution to the RHS source terms for discretization of nodes in Figure 3. Cases selected from Figure 2a are nodes (a) #11, (b) #36, (c) #6, and (d) #1; Cases selected from Figure 2b are nodes (e) #1, and (f) #36.
Figure 5. Stencil contribution to the RHS source terms for discretization of nodes in Figure 3. Cases selected from Figure 2a are nodes (a) #11, (b) #36, (c) #6, and (d) #1; Cases selected from Figure 2b are nodes (e) #1, and (f) #36.
Computation 13 00099 g005
Figure 6. Schematic of the equivalent stencil and the reduced 2D grid of the first reduction in the 3D matrix decomposition method.
Figure 6. Schematic of the equivalent stencil and the reduced 2D grid of the first reduction in the 3D matrix decomposition method.
Computation 13 00099 g006
Figure 7. An example of combined boundary conditions, PDNDDP, of a 3D rectangular domain: Here D, N, and P stand for Dirichlet, Neumann, and Periodic BCs. The BCs are labeled in the sequence of the Bottom (B), West (W), South (S), East (E), North (N), and Top (T) boundaries of the domain.
Figure 7. An example of combined boundary conditions, PDNDDP, of a 3D rectangular domain: Here D, N, and P stand for Dirichlet, Neumann, and Periodic BCs. The BCs are labeled in the sequence of the Bottom (B), West (W), South (S), East (E), North (N), and Top (T) boundaries of the domain.
Computation 13 00099 g007
Figure 8. Iso-surface and contour lines on xz-plane of the exact solution to the Gaussian particle source with the red dots indicating where errors of numerical solutions (4th-order stencil-1, h = 0.03125 ) are beyond the 99.9 percentile value for BCs: (a) DDDDDD (b) NNNDDD. The levels of iso-surface and contour lines are labeled on the color bar.
Figure 8. Iso-surface and contour lines on xz-plane of the exact solution to the Gaussian particle source with the red dots indicating where errors of numerical solutions (4th-order stencil-1, h = 0.03125 ) are beyond the 99.9 percentile value for BCs: (a) DDDDDD (b) NNNDDD. The levels of iso-surface and contour lines are labeled on the color bar.
Computation 13 00099 g008
Figure 9. Error statistics for the Gussain point source problem as functions of grid spacing, h , stencil schemes (▽: 2nd-order stencil, +: 4th-order stencil-1, □: 4th-order stencil-2, see Table 1), and boundary conditions (identified by colors): (a) Mean error, ε m e a n , (b) 99.9-percentile error, ε 99.9 % , and (c) maximum error, ε m a x . The black and green dashed lines are used to represent the 4th- and 2nd-order convergences, respectively.
Figure 9. Error statistics for the Gussain point source problem as functions of grid spacing, h , stencil schemes (▽: 2nd-order stencil, +: 4th-order stencil-1, □: 4th-order stencil-2, see Table 1), and boundary conditions (identified by colors): (a) Mean error, ε m e a n , (b) 99.9-percentile error, ε 99.9 % , and (c) maximum error, ε m a x . The black and green dashed lines are used to represent the 4th- and 2nd-order convergences, respectively.
Computation 13 00099 g009
Figure 10. Error ratios of the 4th-order scheme-2 to the 2nd-order scheme for various boundary conditions and grid spacing in the Gaussian point source problem: (a) Mean error, (b) 99.9-percentile error, and (c) maximum error.
Figure 10. Error ratios of the 4th-order scheme-2 to the 2nd-order scheme for various boundary conditions and grid spacing in the Gaussian point source problem: (a) Mean error, (b) 99.9-percentile error, and (c) maximum error.
Computation 13 00099 g010
Figure 11. Error ratios of the 4th-order scheme-1 to the 4th-order scheme-2 for various boundary conditions and grid spacing in the Gaussian point source problem: (a) Mean error, (b) 99.9-percentile error, and (c) maximum error.
Figure 11. Error ratios of the 4th-order scheme-1 to the 4th-order scheme-2 for various boundary conditions and grid spacing in the Gaussian point source problem: (a) Mean error, (b) 99.9-percentile error, and (c) maximum error.
Computation 13 00099 g011
Figure 12. Error statistics for the Gussain point source problem as functions of grid spacing, h , stencil schemes (▽: 2nd-order stencil, +: 4th-order stencil-1, □: 4th-order stencil-2, see Table 1), and domain width on the Neumann direction (i.e., L z ) for the BC NDDDDN: (a) Mean error, ε m e a n , (b) 99.9-percentile error, ε 99.9 % , and (c) maximum error, ε m a x . The black and green dashed lines are used to represent the 4th- and 2nd-order convergences, respectively.
Figure 12. Error statistics for the Gussain point source problem as functions of grid spacing, h , stencil schemes (▽: 2nd-order stencil, +: 4th-order stencil-1, □: 4th-order stencil-2, see Table 1), and domain width on the Neumann direction (i.e., L z ) for the BC NDDDDN: (a) Mean error, ε m e a n , (b) 99.9-percentile error, ε 99.9 % , and (c) maximum error, ε m a x . The black and green dashed lines are used to represent the 4th- and 2nd-order convergences, respectively.
Computation 13 00099 g012
Figure 13. Iso-surface of the exact solution to the sinusoidal source field with the red dots indicating where the errors of the numerical solutions (4th-order stencil-1, h = 0.098175 ) are beyond the 99.9-percentile value for BCs: (a) NNDNDD (b) PPPPPP.
Figure 13. Iso-surface of the exact solution to the sinusoidal source field with the red dots indicating where the errors of the numerical solutions (4th-order stencil-1, h = 0.098175 ) are beyond the 99.9-percentile value for BCs: (a) NNDNDD (b) PPPPPP.
Computation 13 00099 g013
Figure 14. Error statistics for the sinusoidal source problem as functions of grid spacing, h , stencil schemes (▽: 2nd-order stencil, +: 4th-order stencil-1, □: 4th-order stencil-2, see Table 1), and Dirichlet-Periodic BCs (identified by colors): (a) Mean error, ε m e a n , (b) 99.9-percentile error, ε 99.9 % , and (c) maximum error, ε m a x . The black and green dashed lines are used to represent the 4th- and 2nd-order convergences, respectively.
Figure 14. Error statistics for the sinusoidal source problem as functions of grid spacing, h , stencil schemes (▽: 2nd-order stencil, +: 4th-order stencil-1, □: 4th-order stencil-2, see Table 1), and Dirichlet-Periodic BCs (identified by colors): (a) Mean error, ε m e a n , (b) 99.9-percentile error, ε 99.9 % , and (c) maximum error, ε m a x . The black and green dashed lines are used to represent the 4th- and 2nd-order convergences, respectively.
Computation 13 00099 g014
Figure 15. Error statistics for the sinusoidal source problem as functions of grid spacing, h , stencil schemes (▽: 2nd-order stencil, +: 4th-order stencil-1, □: 4th-order stencil-2, see Table 1), and BCs (identified by colors): (a) Mean error, ε m e a n , (b) 99.9-percentile error, ε 99.9 % , and (c) maximum error, ε m a x . The black and green dashed lines are used to represent the 4th- and 2nd-order convergences, respectively.
Figure 15. Error statistics for the sinusoidal source problem as functions of grid spacing, h , stencil schemes (▽: 2nd-order stencil, +: 4th-order stencil-1, □: 4th-order stencil-2, see Table 1), and BCs (identified by colors): (a) Mean error, ε m e a n , (b) 99.9-percentile error, ε 99.9 % , and (c) maximum error, ε m a x . The black and green dashed lines are used to represent the 4th- and 2nd-order convergences, respectively.
Computation 13 00099 g015
Figure 16. Error ratios of the 4th-order scheme-2 to the 2nd-order scheme for various boundary conditions and grid spacing in the sinusoidal source problem. (a) Mean error, (b) 99.9-percentile error, and (c) maximum error.
Figure 16. Error ratios of the 4th-order scheme-2 to the 2nd-order scheme for various boundary conditions and grid spacing in the sinusoidal source problem. (a) Mean error, (b) 99.9-percentile error, and (c) maximum error.
Computation 13 00099 g016
Figure 17. Error ratios of the 4th-order scheme-1 to the 4th-order scheme-2 for various boundary conditions and grid spacing in the sinusoidal source problem. (a) Mean error, (b) 99.9-percentile error, and (c) maximum error.
Figure 17. Error ratios of the 4th-order scheme-1 to the 4th-order scheme-2 for various boundary conditions and grid spacing in the sinusoidal source problem. (a) Mean error, (b) 99.9-percentile error, and (c) maximum error.
Computation 13 00099 g017
Figure 18. Computational times obtained from matrix inversion and matrix decomposition methods.
Figure 18. Computational times obtained from matrix inversion and matrix decomposition methods.
Computation 13 00099 g018
Figure 19. Examples in vortex-in-cell simulations of heads-on collision of coaxial vortex rings at R e Γ = 1000 . Three snapshots of vorticity iso-surfaces ( ω θ = 2 , 0.2 ,   0.2 ,   a n d   2 ) as well as the instantaneous streamlines on the y=0 plane at non-dimensionalized time steps, i.e., T = 0 (Initialization), T = 5, and T = 10, are presented. The simulations are obtained from Dirichlet BCs of stream functions on all boundaries.
Figure 19. Examples in vortex-in-cell simulations of heads-on collision of coaxial vortex rings at R e Γ = 1000 . Three snapshots of vorticity iso-surfaces ( ω θ = 2 , 0.2 ,   0.2 ,   a n d   2 ) as well as the instantaneous streamlines on the y=0 plane at non-dimensionalized time steps, i.e., T = 0 (Initialization), T = 5, and T = 10, are presented. The simulations are obtained from Dirichlet BCs of stream functions on all boundaries.
Computation 13 00099 g019
Figure 20. An example of vortex-in-cell simulation of flow induced by an impulsively started sphere at R e D = 500 and normalized time u t / D = 20 . (a) Instantaneous vorticity magnitude isosurfaces at levels ω = 0.5 , 2.5 , 5 . (b,c) are streamlines and vorticity magnitude contours (at the same level) on xy- and xz-planes passing the center of the sphere, respectively. The simulations are obtained from the streamfunction Dirichlet BCs on all boundaries.
Figure 20. An example of vortex-in-cell simulation of flow induced by an impulsively started sphere at R e D = 500 and normalized time u t / D = 20 . (a) Instantaneous vorticity magnitude isosurfaces at levels ω = 0.5 , 2.5 , 5 . (b,c) are streamlines and vorticity magnitude contours (at the same level) on xy- and xz-planes passing the center of the sphere, respectively. The simulations are obtained from the streamfunction Dirichlet BCs on all boundaries.
Computation 13 00099 g020
Figure 21. An example of vortex-in-cell simulation of flow induced by an impulsively started circular cylinder at R e D = 1000 and normalized time u t / D = 12.8 . (a) Instantaneous vorticity magnitude isosurfaces at levels ω = 1 , 5 , a n d   10 . (b,c) are streamlines and vorticity magnitude contours (at the same level) on xy- and xz-planes passing the center of the cylinder, respectively. The simulations are obtained from the stream function Periodic BCs along y-direction and Dirichlet BCs on the rest of the boundaries.
Figure 21. An example of vortex-in-cell simulation of flow induced by an impulsively started circular cylinder at R e D = 1000 and normalized time u t / D = 12.8 . (a) Instantaneous vorticity magnitude isosurfaces at levels ω = 1 , 5 , a n d   10 . (b,c) are streamlines and vorticity magnitude contours (at the same level) on xy- and xz-planes passing the center of the cylinder, respectively. The simulations are obtained from the stream function Periodic BCs along y-direction and Dirichlet BCs on the rest of the boundaries.
Computation 13 00099 g021
Figure 22. An example of vortex-in-cell simulation of boundary layer flow passing a surface-mounted cube at R e D = 300 and normalized time u t / D = 24 . (a) Instantaneous vorticity magnitude isosurfaces at levels ω = 0.5 , 2.5 , 5 . (b): streamlines and vorticity magnitude contour (at the same level) on xy-plane at half height of the cube, i.e., z = 0.5 . (c) Streamlines and vorticity contour (at the same level) on the xz-plane at y = 0 . The simulations are obtained from the stream function Neumann BCs along y-direction and Dirichlet BCs on the rest of the boundaries.
Figure 22. An example of vortex-in-cell simulation of boundary layer flow passing a surface-mounted cube at R e D = 300 and normalized time u t / D = 24 . (a) Instantaneous vorticity magnitude isosurfaces at levels ω = 0.5 , 2.5 , 5 . (b): streamlines and vorticity magnitude contour (at the same level) on xy-plane at half height of the cube, i.e., z = 0.5 . (c) Streamlines and vorticity contour (at the same level) on the xz-plane at y = 0 . The simulations are obtained from the stream function Neumann BCs along y-direction and Dirichlet BCs on the rest of the boundaries.
Computation 13 00099 g022
Table 1. Stencil weights for 2D [26] and 3D [19,25] compact finite difference discretization of the Poisson’s equation on a uniform grid using 2nd- or 4th-order accurate schemes.
Table 1. Stencil weights for 2D [26] and 3D [19,25] compact finite difference discretization of the Poisson’s equation on a uniform grid using 2nd- or 4th-order accurate schemes.
Domain Dimension and
Field Variables or Source Terms
Order of AccuracyType a ϕ   or   a f b ϕ   or   b f c ϕ   or   c f d ϕ   or   d f
3D LHS weights for ϕ 2- 6 1 0 0
41 25 6 5 12 1 8 1 48
42 4 1 3 1 6 0
3D RHS weights for f 2-1000
41 125 216 25 432 5 864 1 1728
42 1 2 1 12 0 0
2D LHS weights for ϕ 2- 4 1 0 -
4- 10 3 2 3 1 6 -
2D RHS weights for f 2- 1 0 0 -
4- 2 3 1 12 0 -
Table 2. Matrix sizes and coefficients of 3D finite difference discretization for various boundary conditions on the opposite sides. Notations of domain boundaries are W: West, E: East, S: South, N: North, B: Bottom, and, T: Top. The n ’s and N s are the number of solution nodes and total nodes, respectively, along the direction labeled in the subscript. The g ’s and p ’s are used to discern the effect of Neumann and Periodic BCs, respectively, along the boundary labeled in the subscript.
Table 2. Matrix sizes and coefficients of 3D finite difference discretization for various boundary conditions on the opposite sides. Notations of domain boundaries are W: West, E: East, S: South, N: North, B: Bottom, and, T: Top. The n ’s and N s are the number of solution nodes and total nodes, respectively, along the direction labeled in the subscript. The g ’s and p ’s are used to discern the effect of Neumann and Periodic BCs, respectively, along the boundary labeled in the subscript.
BPeriodicDirichletDirichletNeumannNeumann
S
W
TPeriodicDirichletNeumannDirichletNeumann
N
E
n z N z N z 2 N z 1 N z 1 N z
n y N y N y 2 N y 1 N y 1 N y
n x N x N x 2 N x 1 N x 1 N x
g B 11122
g S
g W
g T 11212
g N
g E
p B 10000
p S
p W
p T 10000
p N
p E
Table 3. Summary of Reynolds numbers ( R e Γ and R e D ), grid spacings ( h ), time increments ( t ), and BCs of the Poisson’s solver used in the flow simulations. The BCs are labeled in the sequence of B-W-S-E-N-T boundaries of the rectangular domain, and D, N, and P are abbreviations of Dirichlet, Neumann, and Periodic conditions, respectively. For vortex ring simulation, R e Γ Γ / ν . For simulations of flow passing bluff bodies. R e D = u D / ν .
Table 3. Summary of Reynolds numbers ( R e Γ and R e D ), grid spacings ( h ), time increments ( t ), and BCs of the Poisson’s solver used in the flow simulations. The BCs are labeled in the sequence of B-W-S-E-N-T boundaries of the rectangular domain, and D, N, and P are abbreviations of Dirichlet, Neumann, and Periodic conditions, respectively. For vortex ring simulation, R e Γ Γ / ν . For simulations of flow passing bluff bodies. R e D = u D / ν .
Example Flow Reynolds   Number ,   R e Grid   Spacing ,   h Time   Increment ,   t BCs
Heads-on collision of two vortex rings R e Γ = 1000 h / R = 0.002 t   Γ / R 2 = 0.01 DDDDDD
Impulsively started sphere R e D = 500 h / D = 0.0179 t   u / D = 0.01 DDDDDD
Impulsively started circular cylinder R e D = 1000 h / D = 0.012 t   u / D = 0.01 DDPDPD
Boundary layer flow passing a cube R e D = 300 h / D = 0.032 t   u / D = 0.02 DDNDND
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wu, C.-H. Enhanced Efficient 3D Poisson Solver Supporting Dirichlet, Neumann, and Periodic Boundary Conditions. Computation 2025, 13, 99. https://doi.org/10.3390/computation13040099

AMA Style

Wu C-H. Enhanced Efficient 3D Poisson Solver Supporting Dirichlet, Neumann, and Periodic Boundary Conditions. Computation. 2025; 13(4):99. https://doi.org/10.3390/computation13040099

Chicago/Turabian Style

Wu, Chieh-Hsun. 2025. "Enhanced Efficient 3D Poisson Solver Supporting Dirichlet, Neumann, and Periodic Boundary Conditions" Computation 13, no. 4: 99. https://doi.org/10.3390/computation13040099

APA Style

Wu, C.-H. (2025). Enhanced Efficient 3D Poisson Solver Supporting Dirichlet, Neumann, and Periodic Boundary Conditions. Computation, 13(4), 99. https://doi.org/10.3390/computation13040099

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop