A Strong-Form Off-Lattice Boltzmann Method for Irregular Point Clouds

Radial basis function generated finite differences (RBF-FD) represent the latest discretization approach for solving partial differential equations. Their benefits include high geometric flexibility, simple implementation, and opportunity for large-scale parallel computing. Compared to other meshfree methods, typically based upon moving least squares (MLS), the RBF-FD method is able to recover a high order of algebraic accuracy while remaining better conditioned. These features make RBF-FD a promising candidate for kinetic-based fluid simulations such as lattice Boltzmann methods (LB). Pursuant to this approach, we propose a characteristic-based off-lattice Boltzmann method (OLBM) using the strong form of the discrete Boltzmann equation and radial basis function generated finite differences (RBF-FD) for the approximation of spatial derivatives. Decoupling the discretizations of momentum and space enables the use of irregular point cloud, local refinement, and various symmetric velocity sets with higher order isotropy. The accuracy and computational efficiency of the proposed method are studied using the test cases of Taylor–Green vortex flow, lid-driven cavity, and periodic flow over a square array of cylinders. For scattered grids, we find the polyharmonic spline + poly RBF-FD method provides better accuracy compared to MLS. For Cartesian node layouts, the results are the opposite, with MLS offering better accuracy. Altogether, our results suggest that the RBF-FD paradigm can be applied successfully also for kinetic-based fluid simulation with lattice Boltzmann methods.


Introduction
In the last three decades the lattice Boltzmann method (LBM) has grown into a successful tool for computational fluid dynamics including single-and multi-phase flows [1,2], flows in porous media [3,4], heat and mass transfer [5][6][7], and other complex flows [8,9]. The main properties underlying this success include the algorithmic simplicity of the method with its particle-like nature, a high parallel efficiency due to the large degree of local operations, and the simple handling of boundary conditions via bounce-back type methods [10,11], among others. These properties are a result of the ingenious coupling between the velocity and spatial discretizations in such a way that the spatial grid corresponds to the characteristics of the discrete velocity space [12]. On the computational level this gives the stream-collide algorithm, where collision is a set of local arithmetic operations, and the streaming of particles occurs via a simple shift of values in the memory [13].
Simultaneously, the coupling between momentum space and physical space restricts the method to uniform Cartesian grids, which are often disadvantageous for the study of flow in complex geometries [12]. Moreover, the standard lattice symmetric stencils in 2-d and 3-d, that is, those including only the neighbors from the first Brillouin zone, such as D2Q9 and D3Q19, do not guarantee sufficient degrees of freedom and lead to the loss of Galilean invariance at finite Mach numbers. This limits LBM's domain of applicability to weakly-compressible athermal flows [14,15]. Specifically, it is the lack of a cubic term in velocity due to the second order truncation of equilibrium, which leads irregularly distributed points adapted to the specific geometry can help reduce the amount of computation for a given level of accuracy [41].
Previous efforts in this direction have mostly relied on classical PDE discretization techniques [42][43][44]. Due to the stiffness of the collision operator, most early OLBMs suffered from a stringent time-step condition, limiting them to low Reynolds number (isothermal) flows. This limitation was later overcome in [12] by combining the classic variable transformation technique from [43,45] with a discretization of the advective term along characteristics [28]. The collision term is then integrated implicitly at explicit cost. An alternative explanation for this special feature of LBM was later provided in terms of Strang splitting [46]. Since the splitting procedure brings significant gains in time-stepping efficiency, most of the (Eulerian) off-Lattice Boltzmann methods developed in the past decade have aimed to preserve this feature.
Among kinetic-based simulation methods, the Taylor least-squares interpolation supplemented LBM (TLS-LBM) by Shu et al. [47] is the first example of a meshfree LBM. It relies upon a semi-Lagrangian streaming step in combination with least-squares for the spatial approximation. While applied successfully for the simulation to several flow benchmarks, the method was only demonstrated using structured grids. As indicated by Bayona (2019) [48], the least squares approach remains viable only for low-order approximations, limiting further improvements of the TLS-LBM. More recently, several weak-form meshfree LBM methods have been developed by Musavi et al. [49,50] and Tanwar (2018) [51]. Similar to grid-based weak-form discretizations methods like FEM, these meshfree LBMs require an additional quadrature step (typically involving an additional background grid), increasing the complexity of the method and resulting in an implicit system of equations. Given the shortcoming of the weak-form meshfree LBMs, it is natural to wonder whether or not their strong-form cousins are more suitable candidates for computational kinetic simulations with discrete velocity models such as LBM.
In this paper, we demonstrate this notion by developing a strong-form OLBM (SF-OLBM) for irregular point clouds. The basic version of our SF-OLBM combines an RBF-FD discretization in space with the Lax-Wendroff method in time, resulting in an intuitive and finite-difference-like numerical scheme. While considerably simpler than the weakform meshfree LBM methods, the SF-OLBM is able to achieve comparable accuracy and computational efficiency. We demonstrate and compare these methods using a small set of numerical benchmarks including the Taylor-Green vortex flow, the lid-driven cavity, and the periodic flow in a square array of cylinders.
The rest of the paper is structured as follows. The SF-OLBM will be explained in Section 2, with details of the governing lattice Boltzmann equations, the spatial and temporal discretization methods, and the boundary conditions. A brief explanation of the point generation process concludes the section. In Section 3, the verification and validation of the proposed method is performed using the above-mentioned benchmark flow cases. This includes the evaluation of the spatial convergence rates, a demonstration of the ability to use geometry-adapted point clouds, and the effect of boundary conditions.

Method
This section presents the governing equations, the numerical formulation of the SF-OLBM, the boundary conditions, and the point cloud generation procedure.

Governing Equations
Our starting point is the discrete velocity Boltzmann equation (DVBE), for the single-particle distribution function f i (x, t) (PDF) which is a continuous function of space and time. The subscript i refers to the set of microscopic velocities c i , where i = 1, . . . , q. In the classical kinetic theory of gases, the particle velocity is a continuous variable, but in the DVBE it is limited to a finite set of velocities [46], also known as the lattice. The left-hand side of (1) describes the flight of particles along straight lines dictated by the velocities c i , while the right-hand side is the collision operator C i (f) that couples the dynamics of the individual equations through its dependence on all of the PDFs, f = ( f 1 , . . . , f q ) T . Without loss of generality we limit ourselves to the famous D2Q9 lattice, where the c i are given as: The low isotropy of this lattice makes it suitable only for incompressible and weaklycompressible flows. Since the focus of this paper is on spatio-temporal discretization, we limit ourselves to the lattice BGK collision operator: with a single relaxation time τ. The equilibrium distribution function is given by: where the value, δρ = ρ − ρ 0 , is the density fluctuation around a mean density ρ 0 . The values t i are the set of lattice weights and c s = 1/ √ 3 is the lattice speed of sound. The equilibrium above is in the incompressible form [52], meaning for steady flows it leads to results in agreement with the incompressible Navier-Stokes equation. The density fluctuation and macroscopic velocity are calculated from the particle distribution functions as the zeroth and first order moments with respect to the particle velocity, The pressure can be calculated from the state equation p = ρc 2 s . The viscosity of the fluid is ν = τc 2 s .

Numerical Solution Procedure
The success of (standard) LBM can be, in part, attributed to the unique spatio-temporal discretization based upon Strang splitting [46]. This way, the collision dynamics are integrated implicitly, at the cost of an explicit method. Solution of (1) is performed in two steps, collision:f and streaming: To achieve good bandwidth and limit computation costs it is crucial to design the discretization of (7) in a way that satisfies the hyperbolic character of advection. A survey of other OLBMs including FDM, FVM, and others, shows that almost all methods rely upon some form of hyper-viscosity, most typically through upwinding. In this work, we apply the standard Lax-Wendroff method for temporal discretization of the advection equation.

Temporal Discretization
We use the Lax-Wendroff scheme for temporal discretization. First, the particle distribution function is expanded in a Taylor series with respect to time: The superscript n refers to a discrete time instance, that is, t n = n∆t. By using the advection Equation (7) we can replace the temporal derivatives with derivatives in space, Substituting these into the Taylor series (8) and neglecting the third-order terms we arrive at the following time-stepping relation: The same time discretization has been used previously in the general characteristicbased OLBM [12], as well as the weak-form meshless OLBM variants [49,53].

Spatial Discretization
We perform the spatial discretization using a meshfree local strong-form method [54], which can be seen as a generalization of several meshfree methods including the moving least squares approximation method (MLS), radial basis function-generated finite difference method (RBF-FD), generalized finite difference method (GFDM), diffuse approximate method (DAM), local radial basis function collocation method (LRBFCM), collocated discrete least squares method (CLSM) and many others.
The basic principle of these methods is to approximate the differentials operator as a weighted sum of function values over a local support domain, {x k } n k=1 . For a linear differential operator L acting on a field u(x) the approximation is written as: where u k = u(x k ) and x c = x 1 is known as the center of the approximation. For the Lax-Wendroff method in 2-d, the operators required are the advection operator, and the second order operator, The sub-methods listed above differ in the way the weights w k are generated, using different underlying trial functions for the construction of the approximation. The detailed procedures for generating the weights can be found in the literature. Herein, we only studied the MLS [54] and RBF-FD [55] approximations. Computational routines for both types of approximation are available in specialized libraries [56]. Compared to weak-form OLBM methods, the strong-form methods obviate the need for an integration step, rendering an explicit system of equations (the mass matrix is equal to the identity matrix).
The weights for each stencil are calculated by solving a small system of linear equations at each node. For a chosen approximation method, assembly of the linear system only requires knowledge of the stencil coordinates. The weight computation is only performed once before the simulation begins (as long as the nodes are static). For the computer implementation, we assemble the weights into set of sparse matrices for each velocity direction c i . The sparsity pattern of the matrices equals the graph of nodal connectivity. After assembly and selection of the time step size ∆t, the time-update (9) is performed by repeated sparse matrix-vector product (SpMV) operations.

Boundary Conditions
For flows around objects or in bounded domains, correctly capturing the boundary conditions is a crucial part of the calculation. Typically, the boundary conditions are prescribed only in terms of the macroscopic variables and act as a constraint upon the underlying particle distribution functions. Since the strong-form OLBM assumes that the distribution functions reside at the nodes we adapt a regularized procedure [57], where the boundary populations are reconstructed according to: where x w is a wall node and u w is the prescribed wall velocity. The equilibrium part depends only on the macroscopic boundary conditions. The density ρ w can be computed using lattice symmetries, or extrapolated from the previous time step. The non-equilibrium correction f (1) i [11] is computed from the strain-rate tensor S using: where Q i = Q iαβ = c iα c iβ − c 2 s δ αβ and the symbol : means tensor contraction. The strain-rate tensor is evaluated either from the velocity field u(x, t) using the definition S = 1 2 ∇u + (∇u) T . Here, the discrete differentiation operators (10) can be used. Alternatively, the strain-rate tensor can be evaluated from the non-equilibrium part of the distribution function using the regularization technique [57]. This was the strategy we adopted here.

Point Cloud Generation
One of the advantages of meshfree methods is they can avoid the difficulties associated with traditional polygonal mesh generation techniques. Instead, several heuristic sampling techniques that can generate well-conditioned points sets have been developed in recent years [58,59]. An overview of different point cloud generators can be found in the work of Slak and Kosec [60].
In this work we have used two kinds of point-cloud generators. For the periodic flow benchmarks we have relied on a periodic Poisson disk sampling algorithm with a fixed background grid for fast neighbor searching and distance calculations. At the periodic boundaries we use modulo arithmetic to find the indexes of the neighbor cells. In two dimensions, the 24 neighbor cells are searched. If there are objects in the domain, we first place equally spaced nodes on the object boundary. Any sampled points that are outside of the domain are rejected. For non-periodic flow cases we have relied on the point cloud generator described in Ref. [60]. This generator provides node sampling according to a target density function, for example, for placing nodes more densely near boundaries. For simple domains (e.g., rectangles) we can use the classic stretching functions such as the sine and inverse tangent ones. Herein, we have decided to use a signed distance function (SDF) approach adapted from the distmesh generator. The signed distance function returns the shortest distance from any given point to the boundary of the object at a given point. For geometric shapes such as squares, rectangles, circles, and ellipses, the SDFs are known analytically and can be combined using the operators from constructive solid geometry to build complex shape domains. The inside and outside regions of the domain differ in sign, while the boundary is located precisely on the zero level set curve (in 2-d). Using a given SDF we can prescribe a nodal density function according to: where r min is the minimal allowable distance between nodes (e.g., the spacing at the boundaries), and r max is the maximal allowed spacing, for a domain with characteristic dimension L. Afterwards, the generated points are organized into a k-d tree for fast spatial searching. For two-dimensional periodic domains, the points can be mapped to the surface of a torus and the neighbor searches are then performed in three-dimensional space [55]. Since the torus mapping is only locally distance preserving, the resulting support sets are different from what we would expect if the neighbor searches were performed in 2-d. The partial differential equations should then also be mapped to the surface and solved using the three-dimensional mapped operators. Instead, we have decided to apply a 2-d tiling strategy. First, we create a tiling of the initial periodic point set that is translated in the axis-directions (in 2-d we need three tiles if a single axis is periodic and nine tiles if the domain is periodic in both). Spatial neighbor searches are performed only for the center tile. We recover the nodal indexes in the center tile by taking the remainder in the division of the tiled index with the number of nodes in the center tile. Bear in mind that when inter-nodal distances are calculated using only the node positions in the original (center) tile, a correction is needed due to the periodicity.
Finally, to improve the conditioning of the generated node sets we apply an annealing process based upon a repulsive force between nodes similar to the ones described in [60,61]. An example point cloud generated by the periodic Poisson disk sampling and the following annealing procedure are shown in Figure 1.

Results
For the verification and validation of the SF-OLBM we have used three flow benchmarks: the Taylor-Green vortex flow, lid-driven cavity flow, and the flow over a periodic square array of cylinders. These benchmarks were selected to demonstrate the beneficial properties of our approach including the adjustable spatial accuracy and the ability to use irregular point clouds.
Unless stated otherwise, the constant part of density was set to ρ 0 = 1, the minimal spacing between points was h = 1, and the time step was chosen according to δt = h CFL/|c i | max , where CFL = 0.1. The ratio of ∆t/τ was then free to vary, depending on the viscosity ν = τ/c 2 s . For stationary flows we used the following convergence criterion: where the sum is taken over the nodes in the simulation domain.

Taylor-Green Vortex Flow
To test the accuracy of the proposed OLBM in the absence of boundary effects, we first apply it to Taylor-Green vortex flow in two dimensions. The analytic solution for this fully periodic flow is given by: for velocity, and for pressure. Here, u 0 is the initial velocity amplitude, k = (k x , k y ) is the wave vector, p 0 is a reference pressure value, ν is the shear viscosity, and t c = Ä ν(k 2 x + k 2 y ) ä −1 is the vortex decay time.
Using the macroscopic values from the formulas (17) and (18), the initial values of the distribution function at t = 0 are set to their equilibrium values f i (x, 0) = f (0) i (ρ a (x, 0), u a (x, 0)).

Spatial Errors on Scattered and Cartesian Point Clouds
We evaluate the spatial convergence on scattered (S) and Cartesian (C) point clouds for three stencil sizes (9, 13, and 21). Details of the approximations, along with their abbreviations, are provided in Table 1. For the RBF-FD stencils we use polyharmonic splines (PHS) augmented with polynomials (also known as PHS + poly). The MLS-based stencils use a polynomial basis and a Gaussian weight function. The scattered point clouds (see Figure 2b) were generated according to the procedure described in Section 2.4. Due to the irregular packing the scattered node sets contain fewer nodes than the Cartesian ones for the same domain size.  The computational domain is set to 0 < x, y < L for the domain sizes L ∈ {L 0 , 2L 0 , 4L 0 , 8L 0 } where L 0 = 32. We consider only the case with unity aspect ratio k x = k y = 2π/L. The kinematic viscosity is set to ν = 0.01 and the initial velocity on the coarsest grid is chosen as u 0 = 0.01 in order to limit the effect of compressibility errors. The simulations are run until t = − ln (0.01) t c , when the initial wave amplitude has decayed by two orders of magnitude. Measurements of the error norms (see Equation (19)) and maximum velocity (wave amplitude) are performed every 100(L/L 0 ) 2 /CFL time steps for times t > − ln (0.1) t c . The delay before starting measurements is to allow any remaining inconsistency in the initial condition that could lead to oscillatory behavior of the solution to die out. Following the procedure outlined in [22] (cf. Section 5.1, p. 518), we measure the viscosity of the fluid ν m by fitting a linear function to the logarithm of amplitude values, ln (|u| max ). The slope of the fit corresponds to the viscosity coefficient.
Using the set viscosity ν, we determine the relative error in viscosity, ER ν = |ν m − ν|/ν, as a function of the number of nodes N. The results are plotted in Figures 3 and 4 for the Cartesian and scattered node sets, respectively. Additionally, we show the L 2 -norm of velocity at the final time t = − ln (0.01) t c calculated as:  For Cartesian node layouts, in Figure 3 we observe second order convergence behavior for both the MLS and RBF-FD discretizations. Interestingly, the viscosity and velocity errors of the MLS discretization are one order of magnitude lower than the RBF-FD ones. A modified equation analysis of the discrete Lax-Wendroff operators to identify the leading error terms could shed more light on this result. For the PHS + poly RBF-FD method, larger stencil sizes lead to smaller errors in agreement with previous findings [61,62]. At a fixed polynomial degree of the RBF-FD, the polynomial terms take over the initial terms in the Taylor expansion of the approximated function, while the additional RBF terms act on the remaining error, thereby slowly decreasing the error with an increase in stencil size. For the MLS approximation, the effect of stencil size is less dramatic, although a more definitive study would also need to consider the weight function influence [48].
For scattered point clouds the spatial convergence results are shown in Figure 4. Compared to the calculations on Cartesian grids, the results are here markedly different. The residual errors of the MLS and RBF-FD approximation now display similar magnitudes. With large N the residuals of RBF-FD decrease faster than the MLS ones, giving an advantage to the PHS + poly RBF-FD when used on irregular point clouds.

Comparison with Other Methods
In Figure 5, we compare the spatial and temporal accuracy of the SF-OLBM with other off-lattice approaches, including the general characteristic-based algorithm of Bardow [12] and the DUGKS method [63]. Values of the L 2 -norm (19) were extracted from [63] (see Table 1 and  The spatial accuracy measurements were performed at four increasing grid sizes with a fixed time-step ∆t = 2τ to keep the temporal errors small. For fair comparison, in Figure 5 (left) we have only included the measurements of MLS and RBF-FD with a stencil size of nine points, equal to the finite volume stencils used in the original source [63]. The MLS accuracy on the Cartesian node layout compares favorably with the DUGKS method, given that the latter is twice as expensive. As already found earlier, the RBF-FD set-up leads to poor results on regular grids.
For temporal accuracy, a grid of size 64 × 64 was used while the ratio ∆t/τ was varied at fixed τ. As can be seen from Figure 5 (right), the accuracy of MLS is close to that of the method of Bardow, while the errors of the RBF-FD stencil with nine nodes are again larger. For the highest measurement point ∆t = 50τ the RBF-FD diverged (also for larger stencil sizes) indicating insufficient dissipation at large ratios of ∆t/τ. Interestingly, the MLS error shows non-monotone behavior with a local minimum appearing close to ∆t ≈ 2.2τ. A cancellation of leading error terms (including those from the collision step) is a possible explanation. Similar minima might also occur for the DUGKS and Bardow methods but are missed by the coarse sampling used in [63].
As a rough performance estimate, we also give the computation times needed for 10,000 evolution steps measured on a single core of an Intel(R) Core(TM) i7-7700K CPU @ 4.20 GHz processor. All computations were performed using 64-bit double precision floating point numbers. The results for different mesh and stencil sizes are summarized in Table 2. We find the performance of the SF-OLBM to be comparable to or slightly better than other available reports [41,63]. Due to different hardware, amount of optimization effort, compiler flags, data structure memory layout and other computational issues, the timings should be approached with caution.

Lid-Driven Cavity Flow
The lid-driven cavity flow is a classic benchmark problem for numerical schemes in CFD. The popularity of this benchmark problem is due to its simple setup, consisting of a square cavity with side length L. The lid of the cavity moves with constant velocity U w , while the remaining walls are kept stationary. The Reynolds number of the flow is Re = U w L/ν with ν being the viscosity of the fluid. Although geometrically simple, the flow in the cavity is non-trivial, with the existence of thin boundary layers, and secondary vortices in the corners, that develop as the Reynolds number is increased. Moreover, a numerical singularity exists at the top corners of the cavity that can trigger the instability of the numerical scheme. Here, we use the lid-driven cavity flow to evaluate the spatial accuracy of the proposed OLBM at steady state and also to demonstrate the advantages of using irregular point clouds.

Spatial Accuracy on Regular Cartesian Grids
Here, we look at the spatial convergence of the SF-OLBM on a regular Cartesian grid. We use MLS including polynomials up to second order on 12-node nearest neighbor stencils. The settings are justified by the low error observed in the Taylor-Green benchmark (see Figure 4b). We fix the cavity length to L = 1, CFL number to 0.1 and the ratio ∆t/τ =≈ 5.1. Calculations are performed for regular grids of size 65 × 65, 129 × 129, and 257 × 257. The Reynolds number is set to Re = 400, with the lid velocity fixed at U w = 1, and viscosity ν = 0.025. On the coarsest grid we set Ma = 0.05 and halve it with each increase in grid size by modifying the speed of sound c s . Although the Mach number varies, the set up corresponds to an acoustic scaling Ma ∼ ∆t/∆x. Figure 6 shows the steady-state horizontal velocity profile along the vertical line through the center of the cavity for different grid sizes. The results are compared to the benchmark solutions of Ghia et al. [64] obtained by solving the Navier-Stokes equations. The OLBM results from the 129 × 129 grid are already in close agreement with the benchmark solutions. Analogously, the vertical velocity profiles along the horizontal center line are shown in Figure 7. Again, we observe good agreement with the benchmark results as the grid size is increased.  The flow pattern in the cavity can be visualized with the help of streamlines, that is, contours of the stream-function ψ. These can be obtained as the solution to the Poisson problem: where ∆ is the Laplace operator and ω = ∇ × u is the vorticity. The stream-function must also satisfy the homogeneous boundary conditions ψ = 0 at the four walls. Upon reaching steady-state, we use the pre-calculated derivative coefficients to set up a linear system of equations for the unknown stream-function values at the nodes, and calculate the vorticity values appearing on the right-hand side. The sparse system of equations is solved using the BiCGSTAB method available in the Eigen library. The calculated stream functions for three different Reynolds number are displayed in Figure 8 and agree visually with previous results [28,[63][64][65]. The calculations are performed using MLS approximation with second order polynomials on a 80 × 80 grid and a fixed CFL value of 0.8 (for Re = 1000 we use a CFL value of 0.4).

Irregular Point Cloud Calculation
One of the main benefits of the proposed OLBM is the opportunity to vary the density of the nodes in space. This means nodes can be placed in areas where they are needed, for example, boundary or internal layers. In standard LBM the grid spacing is constant, and determined by the smallest existing flow structures, meaning that the calculations are often over-resolved elsewhere. By varying the node density, we can achieve similar global accuracy with a smaller number of nodes. Unlike the multi-scale LBM, which uses a hierarchy of successively refined grids that demand specialized data structures, in the mesh-free setting we are able to use smoothly-varying nodes with practically no changes to the computational algorithm.
The actual point generation is performed using the fast point cloud generator of Slak and Kosec [60], followed by 100 steps of a relaxation procedure. The resulting point cloud with N = 16,298 nodes is shown in Figure 9. Each edge was discretized with 200 nodes; if a regular grid were used this would give a total of over 40,000 nodes, meaning a more than two-fold reduction was achieved. The RBF-FD stencil used the 21 nearest neighbors, quintic polyharmonic splines (r 5 ) and polynomials up to second order. Calculations were performed for Re = 400. The remaining parameters are chosen as Ma ≈ 0.04 and CFL = 0.1.
The remaining mismatch between velocity cross-sections and benchmark results in Figure 9 (right) being attributed to the higher dissipation of irregular grids. This is obvious when compared with the Cartesian results from the N = 129 2 = 16,641 grid (see Figures 6 and 7). A reduction in the amount of dissipation could be achieved with tuning of the RBF-FD (or MLS) approximations by including higher-order polynomials or changing the RBF (or weight function in MLS).

Flow over a Periodic Square Array of Cylinders
In this section, we study the boundary condition convergence behavior for Stokes flow through a periodic square array of cylinders. Similar benchmark computations have been used previously in [30,66,67] on lattice Boltzmann boundary conditions and the behavior of numerical error with bounce-back methods.
The set-up for this benchmark follows the descriptions in Refs. [67,68] and consists of a single cylinder of radius R in a square unit cell of length L. For the case of linear flow, analytic expressions are available for the dimensionless permeability k * , defined such that the force per unit length of cylinder in the flowing medium is: whereŪ is the average flow rate through the unit cell, and µ is the dynamic viscosity of the fluid. The dimensionless permeability k * = k * (χ) depends on the solid volume fraction χ = πR 2 /L 2 . Note that, at intermediate values of χ (e.g., between 0.2 and 0.5) away from the dilute (χ 1) and lubrication-type (χ → χ max = π/4) regimes, a small change in χ (or equivalently the radius of the cylinder) can lead to a ten-fold increase in hydrodynamic force, making this a sensitive test for the correctness of the implementation.
For simplicity, the average flow rate in Equation (21) is calculated as the volume average of velocity values in x-direction, that is, Note that the sum above is only a crude integration rule; better results can likely be obtained with more purposefully placed integration points. The force in Equation (21) contains contributions from a pure frictional force F D and a buoyancy force F B due to a pressure gradient. In the computations, the flow is driven instead by an external force b = (b x , 0) and the buoyancy force is given by F B = πR 2 b x [30,69]. The drag and lift forces are obtainable from the the surface integral of traction along the cylinder boundary, where σ(θ) is the stress tensor and n = (cos θ, sin θ) is the unit normal vector pointing from the surface into the fluid. The stress tensor is a combination of the viscous stresses and an isotropic pressure part, In LBM, the viscous part of stress tensor σ = σ αβ can be recovered from the nonequilibrium part of the distribution functions (up to third order) according to: Alternatively, the differentiation stencils can be used to compute σ = νρ 0 (∇u + (∇u) T ) from the macroscopic velocity field. However, the use of Equation (25) is preferred due to the locality of the expression [70]. Together, Equations (21)- (24) can be reorganized to compute the dimensionless permeability explicitly as: Unlike in regular LBM methods, where periodicity is enforced by simply reintroducing values exiting the domain on the opposite side, in the off-lattice method we enforce periodicity already during the construction of the derivative stencils and the calculation of weight coefficients. The point clouds are generated according to the method described in Section 2.4, starting from an initial set of nodes placed equidistantly on the cylinder's surface. We use the TRT collision model [20] with τ = 4/5 and Λ = 1/4. The body force is set to b x = 7.7 × 10 −7 and lowered if necessary to assure creeping flow. Upon reaching steady state, the Reynolds number based upon the average velocity in x-direction, that is, Re = 2R u x /ν and u x = ∑ x u x (x)/N, is checked to satisfy Re < 0.005.
First, we increase the resolution of the cylinder at fixed values of the solid fraction χ ∈ {0.3, 0.4} and measure the convergence of the dimensionless permeability k * in terms of the relative error: with respect to a reference value k * ref .
The convergence of the dimensionless permeability with respect to the cylinder's radius is displayed in Figure 10 and can be seen to approach a terminal value. Figure 11. shows the relative error (with respect to the value on the finest point cloud) as a function of the cylinder's radius with logarithmic axes. The convergence rate is slightly below second order. The results for unit cells of size 33 2 , 99 2 , 297 2 , and 352 2 lattice units are given in Table 3. From these results we can see that, while the dimensionless permeability values are converged to the accuracy of three significant digits, a small error remains in comparison to the reference semi-analytical solution [68]. Part of this error is likely attributable to the integration errors when calculating the mean flow rate using the point sum over the irregularly distributed points. By interpolating the results onto a Cartesian grid akin to a standard LBM one we have calculated a second set of voxelized error values, where this effect is expected to be reduced. A cubic interpolation method was used in order to prevent interpolation errors from interfering with the LBM ones. The permeability values measured from the voxelized solution are in close agreement with the reference values obtained with standard LBM and the bounce-back boundary condition (see Table 3).
Further work is necessary to evaluate the other families of LBM boundary conditions in the off-lattice setting including least-squares and moment based approaches that are also applicable to higher order lattices. Advanced nodal integration methods should be sought to reduce the errors in averaged quantities such as permeability.

Conclusions
In this work, we have introduced a strong-form off-lattice Boltzmann method (SF-OLBM) based upon the Lax-Wendroff discretization in time and an RBF-FD or MLS discretization of the spatial terms. For static grids, the discretization weights only need to be computed once at the beginning of the calculation and stored as a sparse matrix for use in time-stepping. The second order accuracy of this approach when using PHS + poly RBF-FD on various stencils has been shown for the Taylor-Green vortex flow. Higher accuracy can be achieved by increasing the polynomial order of the approximations with a related increase in stencil size. The moving least squares approximations delivered good accuracy on Cartesian grids. For irregular points, the existence of a turn-over point was found where RBF-FD was able to perform better.
The ability to employ local point cloud refinement was demonstrated for the lid-driven cavity flow and could increase the global accuracy with little extra computational effort. For coarse grids in the presence of boundaries we found some errors in mass conservation, which however, disappear with increasing node density and high order approximations. The flow around a periodic square array of cylinders was used to probe the accuracy of the method in the presence of curved boundaries and raised the issue of computing accurate averaged quantities.
A downside of the proposed SF-OLBM, shared with other Eulerian lattice Boltzmann methods, is the loss of locality and large computational cost associated with the derivation evaluation in the streaming step. This decrease in efficiency can, however, be off-set using higher order approximations and local point cloud refinement. The opportunity to employ a high-level coordinate-free programming style is an additional benefit that saves on development time. A comparison of the current Lax-Wendroff streaming versus other streaming methods, including the semi-Lagrangian LBM [71] and the popular DUGKS scheme [29], remains a topic of future interest.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: off-lattice Boltzmann method PHS polyharmonic spline PDF particle distribution function RBF-FD radial basis function-generated finite differences SF-OLBM strong-form off-lattice Boltzmann method TRT two-relaxation-time