Scrutinizing Dynamic Cumulant Lattice Boltzmann Large Eddy Simulations for Turbulent Channel Flows

: This technical paper outlines the predictive performance of a recently published dynamic cumulant lattice Boltzmann method (C-LBM) to model turbulent shear ﬂows at all resolutions. Emphasis is given to a simple strategy that avoids a frequently observed velocity overshoot phenomenon near rigid walls when combining the C-LBM with an all-resolution (universal) wall function. The examples included are conﬁned to turbulent channel ﬂow results for a variety of friction Reynolds numbers within 180 and 50,000, obtained on a sequence of isotropic, homogeneous grids that feature non-dimensional lattice spacings using inner coordinates from 4 to 2200. The results indicate that adjusting the near-wall distance of the ﬁrst ﬂuid node, i.e., the intersection of the wall with the ﬁrst lattice edge, to the resolution provides a reasonably simple, robust, and accurate supplement to the all-resolution C-LBM approach. The investigated wall function/C-LBM combination displays a remarkable predictive performance for all investigated resolutions. present paper proposes an adjustment of a universal wall function that comes at negligible computational expenses and favorably inﬂuences the near-wall cumulant expressions. To render the predictive beneﬁts, attention is devoted to a comprehensive range of bulk Reynolds numbers Re B ∈ [ 3000, 1.7 × 10 6 ] on grids between 4.5 × 10 5 and 27 × 10 6 nodes that feature resolutions of ∆ x + ∈ [ 4, 2200 ] . The results indicate that the uniﬁed model is able to capture all investigated ﬂows without further adjustments or user input. Future research could consider implementing the approach for complex geometries to predict separated engineering shear ﬂows.


Introduction
Significant progress has been achieved in engineering flow simulations. Nonetheless, a central issue in computational fluid dynamics research refers to modeling turbulent flows at high Reynolds numbers. The starting point of many efforts is the spatial and (or) temporal variations of the resolution properties of the discrete model, caused by either the flow dynamics and (or) local resolution differences due to limited computational resources. Related developments aim at adaptive, hybrid simulation methods that connect flow models ranging from classical, efficient statistical (Reynolds-) averaging strategies (RANS) to partially scale-resolving large eddy simulations (LES) or even fully resolved direct numerical simulations (DNS). This intention has promoted a range of grid refinement and hybrid flow modeling practices. Prominent examples of the latter involve all-resolution wall function approaches, e.g., [1][2][3], or hybrid RANS-LES methods, where the detached eddy simulation [4,5] and related improvements [6][7][8] are perhaps the most widespread industrialized strategies.
As an alternative to Navier-Stokes-based simulations, lattice Boltzmann methods (LBMs) have recently gained enhanced interest by the technical flow simulation community, e.g., [9][10][11][12][13]. The method computes the evolution of a set of particle distribution functions (PDFs) in space and time which are used to reconstruct the flow field [14]. The benefit of LBM refers to the separation of non-local and non-linear discretized expressions and the related algorithmic simplicity, which promotes very efficient parallel implementations on GPU and CPU systems [15][16][17][18][19][20]. Separating discretization aspects from flow physics modeling is more challenging in an LBM than in a Navier-Stokes framework since both aspects are related to the collision model that relaxes the PDFs towards their equilibrium state. To a large extent, the employed collision operator governs the predictive performance of an LBM simulation. An overview of recently suggested collision modeling strategies is beyond the scope of this paper and provided by Coreixas et al. [21]. The well-conditioned, parameterized cumulant collision operator, suggested by Geier et al. [22], transforms the PDFs into corresponding cumulant expressions, performs the relaxation cumulant space, and subsequently translates the result back. Variants of the cumulant LBM were very recently applied to a variety of turbulent flows, e.g., by [23][24][25]. It is particularly interesting for its predictive prospects to cover a range of resolutions with the same computational model [26]. The present authors have recently suggested two related modifications in an attempt to improve modeling turbulent flows with an identical numerical approach for a wide range of resolutions from DNS to very large eddy simulations (VLES) [27]. Both modifications are resolution-sensitive and vanish in the respective fine grid limits. The first one regularizes the higher (third-) order cumulant expressions and exclusively acts at high wave numbers [28]. The second suggestion refers to a local, cumulant-based dynamic subgrid scale model, which acts at lower wave numbers, gradually vanishes in the DNS limit, and provides the necessary subgrid scale contributions for very coarse resolutions. Along rigid boundaries, the modified cumulant model is coupled to the wall function of Asmuth et al. [29] in combination with the boundary condition of Yu et al. [30] to comprise a universal wall function; cf. Gehrke and Rung [27] for details. Figure 1 displays exemplary mean flow (a) results, supplemented by (averaged) resolved, viscous, and modeled turbulent shear stresses (b) obtained from the modified C-LBM approach for two simple channel flows at Re τ = H u τ /ν = 180 [31] and Re τ = 2000 [32]. Here, ν is the fluid viscosity, H refers to the half channel height, and u τ = |τ w |/ρ denotes the shear stress velocity, computed from the magnitude of the wall shear stress |τ w | and the fluid density ρ. (a) Comparison of predicted non-dimensional mean velocity profiles u + = u/u τ as functions of the non-dimensional distance y + = y u τ /ν returned by the modified C-LBM for N H = 24 along the channel half height (symbols), the DNS data of Kim et al. [31] (Re τ = 180) and the DNS data of Bernardini et al. [32] (Re τ = 2000), indicated by solid lines. To improve the presentation, the mean velocity profile for Re τ = 2000 is shifted by ∆u + = 5 velocity units in vertical direction. (b) Evolution of (averaged) resolved (solid/circles), viscous (dashed/squares), modeled (dotted/diamonds), and total (black line) normalized shear stress contributions.
Both simulations are performed on a homogeneous, isotropic, Cartesian mesh using N H = 24 points per channel half height. The figure reveals a remaining issue of the modified cumulant model near the wall for Re τ = 2000, where a velocity overshoot phenomenon [27,29,33] is observed. For the higher Reynolds number, the resolved stresses suffer from the missing dynamics of the wall function, and the predicted wall-adjacent modeled stresses are too small. As a consequence, the gradient of the mean flow is overpredicted near the wall to compensate for the underestimated turbulent stresses. Similar results are displayed for other Reynolds numbers in [27] when the non-dimensional grid spacing exceeds ∆x + = ∆x u τ /ν 20. Other recent publications, e.g., [34][35][36][37], that cover developments in the simulation of turbulent channel flows at various Reynolds numbers, applying different collision operators, wall function approaches, and turbulence models, also highlight the relevance of the overshoot issue and prove that this is an active field of research.
The present technical paper tries to convey the merits of a simple (virtual) wall positioning method that adjusts the location of the wall between the solid node and the first fluid node to the resolution and thereby neutralizes the overshoot phenomenon. Supplementarily, the C-LBM approach published in [27] is assessed for very high Reynolds numbers and coarse resolutions that reach down to 45 grid points between the top and bottom walls of a channel flow at Re B = H u B /ν ≈ 1.7 × 10 6 , respectively, where u B denotes to the average (bulk) velocity. Moreover, we also provide guidelines on meaningful resolution limits.
The remainder of the paper is structured as follows: Section 2 outlines the numerical model, summarizes the LES approach, and illustrates the test case in brief. Subsequently, Section 3 describes the suggested wall treatment modification and discusses the results in comparison to the standard approach. The final conclusions are drawn in Section 4.

Numerical Method
The study employs a graphics processing unit (GPU)-based lattice Boltzmann implementation for the simulation of incompressible fluid flows [15,[26][27][28]38,39]. To save space, the general approach is only briefly described, and the interested reader is referred to the rich literature on the fundamentals of LBM, e.g., [14,[40][41][42][43]. The discrete approximation comprises two steps, viz., Herein, f ijk are the dependent variables of LBMs and represent particle distribution functions (PDFs), which describe the probability of a particle located at position x x x and time t to move in the discrete direction e e e ijk . The latter denotes a directional speed matrix e e e ijk = c (i, j, k) , with {(i, j, k) ∈ N 3 0 | i, j, k ∈ {−1; 0; 1}}, and restricts the particle advection (2) with a constant lattice speed c to the immediate discrete neighbor nodes during a discrete time step ∆t.
As indicated by Figure 2a, the simulations rest upon a regular D3Q27 [44] stencil and employ a unit spatial (∆x) and temporal (∆t) spacing together with a unit lattice velocity [45] of c = ∆x/∆t = 1. The macroscopic flow properties are recovered from the PDFs by ρ = ∑ ijk f ijk and u u u = (u, v, w) = (∑ ijk e e e ijk f ijk )/ρ.

Unit Conversion
LBMs are commonly weakly compressible approaches and require the specification of a Mach number. Moreover, all properties are computed in non-dimensional form and require a conversion from physical SI (index SI) to non-dimensional LB units (without index). To this end, the Mach number is related to the (averaged) maximum, i.e., the centerline, velocity u C of the channel, which is approximately obtained from the bulk velocity via u C = 1.16 u B : Here, (c s ) SI is the speed of sound in SI units which converts to c/ √ 3 with c = 1 in non-dimensional LB units. The entrance parameter of all present simulations is the frictional Reynolds number Re τ . Applying (3) and Dean's [47] correlation (Re B ≈ 7.32 Re 8/7 τ ), the scaling factors Λ φ that transfer SI in LB units, i.e., φ = φ SI /Λ φ , read Hence, the simulation is uniquely specified by the SI values for H [m] and ν SI [m 2 /s], the discretization parameter N H , as well as the similarity parameters Re τ and Ma. As outlined above, the spatial and temporal step size as well as the lattice velocity in nondimensional LB units read ∆x = ∆t = c = 1. The non-dimensional shear viscosity follows from ν = ν SI /Λ ν .

Collision Model
During the collision step (1), the deviation of the incoming pre-collisional PDFs f ijk (x x x, t) from an equilibrium state f eq ijk is relaxed by the collision operator Ω ijk , which returns the post-collisional PDFs f * ijk (x x x, t). Different options are conceivable, which range from the simple single relaxation time (τ = 3ν/c 2 + ∆t/2) approach [48], i.e., , to more involved two- [49] or multiple-relaxationtime methods; see for example [50][51][52]. Some sophisticated approaches transform the PDFs prior to the collision and impose the relaxation in the transformed space. In the present study, the collision proceeds in cumulant [53] space, and the relaxation is performed for cumulant expressions C αβγ [16,22] that result from the transformation of the PDFs ( f ijk ) into cumulants (C αβγ ), viz., The sum of the Greek indices correlates with the cumulants' order and reaches from zero (C 000 ) to six (C 222 ), i.e., α, β, γ ∈ (0, 1, 2). Related details and an outline of the cumulant theory can be found in the seminal work of Geier et al. [16]. The transformation from PDF to cumulant space employs central moments that serve as an intermediate step, i.e., ( f ijk ) central moments cumulants (C αβγ ). The present implementation corresponds strictly to Appendix B in [27], and details are again omitted to save space. Equation (5) initially involves ten linear independent relaxation rates, which we denote by ω χ to simplify the content of this technical paper (details can be found in [16,27]). These non-dimensional rates need to be assigned to values in a realizable range of ω χ ∈ [0; 2]. The two lower-order relaxation rates are related to the shear (ω 1 ) and the bulk (ω 2 ) viscosity. In accordance with other approaches, the shear viscosity's related rate follows from The value for ω 2 is usually simply assigned to unity, which recovers the equilibrium state in (5) and improves the stability, cf. [16]. Following [22] and our previous work [27,28], we assign the rates ω {6,..,10} , addressing the fourth-, fifth-and sixth-order cumulants' relaxation, to unity, to improve the stability of the simulation. The remaining three third-order relaxation rates are parameterized according to the suggestion of Geier et al. [22] and related to their respective values ω 1 , i.e., These parameterized rates vanish in the high Reynolds number limit due to ω 1 → 2, which adversely influences the numerical stability [22,28]. Hence, the parameterization (7) needs to be regularized. The present regularization refers to the above mentioned suggestion published in [27]. Since the linear combinations of cumulant expressions are preferentially used for the relaxation, the regularization distinguishes between respective cumulant expressions ω C {3;4;5},i , viz., where the subscript 3 (4) refers to the summation (subtraction) of the cumulant expressions and C ω is a resolution-sensitive parameter, which is assigned to C ω = Re ∆x /(10 Ma) depending on the cell Reynolds number, Re ∆x = (u B ∆x/ν) SI = ∆x + Re B /Re τ , and the Mach number, Ma = 1.16 √ 3u B . For the in-depth details of the underlying collision step in the cumulant space, the reader is referred to Section II.E and Appendix B of [27].

Subgrid Scale Model
The simulation of significantly under-resolved turbulent flows utilizes a Smagorinsky approach to model the influence of subgrid stresses with an eddy viscosity, ν t = (C S ∆x) 2 S. The eddy viscosity involves the grid spacing ∆x, a Smagorinsky parameter C S [54] and a strain-rate measure S 2 = 0.5(∇ ∇ ∇u u u + u u u∇ ∇ ∇) · ·(∇ ∇ ∇u u u + u u u∇ ∇ ∇). Unlike the classical approach, we use a resolution-dependent formulation [27] and employ a dynamic Smagorinsky parameter that is linked to the third-order cumulant expression |C 210 + C 012 | and is parameterized by the cell Reynolds and Mach number to account for the resolution: The eddy viscosity enters an effective viscosity ν e = (ν + ν t ) that replaces the fluid viscosity in the definition of ω 1 , cf. Equation (6). It is to be noted that an upper threshold is set to C S = 20/Ma, which, e.g., comes into effect for Re ∆x ≈ 3250 or ∆x + ≈ 130 for Ma = 0.1. Moreover, the influence of the SGS on the parameterization of the relaxation rates ω {3,4,5} (7) is neglected, i.e., the classical ω 1 definition (6) is employed in Equation (7), since related differences are deemed negligible [27].

Suggested Model
The results displayed in this paper utilize all features addressed in Sections 2.1.2 and 2.1.3 in addition to the wall function modification outlined in Section 3. The resolution sensitivity of the suggested combination of regularization, the subgrid scale model, and the universal wall function allows us to activate these three features irrespective of the resolution quality without additional user input, i.e., for DNS, LES, and VLES simulations close to classical URANS.

Test Case and Parameter Space
The dimensions of the investigated channel are 6πH in streamwise (x), 2πH in spanwise (z), and 2H in wall-normal direction (y), cf. Figure 2b. Six Reynolds numbers are employed to analyze the capabilities of the C-LBM, i.e., Re τ = {180, 550, 2000, 5200, 20,000, 50,000}. The investigations are performed on various grids. They are characterized by the resolution of the channel half height with N H nodes and support investigating a large range of non-dimensional resolutions, i.e., 4 ∆x + = ∆x u τ /ν 2200. As indicated by Table 1, each Reynolds number is investigated on four grids that coarsen from dark green to red. The selected resolutions aim at an approximately equal change in ∆x + while also considering resolutions of particular interest. The Mach number is assigned to Ma = 0.1 in response to previous investigations [27,28]. Each simulation comprises 200 flow passes T = 6πH/u B through the domain. An initial phase of 50 T is followed by a moving average computation of the mean flow and the contributions of the viscous and modeled stresses, which usually also span 50 T. Once the mean velocity is converged, the Reynolds stresses are evaluated during the final 100 T.

Results and Discussion
The present results represent a follow-up study of a recently published paper [27], where the regularized dynamic cumulant lattice Boltzmann (LES) method summarized in Section 2.1 was applied to simulate turbulent channel flows on a homogeneous, Cartesian lattice. As indicated by selected results shown in Figure 1, fair average velocities, Reynolds stresses, spectra, and correlation lengths are reported for Re B ∈ [2800, 130,000] by the same LBM formulation in [27]. An exception refers to significant overestimated nearwall gradients observed for the higher Reynolds numbers, which might deteriorate the assessment of engineering quantities of importance, e.g., heat fluxes. The deficit is deemed to be related to the interaction of the wall function with the cumulant-based subgrid stress model. In conclusion, a simple correction is sought that alleviates these deficits.
To understand the rationale of the suggested correction, it is essential to note that the second fluid node serves as a reference location for constructing the wall function. The latter iteratively computes a consistent shear stress τ w in line with the velocity prediction. Thus, the related (instantaneous) shear stress velocity u τ = |τ w |/ρ) provides a valid < y + 2 , u + 2 > combination (with y + i = y i u τ /ν and u + i = u i /u τ ) at the second fluid node for a given wall distance y 2 and a simulated (instantaneous) velocity u 2 at this node [27].
We suggest manipulating the non-dimensional wall distance of the first fluid node y + 1 from its standard value of y + 1 = ∆x + /2 virtually, i.e., without changing the actual grid, cf. the sketch in Figure 3a. Since no deficits occur for an adequate resolution of the near-wall regime, no manipulation of the wall distance is needed for ∆x + ≤ 15. A comprehensive parameter study is performed for a wide range of Reynolds numbers Re τ ∈ [550, 50,000] to compile the appropriate wall location shift. As indicated by the symbols in Figure 3, a fair correlation for an adequate near-wall distance of the first fluid node reads On the one hand, this virtual shift of the y + -origin reduces the wall distance. On the other hand, the relative reduction decreases with increasing distance, and the manipulation substantially shifts the wall distance of the first fluid node to the left. At the same time, the reduced distance increases the third-order cumulant expression at the first fluid node inside the Smagorinsky parameter C S (9) roughly linearly for high Reynolds number flows, cf. Figure 4b and the derivation of the regularization term in [27]. Figure 3. Evolution of the suggested non-dimensional wall distance of the wall-adjacent fluid node y + 1 in normalized (a) and non-normalized (b) format, depending on the grid resolution ∆x + as described in Equation (10). Colored dots display evaluated discrete y + 1 optima of each Reynolds number and grid. Furthermore, the reasonable maximum resolution for each of the underlying Reynolds numbers is illustrated by colored dashed vertical lines.
This yields an increase in the eddy viscosity ν t = C 2 s ∆ 2 S that approximately scales with C 2 S , as indicated by Figures 4a and 5a. As indicated by cf. Figure 6a, the velocity gradient between the solid node and the reference second fluid node, i.e., the centered velocity gradient at the first fluid node, does virtually not change. The latter is easily understood by the example given in Figures 5 and 6, where |τ mod | = ν t S = C 2 S ∆ 2 S 2 increases by 24% in response to a 23% increase in the eddy viscosity and an 11% augmentation of C S . The unaltered strain rate at the first fluid node is crucial to control the modeled stress by the exclusive change in the third-order cumulant expression |C 210 + C 012 | 1 at the first fluid node in response to the shift in the wall location, while leaving the grid and the numerical approach unchanged.
It is also interesting to note that the influence of the suggested wall positioning modification is almost exclusively restricted to the computed eddy viscosity (and the cumulants) at the first fluid node, cf. Figure 4a, and therefore can be interpreted as a local adjustment of the subgrid scale model to the wall function.   Figure 1a and refers to the standard grid layout with y + 1 (∆x + ) = 0.5 ∆x + = y + 1 (∆x + , Re τ ). The green curve depicts results from the suggested y + 1 (∆x + , Re τ ) = 0.15∆x + (1 + 35/∆x + ) manipulation, cf. Equation (10). (b) The overshoot vanishes when the modeled shear stress increases due to an increase in the eddy viscosity, as displayed by the normalized (averaged) τ i partition pathlines. Figures 7 and 8 display the predicted non-dimensional mean velocity profiles of the three lower and higher Reynolds numbers, respectively. All results are obtained with the adjustment of the wall location according to Equation (10).
Since all wall-adjacent nodes are located above the buffer-layer for the higher Reynolds numbers displayed in Figure 8, the lower bound of the abscissa is set to y + 1 = 30 in order to emphasize the logarithmic layer and the wake flow. The characteristic of the logarithmic layer is in fair predictive agreement with the parameters found by Nikitin et al. [55] for turbulent channel flows up to Re τ = 80,000 (κ = 0.41 and B = 5.2). Non-dimensional mean velocity profiles for three lower Reynolds numbers. Solid lines refer to reconstructions of discrete reference data sets published by Kim et al. [31] for Re τ = 180 and Bernardini et al. [32] for Re τ = {550, 2000} as given in Gehrke and Rung [27]. The color codes of the symbols are given in Table 1 Table 1. To improve the comparison, profiles for Re τ = 20,000 [50,000] are vertically shifted by ∆u + = 4 [8].
Further analysis of the mean flow profiles reveals a sensitivity of the present C-LBM to predict wake flows [56] depending on the resolution. In line with previous LES or DES simulations, e.g., those of Cabot et al. [57] or Keating and Piomelli [58], the present C-LBM also portrays an attenuation of the wake flow when the resolution is coarsened. This is caused by a weaker reduction in the resolved stresses in the outer regime of the boundary layer, as illustrated in Figure 9. Mind that wake flow usually occurs beyond a critical Reynolds number, e.g., Re τ ≥ 1000 [59], and hence no substantial grid sensitivity is observed for the two lower Reynolds numbers in Figure 7, and no significant change in the reduction in resolved stress towards the exterior boundary layer region is displayed in Figures 9a,b when the grid is coarsened. The analysis of the wake prediction yields a heuristic criterion for the maximum grid coarseness through ∆x + max (Re τ ) = Re τ /10 , Re τ 10 4 ( to capture wake ) 10 √ Re τ , Re τ > 10 4 ( to capture wake and maintain the validity of (10) ) .

Conclusions
The paper scrutinizes the predictive performance of a D3Q27 cumulant-based LB method to capture turbulent channel flows. Emphasis is given to assessing a unified computational model utilizing adaptive, dynamic regularization and subgrid scale components based on cumulant expressions combined with a universal wall treatment. The approach aims at resolution independence and is based on a previously published suggestion of the authors [27]. The essential building blocks and main advantages of this strategy are (a) an alternative regularization that dissipates the kinetic energy of the large wave numbers and accurately computes flows with fine to moderate resolutions (∆x + 20) and (b) a third-order cumulant expression to formulate a dynamic Smagorinsky-type SGS model that acts on smaller wave numbers. Both elements are sensitized to the resolution through the cell Reynolds (spatial) and Mach numbers (temporal) and seamlessly vanish in the fine-grid limit.
A remaining deficit refers to overshoot issues due to the missing interplay between the wall function and the SGS model nearby the wall. Related misinterpretations of the averaged flow field might yield a substantially wrong prediction of the wall shear stress and heat transfer rate at higher Reynolds numbers of engineering interest. Depending on the local flow situation, the modeled shear stress contributions are up to 30% wrong in response to the overshoot phenomenon. The problem is particularly prominent for ∆x + 1 20. The present paper proposes an adjustment of a universal wall function that comes at negligible computational expenses and favorably influences the near-wall cumulant expressions.
To render the predictive benefits, attention is devoted to a comprehensive range of bulk Reynolds numbers Re B ∈ [3000, 1.7 × 10 6 ] on grids between 4.5 × 10 5 and 27 × 10 6 nodes that feature resolutions of ∆x + ∈ [4,2200]. The results indicate that the unified model is able to capture all investigated flows without further adjustments or user input. Future research could consider implementing the approach for complex geometries to predict separated engineering shear flows.