A GPU-Implemented Lattice Boltzmann Model for Large Eddy Simulation of Turbulent Flows in and around Forest Shelterbelts

: Using porous wind barriers for the microclimate modification of agricultural lands, urban areas, and surrounding roads is a ubiquitous practice. This study establishes a new method for numerically modeling the turbulent flow in and around forest shelterbelts using an advanced multiple-relaxation-time lattice Boltzmann model (MRTLBM). A detailed description is presented for a large eddy simulation (LES) of turbulent winds by implementing barrier element drag force in the MRTLBM framework. The model results for a forest shelterbelt are compared with a field observational dataset. The study indicated that our implementation of drag force in MRTLBM is an accurate method for modeling turbulent flows in and around forest patches. Sensitivity analyses of turbulent flow related to the shelterbelt structure parameters and wind directions are also carried out. The analysis indicated that the optimal wind shelter effect in reducing the mean wind speed and turbulent kinetic energy is maximized using a narrow, medium porosity shelterbelt, with the wind direction perpendicular to the shelterbelt. These conclusions are in agreement with other observational and modeling studies. Finally, the computational time of a central processing unit (CPU) and graphics processing unit (GPU) was compared for a large domain with 25 million grids to demonstrate the MRTLBM advantage of LES in regards to computational speed with a mixed forest and building environment. The GPU is approximately 300 times faster than a CPU, and real-time simulation for this large domain is achieved using the Nvidia V100 GPU.


Introduction
Natural and human-grown forest patches are prevalent landscape components in urban and complex terrains, playing an important role regarding the microclimate near the ground's surface.A forest canopy can affect wind, temperature, radiation sheltering, acoustic noise, and aesthetics in the atmospheric boundary layer.This study is focused on the effect of a forest shelterbelt on wind and the associated turbulent kinetic energy under a neutral atmospheric stability condition.The objective is to develop and validate a large eddy simulation (LES) model based on a multiple-relaxation-time lattice Boltzmann model (MRTLBM) for predicting the wind field and turbulence in and near the forest canopies.
The experimental and observational studies of forest wind barriers began early in the 20th century because forest shelterbelt usage in wind modification projects for agricultural and other purposes were common and effective.The wind shelter effect was summarized and analyzed from field observational results [1][2][3][4][5][6][7][8], most of which were focused on averaged steady mean wind speed because early anemometers were not able to reliably measure the turbulence.Some wind tunnel tests [9][10][11][12] were also carried out to simulate the wind barrier effect in different wind barrier arrangements over flat terrain.These observational studies and wind tunnel tests accumulated good knowledge and data for the validations of numerical models.Thom introduced the theoretical study and parameterization of the canopy drag force [13].This was an important step due to the implementation of the porous canopy drag force as a body force term in Navier-Stokes equations (NSE).The numerical method was further developed to simulate turbulent flow in and above the forest canopies for time and spatial averaging using the governing NSE [14][15][16][17].Two and/or three-dimensional NSE, based on numerical models using the Reynolds averaged simulation (RANS), were developed by many researchers [17][18][19][20][21][22].The steady RANS is usually a time-averaged model, parameterizing turbulence in the same manner in all scales, using the concept of balance of turbulent kinetic energy (TKE) production and dissipation.These RANS simulations capture the mean wind and turbulence intensity of a turbulent flow reasonably well when compared with steady field and laboratory data but are not able to simulate the transient structure of turbulence.LES models [23,24] resolve the turbulent eddies which are larger than the grid-scale and represent a major step forward in turbulence modeling due to their capability to simulate transient structures under unsteady flows.The LES model method parameterizes small sub-grid scale turbulence and explicitly resolves turbulent eddies that have greater length scales than those of the computation grid.However, LES models usually use a large number of computation grids and require more computational resources when compared with the steady RANS model.In recent years, NSE-based LES models were employed by several researchers [25][26][27][28][29] for turbulent flows in and around wind barriers.
The lattice Boltzmann method (LBM) for modeling the fluid dynamic problem was initiated in the early 1990s [30][31][32].Unlike traditional computational fluid dynamics methods that numerically solve continuum-based macroscopic flow velocity variables (ρ, u, v, w) in NSE, the LBM is a kinetic, particle-based mesoscopic method for solving the Boltzmann equation in which the particle probability distribution function (PDF) is the control variable.The LBM solves the particle PDF via collision and streaming for energy and momentum propagation in fluid at each lattice node.The macroscopic fluid variables (e.g., density [ρ] and velocity [u i ]) are derived from the zero, first-order statistical moments of the PDF.The LBM has several advantages, including the simplicity of its code development, its easier implementation of complex boundary conditions, and its intrinsic parallelism for massively parallel computing.In NSE-based RANS or LES models, the nonlinearity advection terms are usually computed by several neighboring grid points, depending on the order of accuracy.This makes parallel processing more difficult, and the pressure equation in the incompressible flow in NSE-solvers is a Poisson equation that takes many iterations to solve for each time step.With the LBM, the nonlinear processes are handled in an individual computation grid, making parallel processing more efficient.Furthermore, the dynamic pressure field is derived from a simple diagnostic relationship using the density fluctuation in LBM.Details about the history, fundamental theory, and methods of LBM are available in the literature [30][31][32].The LBM has undergone significant developments in the last three decades.An LBM based on the Bhatnagar-Gross-Krook collision [33], called LBGK, was the first to be developed and subsequently, the most popular method implemented at the beginning of the LBM development era [34,35].Using LBGK, the collision process is performed with a single-relaxation time parameter.While LBGK is an accurate method for solving low-Reynolds number laminar flow problems, it is not numerically stable and is inaccurate for simulating a high-Reynolds number turbulent flow.More recently, several varieties of LBM methods have been developed to remedy this critical problem [36][37][38][39][40][41].
We used the multiple-relaxation-time Lattice Boltzmann method (MRTLBM) developed by d'Humières et al. [39,40].In MRTLBM, the fluid particle collision processes are performed in a moment space, and the PDF moments are relaxed at different rates to achieve better accuracy and numerical stability for turbulent flow.This method has been demonstrated to be superior for simulating turbulent flows over single-relaxation LBGK, in which collisions are represented with a single-relaxation rate [40,42].In a recent analysis of Chai and Shi [43], they presented a new MRT-LBM with a block-lower triangular-relaxation matrix obtained from the discrete velocity Boltzmann equation through some proper discretization of time and space.Since the Reynolds number in atmospheric turbulent flows is usually large, it is critically important to use the MRTLBM or other improved LBMs for atmospheric boundary layer flow modeling.Other types of advanced LBMs have been developed for simulating turbulent flows in an infinite homogenous forest canopy over flat terrain [44,45].To our knowledge, this is the first effort to develop and validate the capability of the MRTLBM in the simulation of a forest shelterbelt.We have developed an atmospheric boundary layer environment model using the MRTLBM, using this model to simulate the laboratory case of stratified turbulent flows over a ridge terrain [46] and around and overhead urban buildings [47,48].The results show that this is an accurate and fast model for urban turbulent flow modeling using a graphical processing unit (GPU).In this continuation of our research, we extend the model's capability for the flow in and around porous forest canopies.

The ABLE-LBM Model
The numerical model used in this study is a newly developed microscale atmospheric boundary layer model that is called the atmospheric boundary layer environment-lattice Boltzmann model (ABLE-LBM).It is a four-dimensional prognostic LES model based on an MRTLBM [39][40][41][46][47][48].It is used for fine-scale (temporal: seconds; spatial: meter to tens of meters) atmospheric boundary turbulent flows and the associated transport and dispersion of temperature, moisture, and other scalars (aerosol, smoke).
Instead of numerically solving the NSE, the LBM is a mesoscopic method that tracks the PDF of air parcel (or fictitious particle) motion and collision in certain prescribed sample directions that connect to neighboring computation grids.It can start with the evolution equation of the MRTLBE coupled with the forcing term [39][40][41][46][47][48] where f( → x, t) is a single particle PDF in phase space; Ω α is the collision operator; c is the particle velocity; F is the external forces (including buoyancy and drag forces).The collision is the process of relaxing to the Maxwell-Boltzmann equilibrium distribution, f eq .In the LBM of the 3D flow problem, there are different lattice models-D3Q15, D3Q19, D3Q27where Q denotes the number of lattice directions.In the ABLE-LBM, we chose 19 lattice directions for the parcel collision and streaming.The Taylor expansion of the f eq can be written in the following form by neglecting the third order terms: where u i is the fluid macroscopic velocity; c s = c/ √ 3 is the lattice sound speed; and c α is the lattice speed in 19 directions, expressed in our model as: The w α is the weighting factor in the lattice directions: In the LBGK, only single-relaxation time is used to characterize the collision effects.It is unrealistic to represent the collision to equilibrium in the same relaxation for all the physical modes (e.g., momentum and energy) [41].In the MRTLBM, a collision matrix with different eigenvalues or multiple-relaxation times are used for different physical modes.The MRTLBM equation of the Boltzmann Equation (1) with α discrete particle velocities can be written in the following vector form with multiple-relaxation time [40][41][42]: where x and t are space vector and time, respectively, and boldface denotes column vectors with α lattice directions.We used a D3Q19 lattice model, which has 19 lattice directions (α = 0, . .., 18).The PDF f, for example, is written in the following column vector: The F is the moment of the forcing term in moment space [31]; F = [I − S/2]MF α , where F α is the body force vector; S is the diagonal relaxation matrix.The column vectors m and m eq are the velocity moment vector and its equilibrium vector, respectively; M is a 19 × 19 matrix, which linearly maps the velocity space to the velocity moment space [40], m = Mf, f = M −1 m.The moment vector components can be interpreted as the hydrodynamic quantities and their fluxes.The moment vector is arranged in the following order, with each component related to a physical property of the fluid: where ρ ′ is the density fluctuation; e is related to energy; ε is related to energy squared; j x , j y , and j z are the components of flow momentum; (j x , j y , j z ) = ρ 0 → u , q x , q y , q z are the third order moments corresponding to energy fluxes in x, y, and z directions, respectively; p xy , p yz , and p zx are related to the components of the symmetric and traceless strain-rate tensor; π xx and π ww are fourth order moments; and m x , m y , and m z are the third order moments.The orthogonal transformation matrix M is derived using the Gram-Schmidt orthogonalization procedure [40].Since M is an orthogonal matrix, its inverse M −1 can be easily determined by multiplying it by its transpose.There are four conserved quantities in this MRTLBM; these are the density ρ ′ and three components of momentum j x , j y , and j z .By using the following decomposition of density, the compressibility effect of the model and round-off error are reduced, as follows: The equilibrium function vector m eq for the other non-conserved moments are given as follows [39]: The parameters in Equation ( 9) are chosen to optimize the linear stability of the MRTLBM ω ε = ω xx = 0 and ω εj = − 475 63 .With the equilibrium moments above, the speed of sound c s in the unit of lattice speed is c ≡ dx dt = 1.The diagonal relaxation matrix S for the collision process in the moment space is S = diag(0, s 1, s 2, 0, s 4, 0, s 4, 0, s 4, s 9, s 10, s 9, s 10, s 13, s 13, s 13, s 16, s 16, s 16 ), (10) where the relaxation rates of the non-conserved moments are in the range of (0, 2) and are specifically related to the shear viscosity (ν) and bulk viscosity (ζ) as s 1 = 2/(9ζ + 1) and s 9 = s 13 = 2/(6ν + 1).(11) In this simulation, we use the following relaxation parameters for optimized stability [39,40]: All the model parameters and their physical representations, such as the relaxation coefficients matrix S, moment vector (m), equilibrium moment vector (m eq ), and the transformation matrix (M) are described in the work of d'Humières et al. and Lallemand and Luo [40,41].More detailed descriptions of the ABLE-LBM and its associated MRTLBM are available in the work of Wang et al. [46][47][48].

Modeling of Porous Element Drag Force in LBM System
Thom [13] reported that under the assumption of a high-Reynolds-number turbulent flow, the skin friction part of drag force is small and may be neglected.The vegetation drag force is mainly due to the pressure from drag, and its components in a Cartesian coordinate can be computed from a parameterized formula: where C d is the nondimensional drag coefficient; A is the canopy vegetation element density; U is the total wind speed; and u i respresents the velocity components in three Cartesian directions.It is important to note that the C d here is an averaged value and is often derived from the laboratory wind tunnel data.C d is a dynamic coefficient and is related to many factors, such as the shape and arrangement of canopy elements, as well as the Reynolds number.The magnitude of C d for a shelterbelt has a wide range of variation in the literature [5,7,9,21,28]; it is quite different for the infinite and homogenous vegetation canopies in other canopy models.In this study, we use C d = 1.6, which is derived from a wind tunnel test specifically employed for the observation of shelterbelt flow [49].In the shelterbelt case, which is usually thin in width, the greatest drag force or momentum sink is due to the canopy, causing dynamic pressure difference at the windward and lee sides of the shelterbelt.In the modeling of thin artificial wind barriers, it is customary in shelterbelt modeling to use a resistance coefficient or pressure-loss coefficient k r , in which the resistance coefficient is related to the drag coefficient and vegetation density in the following equation [17,21]: where W sb is the shelterbelt width.This equation will be useful for comparing our results with observational and other modeling results.Since the LBM lattice has 19 directions in ABLE-LBM, the drag force is then distributed in each lattice directions, according to Ref. [50], as follows: where u i are the macroscopic wind velocity vector components computed from the previous step.The F α is used in Equation ( 15) for the drag force effect in each lattice direction.The macroscopic air density and velocity are computed as follows: where p is the dynamic pressure.The pressure is computed diagnostically from the density, which is another advantage of the LBM over the NSE-based model.

Sub-Grid Turbulence Parameterization
The Lily-Smagorinsky turbulence model [23,24] is used for the sub-grid turbulence in this research.It uses the eddy viscosity that is related to the local strain-rate tensor for Reynolds stress parameterization, analogy to the molecular momentum diffusion determined by molecular viscosity.The equation for total momentum diffusion at sub-grid scales is determined by a total effective viscosity (ν total ) that includes molecular viscosity (ν) and turbulent viscosity (ν t ): where C is a constant, which is taken as 0.2; ∆ is the filter size, taken as the grid size; and S is the magnitude of the filtered strain-rate tensor: where , and u i represents the macroscopic wind vector components computed from the previous step.The strain-rate tensor components can be computed directly from the nonequilibrium moments in the LBM [51].The strain-rate tensor, S ij , is computed using the equations from: 2ρ 0 dt m 13,14, Here, ρ 0 is the background fluid density, and s i, and m i are the multiple-relaxation coefficients and moments components, respectively.For a more detailed description of the sub-grid turbulence modeling, see Wang et al. and Yu et al. [46,51].
There is an effect of the forcing term on computing the strain-rate tensor [52].Since this implementation of MRTLBM includes a drag force term of vegetation elements, a small error in the strain-rate term could be caused by using Equation (19).Our flow regime is highly turbulent compared with the results of previous analyses [52].It is worthwhile to investigate this effect in future studies.

Simulation of a Forest Shelterbelt
The first validation case uses the data from a field observation by Kurotani et al. [6] in Izumo, Japan.In their study, an inflow reference tower and five towers on the lee side of the shelterbelt were installed for mean wind and turbulent kinetic energy profiles using 3D sonic anemometers over a flat terrain.Figure 1 shows the observational setups with a reference tower mounted at three levels for the inflow conditions at x = −5 H.The other dimensions of the shelterbelt were 74 m in length and 2 m in width.There is a trunk area in the shelterbelt without leaf elements, ranging from the ground level to 1.2 m high.The five profiles (P 1 -P 5 ) were located at distances of 1 H (7 m), 2 H (14 m), 3 H (21 m), 4 H (28 m), and 5 H (35 m), respectively, on the lee side of the shelterbelt.Four levels of observation at heights of 1.5 m, 3.0 m, 4.5 m, and 6.0 m were carried out at each observational profile (P 1 -P 5 ).
dimensions of the shelterbelt were 74 m in length and 2 m in width.There is a trunk area in the shelterbelt without leaf elements, ranging from the ground level to 1.2 m high.The five profiles (P1-P5) were located at distances of 1 H (7 m), 2 H (14 m), 3 H (21 m), 4 H (28 m), and 5 H (35 m), respectively, on the lee side of the shelterbelt.Four levels of observation at heights of 1.5 m, 3.0 m, 4.5 m, and 6.0 m were carried out at each observational profile (P1-P5).The data selected for the model validation was obtained from 12:30 to 14:50 on 20 December 2000 [6].The atmospheric stratification condition was neutral, and the wind was basically perpendicular to the shelterbelt, with small deviation from −10° to 0° from the perpendicular line.The inflow mean wind speed was quite stationary at approximately 5.5 m/s at the height of 5.36 m.The LES simulation requires not only the observed mean wind profile of the approaching wind, but also the turbulent fluctuations for the inflow boundary condition.We used a simple synthetic algorithm [48] for the turbulence part of the inflow that agreed with the observed profile.The inflow boundary profiles are shown in Figure 3.The left panel shows the inflow profiles of the observed mean wind speed and turbulent kinetic energy (TKE) at the reference tower P0.The computed mean wind speed and TKE at the same point (P0) are also plotted in the right panel of Figure 3 to assure that the inflow boundary conditions agree with the observations.The mean wind profile at the reference point also agreed with the neutral logrithmic wind  The data selected for the model validation was obtained from 12:30 to 14:50 on 20 December 2000 [6].The atmospheric stratification condition was neutral, and the wind was basically perpendicular to the shelterbelt, with small deviation from −10 • to 0 • from the perpendicular line.The inflow mean wind speed was quite stationary at approximately 5.5 m/s at the height of 5.36 m.The LES simulation requires not only the observed mean wind profile of the approaching wind, but also the turbulent fluctuations for the inflow boundary condition.We used a simple synthetic algorithm [48] for the turbulence part of the inflow that agreed with the observed profile.The inflow boundary profiles are shown in Figure 3.The left panel shows the inflow profiles of the observed mean wind speed and turbulent kinetic energy (TKE) at the reference tower P 0 .The computed mean wind speed and TKE at the same point (P 0 ) are also plotted in the right panel of Figure 3 to assure that the inflow boundary conditions agree with the observations.The mean wind profile at the reference point also agreed with the neutral logrithmic wind profile, u = (u * /k)ln[(z + z 0 )/z 0 ], where u * = 0.34 m/s is the friction wind speed; k = 0. Note that the H is the shelterbelt canopy height in this case, and logrithmic profile is assumed for domains higher than 1.28 H (>9 m) where the observation was not available.
Figure 4 displays an instantaneous wind field of vertical (a) and horizontal (b) cross section results at t = 800 s of the model simulation results, just to show the characteristics of a turbulent wind field.For a clearer view of the turbulent variations, note that the scale of the vertical axis of (a) is not the same as the horizontal axis.The LES model resolves the larger scale turbulent eddies which are greater than the grid length, while the turbulent eddies smaller than the grid length are modeled using the Lily--Smagorinsky parameterization [23,24].The vertical cross section shows a typical shelterbelt zone, as illustrated in Judd et al. [9] from a wind tunnel study; a bleed flow (x = 0-2 H, Z = 0-1 H), where the turblent fluctuations are dissipated by the canopy; a displacement zone (x = 0-8 H, z > 1 H), where wind speed is greater than the upstream wind; a quiet zone (x = 2-8 H, z < 1 H), where wind speed is greatly reduced; and a recovery zone (x = 8-20 H, z = 0-2 H), where the wind gradually returns to the upstream state.The recovery zone displayed more wind variations because the upper-level higher energy flow penetrated toward the ground level via Helmoholtz instability.Generally, the turbulent eddies are elongated in the mean flowing (x) direction compared with the vertical direction.The lee wake area is still evident at x = 21 H, which qualitatively agrees with many previous studies [5,9,19,21,25,28].The leeside wakes are up to 2-3 H vertically at a distance of 9-18 H in the x direction.For a clearer view of the turbulent variations, note that the scale of the vertical axis of (a) is not the same as the horizontal axis.The LES model resolves the larger scale turbulent eddies which are greater than the grid length, while the turbulent eddies smaller than the grid length are modeled using the Lily--Smagorinsky parameterization [23,24].The vertical cross section shows a typical shelterbelt zone, as illustrated in Judd et al. [9] from a wind tunnel study; a bleed flow (x = 0-2 H, Z = 0-1 H), where the turblent fluctuations are dissipated by the canopy; a displacement zone (x = 0-8 H, z > 1 H), where wind speed is greater than the upstream wind; a quiet zone (x = 2-8 H, z < 1 H), where wind speed is greatly reduced; and a recovery zone (x = 8-20 H, z = 0-2 H), where the wind gradually returns to the upstream state.The recovery zone displayed more wind variations because the upper-level higher energy flow penetrated toward the ground level via Helmoholtz instability.Generally, the turbulent eddies are elongated in the mean flowing (x) direction compared with the vertical direction.The lee wake area is still evident at x = 21 H, which qualitatively agrees with many previous studies [5,9,19,21,25,28].The leeside wakes are up to 2-3 H vertically at a distance of 9-18 H in the x direction.
By taking the time average of the model output results at each grid point, the LES model results can be compared with the observational data [6], which showed the time average for a period of stationary turbulent flow.The time average for each component separates the wind component into average and turbulent fluctuation parts: where u i = 1 T T 0 u i dt, and TKE is computed as: where σ 2 i represents the variations in the three velocity components.By taking the time average of the model output results at each grid point, the LES model results can be compared with the observational data [6], which showed the time average for a period of stationary turbulent flow.The time average for each component separates the wind component into average and turbulent fluctuation parts: where  =   , and TKE is computed as: where  represents the variations in the three velocity components.Figures 5 and 6 are the plots of the comparisons of the average mean wind profiles and TKEs, respectively, at the observational points denoted in Figure The simulation requires some time for the inflow information to proceed through the entire simulation domain to reach a quasi-stationary flow state.To avoid the results in the un-stationary spin up time, the time average of the LES simulation was taken from t = 500 to t = 1500 s.The averaged wind vertical profiles at x = 1 H, 2 H, 3 H, 4 H, and 5 H compare reasonably well with those of the observation.The fact that the shelterbelt at the bottom has a 1.2 m open trunk space is reflected in the observational data simulation results in this region of the lee side flow.These simulation results are consistent with those of other simulations [28,29].The grid size independence of the numerical method model was also checked the refinement.In this case, the grid refinement is also a good way to show that the size is adequate for the flow problem to be modeled.In some very complex situation scales of large turbulent eddies are so different that local refinements may be needed refined-grid simulation, the grid size was set to be Δx = Δy = Δz = 0.5 m, with the nu of grid points of 530 × 262 × 84 for the same computational domain.The simulation re are shown with green lines for mean wind in Figure 5 and for TKE Figure 6.The re of the refined-grid simulation are very close to the results obtained for the origina size of 1 m.This sensitivity indicated that a 1 m uniform grid was adequate to resolv turbulent flow field for this simulation problem.
Figure 7 displays an averaged dynamic pressure field ( = ( −  )/ ) comp from the LES model at the center vertical cross section at y = 65 m.The plot show there is a larger, abrupt pressure gradient at the leading edge of the shelterbelt, with itive pressure at the windward side leading edge and negative pressure at the lee s the leading edge of the shelterbelt.The results agree well with those of other studies [ The pressure field is closely co-related with the wind field and shows a major driven for the mean wind reduction of the shelterbelt.The grid size independence of the numerical method model was also checked using the refinement.In this case, the grid refinement is also a good way to show that the grid size is adequate for the flow problem to be modeled.In some very complex situations, the scales of large turbulent eddies are so different that local refinements may be needed.In a refined-grid simulation, the grid size was set to be ∆x = ∆y = ∆z = 0.5 m, with the number of grid points of 530 × 262 × 84 for the same computational domain.The simulation results are shown with green lines for mean wind in Figure 5 and for TKE Figure 6.The results of the refined-grid simulation are very close to the results obtained for the original grid size of 1 m.This sensitivity indicated that a 1 m uniform grid was adequate to resolve the turbulent flow field for this simulation problem.
Figure 7 displays an averaged dynamic pressure field (p ′ = (p − p 0 ) /ρ 0 ) computed from the LES model at the center vertical cross section at y = 65 m.The plot shows that there is a larger, abrupt pressure gradient at the leading edge of the shelterbelt, with positive pressure at the windward side leading edge and negative pressure at the lee side of the leading edge of the shelterbelt.The results agree well with those of other studies [9,21].The pressure field is closely co-related with the wind field and shows a major driven force for the mean wind reduction of the shelterbelt.

The Sensitivity of Turbulent Flow Field to Shelterbelt Structures and Wind Directions
In the preceding model validation, a specific shelterbelt structure of the field observation [6] was considered, and the wind direction was perpendicular to the long axis of the shelterbelt.In many other shelterbelt application designs, the structures of the shelterbelt (including width, height, and vegetation element density) are more likely to be different from those in the previously mentioned experiment site.Wind directions are probably not perpendicular to the shelterbelt long axis.These factors deserve a sensitivity analysis for the application point of view.In the following sensitivity analysis, we used the same inflow mean and turbulent profiles for simplicity, and the shelterbelt was uniform in terms of vegetation element distribution from the ground up (no gap was near the

The Sensitivity of Turbulent Flow Field to Shelterbelt Structures and Wind Directions
In the preceding model validation, a specific shelterbelt structure of the field observation [6] was considered, and the wind direction was perpendicular to the long axis of the shelterbelt.In many other shelterbelt application designs, the structures of the shelterbelt (including width, height, and vegetation element density) are more likely to be different from those in the previously mentioned experiment site.Wind directions are probably not perpendicular to the shelterbelt long axis.These factors deserve a sensitivity analysis for the application point of view.In the following sensitivity analysis, we used the same inflow mean and turbulent profiles for simplicity, and the shelterbelt was uniform in terms of vegetation element distribution from the ground up (no gap was near the ground).

Vegetation Element Density
The vegetation element density of the shelterbelt is one of the important structure parameters for determining the flow fields of the shelter effects.Figure 8 shows the sensitivity of the mean wind speed and TKE with respect of different shelterbelt vegetation element densities.In this test, all the inflow winds originate from the west and blow perpendicular to the shelterbelts.The shelterbelts were kept in the same geometric dimension, although the vegetation element densities varied from small to large.The average lee side wind speed at ½ H were reduced by 30% for A = 0.19 m −1 (Figure 8a), while the speed reduction was approximately 60% for A = 2.25 m −1 (Figure 8d).The sections exhibiting the greatest wind reduction are located at a distance of 3-15 H away on the lee side of the shelterbelt.The sheltered distances for the portions exhibiting the largest wind reductions in the x direction were decreased as the vegetation element density increased.The TKE distribution was highly related to vegetation element density.The shelterbelt showed better TKE reduction in low vegetation density shelterbelts (Figure 8e,f).While the TKE decreased in x direction from 0-6 H in all cases (Figure 8e-h), the TKE increased and extended higher up in the z direction as the vegetation element density increased.This TKE increase indicated that the flow on the lee side was more turbulent as the vegetation element density increased.The shelter effects involve not only the mean speed reductions, but also the turbulent intensity alteration due to the shelterbelt, which has significant implications for some specific applications.

The Width of the Shelterbelt
Given the same vegetation element density distribution, the shelterbelt's width has a large influence on the shelter effects on the lee side [21,27,28].Figure 9 shows the averaged wind vertical cross sections at y = 65 m.The model simulation using a 2 m wide shelterbelt (Figure 9a) produced an approximately 30-40% reduction in mean wind speed, while the shelterbelt with a 15 m width (Figure 9c) showed an approximately 80-90% reduction in the mean wind speed.The TKE also showed dramatic differences; while the 0-3 H on the lee side displayed a reduction of TKE for both narrow and wide shelterbelts, the TKE from x = 3-24 H exhibited greater TKE values for the wide shelterbelt, especially in the upper level of the domain.These tests indicate that wider shelterbelts produce a greater number of large turbulent eddies on the lee side, due to more pressure being drag-produced in the wider shelterbelt.Furthermore, when the shelterbelt is wide enough, a stationary bound-

The Width of the Shelterbelt
Given the same vegetation element density distribution, the shelterbelt's width has a large influence on the shelter effects on the lee side [21,27,28].Figure 9 shows the averaged wind vertical cross sections at y = 65 m.The model simulation using a 2 m wide shelterbelt (Figure 9a) produced an approximately 30-40% reduction in mean wind speed, while the shelterbelt with a 15 m width (Figure 9c) showed an approximately 80-90% reduction in the mean wind speed.The TKE also showed dramatic differences; while the 0-3 H on the lee side displayed a reduction of TKE for both narrow and wide shelterbelts, the TKE from x = 3-24 H exhibited greater TKE values for the wide shelterbelt, especially in the upper level of the domain.These tests indicate that wider shelterbelts produce a greater number of large turbulent eddies on the lee side, due to more pressure being drag-produced in the wider shelterbelt.Furthermore, when the shelterbelt is wide enough, a stationary boundary layer could develop above the canopy-a different turbulent regime when compared with the flow around shelterbelts.The results have important implications in regards to shelterbelt designs for specific applications.

The Width of the Shelterbelt
Given the same vegetation element density distribution, the shelterbelt's width large influence on the shelter effects on the lee side [21,27,28].Figure 9 shows the ave wind vertical cross sections at y = 65 m.The model simulation using a 2 m wide shel (Figure 9a) produced an approximately 30-40% reduction in mean wind speed, wh shelterbelt with a 15 m width (Figure 9c) showed an approximately 80-90% reduct the mean wind speed.The TKE also showed dramatic differences; while the 0-3 H lee side displayed a reduction of TKE for both narrow and wide shelterbelts, the TKE x = 3-24 H exhibited greater TKE values for the wide shelterbelt, especially in the level of the domain.These tests indicate that wider shelterbelts produce a greater nu of large turbulent eddies on the lee side, due to more pressure being drag-produced wider shelterbelt.Furthermore, when the shelterbelt is wide enough, a stationary b ary layer could develop above the canopy-a different turbulent regime when com with the flow around shelterbelts.The results have important implications in rega shelterbelt designs for specific applications.

The Wind Directions and Shelter Effects
The effectiveness of the shelterbelt is related to the angles of the incoming wind direction relative to the shelterbelt long axis direction, as documented in earlier studies [8,18,21,28].In this sensitivity study, the LES model was used to test different wind angles with respect to the effectiveness of the shelterbelts.Figure 10 shows the shelterbelt effectiveness in reducing the wind speed and in altering the TKE in three approaching wind angles, with respect to the shelterbelts.All these runs used the same shelterbelt dimensions and vegetation element densities.Figure 10a shows an instantaneous wind (t = 1000 s) incoming perpendicular to the shelterbelt; Figure 10b shows an instantaneous wind occurring simultaneously to wind coming in at 45 • with respect to the shelterbelt; and Figure 10c is an instantaneous wind occurring simultaneously to an incoming wind of 60 • with respect to the shelterbelt.The sheltered areas are larger for small angles of incoming wind.The 60 • incoming wind exhibited less than a quarter of the sheltered area when compared with the perpendicular wind.However, the wind speeds exhibited a larger reduction in the affected areas in the cases of non-perpendicular wind (Figure 10b,c).This is due to the longer air parcel travel length inside the shelterbelt for the non-perpendicular cases.The bottom panel plots (Figure 10d-f) show the average TKE corresponding to the cases (Figure 10a-c), respectively.
The smaller wind angles with respect to the shelterbelt showed large areas of TKE reduction in the shelterbelts.The larger TKE at the south and north edges of the shelterbelts all displayed larger TKE in all three runs because these areas represent the interfaces of the shelterbelt's affected and nonaffected areas where the turbulent motions produced intensive turbulent activities, especially in regards to the nonstationary turbulent eddies due to the shear production.
The 60° incoming wind exhibited less than a quarter of the sheltered area when compared with the perpendicular wind.However, the wind speeds exhibited a larger reduction in the affected areas in the cases of non-perpendicular wind (Figure 10b,c).This is due to the longer air parcel travel length inside the shelterbelt for the non-perpendicular cases.The bottom panel plots (Figure 10d-f  The smaller wind angles with respect to the shelterbelt showed large areas of TKE reduction in the shelterbelts.The larger TKE at the south and north edges of the shelterbelts all displayed larger TKE in all three runs because these areas represent the interfaces of the shelterbelt's affected and nonaffected areas where the turbulent motions produced intensive turbulent activities, especially in regards to the nonstationary turbulent eddies due to the shear production.

GPU Implementation of the ABLE-LBM and Model Execution Speed
One of main advantages of the LBM as compared to the traditional CFD method is that the LBM is more suitable to the massive parallelization of the GPU architecture.Modern GPUs have thousands of computer cores and thus can run thousands of threads simultaneously, especially for memory intensive problems.The LBM algorithm is memoryintensive due to 19 direction of lattice directions.The data locality in the LBM is satisfied, and time-stepping is explicit.Each computation grid cell in the domain is assigned to a GPU thread.The simulation is written in Compute Unified Device Architecture (CUDA) (Nvidia, Santa Clara, CA, USA).In our implementation, the 3D computation grids are mapped to 1D memory.In GPUs, the threads are executed in lockstep sets called warps.The threads within each warp need to load memory together to use the hardware effectively; this is called memory coalescing.In the implementation, we manage this by ensuring that threads within a warp are accessing consecutive global memory as often as possible.For instance, when calculating the moment vector in Equation ( 5), we load all 19 lattice velocities per grid cell.We organize the velocities so all the values for each specific direction are consecutive in the memory.This way, as the threads of a warp access the same direction across consecutive grid cells, and these memory accesses can be coalesced.
A large model domain with a grid number with 25,351,101 grid points (501 × 501 × 101) was chosen to test the ABLE-LBM's computation efficiency in regards to GPU implementation.There were several large buildings and two shelterbelts in the domain.Figure 11 shows the 3D picture of the domain and the instantaneous (1000 s) turbulent wind field within the ABLE-LBM LES.The turbulent structures around buildings and shelterbelts are shown.The model resolution of the domain is 3 m (∆x = ∆y = ∆z = 3 m), covering an area of 1.5 × 1.5 km.Table 1 lists the computation efficiency of our simulations for this simulation domain.The computation speeds are different using a Tesla P100 GPU (3584 cores, 16 GB memory; Nvidia) and a Tesla V100 GPU (5120 cores, 32 GB memory; Nvidia).The model computation speed is compared with that of a CPU (Xeon X5650 @ 2.67 GHz; Intel, Santa Clara, CA, USA).The Tesla P100 is more than 129 times faster than single-CPU processing.The Tesla V100 is more than 303 times faster than single-CPU processing.The real-time simulation goal is achieved for a domain with 25 million grids using the V100.These two types of GPUs are not the most advanced GPUs.We are in the process of extending our ABLE-LBM CUDA implementation to multiple GPUs, which will be necessary to handle a larger computation domain with 1 billion grid points.
11 shows the 3D picture of the domain and the instantaneous (1000 s) turbulent wind field within the ABLE-LBM LES.The turbulent structures around buildings and shelterbelts are shown.The model resolution of the domain is 3 m (Δx = Δy = Δz = 3 m), covering an area of 1.5 × 1.5 km.Table 1 lists the computation efficiency of our simulations for this simulation domain.The computation speeds are different using a Tesla P100 GPU (3584 cores, 16 GB memory; Nvidia) and a Tesla V100 GPU (5120 cores, 32 GB memory; Nvidia).The model computation speed is compared with that of a CPU (Xeon X5650 @ 2.67 GHz; Intel, Santa Clara, CA, USA).The Tesla P100 is more than 129 times faster than single-CPU processing.The Tesla V100 is more than 303 times faster than single-CPU processing.The real-time simulation goal is achieved for a domain with 25 million grids using the V100.These two types of GPUs are not the most advanced GPUs.We are in the process of extending our ABLE-LBM CUDA implementation to multiple GPUs, which will be necessary to handle a larger computation domain with 1 billion grid points.

Summary and Conclusions
We have implemented a forest canopy drag force parameterization for turbulent flow simulations in and around shelterbelts in an MRTLBM model.Detailed descriptions of the LES of turbulent winds in and around the forest shelterbelts in an MRTLBM framework are presented.The model was evaluated first with a field observational dataset, producing reasonably accurate mean wind and TKE fields when compared with the field observations.This study indicates that our implementation of MRLBM represents a good method for modeling turbulent flows in and around forest patches.This validation only covers a neutral flow case over flat terrain; tests using different atmospheric stability conditions and complex terrain are also required.Those validations should be carried out in future studies.
Sensitivity analyses of turbulent flow related to the shelterbelt structure parameters and wind directions were also carried out in this study.The sensitivity tests indicated that the optimal wind shelter effect for reducing the mean wind speed and turbulent kinetic energy is maximized with a narrow shelterbelt with medium vegetation element density, in addition to the wind direction being perpendicular to the shelterbelt.These conclusions are in agreement with the results of previous observational and modeling studies.The sensitivity results also showed that a shelterbelt with a dense vegetation element or wide width produces higher TKE on the lee sides.These TKE behaviors on the lee side have practical implications.For example, if preventing wind damage or wind erosion is the main objective for a shelterbelt, the optimal design is a narrow and thin shelterbelt system perpendicular to the prevalent wind direction.However, it would be desirable to have a dense or wider shelterbelt if diffusing air pollutants were the main objective, since the higher turbulence will promote rapid dispersion and transport.
Finally, the computation times of a CPU and two different GPUs were compared to demonstrate MRTLBM's LES advantage in regards to computational speed for a large simulation domain (25 million grid points) with a mixed forest and buildings over a flat terrain.For an hour of turbulent flow time in the same domain, a CPU required 237 h of computation time.The computation time is drastically reduced using a GPU, i.e., It took 1.84 h using a Nvidia P100 GPU and 0.78 h using a Nvidia V100 GPU.The real-time goal was achieved using Nvidia V100 GPU for this domain.These two types of GPUs are not the most recent generation of GPUs, and we expect that computation times will be shorter using more advanced Nvidia GPUs, such as the recent A100 and B100.

Figure 1 . 17 Figure 1 .
Figure 1.The observational setup in Kurotani et al.[6].The wind was perpendicular to the shelterbelt, denoted in green.P 0 is the reference inflow profile 35 m upwind from the shelterbelt.P 1 -P 5 profiles were on the lee side of the shelterbelt.Blue dots are the 3D sonic observational points at the denoted Z heights from the ground.The setup of the domain simulation is shown in Figure2.The domain is a box with a length of 35 H, a width of 18.6 H, and a height of 6 H over flat terrain, where H = 7 m and is the height of the shelterbelt.The grid lengths are 1 m in the x, y, z directions, resulting in grid numbers of 265 × 131 × 42.The forest shelterbelt with dimensions of 2 m in width, 7 m in height, and 70 m (10 H) in length was set up at the x = 90 m in the domain.The observed wind profiles (Figure1) are placed at the y = 0 m, which is the vertical xz symmetry plane in Figure2.In this simulation, the grid lengths were ∆x = ∆y = ∆z = 1 m.The vegetation element density A = 1.18 m −1 was used in this simulation, according to methods provided in the literature[28,29,48].

Figure 2 .
Figure 2. The simulation domain setup.The observational profiles are located at the symmetric xy plane, as denoted by P0-P5.
profile,  = ( * /)[( +  )/ ], where  * = 0.34 m/s is the friction wind speed; k = 0.41 is the Von Karman constant; and  = 0.01 m is the roughness length.The surface boundry condition was set up using the logarithmic profile for the first model layer height, since the large eddy scale is smaller than the model level.An open outflow boundary condition was used at x = 175 m (25 H).A free-slip boundary condition was set for the top boundary at z = 42 m (6 H), and cyclic boundary conditions were set for the lateral boundary condition to avoid the limitation of lateral boundaries.

Figure 2 .
Figure 2. The simulation domain setup.The observational profiles are located at the symmetric xy plane, as denoted by P 0 -P 5 .

17 Figure 3 .
Figure 3.The inflow mean wind profile (left panel), U, and the TKE (right panel) at the location of reference tower (P0).The circles indicate the observed values [6], the solid lines are the LES model values, and the dashed lines are the fitted logarithmic profile, with  = 0.01 m,  * = 0.34 m/s, and k = 0.41.Note that the H is the shelterbelt canopy height in this case, and logrithmic profile is assumed for domains higher than 1.28 H (>9 m) where the observation was not available.

Figure 3 .
Figure 3.The inflow mean wind profile (left panel), U, and the TKE (right panel) at the location of reference tower (P 0 ).The circles indicate the observed values [6], the solid lines are the LES model values, and the dashed lines are the fitted logarithmic profile, with z 0 = 0.01 m, u * = 0.34 m/s, and k = 0.41.Note that the H is the shelterbelt canopy height in this case, and logrithmic profile is assumed for domains higher than 1.28 H (>9 m) where the observation was not available.

Figure 4
Figure4displays an instantaneous wind field of vertical (a) and horizontal (b) cross section results at t = 800 s of the model simulation results, just to show the characteristics of a turbulent wind field.For a clearer view of the turbulent variations, note that the scale of the vertical axis of (a) is not the same as the horizontal axis.The LES model resolves the larger scale turbulent eddies which are greater than the grid length, while the turbulent eddies smaller than the grid length are modeled using the Lily--Smagorinsky parameterization[23,24].The vertical cross section shows a typical shelterbelt zone, as illustrated in Judd et al.[9] from a wind tunnel study; a bleed flow (x = 0-2 H, Z = 0-1 H), where the turblent fluctuations are dissipated by the canopy; a displacement zone (x = 0-8 H, z > 1 H), where wind speed is greater than the upstream wind; a quiet zone (x = 2-8 H, z < 1 H), where wind speed is greatly reduced; and a recovery zone (x = 8-20 H, z = 0-2 H), where the wind gradually returns to the upstream state.The recovery zone displayed more wind variations because the upper-level higher energy flow penetrated toward the ground level via Helmoholtz instability.Generally, the turbulent eddies are elongated in the mean flowing (x) direction compared with the vertical direction.The lee wake area is still evident at x = 21 H, which qualitatively agrees with many previous studies[5,9,19,21,25,28].The leeside wakes are up to 2-3 H vertically at a distance of 9-18 H in the x direction.By taking the time average of the model output results at each grid point, the LES model results can be compared with the observational data[6], which showed the time average for a period of stationary turbulent flow.The time average for each component separates the wind component into average and turbulent fluctuation parts:

Figure 4 .
Figure 4. Plots of an instantaneous total wind speed for the xy plane (a) through y = 65 m and xy plane (b) for z = 4 m.Dashed rectangles represent the cross sections of the shelterbelt.

Figure 4 .
Figure 4. Plots of an instantaneous total wind speed for the xy plane (a) through y = 65 m and xy plane (b) for z = 4 m.Dashed rectangles represent the cross sections of the shelterbelt.

Figures 5
Figures5 and 6are the plots of the comparisons of the average mean wind profiles and TKEs, respectively, at the observational points denoted in Figure1.The simulation requires some time for the inflow information to proceed through the entire simulation domain to reach a quasi-stationary flow state.To avoid the results in the un-stationary spin up time, the time average of the LES simulation was taken from t = 500 to t = 1500 s.The averaged wind vertical profiles at x = 1 H, 2 H, 3 H, 4 H, and 5 H compare reasonably well with those of the observation.The fact that the shelterbelt at the bottom has a 1.2 m open trunk space is reflected in the observational data simulation results in this region of the lee side flow.These simulation results are consistent with those of other simulations[28,29].Atmosphere 2024, 15, x FOR PEER REVIEW 10 of 17

Figure 5 .
Figure 5.Comparison of time-averaged wind profiles (from 500 to 1500 s) at the observation points P0 to P5.The U profiles with coarse mesh (black lines), fine mesh (green lines), and observed U profiles (circles) at the observation points denoted in Figure 1.

Figure 5 .
Figure 5.Comparison of time-averaged wind profiles (from 500 to 1500 s) at the observation points P 0 to P 5 .The U profiles with coarse mesh (black lines), fine mesh (green lines), and observed U profiles (circles) at the observation points denoted in Figure1.

Figure 5 .
Figure 5.Comparison of time-averaged wind profiles (from 500 to 1500 s) at the observation P0 to P5.The U profiles with coarse mesh (black lines), fine mesh (green lines), and observed U files (circles) at the observation points denoted in Figure 1.

Figure 6 .
Figure 6.Comparison of time-averaged TKE profiles (from 500 to 1500 s) at the observation P0 to P5. TKE with coarse mesh (black lines), fine mesh (green lines), and observed TKE (circ the observation points denoted in Figure 1.

Figure 6 .
Figure 6.Comparison of time-averaged TKE profiles (from 500 to 1500 s) at the observation points P 0 to P 5 .TKE with coarse mesh (black lines), fine mesh (green lines), and observed TKE (circles) at the observation points denoted in Figure 1.

Figure 7 .
Figure 7. Dynamic pressure  computed from the model in a vertical cross section at y = 65 m.The value is normalized by mean horizontal momentum.The dashed rectangle represents the x-y cross section of the shelterbelt.

Figure 7 .
Figure 7. Dynamic pressure p ′ computed from the model in a vertical cross section at y = 65 m.The value is normalized by mean horizontal momentum.The dashed rectangle represents the x-y cross section of the shelterbelt.

Figure 8 .
Figure 8. Display of average wind speed (left panel, a-d) and corresponding average TKE (right panel, e-h) for the differing vegetation element densities (A values, m −1 ) (a,e) A = 0.19 m −1 ; (b,f) A = 0.75 m −1 ; (c,g) A = 1.50 m −1 ; and (d,h) A = 2.25 m −1 .The average wind speeds and TKEs are normalized by the wind speed at a point located at x = 70 m and z = 7 m away on the windward side.All shelterbelts in the simulation were 70 m in length, 7 m in height, and 2 m in width.All the incoming winds are perpendicular to the shelterbelts.The dashed rectangle represents the x-y cross section of the shelterbelt.

Figure 8 .
Figure 8. Display of average wind speed (left panel, a-d) and corresponding average TKE (right panel, e-h) for the differing vegetation element densities (A values, m −1 ) (a,e) A = 0.19 m −1 ; (b,f) A = 0.75 m −1 ; (c,g) A = 1.50 m −1 ; and (d,h) A = 2.25 m −1 .The average wind speeds and TKEs are normalized by the wind speed at a point located at x = 70 m and z = 7 m away on the windward side.All shelterbelts in the simulation were 70 m in length, 7 m in height, and 2 m in width.All the incoming winds are perpendicular to the shelterbelts.The dashed rectangle represents the x-y cross section of the shelterbelt.

Figure 9 .
Figure 9. Sensitivity test of shelterbelt width.The shelterbelts x-z cross sections are shown as d rectangles.The right panels are averaged wind speeds; the left panels are averaged TKEs.(a same run with a 2 m shelterbelt width, A = 1.34 m −1 ; (c,d) the same run with 15 m shelterbelt A = 1.34 m −1 .

Figure 9 .
Figure 9. Sensitivity test of shelterbelt width.The shelterbelts x-z cross sections are shown as dashed rectangles.The right panels are averaged wind speeds; the left panels are averaged TKEs.(a,b) the same run with a 2 m shelterbelt width, A = 1.34 m −1 ; (c,d) the same run with 15 m shelterbelt width, A = 1.34 m −1 .
) show the average TKE corresponding to the cases (Figure 10a-c), respectively.

Figure 10 .
Figure 10.Sensitivity of approaching wind directions with respect to the shelterbelts.All three shelterbelts had the same dimensions (height = 7 m, width = 2 m, length = 70 m) and vegetation element density A = 1.34 m −1 .All plots are instantaneous (t = 1000 s) wind speed horizontal cross sections at z = 4 m.(a) Instantaneous wind speed with incoming wind perpendicular to the shelterbelt, (b) instantaneous wind speed with incoming wind of 45° with respect to the shelterbelt, (c) instantaneous wind speed with incoming wind of 60° with respect to the shelterbelt, (d) average TKE corresponding to (a,e) average TKE corresponding to (b,f) average TKE corresponding to (c).

Figure 10 .
Figure 10.Sensitivity of approaching wind directions with respect to the shelterbelts.All three shelterbelts had the same dimensions (height = 7 m, width = 2 m, length = 70 m) and vegetation element density A = 1.34 m −1 .All plots are instantaneous (t = 1000 s) wind speed horizontal cross sections at z = 4 m.(a) Instantaneous wind speed with incoming wind perpendicular to the shelterbelt, (b) instantaneous wind speed with incoming wind of 45 • with respect to the shelterbelt, (c) instantaneous wind speed with incoming wind of 60 • with respect to the shelterbelt, (d) average TKE corresponding to (a,e) average TKE corresponding to (b,f) average TKE corresponding to (c).

Figure 11 .
Figure 11.(a) Domain buildings and forest shelterbelt setup.The northern shelterbelt is 9 m in width and 15 m in height.The eastern shelterbelt is 15 m in width and 15 m in height.(b) Instantaneous horizontal cross section of wind field at z = 12 m at t = 1000 s.The shelterbelts are denoted in dashed red rectangles, and both shelterbelts have a vegetation element density of 1 m −1 .The wind on the lee side of the shelterbelts is quite different because the northern shelterbelt is narrow than the eastern

Figure 11 .
Figure 11.(a) Domain buildings and forest shelterbelt setup.The northern shelterbelt is 9 m in width and 15 m in height.The eastern shelterbelt is 15 m in width and 15 m in height.(b) Instantaneous horizontal cross section of wind field at z = 12 m at t = 1000 s.The shelterbelts are denoted in dashed red rectangles, and both shelterbelts have a vegetation element density of 1 m −1 .The wind on the lee side of the shelterbelts is quite different because the northern shelterbelt is narrow than the eastern shelterbelt, and the local wind direction of the eastern shelterbelt is almost parallel with respect to the shelterbelt (see Section 3.2.3 for a more detailed explanation).

Table 1 .
Model execution time using a CPU and GPUs for a 1 h flow time in a domain (see Figure11) with 5,352,101 (501 × 501 × 101) grids.