Study of Realistic Urban Boundary Layer Turbulence with High-Resolution Large-Eddy Simulation

: This study examines the statistical predictability of local wind conditions in a real urban environment under realistic atmospheric boundary layer conditions by means of Large-Eddy Simulation (LES). The computational domain features a highly detailed description of a densely built coastal downtown area, which includes vegetation. A multi-scale nested LES modelling approach is utilized to achieve a setup where a fully developed boundary layer ﬂow, which is also allowed to form and evolve very large-scale turbulent motions, becomes incident with the urban surface. Under these nonideal conditions, the local scale predictability and result sensitivity to central modelling choices are scrutinized via comparative techniques. Joint time–frequency analysis with wavelets is exploited to aid targeted ﬁltering of the problematic large-scale motions, while concepts of information entropy and divergence are exploited to perform a deep probing comparison of local urban canopy turbulence signals. The study demonstrates the utility of wavelet analysis and information theory in urban turbulence research while emphasizing the importance of grid resolution when local scale predictability, particularly close to the pedestrian level, is sought. In densely built urban environments, the level of detail of vegetation drag modelling description is deemed most signiﬁcant in the immediate vicinity of the trees. analysis, M.A. and S.B.; investigation, M.A. and S.B.; resources, M.A. and T.T.; data curation, M.A. and T.T.; writing—original draft preparation, M.A. and S.B.; writing—review and editing, L.J.; visualization, M.A.; supervision, A.H. and L.J.; project administration, M.A., A.H., and L.J.; funding acquisition, M.A., A.H., and L.J.


Introduction
In a rapidly urbanizing world, it is important to improve our understanding of the turbulent dispersion and exchange processes influencing the microclimate and air quality of our cities. This poses a challenge because urban boundary layer (UBL) flows are complex flow systems that arise from the interaction between atmospheric boundary layer (ABL) turbulence and varying arrangements of urban obstacles [1]. While the general urban layout, characterized by the mean building (or building block) height, density, and orientation, governs the wind dynamics in a statistical sense, the actual aerodynamic properties of individual obstacles dictate the local flow conditions which bear the greatest significance to people residing or working in the area. Thus, it is important to aquire the ability to and very large-scale turbulence motions (LSM and VLSM) of which the time scales are so long that it is unreasonable to expect local in situ measured time series to capture sufficiently converged statistics on them. Therefore, it is necessary to examine the complexities real urban flows impose and to seek methodologies to extract relevant information even concerning the local scale processes.
This numerical study employs state-of-the-art LES modelling techniques to examine a realistic UBL flow scenario designed to bear close correspondence with observable urban turbulence under neutral stratification. The PArallel Large-eddy Simulation Model (PALM) LES solver [23], with a newly added nested multi-scale system [24], is used to perform numerical simulations. The LES model domain features a highly detailed description of downtown Helsinki, a coastal capital city of Finland accommodating a variety of urban microclimate studies [25]. The considered city domain contains a significant amount of urban vegetation, which is included in the LES model as porous media. The study transparently documents the employed model configuration where fully developed ABL turbulence becomes incident with the urban surface, generating a developing UBL flow characteristic for coastal cities.
The numerical setup is designed to allow the ABL flow to develop and evolve very large-scale streamwise structures that are observed in high Reynolds number boundary layer flows (see e.g., Hutchins et al. [26], Fang and Porté-Agel [27], and references therein). These very large-scale motions (VLSM) of ABL, of which the length scales extend well beyond the boundary layer height δ, introduce difficulties in analyzing real urban turbulence statistics as the diurnally changing ambient conditions impose limits on the considered time series length. Thus, this work deliberately adopts a confined approach, where the considered time series do not yield sufficiently converged statistics on the VLSMs, and examines the sensitivity of the urban LES solution to central numerical modelling choices via comparative techniques. Particular emphasis is placed on the aerodynamic modelling of urban vegetation due to the embedded uncertainties in determining their spatial extent and aerodynamic attributes. The sensitivity of LES-resolved UBL turbulence to the alterations in vegetation model is examined in juxtaposition to its sensitivity to the modelled height of the ABL. This aspect is considered because the considered UBL flow is developing, having gone though a sharp transition at the leading edge of the city as the ABL flow becomes incident with the shoreline. Here, the inference that, under fully developed UBL conditions, the low altitude RSL turbulence is insensitive to outer layer dynamics is considered insecure, warranting confirmation in the chosen context. In addition, the considered sensitivities in local turbulence signals are also compared against the effects arising from an elementary reduction in grid resolution.
The comparative post-processing analysis starts by evaluating deviations in two-dimensional mean flow distributions over horizontal planes and proceeds by comparing the contents of virtual tower measurements extracted at different locations within the downtown area. The local tower measurements are subjected to joint time-frequency analysis with wavelets to inspect the makeup of the simulated turbulent fluctuations with some level of temporal coherence across different frequencies (or scales). In an effort to expose deviations in the local turbulent signals that arise due to the model modifications, concepts of information entropy and divergence are exploited in conjuction with wavelet analysis to carry out a deep probing comparison of LES-resolved urban canopy turbulence. These mathematical techniques are robustly introduced to establish a solid foundation for their utility in UBL turbulence research.
Due to the high degree of complexity of the UBL turbulence, this study aims to address what is the required level of detail for the urban surface description within high-resolution LES models. This information is relevant considering that the highest quality (vegetation resolving) light detection and ranging (LiDAR) scan datasets are not publically available for many potential study sites. This work is also considered a relevant precursor for future validation efforts, providing guidelines for the urban LES modeling and further identifying the bounds within which agreement against urban in situ measurements can be sought considering the modelling uncertainties and observation limitations.

Note on Nomenclature
Throughout this document, Cartesian vectors are denoted by boldface font and scalars are denoted by normal font. Thus, for instance, the coordinate vector and its components are denoted as x = (x, y, z). In the context of LES, the velocity vector obeys the following decomposition U(t, x) = u(t, x) + u (t, x), where U is the unfiltered velocity field, u is the space filtered (grid resolved) velocity, and u is the subgrid-scale velocity. The Cartesian velocity vector components are u = (u, v, w), whereas the flow components in a horizontally rotated coordinate system (x 1 , x 2 , x 3 ), where the x 1 -axis is aligned with the mean flow's streamwise direction, are marked as u = (u 1 , u 2 , u 3 ), where u 1 is the component in the mean streamwise direction, u 2 is the component in the mean crosswind direction, and u 3 is the vertical component. These velocity components are computed as follows: where α is the local horizontal mean flow direction. The Reynolds decomposition of the resolved velocity field follows the convention u(t, x) = u(x) + u (t, x), using overbar • to indicate a temporal average and prime (•) to specify fluctuations about the mean. By default, angled brackets • ≡ • x,y denote a horizontal average, while a subscript is used in other situations. All the spatial averages in this text are intrinsic averages (see Whitaker [28], Schmid et al. [29]), where all data points representing air or porous media (vegetation) are included while points within solid obstancles are excluded.

Large-Eddy Simulation Solver PALM
In this study PALM [23] LES model is utilized to study developing urban boundary layer flows. The solver employs finite-difference discretization on staggered Cartesian grid to numerically integrate the filtered, Boussinesq-approximated Navier-Stokes equations for incompressible flows, presented here using vector notation while omitting moisture-and Coriolis acceleration-related terms as they are not relevant in this study: In Equation (2), g is the gravitational accelaration,k is the unit vector in the z direction, I is the identity matrix, T sgs = (u ⊗ u − u ⊗ u) is the subgrid-scale kinematic stress tensor as ⊗ denotes the outer vector product, π * = p * + 1/3ρ 0 tr (T sgs ) is the modified perturbation pressure (p * being the perturbation pressure), and ρ 0 is the constant density of air at ground level. In the energy equation, Equation (3), θ is the potential temperature and q sgs = u θ is the subgrid scale temperature flux vector. For the discretization of the advection and diffusion schemes, the fifth-order Wicker-Skamarock and second-order central difference schemes are employed, respectively [30]. The solution strategy of PALM is based on a 3rd-order accurate Runge-Kutta time-stepping scheme and a predictor-corrector approach, where the divergence generated by the predictor step in the velocity field is subsequently removed by a corrector step utilizing the solution to a Poisson equation for π * . An efficient iterative multigrid solver is used for the pressure solution. To attain closure in the final system of filtered equations, PALM implements the 1.5-order subgrid-scale (SGS) turbulence model by Deardorff [31] with modifications due to Moeng and Wyngaard [32] and Saiki et al. [33]. The PALM model system includes also a newly implemented surface energy model [34], which incorporates the principal surface energy budget processes occurring within urban environments. Since the objective of this study is to examine the impact of certain modelling choices and their observability in an urban flow system under neutral ABL conditions (when only the anthropogenic heat component is relevant), the functionality of the USM is not exploited here.

Nested LES Domain Approach
The PALM solver has been recently implemented with a self-nested domain approach, which allows multiple solution domains with decreasing domain sizes but increasing resolutions to be solved simultaneously in a coupled manner. This approach facilitates an efficient simulation of flow problems that contain a wide range of turbulent scales by making it possible to employ higher resolution only where necessary. In this context, a notational convention Ω (n,p) is adopted for labeling domains, where the superscript (n, p(n)) includes a unique identifier n for every so-called child domain and an optional parent entry p = p(n) to indicate whether domain n has a parent domain p within which it is fully contained. In situations where parent information is not critical, shorter notation Ω (n) may be used. By convention, the main parent domain (which defines the outer boundaries of the LES model) is denoted Ω (1) , whereas the subsequent n ∈ {2, . . . , M} child domains become Ω (2,1) , . . . , Ω (M,p(M)) , noting that M > p(M) always applies. The boundary notation is also adjusted accordingly such that, for instance, Γ t refers to the top boundary of domain Ω (3,p(3)) . The solutions of each parent and child pair are coupled either by one-way or two-way coupling schemes (see Hellsten et al. [24] for details). In this study, the LES model is composed of three one-way coupled domains forming a nested system Ω (3,2) ⊂ Ω (2,1) ⊂ Ω (1) . Table 1 lays out the relevant information on each computational domain. In the one-way coupling scheme, the parent domains Ω (p) provide time-accurate Dirichlet boundary conditions for every prognostic variable on ∂Ω (n,p(n)) , excluding the bottom boundaries Γ (n) b where identical solid wall boundary conditions are employed on all domains. The one-way coupling scheme is particularly well suited for the considered advection-driven UBL flow study where it is desired that the solution of the coarser domains remain unaltered by the finer nested domain solutions (which is contrary in the two-way coupling mode.) The boundary conditions are further described in Section 2.4.1.

Urban Surface Model in LES
This study utilizes a numerical model that features a detailed description of a real urban area ( Figure 1). The surface model features a 2.2 × 1.5 km 2 area of downtown Helsinki, the coastal capital of Finland. This region contains a dense arrangement of perimeter blocks, an irregular distribution of vegetated openings and streets (boulevards), and a network of narrow street canyons plus a central multilane artery that divides two downtown districts. The chosen area encompasses universal elements of low built urban environments that are common particularly within European cities. The urban surface model is constructed from an open access LiDAR dataset (accessible via http://hel.fi/3D) provided by the City of Helsinki. For the purpose of this and future urban LES studies, the point cloud dataset has been processed to yield a series of detailed raster maps, including topography (i.e., digital elevation model (DEM)), orography (land surface elevation), vegetation height, and land cover distributions. These raster materials are also made openly accessible for the community [35]. Information concerning the surface morphometry of the Ω (2) and Ω (3) model domains is presented in Table 2. The relevant mean canopy height is assigned H = 20 m, which very closely matches the mean building height of the Ω (3) domain and the upwind sector belonging to Ω (2) . The plan and frontal area fractions of obstacles are defined as λ P = A P /A tot and λ F = A F /A tot , where A P is the planar area of obstacles at ground level, A F is the total frontal area (visible to the flow) of obstacles, and A tot = L x × L y ( m 2 ) is the total planar area of the domain. Using the flow regime categorization for idealized urban canopies according to Hussain and Lee [36], the obtained building area fractions classify the urban area as dense where skimming flow dominates [37]. However, due to the irregularity in building morphologies and variability in street canyon orientations and vegetation distributions, such flow regime characterization offers little information about the considered urban canopy flow.
The building and tree height distributions in Figure 2 illustrate the symmetric variability in building heights (except for the single outlier at 2.8H) and the positive skew distribution of the vegetation heights. The vegetation height distribution closely resembles that of a suburban area reported by Giometto et al. [20], except that here the small plants and shrubs with h v < 2 m are not explicitly described by the dataset. The effect of the low vegetation and other fine structural details (balconies, small chimneys, roof ventilator, etc.) are taken into account via surface roughness parametrization in the LES simulation (see Section 2.4 below). Table 2. Summary of morphometric statistics for the urban area: Separate values are listed for the area within Ω (2) but outside of Ω (3) and within Ω (3) alone.

LES Modelling
To facilitate this comparative investigation of local scale urban canopy turbulence, a baseline LES simulation setup is established to act as a reference onto which each individual alteration is subsequently introduced. The wind direction is held unaltered for the all the simulations because the structural complexity and irregular orientation of the considered city plan introduces a significant amount of directional and structural variation into the roughness sublayer turbulence.

Boundary Conditions and Initialization
The inlet boundary condition on Γ in ≡ Γ (1) l is designed to represent fully developed neutral ABL turbulence, which contains the appropriate range of length scales and remains unaffected by the urban roughness. This is achieved by employing the turbulence recycling technique (see References [23,38] for details), where the inlet boundary value is constructed from two contributions Here, ψ = {u, v, w, θ, ...} is any prognostic variable such that the predetermined mean vertical inlet profile ψ (z) is held fixed while the fluctuating components ψ are recycled to the inlet from a specified recycling plane Γ rc at x = x rc downstream (drawn in Figure 1). Only the fluctuations occuring for z < 0.8L z are recycled, and therefore, δ = 0.8L z is utilized as the boundary layer height for the reference case from here on. The recycling height is modified in one of the documented simulations.
The inlet profiles ψ (z) for the prognostic variables are generated by precomputing a fully developed ABL solution, called a precursor solution, on a L x = 2048 m, L y = 512 m, L z = 512 m domain outlined in the bottom left corner of Figure 1, with periodic lateral boundary conditions and no-slip and free-slip conditions at the bottom and top, respectively. This flow is driven by a prescribed negative pressure gradient, which is imposed for z > 0.7 L z . The magnitude and direction of the pressure gradient is adjusted such that u 1pre ≡ u 1 (z = 0.95 L z ) ≈ 10 m s −1 and the flow angle with respect to the x-axis is α = arctan( v / u ) ≈ 3 • . The recycling method yields a flow solution, in which the double-averaged profiles in Figure 3 closely resemble ABL flow in geostrofic balance, featuring a constant momentum flux layer at z < L z /2. However, this setup exhibits faster convergence. The precursor simulation was initialized with a guessed mean profile and computed for ∆t = 290T pre , as T pre = L z / u 1pre , to obtain well-converged double-averaged profiles from the last 150T pre . The flow field of the urban LES simulation is, in turn, initialized by repetitively copying the precursor flow solution to the Ω (1) domain. The prognostic variables within the Ω (2) and Ω (3) domains are initialized by interpolating the parent domain values onto their respective grids.
The relatively long frontal section representing ABL flow over sea surface is constructed to facilitate the development and evolution of large-scale and very-large-scale motions, termed LSM and VLSM, respectively, that are observed in pipe, channel, and boundary layer flows (see, for example, References [26,[39][40][41][42], and references therein). The presence and relevance of these long streamwise structures, which affect the low altitude roughness sublayer flow via amplitude modulation, have been thoroughly established [43][44][45][46] in near neutral ABL flows, implying they ought to be considered a universal trait of UBL flows under similar conditions. Therefore, despite the statistical convergence difficulties they introduce, the inclusion of the large-scale boundary layer motions is deemed essential in the presented simulations as close compatibility is sought between modelled and (in the future) measured urban turbulence.
The small flow angle in the simulated ABL flow is important because the turbulence recycling setup is prone to numerically amplifying the elongated streamwise disturbances, causing the VLSM flow between the recycling planes to become correlated with the inlet [47]. The small v -component forces the LSM and VLSM structures to traverse the domain in the spanwise direction and to evolve without numerical amplification. This requires the distance between the inlet and the recycling plane to be sufficiently long. For this configuration, the distance x rc − x in = 3960 m ≈ 10δ is found to be adequate when the mean flow angle results in δ/2 displacement in the lateral direction as the flow moves from the inlet to the recycling plane.
The placement of the recycling plane is also sensitive to the disturbances that travel upstream from the leading edge of the urban surface. In this study, the disturbances generated by the real, unsymmetric shoreline of Helsinki gives rise to a weak perturbation at the recycling plane, which is subsequently transported all the way to the inlet. This perturbation manifests by adding a weak but observable systematic stripiness in the mean flow. This influence could be eliminated by substantially increasing the distance between the recycling plane and the leading edge of the city. However, a computationally more efficient technique is utilized here instead. The effect of the unsymmetric disturbances is diminished by adding a single sparse array of narrow but tall spires (0.04δ wide "towers" reaching a height ∆z = L z /2, placed ∆y = L z /2 apart) right after the inlet plane. These structures introduce a minor but regular source of small-scale disturbances that do not have an observable impact on the generated ABL flow far downstream but make the turbulence recycling boundary conditions more immune to amplification of upwind travelling disturbances from the urban roughness.
In each computational domain, the no-slip wall boundary condition is imposed for all x ∈ Γ b , including the building walls. The shear stress at the wall is computed in accordance to the logarithmic law of the wall with roughness, taking into account the effect of the roughness elements not resolved by the computational grid by assigning the subgrid-scale aerodynamic roughness length z 0,sgs = 0.03 m. The thermal landscape of the city due to anthropogenic sources is taken into account in a simplified manner by assigning a Dirichlet condition θ| x∈Γ u = 1.005 θ pre , where θ pre is the constant potential temperature of air from the neutral precursor simulation and Γ u ⊂ Γ b is the urban surface. The inclusion of anthropogenic heat is not relevant for this comparative study, but it is deemed an inherent ingredient of UBL flows under neutrally stratified atmospheric conditions.

Modeling of Trees as Porous Media
The difficulty of including trees within an urban LES model originates from the complex nature of their aerodynamic impact. The drag caused by an individual tree is much higher when it is isolated compared to being part of a forest canopy where the trees shelter each other. The earlier fundamental research on the effects of vegetation in ABL flows has considered continuous canopies where individual trees or vegetation elements are not identifiable objects with distinct spatial and aerodynamic properties or such that the canopy is constructed from a regular array of distict objects; see e.g., References [48][49][50][51][52].
Within densely built urban environments, the situation is complicated by the sporadic appearance of vegetation. Trees are typically planted individually along boulevards, esplanads, and other roads, whereas parks and other green areas may feature clusters of trees where the merging of foliage may form canopy-like patches. LiDAR measurements can provide spatially resolved information on the structural density of tree foliage, but their point cloud datasets may not provide reliable information concerning the inner structure of trees that have large crown sizes and dense leaf covers. The quality of this data depends on the amount of LiDAR beam reflections obtained from the inner part of the foliage. Even in the case of accurate structural probing of trees, their location with respect to the wind and each other strongly influence the aerodynamic properties they possess. For this reason, the modelling of vegetation within urban environments carries significant uncertainties of which the overall impact on the urban flow system needs to be established before detailed air quality and urban plan investigations featuring real urban sites can be undertaken.
In the PALM LES model, the aerodynamic drag due to vegetation is modelled as a momentum sink in the form of pressure drag (assuming the viscous drag contribution to be negligible in comparison) and is formulated as follows: where C Dv is a constant drag coefficient for vegetation and LAD = LAD(x) ( m −1 ) is the leaf area density or, synonymously, plant area density of discretized tree elements (cells in the grid occupied by foliage). The LAD is the ratio of the plant's surface area (visible to the flow) to the volume it occupies. In this study, the uncertainty in the effective drag coefficient D v (x) = C Dv LAD(x) which combines the two model inputs is scrutinized by imposing variations in both C Dv and LAD that are expected to span a relevant fraction of the uncertainty range associated with these inputs. The formulation of Equation (4) simplifies a complex problem by splitting the user specified elements of D v into a universal constant C Dv and a plant's density measure, which is solely determined by the spatial attributes of the tree. This is immediately problematic because trees are not only spatially complex objects but also elastic structures of which the structural and aerodynamic properties (e.g., branch rigidity, leaf, or needle shape reconfiguration under wind loads) can vastly vary between different species. Experimental drag coefficient measurements carried out by, for example, Mayhead [53], Rudnicki et al. [54], and Vollsinger et al. [55], reveal the wide scatter of result between different species. The situation is further complicated by the influence of wake interaction or "shading" [56,57] that occurs between trees within close proximity of each other. When this effect is included in the drag coefficient value, the implication is that the drag is caused by a population of trees and C Dv is termed bulk drag coefficient. Due to the complexity of the urban vegetation system, the drag coefficient is interpreted to encompass both meanings.
Consider the choice for the drag coefficient C Dv first. When ABL flows over continuous vegetated canopies are considered, C Dv ≈ 0.1 . . . 0.3 is a typical range of values, and often, C Dv = 0.2 is chosen [7,20,58]. This originates from Katul and Albertson [59], who found this to be the optimal value that minimized the discrepancy between two-dimensional Lagrangian stochastic turbulence model simulations and experimental measurements. However, this value is too small for urban situations where individual trees or closely packed populations of trees are sporadically distributed within the urban plan experiencing varying amounts of shading by buildings or other trees. The D v values for unshaded individual trees in crosswind are measured to vary widely (as a function of wind speed and tree species), ranging between D v = 0.2 . . . 1.2 m −1 [53,55]. Clearly, specifying one value for the urban application is not optimal, but the suitable value for the considered urban application should lie somewhere between the two canopy regimes.
In this study, the drag coefficient value is approximated by first adopting the method by Raupach et al. [60], as utilized in Ghasemian et al. [61], where the drag coefficient for windbreaks is evaluated from where k = ∆p s /(1/2)ρ|u| 2 d v is the pressure loss coefficient, ∆p s is the drop in static pressure across vegetation of thickness d v , Γ b1 is the bulk drag coefficient for a solid fence (assigned Γ b1 = 1.07 according to Jacobs [62]), and k 1 = 1.5 is an empirical constant. In accordance with Ghasemian et al. [61], choosing a representative pressure loss coefficient k = 2 (from the original measurement by Grunert et al. [63]), an initial value C Dv,ini = 0.59 is obtained. As this value is a reasonable approximation for unsheltered tree arrays, the value is adjusted 20% smaller to account for the scattered sheltering within the urban area. Thus, the value C Dv,ref = 0.8C Dv,ini = 0.47 is used as a reference in this study. To examine how the urban flow system is affected by this choice, the reference value is further scaled down (assuming a stronger impact from sheltering) by another 20% to get C Dv,mod = 0.8C Dv,ref = 0.376. These two values represent a significant uncertainty margin in D v , which provides the means to assess its necessary level of detail in relation to other modelling choices.
To probe the spatial variability of D v , two alternative techniques are used to specify the LAD distributions. See Figure 4 for visual assistance. The first, higher level of detail method used as a reference relies on the work of Tanhuanpää et al. [64], who provided an (unpublished) upgraded register for the street trees in Helsinki. This register had been processed from the same open access Helsinki3D LiDAR dataset that has been employed here to generate the urban surface datasets. The register contains metrics for each ∼ 23,000 individual street trees within the city and includes percentile profiles of first pulse returns from the interior of the tree crowns. These vertical percentile plots provide a horizontally integrated plant area density profiles for each tree. This information alone cannot be used in the LES because the three-dimensional trees within the model are discretized by the LES grid cells and the required LAD information needs to be assigned via vertical profiles for each ∆ x × ∆ y × (h v (x i , y j ) − h 0 ) grid column at location x = x i and y = y j that coincides with a tree crown. The tree height distribution h v (x, y) is obtained from the vegetation height raster dataset, and the starting height of the foliage is fixed at 4 m based on information from city officials stating that all street and park trees are trimmed below 3-4 m. For the purpose of assigning LAD information to LES, a one-dimensional beta distribution profile f (α, β, z) (see Markkanen et al. [65]) with α = 2.4 and β = 1.7 was manually optimized to yield, after horizontal averaging, the correct mean LAD profile shape of the three-dimensional street trees. The drag due to tree trunks is neglected in this study.
The second method is based on a low level of detail approximation that ignores all the vertical variabilities in LAD, thus employing a constant value in every grid cell within a tree crown. The value is derived from a reference, open grown urban deciduous tree (Tilia vulgaris), which is typical in Helsinki, of which the crown height is 13.0 (note the branchless lower part of the trunk is not included), thus yielding This value falls within the normal range of urban trees, which grow higher density crowns than forest trees, and it is about half of the green ash shelterbelt crown LAD reported in Zhou et al. [66], although it is not clear wheather the reported values are two-sided or one-sided plant area densities. Typically LAD in Equation (4) is defined as one-sided leaf/plant area density [67].

Different Simulation Configurations
This numerical study consists of four urban LES simulations, which are based on an identical reference configuration but selectively altered in accordance with the stated objectives. The reference configuration, labelled [R], adheres to the numerical method descriptions provided herein whereas the altered cases deviate from this reference only by the single declared model modification that is itemized in Table 3 Table 3. A listing of all the LES cases included in this study and their respective modification presented in juxtaposition to the reference configuration.

Model Constituent Reference Case Modified Case
LAD profile: Neutral ABL height:

Data Acquisition from LES
After the initialization by precursor solution, the main simulations were run for 215T, where T = H u * is the turnover time for the eddies shed by the urban structures and an appropriate time scale for the urban canopy flow [12]. The precursor initialization method permits a relatively short 25T spin-up time for the nonequilibrium urban boundary layer flow. The allocated time is sufficient for flushing away the initial transients within the urban canopy, allowing the upstream ABL conditions to reestablish stationary conditions (adopted from the precursor solution). All time series samples are recorded, and temporal averages are computed over ∆t avg = 190T, which is a sufficiently long time to accumulate well-converged statistics of low altitude urban canopy turbulence [15]. However, this study features a computational setup that allows the natural evolution of long streamwise velocity fluctuations, observed in ABL flows, and their interaction with the urban surface, introducing very long time scales that would necessitate up to 10 times longer averaging times to obtain stationary statistics for all the energy containing structures [26]. Limiting the sampling and averaging time in this study to 190T (which under conventional wind conditions translates to approximately 1.5-2 h) takes into account the issue of observability of local scale changes in real urban flows. Diurnally varying meteorology typically imposes restrictions on the length of the time series as changes due to alterations in the urban plan becoming muddled by the changes in meteorology. The temporal extent of the time series employed here with fixed meteorology is considered experimentally attainable (albeit rarely) under neutral ABL conditions. For example, Giometto et al. [5] employed urban tower data in 30 min blocks (here, ∼ 45T) from the Basel Urban Boundary Layer Experiment (BUBBLE) experiment [68]. The use of longer time series, such as 650T by Castro et al. [19], 400T by Coceal et al. [12], and 300T by Xie and Castro [16] are difficult to justify when the objective is to establish the degree with which the changes to the street-canyon scale turbulence caused by the changes in the urban plan can be observed. Thus, the study of real urban boundary layer flows has to rely on time series that cannot provide sufficiently converged statistics for the very large scale motions of the ABL.
The following data are collected from each simulation for comparative analysis: 1.
Horizontal 2D mean flow distributions are gathered from both the Ω (2) and All datasets are exposed to identical analysis, but in the interest of conciseness, the main text presents the analysis results of selected datasets.

Analysis Methods
The different urban LES results are compared using methods in which the scope ranges from integral to local. At the most basic level, spatially and temporally averaged vertical profiles of elementary flow quantities are juxtaposed. This is followed by a statistical comparison of differences that arise between horizontal mean flow distributions. A particular attention is assigned to the comparative analysis of the location specific time series that are gathered from the network of virtual tower measurement sites. These datasets of LES-resolved turbulence are meticulously scrutinized to extract the coherent fluctuations they contain and to expose their respective frequency distributions. This is accomplished by the means of wavelet analysis, whereas the comparative analysis of the associated signal frequency content is achieved by exploiting the concepts of information entropy and divergence. The following sections introduce the relevant analysis methodologies used in the post-processing.

Statistical Evaluation of Differences in 2D Mean Flow Distributions
To gain a statistical insight of the level of differences that arise in the horizontal distributions of the u + 1 = u/u * and w + solutions due to the modelling choices, two statistical measures recommended by Chang and Hanna [69] are employed in the comparison. Root normalized mean square difference (RN MSD) gives a measure of mean difference that is composed of random scatter and systematic bias, and fractional bias (FB) provides a specific measure for the systematic bias between two datasets. These metrics are defined as follows: ψ is a generic variable from cases [P], [B], and [D], while ψ R refers to the value from the reference simulation [R]. Because the double-averaged quantities for vertical velocities become very close to zero (particularly over Ω (3) domain and at higher elevations), conditional evaluations RN MSD (ψ|ψ > 0) and FB (ψ|ψ > 0) are utilized instead where only the data points satisfying ψ R > 0 are considered. The case ψ R < 0 is also inspected but omitted for brevity as the results do not contribute new information.
In addition, a qualitative inspection of spatial dimension of mean flow differences is obtained via relative difference distributions defined for ψ(x) as

Joint Time-Frequency Analysis with Wavelets
A crucial step of our work is to analyze the time-frequency statistical properties in the fast-scale turbulent fluctuations of the velocity field: Here, u(x l ) is the time average of the flow at the chosen urban location x l (see Section 2.5.1 for details).
The objective is to gain insight about the statistics and frequency content of these fluctuations. In the following subsections, we will illustrate the theoretical tools that will be deployed for our aim.
The first subsection will be dedicated to explaining the conventions about the Fourier transform and its discretization is adapted here; the Fourier filtering [70] methodology will be also explicated in detail. Such a procedure will be used to define the pure small-scale flow.
In the second subsection, we will introduce the wavelet transform. This mathematical instrument allows taking into account the intrinsic non-stationarity of the turbulent fluctuations, a common feature in inhomogeneous turbulence. Such a technique has already been used in environmental fluid mechanics literature [22]. Wavelet analysis will also make it possible to identify a threshold for the time scales of the flow under which we want to dedicate the present analysis. Consequently, all the scales above that threshold will be filtered out via Fourier analysis.
After we have obtained both the wavelet coefficients and the filtered fluctuations, we will analyze them from a statistical point of view. Unlike the approach presented in Hertwig et al. [22], where just sample kurtosis and skewness for wavelet coefficient histograms were considered, we will exploit other statistical indicators which provide a deeper view on the whole fluctuation probability density function. These indicators are Shannon and Rényi entropies as well as divergence functions and are commonly utilized in information theory [71][72][73]. Entropy functionals give us information about both the central region and tails of a given probability density function, while divergences make it natural to compare two density functions coming from different simulation configurations (e.g., boundary conditions, tree models, and resolution). By applying these tools to the time series, we will be precisely able to identify in which manner the simulation configurations affect the results with respect to city location, height, time instants, frequency, or time scales.
All the abovementioned quantities will be first introduced in their theoretical, continuous-time formulation. Therefore, it will be explained what conventions we will follow in the present paper to apply those functionals and transformations to our time series.

Fourier Transform: Filtering and Spectrum
Given the turbulent fluctuations u (x, t), an operation often performed is to analyze them by means of Fourier transform and antitransform. We refer to Appendix A for a thorough explanation about how to pass from continuous to discrete versions according to the conventions of the present paper and the rationale behind any following equation. Here, we will list their discrete expressions that are implemented to perform the data analysis of this work: where t k = k ∆t, ∆ f = 1/(N∆t), f n = n ∆ f , and • stands for the floor function. Notice that, in the discrete framework of the Fourier analysis, the ∆t is often dropped, as it represents a constant factor. However, here, it was retained so as to keep a direct contact to the continuous-time formulation and the wavelet transform that will be explained in the next section. As for the time series itself, all the information about it is contained in the index set n ∈ {0, . . . , N/2 }, as the fluctuations are a real quantity.
When considering only small-scale fluctuations in the flow, of which the Fourier frequencies are higher than some cutoff f c , slower Fourier modes will be removed via a plain, sharp high-pass filter. In practice, we are going to consider the small-scale fluctuations plus the average local offset. Therefore, the relationship implemented numerically to extract this quantity from the full fluctuation time series is as follows having denoted by n c the index such that n c ∆ f = n c /(N∆t) = f c . Again, we refer to Appendix A for the details behind the derivation of Equation (12) and its significance.
In the next sections, we are also going to consider another quantity related to the Fourier transform of the fluctuations, that is the spectral density [74] of a given velocity component u i . Here, we calculate it as follows: It is an even function of f as u i is a real quantity. As a direct consequence of this, the mean squared velocity is related to the spectrum through the following identity: By looking at Equation (13), it is also apparent that the spectrum S u i will be null for any f < f c .

Wavelet Transform: Time-Frequency Joint Analysis
The limitation of the Fourier transform is that any information about the intrinsic non-stationarity of turbulence is lost. Indeed, the integration covers all the signal durations, and one loses any insight about variations in the flow spectrum with respect to time, be those large modes, vortices, coherent or pulsating structures, intermittency, etc. Wavelet transform is the right instrument to overcome this issue, as it intuitively represents a time-local analysis with respect to different time-scales [75,76].
More precisely, a continuous wavelet transform (CWT) of the signal associated to a velocity component u i (x, t), e.g., u 1 , is defined as follows [75]: The graph of the function W u i (s, t) on the plane (s, t) is called scalogram. The function Ψ(t), which is complex in general, is called mother wavelet and generalizes the complex oscillating exponential functions in the Fourier transform. The mother wavelet generates a wavelet family by means of the dilation factor s and the time shift t in the sense that the analysis is carried out at time t of the velocity u i . There exist several mother wavelets, any of which must satisfy the so-called admissibility criteria. One of these criteria is that the mother wavelet must have null mean value. Consequentially, the wavelet transform of u i and u i are the same, as can be promptly noticed in Equation (15). The mother wavelet we will be utilizing in this work is a common simplified version of the complex Morlet wavelet: where t = t − t /s (see Equation (15)) is a non-dimensionalized, t-centered time and the dimesionless parameter ω 0 , which in turn generates a family of mother wavelets, can be freely chosen depending on the case considered. For the approximated form in Equation (16) it is common practice to choose it at least > 5, and we will set it to 6. The reasons behind these practices and choices are thoroughly explained in Appendix B. For our purposes, what is meaningful to pinpoint is that, thanks to this choice, the typical wavelength of the wavelet is simply ∼ s. This biunivocal correspondence provides us with a prompt physical interpretation of the decomposition of the velocity u i at time t in many wavelet pseudo-harmonics of scale s. In reality, for our analysis, it will be expedient to consider the wavelet power scalogram |W u i (x l , s, t)| 2 , which is defined as the graph of the square module of the wavelet transform with respect to the parameters (t, s). For the reasons we have illustrated previously, this graph can be promptly interpreted at a glance as the kinetic power of the velocity component concentrated at a certain scale s and location x l thanks to the immediate connection between the parameter s and the time scales involved in the processes.
As for the discretization of Equations (15) and (16), we need to account for finite size effects and aliasing. Taking into account that ∆t = 0.5 s is the time step in the time series u(x i , t n ), only time scales larger than 1.4 s can be resolved in the wavelet transform (for proof of this, see again Appendix B).
To calculate the wavelet transform in Equation (15), we just take the discrete convolution over a sequence of scales s k : with the convention that u i (x l , s k , t n ) = 0 when n < 0 or n > N − 1. The sum in Equation (17) thus is over finite number of elements in reality. Due to the 2s-wide spread of the Gaussian envelop in Ψ, this fact will result in finite-size effects. Namely, given a scale s k , each time step t n such that t n ≤ 2s k or t N−1 − t n ≤ 2s k will be affected by this finite-size effect in time. Such a distortion needs caution especially for large modes, where it may affect significantly the wavelet transform. This effect constrains the practical use of time-frequency analysis for large modes and, in practice, poses an admissible maximum for the scale s k that is significantly lower than t N−1 /2. The distortion in time at the beginning and the end of the wavelet transform is however easily visible in the scalograms; therefore, it can be handled safely. Moreover, as already said previously in this work, we are going to focus only on small-scale fluctuations, an aspect that makes these finite-size effects rather negligible in the present analysis.

Entropy and Divergence of Turbulent Signals
We illustrate the technique implemented here to estimate the Shannon entropy of the turbulent fluctuations and the related divergence between two different turbulent signal density functions. As we will see, these are natural tools to compare the turbulent fluctuations statistics between different simulation configurations.
Given a probability density function ρ(a) for a continuous random variable a, we define Shannon entropy as the following nonlinear functional of ρ: By looking at the Equation (18), the connection to the thermodynamic entropy definition from statistical mechanics is evident [77]. The definition in Equation (18) can be adapted to discrete contexts as well, and the latter is the formulation we will be using. The form it takes when a random variable b assumes N discrete values b k , k = 1, . . . , N is as follows: By comparing this formula to Equation (18), it must be noticed that it does not correspond to the continuous limit when k becomes a continuous variable, that is, when the bin width ∆ * k → 0, N → ∞, taking into account that P(b k ) = ρ(b k )∆ * k . See Bonachela et al. [72] for further details. Consider now a time series a(t 1 ), . . . , a(t N ) representing a time-sample of the random variable a (which in this study will be the time series of the squared wavelet coefficients of the small-scale fluctuations at each location x i ). A histogram is constructed from the time series values by dividing the sample values into M nonuniform interval bins ∆ * 0 , . . . , ∆ * M . These intervals will be chosen so that extreme events will be associated to larger intervals in order to capture better the small probability intervals and to improve the convergence of the histogram.
If the underlying quantity is theoretically continuous-be it the filtered velocity or the wavelet coefficients-then the probability of having an observation at a certain t within an interval ∆ * k is given by the following: This can be straightforwardly estimated by means of the associated discrete sample frequency f * (∆ * k ) in the given time series.
Once we measure the sample frequencies, we can plug them into Equation (19) with f * (∆ * k ) in the place of P k : The resulting quantity represents an estimator of the Shannon entropy. Such a simple frequentist estimator proves to be asymptotically unbiased [72,73]. In other words, although the expected value of the sample frequency E[ f * (∆ * k )] is P k , due to the nonlinearity of the entropy functional, E[H 1 [a]] = S 1 (a); namely, there is a bias.
However, this bias goes asymptotically to zero as the sample size goes to infinity. Given that we herein deal with large samples, the asymptotic limit E[H 1 [a]] → S 1 (a) is recovered.
Finally, let us introduce the so-called Shannon entropy divergence functional for continuous density functions: Oftentimes, this functional is also called Kullback-Leibler divergence, albeit based on Shannon entropy. Divergence measures how much some new probability density function ρ 2 is different from a reference ρ 1 . In a sense, it represents the increase of information that ρ 2 contains in comparison to ρ 1 . A geometrical interpretation of this can be found in Xu and Erdogmuns [78]. When ρ 1 = ρ 2 , the divergence is 0.
Similarly to the entropy case, we are going to estimate the divergence by first estimating the probabilities empirically. To do so, we will be constructing a histogram per different configuration of our numerical simulations by dividing up the value range of the time series into M bins ∆ 1 , . . . , ∆ M . For the sake of consistency, we will shortly see that the bins of the histograms will need to be the same for all the configurations. Looking at Equation (20) }. We will thus be able to estimate the discrete versions of Equation (22), by plugging the relative sample frequencies f * in the place of the bin probabilities: By looking at Equation (23), it is now evident that this methodology is well defined only if the endpoints of the bins ∆ * k are the same. Note also that, unlike the entropy functional, from Equation (23), the continuous limit in Equation (22) is recovered when we go to the continuous limit.
The current approach to estimating the divergence via the sample frequencies in the histograms is simple and consistent as we deal with fairly large samples, but certainly, it is not optimal as the convergence rate. Other more complex, sophisticated estimation methods are available in literature, making use of nearest-neighbor techniques or other machine learning algorithms [78,79]. However, the principal interest here does not lie in a precise quantitative estimation of entropy or divergence values but rather in a comparative analysis of the differences that arise in LES-simulated urban turbulence due to level of detail modifications in the model. The presented methodology will therefore be suitable to assess these differences with respect to frequency (or time scale) at different locations and for varying elevations.

Similarity Scaling Parameters
The comparison of results from different UBL flow realizations necessitates normalization with appropriate scaling parameters. In ABL flows with surface roughness and a well-defined ISL, this is typically achieved by employing the ABL adopted form of the logarithmic law of the wall where the friction velocity u * specifies the characteristic velocity scale, the displacement height z d sets the aerodynamic "zero level" for the logarithmic profile, and the roughness length z 0 provides a parametrization for the effect of drag due to surface roughness. The theoretical validity of this similarity solution for neutrally stratified boundary layers is roughly confined to elevation range z d < z < δ/3 [80], where the δ is either the boundary layer height or the channel centerline height.
Despite the nonequilibrium nature of the studied flow, the double-averaged (DA) vertical momentum flux profiles extracted from the Ω (3) domain exhibit a linear section in the range 1.5 < z/H < 4, indicating a region where, on the average, the vertical momentum flux is balanced by the negative mean pressure gradient. Even though this does not imply the existence of an ISL, this overall balance provides a robust mean to define u * = u w 2 + v w 2 1/4 z→z d , which is obtained by extrapolating the linear slope in the flux profile down to the displacement height z d . The obtained values for u * , z d , and z 0 are listed for each case in Table 4. The values for z d and z 0 are obtained from a logarithmic fit to the narrow band 1.5 < z/H < 4 in the streamwise velocity profile. The width of the band was determined to be most suitable for the logarithmic fit by trial and error. In the scaling parameters, case [B] with the largest ABL eddies removed, brings about the most notable deviations. Compared to the reference case, friction velocity became approximately 6% lower, the roughness lengths became 6% higher, and the displacement height became 3% lower. Other small but observable deviations include case [D], with reduced canopy drag, causing a 4% decrease in displacement height and a 5% increase in roughness length. Simplifying the LAD profiles of trees resulted in negligible changes in the scaling parameters.
It is acknowledged that the use of log-law scaling parameters is problematic because the inherent assumptions concerning equilibrium conditions and the existence of an ISL above an urban canopy are generally violated. For sufficiently developed urban boundary layer flows, the logarithmic velocity profile manifests above the urban canopy only via horizontal averaging [10,12,15]. Therefore, it is difficult to obtain reliable log-law scaling parameters from urban field studies as horizontal averaging would require a dense network of towers reaching the elevation range 1.5 < z/H < 4 that, for real urban flow scenarios (as in the considered study), falls entirely within the RSL. In this comparative study, the log-law scaling is found to provide the most robust mean to non-dimensionalize the flow data, but it is likely that other scaling techniques should be used in real urban validation studies.

Horizontal Mean Flow Distributions
As expected, the DA profiles can only provide information on how the chosen modeling alterations influence the urban flow system at an integral scale. Next, the mean flow distributions are compared in order to establish how the modeling choices embedded within the structural complexity of a real city alter the numerical predictions of urban wind. Figure 6 exhibits the horizontal distribution of u + 1 across the Ω (2) domain at z/H = 0.7 obtained from case [R] with ∆ = H/10 resolution. This wind map highlights the heterogeneity and complexity of urban canopy flow within a real built environment featuring a network of street canyons of different widths, buildings and building blocks of varying dimensions, and openings and parks with different vegetation covers. Close inspection of these results would warrant a long report of interesting observations, but such examination falls outside the scope of this comparative study. Juxtaposing u + 1 and w + distributions and relative difference distributions conveys the inherent difficulty associated with predicting vertical velocity variations within a densely built urban environment. A real urban canopy gives rise to a flow system involving a complex labyrinth of interacting blunt-body wakes and street canyon vortices, which manifests as a rapidly sign-altering vertical mean flow distribution. Therefore, if the objective is to predict the up and down drafts within an urban canopy, all aspects concerning model accuracy must be carefully considered. The Rel∆w + distributions in Figure 8b-d offer an instructive view of the uncertainty embedded in predicting vertical movements of the wind and the spatial dimension of wakes and vortecies within a real city.  Figure 7e depicting the vertical profile of RN MSD(u + 1 ) relative to its recommended maximum limit C RN MS = √ 1.5 lays bare how modeling discrepancies in the horizontal wind become larger within the urban canopy close to the ground but diminishes rapidly above the roof height where mean wind speed increases. However, even the highest discrepancies yielded by the case [P] over Ω (3) at low altitudes satisfy the criteria RN MSD < C RN MS with a significant margin, ensuring that the considered model uncertainties take up only a minor portion of the target margin. Similarly, the vertical profiles of FB(u + 1 ), plotted relative to its recommended magnitude limit C FB = 0.3 in (f), show only weak systematic bias between the simulations (consistently below 5% of desireable limits within range 0.35 < z/H < 5). The relatively uniform fractional bias profile implies that the discrepancies reflected in RN MSD(u + 1 ) close to the ground are principally random in nature [69]. Thus, in an integral sense, the presented solution for the horizontal wind exhibits robustness against the imposed modelling parameter changes away from the ground, but close to the ground (or pedestrian level), these sensitivities increase.
The conditionally sampled RN MSD(w + ) and FB(w + ) results in Figure 8e,f also reveal that the discrepancies in vertical velocity are primarily random in nature. The RN MSD(w + ) values within the urban canopy are confined to very modest ∼ 10% levels; case [P] (with simplified LAD profile) again exhibits marginally higher deviations close to the ground than others. Nonetheless, the results are very acceptable given the complexity of the urban canopy flow. The steady increase of RN MSD(w + ) values with height reflects the difficulty associated with the normalized differences as the mean values approach zero.

Local Turbulence Profiles
The practical limitations in conducting measurement campaigns within real urban environments assures that locally obtained time series data remain an essential point of contact between model results and experimental measurements. For this reason, it is important to examine how observable the changes caused by the model modifications are in the virtual tower measurements described in Section 2.5.1. The structural composition of real cities is generally so complex that no individual measurement station can provide street canyon flow information that is representative of the whole city, but neighborhood scale representativeness is attainable.

Mean Flow
The vertical profiles of dimensionless streamwise-rotated velocity, obtained from all 11 virtual tower locations shown in Figure 5, are depicted in Figure 9. The set of figures also includes the DA profile u + 1 from domain Ω (3) to mark the contrast between local and spatially averaged profiles. The profiles also contain bootstrapped 95th confidence intervals (computed using 1000 iterations) in grey color, but they remain within the line width. The local profiles now also include the case [R2], where the flow data is collected from the reference simulation in an identical manner as data labeled  On the other hand, the profiles from the ∆ = H/10 resolved [R2] case consistently exhibit deviations close to the ground and near high gradients. Thereby, we can conclude that, if local predictability in mean wind is sought, the effect of resolution outweighs the level of detail variations considered herein.
The dimensionless vertical velocity profiles (with their bootstrapped 95th confidence intervals) of four selected stations are displayed in Figure 10. The profiles reflect the difficulty in predicting the swirls and vortices within the urban canopy. At the shown scale, the deviations appear notable, but again, the ∆ = H/20 resolved profiles closely adhere to each other whereas the ∆ = H/10 resolved [R2] profiles depict the most discernible deviations, reaffirming the relevance of sufficient resolution in urban flows. A careful inspection reveals that making a consistent 20% mistake in vegetation drag (case [D]) has a weaker impact on the mean results than simplifying the LAD profile (case [P]) within the urban canopy. Also, eliminating the largest ABL eddies (case [B]) brings about only minor and random deviations compared to the reference case.

Velocity Variance
Mean wind profiles provide a very limited view of the interaction between atmospheric boundary layer turbulence and urban environment. Therefore, this study places emphasis on examining the dynamics of the LES-predicted urban canopy turbulence. Here, the issue of observability is also considered relevant as real urban canopy flows always contain diurnal variation, which complicates the sampling and analysis of time series sufficiently long to provide convergent statistics. For this reason, the sampling period of 190T used here in, or equivalently 150T δ as T δ = δ u (z=δ) , is considered an appropriate compromise between seeking convergent statistics and maintaining relevance in connection to real observations. (This time period translates to ∼ 1.5 h of real time under typical wind conditions.) The local vertical profiles of dimensionless variance plotted in Figure 11 (excluding stations M9 and M11 as they reside very close to the nesting boundary zone) for resolved u 1 lay bare that the variance values within the roughness sublayer where z > H have not converged properly. The drastic scatter in local σ 2 u1 /u 2 * profiles occurring between measurement stations implies that the urban roughness sublayer does contain the slowly evolving VLSM structures of which the time scale is much longer than T or T δ . Recall that the computational setup has been designed to allow these long streamwise perturbations, observed in ABL measurements, to evolve and subsist but without observable model-related amplification.
Although each simulation is identically initialized, the feedback from the urban surface to the recycling plane, no matter how weak, results in different flow realizations for each computational case. The resulting scatter in variance profiles offers evidence that the slowly evolving modes are localized, as made apparent by the local discrepancy, for example, in case [D] profiles at stations M1 and M4 versus M7. In general, the scatter diminishes as statistical convergence improves within narrow street canyons (stations M1, M2, M4, M7, and M10) and densely vegetated parks (stations M5 and M6) where the surrounding structures and obstacles effectively convert larger turbulent structures into smaller ones, enhancing the energy transfer toward the small scale eddies and hence also dissipation. The wider street canyons, particularly if parallel to the flow, allow the energetic large scale structures to penetrate the lower altitudes (see station M8), and therefore, poor convergence of variance is observed also within such street canyon, similar to the higher altitudes. Figure 11. Local vertical plots of streamwise velocity variance σ 2 u1 /u 2 * from selected measurement sites: The contribution of statistically insufficiently described very-large-scale turbulence motions is distinctly visible above the urban canopy.
The profiles of case [B], with the largest ABL eddies removed, when compared to other cases for z/H > 1 exhibit consistently lesser variance values and the least amount of local variation between stations. Therefore, it is natural to conclude that the inclusion of the largest ABL eddies in the numerical model is conducive to the existence of the largest streamwise modes. However, even in case [B], these modes exist as seen when juxtaposing M3 and M6 profiles in the upper part of the roughness sublayer.
The variance results lay bare the imminent complications that arise when seeking agreement between simulated and measured UBL flow data; in the absence of idealized conditions or a monotonous urban plan, the observed TKE within the roughness sublayer is likely to contain low-frequency fluctuations of which the spatial characteristics may be difficult to uncover from tower measurements. It is thereby considered vitally important to employ analysis techniques which expose the dynamic makeup of the urban flow system. In this study, time-frequency analysis with wavelets is exploited in the dissection of the obtained time series in a manner that is also considered highly relevant when experimental urban flow data is considered. Through wavelet analysis, the comparative study can be broadened to encompass also the frequency content of the turbulence signal.

Time-Frequency Analysis
Each time series for u 1 (t, z) obtained from the virtual towers stations is subjected to continuous wavelet transform (Equation (15)) using complex Morlet wavelet with center frequency ω 0 = 6 to generate a power scalogram |W u + 1 (t, s)| 2 for each z-coordinate. Example power scalograms obtained at elevations z/H = 0.6 and z/H = 1.7 from the M3 station are shown in Figure 12. The scalogram features time t (in seconds) on the horizontal axis and scales s = ω 0 2π f (in seconds) on the vertical axis. Due to the convenient choice for ω 0 , there is a close inverse relationship between scales and frequency, i.e., s ≈ f −1 .
The figure excludes scales s/T >≈ 4, thus, excluding wave lengths λ w > u 1 / f = u 1 4T that exceeds δ at elevation z/H = 2. The power scalogram |W u + 1 (t, s, z)| 2 provides a time-resolved breakdown of the fluctuation energy content of u 1 into contributions from different frequency ranges. It offers a thorough view into the energy composition of the obtained turbulence signal and can be subjected to further analysis via different approaches.
In this comparative study, the power scalograms are first temporally averaged to uncover how the variance values are constructed from different frequency contributions. (Power spectrum via Fourier analysis could offer the same information.) This procedure is applied to u 1 time-series profiles obtained from the virtual tower measurements. The results from cases [B], [R], and [R2] at stations M1, M3, and M6 are depicted in Figure 13. The figure contains |W u + 1 | 2 (s, z) contour maps that are juxtaposed with corresponding σ 2 u1 /u 2 * variance plots (variance corresponds to the integral of |W u + 1 | 2 over all scales). Note the different range on the z-axis for [R2] results obtained from domain Ω (2) .
The role of the LSM and VLSM (s > 2.5T) contributions in variance plots for z/H > 1 becomes immediately apparent from the displayed |W u + 1 | 2 distributions. Particularly, attention is drawn to the contributions within range 1 < z/H < 4 that highlight a confined energetic region, which arises from the interaction between the ABL turbulence and the urban canopy layer. The amplification of coherent structures within this elevation range is substantiated by the [R2] contours and variance profiles, which reveal how the effect of the urban surface vanishes for z/H > 5, providing an estimate ∼ 5H for the UBL height. These bulging UBL regions of elevated |W u + 1 | 2 values appearing in both [R] and [B] cases display remarkable similarity despite the complexity in the distributions. This provides evidence that the amplification of coherent large-scale structures just above the urban canopy height does not (significantly) involve the larger ABL eddies that scale with the boundary layer height.
The effect of LES resolution ([R] vs. [R2]) is distinctly observable for z < H in results from stations M3 and M6. The coarser solution allows large-scale modes to energize the flow at near-ground elevations, whereas the improved resolution facilitates smaller eddies and higher velocity gradients which, in turn, lead to greater kinetic energy dissipation. The effect of resolution on variance within the park trees (station M6) is particularly notable.
Establishing the spatial nature and dynamic behavior of these amplified coherent large-scale structures, possibly characteristic to developing UBL flows, deserves to be pursued further, but this falls beyond the scope of this work. Within the context of this comparative investigation, these slow turbulent modes near the urban canopy present difficulties as they smear all the statistical differences that arise from the model variations within the roughness sublayer. Thus, guided by the power scalogram results, the statistical analysis will be confined to the sufficiently converged frequency/scale range that nonetheless covers all the turbulent length scales that are directly influenced by the vegetation model LOD variations. Although the amplification of large-scale modes within the urban roughness sublayer have not been substantiated experimentally, the presence of low frequency ABL structures in real urban canopy flows should be anticipated. For this reason, the importance of conducting time-frequency analysis of urban flow data is strongly emphasized here.

Filtering Out the Largest Atmospheric Boundary Layer Modes
To perform a targeted comparison of turbulence statistics within a frequency/scale range that is representative of urban canopy turbulence and free of the observed VLSMs, the streamwise-oriented velocity time-series are subjected to a sharp Fourier filter. To specify the appropriate cutoff frequency/scale, both |W u + 1 | 2 distributions and spectral energy density plots are examined. Figure 14a displays an example spectral energy density curve obtained from case [R] and station M6 at two heights z = 0.5H and z = 3H. The M6 station is located within a park of dense vegetation. The spectra reveals that the LES-resolved u 1 signal at z/H = 3 successfully captures the theoretical −5/3 slope energy cascade well over an order of magnitude frequency range. As expected, within the vegetation covered park at z/H = 0.5, the enhanced dissipation due to trees is clearly observed. A sharp cutoff frequency f c = 0.011 s −1 or scale s c = 2.3T for the Fourier filtering is specified to eliminate the fluctuating modes in which wave lengths (λ w > u 1 s c ≈ 9H) exceed the length of two charactristic city blocks (∼ 4H/block) within the domain. Figure 14b illustrates the spectral energy density of the filtered streamwise velocity, labeled u 1 . For completeness, the vertical velocity component is also subjected to identical Fourier filtering, thus yielding w.
A selection of newly computed dimensionless variance profiles are plotted in juxtaposition with their original profiles in Figures 15 and 16 for u 1 and w, respectively. The plots for σ 2 u1 /u 2 * and σ 2 w /u 2 * feature tighter ranges on the axes to allow improved differentiation of differences. The removal of s > 2.3T contributions reduced the original variance of u 1 up to 75% (the original peak value max(σ 2 u1 ) ≈ 4u 2 * appearing in both M5 and M10 profiles approximately at height z/H = 4 reduced to σ 2 u1 ≈ 1u 2 * after filtering) whereas the variance of w was reduced by approximately 20%. This is expected given the horizontal degrees of freedom of the LES model.
The influence from the VLSM modes is still indirectly contained in the high-frequency (low scale) fluctuations via amplitude modulation, and the observed scatter in profiles for z/H > 3 offers an indication for its magnitude. The filtered variance profiles also feature 95th confidence intervals computed via bootstrapping technique to confirm that the resulting scatter does not arise from poorly converged statistics. The most significant deviations are again introduced by the coarser [R2] solution at low altitudes. This is particularly notable in the σ 2 w profiles in Figure 16b where the vertical fluctuations are clearly subdued close to the ground and most notably within the vegetation at location M5. The narrow streamwise-oriented street canoyon site M1 forms an exception in the shown profiles as all model variations agree on the profile shape that reaches its maximum within range 2 < z/H < 3 which is notably higher than at the other shown site. Sites M2 and M7 exhibit similar variance profiles.

Comparing Coherent Structure Content Using Entropy and Divergence
The final approach used to investigate the sensitivity of LES modeled urban canopy turbulence to modeling choices involves a detailed examination of the probability density distributions of coherent local u 1 fluctuations, conveyed via power scalograms |W u + 1 | 2 (t, s, z) evaluated for each urban location. The objective is to compare the energy distribution of turbulent fluctuations occuring at a specific time scale s i , location x m , and height z j . Here, the concepts of information entropy and, more specifically, Shannon entropy (see Equations (19) and (21)) and the associated divergence (Equations (22) and (23)) are exploited to perform a quantitative comparison. Information entropy provides a measure for how broadly the magnitudes of the coherent turbulent fluctuations are distributed, as quantified by a discrete probability density function ρ * ≡ ρ * (|W u + 1 | 2 ). The information divergence, on the other hand, provides a measure of how two probability density functions or sample frequency distributions f * = ρ * ∆ * differ from each other. These entropy and divergence evaluations will be carried out over a time scale range 0.025 < s/T < 1.5 considered most relevant in this study and over an elevation range 0 < z/H < 4.5, which encompasses the essential part of the urban RSL. Figure 17 demonstrates the discrete probability density distributions of power scalogram coefficients ρ * ≡ featuring selected case pairs, time scales, and elevations. The Shannon entropy (see Equation (21)) of the reference case is denoted as H 1 ( f * A ) and the modified case is denoted as . The divergence between the two sample frequency distributions is denoted as The generation of the discrete sample frequencies f * for Equations (21) and (23) must be performed with care because the evaluation of divergence requires that both discrete probability distributions, f * A and f * B , feature exactly the same bins. Therefore, the construction of histograms of the reference case A =[R] and the selected modified case B at specific time scale and elevation must be carried out simultaneously such that the bin distributions can be made suitable for both. This is achieved by first establishing the common min/max bounds for both histograms and then by redefining a quadratic distribution of bin widths from min to max using M initial bins. In this work, M = 40 was determined to provide a sufficiently resolved initial distribution of bins. To ensure each bin represents converged information, every bin ∆ k , where the number of samples n k < 50 in either of the histograms, is merged with ∆ k−1 . This process is repeated until the bin distribution has converged. The final histograms typically feature between 20 and 34 bins. In Figure 17, all histograms contain multiple merged bins at the tail end of the distributions. Figure 18 presents the Shannon entropy distributions whereas the comparison between different monitoring stations reveal systematic differences. Some fine-detail differences particularly at larger scales (s/T > 1), that are also observable as non-systematic spots in the divergence distributions, are assumed evidence for the intermittent coherence that stems from the slowly evolving LSM and VLSM modes, which had their unique realizations in each simulation (except for [R] and [R2]). However, in this context, it is not possible to determine the root cause of the sporadic divergence appearances occuring at larger time scales (s/T > 1) or higher elevations (z/H > 2) because the footprint of the flow phenomena extends far upstream. Nonetheless, it is informative to observe the magnitude range of these disordered divergence instances in relation to where systematic divergence appears.
In general, the divergence between the reference and the other ∆ = H/20 resolution simulations remained systematically very low for s/T < 1 throughout the RSL. A noteworthy exception occurs in Figure 18c Figure 9) and variance ( Figure 11) profiles for the streamwise velocity, the lower resolution results in notably lower information entropy in the resolved turbulence than in the [R] solution. As Figure 17c illustrates, the rare and stronger bursts of coherent turbulence are far less frequent in the coarser [R2] than in the [R] solution. This is a general feature that separates the turbulence in [R2] from [R]: in the vicinity of the urban roughness elements, the coarse solution fails to resolve the stronger bursts of coherent fluctuations. For instance, in the scale range 0.45 < s/T < 1.2, the near-ground turbulence (between the ground and the foliage) displays marked levels of divergence for the same reason.
Moreover, the resolution-induced divergence also dominates in the results for M5 and M6 stations. However, there is a constrast in how the turbulence solutions differ: at station M5, which is near the upwind edge of the park, the divergence is systematically elevated within the foliage at low scales (s/T < 0.75), whereas at M6, which is further downstream, the divergence is accumulated at higher time scales (s/T > 1). At M5, in addition to the infrequency of strong bursts in [R2] solution, the small amplitude coherent fluctuations of the high-resolution [R] solution adapts to the vegetation drag faster, leading to a significantly reduced peak at the left end of the distribution as well. At station M6, the wind is significantly subdued throughout the depth of the vegetation canopy and the two solutions principally differ because the coarser solution contains a reduced count of weak large-scale fluctuations.

Conclusions
In this comparative study, a realistic urban boundary layer flow scenario is numerically investigated with a high-resolution LES model. The scenario involves neutral ABL flow developed over sea surface that becomes incident with a coastal city where topographic description within the LES model is obtained from a high-resolution LiDAR dataset. The simulations are repeated with modified LES models to investigate the sensitivity of the urban flow system to modelling choices that carry uncertainties. This work employs a number of techniques in quantifying the magnitude and nature of the deviations that stem from uncertainties in specifying the aerodynamic properties of urban vegetation and the vertical extent of the ABL. The results and their comparative analysis lead to the following conclusions and subsequent recommendations for urban LES modelling.
The applied LOD modifications to the urban surface or boundary layer depth resulted in minor changes in the log-law similarity scaling parameters computed from double-averaged profiles. The elimination of the largest ABL eddies (case [B]) had the most observable impact, reducing friction velocity u * by 6%, increasing roughness length z 0 by 6%, and reducing displacement height z d by 3%.
The vertical distributions of RN MSD(u + 1 ), the root-normalized-mean-square differences (with respect to the reference case) for streamwise velocity evaluated over horizontal planes, obtain their highest values near the ground but diminish rapidly with elevation. Case [P], featuring simplified LAD-profiles of trees, yielded marginally higher RN MSD(u + 1 ) values than other model modifications, but even its maximum remained at only 10% of the recommended maximum level desired when model results are validated against experimental measurements. Thus, the model uncertainties leave a significant margin for successful experimental model validation. On the other hand, the conditional RN MSD(w + ) values for vertical velocity (evaluated using points where w + > 0 in the reference case) remain consistently around 10% from the recommended maximum throughout the canopy height again case [P] recording the highest discrepancy levels close the ground. Fractional bias (FB) values for both velocity components remained well below 10% of the recommended limit throughout the canopy and above, indicating that the model modifications generate very low levels of systematic deviations.
Overall, the ability to quantify differences in urban canopy flows becomes difficult close to the level of pedestrians and vehicles, particularly in areas where the wind speeds are reduced due to the density of the surrounding urban plan and/or drag caused by dense vegetation. Predictions concerning w within a complex urban canopy are endowed with greater uncertainty as the network of street canyons and blunt obstacles gives rise to a complex distribution of vortices and wakes. This is in contrast with the horizontally streamwise-aligned velocity, which carries most of the momentum and of which the distribution is largely dictated by energetic effects.
The u + 1 profiles obtained at 11 different virtual tower positions depict very high degrees of adherence between the cases, demonstrating that the street-canyon flow within a city section having a high plan and frontal area fraction of buildings is, nonetheless, primarily dictated by the solid obstacles. On the other hand, the profiles obtained from the reference case but extracted from the ∆ = H/10 resolved nested domain reveal the pronounced sensitivity of the solution to grid resolution. The deviations arising due to coarser grid resolution systematically outweighed the deviations due to imposed model modifications. This outcome was particularly pronounced in the comparison of local + profiles although [P] (with simplified LAD-profiles) also showed deviations where vegetation caused changes in mean vertical velocity.
Comparison of local variance profiles is complicated by the presence of the slowly evolving very-large-scale ABL modes which interact with the urban roughness and comprise up to 75% of the observed variance in the streamwise velocity component. The use of sharp Fourier filter to eliminate the direct influence of the VLSM turbulence facilitated a meaningful comparison where, again, the case with reduced spatial resolution yielded the most notable differences within the urban canopy. Case [P] also exhibited deviations in variance profiles at locations where vegetation is in the immediate vicinity. The effects in local variances caused by reducing the vegetation drag or eliminating the largest ABL eddies were minor.
The energy distribution of the turbulent fluctuations at different frequencies are examined and compared by the means of wavelet analysis exploiting the concept of information entropy to provide a measure for the diversity of coherent fluctuation amplitudes within the signal. A measure of the difference between two signals' information entropy content, in turn, is given by information divergence. The divergence distributions, where the reference signal is compared against the modified cases over different frequency and elevation values, reveal that, within the canopy in areas where strong shear layers exists or where the flow adjusts to the surrounding vegetation, the change in grid resolution clearly leads to the highest divergence values. Changing the LAD profiles also results in systematic appearances of divergence within parks below the tree crowns. However, these changes in turbulence were weaker than the changes caused by reduced grid resolution.

•
Nested domain approach is essential in realistic UBL flows due to the size requirement for the computational domain. • Sufficient grid resolution is critical when urban canopy flow is of primary interest. Grid spacing of ∆ = H/20 or better is recommended in the vicinity of the urban roughness elements and close to the ground.

•
The accuracy of the LAD profile is relevant when detailed information is sought close to the ground in proximity of trees.

•
The shape of vegetation should be modelled as accurately as possible, but the flow system within a densely arranged urban environment is not very sensitive to the precision of the constant drag coefficient C Dv in the vegetation drag model.

•
The grid resolution on the top part of the domain can be relaxed as the largest eddies do not influence the urban canopy turbulence to an observable degree. However, it is beneficial to employ sufficiently tall computational domains, with L z ≈ 20H, particularly when the transition from fully developed ABL flow to a developing UBL flow is resolved.

Appendix A. Fourier Filtering: Continuous and Discrete Version
Given the turbulent fluctuations u (t, x), where time is ideally a continous variable, the Fourier transform and antitransform read as follows: Since our time series always have constant time steps (i.e., t k = k ∆t), the convention to discretize the Fourier transform here will be ∆ f = 1/(N∆t), f n = n ∆ f , and the following: The second equality in Equation (A4) is justified because, from Equation (A3), it turns out that u ( f n , x i ) is N-periodic with respect to the index n. It is thus convenient to consider the period of the frequency series between indexes − (N − 1)/2 and N/2 , with • being the floor function. Such a convention renders it easier to associate the the continuous Fourier transform to its discrete version. All the information about the time series is contained in the positive half of the frequency indexes 0, . . . , N/2 , as the fluctuations are a real function. Now, let us suppose we recognize a cutoff frequency f c in the fluctuations u (t, x) under which the flow can be considered small-scale turbulence and above which the flow consists of slow modes only, that is large-scale fluctuations. This split can be done via a sharp Fourier filter: As a consequence of the reality of the fluctuation function, we also have that u ( f , x) = u (− f , x), the symbol standing for complex conjugation. The small-scale fluctuations are represented by u (t, x), whereas u LM (t, x) constitutes the large-mode fluctuations. It is the former subset that is analysed in the present work. However, to do so, it is expedient to consider the small-scale fluctuations plus the average local offset: In practice, our time series will be the velocity fluctuations plus the offset computed on a time sequence t 1 , . . . , t N , and as a result of this, what will be implemented is a discrete combination of Equations (A5)-(A7): having denoted by n c the index such that n c ∆ f = n c /(N∆t) = f c .
following reasons. First, we recall that it is common practice to consider the following approximation whenever ω 0 ≥ 5 [22]: As for the Fourier transform, we get a Gaussian centered in ω 0 /(2π): We can therefore notice from Equation (A11) that the Morlet wavelet can be well approximated as follows: We can also interpret this wavelet as the atom of the joint time-frequency analysis associated to the dimensionless frequency ω 0 2π at time 0. By virtue of Equation (A11), plugging the rescaled time with respect to the time scale s into the wavelet function Ψ t−t s √ s involved in Equation (A9) reveals the spirit of time-frequency joint transform, as its envelop is a Gaussian function centered around the time t. The generated wavelets have the following Fourier transform with respect to t: We can notice the oscillating complex factor due to the time translation t. Also, the Gaussian is centered around the frequency ω 0 /(2πs), with its variance widening by a factor 1/s 2 as a consequence of the uncertainty principle [75]. Each positive value of the rescaling time s biunivocally corresponds to a wavelet ordinary frequency f = ω 0 /(2πs) ∼ 1/s because of the choice ω 0 = 6. The typical wavelength (or pseudoperiod) of the wavelet consequentially is simply ∼ s. This biunivocal correspondence allows defining the spectrogram as the graph of the wavelet transform in the plane ( f , t) instead of (s, t). Finally, we need to establish how Equations (A9) and (A15) will be discretized, taking into account finite size effect and aliasing. With ∆t = 0.5 s being the time step in the time series u(t n , x i ), we already know that time scales close to 1 s are affected by aliasing [81]. Wavelet resolution also imposes constraints on the smallest time scale that can be analyzed. From Equation (A16), the wavelet bandwidth can be identified as [ω 0 /s + 3/s]/(2π). Given that the sample frequency is f s = 2 Hz, the fulfillment of the following constraint would be sufficient in order to avoid aliasing [81]: As a result of this, only time scales sufficiently larger than (ω 0 + 3)/(2π) 1.4 s can be well resolved in the wavelet transform, as stated in the main text.