Building Geometry Simpliﬁcation for Improving Mesh Quality of Numerical Analysis Model

: Numerical analysis, especially the ﬁnite volume method (FVM), is one of the primary approaches employed when evaluating a building environment. A complicated geometry can degrade the mesh quality, leading to numerical di ﬀ usions and errors. Thus, this study develops and evaluates an automatic building geometry simpliﬁcation method based on integrating similar surfaces for the geometry of an indoor space. A regression model showed that the complexity of the simpliﬁed geometry and its similarity to the original geometry decreased linearly with the threshold of the method. The mesh quality was signiﬁcantly improved by the simpliﬁcation. In particular, the maximum skewness decreased exponentially with the threshold of the method. It is expected that the simpliﬁcation method and regression model presented in this study can be used to quantitatively control the mesh quality. G.P., C.K. and M.L.; software, G.P.; validation, G.P., C.K. and M.L.; formal analysis, G.P.; investigation, G.P., M.L.; resources, G.P.; data curation, G.P.; writing—original draft preparation, G.P.; writing—review and editing, G.P. and C.K.; visualization, G.P.;


Introduction
The finite volume method (FVM) is commonly used to perform numerical analysis in many fields, including fluid dynamics, owing to its advantages in flux calculations in terms of precision [1]. In the construction field, FVM is used for environmental analyses using computational fluid dynamics (CFD), fire safety analysis [2], and some cases of heat, air, and moisture (HAM) analysis [3,4]. In particular, CFD is widely used in the analysis of aerodynamic environments, e.g., indoor ventilation and micro-environments around occupants [5].
FVM is a method that discretizes and analyzes partial differential equations in the form of algebraic equations. For the computation of algebraic equations, there is a need to divide the target model into finite volumes ("cells"), i.e., to design the mesh. During the mesh design process, a discretization error can occur, and the size of the error is affected by the geometry of the finite volume. Discretization errors not only reduce the accuracy of the simulation, but also cause numerical diffusion, which interferes with the convergence of the simulation. The accuracy and stability of the numerical analysis are quantitatively evaluated using a mesh quality that is calculated based on the geometry of cells [6,7]. The mesh quality of an FVM model is typically evaluated according to the non-orthogonality and skewness. Non-orthogonal cells and skewed cells that are generated near complex surfaces are the main sources of numerical errors. In particular, analyses of buoyancy-driven environments, such as those with natural ventilation, are significantly affected by the mesh quality [8,9].
As the fine features and complexity of the geometry adversely affect the mesh design, a geometry simplification is commonly used as a preprocessing step to improve the mesh quality of the model [10]. Among the numerical analysis processes, the simplification of a complex geometry requires a great deal of manual work and is time-consuming [11]. In addition, the geometry simplification needs to be repeated manually until the target accuracy is achieved. Moreover, the determination of unnecessary geometrical features, simplification methods, and the level of simplification may depend on an analyst's arbitrary judgment and experience. Therefore, there is a need for an objective automatic geometry simplification method for the FVM analysis and the evaluation of the mesh quality improvement achieved by the simplification. However, most studies that focus on the simplification of building geometries aim to reduce the computational cost in the visualization process, and there have been relatively few studies focusing on numerical analysis.
This study proposes a building geometry simplification method for improving the mesh quality of FVM models. The rest of the paper is organized as follows. Section 2 discusses previous studies on geometry simplification in the construction industry, and Section 3 discusses the mesh quality metrics. Section 4 presents the proposed automatic geometry simplification method for FVM analysis, while Sections 5 and 6 respectively discuss the evaluation method and results obtained showing the effects of the proposed method. Section 7 presents the conclusions and recommendations for future research.

Literature Review
To achieve the automatic simplification of three-dimensional (3D) geometry, the simplification of a surface mesh is commonly used, such as a mesh decimation algorithm. Surface mesh simplification involves partially deleting the vertices constituting the mesh [12], and was developed for models that include several millions of faces and vertices, such as human body geometry. In research studies on the application of numerical analysis in construction, surface mesh simplification has been used primarily to simplify the computer-simulated person (CSP) geometry in indoor environment analysis models [13,14]. However, building geometries consist mainly of planes that are perpendicular or parallel to each other [15]. Therefore, they are expressed as simple surface meshes with hundreds of vertices. If vertices are deleted using the surface mesh simplification algorithm, the geometry may be excessively deformed.
Most studies on the simplification of the 3D geometries of buildings have been conducted for the computational optimization of buildings and map visualization. Kada [16] proposed a method for generating a new polyhedron from the major surfaces of the sidewall of a building geometry, and excluded negligible surfaces with small areas. Thereafter, the detailed geometry of the roof was reproduced as post-processing. Rau et al. [17] varied the complexity of a horizontal cross-section of a building, and formed a prism geometry by sweeping the cross-section in the vertical direction to simplify the building model. Similarly, He et al. [18] integrated buildings with similar heights, and simplified them into a prism geometry with a flat roof in order to simplify a set of adjacent buildings. In most cases, the building geometry is composed of walls that are perpendicular or parallel to each other, with little change in the vertical direction. The abovementioned studies presented simplification methods that consider these geometric features of buildings. However, for visualization of the 3D map, only the exterior geometry of the buildings was targeted, and its applicability to indoor models was not evaluated.
Owing to the increased usage of file formats such as building information modeling (BIM) and geography markup language (GML) formats, such as CityGML, simplification studies that use the information of individual components of buildings have been conducted for visualization. Zhao et al. [19] integrated building components using morphological operations to perform a simplification. The hierarchical connection information between components was analyzed to determine the components to be integrated. Geiger et al. [20] classified the level of detail of a building according to the steps (section and height, roof and slab, door and window) required for extracting building components from the BIM model. As these methods use the semantic information of buildings, they are dependent on the data format of the building model. Most of these building simplification methods for visualization purposes apply different simplification criteria to the sidewall and roof surfaces, and they determine mainly the exterior properties of the building. As a consistency and objective criterion of geometric design is required for the analysis of physical phenomena, these methods are not suitable as preprocessing approaches for numerical analyses.
Generally, simplification in FVM analysis is performed at the discretion of the researcher, and there have been a limited number of studies regarding the application of an automatic geometry simplification based on consistent criteria. Ayala et al. [21] approximated the sloping roof of an atrium model for a fire safety analysis, and it was in the form of a staircase-shaped polyhedron. The roof was simplified in four levels depending on the scale of the stairs. From a comparison of the simulation results, the temperature error of the simplified model was analyzed, and was found to be less than 10%. The staircase-like polyhedron model has the advantage of having a simple design for a high-quality hexahedron mesh. The aim of the study was to analyze the behavior of smoke. However, if the planar roof is transformed into a staircase geometry, a vortex may be formed near the roof surface, and the resistance to fluid and smoke may increase, unlike the actual geometry. Piepereit et al. [22] proposed a method for integrating the surface of a building with adjacent surfaces by sweeping the surface in the normal direction to analyze an exterior wind environment. The degree of simplification was controlled using a distance threshold in the integration. As a result of the mesh design for the original and simplified models, the maximum skewness was decreased from 0.96 to 0.80, and the number of mesh cells was reduced by 4.4%. However, the proposed method is not deterministic as the faces are removed in an arbitrary order, and it is possible to generate small angles that are difficult for meshing. These studies confirmed the applicability of geometry simplification methods through objective criteria for building FVM models. However, considering that each method was applied to a specific case, there is a need to evaluate the general effects of simplification on the degree of change in the geometry and mesh quality according to an established simplification threshold.

Mesh Quality
In FVM analysis, the differential equation of the physical phenomena needs to be discretized for the control volume (i.e., a cell in the mesh). For example, the conservation equation of the variable φ in the steady-state condition is as follows [23]: where V C is the control volume, → v is the velocity vector, Γ is the diffusivity coefficient, and Q is the source. Figure 1 illustrates an approximation of the equation for the analysis of the conservation equation for the two given cells, as shown in Equations (2)- (4).
where a f is the area of face f , and f x is the interpolation factor.
Appl. Sci. 2020, 10, 5425 where is the control volume, ⃗ is the velocity vector, is the diffusivity coefficient, and is the source. Figure 1 illustrates an approximation of the equation for the analysis of the conservation equation for the two given cells, as shown in Equations (2)-(4). and are the centers of each cell, is the face centroid, is the face interpolated value, and ⃗ is a normal of face . Equation (2) is the result obtained by approximating the surface integral of φ through the flux of the surface. As the FVM stores the values of the cell center (φ P and φ Q ), the value representing the face where the two cells are in contact (φ f ) is approximated to the value of the point f i (φ f i ) (Equation (3)). However, this equation assumes that the normal vector of f is parallel to the vector between the centers of the two cells ( → n f → s f ), and that the point f i represents the face ( f i = f c ). Numerical errors and diffusion exist for cells that do not match the abovementioned assumptions, as shown in Figure 1.
To evaluate the error, the mesh quality is evaluated based on the non-orthogonality (θ) and skewness ( ) as follows: The non-orthogonality and skewness are metrics that are used to evaluate the deviation of the mesh from each assumption. The lower the values of both metrics, the higher the mesh quality. A mesh with high values for both metrics not only has a large error, but is also likely to experience numerical diffusion in the process of the numerical analysis. Both metrics are evaluated for all adjacent cells of the designed mesh. As a small number of low-quality meshes can affect the convergence and error of the entire model, the mesh quality of the model is generally evaluated based on the maximum value. The mesh quality that is recommended in OpenFOAM, which is an open source CFD software, is a maximum non-orthogonality 70 • and a maximum skewness 4.0 [24].
The type of polyhedron for the cells also influences the numerical error. A cell is classified by the type of polyhedron, e.g., hexahedron and tetrahedron. Tetrahedron cells are easy to design, but are known to cause numerical errors and diffusion because of their low quality [25,26]. In contrast, hexahedron cells have advantages in terms of their accuracy and computational efficiency in analysis, although it is difficult to design high-quality cells in the case of complex geometries [7,9]. In addition, with hexahedron cells, a mesh containing a relatively small number of cells can be designed, thereby reducing the computational cost of the numerical analysis.

Automatic Geometry Simplification Method for FVM
The building geometry simplification method proposed in this study consists of preprocess, face classification, and polyhedron generation steps ( Figure 2). This method evaluates the similarity between surfaces to classify the surfaces to be removed. When the target model is composed of multiple solid objects, such as internal spaces or structures of the building, there may be numerical errors, such as small gaps between the solids. This can cause simplification errors and increase the number of surfaces, adversely affecting the performance of the algorithm. Considering this, a single solid object is created by uniting the target solids ( Figure 2b). The similarity between surfaces is evaluated by the difference in the angle between the surfaces, as well as the distance. For this purpose, the surface of the target model should consist of a flat surface (i.e., a face). Therefore, when there is a curved surface in the target model, the model has to be approximated in the form of a polyhedron composed of faces. In the face classification step, faces are classified into major faces (solid lines in Figure 2c) for composing the surface of the simplified model, and minor faces (dotted lines) to be removed from among the surfaces. To fill the empty spaces where minor faces have been removed, solids are generated using the boundaries of stretched major faces, and solids that are similar to the original geometry are selected (Figure 2d). The selected solids are united to obtain a simplified building geometry (Figure 2e).

Preprocess
If the target building model contains multiple solid objects, a union operation is applied to the solids. A union operation, one of the Boolean operation for polygons, creates the solids containing volume of target solids, similar to the union of set (∪). The building model is represented as single solid by union ( Figure 2b). Considering the case of numerical errors in the surface(s) that the solids abut, a geometry repair algorithm [27] is applied after union.
To approximate the target building geometry into a polyhedron, the solid is represented in a boundary representation (B-rep) form, and the surface meshes are generated. B-rep is a solid modeling method that represents a solid using surface geometric information (set of surfaces, edges, and vertices). When a curved surface is present in the target geometry, a triangular surface mesh is generated by a tessellation algorithm [28], and each cell of the surface mesh is regarded as a face.

Face Classification
To preserve the geometrical characteristics in the simplification process, an insignificant face in the model is integrated into the nearby major face that determines the overall shape of the model. The surface on building geometries are generally parallel or perpendicular to each other like cuboid. Given this characteristics, the faces with large area and has a normal direction similar to that of other faces are considered as major faces. The area and index are calculated as the preservation priorities for each face of the B-rep solids. is an index that is defined in this study, and calculated as follows: where is the proposed index, is the area (m 2 ), and ⃗ is the unit normal of the target face . is an average of the scalar product of the normal of the target face and another face's weighted with the area. By evaluating the priority sequence of the face through along with the face area, it is possible to obtain a model consisting of Cartesian angles, and by preventing the formation of acute

Preprocess
If the target building model contains multiple solid objects, a union operation is applied to the solids. A union operation, one of the Boolean operation for polygons, creates the solids containing volume of target solids, similar to the union of set (∪). The building model is represented as single solid by union ( Figure 2b). Considering the case of numerical errors in the surface(s) that the solids abut, a geometry repair algorithm [27] is applied after union.
To approximate the target building geometry into a polyhedron, the solid is represented in a boundary representation (B-rep) form, and the surface meshes are generated. B-rep is a solid modeling method that represents a solid using surface geometric information (set of surfaces, edges, and vertices). When a curved surface is present in the target geometry, a triangular surface mesh is generated by a tessellation algorithm [28], and each cell of the surface mesh is regarded as a face.

Face Classification
To preserve the geometrical characteristics in the simplification process, an insignificant face in the model is integrated into the nearby major face that determines the overall shape of the model. The surface on building geometries are generally parallel or perpendicular to each other like cuboid. Given this characteristics, the faces with large area and has a normal direction similar to that of other faces are considered as major faces. The area a and index p are calculated as the preservation priorities for each face of the B-rep solids. p is an index that is defined in this study, and calculated as follows: Appl. Sci. 2020, 10, 5425 where p i is the proposed index, a i is the area (m 2 ), and → n i is the unit normal of the target face i. p is an average of the scalar product of the normal of the target face and another face's weighted with the area. By evaluating the priority sequence of the face through p along with the face area, it is possible to obtain a model consisting of Cartesian angles, and by preventing the formation of acute angles, the generation of high-quality meshes can be expected [15,29].
The list of entire faces of the target solid is sorted with the area a and index p to extract the face priority sequence (F) and classified into major and minor faces ( Figure 3). In the order of the priority sequence, it is determined whether to remove a face, for faces ( f 2 ) having a smaller area than each surface ( f 1 ) (Figure 3a). When the distance (d) and the angle (∠( → n 1 , → n 2 )) between faces are smaller than the input thresholds, the face f 2 is determined to be a minor face, that is, a surface to be removed. The distance (d) is evaluated as the distance from c 2 (the center of f 2 ) to the p (the foot of the perpendicular from c 2 to f 1 ), that calculated as there is any geometry that should not be removed according to the purpose of the numerical analysis (S preserve ), it is excluded from the evaluation of minor faces. For example, the opening geometry like door, which is important for CFD analysis, was assigned as S preserve in Figure 3b). After classifying all minor faces (F minor ), the remaining faces are determined to be the major faces. Algorithm 1 shows the pseudocode for the face classification algorithm.

Algorithm 1 Pseudocode for face classification
Input: S target : Set of geometry solids of the target model S preserve : Set of geometry solids to preserve φ threshold : Angle threshold of simplification d threshold : Distance threshold of simplification Output: F ma jor : Set of major faces of target geometry S target Algorithm: (Note that a i is area, the perpendicular from to ), that calculated as ⃗ = | ⃗ • ( ⃗ − ⃗ )|. At this time, if there is any geometry that should not be removed according to the purpose of the numerical analysis ( ), it is excluded from the evaluation of minor faces. For example, the opening geometry like door, which is important for CFD analysis, was assigned as in Figure 3b). After classifying all minor faces ( ), the remaining faces are determined to be the major faces. Algorithm 1 shows the pseudocode for the face classification algorithm.

Polyhedron Generation
After the faces are classified, the base solid of simplified model is generated as a bounding volume. The base solid need to have buffer space outside the original model (i.e., bigger than minimum bounding volume), since new solid might be generated in the simplified model as shown in Figure 2d. In this study, a solid that is twice the length of each dimension of the minimum bounding box was generated.
The base solid is split by each plane containing a major face as shown in Figure 4. Among the solids split by the planes, the solid that will form the simplified model is selected by comparing it with the original model ( Figure 5a). For each split solid (S i ), the ratio of shared volume of the solid and the original solid (S i ∩ S target ) over volume of the solid (S i ) is calculated. The solid with a volume ratio of 0.5 or more is selected. Figure 5b shows the calculated volume ratio of part of solids. Finally, the selected split solids (Figure 5c) were united to generate simplified building geometry. Algorithm 2 is the pseudocode for the polyhedron generation process.

Algorithm 2 Pseudocode for polyhedron generation
Input: S target : Set of geometry solids of the target model F ma jor : Set of major faces of target geometry S target Output: S simpli f ied : Solids of simplified geometry Algorithm: P{Plane solids split by the planes, the solid that will form the simplified model is selected by comparing it with the original model ( Figure 5a). For each split solid ( ), the ratio of shared volume of the solid and the original solid ( ∩ ) over volume of the solid ( ) is calculated. The solid with a volume ratio of 0.5 or more is selected. Figure 5b shows the calculated volume ratio of part of solids. Finally, the selected split solids (Figure 5c) were united to generate simplified building geometry. Algorithm 2 is the pseudocode for the polyhedron generation process.  : Solids of simplified geometry Algorithm:

Geometrical Properties
In addition to the mesh quality, the changes in the geometric properties according to the simplification of the building model were evaluated in terms of the shape similarity and meshing complexity. The similarity between the original and simplified geometry was evaluated by comparing the D2 shape distribution of each geometry. The meshing complexity refers to the degree of difficulty in the mesh design, which depends on the complexity of the geometry; the meshing complexity was evaluated by the inverse topology count (ITC).

Shape Similarity
The shape distribution involves the distribution of the geometric characteristics of the model that are evaluated with a shape function, and is used to quantitatively describe and compare the 3D geometry [30,31]. There are numerous shape distributions that apply various shape functions, such as the angle (A3), distance (D1, D2), area (D3), and volume (D4); however, the D2 shape distribution is known to be most suitable for model comparison and classification [32]. The D2 shape distribution is defined as the distribution of the distance between two arbitrary points on a model surface. In practice, it is evaluated based on a histogram of the distance between the two points that are sampled randomly from the surface.
The similarity between the shape distributions was evaluated using the Bhattacharyya coefficient. For distributions p 1 (x) and p 2 (x) (x ∈ X), the Bhattacharyya coefficient (ρ) is calculated as follows [33]: The Bhattacharyya coefficient is equal to the area of the overlapping region between the two distributions. It has a range of 0 ≤ ρ ≤ 1, and can be interpreted as a standardized similarity between the distributions.

Meshing Complexity
The meshing complexity and mesh quality of the numerical analysis model increase with the complexity of the target geometry. White et al. [34] proposed the ITC as an index for evaluating the hexahedral meshing complexity, and the calculation formula is as follows: where C ITC is the ITC, F is a set of faces, and E is a set of edges in the target model. The ITC is an index for evaluating the complexity of a geometry and the ease of hexahedral meshing based on the basic geometric information, i.e., the number of faces and edges. It has a range of 0 < C ITC ≤ 1, and a lower value is calculated for a model with more complex surfaces, which comprises multiple faces and edges. The ITC of the cube (|F| = 6, |E| = 12), which is ideal for the hexahedron mesh design, is equal to the maximum value, 1.

Building Geometry Dataset
As the simplification method proposed in this study does not consider building materials or semantic information (e.g., relationships and hierarchies of components), the 3D geometry can be extracted and applied regardless of the data format of the building model. To evaluate the proposed method for models with various geometric characteristics, the building models were obtained from the BIM library available on the web. Two or three models have been selected from every three libraries as follows: Academic Advance Sample, Academic Kingo, Haus30 models from [35], Medical Clinic, Duplex Apartment, Office Building models from [36], and OTC Conference Center, West Riverside Hospital Parking Garage from [37]. The domain for building environment simulations, such as CFD simulations and fire safety assessments, is usually an interior space, rather than a structural part of the building. As such, the geometry of the indoor space (IfcSpace class) was selected for evaluation from among the attributes of the BIM model. To set the boundary conditions for the indoor environmental analysis model, the information of the door and window was required; in consideration of this, the geometries of the openings in doors and windows (IfcOpeningElement class) adjacent to the target space were simultaneously extracted. Of the total 40 spaces that were selected, five were from each respective BIM model, including spaces with complex geometries that could be simplified, such as curved surfaces, walls with different depths, and interior walls. To evaluate the general effect of the geometry simplification, geometries of various geometrical complexities were selected ( Table 1). The characteristic length, which is an index that represents the hydrodynamic properties of geometry, was calculated as follows: where L C is the characteristic length (m), V is the volume (m 3 ), and A is surface area (m 2 ) of the target geometry.

Implementation and Mesh Design
The algorithm implementation and data analysis was performed through Python programming language. The open source library IfcOpenShell [38] was used to analyze the BIM model and extract the geometry. Using Open CASCADE [39], a computer-aided design (CAD) library, and its wrapper library pythonOCC [40], the simplification method was implemented and the D2 shape distribution was evaluated. The mesh of the target building model was designed using the cfMesh library [41]. cfMesh can automatically generate mesh and boundary layers for each geometric shape, and is basically included for use in OpenFOAM [42]. A hexahedron mesh was generated using cfMesh; in the case of complex geometry where there is difficulty in generating high-quality hexahedron meshes, some cells are generated in the shape of a tetrahedron, prism, and pyramid.

Experiment Settings and Analysis
The changes in geometric shape and mesh quality were analyzed according to the threshold of the proposed geometry simplification method. As the building model was mostly composed of perpendicular or parallel surfaces, in the simplification method, the distance threshold had a more dominant effect on the results than the angle threshold. Therefore, the angle threshold (φ threshold ) was fixed at 90 • , and the degree of the simplification was controlled according to the distance threshold (d threshold ). Considering the differences in the scale and complexity of individual models, the distance threshold was standardized by a characteristic length, i.e., a metric in the length dimension representing the geometric properties. Each model was simplified with a distance threshold of 0.01, 0.05, 0.1, 0.5, 1.0, 5.0, or 10.0 times the characteristic length, and was compared with the original model. In view of a CFD simulation or fire safety analysis for the building, the geometry of the openings (windows and doors), which significantly affects the indoor aerodynamic environment, was excluded from the target of simplification. There is a trade-off relationship between the accuracy and the computational cost of the numerical analysis according to the number of meshes. The researcher controls the size and number of meshes considering the required accuracy of the model and available computational resources. Considering that, mesh sizes of 0.25, 0.5, 1.0, 2.0, and 4.0 m were designed for each model.
The applicability of the geometry simplification was evaluated in terms of the degree of change in the geometry and mesh quality. The similarity between the original and the simplified model was quantitatively evaluated by the Bhattacharyya coefficient between D2 shape distributions. The D2 shape distribution was calculated from the distance between 1,000,000 pairs of points that were randomly extracted from the surface of the target geometry. The mesh quality was evaluated based on the maximum non-orthogonality, maximum skewness, and ratio of hexahedron meshes. In addition, the number of cells was evaluated as it was proportional to the computational cost of the numerical analysis model. The change in each index according to the simplification threshold was analyzed using a linear regression model based on ordinary least squares. Unlike other metrics, the skewness does not have an upper limit. Typically, high-quality meshes have a maximum skewness between 0 and 4; however, extremely skewed cells with skewness values of 1 × 10 10 or more can be generated near the surface of some complex geometries. To minimize the effects of these outliers in the statistical analysis, the elliptic envelope method [43] was applied to the data from the mesh design to remove outliers of 1% of the entire data.

Simplification of Building Geometry
A total of 40 geometries were simplified under six distance thresholds each. The average execution time of the implemented algorithm was 137.5 s (median 37.3 s) with Ryzen 7 3800X processor and 32 GB RAM. Figure 6 shows the simplification of a typical building model for which the majority of adjacent surfaces are orthogonal to each other. The gray surfaces and blue surfaces represent ordinary walls and openings respectively. Figure 6a shows Space 1E24 in the Medical Clinic model. As the distance threshold (d threshold ) of the geometry simplification increased, the details of the geometry were gradually removed, and the overall geometry tended to be similar to a simple cube.  Figure 6b illustrates Space 04 from the Academic Kingo model. For d threshold = 0.5L C0 , some parts of the ceiling surface parallel to the floor were united into the remaining non-horizontal ceiling. As the opening geometry was excluded from the simplification target, when the distance threshold is 1.0L C0 or greater, some of the wall face is shifted to a parallel plane to the face of the doors.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 11 of 18 generated near the surface of some complex geometries. To minimize the effects of these outliers in the statistical analysis, the elliptic envelope method [43] was applied to the data from the mesh design to remove outliers of 1% of the entire data.

Simplification of Building Geometry
A total of 40 geometries were simplified under six distance thresholds each. The average execution time of the implemented algorithm was 137.5 s (median 37.3 s) with Ryzen 7 3800X processor and 32 GB RAM. Figure 6 shows the simplification of a typical building model for which the majority of adjacent surfaces are orthogonal to each other. The gray surfaces and blue surfaces represent ordinary walls and openings respectively. Figure 6a shows Space 1E24 in the Medical Clinic model. As the distance threshold ( ) of the geometry simplification increased, the details of the geometry were gradually removed, and the overall geometry tended to be similar to a simple cube. The characteristic length of the original model ( ) was calculated as 0.  Figure 6b illustrates Space 04 from the Academic Kingo model. For = 0.5 , some parts of the ceiling surface parallel to the floor were united into the remaining non-horizontal ceiling. As the opening geometry was excluded from the simplification target, when the distance threshold is 1.0 or greater, some of the wall face is shifted to a parallel plane to the face of the doors. When the model geometry contained several non-orthogonal and curved surfaces, the geometry mainly changed with a low threshold of below 1.0 ( Figure 7). This was considered to be because the length of the perpendicular from the center, which is a criterion for surface integration, was relatively short owing to the small angle difference between faces. Figure 7a   When the model geometry contained several non-orthogonal and curved surfaces, the geometry mainly changed with a low threshold of below 1.0L C0 (Figure 7). This was considered to be because the length of the perpendicular from the center, which is a criterion for surface integration, was relatively short owing to the small angle difference between faces. Figure 7a is Space 123 of the Academic Advanced Sample model. The circular column inside the original model was represented as the column of a regular 24-sided polygon as a generating surface mesh. Upon applying the simplification, it was transformed to an octagonal column (d threshold = 0.1L C0 ) and square column (d threshold = 0.5L C0 ). When the distance threshold was 1.0L C0 (= 0.756 m) or longer, the column was removed as the distance threshold exceeded the diameter (0.3 m) of the column. Figure 7b is Space B112 in the OTC Conference Center model.   (Table 2), the similarity with the original geometry decreased by approximately 0.31% when the distance threshold increased by 1.0 .  According to the characteristics of the simplification method for integrating similar surfaces, the number of faces and edges in the model tended to decrease. In the original model, the average number of faces was 1243 and the number of edges was 1866; however, with the simplification under a condition of = 10.0 , the numbers were reduced to 224 and 335, respectively ( Figure   8b). As a result, the ITC, which represents the hexahedral meshing complexity, showed an increasing  Figure 8 shows the averages of the geometric properties of the target model according to the simplification threshold. Based on the calculated results of the Bhattacharyya coefficient, the similarity with the original geometry tended to decrease as the degree of simplification increased. When the distance threshold was a maximum (d threshold = 10.0L C0 ), the mean of the Bhattacharyya coefficient was 0.963, and the standard error was 5.83 × 10 −3 (Figure 8a). Based on the results of the statistical analysis, no significant change in the Bhattacharyya coefficient was observed according to the geometrical properties (ITC and characteristic length). In contrast, the distance threshold of the simplification had a linear relationship with the Bhattacharyya coefficient. From the result of regression analysis (Table 2), the similarity with the original geometry decreased by approximately 0.31% when the distance threshold increased by 1.0 L C0 .   (Table 2), the similarity with the original geometry decreased by approximately 0.31% when the distance threshold increased by 1.0 .  According to the characteristics of the simplification method for integrating similar surfaces, the number of faces and edges in the model tended to decrease. In the original model, the average number of faces was 1243 and the number of edges was 1866; however, with the simplification under a condition of = 10.0 , the numbers were reduced to 224 and 335, respectively ( Figure   8b). As a result, the ITC, which represents the hexahedral meshing complexity, showed an increasing  According to the characteristics of the simplification method for integrating similar surfaces, the number of faces and edges in the model tended to decrease. In the original model, the average number of faces was 1243 and the number of edges was 1866; however, with the simplification under a condition of d threshold = 10.0L C0 , the numbers were reduced to 224 and 335, respectively (Figure 8b). As a result, the ITC, which represents the hexahedral meshing complexity, showed an increasing trend. The average ITC of the original model increased from 0.0425 (standard error 0.0165) to 0.0929 (SE 0.0315) under a condition of d threshold = 5.0L C0 , and to 0.0847 (SE 0.0212) for d thershold = 10.0L C0 .

Mesh Quality Improvement by Geometry Simplification
A hexahedral mesh of each size was designed for the original and simplified geometries, and the respective mesh qualities were evaluated. The data of 10,661 cases were collected, except for some cases where simplification was not performed because the distance threshold value was too small, or where mesh generation failed owing to the excessively large mesh size when compared with the volume of the model. Among them, there were 10,549 datasets with approximately 1% of the outliers removed considering the distribution of the mesh quality. Among the mesh quality metrics, the maximum skewness formed a distribution with an extremely long tail on the right side (skewness of distribution 15.54); considering this distribution, the maximum skewness was log-transformed before the analysis. Figure 9 shows the proportion of hexahedron cells in the mesh and the relative number of cells according to the simplification threshold and mesh size. From the result of meshing the original geometry, the proportion of hexahedron cells among all cells tended to decrease as the mesh size increased (Figure 9a). Hexahedron cells have the advantage of having fewer numerical errors, and can be designed with a relatively small number of cells, but it is difficult to achieve a high-quality mesh design for complex geometries. The volume range of the target building models was 0.96-1062.6 m 3 , and the characteristic length range was 0.015-1.025 m. As the mesh size was set to be larger relative to the volume and complexity of the target geometry, it was believed that the proportion of cells with geometries such as tetrahedron, pyramid, and prism increased to represent the detailed geometry of the model surface. The hexahedron cell ratio of the simplified model increased linearly according to the simplification threshold. For all mesh sizes, the hexahedron cell ratio of the model simplified with d threshold = 10.0L C0 increased by an average of 10.4% when compared with the original model.

Mesh Quality Improvement by Geometry Simplification
A hexahedral mesh of each size was designed for the original and simplified geometries, and the respective mesh qualities were evaluated. The data of 10,661 cases were collected, except for some cases where simplification was not performed because the distance threshold value was too small, or where mesh generation failed owing to the excessively large mesh size when compared with the volume of the model. Among them, there were 10,549 datasets with approximately 1% of the outliers removed considering the distribution of the mesh quality. Among the mesh quality metrics, the maximum skewness formed a distribution with an extremely long tail on the right side (skewness of distribution 15.54); considering this distribution, the maximum skewness was log-transformed before the analysis. Figure 9 shows the proportion of hexahedron cells in the mesh and the relative number of cells according to the simplification threshold and mesh size. From the result of meshing the original geometry, the proportion of hexahedron cells among all cells tended to decrease as the mesh size increased (Figure 9a). Hexahedron cells have the advantage of having fewer numerical errors, and can be designed with a relatively small number of cells, but it is difficult to achieve a high-quality mesh design for complex geometries. The volume range of the target building models was 0.96-1062.6 m 3 , and the characteristic length range was 0.015-1.025 m. As the mesh size was set to be larger relative to the volume and complexity of the target geometry, it was believed that the proportion of cells with geometries such as tetrahedron, pyramid, and prism increased to represent the detailed geometry of the model surface. The hexahedron cell ratio of the simplified model increased linearly according to the simplification threshold. For all mesh sizes, the hexahedron cell ratio of the model simplified with = 10.0 increased by an average of 10.4% when compared with the original model.  was believed to result from the generation of cells other than the hexahedron, as described above, which could be prevented by setting an appropriate simplification threshold and mesh size. Figure 10 shows the changes in the maximum non-orthogonality and skewness, which affect the accuracy and stability of the FVM. Both metrics tended to decrease with an increasing simplification  This was believed to result from the generation of cells other than the hexahedron, as described above, which could be prevented by setting an appropriate simplification threshold and mesh size. Figure 10 shows the changes in the maximum non-orthogonality and skewness, which affect the accuracy and stability of the FVM. Both metrics tended to decrease with an increasing simplification threshold, that is, the mesh quality improved. The total average maximum non-orthogonality of the original model was 69.6 • , and the log-transformed maximum skewness (log ( ))) was 0.82. With the geometry simplification, the metrics decreased to 65.3 • and 0.47 with d threshold = 1.0L C0 ; under the condition of d threshold = 10.0L C0 , the values decreased to 51.0 • and −0.13, respectively. In contrast, as a larger mesh was designed, the mesh quality improved on average; however, there was no consistent tendency. For example, the maximum skewness of the 4.0-m mesh was higher than that of the 2.0-m mesh under the d threshold ≤ 1.0L C0 condition.  For a quantitative analysis of the effect of the geometry simplification on the mesh quality, a multiple linear regression analysis was performed (Tables 3-5). The independent variables included the ITC ( ) and characteristic length ( ) of the original geometry, the maximum size of the designed mesh ( ), and the degree of simplification normalized with the characteristic length ( / ). Considering the nonlinear relationship between the mesh size and quality, the square term of the mesh size ( ) was added as an independent variable. In the case of maximum nonorthogonality, the effect of the characteristic length was not statistically significant (p-value = 0.350), and it was excluded from the analysis. All of the linear regression models were evaluated as being statistically significant (p-value < 2.2 × 10 −1 ⁶). For all the metrics, the mesh quality improved with the increasing simplification threshold ( / ) and the geometry characteristic metrics ( and )). Considering that the adjusted R 2 of all linear models was less than 0.4, it is estimated that some factors that affected the mesh quality were not considered in this study. However, considering the statistical significance of the individual variables, there is an effect in improving the mesh quality for FVM analysis with the geometry simplification process proposed in this study. In particular, the maximum skewness decreased exponentially according to the distance threshold (ϵ = exp(3.02 − 0.118( / ) − 1.20 + 0.213 − 4.64 − 0.720 )), and the corresponding quality improvement effect was expected to be high. From the regression models, it is expected that the mesh quality can be predicted, and an appropriate simplification threshold ( ) can be selected before designing a building FVM simulation model.  For a quantitative analysis of the effect of the geometry simplification on the mesh quality, a multiple linear regression analysis was performed (Tables 3-5). The independent variables included the ITC (C ITC0 ) and characteristic length (L C0 ) of the original geometry, the maximum size of the designed mesh (s), and the degree of simplification normalized with the characteristic length (d threshold /L C0 ). Considering the nonlinear relationship between the mesh size and quality, the square term of the mesh size (s 2 ) was added as an independent variable. In the case of maximum non-orthogonality, the effect of the characteristic length was not statistically significant (p-value = 0.350), and it was excluded from the analysis.   All of the linear regression models were evaluated as being statistically significant (p-value < 2.2 × 10 −16 ). For all the metrics, the mesh quality improved with the increasing simplification threshold (d threshold /L C0 ) and the geometry characteristic metrics (C ITC0 and L C0 )). Considering that the adjusted R 2 of all linear models was less than 0.4, it is estimated that some factors that affected the mesh quality were not considered in this study. However, considering the statistical significance of the individual variables, there is an effect in improving the mesh quality for FVM analysis with the geometry simplification process proposed in this study. In particular, the maximum skewness decreased exponentially according to the distance threshold ( = exp (3.02 − 0.118(d threshold /L C0 ) − 1.20s + 0.213s 2 − 4.64C ITC0 − 0.720L C0 )), and the corresponding quality improvement effect was expected to be high. From the regression models, it is expected that the mesh quality can be predicted, and an appropriate simplification threshold (d threshold ) can be selected before designing a building FVM simulation model.

Conclusions
This study proposed an automatic simplification methodology of building geometries to improve the accuracy and stability of the FVM model, and analyzed changes in geometry and mesh quality by simplification. The geometry simplification method proposed in this study integrates similar faces through angle and a distance threshold, and removes insignificant faces from the surfaces composing the building model. To evaluate the applicability of the simplification method, the geometries of the building models obtained from the public BIM library were simplified under various conditions.
To evaluate the extent of the shape changes from the simplification method, the similarity between the original and simplified geometry was evaluated using the D2 shape distribution and Bhattacharyya coefficient. In addition, the complexity of the building model and hexahedron meshing were evaluated using the ITC. As a result, the similarity of simplified geometry to the original model and the complexity decreased in proportion to the distance threshold. Therefore, it was determined that the degree of geometry changes could be controlled linearly using the threshold of the simplification method. The mesh quality of the building model was evaluated using the maximum non-orthogonality, skewness, and the hexahedron cell ratio as each affects the accuracy and stability of FVM analysis. The mesh quality metrics showed a significant change based on the distance threshold from the geometry simplification and the complexity of the original geometry. All three mesh quality metrics improved with geometry simplification. In particular, the maximum skewness showed an exponential decrease.
To evaluate the priority of the faces to be removed among the building surfaces, an index for evaluating the similarity between the normal vector of a target face and other faces was proposed in this study. The proposed algorithm guarantees identical results with the same input as the results achieved by evaluating the priority order of entire faces with the index. Furthermore, the hexahedral meshing complexity of the geometry is expected to be reduced as Cartesian angles are formed. The experimental results showed that the mesh quality of an FVM model can be quantitatively controlled using the geometry simplification method proposed in this study. In particular, through the linear regression model, it is possible to estimate the mesh quality from the geometrical properties of the original model and the size of the mesh to be designed, as well as to evaluate the required threshold for simplification. However, considering the low adjusted R 2 of the regression model, it is believed that there may be a cause of error that has not been considered in this study. In practice, it is expected that the mesh quality can be further improved by applying multiple simplification thresholds and/or traditional preprocessing methods, such as the blocking of geometries.
This study evaluated the applicability of a geometry simplification method in terms of the geometry changes and the improvement of the mesh quality. However, considering that differences may occur in the analysis results as the geometry changes, further studies are needed to evaluate the effects of geometry simplification methods on individual FVM analysis results, such as those for CFD, fire dynamics simulations, and thermal simulations. Based on these studies, it is expected that an appropriate geometry simplification threshold can be presented while considering the relationship between the error caused by the geometry change and the effect of improving the mesh quality.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.