Maintaining Semantic Information across Generic 3D Model Editing Operations

: Many of today’s data models for 3D applications, such as City Geography Markup Language (CityGML) or Industry Foundation Classes (IFC) encode rich semantic information in addition to the traditional geometry and materials representation. However, 3D editing techniques fall short of maintaining the semantic information across edit operations if they are not tailored to a speciﬁc data model. While semantic information is often lost during edit operations, geometry, UV mappings, and materials are usually maintained. This article presents a data model synchronization method that preserves semantic information across editing operation relying only on geometry, UV mappings, and materials. This enables easy integration of existing and future 3D editing techniques with rich data models. The method links the original data model to the edited geometry using point set registration, recovering the existing information based on spatial and UV search methods, and automatically labels the newly created geometry. An implementation of a Level of Detail 3 (LoD3) building editor for the Virtual Singapore project, based on interactive push-pull and procedural generation of façades, veriﬁed the method with 30 common editing tasks. The implementation synchronized changes in the 3D geometry with a CityGML data model and was applied to more than 100 test buildings.


Introduction
Many of today's data models for 3D applications, such as CityGML or IFC, encode rich semantic information in addition to the traditional geometry and materials representation. Semantic information enables a wide range of application scenarios beyond visualization. Specifically, semantic and hierarchical information of building models is important for building information analysis [1], urban analytics [2], deployment of facilities [3], prediction [4], and governance [5]. Semantic information provides the identification of key components in a generic 3D model and enables users to differentiate walls, roofs, ground, and other surfaces in a 3D building model. Hierarchical information carries the relationships between surfaces in 3D models; for instance, in 3D building models, it tells how a window is related to a wall surface. Nowadays, a variety of techniques are available for editing generic 3D models. However, the maintenance of semantic and hierarchical information through the editing process is only supported by specific applications that are closely tied to the underlying data model. We identified five main classes of popular geometry editing techniques: transformation, deletion, deformation, push-pull [6,7], and parametric (procedural) templates [8][9][10]. Most editing tools preserve only geometric information, UV mappings, and materials. In order to provide a flexible integration of existing as well as future editing tools, an automatic matching and synchronization method for the modified geometry and the data model is needed. Many of today's 3D applications provide SDKs and APIs where 3D geometry, UVs, and materials are exposed, and thus enable an easy integration of the presented method.
There are three major challenges to maintaining semantic information across the editing process: the first one is to decide to what extent to depend on the specific edit operation. On the one hand, recovering information based on an edit operation is an intuitive way to maintain semantic information. On the other hand, this would mandate the same number of data model update strategies as edit operations, severely limiting its flexibility. Therefore, an automatic matching method independent of the specific edit operation is required. The second challenge is how to deal with both rigid and non-rigid transformations of polygons as well as how to distinguish existing faces from newly added faces. New faces need to be inserted and classified according to the data model. The third challenge is the performance of the automatic matching process, since it very likely runs after each interactive edit operation, and thus may impact the user experience by its running time.
The presented automatic matching method addresses all three challenges. It is based on a closed loop of interleaving interactive edit operations and automatic data model synchronizations. Starting from a source data model with geometry, UVs and materials, the method preserves semantic information after geometry changes through a process of registration, recovering, and labeling. The method is independent of the specific editing operation, and can thus be applied to a wide range of interactive 3D tools or 3D batch processing jobs. The main contributions of the method are:

•
Independence of edit operation: The method assumes no knowledge about the edit operation. It is able to recover information only based on the source data model and the modified 3D geometry.

•
Integration flexibility: Since the method is only based on 3D geometry, UVs and materials, which are usually exposed through SDKs and APIs, integration in existing 3D applications is simple. • Information preservation: The method can preserve all of the semantic information of the original model, classify newly added polygons, and label them according to the data model.
The method was implemented and tested as part of a LoD3 building editor for the Virtual Singapore project using CityGML as the underlying data model. It features standard 3D transformation tools, push-pull tools, and procedural façade templates through the integration of 3rd party libraries. These libraries disregard the data model and operate on the 3D geometry only. The CityGML data model is updated automatically after each edit operation. Thirty editing test cases were designed to verify that the data model was consistent, and the semantic information was maintained after each edit operation.

Related Work
Several research areas are relevant for understanding the problem's context: generating and maintaining hierarchical and semantic information of 3D models (CityGML in particular), 3D model matching methods, and state-of-the-art 3D editing techniques.

Semantic Information
A semantically-enriched 3D scene is crucial for non-visualization oriented applications. Alegre [11] proposed a probabilistic approach to the semantic interpretation of building façades. Verdie [12] reconstructed a 3D urban scene by classifying and abstracting the original meshes. With the assistance of Markov Random Field, Verdie's method is able to distinguish ground, trees, façades, and roof surfaces. Zhu [13] proposed a point cloud classification method based on multi-level semantic relationships, including point-homogeneity, supervoxel-adjacency, and class-knowledge constraints. Furthermore, Wu [14] presented an algorithm to produce hierarchical labeling of an RGB-D scene for robotics. Rook [15] automatically labeled faces using a decision tree but without support for LoD3 labels. All these methods address the broader problem of creating semantic information directly from 3D faces or point cloud data, and thus do not take into account prior information available in a rich source data model.

Model Registration
Model registration between the source geometry and the edited geometry allows recovering the transformation applied by an edit operation. Various methods were developed over the past to fully or partially match geometries and extract transformations between matching parts.
Sundar [16] proposed a skeleton based method for comparing and searching 3D models. The method expresses the model by a skeleton graph and uses graph searching techniques to match 3D models. DeepShape [17] extracts high-level shape features to match geometric deformations of shapes. A shape signature is computed with the method introduced in [16], which makes it possible to measure similarity of different shapes. This method represents the signature of an object as a shape distribution sampled from a shape function measuring global geometric properties of the object. Iterative closest point (ICP) is a widely used registration method introduced by Besl and McKay [18] and Zhang [19]. This efficient point set registration method meets well, the requirements in terms of interactive edit performance and simple geometry representation. ICP was proved to be adequate for the editing operations used in the test cases (see Table 1). If necessary, ICP variants such as [20] and [21] can be integrated. Other alternatives include: Coherent point drift(CPD), a probabilistic method introduced by Myronenko [22] for both rigid and non-rigid point set registration. Ma proposed his vector field consensus algorithm in [23]. Ge [24] presented the global-local topology preservation (GLTP) algorithm for non-rigid point set registration.

Editing Techniques
Push-pull (press-pull or sculpting) is a popular, intuitive editing technique which unifies previously distinct tools such as edge move, face move, face split, and extrude into a single tool. Having a single tool minimizes tool changes and thus improves the efficiency of the editing process. It has been adopted by many commercial products, such as AutoCAD [25], SketchUp [7], and Maya [26]. In particular, PushPull++ [6] used in CityEngine [27], and ArcGIS Pro [28] employs methods for adaptive face insertion, adjacent face updates, edge collapse handling, and an intuitive user interface that automatically proposes useful drag directions. PushPull++ has been verified to be able to reduce the complexity of common modeling tasks by up to an order of magnitude comparing to existing tools, and is therefore well suited for editing building models and is integrated as part of the Virtual Singapore Editor (VSE). Esri's procedural runtime [10] is used to generate parametrized geometry from façade templates. The procedural runtime is a library implementing the computer graphics architecture (CGA) shape grammar introduced in [8,9], which defines a set of operations for transforming shapes in space, with a special focus on patterns found in architecture.

CityGML
While the method is independent of the data model, the VSE is based on CityGML [29], introduced by Kolbe [30], covering the geometrical, topological, and semantic aspects of 3D city models. CityGML defines the classes and relations for relevant topographic objects in cities and regional models with respect to their geometrical, topological, semantical, and appearance properties. There is wide industry support for importing CityGML, (e.g. Bentley Map [31], BS Contact Geo [32], CityServer3D [33], CodeSynthesis XSD [34], FME from Safe Software [35], and many more), but editing of the data model is supported by only a few options (e.g. [36,37]).

Information Synchronization
The method regards edit operations as black boxes, but looking at current tools, it helps to distinguish five main types of editing techniques: transformation, deletion, deformation, push-pull, and template applications.
The following section gives an overview of the method and presents each step. Then, the key tasks of registration, recovery, information transfer between source and edited model, and automatic labeling are discussed.
Internally, 3D geometry is represented as face-vertex meshes [38] which incorporate a list of vertices and a set of faces referencing their vertices. Both the source geometry and edited geometry are handled as meshes (M s and M e ). Mesh M s is represented by vertices v ∈ V which are points in three-dimensional space. A vertex v i may optionally be annotated by a two-dimensional texture coordinate t i ∈ T. For each pair of connected vertices in f , an edge e = (v i , v j ) is defined. Planar faces f ∈ F are defined by a list of counter clockwise oriented vertices (v 1 , v 2 , . . . , v n ). Opening (holes) are represented as faces referencing their boundary (parent) faces. Figure 1 gives an overview of the model synchronization method. Applying an edit operation to a source geometry (M s ) results in an edited geometry (M e ) deprived of semantic information because we assume that these are not preserved by the edit operation.

Semantic information transfer:
An octree search with faces from M e yields matching rigid transformed faces in M t and their original semantic labels in M s . Deformed faces are handled in a similar way by an additional UV space search. 3. Labeling: Each face in the edited geometry M e is incorporated into a topological tree to detect openings and their boundary surfaces. Boundary surfaces are labeled as roof, wall, or ground faces based on their normal direction. For openings, the octree and UV space are used again to find the position in the data model, finally resulting in an updated data model that preserves as much semantic information as possible and which is updated with new faces labeled according to their topology.

Face Matching
Since M e contains only geometric information after an edit operation, faces f e ∈ M e have to be matched with corresponding faces f s ∈ M s to find related semantic information. The following feature are used for face matching: • Face normal f normal .
• Length of edges f length .
• Set of texture coordinates f T .
If the distance between two vertices v i and v j is below a given threshold v , they are considered the same. Likewise, if the UV distance of two texture coordinates t i and t j is below a given threshold t , their UVs are considered the same (See Figure 2). For matching faces, semantic information from f s ∈ M s can be transferred to f e ∈ M e via f t ∈ M t .

Model Registration
Several editing operations include rigid transformation and scaling, which can be represented by a transformation matrix. The ICP algorithm is used by the method to calculate the transformation matrix between the source M s and the edited geometry M e . ICP can be applied to the source geometry as a whole to compute the global transformation of the geometry or partially to recover local transformations.

ICP Algorithm
The ICP algorithm is a simple and efficient way to find the best match between two point sets by solving an optimization problem. The inputs of the ICP algorithm are two point sets, the source point set X (see Equation (1)), and target point set P (see Equation (2)); the point sets may have different sizes, and the distance between two points is defined by an euclidean metric.
The ICP algorithm finds the transformation T in an iterative approach, applying transformation T to X and computing the fitness error E(T), until E(T) is less than a given threshold.

Point Set Generation
In order to increase the quality of the ICP result for a given edit operation, more points than just the vertices of the source geometry are usually required. Generating points by sampling uniformly the edges is easy to implement but will result in the same points for symmetries, and thus prevent ICP from recovering, e.g., rotation or mirroring along these symmetries (see Figure 3). Random sampling is a widely used strategy [39,40], but may add an error to the ICP result because of the noise introduced by the random sampling. Our method uses random sampling along edges and diagonals but with a consistent random sequence based on a random seed computed from the texture coordinate t i of vertex v i . As illustrated in Figure 4, additional points for the ICP algorithm are generated along diagonals by connecting opposite vertices. If the source geometry does not have texture coordinates, they are generated beforehand. Initializing the random number generator from the texture coordinates results in a unique sequence of sampling points for every edge and diagonal in M s , and hence in M e , which then can be unambiguously registered by the ICP algorithm.
The edge and diagonal sampling is done as follows: For a segment S ij with vertices (v i ,v j ) and texture coordinates (t i ,t j ): 1. Calculate two random seeds seed i and seed j based on the texture coordinates of t i and t j .
2. Generate n random numbers r i 1 , r i 2 , . . . , r i n based on seed i , and n random numbers r j 1 , r j 2 , . . . , r j n based on seed j . By applying these random numbers, we obtain n sampling points s ij k and n sampling points s ji k (see Equation (4)). 3. Add the points obtained by the second step to the source point set and target point set of ICP algorithm.

Parameter Analysis
Two parameters in ICP need to be determined for our method. First, the fitness error between M s and M e ; 0.01 m is sufficient for buildings reconstructed from airborne and ground-based point clouds. Second, the maximum number of iterations.
Considering that VSE supports five main types of editing operations, six experiments were designed to derive the value of maximum number of iterations:
Deletion on a face; 4.
Deformation on a face; 5.
Push-pull on a face; 6.
Template application on a face.
These experiments were executed on a complete building model with 495 faces (see Figure 6). Figure 7 shows source point set and target point set.  Figure 8 shows that deletion, deformation, push-pull, and template applications terminate ICP algorithm in the very beginning. For all experiments, fitness error falls below the threshold after no more than 90 iterations. In these experiments, maximum number of iterations can be set to an arbitrary value over 90. It was set to 300 to deal with potential complicated situations.
The registration result shown in Figure 9 indicates that 300 ICP alignment iterations and a fitness error of 0.01 m are sufficient to calculate the transformation matrix T. The red point set was generated from M s , the yellow point set was generated from M e , and the white point set was the transformed source point set by applying T to source point set.

Semantic Information Transfer
To match semantic information from the source model to the edited geometry, a search for matching faces had to be performed. The implementation uses an octree to speed up the search process.
An octree is a hierarchical data structure allowing efficient search operations in three-dimensional space. To match the semantic information of the original model, we constructed an octree based on the transformed source geometry M t = T × M s . The root node of the octree is the bounding box of the source model. The bounding box is recursively divided into eight nodes until each box contains only one node as shown in Figure 10. Each node stores the face features from Section 3.1. Whether a face f e in the edited geometry matches a face in the source data model is determined by searching for the corresponding face features (see Section 3.1) in the octree. Matching faces allow the transfer of semantic information from the source model M s to the edited geometry M e .
As illustrated in Figure 11, the deformation of a vertex, edge, or face changes f normal , f centroid , and f length . Such a deformed face will not be found by an octree search. However, deforming edit operations only affect the geometry but preserve texture coordinates. Therefore, these faces can still be matched by an UV space search, a method also used for texture transfer (see for example [41]), and hence can be used to transfer semantic information. After the octree and UV search, unmatched faces in the source data model M s are deleted, and unmatched faces in the edited geometry M e are considered new faces. For these, semantic information has to be generated through labeling, which will be discussed in the next section.

Labeling
Semantic information transfer based on faces can be applied as described to any data model. However, labeling new faces is domain specific and the approach described here is based on CityGML, which is the underlying data model of the implementation. According to the CityGML building model [42] the relevant classes for LoD3 building model are divided into three levels (see Figure 12), which are building, boundary surface, and opening. Specifically, the boundary surfaces are roof surface, wall surface, and ground surface, whereas the openings include windows and doors. Typical edit operations creating new boundary surfaces or openings are push-pull (see Figure 13) and template applications (see Figure 14). New boundary surfaces are classified according to the following heuristic where f φ is the angle between the face normal f normal and the up vector of the coordinate system, as shown in Figure 15. This basically assumes that walls are within +/−15 degree from the vertical (90 degree): (5) Figure 13. Opening generated by push-pull. This editing operation will not split the boundary surface but generate a hole on the boundary surface.  Starting from a boundary surface (labeled either by semantic information transfer or by the heuristic), topological relationships are constructed by the iterative method depicted in Figure 16. This iterative approach constructs a topological tree based on the edited geometry. First, all new faces which are co-planar with labeled faces are collected and added to the topological tree as the first level. Second, it is determined whether the rest of the newly added faces share an edge or have an edge on the face of the leaf node in the topological tree. If yes, we add these faces to corresponding leaf nodes. This process is repeated until there are no new faces left. Finally, leaf nodes of all wall surfaces and with tree depth larger than two are considered as openings. Openings are classified as follows with h denoting the height and w denoting the width of the bounding box of the opening in meters. The constraints for doors are taken from [43].
In addition, roof infrastructures generated by push-pull operations (see Figure 17) are not openings, since in the topological tree they are children of roof surfaces.

Discussion
The method was implemented as part of a standalone JavaFX based LoD3 building editor for the Virtual Singapore project. The Esri procedural runtime provides a façade template application and the PCL (Point Cloud Library [44]) and ICP implementations, along with other utilities for point set management. Besides camera control, CityGML import/export, and point cloud visualization, selection, and texture assignment, Figure 18 shows the user interface of VSE, the following editing operations are supported by the editor: • Model move, scale, rotate (see Figure 9). • Vertex, edge, and face move (see Figure 11).
• Face split (see Figure 19). • Vertex, edge, and face deletion (see Figure 20). We used FZKViewer [45] as 3rd party software to verify that the CityGML attributes were maintained when editing the building model. To illustrate typical edit operations and how the method updates the data model, two use cases are presented in the following sections: face deletion for shelters and template application for openings.

Face Deletion For Shelters
A typical example for superfluous faces and incorrect semantic information are faces which are present because of occlusions at ground level. The passageway under the shelter, shown in Figure 20, was constructed as a volume instead of just the roof section of the passageway. The editor allows the user to remove superfluous faces through deletion and the data model is updates accordingly (see Figure 21).

Template Application For Openings
Procedural modelling is used for adding regular structures; e.g., extending façade patterns into occluded areas or correcting measurement errors in the source model. Figure 22 shows hundreds of added openings in a high-rise building. Through the labeling process of the method, openings are correctly detected and classified as windows in the data model (see Figure 23). These two use cases illustrate the automatic data model update process, which is essential for an effective editing workflow. While the increase in efficiency, compared to manual labeling, largely depends on the editing tasks, data model consistency has shown itself to be one of the greatest benefits of the method.  Delete edge between two non-planar faces 05 Delete edge between two co-planar faces 06 Delete face with holes 07 Delete face without holes 08 Push-pull on existing face in corner vertically 20 Push-pull on existing face in corner with an angle 21 Push-pull on existing edge 22 Push-pull after creation of a face on an existing face 23 Push-pull after creation of a edge on an existing face 24 Push-pull after creation of multiple face on an existing face 25 Push-pull after creation of multiple face on multiple existing face 26 Push-pull after creation of a face on an existing face with holes 27 Push-pull after creation of a circle on an existing face 28 Template Apply procedural templates with one level of opening extrusion 29 Apply procedural templates with two levels of opening extrusion 30

Commercial Tools
Several commercial tools are available which support editing of CityGML models. The following sections discuss a selection of these tools in terms of their advantages and limitations regarding editing of LoD3 CityGML building models.

CityEditor
Trimble SketchUp [7] is a complete and extensible 3D modelling package, widely used in 3D modelling and architectural design. CityEditor [36] is an extension for SketchUp, which supports import/export and editing of CityGML files. As an extension, CityEditor leverages the full capabilities of SketchUp, going far beyond the editing operations that the VSE supports. However, it lacks specific LoD3 tools, such as façade template application. Furthermore, CityEditor discards all semantic information when importing a CityGML file, and each surface is labeled "unclassified." Users need to assign semantic information for each surface manually. CityEditor also supports limited automatic surface classification depending on the tilt of the surface. The surface classification can then be exported to a CityGML file.

CityGML Importer
FME Desktop from Safe Software [35] is data conversion software which supports more than 450 formats and applications to help integrate and transform data. CityGML Importer [46] is also based on Safe Software's data transformation technology and focuses on loading and converting CityGML data; e.g., into IMX files which can be edited in InfraWorks [47] and Map 3D [48]. While loading a CityGML file, CityGML importer maintains the semantic information of each surface by storing them in separate layers. However, the hierarchical structure of the CityGML file is lost. For instance, building parts and boundary surfaces are at different hierarchical levels in the data model, but CityGML Importer discards the relationship and puts the layers on the same level. The VSE maintains the semantic information as well as the hierarchical structure of the CityGML data model.

Rhino City
Rhino3D [49] is a popular architectural design software which has extensive support for curves and free-form surfaces. The polygon-based editing operations of the VSE are not well suited for curved surfaces and further development is required to provide a good user experience. Through the Grasshopper extension, parametric 3D modeling can be added to Rhino3D which offers similar capabilities as the CGA based template application of the VSE. RhinoCity [50] is another extension for Rhino3D and focuses on generation of building models and supports import/export of CityGML files. RhinoCity leverage the full editing tools from Rhino and offers a far richer set of editing operations than the VSE. RhinoCity also supports automatic generation of CityGML data during the 3D model creation phase but lacks the maintenance of semantic information during further editing.

AutoCAD Map 3D
AutoCAD [25] is a commercial computer-aided design (CAD) and drafting software which has several domain-specific enhancements. For instance, AutoCAD Map 3D [48] is one of AutoCAD's vertical applications. AutoCAD Map 3D incorporates geographic information system and CAD data with an industry-specific toolset for GIS and 3D mapping. AutoCAD Map 3D supports importing and exporting objects into CityGML format. However, CityGML import is limited to level of detail 2 (LoD2), whereas the VSE is designed explicitly for LoD3 editing. Furthermore, much of the semantic information is lost when exporting to CityGML format from Map 3D.

Comparisons among VSE, CityEditor, and CityGML Importer
We compared our method with CityEditor and CityGML Importer to edit the building model shown in Figure 24. Fourteen test cases in Table 1 were chosen while other test cases were ignored, since they were not applicable in Skethup or 3ds max. For instance, test case 29 and 30 were façade template applications provided by Esri procedural runtime which were not feasible in Sketchup and 3ds max. One metric was employed: the minimum amount of mouse and keyboard interactions required for updating the semantic information of the edited geometry.
To record the metric, one user performed the 14 test cases per software. In GUI of VSE, the user can click the refresh button after an edit operation to update semantic information of the edited geometry. In CityEditor, the user need to assign semantic label manually in the context menu. CityGML Importer translates CityGML files into IMX files and stores each polygon into separate layers. The name of the layer is the semantic label of the polygon. The user can edit the name of the layer to update the semantic information. The results are shown in Table 2.
There are two observations from Table 2. First, our method is helpful for an efficient workflow with multiple newly added faces. For transformation, deletion, and deformation, no extra mouse clicks are needed to update the semantic information in Sketchup and 3ds max. For push-pull, which generates newly added faces, users need several mouse clicks to manually assign semantic labels for the newly added faces. Thanks to automatic labeling introduced in Section 3.4, manual labor is saved in updating the edited geometry. Second observation is our method has advantages in importing and exporting CityGML data models. As mentioned in Section 4.3.1, CityEditor lost all semantic information in loading the CityGML data model. The comparison excludes the number of mouse clicks in initializing the semantic label of the model in VSE, CityEditor, and CityGML Importer. Extra tools are needed to export the shape in 3ds max into CityGML format, while VSE supports the direct export of CityGML files. Table 2. Comparisons among VSE, CityEditor, and CityGML Importer.

Conclusions
This article introduced a data model synchronization method which preserves semantic information across editing operation. It is independent of the edit operation and depends only on geometry, UV mappings, and materials. This enables easy integration of existing and future 3D editing tools with rich data models in a broad range of 3D applications. The method was implemented in a LoD3 building editor for the Virtual Singapore project, including interactive push-pull and procedural generation of façade provided by 3rd party libraries. The quality of the method was verified with 30 common editing tasks and compared with four commercial tools.
The VSE internally uses a polygon-based representation for the geometry which is not well suited for free-form or curved surfaces. Further research is required for recovering surface parametrization and designing an easy to use toolset for editing curved surfaces.
Currently, the focus is on interactive editing operations only. It would be interesting to see how accuracy our labeling process will be in more practice cases and how the method performs in batch processing of 3D geometry, e.g., for change detection and labeling, an important topic for maintaining and updating 3D city models.