2.1. Definition of Semantics and Enrichment
Building information modeling (BIM) is a key aspect in digitization and thus brings great value to planning, design, construction, operation, and all processes throughout the building lifecycle. Semantically rich BIM models can be used for life cycle assessment (LCA), which is an established methodology for quantifying environmental impacts and is therefore increasingly being used to assess the environmental performance of buildings [
10]. Despite various initiatives to promote the BIM methodology, it requires continuous improvement and further development to be fully integrated and to be based on collaborative opportunities between involved and interested stakeholders. Among the whole range of applications of BIM, semantics can be distinguished, which is derived from various sources, databases, and rule sets that provide access to the information of individual models [
11]. Thus, the stakeholder involved should be able to search the BIM model for all relevant information and perhaps provide comments that are more relevant from this point of view. A BIM model can be colloquially understood as a semantic database of a building object [
12]. Semantics (i.e., geometry, topology, and meaning), which is the foundation of BIM, facilitates things that CAD (computer-aided design) is unable to do.
The expression “enrichment”, on the other hand, can be defined as the process of improving the quality or power of something by adding something else. Semantic enrichment, on the other hand, refers to a process that is related to and used for various operations in many fields. In the field of computer science, semantic enrichment is presented as one of the methods for ensuring interoperability between different databases. This process provides an opportunity, first, to provide the most important and necessary information about an entity without additional work. In reverse engineering of objects, semantic enrichment can be used in 3D reconstructions from a point cloud. One example of the use of semantic enrichment, for example, is the ability to process internal 3D point clouds to identify all building elements. In BIM, semantic enrichment is defined in several ways and by many researchers. One definition defines the phrase as a process that enables and aims to automate the processing and addition of all information [
13]. Another definition, on the other hand, is that semantic enrichment is the process of analyzing and exploring a database to capture its structure and definitions at a higher level of meaning [
14]. It can also mean the process of adding missing or new information [
15]. Another definition indicates that semantic enrichment refers to the process during which new acquired information is automatically or semi-automatically added to given models using specific and specified techniques. One definition developed by Bloch and Sacks defines semantic enrichment as building object classifications, aggregation, and clustering, uniquely identifying and filling in missing objects [
15]. It can be concluded that semantic enrichment is an oriented process, as requirements are defined by the intended use of information. It is a means to ensure easy communication of construction-related information between different environments [
13].
Semantic enrichment in BIM, for example, can be divided into two categories. One concerns individual model elements. In this case, semantic information can be organized in terms of geometric features, such as shape, location, or non-geometric features—types, functions, specifications. The other is semantic information about the relationships between components—neighborhood, hierarchy, inheritance, etc. [
16]. Modern BIM tools are constantly changing the way information about building objects is created, stored, and exchanged, and thus they guarantee a three-dimensional environment for designing models with geometry, alphanumeric data, and semantics. However, it is important to consider that these tools constantly encounter misinterpretation of information that is not always clearly represented in the model. BIM models are often created from 2D documentation or point clouds, and the results are purely geometric. Thus, inadequate and insufficient semantics can affect the quality of the exchange of selected data in the first place and make it difficult or even impossible to use a few solutions and applications that are used for building analysis [
4]. The exchange of information during design work between several people involved in the project can face interoperability barriers. IFC is one of the most popular standards that are used to exchange geometric and topological information, which still suffers from the problem of information loss or distortion when it is exchanged between multiple applications. IFC’s hierarchical structure: (i) project, (ii) site, (iii) building, (iv) building floor, (v) component, provides a degree of interoperability at the application or database level. However, the IFC schema, despite its rich and broad scope, still faces the challenge of increasing the level of semantics in models [
15]. Designers or modelers are not always aware of this structure [
16] and the possibility of entering detailed non-graphical data. And without them, there is no way to generate summaries, perform analyses, or simulate phenomena. Thus, there is an identified research gap related to BIM model enrichment at different levels of the IFC structure.
2.2. Importance of Non-Graphical Data
BIM is generally equated with a 3D model that contains the geometric data of an object. It is known that such a model significantly facilitates the work of designers and allows the introduction of newer and newer design possibilities. The 3D model is the key to effective design. On the other hand, in the construction process, non-graphical data are indispensable at the stage of construction and subsequent operation of the building object. The model can contain data, for example, on the cost of each element that is created on the construction site, as well as on the elements that are ready to be installed in the construction facility, or on equipment and fixtures. The model can also include information on the completion time of each individual construction stage or scheduled other activity [
17]. Nowadays, it is becoming more and more common to experience the creation of any analysis to check the environmental impact of a construction facility during implementation, as well as during operation. This information is becoming paramount, and thanks to BIM models that are properly “data-enabled”, we can have access to a huge amount of information and thus the ability to monitor, for example, energy consumption and heat in the facility; calculate various types of energy indicators [
18]; or track and calculate the amount of carbon footprint that is produced in all scopes. It is important to note that there is a very large number of different types of information that is associated with a building in all phases of its life. Data are extremely important to manage effectively and efficiently. Therefore, to facilitate and systematize the data used, their categorization has been adopted [
17]. The BIM model is extended by further dimensions, and thus it is multidimensional and can include all specialized data. In addition to 3D modeling, BIM is also defined in other dimensions: 4D—data for delivery and construction schedules, 5D—data to develop cost estimates and budgets, 6D—data for sustainability, and 7D—data to assist in facility management.
There is a huge amount of non-graphical information associated with every construction facility in operation. These can include manuals, warranties, inspections, certificates, and information about materials. Users and owners of the facilities in question have always collected such data as it is essential for effective management and operation. The data are usually stored in binders, CDs with PDF files, spreadsheets, or files in other formats. It should be added that sometimes these data are not collected and archived during the construction of a facility or other work related to upgrades or renovations but are only recreated when the need for it arises [
17].
Among other things, the most important data include the equipment of the building facility with a complete set of information about each item: technical parameters, origin, prices, product sheets, inspections, equipment warranties, maintenance plans, installation and inspection dates, lists of variable parts, etc. This is information that is of extraordinary value and essential for the use of a construction facility [
19]. All data are linked to the objects in the 3D model and supplementary data. BIM models are not only used to transfer data to FM (facility management) systems but are also used to exchange information between different parties. The most important task is that any interested party should be able to obtain non-graphical data of the required scope and a certain level of detail, so that they correspond to their expectations and the presented purposes of use: during the construction process, modernization, renovation, or ongoing use [
17].
By the irreplaceable role of non-graphical data in the model, it should be borne in mind that incomplete, fragmentary, and outdated data can result in inefficient management, incomprehensible results, and wasted time or even money [
19]. Non-graphical data should be updated and added all the time to avoid confusion. Non-graphical data can play a significant role in space management of public facilities [
20]. With BIM, a summary of each room can be created, complete with data on the area, purpose, name of the room, number of equipment, furniture, toys, and educational items located in each individual room. Such information provides an idea of the size of the space and an opportunity to implement new ideas under its management [
21]. Along with the above-mentioned parameters, each room can be supplemented with relevant data to control the occupancy of rooms such as teaching rooms. They can be used to control the space and be modified according to the occurrence of activity in the space. The data provided will allow information to be shown on the available number of seats. These parameters can be assigned based on the number of desks. By adding information about the characteristics of each room, a data sheet can be created to reflect the current state that each of the listed rooms is adequately equipped. The database can be based on information on, for example, manufacturers of equipment, furniture, certificates, materials from which the item was made, year of manufacture, place of manufacture, user manuals, warranties, and contacts. Staff managing the facility will have an overview of the entire database and thus will be able to react; prevent; and, for example, replace obsolete items. The use of non-graphical data, in addition to being used for management inside, can also be used to manage elements outside the kindergarten, namely, playground components, vegetation, paving, landscaping elements, etc. All the information contained in the model can be used for upgrades, renovations, and inventories. Assigning the right parameters already during construction will help reduce errors later. Sound technical knowledge related to data on building and finishing materials will allow for the sustainable development of materials, that is, access to non-graphical data during renovation or other work will avoid the use of materials or components that are defective, have poor performance, or are unsustainable [
22].
When saturating non-graphical data, it is worth following FAIR’s four basic postulates and using standardized solutions wherever possible. Data should be findable, accessible, interoperable, and reusable [
23]. Inaccurate or outdated non-graphical data, such as costs, schedules, or even the properties of individual materials, can negatively affect the quality of the entire project and thus lead to inefficient management. It is also important to make judicious use of the entire model with its wealth of non-graphical data. Sometimes adequate training is needed to meet the challenges of effective and sustainable BIM implementation in an organization [
24]. This work can address this challenge and make an important contribution to the topic of semantic enrichment.
2.3. Non-Graphic Data Saturation Levels
The detail of BIM models in terms of information should be minimal enough to achieve the specified goals. The use of variable detailing in individual models and components at a given stage is crucial in accomplishing the task. Levels of model detail determine the accuracy in mapping the object at each phase of design and implementation of a particular project. By this, the detail refers to the graphical representation, but also to the saturation with non-graphical data of the design and individual model components [
25]. Different geometric detailing can be used for each model and element—LOGD (level of graphical development), or geometric LOD (level of development), and LOI/LOMI (level of information/level of model information), or non-geometric LOD [
26].
It is worth highlighting how information delivery planning affects the implementation of a project using the BIM methodology. According to the ISO 19650 standard [
27], which quite accurately uses the previous achievements of the specification and standards of information delivery planning, it brings, however, a substantial reconstruction of the previous system. The standard does not use the term LOD/LOI but replaces it with the term level of information need and forbids abbreviating it. According to the level of information need standard, it is not a classic metric like LOD or LOI. The standard specifies that the level of information need must be earned, and that means it must be performed by the customer or the developer. Ideally, it should be the clients who define their information requirements before the project starts. In many cases, one will find that typical LOD metrics will be sufficient, but note that every project is different. One may find that off-the-shelf standards will not meet the requirements, and this involves then describing one’s own standard. ISO 19650 [
27] describes that it is up to the contracting authority to make the effort and determine what information it needs. One of the primary goals of defining the level of information needed, in the first place, is not to provide too much information [
27].
The level of saturation of non-graphical data can refer to the quantity as well as the quality of data. This is all the information that is used to comprehensively manage a building facility. Throughout the life cycle of a building, data are collected, the volume of which should increase after each stage of work. At the very beginning, the data that pertain to the building/building/infrastructure facility itself should be identified. The identified data are intended to show, for example, what function the facility has; what area it is located in; and the characteristics of the building, floors, rooms, and zones. Nomenclature and numbering usually follow standard naming conventions given by manufacturers and organizations. Identified data that pertain to component group and type should also be classified based on industry standards [
28]. The three most widely used systems for classification, costing, etc., are Uniformat, OmniClass, and MasterFormat. The first refers to building components related to materials and where they were built. The second system is useful for creating material libraries and organizing product literature and project information. It contains a combination of the other two systems and is used as a basis for MasterFormat or Uniformat, for example. Omniclass is considerably expanded and consists of construction information on objects, separated spaces, products, work results, and materials. MasterFormat, on the other hand, is used for institutional and commercial projects in Canada and the United States. The system primarily presents cost data and related information contained in the design, estimates, and specifications [
29]. Next in importance is the classification of data on equipment, materials, and finishes. These data provide knowledge of any parameters that relate to product manufacturers, models, serial numbers, acquisition dates, dealer leads, and any warranties and information on their use or expiration dates. Subsequently, specifications and attributes can be linked to any data on types and other values, as well as detailed data on weight, power, energy consumption, or spare parts. The last non-graphical data should relate to the operation and maintenance of the facility itself. The model should be supplemented with data on maintenance statuses and their history, as well as data on space occupancy [
30].
Using all data resources minimizes the risk of errors. However, with a high level of non-graphical data saturation comes certain risks. First, too much data can cause, for example, problems in interpreting the design and difficulty in maintaining the data. To avoid problems, a thoughtful and sensible approach to data entry is paramount. Hence, in addition to the advantages and benefits of semantic enrichment, this paper identifies some limitations. The penultimate section sets out directions for future research, which is necessary to obtain closer to full, seamless interoperability, the idea of a digital twin and a circular economy.