A Case-Based Reasoning Model for Retrieving Window Replacement Costs through Industry Foundation Class

Featured Application: This study developed a model to estimate the replacement cost of building components by retrieving cost information from industry foundation class. Case-based reasoning was applied to identify similar cases from among various alternatives. Abstract: Building information modeling (BIM) provides facility managers with a large database consisting of 3D geometric data as well as management data. In particular, Industry Foundation Class (IFC) has been applied in many studies as it provides extensive and diverse information regarding building components. With the use of BIM combined with case-based reasoning (CBR), in this study, a model was developed to estimate replacement costs by retrieving cost information from IFC. This study focused on the replacement of windows for o ﬃ ce buildings, and the costs associated with that replacement. Two main advantages were identiﬁed in the proposed approach. First, the replacement information required for the comparison of di ﬀ erent cases is automatically obtained from a BIM ﬁle and parsed for predicting a cost estimate using IFC. Next, the accuracy is increased by matching various cost-related data such as contractors and manufacturers in the estimation of replacement costs with the help of CBR.


Introduction
The maintenance costs of building components could be as much as two or three times the cost of design and construction [1]. Building facility managers are responsible for managing such costs associated with the facilities. In particular, managing the replacement cost is especially important as replacement tasks generally involve a significant amount of maintenance costs. For replacing a building component, a facility manager must investigate the details of the components to determine the costs associated with each component. This type of information will differ between facilities. As these properties can vary widely, it can be difficult to resolve issues such as determining the cost of replacing a window. One of the difficulties encountered in estimating the replacement costs of building components is the retrieval of relevant cost data. Generally, the cost data of building components is stored separately in the form of spreadsheets, plans/drawings, and construction documents. As drawings and spreadsheets are stored independently, it is difficult to use/analyze them. Exchanging data has been challenging in facility management.
Meanwhile, the construction industry is experiencing explosive growth in its ability to both generate and collect data. Advances in scientific data collection and computerization have generated a flood of data [2]. Furthermore, the recent development of a new technology, building information modeling (BIM), has provided facility managers with a large database consisting of three-dimensional (3D) digital data in the form of geometries as well as management data comprising the quantities and properties of building components. In particular, Industry Foundation Class (IFC) defined by International BuildingSMART has been applied in many studies as it provides extensive and diverse information regarding building components.
Therefore, the aim of this study is to develop a model to estimate the replacement cost of building components by retrieving cost information from IFC. The process to export IFC from BIM is not elaborated in this study because it is already well established by International BuildingSMART and has been used for a long time. Case-based reasoning (CBR) is applied to identify similar cases from among various alternatives. The CBR model in this study is focused on the replacement of windows for office buildings, which are a building component within a facility, and the costs associated with that replacement. The model was developed using two computer programs: (1) factor selection and weight assignment were accomplished using Weka 3, which is a data mining software that uses machine learning algorithms to define factor weights, and (2) myCBR is an open-source program used for CBR applications.

BIM
In an effort to associate building elements in computer-aided design (CAD) drawings, there have been many efforts in the forms of Drawing Exchange Format (DXF), Drawing (DWG), Design (DGN), etc. [3]. However, owing to the complexity in CAD platforms, CAD data has not been used extensively in data retrieval and exchange [4]. Nowadays, BIM technology is becoming widely accepted in the construction industry, and this trend is extending into the facility management industry as well [5]. Its popularity is due to the effectiveness of its application in operations and the maintenance of facilities [6]. BIM provides a complete 3D digital representation and comprehensive information associated with building components [7]. The 3D representation and building information available in a BIM model aid in the activities of a facility manager.
IFC files, as an open BIM standard, contain not only the geometries of walls, columns, beams, doors, windows, and other building components, but also the specific attributes for each object, such as material type, material properties, and vendor. IFC applications include research on various subjects such as construction scheduling [8], cost analysis [9,10], quantity take-off calculations [11], bridge design [12], concrete reinforcement supply chain [13], building design review systems [14], and construction safety [15]. IFC modeling has been used for data handover from construction completion to operations and maintenance in construction operations building information exchange [16]. BIM technology has also been applied in the operation and maintenance of facilities [6], sensing and field data collection in facility operations [17], visualization techniques in field construction [18], and sustainable collaboration [19]. This research assumed that BIM models conformed to IFC standards, and proposed a model to estimate the replacement cost by retrieving cost information from IFC.

CBR
CBR is a technology that facilitates the application of knowledge from previous cases to new scenarios. CBR is based on an artificial intelligence (AI) system and is used to determine the optimum solution for a problem. However, CBR is different from other AI approaches in that it does not use generalized relationships, but uses specific knowledge from previous cases [20]. CBR can be defined as follows: "To solve a new problem by remembering a previous similar situation and by reusing the information and knowledge of that situation" [20]. Aha et al. [21] state that the CBR cycle has five general processes: retrieve, reuse, revise, review, and retain. CBR has been successfully applied to many tasks in the architecture, engineering, and construction industry. Jin et al. [22] used CBR to improve the accuracy of early-stage cost estimation by revising categorical variables in a case-based model.
Even though CBR has been applied successfully in the research field, attempts have still been made to improve the accuracy of the algorithm by accessing the similarity measure using weight distributions. This is because the results of the similarity assessment would have an important impact on the generation of similar source cases [23]. Therefore, the attribute weights allocation method is an important research direction in the CBR model [24]. While some researchers utilized their own experience for determining these weights, a machine learning algorithm was used in this research to obtain the required weight distribution.
The factor selection method begins with the full set of attributes, and the single attribute that results in the best patterns is repeatedly removed. The algorithm stops when removing an additional attribute would degrade the estimated performance [25]. The algorithms can be extended by considering both the addition of an attribute and the deletion of an attribute at each step. Thus, at each step, the algorithm either adds or deletes an attribute. This method is conducted for a good feature set using the induction algorithm as part of the evaluation function [26]. Typically, the factor selection that provides the best performance in the induction algorithm is selected. In this research, CBR is used to retrieve the match closest to under consideration from among various similar cases using data in IFC.

Methodology
This section elaborates the methodologies for developing a CBR model that can estimate the replacement cost of building components by retrieving cost information from IFC. For this, (1) building properties are obtained from IFC entities by a mapping process. (2) Weka 3 is applied to weight attributes for CBR. (3) The CBR model is developed to retrieve the replacement cost. Figure 1 illustrates the framework of this research. Owing to the complexity in the IFC structure, this research was focused on predicting the cost of window replacement.
Appl. Sci. 2019, 9, x FOR PEER REVIEW 3 of 14 many tasks in the architecture, engineering, and construction industry. Jin et al. [22] used CBR to improve the accuracy of early-stage cost estimation by revising categorical variables in a case-based model. Even though CBR has been applied successfully in the research field, attempts have still been made to improve the accuracy of the algorithm by accessing the similarity measure using weight distributions. This is because the results of the similarity assessment would have an important impact on the generation of similar source cases [23]. Therefore, the attribute weights allocation method is an important research direction in the CBR model [24]. While some researchers utilized their own experience for determining these weights, a machine learning algorithm was used in this research to obtain the required weight distribution.
The factor selection method begins with the full set of attributes, and the single attribute that results in the best patterns is repeatedly removed. The algorithm stops when removing an additional attribute would degrade the estimated performance [25]. The algorithms can be extended by considering both the addition of an attribute and the deletion of an attribute at each step. Thus, at each step, the algorithm either adds or deletes an attribute. This method is conducted for a good feature set using the induction algorithm as part of the evaluation function [26]. Typically, the factor selection that provides the best performance in the induction algorithm is selected. In this research, CBR is used to retrieve the match closest to under consideration from among various similar cases using data in IFC.

Methodology
This section elaborates the methodologies for developing a CBR model that can estimate the replacement cost of building components by retrieving cost information from IFC. For this, (1) building properties are obtained from IFC entities by a mapping process. (2) Weka 3 is applied to weight attributes for CBR. (3) The CBR model is developed to retrieve the replacement cost. Figure 1 illustrates the framework of this research. Owing to the complexity in the IFC structure, this research was focused on predicting the cost of window replacement.

Data Collection by Mapping IFC
Data was obtained from a BIM model in IFC data exchange for the implementation of necessary building component properties in the CBR model. Data is extracted from a 3D model through IFC extraction. Building properties are obtained from IFC entities. Each entity contains properties related

Data Collection by Mapping IFC
Data was obtained from a BIM model in IFC data exchange for the implementation of necessary building component properties in the CBR model. Data is extracted from a 3D model through IFC extraction. Building properties are obtained from IFC entities. Each entity contains properties related to the object, which include the object location, geometry, type, and material. From the information extracted from the BIM model, necessary information such as the story, position, size, material, accessibility code, and duration are stored in the CBR repository. Figure 2 shows that there are many factors to be considered in the estimation of replacement costs. In this study, a window replacement is considered as an example wherein the factors to be taken into consideration are the story; location, size, accessibility, and materials of a window; the installation date; and the manufacturer. All of the aforementioned factors are required to be taken into consideration when identifying the costs and duration for a window.
Appl. Sci. 2019, 9, x FOR PEER REVIEW 4 of 14 to the object, which include the object location, geometry, type, and material. From the information extracted from the BIM model, necessary information such as the story, position, size, material, accessibility code, and duration are stored in the CBR repository. Figure 2 shows that there are many factors to be considered in the estimation of replacement costs. In this study, a window replacement is considered as an example wherein the factors to be taken into consideration are the story; location, size, accessibility, and materials of a window; the installation date; and the manufacturer. All of the aforementioned factors are required to be taken into consideration when identifying the costs and duration for a window. Properties such as the window size, material, and location are obtained from the IFC file. The window location includes the details of the building floor, whether the window is interior or exterior, and whether it is placed in a corner or in the middle of a wall. This information is pertinent for accurately estimating window installation costs. The information regarding each building component was parsed from the IFC file. This information is mapped within IFC entities such as IFCWINDOW, IFCDOOR, IFCWALL, and IFCROOF. Each IFC entity begins with an identification number that is referenced to other entities. By following the identification numbers within an IFC entity such as IFCSLAB, the system locates information such as the slab's relative location, location coordinates, orientation, thickness, and story elevation. Each window within the BIM model is identified using a number followed by "IFCWINDOW." The mapping of IFCWINDOW is followed by IFCPRODUCTDEFINITIONSHAPE. IFCPRODUCTDEFINITIONSHAPE ultimately defines the specific 3D points for the object representation. IFCWINDOW is also mapped with IFCOWNERHISTORY and IFCLOCALPLACEMENT. IFCOWNERHISTORY simply contains the Properties such as the window size, material, and location are obtained from the IFC file. The window location includes the details of the building floor, whether the window is interior or exterior, and whether it is placed in a corner or in the middle of a wall. This information is pertinent for accurately estimating window installation costs. The information regarding each building component was parsed from the IFC file. This information is mapped within IFC entities such as IFCWINDOW, IFCDOOR, IFCWALL, and IFCROOF. Each IFC entity begins with an identification number that is referenced to other entities. By following the identification numbers within an IFC entity such as IFCSLAB, the system locates information such as the slab's relative location, location coordinates, orientation, thickness, and story elevation. Each window within the BIM model is identified using a number followed by "IFCWINDOW." The mapping of IFCWINDOW is followed by IFCPRODUCTDEFINITIONSHAPE. IFCPRODUCTDEFINITIONSHAPE ultimately defines the specific 3D points for the object representation. IFCWINDOW is also mapped with IFCOWNERHISTORY and IFCLOCALPLACEMENT. IFCOWNERHISTORY simply contains the modeling program used to create the IFC and the user of the file. IFCLOCALPLACEMENT identifies the window's location within the BIM model.
The data extracted from the IFC file is used for 3D representation to identify a particular window that is required to be replaced. The 3D model is a complete model, is interactive, and can be rotated to be viewed from various angles. The interactivity of this model is used to select a particular window, which presents the properties of the window, as shown in Figure 3.
Appl. Sci. 2019, 9, x FOR PEER REVIEW 5 of 14 modeling program used to create the IFC and the user of the file. IFCLOCALPLACEMENT identifies the window's location within the BIM model. The data extracted from the IFC file is used for 3D representation to identify a particular window that is required to be replaced. The 3D model is a complete model, is interactive, and can be rotated to be viewed from various angles. The interactivity of this model is used to select a particular window, which presents the properties of the window, as shown in Figure 3.

CBR Training
The next step of the research process is the CBR training process, which includes a case input and model training. The case input involves collecting old related cases and defining the properties of each case. This collection of cases becomes the repository for the model training section. The CBR repository for the window installation provides the types of building components to be installed, when/where/by whom it was installed, the material types, and its accessibility. The related six attributes are the story, position, size, material, duration, and accessibility code. Unlike the others, the accessibility code is not generally acquired from BIM; users should input a degree of difficulty for construction workers to install into the CBR repository. These multidimensional attributes of the CBR facilitate automatic data retrieval from existing installation data. Descriptions of each attribute of the proposed CBR repository are summarized in Table 1.

Name of Factors Description
Accessibility The degree of difficulty for a construction worker to install a required building component. Nominal and numbers are used for easy (1), medium (2), and difficult (3). Cost The amount of installation in cost in the range of $18-$510. Duration The duration of installation in minutes in the range of 14-121.

Material
Material is a term for the substance, or a mixture of substances that constitute a building component. Nominal values are used for aluminum, plastic, wood, and so on.

CBR Training
The next step of the research process is the CBR training process, which includes a case input and model training. The case input involves collecting old related cases and defining the properties of each case. This collection of cases becomes the repository for the model training section. The CBR repository for the window installation provides the types of building components to be installed, when/where/by whom it was installed, the material types, and its accessibility. The related six attributes are the story, position, size, material, duration, and accessibility code. Unlike the others, the accessibility code is not generally acquired from BIM; users should input a degree of difficulty for construction workers to install into the CBR repository. These multidimensional attributes of the CBR facilitate automatic data retrieval from existing installation data. Descriptions of each attribute of the proposed CBR repository are summarized in Table 1.
Ashworth [27] and Kim et al. [28] stated that selecting important attributes in a CBR process results in a satisfactory degree of accuracy, speed, ease of use, ease in updating, clarity of explanation in construction cost estimations, and consistency in variables. Some systems may rely on the user's experience for determining the weights; however, an AI algorithm, factor selection, was used in this research. A key process in CBR is called case retrieval, wherein the algorithm determines the factors that carry the most weight among the attributes of the cases in the case base. Some researchers set the weights based on experience and using an analytic hierarchy process [29], which may involve both uncertainty and imprecision owing to the requirement for human intervention. Other researchers have conducted a genetic algorithm to determine the weight factors [23], which requires extensive computational power in the entire process. In this research, the use of a machine learning algorithm, named factor selection, is proposed for selecting important factors [25,30]. Factor selection is the process that finds that appropriate factors from a large set of original attributes can be determined with the objective of describing the original dataset. The factor selection and weight assignment were accomplished with the use of the software Weka 3, which is a data mining software that uses machine learning algorithms to define the factor weights. The process included defining the factors and the options for those factors. After the factors were defined, the cases used in the CBR were input in the Weka 3 program. Based on these cases, the program determines which factors influence the target factor, which is the cost. The results of the analysis indicating the ranking of the factors from the most influential on cost to the least influential are the story, accessibility, position, material, and size. Figure 4 shows the weights assigned to each factor. The story factor for the window cost estimation has a much higher value owing to its significant influence on the cost of installing a window. Table 1. CBR repository attributes and descriptions used in window installations.

Accessibility
The degree of difficulty for a construction worker to install a required building component. Nominal and numbers are used for easy (1), medium (2), and difficult (3).

Cost
The amount of installation in cost in the range of $18-$510.

Duration
The duration of installation in minutes in the range of 14-121. Ashworth [27] and Kim et al. [28] stated that selecting important attributes in a CBR process results in a satisfactory degree of accuracy, speed, ease of use, ease in updating, clarity of explanation in construction cost estimations, and consistency in variables. Some systems may rely on the user's experience for determining the weights; however, an AI algorithm, factor selection, was used in this research. A key process in CBR is called case retrieval, wherein the algorithm determines the factors that carry the most weight among the attributes of the cases in the case base. Some researchers set the weights based on experience and using an analytic hierarchy process [29], which may involve both uncertainty and imprecision owing to the requirement for human intervention. Other researchers have conducted a genetic algorithm to determine the weight factors [23], which requires extensive computational power in the entire process. In this research, the use of a machine learning algorithm, named factor selection, is proposed for selecting important factors [25,30]. Factor selection is the process that finds that appropriate factors from a large set of original attributes can be determined with the objective of describing the original dataset. The factor selection and weight assignment were accomplished with the use of the software Weka 3, which is a data mining software that uses machine learning algorithms to define the factor weights. The process included defining the factors and the options for those factors. After the factors were defined, the cases used in the CBR were input in the Weka 3 program. Based on these cases, the program determines which factors influence the target factor, which is the cost. The results of the analysis indicating the ranking of the factors from the most influential on cost to the least influential are the story, accessibility, position, material, and size. Figure  4 shows the weights assigned to each factor. The story factor for the window cost estimation has a much higher value owing to its significant influence on the cost of installing a window.

Retrieval of Cases and Cost Estimation
After developing the CBR repository, the next step is to apply it to the CBR algorithm. The CBR searches for a previously solved similar problem, retrieves the corresponding solution, adapts the solution to the current problem, verifies the solution, and stores the newly solved problem for future

Retrieval of Cases and Cost Estimation
After developing the CBR repository, the next step is to apply it to the CBR algorithm. The CBR searches for a previously solved similar problem, retrieves the corresponding solution, adapts the solution to the current problem, verifies the solution, and stores the newly solved problem for future use. In turn, the newly derived solution may be used for solving future problems, which comprises another CBR working cycle.
The training in the CBR includes the repository as its base for case selection. A user input is required in the CBR to define a new case. Once the new case is defined, the CBR identifies the best-fitting case from the CBR repository and outputs the properties of that case to the user. After the CBR model is prepared, the CBR case retrieval process begins. The CBR software used for this research is myCBR. The cases are stored in the myCBR software and can be retrieved. The retrieval process begins when the user inputs information regarding the new case. The information provided by the user is the basis of the CBR selection from the repository. After the user provides adequate information, the CBR identifies the case from repository that is the best-fitting case. Once the CBR has identified the best case, the model outputs the estimated cost, which is the final step of the process. In this research, the cost estimation of a window replacement was analyzed. The window properties were obtained and stored in the CBR repository for a new window replacement activity.
Considering the information exchange capabilities for the cost estimation in facility management, a research framework is proposed as follows: • Existing cost data: Windows installation data are collected and stored in a database. • Data extraction: The database file is read by the prototype model, and the model then collects the desired information from the IFC representation of the BIM.

•
Visual verification (optional): After the extraction of the geometric and spatial information of the building elements, the BIM is remodeled to quickly visually verify the completeness and accuracy of the model after the geometric transformation. • CBR repository: Next, the extracted information with minimal user-inputted data, such as contractors and weather in the CBR repository, is built with six factors. • Refinement process: If the project cost is acceptable, then the process is stopped. If not, the whole process is reiterated with an appropriate modification to the previous steps.
In Figure 5, the first component of obtaining data from an IFC file is related to the general properties of the building component. Line #276 contains the general properties of a window. IFCWALLSTANDARDCASE is the IFC entity name referring to a particular wall. The text within the parenthesis after IFCWALLSTANDARDCASE presents the properties of the wall. The string consisting of a combination of letters and numbers that appears at the beginning and end of the parenthesis is used to identify a particular component. Numbers #16, #184, and #272 refer to other IFC entities that further describe the wall. "Wall-15" identifies the wall number within the IFC model. In Figure 5, the second component is the data extraction from an IFC file in IFC materials collection. This step is related to the materials of the wall. The IFC entities related to the materials are in lines #297, #300, #302, #304, and #305. The entity name in line #297 is IFCMATERIAL, which contains the text "concrete", which indicates the primary material used for the wall. IFCMATERIALLAYERSET in line #302 lists "Brick", which specifies the sub-material of the wall, which could be related to the veneer of the wall. The third component is the collection of data entities. This component identifies the specific location of the element within the IFC model. Lines #311, #315, #319, #322, #325, #329, and #333 are IFC entities that contain the location properties of a wall. Line #311 is the IFCDIRECTION, which is the direction or rotation of the object in the 3D model. #315 is the IFCCARTESIANPOINT for the component. Each component has multiple entities of IFCCARTESIANPOINT to represent the location of the object in the X, Y, and Z planes of the 3D model. The three numbers within the parenthesis in line #315 refer to the location in the X, Y, and Z axes in the 3D model. This location information specifies which floor the component is on and its specific location on the floor. Lastly, the fourth component is the shape entities of the IFC file. The shape entities provide a 3D solid shape to the components. Lines #371, #375, #379, #383, #386, #390, and #393 in Figure 5 are a portion of the IFC entities related to the wall shape. Lines #371, #375, and #386 present the direction or rotation of the wall. Lines #390 and #393 refer to other IFC entities that determine the 3D shape of the wall component.

Illustrative Example
A BIM model was used in this illustrative example. The building named PORTAL is a four-story office building in a university, which is a facility that is 96,000 square feet, and cost approximately $37 million ( Figure 6).
The BIM model contained design components including architectural, electrical, structural, and heating, ventilation, and air conditioning systems. The building design information for each of these disciplines was combined in one BIM model. The BIM file extracted as an IFC file contained approximately 3,300,000 IFC entities describing every building component within the PORTAL building. The PORTAL building is used to identify accurate window replacement costs based on the information obtained from the PORTAL IFC file, previous window replacement cases used in our model, and the decision-making tool CBR.
The PORTAL building includes 180 windows with the average, maximum, and minimum costs being $171.45, $510, and $18, respectively, and the standard deviation being 105.69. The data distribution shows that the replacement costs are widely varied, and there exists a significant

Illustrative Example
A BIM model was used in this illustrative example. The building named PORTAL is a four-story office building in a university, which is a facility that is 96,000 square feet, and cost approximately $37 million ( Figure 6).
However, the replacement costs could amount to as much as $510 when the installation is complicated and difficult, such as that in the case of a high-rise building.  Table 2 shows the overview of the various stories, materials, and accessibilities of the 180 window cases. As shown in Table 2, the window installations were in the range of one to 10 stories; materials of aluminum (109), wood (17), and plastic (54); and accessibilities of easy access (53) to difficult (62). This section describes the detailed process of (1) data collection, (2) CBR training using the algorithm of factor selection, (3) retrieval of the cases, and (4) cost estimation. The algorithm of factor selection was used in this research to measure the significance of each factor, which is one of the input variables in the CBR model.
The first step of the process was to develop a model based on the data obtained in the PORTAL IFC file. The model was first extracted from the BIM file as an IFC file, and the IFC file was then parsed for the data required in this research. All the spatial and geometric information was extracted from the IFC file for the 3D representation of the building. In addition to the 3D representation, IFC is used to extract the window properties of the PORTAL building.
A window of the building is selected to estimate the costs associated with window replacement. Figure 7 shows the window selected for this illustrative example. The presented properties of the window are the story, position, size, material, accessibility, installation date, and manufacturer. The installation date and manufacturer are information obtained from an external source and included in the model. The other items were obtained from the IFC file and are presented in the 3D model. Figure 7 shows how the properties are obtained from an IFC file and the external input for the window properties. The properties related to one window are stored in several lines of an IFC file. A window is initially identified by the IFC entity IFCWINDOW with an IFC entity code before the entity name. In the example, #2843 is the IFC entity code for this particular window. The other IFC The BIM model contained design components including architectural, electrical, structural, and heating, ventilation, and air conditioning systems. The building design information for each of these disciplines was combined in one BIM model. The BIM file extracted as an IFC file contained approximately 3,300,000 IFC entities describing every building component within the PORTAL building. The PORTAL building is used to identify accurate window replacement costs based on the information obtained from the PORTAL IFC file, previous window replacement cases used in our model, and the decision-making tool CBR.
The PORTAL building includes 180 windows with the average, maximum, and minimum costs being $171.45, $510, and $18, respectively, and the standard deviation being 105.69. The data distribution shows that the replacement costs are widely varied, and there exists a significant difference between the minimum and maximum costs. In addition, it should be noted that the median cost was $152, where the majority of the windows were located in buildings with one to three floors. However, the replacement costs could amount to as much as $510 when the installation is complicated and difficult, such as that in the case of a high-rise building. Table 2 shows the overview of the various stories, materials, and accessibilities of the 180 window cases. As shown in Table 2, the window installations were in the range of one to 10 stories; materials of aluminum (109), wood (17), and plastic (54); and accessibilities of easy access (53) to difficult (62).   This section describes the detailed process of (1) data collection, (2) CBR training using the algorithm of factor selection, (3) retrieval of the cases, and (4) cost estimation. The algorithm of factor selection was used in this research to measure the significance of each factor, which is one of the input variables in the CBR model.
The first step of the process was to develop a model based on the data obtained in the PORTAL IFC file. The model was first extracted from the BIM file as an IFC file, and the IFC file was then parsed for the data required in this research. All the spatial and geometric information was extracted from the IFC file for the 3D representation of the building. In addition to the 3D representation, IFC is used to extract the window properties of the PORTAL building.
A window of the building is selected to estimate the costs associated with window replacement. Figure 7 shows the window selected for this illustrative example. The presented properties of the window are the story, position, size, material, accessibility, installation date, and manufacturer. The installation date and manufacturer are information obtained from an external source and included in the model. The other items were obtained from the IFC file and are presented in the 3D model. Figure 7 shows how the properties are obtained from an IFC file and the external input for the window properties. The properties related to one window are stored in several lines of an IFC file. A window is initially identified by the IFC entity IFCWINDOW with an IFC entity code before the entity name. In the example, #2843 is the IFC entity code for this particular window. The other IFC lines either refer to the IFCWINDOW entity or they refer back to IFCWINDOW through the IFC entity codes. The properties defined in the IFCWINDOW entity include the window ID, name, location, shape, height, and width.
The IFC entity represented by #295 in Figure 7 is the IFCRELCONTAINEDSPATIALSTRUCTURE entity, which contains the wall association and story association data. From this data, the story and wall on which the window is located can be determined. The material association of the window is located in IFCRELASSOCIATESMATERIAL, the renovation status is located in IFCRELDEFINESBYPROPERTIES, and the window style is located in IFCRELDEFINESBYTYPE. This figure presents how specific properties can be extracted from an IFC file to be used by facility managers for the cost estimation of window replacement.
Among the 180 window cases, 175 cases were included in the CBR model with the goal of identifying a case that is similar to the new cases. Five window cases were randomly excluded from 180 cases for the CBR model verification.
In this illustrative example, one window was identified from the 3D model of the building, and the properties of the window were obtained. After the model has identified the properties of the window, these properties are input into the CBR model to identify the best match. The data output from the CBR model includes the total estimation costs and total installation duration. This information is helpful to facility managers for easily determining the costs associated with replacing a window at a particular location.
When considering various factors in the CBR, it is important to decide which factors are related to the estimation of a window installation. These properties include the story, position, size, material, and accessibility. Each of these properties has an influence on the cost and duration of the installation; however, the influence of each property is not equal. This must be taken into consideration when determining the costs and duration for the window installation. The next section describes the method of selecting the relevant factors using the machine learning algorithm, which is factor subset selection. The properties of the window are input into the CBR model for the case retrieval. Table 3 presents the four cases retrieved for this particular illustrative example. The similarity values are calculated based on the weights input into the model. For this illustrative example, case32 is the closest match with a similarity value of 0.96, or 96%. Similar cases are identified with the objective of estimating the cost associated with installing a new window in the building. The retrieved cases include the cost as well as the duration measured for the window installation. The duration value can also be used as an estimate for a new window replacement.
The presentation of the cases retrieved can help the facility manager estimate the cost of new projects based on existing cases. The cost can be obtained from case32, which has a value of $162 and is used as an estimate for the new window installation.
lines either refer to the IFCWINDOW entity or they refer back to IFCWINDOW through the IFC entity codes. The properties defined in the IFCWINDOW entity include the window ID, name, location, shape, height, and width.

Validation
To validate the model, k-fold cross-validation was conducted. K-fold cross-validation is useful to conduct error evaluations in case limited datasets are used, as well as to prevent the overfitting problem [31]. Five cases, from the 180 cases, were randomly excluded from the CBR repository for every verification, and their properties were then entered into the CBR model to verify that the model would retrieve a similar case. Additionally, the cost of each case was compared with the cost of each retrieved case in order to validate the accuracy of the model. In this study, the verification was performed 10 times, and an average of the error rate was used for testing each individual model. Furthermore, multiple regression analysis (MRA) was selected, as it is one of the most widely used methods in statistics [32]. Table 4 presents the error rate for the two analysis methods. The test result shows that the proposed model is able to estimate the replacement cost of windows. The CBR verification produces an error rate value of 26.37%. This value of error variance is 0.69% lower than the MRA, the error rate of which was at 27.06%. This difference would not provide evidence for superiority of the proposed model over the regression model, but would be evidence that the model has an analogous prediction performance to that of the MRA model. Moreover, if the cases are sufficiently accumulated, the error rate range of the model should narrow down further.

Conclusions
BIM technology was applied in this research with the intention of aiding building facility managers in decision-making processes and budget management by using a CBR algorithm. Facility managers often rely on national averages or their own personal experience when estimating installation costs for building components; however, the integration of various data such as structured and graphical data in IFC with the CBR model resulted in an accurate cost estimation based on the existing best matching cases. Previous cases were used for obtaining a better match for new problems that the facility managers may have to deal with. This research showed that, with the CBR model, the facility managers can select a particular object from the BIM model, obtain the properties related to that object, input that data into the CBR model, and retrieve the closest case match.
The illustrative example presented in this research showed how beneficial the proposed model could be with window installation cost estimation. Window installation is an example of how replacement costs can vary depending on the location of the installation and other factors related to the window. An illustrative example that took into consideration 180 cases of a real building was presented. While it is uncommon to obtain such a high number of cases in the practice of facility management, recently, the number of BIM projects in the industry is increasing, which will aid in collecting a sufficient number of cases in the near future. Moreover, it should be noted that CBR can be used for conducting data analysis even with a few cases. Therefore, it is desirable but not necessary to make use of a large number of cases in the proposed approach.
In the proposed approach, two main advantages were identified. First, this study makes a theoretical contribution to the body of knowledge by developing a CBR model based on data obtained from IFC entities. Next, the model and the study results will be valuable to practitioners when estimating the replacement cost of windows with the help of the CBR algorithm.
The model presented in this research is a prototype that demonstrates the process of retrieving replacement costs from BIM. Thus, the model needs to improve itself. First, owing to the complexity in the IFC structure, this research was focused on predicting cost estimates for window replacement projects with six factors. However, there would be more factors affecting the replacement cost. Second, the CBR in this study was constructed based on only 180 windows installation cases as an illustrative example. Third, the model did not take account of the time value of money. Therefore, further studies to overcome these limitations and improve the accuracy of the presented model are being carried out with additional factors and cases.