A Design and Its Application of Multi-Granular Fuzzy Model with Hierarchical Tree Structures

: This paper is concerned with the design of a context-based fuzzy C-means (CFCM)-based multi-granular fuzzy model (MGFM) with hierarchical tree structures. For this purpose, we propose three types of hierarchical tree structures (incremental, aggregated, and cascaded types) in the design of MGFM. In general, the conventional fuzzy inference system (FIS) has problems, such as time consumption and an exponential increase in the number of if–then rules when processing large-scale multivariate data. Meanwhile, the existing granular fuzzy model (GFM) reduces the number of rules that increase exponentially. However, the GFM not only has overlapping rules as the cluster centers become closer but also has problems that are difﬁcult to interpret due to many input variables. To solve these problems, the CFCM-based MGFM can be designed as a smaller tree of interconnected GFMs. Here, the inputs of the high-level GFMs are taken from the output to the low-level GFMs. The hierarchical tree structure is more computationally efﬁcient and easier to understand than a single GFM. Furthermore, since the output of the CFCM-based MGFM is a triangular fuzzy number, it is evaluated based on a performance measurement method suitable for the GFM. The prediction performance is analyzed from the automobile fuel consumption and Boston housing database to present the validity of the proposed approach. The experimental results demonstrate that the proposed CFCM-based MGFM based on the hierarchical tree structure creates a small number of meaningful rules and solves prediction-related problems by making them explainable.


Introduction
In general, fuzzy sets can effectively represent ambiguous and uncertain information inherent in real-world nonlinear systems.Fuzzy sets also represent dynamic or static characteristics as a membership function and represent the degree of fuzzy membership for given data.A fuzzy inference system (FIS) can be expressed qualitatively to make it easy to understand the system and have robust characteristics for systems with uncertain information.However, FIS has to rely on the knowledge or experience of experts to obtain fuzzy rules.Meanwhile, neural networks (NNs) can analyze the input and output relationships of a system through learning and have a processing function.Hence, NN can perform processing tasks quickly.However, NNs have difficulty understanding a system because they do not have information about the given system.A complementary model, called a neuro-fuzzy system, was proposed by combining the advantages of fuzzy models and neural networks to address the abovementioned issues [1][2][3][4][5].Studies are actively being conducted on neuro-fuzzy inference systems [6][7][8][9][10][11][12][13][14][15].Similar to FIS, neuro-fuzzy inference systems are expressed by fuzzy rules.As the number of input variables increases, the number of fuzzy rules increases exponentially in the grid partitioning.
Meanwhile, granular computing (GC) is a paradigm for processing an increasing amount of information in computational intelligence and human-centric systems.Researchers have been investigating several design methods for GC [16][17][18][19][20].The term GC was defined as "subsets computed with words" in a study by Zadeh [21].GC is a set of computational methodologies and approaches derived from property to solve complex and realistic problems, and it encompasses techniques, theories, and methodologies that use the concept of granules to solve real-world complex problems.Among many studies, Pedrycz [22] proposed a granular fuzzy model (GFM) in which the output of this model is expressed as triangular contexts rather than numerical values for a human-centric system.The GFM uses context-based fuzzy c-means (CFCM) clustering [23] to generate the contexts in the output variable and produces cluster centers in the input space with the aid of the generated contexts.Unlike conventional fuzzy c-means (FCM) clustering, CFCM clustering efficiently produces clusters by representing the properties of the data in the input and output variables.Studies are actively being conducted on GFM [24][25][26][27][28][29][30][31][32][33].However, increasing the number of inputs to GFM can make it difficult for complex system designs to understand the resulting rules.The GFM has overlapping rules as the cluster centers become closer.It also has problems that are difficult to interpret fuzzy if-then rules due to many input variables.
Therefore, we propose a CFCM-based multi-granular fuzzy model (MGFM) with hierarchical tree structures to design smaller interconnected GFMs that have a smaller number of meaningful rules.The advantage of the proposed model is that it is more computationally efficient and easier to understand fuzzy if-then rules than the GFM itself.Furthermore, we use a new performance index suitable for the output characteristics of GFM.The performance index is obtained by the unique concept of coverage and specificity.Studies on various hierarchical structures are also actively being conducted [34][35][36][37][38].This paper is organized as follows.Section 2 describes the concept and procedure of CFCM clustering and CFCM-based GFM.Section 3 provides three hierarchical tree structures and CFCM-based MGFM with hierarchical tree structure.In Section 4, experiments on automobile fuel consumption and the Boston housing database are described.Finally, conclusions are provided in Section 5.

Context-Based Fuzzy C-Means Clustering
Context-based Fuzzy C-Means (CFCM) clustering was proposed by Pedrycz [39], and it is an effective method of creating clusters that utilize the correlation of the data in the input and output variables.Furthermore, it is possible to preserve the properties of the output variable by using CFCM clustering; hence, CFCM clustering is more homogeneous than conventional FCM clustering [40].Equation (1) defines the context created by using the characteristics of the output data.Here, D denotes the dataset in the output space, and it is assumed that the context has an available value for the given data.f k = T(d k ) denotes the degree of membership for the k-th data in the context of the output variable.The value of f k is between 0 and 1, which indicates the degree of membership.The requirement of the membership matrix can be expressed by Equation (2) based on these characteristics.
The modified membership matrix U can be expressed by Equation (3).
The fuzzification factor m ∈ [1, ∞] generally uses two as the weight exponent.The context is generated by a triangular membership function in the output space to obtain f k , which is the degree of membership.Here, it can be seen that the membership function generates contexts such that they overlap at uniform intervals.Moreover, the interval and shape of the generated contexts can be changed according to the user's settings.Here, we use a method of generating contexts by evenly dividing the data between the output variable and a method of flexibly generating contexts from the Gaussian distribution.The procedure for the CFCM clustering algorithm is as follows: [Step 1] Set the number of contexts to be generated and the cluster's number to be estimated for each context.In addition, initialize the membership matrix U having a value between 0 and 1. [Step 2] Generate the contexts via triangular fuzzy sets that are flexibly distributed in the output variable.The generation method of the context can be changed based on the user's settings.[Step 3] Calculate the cluster centers for each context using Equation (4). [ Step 4] Compute the objective function using the equation below.The above process stops if the value updated through the previous iteration is less than the threshold.
Here, d k represents the Euclidean distance between the i-th cluster center and the kth data, and p denotes the number of iterations.

Begin
e.g., m = 2); Fix maxiteration, (e.g., maxiterations = 100); Choose the type of context to be created in the output space (e.g., type 1 = uniform, type 2 = flexible); Create p contexts; Randomly initialize v cluster centers; for t = 1 to maxiterations do aaaaaUpdate the membership matrix U; aaaaaCalculate the new cluster centers V aaaaaCalculate the new objective function J; aaaaaif (abs(J t − J t−1 )<∈) then aaaaaaaa break; aaa else aaaaaaaa J t−1 = J t ; aaa end if end for end

CFCM-Based Granular Fuzzy Model
As explained in Section 2.1, the granular fuzzy model (GFM) can be designed using CFCM clustering.The premise parameters are determined by the cluster centers estimated by CFCM clustering.The linguistic contexts, which are the consequent parameters, are produced in the output variable, as shown in Algorithm 1.The CFCM-based GFM consists of four layers.The first layer receives the input data, and the second layer represents the set of activation levels for all clusters related to the linguistic contexts.In the third layer, the CFCM clustering for each context is performed.Hence, when a linguistic context is given, clustering occurs in the input variable corresponding to the context.For each context, the cluster's number is adjusted by the user.The fourth layer is the output layer, and it is implemented in a single granulated form.
The output value Y of the CFCM-based GFM is expressed by Equation (7).
Here, the addition and multiplication symbols are used to represent information granules.W t and z t are the t-th context and t-th firing strength, respectively.The GFM has a single hidden layer formed by clusters obtained by CFCM clustering.The space between hidden and output layers can be expressed linguistically using contexts, unlike the design method of conventional neural networks.The final value of the output layer can be computed as follows The generalized additions and multiplications in Equation ( 8) are completed using a fuzzy calculation method.Here A i is the i-th fuzzy set characterized by context.When the linguistic context is assigned as a triangular fuzzy number, it can be expressed using the following equation: Here, a i− , a i , a i+ denote the lower limit output value, granular model output value, and upper limit output value of the triangular fuzzy set, respectively.The lower prediction value and upper limits of the GFM can be expressed using Equations ( 10)- (12).

Hierarchical Tree Structures
Increasing the number of input variables in fuzzy-related and granular models also exponentially increases the number of fuzzy rules, resulting in poor computational efficiency and performance for fuzzy-related and granular models.It also makes it difficult to understand the operation of the model and makes it challenging to adjust the parameters of the membership functions and fuzzy rules.Moreover, the generalizability of fuzzy models and granular models can be reduced when processing large multivariate databases.To solve these problems, fuzzy and granular models are designed in an interconnected hierarchical structure [41] rather than in a single monolithic form.Fuzzy and granular models, designed in a hierarchical structure, can be referred to as hierarchical fuzzy and granular models.In this structure, the output of the low-level model is used as the input to the high-level model.A hierarchical fuzzy or granular model designed in this manner is easier to understand and can perform computations more efficiently than a single monolithic fuzzy or granular model with a unified number of inputs.
In this paper, we present three types of hierarchical structure, including incremental structure, aggregated structure, and cascaded structure [42][43][44].Figure 1 shows these three hierarchical structures.In the incremental structure, input values are integrated into multiple levels, and output values can be specified at each level.When selecting an input variable at various levels, the input variable is used based on its contribution to the final output value.Typically, the input variable with the highest contribution is used for the lowest-level model, while the input variable with the lowest contribution is used for the highest-level model.In other words, the input values of the low-level model depend on the input values of the high-level model.On the other hand, in the case of aggregated structure, input variables are used in the lowest-level models, and the outputs of the lowest-level models are used as inputs to the models with high levels.Input variables are naturally grouped and used for specific decisions.The associated input variable is low-level.It is used as input to the model, and its output is used as input to the high-level model.Finally, the cascaded structure is designed by combining the incremental and aggregated structures.This structure is suitable for models that contain both correlated and uncorrelated input variables.Thus, we will use a hierarchical tree model with the cascaded structure combining the incremental and aggregated structures in this paper.
In this paper, we present three types of hierarchical structure, including incremental structure, aggregated structure, and cascaded structure [42][43][44].Figure 1 shows these three hierarchical structures.In the incremental structure, input values are integrated into multiple levels, and output values can be specified at each level.When selecting an input variable at various levels, the input variable is used based on its contribution to the final output value.Typically, the input variable with the highest contribution is used for the lowest-level model, while the input variable with the lowest contribution is used for the highest-level model.In other words, the input values of the low-level model depend on the input values of the high-level model.On the other hand, in the case of aggregated structure, input variables are used in the lowest-level models, and the outputs of the lowest-level models are used as inputs to the models with high levels.Input variables are naturally grouped and used for specific decisions.The associated input variable is lowlevel.It is used as input to the model, and its output is used as input to the high-level model.Finally, the cascaded structure is designed by combining the incremental and aggregated structures.This structure is suitable for models that contain both correlated and uncorrelated input variables.Thus, we will use a hierarchical tree model with the cascaded structure combining the incremental and aggregated structures in this paper.

CFCM-Based Multi-Granular Fuzzy Model (MGFM) with
The tree-structured CFCM-based MGFM is proposed arise when a single-granular model processes a large-sca CFCM-based MGFM is designed as a multi-granular mode ing models in aggregated and incremental structures.

CFCM-Based Multi-Granular Fuzzy Model (MGFM) with Hierarchical Tree Structure
The tree-structured CFCM-based MGFM is proposed to resolve the problems that arise when a single-granular model processes a large-scale multivariate database.The CFCM-based MGFM is designed as a multi-granular model in the form of a tree by stacking models in aggregated and incremental structures.Figures 2 and 3 show the hierarchical tree structure and diagram of CFCM-based MGFM, respectively.This structure uses correlated input variables and non-correlated input variables by dividing them.First, the low-level granular model calculates the output by grouping the correlated input variables in an aggregated form.Then, the calculated output and the uncorrelated input variables are calculated through the granular model designed in an incremental structure.Figure 4 shows the process of generating fuzzy rules in CFCM-based MGFM.The context is created in the output variable, and the clusters are created in the input areas corresponding to each context.Fuzzy rules are created using context and clusters.Correlation determines the rank of input variables based on their correlation with the output attributes of the database, and it uses them as the inputs to the low-level granular model based on descending order of positive and negative correlations.The procedure of the tree-structured CFM-based MGFM is as follows:

Performance Evaluation of CFCM-Based MGFM
The GFM (granular fuzzy model) is constructed by the information granules generated using linguistic concepts and information.The information granule can be realized as a specific fuzzy set rather than a single numerical entity.Conventional numerical data are evaluated by the standard deviation as well as the mean or median value.The information granules must be generated rationally so that they can optimally contain the data's

Performance Evaluation of CFCM-Based MGFM
The GFM (granular fuzzy model) is constructed by the information granules generated using linguistic concepts and information.The information granule can be realized as a specific fuzzy set rather than a single numerical entity.Conventional numerical data are evaluated by the standard deviation as well as the mean or median value.The information granules must be generated rationally so that they can optimally contain the data's information.Additionally, the GFM is evaluated based on specific concepts of coverage and specificity.The first coverage criterion concerns whether experimental evidence of sufficiently high dimension is accumulated to form the information granule that supports the existence of the data.The second specificity criterion is to maintain the high specificity of the information granule generated due to this result.
The production of rational information granules focuses on generating meaningful information granules using the original data.To generate rational information granules, the coverage and specificity of the information granules should be satisfied [45].Figure 5 illustrates a conceptual diagram of the coverage and specificity required to generate rational information granules.Coverage is a criterion to indicate whether data are included in the information gra ules generated by the GFM, and it shows the degree to which the output values are cluded in the range of the triangular information granules.When  is a triangular co text, it has a value close to 1 if  is included in  =  ,  and a value close to 0 if is not included.In other words, the number of target data included in the output of t GFM can be calculated by the coverage, and then the average value for all data is cal lated.When all data are included in the output of the GFM, the coverage has a value clo to 1.

𝐶𝑜𝑣𝑒𝑟𝑎𝑔𝑒 = ∑ 𝑖𝑛𝑐𝑙(𝑦 , 𝑌 ) (
Specificity indicates the precision of the constructed information granules as Equ tion (14).Shorter information granules indicate higher specificity, meaning that the resu ing information granules have a well-defined meaning.The narrower the interval betwe the upper and lower limit values, the higher the specificity value.When the output  the GFM decreases to a narrow width, the specificity has a value close to 1.

𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 = ∑ 𝑒𝑥𝑝(−|𝑦 − 𝑦 |) (
The GFM is evaluated by the coverage and specificity of information granules, a we need to find a method that maximizes coverage and specificity simultaneously.Tw characteristics of information granules need trade-offs.It means that when the covera has a high value, the specificity has a low value.A rational information granule can expressed by Equation (15), and this equation is referred to as the performance index (P Coverage is a criterion to indicate whether data are included in the information granules generated by the GFM, and it shows the degree to which the output values are included in the range of the triangular information granules.When Y k is a triangular context, it has a value close to 1 if y k is included in Y k = y − k , y + k and a value close to 0 if y k is not included.In other words, the number of target data included in the output of the GFM can be calculated by the coverage, and then the average value for all data is calculated.When all data are included in the output of the GFM, the coverage has a value close to 1. Specificity indicates the precision of the constructed information granules as Equation ( 14).Shorter information granules indicate higher specificity, meaning that the resulting information granules have a well-defined meaning.The narrower the interval between the upper and lower limit values, the higher the specificity value.When the output Y k of the GFM decreases to a narrow width, the specificity has a value close to 1.

Speci f icity
The GFM is evaluated by the coverage and specificity of information granules, and we need to find a method that maximizes coverage and specificity simultaneously.Two characteristics of information granules need trade-offs.It means that when the coverage has a high value, the specificity has a low value.A rational information granule can be expressed by Equation (15), and this equation is referred to as the performance index (PI).
The performance index plays an important role in evaluating the accuracy and clarity of models.Several studies have been conducted on various methods of evaluating the performance of models.Common performance evaluation methods include root mean square error (RMSE) and mean absolute percentage error (MAPE).These methods are mainly used when the output of the model is a numerical value.However, the output of GFM is expressed by fuzzy numbers in a linguistic form.In this sense, the performance evaluation method of GFM is challenging.Thus, we use the performance index as a new performance evaluation method for GFM.The higher the value of the performance index, the more meaningful the information granule, making it possible to design the GFM with excellent performance.Furthermore, we can verify the relationship between the coverage and specificity by the performance index calculated from the GFM.

Experimental Results and Comments
To verify the validity of the tree-structured CFCM-based MGFM proposed in this paper, we performed experiments using the automobile fuel consumption database and the Boston housing price database, which are used as benchmarking databases in the prediction field.

Database
The automobile fuel consumption database [46] was collected from the late 1970s to the early 1980s to predict automobile fuel economy.This database consists of eight variables and 398 instances.The inputs consist of the cylinder's number, engine displacement, weight, horsepower, acceleration, year, production region, and model name, and the output is fuel economy.In this experiment, we use six input variables except for the model name.
The Boston housing price database [47] is a database of collected information on housing prices in the city of Boston.This database has 14 variables and 506 instances.The inputs include the crime rate per capita by municipality, the proportion of residential areas over 25,000 square feet, the proportion of land occupied by non-retail commercial districts, the dummy variable for the Charles River, nitric oxides concentration per 10 ppm, the average number of rooms per dwelling, the proportion of owner-occupied houses before 1940, the index of accessibility to five Boston job centers, the index of accessibility to radial roads, property tax rate per USD 10,000, the ratio of students to teachers by municipality, the proportion of people of color by municipality, and the proportion of the lower class of the population.The output is the price of owner-occupied houses (MEDV).
The energy efficiency database [48] was collected to predict the cooling and heating load of buildings and consists of eight input variables and 768 instances.The input variables consist of relative compactness, surface area, walled space, roofed space, total height, direction, glazed area, and glazed area distribution map, and the outputs are cooling load and heating load.

Experimental Methods and Results Analysis
The performance used in this paper is evaluated by the performance index method, as explained in the previous section.This database was normalized to a value between 0 and 1.Total data are divided by training data (50%) and testing data (50%).The experiment was conducted by increasing the number of contexts (P) and clusters (C) from two to six in each tree-structured CFCM-based MGFM.Furthermore, fuzzification coefficients of 1.5 and 2 were used in the experiment.In addition, the contexts are generated by a uniform and flexible method.
In the case of the automobile fuel efficiency prediction experiment, Tables 1 and 2 list the performance index of the tree-structured CFCM-based MGFM with uniformly generated contexts for the training data and validation data, respectively.Here, m denotes the fuzzification coefficient, and P and C represent the number of contexts and clusters, respectively.In addition, PI denotes the performance index.As listed in Table 2, the best prediction performance is achieved when four contexts and four clusters are created.However, when the context's number is set to 2, the specificity of the information granules is not secured, and the specificity value is close to 0. Consequently, the performance index is 0. Figures 6 and 7 show the prediction results for the training data and validation data, respectively.In these figures, the solid black line represents the actual output values of the data, and the dotted red line represents the predicted values of the proposed method.As shown in these figures, the experimental results revealed that tree-structured CFCM-based MGFM showed good prediction performance for automobile fuel consumption data.In what follows, we performed the experiments for the Boston housing price prediction.Tables 3 and 4 list the performance index of the tree-structured CFCM-based MGFM for the training data and verification data, respectively.In the same manner, the experiment was conducted by increasing the number of contexts and clusters from two to six in each treestructured CFCM-based MGFM.As listed in Table 4, the best performance is achieved when three contexts and four clusters are created.As in the previous experiment, the specificity is not secured when the number of contexts is set to 2, and the resulting specificity value is close to 0. Hence, the performance index is 0. Figures 8 and 9 show the prediction results for the training data and validation data, respectively.As shown in these figures, it was confirmed that tree-structured CFCM-based MGFM showed valid prediction performance for Boston housing data.
Table 5 lists the experimental results for the automobile fuel consumption prediction, the Boston housing price prediction, and the additional energy efficiency example.Figure 10 shows the performance comparison for CFCM-based MGFM and GFM itself.As listed in Table 5, the experimental results for the automobile fuel consumption data demonstrated that the best performance was achieved by each of 18 fuzzy rules (P = 6, C = 3) for four GFMs with two inputs for hierarchical tree structure.Here, the performance index of the best model in a hierarchical tree structure is 0.4909.The processing time of the proposed method is 0.0104 s.In contrast, the performance index and processing time of GFM itself with six inputs are 0.3986 and 0.4 s, respectively.Here, the number of premise and consequent parameters in the explainable simple GFM are 36 (18 centers × 2 inputs) and 18 (6 contexts × 3 points), respectively.In contrast, the number of the premise and consequent parameters in the GFM itself, which is difficult to explain, are 108 (18 centers × 6 inputs) and 18 (6 contexts × 3 points), respectively.The experiment was performed with MATLAB 2022a in a desktop environment of Intel Core i7-7700CPU, RAM 16GB, NVIDIA Geforce GTX 1060.As listed in Table 5, the experimental results revealed that the performance index and processing time showed good results in comparison to those of GFM itself.Furthermore, the proposed method has the advantage of showing that it can be explained by simplifying hierarchical models with each of the two inputs.In what follows, we performed the experiments for the Boston housing price prediction.Tables 3 and 4 list the performance index of the tree-structured CFCM-based MGFM for the training data and verification data, respectively.In the same manner, the experiment was conducted by increasing the number of contexts and clusters from two to six in each tree-structured CFCM-based MGFM.As listed in Table 4, the best performance is achieved when three contexts and four clusters are created.As in the previous experiment, the specificity is not secured when the number of contexts is set to 2, and the resulting   In what follows, we performed the experiments for the Boston housing price prediction.Tables 3 and 4 list the performance index of the tree-structured CFCM-based MGFM for the training data and verification data, respectively.In the same manner, the experiment was conducted by increasing the number of contexts and clusters from two to six in each tree-structured CFCM-based MGFM.As listed in Table 4, the best performance is achieved when three contexts and four clusters are created.As in the previous experiment, the specificity is not secured when the number of contexts is set to 2, and the resulting  In the case of Boston housing price data, the experimental results showed that the best performance was achieved by each of 24 fuzzy rules (P = 6, C = 4) for seven GFMs with two inputs for hierarchical tree structure.Here, the performance index of the best model in a hierarchical tree structure is 0.4603.The processing time of the proposed method is 1.2834 s.In contrast, the performance index and processing time of GFM itself with 12 inputs are 0.3043 and 1.3417 s, respectively.Although the proposed method has a similar processing time, the performance index of GFM itself was confirmed to have better performance.However, the GFM itself was found to be difficult to obtain an explainable model because the rules were generated with too many inputs.
In the case of energy efficiency data, the experimental results showed that the best performance was achieved by each of the 24 fuzzy rules (P = 6, C = 4) for a single GFM with two inputs for a hierarchical tree structure.Here, the performance index of the best model in a hierarchical tree structure is 0.4952.The processing time of the proposed method is 0.2834 s.In contrast, the performance index and processing time of GFM itself with six inputs are 0.4561 and 1.9887 s, respectively.As listed in Table 5, the performance index and processing time showed good results in comparison to those of GFM itself.Furthermore, the proposed CFCM-based MGFM has the characteristics of representing that it can be explained by simplifying hierarchical models with each of the two inputs.

Conclusions
We proposed the CFCM-based MGFM with a hierarchical tree structure.This method has the advantage that it is more computationally efficient and easier to understand than the GFM itself in multivariate system applications.For this, we used a hierarchical tree model with the cascaded structure combining the incremental and aggregated structures.
Furthermore, we used the performance index based on the unique coverage and specificity concept for the performance evaluation of the proposed model with the aid of information granules.The experiments proved the validity of the prediction performance of the proposed model using the well-known automobile fuel consumption prediction data, Boston housing data, and energy efficiency data.As a result, it was confirmed that the proposed CFCM-based MGFM with the hierarchical tree structure can be expressed as an explainable structure with fewer rules while simplifying the existing designed model and that the proposed model can be usefully used by the new performance index.The best model of the proposed model was found by trial and error.Therefore, we shall optimize to find the best granular fuzzy model with a hierarchical tree structure in future research.

[Step 5 ]Algorithm 1 .
Calculate a new membership matrix U by the equation.Then, go to [Step 3].The pseudo-code for the CFCM clustering algorithm see Algorithm 1.The context based fuzzy C means clustering algorithm.
Fig chical tree structure and diagram of CFCM-based MGFM, r correlated input variables and non-correlated input variab low-level granular model calculates the output by grouping

[Step 1 ] 18 [
Calculate the positive and negative correlations of the input variables based on the correlations with the output variables in the database to be used.[Step 2] By using the positive and negative correlation ranks, designate input variables (input variables with high correlation ranks) to be input into the low-level granular model of the aggregated structure and input variables (input variables with low correlation ranks) to be input into the granular model of the incremental structure.[Step 3] Sets the context's number to create in the low-level granular model and the cluster's number to be produced per context.In addition, initialize the membership matrix U. [Step 4] Create contexts and clusters using context-based FCM clustering to generate fuzzy rules automatically.The generated fuzzy rules are then used to calculate the output of the lower-level granular model.[Step 5] Process the output of the low-level granular model, the output variables of the database, and the input variables with low correlation ranks that will be used in the incremental structure so that they can be used as inputs to the high-level granular model.[Step 6] Use the processed database as input to the high-level granular model and generate fuzzy rules using CFCM clustering to calculate the final output.Appl.Sci.2023, 13, x FOR PEER REVIEW 7 of Step 4] Create contexts and clusters using context-based FCM clustering to generate fuzzy rules automatically.The generated fuzzy rules are then used to calculate the output of the lower-level granular model.[Step 5] Process the output of the low-level granular model, the output variables of the database, and the input variables with low correlation ranks that will be used in the incremental structure so that they can be used as inputs to the high-level granular model.[Step 6] Use the processed database as input to the high-level granular model and generate fuzzy rules using CFCM clustering to calculate the final output.

Figure 4 .
Figure 4. Diagram for fuzzy rule generation in design of CFCM-based MGFM.

Figure 5 .
Figure 5.The concept of coverage and specificity for rational information granule generation.

Figure 5 .
Figure 5.The concept of coverage and specificity for rational information granule generation.

Table 1 .
Performance index of tree-structured CFCM-based MGFM for training data (automobile fuel consumption data).

Table 2 .
Performance index of tree-structured CFCM-based MGFM for verification data (automobile fuel consumption data).

Table 3 .
Performance index of tree-structured CFCM-based MGFM for training data (Boston house price data).

Table 4 .
Performance index of tree-structured CFCM-based MGFM for validation data (Boston house price data).

Table 5 .
Experimental results of performance index and processing time (validation data).