1. Introduction
Understanding customer preferences for products is an essential task in the customer-centric design of products, as the preferences of customers can be mapped into optimal design [
1]. Commonly used methods to analyze customer preferences are questionnaires or participant interviews. However, only scores in questionnaires or answers to the predefined questions can be obtained. Also, not much sentimental expression is contained in the collected survey data. Comparatively, online reviews include rich information about customer preferences, which are generated freely by customers after using products. They are important data for product design and studying customer preferences. By using comments on products, sentiment analysis can be introduced to understand customer preferences and collect effective data sets, which is almost cost-free. Next, a modeling method will be developed to assess the relationship between design characteristics and customer preferences. Recently, the adaptive neuro fuzzy inference system (ANFIS) has been frequently applied to develop customer preference models using online reviews. It can deal with the fuzzy nature of sentimental expression from customers and the nonlinearity of relationships. However, ANFIS cannot provide an interpretation of the nonlinear relationship transparently, and no visible nonlinear equations between inputs and outputs can be obtained. There is also a lack of clarity about its inner working, which is called black box modeling. The only transparent information generated from the ANFIS is the fuzzy rules, in which the internal models are in the linear function, and the nonlinear relationship cannot be shown. As explainable artificial intelligence develops, exhibiting explicit and understandable nonlinearity in ANFIS becomes increasingly important. In order to solve the above limitations of ANFIS, this paper proposes a chaos-driven ANFIS approach to model customer preferences with nonlinear fuzzy rules. First, the proposed approach conducts sentiment analysis, which involves analyzing online reviews, extracting dimensions of customer preferences, and computing sentiment scores. Based on the extracted information and settings of the product’s design characteristics, a chaos optimization algorithm (COA) is integrated into the ANFIS to generate the customer preference models. To solve the black box issue of ANFIS, we propose transforming the linear internal models into fuzzy rules for a nonlinear structure, including higher-order or interaction terms. Some stochastic optimization algorithms accept poor solutions to escaping from the local minimum based on a certain probability. In contrast, COA provides better solutions because it can avoid local minima by examining the regularity in chaotic motions [
2]. It can provide more diverse solutions and has a fast convergence speed. Furthermore, the chaotic search technique is well known for simulating the dynamic behavior of the nonlinear systems [
3]. COA and ANFIS are combined to create customer preference models that can handle the fuzziness and nonlinearity of the relationship, and they are transparent for readers.
Following is a list of the rest of the paper’s contents. A description of related works is provided in
Section 2. The proposed chaos-driven ANFIS approach is discussed in
Section 3, and its implementation in a laptop case study is discussed in
Section 4. Validations of the proposed approach are presented in
Section 5 along with a comparison with five other approaches.
Section 6 provides the conclusions of the study.
3. A Chaos-Driven ANFIS Approach
The data sets used for modeling in a chaos-driven ANFIS approach are the design characteristics of products and the sentiment scores of customer preferences. According to the information about the products, the design characteristics are collected. Customer preference sentiment scores are derived from online reviews using sentiment analysis. To determine the nonlinear structure of fuzzy rules in ANFIS, COA is used. In the developed models, the nonlinearity and fuzziness of the relationships are explicit and explainable, which are shown in the generated fuzzy rules. The procedure used in the proposed approach is described in
Figure 1.
3.1. Subsection Sentiment Analysis Based on Customer Reviews
The products used for the research are determined as samples first. Lexalytics’ text and sentiment analysis, called Semantria API, is introduced to analyze customer reviews, identify customer preferences, and determine the corresponding sentimental scores. To collect online reviews, a web crawler with Python 3.12.9 is used by requesting the URL of the websites. An Excel file is used to store the collected customer reviews as a data source for sentiment analysis. The procedure of sentiment analysis in Semantria includes the following six steps. First, the review content is cleaned, from which punctuation, stop words, and non-alphanumeric symbols are removed. Second, the words in online reviews are categorized using part-of-speech tagging. In this paper, nouns describe the dimensions of customer preferences, while adjectives and adverbs refer to related emotions. Third, the expressions of the emotions related to customer preferences are extracted. Fourth, by using feature pruning, redundancy and incorrect features are deleted. Fifth, the synonymous keywords or phrases are grouped and provided as the settings of Semantria. For example, the words “weight”, “heavy”, “light”, and “comfortable to carry” have similar meaning as the customer preference “easy carrying”, which are the settings in Semantria. Finally, the semantic scores and polarity of customer presences are determined based on SentiWordNet, which is an opinion lexicon that assigns three numerical scores indicating positivity, negativity, and objectivity to each synset [
32].
3.2. Generation of Nonlinear Fuzzy Rules
In the original ANFIS, the expression of the fuzzy rules with two inputs is shown as follows.
where
represents the fuzzy rules;
and
denote the
ith and
jth membership functions for the input
and
, respectively; and
,
, and
are the consequent parameters in the internal model
. The above internal models in the fuzzy rules only involve linear terms and demonstrate a linear relationship between customer preference and design characteristics. To interpret the nonlinear relationship transparently, in this research, the nonlinear fuzzy rules in polynomial structure are proposed for ANFIS, which are expressed in (2).
is the nonlinear model for
;
,
and
are the coefficients of items in the polynomial structure.
is the constant in the model. In (2), the polynomial structure is determined based on the COA by minimizing the mean relative error (
MRE) as shown in (3).
where
n denotes the size of the data set;
and
represent the
ith actual and the predicted sentiment score of customer preference, respectively.
3.3. Chaos Optimization Algorithm (COA)
COA is characterized by employing chaotic dynamics to search for the optimal regions instead of random searches and find the optimal solution for the problems at the global level [
33]. The process of searching in COA includes two phases, which are the global search and the local search. To identify the good states, a global search is first conducted using the ergodic trace of the entire searching area. If the termination requirement is met, then the good states are achieved, which implies that the best solution is near. Then local search is started in accordance with these results. The deeper search is conducted in the local space by using a small disturbance term. When the criterion of the final terminated state is satisfied, the best solution is found. In chaotic dynamics, variables are transformed from chaotic space to solution space, which determines the solution ranges. The process is described as follows.
In this paper, COA employs the logic model. By iterating the calculation according to (4), chaotic variables are obtained.
where
m represents the iteration number;
is the value of the chaotic variable
c at the
mth iteration, and the range is [0, 1];
is commonly equal to 4 to reach all the chaotic states. For example, if
, then
. By utilizing (4), the chaos variables are generated by the successive iterations. In the optimization process, the search for optimal solutions is performed using the generated chaos variables through their own locomotion law.
Based on (5), the optimization variable
is obtained by transferring the chaos variable
using the linear mapping.
where
a and
b are the lower and upper limits of the elements in
. Based on (4) and (5), the chaos variable
traverses between 0 and 1, while the optimization variable
traverses between a and b. For example, if the values of
a and
b are 1 and 4, respectively, and
, then the value of the element in
is 1 + (4 − 1) × 0.5 = 2.5 which is between 1 and 4.
The values of
determine the nonlinear structure of the fuzzy rules. The elements in
involve the inputs of ANFIS, which are denoted as
, and the mathematical operations between each two inputs. In fuzzy rule (2), there are two types of mathematical operations, which are addition, “+” and multiplication, “×”. The following (6) is the expression of
.
where
should be an odd value, which represents the element number in
. Also, it is the length of
.
Table 1 shows the structure of
.
For the items at the odd positions , their rounded values denote the numerical sequence of inputs. For example, if = 2, then the input is determined as the first position in the . For the items at the even positions, the values are either 0 or 1. 0 denotes the mathematical operation “+” while 1 denotes the operation “×” between the two inputs. For example, if the item number in is 5 and it is obtained as , the polynomial structure of the fuzzy rule is translated as . There are two items, and , in the structure and the coefficients are determined by the least square estimation (LSE) method in the learning process of ANFIS. Therefore, nonlinear structures are generated for the fuzzy rules.
3.4. Chaos-Driven ANFIS
ANFIS integrates a multi-layer feedforward neural network with fuzzy reasoning and maps inputs into an output [
34]. An example of an ANFIS structure can be seen in
Figure 2, which consists of two inputs and one output, and two membership functions are associated with each input.
In the first layer,
and
,
and
, represent the membership function of the
ith and
jth linguistic explanation of
and
, respectively. In this study, the triangle membership function is applied, and is shown as follows:
and
where
and
are the antecedent parameters in ANFIS which are the fuzzy numbers in the shape of a triangle. Their values are determined based on the grid partition method.
The second layer consists of four nodes determined by the membership function of the inputs. The outputs of this layer are the firing strength and are calculated as follows.
In the third layer, the normalized firing strength
is defined by (9).
Each combination of membership functions of with is indicated by one fuzzy rule. Therefore, the fourth layer consists of 4 fuzzy rules. The importance of fuzzy rule is measured by the value of . The larger value of means more importance of . The fuzzy rules are proposed in (2), and in a fuzzy rule, is the internal model with the nonlinear structure.
A single node in the fifth layer represents the final output, which is computed using (10).
In (10),
,
,
and
are the consequent parameters of ANFIS, which are defined by the LSE method. The output (10) can be transformed into the following equation.
where
,
and
,
. Based on the process of ANFIS, matrix
A is obtained. An estimated value of
, denoted as
, is obtained by minimizing the following squared error
, which is updated and calculated based on the following equations.
where
represents the covariance matrix and
;
should be set as a positive and large number;
denotes the identity matrix;
denotes the
row of the matrix
; and
is the
element of
; Based on
, the predicted output
can also be calculated using (14).
where
is the predicted customer preference’s sentiment score.
3.5. The Procedure of the Proposed Approach
The following is a description of the chaos-driven ANFIS method to model customer preferences with the explainable nonlinearity derived from online comments.
Step 1. Based on the defined samples of products, the data sets for modeling are first prepared. Sentiment analysis is conducted according to
Section 3.1 to extract the customer preferences and calculate their sentiment scores, which are the outputs of the datasets. The inputs, which are the settings of the relevant design characteristics, are obtained from the products’ information.
Step 2. The initialization for the parameters used in the proposed approach is performed, such as the iteration number of COA, the length of the chaos variable , the number of fuzzy rules, the initial value for , and the values of a and b in (5).
Step 3. The iteration of COA starts (m = 1). Based on (4), the chaos variable is computed. Using (5), the corresponding is obtained. Based on the generated , the items at the odd positions are substituted by the values of corresponding inputs, and the items at the even positions are replaced by the operations, “+” or “×”.
Step 4. Using the generated polynomial structure, the models are built following (7)~(10), and the coefficients of nonlinear fuzzy rules are trained and obtained using (11)~(13). The predicted customer preference’s sentiment scores are calculated based on (14). Then, in each iteration, the fitness value of COA, MRE, is computed using (3).
Step 5. The iteration goes on m + 1 m, and the processes are repeated from Step 3. When the iteration number achieves the predefined number, the iteration is stopped. The smallest MRE is found, and a record is made for the corresponding iteration number. Therefore, the best customer preference model is the model generated in this iteration.
4. Implementation
To assess the effectiveness of the chaos-driven ANFIS approach, a case study of laptop products is introduced in this research. Ten laptops from different brands are selected as samples, and their customer reviews are compiled from the Amazon website. The number of online reviews collected is 4990, 630, 590, 1100, 610, 590, 1980, 500, 412, and 370 for products 1 to 10, respectively. As a result of sentiment analysis using Semantria, four categories of customer preferences are identified, which are easy carrying, exterior design, quality, and performance. In this study, the customer preference, easy carrying, is selected for the illustration and is denoted as y. Its sentiment score is obtained and shown in the last column of
Table 2. Display size and weight are two design characteristics related to easy carrying, which are denoted by
and
, respectively. Their settings are found in the description of the products and are shown in
Table 2.
To model customer preferences, the proposed approach is applied to the above data sets. For the two inputs, each input has three membership functions. Therefore, the total number of fuzzy rules is nine. To include all the inputs in the polynomial structure of fuzzy rules, the number of items in the optimization variable,
, is set as 5. The initial values of
are the random values between 0 and 1. For the odd number of chaos variables, the value of
a is 1, and the value of
b is set as the total number of inputs. For an even number of chaos variables, the values of
a and
b are equal to 0 and 1, respectively. In COA, the number of iterations is set to 200. Extensive experimental tests were conducted over a range of iterations from 50 to 5000. The selection of 200 iterations is justified by its ability to achieve an optimal balance between computational time and models’ accuracy. Assuming that there are
n data sets, the number of clusters will be ≤
[
35]. Therefore, the number of membership functions for each input is set to 3 for each input in ANFIS. Based on the description in
Section 3.4, the number of fuzzy rules in the proposed approach is 3 ×3 = 9. To describe the modeling results, the following training example is used. Products 1 and 2 are used as the validation data, while products 3~10 are used as the training data sets. MATLAB software with version R2023a was installed on a Lenovo-XiaoXinPro 16 ARP8 laptop with a 64-bit operating system. They are utilized to implement the process of modeling using the proposed approach.
The parameters of the three membership functions for are obtained as (9.6, 11.6, 13.6), (11.6, 13.6, 15.6), and (13.6, 15.6, 17.6). And the values of for are determined as (1.425, 2.64, 3.855), (2.64, 3.855, 5.07), and (3.855, 5.07, 6.285). By introducing the above antecedent parameters into Equation (7), the membership functions in ANFIS are generated. Using the data set of product 3 as an example, and = 0.8436. Based on (8), . Following the same calculations, all the values of can be obtained and then can be determined using (9).
The optimal solution for the optimization variable
is obtained as follows.
Therefore, the generated nonlinear structure of the nine fuzzy rules in this example is transformed as follows.
where c is a constant in the fuzzy rule.
Based on the LSE method, the coefficients for the items in the above structure are calculated and shown in
Table 3. Using the data set of product 1 in validation 1 as an example,
if + 0.8753 = 0.4512.
By using the proposed approach, the establishment of a customer preference model for easy carrying is completed. The nonlinear fuzzy rules in
Table 3 are explainable as the equations can show the relationship between the inputs and the output.
5. Validation
To further evaluate the proposed chaos-driven ANFIS approach, five validation tests were performed on a case study of laptop products. In the process of training and validation, no repeated data sets were used. In validation tests 1 to 5, as validation data sets, product 1 and 2, product 3 and 4, product 5 and 6, product 7 and 8, and product 9 and 10 have been chosen. Training data sets were derived from the remaining 8 products. In each validation, a comparison is made between the proposed approach and the other five approaches, which are fuzzy regression (FR), fuzzy least-squares regression (FLSR), ANFIS, ANFIS with subtractive cluster (ANFIS with SC), and ANFIS with K-means. In ANFIS with SC and ANFIS with K-means, ANFIS is combined with the methods of subtractive clustering and K-means to identify the membership function of inputs. All six approaches were tested using the same data sets. For the purpose of comparing the validation results based on the six approaches, the values of
MRE in (3) and the variance of errors (
VoE) in (17) were applied.
Further information about parameter settings in the chaos-driven ANFIS approach can be found in
Section 4. A comparison of the
MRE and
VoE values based on the six approaches in the five validations is presented in
Table 4. From
Table 4, it can be seen that the values of
MRE and
VoE are the smallest based on the proposed approach compared with those based on the other five approaches.
Based on all the validation errors from the five tests, the confidence intervals of the six approaches with a 95% level of confidence were calculated and are compared in
Table 5. The comparison analysis shows that the proposed approach has both the narrowest interval and the lowest center point, which indicates the proposed approach yields significantly more robust and stable models than the other five approaches.
The generated explainable nonlinear fuzzy rules for validation 1 are given in
Table 3, and
Table 6 shows the nonlinear fuzzy rules in different polynomial structures for validations 2~5.
Using validation 5 as a reference, the customer preference models developed based on FR and FLSR are expressed in (18) and (19), respectively. And nine examples of fuzzy rules derived from ANFIS, ANFIS with SC, and ANFIS with K-means are shown in
Table 7.
Based on the above six methods, the customer preference models for easy carrying are developed. The models based on FR and FLSR are transparent linear models with fuzzy coefficients but lack nonlinear capability. The methods of ANFIS, ANFIS with SC, ANFIS with K-means, and chaos-driven ANFIS can deal with both the fuzziness and nonlinearity of the models. However, from
Table 7, it can be observed that the methods of ANFIS, ANFIS with SC, and ANFIS with K-means cannot solve the black box problems of ANFIS, as the fuzzy rules generated are linear models, which are only transparent information from the modeling process, and the nonlinearity is obscured. Comparatively, the fuzzy rules derived from the chaos-driven ANFIS approach, as shown in
Table 6, are explained in nonlinear equations, and the nonlinearity in the relationship between the customer preferences and design characteristics can be interpreted directly and explicitly. Therefore, only the proposed approach can mitigate the black-box problem of ANFIS by explicitly revealing the model construction and processes involved in nonlinear modeling.
6. Conclusions
In modeling customer preferences based on customer comments, ANFIS is an effective approach to associate customer preferences with design characteristics. By using ANFIS, the fuzziness and nonlinearity of the modeling can be addressed. ANFIS, however, has black box issues. Its internal decision-making process is buried within complex, interconnected layers of nodes and membership functions that are difficult for a human to decipher. To solve the above problem, a chaos-driven ANFIS approach is introduced to develop models with nonlinearity that can be explained transparently. As an illustration of the modeling process of the proposed approach, a case study of laptop products is introduced in the study. And the modeling of customer preference, easy carrying, is selected as an example. Five validation tests were performed to further evaluate the chaos-driven ANFIS approach, along with comparisons to FR, FLSR, ANFIS, ANFIS with SC, and ANFIS with K-means. Based on the comparison results, it can be shown that the proposed approach has smaller mean relative errors, variance in errors, and confidence intervals of validation errors than the other five approaches. While all established models from the ANFIS-based approaches capably capture the model’s fuzziness and nonlinearity, only the proposed approach solves the inherent black-box issue of ANFIS. The proposed method revolutionizes this by structuring the “THEN” part of its fuzzy rules as interpretable polynomials. Each rule now represents a clear, nonlinear relationship that is valid within a specific fuzzy-defined region of the input space. This means the entire model can be broken down into a set of human-readable equations. Each governs the system’s behavior under certain conditions, which allow the model’s nonlinearity to be addressed explicitly and its logic to be directly interpreted.
By mapping sentiment-based customer preferences directly to design characteristics, the transparent ANFIS model provides product designers with a useful decision-support tool. The designers move beyond knowing what is happening to understand precisely why it is happening. In product strategy and development, designers are no longer left guessing which feature to improve. Instead, they can strategically reallocate resources. They can proactively adjust the settings of key design characteristics, such as battery life, weight, or screen size, to refine existing products or blueprint entirely new ones. This data-driven alignment ensures that resources are invested in features that directly drive customer satisfaction, dramatically increasing the likelihood of market success. Also, this transparent ANFIS model provides product designers with a critical bridge between powerful AI capability and human-centric design. It converts abstract data into tangible and explainable logic and can empower designers to create products aligned with the changing needs of customers.
The limitation of the proposed chaos-driven ANFIS approach is the complexity of the ANFIS structure, which increases with the number of input variables of a given data set. As inputs are added, the number of nodes in layers 2~4 of ANFIS increases exponentially, leading to a massive increase in the number of parameters that must be tuned during training. Consequently, the computational time required for modeling becomes longer, or even fails to converge to a solution. To mitigate the limitation and enhance the model’s practicality, future research will be conducted on the reduction of the input dimensionality in ANFIS. The methods of optimal feature selection or identifying the nonlinear combination of the variables can be employed by introducing advanced optimization algorithms, such as evolutionary or swarm intelligence techniques. Furthermore, future research work will include the involvement of the higher-degree polynomial terms within the fuzzy rules in the modeling process to investigate their influence on the performance of the models. Also, the development of adaptive polynomials with variable degrees in the models can be performed using advanced artificial intelligence techniques. On the other hand, the dynamic trends in customer preferences will be captured by modeling time series customer preferences using customer reviews.