1. Introduction
Electronic word-of-mouth (eWOM) serves as a channel enabling individuals to instantly access information regarding the consumption experiences of products within consumer groups. Across various platforms, eWOM manifests in multiple forms, including star ratings, textual reviews, quality complaint texts, and feedback information, either individually or simultaneously. Through advanced data analysis tools, enterprises can extract insights into consumer behavior preferences, monitor market dynamics and evolving trends, and based on these findings, develop forward-looking marketing and management strategies to enhance market competitiveness and achieve sustainable development. Consumers, in turn, can leverage this multi-dimensional and authentic product information to assist them in making informed purchasing decisions. The core objective of automobile product ranking decision research driven by eWOM is to construct a group intelligence aggregation model through the integration of large-scale eWOM data, thereby providing personalized recommendation services for consumers. Initially, the primary focus of information aggregation was on user-provided star ratings, combined with statistical methods to generate product ranking results. However, existing research predominantly relies on eWOM data from a single platform, with relatively few studies adopting a cross-platform perspective. Moreover, the influence of quality-related eWOM, such as complaint texts and safety test information, on purchasing decisions is no less significant than that of satisfaction-related eWOM, such as ratings and reviews [
1]. For instance, unlike other studies, one study incorporates quality-related eWOM, such as the number of faults per 100 vehicles, into the product ranking decision-making process alongside satisfaction-related eWOM, achieving superior outcomes [
2]. Existing research remains limited to mining crowd intelligence knowledge from single-dimensional data sources, with few studies focusing on extracting and aggregating crowd intelligence knowledge for product ranking decisions from multi-source information, including rating eWOM, quality complaint eWOM, and safety test data [
3]. The central issue addressed in this study is how to mine valuable knowledge from multi-source and multi-dimensional eWOM data and construct a comprehensive research framework for product ranking decision-making.
In large-scale data-driven product ranking research, the quality of eWOM data plays a critical role in determining the reliability of decision-making outcomes. Due to the low posting threshold for users and the high complexity of platform reviews, the credibility of eWOM disclosed by online service platforms varies significantly. The accumulation of a substantial amount of low-credibility eWOM has exacerbated information quality issues, severely undermining the quality of data-driven decision-making. For instance, nearly 90% of Uber reviews may be rated as 5 stars, which highlights the uneven quality of eWOM data. This inconsistency leads to reduced credibility in product ranking results, making it challenging to provide effective guidance for consumers [
4]. Therefore, integrating information quality considerations into large-scale eWOM data-driven product ranking is essential for enhancing decision-making outcomes. Notably, the basic uncertainty information (BUI) model, capable of simultaneously characterizing evaluation information and credibility [
5,
6], offers a pivotal tool for addressing the challenges. In traditional multi-criteria decision-making (MCDM) processes, fuzzy set decision theory describes the fuzziness of object attributes through membership functions [
7,
8], while probability set decision theory relies on probability distributions to depict the objective occurrence frequency of random events [
9,
10]. In contrast, the latest BUI theory focuses on modeling information credibility, providing a novel research perspective for solving information quality problems in big data-driven MCDM. Currently, research on BUI primarily centers on aggregation operators and extended sets, leading to the development of a series of MCDM theories. Regarding aggregation operators, advancements such as the BUI ordered weighted average [
11], BUI ordered weighted geometric average [
12], and BUI generalized aggregation operator [
13] have enriched the decision-making theory system based on BUI. In terms of theoretical expansion, scholars have integrated BUI with (interval) bipolar semantic sets [
14,
15], rough sets [
16], and other frameworks, defining a range of BUI extension sets. A series of studies [
16,
17,
18,
19] have demonstrated that BUI theory holds significant advantages in modeling information quality within MCDM contexts, offering a robust theoretical foundation for product ranking research based on massive eWOM data in this study.
It is highly significant to incorporate the adjustment mechanism of users’ personalized preferences into the research on product ranking and recommendation systems. As the diversified characteristics and personalized demands of customers become increasingly prominent [
20], consumer preferences in cities at different stages of development have gradually exhibited significant heterogeneous features [
21]. Some recommendation models typically adopt a fixed parameter system, lacking sufficient flexibility for personalized adjustments tailored to individual users. This falls short of meeting the requirements of the “user autonomous choice mechanism” emphasized in certain regulations (e.g., China’s “Regulations on the Administration of Algorithmic Recommendation for Internet Information Services”). Fortunately, some studies have developed product ranking methods grounded in users’ personalized preferences, achieving commendable results [
22,
23]. Consequently, research on personalized product ranking methods based on multi-platform information aggregation holds substantial practical significance.
In conclusion, it is highly significant to mine valuable preference knowledge and develop product ranking and recommendation algorithms by leveraging data analysis tools to process the extensive variety of consumer preferences embedded in multi-source eWOM. However, most existing studies focus on single-dimensional eWOM data sources, with relatively limited exploration into integrated recommendations derived from multi-platform and multi-dimensional eWOM data. The information quality of eWOM substantially impacts the reliability of product ranking decision outcomes. Moreover, incorporating users’ personalized preference adjustment mechanisms into product ranking and recommendation systems aligns with societal development needs. Therefore, this study will commence by analyzing multi-source eWOM information, including rating satisfaction eWOM, user complaint eWOM, and security test data. It will introduce the BUI model as an information representation tool, integrate the user’s personalized adjustment mechanism, and conduct research on multi-source data-driven personalized product ranking and recommendation decision-making. Specifically, the study will (1) construct MCDM evaluation indicators for the three-dimensional data; (2) transform the three-dimensional data into BUI form through processes such as evaluation information mining and credibility conversion; (3) perform information aggregation using the ordered weighted average operator for basic uncertain information; and (4) propose a product ranking algorithm based on user personalization. This research can extract useful knowledge from multi-source eWOM data, assisting consumers in making informed shopping decisions while also guiding the formulation of enterprise management strategies.
2. Literature Review
In the study of ranking decisions for automobile products, compared to ordinary daily necessities, the ranking and recommendation of automotive products involve greater complexity. First, given their high price, users tend to exhibit greater caution during the purchasing process. Second, user purchase requirements are multifaceted, and the overall duration of the decision-making process is relatively prolonged. Research on automotive product ranking decisions can primarily be categorized into two aspects: scoring-driven ranking decisions and text-review-driven ranking decisions.
Research on ranking decision based on scoring. The distributed linguistic term set continues to serve as the core representation tool for converting score information, with PROMETHEE-II and TODIM being extended to the linguistic term set environment to propose a product ranking method [
24,
25]. To address the issue of data sparsity, scholars employed the grey correlation prediction method to forecast missing score data and introduced the cloud model to construct an automobile product recommendation algorithm [
26]. A multi-level classification mechanism was designed, and a group score aggregation method was proposed, defining the IFINWIBM operator while describing the interaction degree between attributes through an operator parameter learning mechanism to avoid subjectivity-induced unscientificity. An online multi-dimensional rating aggregation decision model was developed to address product ranking challenges [
27]. With a focus on reliability, research has been conducted on rating aggregation methods for cross-platform distributions, proposing a user weighting model based on feature information and an order preference technical model based on ideal solution similarity using BULI distance measurement to enhance product evaluation aggregation methodologies [
19]. A novel dynamic three-way decision model was investigated, and an integrated framework for MCDM based on BULI was developed utilizing rough sets from decision theory [
16]. Standard weights for electric vehicles were determined based on Shannon’s entropy, and electric vehicle rankings were established using TOPSIS [
28].
Research on ranking decision-making based on text comments. The existing research primarily employs a variety of sentiment analysis tools to extract the sentiment tendencies from online text comments and develops product ranking methods grounded in theories such as fuzzy sets and distributed language term sets. To enhance the accuracy of sentiment intensity recognition, scholars have proposed an ideal-scheme-based ranking decision method by incorporating interval type-2 fuzzy sets [
29]. To address the issue of single-feature information representation, BERT and q-rung orthopair fuzzy set theory are employed to construct a product ranking framework for online reviews concerning feature quality allocation [
30]. In the literature, emotion analysis technology is utilized to output sentiment intensity levels and their frequencies, which are subsequently converted into trust values for different sentiment intensity levels. A product ranking decision-making method based on emotion analysis and evidence theory is then constructed [
31]. Based on sentiment analysis technology, the sentiment orientation of each comment toward each alternative under each attribute is distinguished, and an information conversion mechanism is defined to facilitate the transformation of unstructured data into fuzzy numbers in pictures. The weight determination method of fuzzy entropy measurement in pictures is applied to determine attribute weights [
32]. Sentiment analysis results are transformed into hesitant intuitionistic fuzzy elements (HIFE), and sorting outcomes are obtained through the extended ORESTE method based on hesitant intuitionistic fuzzy Chebyshev distance [
33]. Considering the advantages of probabilistic linguistic term sets in representing emotional tendencies and their distribution forms, scholars use sentiment analysis technology to output five sentiment levels and integrate TODIM and evidence theory to construct relevant product ranking decision-making methods [
34]. Multi-attribute decision-making for product ordering is combined with q-rung orthopair fuzzy set theory and prospect theory [
35].
The aforementioned research status reveals that the current studies in the field of automotive product ranking are primarily conducted separately based on rating eWOM data and text eWOM data. Specifically, some studies have concentrated on rating eWOM data to evaluate the quality of automotive products by analyzing consumer ratings for these products. Another segment of research leverages text eWOM data to mine consumer attitudes and evaluation information embedded within the text for achieving product ranking. However, few studies have effectively integrated the knowledge derived from both rating eWOM data and text eWOM data, thereby limiting the comprehensiveness and accuracy of automotive product ranking research to a certain extent. Furthermore, in rating-driven car product ranking research, some scholars have introduced the BUI theory to address information quality issues. This theory can efficiently assess and handle low-credibility scoring data, thus enhancing the reliability of ranking outcomes. Conversely, in text-data-driven product ranking research, relatively limited attention has been paid to information quality. Given the unstructured nature and complex semantics of text data, information quality challenges can significantly impact the accuracy of sorting results. Therefore, introducing effective information quality assessment methods in this domain holds substantial research significance. Meanwhile, automotive safety test data play a pivotal role in automotive product recommendations. Safety performance is one of the key factors consumers consider when purchasing a vehicle. Automotive safety test data can objectively reflect the safety levels of automotive products and provide critical reference points for consumers. For instance, with the increasing prevalence of electric vehicles, battery-related safety incidents have become more frequent. Scholars have analyzed Chinese safety standards from multiple dimensions, including battery materials, cells, modules, and battery systems, offering valuable suggestions for establishing improved international battery safety standards [
36].
To summarize, this study aims to integrate the knowledge of consumer behavior preferences embedded in rating eWOM, complaint text eWOM, and security test data. Specifically, the credibility of these three types of data will be assessed individually, and the BUI model will be constructed based on the evaluation outcomes. Through this model, the characteristics and information quality of different data sources can be comprehensively considered. Subsequently, personalized product ranking and recommendation algorithms will be developed to enhance the accuracy and effectiveness of automotive product ranking and recommendation, thereby providing stronger support for the development of the automotive industry and aiding consumers’ car purchase decisions.
3. Model Construction
3.1. Construction of Evaluation System
The problem of automotive product ranking based on online eWOM data can be formulated as follows: given large-scale, multi-dimensional eWOM data, a ranking and recommendation of the set of alternative automotive products is achieved through an aggregation decision evaluation method. A practical challenge in automotive product ranking research lies in the high similarity of eWOM scores, which leads to low differentiation and makes it difficult to intuitively distinguish between automotive products. To address this issue, this paper first constructs a comprehensive evaluation system from three key dimensions: rating word-of-mouth, complaint word-of-mouth, and safety test data.
3.1.1. Evaluation Indicators and Set Definition Based on Rating Word-of-Mouth
The set
of online service platforms that provides the rating-based eWOM data and its corresponding weight set
are defined as follows:
There are seven evaluation indicators derived from the rated eWOM data, namely, space, configuration, cost-performance, interior, appearance, fuel consumption, and driving experience, as shown in
Figure 1. The rating indicator set
and its corresponding weight set
are formally defined as follows:
The user set
and its corresponding weight set
within the online service platform are formally defined as follows:
Then, the user’s rating matrix is
where
represents the score assigned by user
on platform
for attribute
of product
.
3.1.2. Evaluation Indicators and Set Definition Based on Complaint Word-of-Mouth
The set
of online service platforms providing user complaints and eWOM data, along with its corresponding weight set
, is formally defined as follows:
There are eight evaluation indicators for customer complaints, namely, engine, transmission, steering, braking, tires, front and rear axle and suspension systems, body accessories, and electrical and service issues, as shown in
Figure 2. The complaint indicator set
and its corresponding weight set
are formally defined as follows:
The set
of users who provide complaint content on the service platform, along with its corresponding weight set
, is formally defined as follows:
Then, the user’s complaint risk level matrix is
where
represents the risk level of user
on platform
for product
with respect to attribute
. It is important to note that a higher grade value corresponds to a lower risk level; specifically, a grade value of 1 indicates the highest risk, while a grade value of 5 indicates the lowest risk.
Note that since automatic transmission cars do not encounter clutch-related issues and the majority of consumers currently purchase automatic transmission vehicles, the problem of clutch failure will not be taken into consideration in this study.
Section 3.2 presents a method leveraging natural language processing techniques to transform the original complaint text information into a quantifiable user complaint risk level
.
3.1.3. Evaluation Indicators and Set Definition Based on Safety Test Data
The set
of online service platforms providing security assessment data, along with their corresponding weight set
, is formally defined as follows:
There are seven evaluation indicators derived from the rated eWOM data, specifically, occupant protection, VRU protection, and active safety, as shown in
Figure 3. The safety test indicator set
and its corresponding weight set
are formally defined as follows:
Then, the user’s safety classification matrix
is
where
represents the security level of platform
for product
with respect to attribute
, and a higher security level value indicates greater security.
3.2. Information Acquisition
3.2.1. The Transformation of Complaint Text Information
As shown in
Figure 1,
Figure 2 and
Figure 3, the rating eWOM and safety test data can readily yield quantifiable rating data. In contrast, complaint eWOM is predominantly described in textual form and lacks direct rating data. Consequently, a method is required to transform these textual descriptions into corresponding rating data for subsequent computational analysis. This study will employ a deep learning and neural network model based on artificial intelligence to address this issue.
XLNET is a natural language processing (NLP) model introduced by Google, employing a generalized autoregressive pre-training method. The literature [
37] presents a comprehensive comparison of XLNet with models such as BERT and RoBERTa. Under identical data and hyperparameter configurations, XLNet demonstrates significantly superior performance compared to BERT in multiple natural language understanding tasks. When the experiments were extended to larger-scale datasets and more optimized hyperparameter settings were applied, XLNet exhibited outstanding performance on benchmark test sets, including SQuAD, RACE, and GLUE. In some tasks, it even outperformed RoBERTa, thereby showcasing its robust capability in handling complex natural language processing tasks. This study employs XLNet as the core model for complaint text risk scoring, primarily due to its several key characteristics that align well with the task requirements. Complaint texts are generally lengthy and contain dispersed information. XLNet, through its utilization of the long-range dependency mechanism in Transformer-XL and relative position encoding, effectively overcomes the text length limitations inherent in traditional models, enabling it to fully preserve risk signals across paragraphs. Furthermore, XLNet integrates the bidirectional global modeling capability of permutation language models, which resolves the pre-training and fine-tuning bias issues encountered by masked language models such as BERT. Consequently, XLNet is capable of more precisely capturing the intricate.
An autoregressive (AR) language model assumes a linear dependency among sequence data. Given a text sequence
, AR language modeling is trained by maximizing the likelihood under forward autoregressive decomposition.
where
denote the context representation generated by the neural model and let
represent the embedding of
. Since
is conditioned only on position
(i.e., the token to the left), this characteristic plays a crucial role in the model’s behavior.
In this study, we engaged experts to annotate the risk levels of complaint texts and constructed a labeled dataset for subsequent model training. During the model training phase, a dataset comprising nearly ten thousand entries was utilized. To ensure the accuracy and validity of model learning, car names were anonymized in the data preprocessing stage to prevent potential biases in model learning that might arise from similarities in vehicle model names. Furthermore, during the training process, the F1 score was selected as the evaluation metric for assessing model performance. The F1 score provides a comprehensive consideration of precision and recall, enabling an objective reflection of the model’s performance in classification tasks and facilitating a more accurate evaluation of its strengths and weaknesses. The specific procedures are outlined as follows:
Step 1. Obtain the unmarked set
of the complaint indicator set
:
Step 2. Obtain a manually annotated risk level matrix.
Step 2.1. Expert group
conducted manual labeling to derive the risk grade matrix:
Step 2.2. Develop a comprehensive, manually labeled risk level matrix:
where
.
Step 3. The comprehensively manually labeled risk level matrix is divided into a training set and a test set in a 7:3 ratio.
Step 4. The model undergoes training, and its accuracy and precision are enhanced through parameter fine-tuning.
Step 5. After the model training process has been completed, the resulting trained model is utilized to generate the quantitative matrix:
3.2.2. Information Reliability Calculation
In the first step, the score data related to rated word of mouth, complaint word of mouth, and safety test were collected. Subsequently, the corresponding credibility information will be acquired.
- (a)
Obtain platform credibility information.
Step 1. Develop a comprehensive platform credibility assessment system. Fifteen indicators are developed to assess the credibility of the platforms, including overall enterprise capability, honor index, Baidu index, negative news index, user approval rate, comprehensiveness of annotation rules, richness of content presentation, overall stability of reviews, intuitiveness of navigation links, efficiency of the retrieval system, aesthetics of page layout, coherence of overall style, effectiveness of anti-cheating algorithms, data acquisition efficiency, and code transparency. The set
of platform credibility assessment indicators and their corresponding weight set
are presented, respectively.
Step 2. Obtain platform credibility information.
Step 2.1. Obtain a trustworthiness matrix for each platform.
Step 2.2. Assess the reliability of various platforms.
- (b)
Obtain indicator credibility information.
Step 1. Reliability assessment of rating evaluation indicators.
where
represents the reliability of product
on rating platform
for attribute
. The credibility of the rating evaluation index is determined by integrating the fluctuation degree of user scores under this attribute with the credibility of the platform
.
Step 2. Reliability assessment of complaint evaluation indicators.
where
represents the reliability of product
on complaint platform
for attribute
, and
denotes the progress index of risk information in complaint texts for category
, serving as a grading reference.
Step 3. Reliability assessment of safety testing evaluation indicators.
where
represents the reliability of product
on safety testing platform
for attribute
,
denotes the evaluation time of platform
for automotive product
, and
represents the evaluation depth index of the platform. The evaluation depth index reflects the number of secondary indexes assessed, where a higher number indicates a more comprehensive evaluation. This serves as a grading criterion for performance assessment.
3.3. Information Aggregation
3.3.1. Fundamental Concepts
To begin, this section introduces several fundamental concepts pertinent to this study.
- (a)
The basic uncertainty information (BUI) [
5,
6] particle is represented as a pair of values
, where the first value
corresponds to the input value (typically the evaluation score), and the second value
represents the degree of certainty associated with the score, and
represents the degree of uncertainty related to the score. The set of all BUI particles is denoted by
. The BUI aggregate function
for the input of an n-dimensional variable is formally defined as
, where
represents the value mapping function and
denotes the confidence mapping function.
- (b)
BUIOWA aggregation function.
The BUIOWA aggregation operator was first introduced in reference [
11], which was inspired by the concept of the Choquet integral and processes BUI input via problem decomposition and integration techniques. Its design principle and underlying concept involve partitioning the BUI input into two components—certain
and uncertain
—based on the degree of certainty. The traditional OWA operator and mean operator are then, respectively, employed for aggregation, while the integral approach is incorporated to facilitate continuous aggregation. This process enables effective processing and precise aggregation of the BUI input. For further details regarding the specific design process of this operator, please refer to reference [
11].
This paper first presents the formal definition of an uncertain order weighted average (UOWA) operator, assuming constant input variables.
Definition 1 ([
11])
. For any input function and a subset on which the input function values are certain (with its complement , on which the input function values are uncertain), an UOWA operator for with a family of OWA weight vectors , having orness , is a mapping such thatwhere the weight vectors are calculated by orness, and the convention and holds true. The attached aggregation certainty (aggregation uncertainty ) is defined by Next, in accordance with the principle and concept of UOWA, the definition of the basic uncertain information ordered weighted averaging (BUIOWA) aggregation operator is formally introduced.
Definition 2 ([
11]).
A BUIOWA operator for with a family of OWA weight vectors, having orness, , is a mapping such thatwhere . The specific calculation process of the BUIOWA operator is elaborated upon in detail through an illustrative example provided in
Appendix A.
In the process of aggregating three types of eWOM data to derive the respective comprehensive BUI scores, it is essential to consider the aggregation across three dimensions: users, metrics, and platforms. For the user dimension, this study employs an equal-weighted averaging method. This approach is straightforward and intuitive, ensuring that each user input is treated equally. For indicators and platform aggregation, the BUIOWA operator is utilized. Its weight assignment mechanism is intricately tied to its unique design principle, comprehensively reflecting the weight design concepts from both data-driven and preference-setting perspectives. On one hand, the BUIOWA operator achieves dynamic weight allocation by introducing a threshold-based partitioning method and employing differentiated calculation approaches for distinct deterministic BUI particles. This process relies entirely on the inherent credibility of the data, effectively mitigating the potential influence of low-credibility information on the aggregation outcomes, thereby demonstrating the scientific rigor of data-driven weight allocation. On the other hand, the BUIOWA operator provides decision-makers with the flexibility to customize aggregated weights by adjusting the orness value. The orness value, a critical parameter in the OWA operator, reflects the decision-maker’s preference for optimism or pessimism. By modulating the orness value, decision-makers can flexibly control the aggregation results’ sensitivity to larger or smaller input values. A higher orness value signifies a more optimistic inclination, tilting the aggregation results toward larger input values, whereas a lower orness value indicates a relatively more pessimistic stance, causing the aggregation results to lean toward smaller input values.
3.3.2. Aggregation of Rating Word-of-Mouth
This section presents a method for aggregating rated eWOM information, which enables us to aggregate the ratings of all users on multiple attributes of a product in each platform into platforms’ comprehensive BUI evaluation data. The data input of the rating eWOM aggregation algorithm is .
Step 1. The multi-attribute rating matrix
for product
on platform
is derived by aggregating the group scores in a systematic manner.
Step 2. Aggregate multi-attribute ratings.
Step 2.1. Based on the rating reliability
obtained through the rating data reliability calculation method presented in
Section 3.2, the multi-attribute rating matrix
is transformed into the BUI matrix:
Step 2.2. The BUI value of the product on each platform was calculated by employing the BUIOWA operator:
Step 3. The BUI values from multiple scoring platforms were aggregated to calculate the comprehensive BUI value of the product.
Finally, the comprehensive BUI results of each rating platform are output .
3.3.3. Aggregation of Complaint Word-of-Mouth
This section presents a complaint information aggregation method utilizing the BUIOWA aggregation operator, which can consolidate the complaint rating data obtained from the method described in
Section 3.2 into comprehensive BUI complaint rating data. The data input of the complaint eWOM aggregation algorithm is
Step 1. The multi-attribute complaint rating matrix
on complaint platform
is derived by aggregating the group scores in a systematic manner.
Step 2. Multi-attribute aggregation.
Step 2.1. Based on the complaint rating reliability
obtained through the complaint data reliability calculation method presented in
Section 3.2, the multi-attribute complaint rating matrix
is transformed into the BUI matrix:
Step 2.2. The BUI value of the product on each complaint platform was calculated by employing the BUIOWA operator.
Step 3. The BUI values from multiple complaint platforms were aggregated to obtain the comprehensive BUI value of the product.
Finally, the comprehensive BUI results of each complaint platform are output .
3.3.4. Aggregation of Safety Testing Word-of-Mouth
This section presents a method for aggregating safety testing eWOM information, which enables us to aggregate the ratings of all safety testing platforms on multiple attributes of a product into comprehensive BUI evaluation data. The data input of the safety testing eWOM aggregation algorithm is .
Step 1. Multi-attribute aggregation.
Step 1.1. Based on the safety testing rating reliability
obtained through the safety testing data reliability calculation method presented in
Section 3.2, the multi-attribute safety testing rating matrix
is transformed into the BUI matrix:
Step 1.2. The BUI value of the product on each safety testing platform was obtained using the BUIOWA operator.
Step 2. The BUI values from multiple safety testing platforms are aggregated to obtain the comprehensive BUI value of the product.
Finally, the comprehensive BUI results of each complaint safety testing platform are output .
3.3.5. The Aggregation of eWOM Information on Multiple Platforms
This section presents a method for aggregating comprehensive BUI-type eWOM from the three types of online service platforms into comprehensive BUI information that holistically represents product evaluation. The algorithm’s input consists of the comprehensive BUI results from the three platform types.
The BUI values from multiple platform types were aggregated to obtain the comprehensive BUI value.
Finally, the comprehensive BUI results of multiple platforms are output.
3.4. Product Ranking Based on User Preferences
Based on the comprehensive BUI results from three types of platforms and the aggregated BUI outcomes across multiple platforms , this section develops a personalized recommendation algorithm for automotive products.
Step 1. For each alternate vehicle, construct a four-element BUI group and its corresponding ordering vector .
Step 1.1. Construct a four-element BUI group for each alternative car.
Step 1.2. Based on the BUI sorting method, the sorting percentage vector for the four-element BUI group was computed.
where
, respectively, represents the ranking percentage of product
on the rating platform, complaint platform, security platform, and multi-platform combination.
Step 2. Initial recommendation results based on users’ preferred ranking percentage.
Step 2.1. Input the set of user expectations regarding the percentage of the sorted four-element BUI result group:
where
, respectively, denote the ranking percentages of the user’s expected rating platform, complaint platform, security platform, and multi-type platform.
Step 2.2. Based on
and
, the initial set
of personalized recommendations is determined.
Step 3. The aggregated BUI value across multiple platforms, considering user preferences, was computed.
Step 3.1. Determine the user platform preference weights , which, respectively, represent the user’s preference weights for the rating platform, the complaint platform, and the security platform, while ensuring that .
Step 3.2. Compute the multi-platform aggregated BUI value
by incorporating user preferences.
Step 4. By applying the BUI ranking method, the ranking results of the schemes within the initial set of personalized recommendations are obtained.
4. Product Ranking Recommendation Algorithm Based on Multi-Type Platform eWOM Data
In this section, the overall process of the automotive product ranking algorithm based on rating eWOM data, complaint eWOM data, and safety test data is introduced, as shown in
Figure 4.
Step 1. The alternative vehicle product set , rating platform set , complaint platform set , and safety test platform set are identified.
Step 2. Relying on crawler technology, obtain the eWOM data of candidate automotive product set on platform sets , , and .
Step 2.1. Construct the user rating matrix for the i-th automobile product on the l-th platform in platform set .
Step 2.2. Construct a complaint text matrix for the j-th attribute of the i-th automotive product on the l-th platform in platform set .
Step 2.3. Construct the security level matrix for the l-th platform in platform set .
Step 3. To convert complaint text data into calculable numerical data, we follow the complaint text conversion method described in
Section 3.2.1 First, the annotation set is manually annotated to obtain a comprehensive, manually annotated risk level matrix
, which can then be utilized in subsequent model training.
Step 4. Based on the complaint text conversion method described in
Section 3.2.1, the XLNET model is trained using the manually annotated risk level matrix
obtained in Step 3. Subsequently, the trained model is utilized to convert the complaint text matrix
into a quantitative matrix
.
Step 5. Based on the platform reliability calculation method presented in
Section 3.2, three types of platform reliability are computed.
Step 6. Based on the credibility calculation method for eWOM information presented in
Section 3.2, the credibility of the rating eWOM index
, the credibility of the complaint eWOM index
, and the credibility of the safety test index
are computed.
Step 7. Based on the information aggregation method presented in
Section 3.3, the information aggregation results for rating eWOM
, complaint eWOM
, and safety test data
were computed using the BUIOWA aggregation operator.
Step 8. Based on the product ranking method for user personalized preferences presented in
Section 3.4, automobile product ranking recommendations tailored to user personalized preferences are realized.
5. A Case Study of Product Ranking Driven by Multi-Source Data
This section will employ real-world cases to demonstrate the multi-source data-driven product ranking algorithm presented in
Section 4. The data originate from three types of online service platforms: user ratings, user complaints, and security tests. Suppose that all data are fully available, and the data scale is sufficiently large. Furthermore, in this case analysis, we assume that decision-makers demonstrate a relatively low level of optimistic preference inclination. Specifically, during the application of the BUIOWA operator, the orness parameter is assigned a value of 0.25.
Step 1. The alternative vehicle product set
, the rating platform set
, the complaint platform set
, and the safety test platform set
are defined as follows:
Step 2. Based on crawler technology, the real data for the candidate automotive product set
on platform sets
,
, and
were obtained. The data scale is presented in
Table 1. After organizing the data, the user rating matrix
, complaint text matrix
, and security level matrix
were constructed.
Due to the large volume of data, only a portion of the data is presented below.
Step 2.1. The user rating matrix
is constructed (using the BMW 3 series coefficient data from the Autohome.com platform as an example).
Step 2.2. The complaint text matrix
is constructed.
Table 2 presents a portion of the complaint text contents regarding the BMW 3 series from the 12365auto.com platform.
Step 2.3. Security level matrix
is constructed.
Step 3. Manually annotate the dataset to obtain a comprehensive, manually annotated risk level matrix
.
Step 4. The XLNet model presented in
Section 3.2 was trained using the aforementioned synthetically and manually labeled risk level matrix
. The variation trends of model accuracy and loss function during the training process are illustrated in
Figure 5 and
Figure 6. Subsequently, the trained model was utilized to transform the complaint text matrix
into a quantitative matrix
.
Step 5. Based on the platform reliability calculation method presented in
Section 3.2, the reliability of various platforms was computed.
Step 5.1. Obtain a reliability matrix for each platform. For each platform, there are 15 indicators to evaluate its reliability.
Step 5.2. Calculate the reliability of various platforms. The reliability of each platform is obtained by aggregating the corresponding reliability values of the 15 evaluation indicators for platform reliability.
Step 6. Based on the credibility calculation method presented in
Section 3.2, the credibility of the rating eWOM indicator
, complaint eWOM indicator
, and safety test indicator
is calculated.
Step 6.1. The credibility of the rating eWOM indicator is calculated as follows:
Step 6.2. The credibility of the complaint eWOM indicator is calculated as follows:
Step 6.3. The credibility of the safety test indicator is calculated as follows:
Step 7. Based on the information aggregation method presented in
Section 3.3, the BUIOWA aggregation function was utilized with orness
to calculate the aggregated results of rating eWOM
, complaint eWOM
, and safety test
, respectively.
Step 7.1. The multi-attribute rating matrix
for product
on platform
is derived by aggregating the ratings provided by different user groups
The multi-attribute BUI matrix is constructed by integrating the multi-attribute scoring matrix and the credibility matrix of the rating eWOM indicator into a binary unit. By applying the BUIOWA aggregation operator to aggregate the attribute dimensions, the BUI ratings
for nine automobile brands across two platforms can be obtained.
The BUI values of multiple rating platforms were aggregated to determine the comprehensive BUI value of the product.
Step 7.2. The multi-attribute complaint risk level matrix
for product
on platform
is derived by aggregating the ratings provided by different user groups
The multi-attribute BUI matrix is constructed by integrating the multi-attribute complaint risk level matrix and the credibility matrix of the complaint eWOM indicator into a binary unit. By applying the BUIOWA aggregation operator to aggregate the attribute dimensions, the BUI ratings
for nine automobile brands across two platforms can be obtained.
The BUI values of multiple complaint platforms were aggregated to determine the comprehensive BUI value of the product.
Step 7.3. The multi-attribute security level matrix
for product
on platform
is obtained.
The multi-attribute BUI matrix is constructed by integrating the multi-attribute security level matrix and the credibility matrix of the safety testing evaluation indicator into a binary unit. By applying the BUIOWA aggregation operator to aggregate the attribute dimensions, the BUI ratings
for nine automobile brands across two platforms can be obtained.
The BUI values of multiple safety testing platforms were aggregated to determine the comprehensive BUI value of the product.
Step 8. Based on the product ranking algorithm that incorporates users’ personalized preferences as described in
Section 3.4, personalized product ranking recommendations are realized.
Step 8.1. Construct a quadruple BUI group
along with its corresponding sorting percentage vector
.
Step 8.2. Preliminary results derived from the users’ desired ranking ratio.
Let the set of user expectations regarding the percentage of the sorted four-element BUI result group be denoted as follows.
Based on
and
, the initial set of personalized recommendations can be determined as follows.
Step 8.3. Assuming that users assign equal preference weights
to the scoring platform, complaint platform, and security platform, the comprehensive BUI value
of multi-type platforms, based on user preferences, is computed as follows.
Step 8.4. In accordance with the BUI ranking rule, the ranking of the five types of cars in the initial recommendation set is determined as , and the final optimal car recommendation is .
6. Comparative Analysis and Management Enlightenment
6.1. Weight Sensitivity Analysis
In Step 8.3 of
Section 5, decision-makers can determine the weights of the three types of platforms according to their preferences, thereby obtaining a personalized comprehensive BUI value for multi-type platforms. In this section, sensitivity analysis will be conducted by designing various platform weight configurations.
Table 3 presents seven weight allocation cases, where the first three cases indicate that decision-makers consider only the eWOM data from a single type of platform; Case 4 to Case 6 indicate that decision-makers consider the eWOM data from two types of platforms; and Case 7 indicates that decision-makers consider the eWOM data from all three types of platforms. Based on these seven personalized weight cases, the comprehensive BUI values and ranking results for multi-type platforms, derived from user preferences, were calculated, respectively. The obtained results are presented in
Table 4.
The results indicate that the ranking outcomes under the seven scenarios exhibit significant differences, suggesting that rating eWOM data, complaint eWOM data, and safety test data encapsulate distinct product ranking knowledge. Focusing solely on a single type of eWOM data or failing to comprehensively integrate all three types of eWOM data will lead to incomplete ranking results, which are insufficient to serve as an effective reference for consumers. Only through the comprehensive consideration of rating eWOM data, complaint eWOM data, and safety test data can the most representative product ranking outcomes be achieved.
6.2. Theoretical Comparative Analysis
In the field of MCDM, traditional baseline methods such as TOPSIS and AHP have been extensively utilized for scheme ranking. Specifically, the TOPSIS method ranks alternatives by calculating their distances from the ideal solution, while the AHP method relies on hierarchical weight analysis to determine the relative importance of each criterion.
However, when addressing multi-source heterogeneous data (e.g., online eWOM data), these traditional methods exhibit significant limitations. Multi-source heterogeneous data often possess characteristics such as uncertainty, conflict, and quality variability, which traditional methods struggle to handle effectively. In contrast, fuzzy MCDM methods, through the integration of fuzzy theory, can more flexibly quantify subjective preferences and fuzzy information, thereby alleviating some of the shortcomings of traditional approaches. Nevertheless, this method still lacks a systematic evaluation mechanism for assessing information quality, which, to some extent, constrains its application in complex decision-making scenarios.
Against this backdrop, the large-scale group decision-making method based on the BUI model presented in this study demonstrates unique advantages.
From the perspective of information quality-driven decision-making, traditional MCDM methods solely rely on evaluation data for decision-making while neglecting the credibility of the evaluation information itself. The BUI model dynamically assesses the quality of multi-source data, enabling comprehensive consideration of both evaluation information and its credibility. This approach enhances the robustness of decisions and improves the reliability of the decision structure, leading to more scientific and reasonable outcomes.
Regarding large-scale group fusion, the BUIOWA operator achieves dynamic weight distribution. On the one hand, it simulates different decision-maker preferences by adjusting the orness parameter, such as maximizing pessimistic or optimistic criteria, thereby accommodating diverse decision-making scenarios. On the other hand, the BUIOWA operator effectively mitigates the influence of low-credibility information on aggregation results, thus enhancing the reliability of decision-making outcomes.
This decision-making mechanism, which integrates information quality analysis, enables BUIOWA to significantly outperform traditional MCDM methods and baseline models in scenarios such as product ranking that depend on multi-source eWOM data. It provides more reliable support for complex decision-making processes and assists decision-makers in achieving more scientific and reasonable decisions when confronted with complex multi-source heterogeneous data.
Table 5 presents the comparative analysis results of the method proposed in this paper against other MCDM methods.
6.3. The Management Enlightenment of Enterprise Sustainable Development
In the era of big data, various online service platforms encompass diverse types of consumer behavior preference data, offering enterprises critical insights into market dynamics and consumer demands. For instance, among the three categories of eWOM data examined in this study, rated eWOM data quantitatively reflect consumers’ actual experiences with enterprise products. By integrating such data, enterprises can assess the overall reputation and competitiveness of their products in the market. Complaint eWOM data highlight areas where consumers are most dissatisfied with enterprise products. A thorough analysis of these data enables enterprises to precisely identify weaknesses in their operations. Safety test data, evaluated from a professional automotive safety perspective, demonstrate the safety performance of products and provide targeted directions for safety technology development and improvement. Compared to relying on a single type of eWOM data, comprehensively considering multi-type user eWOM data and extracting consumer behavior preferences is essential for enterprise sustainability. Consequently, enterprises should proactively adjust management strategies and prioritize the application of data analysis technologies in uncovering consumer behavioral preferences. Additionally, enterprises must establish a robust data analysis framework to ensure the effective extraction of consumer preference information from multi-dimensional eWOM data. By leveraging multi-source data-driven consumer behavior analysis capabilities, enterprises can enhance their sensitivity to market changes, promptly adjust strategic directions, improve product market competitiveness, and achieve sustainable development objectives.