1. Introduction
The complex product manufacturing industry serves as a strategic pillar of national industrialization. It is driven by advanced technologies and promotes the continuous upgrading and transformation of industrial systems. The rise of the Internet economy and emerging technologies, such as big data and artificial intelligence, has promoted major economies to accelerate the integration of manufacturing with next-generation technologies. These countries are actively implementing reindustrialization strategies to revitalize advanced manufacturing. For instance, the United States launched the National Strategic Plan for Advanced Manufacturing, Germany introduced Industry 4.0, the United Kingdom released Manufacturing 2025, and China proposed Made in China 2025. This new wave of reindustrialization is not a simple return to traditional manufacturing models; rather, it is built on digitalization, networking, and intelligence. By strengthening technological innovation capabilities and constructing new competitive advantages in manufacturing, these efforts seek to re-establish global leadership in the manufacturing landscape [
1].
In the development of complex products, the design phase plays a central role, directly influencing cost, quality, and schedule throughout the life cycle. Accurately capturing user requirements is vital to innovation and competitiveness [
2]. Yet in practice, such requirements are often vague, implicit, and diverse, increasing acquisition difficulty [
3,
4]. Traditional methods like interviews and surveys typically rely on small samples and expert judgment, resulting in limited data coverage and strong subjectivity [
5,
6]. Although approaches such as LDA and topic modeling improve scalability, they often depend on a single data type and lack behavioral input. Moreover, uniform weighting fails to reflect the unequal contribution of heterogeneous information sources, which compromises decision robustness.
To address these limitations, this study integrates multi-source heterogeneous data to comprehensively capture user requirements. These data sources differ significantly in structure, granularity, and expression forms and thus exhibit heterogeneous characteristics. Such heterogeneity implies that the underlying relationships between data are not linearly symmetric in the conventional sense. Instead, they reflect a form of higher-order symmetry, in which each source contributes differently but collaboratively to the overall representation of user needs. From this perspective, “symmetry” is redefined as the structural balance among multiple heterogeneous sources in representing and supporting decision-making on user requirements. By quantifying each source’s contribution through Shapley value analysis, this study identifies a state of inter-source equilibrium—a functional symmetry characterized by influence balance and structural complementarity. This reinterpreted symmetry aligns with the core thematic interest of the journal, offering a novel and meaningful perspective for analyzing complexity in multi-source user requirement acquisition.
In response to this background, this study investigates the following three key research questions (RQs): RQ1: In current complex product design, requirement acquisition often depends on a single data source or designers’ personal experience. Is it possible to construct a multi-source integrated acquisition method to capture user requirement information more comprehensively? RQ2: Among the existing methods of requirement acquisition, which are considered accurate and effective? Do these methods still have potential for improvement in terms of expression precision and analytical depth? RQ3: Once heterogeneous data in different forms have been collected, how can a unified and symmetric data information be established and reasonable, fair weights assigned to different acquisition approaches to ensure user requirement balancing and decision-making robustness?
To address RQ1, this study establishes a multi-source heterogeneous user requirement acquisition model and applies it to complex product design scenarios. The term multi-source refers to the extraction of user requirements from multiple independent information channels, offering broad data coverage and diverse perspectives in information expression. Heterogeneous indicates that the collected data differ in structure, format, and analytical methods. These data types include qualitative language, quantitative ratings, and behavioral indicators, each with distinct forms and complementary levels of expression. By integrating diverse sources and structurally varied user requirement data, this model overcomes the limitations of single-source approaches. Even if a method such as text mining offers wide coverage, it is often insufficient in capturing both the depth and precision of user insights. In contrast, collecting multi-source heterogeneous data improves the comprehensiveness and reliability of requirements. This provides a stronger foundation for user-centered design in complex products.
To address RQ2, this study highlights the importance of precise and effective requirement acquisition for improving product design. Therefore, a multi-method integrated approach is developed to elicit user requirements. Specifically, the data collection involves (1) collecting user online reviews and applying BERTopic for topic modeling (this helps identify common concerns in real usage, revealing functional and usability requirements); (2) conducting semi-structured interviews and applying fuzzy Kano analysis to convert users’ subjective descriptions and expectations into structured preference and satisfaction data; and (3) designing eye-tracking experiments with static prototypes and images to capture visual attention and fixation patterns. These data support the analysis of user perception and aesthetic preferences for product appearance.
To address the challenges in RQ3, which concerns the integration, compatibility, and symmetry of multi-source heterogeneous data, this study adopts a game-theoretic perspective to model and coordinate user requirement information from different sources. An analytical framework is constructed, with the three information sources treated as cooperative participants. Their importance and actual contribution to requirement representation are comprehensively evaluated. This enables unified data structuring, rational weight assignment, and credible integration across sources. The proposed approach effectively resolves issues such as structural inconsistency, varying data quality, and subjectivity among different information forms. It improves the fairness, interpretability, and scientific rigor of the fusion process and ensures that the final ranking of user requirements possesses strong representativeness and practical value for design decision-making.
The remainder of this paper is organized as follows. 
Section 2 reviews existing studies on user requirement acquisition methods and complex product design, aiming to identify the research gap and the entry point for this study. 
Section 3 presents the research methodology, including text data collection, BERTopic-based topic modeling, eye-tracking experiments, and a game-theoretic approach for balancing different types of requirements. A multi-source heterogeneous user requirement optimization method for complex product design is proposed, and the step-by-step procedure of the method is described in detail. 
Section 4 takes medical equipment as an example and demonstrates the feasibility and accuracy of the proposed approach through a case study of an oxygen concentrator. 
Section 5 provides the results and discussion, where comparative experiments are conducted to validate the effectiveness of the proposed method. Finally, 
Section 6 summarizes the key findings and offers suggestions for future research directions.
  2. Literature Review
  2.1. Complex Product Design
Product design is a critical phase in the product life cycle. It determines the quality of functional performance and directly influences the feasibility of manufacturing, as well as the efficiency and value of later service stages. In this context, complex product design becomes especially important. It covers functional realization and addresses many interrelated factors, such as product structure, performance, cost, maintainability, human–machine interaction, and visual aesthetics. As such, it serves as a key stage for enhancing product competitiveness [
7].
In recent years, numerous scholars have proposed innovative theories and methods for complex product design. Chuan He et al. [
8] developed a functional design improvement model based on rainflow evolution to support the evolution and optimization of complex product functions. Haizhu Zhang et al. [
9] constructed a PDS–behavior–structure conceptual design model to address the design requirements of complex product systems (CoPS) in multidisciplinary contexts. Kang Wang et al. [
10] adopted a neural network-assisted evolutionary approach to mitigate the negative impact of product complexity on development processes from a holistic perspective. An-Jin Shie et al. [
11] integrated the fuzzy Kano model, Kansei engineering, and TRIZ theory to establish a method that maps the complex emotional needs of elderly users into design parameters. Dexin Chu et al. [
12] introduced a novel multi-skeleton modeling method to support the layout design of complex products in top-down and modular design scenarios. Xiqiang Yan et al. [
13] proposed a multidisciplinary conceptual design process model based on decomposition–mapping–negotiation–integration with feedback mechanisms. Hong Bao et al. [
14] developed a non-cooperative game-based modular design method for electromechanical products that balances reusability and assembly complexity to optimize both production efficiency and cost. Zhiyong Zhou [
15] constructed an evaluation model for healthcare product design based on Kansei engineering, incorporating eye-tracking and EEG technologies. Tianlu Zhu et al. [
16] proposed a complex product evaluation method based on hybrid Kansei engineering (HKE) modeling. A summary and comparison of these studies with the present research are provided in 
Table 1.
The reviewed studies show that complex products are used across multiple domains, including engineering instruments, transportation equipment, medical devices, and energy systems. Their complex structures and diverse design elements create high demands for effective design methodologies. Most existing studies have focused on stages such as conceptual design, functional modeling, modular division, and design evaluation. However, research on user requirements remains limited, particularly in relation to the systematic processing of multi-source heterogeneous information. In response, this study proposes a balancing and optimization method for multi-source heterogeneous user requirements in complex product design, which integrates user reviews, interview data, and eye-tracking behavior. By enabling fair weighting and complementary fusion across sources, the method effectively uncovers latent user needs and enhances the completeness, accuracy, and robustness of requirement modeling.
  2.2. User Requirement Acquisition Methods
As user-centered design continues to evolve, user requirements have become a key driving force behind technological advancement and product innovation. However, accurately, comprehensively, and objectively acquiring these requirements remains a major challenge in complex product design. Requirement acquisition methods differ in data sources, processing efficiency, and analytical depth. These differences influence the accuracy of requirement identification and the validity of design decisions.
At present, user requirement acquisition methods can be categorized into three main types: qualitative analysis, physiological signal-based approaches, and data mining techniques (as shown in 
Figure 1). Among them, qualitative methods such as literature review, web search, user interviews, and questionnaires are the most widely used in current research. For example, Zhiyong Zhou [
15] conducted literature reviews, online searches, and user surveys to summarize descriptive adjectives representing Kansei imagery for medical nursing beds. He then selected appropriate terms through voting by healthcare professionals, designers, and bedridden patients. Diana Herrera-Valenzuela et al. [
19] conducted interviews with patients and clinicians to identify a comprehensive set of requirements for gait rehabilitation using wearable robots (WR) from the perspectives of individuals with spinal cord injury (SCI) and the clinicians responsible for their recovery. Tianlu Zhu et al. [
20] collected requirements for surgical assistance devices through expert interviews and scoring methods. These studies typically employed expert reviews, focus groups, or the KJ method to refine vocabulary, followed by statistical methods such as Kano, QFD, factor analysis, or principal component analysis in the post-processing phase. For instance, Zhigang Hu et al. [
21] crafted a user requirements importance ranking system leveraging the Kano model and pairwise analysis. Although these approaches enable in-depth understanding of the target user group, they also face several challenges. These include small sample sizes, high data collection costs, and strong subjectivity in analysis. As a result, it is often difficult to obtain complete and authentic expressions of user requirements through such communication-based methods, which may compromise the comprehensiveness of the data obtained.
Many studies combine two types of sources, such as interviews with eye-tracking or online reviews with questionnaires, to improve the breadth and depth of user requirement acquisition. These combinations help reduce subjectivity and enhance contextual adaptability. However, they generally lack a unified integration framework and struggle to resolve conflicts or assign balanced weights across multiple sources. To date, no research has systematically integrated textual feedback, subjective preferences, and behavioral data within a coherent model that ensures fair representation and optimized expression of multi-source heterogeneous information.
To address these limitations, some researchers use data mining and natural language processing (NLP) techniques to extract unstructured data such as online user reviews. Using web crawling and word frequency analysis, they quantify user requirements objectively and at scale. For example, Bingkun Yuan et al. [
22] constructed a corpus consisting of academic literature and patent abstracts, and applied Latent Dirichlet Allocation (LDA) to extract usable Kansei semantics from large-scale data. This approach effectively mitigated the subjectivity issues inherent in traditional Kansei engineering. Juan Hao et al. proposed a hybrid LDA and K-means clustering algorithm based on patent texts to perform user requirement clustering, offering a new perspective for functional requirement modeling. Although these big-data-based approaches have improved the comprehensiveness of requirement acquisition, most of them still rely on a single data source. User requirements obtained from a single channel may be insufficient to support systematic design and high-precision modeling. As a result, the multidimensionality and objectivity of user requirements can be compromised. Additionally, single-source approaches to user need acquisition often present inherent limitations. For example, online reviews may suffer from self-selection bias, user interviews can be influenced by social desirability or subjective memory, and eye-tracking experiments are typically constrained by limited sample sizes. Therefore, it is necessary to complement these methods with additional data sources to ensure a more comprehensive and reliable understanding of user needs.
  2.3. Natural Language Processing Methods
Natural language text is a primary medium through which humans express needs, communicate ideas, and share opinions. As a major carrier of thought, it plays a vital role in recording and transmitting user intent. Analyzing the semantic content of large-scale, unstructured natural language data enables the extraction of valuable information, thereby supporting user requirement identification and product optimization [
23]. Natural language processing (NLP) is an interdisciplinary field that integrates computer science, artificial intelligence, and linguistics. It focuses on developing algorithms and technologies that allow computers to understand, interpret, generate, and process the semantics and structure of human language [
24]. NLP has found wide application in industry and commerce, particularly in text mining tasks such as text classification, clustering, information extraction, summarization, and sentiment analysis [
25].
With the advancement of data mining techniques, increasing research attention has been directed toward extracting user requirements from unstructured data, especially online review content from community platforms and e-commerce websites. These reviews reflect users’ experiences and attitudes toward products. While they influence potential consumers, they also provide companies with crucial insights into user preferences and product feedback [
26]. In practice, extracting user concerns and interest topics from large volumes of natural language content remains a challenging task. To address this, topic analysis has become a key method in NLP and is widely used in text mining for unstructured data. By identifying latent semantic structures and core topics within text, topic analysis offers an effective tool for understanding the content dimensions of user requirements and serves as a bridge between semantic processing and requirement modeling [
27].
Topic analysis generally involves two main steps: text representation and topic extraction. Traditional text representation methods rely on surface features such as term frequency (TF) and term frequency–inverse document frequency (TF-IDF). More recent approaches adopt deep semantic modeling based on word embeddings (e.g., Word2Vec, GloVe) and contextual embeddings (e.g., BERT) [
28]. Among topic models, Latent Dirichlet Allocation (LDA) is one of the most widely used unsupervised methods. It uses Bayesian inference to identify topic distributions for each document and extract keywords for each topic [
29]. As the demand for semantic precision increases, deep language models such as BERT have been widely applied to topic modeling. With its multi-layer bidirectional transformer architecture, BERT captures contextual semantics and provides high-quality embeddings for downstream clustering and classification. In complex user review scenarios, BERT combined with the HDBSCAN (Hierarchical Density-Based Spatial Clustering of Applications with Noise) clustering algorithm has been shown to outperform traditional models in both semantic coherence and topic distinctiveness [
30].
Given the large volume, informal language, and semantic complexity of online reviews, NLP and text mining technologies offer designers a robust, quantifiable source of user data. In particular, topic analysis enables the efficient extraction of core requirements from reviews and holds significant value for user requirement modeling.
  2.4. Game Theory
Game theory, a key branch of operations research, provides a theoretical framework for analyzing rational decision-making among multiple interdependent players in conflict or competition scenarios [
31]. Since its inception, game theory has been widely applied in economics [
32], management [
33], and engineering [
34] and has gradually become a crucial method for decision-making in product design and user requirement analysis [
35].
Based on the nature of interaction among participants, game theory can be categorized into non-cooperative and cooperative games. Non-cooperative game theory focuses on strategic competition among individuals, where each participant aims to maximize their own interests through rational behavior. This model is well-suited for scenarios involving conflicting user preferences or strategy confrontation—for example, when different stakeholders (e.g., users, engineers, and marketers) have divergent priorities in product features or when departments within an organization compete for limited design resources or budget allocations. In contrast, cooperative game theory centers on alliance formation and emphasizes collaboration and joint benefit maximization. It promotes values such as efficiency, fairness, and symmetry, making it particularly applicable to multi-source data integration and group decision-making [
36]. Compared to traditional subjective weighting methods such as the Analytic Hierarchy Process (AHP) or objective weighting techniques like entropy weighting, cooperative game theory adopts a combinatorial weighting strategy. This allows for more accurate characterization of complementarity, redundancy, and marginal contribution among information sources, thereby enhancing the scientific rigor and interpretability of the weighting process [
37]. Additionally, this approach helps mitigate information loss and imbalance issues commonly encountered in linear fusion methods, offering stronger support for requirement prioritization and strategic optimization in complex product design. It reduces human subjectivity and provides a theoretical and practical foundation for integrating high-dimensional, complex, and multi-perspective data [
38].
Although prior studies have combined natural language processing with expert scoring or used entropy-based methods to fuse requirement data, these frameworks often adopt static or linear weighting strategies that do not address the asymmetry and structural imbalance across sources. Most methods assume each input channel contributes independently, without modeling their mutual influence or conflict. In contrast, this study introduces a cooperative game-based fusion mechanism that redistributes weights based on the relative contribution and interaction between sources. This dynamic adjustment allows for a more fair and interpretable requirement prioritization process, especially when integrating sources with different levels of granularity, reliability, and cognitive context. Therefore, the methodological contribution of this work lies not only in combining multi-source data, but in enabling inter-source coordination through a contribution-aware allocation strategy.
In this study, user requirement modeling for complex products is based on structured requirement items obtained through text mining, user interviews, and eye-tracking experiments. The relationship among multi-source heterogeneous user requirement information is illustrated in 
Figure 2. These three types of information sources reflect users’ real concerns under different contexts, cognitive levels, and modes of expression. However, user requirements obtained from different sources often exhibit asymmetry and imbalance. Each source tends to emphasize distinct aspects of user intent and differs in terms of precision, granularity, or emotional expression. As a result, simple averaging or uniform fusion approaches are insufficient to capture the real differences and actual contributions of each source. To address this, cooperative game theory is introduced as the underlying framework. Textual data, interview data, and eye-tracking data are regarded as cooperative participants. By constructing a coalition utility function and calculating the marginal contribution of each source to each requirement item, the Shapley value method is applied to achieve scientifically grounded weight allocation and requirement prioritization. This enables a balanced, symmetric, and optimized integration of multi-source heterogeneous user requirements.
  3. Materials and Methods
This section introduces the integrated research framework for multi-source user requirement optimization, as illustrated in 
Figure 3. The proposed framework consists of four key modules: (1) requirement extraction based on online reviews using the BERTopic model; (2) user preference acquisition through semi-structured interviews and fuzzy Kano analysis; (3) visual perception analysis via eye-tracking experiments; and (4) multi-source data balancing using a cooperative game-theoretic model. These components are applied sequentially to capture, classify, and integrate heterogeneous user needs in a balanced and symmetric manner. This framework was then applied to a case study of 12 commercial oxygen concentrators, with each step executed accordingly. The methods and implementation details of each module are elaborated in 
Section 3.1, 
Section 3.2, 
Section 3.3 and 
Section 3.4.
  3.1. BERTopic
Recent breakthroughs in deep learning have significantly advanced NLP, with BERT, developed by Google, being one of the most representative models [
39]. BERT has also achieved outstanding performance across multiple NLP benchmark tasks [
40]. In 2022, Maarten Grootendorst proposed a document-level topic clustering method known as BERTopic, based on the BERT model [
41]. This model combines BERT’s deep language representations with innovative topic modeling techniques to perform semantic analysis. The model dynamically adjusts during training and automatically generates optimal clustering results.
The algorithmic process of BERTopic involves several stages: First, text data are embedded into dense vectors using pretrained BERT models. These high-dimensional embeddings are then reduced using UMAP (Uniform Manifold Approximation and Projection) [
42], a nonlinear dimensionality reduction technique. The reduced vector matrix is subsequently clustered using HDBSCAN [
43], resulting in distinct topic groups. For each cluster, the class-based TF-IDF (C-TF-IDF) method is applied to evaluate the importance of candidate topic terms within the cluster. This yields representative keywords for each topic group, forming the final topic classification results. The detailed process is illustrated in 
Figure 4.
The BERTopic model improves upon the traditional TF-IDF weighting algorithm [
44] by calculating the importance scores of words within semantic clusters (i.e., the extracted topics), thereby identifying the most representative terms for each topic. In this context, TF refers to term frequency, which indicates how often a word appears within a specific class of documents, and is defined as
In Formula (1), 
 denotes the term frequency of a word in a given document, while 
 represents the number of times the word appears in the document. The denominator indicates the total number of word occurrences in that document. IDF stands for inverse document frequency. The IDF of a term is calculated by dividing the total number of documents in a cluster by the number of documents containing that term, followed by taking the logarithm of the quotient. A higher IDF value indicates that the term appears in fewer documents and thus has stronger discriminative power across categories. The calculation formula is as follows:
In Formula (2), 
 represents the total number of documents, and 
 denotes the number of documents that contain the term. Once the TF and IDF values are calculated, their product yields the TF-IDF score of the term. A higher TF-IDF score indicates that the term is more important within the document set. Based on this, the formula for computing C-TF-IDF is as follows:
In Formula (3),  represents the importance score of term  in category ,  denotes the frequency of term  within category , and  indicates the frequency of term  across all categories.  refers to the average number of words per category. This improved algorithm reduces the computational time required by traditional TF-IDF and enhances the overall efficiency of the model.
A key advantage of BERTopic is that it supports a wide range of language models and can yield stable clustering results using default hyperparameter settings, without extensive manual tuning. The model dynamically adjusts during training and automatically generates optimal clustering results. Although no ablation study on hyperparameter sensitivity was conducted in this work, prior literature suggests that BERTopic exhibits reasonable robustness to default settings in typical short-text scenarios [
41].
To further verify the applicability and advantages of the BERTopic model in this study, a comparative analysis was conducted between BERTopic and several mainstream topic modeling approaches, including LDA, STM, KeyBERT, domain-adapted transformers, and LLM-based clustering methods. As shown in 
Table 2, BERTopic integrates BERT-based semantic embeddings with UMAP for dimensionality reduction and HDBSCAN for density-based clustering, making it particularly effective for short and unstructured texts such as user-generated reviews. It captures fine-grained semantic features and latent topics by fully considering the semantic, structural, and sequential characteristics of text, thereby enhancing topic coherence and interpretability [
45,
46]. Compared to frequency-based models such as LDA and STM, BERTopic offers more comprehensive handling of contextual semantics. LDA relies on shallow statistical features and often fails to capture deeper semantic relationships, particularly in short text scenarios [
47]. STM, as an extension of LDA, introduces document-level metadata to model topic evolution trends but is highly dependent on structured input variables and exhibits increased modeling complexity, which limits its applicability in unstructured domains like consumer product reviews [
48]. Other modern NLP techniques also present limitations in this context. KeyBERT extracts keywords based on document–keyword similarity but does not generate topic structures and performs poorly when dealing with emotionally charged or ambiguous texts. Domain-adapted transformers (e.g., SciBERT, BioBERT) improve domain-specific understanding but require extensive labeled corpora and fine-tuning, reducing their generalizability. LLM-based methods (e.g., OpenAI embeddings with K-means clustering) provide strong semantic representations but often lack topic coherence due to the decoupling of embedding and clustering stages. In contrast, BERTopic balances semantic richness, automatic determination of topic numbers, and interpretability, making it more suitable for extracting meaningful user needs from multi-source unstructured data [
49].
Based on the above, BERTopic offers strong interpretability, robust automatic clustering capabilities, and high adaptability. These advantages make it particularly suitable for analyzing user-generated online reviews, which often exhibit informal language, implicit topics, and complex contextual features. In this study, BERTopic (version 0.15.0, developed by Maarten Grootendorst, The Netherlands,) is applied to perform topic modeling on preprocessed review corpora, aiming to identify complex and non-standardized user requirements expressed during the actual use of medical device products. This modeling approach enables the aggregation and classification of large volumes of semantic content, thereby providing effective semantic support for the subsequent refinement of user requirements.
  3.2. Fuzzy Kano Model
The Kano model is an effective and intuitive method for identifying and classifying user requirements. It improves the efficiency of recognizing user satisfaction and directly reflects users’ subjective perceptions of product features. Additionally, it is useful for addressing the ambiguity commonly present in user requirements [
50]. As shown in 
Figure 5, the Kano model categorizes user requirements into five types based on the level of satisfaction users experience when the requirement is fulfilled: Attractive requirements, One-dimensional requirements, Must-be requirements, Indifferent requirements, Reversal requirements.
The fuzzy Kano model is widely adopted in user requirement analysis due to its ability to effectively address the inherent ambiguity and hesitation in user responses. Unlike the traditional Kano model, which relies on discrete categorization, the fuzzy Kano approach allows each requirement to simultaneously belong to multiple categories with varying degrees of membership. This enhances the interpretability of user satisfaction analysis and better reflects the subjective and uncertain nature of real-world user needs [
51]. Therefore, the fuzzy Kano model was chosen as the method for prioritizing user requirements in this study.
To enable the quantitative evaluation and prioritization of user requirements, this study calculates fuzzy membership degrees for each requirement based on the frequency with which it is classified under four Kano categories: Attractive (A), One-dimensional (O), Must-be (M), and Indifferent (I). These membership degrees are derived from the relative frequency of each classification within the total sample (see Formula (4)).
In Formula (4),  represents the fuzzy membership degree of the -th user requirement under the Kano category .  denotes the frequency with which the requirement is classified into category .  indicates the total number of survey responses.
In addition, to quantify the influence of different types of user requirements on overall satisfaction, a weight assignment mechanism is incorporated into the fuzzy Kano analysis. Following the extended Kano model used in previous studies [
52], the model assigns weight coefficients based on the relative impact of each category on user satisfaction: 0.5 for Attractive (A), 1.0 for One-dimensional (O), 1.5 for Must-be (M), and 0 for Indifferent (I). This reflects the common understanding that failing to meet Must-be requirements leads to strong dissatisfaction, whereas Attractive features, while appreciated, are not essential [
53]. Must-be requirements are assigned the highest weight because their absence leads to significant dissatisfaction. Attractive requirements, while capable of enhancing delight, are non-essential and thus receive a lower weight. Indifferent requirements, which have minimal influence on user decisions, are assigned a weight of zero. By integrating these weighted satisfaction contributions, a Weighted Kano Importance Index is constructed to represent the priority level of each requirement (see Formula (5)).
In Formula (5),  denotes the weight coefficient assigned to each Kano category. The commonly used values are .
To eliminate differences in value scales among various requirements, the Kano index is further normalized (see Formula (6)).
In Formula (6), the normalization ensures that the maximum value is scaled to 1 and the minimum value to 0.
The consistency of the classification results was evaluated using the standard deviation across the four Kano categories for each requirement. The standard deviation reflects the degree of variation in user judgments. It is calculated using the following formula:
In Formula (7),  denotes the standard deviation of the -th requirement,  is the mean frequency of the -th requirement across all four categories, and n = 4 is the total number of Kano categories.
This standard deviation measure quantifies the internal dispersion of user judgments. A lower  implies more consistent responses across participants, while a higher value suggests potential disagreement or interpretive ambiguity.
  3.3. Eye-Tracking Technology
Eye-tracking technology enables the accurate study of users’ unconscious and emotional responses by measuring either the position of gaze points or the movement of the eyeball relative to the head. According to the eye–brain hypothesis, changes in eye movement can be used to infer cognitive processes in the brain. This method serves as a supplement to subjective user feedback and is widely used in Kansei and emotional product design. By analyzing eye-tracking data such as gaze position, fixation duration, and gaze trajectory, researchers can reveal complex cognitive operations. These analyses help in understanding users’ affective and cognitive behaviors [
54]. Eye movements are closely related to brain activity and are considered important indicators of cognitive processing. Studies have shown that the human eye maintains a stable gaze for approximately 200–300 ms during a fixation period, during which visual information is focused and processed [
55]. Eye-tracking technology can help identify user needs, preferences, and points of interest. It provides a more objective and intuitive basis for requirement analysis and supports deeper insight into user behavior during product design.
In this study, attention heatmaps and total fixation duration within Areas of Interest (AOIs) are used to reflect participants’ levels of attention and interest. Total fixation duration is also used as an indicator of visual attraction. The longer the fixation time, the more the stimulus captures user attention. Shorter fixation times indicate weaker attraction. By applying eye-tracking technology and analyzing both heatmaps and AOI-based fixation data, this study aims to provide objective insight into user preferences. These methods help ensure the reliability and validity of the product evaluation results.
  3.4. Cooperative Game Model
Among cooperative game methods, the Shapley value model is one of the most representative theoretical tools. Its core idea is to compute the average marginal contribution of each participant across all possible coalition combinations and thereby derive a fair and rational weight for each party within the overall utility distribution [
56]. A key advantage of the Shapley value method is its inherent compliance with axioms of efficiency, symmetry, additivity, and fairness, which ensures that participants with equal contributions receive equal weights and that the solution respects structural equity among all players. It supports symmetric and balanced integration of multi-source heterogeneous user requirement data. In this study, the Shapley value is particularly suited for addressing the challenge of integrating multi-source heterogeneous user requirement data, as it enables a transparent and justifiable quantification of each source’s contribution. Rather than being applied as an isolated computational tool, the Shapley value is embedded as a core mechanism within the overall modeling framework, serving as a link between upstream requirement acquisition and downstream decision prioritization. This integration ensures conceptual coherence across data fusion, weight allocation, and final solution interpretation. Specifically, the cooperative game theory framework is introduced by modeling different sources of user requirement data as collaborative participants. The Shapley value method is then employed to evaluate the marginal contribution of each information source to the overall requirement identification outcome. This facilitates not only the weighted integration of multi-source data but also ensures that the prioritization process reflects the relative importance and reliability of each input source under a unified theoretical foundation. It is important to note that this method does not operate on the full-scale raw dataset directly. Instead, user concerns are first extracted and clustered to generate a reduced set of representative demand items. Only these filtered items are included in the cooperative game-theoretic integration stage. This design ensures that the final number of items remains within a reasonable range, thereby keeping the computational cost of the Shapley value analysis at a manageable level even when the initial dataset is large. The process of calculating requirement weights using the Shapley value is as follows:
Let the set of participants be 
, where each element 
 represents an information source. In this study, the sources include text mining, user interviews, and eye-tracking experiments. For any user requirement item 
, each information source provides a normalized score 
, which represents the identification strength or support level of source 
 for that specific requirement. To evaluate the true contribution of different sources to the identification of requirements, a coalition utility function 
 is defined, where 
 represents any subset of information sources. The function 
 reflects the joint expressive power or recognition utility of the sources in subset 
 for the requirement item. In this study, an average-based utility function is adopted to model coalition utility. In intuitive terms, the coalition utility function reflects how well a group of data sources jointly recognize or support a user requirement. Building on this, the Shapley value calculates the average marginal contribution of each individual source across all possible combinations. This is akin to evaluating how much each team member contributes to a collaborative project, on average. In our context, it ensures that no single data channel (e.g., reviews, interviews, or eye-tracking) dominates the evaluation, allowing for fair and symmetric weighting across heterogeneous sources. The specific formula is defined as follows:
In Formula (8),  represents the utility value of the coalition  for a given user requirement item;  denotes the number of information sources included in subset ;  is the normalized score assigned by source i to the given requirement item. All  values are scaled to the range  before calculation.
Based on the coalition utility function, the Shapley value is used to compute the average marginal contribution of each information source across all possible coalitions. Its mathematical definition is as follows:
In Formula (9), 
 denotes the Shapley value of participant 
; 
 represents any subset of participants that does not include 
; and 
 refers to the total number of participants (which is 3 in this study). The formula enumerates all possible coalition combinations and calculates the average marginal gain contributed by information source 
 when joining each coalition. This allows for a scientifically grounded allocation of weights in the final decision-making process. In practical application, the scores provided by each information source for a given requirement item must first be unified and normalized to the interval 
. Then, for each requirement item, a coalition utility function is constructed, and the Shapley value of each participant is computed. These Shapley values are used as weights to integrate the original scores, resulting in a composite score for the corresponding requirement item:
In Formula (10),  represents the weighted score of requirement item , and  is the Shapley value of information source . Finally, all requirement items are ranked based on their composite scores, resulting in a user requirement priority list derived from fair contribution-based weighting.
To facilitate performance benchmarking, an equal-weighted fusion method is also introduced as a baseline. In this method, each source is assigned an identical weight of 1/3. The final integrated score of each user requirement under this method is calculated as follows:
In Formula (11), , , and  represent the normalized scores for the requirement from the three sources, respectively.
  4. Case
  4.1. Text-Based Requirement Mining Using BERTopic
This section presents a case study to demonstrate the proposed method. The study focuses on commercial oxygen concentrators and aims to extract user requirements from real-world usage to support design optimization. Oxygen concentrators are complex medical devices that integrate multiple subsystems, including oxygen separation, interface display, voice interaction, and physiological sensing. Their complexity and the high standards required in medical settings make them a suitable subject for this study.
The study uses two major e-commerce platforms in China, JD.com and Taobao, as data sources. The keyword “oxygen concentrator” was used to search for relevant products. These platforms provide a large number of user reviews. For example, the best-selling product has over 5000 comments, which offers a sufficient amount of data for subsequent text analysis and requirement mining. The top eight products from each platform, sorted by sales volume, were selected for web scraping. Four products appeared on both lists, resulting in 12 unique samples in the final dataset. The experimental samples collected are shown in 
Figure 6.
To ensure the timeliness and accuracy of the collected data, this study focused on user reviews of sample products posted within the past year. A total of 15,289 initial user reviews were collected from the two platforms. The raw data included product information, user comments, and follow-up reviews. To improve the quality of the text data, comments with fewer than five characters and meaningless content such as “good”, “nice”, or “thumbs up” were removed. After cleaning, a total of 12,715 valid review texts were retained for further analysis. The Jieba Python package (version 0.42.1) was used for word segmentation. The segmented texts were processed using a stopword dictionary to remove irrelevant terms. The stopword list combined entries from the Baidu Stopwords List and the Machine Intelligence Laboratory Stopwords List from Sichuan University. In addition, custom stopwords related to logistics and other non-design-related content were added to better isolate reviews relevant to product form, thereby improving the accuracy of the subsequent topic analysis.
The BERTopic model was applied to identify user topics and extract the corresponding keyword distributions for home-use oxygen concentrators. A total of 14 user topics were identified. Each topic and its associated details are presented in 
Table 3.
To support a more intuitive understanding of the topic distribution and keyword composition, several visualizations were generated using the BERTopic model. To illustrate the document distribution within each topic, the topic_model.visualize_documents() function from the BERTopic library was used. In the resulting visualization, each color cluster represents a group of documents belonging to the same topic. The result is shown in 
Figure 7. The visualization shows clearly defined boundaries between topics, indicating a good level of clustering performance. The meaning and label corresponding to each topic are presented in 
Table 3, which summarizes the distribution of domestic topics.
To gain a more fine-grained understanding of the content associated with each topic, the topic_model.visualize_barchart() function was used to visualize the top eight keywords for each of the 14 topics. By examining the weight and ranking of keywords across different topics, the core semantic focus of each topic can be identified. The results are shown in 
Figure 8.
The bubble chart is used to illustrate the distribution and density of user comments across different topics. The size of each bubble represents the number of samples associated with the corresponding topic. Each circle in the chart denotes a topic, and its area reflects the frequency of that topic in the overall corpus. Topics that appear closer together on the coordinate plane indicate higher semantic similarity. The topic bubble chart is shown in 
Figure 9.
A confusion matrix was used to quantify the semantic similarity between different topics. If multiple topics exhibit high similarity, it may indicate fuzzy clustering boundaries or significant semantic overlap, suggesting the need for topic merging or structural optimization. As shown in 
Figure 10, the confusion matrix presents the similarity scores between topics, where darker colors indicate higher levels of similarity. The matrix reveals no significant overlap between topics, indicating that the clustering performance is satisfactory.
Based on the above analysis, the 14 user demand topics identified in this stage are considered valid. According to the topic frequency categorized by BERTopic, each topic keyword was expanded into a specific demand item. Since the first topic, “oxygen generation”, refers to the basic function of an oxygen concentrator, it was excluded from further analysis. Among the extracted topics, “sound” (Topic 04) and “noise” (Topic 10) both referred to the operational noise level of the oxygen concentrator. To validate the semantic proximity of these topics, we computed the pairwise cosine similarity score using the topic vectors generated by BERTopic. The result showed a high similarity of 0.83, confirming that the two topics largely overlapped in their semantic content and user focus. Therefore, these topics were merged into a single demand item labeled “low noise” to avoid redundancy in subsequent modeling. Finally, the frequencies were normalized to generate the initial scores for each demand topic. The results are shown in 
Table 4.
  4.2. Interview-Based Demand Acquisition Using the Fuzzy Kano Model
To obtain design-related appearance demands for oxygen concentrators through user research, a fuzzy Kano questionnaire was developed based on focus group discussions. The questionnaire includes 20 typical appearance-related functional attributes. It adopts a structure combining both positive and negative formulations to facilitate later classification analysis. The full questionnaire content is provided in 
Supplementary File S1. A total of 180 questionnaires were distributed during the field study, of which 173 were returned. After removing five invalid responses, 168 valid questionnaires were retained, resulting in an effective response rate of 93%, ensuring the reliability of the survey. The collected responses were further processed and analyzed following the steps of the fuzzy Kano model. Each demand item was classified into one of four categories—Attractive (A), One-dimensional (O), Must-be (M), or Indifferent (I)—based on frequency distribution. The results are shown in 
Table 5.
After obtaining the classification frequency of each user demand item, the fuzzy membership degrees were calculated based on the proportion of each category relative to the total sample size. Specifically, the frequencies of each type were divided by the total number of valid responses (168) to compute the membership degrees for the Attractive (μ_A), One-dimensional (μ_O), Must-be (μ_M), and Indifferent (μ_I) categories. The results are shown in 
Table 6.
To further quantify the perceived importance of each user need, a weighted model was applied to construct the Fuzzy Kano Importance Index. Based on the varying impacts of different Kano categories on user satisfaction, the following weights were assigned: 0.5 for Attractive (A) needs, 1.0 for One-dimensional (O) needs, 1.5 for Must-be (M) needs, and 0 for Indifferent (I) needs. Using this weighting scheme, the Kano index for each demand item was calculated. The results were then normalized to the [0,1] range using the min–max normalization (range normalization) method. The final normalized scores and ranking results for each user need are presented in 
Table 7.
To assess the consistency and reliability of the fuzzified Kano classification results, this study calculated the standard deviation of category frequencies (A, O, M, I) for each user requirement. This deviation reflects the dispersion in participants’ responses, with a lower standard deviation indicating stronger consensus regarding the classification of a given requirement. As shown in the 
Table 8, most requirements exhibit moderate standard deviation values, suggesting acceptable internal consistency.
  4.3. User Requirement Acquisition Based on Eye-Tracking Experiments
Static appearance images of oxygen concentrators collected from online platforms were organized as visual stimuli for the eye-tracking experiment. To identify user requirements related to the product, expert interviews were conducted prior to the experiment. A total of ten experts were invited, including five industry specialists and five design professionals. A focus group discussion was held to identify and screen the most representative vocabulary that captures the visual characteristics of the 12 selected oxygen concentrator samples. After thorough discussion and screening, consensus was reached on 12 core descriptive terms that best represent the design features. The final list of expert-identified terms is shown in 
Table 9.
Participants in the experiment were recruited from students and faculty members in the field of industrial design, ranging in age from 21 to 50 years. All participants had prior experience related to the use of oxygen concentrators. A total of 25 individuals took part in the preliminary experiment, including 12 males and 13 females. All participants had normal color vision (no color blindness or color weakness) and corrected visual acuity of 1.0 or higher. All participants joined the study voluntarily and were clearly informed of their right to withdraw at any stage without penalty. All data were anonymized to ensure strict protection of participants’ privacy and to maintain full confidentiality and anonymity. The experiment was carried out in the experimental building of the School of Design, Art, and Media at Nanjing University of Science and Technology. The environment was kept quiet and comfortable to prevent interference from external noise during the procedure. Eye-tracking data were recorded using a Tobii X3-120 device (Tobii Technology, Danderyd, Sweden), and the collected signals were exported and analyzed via Tobii Studio (version 3.4.8, Tobii Technology, Danderyd, Sweden). The study protocol was reviewed and approved by the Institutional Review Board of Nanjing University of Science and Technology, and written informed consent was obtained from all participants prior to the eye-tracking experiment. Prior to the formal experiment, each participant was given a detailed explanation of the procedures and key considerations. Once the participant fully understood the experimental process, the lead experimenter provided them with instructions. The full procedure included five stages: an introductory page, a visual focus calibration page, an experiment instruction page, a stimulus presentation page, and a conclusion page (as shown in 
Figure 11).
During the experiment, participants were positioned at a fixed distance of around 60 cm from the eye-tracking device and were instructed to sit upright throughout the session to ensure data consistency. Prior to the presentation of stimuli, a standard nine-point calibration procedure was performed using Tobii Studio software to ensure accurate gaze tracking. Calibration was repeated as necessary until an acceptable level of accuracy was achieved. The experiment began with an introductory screen that outlined the objectives, procedures, and participation guidelines, displayed for 20 s. This was followed by a visual fixation page, where a black cross (“+”) measuring 0.5 × 0.5 cm was displayed. Participants were instructed to focus their gaze on the center of the cross for 10 s to stabilize visual attention. Upon completion of the fixation phase, the system automatically transitioned to an instruction screen that prompted participants to observe the forthcoming product designs of oxygen concentrators and focus on the images they found most visually interesting. After 5 s, the experiment proceeded to the main stimulus presentation. Each participant was then shown a single visual stimulus page displaying 12 static images of oxygen concentrator designs. This page remained on screen for 60 s to allow free exploration based on individual interest. Upon completion, the system advanced to the concluding screen, marking the end of the session. The next participant was then invited to begin the experiment.
The experiment tested participants’ visual selective attention using the presented oxygen concentrator samples. During the viewing process, the system continuously recorded attention duration and visual hotspots within predefined Areas of Interest (AOIs). Fixation durations longer than 100 ms were considered indicators of meaningful attention. Upon completion of each session, the data were exported for further analysis. Heatmaps were generated based on fixation distributions to visualize participants’ gaze concentration, and areas with higher cumulative fixation values were identified as visual attention hotspots, as shown in 
Figure 12.
Based on the eye-tracking heatmaps, users’ visual preferences regarding the appearance of oxygen concentrators can be generally inferred. However, heatmaps alone do not provide precise fixation durations at specific points. Therefore, the Tobii Studio “Statistics” function was used to export detailed data. This included the fixation duration (in seconds) within the Area of Interest (AOI) for each of the 20 participants on each sample page. The statistical results of the visual physiological indicators from the eye-tracking experiment for all participants are shown in 
Table 10.
Before conducting statistical analysis, an outlier detection step was conducted to ensure the stability of the eye-tracking data. The analysis involved calculating the first (Q1) and third (Q3) quartiles, as well as the interquartile range (IQR = Q3 − Q1), for the fixation durations of each product sample. Data points falling outside the range of Q1 − 1.5 × IQR to Q3 + 1.5 × IQR were identified as potential outliers. This is a standard statistical approach for detecting extreme values that may distort analysis. In this study, all fixation durations fell within the expected range, and no significant outliers were found. After confirming data reliability, a one-way ANOVA was conducted to test for differences in total fixation time across the 12 oxygen concentrator designs. The result showed a significant main effect of design type (F = 15.71, p < 0.001), indicating that participants allocated significantly different levels of visual attention to different product designs.
To make the eye-tracking results more interpretable, the fixation durations were normalized. This removed unit dependency and scaled all values to a [0,1] range. Min–max normalization was applied to the fixation durations within the AOIs of the 12 samples. The resulting values were used as the weights of corresponding user needs. While this method preserves relative differences, it is sensitive to outliers. Therefore, before applying the normalization, the raw fixation data were examined for anomalies. No significant outliers were found, and the normalized results were considered stable and reliable. 
Table 11 presents the detailed results, completing the extraction of user needs from the eye-tracking experiment and providing the initial scores for further analysis.
  4.4. Multi-Source Demand Balancing Based on Cooperative Game Theory
In the cooperative game model, the three types of information sources are regarded as game participants, denoted as source A (online reviews), source B (user interviews), source C (eye-tracking experiments).
As noted earlier, each source provided a list of user needs with corresponding scores. Identical items were merged and deduplicated. The classical Shapley value model assumes that all participants contribute rationally and meaningfully to the coalition, which may not always reflect real-world conditions. To mitigate potential data biases, normalization and filtering were applied to remove low-confidence and zero-contribution entries before forming coalition. Specifically, entries that received no support from any of the three data sources were defined as “low-confidence and zero-contribution” items. These entries had all scores equal to 0 and were excluded from the final integration. The resulting standardized input matrix is shown in 
Table 12. The definitions and explanations of these requirements are provided in 
Table A1 (
Appendix A).
Based on information sources A, B, and C, seven coalition structures were formed: v({A}), v({B}), v({C}), v({A, B}), v({A, C}), v({B, C}), and v({A, B, C}). According to Formula (7), the utility value of each user demand within each coalition was calculated. The results are shown in 
Table 13.
Based on the Shapley value calculation (Formula (9)) and the integrated scoring formula (Formula (10)), a marginal contribution analysis and weight allocation were performed for each demand item. The resulting Shapley values and integrated scores are presented in 
Table 14.
To explore how changes in the relative importance of data sources influence user requirement prioritization, four weight scenarios were simulated: (1) text mining dominant (50%, 25%, 25%), (2) user interview dominant (25%, 50%, 25%), (3) eye-tracking dominant (25%, 25%, 50%), and (4) equal weights (33.33% each). Based on the normalized initial contribution matrix, weighted scores were computed and ranked for each requirement under the four scenarios (As shown in 
Table 15.).
Based on Equation (10), the equal-weighted fusion scores for each user requirement are presented in the following 
Table 16.
This study takes home oxygen concentrators as a case to construct a set of 30 user demand items, integrating scores from three heterogeneous information sources. Final rankings were derived using Shapley value analysis. 
Figure 13 compares these rankings with the initial scores, clearly showing how each demand was balanced and optimized through multi-source game-theoretic integration. The results show that “D14 Friendly appearance”, “D1 Age-friendly”, and “D6 Easy operation” are top priorities after integration. In contrast, “D11 Brand trust”, “D25 Steady appearance”, and “D9 Voice control” received low composite scores, indicating weak expression or low concern across data sources.
Alternatively, the equal-weighted fusion results again reveal structural differences in priority assignment compared to the Shapley-based integration. For instance, “D6 Easy operation” received a high interview score (1.0) but had low support from textual data (0.26) and none from eye-tracking. Under the Shapley method, it ranked second with a score of 0.580, as the algorithm emphasized its strong marginal contribution from interviews. However, under equal weighting, its score dropped to 0.420, lowering its rank to fourth. This demonstrates how uniform weighting can dilute a strong single-source influence when other sources are weak or missing. Meanwhile, “D12 Clear interface”, supported by both interviews (0.83) and eye-tracking (0.50), received a Shapley-integrated score of 0.458, ranking fifth. Under equal-weighted fusion, its score was slightly lower at 0.444, and its ranking was consistent (sixth), showing that when two sources are moderately aligned, both methods yield similar outcomes. A more divergent case is “D13 Rounded corners”, which had a strong interview input (0.73) but no signals from other sources. Its Shapley-integrated score was 0.281, placing it eighth, while equal-weighting pulled it down to 0.243 ninth), again underrepresenting a dominant single-source contribution. These findings reinforce that Shapley-based integration is better suited for asymmetric or sparse multi-source inputs, while equal weighting may oversimplify source dynamics, particularly when data heterogeneity is high.
Further analysis reveals that the score distributions of user demands vary significantly across the three information sources. For instance, “D14 Friendly appearance” received a perfect score of 1.0 in the eye-tracking experiment and a high score of 0.71 in user interviews, yet it was completely absent from the online review data. This suggests that although users did not explicitly express this demand, it still attracted considerable visual attention. After applying Shapley-based weighting, its overall score reached 0.722, ranking first. This demonstrates that the fusion model can successfully identify latent demands that are not visible in any single source. Similarly, “D5 Home-use” received relatively high scores across all three sources, resulting in a consistently high overall score. This highlights the importance of widely recognized and commonly agreed-upon user needs. In contrast, “D10 Compact and lightweight”, despite receiving a very low score (0.03) from online reviews, was rated higher in user interviews (0.35) and eye-tracking (0.46). Its final score of 0.153 placed it in the mid-range, suggesting that the fusion model effectively balances biased or uneven input across sources. Furthermore, “D3 Low-noise”, a common feedback topic in post-use reviews, had a relatively high score (0.47) only in online comments but minimal contributions from other sources, leading to a decreased final ranking. This reflects the Shapley method’s suppressive effect on demands with source-specific dominance. These findings suggest that the cooperative game-theoretic integration mechanism not only enhances the expression of authentic user preferences but also reveals overlooked or implicit needs.
Therefore, this multi-source data structure reflects the flow from real market usage (online reviews), to individual user perception (interviews and eye-tracking), and finally to structured design input. The extracted user requirements serve as the foundation for guiding design decisions, aligning with the view that user requirements initiate and structure the product development process. In this way, the method operationalizes the market–user–design linkage described in the theoretical framework.
  5. Discussion
This study systematically extracted user demands from three heterogeneous information sources: (1) functional concerns were identified via topic modeling on online reviews using the BERTopic algorithm, resulting in 12 functional demand entries; (2) user preferences were elicited through interviews and evaluated with a fuzzy Kano model, yielding 20 subjective demand items; and (3) visual perception-related demands were obtained by analyzing gaze behavior in an eye-tracking experiment, producing 12 visually driven appearance demands. Based on the multimodal data, a cooperative game theory model was employed, treating the three sources as players in a game. Coalition utility functions were constructed, and Shapley values were applied to quantify each source’s marginal contribution to each demand item, enabling a balanced and symmetric integration framework and producing a prioritized list of all identified user requirements.
The results show that the multi-source fusion model more accurately reflects user attention and hidden preferences in complex product design. For instance, “D14 Friendly appearance”, “D12 Clear interface”, and “D10 Compact and lightweight” received lower scores in text mining but scored significantly higher in interview and eye-tracking results. Their final ranking in the integrated model highlights the method’s ability to reveal implicit needs. Additionally, the Shapley-based fusion mechanism effectively balanced weights across sources and requirement symmetry, especially when data distributions were uneven or divergent. This confirms the model’s fairness and robustness in handling multi-source preference integration.
The comparative analysis between the Shapley-based fusion model and the equal-weighted fusion method reveals clear differences in their treatment of multi-source data. Although both rely on the same inputs, the resulting scores and rankings vary notably. The equal-weighted method assumes that all sources contribute equally, which can lead to information dilution or the overemphasis of single-source input. In contrast, the Shapley-based method avoids fixed weights and evaluates the marginal contribution of each source across all possible coalitions, dynamically adjusting to the actual structure and distribution of the data. This approach accounts for the asymmetry commonly found in real-world data sources and allows for a more adaptive integration process. By quantifying each source’s marginal contribution to every possible combination, the Shapley method effectively identifies which sources are truly influential for each requirement. This approach prevents weak or missing signals from exerting excessive influence and ensures strong but isolated signals are properly reflected in the final score. This dynamic redistribution addresses information asymmetry by assigning weights based on actual usefulness rather than uniform assumptions. For instance, “D6 Easy operation” received a full score of 1.00 from interview data but only 0.26 from text mining and no input from eye-tracking. Under equal-weighted fusion, its score was diluted to 0.420, whereas the Shapley-integrated score rose to 0.580, reflecting its reliance on a dominant but valid single-source contribution. Conversely, “D13 Rounded corners”, which was supported only by interviews (0.73) and lacked any signals from other sources, received 0.243 under equal weights, while its Shapley-based score rose to 0.281, giving it relatively more weight due to the isolated but significant input. Overall, while the Shapley-based model does not always produce higher scores, it provides a flexible approach to handling uneven contributions. This enables more representative integration in cases of information asymmetry and supports more robust user requirement prioritization in complex product design contexts.
Compared to existing research, this approach is innovative in both theory and application. Traditional methods like QFD or the Kano model [
57,
58] often rely on expert judgment and limited sample sizes. While NLP-based text mining alleviates sample scarcity and subjectivity, its single-source limitation fails to capture the multi-faceted nature of user cognition. In contrast, this study introduces a cooperative game theory-based fusion mechanism centered on the Shapley value to ensure fairness, symmetry, and efficiency. The model not only quantifies the individual contribution of each source but also reveals their combined value in multi-source coalitions, thus overcoming previous methodological limitations and improving interpretability and generalizability.
The significance of this study lies in its systematic demonstration of the value of fusing heterogeneous user data for complex product demand modeling. Each source has inherent limitations. Online reviews focus on post-use feedback and may neglect emotional or visual concerns. Interviews reflected detailed preferences but are limited by variation in user expression and sample size. Eye-tracking provides objective behavior data but lacks semantic richness. Individually, none of these sources can offer a complete understanding of user needs. Through cooperative integration, this study achieves complementary fusion across perceptual, emotional, and behavioral levels, substantially increasing the depth and breadth of expressed needs. Moreover, due to structural and semantic differences among data sources, naive averaging or equal-weight fusion could lead to information redundancy or overlook specific preferences. The proposed cooperative game model mitigates this by simulating coalition utilities and scientifically allocating marginal contributions, ensuring symmetry in data contribution and balance in preference representation. Such symmetry and balance are reflected not only in the treatment of data sources but also in the equal analytical depth and methodological fairness applied to different types of requirements.
In complex product design, achieving balance among multi-source user requirements is an essential research task, often referred to as constructing requirement symmetry. Only when multi-source requirements achieve a degree of symmetry can designers accurately reconcile differences across sources and efficiently advance the product development process. The results of this study also demonstrate that the heterogeneous nature of data formats often leads to asymmetry in the expression of user requirements, primarily due to variations in data acquisition methods, focus areas, and cognitive representations. Therefore, while multi-source heterogeneous requirements enrich user understanding, it remains necessary to establish a cooperative mechanism for balancing and integrating these requirements, to ensure symmetry in multi-source user requirement modeling.
Despite the demonstrated accuracy and fairness of the multi-source fusion and Shapley-based weighting, some practical challenges remain. Eye-tracking requires specialized equipment and controlled environments, while the alignment and modeling of multi-source data demand additional time and analytical effort. These factors increase methodological complexity. However, for safety-critical products like home oxygen concentrators, where user needs are sensitive and design tolerances are low, this strategic investment in early-stage modeling is justified. Compared to traditional single-source or empirical weighting methods, the cooperative game-based approach ensures scientific rigor, symmetry, and fairness in demand fusion, potentially reducing costly redesigns and enhancing user satisfaction. Thus, integrating high-precision, multi-source modeling at the design stage lays a solid foundation for downstream product optimization.
  6. Conclusions
To address the asymmetry and imbalance of user requirements from heterogeneous sources in complex product design, and to improve the structural symmetry and balance of decision influence, this study proposes a multi-source demand identification and weighted ranking method that integrates natural language processing, fuzzy modeling, and cooperative game theory. First, the BERTopic model was applied to online reviews for topic modeling, extracting user concern themes related to product functionality. Then, the fuzzy Kano model was used to classify affective attributes derived from user interviews, thereby constructing structured and personalized demand items. Finally, eye-tracking experiments were conducted to capture visual attention, quantifying user preferences for appearance and interface features. Based on the structured representation of multi-source data, a cooperative game model was established to define coalition utility functions, and the Shapley value was applied to assign weights based on the marginal contributions of each source. This resulted in a symmetric and balanced integration framework for prioritizing multi-source user demands. Using an oxygen concentrator as a case study, the feasibility and practicality of the proposed method were validated in identifying multilayered user preferences, coordinating demand conflicts, and optimizing design objectives.
Compared with methods using single data sources or fixed weights, the proposed multi-source integration offers both theoretical value and practical clarity. On one hand, this method utilizes the complementary strengths of multiple data sources to identify user demands across functional, emotional, and behavioral dimensions, effectively reducing the bias and omissions inherent in single-source approaches. On the other hand, the cooperative game model systematically evaluates the marginal utility of each source in different coalitions, effectively solving the weight imbalance problem that traditional averaging methods cannot address. This change redefines system capability as a design goal rather than presenting it as a guaranteed outcome.
First, this study establishes a heterogeneous multi-source user demand acquisition method by integrating natural language processing, fuzzy modeling, and behavioral analysis. By integrating online reviews, interview, and eye-tracking data, the method supports multidimensional expression of user needs, covering functional feedback, emotional preferences, and visual attention. It overcomes the limitations of traditional single-source or intuition-based methods and supports comprehensive user profiling in complex product design.
Second, the BERTopic model effectively extracts functional evaluations from user reviews after product usage; user interviews identify individualized emotional needs; and eye-tracking reveals latent perceptual preferences for interfaces and appearance. Although each source provides expressive power in its respective dimension, they also present inherent limitations—textual data is prone to emotional noise, interview data is subjective, and eye-tracking data lacks semantic interpretation. This study enhances the expressiveness and analytical depth of demand acquisition through multi-source collaboration, compensating for the limitations of any single technique.
Finally, a cooperative game theory model was introduced, using Shapley values as the weighting mechanism to address the heterogeneity of data structures, expression formats, and relative importance among sources. By constructing a coalition utility function, the marginal contribution of each information source to demand prioritization was calculated under different combinations. This enabled a fair, symmetric, and interpretable weighting mechanism based on marginal value. The method balances credibility differences and reduces redundancy, enabling unified weighting of heterogeneous data. This creates a balanced decision framework where every user requirement is fairly assessed and appropriately weighted in the final design.
The proposed framework targets the optimization of existing complex products. It is best suited for products with multiple functional subsystems, interactive components, and strict performance or safety requirements, such as medical, industrial, or high-reliability consumer products. It is not intended for early-stage concept generation or speculative design. By focusing on complex products already in use, the method leverages actual user feedback to extract needs rooted in practical experience. This ensures that the resulting requirements reflect actual usage scenarios and provide reliable input for improving performance, usability, and satisfaction. The method is especially suitable for consumer-facing products that accumulate sufficient user feedback through multiple channels, including medical devices, household electronics, and other high-use products with visible form and interactive interfaces. User requirements are integrated into the design process by ranking them through a Shapley value-based fusion model. These ranked needs directly inform design decisions related to product features, appearance, and interaction. Despite its strengths, the method has some limitations. It requires considerable time, cost, and technical effort due to its reliance on eye-tracking, fuzzy modeling, and expert interviews. This may restrict its applicability in time-sensitive or resource-constrained settings. The method is most effective in stable and structured product domains, and its adaptability to fast-evolving or unfamiliar design contexts remains to be verified. In addition, the current framework mainly focuses on end users, whereas other stakeholders—such as engineers, clinicians, and maintenance personnel—also contribute important requirements that have not yet been systematically addressed.
Future research should explore ways to reduce resource demands and broaden applicability. This may include introducing lower-cost behavioral data sources or using automated algorithms to streamline preference modeling. To improve generalizability, the method could be tested in rapidly evolving product categories or emerging technologies. Expanding the framework to incorporate multi-role stakeholder data will also require new acquisition channels, conflict-resolution strategies, and flexible weighting schemes. Without these extensions, the model risks oversimplifying real-world complexity and missing critical system-level requirements.