An Approach to Discovering Product/Service Improvement Priorities: Using Dynamic Importance-Performance Analysis

The cost budget and resources of a business are limited. In order to be competitive sustainably in the market, it is necessary for a businesses to discover the improvement priorities of their product/service features effectively and allocate their resources appropriately for higher customer satisfaction. Online customer review mining has been attracting increasing attention for businesses to discover priorities of product/service improvement from online customer reviews. Despite some prior related studies, their methods have several limitations, such as simply using the frequencies of mentioned product features in reviews as an indicator of importance; neglecting the market competition; and focusing only on the static importance and performance of the target product/service features. To address those limitations, this study proposes a novel approach to discovering a product/service’s improvement priorities through dynamic importance-performance analysis of online customer reviews. It first clusters similar features into a feature group and calculate the relative performance of the feature groups using sentiment analysis. Next, the importance of each feature group’s performance to overall customer satisfaction is measured by the factor categories based on the Kano’s model. The factor categories are determined by the significance values of each feature group in both positive and negative sentiment polarities derived from the constructed decision tree. Finally, feature improvement priorities of a target product/service will be discovered based on the dynamic performance trend and predicted importance using a dynamic importance-performance analysis. The evaluation results show that the dynamic importance-performance analysis approach proposed in this study is a much better approach for product/service improvement priorities discovering than the product opportunity mining approach proposed in the prior studies. This study makes new research contributions to automatic discovery of product/service improvement priorities from large-scale online customer reviews. The proposed approach can also be used for product/service performance monitoring and customer needs analysis to improve product/service design and marketing campaigns.


Introduction
In order to be competitive sustainably in the market, businesses need to understand customers' needs and preferences continuously so as to design better products/service and gain competitive advantages.However, the cost budget and resources of a business are limited, it is necessary for businesses to discover the improvement priorities of its product/service features effectively and allocate their resources appropriately for higher customer satisfaction.
Online customer reviews (OCRs) have become a popular source for businesses to get feedback from their customers and improve their product/service quality [1].To achieve this goal, a key task is to understand the strengths and weaknesses of their current products/service based on customers' comments on products/service in OCRs [2].The ever-increasing volume and unstructured textual content of OCRs, however, make navigation and analysis of OCRs and identification of customer experience with products/service very challenging and time consuming [3,4].Therefore, there is an imperative need for effective and automated discovery of product/service features improvement priorities from OCRs.A common method for identifying customers' feeling about products/service is sentiment analysis of OCRs, which is concerned with identifying emotions and opinions from textual review contents using natural language processing, text analytics, and computational linguistics [5,6].Jeong et al. [7] classified sentiment analysis approaches into the lexicon-based [8,9], text classification-based [10], and deep learning-based approaches [11][12][13].Such automated approaches need to be adapted in different contexts [14], such as different business domains in and objectives for which OCRs are analyzed, to ensure that they provide accurate practical implications.
To address this question, first of all, it should be considered that a purpose of discovering product/service's improvement priorities is to examine the performance of a target product/service from the customers' perspective, and to discover which product/service feature(s) should be improved first.In this regard, as prior studies have noted, a gap-based approach was proposed to direct product/service quality diagnosis and improvement towards a customer-oriented focus [15,16].Similarly, a product opportunity mining approach combing the values of importance and satisfaction into a single metric was proposed [7].These studies used the sentiment scores to measure the performance of each product/service feature and used the frequencies of the mentioned product/service features to measure their importance to customer satisfaction.Then they used an algorithm considering both the performance and importance of each product/service feature to determine which product/service feature(s) should be improved first.However, these studies have several limitations.First, they merely focused on the actual performance of each product/service feature, which means that they only measured the performance of the target product/service features themselves without comparing with other products/service in the market segment.Second, they calculated the frequency of mentioned product/service features in OCRs as an indicator of importance that how important each product/service feature's performance is to overall customer satisfaction.However, it is a rough way to measure the importance.A product/service feature mentioned more does not mean that its performance has greater influence on overall customer satisfaction all the time.Maybe its performance only has greater influence on overall customer satisfaction when it is low or high.It has been recognized that the relationship between feature performance and overall customer satisfaction in the real world is nonlinear and asymmetrical [17,18].To address the limitations of these studies, importance-performance analysis (IPA) is an appropriate approach to discover the improvement priorities of each product/service feature.IPA was first introduced by Martilla and James [19] as a framework for analyzing product/service's features in order to identify the critical ones.It has been used and improved by many scholars in view of the limitations of the above studies.However, most IPAs have been based on a standardized questionnaire which inevitably requires considerable time and resources.It has not been applied to analyze OCRs and could not monitor dynamic change of the information in time.
Given these considerations, to help the businesses to be competitive sustainably in the market under certain cost budget and resources, this study proposes an approach to discovering product/service improvement priorities from OCRs using dynamic IPA.First of all, this study uses sentiment analysis to derive the feature performance of both the target product/service and its major competitors in the market segment from OCRs.The relative performances are calculated to further consider the situation of the market segment.The OCRs of the last three years are analyzed and the dynamic trends of each product/service feature's relative performance are monitored.Second, a decision tree model is constructed based on the overall customer satisfaction and the OCRs' sentiment polarities of each product/service feature.The importance of each product/service feature's performance to overall customer satisfaction are then measured by the Kano's model categories based on the significance values derived from the decision tree model.Similarly, the dynamic trends of each product/service feature's importance are monitored and the future importance can be predicted based on the dynamic trends.Finally, the relative performance and predicted importance (Kano's model categories) are subsequently used as coordinates for plotting product/service features in two-dimensions.The IPA map is graphically presented as grid divided into different quadrants and different strategies are proposed to handle features in different quadrants.
There are three elements of academic contribution of this study.First, to better discover product/service improvement priorities, the feature performance of both target product/service and its major competitors in the market segment are analyzed in this study, further considering the competition and overall situation in the market segment which most prior related studies neglected.Second, considering the real word situation that the relationship between feature performance and overall customer satisfaction is nonlinear and asymmetrical, this study derived the significance values of each product/service feature to overall customer satisfaction in both positive and negative sentiment polarities from the constructed decision tree model.Third, in order to apply IPA into OCR analysis for dynamic monitoring and more accurate product/service improvement priority discovery, the relative performance and importance are derived by sentiment analysis and decision tree modeling.The OCRs of the last three years are collected for dynamic analysis.
From a practical perspective, our approach helps the businesses to monitor the dynamic trend of customer needs and decide resource allocation and improvement priorities more effectively.It is an efficient approach for businesses to get higher customer satisfaction with limited resources and to be competitive sustainably in the market.
The rest of this paper is organized as follows.The following section describes the related works on this topic.Section 3 presents the proposed approach in detail.Section 4 uses a case study of Huawei P-series smartphones to verify the feasibility and validity of the proposed approach.Finally, implications of the findings and future directions are presented in Section 5.

Related Works
In recent years, OCRs have become a popular source for businesses to get feedback from their customers and to discover their product/service defects and improvement.Thus, this study proposes an approach to discovering product/service improvement priorities from OCRs using dynamic IPA.Three groups of related works for discovering product/service defects and improvement are reviewed here.
Firstly, several studies focused on classifying the product/service features that influence the customer satisfaction with the product/service and identifying the important ones.They used a topic model-based method, fuzzy method, or a new proposed framework to classify the product/service features extracted from OCRs and identified the important ones based on the mentioned frequencies principally.Li et al. [20] first extracted and grouped feature expressions simultaneously using a topic model-based method, and then used sentiment scores to measure customer satisfaction according to a product's different features and provide opinion summaries.Guo et al. [21] classified the features of customer service voiced by hotel visitors using latent Dirichlet analysis and further identified the important ones using perceptual mapping.Besides, Wei et al. [1] proposed an automatic HQA method based on fuzzy methods to classify the product/service features that influence the customer satisfaction for automatic hotel service quality evaluating.Kang and Park [22] proposed a framework for measuring customer satisfaction of service based on the user generated contents using sentiment analysis and VIKOR approach.However, these studies mainly focused on how to better classify the product/service features that influence customer satisfaction and identity the important ones, without considering that the important product/service features to customer satisfaction may already have quite high performance so that it is wasteful to further allocate resources to improve them.
Secondly, some studies further identified the strengths and weaknesses of the product/service derived from OCRs and decided which product/service features should be improved based on not only the importance to customer satisfaction but also the performance of each product/service feature.They used the sentiment scores calculated by sentiment analysis of OCRs to measure the performance of each product/service feature and used the frequencies of the mentioned product/service features to measure their importance to customer satisfaction.Then they used an algorithm to calculate a value considering both the performance and importance of each product/service feature to determine which product/service feature(s) should be improved first.A gap-based approach was proposed to direct product or service quality diagnosis and improvement towards a customer-oriented focus [15,16].The gap-based approach used the difference value between the customer expectation score and customer perception score of each product or service feature to identify those that are in most urgent need of improvement to yield better customer satisfaction.Similarly, a product opportunity mining approach combing the values of importance and satisfaction into a single metric was proposed to identify potential opportunities for product topics obtained from social media data from the customer perspective [7].However, although these studies further considered the performance of each product/service feature, they merely measured the performance of the target product/service features themselves without comparing with other products/service in the market segment.Neglecting the situation of the market segment may easily lead to improper decision-making.Besides, they used the frequencies of the mentioned product/service features to measure their importance to customer satisfaction roughly.It has been recognized that the relationship between feature performance and overall customer satisfaction in the real world is nonlinear and asymmetrical [17,18].It is necessary to differentiate the importance of each product/service feature to customer satisfaction when their performance are high and low.
Thirdly, before OCR mining, several studies have proposed some approaches to get feedback from their customers and discover their product/service defects and improvement based on the data collected by questionnaires.Importance-performance analysis (IPA) is one of the most effective ones.IPA was first introduced by Martilla and James [19] as a framework for analyzing product/service features in order to identify the critical ones.IPA measures the performance and importance of each product/service feature based on the scores filled in the questionnaires by customers and the calculated mean performance and importance scores are subsequently used as coordinates for plotting product/service features in two dimensions.An IPA map is graphically presented as a grid divided into four quadrants and different strategies will be proposed to handle features in different quadrants.
After decades of in-depth research on IPA, a lot of researchers have proposed methods for improving IPA from both importance and performance perspectives.More and more researchers derived importance of product/service features to customer satisfaction based on the three-factor theory developed by Kano et al. [17].The theory states that the basic factors of customer satisfaction are minimum requirements.Customers are highly dissatisfied if the basic factors are not fulfilled.On the contrary, excitement factors of customer satisfaction are features that increase customer satisfaction greatly if they are fulfilled, but cause no dissatisfaction if they are not fulfilled.Furthermore, the performance of the performance factors of customer satisfaction have a linear correlation with customer satisfaction.The three-factor theory further expanded to five factors afterwards, and it is shown in Figure 1.They derived the importance of product/service features to customer satisfaction based on three-factor theory by multiple regression [23][24][25][26][27], partial correlation [28][29][30][31], explicit and implicit importance matrix [32][33][34], or the ratio of satisfied coefficient and dissatisfied coefficient [35].As for the revision of feature performance, Deng et al. [36] derived the performance of target product/service features relative to the best competitor.Taplin [37] used benchmarking against competitors to derive relative product/service feature performance.However, although IPA has been proved to be an effective approach to discover the improvement priorities of the product/service features, few studies applied it in OCR analysis [38,39].IPA based on the data collected by questionnaires inevitably requires considerable time and resources, and it cannot monitor dynamic change of the information in time.The comparison of traditional IPA based on questionnaires and dynamic IPA based on OCRs is shown in Table 1.It can be seen obviously from Table 1 that dynamic IPA is better than traditional IPA in several aspects.Therefore, in order to apply revised IPA into OCR analysis for automatic improvement priorities discovering and dynamic monitoring, it is a good idea to choose an appropriate method above to derive revised importance and relative performance of each product/service feature which can also be realized by text mining and processing.

Proposed Approach
The approach to discovering product/service improvement priorities from OCRs using dynamic IPA is built on three modules: sentiment analysis, decision tree modeling and IPA.First of all, this study uses sentiment analysis to derive the feature performance of both the target product/service and its major competitors in the market segment from OCRs.The relative performances are calculated by dividing the target product/service feature performance by the average feature performance of the major products/service in the market segment.OCRs from the past three years are analyzed and the dynamic trends of each product/service feature's relative performance are monitored.Second, decision tree model is constructed based on the overall customer satisfaction and the OCRs' sentiment polarities of each product/service feature.The significance values of each product/service feature to overall customer satisfaction in both positive and negative sentiment polarities are calculated.The importance of each product/service feature's performance to overall customer satisfaction is then measured by the Kano's model categories based on the significance values derived above.Similarly, the dynamic trends of each product/service feature's importance are monitored and the future importance can be predicted based on the dynamic trends.Finally, the relative performance is divided into four groups based on their value sizes.The relative performance and predicted importance (Kano's model categories) are subsequently used as coordinates for plotting product/service features in two-dimensions.The IPA map is graphically presented as grid divided into 16 areas and different strategies are proposed to handle features in different areas.The process of the proposed approach is shown in Figure 2.

Step 1: Data Collection and Processing
As the proposed approach not only focuses on the target product/service itself, but also considers its competitors in the market segment, the first step is to collect OCRs of the target product/service and its competitors in the market segment.Both the overall customer satisfaction rating and reviews need to be collected.These OCRs can be obtained via various online channels, such as e-commerce websites, online customer communities, and social media.Besides, the proposed approach considers the dynamic change of OCRs, so that the OCRs of products/service of recent generations (the last three years) should be collected as well.
Once a set of OCRs mentioned above has been prepared, the keywords of product/service features, opinions' modifiers and opinions are extracted from OCRs, such as (screen, very, clear).Stanford CoreNLP is used to split sentences, segment words, POS tag and parse sentences, which is one of the most used natural language analysis toolkits.It is a fairly small and self-contained natural language analysis system that is easy to use.Furthermore, most users benefit greatly from the provision of a set of stable, robust, high quality linguistic analysis components, which can be easily invoked for common scenarios [40].
In order to better extract the keywords of product/service features, opinions' modifiers, and opinions in Chinese OCRs, these keywords are extracted according to the extract rules of Language Technology Platform (LTP), which is an appropriate tool for Chinese natural language processing developed by Harbin Institute of Technology [41].The tags of LTP's sentences parsing are shown in Table 2.As the tags of Stanford CoreNLP's sentence parsing are similar with LTP's, the extract rules of LTP can be applied into Stanford CoreNLP with some adjustment.There are two extract rules of LTP [41].First, if the sentences parsing tag of a sentence is "SBV", "ATT", or "CMP", the noun, gerund, or verb in the sentence are the keywords of product/service features and the corresponding adjectives are the keywords of opinions.Second, if the sentences parsing tag of a sentence is "ADV", the corresponding adverb of the opinion keyword is its modifier.Based on these two extract rules, a set of vectors (feature, modifier, opinion) can be constructed from the OCRs.
Furthermore, the performance of each product/service feature are measured by the sentiment scores S of the vectors (feature, modifier, opinion).According to Wu et al. [41], the opinion word's sentiment score S 1 (o) is calculated based on the similarity between the opinion words and the benchmark words which have obvious sentiment polarities as where k represents k pairs of benchmark sentiment words derived from HowNet including one positive word and one negative word; key − p i represents a positive word and key − n j represents a negative word.α and β represent adjustable parameters to adjust the accuracy of the algorithm.Sim(key − p i , W) and Sim(key − n j , W) represent the similarity between the opinion word and the positive benchmark word and the similarity between the opinion word and the negative benchmark word respectively which is calculated based on Tongyici Cilin [42].Tongyici Cilin includes not only the synonyms of a word, but also some generalized related words.The amount of the words included in it have been expanded to 70,000 by Harbin Institute of Technology.
Besides, considering the modifier of the opinion word, this study calculates its modifying strength according to its similarity to the benchmark word like the methodology mentioned above.Therefore, if both opinion word and its modifier appear in the vector (feature, modifier, opinion), the sentiment score S of this vector is calculated by formula [41] or else S = S 1 .
where S 2 represents the modifying strength of the opinion's modifier.S 1 represents the sentiment score of the opinion word calculated by formula (1).∆µ represents displacement index and is set as 0.5 here.In order to calculate conveniently, the modifying strengths of modifiers are divided into different degrees with a benchmark word in each degree as shown in Table 3.When there are other modifiers, the similarity between these modifiers and benchmark words are calculated according to the similarity formula.The modifying degree of the benchmark word with the biggest similarity is used as the modifying degree of the modifiers.Besides, when there are negative modifiers, the sentiment score S = −S 1 .After the process above, a set of vectors (feature, sentiment score) can be obtained and the performance of each product/service feature are measured by these sentiment scores.

Step 2: Performance Analysis
In this step, group-based sentiment scores are calculated to measure the performance of each target product/service's feature group.The product/service features in the vectors obtained in Step 1 are divided into different groups according to the product/service hierarchy manually.After dividing the product/service features into groups, the average sentiment score of product/service features in each group is calculated as the performance of each target product/service's feature group as where P xi represents the performance of the target product/service x's ith feature group, P xik represents the performance of the target product/service x's kth feature belonging to the ith feature group, n represents the amount of the features belonging to the ith feature group.As mentioned above, the OCRs of the target product/service and its main competitors in the market segment are collected.Both the performance of the target product/service and the major products/service in the market segment are calculated.Relative performance is analyzed here rather than actual performance which neglects the comparison with target product/service's major competitors in the market segment.The relative performance is calculated by Formula (4), developed by this study as where RP xi represents the relative performance of the target product/service x's ith feature group, P C j i represents the performance of the jth competitor's ith feature group, ω C j i represents the market share of the jth competitor, m represents the amount of competitors in the market segment.
Relative performance of target product/service's feature groups in the last three years is calculated and the dynamic trends of each feature group's relative performance are drawn in the line chart.Problems of the target product/service can be detected by the identification of negative trends.Furthermore, performance comparisons of target product/service's underperforming feature groups among the target product/service and its main competitors in the market segment are delineated by radar map in detail.

Step 3: Importance Analysis
In this step, considering the real word situation that the relationship between feature performance and overall customer satisfaction is nonlinear and asymmetrical, this study derived the significance values of each product/service feature to overall customer satisfaction in both positive and negative sentiment polarities from the constructed decision tree model.The importance of each product/service feature's performance to overall customer satisfaction are then measured by the Kano's model categories based on the significance values derived above.
The constructed decision tree model allows the consideration of the influence of the presence/absence of positive/negative OCRs of each product/service feature groups on overall customer satisfaction.In other words, this approach is able to identify non-linear dependencies between overall customer satisfaction and satisfaction with each product/service feature group.The decision tree model also allows the detection of the most significant product/service feature groups that are essential for the customers [43].
The decision tree in our research is constructed based on a C4.5 algorithm which is one of the most commonly-used algorithms in decision tree modelling proposed by Quinlan [44].The independent variables of the decision tree are the customer satisfaction with each product/service feature group (presence/absence of positive and negative OCRs on each product/service feature groups) and the dependent variable of the decision tree is the overall customer satisfaction with the product/service (positive/negative).An example of the decision tree model's data set is shown in Table 4. Therefore, nodes of the decision tree are the variables of positive and negative OCRs on the feature groups.Edges of the tree are the values of the variables of positive and negative OCRs on the feature groups, i.e., 1 is presence, 0 is absence.Leaves present the sentiments of overall customer satisfaction in the OCRs, i.e., each branch leads to either a positive or a negative customer satisfaction.According to the C4.5 algorithm, the decision tree model is constructed according to gain ratio as The alternative classifying variables are chosen from the variables whose information gain are higher than the average information gain first and then the variable whose gain ratio is the highest is chosen to be the classifying variable.Based on this classifying principle, the decision tree model is constructed and the rules of the decision tree can be extracted.One example of the constructed decision tree model and its rules is shown in Figure 3. Besides, the significance of each product/service feature group can be calculated.Significance of each feature group shows how much the sentiment of an OCR depends on the feature group in positive and negative sentences.Setting the amount of feature groups as g/2, then the amount of independent sentimental variables is g.According to Yussupova et al. [45], the significance of variable m is calculated as where k l represents the amount of nodes that were split by feature group l, E l,j represents entropy of the parent node, split by feature group l, E l,j,i represents entropy of the child nodes for j, which was split by feature group l, Q l,j , Q l,j,i represents the amount of examples in the corresponding nodes, q l,j represents the amount of child nodes for j parent node.Furthermore, according to each feature group's significance values in both positive and negative sentiment polarities, the significances of each product/service feature performance on overall customer satisfaction are converted into the categories of the Kano's model (shown in Figure 1) as each product/service feature's importance to customer satisfaction.According to Füller and Matzler [35], if the significance value of positive sentiment polarity is significantly smaller than that of negative sentiment polarity, the feature group is classified into basic factor of customer satisfaction.If the significance value of positive sentiment polarity is similar to that of negative sentiment polarity, the feature group is classified into performance factor of customer satisfaction.If the significance value of positive sentiment polarity is significantly bigger than that of negative sentiment polarity, the feature group is classified into excitement factor of customer satisfaction.If both the significance value of positive sentiment polarity and negative sentiment polarity are close to 0, the feature group is classified into indifferent factor of customer satisfaction.
Similarly, the dynamic trend of the target product/service feature's importance to customer satisfaction as well as that of the market segment can be analyzed.From the dynamic trend, the future importance of each product/service feature group to customer satisfaction can be predicted.

Step 4: Importance-Performance Analysis
In this step, the levels of relative performance of product/service feature are divided into four levels as 'very low', 'low', 'high', and 'very high' and feature groups were placed in twelve regions according to Lai et al. [33], as shown in Table 5. Different resources allocation strategies are recommended according to the relative performance and importance of the feature groups.For the basic factors of customer satisfaction, when the level of relative performance is 'very low', it is recommended that businesses should make more efforts to improve them.This is because the overall customer satisfaction is really low when the level of performance of basic factors is under the average level.When the level of relative performance is 'low', placing low priority on efforts to improve them is recommended.When the level of relative performance is 'high' or 'very high', it is recommended that businesses should 'reduce' or 'slightly reduce' their resources to them because exceeded performance does not increase customer satisfaction any more.For the performance factors of customer satisfaction, when the level of relative performance is 'very low' or 'low', it is recommended that businesses should improve resources or place low priority on efforts to improve their performance.This is because the efficiency to improve the overall customer satisfaction by improving performance factors is not low.When the level of relative performance is 'high' or 'very high', it is recommended that businesses should keep or reduce resources to improve their performance, because the efficiency to improve the overall satisfaction by improving performance factors is not high enough and there may be better way to improve the overall customer satisfaction by same amount of resources.There are two resources allocation strategies for the excitement factors of customer satisfaction when the level of relative performance is 'very low' or 'low'.If firms have sufficient resources, it is recommended that they should make more effort to improve or largely improve their performance.This is because customers will only be satisfied when the level of performance of excitement factors exceeds the average level.
On the other hand, if firms only have limited resources, firms should utilize their resources more effectively and they should concentrate their efforts on improving performance factors and keeping the performance of these excitement factors at the current level.Besides, when the level of performance is 'high' or 'very high', it is still necessary to keep or place a low priority on efforts to improve them, because there is still room for increasing satisfaction.However, considering that there may be some product/service features belonging to indifferent factors of customer satisfaction, whose performance have no influence on overall customer satisfaction.For indifferent factors, it is recommended that businesses should reduce resources allocated on them no matter what performance levels of these product/service features are.Any resources allocated to improving these product/service features are wasteful as the performance of these product/service features cannot influence the overall customer satisfaction.

Case Study: Huawei P-Series Smartphones
In this section, in order to verify the feasibility and validity of the proposed approach, its process is illustrated using OCRs related to Huawei P-series smartphones and their major competitors in the market segment in the last three years.As smartphones have various features and a huge number of customers actively discuss it online, it is an appropriate product to demonstrate the improvement priorities discovering process clearly.Huawei P-series smartphones are popular smartphones and appropriate for the case study.

Data Collection and Processing
As Jingdong (https://www.jd.com) is one of the most commonly used e-commerce websites for customers to buy digital products in China and there are large-scale relatively effective OCRs, it is used as the data source of this case study.
To analyze the performance and importance of a target product and its major competitors in the market segment, based on the market positioning and pricing of Huawei P-series smartphones, the OCRs of Huawei P-series smartphones and the top sellers in the 2500-3500 RMB market segment in the last three years are collected.The amounts of each product's OCRs collected are shown in Tables 6  and 7.
Table 6.Amounts of the OCRs collected of Huawei P-series smartphones.

Amount of Performance Analysis OCRs Amount of Importance Analysis OCRs
Huawei P9 500 500 Huawei P10 500 500 Huawei P20 500 500 Stanford CoreNLP is used to split sentences, segment words, POS tag and parse sentences, and keywords of product features, opinions, and opinions' modifiers are extracted from the OCRs.The sentiment score of the vector (feature, modifier, opinion) is calculated according to Formulas ( 1) and ( 2) to measure the performance of each product feature.Some samples of Huawei P20's extracted keywords and their sentiment scores are shown in Table 8.In order to calculate the total sentiment score of the market segment, the market share based on the OCRs amounts are used as each product's weight.As most of the OCRs of a product are positive, to calculate the significance of each product's feature group in both sentiment polarities accurately, equal amounts of positive and negative OCRs were analyzed.

Performance Analysis
In order to derive the performance of the smartphone's feature groups, the features are classified into feature groups.However, as the length of most sentences in Jingdong's OCRs are short, the classification effects of the automatic algorithms completely dependent on computer are bad.Therefore, this study classified the features into feature groups manually according to Li et al. [46], which are shown in Table 9.This study used some of the OCRs' product features randomly as training samples to establish more accurate classification standards, and the rest of the product features are classified into feature groups automatically based on the classification standards established before.
The sentiment scores of the smartphone's feature groups are calculated according to Formula (3) to measure the performance of each smartphone feature group.The actual performance of Huawei P-series smartphones' feature groups and the 2500-3500 RMB market segment's feature groups in the last three years are shown in Tables 10 and 11.The higher the value of actual performance is, the better the actual performance is.The negative value of actual performance means that customers are dissatisfied with the performance of the product feature and the lower the negative value of actual performance is, the more dissatisfied the customers are.Besides, if the value of the product feature's actual performance is 0, it means that the product feature is not mentioned in the OCRs.
To analyze the dynamic actual performance trend of Huawei P-series smartphone feature groups, Figure 4 shows a line chart of their actual performance: It can be seen from the line chart that the actual performance of "Basic function" improved in 2017 but decreased recently.The actual performance of "Performance", "Customer feedback" kept decreasing from 2016 to 2018.These feature groups should be given more attention.
In order to compare with the major competitors in the market segment, the relative performance of Huawei P-series smartphones' feature groups is assessed according to Formula (4), and the results are shown in Table 12.If the values of the actual performance of both the target product feature and the market segment feature are positive, the higher the value of relative performance is, the better the relative performance of the target product feature is.On the contrary, if the values of the actual performance of both the target product feature and the market segment feature are negative, the lower the value of relative performance is, the better the relative performance of the target product feature is.Besides, when the value of relative performance is negative, the relative performance of the target product feature is high if the value of the actual performance of the target product feature is positive, while the relative performance of the target product feature is low if the value of the actual performance of the target product feature is negative.
To analyze the dynamic relative performance trend of Huawei P-series smartphones' feature groups intuitively, a line chart is drawn based on their relative performance in Figure 5. From the line chart can be seen that Huawei P-series smartphones' relative performance of 'beautify', 'basic function', 'data function', and 'entertainment function' improved in 2017 but decreased recently while only "basic function" can be found this trend in the dynamic actual performance trend.Huawei P-series smartphones' relative performance of 'performance' and 'customer feedback' kept decreasing from 2016 to 2018, which is consistent with the dynamic actual performance trend.These underperforming feature groups should be given more attention.Furthermore, to mine the details of these underperforming feature groups' comparison with the major competitors in the market segment, a radar map of these underperforming feature groups is shown in Figure 6.
As shown in Figure 6, Xiaomi MIX2s performs pretty well in all these feature groups.The actual performance of Huawei P20 in these feature groups are similar to other competitors in the market segment except 'performance', which is significantly worse.Therefore, it is necessary for Huawei to improve its P20 smartphone's 'performance' to catch up with its competitors.

Importance Analysis
By analyzing significance value of product's feature groups on overall OCR's sentiment in both negative and positive sentiment polarities, the importance of each feature group is converted into the Kano's model categories according to each factor category's characters.The importance analysis results of Huawei P-series smartphones in the last three years are shown in Tables 13-15.a large amount of resource allocated on it is just wasteful.Besides, as the predicted importance of Huawei P-series smartphones' 'screen', 'phone accessories', 'performance', and 'cost performance' are 'performance factor', resource allocation should be just put low priority if its performance is 'low' because there are some feature groups belonging to 'excitement factor' which are better choice to improve customers' satisfaction sharply.However, if its performance is 'very low' such as 'performance', it is still a suboptimal choice to improve the resource allocated on it.However, if its performance has already been 'high', the resource allocated on it should just be kept such as 'cost performance'.Furthermore, as the predicted importance of Huawei P-series smartphones' 'hardware', 'basic function', and 'customer feedback' are 'excitement factor' and their performances are 'very low' or 'low', it is necessary to largely improve or improve the resource allocated on them.Because it is easy to improve the customers' satisfaction sharply if the performance of them exceed the average value.However, as the predicted importance of Huawei P-series smartphones' 'appearance design' and 'photograph function' are 'excitement factor' and their performances are 'high', resource allocation on them should be put low priority compared with 'hardware', 'basic function', and 'customer feedback'.

Evaluation
From Section 4.1 to Section 4.4, the improvement priorities of Huawei P20's 13 feature groups are discovered, so that the proposed approach is proved feasible.In order to further verify the validity of the proposed approach, the product opportunity mining approach proposed in prior studies and the approach proposed in this study are used to discover the improvement priorities of Huawei P10's 13 feature groups for comparison.It can be seen in Table 7 that the market share of Huawei P-series smartphone increased from 13.05% in 2016 to 34.48% in 2018.Therefore, it can be believed that the actual decision of Huawei P10's improvement priorities are effective, so that it can be the ground truth to justify which approach works better.
According to the product opportunity mining approach, the improvement priorities of feature groups are determined by the value calculated by the opportunity algorithm as [7] Opportunity = Impor tan ce + Max(Impor tan ce − Satis f action, 0) The results of Huawei P10's improvement priorities using the product opportunity mining approach are shown in Table 19.According to the approach proposed by this study, the results of Huawei P10's improvement priorities are shown in Table 20.need of analysis.The levels of performance and importance of each feature group are then calculated.The performance of the feature groups is measured by the average sentiment scores of each feature group calculated based on the similarity of the benchmark words using sentiment analysis.What's more, relative performances are calculated as the ratio of target product/service's performance to the market segment's performance, considering the comparison with the major competitors in the market segment.The importance of the feature groups is measured based on Kano's model deriving from decision tree modeling.Finally, the improvement priorities of target product/service' feature groups are discovered from the dynamic performance trend and predicted importance using dynamic importance-performance analysis.The functionality of the approach was demonstrated herein using the data of Huawei P-series smartphones and their competitors in the market segment on Jingdong, one of the major e-commerce websites in China, between 2016 and 2018.The improvement priorities of Huawei P-series smartphones' 13 feature groups were discovered through this case study and the approach proposed in this study was proved to be a much better approach for product/service improvement priorities discovering than the product opportunity mining approach proposed in the prior studies.
Besides, as mentioned in the Introduction section, the cost budget and resources of a business are limited, it is necessary for businesses to discover the improvement priorities of its product/service features effectively and allocate their resources appropriately for higher customer satisfaction.This research targeted 2500-3500 RMB market segment in the case study because it is more consistent with this situation, so that it is more suitable to verify the feasibility and validity of the proposed approach.In this market segment, the cost budget and resources of a business are limited, but it still has space and it is necessary to consider how to allocate these cost budget and resources on each product feature for higher customer satisfaction.However, if this research targets low end market segment, the cost budget and resources of a business may be too little to further consider how to allocate them in detail.The proposed approach may be less practical in this market segment.On the contrary, if this research targets high end market segment, the cost budget, and resources of a business are more sufficient, the business may just allocate the cost budget and resources on each product feature to make its product perfect.The proposed approach may be less practical in this market segment as well.However, considering the principle of profit maximization, businesses in each market segment try to satisfy their customers with less cost, so that our approach is still practical in each market segment.The proposed domain-independent approach contributes to the effective resource allocation and improvement priorities discovering across various domains, including not only products but also services, using the OCRs related to the target.It thereby assists businesses to improve customer satisfaction as much as possible under certain cost constraints and resources.
We expect that this study will make both academic and practical contributions to relevant fields.From an academic perspective, the proposed approach applies dynamic importance-performance analysis into OCRs analysis for improvement priorities discovering.Some limitations of the prior studies were revealed.
First, most of the prior studies focused only on the actual performance of the target product/service, while neglecting the comparison with the major competitors in the market segment.Compared to them, this study analyzes the performance of target product/service's major competitors in the market segment as well and derives the relative performance of the target product/service.
Second, most of the prior studies derived the importance of the target product/service just based on the frequency of the mentioned features, leading to inaccurate importance judgment sometimes.Compared to them, this study uses decision tree modeling to derive the significance of each feature group in both positive and negative sentiment polarities and converts them into Kano's model categories.Based on Kano's model, feature groups belonging to different factors have different relationship between product/service performance and customer satisfaction.Thus, converting the importance of target product/service into Kano's model categories helps to satisfy the customers more effectively.Third, our approach applies dynamic importance-performance analysis into OCRs mining rather than analyzing the data collected from traditional questionnaires.Therefore, our approach can monitor the trends of customer needs in time and predict their needs more accurately.
From a practical perspective, our approach can be implemented as a software system for businesses.Because recent customer needs are more dynamic and businesses can only improve their products/service under certain cost budget and resources, our approach helps the businesses to monitor the dynamic trend of customer needs and decide resource allocation and improvement priorities more effectively.It is an efficient approach for businesses to get higher customer satisfaction with limited resources and to be competitive sustainably in the market.
Despite the contributions made by this study, further work still needs to be completed.First, in the proposed approach, the classification of features requires some degree of manual processing because it is inaccurate to classify totally by computers.Therefore, in future work, better classification algorithms need further research.Second, as there are insufficient effective features comparison opinion in OCRs, our approach can only calculate the relative performance as the ratio of target product/service's performance to market segment's performance.Thus, in future works, how to extract more effective features comparison opinion from OCRs and derive relative performance from them needs further research.Finally, our approach was applied to one example target product, but it has the potential to be applicable to various domains, such as service.Therefore, application studies in different domains will be conducted in further works.

Figure 1 .
Figure 1.Kano's model of customer satisfaction.

Figure 2 .
Figure 2. Process of the proposed approach.
) where E(D) represents the information entropy of the data set, the proportion of the ath sample group is p a and m represents the variables of positive and negative OCRs on the feature groups.Variable m has v possible values {m 1 , m 2 , . . ., m v } (v is two in this model, presence or absence) and D v represents the samples whose value of m are m v .Gain(D, m) represents the information gain of variable m and IV(m) represents the intrinsic value of variable m.

Figure 3 .
Figure 3.One example of the constructed decision tree model and its rules.

Figure 4 .
Figure 4. Dynamic actual performance trend of Huawei P-series smartphones' feature groups.

Figure 5 .
Figure 5. Dynamic relative performance trend of Huawei P-series smartphones' feature groups.

Table 1 .
Comparison of traditional IPA and dynamic IPA.

Table 3 .
Benchmark words of modifiers in each modifying degree.

Table 4 .
Example of the decision tree model's data set.

Table 5 .
Resources allocation of improved importance-performance analysis.

Table 7 .
Amounts of the OCRs collected of the 2500-3500RMB market segment in the last three years.

Table 8 .
Samples of Huawei P20's extracted keywords and their sentiment scores.

Table 9 .
Feature groups of a smartphone.

Table 10 .
Actual performance of Huawei P-series smartphones' feature groups in the last three years.

Table 11 .
Actual performance of the 2500-3500 RMB market segment's feature groups in the last three years.

Table 12 .
Relative performance of Huawei P-series smartphones' feature groups in the last three years.

Table 13 .
Importance analysis results of Huawei P9's feature groups.

Table 14 .
Importance analysis results of Huawei P10's feature groups.

Table 15 .
Importance analysis results of Huawei P20's feature groups.

Table 19 .
Analysis results using the product opportunity mining approach.