Decision-Making Model of Performance Evaluation Matrix Based on Upper Confidence Limits

: A performance evaluation matrix (PEM) is an evaluation tool for assessing customer satisfaction and the importance of service items across various services. In addition, inferences based on point estimates of sample data can increase the risk of misjudgment due to sampling errors. Thus, this paper creates a decision-making model for a performance evaluation matrix based on upper confidence limits to provide various service operating systems for performance evaluation and decision making. The concept is that through the gap between customer satisfaction and the level of importance of each service item, we are able to identify critical-to-quality (CTQ) service items requiring improvement. Many studies have indicated that customer satisfaction and the importance of service items follow a beta distribution


Introduction
Many studies have indicated that the performance evaluation matrix (PEM) is an evaluation tool for customer satisfaction and the importance of service items within various service operation systems [1][2][3]. Some studies have pointed out that evaluations using sample point estimation can increase the risk of misjudgment due to sampling errors [4,5]. Therefore, this paper will create a decision-making model for a performance evaluation matrix based on upper confidence limits to provide various service operating systems for performance evaluation and decision making for improvement. The concept is that through the gap between customer satisfaction and the level of importance of each service item, we can identify critical-to-quality (CTQ) service items requiring improvement. Meanwhile, many studies have worked on the research of the PEM to evaluate various service systems to see whether the performance meets the requirements and make further decisions about a service item: whether improvements are needed, whether the status quo should be maintained, or whether resource transfer is recommended [6][7][8][9][10]. Similar to many other studies and to maintain generality, this paper assumes that there are q service items provided by a service system. Then, 2q questions are designed to conduct a survey of learners about satisfaction and importance [11]. As noted by Chen et al. [12], Lambert and Sharma [13] used the Likert scale to collect data on customer satisfaction and the importance of service items with q service items. The horizontal axis of a PEM represents customer satisfaction; the vertical axis represents the importance of service items. The PEM then is divided into three areas: improvement, maintenance, and resource transfer. Hung, Huang, and Chen [1] revised the placement of the three performance areas in the above-mentioned PEM, making the evaluation rule more reasonable. Since the value of the beta distribution falls between one and zero, based on the beta distribution, Hung et al. [1] asserted that customer satisfaction and the importance of service items should follow a beta distribution, and based on the two parameters of this distribution, they proposed indices of importance and satisfaction, which represent standardization. These two index values are both between zero and one. The index value of satisfaction ranges from zero, representing zero percent satisfaction, to one, representing one hundred percent satisfaction. Likewise, the index value of importance ranges from zero, representing zero percent importance, to one, representing one hundred percent importance.
According to Liu et al. [2], the random variable h X represents the hth service item of satisfaction and then h X is distributed as a beta distribution, denoted by Based on the above information, the values of these two indices are both between zero and one. Additionally: , is of a high proportion.
We can see that when the proportion of highly satisfied customers is higher, the value of parameter l a is higher than that of parameter l b , and the value of the satisfaction index is higher as well. Likewise, when the proportion of customers with low satisfaction is higher, the value of parameter l a is lower than that of parameter l b , and the value of the satisfaction index is lower. And, when the value of parameter l a is equal to that of pa-rameter l b , the value of the satisfaction index is exactly one-half. Additionally, the importance index has the same characteristic as the satisfaction index. We can see that these two indices can reasonably reflect the distributions of customer satisfaction and the importance of service items.
There are some papers that set the satisfaction index as the x-axis and the importance index as the y-axis to form the performance evaluation matrix; at the same time, based on the concepts of total quality management and a continuous improvement process, they use the mean value 0  as the standard of evaluation [14,15]. Then, based on the critical value derived from the statistical test, they form the evaluation area and the evaluation rule of the PEM. The critical value is affected by sample size n and by the value of the variance of the index estimator, causing the critical values of the q service items to be different. As a result, the minimum value is chosen to represent these q-critical values. Although this method has overcome the problem of inconsistency in evaluation criteria, another problem has arisen, i.e., inconsistency in the level of significance. To tackle this issue, this paper instead uses the upper confidence limit of the satisfaction index of the q service items to test whether the value of the satisfaction index of the service items is larger than the mean value 0  , and then to decide whether to improve the satisfaction level of the service items. Next, we use the upper confidence limit of the importance index of q service items to test whether the value of the importance index of the service items is larger than the mean value 0  , and to determine the order of improvement priority. This method is based on the statistical test and hence can lower the risk of misjudgment, which can be caused by sampling errors.
The remainder of this paper is organized as follows. In Section 2, this paper will derive the 100 ( ) 1  − % upper confidence limit of the two indices based on sample data. In Section 3, this paper allows the horizontal axis to represent the mean value of the satisfaction index 0  , and the vertical axis to represent the mean value of the importance index 0  ; these two axes divide the PEM into four evaluation quadrants. Then, we use the upper confidence limits of the two indices to mark the evaluation coordinates of each service item ( ) , hh xy, and based on where the coordinates ( ) , hh xy are located in the PEM, we can evaluate whether a service item needs improvement. When resources are limited, we can set the order of improvement priority. In Section 4, we use a case study to illustrate an application of the model proposed in this paper, showing how to establish the evaluation coordinates and identify service items requiring improvement, and when resources are limited, how to set the order of improvement priority. Section 5 presents the conclusions.

 − % Upper Confidence Limits
As stated, as the satisfaction index and importance index possess unknown parameters, they need to be estimated from the sample data of respondents. Let where h = 1, 2, …, q. Then, the unbiased estimator of the satisfaction index h  is written as follows: This is the expected value of ˆh  , which is equal to h  , which is expressed as follows: In addition, the sample standard deviation is written as follows: Let the random variable l Z be written as follows: ˆh Based on the Central Limit Theorem (CLT), the random variable h Z is approximately distributed as a standard normal distribution for a large sample size n [16], that is, Based on the above information, this paper derives the 100 ( ) 1  − % upper confidence limit of the satisfaction index h  , as follows: Therefore, the 100 where z  is the upper  quantile of the standard normal distribution.
Similarly, let Based on the Central Limit Theorem (CLT), the random variable h T is approximately distributed as the standard normal distribution for a large sample size n [16], that is, Based on the above information, this paper derives the 100 ( ) 1  − % upper confidence limit of the importance index h  , as follows: Therefore, the 100 ( ) where z  is the upper  quantile of the standard normal distribution.

The Decision-Making Model
As noted in some studies, the PEM is widely used to evaluate the performance level of q service items across various service systems [17,18]. Similar to these papers, the satisfaction index h  is set as the x-axis and the importance index h  is set as the y-axis to form the PEM. Let alternative hypothesis 10 : Similarly, the hypothesis of the statistical test for importance index h is written as follows: null hypothesis Performance Quadrant II: Performance Quadrant III: Performance Quadrant IV: According to Equations (22)-(25), the performance evaluation matrix and four performance quadrants can be depicted, as shown in Figure 1, the red dots representing items that need improvement, and the blue dots representing items that do not need improvement.    , indicating that the value of the importance level of the service item h is higher than the mean value, and improvements of this service item can be made later when resources are limited.
Based on the above concept, the evaluation rules determining which service items should be improved and the order of improvement priority are established as follows:

Case Study
As stated, the decision-making model of the performance evaluation matrix based on upper confidence limits, developed in this paper, can assist managers of various service systems in quickly identifying CTQ service items requiring improvements and in making improvements. For example, in the field of e-learning, many researchers have been working on research about using teaching apps and computer-assisted language learning systems to increase learning effectiveness [19][20][21][22]. The proposed method can also be applied to the performance evaluation and model analysis of a real teaching setting. In technological and vocational education, students in practical courses learn through hands-on experience to combine theory and practice. So, this paper uses the case of practical courses as an example to illustrate the application of the proposed model. The questionnaires for the level of teaching satisfaction designed by Chen and Yureference Yeh et al. [23] and Yu et al. [24]. The questionnaires comprise an adequate number of questions and are quite comprehensive. Based on the version of Chen and Yu, this paper has designed a questionnaire to survey the level of satisfaction with the practical course, which includes seventeen questions and five dimensions: (1) Teaching preparation, (2) Teaching attitude, (3) Teaching ability, (4) Teaching management, and (5) Assessment. For each question, respondents are asked to provide both their level of satisfaction and the level of importance, and there are, in fact, thirty-four (q = 17, 2q = 34) questions to be answered. The questionnaire is as follows: This paper used the above-shown questionnaire to collect data on the level of satisfaction and level of importance of students participating in practical courses. The respondents were students participating in a practical course at Taiwan's University of Technology and Science. In total, 367 questionnaires were issued and 301 were recovered (with a recall rate of 82.02%). Next, we calculated the values of the sample statistics of each service item, as follows: According to Equations (16) and (17), we obtain Based on the data from Table 1, we plotted the 17 evaluation coordinates ( ) , hh xy as dots on the PEM below:

Results and Discussion
In Figure 2, the evaluation coordinates of these 17 service items are plotted in the four evaluation quadrants, the red dots representing items that need improvement, and the blue dots representing items that do not need improvement. Quadrant 1: items 4, 5, 6, 7, 8, 9, 10, 11, 14, 15, and 17 According to the evaluation rule determining which service items (larger than 0 0.7864  = ) should not be improved, service items 4,5,6,7,8,9,10,11,14,15, and 17 do not need improvements. Quadrant 2: items 3, 12, and 13 According to the evaluation rules determining which service items should be improved (smaller than Obviously, this model takes the average of the two evaluation indices as the testing standard and can improve the service direction with relatively weak performance, which is in line with the spirit of continuously improving total quality management. In addition, in the case of limited resources, priority should be given to improving service items with high importance to improve improvement efficiency. Based on this case, readers can be more aware of the application of the model proposed in this article, and it is convenient for enterprises to apply this model for evaluation and improvement.

Conclusions
This paper has discussed the correlation between the values of the two indices and the proportion of customer satisfaction/importance of service items. We discovered that when the values of the indices are larger, the proportion of customer satisfaction/importance of service items becomes higher as well. We can see that these two indices can adequately reflect the actual distributions of customer satisfaction and the importance of service items. This paper set the mean value of the satisfaction index 0  and that of the importance index 0  as the horizontal line and the vertical of the PEM, which divided the matrix into four evaluation quadrants. This defined the set of these four evaluation quadrants, making this tool easy for managers to apply to their own systems. Since these two indices have unknown parameters, if the evaluation is performed directly by the point estimation, there will be a risk of misjudgment caused by sampling errors. As a result, based on the rule of using the upper confidence limits to conduct the statistical test, this paper used the upper confidence limits of the two indices as the evaluation coordinates of each service item ( ) , hh xy, and based on where the coordinates ( ) , hh xy are located in the PEM, we can evaluate whether a service item needs improvement, and when resources are limited, we can set the order of improvement priority. Then, we used a real case to explain how to apply the model proposed in this paper, construct evaluation coordinates, identify CTQ service items, and determine the order of priority for improvements when resources are limited. We can see that this method is based on statistical tests; as a result, the risk of misjudgment caused by sampling errors can be reduced. Obviously, this rule is in line with the spirit of continuously improving total quality management. At the same time, under the condition of limited resources, prioritizing the improvement of high-importance service items can improve improvement efficiency. Based on this case, readers can be more aware of the application of the model proposed in this article, and it is convenient for enterprises to apply this model for evaluation and improvement.

Research Limitations and Future Research
Based on the recommendations of many studies, this paper uses the beta distribution as the distribution of customer importance and satisfaction. However, the statistical inference of the two indices must follow the central limit theorem, so the sample size must be large enough, which is the limitation of this study. In addition, the beta distribution is a continuous random variable. However, the collection of customer voices must be carried out through discontinuous scales. How to use the scale to make a reasonable discontinuity is the focus of future research.