Constructing Fuzzy Hypothesis Methods to Determine Critical-To-Quality Service Items

: This paper constructs a performance evaluation matrix (PEM) with beta distribution. Beta is between zero and one, making it a suitable indicator to describe customer ratings of importance and satisfaction from 0% to 100%. According to the spirit of ceaseless improvement put forward by total quality management, the average ratings are set as the standard, and then the coordinates of each satisfaction and importance item is located in the performance areas. This makes it easy to identify critical-to-quality items that require improvement. However, the data collection method of questionnaires inevitably involves sampling error, and the opinions of customers are often ambiguous. To solve these problems, we constructed a fuzzy testing method based on confidence intervals. The use of confidence intervals decreases the chance of misjudgment caused by sampling errors, and more precisely gets closer to customers’ voices. This result is more reasonable than the traditional statistical testing principle. The proposed methods are applied to assessment of a computer-assisted language learning (CALL) system to display their competence.


Introduction
Lambert and Sharma [1] presented the performance evaluation matrix (PEM) for operating systems that collect users' or customers' perceptions. Compared with other assessment methods that need complicated data comparison, the PEM makes it easy to determine which service items most urgently require improvement, maintenance, or adjustment. This is achieved by locating the items according to (1) customer satisfaction with the items as they are and (2) how important the customers deem them. The PEM is widely applied for performance evaluation and improvement in a range of industries and institutions [2][3][4][5][6][7][8]. In the PEM, customer perception of the importance of an item is represented by the vertical axis, and customer satisfaction with the item itself is represented by the horizontal axis. Each axis is cut into three equal shares, forming a three by three matrix with nine performance blocks. The three performance areas on the diagonal are regarded as maintenance areas because their importance and satisfaction are equal; the three performance areas on the upper-left are regarded as improvement areas because their importance level is higher than customers' satisfaction level; and the three areas on the lower-right are viewed as adjustment areas because their satisfaction rating is higher than their importance.
In order to make the PEM more applicable, Hung, Huang, and Chen [9] altered the position of the performance areas in the PEM to rationalize the evaluation principles. They divided the PEM into three equally-sized performance areas. The three performance areas are the following: the upper-left one is improvement, the middle one is maintenance, and the lower-right one is the adjustment area. Satisfaction and importance are viewed as random variables and submitted to the beta allocation. The importance and satisfaction indices are thereby standardized with values between zero and one. Hung, Huang, and Chen [9] also suggested that the conventional method of the PEM was subject to sampling error. They therefore proposed deriving the joint confidence interval of the importance and satisfaction indices based on the central limit theorem and replacing the point estimates of the PEM with these joint confidence intervals. Although this method overcomes sampling error, it is complicated and difficult to use [4,6,9].
In addition to sampling error, another limitation encountered by the PEM is that customer or user opinions are often fuzzy [6,[10][11][12]. Therefore, Wang et al. [6] proposed a fuzzy semantic scale for practical application and also constructed a convenient and comprehensive calculation method. However, while this enables the collection of data that is closer to the feelings of the interviewees, it increases the quantity of data as well as the complexities of collection. Consequently, Yu, Chang, and Chen [13] as well as Chen et al. [14] revised and simplified the data collection and calculation of the fuzzy semantic scales. Their modifications include the Likert scale, which is easy to use, but data collection for the fuzzy semantic scale remains relatively complicated.
In order to overcome the above limitations, we check the fuzzy testing method of Buckley [15], Chen et al. [16], Lee et al. [17], and Sarkar et al. [18]. The method in this paper was extended from the ones in the stated-above three papers. It is based on calculated confidence intervals to construct the triangular shaped fuzzy number similar to the triangular fuzzy number mentioned by Sarkar et al. [18] and meets the requirements. However, values of the two points of the triangular fuzzy number are fixed, ones of the two points of triangular shaped fuzzy number are changeable. Besides, industries consider the cost benefit and control the timeliness and effectiveness, their samples are not too numerous [19]. The method in this paper utilizes confidence intervals as the foundation. It not only reduces the risk of misjudgment arising from sampling errors, but also makes the more accurate decision. First, the satisfaction index of each service item is checked to determine whether it is below the mean (i.e., requiring improvement). Second, the importance index of items requiring improvement is tested to determine whether they are above the mean. In cases of limited resources, service items that are the most important are prioritized for improvement. The strengths of Buckley's fuzzy test are that it maintains the plain format of the Likert scale while deducing the confidence interval from the importance and satisfaction indices to create the fuzzy membership function of the two indexes; and through fuzzy hypothesis testing, it distinguishes items regarded critical to quality. We demonstrate the efficacy of the method through the presentation of a case study of a computer-assisted language learning (CALL) system. The rest of this paper is organized as the following. Section 2 illustrates the theoretical model developed by Hung et al. [9], including definition of two evaluation indexes and their properties. Section 3 refers to the confidence interval of the satisfaction index in Section 2 to create a fuzzy membership function. A fuzzy testing criterion is proposed to evaluate whether the satisfaction index of each service item is below the mean. In Section 4, we construct the fuzzy membership function from the confidence interval of the importance index defined in Section 2 and propose a fuzzy testing criterion to evaluate whether the importance index is above the mean. A case study of a CALL system illustrates employment of the method in Section 5. In Section 6, we present our conclusions.

Performance Indices
For the sake of generality, we follow the example of Hung et al. [9] in this study and assume that there are q service items and each service item is measured by two questions (one for Importance sample matrix: The estimator of satisfaction index i θ and the estimator of importance index i θ′ can be shown separately, as the following: In addition, the standard deviation of these two indices can be expressed respectively, as the following: and ( )

Fuzzy Hypothesis Testing for Satisfaction Index
If we let ij ij X x = , then the data matrix of random variable ij X can be shown as then the observation values for the means, standard deviations, and * i θ of each satisfaction service item can be expressed as the following: Chen et al. [14] drew on the concept of ceaseless improvement promoted by total quality management and set 0 θ as the mean for all the satisfaction index as the following: When the satisfaction index of service item i is higher than the mean ( 0 i θ θ ≥ ), service item i does not require improvement. The following hypothesis test is equivalent: This leads to the following test statistic: The critical region is Thus, the critical value 0 i C can be shown as the following: where β is the significance level and the decision rule is as the following: Since the observed value of where 2 z α is the upper 2 α quintiles of (0,1) N . According to Buckley [15] and Chen et al.
[ 16], the -cuts α of triangular fuzzy number Obviously, when Thus, the triangular fuzzy number of * Then the fuzzy membership function of Similarly to The triangular shaped fuzzy number Then the fuzzy membership function of Subsequently, we can express ( ) i x η and 0 ( ) x η graphically as the following (see Figure 1): Obviously, Among these, [ ] • represents the biggest integer which is less than or equal to • . If we let Ti A represent the total area under the graph of * i θ  , then 0,1,...,100 divided into 100 blocks similar to trapezoids and block j can be shown as the following: We also let For the purposes of practical manipulation, since i h is an integer, If we let Ri A represent the area under the graph of * i θ  but to the right of the vertical line A is divided into h trapezoid-like blocks and block j can be expressed as the following: Now, we compute the area of Tij A and Rij A as T A and R A can be shown as the following: As noted by Buckley [15], we may employ two numbers ( ) as the following: (1) If According to the above-mentioned evaluation rules, q service items can be subjected to fuzzy hypothesis testing to find all service items requiring improvement.

Fuzzy Hypothesis Testing for Importance Index
If we let ij ij Y y = , then the data matrix of random variable ij Y can be shown as The critical region is where β is the significance level and the decision rule is According to Buckley [15], the -cuts α of triangular fuzzy number Obviously, when Thus, the triangular shaped fuzzy The membership function of triangular fuzzy number Similarly to The triangular fuzzy number of Subsequently, can be represented graphically as the following (see Figure   2): Obviously, We also let

Case Study
E-learning represents a rapidly growing trend in education. According to the 2018 survey report on digital industries by the Industrial Development Bureau of the Ministry of Economic Affairs, the worth of digital services output in Taiwan in 2017 was as high as NT $5.01 trillion, representing an increase of 19.6% compared to 2016. E-learning is often utilized for language teaching [20,21]. It is forecasted that digital English learning in the Asia-Pacific area will increase by as much as 25.83% per year from 2018 to 2022 [22]. We therefore selected a CALL system operating at three private universities in central Taiwan as a case study to manifest the presented method.
This paper applied the web-based e-learning system (WELS) questionnaire proposed by Shee and Wang [23] to explore students' satisfaction and perception of importance when using the selected CALL system. There are 13 questions (shown in Table 1) in the WELS questionnaire. Five-point Likert scales were applied as the following: for satisfaction, (1) strongly disagree, (2) disagree, (3) average, (4) agree, and (5) strongly agree; for importance, (1) strongly disagree, (2) disagree, (3) average, (4) agree, and (5) strongly agree. The sample was students at three private universities using the CALL system. A total of 507 questionnaires were distributed and 433 questionnaires were recovered, representing a recovery rate of 85%. Among these, 18 questionnaires were deemed invalid, making the effective recovery rate 82%. In this paper, we establish a calculation process to complete the evaluation procedure based on the equation in Section 3.
Step 1: The mean and standard deviation of each service item are calculated:  (22) and (23). The results are input to Table 1. Then we take service item 1 (i = 1) for an example to describe the calculation process of these 3 statistics as the following:  Table 1. We take service item 1 (i = 1) for an instance to state the calculation process of 0 i C as the following:  θ . In practices, they should be listed as improvement items. Therefore, they are revised to be improved after applying the fuzzy hypothesis testing method constructed in this paper. Obviously, this result is more reasonable than the traditional statistical testing principle. The improvement items are confirmed by fuzzy hypothesis testing. In cases of limited resources, the items with an importance index above the mean will be prioritized for improvement. The calculation process is constructed as described in Section 4.
Step 1: The mean and standard deviation of each service item are calculated:  Table 2. We take service item 5 (i = 5) for an instance to describe the calculation process of these 3 statistics as the following:  , which also means that service item i should not be prioritized.

Conclusions
Hung et al. [9] indicated that as beta distribution is between zero and one, it is a suitable indicator of the degree of importance and satisfaction from 0% to 100%. We applied the proposed methods to the evaluation of a CALL system. We found that according to statistical testing, service items 5, 6, 7, and 8 did not need improvement. However, the * 0 i θ values of service items 5, 6, 7, and 8 were much smaller than 0 θ . In practice, these items would therefore be considered as requiring improvement. Following application of the proposed fuzzy hypothesis testing method, these items were re-categorized. The proposed methods were then applied to determine the importance of these service items, and all four were deemed as a priority. The major managerial insights of this study are: (1) We have constructed a performance evaluation matrix, on the basis of the spirit of continuous improvement promoted by total quality management. Through the fuzzy hypothesis testing method presented by Buckley [15], we have identified the characteristics considered critical to quality and determined which items should be prioritized for improvement in cases of limited resources.
(2) According to Chen et al. [16], fuzzy hypothesis testing brings more reasonable results for those in practice than in traditional statistics. In fact, fuzzy hypothesis testing mentioned in this paper combines the statistical inference method with experts' experiences to make accurate decision. Therefore, service item 5, 6, 7, and 8 are not missed to improve and the loss of the industries can be reduced to the lowest [16,[24][25][26][27].
(3) Based on the work of Buckley [15] and Chen et al. [16], we introduced confidence intervals to decreases the chance of misjudgment arising from sampling errors.
(4) This method of data collection preserves the voice of the customer, and it is relatively simple to apply, thereby increasing customers' willingness to participate.
(5) This method only requires a small quantity of the sample size. It not only meets the industries cost benefit but also controls the timeliness and effectiveness.