Abstract
This paper constructs a performance evaluation matrix (PEM) with beta distribution. Beta is between zero and one, making it a suitable indicator to describe customer ratings of importance and satisfaction from 0% to 100%. According to the spirit of ceaseless improvement put forward by total quality management, the average ratings are set as the standard, and then the coordinates of each satisfaction and importance item is located in the performance areas. This makes it easy to identify critical-to-quality items that require improvement. However, the data collection method of questionnaires inevitably involves sampling error, and the opinions of customers are often ambiguous. To solve these problems, we constructed a fuzzy testing method based on confidence intervals. The use of confidence intervals decreases the chance of misjudgment caused by sampling errors, and more precisely gets closer to customers’ voices. This result is more reasonable than the traditional statistical testing principle. The proposed methods are applied to assessment of a computer-assisted language learning (CALL) system to display their competence.
1. Introduction
Lambert and Sharma [] presented the performance evaluation matrix (PEM) for operating systems that collect users’ or customers’ perceptions. Compared with other assessment methods that need complicated data comparison, the PEM makes it easy to determine which service items most urgently require improvement, maintenance, or adjustment. This is achieved by locating the items according to (1) customer satisfaction with the items as they are and (2) how important the customers deem them. The PEM is widely applied for performance evaluation and improvement in a range of industries and institutions [,,,,,,]. In the PEM, customer perception of the importance of an item is represented by the vertical axis, and customer satisfaction with the item itself is represented by the horizontal axis. Each axis is cut into three equal shares, forming a three by three matrix with nine performance blocks. The three performance areas on the diagonal are regarded as maintenance areas because their importance and satisfaction are equal; the three performance areas on the upper-left are regarded as improvement areas because their importance level is higher than customers’ satisfaction level; and the three areas on the lower-right are viewed as adjustment areas because their satisfaction rating is higher than their importance.
In order to make the PEM more applicable, Hung, Huang, and Chen [] altered the position of the performance areas in the PEM to rationalize the evaluation principles. They divided the PEM into three equally-sized performance areas. The three performance areas are the following: the upper-left one is improvement, the middle one is maintenance, and the lower-right one is the adjustment area. Satisfaction and importance are viewed as random variables and submitted to the beta allocation. The importance and satisfaction indices are thereby standardized with values between zero and one. Hung, Huang, and Chen [] also suggested that the conventional method of the PEM was subject to sampling error. They therefore proposed deriving the joint confidence interval of the importance and satisfaction indices based on the central limit theorem and replacing the point estimates of the PEM with these joint confidence intervals. Although this method overcomes sampling error, it is complicated and difficult to use [,,].
In addition to sampling error, another limitation encountered by the PEM is that customer or user opinions are often fuzzy [,,,]. Therefore, Wang et al. [] proposed a fuzzy semantic scale for practical application and also constructed a convenient and comprehensive calculation method. However, while this enables the collection of data that is closer to the feelings of the interviewees, it increases the quantity of data as well as the complexities of collection. Consequently, Yu, Chang, and Chen [] as well as Chen et al. [] revised and simplified the data collection and calculation of the fuzzy semantic scales. Their modifications include the Likert scale, which is easy to use, but data collection for the fuzzy semantic scale remains relatively complicated.
In order to overcome the above limitations, we check the fuzzy testing method of Buckley [], Chen et al. [], Lee et al. [], and Sarkar et al. []. The method in this paper was extended from the ones in the stated-above three papers. It is based on calculated confidence intervals to construct the triangular shaped fuzzy number similar to the triangular fuzzy number mentioned by Sarkar et al. [] and meets the requirements. However, values of the two points of the triangular fuzzy number are fixed, ones of the two points of triangular shaped fuzzy number are changeable. Besides, industries consider the cost benefit and control the timeliness and effectiveness, their samples are not too numerous []. The method in this paper utilizes confidence intervals as the foundation. It not only reduces the risk of misjudgment arising from sampling errors, but also makes the more accurate decision. First, the satisfaction index of each service item is checked to determine whether it is below the mean (i.e., requiring improvement). Second, the importance index of items requiring improvement is tested to determine whether they are above the mean. In cases of limited resources, service items that are the most important are prioritized for improvement. The strengths of Buckley’s fuzzy test are that it maintains the plain format of the Likert scale while deducing the confidence interval from the importance and satisfaction indices to create the fuzzy membership function of the two indexes; and through fuzzy hypothesis testing, it distinguishes items regarded critical to quality. We demonstrate the efficacy of the method through the presentation of a case study of a computer-assisted language learning (CALL) system.
The rest of this paper is organized as the following. Section 2 illustrates the theoretical model developed by Hung et al. [], including definition of two evaluation indexes and their properties. Section 3 refers to the confidence interval of the satisfaction index in Section 2 to create a fuzzy membership function. A fuzzy testing criterion is proposed to evaluate whether the satisfaction index of each service item is below the mean. In Section 4, we construct the fuzzy membership function from the confidence interval of the importance index defined in Section 2 and propose a fuzzy testing criterion to evaluate whether the importance index is above the mean. A case study of a CALL system illustrates employment of the method in Section 5. In Section 6, we present our conclusions.
2. Performance Indices
For the sake of generality, we follow the example of Hung et al. [] in this study and assume that there are q service items and each service item is measured by two questions (one for importance and one for satisfaction). This creates a total of 2 q questions. Hung et al. [] also assumed that these ratings follow beta distributions. Therefore, this study takes random variables and respectively to represent the distributions associated with importance and satisfaction. The importance and satisfaction indices can be shown as the following:
As noted by Hung et al. [], index is between 0 and 1. More than half of customers are satisfied if index exceeds 0.50, when more than half of customers are dissatisfied if index is below 0.50. Thus, higher values for index reflect higher levels of customer satisfaction. Index behaves similarly. Index is used as the horizontal axis and index as the vertical axis to construct the performance evaluation matrix. Since the indices have unknown parameters, they must be estimated from sample data. If n customers are interviewed, the sample data of satisfaction and importance can be shown as the following:
Satisfaction sample matrix: ,
Importance sample matrix: .
The estimator of satisfaction index and the estimator of importance index can be shown separately, as the following:
and
In addition, the standard deviation of these two indices can be expressed respectively, as the following:
and
The expected values of and can be expressed respectively, as the following:
where
Similarly,
where
Obviously, and are unbiased estimators of and separately.
If we let
and
then by the Central Limits Theorem (CLT), and are distributed as for ; that is
3. Fuzzy Hypothesis Testing for Satisfaction Index
If we let , then the data matrix of random variable can be shown as
then the observation values for the means, standard deviations, and of each satisfaction service item can be expressed as the following:
Chen et al. [] drew on the concept of ceaseless improvement promoted by total quality management and set as the mean for all the satisfaction index as the following:
When the satisfaction index of service item i is higher than the mean (), service item i does not require improvement. The following hypothesis test is equivalent:
This leads to the following test statistic:
The critical region is where is determined by
Thus, the critical value can be shown as the following:
where is the significance level and the decision rule is as the following:
(1) Reject if (i.e., service item i needs improvement),
(2) Do not reject if (i.e., service item i does not need improvement).
Since is distributed as for ,
Since the observed value of is , the observed value of confidence intervals of is
where is the upper quintiles of . According to Buckley [] and Chen et al. [], the of triangular fuzzy number is
Obviously, when , then . Thus, the triangular fuzzy number of is where
Then the fuzzy membership function of is
Similarly to , the of triangular fuzzy number is
The triangular shaped fuzzy number is where
Then the fuzzy membership function of is
Subsequently, we can express and graphically as the following (see Figure 1):
Figure 1.
Membership functions of and .
Obviously,
where . We let and
Among these, represents the biggest integer which is less than or equal to . If we let represent the total area under the graph of , then for . is divided into 100 blocks similar to trapezoids and block j can be shown as the following:
We also let
Suppose such that , then
For the purposes of practical manipulation, since is an integer,
If we let represent the area under the graph of but to the right of the vertical line through point , then . is divided into h trapezoid-like blocks and block j can be expressed as the following:
Now, we compute the area of and as
and can be shown as the following:
and
As noted by Buckley [], we may employ two numbers () as the following:
(1) If , then do not reject and infer that service item i does not need improvement ().
(2) If , then make no decision on whether to reject/not reject.
(3) If , then reject and infer that service item i needs improvement ().
According to the above-mentioned evaluation rules, q service items can be subjected to fuzzy hypothesis testing to find all service items requiring improvement.
4. Fuzzy Hypothesis Testing for Importance Index
If we let , then the data matrix of random variable can be shown as
Then the observation values for the means, standard deviations, and of each importance service item can be expressed as the following:
and
All service items requiring improvement are represented by the Set “SI”. For example, if , service items 3, 7, 8, and 11 require improvement. These service items are then checked for importance. If their importance index is higher than the mean, they are prioritized for improvement in cases of limited resources. This can be represented by hypothesis testing as the following:
This gives the following test statistic:
where
The critical region is where is determined by
where is the significance level and the decision rule is
Similar to the satisfaction index , the confidence intervals of is
According to Buckley [], the of triangular fuzzy number is
Obviously, when , then . Thus, the triangular shaped fuzzy number of is where
The membership function of triangular fuzzy number is
Similarly to , the of triangular fuzzy number of is
The triangular fuzzy number of is where
Then the fuzzy membership function of is
Subsequently, and can be represented graphically as the following (see Figure 2):
Figure 2.
Membership functions of and .
Obviously,
where . Similarly to , if we let ,
and represent the total area under the graph of , then . is divided into 100 trapezoid-like blocks and block j therein can be shown as the following:
We also let
Suppose such that . Then
If we let represent the area under the graph of but to the right of the vertical line through point , then . is cut by h′ into h′ trapezoid-like blocks and block j can be expressed as the following:
Now, we compute areas and as
and can be shown as the following:
and
As noticed by Buckley [], we may cogitate employing two numbers () as the following:
(1). If , then do not reject and infer that service item i must be prioritized for improvement,
(2). If , then make no decision on whether to reject/not reject,
(3). If , then reject and conclude that service item i is not a priority.
5. Case Study
E-learning represents a rapidly growing trend in education. According to the 2018 survey report on digital industries by the Industrial Development Bureau of the Ministry of Economic Affairs, the worth of digital services output in Taiwan in 2017 was as high as NT $5.01 trillion, representing an increase of 19.6% compared to 2016. E-learning is often utilized for language teaching [,]. It is forecasted that digital English learning in the Asia-Pacific area will increase by as much as 25.83% per year from 2018 to 2022 []. We therefore selected a CALL system operating at three private universities in central Taiwan as a case study to manifest the presented method.
This paper applied the web-based e-learning system (WELS) questionnaire proposed by Shee and Wang [] to explore students’ satisfaction and perception of importance when using the selected CALL system. There are 13 questions (shown in Table 1) in the WELS questionnaire. Five-point Likert scales were applied as the following: for satisfaction, (1) strongly disagree, (2) disagree, (3) average, (4) agree, and (5) strongly agree; for importance, (1) strongly disagree, (2) disagree, (3) average, (4) agree, and (5) strongly agree. The sample was students at three private universities using the CALL system. A total of 507 questionnaires were distributed and 433 questionnaires were recovered, representing a recovery rate of 85%. Among these, 18 questionnaires were deemed invalid, making the effective recovery rate 82%. In this paper, we establish a calculation process to complete the evaluation procedure based on the equation in Section 3.
Table 1.
Satisfaction Indices for CALL System.
Step 1: The mean and standard deviation of each service item are calculated: is calculated by Equation (15) and and are calculated by Equations (22) and (23). The results are input to Table 1. Then we take service item 1 (i = 1) for an example to describe the calculation process of these 3 statistics as the following:
Step 2: Based on Equation (16), we calculate = 0.5370 and set up hypothesis testing as the following:
,
.
Step 3: We let the significance level and according to Equation (18) calculate and fill in the result in Table 1. We take service item 1 (i = 1) for an instance to state the calculation process of as the following:
Step 4: Based on Equations (38) and (39), we calculate and respectively, then calculate and fill the result in Table 1. Because service items 5, 6, 7, and 8 of dimension 2 all need improvement, we take these 4 items for example to describe the calculation process of as the following:
Step 5: We set and . According to the evaluation rules presented in Section 3, (marked *). Therefore, we reject and conclude that service item i needs improvement. The values of items 5, 6, 7, and 8 are larger than 0.4, which means that service items 5, 6, 7, and 8 need improvement.
As stated above, according to the statistical testing principle (i.e., if , do not reject and service item i does not need improvement), service items 5, 6, 7, and 8 do not need improvement. However, the values of service item 5, 6, 7, 8 is much smaller than . In practices, they should be listed as improvement items. Therefore, they are revised to be improved after applying the fuzzy hypothesis testing method constructed in this paper. Obviously, this result is more reasonable than the traditional statistical testing principle.
The improvement items are confirmed by fuzzy hypothesis testing. In cases of limited resources, the items with an importance index above the mean will be prioritized for improvement. The calculation process is constructed as described in Section 4.
Step 1: The mean and standard deviation of each service item are calculated: is calculated by Equation (42) and and are calculated by Equations (49) and (50). The results are input to Table 2. We take service item 5 (i = 5) for an instance to describe the calculation process of these 3 statistics as the following:
Table 2.
Importance Indices for CALL system.
Step 2: Based on Equation (44), we calculate 0.7209 and set up hypothesis testing as the following:
Step 3: We let the significance level and according to Equation (46) calculate . The results are input to Table 2. Then we take service item 5 (i = 5) for an example to describe the calculation process of as the following:
Step 4: Based on Equations (64) and (65), we calculate and respectively, then calculate and fill in Table 2. We take service items 5, 6, 7, and 8 of dimension 2 all for example to state the calculation process of as the following:
Step 5: We let and . According to the evaluation rules of Section 4, if , then do not reject and infer that service item i must be prioritized. The values of items 5, 6, 7, and 8 are smaller than 0.2, which means that service items 5, 6, 7, and 8 must be prioritized.
6. Conclusions
Hung et al. [] indicated that as beta distribution is between zero and one, it is a suitable indicator of the degree of importance and satisfaction from 0% to 100%. We applied the proposed methods to the evaluation of a CALL system. We found that according to statistical testing, service items 5, 6, 7, and 8 did not need improvement. However, the values of service items 5, 6, 7, and 8 were much smaller than . In practice, these items would therefore be considered as requiring improvement. Following application of the proposed fuzzy hypothesis testing method, these items were re-categorized. The proposed methods were then applied to determine the importance of these service items, and all four were deemed as a priority. The major managerial insights of this study are:
(1) We have constructed a performance evaluation matrix, on the basis of the spirit of continuous improvement promoted by total quality management. Through the fuzzy hypothesis testing method presented by Buckley [], we have identified the characteristics considered critical to quality and determined which items should be prioritized for improvement in cases of limited resources.
(2) According to Chen et al. [], fuzzy hypothesis testing brings more reasonable results for those in practice than in traditional statistics. In fact, fuzzy hypothesis testing mentioned in this paper combines the statistical inference method with experts’ experiences to make accurate decision. Therefore, service item 5, 6, 7, and 8 are not missed to improve and the loss of the industries can be reduced to the lowest [,,,,].
(3) Based on the work of Buckley [] and Chen et al. [], we introduced confidence intervals to decreases the chance of misjudgment arising from sampling errors.
(4) This method of data collection preserves the voice of the customer, and it is relatively simple to apply, thereby increasing customers’ willingness to participate.
(5) This method only requires a small quantity of the sample size. It not only meets the industries cost benefit but also controls the timeliness and effectiveness.
Author Contributions
Data curation, C.-M.Y.; Formal analysis, K.-S.C.; Investigation, C.-H.Y. and C.-M.Y.; Methodology, K.-S.C.; Project administration, C.-C.L.; Validation, C.-H.Y.; Writing—original draft, C.-H.Y., K.-S.C. and C.-M.Y.; Writing—review & editing, C.-C.L., K.-S.C. and C.-M.Y. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Lambert, D.M.; Sharma, A. A customer-based competitive analysis for logistics decisions. Int. J. Phys. Distrib. Logist. Manag. 1990, 20, 17–24. [Google Scholar] [CrossRef]
- Ghosh, P.; Ojha, M.K. Determining passenger satisfaction out of platform-based amenities: A study of Kanpur Central Railway Station. Transp. Policy 2017, 60, 108–118. [Google Scholar] [CrossRef]
- Martínez-Caro, E.; Cegarra-Navarro, J.G.; Cepeda-Carrión, G. An application of the performance-evaluation model for e-learning quality in higher education. Total Qual. Manag. Bus. Excell. 2015, 26, 632–647. [Google Scholar] [CrossRef]
- Chen, K.S.; Chen, H.T. Applying importance-performance analysis with simple regression model and priority indices to assess hotels’ service performance. J. Test. Eval. 2014, 42, 455–466. [Google Scholar] [CrossRef]
- Goel, G.; Ghosh, P.; Ojha, M.K.; Kumar, S. Journey towards World Class Stations: An Assessment of Platform Amenities at Allahabad Junction. J. Public Transp. 2016, 19, 68–83. [Google Scholar] [CrossRef]
- Wang, K.J.; Chang, T.C.; Chen, K.S. Determining critical service quality from the view of performance influence. Total Qual. Manag. Bus. Excell. 2015, 26, 368–384. [Google Scholar] [CrossRef]
- Wong, R.C.P.; Szeto, W.Y. An alternative methodology for evaluating the service quality of urban taxis. Transp. Policy 2018, 69, 132–140. [Google Scholar] [CrossRef]
- Wu, J.; Wang, Y.; Zhang, R.; Cai, J. An approach to discovering product/service improvement priorities: Using dynamic importance-performance analysis. Sustainability 2018, 10, 3564. [Google Scholar] [CrossRef]
- Hung, Y.H.; Huang, M.L.; Chen, K.S. Service quality evaluation by service quality performance matrix. Total Qual. Manag. Bus. Excell. 2003, 14, 79–89. [Google Scholar] [CrossRef]
- Hu, H.Y.; Lee, Y.C.; Yen, T.M. Service quality gaps analysis based on Fuzzy linguistic SERVQUAL with a case study in hospital out-patient services. TQM J. 2010, 22, 499–515. [Google Scholar] [CrossRef]
- Tian, Z.P.; Nie, R.X.; Wang, J.Q. Social network analysis-based consensus-supporting framework for large-scale group decision-making with incomplete interval type-2 fuzzy information. Inf. Sci. 2019, 502, 446–471. [Google Scholar] [CrossRef]
- Zadeh, L.A. The concept of a linguistic variable and its application to approximate reasoning-I. Inf. Sci. 1975, 8, 199–249. [Google Scholar] [CrossRef]
- Yu, C.M.; Chang, H.T.; Chen, K.S. Developing a performance evaluation matrix to enhance the learner satisfaction of an e-learning system. Total Qual. Manag. Bus. Excell. 2018, 29, 272–745. [Google Scholar] [CrossRef]
- Chen, K.S.; Chang, H.T.; Yu, C.M. Development and application of performance improvement verification model: A case study of an e-learning system. Total Qual. Manag. Bus. Excell. 2019, 30, 936–952. [Google Scholar] [CrossRef]
- Buckley, J.J. Fuzzy statistics: Hypothesis testing. Soft Comput. 2005, 9, 512–518. [Google Scholar] [CrossRef]
- Chen, K.S.; Wang, C.H.; Tan, K.H. Developing a fuzzy green supplier selection model using six sigma quality indices. Int. J. Prod. Econ. 2019, 212, 1–7. [Google Scholar] [CrossRef]
- Lee, T.S.; Wang, C.H.; Yu, C.M. Fuzzy evaluation model for enhancing E-learning system. Mathematics 2019, 7, 918. [Google Scholar] [CrossRef]
- Sarkar, B.; Omair, M.; Kim, N. A cooperative advertising collaboration policy in supply chain management under uncertain conditions. Appl. Soft Comput. J. 2020, 88, 105948. [Google Scholar] [CrossRef]
- Chen, K.S. Fuzzy testing decision-making model for intelligent manufacturing process with Taguchi capability index. J. Intell. Fuzzy Syst. 2020, 38, 2129–2139. [Google Scholar] [CrossRef]
- Hwang, G.J.; Tsai, C.C. Research trend in mobile and ubiquitous learning: A review of publications in selected journal from 2001 to 2010. Br. J. Educ. Technol. 2011, 42, E65–E70. [Google Scholar] [CrossRef]
- Wu, W.H.; Jim Wu, Y.C.; Chen, C.Y.; Kao, H.Y.; Lin, C.H.; Huang, S.H. Review of trends from mobile learning studies: A meta-analysis. Comput. Educ. 2012, 59, 817–827. [Google Scholar] [CrossRef]
- TechNavio. Digital English Language Learning Market in APAC 2018–2022; TechNavio: London, UK, 2018. [Google Scholar]
- Shee, D.Y.; Wang, Y.S. Multi-criteria evaluation of the web-based e-learning system: A methodology based on learner satisfaction and its applications. Comput. Educ. 2008, 50, 894–905. [Google Scholar] [CrossRef]
- Chen, K.S. Two-tailed Buckley fuzzy testing for operating performance index. J. Comput. Appl. Math. 2019, 361, 55–63. [Google Scholar] [CrossRef]
- Chen, K.S. Fuzzy testing of operating performance index based on confidence intervals. Ann. Oper. Res. 2019. [Google Scholar] [CrossRef]
- Chen, K.S.; Chang, T.C. Construction and fuzzy hypothesis testing of Taguchi Six Sigma quality index. Int. J. Prod. Res. 2019. [Google Scholar] [CrossRef]
- Lin, K.P.; Yu, C.M.; Chen, K.S. Production data analysis system using novel process capability indices-based circular economy. Ind. Manag. Data Syst. 2019, 119, 1655–1668. [Google Scholar] [CrossRef]
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).