A Text Analytics-Based Importance Performance Analysis and Its Application to Airline Service

: We introduce a new importance-performance analysis (IPA) methodology while making use of direct service experience perceptions represented by online reviews with numerical ratings. The proposed IPA, which we call the text analytics-based IPA (TAIPA), allows the real-time calculation of importance using the probability distribution of word frequency via the latent Dirichlet allocation (LDA) application to online reviews, and of performance using numerical rating values. The importance is also adjusted with the help of a sentiment analysis of online reviews to provide more precise measurements for service experience perceptions. To ensure an evaluation of the entire service process, we employ service encounters, in which service experiences occur and thus most customer perceptions are created, as a set of attributes composed of LDA topics that contain direct perceptions of service experiences. We investigate statistical correlations between TAIPA calculations and typical benchmarks of ﬁrm performance in the air-transport industry to verify how e ﬀ ective the proposed TAIPA is with respect to the degree that customer satisfaction is represented. As a primary result, TAIPA is more e ﬀ ective than comparison targets in that it shows stronger correlations with ﬁrm performance. TAIPA is specialized in determining which service step (i.e., a one-to-one relationship with a service encounter) needs to be improved. Moreover, TAIPA is ﬂexible in considering multiple competitors.


Introduction
Over the last few decades, airline service providers have struggled to produce sustainable profits under varied forms of unfavorable conditions [1,2]. These conditions include global economic crises, regional political instability, unstable oil prices, frequent terror threats and attacks, occasional natural disasters, and so forth [3]. In addition, the level of competition within the industry has continued to intensify since the establishment of open skies agreements [4,5]. To survive in such a hostile and competitive business environment, it is paramount in airline marketing and operational strategies to better understand passenger needs for service satisfaction more quickly than competitors for the following reasons [1,2]. An elaborate analysis of the critical factors that improve service quality for passengers is essential because high-quality service leads to customer satisfaction [6][7][8][9]. Customer satisfaction drives further growth potential and opportunities for future performance, and results in a core source of provider competitiveness [10,11]. Special attention has been continuously paid to measure exact passenger perceptions of service experiences through critical attributes of customer satisfaction. It is noted that attributes of customer satisfaction and their importance could vary depending on individual customers. An importance-performance analysis (IPA) allows a customer to address differences in an attribute's importance that a customer recognizes while providing an analytical framework to evaluate comprehensively an entire service process from the customer perspective [12]. In the IPA application, a respondent completes a questionnaire and rates every attribute for its importance and performance based on his/her own perceptions [12]. Nevertheless, a survey-based IPA frequently fails to elicit exact customer experience perceptions because the questionnaire is composed of a series of identical questions designed by researchers, and it is thus limited in its ability to capture different customer characteristics [13]. Recently, there has been a strong tendency in research to measure service quality and customer satisfaction using multiple innovative methods rather than using typical questionnaire forms (e.g., [14][15][16][17][18]).
In this trend, text-analytics techniques have increased in popularity to utilize direct service experience perceptions represented in a textual data. They are mainly conducted on online reviews, and the techniques take advantage of reviews, the chief characteristic of which provide an opportunity to capture direct customer perceptions and help estimate a service accurately. Online reviews are voluntarily authored by passengers and regarded as the most direct and immediate feedback from customer experiences [19][20][21][22]. Their features are not cultivated, intended or planned but faithfully customer-perceived. TripAdvisor has been one of the most well-known travel review sites in which customers post their travel experiences regarding hospitality services [15,23]. There are many published results of text-analytics techniques working with the reviews provided in TripAdvisor for service quality and satisfaction [22,[24][25][26]. Liau and Tan [24] tried to describe passenger perceptions by conducting a sentiment analysis, a popular text analytics approach, and a clustering method. Based on text analysis, Martin-Domingo, Martin, and Mandsberg [25] attempted to identify passenger satisfaction utilizing online reviews to improve the level of airport service. Nam, Ha, and Lee [22] redesigned the airline service to be more customer-focused using topic modeling. More relevant research results can be found in Sections 2.1 and 3.2.
Despite the easy and wide applications of IPA in service literature, related studies show that a typical IPA needs to overcome a few major challenges. First, it is not clear how well service attributes can be selected and organized to effectively reflect critical customer perceptions and thus be customer experience-based as much as applicable [27,28]. It is necessary to incorporate direct customer perceptions of service experiences into the selection of service attributes rather than simply relying on survey designs from previous studies. If this is not done, there exists the possibility that the evaluation criteria might not account for exact customer tastes and perceptions. Second, researchers must take into consideration how well relative importance can be derived using meaningful implicit (e.g., statistical) approaches while comparing all service attributes simultaneously [10,[27][28][29][30]. It should be determined whether a multidimensional measure is present [27,31], as well as whether it is able to account for market-driven importance as compared with other competitors in the same market [18,[32][33][34]. If the importance is not relatively measured, the IPA results become biased [27]. For example, when absolute importance is used, it frequently has a positive upward tendency, i.e., the so called ceiling effect [10,18,27,35]. Third, it should be determined how well the analysis guarantees the statistical independence between the importance and performance of attributes in the IPA space [10,27,35]. Unless the statistical independence is ensured in the IPA grid, the decision based on the IPA will not be effective. The more that orthogonality between importance and performance is guaranteed, the greater the IPA explanation power. Previous studies have shown that dependency is less problematic when implicit approaches are used to derive attributes [10,18,28,35], and there have been many implicit approaches developed and proposed to fix the dependency issue [18,[29][30][31]36,37].
To cope with these challenges, we propose a text analytics-based IPA (hereafter TAIPA) which uses online reviews and online ratings. The core idea of TAIPA is as follows. First, TAIPA extracts latent topics from customer-authored online reviews using the latent Dirichlet allocation (LDA). Since a topic is a probability distribution of word frequency from reviews enclosing passenger perceptions [22,[38][39][40][41], and the perceptions of service experiences are the sum of experience perceptions from all service encounters in the corresponding service steps [22,42], a topic itself or a group of topics with similar characteristics is well represented as a service encounter [22,43]. To include customer perceptions of the entire service process, TAIPA employs service encounters, in which customer service experiences occur and, thus, most customer perceptions are created, as a set of attributes composed of LDA topics that include direct perceptions of service experiences. The attributes of TAIPA become a complete set of actual customer experience-based criteria for an evaluation. Second, TAIPA assigns the importance of attributes using the results of the LDA modeling. As the probability of word frequency represents weight in a topic [38,39], the probability of a specific word is regarded as importance. Then TAIPA computes word importance, topic importance, and attribute importance consecutively based on the following relations. A topic is composed of multiple words, and an attribute is composed of multiple topics. The attribute importance becomes multidimensional, and relative as well since LDA produces probability distributions for topics at the same time. In addition, importance is adjusted via a sentiment analysis of online reviews to provide more precise measurements for customer experience perceptions. Third, TAIPA uses online ratings of customers as the performance of attributes because customers frequently expose service performance by means of rating values while perceiving the service [15,16,18,41,44,45]. The rationale of this process will be discussed in detail in Section 3.2. TAIPA subsequently generates IP pairs of all attributes and positions the pairs in the IPA grid. Since attributes are represented by service encounters and service encounters are one-to-one related with service steps, TAIPA is specialized in determining which service step needs to be improved. As a version of TAIPA that considers competitors, TAIPCA extracts attributes from the online reviews of all comparable companies to find shared attributes on average among these competitors. The remaining TAIPCA steps are executed in the same way as those in TAIPA. We also analyze statistical correlations between IP pairs to judge the statistical independency issue. In particular, we compare statistical correlations between performance and importance with or without sentiment score adjustments. Finally, we explain how TAIPA is effective in terms of representing customer satisfaction using statistical analyses.

Survey-Based versus Data Analytics-Based Service Evaluation
One of the key elements for a sustainable business is to accurately assess customer satisfaction perceived through customer experiences [8]. Customer satisfaction has a strong relationship with service quality which can be estimated by the gap between the customers' expectations and perceived service performance [8,12,31,46]. Therefore, customer perceptions have been regarded as a critical measurement in service literature. This trend is evident in the aviation management field, where most studies thus far have endeavored to analyze customer perceptions and estimate service quality or customer satisfaction as precisely as possible by mainly relying on survey-based investigations. The survey-based study results are as follows. Koklic, Kukar-Kinney, and Vegelj [47] investigated how differences in airline business models could affect passenger perceptions. They found that there were no significantly different perceptions between low-cost carrier (LCC) passengers and full-service carrier (FSC) passengers. In contrast, Suhartanto and Noor [48] found that there existed significant differences between LCC and FSC passengers. They asserted that FSC customers were satisfied with the service more easily than LCC customers. Hussain, Al Nasser, and Hussain [49] showed that customer satisfaction was affected by service quality, perceived value and brand image. Rajaguru [50] concluded that value for money positively influenced the customer satisfaction of airline passengers. An and Noh [51] demonstrated the effects of inflight service quality on airline passenger perception. They collected data from two different types of seat classes, and they found that there were considerable differences that could have an influence on customer satisfaction. In the practical filed, Skytrax, a world-renowned company for airline service evaluations, provides an annual report of airline rankings by assessing airline services through the traditional survey-based method. The survey is designed to evaluate the overall quality of service processes, including cabin services, ground and airport, and onboard products. The results are used for the basis for customer satisfaction estimation. Several weaknesses in survey-based methods for estimating customer satisfaction have been pointed out [18,20,43]. They include sample size issues, collection time and cost issues, challenges in collecting real-time samples, statistical errors via incomplete sample representativeness, and missing and invalid samples. Above all, it is concerning that customer perceptions of service experiences can be frequently collected in a restricted manner since respondents can only answer for the questions within the questionnaire provided by the survey designer. To counter this weakness, online reviews have been recognized as a recent data source worthwhile to exploit. Various studies using reviews have demonstrated their value and usefulness in related research [14,17,25,26,41,45,52]. Online reviews are easily gathered with big data in real-time [22]. As voluntary responses and evaluations, the reviews hold direct and honest perceptions of customers without format restrictions, and collection costs and time commitments are minimized. Online reviews are less likely to lead to statistical errors due to an insufficient level of sample representativeness, which is inevitable in typical survey-based sampling, because they are not sampled but fully collected given the collecting principle. Lee and Yu [41] proposed a method of measuring customer satisfaction using online reviews and ratings generated by airport users. Based on a sentiment analysis, Gitto and Mancuso [17] made effort to analyze traveler satisfaction making use of online reviews to upgrade the level of airport service. The discovery of airport passenger satisfaction is crucial information for airport managers. Stamolampros, Korfiatis, Kourouthanassis, and Symitsi [26] explored the cultural influences implied in online reviews and ratings through topic modeling. According to the results of their study, customers were more likely to be satisfied because providers recognized the specific level of cultural influences. Jia [45] analyzed the motivation and customer satisfaction of yoga customers through LDA implementation. For the hotel industry, Berezina et al. [14] used text analytics to evaluate customer satisfaction, and they provided insight into the impact of customer satisfaction on corporate management. Kim,Lim,and Brymer [52] investigated the influence of online reviews in the hotel industry. They found that online reviews such as overall ratings and negative responses had to be managed seriously to increase the degree of customer satisfaction.

Importance-Performance Analysis (IPA) as Service Evaluation
IPA is a relatively simple analytical tool that arranges attributes to quadrants composed of two dimensions of importance and performance [10,12,27,28]. As the tool uses both axes when evaluating attributes, it has been widely applied in various research fields, especially in tourism and hospitality studies [10]. Before the introduction of IPA, the survey method was usually performed using either of two dimensional spaces when assessing attributes [12]. The capability of simultaneously assessing both dimensions of attributes led to more precise understandings of customer perceptions. According to the results of various studies [27,[53][54][55], IPA is a suitable technique for assessing customer satisfaction. Matzler, Sauerwein, and Heischmidt [56] measured the customer satisfaction of bank users by proposing IPA combined with the three-factor theory. Matzler, Bailom, Hinterhuber, Renzl, and Pichler [57] performed theoretical and empirical research in the automobile industry. They provided strategic insight by displaying critical attributes that affected customer satisfaction in an IPA grid. Boley, McGehee, and Hammett [58] used IPA to make a plan to strengthen sustainable tourism based on residents' perceptions. By analyzing attributes affecting the customer satisfaction of hotel customers, and displaying the attributes in the IPA space, Susskind [59] found that energy saving activities could have an impact on the customer satisfaction of the hotel business. Ziegler, Dearden, and Rollins [60] and Tonge and Moore [61] used IPA to assess the customer satisfaction of amusement park tourists. The researchers suggested practical improvements for the management of amusement parks while interpreting the IPA grid. Results are also available from several IPA studies applied to the air transport industry. Tsafarakis, Kokotas, and Pantouvakis [31] measured customer satisfaction using a multiple-criteria decision method and interpreted multiple attributes using an IPA grid. They proposed a methodology of customer-satisfaction assessment for an entire airline service process through analyzing several important attributes. Chang and Yang [62] attempted to identify the precise perceptions of kiosk users by applying IPA. They determined the service attributes of the kiosk service based on the critical incident technique.
The traditional IPA is limited to reflect the customer perceptions of service experiences. To remedy this, we derive attributes and calculate their importance and performance using online reviews with ratings generated by a huge number of passengers. Because online reviews with ratings contain a sufficient representation of passenger experience perceptions, we can derive passenger-oriented attributes. The traditional IPA was designed to select attributes of the target product/service explicitly. However, due to the shortcomings of explicitly derived attributes, there have been lots of implicit approaches to compose attributes [18,[29][30][31]36,37]. The approaches to derive attributes can be categorized into two types-explicit and implicit-depending on how attributes can be derived [18,28,36,63]. When the attributes of explicit methods are measured by questions in the questionnaire, respondents score the importance and performance of attributes sequentially. Therefore, if respondents recognize an attribute as important, they tend to overestimate the performance of the attribute as well. In this case, importance and performance are likely to be dependent on each other with a positive slope. Namely, they show statistical correlations or even causal relationships between them [10,27,28]. As a result, the ceiling effect is occasionally observed in the IPA grid if attributes of explicit approaches are employed [27,28,36,57]. By contrast, implicit methods reorganize or reinterpret existing attributes to meaningful new ones through inferring processes such as statistical modeling and the subsequent changing processes of their characteristics and dimensions. When being changed implicitly, the statistical dependence between two dimensions is often lessened [18,28]. Table 1 summarizes the size of statistical correlation between importance and performance using datasets given in the papers, which have been selected according to journal reputations and citation numbers in the related IPA literature. To statistically compare the difference in the correlation sizes between two groups, a t-test was conducted. The results of a t-test are summarized in Table 2. We conclude that the statistical correlations of the group of explicit methods is significantly larger than that of the group of implicit methods.  After applying multiple data-analytics methods to online reviews, Bi et al. [18] recently proposed a revised form of IPA (hereafter BIPA) and presented the results as applied to hotel service. Basically, BIPA extracts the important attributes from online reviews using LDA, as TAIPA does. Then BIPA estimates importance by employing an ensemble neural network-based model (ENNM), and it computes performance based on sentiment weights, which are derived from an improved one-versus-one strategy-based support vector machine (IOVO-SVM). Bi et al. [18] also developed a version of BIPA that could track attribute trends by collecting online reviews consecutively at different time periods. Because online reviews need to be monitored over time and can be easily crawled at any time according to an analyst's will, the version of trend tracking over time can be a useful IPA tool. In addition, a version of BIPA that takes competitors into account was developed to compare identical attributes across competing services. However, this version can only consider a competitor in the market, which is an unrealistic assumption in practice. Compared to BIPA, the conceptual and operational mechanism of TAIPA is much simpler and more intuitive. The results of one-to-one comparisons between TAIPA and BIPA in terms of key features in IPA applications will be discussed in detail in Section 4.3.

Data and Its Meaning in Text Analytics-Based IPA (TAIPA)
The data used for this study were collected from TripAdvisor, one of the most popular travel review sites [70], for the top 10 airline services posted between 1 December 2017 and 30 November 2018. The top 10 airlines were chosen based upon the annual report of best service airlines in Skytrax for 2018. Since they are picked from around over 300 airlines, the level of their service quality is regarded as the highest standard of airline services [71]. To minimize the bias from the small sample size, we finally eliminated one airline that had fewer than 1000 online reviews. Table 3 summarizes the top nine airlines with their codes and 41,959 online reviews with numerical ratings posted voluntarily by passengers. Note that an online review has a rating value. Generational distinctions about Internet use have gradually disappeared [72], and more than 80% of online consumers are using online reviews for service or product purchases [70]. Given the fact that airline's online sales have grown to half a trillion dollars in 2015 since 1995 when airline's online sales began [73], and about 30 million users visit TripAdvisor every month [23], it is plausible to assume that passengers in TripAdvisor are representative of the flying population in general.
After experiencing service, customers express their perception of service experiences and various aspects that make up the experiences in the form of user generated content [41,44,74]. User generated content generally consists of a brief assessment of overall service performance with numerical ratings, and a detailed description of the assessment as a form of textual reviews [52]. TripAdvisor provides a spot for an overall evaluation of the service experience with scaled rating values from 1 to 5, and a blank space for the textual detailed comments of customers who experienced the service. When a customer accesses TripAdvisor to leave an evaluation of his/her service experience, the customer first gives a rating score among five choices as a general impression of the service performance. This type of evaluation can indicate how well the service is performed from a customer's point of view in general. Then the customer describes his/her special impressions to share with others while experiencing the service. The textual form of descriptions includes memorable and important experiences that elicit primarily good or bad feelings [15,26]. This type of evaluation suggests how the service is specifically important from a customer's point of view.

Analytical Framework of TAIPA
In this study, we revise IPA to evaluate service as precisely as possible from a customer's point of view. In the customer satisfaction literature, it is very important to understand the entire process of service [22,75]. This consists of a series of multiple service steps, which have one-to-one relationship with service encounters [8,[76][77][78]. Customers directly interact with various types of service components (e.g., service providers and servicescapes) at a point and/or moment called a service encounter, which is the crux of service delivery [22,76,78]. Customer service experiences are made up of experiences from service encounters in all the corresponding service steps [22,42,79,80], and customers recognize and assess the service based on resultant perceptions from interactions created in the service encounters [76,78]. This is why service providers try to synchronize their services with the resultant customer perceptions to deliver better services. Therefore, when measuring customer satisfaction, it is essential to consider a service encounter as a critical element, and it is reasonable to assume that service encounters are customer satisfaction attributes in IPA [81][82][83]. Nam et al. [22] used an LDA text analysis that derived airline service encounters from online reviews for the purpose of redesigning airline service steps. They presented seven service encounters, equivalent to service steps, using a service blueprint format through a two-step matching process of LDA topics containing the direct perceptions of passengers. In this study, while developing TAIPA, we employ the seven service encounters used in Nam et al. [22] from flight reservation to arrival at destination (i.e., reservation, pre-boarding service, boarding and ground service, take-off safety check, meal and beverage service, passenger relaxation, deplaning and post-deplaning). Figure 1 depicts the basic structure of the proposed TAIPA methodology. When applied to LDA, customer perceptions are extracted and modeled as topics. Because an LDA topic is composed of words closely related to each other while containing passenger perceptions of service encounters [21,22,43], a topic itself or a group of topics with similar characteristics can represent a service encounter [22,43]. Service encounters then comprise a set of attributes in TAIPA. Moreover, since an LDA topic is a probability distribution of word frequency from online reviews, the importance of a word is measured by the probability (weight) of the word. This suggests that a word is mentioned more frequently when it is more important [21,22]. The importance of a topic is calculated by the sum of the importance of words in the same topic, and the importance of an attribute is measured by the sum of the importance of topics in the same attribute.
Meanwhile, online reviews often involve customers' emotional expressions and their sentiment polarity and strength need to be gauged. If importance is only measured by frequency-based probability without considering the sentiments of customers, customer perceptions might be misunderstood. Therefore, we analyze sentiments using the Google natural language application programming interface (Google NLP) to adjust importance purely based on frequency while incorporating customer emotions contained in online reviews. There are a number of studies that have tried to incorporate sentimental aspects to obtain a more precise understanding of customer experiences [41,74,[84][85][86]. Fang, Ye, Kucukusta, and Law [86] utilized a sentiment analysis for a more reliable investigation of textual comments to achieve sustainable and smart tourism. Xiang, Schwartz, Gerdes, and Uysal [74] found that the application of a sentimental approach could enrich findings of the study for hotel customer experiences. Li, Li, Hu, Zhang, and Hu [85] analyzed the reviews of tourists through a sentiment classification combining the results of topic modeling. They captured more accurate determinants of customer satisfaction by classifying hotel customers' sentiments. Lee and Yu [41] also proposed a conceptual framework employing sentiment scores with LDA to specify the relationship between Google star ratings and airport service quality (ASQ) ratings. destination (i.e. reservation, pre-boarding service, boarding and ground service, take-off safety check, meal and beverage service, passenger relaxation, deplaning and post-deplaning). Figure 1 depicts the basic structure of the proposed TAIPA methodology. When applied to LDA, customer perceptions are extracted and modeled as topics. Because an LDA topic is composed of words closely related to each other while containing passenger perceptions of service encounters [21,22,43], a topic itself or a group of topics with similar characteristics can represent a service encounter [22,43]. Service encounters then comprise a set of attributes in TAIPA. Moreover, since an LDA topic is a probability distribution of word frequency from online reviews, the importance of a word is measured by the probability (weight) of the word. This suggests that a word is mentioned more frequently when it is more important [21,22]. The importance of a topic is calculated by the sum of the importance of words in the same topic, and the importance of an attribute is measured by the sum of the importance of topics in the same attribute. Meanwhile, online reviews often involve customers' emotional expressions and their sentiment polarity and strength need to be gauged. If importance is only measured by frequency-based probability without considering the sentiments of customers, customer perceptions might be misunderstood. Therefore, we analyze sentiments using the Google natural language application programming interface (Google NLP) to adjust importance purely based on frequency while Customers recognize service performance through providers' service deliveries that occur in service encounters [22,42,79,80]. The degree of service performance perceived by customers suggests the level of stimulus, which is evaluated according to customers' subjective criteria in service encounters [76,78,87]. Related research in the hospitality industry, including for restaurants and hotels, indicates that customers reveal service performance in the form of online ratings as they experience and perceive the service [14,15,41], and this shows that ratings can play an important role in estimating customer satisfaction [16,18,41,45]. Therefore, we use online ratings posted with reviews in TripAdvisor as the performance of attributes in the proposed approach. Lee and Yu [41] showed that the Google star ratings composed by the users of airport services represented the level of the service performance of the airport. O'Connor [15] confirmed that customers expressed the entire assessment with ratings on a 5-ponit scale for the service that they had experienced while taking into account various elements of service process.
After all of the related calculations are completed, TAIPA matches pairs of importance adjusted by the sentiments (Is) and performance (P) for each attribute and places the pairs in the TAIPA space. When placing the pairs, TAIPA standardizes the pairs with means and standard deviations of each two dimensions (Is and P) because the ranges of scale of the two dimensions are considerably different. Note that the adjusted importance must be between −1 and 1 (see details in Section 3.3), and performance must be between 1 and 5. Due to the standardization, all attributes should be interpreted in a relative manner within a target company. The standardization, however, makes it easier to compare among multiple competitors. We will show how to specifically apply TAIPA to a real problem in the following section.

Deriving Importance, Sentiment Score, and Performance
Because an accurate determination of importance with a configuration of attributes is the most critical work in the IPA application, Azzopardi, and Nash [28] and Oh [27] supplied useful guidelines and insights. As explained previously, the LDA topic modeling takes responsibility for the decisions and configurations in TAIPA application. The LDA modeling provides topics suitable to structure a set of attributes and attribute importance, which is presented in the form of the probabilities of specific words in a topic. Based upon the perplexity function in terms of the number of topics (= k), the LDA modeling is optimized around 18 of k for all airlines. For example, Appendix A gives the results of the LDA application to SQ with service encounters labeled in the second rows. The top 15 words based on the probability size in every topic are used for topic labeling [21,22], and the probability is used for importance. The top 15 words represent topics at the amount of 56.0% on average (see the last rows of tables in Appendix A.) To gain the sentiment scores of reviews, TAIPA uses Google NLP, which works with lexical resources, which are accumulated by unspecified and numerous numbers of common internet users, including those on TripAdvisor. Thus, the results of sentiment analysis for words show a general usage of lexicons that most users agree with as Google NLP produces the sentiment scores of texts from −1(negative) to 1(positive).
To illustrate the calculations of importance of a service encounter, we combine the LDA modeling results in the table in Appendix A with sentiment scores. For example, if we want to calculate an attribute's importance for the pre-boarding service in SQ, two topics (T4 and T9) that comprise the service encounter are selected from the table in Appendix A. The top 15 words of T4 and T9 have been used for their topics' labeling as pre-boarding service. The sentiment scores of 15 words from Google NLP are also inserted in the table. The combined data are summarized in Table 4. The importance of a topic and attribute can then be obtained as follows: (1) Adjusted importance of word in topic k (WI w,k ) = I w × s w , where w = 1, 2, . . . , 15 I w = importance of word, s w = sentiment score of word (2) Importance of topic k (TI k ) = 15 w WI w,k , where k = 1, 2, . . . , 18 (3) Importance of attribute (AI i ) = ΣTI n /m, where i = 1, 2, . . . , 7, n is the topic index in the same attribute, and m is the number of topics in the same attribute. As a result, the importance of T4 is 0.093 and that of T9 is 0.088, and their average value (= 0.091) is finally determined as the importance of the pre-boarding service. Because importance is between 0 and 1, and the sentiment score is between −1 and 1, the importance adjusted with sentiment scores must be between −1 and 1.
Performance, the X axis of the TAIPA grid, indicates how well the service is performed from a customer's point of view in general. Since an online rating for performance is currently represented in a review level (one review with one rating), a rating should be converted into a word level to relate to the importance of the word. To exploit the structure of the LDA modeling as it is, the performance of a word for every airline is calculated via the average ratings of all the online reviews that include the specific word from the LDA modeling results. For example, the word 'excel' belongs to T4, and there are 990 reviews of SQ containing 'excel'. The performance of 'excel' in SQ is measured using the average ratings of the 990 online reviews. By doing this, TAIPA can determine the performance in the word level. The performance of a topic and an attribute can be obtained as follows: (4) Performance of word in topic k (WP w,k ) = ΣR w /E w , where w = 1, 2, . . . , 15 R w = a rating of a review including the word, E w = the number of reviews including the word (5) Performance of topic k (TP k ) = 15 w WP w,k /15, where k = 1, 2, 18 (6) Performance of attribute (AP i ) = ΣTP n /m, where i = 1, 2, . . . , 7, n is the topic index in the same attribute, and m is the number of topics in the same attribute.

TAIPA of Nine Airlines
Before matching the pairs of importance adjusted by sentiment and performance for attributes in the grid, we examine where the sentiment score adjustment has been effective to lessen the statistical dependency by analyzing the statistical correlations between pairs. Specifically, we compare the statistical correlations between performance and importance with or without sentiment score adjustments. Figure 2 summarizes the results of Pearson's correlation coefficients between performance and importance with or without sentiment adjustments in the order of Skytrax rankings. Except for three airlines (i.e., CX, GA, and BR) in the top nine, the statistical correlation between performance and importance with adjustments is rather smaller than that between performance and importance without adjustments. As such, it is determined that sentiment adjustments are able to lessen the degree of statistical dependency, and we can thus conclude that TAIPA derives more effective and reliable strategic suggestions and decisions in this example.   Figure 3 shows the performance distribution of nine airlines for the service encounter of passenger relaxation service. The graph includes the attribute performance recognized by the passengers of the nine airlines, and performance is calculated via Equations (4), (5), and (6). LH passenger perceptions on service performance were the worst among the nine comparable airlines. In contrast, NH passenger perceptions were the best in terms of service performance. The pattern of distribution of passenger relaxation was similar to that of the average ratings of the samples we used. This similarity derives from the method of calculating performance. This is because performance in a word level is calculated by the average ratings of all online reviews that include the specific word from the LDA modeling results to use the original form of the LDA modeling. In other cases of service encounters, the situation will not be very different.  Figure 3 shows the performance distribution of nine airlines for the service encounter of passenger relaxation service. The graph includes the attribute performance recognized by the passengers of the nine airlines, and performance is calculated via Equations (4)- (6). LH passenger perceptions on service performance were the worst among the nine comparable airlines. In contrast, NH passenger perceptions were the best in terms of service performance. The pattern of distribution of passenger relaxation was similar to that of the average ratings of the samples we used. This similarity derives from the method of calculating performance. This is because performance in a word level is calculated by the average ratings of all online reviews that include the specific word from the LDA modeling results to use the original form of the LDA modeling. In other cases of service encounters, the situation will not be very different.
Depending upon the positions of pairs in the specific quadrant in the IPA grid, strategic suggestions are given to improve the service [10,27]. Figure 4 presents the results of the TAIPA analysis of each airline. As explained previously, TAIPA standardizes the pairs (Is, P) with their means and standard deviations when locating the pairs. This is because the ranges in which the two axes vary are significantly different. Importance adjusted by sentiment varies from −1 to 1, and performance ranges from 1 to 5. Note the differences in scales of the Y axis between the distributions of performance in Figure 3 and of the adjusted importance in Figure 5, which will be discussed in the next section. We follow the traditional way of interpreting attribute locations in the quadrants-"keep up the good work" in the 1st quadrant (Q1), "concentrate here" in the 2nd quadrant (Q2), "low priority" in the 3rd quadrant (Q3), "possible overkill" in the 4th quadrant (Q4) (see TG in Figure 4). Due to the standardization, all attributes should be relatively read within a single target company. For the example of TG, which ranked last among the nine airlines, the airline could maintain the current level of service at the steps of reservation (Q1) and boarding and deplaning (Q3) compared to other service steps. Although the reasons differ in terms of cultivating future action plans, TG is free from change for these steps. Relative to other steps, the airline needs to invoke positive change for meal service (Q2), take-off, relaxation, and pre-boarding (Q4). For the steps in Q4, the airline needs to reduce its service capability. On the contrary, TG should strengthen the level of service for its meal service by rearranging its own service capabilities and leveraging the step reductions in Q4.

Figure 2.
Correlation between performance and importance with or without sentiment adjustments. Figure 3 shows the performance distribution of nine airlines for the service encounter of passenger relaxation service. The graph includes the attribute performance recognized by the passengers of the nine airlines, and performance is calculated via Equations (4), (5), and (6). LH passenger perceptions on service performance were the worst among the nine comparable airlines. In contrast, NH passenger perceptions were the best in terms of service performance. The pattern of distribution of passenger relaxation was similar to that of the average ratings of the samples we used. This similarity derives from the method of calculating performance. This is because performance in a word level is calculated by the average ratings of all online reviews that include the specific word from the LDA modeling results to use the original form of the LDA modeling. In other cases of service encounters, the situation will not be very different. Depending upon the positions of pairs in the specific quadrant in the IPA grid, strategic suggestions are given to improve the service [10,27]. Figure 4 presents the results of the TAIPA analysis of each airline. As explained previously, TAIPA standardizes the pairs (Is, P) with their means and standard deviations when locating the pairs. This is because the ranges in which the two axes vary are significantly different. Importance adjusted by sentiment varies from -1 to 1, and performance ranges from 1 to 5. Note the differences in scales of the Y axis between the distributions of performance in Figure 3 and of the adjusted importance in Figure 5, which will be discussed in the next section. We follow the traditional way of interpreting attribute locations in the quadrants-"keep up the good work" in the 1st quadrant (Q1), "concentrate here" in the 2nd quadrant (Q2), "low priority" in the 3rd quadrant (Q3), "possible overkill" in the 4th quadrant (Q4) (see TG in Figure 4). Due to the standardization, all attributes should be relatively read within a single target company. For the example of TG, which ranked last among the nine airlines, the airline could maintain the current level of service at the steps of reservation (Q1) and boarding and deplaning (Q3) compared to other service steps. Although the reasons differ in terms of cultivating future action plans, TG is free from change for these steps. Relative to other steps, the airline needs to invoke positive change -0. 6 Is vs. P I vs. P  Although attributes need to be interpreted relatively within a target company, the standardization of pairs makes it possible to monitor patterns between TAIPA plots since the plots are shown in an equal grid. We can roughly form three groups with respect to the pattern of attribute importance. The performance of the first group (SQ and QR) is relatively evenly distributed from −1.5 to 1.5 along the X axis. However, the importance of the group varies in a very small range, except for one attribute along the Y axis. In the second group (EK and BR), the pattern of the importance distribution is slightly loosened from the pattern of the first group. Now the importance is divided into two parts based on the X axis. That is, the variation within a part is still small but the variation between parts increases in terms of the importance. For EK, two parts (i.e., take-off, reservation, and meal service vs. pre-boarding, deplaning, boarding, and relaxation) are clearly distinguished by the X axis. Compared to other groups, attributes are randomly distributed in both axes in the last group (NH, CX, LH, GA, and TG), while no special pattern of importance is observed. If we provide an interpretation of groups using the given Skytrax rankings, we may conclude that the two highest ranking carriers (SQ and QR) tend to consider all service steps to be equivalently important if one service step is ignored as an outlier in the importance distribution. In contrast, the low-ranking group-they are still world-class, and NH is third in Skytrax-shows a random pattern in the importance distribution. Neither the first nor the last group in terms of importance, the mid-ranking group of EK and BR, tend to focus on a few service steps while showing a clear-cut distinction between important and unimportant service steps. Skytrax rankings, we may conclude that the two highest ranking carriers (SQ and QR) tend to consider all service steps to be equivalently important if one service step is ignored as an outlier in the importance distribution. In contrast, the low-ranking group -they are still world-class, and NH is third in Skytrax -shows a random pattern in the importance distribution. Neither the first nor the last group in terms of importance, the mid-ranking group of EK and BR, tend to focus on a few service steps while showing a clear-cut distinction between important and unimportant service steps.

TAIPA Considering Competitors
A typical IPA has been devised for the evaluation of a single company. However, a customer is more likely to make a choice among competitors within the same market or business sector when purchasing a product/service. To represent a more realistic circumstance, the typical form should be revised to adopt this condition. First, attributes should be reorganized to show commonality among competitors. Second, an attribute's importance should be market-driven as compared to all the competitors in the same market. Third, an attribute's performance should be evaluated simultaneously in comparison with the competitors. A few studies have been developed to resolve the problem while assuming there is only one competitor in the market [18,32,34]. They focus on attribute differences between one company and one competitor in performance and importance.
When TAIPA is revised for this aim, a more realistic approach can be designed to accommodate the problem. Since TAIPA is sufficiently flexible to include multiple datasets of comparable companies, the version for competitors in TAIPA (i.e., TAIPCA) provides strategic suggestions, which are market-driven while simultaneously considering multiple competitors in the same market, as opposed to the existing approaches considering one competitor. The only difference between TAIPA and TAIPCA applications is the number of company datasets that the LDA analysis uses. TAIPA uses a dataset of a single target company while TAIPCA works with datasets of multiple target companies that need to be compared. In the latter case, the LDA results produce shared attributes among targets. The LDA results also give the shared distribution of their importance as in the TAIPA application. Figure 5 depicts the joint distribution of importance with respect to service encounters after the sentiment is adjusted. That is, the importance distribution has been derived from all passenger perceptions for the nine airlines. Of interest, we observe that the importance of the first three service encounters (from reservation to boarding) is higher than that of others in this example. Passengers from the nine airlines have perceived service more importantly around the beginning stages, referred to as the moment of truth (MOT), where the first customer interactions take place [88,89]. Because the MOT has been known to be a primary determinant for customer satisfaction, airlines need to sharpen their service capabilities around those MOT steps or encounters.  Figure 6 shows the plot of the TAIPCA application for the nine airlines. When plotting pairs, the adjusted importance (Is) and performance (P) for every attribute are standardized to compare the airlines within an equivalent grid. As seen in the figure, the attribute locations are only determined via performance because TAIPCA uses the identical shared distribution of importance. Note that the attribute locations of attributes are identical along the Y axis and only vary along the X axis. With respect to the attribute distributions, the nine airlines can be classified into four groups. A simple criterion is reused from the classification in the previous section. The airlines are not required to

TAIPA Considering Competitors
A typical IPA has been devised for the evaluation of a single company. However, a customer is more likely to make a choice among competitors within the same market or business sector when purchasing a product/service. To represent a more realistic circumstance, the typical form should be revised to adopt this condition. First, attributes should be reorganized to show commonality among competitors. Second, an attribute's importance should be market-driven as compared to all the competitors in the same market. Third, an attribute's performance should be evaluated simultaneously in comparison with the competitors. A few studies have been developed to resolve the problem while assuming there is only one competitor in the market [18,32,34]. They focus on attribute differences between one company and one competitor in performance and importance.
When TAIPA is revised for this aim, a more realistic approach can be designed to accommodate the problem. Since TAIPA is sufficiently flexible to include multiple datasets of comparable companies, the version for competitors in TAIPA (i.e., TAIPCA) provides strategic suggestions, which are market-driven while simultaneously considering multiple competitors in the same market, as opposed to the existing approaches considering one competitor. The only difference between TAIPA and TAIPCA applications is the number of company datasets that the LDA analysis uses. TAIPA uses a dataset of a single target company while TAIPCA works with datasets of multiple target companies that need to be compared. In the latter case, the LDA results produce shared attributes among targets. The LDA results also give the shared distribution of their importance as in the TAIPA application. Figure 5 depicts the joint distribution of importance with respect to service encounters after the sentiment is adjusted. That is, the importance distribution has been derived from all passenger perceptions for the nine airlines. Of interest, we observe that the importance of the first three service encounters (from reservation to boarding) is higher than that of others in this example. Passengers from the nine airlines have perceived service more importantly around the beginning stages, referred to as the moment of truth (MOT), where the first customer interactions take place [88,89]. Because the MOT has been known to be a primary determinant for customer satisfaction, airlines need to sharpen their service capabilities around those MOT steps or encounters. Figure 6 shows the plot of the TAIPCA application for the nine airlines. When plotting pairs, the adjusted importance (Is) and performance (P) for every attribute are standardized to compare the airlines within an equivalent grid. As seen in the figure, the attribute locations are only determined via performance because TAIPCA uses the identical shared distribution of importance. Note that the attribute locations of attributes are identical along the Y axis and only vary along the X axis. With respect to the attribute distributions, the nine airlines can be classified into four groups. A simple criterion is reused from the classification in the previous section. The airlines are not required to change if attributes are scattered in Q1 and Q3 whereas the airlines are required to change for improvement if attributes are scattered in Q2 and Q4. Major action plans are suggested regarding the attributes in Q1 and Q2 because they are important. The first group includes NH, EK, LH, and GA. The primary observation in this group centers on the locations of the MOT encounters in Q1, while nothing is observed in Q2. This indicates that the airlines effectively provide the service based on customer perceptions. They demonstrate high performance for the important steps. The vacancy in Q2 indicates there are no important attributes with low performance. The second group includes QR and CX. They show the same locations for all service steps except for the deplaning step. Before anything else, the reservation service in Q2 should be improved. The third group contains SQ and BR. They show the same placements for all service steps except for reservation and take-off steps. Both airlines should change the boarding step, and SQ also needs to improve the reservation step. The last group is solely composed of TG. No service steps are observed in Q1. All of the MOT steps are in Q2, and a lot of effort is required for improvements.
Q2 indicates there are no important attributes with low performance. The second group includes QR and CX. They show the same locations for all service steps except for the deplaning step. Before anything else, the reservation service in Q2 should be improved. The third group contains SQ and BR. They show the same placements for all service steps except for reservation and take-off steps. Both airlines should change the boarding step, and SQ also needs to improve the reservation step. The last group is solely composed of TG. No service steps are observed in Q1. All of the MOT steps are in Q2, and a lot of effort is required for improvements.

NH
EK TG LH GA SQ QR CX BR Figure 6. TAIPCA plot of nine airlines.

BIPA versus TAIPA
Bi et al. [18] introduced a new IPA combining various data analytics tools, one of which being the LDA analysis, and applied it to the hotel service. BIPA and TAIPA share a couple of common features. First, both approaches use online reviews as a data source to overcome the weaknesses of data collected from survey methods. Online reviews are effective in terms of collection costs and time, and they are easily accumulated as a form of big data in real-time. Thus, inferences derived from them are relatively less exposed to statistical errors due to insufficient sample representativeness. Second, both methods employ the LDA analysis to extract customer-focused attributes based on the

BIPA versus TAIPA
Bi et al. [18] introduced a new IPA combining various data analytics tools, one of which being the LDA analysis, and applied it to the hotel service. BIPA and TAIPA share a couple of common features. First, both approaches use online reviews as a data source to overcome the weaknesses of data collected from survey methods. Online reviews are effective in terms of collection costs and time, and they are easily accumulated as a form of big data in real-time. Thus, inferences derived from them are relatively less exposed to statistical errors due to insufficient sample representativeness. Second, both methods employ the LDA analysis to extract customer-focused attributes based on the direct perceptions of service experiences, which are faithfully embedded in online reviews. However, there are considerable differences between BIPA and TAIPA. They take significantly different approaches to calculating importance and performance. TAIPA relies on the fact that LDA modeling simultaneously supplies topics suitable to organize attributes as well as attribute importance, which is displayed in the form of a probability distribution. Moreover, sentiment scores from the Google NLP adjust importance to give more accurate calculations for customer perceptions. After all, this adjustment worked positively to ensure statistical independence between importance and performance (see Figure 2). In TAIPA, passenger online ratings are exploited as the performance of attributes as indicated in lots of previous literature. In contrast, BIPA computes importance using a neural network-based analysis for numerical data, and it estimates performance using a sentiment analysis derived from a support vector machine-based analysis for textual data. Finally, a major difference in TAIPA is the use of service encounters for service attributes to cover the entire service process fully. Therefore, strategic action plans from a TAIPA analysis would be more direct and offer more details regarding steps that require change. Table 5 summarizes the major differences between BIPA and TAIPA with respect to critical features in practical applications.

TAIPA as Customer Satisfaction Measure
As discussed in detail in Section 2, a number of studies have strived to estimate customer satisfaction as precisely as possible by employing valid models. To verify how effective TAIPA is with respect to the degree of describing customer satisfaction, we propose a measure of customer satisfaction based on TAIPA calculations. According to the related research [27,90,91], importance has been dealt with as a weighted variable of the performance of attributes. Pezeshki, Mousavi, and Grant [92] showed that this measure could be estimated by importance, which is a function of performance. Oh and Park [93] employed the relationship between attribute importance and performance to measure customer satisfaction in the lodging industry. They multiplied two dimensions of importance and performance while using the multiplication as a weighted customer satisfaction measurement model. Similar to the study by Oh and Park [93], a gauge of customer satisfaction (CS) in TAIPA can be represented by the multiplication of importance and performance as follows.  (10), TAIPA is verified in terms of the level representing customer satisfaction. First, we investigate statistical correlations between TAIPA calculations and typical benchmarks of firm performance in the air transport industry. Services that meet the customer's needs usually lead to a high level of customer satisfaction, and the high level of customer satisfaction is closely related to the positive performance of the business [94,95].
Although several studies have claimed that customer satisfaction could have an endogenous relationship with firm performance, they still show that customer satisfaction is positively related with firm performance [96][97][98]. Here our concern is only the correlation, not the casualty, between customer satisfaction and business performance. We compute Pearson's correlation coefficients between the measurements of customer satisfaction in Equation (10) and major benchmarks of performance for the nine airlines. Revenue per kilometer (RPK), available seat kilometer (ASK), and total revenue are used for the benchmarks and divided by the number of aircraft for each airline (i.e., RPK/aircraft, ASK/aircraft, and total revenue/aircraft) to remove the effects of the business scale. Since online ratings have often been used as measures of customer satisfaction [96][97][98], the related information from TripAdvisor is adopted for comparison. To analyze the correlation with the benchmarks, the average ratings of the nine airlines for the sampling period (1 Dec. 2017~30 Nov. 2018, refer to Section 3.1) are used, and they are denoted as TripAdvisor(sample). In addition, the averages of all of the ratings for the nine airlines at the time of Nov. 31st, 2018 are used for the correlation calculation, and they are denoted as TripAdvisor(population). Figure 7 reveals the correlation coefficients. The measure of customer satisfaction in TAIPA reveals a stronger correlation with firm performance than those of TripAdvisor in all the cases. We could identify the potential of the TAIPA measure as another way to gauge business performance. On the other hand, it is concluded that the measurement of customer satisfaction works better when online reviews and ratings are combined than when online ratings are used alone.
positively related with firm performance [96][97][98]. Here our concern is only the correlation, not the casualty, between customer satisfaction and business performance. We compute Pearson's correlation coefficients between the measurements of customer satisfaction in Equation (10) and major benchmarks of performance for the nine airlines. Revenue per kilometer (RPK), available seat kilometer (ASK), and total revenue are used for the benchmarks and divided by the number of aircraft for each airline (i.e., RPK/aircraft, ASK/ aircraft, and total revenue/ aircraft) to remove the effects of the business scale. Since online ratings have often been used as measures of customer satisfaction [96][97][98], the related information from TripAdvisor is adopted for comparison. To analyze the correlation with the benchmarks, the average ratings of the nine airlines for the sampling period (1 Dec. 2017~30 Nov. 2018, refer to Section 3.1) are used, and they are denoted as TripAdvisor(sample). In addition, the averages of all of the ratings for the nine airlines at the time of Nov. 31st, 2018 are used for the correlation calculation, and they are denoted as TripAdvisor(population). Figure 7 reveals the correlation coefficients. The measure of customer satisfaction in TAIPA reveals a stronger correlation with firm performance than those of TripAdvisor in all the cases. We could identify the potential of the TAIPA measure as another way to gauge business performance. On the other hand, it is concluded that the measurement of customer satisfaction works better when online reviews and ratings are combined than when online ratings are used alone. Second, we employ Spearman's rank statistics to check briefly the similarities between the rankings produced by TAIPA and the comparison targets. Based on the measurements in Equation (10), TAIPA can rank the nine airlines. The rankings of TripAdvisor(sample) and TripAdvisor(population) are determined by the values explained above. Table 6 summarizes Spearman's correlation coefficients between the methods. We can see a relatively clear positive relationship between TripAdvisor and Skytrax (>0.5) but not with TAIPA. This indicates that TAIPA is rather different measure from the other three methods. When TAIPA is regarded to be more effective than others from the Pearson's correlation study above, this conclusion of Spearman's correlation analysis is reached by that of Pearson's analysis.  Second, we employ Spearman's rank statistics to check briefly the similarities between the rankings produced by TAIPA and the comparison targets. Based on the measurements in Equation (10), TAIPA can rank the nine airlines. The rankings of TripAdvisor(sample) and TripAdvisor(population) are determined by the values explained above. Table 6 summarizes Spearman's correlation coefficients between the methods. We can see a relatively clear positive relationship between TripAdvisor and Skytrax (>0.5) but not with TAIPA. This indicates that TAIPA is rather different measure from the other three methods. When TAIPA is regarded to be more effective than others from the Pearson's correlation study above, this conclusion of Spearman's correlation analysis is reached by that of Pearson's analysis.

Conclusions
Related previous studies have emphasized the need to improve IPA to be a more effective and valid analysis framework for real-life applications. The proposed TAIPA in this paper is designed to achieve this aim while utilizing direct service experience perceptions represented by online reviews with numerical ratings. TAIPA provides the following academic implications in the IPA literature. First, TAIPA organizes attributes to incorporate customer perceptions as accurately as possible by using text analytics approaches. The attributes become a set of actual customer experience-based criteria for customer evaluations. If attributes are constructed to combine exact customer experiences, the results of the analysis will be correct and valid, and strategic decisions stemming from the results will be effective and reasonable. Second, TAIPA statistically computes the relative importance based on the probability distribution generated by the LDA application to reviews, and it calculates performance based on the statistics of online ratings with the structure of LDA results. Using review sentiments, importance has been adjusted to supply more accurate calculations, and the orthogonality between importance and performance has been maintained in the TAIPA space. These efforts are able to mitigate the ceiling effect. Third, TAIPA is sufficiently flexible to include multiple competitors. Unlike the survey-based IPA, TAIPA has no restrictions on the use of big data comprising online reviews with ratings. Using multiple competitor datasets, TAIPCA produces joint attributes and the shared distribution of attribute importance. By simultaneously comparing all competitors in the same market, the attribute's importance satisfies the condition that the importance should be market-driven. Fourth, the effectiveness of TAIPA in representing customer satisfaction is verified based on the two kinds of statistical correlation studies: Pearson's and Spearman's correlation statistics. Also, TAIPA is a simpler and more intuitive IPA methodology than the existing approach.
The practical implications of this work are listed as follows. To evaluate the entire service process, TAIPA makes use of service encounters, in which customer perceptions are for the most part created. As a result, TAIPA is very useful to determine which service step needs to be improved because of the one-to-one relationship with service encounters. Thus, TAIPA can give strategic guidance for efficient service improvement based on priority. Second, this proposed TAIPA comprises a series of evaluation processes from structuring attributes to scoring attributes fully based on the customer's perspective. Because TAIPA evaluates a service using real-time customer perceptions rather than fixed criteria, practitioners can employ TAIPA as a service evaluation tool considering the latest version of customer characteristics. The thorough service evaluation from the customer's perspective is a starting point for business sustainability. Third, because the market-driven importance of attributes can be updated in real time, TAIPCA promptly offers strategic suggestions and decisions according to market changes such as the entry of new competitors and a competitor's rise or fall in real time. Flexible responses to market changes lead to customer satisfaction. Customer satisfaction contributes to a concrete basis for provider competitiveness, and is a critical requirement for a sustainable service.
Finally, probable research ideas related to the TAIPA proposal are shared. We have shown that the TAIPA measure is effective in terms of the degree it is able to represent customer satisfaction. To do that, we conducted the statistical analyses with the benchmarks of business performance. It is necessary to expand on the types of analyses and benchmarks of firm performance. For example, a more sophisticated correlation analysis with firm performance is needed by applying TAIPA to longitudinal data. There could exist a time difference until customer satisfaction melts into the index of business performance, and vice versa. Thus, a TAIPA version for time-series data is required to calibrate the effects of time lag. Moreover, a more meaningful correlation analysis would be possible if companies measure their performance at the service step level. If the performance is given by the level of service steps, rather than the entire level of the company, correlation results would be more realistic as they would use more degrees of freedom of data. In addition, we have been able to observe that sentiment adjustments bring about a reduction in statistical dependency between importance and performance with empirical evidence. Together with verifications by applying to various examples, it is necessary to verify the theoretical rationale behind if possible.