NLP-Based Customer Loyalty Improvement Recommender System (CLIRS2)

: Structured data on customer feedback is becoming more costly and timely to collect and organize. On the other hand, unstructured opinionated data, e.g., in the form of free-text comments, is proliferating and available on public websites, such as social media websites, blogs, forums, and websites that provide recommendations. This research proposes a novel method to develop a knowledge-based recommender system from unstructured (text) data. The method is based on applying an opinion mining algorithm, extracting aspect-based sentiment score per text item, and transforming text into a structured form. An action rule mining algorithm is applied to the data table constructed from sentiment mining. The proposed application of the method is the problem of improving customer satisfaction ratings. The results obtained from the dataset of customer comments related to the repair services were evaluated with accuracy and coverage. Further, the results were incorporated into the framework of a web-based user-friendly recommender system to advise the business on how to maximally increase their proﬁts by introducing minimal sets of changes in their service. Experiments and evaluation results from comparing the structured data-based version of the system CLIRS (Customer Loyalty Improvement Recommender System) with the unstructured data-based version of the system (CLIRS2) are provided.


Introduction
This research addresses the scarcity of applying intelligent systems to the problem of customer satisfaction improvement. Current solutions mainly streamline collecting customer feedback. Recommender systems (RS) have been applied in the e-commerce setting to help the customers in Business-To-Customer (B2C) make decisions. However, not many RS have been applied in the Business-to-Business (B2B) setting. For the B2B relationship, the information of interest would be the contracts, product prices, quality of service, and payment terms. Past studies have shown that consumer satisfaction of B2B is lower than that of B2C, which indicates that the enterprises in the B2B market do not understand their customers profoundly and do not make an efficient response to their needs. Applying RS technology to B2B can bring numerous advantages: streamline the business transaction process, improve customer relationship management, improve customer satisfaction, and make dealers understand each other better. B2B participants can receive different useful suggestions from the system to help them do better business. In addition, the recommender system in B2B can be linked with the enterprise's back-end information system and augment the company's marketing. The system can analyze sales history and customers' comments and give advice on marketing issues, such as how to improve products, the customers' purchase trends, etc.
The article starts with an overview of the currently available data analytics tools in the customer relationship management domain. Then, we show how to develop a knowledge-based recommender system from unstructured data and present an algorithm that transforms unstructured text data into a structured form using aspect-based opinion mining. Next, we apply data mining techniques for actionable knowledge discovery on the transformed data. Finally, we present a method to generate business recommendations for improving customer loyalty. A comparison of the knowledge-based approach with structured data and unstructured data is presented. We also provide a discussion on future work and recommender system improvements.

Recommender Systems
A new generation of intelligent systems, known as recommender systems (RS), has been applied mostly in e-commerce settings, supporting customers in online purchases of commodity products such as books, movies, or CDs. Recommender systems aim to predict the preferences of an individual (user/customer) and provide suggestions for further resources or items that are likely to be of interest. Formally, recommender systems are defined as programs that attempt to recommend the most suitable items (products or services) to particular users (individuals or businesses) by predicting a user's interest in an item based on information about similar items, the users, and the interactions between items and users [1]. They are usually deployed as a part of many e-commerce sites, offering several important business benefits: increasing the number of items sold, selling more diverse items, increasing user satisfaction and loyalty, and helping to understand what the user wants [2][3][4][5]. The current generation of recommendation methods can be divided into four groups [6]: • Collaborative filtering is a technique based on comparing user to user to produce recommendations. • Content-based is a technique that associates the derived content of the items with the user profile. • Knowledge-based relies on external knowledge about items. • Hybrid is any combination of methods above.

Survey Design and Data Collection
There are many tools readily available on the market that support customer feedback design and data collection: Client Heartbeat [7] is a ready tool built specifically to measure customer satisfaction, track changes in satisfaction levels, and identify customers 'at risk'. SurveyMonkey [8] is an online survey tool that helps to create any type of survey. However, it lacks features with regard to measuring satisfaction and receiving actionable feedback. Customer Sure [9] is a tool that facilitates the distribution of customer surveys, gathering the results. It contains certain intelligent techniques to analyze customer satisfaction scores over time and observe trends. Floqapp [10] offers customer satisfaction survey templates, collects the data, and puts them into reports. SurveyGizmo [11] is a tool for gathering customer feedback. It offers customizable customer satisfaction surveys; however, it lacks features to intelligently analyze the data. Temper [12] helps with gauging satisfaction as opposed to just being a survey tool. It measures and tracks customer satisfaction over a period of time. The Qualtrics Insight Platform [13] is the platform for the actionable customer, market, and employee insights. It offers not only customers' feedback collection, analysis, and sharing, but also insight capabilities, including tools for end-to-end customer experience management programs, customer and market research, and employee engagement.
These types of tools mostly facilitate the design of surveys and data collection; however, they offer very limited analytics and insight into customer feedback. It is mostly confined to simple trend analysis (tracing if score increased or decreased over time) and data aggregation reports.

Customer Relationship Management (CRM) Systems
Other types of systems traditionally supporting customer relationship management are CRM systems. CRM is described as "managerial efforts to manage business interactions with customers by combining business processes and technologies that seek to understand a company's customers" [14], i.e., structuring and managing the relationship with customers. CRM covers all the processes related to customer acquisition, customer cultivation, and customer retention. CRM also involves the development of the offer: which products to sell to which customers and through which channel. CRM seeks to retain customers and design marketing campaigns. Sometimes CRM strategy is incorporated into other enterprise systems. An enterprise data warehouse has become a critical component of a successful CRM strategy [15]. Data mining techniques in this area are useful for extracting marketing knowledge and further supporting marketing decisions. The CRM systems must analyze the data using statistical tools and data mining. There are two critical components of marketing intelligence: customer data transformation and customer knowledge discovery.

Decision Support Systems
CRM systems are sometimes augmented with decision support systems (DSS). A DSS is an interactive computer-based system designed to help in decision-making situations by utilizing data and models to solve unstructured problems [14]. The aim of DSSs is to improve and expedite the processes by which management makes and communicates decisions-in most cases, the emphasis in DSSs is on increasing individual and organizational effectiveness. DSS, in general, can improve strategic planning and strategic control. Research indicates data-driven or data-informed organizations improve decision-making, increase profitability, and drive innovation. Strategic planning requires a large amount of information to be available for mining. Proper integration of DSSs and CRM presents new opportunities for enhancing the quality of support provided by each system.

Knowledge-Based Recommender System for B2B
In this research, we propose a novel approach to handle customer management and managerial decision support. We propose a knowledge-based and a text mining-based approach for developing a recommender system. In our domain, users of an RS are business decision-makers and items of interest are strategic business actions. In our domain, an item is understood as any combination of a number of of business strategies: price competitiveness, staff attitude, technician's knowledge, etc.

Recommender System for B2B
The idea of applying a recommender system in the area of strategic business planning is quite novel and currently under-researched. On the other hand, the principles of the proposed approach are generalizable to multiple and diverse environments and applications. One of the proposed definition of a recommender system for B2B e-commerce was given in [16]: "a software agent that can learn the interests, needs, and other important business characteristics of the business dealers and then make recommendations accordingly". The systems use product/service knowledge-either hand-coded knowledge provided by experts or knowledge learned from the behavior of consumers-to guide the business dealers through the often overwhelming task of identifying actions that will maximize business metrics of the companies.

NPS-Based Recommendations
The goal of the recommendations is to improve customer loyalty, as measured by net promoter score (NPS)-a current standard metric for measuring customer satisfaction (NPS ® , Net Promoter ® , and Net Promoter ® Score are registered trademarks of Satmetrix Systems, Inc., Bain and Company and Fred Reichheld). The idea of the NPS metric is based on categorizing a customer into one of the three categories: a promoter (a loyal customer), a passive, or a detractor (disloyal customer hurting the reputation of a company). The percentage NPS metric is calculated as the total percentage of promoters minus the total percentage of detractors.

Knowledge Discovery Techniques
The proposed approach for improving NPS is based on a novel technique in data mining, called action rule [17], and aggregating results of multiple action rules to calculate the predicted NPS impact [18]. Action rules are patterns mined from large sets of numerical customer feedback that provide aggregated knowledge about actions necessary to change the value of a decision attribute-in our domain, the label of customer status from a detractor to a promoter. The effect of multiple action rules is aggregated to generate recommendations with the goal to provide minimal sets of changes that bring the maximal expected impact on NPS improvement.

Sentiment Mining Techniques
Text mining can be useful to identify the frequency with which topics are mentioned and the sentiment associated with the topics. Sentiment analysis is generally defined as analyzing people's opinions, sentiments, evaluations, attitudes, and emotions from written language. We propose an algorithm for calculating the overall impact that additionally takes into account the results of text mining on text customer feedback related to the numerical scores. In particular, we propose an aspect-based sentiment analysis approach for text mining. Aspect-based sentiment analysis is based on the idea that an opinion consists of sentiment (positive or negative) and a target of the opinion, that is, a specific aspect or feature of the object [19].

Data Visualizations Techniques
An important aspect to be considered in an RS is the output design and the presentation of the output-e.g., how to deliver the recommendations to the users, if the system explains the results to the user, etc. In this research, it is proposed to provide explainable output to allow the system's users to make more informed and accurate decisions about which recommendations to utilize. To attain this goal, it is proposed to apply novel visualization web-based techniques to help business users understand the algorithms behind recommendations, compare recommendation options, provide for sensitivity analysis, and explore raw data behind recommendations [20].

CLIRS: Data-Driven Customer Loyalty Improvement RS
The proposed data-driven approach for building a knowledge-based RS is based on a large dataset of customer feedback (around 300,000 records) collected from companies in the heavy equipment industry by a collaborating consulting company. Each record in the dataset represents a survey, with features describing the company and their particular service being assessed and the customer being surveyed. Feedback is represented by numerical scores in different areas and additional notes in the free-form text. Since there are limitations to a human (manual) approach to deriving novel insights or patterns from vast amounts of records, we propose applying data mining techniques with the goal to extract meaningful patterns about customer sentiment.
The results of data mining and text mining are incorporated into the framework of customer loyalty improvement recommender system (CLIRS) [21]. The algorithm for generating recommendations includes the following steps ( Figure 1 and Algorithm 1): 1. Data preprocessing. The original data are being pre-processed including standardization, missing values imputation, and feature extraction. The feature extraction was applied on geographic attributes describing companies. For example, an area code was extracted based on the Contact Phone, and County was added based on the Zip and Contact Phone columns. The comprehensive survey data are divided into single-company datasets.
2. Semantic Similarity. We introduce the concept of semantic similarity between two companies, which is based on the distance between classification models extracted from the datasets of the companies. Companies are similar if the semantic categorization of promoter, passive, and detractor, as extracted by a classification model, is similar. Assuming that RC [1] and RC [2] are the sets of classification rules extracted from the single-client datasets (of clients C1 and C2): where the above three sets are collections of classification rules defining correspondingly: "Promoter", "Passive" and "Detractor": In a similar way, we define: Based on the above, the concept of semantic similarity between clients C1, C2, denoted by SemSim(C1, C2), is defined as follows: The metric is used to find clients similar to a current client in semantic terms. It calculates the distance between each pair of clients. The smaller the distance is, the more similar the clients are.

3.
Clustering procedure. The single-company datasets are extended (merged with) datasets of companies that are semantically similar. The extension procedure aims at finding a "similar" company, which has a higher NPS. Therefore, knowledge from that company(s) can help the current client improve its NPS. We propose a hierarchical clustering algorithm for the procedure of forming "clusters" of semantically similar companies (called the "HAMIS" procedure-Hierarchical Agglomerative Method for Improving NPS [22]). The algorithm uses the semantic similarity metric as defined above to create a hierarchical structure called "dendrogram". The result of the procedure is single-client datasets expanded with their semantic neighbors based on the dendrogram structure. We developed an interactive visualization to display the resulting dendrogram structure (see Figure 2). 4.
Actionable knowledge mining. The datasets expanded in the previous step are mined for actionable knowledge with the goal of improving NPS, by changing customers from detractors to promoters. In our domain, action rules show minimal sets of business changes needed to be undertaken in order to change a customer from being detractor or passive to being a promoter. Action rules are composed of socalled stable attributes, flexible attributes, and the target attribute. For stable attributes, characteristics of a survey type and a customer were chosen for the experimental setup.
For flexible attributes, we chose benchmark questions from surveys, which represent areas of improvement in the customer satisfaction problem. We confined analysis only to core benchmarks, which were extracted using a feature selection method based on computing decision reducts. The target attribute which we desire to change is the NPS status (from a detractor to a promoter). Each such rule is characterized by the support and the confidence metric. Support of the rule is understood as the number of customers affected by this rule, that is, the greater is the support, the more customers there are who can potentially be changed from detractor to promoter. Confidence says how probable it is to change the promoter status.
An example of the extracted action rule is given in Listing 1: where B1, B4, and B9 denote codes of benchmark questions, and therefore specific areas of customer service. The interpretation of the rule is as follows: if the values of benchmark questions are changed from the value given on the left-side to the value as given on the right side, it is expected to change customer's status from being detractor to become a promoter, with a given confidence.

5.
The text mining procedure is applied to the text comments that accompany numerical surveys. The algorithm is based on extracting sentiment towards different aspects of service (so-called aspect-based sentiment analysis). This procedure involves text pre-processing (parsing and POS tagging), extracting sentiment based on opinion words list, aspect-based sentiment extraction (based on a pre-defined dictionary of features and their aspects), and text summarization.The details of the procedure and the results are described in [23]. 6.
NPS impact calculation. We combine results from action rule mining and text mining to generate recommendations and calculate an expected NPS impact. The procedure combines different aspects and checks its NPS effect calculated based on the support and confidence of corresponding action rules. NPS impact is calculated based on a multiplication of support and confidence (support × confidence), which provides information about the expected number of customers affected by changes. Only sets of aspects with the highest impact are presented to the business user. 7.
Visualization. Visualization techniques are proposed for displaying the recommendations in a web-based format. The visualization is interactive, and the set of recommendations changes based on the feasibility of aspects as assigned by the business user ( Figure 3). Each recommendable item (visualized as "bubble") consists of a set of aspects that should be improved in order to impact NPS positively. Each recommendable item is characterized by an attractiveness (color-coded) presented in a two-dimensional chart as a function of NPS impact and feasibility, assigned by the user based on their business knowledge. Each recommendable item can be further analyzed by drilling down into the raw comments associated with aspects included in recommendations.   (promoterCommentFilePath, clientName, 12: tagger, parser, positiveSeed, negativeSeed, f eatureClasses) 13: listO f ResultD ← miner.MiningProcessO f Segments 14: (detractorCommentFilePath, clientName, tagger, 15: parser, positiveSeed, negativeSeed, f eatureClasses) 16: //Trigger action rules. 17: atomicmetaActions ← actionRulesTriggeration().atomicActionTrigger

CLIRS2: Sentiment Analysis Based Customer Loyalty Improvement RS
The business setting and requirements changed as we developed a data-driven version of a recommender system for improving customer loyalty (CLIRS). Namely, the business recognized that structured surveys conducted via calling end customers are taking too much time, and customers are reluctant or impatient to take time to complete these. The business decided to switch to a shorter version of surveys, consisting of 3-5 numerical questions, and mostly open-ended questions (see Figure 4). This represented a challenge for the research, as the version of the system was built mainly on numerical data, and using text comments only as an augmentation of the recommender algorithm. In the future, the dataset will reverse its structure, that is, the text feedback will be accompanied by a few numerical questions, with most insight provided by the free-form text.
The procedure for generating data-driven recommendations (as presented in Figure 1) had to be revised since the structure of the primary source of customer feedback changed from the numerical format to text format. The new procedure ( Figure 5) for generating recommendations performs mining on the text first and contains an additional step of transforming text into a structured form. Based on the revised procedure, a new version of Customer Loyalty Improvement Recommender System (CLIRS2) was developed. In general, this new version of the system presents a novel method for developing RS from solely text data. When comparing the procedure revised for text data ( Figure 5) with the previous process (Figure 1), text mining is performed first to build a dataset suitable for action rule mining. In addition, the procedure for generating recommendations (meta nodes) has changed and is described in more detail later.

Opinion Mining
Our approach to aspect-based opinion mining consists of four steps: 1. Identify opinion sentences and their orientation with localization. 2. Summarize each opinion sentence using discovered dependency templates. 3. Summarize opinions based on identified feature (aspect) words. 4. Generate meta-actions with regard to given suggestions.
Our approach is dictionary-based, that is, we use pre-defined dictionaries of both opinion words and aspect words. The detailed procedure is presented in [23]. This approach uses the recognition of opinion words, associating opinion words with aspect words, using Stanford lexical dependency patterns [24], and clustering similar opinion-aspect pairs. We propose a two-level definition of aspects: general aspects (such as technician, staff, pricing), and more detailed aspects, such as technician knowledge, staff attitude, and pricing competitiveness (see Figures 6 and 7).

Sentiment Table
The goal of this step is to build a data table, which will be in a format suitable for data mining algorithms. That way, we can adapt a new data format to our previous approach based on mining actionable knowledge to generate recommendations. The data table (as sample Table 1) is built with the following procedure: • The attributes (columns) of the table are defined as aspects in the sentiment mining. Depending on whether we consider service surveys or parts surveys, we use the corresponding hierarchy of aspects ( Figure 6 or Figure 7, correspondingly) to build table columns. We propose a two-level hierarchy of defining aspects, and only "leaf nodes" form columns of the table. • Each row represents a uniquely identified survey, in which feedback is given by an open-ended text. • Each cell represents a sentiment score detected towards an aspect given by a column in a survey given by the row.

Sentiment Score
We proposed to increase the scale of sentiment polarity from positive/negative to that in the range {−2, −1, 0, 1, 2}. We adopted a dictionary-based approach for detecting "opinion" words. We not only use simple lists of positive and negative words but also added dictionaries, such as SentiWordNet and AFINN, that assign a polarity score to words. We proposed the mapping of sentiment values assigned in the dictionaries to our discrete scale. For example, SentiWordNet uses a continuous scale <−1; 1>, and the AFINN dictionary uses a discrete scale of <−5; 5>. Therefore, standardization was added to our algorithm. We also added programmatically detection of words indicating strong opinions, such as "very" and "really", to adjust the sentiment detected on opinion word. We assign polarity "0" (neutral) if the aspect is not mentioned in the comment, and the NULL value if an aspect is not applicable to the type of survey.

Action Rule Mining
In the constructed decision table, each survey is represented as an N-vector (where N is the number of aspects/columns) with values in the range <−2; 2>, denoting sentiment score detected in the survey towards the nth aspect. The transformation produces a table with numerical values, which enables further to apply data mining algorithms, similarly to in the previous approach. Each survey is labeled with a category of NPS status-whether the customer expressing the opinion in the survey was labeled as a promoter, detractor, or passive. We propose to mine for action rules from the resulting "sentiment table", with patterns in a format as given in the Listing 2 below: The actionable patterns show how to change the customer from being an unfavorable detractor to a promoter given certain changes in sentiment towards certain aspects. We propose to combine the effects of such rules to affect the maximal number of customers given minimal business changes.

Recommendations and Predicted NPS Impact
The procedure of generating recommendations for improving NPS changed in comparison to the approach that used quantitative data. The recommendations are built as the sets of aspects that need to be addressed. The procedure creates the sets starting from minimal sets, adding a new aspect, and recalculating the NPS impact given action rules that contain that aspect. The revised procedure for generating recommendations (called "meta nodes") is presented in Algorithm 2.

Aspect-Based Sentiment Analysis
The evaluation of aspect-based sentiment analysis approach was performed. Experiments were conducted on a representative subset of 70 text comments (35 from detractors and 35 from promoters). The evaluation metrics included: (1) accuracy, computed as a percentage of correctly recognized opinions (vs. human recognition); (2) coverage as the percentage of opinion extracted vs. recognized by human approach; and (3) "All' as a metric calculated as overall quality with a weighted accuracy (0.5) and coverage (0.5). A comment is assumed to be covered if at least one opinionated aspect in the comment was recognized (coverage per comment). In the second test case, coverage was measured by dividing by an actual number of opinionated aspects (coverage per opinion). Table 2 presents the final results and compares the accomplished machine sentiment recognition with human recognition (Hum). By introducing modifications, the coverage increased to 57%, as calculated per comment, and to 48%, as calculated per opinion. However, the accuracy decreased to 88%, as calculated per comment. On the other hand, when calculating per opinion, it increased to 96%. Overall, the algorithm was improved significantly. The weighted metric increased from 63% to 72%, when measuring per comment, and from 67% to 72%, when measuring per opinion. On the other hand, there is still a gap versus human recognition.

NLP-Driven Recommendations
The next step in evaluation was a comparison of the approach that uses both quantitative and qualitative customer feedback with the approach that uses solely qualitative (text) data for generating customer loyalty improvement recommendations. The former procedure was built into the CLIRS system (Customer Loyalty Improvement Recommender System), and the latter into the modified version of the system-CLIRS2.
The results of the comparison are presented in Table 3. The test cases of two companies were chosen (the names of the companies were anonymized with ordering numbers-Clients 16 and 3). We used the following metrics for comparison: • The number of action rules extracted and read into the system • Meta actions (MA) extracted (results of sentiment mining)-the number of opinionated aspect-based segments of text • Effective meta nodes (MN)-representing the number of generated recommendations • Maximal NPS impact-the percentage predicted impact on NPS associated with the most impactful meta node (recommendation) The change in the recommendation algorithm was made invisible from the user's perspective. The same web-based interface was adopted for CLIRS2 and the final recommendations can be explored in a similar way (compare Figures 3 and 8). In the same way as previously, the recommendation can be analyzed one at a time by clicking the corresponding bubble on the visualization chart and exploring the table underneath with details on the actual raw comments associated with the recommendation (Figure 9).

Discussion
From the results of testing two versions of the system on the chosen companies' datasets (Table 3), it can be concluded that the scale of the predicted NPS impact decreased when switching to CLIRS2. For example, the maximal NPS impact in CLIRS was 8.58% for Client 16 and 4.8% for Client 3. In CLIRS2, which was tested on text notes associated with the structured survey, these impact measures decreased to 1.34% and 0.74%, correspondingly. The difference in scale can be also seen by comparing the scale for visualization charts in Figures 3 and 8. The difference in the calculated NPS impact in CLIRS and CLIRS2 results from the loss of knowledge associated with quantitative data on customer feedback. In addition, it must be noted that CLIRS2 was tested on text data that were supplemental to the numerical feedback, and, in the future, the text feedback is expected to be more informational than the one currently available as "additional notes". The future work on CLIRS2 includes improving the coverage of the sentiment mining algorithm, which is the basis for generating recommendations based on actionable knowledge mined on sentiment table. The improvement in opinion mining is expected to increase the impact of recommendation produced by CLIRS2. In addition, the open-ended surveys for collecting qualitative feedback need to include hints to customers with regard to aspects that they should cover in their open-ended opinions to maximize sentiment coverage.

Conclusions
There are many computer tools supporting survey design and customer data collection used commercially. Customer relationship management (CRM) systems collect enterprisewide data on customers and sometimes have data analysis capabilities and decision support systems built on top of them. However, the use of intelligent systems in the field, especially recommender systems with machine learning, and in particular with text mining techniques embedded in its recommendation engine, is still scarce. This research addresses this scarcity by presenting an approach to building a sentiment analysis-based recommender system for business. The same approach can be adopted whenever open-ended qualitative feedback data are available as the only source of customer feedback. We present the strategy of adopting the version of the system that generates data-driven recommendations based on both numerical and text data (CLIRS) to a new version of the system (CLIRS2) that produces quantifiable recommendations based on text feedback only. We successfully adapted the RS without any visible interface change to the end business users of the RS system. It allows for flexibility in switching between versions depending on the format of data available as customer feedback. The unique contribution of this research is proving the hypothesis that it is possible to build a knowledge-based recommender system from unstructured data. We present a novel method for aspect-based sentiment analysis and transforming unstructured data into a structured format based on the opinion mining procedure. The hypothesis was tested in the problem area of customer loyalty measured by net promoter score. From the business point of view, the value of using CLIRS/CLIRS2 systems lies in its ability to identify and quantify business areas (aspects) most critical to customer satisfaction. It improves strategic decision-making processes by helping companies prioritize strategic actions with the highest expected ROI (return on investment). Additionally, the system provides for interactivity and sensitivity analysis, by assigning feasibility to business changes. The system operates on multiple companies' data in the same industry, which allows for under-performing companies to "learn" from the knowledge of similar, but better-performing companies. Finally, the system's recommendations are explainable. The system is able to justify its recommendations by displaying the matched cases of customer surveys and their feedback associated with the recommendations for improvement. This approach provides for fine-grained insights into customer experience improvement. Informed Consent Statement: All subjects gave their informed consent for inclusion before they participated in the study.