Application of Recommender System for Spending Habits Based Campaign Management

: Nowadays, banks are working on finding a suitable campaign for every customer profile. With this study, we aimed to develop a recommendation system that will direct the customer to the appropriate campaign. With the data received from a private bank, credit card transactions of the users were analyzed, and spending habits were modeled. We aimed to recommend the most suitable campaign to the users through the created models. Within the scope of the study, 662.088 credit card transactions performed by 4997 customers within three months were analyzed, and three campaigns were proposed for each customer as a result of the study. The ALS (Alternating Least Square) algorithm was used on Spark to establish the recommendation system. The primary purpose of the study is to increase customer satisfaction by finding unique users based on spending habits instead of campaigns that are applied collectively to customers by making a personalized campaign offer.


Introduction
The recommender systems are a class of information retrieval domain. The main purpose of the recommendation system is to improve the consumer experience and to provide user-related items. With the growing usage of credit cards, banks are storing an enormous amount of data about customers' profiles, such as spending habits, location, or demographic information. This big data can be used for campaign management using recommender systems. Campaign management is providing a suitable campaign for a suitable customer at the right moment. For the banking sector, campaign management is part of customer relationship management (CRM). CRM is a strategy that allows companies to analyze customer profiles, determine their needs and areas of profitability, and take the necessary actions to achieve both customer satisfaction and profitability [1]. CRM covers many management units, such as campaign management, human resources management, sales management, and service management. Today, there is deep competition among banks, which is an advantage for customers. Customers expect fewer transaction fees, higher interest rates, new products, and appropriate campaigns from banks [2]. Therefore, campaign management is essential to ensure customer satisfaction; thus, suitable campaign recommender systems can be used. One example of recent studies on the subject is Reference [3]. In this study, recommender systems were used for CRM to increase customer satisfaction. Recommendation systems are algorithms that provide the most meaningful and accurate products for the user by filtering useful content from big data. The data for recommendation systems can be collected not only by delivering the opinions of the users directly like ratings but also indirectly, such as purchase history, time spent on web pages, email content, etc. [4]. Generally, recommendation systems are classified as  collaborative filtering, content-based filtering, and hybrid systems, as shown in Figure 1. Content-based filtering systems produce recommendations based on item specifications/features, which could be used web pages or news recommendations. In collaborative filtering approaches, we accept that similar users like similar items. These systems have affected the way the online world of e-commerce and social media function, with some popular examples being Netflix movie recommendations, Amazon product recommendations, or friend recommendations on Facebook [5].  [6] studied Amazon movies data to build a recommendation system and compare results for a content-based approach with review counts, collaborative approaches for rating values, and hybrid recommendations for combining them. Another example of hybrid systems by Srikanth and Nagalakshmi (2020) [7] built a song recommendation system using the SVD (Singular Value Decomposition) machine learning algorithm. Kulkarni (2017) [8] developed a book recommendation system using Apache Spark. In that study, the solution was proposed to one of the hardest problems of the recommendation system; the cold start problem being a lack of evaluation value for new items or new users, by recommending popular books in the absence of evaluation value. Another approach of solving the cold start problem by Aggarval and Bahuguna (2017) [9], built a recommendation engine for a MovieLens data set where suggestions for new users are produced from the demographic characteristics of the users. Dutta and Bandyopadhyay (2020) [10] used recommender systems to investigate customer behavior on term deposit subscriptions using featured data, which includes customer's age, job profile, marital status, etc. The proposed recommender system has an accuracy of 88.32%. Another example of a recommendation system for banking applications is integrating a recommendation system to the process of delivery of personalized customer services. Nieves et al. (2019) [11] developed a hybrid recommendation system for the banking products such as mortgages, loans for improving aspects of customer support services, and reducing entity management costs. In this study, a spending habit-based recommender system is proposed for campaign management. For this purpose, 4997 customer's spending habits are analyzed and modeled from 662.088 credit card transaction data obtained from a private bank. The developed engine recommends to customers the three most suitable campaigns among sixteen proposed campaigns. The ALS (Alternating Least Square) algorithm was used on Spark to establish the recommendation system. By recommending a campaign according to customer's spending habits, we aimed to increase the satisfaction of the customer.

Method
The purpose of this study was to build a recommendation system based on spending habits using collaborative filtering algorithms. These algorithms aim to fill in the missing values of a user-item association matrix. In this study, the ALS (Alternating Least Square) method was used on Apache Spark to establish the recommendation system for a Matrix Factorization Model (MF). R, a rating matrix of size U X M can be decomposed into two low rank matrices, P and Q, of size U X K and M X K, respectively, where K is called the rank of the matrix [12]. The purpose of matrix factorization model, filling empty cells in the original matrix R using low rank matrices P and Q, is given by the following equation: To make strong recommendations, predicted values are as close as the original values. The error between the original and predicted value given as: In order to optimize the preceding equation, the Stochastic Gradient Descent (SGD) and Alternating Least Squares algorithms are commonly used. In this study, the ALS algorithm was used. The ALS is an iterative algorithm that involves computing one feature vector term using the least-squares function by fixing the other feature vector term constant until solving the equation optimally [13].
In collaborative filtering recommendation systems with implicit feedback that only have positive feedback, if a user has no feedback for an item in the dataset, it does not mean the user dislikes it [14]. Moreover, for implicit feedback-based recommendation systems, user reactions could not be tracked so precision-based metrics are not very appropriate. In this study, a recall-based evaluation metric [15] known as Mean Percentage Ranking (MPR) was used:

Dataset and Processing
This work used credit card transaction data obtained from a private bank. The dataset had 4997 customer's with 662.088 credit card transactions data that included encrypted the customer number, merchant category code (MCC), age, marital status, education level, transaction amount, transaction date. MCC is a four-digit number that is assigned by a bank or card organization such as Visa, Master Card, etc. to determine credit card transaction's market segment [16].
First of all, all MCC codes were merged into sixteen merchant category groups (MCG) according to their fields, and these are also campaign groups to be used in the study. Then the transaction data sets were grouped by user ID and MCG to find users' transaction counts for each MCG. After data processing, the final version of the data set, used in this study, had 79,952 rows including user ID, MCG, and transaction count for each MCG.
A wide range of 16 MCGs from education to insurance were used in the study from the data obtained from the private bank, as shown in Table 1.

Research Model
The model developed for the campaign recommendation system is given in Figure 2. As shown in Figure 1, the first step of the study is data preprocessing. All transaction and customer data imported as Microsoft SQL Server tables included customers and transactions. Then MCC codes were grouped into MCG codes, and the credit card transaction data were organized according to MCG codes. In its final form, the data set consisted of user ID, MCG, and count of transactions.
In this study, we built a recommendation system with implicit feedback using Apache Spark 3.0. Apache Spark is an open-source project for big data and machine learning. For building a recommendation system in this study, the ALS algorithm was used with PySpark. Data were split 60% to train, 20% for validation, and 20% for testing, and the models were evaluated using MPR. The most successful model was selected, and three campaigns were recommended for each user.

Findings
This study was aimed at developing a recommendation system with implicit feedback using the ALS algorithm for Matrix factorization. Matrix factorization uses latent factors that are the features in the lower dimension latent space projected from the useritem interaction matrix for representing user preferences in a much smaller dimension space. ALS is an optimization algorithm for minimizing the loss function. Hyperparameter tuning gives a tuple of hyperparameters that provides an optimal mode [13]. The Spark ALS model has an infrastructure for model tuning with some hyperparameters, such as regularization, rank, etc [17]. For the first model of a recommendation system for ten iterations, the best model had 16 latent factors, 0.01 regularizations, and an MPR value of 0.263.
The second model was created using the same parameters, but with 20 iterations. The results of the two models are given in Table 2. For the second model of recommendation systems with twenty iterations, the best model had 16 latent factors, 0.01 regularization, and an MPR value of 0.213. This was also the most successful result between the two models, such that recommendations are produced according to it.
When the recommendations created are examined, it was seen that the most recommended campaign was vacation and travel. Other recommended offers for the first recommendation are shown below in Figure 3. Within the scope of the study, three campaigns were offered to all users, and the distribution of the suggested campaigns is given in Figure 4. MCG 8 (Supermarket) and MCG 15 (Restaurants payments) were the most recommended campaigns, and both were recommended to around 2000 customers, followed by MCG 13 (Vacation and Travel) and MCG 5 (Bill Payments) which were recommended to around 1800 and 1500 customers, respectively. MCG 1 (Kids), MCG 11 (Unclassified expenses), and MCG 12 (Insurance) were the least recommended campaigns and were recommended to below 200 customers.

Discussion and Conclusions
In this study, for building a recommendation system based on spending habits, the ALS algorithm was used. 662.088 credit card transactions performed by 4997 customers within three months were analyzed, and three campaigns were proposed for each customer as a result of the study. As a result of the evaluations, the best model has 16 latent factors and an MPR value of 0.213. The most recommended campaigns are supermarket and restaurant payments while the least recommended are kids, unclassified expenses, and insurance. For future work, we aim to develop a hybrid recommender system that includes collaborative filtering, and content-based filtering by combining user evaluation values and user demographic features to increase the performance of the recommendation system.