Matching Consignees / Shippers Recommendation System in Courier Service Using Data Analytics

: The purpose of this research was to create a Matching Consignees / Shippers Recommendation System (MCSRS). We used the association rule to identify product associations, the clustering technique to group shippers and consignees according to behaviors when receiving goods from similar shipper groups, and the decision tree to identify possible matches between shippers and consignees. Finally, Monte Carlo simulation was used to estimate potential revenue. The case study is a courier company in Thailand. The results showed that garment products and clothes were the products with the highest association. Shippers and consignees of these products were segmented according to recency, frequency, monetary factors, number of customers, number of product items, weight, and day. Three rules are proposed that enabled the assignment of 8 consignees to 56 shippers with an estimated increase in revenue by 36%. This approach helps decision-makers to develop an e ﬀ ective cost-saving new marketing, inclusive strategy quickly.


Introduction
Recommendation systems (RS) have increasingly been employed owing to technological advancements in understanding individual customer behavior resulting in increased customer satisfaction [1][2][3]. In e-commerce, this commonly applies to product recommendations. An RS suggests products that customers will likely prefer by considering the relationship between a customer's purchase history and the product's review rating [4,5]. For example, the popular e-commerce site Amazon has developed an RS by identifying customers who have purchased and rated items. Based on the collaborative filtering algorithm and association rule, Amazon has increased its revenue by 30% [1]. Netflix, the video rental and streaming service, held a competition to improve its RS, called Cinematch. In 2009, a team combined 107 recommendation algorithms resulting in a single prediction, improving predictive accuracy. LinkedIn and Facebook provide recommendations for people you might know, jobs you might like, groups you might want to follow, or companies in which you may be interested. LinkedIn uses Apache Hadoop to build its specialized collaborative-filtering capabilities [2]. This is due to technological advancements that allow data to be collected, stored, and processed effectively, resulting in proactive data that enhances business opportunities, captures new markets and customers, and increases margins. Big data, artificial intelligence, and the Internet of things are examples of how datafication is applied in today's businesses with many algorithmic approaches available for recommendation engines depending on the type of organization [6].

Literature Review
Understanding customer demand can help a company estimate and manage production capacity, research and development, advertising, and investments [14]. Correctly matching the demand with the right supply can allow businesses to gain customer satisfaction and competitive advantage. Boysen et al. [15] presented a demand and supply matching recommendation for parking spaces, and driver requests in sharing economy using the classification scheme. Products and services also depend on the resources and production capabilities available to the business. A firm needs to know how a new product or service would fit customer demand and company supply [16]. Klassen and Rohleder [17] reported that customer demand influences facility design and equipment in service sectors. To ensure adequate service levels, services that experience large, fast, and unpredictable changes in the market tend to have enough capacity to serve the higher levels of expected demand. Therefore, demand and capacity are generally required simultaneously for decisions. To develop new services efficiently, firms must be able to appropriately identify and understand demand, including customer needs and behavior [18].
The business model should be designed collectively to implement the plan. Business models help decide the structure of components, the relationship between the elements, and the dynamics to plan systematically. The Business Model Canvas developed by Osterwalder and Pigneur [19] defines how companies generate revenue and make a profit through the overall structures of process, customers, suppliers, channels, resources, and capabilities. The critical goal of service businesses is to understand customers and their behavior, necessitating identification of key consumer groups [20], while prioritizing the customer to allocate attention and resources to designated customer groups [13].
Researchers have tried to adapt DA tools for the service industry using association rules, clustering techniques, and decision trees to solve for several objectives. Hung and Zhang [21] conducted three DA techniques to discover various patterns of online behaviors and to predict outcomes. Birant [22] proposed a three-step DA for improved customer satisfaction. Liao et al. [23] used association rules and clustering techniques for mining customer knowledge among online customers. Pitchayadejanant and Nakpathom [24] created association rules and cluster analysis to identify patterns in tourism and to suggest related activities. The literature review in the present study focuses on DA tools that could be used to identify demand for the RS: (1) to identify potential customers, (2) to identify potential products, (3) to predict the suggested matches of shippers with consignees, and (4) to simulate the estimated revenue. Five analysis techniques are reviewed: behavior segmentation, association rules, clustering techniques, decision trees, and Monte Carlo simulation. Each of these is discussed further below.

Shippers/Consignees Behavior Segmentation
Customer segmentation is the process of dividing customers into similar groups according to certain characteristics. Customer information is examined in order to determine and retain profitable and loyal customers and then to develop an effective marketing strategy for each cluster of customers [25]. In the courier business, there are two types of customers: shippers and consignees. Shippers are direct customers who send their products via a courier. Consignees are indirect customers or customers of those shippers. Many factors need to be considered to correctly segment their behavior [26].
The Recency, Frequency, and Monetary Concept (RFM) was introduced by Bult and Wansbeek [27] and is used to analyze aspects of customer behavior, such as the length of time since last purchase (recency), the number of purchases within a certain time period (frequency), and the amount of money spent over a certain time period (monetary). The integration of RFM and DA techniques has been proposed for different applications, such as identifying customers and analyzing profitability. Aggelis and Christodoulakis [28] applied RFM to measure customer profitability and need for banking services. Khajvand et al. [29] proposed a customer lifetime value for customers of a health and beauty company, using an RFM marketing analysis method for segmentation and adding a parameter (count item). Other research combined DA with RFM, including a self-organizing map [30,31], neural networks and decision trees [32], rough set theory [33], chi-square automatic interaction detection [34], genetic algorithm [35], and sequential pattern mining [36,37].
Several studies have considered different versions of RFM analysis. For example, the Length, Recency, Frequency, and Monetary (LRFM) model was proposed by Wu et al. [38]. In addition to standard variables, this considers the length of the relationship between an organization and customers, with L defined as the number of time periods (e.g., days) from first to last purchase. Alternatively, Hosseini et al. [39] proposed a weighted form of RFM, WRFM, that calculates the weight of RFM through the multiplication of wR, wF, and wM, according to their relative importance, to make intuitive judgments about ranking and ordering. Chen et al. [40] developed an Length, Recency, Frequency, Monetary, and Profit (LRFMP) model for a logistics company to predict customer churn. Other versions include Timely RFM (TRFM), which considers the relationship between product properties and purchase periodicity [41]; Recency, Frequency, Duration and Lifetime (RFDL), which measures the duration of website visits [42]; Frequency, Regency, Amount, and Type of merchandise or service (FRAT), which aims to provide a personalized u-commerce recommendation service [43]; Group-RFM (GRFM), which adds product category groups [44]; Recency, Monetary and Loyalty (RML), which adapts RFM into annual transaction environments; Recency, Frequency, Reach (RFR), proposed for application to social media, where variable examples are the last post for recency, the total number of posts for frequency, and the network of friends for reach [22].
In this study, we applied RFM to the number of customers (NC), the number of product items (NP) [42], weight (W), and day (D) to describe the behavioral patterns of shippers and consignees in the courier business. R is the time between the date of previous services and the reference date. Low recency customers tend to repurchase more than high recency customers [22]. F is the total number of services used within a specific period. M is the total spent in THB over a specific time. NP represents business diversity and reflects the previous history of shippers or customers. For example, a shipper who repeatedly sends the same product to individual customers or many customers may also operate as a manufacturer or a distributor of one specific product. On the other hand, if a shipper sends various types of products, the shipper may be a larger enterprise. NC represents the number of customers served by shippers and denotes the type of business. For example, if a shipper delivers products to many customers, it can be assumed that the shipper works in retailing, wholesaling, or online shopping. W represents the average weight of each delivery, which reflects delivery volume. D represents the average number of days between each transaction. A lower number of days means that sending/receiving occurs more frequently, which could suggest that the customer or shipper is an entrepreneur or business.

Association Rule
The association rule, an important data mining technique is widely used to extract patterns of interest [45]. It is mainly used to determine relationships between features that occur synchronously in databases and is expressed as X ==> Y. The left part of the rule is the antecedent, and the right is the consequent. Sets X and Y are disjointed, and the same item cannot be found in both the antecedent and consequent. A rule is judged on the basis of its support and confidence levels [46]. Association rules surpass minimum support (Min sup) and minimum confidence (Min conf) thresholds defined by a user [47]. Support determines the frequency of the transaction that satisfies X and Y within the set of all transactions. If a rule has low support, it might be happening simply by chance. Conversely, confidence is the conditional probability that Y will occur, given that X has occurred. Thus, the higher the confidence, the more likely it is for Y to be present in transactions that contain X. Chen et al. [48] proposed an association rule to identify associations between a customer profile and the product items purchased, to establish a method of mining changes in customer behavior. Yoshimura et al. [49] applied an association rule to extract frequent combinations of stores to characterize shoppers' behaviors. Pitchayadejanant and Nakpathom [24] created an association rule to identify patterns of tourism activities in Thai orchards and to suggest related activities.
The Apriori algorithm provides a basic association rule, reading data once for every iteration [50], while Han et al. [51] proposed an Frequent Pattern Growth (FP-Growth) algorithm to increase the efficiency of the data mining process by scanning the file twice. Their approach is efficiently search frequent itemsets with minimum accesses to the original database. Moreover, it reduces the problem of excessively large amount of candidate itemsets. Many studies have compared the two [52][53][54] and specific adaptive methods have been proposed [55][56][57] to improve efficiency. This study uses FP-Growth as a tool for identifying product recommendations.

Clustering Technique
A cluster is a collection of data objects that are similar to each other and dissimilar to objects of other clusters [58]. There are several clustering algorithms among which K-means and Hierarchical Clustering method are the two most prominent clustering algorithms. Kaushik and Mathur [59] compared the strengths and weaknesses of K-means and Hierarchical Clustering Techniques. Their study indicated that K-means delivers better performance and is suitable for large datasets. K-means is very sensitive to the choice of a starting point for partitioning items into K initial clusters and is used to assign each record in the dataset to only one initial cluster [60]. Although the major limitation is the requirement of the number of clusters k as an input, there exists a technique to find the suitable number of groups of datasets called "k-optimal" that can yield effective and accurate solutions [61]. Validation studies are conducted using the elbow method, which runs K-means clustering on the dataset for each value of k-optimal. A line chart of the average centroid distance is then plotted for each value of k. If the line chart resembles a knee, then the "bend" represents the optimal value of k [62]. The performance of different clustering methods has been compared in several reports [25,63]. Clustering techniques are widely integrated with RFM concepts for discovering patterns and relationships hidden in datasets. This is not only to develop a product/service to satisfy customers but also to track customer purchasing behaviors and to present distinct products/services for each segment. There are many applications of RFM and clustering techniques in different sectors, including the retail industry [31,64,65], banking [28], patient analytics in dental clinics [38], and e-commerce [23]. The literature also shows that several studies have developed modified K-means approaches that can improve efficiency and run time. However, these may lead to complications, and the procedures are complex when applied to large amounts of data or real data sets. Therefore, this study applies the traditional K-means algorithm because of its efficiency in classifying a large number of continuous numerical values of high dimensions.

Decision Tree
The decision tree algorithm is a supervised classification technique used to classify unknown patterns or to create a predictive model to divide subjects into groups or predict the values of a target variable. ID3 is the original decision tree method proposed by Quinlan [66] who subsequently proposed C4.5. According to Singh and Gupta [67], the decision trees (i.e., ID3, CART, and C4.5) have different characteristics when applied to different types of data sets. C4.5 can handle numerical attributes and eliminate the ID3 bias by dealing with missing values and noisy data [68]. The performance of decision trees is determined based on the accuracy and confidence of rules. Accuracy is defined on the basis of a confusion matrix for evaluating classification models.
Decision trees are widely used in several sectors. Skrbinjek and Dermol [69] applied decision trees in education to determine the relationship between student satisfaction and their performance in an e-classroom. They proposed five student satisfaction factors: student satisfaction (SATISFACT), grade (GRADE), number of students' attempts to pass the examination (EXAPPROACH), average student responses, and active engagement (views and posts) (EINVOLV), and student workload (WORKLOAD). Gonoodi et al. [70] used a decision tree algorithm to evaluate risk factors associated with vitamin D deficiency. The model investigated 14 variables, with sensitivity, specificity, accuracy, and receiver operating characteristics used for validation. Sheu et al. [71] applied decision trees to analyze the influences of internal cognition and the external environment on the loyalty of animations, comics, and games (ACG) consumers. Their results were used to suggest policies concerning products' extensional design, marketing, and CRM in the ACG industry. Hsu and Wang [72] used decision trees to identify and classify patterns in the body shapes of soldiers, generating results that are useful for manufacturers to define a suitable range of clothing sizes and to generate regular size patterns to facilitate production. Mitik et al. [73] proposed a hybrid system, using C4.5 decision trees and Naïve Bayes, to classify customers' interest in an offered product and clusters for product and channel suggestions. The results increased the overall profit/cost ratio. Tayefi et al. [74] developed a decision tree to identify risk factors associated with hypertension that can be used in programs for hypertension management. Tseng et al. [75] applied a decision tree and artificial neural network analysis to analyze historical cases of oral cancer. Dongming et al. [76] used a decision tree to predict the soil quality grade for precision fertilization in agriculture.
Many studies have applied clustering techniques, decision trees derived from customer demographics, and RFM variables. The integration of decision trees and RFM was studied by Chen et al. [40], who developed a customer churn prediction model using a decision tree. LRFMP variables were also considered and found to have an effect on customer churn. Olson et al. [32] analyzed customers' possible responses to specific product promotion. They compared three data analytics techniques: logistic regression, decision trees, and neural networks, and discussed the relative tradeoffs among these in the context of customer segmentation. Bunnak et al. [77] applied the RFM concept to determine customer loyalty according to the type of customer. Customer loyalty was partitioned into five classes using a K-means clustering algorithm and customer types (platinum, gold, and silver) were heuristically assigned. A classification system using decision trees was used to determine the loyalty of new future customers. Moedjiono et al. [78] applied K-means clustering with RFM for customer segmentation and decision trees, for internet and cable service providers to identify solvent customers who refused to pay after using services.

Monte Carlo Simulation
Monte Carlo (MC) simulation relies on repeated random sampling and statistical analysis to compute results [79,80]. It is closely related to random experiments, i.e., experiments for which the specific result is not known in advance [81]. These models typically depend on a number of input parameters, which result in one or more outputs. Mathematical models are used in natural sciences, social sciences, and engineering disciplines to describe interactions in a system of mathematical expressions [82,83].
In MC simulation, each problem is representative of a broad class of similar problems-for a detailed discussion, refer to Glasserman [84]. The technique has been applied in fields such as financial analysis, reliability analysis, Six Sigma, and mathematical and statistical physics. Armaghani et al. [85] applied MC simulation to develop a predictive model for flyrock estimation based on multiple regression analyses. In the present research, MC was used to estimate expected revenue from shippers and consignees through integration with a data mining tool to generate input data before the simulation.
The literature review shows that each tool can deal with the specific problems of several fields and industries. However, this study aims to close these gaps by integrating DA techniques, association rules, clustering techniques, and decision tree for new business development decisions in the logistics business. Therefore, this study implements DA to develop a recommendation approach for matching suitable consignees/shippers using a case study of courier service business. The MCSRS integrates association rules, clustering techniques, and decision tree learning with RFM analysis to better understand the customer behavior. The specific sets of questions are listed in Table 1.

Methodology
The method comprises four steps ( Figure 1). Data of 461,708 transactions, involving 94,725 shippers and 137,652 consignees, were obtained from a 2018 sales database. The data were pre-processed and the outliers were detected e.g., the unexpectedly high cargo weight was replaced by the mean of all samples. The redundant attributes e.g., Mr.Brown Emily, Mr BrownEmily, and MrBrown Emily were conformed. Unnecessary attributes such as employee name and payment type were reduced. Thereafter, the data was transformed into a format suitable for association, clustering, and classification. Steps 1-3 were conducted using RapidMiner Studio 9 software and are described below in detail. To identify product associations, rules are extracted from the product category of consignees' transactions using the FP-Growth algorithm [51]. This involved the following:

Step 1: Product Association
To identify product associations, rules are extracted from the product category of consignees' transactions using the FP-Growth algorithm [51]. This involved the following: (1) FP-Growth was used to determine items in the set that have been frequently delivered together in a certain fraction of transactions. (2) Evaluation of the association rule used Min sup and Min conf thresholds [47]
Step 2: Shipper/Consignee Clustering K-means is used to assign each record in the dataset to one initial cluster. Each record is assigned to the cluster to which it is most similar, using a measure of distance or similarity as the Euclidean distance measure, as per Equation (3) Euclidean A suitable number of groups of the dataset is referred to as K-optimal, determined by K-means clustering. Validation is conducted by the elbow method, which runs k-means clustering on the dataset for a range of values of k and each value of k. The average distance within the centroid is calculated using Equation (4), where n is the number of dataset points and d is the distance between clustering centroid and dataset point.
A line chart of the average within the centroid distance is then plotted for each value of k. If the line chart looks like a knee, then the 'elbow' of the knee represents the optimal value of k. This idea is similar to a sum of square error [62].
This step divided shippers of Y and consignees of X==>Y into groups with similar R, F, M, NC, NP, W, and D values. Data were partitioned into k clusters using the K-means technique [86]. This involved four sub-steps: (1) Determining R, F, M, NC, NP, W, and D variables for each shipper and consignee (2) Correlating variables to investigate relationships (3) Determining optimal numbers of clusters, with the K-optimal method [62]. The K-means technique was used to cluster a group of Y shippers and X==>Y consignees. (4) Analyzing different clusters of Y shippers and screening potential X==>Y consignee clusters.

Step 3: Prediction of Shippers/Consignees Matching
A C 4.5 decision tree was used to identify consignees' behavior using R, F, M, NC, NP, W, and D variables to predict shipper cluster suggestions. C 4.5 generates a tree by splitting the given data. It calculates overall entropy and information gain for all attributes. The attribute with the highest information gain is then chosen for the decision. At each node of the tree, C 4.5 chooses one attribute Appl. Sci. 2020, 10, 5585 9 of 22 that most effectively splits the training data into subsets with the best cut-off point, according to entropy and information gain [22], as per Equations (5) and (6): In f ormation Gain (∇) = Entropy (parent node) − Entropy (child node) where p i t is the fraction of records belonging to class i at a given node t [67]. The process used was as follows: (1) Classification of the delivery behavior of X==>Y consignees and Y shipper clusters through a decision tree algorithm. Predictions of X consignees were used as testing data ( Figure 2). (2) Evaluation of the classification model through 10-fold cross-validation. Accuracy and confidence of rule thresholds were determined for the expected value of minimum revenue in Step 4.

Step 3: Prediction of Shippers/Consignees Matching
A C 4.5 decision tree was used to identify consignees' behavior using R, F, M, NC, NP, W, and D variables to predict shipper cluster suggestions. C 4.5 generates a tree by splitting the given data. It calculates overall entropy and information gain for all attributes. The attribute with the highest information gain is then chosen for the decision. At each node of the tree, C 4.5 chooses one attribute that most effectively splits the training data into subsets with the best cut-off point, according to entropy and information gain [22], as per Equations (5) and (6): where ( ) is the fraction of records belonging to class i at a given node t [67]. The process used was as follows: (1) Classification of the delivery behavior of X==>Y consignees and Y shipper clusters through a decision tree algorithm. Predictions of X consignees were used as testing data ( Figure 2).
(2) Evaluation of the classification model through 10-fold cross-validation. Accuracy and confidence of rule thresholds were determined for the expected value of minimum revenue in Step 4.

Step 4: Revenue Simulation
Expected revenue for shippers and consignees in suggested pairings was estimated using the MC and fitness thresholds approach in @Risk 7.6 software. Variables considered are given in Equation (7)

Step 4: Revenue Simulation
Expected revenue for shippers and consignees in suggested pairings was estimated using the MC and fitness thresholds approach in @Risk 7.6 software. Variables considered are given in Equation (7): where F(x) C/G Ratio = the revenue ratio of X/Y products (continuous probability distributions), F(x) R = the total revenue of consignees (continuous probability distributions) in THB, the fitness of association (FA) = the support value of association * the confidence of association (CA), and the fitness of classification (FC) = the rule confidence of classification (RC) * the accuracy of classification (AC).

The Case and Data Collection
This research used the biggest bus company in northern Thailand, the G Company, for its case study. Over the years, the company has faced strong competition from other types of public transportation, such as taxis, trains, and especially low-cost airlines. The boom in low-cost airlines has affected many market segments owing to competitive prices and promotions that impact customer numbers and profits. To address this crisis, the company has tried to develop two new business units. The first is a charter service, a non-regular route public carrier. The second is a courier service that uses the available space under routed buses. Since 2010, the G Company has provided a port-to-port courier service along its bus routes, using 115 service ports covering 22 provinces from northern to southern Thailand. It provides a courier service as well as packing, short-term warehousing, and door-to-door services in some areas.
The company's courier service has become an important business with several types of customers, exhibiting a 17.03% revenue growth rate in 2018. The strength of this business is derived from the high frequency of bus travel and low cost, given that the service capitalizes on already available space under passenger service buses. The company has a customer database, providing scope for development of further business opportunities. All sale activities are recorded in the transaction database. However, this is not used for any further analysis at present.
Sales transaction data from 2018 were generated and collected in an Excel spreadsheet. Courier revenue during this year totaled 79 million THB. Attributes collected in the database included shippers'/consignees' names, dates, times, product group names, product category names, start/destination stations, quantities, weights, freight charges, membership application dates, and payment types ( Table 2). Products were originally classified into 66 categories (items), then further aggregated into 13 general groups (Figure 3). The products most frequently transported were food and drink (23%); clothes, garments, and accessories (21%); agricultural products (16%); and furniture, décor/kitchenware (10%), respectively.

Identifying Potential Products Using Association Rule
The association rule was used to determine the relationships between product categories and

Identifying Potential Products Using Association Rule
The association rule was used to determine the relationships between product categories and data were separated into two groups-shipper and consignee datasets-to investigate (a) which products are often delivered together and (b) customer behaviors seen when sending/receiving products (Table 3). Association diagrams for hidden product categories are shown in Figure 4.  Sequential association rules were applied and Rules 1 through 11 (Table 3) are those with the highest support rates (>0.005). Rule 1 indicates that more than 9% of consignees who receive garment products (in group of handicrafts souvenirs and gifts) and also receive clothes (in group of clothes, garment and accessories) and the data show a confidence level of 80.81%, indicating is a strong relationship between these two items. Rule 2 indicates that 1.9% of consignees who receive garments and related products also receive clothes. Rule 3 indicates that 1.8% of consignees who receive bags and handbags also receive clothes. In this study, Rule 1 is selected as an example rule.

Rule 1: {Garment products ==> Clothes}
Using Rule 1, a list of garment products consignees can be provided to shippers who are clothes suppliers. There can be many shippers and consignees who have not traded with each other. However, not all garment products consignees may want to have clothes delivered. There are 1435 consignees who receive garment products and 14,818 clothes shippers; therefore, investment marketing costs may be high. This necessitates a screening process to identify possible matches. The clustering technique is used for this purpose. Sequential association rules were applied and Rules 1 through 11 (Table 3) are those with the highest support rates (>0.005). Rule 1 indicates that more than 9% of consignees who receive garment products (in group of handicrafts souvenirs and gifts) and also receive clothes (in group of clothes, garment and accessories) and the data show a confidence level of 80.81%, indicating is a strong relationship between these two items. Rule 2 indicates that 1.9% of consignees who receive garments and related products also receive clothes. Rule 3 indicates that 1.8% of consignees who receive bags and handbags also receive clothes. In this study, Rule 1 is selected as an example rule.

Rule 1: {Garment products ==> Clothes}
Using Rule 1, a list of garment products consignees can be provided to shippers who are clothes suppliers. There can be many shippers and consignees who have not traded with each other. However, not all garment products consignees may want to have clothes delivered. There are 1435 consignees who receive garment products and 14,818 clothes shippers; therefore, investment marketing costs may be high. This necessitates a screening process to identify possible matches. The clustering technique is used for this purpose.

Variable Analysis
The clustering technique was applied after generating association rules to identify shippers with similar variables, as defined and calculated in Table 4.

Statistical Correlation Analysis
As shown in Table 5, M is related to F with a coefficient of 0.895 and trends in the same direction. However, the lower the value of R, the higher the tendency that such a customer will be a repeat customer, and the lower the value of D, the more frequently a customer uses the service. Overall, however, results show relatively low correlations between variables.

K-means Clustering
Using the product association example of "Garment product ==> Clothes" clothes shippers and garment products and clothes consignees were divided into groups and five clusters were identified. Cluster_0 was the largest group, with 14,534 shippers. This group delivers products twice a year, with an average of 314 THB spent per year. These shippers are not business organizations, as F, NC, and NP are relatively low. Based on D, which was > 272 days on average, this group does not appear to serve regular customers. The next group, Cluster_2, was comprised of three shippers representing the highest average scores of R, F, M, NC, NP, W, and D. The average number of customers was 302, and the average number of products was 15. D was less than 1 day, and the average weight was 36.3 kg.
Cluster_2 is the most interesting group with the most business potential. Meanwhile, Clusters_1, 3, and 4 were made up of groups of shippers with higher average scores than Cluster_0 but still lower than in Cluster_2, with 13, 30, and 288 shippers, respectively. For simple ranking, seven variables were normalized, and simple additive weighting (SAW) was used [87]. This indicated that Cluster_2 was the most preferred target, followed by Clusters_4, 3, 1, and 0 ( Table 6). Consignees who receive garment products and clothes were also clustered. Of 2139 consignees, 236 were targeted as potential customers.  Figure 5 shows the 3D scatter plot of RFM values for the various clusters, with a higher RFM indicating more regular customers. Cluster_2 consists of shippers who are more loyal in terms of the frequency of service usage or the amount of money spent on the company's services. This group is followed by Clusters_4, 3, 1, and 0, with the latter comprised of one-time customers. Variables NC, NP, W, and D were used to analyze shipping to gain insight into behavior, type or size of business, and to provide recommendations for shippers and customers. Figure 6 shows scatter plots for clothes shipper clusters. Most clusters have NC < 50, with the exception of Clusters_2 and 4, which have NC > 100. The numbers of products vary with each cluster. Clusters_1, 2, 3, and 4, however, have a clear pattern for W and D, in contrast to Cluster_0.  Figure 5 shows the 3D scatter plot of RFM values for the various clusters, with a higher RFM indicating more regular customers. Cluster_2 consists of shippers who are more loyal in terms of the frequency of service usage or the amount of money spent on the company's services. This group is followed by Clusters_4, 3, 1, and 0, with the latter comprised of one-time customers. Variables NC, NP, W, and D were used to analyze shipping to gain insight into behavior, type or size of business, and to provide recommendations for shippers and customers. Figure 6 shows scatter plots for clothes shipper clusters. Most clusters have NC < 50, with the exception of Clusters_2 and 4, which have NC > 100. The numbers of products vary with each cluster. Clusters_1, 2, 3, and 4, however, have a clear pattern for W and D, in contrast to Cluster_0.

Predicting Possible Matching of Shippers and Consignees Using Decision Tree
Shippers in Cluster_2 had the greatest potential for new business. However, it was necessary to estimate how many matching pairs could be created with the RS and at what level of accuracy. Decision tree learning was used to discover product delivery behavior and to forecast which groups of customers are likely to acquire various products. The dataset used for the behavioral study was obtained from training and testing data ( Figure 2). Training data included the data of 263 consignees who received garment products and clothes over 867 transactions. This data was later used for the creation of the classification model. Testing data included 1435 customers of garment products (those who have never received clothes). The model was validated with 10-fold cross-validation. Model efficiency was determined by the number of correct forecasts made in every class. The model learned and predicted the behavior of customers who had never received clothes with an accuracy of 62.40, resulting in eight forecasting rules (Figure 7).

Predicting Possible Matching of Shippers and Consignees Using Decision Tree
Shippers in Cluster_2 had the greatest potential for new business. However, it was necessary to estimate how many matching pairs could be created with the RS and at what level of accuracy. Decision tree learning was used to discover product delivery behavior and to forecast which groups of customers are likely to acquire various products. The dataset used for the behavioral study was obtained from training and testing data ( Figure 2). Training data included the data of 263 consignees who received garment products and clothes over 867 transactions. This data was later used for the creation of the classification model. Testing data included 1435 customers of garment products (those who have never received clothes). The model was validated with 10-fold cross-validation. Model efficiency was determined by the number of correct forecasts made in every class. The model learned and predicted the behavior of customers who had never received clothes with an accuracy of 62.40, resulting in eight forecasting rules (Figure 7).
Rules were then selected based on the rule's level of confidence, the value of the cluster, and the number of members in a cluster, among other factors. Examples of three rules are as follows: •   Rule 1 suggests matching 1 consignee to 13 shippers with confidence level 100%. This rule applies to 'big lot receivers' who receive products with an average shipping weight > 303.13 kg and was suggested to 13 shippers in Cluster_4. These are clothes shippers who send high weight and high-frequency deliveries to a number of customers and with a variety of product items (positioning in rank 2). New consignees with parameters within this behavior range can also be suggested to shippers in Cluster_4.
Rule 7 suggests matching 1 consignee to 3 shippers with confidence level 100%. This rule applies to 'high-frequency receivers' who receive garment products with average weight < 303.13 kg, with < 24,502 THB spent per year, and with > 2 deliveries/week but a lower weight than the 'big lot receiver group.' This consignee group was suggested for 3 shippers in Cluster_2. These rank 1 clothes shippers send high-frequency, high-weight deliveries, have a high number of customers, and manage a variety of product items. New consignees with parameters within this range of behavior, can be suggested to shippers in Cluster_2.
Rule 4 suggests matching 6 consignees to 40 shippers with confidence level 60%. It applies to 'moderate weight and frequency' consignees who received garment products from > 1 shipper, with an average shipment weight of 144.30-303.13 kg, with < 24,502 THB spent per year. This group was suggested to 40 shippers in Cluster_3. These are clothes shippers who are positioned in rank 3. New consignees positioned within this range of behavior can be suggested to the 40 shippers in Cluster_3.
Consignees grouped according to Rules 1, 7, and 4 were segmented by F and W, as shown in Figure 8. Rule 1 suggests matching 1 consignee to 13 shippers with confidence level 100%. This rule applies to 'big lot receivers' who receive products with an average shipping weight > 303.13 kg and was suggested to 13 shippers in Cluster_4. These are clothes shippers who send high weight and high-frequency deliveries to a number of customers and with a variety of product items (positioning in rank 2). New consignees with parameters within this behavior range can also be suggested to shippers in Cluster_4.
Rule 7 suggests matching 1 consignee to 3 shippers with confidence level 100%. This rule applies to 'high-frequency receivers' who receive garment products with average weight < 303.13 kg, with < 24,502 THB spent per year, and with > 2 deliveries/week but a lower weight than the 'big lot receiver group.' This consignee group was suggested for 3 shippers in Cluster_2. These rank 1 clothes shippers send high-frequency, high-weight deliveries, have a high number of customers, and manage a variety of product items. New consignees with parameters within this range of behavior, can be suggested to shippers in Cluster_2.
Rule 4 suggests matching 6 consignees to 40 shippers with confidence level 60%. It applies to 'moderate weight and frequency' consignees who received garment products from > 1 shipper, with an average shipment weight of 144.30-303.13 kg, with < 24,502 THB spent per year. This group was suggested to 40 shippers in Cluster_3. These are clothes shippers who are positioned in rank 3. New consignees positioned within this range of behavior can be suggested to the 40 shippers in Cluster_3.
Consignees grouped according to Rules 1, 7, and 4 were segmented by F and W, as shown in Figure 8.

Simulating Expected Revenue Using Monte Carlo Simulation
MC simulation was used for dimensional analysis of revenue simulation with @RISK software. Three rules were used to evaluate the feasibility of reaching the expected revenue level, a minimum base requirement when assessing the fitness of models. Variables considered are given in Equation (7) The F(x) C/G Ratio is taken from the distribution fitting of 2400 consignees in a revenue ratio of clothes/garment products. F(x) R is taken from the total revenue distribution fitting of consignee numbers according to the rule. SA and CA are derived from Table 3. RC was derived from Table 7, while AC was 0.624.

Simulating Expected Revenue Using Monte Carlo Simulation
MC simulation was used for dimensional analysis of revenue simulation with @RISK software. Three rules were used to evaluate the feasibility of reaching the expected revenue level, a minimum base requirement when assessing the fitness of models. Variables considered are given in Equation (7) The ( ) / is taken from the distribution fitting of 2400 consignees in a revenue ratio of clothes/garment products. ( ) is taken from the total revenue distribution fitting of consignee numbers according to the rule. SA and CA are derived from Table 3. RC was derived from Table 7, while AC was 0.624.     Table 8 shows the distribution model of expected revenue obtained using MC modeling, together with summary input and output parameters. For example, expected revenue of 1038 THB for Rule 1 was obtained based on F(x) C/G Ratio of Triang (4.05, 6.5015), F(x) R of 3884, SA of 0.09, CA of 0.8, RC of 1, and AC of 0.624. Predictions of the expected revenue result from simulations of 10,000 iterations and Table 9 shows the results of MC modeling (90% confidence level). For Rule 1, the Lognorm (1082, 939.8) was the best-fitting model and a value of 1038 THB/year was simulated as the most likely expected revenue with a range of 192-2675 THB/year. For Rule 7, the Lognorm (3186.2, 2789.7) was also the best-fitting model, with expected revenue of 3063 THB/year and a range of 558-8161 THB/year. For Rule 4, the most likely expected revenue was 12,266 THB, and a range of 334-22,200 THB/year. Based on these rules, revenue is expected to grow by 16,368 THB/year, equivalent to a 36% revenue increase from eight consignees. Even more revenue can be generated by considering all rules and all product categories and creating classification rules for suggesting supply-demand relationships. Predictions of the expected revenue result from simulations of 10,000 iterations and Table 9 shows the results of MC modeling (90% confidence level). For Rule 1, the Lognorm (1082, 939.8) was the best-fitting model and a value of 1038 THB/year was simulated as the most likely expected revenue with a range of 192-2675 THB/year. For Rule 7, the Lognorm (3186.2, 2789.7) was also the best-fitting model, with expected revenue of 3063 THB/year and a range of 558-8161 THB/year. For Rule 4, the most likely expected revenue was 12,266 THB, and a range of 334-22,200 THB/year. Based on these rules, revenue is expected to grow by 16,368 THB/year, equivalent to a 36% revenue increase from eight consignees. Even more revenue can be generated by considering all rules and all product categories and creating classification rules for suggesting supply-demand relationships. Predictions of the expected revenue result from simulations of 10,000 iterations and Table 9 shows the results of MC modeling (90% confidence level). For Rule 1, the Lognorm (1082, 939.8) was the best-fitting model and a value of 1038 THB/year was simulated as the most likely expected revenue with a range of 192-2675 THB/year. For Rule 7, the Lognorm (3186.2, 2789.7) was also the best-fitting model, with expected revenue of 3063 THB/year and a range of 558-8161 THB/year. For Rule 4, the most likely expected revenue was 12,266 THB, and a range of 334-22,200 THB/year. Based on these rules, revenue is expected to grow by 16,368 THB/year, equivalent to a 36% revenue increase from eight consignees. Even more revenue can be generated by considering all rules and all product categories and creating classification rules for suggesting supply-demand relationships. Predictions of the expected revenue result from simulations of 10,000 iterations and Table 9 shows the results of MC modeling (90% confidence level). For Rule 1, the Lognorm (1082, 939.8) was the best-fitting model and a value of 1038 THB/year was simulated as the most likely expected revenue with a range of 192-2675 THB/year. For Rule 7, the Lognorm (3186.2, 2789.7) was also the best-fitting model, with expected revenue of 3063 THB/year and a range of 558-8161 THB/year. For Rule 4, the most likely expected revenue was 12,266 THB, and a range of 334-22,200 THB/year. Based on these rules, revenue is expected to grow by 16,368 THB/year, equivalent to a 36% revenue increase from eight consignees. Even more revenue can be generated by considering all rules and all product categories and creating classification rules for suggesting supply-demand relationships. Table 9. Obtained minimum, mean, and maximum expected revenue (THB) within 90% confidence level using MC modeling of Rules 1, 7, 4, and the total.

Conclusions and Discussion
The purpose of this study was to demonstrate the benefits of implementing DA to create a model for pairing consignees and shippers for a Thai courier service. DA was performed using a database of 461,708 sales transactions between 94,725 shippers and 137,652 consignees. To answer RQ1, the association rule was used to learn which categories of products consignees often received together and which shippers could be matched with consignees based on the type of products received. Our results suggested pairing clothes shippers with garment product consignees identified from a total of 1435 garment product consignees and 14,818 clothes shippers. The clustering technique was used to answer RQ2 to identify customers with similar behaviors in order to link five identified shipper

Conclusions and Discussion
The purpose of this study was to demonstrate the benefits of implementing DA to create a model for pairing consignees and shippers for a Thai courier service. DA was performed using a database of 461,708 sales transactions between 94,725 shippers and 137,652 consignees. To answer RQ1, the association rule was used to learn which categories of products consignees often received together and which shippers could be matched with consignees based on the type of products received. Our results suggested pairing clothes shippers with garment product consignees identified from a total of 1435 garment product consignees and 14,818 clothes shippers. The clustering technique was used to answer RQ2 to identify customers with similar behaviors in order to link five identified shipper clusters to the appropriate consignee groups. To determine the answer to RQ3, a decision tree was used to classify consignees based on behavior patterns exhibited when receiving goods from similar shipper groups. Three rules were used to link 8 consignees to 56 shippers (Figure 9). Finally, RQ4 was answered by using MC simulation to predict an estimated revenue increase of 36%, or 16,368 THB/year. This approach could recommend new products to new shippers and consignees without affecting the original consignee-shipper relationships. However, the results presented here only consider a pair of product categories and three prediction rules. If all product categories were considered, this could generate an additional 6.6% of total revenue (5.2 million THB/year) and an additional 31,000 transactions. However, this study had limited data (461,708 transactions from one year). Additionally, the categories of products used were rather broad. More specific identification of products would deliver more accurate results.
Future research can consider an analysis of other product pairs that could result in increased revenue. Model validation can be conducted by interviewing consignees and determining if they would be interested in these suggestions. This would help confirm that the approach is credible and usable in real situations.